@promptbook/markitdown 0.86.31 → 0.88.0-10
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +21 -43
- package/esm/index.es.js +83 -10
- package/esm/index.es.js.map +1 -1
- package/esm/typings/src/_packages/node.index.d.ts +2 -0
- package/esm/typings/src/_packages/types.index.d.ts +2 -0
- package/esm/typings/src/_packages/utils.index.d.ts +2 -0
- package/esm/typings/src/cli/cli-commands/common/handleActionErrors.d.ts +11 -0
- package/esm/typings/src/execution/ExecutionTask.d.ts +24 -0
- package/esm/typings/src/scrapers/_common/register/$provideScriptingForNode.d.ts +11 -0
- package/esm/typings/src/scripting/javascript/JavascriptEvalExecutionTools.d.ts +1 -1
- package/esm/typings/src/scripting/javascript/JavascriptExecutionTools.d.ts +1 -1
- package/esm/typings/src/scripting/javascript/postprocessing-functions.d.ts +1 -1
- package/esm/typings/src/scripting/javascript/utils/extractVariablesFromJavascript.d.ts +1 -1
- package/esm/typings/src/utils/serialization/jsonStringsToJsons.d.ts +9 -0
- package/esm/typings/src/utils/serialization/jsonStringsToJsons.test.d.ts +1 -0
- package/package.json +2 -2
- package/umd/index.umd.js +82 -9
- package/umd/index.umd.js.map +1 -1
- /package/esm/typings/src/_packages/{execute-javascript.index.d.ts → javascript.index.d.ts} +0 -0
package/README.md
CHANGED
|
@@ -23,6 +23,10 @@
|
|
|
23
23
|
|
|
24
24
|
|
|
25
25
|
|
|
26
|
+
<blockquote style="color: #ff8811">
|
|
27
|
+
<b>⚠ Warning:</b> This is a pre-release version of the library. It is not yet ready for production use. Please look at <a href="https://www.npmjs.com/package/@promptbook/core?activeTab=versions">latest stable release</a>.
|
|
28
|
+
</blockquote>
|
|
29
|
+
|
|
26
30
|
## 📦 Package `@promptbook/markitdown`
|
|
27
31
|
|
|
28
32
|
- Promptbooks are [divided into several](#-packages) packages, all are published from [single monorepo](https://github.com/webgptorg/promptbook).
|
|
@@ -54,8 +58,6 @@ Rest of the documentation is common for **entire promptbook ecosystem**:
|
|
|
54
58
|
|
|
55
59
|
During the computer revolution, we have seen [multiple generations of computer languages](https://github.com/webgptorg/promptbook/discussions/180), from the physical rewiring of the vacuum tubes through low-level machine code to the high-level languages like Python or JavaScript. And now, we're on the edge of the **next revolution**!
|
|
56
60
|
|
|
57
|
-
|
|
58
|
-
|
|
59
61
|
It's a revolution of writing software in **plain human language** that is understandable and executable by both humans and machines – and it's going to change everything!
|
|
60
62
|
|
|
61
63
|
The incredible growth in power of microprocessors and the Moore's Law have been the driving force behind the ever-more powerful languages, and it's been an amazing journey! Similarly, the large language models (like GPT or Claude) are the next big thing in language technology, and they're set to transform the way we interact with computers.
|
|
@@ -112,27 +114,28 @@ Promptbook project is ecosystem of multiple projects and tools, following is a l
|
|
|
112
114
|
</tbody>
|
|
113
115
|
</table>
|
|
114
116
|
|
|
117
|
+
Hello world examples:
|
|
118
|
+
|
|
119
|
+
- [Hello world](https://github.com/webgptorg/hello-world)
|
|
120
|
+
- [Hello world in Node.js](https://github.com/webgptorg/hello-world-node-js)
|
|
121
|
+
- [Hello world in Next.js](https://github.com/webgptorg/hello-world-next-js)
|
|
122
|
+
|
|
115
123
|
We also have a community of developers and users of **Promptbook**:
|
|
116
124
|
|
|
117
125
|
- [Discord community](https://discord.gg/x3QWNaa89N)
|
|
118
126
|
- [Landing page `ptbk.io`](https://ptbk.io)
|
|
119
127
|
- [Github discussions](https://github.com/webgptorg/promptbook/discussions)
|
|
120
128
|
- [LinkedIn `Promptbook`](https://linkedin.com/company/promptbook)
|
|
121
|
-
- [Facebook `Promptbook`](https://www.facebook.com/61560776453536)
|
|
129
|
+
- [Facebook `Promptbook`](https://www.facebook.com/61560776453536)
|
|
122
130
|
|
|
123
131
|
And **Promptbook.studio** branded socials:
|
|
124
132
|
|
|
125
|
-
|
|
126
|
-
|
|
127
133
|
- [Instagram `@promptbook.studio`](https://www.instagram.com/promptbook.studio/)
|
|
128
134
|
|
|
129
135
|
And **Promptujeme** sub-brand:
|
|
130
136
|
|
|
131
137
|
_/Subbrand for Czech clients/_
|
|
132
138
|
|
|
133
|
-
|
|
134
|
-
|
|
135
|
-
|
|
136
139
|
- [Promptujeme.cz](https://www.promptujeme.cz/)
|
|
137
140
|
- [Facebook `Promptujeme`](https://www.facebook.com/promptujeme/)
|
|
138
141
|
|
|
@@ -150,8 +153,6 @@ _/Sub-brand for images and graphics generated via Promptbook prompting/_
|
|
|
150
153
|
|
|
151
154
|
## 💙 The Book language
|
|
152
155
|
|
|
153
|
-
|
|
154
|
-
|
|
155
156
|
Following is the documentation and blueprint of the [Book language](https://github.com/webgptorg/book).
|
|
156
157
|
|
|
157
158
|
Book is a language that can be used to write AI applications, agents, workflows, automations, knowledgebases, translators, sheet processors, email automations and more. It allows you to harness the power of AI models in human-like terms, without the need to know the specifics and technicalities of the models.
|
|
@@ -201,8 +202,6 @@ Personas can have access to different knowledge, tools and actions. They can als
|
|
|
201
202
|
|
|
202
203
|
- [PERSONA](https://github.com/webgptorg/promptbook/blob/main/documents/commands/PERSONA.md)
|
|
203
204
|
|
|
204
|
-
|
|
205
|
-
|
|
206
205
|
### **How:** Knowledge, Instruments and Actions
|
|
207
206
|
|
|
208
207
|
The resources used by the personas are used to do the work.
|
|
@@ -249,7 +248,7 @@ Or you can install them separately:
|
|
|
249
248
|
- ⭐ **[@promptbook/utils](https://www.npmjs.com/package/@promptbook/utils)** - Utility functions used in the library but also useful for individual use in preprocessing and postprocessing LLM inputs and outputs
|
|
250
249
|
- **[@promptbook/markdown-utils](https://www.npmjs.com/package/@promptbook/markdown-utils)** - Utility functions used for processing markdown
|
|
251
250
|
- _(Not finished)_ **[@promptbook/wizzard](https://www.npmjs.com/package/@promptbook/wizzard)** - Wizard for creating+running promptbooks in single line
|
|
252
|
-
- **[@promptbook/
|
|
251
|
+
- **[@promptbook/javascript](https://www.npmjs.com/package/@promptbook/javascript)** - Execution tools for javascript inside promptbooks
|
|
253
252
|
- **[@promptbook/openai](https://www.npmjs.com/package/@promptbook/openai)** - Execution tools for OpenAI API, wrapper around OpenAI SDK
|
|
254
253
|
- **[@promptbook/anthropic-claude](https://www.npmjs.com/package/@promptbook/anthropic-claude)** - Execution tools for Anthropic Claude API, wrapper around Anthropic Claude SDK
|
|
255
254
|
- **[@promptbook/vercel](https://www.npmjs.com/package/@promptbook/vercel)** - Adapter for Vercel functionalities
|
|
@@ -278,16 +277,9 @@ Or you can install them separately:
|
|
|
278
277
|
|
|
279
278
|
## 📚 Dictionary
|
|
280
279
|
|
|
281
|
-
|
|
282
|
-
|
|
283
|
-
|
|
284
|
-
|
|
285
|
-
|
|
286
|
-
### 📚 Dictionary
|
|
287
|
-
|
|
288
280
|
The following glossary is used to clarify certain concepts:
|
|
289
281
|
|
|
290
|
-
|
|
282
|
+
### General LLM / AI terms
|
|
291
283
|
|
|
292
284
|
- **Prompt drift** is a phenomenon where the AI model starts to generate outputs that are not aligned with the original prompt. This can happen due to the model's training data, the prompt's wording, or the model's architecture.
|
|
293
285
|
- **Pipeline, workflow or chain** is a sequence of tasks that are executed in a specific order. In the context of AI, a pipeline can refer to a sequence of AI models that are used to process data.
|
|
@@ -298,13 +290,9 @@ The following glossary is used to clarify certain concepts:
|
|
|
298
290
|
- **Retrieval-augmented generation** is a machine learning paradigm where a model generates text by retrieving relevant information from a large database of text. This approach combines the benefits of generative models and retrieval models.
|
|
299
291
|
- **Longtail** refers to non-common or rare events, items, or entities that are not well-represented in the training data of machine learning models. Longtail items are often challenging for models to predict accurately.
|
|
300
292
|
|
|
293
|
+
_Note: This section is not complete dictionary, more list of general AI / LLM terms that has connection with Promptbook_
|
|
301
294
|
|
|
302
|
-
|
|
303
|
-
_Note: Thos section is not complete dictionary, more list of general AI / LLM terms that has connection with Promptbook_
|
|
304
|
-
|
|
305
|
-
|
|
306
|
-
|
|
307
|
-
#### 💯 Core concepts
|
|
295
|
+
### 💯 Core concepts
|
|
308
296
|
|
|
309
297
|
- [📚 Collection of pipelines](https://github.com/webgptorg/promptbook/discussions/65)
|
|
310
298
|
- [📯 Pipeline](https://github.com/webgptorg/promptbook/discussions/64)
|
|
@@ -317,7 +305,7 @@ _Note: Thos section is not complete dictionary, more list of general AI / LLM te
|
|
|
317
305
|
- [🔣 Words not tokens](https://github.com/webgptorg/promptbook/discussions/29)
|
|
318
306
|
- [☯ Separation of concerns](https://github.com/webgptorg/promptbook/discussions/32)
|
|
319
307
|
|
|
320
|
-
|
|
308
|
+
#### Advanced concepts
|
|
321
309
|
|
|
322
310
|
- [📚 Knowledge (Retrieval-augmented generation)](https://github.com/webgptorg/promptbook/discussions/41)
|
|
323
311
|
- [🌏 Remote server](https://github.com/webgptorg/promptbook/discussions/89)
|
|
@@ -334,17 +322,9 @@ _Note: Thos section is not complete dictionary, more list of general AI / LLM te
|
|
|
334
322
|
|
|
335
323
|
|
|
336
324
|
|
|
337
|
-
|
|
325
|
+
## 🚂 Promptbook Engine
|
|
338
326
|
|
|
339
|
-
-
|
|
340
|
-
- Application mode
|
|
341
|
-
|
|
342
|
-
|
|
343
|
-
|
|
344
|
-
## 🔌 Usage in Typescript / Javascript
|
|
345
|
-
|
|
346
|
-
- [Simple usage](./examples/usage/simple-script)
|
|
347
|
-
- [Usage with client and remote server](./examples/usage/remote)
|
|
327
|
+

|
|
348
328
|
|
|
349
329
|
## ➕➖ When to use Promptbook?
|
|
350
330
|
|
|
@@ -405,20 +385,18 @@ Promptbook project is under [BUSL 1.1 is an SPDX license](https://spdx.org/licen
|
|
|
405
385
|
|
|
406
386
|
See [TODO.md](./TODO.md)
|
|
407
387
|
|
|
408
|
-
|
|
409
|
-
|
|
410
388
|
## 🤝 Partners
|
|
411
389
|
|
|
412
390
|
<div style="display: flex; align-items: center; gap: 20px;">
|
|
413
391
|
|
|
414
392
|
<a href="https://promptbook.studio/">
|
|
415
|
-
<img src="./design/promptbook-studio-logo.png" alt="Partner 3" height="
|
|
393
|
+
<img src="./design/promptbook-studio-logo.png" alt="Partner 3" height="70">
|
|
416
394
|
</a>
|
|
417
395
|
|
|
418
396
|
<a href="https://technologickainkubace.org/en/about-technology-incubation/about-the-project/">
|
|
419
|
-
<img src="./other/partners/CI-Technology-Incubation.png" alt="Technology Incubation" height="
|
|
397
|
+
<img src="./other/partners/CI-Technology-Incubation.png" alt="Technology Incubation" height="70">
|
|
420
398
|
</a>
|
|
421
|
-
|
|
399
|
+
|
|
422
400
|
</div>
|
|
423
401
|
|
|
424
402
|
## 🖋️ Contributing
|
package/esm/index.es.js
CHANGED
|
@@ -5,7 +5,7 @@ import hexEncoder from 'crypto-js/enc-hex';
|
|
|
5
5
|
import { basename, join, dirname } from 'path';
|
|
6
6
|
import { format } from 'prettier';
|
|
7
7
|
import parserHtml from 'prettier/parser-html';
|
|
8
|
-
import {
|
|
8
|
+
import { Subject } from 'rxjs';
|
|
9
9
|
import { randomBytes } from 'crypto';
|
|
10
10
|
import { forTime } from 'waitasecond';
|
|
11
11
|
import sha256 from 'crypto-js/sha256';
|
|
@@ -26,7 +26,7 @@ const BOOK_LANGUAGE_VERSION = '1.0.0';
|
|
|
26
26
|
* @generated
|
|
27
27
|
* @see https://github.com/webgptorg/promptbook
|
|
28
28
|
*/
|
|
29
|
-
const PROMPTBOOK_ENGINE_VERSION = '0.
|
|
29
|
+
const PROMPTBOOK_ENGINE_VERSION = '0.88.0-10';
|
|
30
30
|
/**
|
|
31
31
|
* TODO: string_promptbook_version should be constrained to the all versions of Promptbook engine
|
|
32
32
|
* Note: [💞] Ignore a discrepancy between file name and entity name
|
|
@@ -2068,6 +2068,36 @@ function $randomToken(randomness) {
|
|
|
2068
2068
|
* TODO: Maybe use nanoid instead https://github.com/ai/nanoid
|
|
2069
2069
|
*/
|
|
2070
2070
|
|
|
2071
|
+
/**
|
|
2072
|
+
* Recursively converts JSON strings to JSON objects
|
|
2073
|
+
|
|
2074
|
+
* @public exported from `@promptbook/utils`
|
|
2075
|
+
*/
|
|
2076
|
+
function jsonStringsToJsons(object) {
|
|
2077
|
+
if (object === null) {
|
|
2078
|
+
return object;
|
|
2079
|
+
}
|
|
2080
|
+
if (Array.isArray(object)) {
|
|
2081
|
+
return object.map(jsonStringsToJsons);
|
|
2082
|
+
}
|
|
2083
|
+
if (typeof object !== 'object') {
|
|
2084
|
+
return object;
|
|
2085
|
+
}
|
|
2086
|
+
const newObject = { ...object };
|
|
2087
|
+
for (const [key, value] of Object.entries(object)) {
|
|
2088
|
+
if (typeof value === 'string' && isValidJsonString(value)) {
|
|
2089
|
+
newObject[key] = JSON.parse(value);
|
|
2090
|
+
}
|
|
2091
|
+
else {
|
|
2092
|
+
newObject[key] = jsonStringsToJsons(value);
|
|
2093
|
+
}
|
|
2094
|
+
}
|
|
2095
|
+
return newObject;
|
|
2096
|
+
}
|
|
2097
|
+
/**
|
|
2098
|
+
* TODO: Type the return type correctly
|
|
2099
|
+
*/
|
|
2100
|
+
|
|
2071
2101
|
/**
|
|
2072
2102
|
* This error indicates problems parsing the format value
|
|
2073
2103
|
*
|
|
@@ -2294,21 +2324,43 @@ function assertsTaskSuccessful(executionResult) {
|
|
|
2294
2324
|
function createTask(options) {
|
|
2295
2325
|
const { taskType, taskProcessCallback } = options;
|
|
2296
2326
|
const taskId = `${taskType.toLowerCase().substring(0, 4)}-${$randomToken(8 /* <- TODO: To global config + Use Base58 to avoid simmilar char conflicts */)}`;
|
|
2297
|
-
|
|
2327
|
+
let status = 'RUNNING';
|
|
2328
|
+
const createdAt = new Date();
|
|
2329
|
+
let updatedAt = createdAt;
|
|
2330
|
+
const errors = [];
|
|
2331
|
+
const warnings = [];
|
|
2332
|
+
let currentValue = {};
|
|
2333
|
+
const partialResultSubject = new Subject();
|
|
2334
|
+
// <- Note: Not using `BehaviorSubject` because on error we can't access the last value
|
|
2298
2335
|
const finalResultPromise = /* not await */ taskProcessCallback((newOngoingResult) => {
|
|
2336
|
+
Object.assign(currentValue, newOngoingResult);
|
|
2337
|
+
// <- TODO: assign deep
|
|
2299
2338
|
partialResultSubject.next(newOngoingResult);
|
|
2300
2339
|
});
|
|
2301
2340
|
finalResultPromise
|
|
2302
2341
|
.catch((error) => {
|
|
2342
|
+
errors.push(error);
|
|
2303
2343
|
partialResultSubject.error(error);
|
|
2304
2344
|
})
|
|
2305
|
-
.then((
|
|
2306
|
-
if (
|
|
2345
|
+
.then((executionResult) => {
|
|
2346
|
+
if (executionResult) {
|
|
2307
2347
|
try {
|
|
2308
|
-
|
|
2309
|
-
|
|
2348
|
+
updatedAt = new Date();
|
|
2349
|
+
errors.push(...executionResult.errors);
|
|
2350
|
+
warnings.push(...executionResult.warnings);
|
|
2351
|
+
// <- TODO: !!! Only unique errors and warnings should be added (or filtered)
|
|
2352
|
+
// TODO: [🧠] !!! errors, warning, isSuccessful are redundant both in `ExecutionTask` and `ExecutionTask.currentValue`
|
|
2353
|
+
// Also maybe move `ExecutionTask.currentValue.usage` -> `ExecutionTask.usage`
|
|
2354
|
+
// And delete `ExecutionTask.currentValue.preparedPipeline`
|
|
2355
|
+
assertsTaskSuccessful(executionResult);
|
|
2356
|
+
status = 'FINISHED';
|
|
2357
|
+
currentValue = jsonStringsToJsons(executionResult);
|
|
2358
|
+
// <- TODO: [🧠] Is this a good idea to convert JSON strins to JSONs?
|
|
2359
|
+
partialResultSubject.next(executionResult);
|
|
2310
2360
|
}
|
|
2311
2361
|
catch (error) {
|
|
2362
|
+
status = 'ERROR';
|
|
2363
|
+
errors.push(error);
|
|
2312
2364
|
partialResultSubject.error(error);
|
|
2313
2365
|
}
|
|
2314
2366
|
}
|
|
@@ -2325,12 +2377,33 @@ function createTask(options) {
|
|
|
2325
2377
|
return {
|
|
2326
2378
|
taskType,
|
|
2327
2379
|
taskId,
|
|
2380
|
+
get status() {
|
|
2381
|
+
return status;
|
|
2382
|
+
// <- Note: [1] Theese must be getters to allow changing the value in the future
|
|
2383
|
+
},
|
|
2384
|
+
get createdAt() {
|
|
2385
|
+
return createdAt;
|
|
2386
|
+
// <- Note: [1]
|
|
2387
|
+
},
|
|
2388
|
+
get updatedAt() {
|
|
2389
|
+
return updatedAt;
|
|
2390
|
+
// <- Note: [1]
|
|
2391
|
+
},
|
|
2328
2392
|
asPromise,
|
|
2329
2393
|
asObservable() {
|
|
2330
2394
|
return partialResultSubject.asObservable();
|
|
2331
2395
|
},
|
|
2396
|
+
get errors() {
|
|
2397
|
+
return errors;
|
|
2398
|
+
// <- Note: [1]
|
|
2399
|
+
},
|
|
2400
|
+
get warnings() {
|
|
2401
|
+
return warnings;
|
|
2402
|
+
// <- Note: [1]
|
|
2403
|
+
},
|
|
2332
2404
|
get currentValue() {
|
|
2333
|
-
return
|
|
2405
|
+
return currentValue;
|
|
2406
|
+
// <- Note: [1]
|
|
2334
2407
|
},
|
|
2335
2408
|
};
|
|
2336
2409
|
}
|
|
@@ -3641,7 +3714,7 @@ function valueToString(value) {
|
|
|
3641
3714
|
* @param script from which to extract the variables
|
|
3642
3715
|
* @returns the list of variable names
|
|
3643
3716
|
* @throws {ParseError} if the script is invalid
|
|
3644
|
-
* @public exported from `@promptbook/
|
|
3717
|
+
* @public exported from `@promptbook/javascript`
|
|
3645
3718
|
*/
|
|
3646
3719
|
function extractVariablesFromJavascript(script) {
|
|
3647
3720
|
const variables = new Set();
|
|
@@ -4705,7 +4778,7 @@ async function executeAttempts(options) {
|
|
|
4705
4778
|
Last result:
|
|
4706
4779
|
${block($ongoingTaskResult.$resultString === null
|
|
4707
4780
|
? 'null'
|
|
4708
|
-
: $ongoingTaskResult.$resultString
|
|
4781
|
+
: spaceTrim$1($ongoingTaskResult.$resultString)
|
|
4709
4782
|
.split('\n')
|
|
4710
4783
|
.map((line) => `> ${line}`)
|
|
4711
4784
|
.join('\n'))}
|