soustack 0.1.3 → 0.2.3
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/LICENSE +21 -21
- package/README.md +301 -203
- package/dist/cli/index.js +1763 -1365
- package/dist/cli/index.js.map +1 -1
- package/dist/index.d.mts +60 -135
- package/dist/index.d.ts +60 -135
- package/dist/index.js +1141 -1455
- package/dist/index.js.map +1 -1
- package/dist/index.mjs +1140 -1443
- package/dist/index.mjs.map +1 -1
- package/dist/scrape.d.mts +308 -0
- package/dist/scrape.d.ts +308 -0
- package/dist/scrape.js +819 -0
- package/dist/scrape.js.map +1 -0
- package/dist/scrape.mjs +814 -0
- package/dist/scrape.mjs.map +1 -0
- package/package.json +86 -75
- package/src/profiles/.gitkeep +0 -0
- package/src/profiles/base.schema.json +9 -0
- package/src/profiles/cookable.schema.json +18 -0
- package/src/profiles/illustrated.schema.json +48 -0
- package/src/profiles/quantified.schema.json +43 -0
- package/src/profiles/scalable.schema.json +75 -0
- package/src/profiles/schedulable.schema.json +43 -0
- package/src/schema.json +63 -24
- package/src/soustack.schema.json +344 -0
package/LICENSE
CHANGED
|
@@ -1,21 +1,21 @@
|
|
|
1
|
-
MIT License
|
|
2
|
-
|
|
3
|
-
Copyright (c) 2024 Richard Herold
|
|
4
|
-
|
|
5
|
-
Permission is hereby granted, free of charge, to any person obtaining a copy
|
|
6
|
-
of this software and associated documentation files (the "Software"), to deal
|
|
7
|
-
in the Software without restriction, including without limitation the rights
|
|
8
|
-
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
|
9
|
-
copies of the Software, and to permit persons to whom the Software is
|
|
10
|
-
furnished to do so, subject to the following conditions:
|
|
11
|
-
|
|
12
|
-
The above copyright notice and this permission notice shall be included in all
|
|
13
|
-
copies or substantial portions of the Software.
|
|
14
|
-
|
|
15
|
-
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
|
16
|
-
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
|
17
|
-
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
|
18
|
-
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
|
19
|
-
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
|
20
|
-
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
|
21
|
-
SOFTWARE.
|
|
1
|
+
MIT License
|
|
2
|
+
|
|
3
|
+
Copyright (c) 2024 Richard Herold
|
|
4
|
+
|
|
5
|
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
|
6
|
+
of this software and associated documentation files (the "Software"), to deal
|
|
7
|
+
in the Software without restriction, including without limitation the rights
|
|
8
|
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
|
9
|
+
copies of the Software, and to permit persons to whom the Software is
|
|
10
|
+
furnished to do so, subject to the following conditions:
|
|
11
|
+
|
|
12
|
+
The above copyright notice and this permission notice shall be included in all
|
|
13
|
+
copies or substantial portions of the Software.
|
|
14
|
+
|
|
15
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
|
16
|
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
|
17
|
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
|
18
|
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
|
19
|
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
|
20
|
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
|
21
|
+
SOFTWARE.
|
package/README.md
CHANGED
|
@@ -1,203 +1,301 @@
|
|
|
1
|
-
# Soustack Core
|
|
2
|
-
|
|
3
|
-
> **The Logic Engine for Computational Recipes.**
|
|
4
|
-
|
|
5
|
-
[](https://www.npmjs.com/package/soustack)
|
|
6
|
-
[](https://opensource.org/licenses/MIT)
|
|
7
|
-
[](https://www.typescriptlang.org/)
|
|
8
|
-
|
|
9
|
-
**Soustack Core** is the reference implementation for the [Soustack Standard](https://github.com/soustack
|
|
10
|
-
|
|
11
|
-
---
|
|
12
|
-
|
|
13
|
-
## 💡 The Value Proposition
|
|
14
|
-
|
|
15
|
-
Most recipe formats (like Schema.org) are **descriptive**—they tell you _what_ a recipe is.
|
|
16
|
-
Soustack is **computational**—it understands _how_ a recipe behaves.
|
|
17
|
-
|
|
18
|
-
### The Problems We Solve:
|
|
19
|
-
|
|
20
|
-
1. **The "Salty Soup" Problem (Intelligent Scaling):**
|
|
21
|
-
- _Old Way:_ Doubling a recipe doubles every ingredient blindly.
|
|
22
|
-
- _Soustack:_ Understands that salt scales differently than flour, and frying oil shouldn't scale at all. It supports **Linear**, **Fixed**, **Discrete**, and **Baker's Percentage** scaling modes.
|
|
23
|
-
2. **The "Lying Prep Time" Problem:**
|
|
24
|
-
- _Old Way:_ Authors guess "Prep: 15 mins."
|
|
25
|
-
- _Soustack:_ Calculates total time dynamically based on the active/passive duration of every step.
|
|
26
|
-
3. **The "Timing Clash" Problem:**
|
|
27
|
-
- _Old Way:_ A flat list of instructions.
|
|
28
|
-
- _Soustack:_ A **Dependency Graph** that knows you can chop vegetables while the water boils.
|
|
29
|
-
|
|
30
|
-
---
|
|
31
|
-
|
|
32
|
-
## 📦 Installation
|
|
33
|
-
|
|
34
|
-
```bash
|
|
35
|
-
npm install soustack
|
|
36
|
-
```
|
|
37
|
-
|
|
38
|
-
## What
|
|
39
|
-
|
|
40
|
-
- **Validation**: `validateRecipe()` validates Soustack JSON against the bundled schema.
|
|
41
|
-
- **Scaling & Computation**: `scaleRecipe()`
|
|
42
|
-
- **
|
|
43
|
-
-
|
|
44
|
-
-
|
|
45
|
-
|
|
46
|
-
-
|
|
47
|
-
- `
|
|
48
|
-
|
|
49
|
-
|
|
50
|
-
|
|
51
|
-
|
|
52
|
-
|
|
53
|
-
|
|
54
|
-
|
|
55
|
-
|
|
56
|
-
|
|
57
|
-
|
|
58
|
-
|
|
59
|
-
|
|
60
|
-
|
|
61
|
-
|
|
62
|
-
|
|
63
|
-
|
|
64
|
-
|
|
65
|
-
|
|
66
|
-
|
|
67
|
-
|
|
68
|
-
|
|
69
|
-
|
|
70
|
-
|
|
71
|
-
|
|
72
|
-
|
|
73
|
-
|
|
74
|
-
|
|
75
|
-
|
|
76
|
-
|
|
77
|
-
|
|
78
|
-
|
|
79
|
-
|
|
80
|
-
const
|
|
81
|
-
|
|
82
|
-
|
|
83
|
-
|
|
84
|
-
|
|
85
|
-
|
|
86
|
-
|
|
87
|
-
|
|
88
|
-
|
|
89
|
-
|
|
90
|
-
|
|
91
|
-
|
|
92
|
-
|
|
93
|
-
|
|
94
|
-
|
|
95
|
-
|
|
96
|
-
|
|
97
|
-
|
|
98
|
-
|
|
99
|
-
|
|
100
|
-
|
|
101
|
-
|
|
102
|
-
|
|
103
|
-
|
|
104
|
-
|
|
105
|
-
|
|
106
|
-
|
|
107
|
-
|
|
108
|
-
|
|
109
|
-
|
|
110
|
-
|
|
111
|
-
|
|
112
|
-
|
|
113
|
-
|
|
114
|
-
|
|
115
|
-
|
|
116
|
-
|
|
117
|
-
|
|
118
|
-
|
|
119
|
-
|
|
120
|
-
|
|
121
|
-
|
|
122
|
-
|
|
123
|
-
|
|
124
|
-
|
|
125
|
-
|
|
126
|
-
|
|
127
|
-
|
|
128
|
-
|
|
129
|
-
|
|
130
|
-
|
|
131
|
-
|
|
132
|
-
|
|
133
|
-
|
|
134
|
-
|
|
135
|
-
|
|
136
|
-
//
|
|
137
|
-
const
|
|
138
|
-
|
|
139
|
-
|
|
140
|
-
|
|
141
|
-
|
|
142
|
-
|
|
143
|
-
|
|
144
|
-
|
|
145
|
-
|
|
146
|
-
|
|
147
|
-
|
|
148
|
-
|
|
149
|
-
|
|
150
|
-
|
|
151
|
-
|
|
152
|
-
|
|
153
|
-
|
|
154
|
-
|
|
155
|
-
|
|
156
|
-
if (
|
|
157
|
-
|
|
158
|
-
|
|
159
|
-
|
|
160
|
-
|
|
161
|
-
|
|
162
|
-
}
|
|
163
|
-
|
|
164
|
-
|
|
165
|
-
|
|
166
|
-
|
|
167
|
-
|
|
168
|
-
-
|
|
169
|
-
|
|
170
|
-
|
|
171
|
-
|
|
172
|
-
|
|
173
|
-
|
|
174
|
-
|
|
175
|
-
|
|
176
|
-
|
|
177
|
-
|
|
178
|
-
|
|
179
|
-
|
|
180
|
-
|
|
181
|
-
|
|
182
|
-
|
|
183
|
-
|
|
184
|
-
```
|
|
185
|
-
|
|
186
|
-
|
|
187
|
-
|
|
188
|
-
|
|
189
|
-
|
|
190
|
-
|
|
191
|
-
**
|
|
192
|
-
|
|
193
|
-
|
|
194
|
-
|
|
195
|
-
|
|
196
|
-
|
|
197
|
-
|
|
198
|
-
|
|
199
|
-
|
|
200
|
-
|
|
201
|
-
|
|
202
|
-
|
|
203
|
-
|
|
1
|
+
# Soustack Core
|
|
2
|
+
|
|
3
|
+
> **The Logic Engine for Computational Recipes.**
|
|
4
|
+
|
|
5
|
+
[](https://www.npmjs.com/package/soustack)
|
|
6
|
+
[](https://opensource.org/licenses/MIT)
|
|
7
|
+
[](https://www.typescriptlang.org/)
|
|
8
|
+
|
|
9
|
+
**Soustack Core** is the reference implementation for the [Soustack Standard](https://github.com/RichardHerold/soustack-spec). It provides the validation, parsing, and scaling logic required to turn static recipe data into dynamic, computable objects.
|
|
10
|
+
|
|
11
|
+
---
|
|
12
|
+
|
|
13
|
+
## 💡 The Value Proposition
|
|
14
|
+
|
|
15
|
+
Most recipe formats (like Schema.org) are **descriptive**—they tell you _what_ a recipe is.
|
|
16
|
+
Soustack is **computational**—it understands _how_ a recipe behaves.
|
|
17
|
+
|
|
18
|
+
### The Problems We Solve:
|
|
19
|
+
|
|
20
|
+
1. **The "Salty Soup" Problem (Intelligent Scaling):**
|
|
21
|
+
- _Old Way:_ Doubling a recipe doubles every ingredient blindly.
|
|
22
|
+
- _Soustack:_ Understands that salt scales differently than flour, and frying oil shouldn't scale at all. It supports **Linear**, **Fixed**, **Discrete**, **Proportional**, and **Baker's Percentage** scaling modes.
|
|
23
|
+
2. **The "Lying Prep Time" Problem:**
|
|
24
|
+
- _Old Way:_ Authors guess "Prep: 15 mins."
|
|
25
|
+
- _Soustack:_ Calculates total time dynamically based on the active/passive duration of every step.
|
|
26
|
+
3. **The "Timing Clash" Problem:**
|
|
27
|
+
- _Old Way:_ A flat list of instructions.
|
|
28
|
+
- _Soustack:_ A **Dependency Graph** that knows you can chop vegetables while the water boils.
|
|
29
|
+
|
|
30
|
+
---
|
|
31
|
+
|
|
32
|
+
## 📦 Installation
|
|
33
|
+
|
|
34
|
+
```bash
|
|
35
|
+
npm install soustack
|
|
36
|
+
```
|
|
37
|
+
|
|
38
|
+
## What's Included
|
|
39
|
+
|
|
40
|
+
- **Validation**: `validateRecipe()` validates Soustack JSON against the bundled schema.
|
|
41
|
+
- **Scaling & Computation**: `scaleRecipe()` scales a recipe while honoring per-ingredient scaling rules and instruction timing.
|
|
42
|
+
- **Schema.org Conversion**:
|
|
43
|
+
- `fromSchemaOrg()` (Schema.org JSON-LD → Soustack)
|
|
44
|
+
- `toSchemaOrg()` (Soustack → Schema.org JSON-LD)
|
|
45
|
+
- **Web Extraction**:
|
|
46
|
+
- Browser-safe HTML parsing: `extractSchemaOrgRecipeFromHTML()` (convert to Soustack with `fromSchemaOrg()`)
|
|
47
|
+
- Node-only scraping entrypoint: `scrapeRecipe()` and helpers via `import { ... } from 'soustack/scrape'`
|
|
48
|
+
|
|
49
|
+
## 🚀 Quickstart
|
|
50
|
+
|
|
51
|
+
Validate and scale a recipe in just a few lines:
|
|
52
|
+
|
|
53
|
+
```ts
|
|
54
|
+
import { validateRecipe, scaleRecipe } from 'soustack';
|
|
55
|
+
|
|
56
|
+
// Validate against the bundled Soustack schema
|
|
57
|
+
const { valid, errors, warnings } = validateRecipe(recipe);
|
|
58
|
+
if (!valid) {
|
|
59
|
+
throw new Error(JSON.stringify(errors, null, 2));
|
|
60
|
+
}
|
|
61
|
+
if (warnings?.length) {
|
|
62
|
+
console.warn('Non-blocking warnings', warnings);
|
|
63
|
+
}
|
|
64
|
+
|
|
65
|
+
// Scale to a new yield (multiplier, target yield, or servings)
|
|
66
|
+
const scaled = scaleRecipe(recipe, { multiplier: 2 });
|
|
67
|
+
```
|
|
68
|
+
|
|
69
|
+
### Profile-aware validation
|
|
70
|
+
|
|
71
|
+
Use profiles to enforce integration contracts (e.g., **Block** vs **Integrator** payloads).
|
|
72
|
+
|
|
73
|
+
```ts
|
|
74
|
+
import { detectProfiles, validateRecipe } from 'soustack';
|
|
75
|
+
|
|
76
|
+
// Discover which profiles a recipe already satisfies
|
|
77
|
+
const profiles = detectProfiles(recipe); // e.g. ['block']
|
|
78
|
+
|
|
79
|
+
// Validate while requiring specific profiles
|
|
80
|
+
const result = validateRecipe(recipe, { profiles: ['block', 'integrator'] });
|
|
81
|
+
if (!result.valid) {
|
|
82
|
+
console.error('Profile validation failed', result.errors);
|
|
83
|
+
}
|
|
84
|
+
```
|
|
85
|
+
|
|
86
|
+
### Browser-safe vs. Node-only entrypoints
|
|
87
|
+
|
|
88
|
+
- **Browser-safe:** `import { extractSchemaOrgRecipeFromHTML, fromSchemaOrg, validateRecipe, scaleRecipe } from 'soustack';`
|
|
89
|
+
- Ships without Node fetch/cheerio dependencies.
|
|
90
|
+
- **Node-only scraping:** `import { scrapeRecipe, extractRecipeFromHTML, extractSchemaOrgRecipeFromHTML } from 'soustack/scrape';`
|
|
91
|
+
- Includes HTTP fetching, retries, and cheerio-based parsing for server environments.
|
|
92
|
+
|
|
93
|
+
## Spec compatibility & bundled schemas
|
|
94
|
+
|
|
95
|
+
- Targets Soustack spec **v0.2.1** (`spec/SOUSTACK_SPEC_VERSION`, exported as `SOUSTACK_SPEC_VERSION`).
|
|
96
|
+
- Ships the base schema plus profile schemas in `spec/` and mirrors them into `src/` for consumers.
|
|
97
|
+
- Vendored fixtures live in `spec/fixtures` so tests can run offline, and version drift can be checked via `npm run validate:version`.
|
|
98
|
+
|
|
99
|
+
## Programmatic Usage
|
|
100
|
+
|
|
101
|
+
```ts
|
|
102
|
+
import {
|
|
103
|
+
extractSchemaOrgRecipeFromHTML,
|
|
104
|
+
fromSchemaOrg,
|
|
105
|
+
toSchemaOrg,
|
|
106
|
+
validateRecipe,
|
|
107
|
+
scaleRecipe,
|
|
108
|
+
} from 'soustack';
|
|
109
|
+
import {
|
|
110
|
+
scrapeRecipe,
|
|
111
|
+
extractRecipeFromHTML,
|
|
112
|
+
extractSchemaOrgRecipeFromHTML as extractSchemaOrgRecipeFromHTMLNode,
|
|
113
|
+
} from 'soustack/scrape';
|
|
114
|
+
|
|
115
|
+
// Validate a Soustack recipe JSON object with profile enforcement
|
|
116
|
+
const validation = validateRecipe(recipe, { profiles: ['block'] });
|
|
117
|
+
if (!validation.valid) {
|
|
118
|
+
console.error(validation.errors);
|
|
119
|
+
}
|
|
120
|
+
|
|
121
|
+
// Scale a recipe to a target yield amount (returns a "computed recipe")
|
|
122
|
+
const scaled = scaleRecipe(recipe, { multiplier: 2 });
|
|
123
|
+
|
|
124
|
+
// Scrape a URL into a Soustack recipe (Node.js only, throws if no recipe is found)
|
|
125
|
+
const scraped = await scrapeRecipe('https://example.com/recipe');
|
|
126
|
+
|
|
127
|
+
// Browser: fetch your own HTML, then parse and convert
|
|
128
|
+
const html = await fetch('https://example.com/recipe').then((r) => r.text());
|
|
129
|
+
const schemaOrgRecipe = extractSchemaOrgRecipeFromHTML(html);
|
|
130
|
+
const recipe = schemaOrgRecipe ? fromSchemaOrg(schemaOrgRecipe) : null;
|
|
131
|
+
|
|
132
|
+
// Node: parse raw HTML with cheerio-powered extractor
|
|
133
|
+
const nodeSchemaOrg = extractSchemaOrgRecipeFromHTMLNode(html);
|
|
134
|
+
const nodeRecipe = extractRecipeFromHTML(html);
|
|
135
|
+
|
|
136
|
+
// Convert Schema.org → Soustack
|
|
137
|
+
const soustack = fromSchemaOrg(schemaOrgJsonLd);
|
|
138
|
+
|
|
139
|
+
// Convert Soustack → Schema.org
|
|
140
|
+
const jsonLd = toSchemaOrg(recipe);
|
|
141
|
+
|
|
142
|
+
```
|
|
143
|
+
|
|
144
|
+
## 🪶 Core-lite (browser) Schema.org conversion
|
|
145
|
+
|
|
146
|
+
Need to stay browser-only? Import the core bundle (no `fetch`, no cheerio) and perform Schema.org extraction and conversion entirely client-side:
|
|
147
|
+
|
|
148
|
+
```ts
|
|
149
|
+
import { extractSchemaOrgRecipeFromHTML, fromSchemaOrg, toSchemaOrg } from 'soustack';
|
|
150
|
+
|
|
151
|
+
async function convert(url: string) {
|
|
152
|
+
const html = await fetch(url).then((r) => r.text());
|
|
153
|
+
|
|
154
|
+
// Pure DOMParser-based extraction (works in modern browsers)
|
|
155
|
+
const schemaOrg = extractSchemaOrgRecipeFromHTML(html);
|
|
156
|
+
if (!schemaOrg) throw new Error('No Schema.org recipe found');
|
|
157
|
+
|
|
158
|
+
// Convert to Soustack and back to Schema.org JSON-LD if needed
|
|
159
|
+
const soustackRecipe = fromSchemaOrg(schemaOrg);
|
|
160
|
+
const jsonLd = toSchemaOrg(soustackRecipe);
|
|
161
|
+
|
|
162
|
+
return { soustackRecipe, jsonLd };
|
|
163
|
+
}
|
|
164
|
+
```
|
|
165
|
+
|
|
166
|
+
## 🔁 Schema.org Conversion
|
|
167
|
+
|
|
168
|
+
Use the helpers to move between Schema.org JSON-LD and Soustack's structured recipe format. The conversion automatically handles image normalization, supporting multiple image formats from Schema.org.
|
|
169
|
+
|
|
170
|
+
```ts
|
|
171
|
+
import { fromSchemaOrg, toSchemaOrg, normalizeImage } from 'soustack';
|
|
172
|
+
|
|
173
|
+
// Convert Schema.org → Soustack (automatically normalizes images)
|
|
174
|
+
const soustackRecipe = fromSchemaOrg(schemaOrgJsonLd);
|
|
175
|
+
// Recipe images: string | string[] | undefined
|
|
176
|
+
// Instruction images: optional image URL per step
|
|
177
|
+
|
|
178
|
+
// Convert Soustack → Schema.org (preserves images)
|
|
179
|
+
const schemaOrgRecipe = toSchemaOrg(soustackRecipe);
|
|
180
|
+
|
|
181
|
+
// Manual image normalization (if needed)
|
|
182
|
+
const normalized = normalizeImage(schemaOrgImage);
|
|
183
|
+
// Handles: strings, arrays, ImageObjects with url/contentUrl
|
|
184
|
+
```
|
|
185
|
+
|
|
186
|
+
### Image Format Support
|
|
187
|
+
|
|
188
|
+
Soustack supports flexible image formats:
|
|
189
|
+
|
|
190
|
+
- **Recipe-level images**: Single URL (`string`) or multiple URLs (`string[]`)
|
|
191
|
+
- **Instruction-level images**: Optional `image` property on instruction objects
|
|
192
|
+
- **Automatic normalization**: Schema.org ImageObjects are automatically converted to URLs during import
|
|
193
|
+
|
|
194
|
+
Example recipe with images:
|
|
195
|
+
|
|
196
|
+
```ts
|
|
197
|
+
const recipe = {
|
|
198
|
+
name: "Chocolate Cake",
|
|
199
|
+
image: ["https://example.com/hero.jpg", "https://example.com/gallery.jpg"],
|
|
200
|
+
instructions: [
|
|
201
|
+
"Mix dry ingredients",
|
|
202
|
+
{ text: "Decorate the cake", image: "https://example.com/decorate.jpg" },
|
|
203
|
+
"Serve"
|
|
204
|
+
]
|
|
205
|
+
};
|
|
206
|
+
```
|
|
207
|
+
|
|
208
|
+
## 🧰 Web Scraping
|
|
209
|
+
|
|
210
|
+
### Node.js: `scrapeRecipe()`
|
|
211
|
+
|
|
212
|
+
`scrapeRecipe(url, options)` fetches a recipe page and extracts Schema.org data. **Node.js only** due to CORS restrictions.
|
|
213
|
+
|
|
214
|
+
Options:
|
|
215
|
+
|
|
216
|
+
- `timeout` (ms, default `10000`)
|
|
217
|
+
- `userAgent` (string, optional)
|
|
218
|
+
- `maxRetries` (default `2`, retries on non-4xx failures)
|
|
219
|
+
|
|
220
|
+
```ts
|
|
221
|
+
import { scrapeRecipe } from 'soustack';
|
|
222
|
+
|
|
223
|
+
const recipe = await scrapeRecipe('https://example.com/recipe', {
|
|
224
|
+
timeout: 15000,
|
|
225
|
+
maxRetries: 3,
|
|
226
|
+
});
|
|
227
|
+
```
|
|
228
|
+
|
|
229
|
+
### Browser: `extractSchemaOrgRecipeFromHTML()`
|
|
230
|
+
|
|
231
|
+
`extractSchemaOrgRecipeFromHTML(html)` extracts the raw Schema.org recipe data from HTML. Returns `null` if no recipe is found. Use this when you need to inspect, debug, or convert Schema.org data in browser builds without dragging in Node dependencies.
|
|
232
|
+
|
|
233
|
+
```ts
|
|
234
|
+
import { extractSchemaOrgRecipeFromHTML, fromSchemaOrg } from 'soustack';
|
|
235
|
+
|
|
236
|
+
// In browser: fetch HTML yourself
|
|
237
|
+
const response = await fetch('https://example.com/recipe');
|
|
238
|
+
const html = await response.text();
|
|
239
|
+
|
|
240
|
+
// Extract Schema.org format (for inspection/modification)
|
|
241
|
+
const schemaOrgRecipe = extractSchemaOrgRecipeFromHTML(html);
|
|
242
|
+
|
|
243
|
+
if (schemaOrgRecipe) {
|
|
244
|
+
// Inspect or modify Schema.org data before converting
|
|
245
|
+
console.log('Found recipe:', schemaOrgRecipe.name);
|
|
246
|
+
|
|
247
|
+
// Convert to Soustack format when ready
|
|
248
|
+
const soustackRecipe = fromSchemaOrg(schemaOrgRecipe);
|
|
249
|
+
}
|
|
250
|
+
```
|
|
251
|
+
|
|
252
|
+
### Node-only scraping: `soustack/scrape`
|
|
253
|
+
|
|
254
|
+
For server-side scraping with built-in fetching and cheerio-based parsing, use the dedicated entrypoint:
|
|
255
|
+
|
|
256
|
+
```ts
|
|
257
|
+
import { scrapeRecipe, extractRecipeFromHTML, fetchPage } from 'soustack/scrape';
|
|
258
|
+
|
|
259
|
+
// Fetch and parse a URL directly
|
|
260
|
+
const recipe = await scrapeRecipe('https://example.com/recipe');
|
|
261
|
+
|
|
262
|
+
// Or work with already-downloaded HTML
|
|
263
|
+
const html = await fetchPage('https://example.com/recipe');
|
|
264
|
+
const parsed = extractRecipeFromHTML(html);
|
|
265
|
+
```
|
|
266
|
+
|
|
267
|
+
### CLI
|
|
268
|
+
|
|
269
|
+
```bash
|
|
270
|
+
# Validate with profiles (JSON output for pipelines)
|
|
271
|
+
npx soustack validate recipe.soustack.json --profile block --strict --json
|
|
272
|
+
|
|
273
|
+
# Repo-wide test run (validates every *.soustack.json)
|
|
274
|
+
npx soustack test --profile block
|
|
275
|
+
|
|
276
|
+
# Convert Schema.org ↔ Soustack
|
|
277
|
+
npx soustack convert --from schemaorg --to soustack recipe.jsonld -o recipe.soustack.json
|
|
278
|
+
npx soustack convert --from soustack --to schemaorg recipe.soustack.json -o recipe.jsonld
|
|
279
|
+
|
|
280
|
+
# Import (scrape) or scale from the CLI
|
|
281
|
+
npx soustack import --url "https://example.com/recipe" -o recipe.soustack.json
|
|
282
|
+
npx soustack scale recipe.soustack.json 2
|
|
283
|
+
```
|
|
284
|
+
|
|
285
|
+
## 🔄 Keeping the Schema in Sync
|
|
286
|
+
|
|
287
|
+
The schema files in this repository are **copies** of the official standard. The source of truth lives in [RichardHerold/soustack-spec](https://github.com/RichardHerold/soustack-spec).
|
|
288
|
+
|
|
289
|
+
**Do not edit any synced schema artifacts manually** (`src/schema.json`, `src/soustack.schema.json`, `src/profiles/*.schema.json`).
|
|
290
|
+
|
|
291
|
+
To update to the latest tagged version of the standard, run:
|
|
292
|
+
|
|
293
|
+
```bash
|
|
294
|
+
npm run sync:spec
|
|
295
|
+
```
|
|
296
|
+
|
|
297
|
+
## Development
|
|
298
|
+
|
|
299
|
+
```bash
|
|
300
|
+
npm test
|
|
301
|
+
```
|