soustack 0.2.1 β 0.3.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/LICENSE +21 -21
- package/README.md +394 -244
- package/dist/cli/index.js +1672 -1387
- package/dist/cli/index.js.map +1 -1
- package/dist/index.d.mts +159 -151
- package/dist/index.d.ts +159 -151
- package/dist/index.js +1731 -1641
- package/dist/index.js.map +1 -1
- package/dist/index.mjs +1725 -1628
- package/dist/index.mjs.map +1 -1
- package/dist/scrape.d.mts +334 -0
- package/dist/scrape.d.ts +334 -0
- package/dist/scrape.js +921 -0
- package/dist/scrape.js.map +1 -0
- package/dist/scrape.mjs +916 -0
- package/dist/scrape.mjs.map +1 -0
- package/package.json +89 -75
- package/src/profiles/.gitkeep +0 -0
- package/src/profiles/base.schema.json +9 -0
- package/src/profiles/cookable.schema.json +18 -0
- package/src/profiles/illustrated.schema.json +48 -0
- package/src/profiles/quantified.schema.json +43 -0
- package/src/profiles/scalable.schema.json +75 -0
- package/src/profiles/schedulable.schema.json +43 -0
- package/src/schema.json +56 -23
- package/src/soustack.schema.json +356 -0
package/README.md
CHANGED
|
@@ -1,244 +1,394 @@
|
|
|
1
|
-
# Soustack Core
|
|
2
|
-
|
|
3
|
-
> **The Logic Engine for Computational Recipes.**
|
|
4
|
-
|
|
5
|
-
[](https://www.npmjs.com/package/soustack)
|
|
6
|
-
[](https://opensource.org/licenses/MIT)
|
|
7
|
-
[](https://www.typescriptlang.org/)
|
|
8
|
-
|
|
9
|
-
**Soustack Core** is the reference implementation for the [Soustack Standard](https://github.com/soustack
|
|
10
|
-
|
|
11
|
-
---
|
|
12
|
-
|
|
13
|
-
## π‘ The Value Proposition
|
|
14
|
-
|
|
15
|
-
Most recipe formats (like Schema.org) are **descriptive**βthey tell you _what_ a recipe is.
|
|
16
|
-
Soustack is **computational**βit understands _how_ a recipe behaves.
|
|
17
|
-
|
|
18
|
-
### The Problems We Solve:
|
|
19
|
-
|
|
20
|
-
1. **The "Salty Soup" Problem (Intelligent Scaling):**
|
|
21
|
-
- _Old Way:_ Doubling a recipe doubles every ingredient blindly.
|
|
22
|
-
- _Soustack:_ Understands that salt scales differently than flour, and frying oil shouldn't scale at all. It supports **Linear**, **Fixed**, **Discrete**, and **Baker's Percentage** scaling modes.
|
|
23
|
-
2. **The "Lying Prep Time" Problem:**
|
|
24
|
-
- _Old Way:_ Authors guess "Prep: 15 mins."
|
|
25
|
-
- _Soustack:_ Calculates total time dynamically based on the active/passive duration of every step.
|
|
26
|
-
3. **The "Timing Clash" Problem:**
|
|
27
|
-
- _Old Way:_ A flat list of instructions.
|
|
28
|
-
- _Soustack:_ A **Dependency Graph** that knows you can chop vegetables while the water boils.
|
|
29
|
-
|
|
30
|
-
---
|
|
31
|
-
|
|
32
|
-
## π¦ Installation
|
|
33
|
-
|
|
34
|
-
```bash
|
|
35
|
-
npm install soustack
|
|
36
|
-
```
|
|
37
|
-
|
|
38
|
-
## What's Included
|
|
39
|
-
|
|
40
|
-
- **Validation**: `validateRecipe()` validates Soustack JSON against the bundled schema.
|
|
41
|
-
- **Scaling & Computation**: `scaleRecipe()`
|
|
42
|
-
- **
|
|
43
|
-
-
|
|
44
|
-
-
|
|
45
|
-
|
|
46
|
-
-
|
|
47
|
-
- `
|
|
48
|
-
|
|
49
|
-
|
|
50
|
-
|
|
51
|
-
|
|
52
|
-
|
|
53
|
-
|
|
54
|
-
|
|
55
|
-
|
|
56
|
-
|
|
57
|
-
|
|
58
|
-
|
|
59
|
-
|
|
60
|
-
|
|
61
|
-
|
|
62
|
-
|
|
63
|
-
|
|
64
|
-
|
|
65
|
-
|
|
66
|
-
|
|
67
|
-
|
|
68
|
-
|
|
69
|
-
|
|
70
|
-
|
|
71
|
-
|
|
72
|
-
|
|
73
|
-
|
|
74
|
-
|
|
75
|
-
|
|
76
|
-
|
|
77
|
-
|
|
78
|
-
|
|
79
|
-
|
|
80
|
-
|
|
81
|
-
|
|
82
|
-
|
|
83
|
-
|
|
84
|
-
|
|
85
|
-
|
|
86
|
-
|
|
87
|
-
|
|
88
|
-
//
|
|
89
|
-
const
|
|
90
|
-
|
|
91
|
-
|
|
92
|
-
|
|
93
|
-
|
|
94
|
-
|
|
95
|
-
|
|
96
|
-
|
|
97
|
-
|
|
98
|
-
const
|
|
99
|
-
|
|
100
|
-
//
|
|
101
|
-
|
|
102
|
-
|
|
103
|
-
|
|
104
|
-
|
|
105
|
-
|
|
106
|
-
|
|
107
|
-
|
|
108
|
-
|
|
109
|
-
|
|
110
|
-
|
|
111
|
-
|
|
112
|
-
//
|
|
113
|
-
|
|
114
|
-
|
|
115
|
-
|
|
116
|
-
|
|
117
|
-
|
|
118
|
-
|
|
119
|
-
|
|
120
|
-
|
|
121
|
-
|
|
122
|
-
|
|
123
|
-
|
|
124
|
-
|
|
125
|
-
|
|
126
|
-
|
|
127
|
-
|
|
128
|
-
|
|
129
|
-
- **
|
|
130
|
-
-
|
|
131
|
-
- **
|
|
132
|
-
|
|
133
|
-
|
|
134
|
-
|
|
135
|
-
|
|
136
|
-
|
|
137
|
-
|
|
138
|
-
|
|
139
|
-
|
|
140
|
-
|
|
141
|
-
|
|
142
|
-
|
|
143
|
-
|
|
144
|
-
|
|
145
|
-
|
|
146
|
-
|
|
147
|
-
|
|
148
|
-
|
|
149
|
-
|
|
150
|
-
|
|
151
|
-
|
|
152
|
-
|
|
153
|
-
|
|
154
|
-
|
|
155
|
-
|
|
156
|
-
- `
|
|
157
|
-
-
|
|
158
|
-
|
|
159
|
-
|
|
160
|
-
|
|
161
|
-
|
|
162
|
-
|
|
163
|
-
|
|
164
|
-
|
|
165
|
-
|
|
166
|
-
|
|
167
|
-
|
|
168
|
-
|
|
169
|
-
|
|
170
|
-
|
|
171
|
-
|
|
172
|
-
|
|
173
|
-
|
|
174
|
-
|
|
175
|
-
|
|
176
|
-
|
|
177
|
-
|
|
178
|
-
|
|
179
|
-
|
|
180
|
-
|
|
181
|
-
|
|
182
|
-
|
|
183
|
-
|
|
184
|
-
|
|
185
|
-
|
|
186
|
-
|
|
187
|
-
|
|
188
|
-
|
|
189
|
-
|
|
190
|
-
|
|
191
|
-
|
|
192
|
-
|
|
193
|
-
|
|
194
|
-
|
|
195
|
-
|
|
196
|
-
|
|
197
|
-
|
|
198
|
-
|
|
199
|
-
|
|
200
|
-
|
|
201
|
-
|
|
202
|
-
|
|
203
|
-
|
|
204
|
-
|
|
205
|
-
|
|
206
|
-
|
|
207
|
-
|
|
208
|
-
|
|
209
|
-
|
|
210
|
-
|
|
211
|
-
|
|
212
|
-
|
|
213
|
-
|
|
214
|
-
|
|
215
|
-
|
|
216
|
-
|
|
217
|
-
|
|
218
|
-
|
|
219
|
-
|
|
220
|
-
|
|
221
|
-
|
|
222
|
-
|
|
223
|
-
|
|
224
|
-
|
|
225
|
-
|
|
226
|
-
|
|
227
|
-
|
|
228
|
-
|
|
229
|
-
|
|
230
|
-
|
|
231
|
-
|
|
232
|
-
|
|
233
|
-
|
|
234
|
-
|
|
235
|
-
|
|
236
|
-
|
|
237
|
-
|
|
238
|
-
|
|
239
|
-
|
|
240
|
-
|
|
241
|
-
|
|
242
|
-
|
|
243
|
-
|
|
244
|
-
|
|
1
|
+
# Soustack Core
|
|
2
|
+
|
|
3
|
+
> **The Logic Engine for Computational Recipes.**
|
|
4
|
+
|
|
5
|
+
[](https://www.npmjs.com/package/soustack)
|
|
6
|
+
[](https://opensource.org/licenses/MIT)
|
|
7
|
+
[](https://www.typescriptlang.org/)
|
|
8
|
+
|
|
9
|
+
**Soustack Core** is the reference implementation for the [Soustack Standard](https://github.com/RichardHerold/soustack-spec). It provides the validation, parsing, and scaling logic required to turn static recipe data into dynamic, computable objects.
|
|
10
|
+
|
|
11
|
+
---
|
|
12
|
+
|
|
13
|
+
## π‘ The Value Proposition
|
|
14
|
+
|
|
15
|
+
Most recipe formats (like Schema.org) are **descriptive**βthey tell you _what_ a recipe is.
|
|
16
|
+
Soustack is **computational**βit understands _how_ a recipe behaves.
|
|
17
|
+
|
|
18
|
+
### The Problems We Solve:
|
|
19
|
+
|
|
20
|
+
1. **The "Salty Soup" Problem (Intelligent Scaling):**
|
|
21
|
+
- _Old Way:_ Doubling a recipe doubles every ingredient blindly.
|
|
22
|
+
- _Soustack:_ Understands that salt scales differently than flour, and frying oil shouldn't scale at all. It supports **Linear**, **Fixed**, **Discrete**, **Proportional**, and **Baker's Percentage** scaling modes.
|
|
23
|
+
2. **The "Lying Prep Time" Problem:**
|
|
24
|
+
- _Old Way:_ Authors guess "Prep: 15 mins."
|
|
25
|
+
- _Soustack:_ Calculates total time dynamically based on the active/passive duration of every step.
|
|
26
|
+
3. **The "Timing Clash" Problem:**
|
|
27
|
+
- _Old Way:_ A flat list of instructions.
|
|
28
|
+
- _Soustack:_ A **Dependency Graph** that knows you can chop vegetables while the water boils.
|
|
29
|
+
|
|
30
|
+
---
|
|
31
|
+
|
|
32
|
+
## π¦ Installation
|
|
33
|
+
|
|
34
|
+
```bash
|
|
35
|
+
npm install soustack
|
|
36
|
+
```
|
|
37
|
+
|
|
38
|
+
## What's Included
|
|
39
|
+
|
|
40
|
+
- **Validation**: `validateRecipe()` validates Soustack JSON against the bundled schema.
|
|
41
|
+
- **Scaling & Computation**: `scaleRecipe()` scales a recipe while honoring per-ingredient scaling rules and instruction timing.
|
|
42
|
+
- **Schema.org Conversion**:
|
|
43
|
+
- `fromSchemaOrg()` (Schema.org JSON-LD β Soustack)
|
|
44
|
+
- `toSchemaOrg()` (Soustack β Schema.org JSON-LD)
|
|
45
|
+
- **Web Extraction**:
|
|
46
|
+
- Browser-safe HTML parsing: `extractSchemaOrgRecipeFromHTML()` (convert to Soustack with `fromSchemaOrg()`)
|
|
47
|
+
- Node-only scraping entrypoint: `scrapeRecipe()` and helpers via `import { ... } from 'soustack/scrape'`
|
|
48
|
+
- **Unit Conversion**: `convertLineItemToMetric()` converts ingredient line items from imperial volumes/masses into metric with deterministic rounding and ingredient-aware equivalencies.
|
|
49
|
+
|
|
50
|
+
## π Quickstart
|
|
51
|
+
|
|
52
|
+
Validate and scale a recipe in just a few lines:
|
|
53
|
+
|
|
54
|
+
```ts
|
|
55
|
+
import { validateRecipe, scaleRecipe } from 'soustack';
|
|
56
|
+
|
|
57
|
+
// Validate against the bundled Soustack schema
|
|
58
|
+
const { valid, errors, warnings } = validateRecipe(recipe);
|
|
59
|
+
if (!valid) {
|
|
60
|
+
throw new Error(JSON.stringify(errors, null, 2));
|
|
61
|
+
}
|
|
62
|
+
if (warnings?.length) {
|
|
63
|
+
console.warn('Non-blocking warnings', warnings);
|
|
64
|
+
}
|
|
65
|
+
|
|
66
|
+
// Scale to a new yield (multiplier, target yield, or servings)
|
|
67
|
+
const scaled = scaleRecipe(recipe, { multiplier: 2 });
|
|
68
|
+
```
|
|
69
|
+
|
|
70
|
+
### Profile-aware validation
|
|
71
|
+
|
|
72
|
+
Use profiles to enforce integration contracts. Available profiles:
|
|
73
|
+
- **minimal**: Basic recipe structure with minimal requirements
|
|
74
|
+
- **core**: Enhanced profile with structured ingredients and instructions
|
|
75
|
+
|
|
76
|
+
```ts
|
|
77
|
+
import { detectProfiles, validateRecipe } from 'soustack';
|
|
78
|
+
|
|
79
|
+
// Discover which profiles a recipe already satisfies
|
|
80
|
+
const profiles = detectProfiles(recipe); // e.g. ['minimal', 'core']
|
|
81
|
+
|
|
82
|
+
// Validate with a specific profile (defaults to 'core' if not specified)
|
|
83
|
+
const result = validateRecipe(recipe, { profile: 'minimal' });
|
|
84
|
+
if (!result.valid) {
|
|
85
|
+
console.error('Profile validation failed', result.errors);
|
|
86
|
+
}
|
|
87
|
+
|
|
88
|
+
// Validate with modules
|
|
89
|
+
const recipeWithModules = {
|
|
90
|
+
profile: 'minimal',
|
|
91
|
+
modules: ['nutrition@1', 'times@1'],
|
|
92
|
+
name: 'Test Recipe',
|
|
93
|
+
ingredients: ['1 cup flour'],
|
|
94
|
+
instructions: ['Mix'],
|
|
95
|
+
nutrition: { calories: 100, protein_g: 5 }, // Module payload required if declared
|
|
96
|
+
times: { prepMinutes: 10, cookMinutes: 20, totalMinutes: 30 }, // v0.3: uses *Minutes fields
|
|
97
|
+
};
|
|
98
|
+
const result2 = validateRecipe(recipeWithModules);
|
|
99
|
+
// Validates using: base + minimal profile + nutrition@1 module + times@1 module
|
|
100
|
+
// Module contract: if module is declared, payload must exist (and vice versa)
|
|
101
|
+
```
|
|
102
|
+
|
|
103
|
+
### Imperial β metric ingredient conversion
|
|
104
|
+
|
|
105
|
+
```ts
|
|
106
|
+
import { convertLineItemToMetric } from 'soustack';
|
|
107
|
+
|
|
108
|
+
const flour = convertLineItemToMetric(
|
|
109
|
+
{ ingredient: 'flour', quantity: 2, unit: 'cup' },
|
|
110
|
+
'mass'
|
|
111
|
+
);
|
|
112
|
+
// -> { ingredient: 'flour', quantity: 240, unit: 'g', notes: 'Converted using 120g per cup...' }
|
|
113
|
+
|
|
114
|
+
const liquid = convertLineItemToMetric(
|
|
115
|
+
{ ingredient: 'milk', quantity: 2, unit: 'cup' },
|
|
116
|
+
'volume'
|
|
117
|
+
);
|
|
118
|
+
// -> { ingredient: 'milk', quantity: 473, unit: 'ml' }
|
|
119
|
+
```
|
|
120
|
+
|
|
121
|
+
The converter rounds using βsaneβ defaults (1β―g/ml under 1β―kg/1β―L, then 5β―g/10β―ml and 2 decimal places for kg/L) and surfaces typed errors:
|
|
122
|
+
|
|
123
|
+
- `UnknownUnitError` for unsupported unit tokens
|
|
124
|
+
- `UnsupportedConversionError` if you request a mismatched dimension
|
|
125
|
+
- `MissingEquivalencyError` when no volumeβmass density is registered for the ingredient/unit combo
|
|
126
|
+
|
|
127
|
+
### Browser-safe vs. Node-only entrypoints
|
|
128
|
+
|
|
129
|
+
- **Browser-safe:** `import { extractSchemaOrgRecipeFromHTML, fromSchemaOrg, validateRecipe, scaleRecipe } from 'soustack';`
|
|
130
|
+
- Ships without Node fetch/cheerio dependencies.
|
|
131
|
+
- **Node-only scraping:** `import { scrapeRecipe, extractRecipeFromHTML, extractSchemaOrgRecipeFromHTML } from 'soustack/scrape';`
|
|
132
|
+
- Includes HTTP fetching, retries, and cheerio-based parsing for server environments.
|
|
133
|
+
|
|
134
|
+
## Spec compatibility & bundled schemas
|
|
135
|
+
|
|
136
|
+
- Targets Soustack spec **v0.3.0** (`spec/SOUSTACK_SPEC_VERSION`, exported as `SOUSTACK_SPEC_VERSION`).
|
|
137
|
+
- Ships the base schema, profile schemas, and module schemas in `spec/schemas/recipe/` and mirrors them into `src/schemas/recipe/` for consumers.
|
|
138
|
+
- Vendored fixtures live in `spec/fixtures` so tests can run offline, and version drift can be checked via `npm run validate:version`.
|
|
139
|
+
|
|
140
|
+
### Composed Validation Model
|
|
141
|
+
|
|
142
|
+
Soustack v0.3.0 uses a **composed validation model** where recipes are validated using JSON Schema's `allOf` composition:
|
|
143
|
+
|
|
144
|
+
```json
|
|
145
|
+
{
|
|
146
|
+
"allOf": [
|
|
147
|
+
{ "$ref": "base.schema.json" },
|
|
148
|
+
{ "$ref": "profiles/{profile}.schema.json" },
|
|
149
|
+
{ "$ref": "modules/{module1}/{version}.schema.json" },
|
|
150
|
+
{ "$ref": "modules/{module2}/{version}.schema.json" }
|
|
151
|
+
]
|
|
152
|
+
}
|
|
153
|
+
```
|
|
154
|
+
|
|
155
|
+
The validator:
|
|
156
|
+
- **Base schema**: Defines the core recipe structure (`@type`, `name`, `ingredients`, `instructions`, `profile`, `modules`)
|
|
157
|
+
- **Profile overlay**: Adds profile-specific requirements (e.g., `minimal` or `core`)
|
|
158
|
+
- **Module overlays**: Each declared module adds its own validation rules
|
|
159
|
+
|
|
160
|
+
**Defaults:**
|
|
161
|
+
- If `profile` is missing, it defaults to `"core"`
|
|
162
|
+
- If `modules` is missing, it defaults to `[]`
|
|
163
|
+
|
|
164
|
+
**Module Contract:** Modules enforce a symmetric contract:
|
|
165
|
+
- If a module is declared in `modules`, the corresponding payload must exist
|
|
166
|
+
- If a payload exists (e.g., `nutrition`, `times`), the module must be declared
|
|
167
|
+
- The validator automatically infers modules from payloads and enforces this contract
|
|
168
|
+
|
|
169
|
+
**Caching:** Validators are cached by `${profile}::${sortedModules.join(",")}` for performance.
|
|
170
|
+
|
|
171
|
+
### Module Resolution
|
|
172
|
+
|
|
173
|
+
Modules are resolved to schema references using the pattern:
|
|
174
|
+
- Module identifier format: `<name>@<version>` (e.g., `nutrition@1`, `schedule@1`)
|
|
175
|
+
- Schema reference: `https://soustack.org/schemas/recipe/modules/<name>/<version>.schema.json`
|
|
176
|
+
|
|
177
|
+
The module registry (`schemas/registry/modules.json`) defines which modules are available and their properties, including:
|
|
178
|
+
- `schemaOrgMappable`: Whether the module can be converted to Schema.org format
|
|
179
|
+
- `minProfile`: Minimum profile required to use the module
|
|
180
|
+
- `allowedOnMinimal`: Whether the module can be used with the minimal profile
|
|
181
|
+
|
|
182
|
+
**Available Modules (v0.3.0):**
|
|
183
|
+
- `attribution@1`: Source attribution (url, author, datePublished)
|
|
184
|
+
- `taxonomy@1`: Classification (keywords, category, cuisine)
|
|
185
|
+
- `media@1`: Images and videos (images, videos arrays)
|
|
186
|
+
- `times@1`: Timing information (prepMinutes, cookMinutes, totalMinutes)
|
|
187
|
+
- `nutrition@1`: Nutritional data (calories, protein_g as numbers)
|
|
188
|
+
- `schedule@1`: Task scheduling (requires core profile, includes instruction dependencies)
|
|
189
|
+
|
|
190
|
+
## Programmatic Usage
|
|
191
|
+
|
|
192
|
+
```ts
|
|
193
|
+
import {
|
|
194
|
+
extractSchemaOrgRecipeFromHTML,
|
|
195
|
+
fromSchemaOrg,
|
|
196
|
+
toSchemaOrg,
|
|
197
|
+
validateRecipe,
|
|
198
|
+
scaleRecipe,
|
|
199
|
+
} from 'soustack';
|
|
200
|
+
import {
|
|
201
|
+
scrapeRecipe,
|
|
202
|
+
extractRecipeFromHTML,
|
|
203
|
+
extractSchemaOrgRecipeFromHTML as extractSchemaOrgRecipeFromHTMLNode,
|
|
204
|
+
} from 'soustack/scrape';
|
|
205
|
+
|
|
206
|
+
// Validate a Soustack recipe JSON object with profile enforcement
|
|
207
|
+
const validation = validateRecipe(recipe, { profile: 'core' });
|
|
208
|
+
if (!validation.valid) {
|
|
209
|
+
console.error(validation.errors);
|
|
210
|
+
}
|
|
211
|
+
|
|
212
|
+
// Scale a recipe to a target yield amount (returns a "computed recipe")
|
|
213
|
+
const scaled = scaleRecipe(recipe, { multiplier: 2 });
|
|
214
|
+
|
|
215
|
+
// Scrape a URL into a Soustack recipe (Node.js only, throws if no recipe is found)
|
|
216
|
+
const scraped = await scrapeRecipe('https://example.com/recipe');
|
|
217
|
+
|
|
218
|
+
// Browser: fetch your own HTML, then parse and convert
|
|
219
|
+
const html = await fetch('https://example.com/recipe').then((r) => r.text());
|
|
220
|
+
const schemaOrgRecipe = extractSchemaOrgRecipeFromHTML(html);
|
|
221
|
+
const recipe = schemaOrgRecipe ? fromSchemaOrg(schemaOrgRecipe) : null;
|
|
222
|
+
|
|
223
|
+
// Node: parse raw HTML with cheerio-powered extractor
|
|
224
|
+
const nodeSchemaOrg = extractSchemaOrgRecipeFromHTMLNode(html);
|
|
225
|
+
const nodeRecipe = extractRecipeFromHTML(html);
|
|
226
|
+
|
|
227
|
+
// Convert Schema.org β Soustack
|
|
228
|
+
const soustack = fromSchemaOrg(schemaOrgJsonLd);
|
|
229
|
+
|
|
230
|
+
// Convert Soustack β Schema.org
|
|
231
|
+
const jsonLd = toSchemaOrg(recipe);
|
|
232
|
+
|
|
233
|
+
```
|
|
234
|
+
|
|
235
|
+
## πͺΆ Core-lite (browser) Schema.org conversion
|
|
236
|
+
|
|
237
|
+
Need to stay browser-only? Import the core bundle (no `fetch`, no cheerio) and perform Schema.org extraction and conversion entirely client-side:
|
|
238
|
+
|
|
239
|
+
```ts
|
|
240
|
+
import { extractSchemaOrgRecipeFromHTML, fromSchemaOrg, toSchemaOrg } from 'soustack';
|
|
241
|
+
|
|
242
|
+
async function convert(url: string) {
|
|
243
|
+
const html = await fetch(url).then((r) => r.text());
|
|
244
|
+
|
|
245
|
+
// Pure DOMParser-based extraction (works in modern browsers)
|
|
246
|
+
const schemaOrg = extractSchemaOrgRecipeFromHTML(html);
|
|
247
|
+
if (!schemaOrg) throw new Error('No Schema.org recipe found');
|
|
248
|
+
|
|
249
|
+
// Convert to Soustack and back to Schema.org JSON-LD if needed
|
|
250
|
+
const soustackRecipe = fromSchemaOrg(schemaOrg);
|
|
251
|
+
const jsonLd = toSchemaOrg(soustackRecipe);
|
|
252
|
+
|
|
253
|
+
return { soustackRecipe, jsonLd };
|
|
254
|
+
}
|
|
255
|
+
```
|
|
256
|
+
|
|
257
|
+
## π Schema.org Conversion
|
|
258
|
+
|
|
259
|
+
Use the helpers to move between Schema.org JSON-LD and Soustack's structured recipe format. The conversion automatically handles image normalization, supporting multiple image formats from Schema.org.
|
|
260
|
+
|
|
261
|
+
**BREAKING CHANGE in v0.3.0:** `toSchemaOrg()` now targets the **minimal profile** and only includes modules that are marked as `schemaOrgMappable` in the modules registry. Non-mappable modules (e.g., `nutrition@1`, `schedule@1`) are excluded from the conversion.
|
|
262
|
+
|
|
263
|
+
```ts
|
|
264
|
+
import { fromSchemaOrg, toSchemaOrg, normalizeImage } from 'soustack';
|
|
265
|
+
|
|
266
|
+
// Convert Schema.org β Soustack (automatically normalizes images)
|
|
267
|
+
const soustackRecipe = fromSchemaOrg(schemaOrgJsonLd);
|
|
268
|
+
// Recipe images: string | string[] | undefined
|
|
269
|
+
// Instruction images: optional image URL per step
|
|
270
|
+
|
|
271
|
+
// Convert Soustack β Schema.org (preserves images)
|
|
272
|
+
const schemaOrgRecipe = toSchemaOrg(soustackRecipe);
|
|
273
|
+
|
|
274
|
+
// Manual image normalization (if needed)
|
|
275
|
+
const normalized = normalizeImage(schemaOrgImage);
|
|
276
|
+
// Handles: strings, arrays, ImageObjects with url/contentUrl
|
|
277
|
+
```
|
|
278
|
+
|
|
279
|
+
### Image Format Support
|
|
280
|
+
|
|
281
|
+
Soustack supports flexible image formats:
|
|
282
|
+
|
|
283
|
+
- **Recipe-level images**: Single URL (`string`) or multiple URLs (`string[]`)
|
|
284
|
+
- **Instruction-level images**: Optional `image` property on instruction objects
|
|
285
|
+
- **Automatic normalization**: Schema.org ImageObjects are automatically converted to URLs during import
|
|
286
|
+
|
|
287
|
+
Example recipe with images:
|
|
288
|
+
|
|
289
|
+
```ts
|
|
290
|
+
const recipe = {
|
|
291
|
+
name: "Chocolate Cake",
|
|
292
|
+
image: ["https://example.com/hero.jpg", "https://example.com/gallery.jpg"],
|
|
293
|
+
instructions: [
|
|
294
|
+
"Mix dry ingredients",
|
|
295
|
+
{ text: "Decorate the cake", image: "https://example.com/decorate.jpg" },
|
|
296
|
+
"Serve"
|
|
297
|
+
]
|
|
298
|
+
};
|
|
299
|
+
```
|
|
300
|
+
|
|
301
|
+
## π§° Web Scraping
|
|
302
|
+
|
|
303
|
+
### Node.js: `scrapeRecipe()`
|
|
304
|
+
|
|
305
|
+
`scrapeRecipe(url, options)` fetches a recipe page and extracts Schema.org data. **Node.js only** due to CORS restrictions.
|
|
306
|
+
|
|
307
|
+
Options:
|
|
308
|
+
|
|
309
|
+
- `timeout` (ms, default `10000`)
|
|
310
|
+
- `userAgent` (string, optional)
|
|
311
|
+
- `maxRetries` (default `2`, retries on non-4xx failures)
|
|
312
|
+
|
|
313
|
+
```ts
|
|
314
|
+
import { scrapeRecipe } from 'soustack';
|
|
315
|
+
|
|
316
|
+
const recipe = await scrapeRecipe('https://example.com/recipe', {
|
|
317
|
+
timeout: 15000,
|
|
318
|
+
maxRetries: 3,
|
|
319
|
+
});
|
|
320
|
+
```
|
|
321
|
+
|
|
322
|
+
### Browser: `extractSchemaOrgRecipeFromHTML()`
|
|
323
|
+
|
|
324
|
+
`extractSchemaOrgRecipeFromHTML(html)` extracts the raw Schema.org recipe data from HTML. Returns `null` if no recipe is found. Use this when you need to inspect, debug, or convert Schema.org data in browser builds without dragging in Node dependencies.
|
|
325
|
+
|
|
326
|
+
```ts
|
|
327
|
+
import { extractSchemaOrgRecipeFromHTML, fromSchemaOrg } from 'soustack';
|
|
328
|
+
|
|
329
|
+
// In browser: fetch HTML yourself
|
|
330
|
+
const response = await fetch('https://example.com/recipe');
|
|
331
|
+
const html = await response.text();
|
|
332
|
+
|
|
333
|
+
// Extract Schema.org format (for inspection/modification)
|
|
334
|
+
const schemaOrgRecipe = extractSchemaOrgRecipeFromHTML(html);
|
|
335
|
+
|
|
336
|
+
if (schemaOrgRecipe) {
|
|
337
|
+
// Inspect or modify Schema.org data before converting
|
|
338
|
+
console.log('Found recipe:', schemaOrgRecipe.name);
|
|
339
|
+
|
|
340
|
+
// Convert to Soustack format when ready
|
|
341
|
+
const soustackRecipe = fromSchemaOrg(schemaOrgRecipe);
|
|
342
|
+
}
|
|
343
|
+
```
|
|
344
|
+
|
|
345
|
+
### Node-only scraping: `soustack/scrape`
|
|
346
|
+
|
|
347
|
+
For server-side scraping with built-in fetching and cheerio-based parsing, use the dedicated entrypoint:
|
|
348
|
+
|
|
349
|
+
```ts
|
|
350
|
+
import { scrapeRecipe, extractRecipeFromHTML, fetchPage } from 'soustack/scrape';
|
|
351
|
+
|
|
352
|
+
// Fetch and parse a URL directly
|
|
353
|
+
const recipe = await scrapeRecipe('https://example.com/recipe');
|
|
354
|
+
|
|
355
|
+
// Or work with already-downloaded HTML
|
|
356
|
+
const html = await fetchPage('https://example.com/recipe');
|
|
357
|
+
const parsed = extractRecipeFromHTML(html);
|
|
358
|
+
```
|
|
359
|
+
|
|
360
|
+
### CLI
|
|
361
|
+
|
|
362
|
+
```bash
|
|
363
|
+
# Validate with profiles (JSON output for pipelines)
|
|
364
|
+
npx soustack validate recipe.soustack.json --profile block --strict --json
|
|
365
|
+
|
|
366
|
+
# Repo-wide test run (validates every *.soustack.json)
|
|
367
|
+
npx soustack test --profile block
|
|
368
|
+
|
|
369
|
+
# Convert Schema.org β Soustack
|
|
370
|
+
npx soustack convert --from schemaorg --to soustack recipe.jsonld -o recipe.soustack.json
|
|
371
|
+
npx soustack convert --from soustack --to schemaorg recipe.soustack.json -o recipe.jsonld
|
|
372
|
+
|
|
373
|
+
# Import (scrape) or scale from the CLI
|
|
374
|
+
npx soustack import --url "https://example.com/recipe" -o recipe.soustack.json
|
|
375
|
+
npx soustack scale recipe.soustack.json 2
|
|
376
|
+
```
|
|
377
|
+
|
|
378
|
+
## π Keeping the Schema in Sync
|
|
379
|
+
|
|
380
|
+
The schema files in this repository are **copies** of the official standard. The source of truth lives in [RichardHerold/soustack-spec](https://github.com/RichardHerold/soustack-spec).
|
|
381
|
+
|
|
382
|
+
**Do not edit any synced schema artifacts manually** (`src/schema.json`, `src/soustack.schema.json`, `src/profiles/*.schema.json`).
|
|
383
|
+
|
|
384
|
+
To update to the latest tagged version of the standard, run:
|
|
385
|
+
|
|
386
|
+
```bash
|
|
387
|
+
npm run sync:spec
|
|
388
|
+
```
|
|
389
|
+
|
|
390
|
+
## Development
|
|
391
|
+
|
|
392
|
+
```bash
|
|
393
|
+
npm test
|
|
394
|
+
```
|