@uniweb/semantic-parser 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,928 @@
1
+ # Content Mapping Patterns
2
+
3
+ This guide shows how to use the mapping utilities to transform parsed content into component-specific formats.
4
+
5
+ ## Overview
6
+
7
+ The parser provides mapping utilities designed for two contexts:
8
+
9
+ - **Visual Editor Mode** (default): Gracefully handles content with silent cleanup, perfect for non-technical users
10
+ - **Build Mode**: Validates content and warns about issues, ideal for development workflows
11
+
12
+ ### Mapping Tools
13
+
14
+ 1. **Type System**: Automatic transformation based on field types (plaintext, richtext, excerpt, etc.)
15
+ 2. **Helpers**: General-purpose utility functions
16
+ 3. **Accessor**: Path-based extraction with schema support
17
+ 4. **Extractors**: Pre-built patterns for common components
18
+
19
+ ## Type System (Recommended)
20
+
21
+ The type system automatically transforms content based on component requirements, making it perfect for visual editors where users don't know about HTML/markdown.
22
+
23
+ ### Visual Editor Mode (Default)
24
+
25
+ Gracefully handles content issues with silent, automatic cleanup:
26
+
27
+ ```js
28
+ const schema = {
29
+ title: {
30
+ path: "groups.main.header.title",
31
+ type: "plaintext", // Auto-strips HTML markup
32
+ maxLength: 60 // Auto-truncates with smart boundaries
33
+ },
34
+ description: {
35
+ path: "groups.main.body.paragraphs",
36
+ type: "excerpt", // Auto-creates excerpt from paragraphs
37
+ maxLength: 150
38
+ },
39
+ image: {
40
+ path: "groups.main.body.imgs[0].url",
41
+ type: "image", // Normalizes image data
42
+ defaultValue: "/placeholder.jpg",
43
+ treatEmptyAsDefault: true
44
+ }
45
+ };
46
+
47
+ // Visual editor mode (default) - silent cleanup
48
+ const data = mappers.extractBySchema(parsed, schema);
49
+ // {
50
+ // title: "Welcome to Our Platform", // <strong> tags stripped
51
+ // description: "Get started with...", // Truncated, markup removed
52
+ // image: "/hero.jpg" or "/placeholder.jpg"
53
+ // }
54
+ ```
55
+
56
+ ### Build Mode
57
+
58
+ Validates content and provides warnings for developers:
59
+
60
+ ```js
61
+ const data = mappers.extractBySchema(parsed, schema, { mode: 'build' });
62
+
63
+ // Console output:
64
+ // ⚠️ [title] Field contains HTML markup but expects plain text (auto-fixed)
65
+ // ⚠️ [title] Text is 65 characters (max: 60) (auto-fixed)
66
+ ```
67
+
68
+ ### Available Field Types
69
+
70
+ #### `plaintext`
71
+
72
+ Strips all HTML markup, returning clean text. Perfect for titles, labels, and anywhere HTML shouldn't appear.
73
+
74
+ ```js
75
+ {
76
+ title: {
77
+ path: "groups.main.header.title",
78
+ type: "plaintext",
79
+ maxLength: 60, // Auto-truncate
80
+ boundary: "word", // or "sentence", "character"
81
+ ellipsis: "...",
82
+ transform: (text) => text.toUpperCase() // Additional transform
83
+ }
84
+ }
85
+
86
+ // Input: "Welcome to <strong>Our Platform</strong>"
87
+ // Output: "Welcome to Our Platform"
88
+ ```
89
+
90
+ #### `richtext`
91
+
92
+ Preserves safe HTML while removing dangerous tags (script, iframe, etc.).
93
+
94
+ ```js
95
+ {
96
+ description: {
97
+ path: "groups.main.body.paragraphs[0]",
98
+ type: "richtext",
99
+ allowedTags: ["strong", "em", "a", "br"], // Customize allowed tags
100
+ stripTags: ["script", "style"] // Additional tags to remove
101
+ }
102
+ }
103
+
104
+ // Input: "Text with <strong>bold</strong> and <script>bad</script>"
105
+ // Output: "Text with <strong>bold</strong> and "
106
+ ```
107
+
108
+ #### `excerpt`
109
+
110
+ Auto-generates excerpt from content, stripping markup and truncating intelligently.
111
+
112
+ ```js
113
+ {
114
+ excerpt: {
115
+ path: "groups.main.body.paragraphs",
116
+ type: "excerpt",
117
+ maxLength: 150,
118
+ boundary: "word", // or "sentence"
119
+ preferFirstSentence: true // Use first sentence if short enough
120
+ }
121
+ }
122
+
123
+ // Input: ["Long paragraph with <em>formatting</em>...", "More text..."]
124
+ // Output: "Long paragraph with formatting..."
125
+ ```
126
+
127
+ #### `number`
128
+
129
+ Parses and optionally formats numbers.
130
+
131
+ ```js
132
+ {
133
+ price: {
134
+ path: "groups.main.header.title",
135
+ type: "number",
136
+ format: {
137
+ decimals: 2,
138
+ thousands: ",",
139
+ decimal: "."
140
+ }
141
+ }
142
+ }
143
+
144
+ // Input: "1234.567"
145
+ // Output: "1,234.57"
146
+ ```
147
+
148
+ #### `image`
149
+
150
+ Normalizes image data structure.
151
+
152
+ ```js
153
+ {
154
+ image: {
155
+ path: "groups.main.body.imgs[0]",
156
+ type: "image",
157
+ defaultValue: "/placeholder.jpg",
158
+ defaultAlt: "Image"
159
+ }
160
+ }
161
+
162
+ // Input: "/hero.jpg" or { url: "/hero.jpg", alt: "Hero" }
163
+ // Output: { url: "/hero.jpg", alt: "Hero", caption: null }
164
+ ```
165
+
166
+ #### `link`
167
+
168
+ Normalizes link data structure.
169
+
170
+ ```js
171
+ {
172
+ cta: {
173
+ path: "groups.main.body.links[0]",
174
+ type: "link"
175
+ }
176
+ }
177
+
178
+ // Input: "http://example.com" or { href: "/page", label: "Click" }
179
+ // Output: { href: "/page", label: "Click", target: "_self" }
180
+ ```
181
+
182
+ ### Validation for UI Hints
183
+
184
+ Get validation results without extracting data - perfect for showing hints in visual editors:
185
+
186
+ ```js
187
+ const hints = mappers.validateSchema(parsed, schema, { mode: 'visual-editor' });
188
+
189
+ // {
190
+ // title: [{
191
+ // type: 'max_length',
192
+ // severity: 'info',
193
+ // message: 'Text is 65 characters (max: 60)',
194
+ // autoFix: true
195
+ // }],
196
+ // image: [{
197
+ // type: 'required',
198
+ // severity: 'error',
199
+ // message: 'Required image is missing',
200
+ // autoFix: false
201
+ // }]
202
+ // }
203
+
204
+ // Use in UI:
205
+ // Title field: ℹ️ "Title is a bit long (will be trimmed to fit)"
206
+ // Image field: ⚠️ "Image is required"
207
+ ```
208
+
209
+ ### Real-World Example
210
+
211
+ ```js
212
+ // Component declares its content requirements
213
+ const componentSchema = {
214
+ brand: {
215
+ path: "groups.main.header.pretitle",
216
+ type: "plaintext",
217
+ maxLength: 20,
218
+ transform: (text) => text.toUpperCase()
219
+ },
220
+ title: {
221
+ path: "groups.main.header.title",
222
+ type: "plaintext",
223
+ maxLength: 60,
224
+ required: true
225
+ },
226
+ subtitle: {
227
+ path: "groups.main.header.subtitle",
228
+ type: "plaintext",
229
+ maxLength: 100
230
+ },
231
+ description: {
232
+ path: "groups.main.body.paragraphs",
233
+ type: "excerpt",
234
+ maxLength: 200
235
+ },
236
+ image: {
237
+ path: "groups.main.body.imgs[0].url",
238
+ type: "image",
239
+ defaultValue: "/placeholder.jpg"
240
+ },
241
+ cta: {
242
+ path: "groups.main.body.links[0]",
243
+ type: "link"
244
+ }
245
+ };
246
+
247
+ // Engine extracts and transforms for component
248
+ const componentData = mappers.extractBySchema(parsed, componentSchema);
249
+
250
+ // Component receives clean, validated data:
251
+ // {
252
+ // brand: "NEW PRODUCT",
253
+ // title: "Welcome to Our Platform",
254
+ // subtitle: "Get started today",
255
+ // description: "Transform how you create content...",
256
+ // image: "/hero.jpg",
257
+ // cta: { href: "/signup", label: "Get Started", target: "_self" }
258
+ // }
259
+ ```
260
+
261
+ ---
262
+
263
+ ## Quick Start
264
+
265
+ ```js
266
+ import { parseContent, mappers } from "@uniwebcms/semantic-parser";
267
+
268
+ const parsed = parseContent(doc);
269
+
270
+ // Use a pre-built extractor
271
+ const heroData = mappers.extractors.hero(parsed);
272
+
273
+ // Or use schema-based extraction
274
+ const customData = mappers.extractBySchema(parsed, {
275
+ title: "groups.main.header.title",
276
+ image: { path: "groups.main.body.imgs[0].url", defaultValue: "/placeholder.jpg" }
277
+ });
278
+ ```
279
+
280
+ ## Helper Utilities
281
+
282
+ ### Array Helpers
283
+
284
+ ```js
285
+ const { helpers } = mappers;
286
+
287
+ // Get first item with default
288
+ const image = helpers.first(images, "/default.jpg");
289
+
290
+ // Get last item
291
+ const lastParagraph = helpers.last(paragraphs);
292
+
293
+ // Transform array
294
+ const titles = helpers.transformArray(items, item => item.header.title);
295
+
296
+ // Filter and transform
297
+ const h2s = helpers.filterArray(headings, h => h.level === 2, h => h.content);
298
+
299
+ // Join text
300
+ const description = helpers.joinText(paragraphs, " ");
301
+
302
+ // Compact (remove null/undefined/empty)
303
+ const cleanArray = helpers.compact([null, "text", "", undefined, "more"]);
304
+ // => ["text", "more"]
305
+ ```
306
+
307
+ ### Object Helpers
308
+
309
+ ```js
310
+ // Get nested value safely
311
+ const title = helpers.get(parsed, "groups.main.header.title", "Untitled");
312
+
313
+ // Pick specific properties
314
+ const metadata = helpers.pick(parsed.groups.main, ["header", "banner"]);
315
+
316
+ // Omit properties
317
+ const withoutMetadata = helpers.omit(item, ["metadata"]);
318
+ ```
319
+
320
+ ### Validation
321
+
322
+ ```js
323
+ // Check if value exists (not null/undefined/empty string)
324
+ if (helpers.exists(title)) {
325
+ // title has a value
326
+ }
327
+
328
+ // Validate required fields
329
+ const validation = helpers.validateRequired(data, ["title", "image"]);
330
+ if (!validation.valid) {
331
+ console.log("Missing fields:", validation.missing);
332
+ }
333
+ ```
334
+
335
+ ### Safe Extraction
336
+
337
+ ```js
338
+ // Wrap extraction in try-catch
339
+ const safeExtractor = helpers.safe((parsed) => {
340
+ return parsed.groups.main.header.title.toUpperCase();
341
+ }, "DEFAULT");
342
+
343
+ const title = safeExtractor(parsed); // Won't throw if path is invalid
344
+ ```
345
+
346
+ ## Path-Based Accessor
347
+
348
+ ### Basic Usage
349
+
350
+ ```js
351
+ const { accessor } = mappers;
352
+
353
+ // Simple path
354
+ const title = accessor.getByPath(parsed, "groups.main.header.title");
355
+
356
+ // Array index notation
357
+ const firstImage = accessor.getByPath(parsed, "groups.main.body.imgs[0].url");
358
+
359
+ // With default value
360
+ const image = accessor.getByPath(parsed, "groups.main.body.imgs[0].url", {
361
+ defaultValue: "/placeholder.jpg"
362
+ });
363
+
364
+ // With transformation
365
+ const description = accessor.getByPath(parsed, "groups.main.body.paragraphs", {
366
+ transform: (paragraphs) => paragraphs.join(" ")
367
+ });
368
+
369
+ // Required field (throws if missing)
370
+ const title = accessor.getByPath(parsed, "groups.main.header.title", {
371
+ required: true
372
+ });
373
+ ```
374
+
375
+ ### Schema-Based Extraction
376
+
377
+ Extract multiple fields at once using a schema:
378
+
379
+ ```js
380
+ const schema = {
381
+ // Shorthand: just the path
382
+ title: "groups.main.header.title",
383
+
384
+ // Full config with options
385
+ image: {
386
+ path: "groups.main.body.imgs[0].url",
387
+ defaultValue: "/placeholder.jpg"
388
+ },
389
+
390
+ description: {
391
+ path: "groups.main.body.paragraphs",
392
+ transform: (p) => p.join(" ")
393
+ },
394
+
395
+ cta: {
396
+ path: "groups.main.body.links[0]",
397
+ required: false
398
+ }
399
+ };
400
+
401
+ const data = accessor.extractBySchema(parsed, schema);
402
+ // {
403
+ // title: "...",
404
+ // image: "..." or "/placeholder.jpg",
405
+ // description: "...",
406
+ // cta: {...} or null
407
+ // }
408
+ ```
409
+
410
+ ### Array Mapping
411
+
412
+ Extract data from array of items:
413
+
414
+ ```js
415
+ // Simple: extract single field from each item
416
+ const titles = accessor.mapArray(parsed, "groups.items", "header.title");
417
+ // ["Item 1", "Item 2", "Item 3"]
418
+
419
+ // Complex: extract multiple fields from each item
420
+ const cards = accessor.mapArray(parsed, "groups.items", {
421
+ title: "header.title",
422
+ text: { path: "body.paragraphs", transform: p => p.join(" ") },
423
+ image: { path: "body.imgs[0].url", defaultValue: "/default.jpg" }
424
+ });
425
+ // [
426
+ // { title: "...", text: "...", image: "..." },
427
+ // { title: "...", text: "...", image: "..." }
428
+ // ]
429
+ ```
430
+
431
+ ### Path Helpers
432
+
433
+ ```js
434
+ // Check if path exists
435
+ if (accessor.hasPath(parsed, "groups.main.banner.url")) {
436
+ // Banner exists
437
+ }
438
+
439
+ // Get first existing path
440
+ const image = accessor.getFirstExisting(parsed, [
441
+ "groups.main.banner.url",
442
+ "groups.main.body.imgs[0].url",
443
+ "groups.items[0].body.imgs[0].url"
444
+ ], "/fallback.jpg");
445
+ ```
446
+
447
+ ## Pre-Built Extractors
448
+
449
+ ### Hero Component
450
+
451
+ Large header with title, image, and CTA:
452
+
453
+ ```js
454
+ const heroData = mappers.extractors.hero(parsed);
455
+ // {
456
+ // title: "Welcome",
457
+ // subtitle: "Get started today",
458
+ // kicker: "NEW",
459
+ // description: "Join thousands of users...",
460
+ // image: "/hero.jpg",
461
+ // imageAlt: "Hero image",
462
+ // banner: "/banner.jpg",
463
+ // cta: { href: "/signup", label: "Get Started" },
464
+ // button: { content: "Learn More", attrs: {...} }
465
+ // }
466
+ ```
467
+
468
+ ### Card Component
469
+
470
+ ```js
471
+ // Single card from main content
472
+ const card = mappers.extractors.card(parsed);
473
+
474
+ // Multiple cards from items
475
+ const cards = mappers.extractors.card(parsed, { useItems: true });
476
+
477
+ // Specific card by index
478
+ const firstCard = mappers.extractors.card(parsed, { useItems: true, itemIndex: 0 });
479
+ ```
480
+
481
+ ### Article Content
482
+
483
+ ```js
484
+ const article = mappers.extractors.article(parsed);
485
+ // {
486
+ // title: "Article Title",
487
+ // subtitle: "Subtitle",
488
+ // kicker: "FEATURED",
489
+ // author: "John Doe",
490
+ // date: "2024-01-01",
491
+ // banner: "/banner.jpg",
492
+ // content: ["paragraph 1", "paragraph 2"],
493
+ // images: [...],
494
+ // videos: [...],
495
+ // links: [...]
496
+ // }
497
+ ```
498
+
499
+ ### Statistics
500
+
501
+ ```js
502
+ const stats = mappers.extractors.stats(parsed);
503
+ // [
504
+ // { value: "12", label: "Partner Labs", description: "..." },
505
+ // { value: "$25M", label: "Grant Funding", description: "..." }
506
+ // ]
507
+ ```
508
+
509
+ ### Navigation Menu
510
+
511
+ ```js
512
+ const nav = mappers.extractors.navigation(parsed);
513
+ // [
514
+ // {
515
+ // label: "Products",
516
+ // href: "/products",
517
+ // children: [
518
+ // { label: "Product 1", href: "/products/1", icon: "..." }
519
+ // ]
520
+ // }
521
+ // ]
522
+ ```
523
+
524
+ ### Features List
525
+
526
+ ```js
527
+ const features = mappers.extractors.features(parsed);
528
+ // [
529
+ // {
530
+ // title: "Fast Performance",
531
+ // subtitle: "Lightning quick",
532
+ // description: "Our platform is optimized...",
533
+ // icon: "<svg>...</svg>",
534
+ // image: "/feature.jpg",
535
+ // link: { href: "/learn-more", label: "Learn More" }
536
+ // }
537
+ // ]
538
+ ```
539
+
540
+ ### Testimonials
541
+
542
+ ```js
543
+ // Single testimonial
544
+ const testimonial = mappers.extractors.testimonial(parsed);
545
+
546
+ // Multiple testimonials from items
547
+ const testimonials = mappers.extractors.testimonial(parsed, { useItems: true });
548
+ // [
549
+ // {
550
+ // quote: "This product changed our workflow completely!",
551
+ // author: "Jane Smith",
552
+ // role: "CEO",
553
+ // company: "Acme Inc",
554
+ // image: "/jane.jpg",
555
+ // imageAlt: "Jane Smith"
556
+ // }
557
+ // ]
558
+ ```
559
+
560
+ ### FAQ
561
+
562
+ ```js
563
+ const faqs = mappers.extractors.faq(parsed);
564
+ // [
565
+ // {
566
+ // question: "How does it work?",
567
+ // answer: "Our platform uses advanced algorithms...",
568
+ // links: [...]
569
+ // }
570
+ // ]
571
+ ```
572
+
573
+ ### Pricing Tiers
574
+
575
+ ```js
576
+ const tiers = mappers.extractors.pricing(parsed);
577
+ // [
578
+ // {
579
+ // name: "Pro",
580
+ // price: "$29/month",
581
+ // description: "For growing teams",
582
+ // features: ["Unlimited users", "API access", "Priority support"],
583
+ // cta: { href: "/signup", label: "Start Free Trial" },
584
+ // highlighted: true
585
+ // }
586
+ // ]
587
+ ```
588
+
589
+ ### Team Members
590
+
591
+ ```js
592
+ const team = mappers.extractors.team(parsed);
593
+ // [
594
+ // {
595
+ // name: "Dr. Sarah Chen",
596
+ // role: "Lead Researcher",
597
+ // department: "Neuroscience",
598
+ // bio: "Dr. Chen specializes in...",
599
+ // image: "/sarah.jpg",
600
+ // imageAlt: "Dr. Sarah Chen",
601
+ // links: [{ href: "https://twitter.com/...", label: "Twitter" }]
602
+ // }
603
+ // ]
604
+ ```
605
+
606
+ ### Gallery
607
+
608
+ ```js
609
+ // All images
610
+ const allImages = mappers.extractors.gallery(parsed);
611
+
612
+ // Only from main content
613
+ const mainImages = mappers.extractors.gallery(parsed, { source: "main" });
614
+
615
+ // Only from items
616
+ const itemImages = mappers.extractors.gallery(parsed, { source: "items" });
617
+ // [
618
+ // { url: "/image1.jpg", alt: "Image 1", caption: "Caption 1" },
619
+ // { url: "/image2.jpg", alt: "Image 2", caption: "Caption 2" }
620
+ // ]
621
+ ```
622
+
623
+ ## Combining Utilities
624
+
625
+ You can combine helpers, accessors, and extractors for complex transformations:
626
+
627
+ ```js
628
+ const { helpers, accessor, extractors } = mappers;
629
+
630
+ // Start with a pre-built extractor
631
+ const baseData = extractors.hero(parsed);
632
+
633
+ // Enhance with custom fields
634
+ const enhancedData = {
635
+ ...baseData,
636
+ // Add custom field using accessor
637
+ customField: accessor.getByPath(parsed, "groups.main.metadata.custom"),
638
+
639
+ // Transform array using helper
640
+ relatedPosts: helpers.transformArray(
641
+ accessor.getByPath(parsed, "groups.items", { defaultValue: [] }),
642
+ item => ({
643
+ title: item.header.title,
644
+ link: helpers.first(item.body.links)
645
+ })
646
+ ),
647
+
648
+ // Safe extraction with fallback
649
+ safeData: helpers.safe(() => {
650
+ return parsed.groups.main.complexPath.deepValue.toUpperCase();
651
+ }, "DEFAULT")
652
+ };
653
+ ```
654
+
655
+ ## Engine Integration Example
656
+
657
+ In your component engine, you might use mappers like this:
658
+
659
+ ```js
660
+ // Component provides a schema
661
+ const componentSchema = {
662
+ content: {
663
+ type: "hero", // Use pre-built extractor
664
+ // OR
665
+ mapping: { // Use custom mapping
666
+ brand: "groups.main.header.pretitle",
667
+ title: "groups.main.header.title",
668
+ subtitle: "groups.main.header.subtitle",
669
+ image: { path: "groups.main.body.imgs[0].url", defaultValue: "/default.jpg" },
670
+ actions: {
671
+ path: "groups.main.body.links",
672
+ transform: links => links.map(l => ({ label: l.label, type: "primary" }))
673
+ }
674
+ }
675
+ }
676
+ };
677
+
678
+ // Engine maps content before passing to component
679
+ function prepareComponentData(doc, schema) {
680
+ const parsed = parseContent(doc);
681
+
682
+ if (schema.content.type) {
683
+ // Use named extractor
684
+ return mappers.extractors[schema.content.type](parsed);
685
+ } else if (schema.content.mapping) {
686
+ // Use custom schema
687
+ return mappers.accessor.extractBySchema(parsed, schema.content.mapping);
688
+ }
689
+
690
+ // Fallback to standard parsed structure
691
+ return parsed;
692
+ }
693
+ ```
694
+
695
+ ## Rendering Extracted Content
696
+
697
+ After extracting content, you need to render it in your components. The parser works with content that may contain paragraph arrays, rich HTML, and formatting marks.
698
+
699
+ ### Text Component Pattern
700
+
701
+ A **Text component** is recommended for rendering extracted content. See the [Text Component Reference](./text-component-reference.md) for a complete implementation guide.
702
+
703
+ #### Why Use a Text Component?
704
+
705
+ The parser's extractors return content in flexible formats:
706
+ - **Arrays of paragraphs** - `["Para 1", "Para 2"]`
707
+ - **Rich HTML** - `"Welcome to <strong>our platform</strong>"`
708
+ - **Color marks** - `"Title with <mark class='brand'>highlight</mark>"`
709
+
710
+ A Text component handles all these cases automatically.
711
+
712
+ #### Quick Example
713
+
714
+ ```jsx
715
+ import { parseContent, mappers } from '@uniwebcms/semantic-parser';
716
+ import { H1, P } from './components/Text'; // See docs/text-component-reference.md
717
+
718
+ const parsed = parseContent(doc);
719
+ const hero = mappers.extractors.hero(parsed);
720
+
721
+ // Simple rendering
722
+ <>
723
+ <H1 text={hero.title} />
724
+ {hero.subtitle && <H2 text={hero.subtitle} />}
725
+ <P text={hero.description} />
726
+ </>
727
+ ```
728
+
729
+ #### Handling Paragraph Arrays
730
+
731
+ Extractors now return paragraph arrays to preserve structure:
732
+
733
+ ```jsx
734
+ // hero.description is an array: ["First para", "Second para"]
735
+ <P text={hero.description} />
736
+ // Renders: <p>First para</p><p>Second para</p>
737
+
738
+ // If you need a single string, use joinParagraphs
739
+ import { joinParagraphs } from '@uniwebcms/semantic-parser/mappers/helpers';
740
+
741
+ <P text={joinParagraphs(hero.description, '\n\n')} />
742
+ // Renders: <p>First para\n\nSecond para</p>
743
+ ```
744
+
745
+ #### Multi-line Headings
746
+
747
+ ```jsx
748
+ // heading.title might be an array for multi-line titles
749
+ <H1 text={heading.title} />
750
+
751
+ // Example: ["Welcome to", "Our Platform"]
752
+ // Renders: <h1><div>Welcome to</div><div>Our Platform</div></h1>
753
+ ```
754
+
755
+ #### Complete Integration Example
756
+
757
+ ```jsx
758
+ import { parseContent, mappers } from '@uniwebcms/semantic-parser';
759
+ import { H1, H2, H3, P } from './components/Text';
760
+
761
+ function HeroSection({ document }) {
762
+ // Parse and extract
763
+ const parsed = parseContent(document);
764
+ const hero = mappers.extractors.hero(parsed);
765
+
766
+ return (
767
+ <section className="hero">
768
+ {hero.kicker && <div className="kicker">{hero.kicker}</div>}
769
+ <H1 text={hero.title} className="hero-title" />
770
+ {hero.subtitle && <H2 text={hero.subtitle} className="hero-subtitle" />}
771
+ <P text={hero.description} className="hero-description" />
772
+ {hero.image && <img src={hero.image} alt={hero.imageAlt} />}
773
+ {hero.cta && (
774
+ <a href={hero.cta.href} className="cta-button">
775
+ {hero.cta.text}
776
+ </a>
777
+ )}
778
+ </section>
779
+ );
780
+ }
781
+ ```
782
+
783
+ #### Rendering Lists
784
+
785
+ ```jsx
786
+ function FeaturesList({ document }) {
787
+ const parsed = parseContent(document);
788
+ const features = mappers.extractors.features(parsed);
789
+
790
+ return (
791
+ <div className="features-grid">
792
+ {features.map((feature, i) => (
793
+ <div key={i} className="feature-card">
794
+ {feature.icon && <img src={feature.icon} alt="" />}
795
+ <H3 text={feature.title} />
796
+ {feature.subtitle && <P text={feature.subtitle} className="subtitle" />}
797
+ <P text={feature.description} />
798
+ </div>
799
+ ))}
800
+ </div>
801
+ );
802
+ }
803
+ ```
804
+
805
+ ### Sanitization Strategy
806
+
807
+ **Important:** Sanitize at the engine level, not in components.
808
+
809
+ ```javascript
810
+ // ✅ Good - sanitize during data preparation
811
+ import { sanitizeHtml } from '@uniwebcms/semantic-parser/mappers/types';
812
+
813
+ function prepareHeroData(parsed) {
814
+ const hero = mappers.extractors.hero(parsed);
815
+
816
+ return {
817
+ ...hero,
818
+ title: sanitizeHtml(hero.title, {
819
+ allowedTags: ['strong', 'em', 'mark', 'span'],
820
+ allowedAttr: ['class', 'data-variant']
821
+ }),
822
+ description: hero.description.map(p => sanitizeHtml(p))
823
+ };
824
+ }
825
+
826
+ const safeHeroData = prepareHeroData(parsed);
827
+ <H1 text={safeHeroData.title} />
828
+ ```
829
+
830
+ ```javascript
831
+ // ❌ Avoid - sanitizing in component on every render
832
+ function Hero({ data }) {
833
+ const safeTitle = sanitizeHtml(data.title); // Runs every render!
834
+ return <H1 text={safeTitle} />;
835
+ }
836
+ ```
837
+
838
+ #### When to Sanitize
839
+
840
+ - **Always**: External content, user-generated content
841
+ - **Optional**: Trusted TipTap editor with locked schema
842
+ - **Never needed**: Hard-coded content in your app
843
+
844
+ See [Text Component Reference - Sanitization](./text-component-reference.md#sanitization-tools) for detailed guidance.
845
+
846
+ ### Helper Functions for Rendering
847
+
848
+ ```javascript
849
+ import {
850
+ joinParagraphs,
851
+ excerptFromParagraphs,
852
+ countWords
853
+ } from '@uniwebcms/semantic-parser/mappers/helpers';
854
+
855
+ // Join paragraphs for single-string display
856
+ const singlePara = joinParagraphs(hero.description, ' ');
857
+
858
+ // Create excerpt for preview
859
+ const excerpt = excerptFromParagraphs(article.content, {
860
+ maxLength: 150
861
+ });
862
+
863
+ // Count words for reading time estimate
864
+ const wordCount = countWords(article.content);
865
+ const readingTime = Math.ceil(wordCount / 200); // ~200 words/min
866
+ ```
867
+
868
+ ### Color Marks in Headings
869
+
870
+ The parser supports color marks for visual emphasis:
871
+
872
+ ```jsx
873
+ // Content with color mark
874
+ const title = "Welcome to <mark class='brand'>Our Platform</mark>";
875
+
876
+ <H1 text={title} />
877
+ ```
878
+
879
+ **CSS for Color Marks:**
880
+
881
+ ```css
882
+ mark.brand {
883
+ background: linear-gradient(
884
+ 120deg,
885
+ var(--brand-color) 0%,
886
+ var(--brand-color) 100%
887
+ );
888
+ background-repeat: no-repeat;
889
+ background-size: 100% 40%;
890
+ background-position: 0 85%;
891
+ color: inherit;
892
+ padding: 0;
893
+ }
894
+ ```
895
+
896
+ **Ensure sanitization allows marks:**
897
+
898
+ ```javascript
899
+ sanitizeHtml(content, {
900
+ allowedTags: ['strong', 'em', 'mark', 'span'],
901
+ allowedAttr: ['class', 'data-variant']
902
+ });
903
+ ```
904
+
905
+ ## Best Practices
906
+
907
+ 1. **Start with extractors**: Use pre-built patterns when they match your needs
908
+ 2. **Customize gradually**: Override specific fields from extractors if needed
909
+ 3. **Use schemas for clarity**: Schema-based extraction is self-documenting
910
+ 4. **Provide defaults**: Always specify default values for optional fields
911
+ 5. **Safe extraction**: Use `helpers.safe()` when accessing uncertain paths
912
+ 6. **Validate**: Use `validateRequired()` for critical fields
913
+ 7. **Type safety**: Consider adding TypeScript definitions for your schemas
914
+ 8. **Sanitize at engine level**: Sanitize once during data preparation, not in components
915
+ 9. **Preserve arrays**: Keep paragraph arrays when possible for better rendering control
916
+ 10. **Use Text component**: Adopt the reference Text component for consistent rendering
917
+
918
+ ## Contributing Patterns
919
+
920
+ If you develop a common pattern that could benefit others, consider contributing it as a new extractor. Common patterns include:
921
+
922
+ - Product cards
923
+ - Event listings
924
+ - Timeline entries
925
+ - Contact forms
926
+ - Newsletter signups
927
+ - Social proof sections
928
+ - Comparison tables