npm - devlyn-cli - Versions diffs - 0.5.2 → 0.5.3 - Mend

devlyn-cli 0.5.2 → 0.5.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (34) hide show

package/bin/devlyn.js +1 -0
package/optional-skills/dokkit/ANALYSIS.md +198 -0
package/optional-skills/dokkit/COMMANDS.md +365 -0
package/optional-skills/dokkit/DOCX-XML.md +76 -0
package/optional-skills/dokkit/EXPORT.md +102 -0
package/optional-skills/dokkit/FILLING.md +377 -0
package/optional-skills/dokkit/HWPX-XML.md +73 -0
package/optional-skills/dokkit/IMAGE-SOURCING.md +127 -0
package/optional-skills/dokkit/INGESTION.md +65 -0
package/optional-skills/dokkit/SKILL.md +153 -0
package/optional-skills/dokkit/STATE.md +60 -0
package/optional-skills/dokkit/references/docx-field-patterns.md +151 -0
package/optional-skills/dokkit/references/docx-structure.md +58 -0
package/optional-skills/dokkit/references/field-detection-patterns.md +130 -0
package/optional-skills/dokkit/references/hwpx-field-patterns.md +461 -0
package/optional-skills/dokkit/references/hwpx-structure.md +159 -0
package/optional-skills/dokkit/references/image-opportunity-heuristics.md +121 -0
package/optional-skills/dokkit/references/image-xml-patterns.md +338 -0
package/optional-skills/dokkit/references/section-image-interleaving.md +346 -0
package/optional-skills/dokkit/references/section-range-detection.md +118 -0
package/optional-skills/dokkit/references/state-schema.md +143 -0
package/optional-skills/dokkit/references/supported-formats.md +67 -0
package/optional-skills/dokkit/scripts/compile_hwpx.py +134 -0
package/optional-skills/dokkit/scripts/detect_fields.py +301 -0
package/optional-skills/dokkit/scripts/detect_fields_hwpx.py +286 -0
package/optional-skills/dokkit/scripts/export_pdf.py +99 -0
package/optional-skills/dokkit/scripts/parse_hwpx.py +185 -0
package/optional-skills/dokkit/scripts/parse_image_with_gemini.py +159 -0
package/optional-skills/dokkit/scripts/parse_xlsx.py +98 -0
package/optional-skills/dokkit/scripts/source_images.py +365 -0
package/optional-skills/dokkit/scripts/validate_docx.py +142 -0
package/optional-skills/dokkit/scripts/validate_hwpx.py +281 -0
package/optional-skills/dokkit/scripts/validate_state.py +132 -0
package/package.json +1 -1

package/bin/devlyn.js CHANGED Viewed

@@ -70,6 +70,7 @@ const OPTIONAL_ADDONS = [
   { name: 'prompt-engineering', desc: 'Claude 4 prompt optimization using Anthropic best practices', type: 'local' },
   { name: 'better-auth-setup', desc: 'Production-ready Better Auth + Hono + Drizzle + PostgreSQL auth setup', type: 'local' },
   { name: 'pyx-scan', desc: 'Check whether an AI agent skill is safe before installing', type: 'local' },
+  { name: 'dokkit', desc: 'Document template filling for DOCX/HWPX — ingest, fill, review, export', type: 'local' },
   // Local optional commands (copied to .claude/commands/)
   { name: 'devlyn.pencil-sync', desc: 'Sync designs between codebase and Pencil (.pen files) via MCP', type: 'command' },
   // External skill packs (installed via npx skills add)

package/optional-skills/dokkit/ANALYSIS.md ADDED Viewed

@@ -0,0 +1,198 @@
+# Analysis Knowledge
+Template analysis patterns and field detection strategies for the dokkit-analyzer agent. Covers field identification, confidence scoring, and the analysis output schema.
+## Table of Contents
+- [Field Detection Strategy](#field-detection-strategy)
+- [Section Detection](#section-detection)
+- [Cross-Language Mapping](#cross-language-mapping)
+- [Confidence Scoring](#confidence-scoring)
+- [Analysis Output Format](#analysis-output-format)
+---
+## Field Detection Strategy
+Detect ALL fillable locations in a template. Fields appear in these patterns:
+### 1. Placeholder Text
+- `{{field_name}}` or `<<field_name>>` — explicit placeholders
+- `[field_name]` or `(field_name)` — bracket patterns
+- `___` (underscores) — blank line indicators
+- `...` (dots) — fill-in indicators
+### 2. Empty Table Cells
+In form-like documents (especially Korean templates):
+- A label cell (e.g., "Name") with an adjacent empty cell = fill target
+- Pattern: `[Label Cell] [Empty Cell]`
+### 3. Instruction Text
+Text telling the user what to enter:
+- "(enter name here)", "(type your answer)"
+- Korean: "(날짜를 입력하세요)", "(내용을 기재)"
+- These should be REPLACED with the actual value
+### 4. Form Controls (DOCX only)
+- Content controls (`w:sdt`) with explicit placeholder values
+- Legacy form fields (`w:fldChar`)
+### 5. Underline Runs
+Runs styled with underline containing only spaces or underscores:
+- Indicates a blank line for handwriting
+- In digital filling, replace with the value
+### 6. Image Fields
+Fields requiring an image rather than text:
+- `{{사진}}`, `{{photo}}`, `<<signature>>` — image placeholder text
+- Existing `<w:drawing>` (DOCX) or `<hp:pic>` (HWPX) in table cells
+- Empty cells adjacent to cells with image keywords
+**Image keywords** (Korean): 사진, 증명사진, 여권사진, 로고, 서명, 날인, 도장, 직인
+**Image keywords** (English): Photo, Picture, Logo, Signature, Stamp, Seal, Image, Portrait
+**Classification** (`image_type`): `photo`, `logo`, `signature`, or `figure`
+### 7. Writing Tip Boxes (작성 팁)
+Standalone 1x1 tables with DASH borders containing guidance text:
+- HWPX: `rowCnt="1"`, `colCnt="1"` with `※` text
+- DOCX: Single `<w:tr>/<w:tc>` with dashed borders
+- Often styled in red (#FF0000)
+Detect as `field_type: "tip_box"` with `action: "delete"`.
+**Container types**:
+- `"standalone"` — top-level 1x1 table between other content
+- `"nested"` — inside `<hp:subList>` within a fill-target cell; include `parent_field_id`
+**`has_formatting` flag**: For mapped fields where `mapped_value` is >100 chars and contains markdown syntax (`**bold**`, `## heading`, `- bullet`, `1. numbered`), set `has_formatting: true`.
+## Section Detection
+Group fields into logical sections:
+1. Use document headings (H1, H2) as section boundaries
+2. In table-based forms, use spanning header rows
+3. In Korean templates, look for: "인적사항", "학력", "경력", "자격증"
+4. If no clear sections, use "General" as default
+## Cross-Language Mapping
+Common Korean-English field equivalents:
+| Korean | English |
+|--------|---------|
+| 성명 / 이름 | Name / Full Name |
+| 생년월일 | Date of Birth |
+| 주소 | Address |
+| 전화번호 / 연락처 | Phone / Contact |
+| 이메일 | Email |
+| 학력 | Education |
+| 경력 | Work Experience |
+| 자격증 | Certifications |
+| 직위 / 직책 | Position / Title |
+| 회사명 | Company Name |
+| 기간 | Period / Duration |
+## Confidence Scoring
+### High Confidence
+- Exact label match between source and template field
+- Unambiguous data (one clear value in sources)
+- Same language label match
+### Medium Confidence
+- Semantic match (different wording, same meaning)
+- Cross-language match (Korean-English)
+- Multiple candidate values in sources
+- Partial data match
+### Low Confidence
+- Indirect inference (value derived from context)
+- Ambiguous mapping (could match multiple fields)
+- Best guess from limited data
+## Analysis Output Format
+Write to `.dokkit/analysis.json`:
+```json
+{
+  "template": {
+    "file_path": "...",
+    "file_type": "docx|hwpx",
+    "display_name": "..."
+  },
+  "sections": [
+    {
+      "name": "Section Name",
+      "fields": [
+        {
+          "id": "field_001",
+          "label": "Field Label",
+          "field_type": "placeholder_text|empty_cell|underline|form_control|instruction_text|image|tip_box|section_content|table_content",
+          "xml_path": {
+            "file": "word/document.xml",
+            "element_path": "body/tbl[0]/tr[1]/tc[2]/p[0]/r[0]",
+            "namespaced_path": "w:body/w:tbl[0]/w:tr[1]/w:tc[2]/w:p[0]/w:r[0]"
+          },
+          "pattern": "{{name}}",
+          "current_content": "{{name}}",
+          "mapped_value": "John Doe",
+          "source": "resume.pdf",
+          "source_location": "key_value_pairs.Name",
+          "confidence": "high",
+          "has_formatting": false
+        },
+        {
+          "id": "field_015",
+          "label": "tip box label",
+          "field_type": "tip_box",
+          "action": "delete",
+          "container": "standalone",
+          "xml_path": { "file": "...", "element_path": "...", "namespaced_path": "..." },
+          "pattern": "(tip box: 1x1 table)",
+          "current_content": "※ 작성 팁: ...",
+          "mapped_value": null,
+          "confidence": "high"
+        },
+        {
+          "id": "field_020",
+          "label": "사진",
+          "field_type": "image",
+          "image_type": "photo",
+          "xml_path": { "file": "...", "element_path": "...", "namespaced_path": "..." },
+          "pattern": "(empty cell, image label)",
+          "current_content": "",
+          "image_source": "ingested",
+          "image_file": ".dokkit/sources/photo.jpg",
+          "dimensions": { "width_emu": 1260000, "height_emu": 1620000 },
+          "confidence": "high"
+        }
+      ]
+    }
+  ],
+  "summary": {
+    "total_fields": 22,
+    "mapped": 18,
+    "unmapped": 4,
+    "high_confidence": 15,
+    "medium_confidence": 2,
+    "low_confidence": 1,
+    "image_fields": 2,
+    "image_fields_sourced": 1,
+    "image_fields_pending": 1,
+    "tip_boxes": 3
+  }
+}
+```
+### Critical Rules for Analysis Output
+- For `table_content` fields that are pre-filled from source: set `mapped_value: null` with `action: "preserve"`. NEVER set `mapped_value` to a placeholder string — the filler treats any non-null `mapped_value` as literal data and will destroy the table.
+- For `image` fields: search `.dokkit/sources/` for matching images first. Set `image_source: "ingested"` if found, or leave `image_file: null` (pending).
+- For `section_content` fields: scan for visual enhancement opportunities (max 3 per field, max 12 total). Record with `generation_prompt`, `dimensions`, `status: "pending"`.
+## References
+See `references/field-detection-patterns.md` for advanced detection heuristics (9 DOCX + 6 HWPX).
+See `references/image-opportunity-heuristics.md` for AI image opportunity detection in section content.

package/optional-skills/dokkit/COMMANDS.md ADDED Viewed

@@ -0,0 +1,365 @@
+# Dokkit Command Reference
+Complete workflows for all 9 subcommands. Loaded automatically into context when `/dokkit` is invoked.
+## Table of Contents
+- [init](#init) — Initialize workspace
+- [sources](#sources) — Source dashboard
+- [preview](#preview) — PDF preview
+- [ingest](#ingest) — Ingest source documents
+- [fill](#fill) — End-to-end fill pipeline
+- [fill-doc](#fill-doc) — Analyze and fill template
+- [modify](#modify) — Targeted changes
+- [review](#review) — Confidence review
+- [export](#export) — Export to format
+---
+## init
+Initialize or reset the `.dokkit/` workspace for a new document filling session.
+### Arguments
+- `--force` or `-f`: Skip confirmation and reset without asking
+- `--keep-sources`: Reset template/output but preserve ingested sources
+### Procedure
+1. Check if `.dokkit/` already exists
+2. If it exists and `--force` is not passed, ask the user to confirm reset
+3. If `--keep-sources` is used, preserve `.dokkit/sources/` and source entries in state.json
+4. Create the workspace structure:
+   ```
+   .dokkit/
+   ├── sources/
+   ├── template_work/
+   ├── output/
+   ├── images/
+   └── state.json
+   ```
+5. Initialize `state.json`:
+   ```json
+   {
+     "version": "1.0",
+     "created": "<ISO timestamp>",
+     "sources": [],
+     "template": null,
+     "analysis": null,
+     "filled_document": null,
+     "exports": []
+   }
+   ```
+6. Validate the state file
+7. Report success with next step guidance
+### Output
+```
+Dokkit workspace initialized at .dokkit/
+  sources/       — ready for /dokkit ingest
+  template_work/ — ready for /dokkit fill
+  output/        — ready for /dokkit export
+  state.json     — initialized
+Next: Use /dokkit ingest <file> to add source documents.
+```
+### Rules
+- Inline command — do NOT fork to any agent
+- If resetting, warn about data loss unless --force is used
+---
+## sources
+Display all ingested source documents with their status, type, and summary.
+### Procedure
+1. Read `.dokkit/state.json`
+2. If `.dokkit/` does not exist, show error: "No workspace found. Run `/dokkit init` first."
+3. If no sources exist, show empty state with supported formats list
+4. For each source, display: name, type, status, summary
+5. Show total count and any errors
+### Output
+```
+Ingested Sources (3 total)
+ #  Name                Type   Status   Summary
+ 1  resume.pdf          PDF    ready    Personal resume with education and work history
+ 2  transcript.xlsx     XLSX   ready    Academic transcript with grades and courses
+ 3  scan.png            PNG    error    OCR failed — image too blurry
+Use /dokkit ingest <file> to add more sources.
+```
+### Rules
+- Inline command — do NOT fork to any agent
+- Read-only: only reads state.json, never modifies anything
+---
+## preview
+Generate a visual preview of the current filled document as PDF.
+### Procedure
+1. Read `.dokkit/state.json` to check document status
+2. If no filled document exists, show error: "No filled document. Run `/dokkit fill <template>` first."
+3. Compile the current `template_work/` into a temporary file
+4. Convert to PDF using LibreOffice: `soffice --headless --convert-to pdf --outdir .dokkit/output/ <file>`
+5. Report the preview file path
+### Output
+```
+Preview generated: .dokkit/output/preview_<name>.pdf
+Open this file to see how the filled document looks.
+```
+### Rules
+- Inline command — do NOT fork to any agent
+- If LibreOffice is not available, show error with install guidance
+- Preview is temporary — `/dokkit export` creates the final output
+---
+## ingest
+Parse one or more source documents and add them to the workspace for template filling.
+### Arguments
+One or more file paths (space-separated or comma-separated).
+<example>
+`/dokkit ingest docs/resume.pdf`
+`/dokkit ingest docs/resume.pdf docs/financials.xlsx docs/photo.jpg`
+</example>
+### Procedure
+1. Parse remaining arguments to extract file paths
+2. Validate each file path exists. Show error for missing files, continue with valid ones.
+3. **Auto-initialize workspace**: If `.dokkit/` does not exist, create it with initial state.json. Report: "Workspace initialized at .dokkit/"
+4. **Ingest each file** sequentially by spawning the **dokkit-ingestor** agent:
+   - Pass the file path as context
+   - The agent parses the file, writes to `.dokkit/sources/`, updates `state.json`
+   - Report progress: "Ingested 1/3: resume.pdf (ready)"
+5. **Show sources dashboard** after all files complete
+### Delegation
+For each file, spawn the dokkit-ingestor agent:
+> "Ingest the source document at `<file_path>`. Follow the dokkit-ingestor agent instructions. The workspace is at `.dokkit/`."
+### Rules
+- Auto-initialize workspace if `.dokkit/` does not exist — do NOT tell user to run `/dokkit init`
+- Supported formats: PDF, DOCX, XLSX, CSV, PPTX, HWPX, PNG, JPG, TXT, MD, JSON, HTML
+- If a format is unsupported, show error with supported formats list and skip that file
+- If no valid files are provided, show error with usage example
+- Always show sources dashboard after ingestion completes
+---
+## fill
+Fully automated document filling pipeline: analyze, fill, review, auto-fix, and export in one step.
+### Arguments
+File path to the template document (DOCX or HWPX).
+<example>
+`/dokkit fill docs/template.hwpx`
+`/dokkit fill form.docx`
+</example>
+### Procedure
+**Phase 1 — Validate**:
+1. Validate the template exists and is DOCX or HWPX
+2. Check `.dokkit/` workspace exists — if not, show error: "No workspace found. Run `/dokkit ingest <files>` first."
+3. Check at least one source has status "ready" — if not, show error: "No sources ingested."
+4. Report: "Starting fill pipeline with N sources -> template_name"
+**Phase 2 — Analyze**:
+5. Spawn the **dokkit-analyzer** agent to detect fields, map to sources, write `analysis.json`
+6. Report: "Found N fields (X mapped, Y unmapped, Z images)"
+**Phase 3 — Source Images**:
+7. **Cell-level images**: For each `field_type: "image"` with `image_file: null` and `image_type: "figure"`:
+   - Run: `python scripts/source_images.py generate --prompt "<prompt>" --preset technical_illustration --output-dir .dokkit/images/ --project-dir . --lang ko`
+   - Parse `__RESULT__` JSON, update `analysis.json`
+   - Skip photo/signature types (require user-provided files)
+   - Default `--lang ko` (Korean only). Override with user instruction if needed.
+8. **Section content images**: For each `image_opportunities` entry with `status: "pending"`:
+   - Run: `python scripts/source_images.py generate --prompt "<generation_prompt>" --preset <preset> --output-dir .dokkit/images/ --project-dir . --lang ko`
+   - On failure: set `status: "skipped"`, log reason
+   - Use `--lang ko+en` if the content contains technical terms that benefit from English (e.g., architecture diagrams with API names).
+9. Report: "Sourced X/Y images"
+**Phase 4 — Fill**:
+10. Spawn the **dokkit-filler** agent in fill mode
+**Phase 5 — Review and Auto-Fix Loop**:
+11. Evaluate fill result: count fields by confidence, identify fixable issues
+12. **Auto-fix**: For fixable issues, spawn **dokkit-filler** in modify mode
+    - Re-map low-confidence fields where better data exists
+    - Fix formatting issues (date formats, truncated text)
+    - Do NOT auto-fix: unfilled fields, image fields without sources
+13. If auto-fix made changes, re-evaluate. Maximum 2 iterations.
+14. Present **final review** table (section-by-section with confidence)
+**Phase 6 — Export**:
+15. Export in same format as input template via **dokkit-exporter** agent
+16. Report output path and file size
+**Phase 7 — Next Steps**:
+17. Offer: `/dokkit modify "..."`, `/dokkit export pdf`, `/dokkit review`
+### Delegation
+**Agent 1 — Analyzer** (dokkit-analyzer):
+> "Analyze the template at `<path>`. Detect all fillable fields INCLUDING image fields. Map to sources. Write `analysis.json`."
+**Agent 2 — Filler** (dokkit-filler, fill mode):
+> "Fill the template using `analysis.json`. Mode: fill. Insert images where `image_file` is populated. Interleave section content images at anchor points."
+**Agent 2b — Filler** (dokkit-filler, modify mode — auto-fix, if needed):
+> "Modify the filled document. Mode: modify. Fix: `<list of issues>`."
+**Agent 3 — Exporter** (dokkit-exporter):
+> "Export the filled document. Format: `<format>`. Compile from `.dokkit/template_work/` and save to `.dokkit/output/`."
+### Rules
+- At least one source must be ingested before filling
+- Auto-fix loop runs maximum 2 iterations
+- Auto-fix does NOT fill fields with missing source data
+- Always show the full review table before exporting
+- If any phase fails, show the error and stop — do NOT proceed
+---
+## fill-doc
+Analyze a template and fill its fields using ingested source data. Does NOT auto-fix or export.
+### Arguments
+File path to the template document (DOCX or HWPX).
+<example>
+`/dokkit fill-doc docs/template.docx`
+</example>
+### Procedure
+1. Validate the template exists and is DOCX or HWPX
+2. Check `.dokkit/` workspace exists with at least one ready source
+3. **Analyze**: Spawn the **dokkit-analyzer** agent
+4. **Source Images**: Same as `/dokkit fill` Phase 3 (cell-level + section content)
+5. **Fill**: Spawn the **dokkit-filler** agent in fill mode
+6. Present review summary
+### Delegation
+**First**: Spawn the dokkit-analyzer agent:
+> "Analyze the template at `<path>`. Detect all fillable fields INCLUDING image fields. Map to sources. Write `analysis.json`."
+**Image sourcing** (inline, between agents):
+- **Pass A — Cell-level**: For `field_type: "image"` with `image_file: null` and `image_type: "figure"`, run `python scripts/source_images.py generate --prompt "..." --preset ... --output-dir .dokkit/images/ --project-dir . --lang ko`
+- **Pass B — Section content**: For `image_opportunities` with `status: "pending"`, run `python scripts/source_images.py generate --prompt "..." --preset ... --output-dir .dokkit/images/ --project-dir . --lang ko`
+- Default language is `ko` (Korean only). Use `--lang ko+en` for mixed content, or `--lang en` for English-only.
+**Then**: Spawn the dokkit-filler agent in fill mode:
+> "Fill the template using `analysis.json`. Mode: fill. Insert images where populated. Interleave section content images at anchor points."
+### Rules
+- Template must be DOCX or HWPX
+- Analyzer runs FIRST, then filler
+- Original template is never modified
+---
+## modify
+Apply targeted changes to the filled document based on natural language instructions.
+### Arguments
+A natural language instruction describing the change.
+<example>
+`/dokkit modify "Change the phone number to 010-1234-5678"`
+`/dokkit modify "Re-do the education section using the transcript"`
+`/dokkit modify "Use YYYY-MM-DD format for all dates"`
+</example>
+### Procedure
+1. Check `.dokkit/state.json` for an active filled document. If none, show error: "No filled document. Run `/dokkit fill <template>` first."
+2. Spawn the **dokkit-filler** agent in modify mode
+### Delegation
+> "Modify the filled document. Mode: modify. User instruction: `<instruction>`. Read `analysis.json` for field locations and make surgical changes."
+### Rules
+- A filled document must exist
+- Only modify targeted fields — do not re-process the entire document
+- Manual overrides get confidence "high"
+---
+## review
+Present the filled document for review with section-by-section confidence annotations.
+### Arguments
+Optional: section name or action.
+<example>
+`/dokkit review` — review all sections
+`/dokkit review "Personal Information"` — review specific section
+`/dokkit review approve` — mark document as finalized
+</example>
+### Procedure
+1. Check `.dokkit/state.json` for an active filled document. If none, show error.
+2. Spawn the **dokkit-filler** agent in review mode
+### Delegation
+> "Review the filled document. Mode: review. Read `analysis.json` and present section-by-section review with confidence annotations."
+If section or action specified:
+> "Focus on section: `<section>` / Action: `<action>`"
+### Rules
+- A filled document must exist
+- Review is read-only — shows status but changes nothing
+- "approve" action sets document status to "finalized"
+---
+## export
+Compile and export the filled document in the specified format.
+### Arguments
+Output format: `docx`, `hwpx`, or `pdf`.
+<example>
+`/dokkit export docx`
+`/dokkit export pdf`
+</example>
+### Procedure
+1. Check `.dokkit/state.json` for a filled document. If none, show error.
+2. Validate the requested format is supported
+3. Spawn the **dokkit-exporter** agent
+### Delegation
+> "Export the filled document. Format: `<format>`. Compile from `.dokkit/template_work/` and save to `.dokkit/output/`."
+### Rules
+- Supported formats: docx, hwpx, pdf
+- Cross-format exports show a warning about potential formatting differences
+- Same-format exports preserve 100% formatting fidelity

package/optional-skills/dokkit/DOCX-XML.md ADDED Viewed

@@ -0,0 +1,76 @@
+# DOCX XML Knowledge
+Open XML structure for surgical DOCX document editing.
+## DOCX Structure
+A DOCX file is a ZIP archive:
+```
+[Content_Types].xml          — MIME type mappings
+_rels/.rels                  — root relationships
+word/
+  document.xml               — main document body (PRIMARY TARGET)
+  styles.xml                 — style definitions
+  numbering.xml              — list numbering definitions
+  settings.xml               — document settings
+  fontTable.xml              — font declarations
+  theme/theme1.xml           — theme colors/fonts
+  media/                     — embedded images
+  _rels/document.xml.rels    — document relationships
+docProps/
+  app.xml                    — application metadata
+  core.xml                   — document metadata
+```
+## Key XML Elements
+### Namespace
+```xml
+xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main"
+```
+### Document Body
+```xml
+<w:body>
+  <w:p>           <!-- paragraph -->
+    <w:pPr>       <!-- paragraph properties -->
+    <w:r>         <!-- run (text with formatting) -->
+      <w:rPr>     <!-- run properties (font, size, bold, etc.) -->
+      <w:t>       <!-- text content -->
+    </w:r>
+  </w:p>
+</w:body>
+```
+### Tables
+```xml
+<w:tbl>
+  <w:tblPr>       <!-- table properties -->
+  <w:tblGrid>     <!-- column widths -->
+  <w:tr>           <!-- table row -->
+    <w:trPr>       <!-- row properties -->
+    <w:tc>         <!-- table cell -->
+      <w:tcPr>     <!-- cell properties (width, merge, borders) -->
+      <w:p>        <!-- cell content (paragraph) -->
+    </w:tc>
+  </w:tr>
+</w:tbl>
+```
+### Content Controls (Structured Document Tags)
+```xml
+<w:sdt>
+  <w:sdtPr>
+    <w:alias w:val="FieldName"/>
+    <w:tag w:val="field_tag"/>
+  </w:sdtPr>
+  <w:sdtContent>
+    <w:p><w:r><w:t>Placeholder</w:t></w:r></w:p>
+  </w:sdtContent>
+</w:sdt>
+```
+## References
+See `references/docx-structure.md` for unpacking, repackaging, and critical rules.
+See `references/docx-field-patterns.md` for field detection patterns (placeholders, empty cells, underline, content controls, instruction text, tip boxes).