npm - devlyn-cli - Versions diffs - 0.5.4 → 0.5.7 - Mend

devlyn-cli 0.5.4 → 0.5.7

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (16) hide show

package/bin/devlyn.js +160 -4
package/config/agents/evaluator.md +40 -0
package/config/commands/devlyn.evaluate.md +467 -0
package/config/commands/devlyn.team-resolve.md +2 -2
package/optional-skills/dokkit/ANALYSIS.md +32 -1
package/optional-skills/dokkit/COMMANDS.md +20 -13
package/optional-skills/dokkit/FILLING.md +19 -0
package/optional-skills/dokkit/IMAGE-SOURCING.md +2 -2
package/optional-skills/dokkit/PIPELINE.md +348 -0
package/optional-skills/dokkit/SKILL.md +169 -111
package/optional-skills/dokkit/references/docx-section-range-detection.md +147 -0
package/optional-skills/dokkit/references/image-opportunity-heuristics.md +1 -1
package/optional-skills/dokkit/scripts/fill_docx.py +819 -0
package/optional-skills/dokkit/scripts/parse_image_with_gemini.py +3 -3
package/optional-skills/dokkit/scripts/source_images.py +40 -2
package/package.json +1 -1

package/optional-skills/dokkit/SKILL.md CHANGED Viewed

@@ -1,153 +1,211 @@
 ---
 name: dokkit
 description: >
-  Document template filling system for DOCX and HWPX formats.
-  Ingests source documents, analyzes templates, detects fillable fields,
-  fills them surgically using source data, reviews with confidence scoring,
-  and exports completed documents. Supports Korean and English templates.
-  Subcommands: init, sources, preview, ingest, fill, fill-doc, modify, review, export.
-  Use when user says "fill template", "fill document", "ingest", "dokkit".
+  One-command document template filling. Put source files (회사소개서, 사업자료,
+  이미지 등) in a folder, provide a DOCX/HWPX template, and get a polished,
+  complete document with AI-generated images. Auto-iterates until perfect.
+  Supports Korean government forms: 사업계획서, 지원서, 신청서.
+  Trigger on: "fill template", "사업계획서 작성", "문서 작성", "dokkit",
+  "fill this form", "템플릿 채워줘", "complete this document", "fill document",
+  "template automation", HWPX files, 한글 templates, document generation,
+  or any task involving filling document templates with source materials.
+  Also trigger when user drops files and asks to fill or complete a template.
 user-invocable: true
 allowed-tools: Read, Write, Edit, Bash, Glob, Grep, Agent
-argument-hint: "<subcommand> [arguments]"
+argument-hint: "<template_path> <sources_folder> | improve [instruction]"
 context:
   - type: file
-    path: ${CLAUDE_SKILL_DIR}/COMMANDS.md
+    path: ${CLAUDE_SKILL_DIR}/PIPELINE.md
 ---
-# Dokkit — Document Template Filling System
+# Dokkit — One-Command Document Filling
-Surgical document filling for DOCX and HWPX templates using ingested source data. One command with 9 subcommands covering the full document filling lifecycle.
+Source folder + template → finished document. Fully automatic, iterates until perfect.
-## Subcommands
+## Usage
-| Subcommand | Arguments | Type | Description |
-|------------|-----------|------|-------------|
-| `init` | `[--force] [--keep-sources]` | Inline | Initialize or reset workspace |
-| `sources` | — | Inline | Display ingested sources dashboard |
-| `preview` | — | Inline | Generate PDF preview via LibreOffice |
-| `ingest` | `<file1> [file2] ...` | Agent | Parse source documents into workspace |
-| `fill` | `<template.docx\|hwpx>` | Agent | End-to-end: analyze, fill, review, auto-fix, export |
-| `fill-doc` | `<template.docx\|hwpx>` | Agent | Analyze template and fill fields only |
-| `modify` | `"<instruction>"` | Agent | Apply targeted changes to filled document |
-| `review` | `[section\|approve]` | Agent | Review with per-field confidence annotations |
-| `export` | `<docx\|hwpx\|pdf>` | Agent | Export filled document to format |
-## Routing
+```
+/dokkit <template_path> <sources_folder>
+/dokkit improve ["instruction"]
+```
-Parse `$ARGUMENTS` to determine the subcommand:
+- `template_path`: DOCX or HWPX template file **(required)**
+- `sources_folder`: Folder with source materials **(required)**
-1. Extract `$1` as the subcommand name
-2. Pass remaining arguments (`$2`, `$3`, ...) to the subcommand
-3. If `$1` is empty or unrecognized, display the subcommand table above with usage examples
+Both arguments are mandatory. If either is missing, show this error and stop:
+```
+Error: template과 sources 폴더를 모두 지정해주세요.
-Full workflows for each subcommand are in COMMANDS.md (auto-loaded via context).
+Usage: /dokkit <template.docx|hwpx> <sources_folder>
+Example: /dokkit docs/사업계획서_양식.docx docs/sources/김철수/
+```
 <example>
-- `/dokkit ingest docs/resume.pdf docs/transcript.xlsx` — ingest two sources
-- `/dokkit fill docs/template.hwpx` — end-to-end fill pipeline
-- `/dokkit modify "Change the phone number to 010-1234-5678"` — targeted change
-- `/dokkit export pdf` — export as PDF
+/dokkit docs/사업계획서_양식.docx docs/sources/김철수/
+/dokkit docs/template.hwpx docs/자료/
+/dokkit improve                              # 전체적으로 품질 향상
+/dokkit improve "이미지를 더 넣어줘"          # 특정 방향으로 개선
+/dokkit improve "시장분석 섹션을 더 풍부하게"  # 특정 섹션 강화
 </example>
-## Architecture
+## Pipeline Overview
-### Agents
+Six phases, fully automated. Phases 3-5 loop until quality gates pass (max 3 iterations).
-| Agent | Model | Role |
-|-------|-------|------|
-| **dokkit-ingestor** | opus | Parse source docs into `.dokkit/sources/` (.md + .json pairs) |
-| **dokkit-analyzer** | opus | Analyze templates, detect fields, map to sources. Writes `analysis.json`. READ-ONLY on templates. |
-| **dokkit-filler** | opus | Surgical XML modification using analysis.json. Three modes: fill, modify, review. |
-| **dokkit-exporter** | sonnet | Repackage ZIP archives, PDF conversion via LibreOffice. |
+| # | Phase | What Happens |
+|---|-------|-------------|
+| 1 | **Prepare** | Parse all source files → structured data |
+| 2 | **Analyze** | Detect template fields, map structure |
+| 3 | **Fill** | Generate & insert content **section-by-section** |
+| 4 | **Images** | Generate via Gemini, insert with **correct aspect ratio** |
+| 5 | **Review** | Quality gates → auto-fix failures → re-check |
+| 6 | **Export** | Compile document + PDF preview |
-### Workspace
+Progress shown as: `Phase N/6: description`
+## Core Design: Section-by-Section Generation
+This is the #1 quality improvement over previous versions.
+**Problem**: Generating all content at once → each section gets shallow attention, quality worse than manual AI.
+**Solution**: Each template section gets dedicated AI focus with full source context, exactly like asking AI to write one section at a time manually.
+For each section:
+1. Read the section's template tips and writing instructions
+2. Load ALL relevant source data into context
+3. Generate rich, persuasive, data-driven content for THIS section only
+4. Insert while preserving original formatting exactly
+5. Verify quality immediately before moving to next section
+The filler agent generates content AND inserts it — no lossy handoff between separate agents.
+## Quality Gates
-All agents communicate via the `.dokkit/` filesystem:
+ALL must pass before final export:
+| Gate | Criterion | Auto-Fix Strategy |
+|------|-----------|-------------------|
+| QG1 | Total text ≥ 7,500 chars | Re-enrich thin sections |
+| QG2 | Each section_content ≥ 500 chars | Re-generate with more detail |
+| QG3 | Zero `00.00` date placeholders | Derive from source context |
+| QG4 | Zero `OO`/`○○` name placeholders (exclude schedule table `O` marks) | Derive from source context |
+| QG5 | Zero `이미지 영역` text | Remove placeholder text |
+| QG6 | Images aspect ratio correct (within 5%) | Re-measure with PIL |
+| QG7 | No red/italic guide text in filled cells | Sanitize styles |
+| QG8 | ≥ 10 images in document | Generate additional images via `source_images.py` |
+| QG9 | Zero `XXXXX`/`XXXXXX` fake placeholders | Replace with real values or "해당없음" |
+| QG10 | TOC page numbers not `00` | Remove or update TOC entries |
+## Font & Formatting Rules
+These rules prevent the font corruption issues seen in previous versions:
+1. **Never copy guide text styling** — Template placeholders often use red (#FF0000) and italic. Strip these unconditionally when creating filled runs.
+2. **Use template default body style** — Find the document's standard body text formatting (black, regular weight) and apply it to all filled content.
+3. **HWPX charPr spacing** — Before ANY text insertion, scan ALL `<hh:charPr>` in `header.xml` and set negative `spacing` values to `"0"`. Negative spacing causes character overlap.
+4. **DOCX rPr sanitization** — When copying run properties from label cells, always remove `<w:color>` if red and `<w:i/>` (italic).
+5. **Preserve structural formatting** — Keep paragraph alignment (pPr), indentation, spacing, and table cell properties unchanged. Only modify text content and run-level styles.
+## Image Rules
+### Generation — use `source_images.py` exclusively
+**Never write inline Gemini API calls for image generation.** Always use the provided script:
+```bash
+python ${CLAUDE_SKILL_DIR}/scripts/source_images.py generate \
+  --prompt "<prompt>" --preset <preset> --output-dir .dokkit/images/ \
+  --project-dir . --lang ko
 ```
-.dokkit/
-├── state.json          # Single source of truth for session state
-├── sources/            # Ingested content (.md + .json pairs)
-├── analysis.json       # Template analysis output (from analyzer)
-├── images/             # Sourced images for template filling
-├── template_work/      # Unpacked template XML (working copy)
-└── output/             # Exported filled documents
+The script uses `gemini-3-pro-image-preview` (high quality), enforces Korean text, and applies correct aspect ratios per preset. Bypassing it results in wrong model, wrong language, wrong dimensions.
+### Prompt quality
+- Include company/product-specific details — never use generic prompts
+- For org charts: use actual names and roles from source data
+- For market charts: include specific numbers from sources
+- For tech diagrams: name the actual technologies and systems
+### Sizing — prevent distortion
+1. **Always measure actual dimensions** — After generating any image, use PIL/Pillow to get true pixel dimensions.
+2. **Preserve aspect ratio** — Calculate display size that fits within the target cell while maintaining the image's original width:height ratio.
+3. **HWPX imgDim** — Must reflect actual pixel dimensions from PIL, NOT layout constants.
+4. **DOCX EMU** — Calculate from actual pixels: `EMU = pixels × 914400 / 96`.
+5. **Never stretch** — If the image doesn't fit the cell exactly, scale down to fit within bounds (letterbox, don't fill).
+```python
+# Correct image sizing
+from PIL import Image
+img = Image.open(path)
+actual_w, actual_h = img.size
+aspect = actual_w / actual_h
+# Scale to fit within target bounds
+scale = min(target_w / actual_w, target_h / actual_h)
+display_w = int(actual_w * scale)
+display_h = int(actual_h * scale)
 ```
-### State Protocol
-Read `.dokkit/state.json` before any operation. Write state changes atomically: read current → update fields → write back → validate.
+## Architecture
+### Workspace
 ```
-init → state created (empty)
-ingest → source added to sources[]
-fill/fill-doc → template set, analysis created, filled_document created
-modify → filled_document updated
-review approve → filled_document.status = "finalized"
-export → export entry added to exports[]
+.dokkit/
+├── state.json          # Pipeline state and progress
+├── sources/            # Parsed source data (.md + .json pairs)
+├── analysis.json       # Template field map (from analyzer)
+├── images/             # Generated/sourced images
+├── template_work/      # Unpacked template XML (working copy)
+└── output/             # Final completed documents
 ```
-Validate after every write: `python ${CLAUDE_SKILL_DIR}/scripts/validate_state.py .dokkit/state.json`
+### Agents
+| Agent | Model | Role |
+|-------|-------|------|
+| **dokkit-ingestor** | opus | Parse source files → `.dokkit/sources/` |
+| **dokkit-analyzer** | opus | Detect fields & structure → `analysis.json` (NO content generation for sections) |
+| **dokkit-filler** | opus | Generate content section-by-section + fill XML + insert images + quality review |
+| **dokkit-exporter** | sonnet | Compile ZIP archives, convert to PDF |
 ### Knowledge Files
-Agent-facing knowledge bases in this skill directory:
-| File | Purpose | Agents |
-|------|---------|--------|
-| `STATE.md` | State schema and management protocol | All |
-| `INGESTION.md` | Format routing and parsing strategies | dokkit-ingestor |
-| `ANALYSIS.md` | Field detection, confidence scoring, output schema | dokkit-analyzer |
-| `FILLING.md` | XML surgery rules, matching strategy, image insertion | dokkit-analyzer, dokkit-filler |
-| `DOCX-XML.md` | Open XML structure for DOCX documents | dokkit-analyzer, dokkit-filler |
-| `HWPX-XML.md` | OWPML structure for HWPX documents | dokkit-analyzer, dokkit-filler |
-| `IMAGE-SOURCING.md` | Image generation, search, and insertion patterns | dokkit-filler |
-| `EXPORT.md` | Document compilation and format conversion | dokkit-exporter |
-Deep reference material in `references/`:
-- `state-schema.md` — Complete state.json schema
-- `supported-formats.md` — Detailed format specifications
-- `docx-structure.md`, `docx-field-patterns.md` — DOCX patterns
-- `hwpx-structure.md`, `hwpx-field-patterns.md` — HWPX patterns (10 detection patterns)
-- `field-detection-patterns.md` — Advanced heuristics (9 DOCX + 6 HWPX)
-- `section-range-detection.md` — Dynamic range detection for section_content
-- `section-image-interleaving.md` — Image interleaving algorithm
-- `image-opportunity-heuristics.md` — AI image opportunity detection
-- `image-xml-patterns.md` — Image element structures (DOCX + HWPX)
-Scripts in `scripts/`:
-- `validate_state.py` — State validation
-- `parse_xlsx.py`, `parse_hwpx.py`, `parse_image_with_gemini.py` — Custom parsers
-- `detect_fields.py`, `detect_fields_hwpx.py` — Field detection
-- `validate_docx.py`, `validate_hwpx.py` — Document validation
-- `compile_hwpx.py` — HWPX repackaging
-- `export_pdf.py` — PDF conversion
+| File | Purpose | Used By |
+|------|---------|---------|
+| `PIPELINE.md` | Detailed pipeline steps (auto-loaded) | Orchestrator |
+| `STATE.md` | State schema and management | All agents |
+| `INGESTION.md` | Source file parsing | Ingestor |
+| `ANALYSIS.md` | Field detection, structure mapping | Analyzer |
+| `FILLING.md` | XML surgery rules, image insertion | Filler |
+| `DOCX-XML.md` / `HWPX-XML.md` | XML format structures | Analyzer, Filler |
+| `IMAGE-SOURCING.md` | Image generation patterns | Filler |
+| `EXPORT.md` | Compilation and conversion | Exporter |
 ## Rules
-<rules>
-- Display errors clearly with actionable guidance. Never silently fall back to defaults.
-- Original template is never modified — copies go to `.dokkit/template_work/`.
-- Analyzer is read-only on templates. Only the filler modifies XML.
-- Confidence levels: high, medium, low (not numeric scores).
-- Signatures must be user-provided — never auto-generate them.
-- Validate state after every write with `scripts/validate_state.py`.
-- Inline commands (init, sources, preview) execute directly — do NOT spawn agents.
-- Agent-delegated commands spawn the appropriate agent(s) sequentially.
-</rules>
+1. **One command does everything** — no manual subcommands needed (except `improve` for post-fill enhancement)
+2. **Never modify the original template** — work on copies in `.dokkit/template_work/`
+3. **Section-by-section generation** — each section gets full AI attention with all source data
+4. **Aspect ratio preservation** — images never stretched or squashed
+5. **Black text only** — never inherit colored/italic guide text styles
+6. **Auto-loop** — iterate until ALL quality gates pass (max 3 iterations)
+7. **Progress reporting** — show `Phase N/6: description` at each step
+8. **Clear errors** — if something fails, show what went wrong with actionable guidance
+9. **Gemini API** — if not configured, warn and skip image generation (don't block text filling)
 ## Known Pitfalls
-Critical issues discovered through production use:
+Critical issues from production experience — these MUST be handled:
-1. **HWPX namespace stripping**: Python ET strips unused namespace declarations. Restore ALL 14 original xmlns on EVERY root element after any `tree.write()`. Applies to section0.xml, content.hpf, header.xml.
-2. **HWPX subList cell wrapping**: ~65% of cells wrap content in `<hp:subList>/<hp:p>`. Check for subList before writing content.
-3. **table_content "Pre-filled" bug**: Never set `mapped_value` to placeholder strings for `table_content` fields. Use `mapped_value: null` with `action: "preserve"`.
-4. **HWPX cellAddr rowAddr corruption**: After row insert/delete, re-index ALL `rowAddr` values. Duplicate rowAddr causes silent data loss.
-5. **HWPX `<hp:pic>` inside `<hp:run>`**: Pic as sibling of run renders invisible. Must be `<hp:run><hp:pic>...<hp:t/></hp:run>`.
-6. **HWPML units**: 1/7200 inch, NOT hundredths of mm. 1mm ~ 283.46 units. A4 text width ~ 46,648 units.
-7. **rowSpan stripping**: When cloning rows with rowSpan>1, divide cellSz height by rowSpan.
+1. **HWPX namespace stripping**: Python ET strips unused namespace declarations. Restore ALL 14 original xmlns on EVERY root element after `tree.write()`.
+2. **HWPX subList cell wrapping**: ~65% of cells use `<hp:subList>/<hp:p>`. Always check before writing.
+3. **table_content "Pre-filled" bug**: Never set `mapped_value` to placeholder strings. Use `null` with `action: "preserve"`.
+4. **HWPX cellAddr rowAddr corruption**: After row insert/delete, re-index ALL `rowAddr` values.
+5. **HWPX `<hp:pic>` placement**: Must be `<hp:run><hp:pic>...<hp:t/></hp:run>`, not pic as sibling.
+6. **HWPML units**: 1/7200 inch. 1mm ~ 283.46 units. A4 text width ~ 46,648 units.
+7. **rowSpan stripping**: Divide cellSz height by rowSpan when cloning.
 8. **HWPX pic element order**: offset, orgSz, curSz, flip, rotationInfo, renderingInfo, imgRect, imgClip, inMargin, imgDim, hc:img, sz, pos, outMargin.
-9. **HWPX post-write safety**: After ET write: (a) restore namespaces, (b) fix XML declaration to double quotes with `standalone="yes"`, (c) remove newline between `?>` and `<root>`.
-10. **compile_hwpx.py skip .bak**: Backup files must be excluded from ZIP repackaging.
+9. **Section content table preservation**: ONLY replace `<w:p>`/`<hp:p>` elements. NEVER remove `<w:tbl>`/`<hp:tbl>`.
+10. **Section range detection**: After deleting tips/instructions, ranges are STALE. Recompute dynamically.
+11. **HWPX post-write safety**: Restore namespaces → fix XML declaration → remove newline between `?>` and root.

package/optional-skills/dokkit/references/docx-section-range-detection.md ADDED Viewed

@@ -0,0 +1,147 @@
+# Section Content Range Detection (DOCX)
+## Problem
+Same as HWPX: after deleting instruction text (※ paragraphs) and tip boxes, the child indices from `analysis.json` become stale. Using stale indices destroys tables and other structural elements.
+**Additionally**, DOCX section content ranges may contain embedded `<w:tbl>` elements (schedule tables, budget tables) that must NEVER be replaced during section content filling. Unlike HWPX where tables are children of the section root, DOCX tables are direct children of `<w:body>` interspersed with paragraphs.
+## Solution: Dynamic Range Detection + Table Preservation
+### Step 1: Recompute ranges after cleanup
+After deleting instruction text and tip boxes, scan `<w:body>` children for section title markers:
+```python
+def find_docx_section_ranges(body, w_ns):
+    """Find section content ranges by locating title markers in w:body.
+    Must run AFTER tip/instruction deletion so indices are stable.
+    Returns dict mapping approximate field labels to (start, end) inclusive child index ranges.
+    """
+    children = list(body)
+    markers = {}
+    for i, child in enumerate(children):
+        text = ''.join(
+            t.text or '' for t in child.iter(f'{{{w_ns}}}t')
+        ).strip()
+        # Section title markers (numbered headings)
+        if '1.' in text and '문제' in text and ('Problem' in text or '필요성' in text):
+            markers['sec1_title'] = i
+        elif '2.' in text and '실현' in text and ('Solution' in text or '개발' in text):
+            markers['sec2_title'] = i
+        elif '3.' in text and '성장' in text and ('Scale' in text or '사업화' in text):
+            markers['sec3_title'] = i
+        elif '4.' in text and '팀' in text and ('Team' in text or '대표자' in text):
+            markers['sec4_title'] = i
+        # End markers
+        elif '사업추진' in text and '일정' in text and '협약기간' in text:
+            markers['schedule1'] = i
+        elif '사업추진' in text and '일정' in text and '전체' in text:
+            markers['schedule2'] = i
+        elif '팀 구성' in text and ('구분' in text or '직위' in text or '안' in text):
+            markers['team_table'] = i
+        elif '협력' in text and '기관' in text:
+            markers['partnership'] = i
+    # Build ranges: content starts after title + instruction text, ends before next structural element
+    ranges = {}
+    if 'sec1_title' in markers and 'sec2_title' in markers:
+        ranges['sec1'] = (markers['sec1_title'] + 1, markers['sec2_title'] - 1)
+    if 'sec2_title' in markers and 'schedule1' in markers:
+        ranges['sec2'] = (markers['sec2_title'] + 1, markers['schedule1'] - 1)
+    if 'sec3_title' in markers and 'schedule2' in markers:
+        ranges['sec3'] = (markers['sec3_title'] + 1, markers['schedule2'] - 1)
+    if 'sec4_title' in markers and 'team_table' in markers:
+        ranges['sec4'] = (markers['sec4_title'] + 1, markers['team_table'] - 1)
+    return ranges
+```
+### Step 2: CRITICAL — Only replace paragraphs, never tables
+When filling section content within the detected range, ONLY operate on `<w:p>` elements. **Skip all other element types**:
+```python
+W_NS = "http://schemas.openxmlformats.org/wordprocessingml/2006/main"
+def fill_docx_section_content(body, start_idx, end_idx, new_paragraphs):
+    """Replace paragraph content within a section range, preserving tables.
+    RULE: Only remove/replace <w:p> elements. NEVER touch <w:tbl>, <w:bookmarkStart>,
+    <w:bookmarkEnd>, <w:sectPr>, or any non-paragraph elements.
+    """
+    children = list(body)
+    # Phase 1: Identify which children to remove (paragraphs only)
+    to_remove = []
+    preserved_elements = []  # (index, element) pairs for tables etc.
+    for i in range(start_idx, min(end_idx + 1, len(children))):
+        child = children[i]
+        tag = child.tag.split('}')[-1] if '}' in child.tag else child.tag
+        if tag == 'p':
+            to_remove.append(child)
+        else:
+            # Tables, bookmarks, sectPr — preserve in their position
+            preserved_elements.append((i, child))
+    # Phase 2: Remove old paragraphs
+    for elem in to_remove:
+        body.remove(elem)
+    # Phase 3: Insert new paragraphs at the start of the range
+    # (preserved tables remain in place)
+    insert_point = start_idx
+    for new_p in new_paragraphs:
+        body.insert(insert_point, new_p)
+        insert_point += 1
+```
+### Why Tables Must Be Preserved
+The 예비창업패키지 사업계획서 template has this structure within `<w:body>`:
+```
+[19] p: "1. 문제 인식 (Problem)..." — section title
+[20] p: "※ 개발하고자 하는..." — instruction text (delete)
+[21-60] p: section content paragraphs (replace)
+[61] p: "2. 실현 가능성 (Solution)..." — section title
+[62] p: "※ 아이디어를..." — instruction text (delete)
+[63-82] p: section content paragraphs (replace)
+[83] p: "< 사업추진 일정(협약기간 내) >" — schedule heading
+[85] tbl: schedule table ← MUST PRESERVE
+[91] tbl: budget table 1 ← MUST PRESERVE
+[96] tbl: budget table 2 ← MUST PRESERVE
+```
+If the filler replaces the entire range including tables, the schedule and budget data is destroyed. The tables are handled separately as `table_content` fields.
+### Form Tables vs Section Content
+The following tables are NOT section content — they are form-filling tables with individual cell fields:
+| Body index | Content | Field type |
+|-----------|---------|------------|
+| 13 | 일반현황 table (창업아이템명, 산출물) | `empty_cell` per cell |
+| 17 | 개요(요약) table (명칭, 범주, etc.) | `empty_cell` per cell |
+| 85 | 사업추진 일정 (협약기간) | `table_content` |
+| 91 | 1단계 정부지원사업비 | `table_content` |
+| 96 | 2단계 정부지원사업비 | `table_content` |
+| 140 | 사업추진 일정 (전체) | `table_content` |
+| 160 | 팀 구성 table | `table_content` |
+| 164 | 협력 기관 table | `table_content` |
+The analyzer must classify these as their specific field types. They should NEVER be included in a `section_content` field's range.
+## Adapting for Different Templates
+Same as the HWPX version — identify section title markers, match by text content, map to field IDs. The key difference for DOCX:
+1. Body children are direct `<w:p>` and `<w:tbl>` elements (flat structure)
+2. Tables are interspersed with paragraphs at the same level
+3. The "only replace `<w:p>`" rule is universal and template-independent

package/optional-skills/dokkit/references/image-opportunity-heuristics.md CHANGED Viewed

@@ -113,7 +113,7 @@ Each opportunity is added to the field's `image_opportunities` array:
 - `insertion_point.strategy`: Always `"after_paragraph"` for section content
 - `insertion_point.anchor_text`: Distinctive Korean phrase from the paragraph (used by filler to locate insertion point)
 - `generation_prompt`: English prompt for AI image generation
-- `preset`: Maps to `scripts/source_images.py` preset parameter
+- `preset`: Maps to `.claude/skills/dokkit/scripts/source_images.py` preset parameter
 - `content_type`: One of `flowchart`, `diagram`, `data`, `concept`, `infographic`
 - `rationale`: Brief explanation of why an image helps here
 - `dimensions`: Default size — filler may adjust based on content_type