devlyn-cli 0.5.1 → 0.5.3
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/bin/devlyn.js +1 -0
- package/optional-skills/better-auth-setup/SKILL.md +222 -11
- package/optional-skills/better-auth-setup/references/proxy-gotchas.md +148 -0
- package/optional-skills/better-auth-setup/references/proxy-setup.md +284 -0
- package/optional-skills/dokkit/ANALYSIS.md +198 -0
- package/optional-skills/dokkit/COMMANDS.md +365 -0
- package/optional-skills/dokkit/DOCX-XML.md +76 -0
- package/optional-skills/dokkit/EXPORT.md +102 -0
- package/optional-skills/dokkit/FILLING.md +377 -0
- package/optional-skills/dokkit/HWPX-XML.md +73 -0
- package/optional-skills/dokkit/IMAGE-SOURCING.md +127 -0
- package/optional-skills/dokkit/INGESTION.md +65 -0
- package/optional-skills/dokkit/SKILL.md +153 -0
- package/optional-skills/dokkit/STATE.md +60 -0
- package/optional-skills/dokkit/references/docx-field-patterns.md +151 -0
- package/optional-skills/dokkit/references/docx-structure.md +58 -0
- package/optional-skills/dokkit/references/field-detection-patterns.md +130 -0
- package/optional-skills/dokkit/references/hwpx-field-patterns.md +461 -0
- package/optional-skills/dokkit/references/hwpx-structure.md +159 -0
- package/optional-skills/dokkit/references/image-opportunity-heuristics.md +121 -0
- package/optional-skills/dokkit/references/image-xml-patterns.md +338 -0
- package/optional-skills/dokkit/references/section-image-interleaving.md +346 -0
- package/optional-skills/dokkit/references/section-range-detection.md +118 -0
- package/optional-skills/dokkit/references/state-schema.md +143 -0
- package/optional-skills/dokkit/references/supported-formats.md +67 -0
- package/optional-skills/dokkit/scripts/compile_hwpx.py +134 -0
- package/optional-skills/dokkit/scripts/detect_fields.py +301 -0
- package/optional-skills/dokkit/scripts/detect_fields_hwpx.py +286 -0
- package/optional-skills/dokkit/scripts/export_pdf.py +99 -0
- package/optional-skills/dokkit/scripts/parse_hwpx.py +185 -0
- package/optional-skills/dokkit/scripts/parse_image_with_gemini.py +159 -0
- package/optional-skills/dokkit/scripts/parse_xlsx.py +98 -0
- package/optional-skills/dokkit/scripts/source_images.py +365 -0
- package/optional-skills/dokkit/scripts/validate_docx.py +142 -0
- package/optional-skills/dokkit/scripts/validate_hwpx.py +281 -0
- package/optional-skills/dokkit/scripts/validate_state.py +132 -0
- package/package.json +1 -1
|
@@ -0,0 +1,365 @@
|
|
|
1
|
+
# Dokkit Command Reference
|
|
2
|
+
|
|
3
|
+
Complete workflows for all 9 subcommands. Loaded automatically into context when `/dokkit` is invoked.
|
|
4
|
+
|
|
5
|
+
## Table of Contents
|
|
6
|
+
|
|
7
|
+
- [init](#init) — Initialize workspace
|
|
8
|
+
- [sources](#sources) — Source dashboard
|
|
9
|
+
- [preview](#preview) — PDF preview
|
|
10
|
+
- [ingest](#ingest) — Ingest source documents
|
|
11
|
+
- [fill](#fill) — End-to-end fill pipeline
|
|
12
|
+
- [fill-doc](#fill-doc) — Analyze and fill template
|
|
13
|
+
- [modify](#modify) — Targeted changes
|
|
14
|
+
- [review](#review) — Confidence review
|
|
15
|
+
- [export](#export) — Export to format
|
|
16
|
+
|
|
17
|
+
---
|
|
18
|
+
|
|
19
|
+
## init
|
|
20
|
+
|
|
21
|
+
Initialize or reset the `.dokkit/` workspace for a new document filling session.
|
|
22
|
+
|
|
23
|
+
### Arguments
|
|
24
|
+
- `--force` or `-f`: Skip confirmation and reset without asking
|
|
25
|
+
- `--keep-sources`: Reset template/output but preserve ingested sources
|
|
26
|
+
|
|
27
|
+
### Procedure
|
|
28
|
+
|
|
29
|
+
1. Check if `.dokkit/` already exists
|
|
30
|
+
2. If it exists and `--force` is not passed, ask the user to confirm reset
|
|
31
|
+
3. If `--keep-sources` is used, preserve `.dokkit/sources/` and source entries in state.json
|
|
32
|
+
4. Create the workspace structure:
|
|
33
|
+
```
|
|
34
|
+
.dokkit/
|
|
35
|
+
├── sources/
|
|
36
|
+
├── template_work/
|
|
37
|
+
├── output/
|
|
38
|
+
├── images/
|
|
39
|
+
└── state.json
|
|
40
|
+
```
|
|
41
|
+
5. Initialize `state.json`:
|
|
42
|
+
```json
|
|
43
|
+
{
|
|
44
|
+
"version": "1.0",
|
|
45
|
+
"created": "<ISO timestamp>",
|
|
46
|
+
"sources": [],
|
|
47
|
+
"template": null,
|
|
48
|
+
"analysis": null,
|
|
49
|
+
"filled_document": null,
|
|
50
|
+
"exports": []
|
|
51
|
+
}
|
|
52
|
+
```
|
|
53
|
+
6. Validate the state file
|
|
54
|
+
7. Report success with next step guidance
|
|
55
|
+
|
|
56
|
+
### Output
|
|
57
|
+
```
|
|
58
|
+
Dokkit workspace initialized at .dokkit/
|
|
59
|
+
sources/ — ready for /dokkit ingest
|
|
60
|
+
template_work/ — ready for /dokkit fill
|
|
61
|
+
output/ — ready for /dokkit export
|
|
62
|
+
state.json — initialized
|
|
63
|
+
|
|
64
|
+
Next: Use /dokkit ingest <file> to add source documents.
|
|
65
|
+
```
|
|
66
|
+
|
|
67
|
+
### Rules
|
|
68
|
+
- Inline command — do NOT fork to any agent
|
|
69
|
+
- If resetting, warn about data loss unless --force is used
|
|
70
|
+
|
|
71
|
+
---
|
|
72
|
+
|
|
73
|
+
## sources
|
|
74
|
+
|
|
75
|
+
Display all ingested source documents with their status, type, and summary.
|
|
76
|
+
|
|
77
|
+
### Procedure
|
|
78
|
+
|
|
79
|
+
1. Read `.dokkit/state.json`
|
|
80
|
+
2. If `.dokkit/` does not exist, show error: "No workspace found. Run `/dokkit init` first."
|
|
81
|
+
3. If no sources exist, show empty state with supported formats list
|
|
82
|
+
4. For each source, display: name, type, status, summary
|
|
83
|
+
5. Show total count and any errors
|
|
84
|
+
|
|
85
|
+
### Output
|
|
86
|
+
```
|
|
87
|
+
Ingested Sources (3 total)
|
|
88
|
+
|
|
89
|
+
# Name Type Status Summary
|
|
90
|
+
1 resume.pdf PDF ready Personal resume with education and work history
|
|
91
|
+
2 transcript.xlsx XLSX ready Academic transcript with grades and courses
|
|
92
|
+
3 scan.png PNG error OCR failed — image too blurry
|
|
93
|
+
|
|
94
|
+
Use /dokkit ingest <file> to add more sources.
|
|
95
|
+
```
|
|
96
|
+
|
|
97
|
+
### Rules
|
|
98
|
+
- Inline command — do NOT fork to any agent
|
|
99
|
+
- Read-only: only reads state.json, never modifies anything
|
|
100
|
+
|
|
101
|
+
---
|
|
102
|
+
|
|
103
|
+
## preview
|
|
104
|
+
|
|
105
|
+
Generate a visual preview of the current filled document as PDF.
|
|
106
|
+
|
|
107
|
+
### Procedure
|
|
108
|
+
|
|
109
|
+
1. Read `.dokkit/state.json` to check document status
|
|
110
|
+
2. If no filled document exists, show error: "No filled document. Run `/dokkit fill <template>` first."
|
|
111
|
+
3. Compile the current `template_work/` into a temporary file
|
|
112
|
+
4. Convert to PDF using LibreOffice: `soffice --headless --convert-to pdf --outdir .dokkit/output/ <file>`
|
|
113
|
+
5. Report the preview file path
|
|
114
|
+
|
|
115
|
+
### Output
|
|
116
|
+
```
|
|
117
|
+
Preview generated: .dokkit/output/preview_<name>.pdf
|
|
118
|
+
Open this file to see how the filled document looks.
|
|
119
|
+
```
|
|
120
|
+
|
|
121
|
+
### Rules
|
|
122
|
+
- Inline command — do NOT fork to any agent
|
|
123
|
+
- If LibreOffice is not available, show error with install guidance
|
|
124
|
+
- Preview is temporary — `/dokkit export` creates the final output
|
|
125
|
+
|
|
126
|
+
---
|
|
127
|
+
|
|
128
|
+
## ingest
|
|
129
|
+
|
|
130
|
+
Parse one or more source documents and add them to the workspace for template filling.
|
|
131
|
+
|
|
132
|
+
### Arguments
|
|
133
|
+
One or more file paths (space-separated or comma-separated).
|
|
134
|
+
|
|
135
|
+
<example>
|
|
136
|
+
`/dokkit ingest docs/resume.pdf`
|
|
137
|
+
`/dokkit ingest docs/resume.pdf docs/financials.xlsx docs/photo.jpg`
|
|
138
|
+
</example>
|
|
139
|
+
|
|
140
|
+
### Procedure
|
|
141
|
+
|
|
142
|
+
1. Parse remaining arguments to extract file paths
|
|
143
|
+
2. Validate each file path exists. Show error for missing files, continue with valid ones.
|
|
144
|
+
3. **Auto-initialize workspace**: If `.dokkit/` does not exist, create it with initial state.json. Report: "Workspace initialized at .dokkit/"
|
|
145
|
+
4. **Ingest each file** sequentially by spawning the **dokkit-ingestor** agent:
|
|
146
|
+
- Pass the file path as context
|
|
147
|
+
- The agent parses the file, writes to `.dokkit/sources/`, updates `state.json`
|
|
148
|
+
- Report progress: "Ingested 1/3: resume.pdf (ready)"
|
|
149
|
+
5. **Show sources dashboard** after all files complete
|
|
150
|
+
|
|
151
|
+
### Delegation
|
|
152
|
+
For each file, spawn the dokkit-ingestor agent:
|
|
153
|
+
> "Ingest the source document at `<file_path>`. Follow the dokkit-ingestor agent instructions. The workspace is at `.dokkit/`."
|
|
154
|
+
|
|
155
|
+
### Rules
|
|
156
|
+
- Auto-initialize workspace if `.dokkit/` does not exist — do NOT tell user to run `/dokkit init`
|
|
157
|
+
- Supported formats: PDF, DOCX, XLSX, CSV, PPTX, HWPX, PNG, JPG, TXT, MD, JSON, HTML
|
|
158
|
+
- If a format is unsupported, show error with supported formats list and skip that file
|
|
159
|
+
- If no valid files are provided, show error with usage example
|
|
160
|
+
- Always show sources dashboard after ingestion completes
|
|
161
|
+
|
|
162
|
+
---
|
|
163
|
+
|
|
164
|
+
## fill
|
|
165
|
+
|
|
166
|
+
Fully automated document filling pipeline: analyze, fill, review, auto-fix, and export in one step.
|
|
167
|
+
|
|
168
|
+
### Arguments
|
|
169
|
+
File path to the template document (DOCX or HWPX).
|
|
170
|
+
|
|
171
|
+
<example>
|
|
172
|
+
`/dokkit fill docs/template.hwpx`
|
|
173
|
+
`/dokkit fill form.docx`
|
|
174
|
+
</example>
|
|
175
|
+
|
|
176
|
+
### Procedure
|
|
177
|
+
|
|
178
|
+
**Phase 1 — Validate**:
|
|
179
|
+
1. Validate the template exists and is DOCX or HWPX
|
|
180
|
+
2. Check `.dokkit/` workspace exists — if not, show error: "No workspace found. Run `/dokkit ingest <files>` first."
|
|
181
|
+
3. Check at least one source has status "ready" — if not, show error: "No sources ingested."
|
|
182
|
+
4. Report: "Starting fill pipeline with N sources -> template_name"
|
|
183
|
+
|
|
184
|
+
**Phase 2 — Analyze**:
|
|
185
|
+
5. Spawn the **dokkit-analyzer** agent to detect fields, map to sources, write `analysis.json`
|
|
186
|
+
6. Report: "Found N fields (X mapped, Y unmapped, Z images)"
|
|
187
|
+
|
|
188
|
+
**Phase 3 — Source Images**:
|
|
189
|
+
7. **Cell-level images**: For each `field_type: "image"` with `image_file: null` and `image_type: "figure"`:
|
|
190
|
+
- Run: `python scripts/source_images.py generate --prompt "<prompt>" --preset technical_illustration --output-dir .dokkit/images/ --project-dir . --lang ko`
|
|
191
|
+
- Parse `__RESULT__` JSON, update `analysis.json`
|
|
192
|
+
- Skip photo/signature types (require user-provided files)
|
|
193
|
+
- Default `--lang ko` (Korean only). Override with user instruction if needed.
|
|
194
|
+
8. **Section content images**: For each `image_opportunities` entry with `status: "pending"`:
|
|
195
|
+
- Run: `python scripts/source_images.py generate --prompt "<generation_prompt>" --preset <preset> --output-dir .dokkit/images/ --project-dir . --lang ko`
|
|
196
|
+
- On failure: set `status: "skipped"`, log reason
|
|
197
|
+
- Use `--lang ko+en` if the content contains technical terms that benefit from English (e.g., architecture diagrams with API names).
|
|
198
|
+
9. Report: "Sourced X/Y images"
|
|
199
|
+
|
|
200
|
+
**Phase 4 — Fill**:
|
|
201
|
+
10. Spawn the **dokkit-filler** agent in fill mode
|
|
202
|
+
|
|
203
|
+
**Phase 5 — Review and Auto-Fix Loop**:
|
|
204
|
+
11. Evaluate fill result: count fields by confidence, identify fixable issues
|
|
205
|
+
12. **Auto-fix**: For fixable issues, spawn **dokkit-filler** in modify mode
|
|
206
|
+
- Re-map low-confidence fields where better data exists
|
|
207
|
+
- Fix formatting issues (date formats, truncated text)
|
|
208
|
+
- Do NOT auto-fix: unfilled fields, image fields without sources
|
|
209
|
+
13. If auto-fix made changes, re-evaluate. Maximum 2 iterations.
|
|
210
|
+
14. Present **final review** table (section-by-section with confidence)
|
|
211
|
+
|
|
212
|
+
**Phase 6 — Export**:
|
|
213
|
+
15. Export in same format as input template via **dokkit-exporter** agent
|
|
214
|
+
16. Report output path and file size
|
|
215
|
+
|
|
216
|
+
**Phase 7 — Next Steps**:
|
|
217
|
+
17. Offer: `/dokkit modify "..."`, `/dokkit export pdf`, `/dokkit review`
|
|
218
|
+
|
|
219
|
+
### Delegation
|
|
220
|
+
|
|
221
|
+
**Agent 1 — Analyzer** (dokkit-analyzer):
|
|
222
|
+
> "Analyze the template at `<path>`. Detect all fillable fields INCLUDING image fields. Map to sources. Write `analysis.json`."
|
|
223
|
+
|
|
224
|
+
**Agent 2 — Filler** (dokkit-filler, fill mode):
|
|
225
|
+
> "Fill the template using `analysis.json`. Mode: fill. Insert images where `image_file` is populated. Interleave section content images at anchor points."
|
|
226
|
+
|
|
227
|
+
**Agent 2b — Filler** (dokkit-filler, modify mode — auto-fix, if needed):
|
|
228
|
+
> "Modify the filled document. Mode: modify. Fix: `<list of issues>`."
|
|
229
|
+
|
|
230
|
+
**Agent 3 — Exporter** (dokkit-exporter):
|
|
231
|
+
> "Export the filled document. Format: `<format>`. Compile from `.dokkit/template_work/` and save to `.dokkit/output/`."
|
|
232
|
+
|
|
233
|
+
### Rules
|
|
234
|
+
- At least one source must be ingested before filling
|
|
235
|
+
- Auto-fix loop runs maximum 2 iterations
|
|
236
|
+
- Auto-fix does NOT fill fields with missing source data
|
|
237
|
+
- Always show the full review table before exporting
|
|
238
|
+
- If any phase fails, show the error and stop — do NOT proceed
|
|
239
|
+
|
|
240
|
+
---
|
|
241
|
+
|
|
242
|
+
## fill-doc
|
|
243
|
+
|
|
244
|
+
Analyze a template and fill its fields using ingested source data. Does NOT auto-fix or export.
|
|
245
|
+
|
|
246
|
+
### Arguments
|
|
247
|
+
File path to the template document (DOCX or HWPX).
|
|
248
|
+
|
|
249
|
+
<example>
|
|
250
|
+
`/dokkit fill-doc docs/template.docx`
|
|
251
|
+
</example>
|
|
252
|
+
|
|
253
|
+
### Procedure
|
|
254
|
+
|
|
255
|
+
1. Validate the template exists and is DOCX or HWPX
|
|
256
|
+
2. Check `.dokkit/` workspace exists with at least one ready source
|
|
257
|
+
3. **Analyze**: Spawn the **dokkit-analyzer** agent
|
|
258
|
+
4. **Source Images**: Same as `/dokkit fill` Phase 3 (cell-level + section content)
|
|
259
|
+
5. **Fill**: Spawn the **dokkit-filler** agent in fill mode
|
|
260
|
+
6. Present review summary
|
|
261
|
+
|
|
262
|
+
### Delegation
|
|
263
|
+
|
|
264
|
+
**First**: Spawn the dokkit-analyzer agent:
|
|
265
|
+
> "Analyze the template at `<path>`. Detect all fillable fields INCLUDING image fields. Map to sources. Write `analysis.json`."
|
|
266
|
+
|
|
267
|
+
**Image sourcing** (inline, between agents):
|
|
268
|
+
- **Pass A — Cell-level**: For `field_type: "image"` with `image_file: null` and `image_type: "figure"`, run `python scripts/source_images.py generate --prompt "..." --preset ... --output-dir .dokkit/images/ --project-dir . --lang ko`
|
|
269
|
+
- **Pass B — Section content**: For `image_opportunities` with `status: "pending"`, run `python scripts/source_images.py generate --prompt "..." --preset ... --output-dir .dokkit/images/ --project-dir . --lang ko`
|
|
270
|
+
- Default language is `ko` (Korean only). Use `--lang ko+en` for mixed content, or `--lang en` for English-only.
|
|
271
|
+
|
|
272
|
+
**Then**: Spawn the dokkit-filler agent in fill mode:
|
|
273
|
+
> "Fill the template using `analysis.json`. Mode: fill. Insert images where populated. Interleave section content images at anchor points."
|
|
274
|
+
|
|
275
|
+
### Rules
|
|
276
|
+
- Template must be DOCX or HWPX
|
|
277
|
+
- Analyzer runs FIRST, then filler
|
|
278
|
+
- Original template is never modified
|
|
279
|
+
|
|
280
|
+
---
|
|
281
|
+
|
|
282
|
+
## modify
|
|
283
|
+
|
|
284
|
+
Apply targeted changes to the filled document based on natural language instructions.
|
|
285
|
+
|
|
286
|
+
### Arguments
|
|
287
|
+
A natural language instruction describing the change.
|
|
288
|
+
|
|
289
|
+
<example>
|
|
290
|
+
`/dokkit modify "Change the phone number to 010-1234-5678"`
|
|
291
|
+
`/dokkit modify "Re-do the education section using the transcript"`
|
|
292
|
+
`/dokkit modify "Use YYYY-MM-DD format for all dates"`
|
|
293
|
+
</example>
|
|
294
|
+
|
|
295
|
+
### Procedure
|
|
296
|
+
|
|
297
|
+
1. Check `.dokkit/state.json` for an active filled document. If none, show error: "No filled document. Run `/dokkit fill <template>` first."
|
|
298
|
+
2. Spawn the **dokkit-filler** agent in modify mode
|
|
299
|
+
|
|
300
|
+
### Delegation
|
|
301
|
+
> "Modify the filled document. Mode: modify. User instruction: `<instruction>`. Read `analysis.json` for field locations and make surgical changes."
|
|
302
|
+
|
|
303
|
+
### Rules
|
|
304
|
+
- A filled document must exist
|
|
305
|
+
- Only modify targeted fields — do not re-process the entire document
|
|
306
|
+
- Manual overrides get confidence "high"
|
|
307
|
+
|
|
308
|
+
---
|
|
309
|
+
|
|
310
|
+
## review
|
|
311
|
+
|
|
312
|
+
Present the filled document for review with section-by-section confidence annotations.
|
|
313
|
+
|
|
314
|
+
### Arguments
|
|
315
|
+
Optional: section name or action.
|
|
316
|
+
|
|
317
|
+
<example>
|
|
318
|
+
`/dokkit review` — review all sections
|
|
319
|
+
`/dokkit review "Personal Information"` — review specific section
|
|
320
|
+
`/dokkit review approve` — mark document as finalized
|
|
321
|
+
</example>
|
|
322
|
+
|
|
323
|
+
### Procedure
|
|
324
|
+
|
|
325
|
+
1. Check `.dokkit/state.json` for an active filled document. If none, show error.
|
|
326
|
+
2. Spawn the **dokkit-filler** agent in review mode
|
|
327
|
+
|
|
328
|
+
### Delegation
|
|
329
|
+
> "Review the filled document. Mode: review. Read `analysis.json` and present section-by-section review with confidence annotations."
|
|
330
|
+
|
|
331
|
+
If section or action specified:
|
|
332
|
+
> "Focus on section: `<section>` / Action: `<action>`"
|
|
333
|
+
|
|
334
|
+
### Rules
|
|
335
|
+
- A filled document must exist
|
|
336
|
+
- Review is read-only — shows status but changes nothing
|
|
337
|
+
- "approve" action sets document status to "finalized"
|
|
338
|
+
|
|
339
|
+
---
|
|
340
|
+
|
|
341
|
+
## export
|
|
342
|
+
|
|
343
|
+
Compile and export the filled document in the specified format.
|
|
344
|
+
|
|
345
|
+
### Arguments
|
|
346
|
+
Output format: `docx`, `hwpx`, or `pdf`.
|
|
347
|
+
|
|
348
|
+
<example>
|
|
349
|
+
`/dokkit export docx`
|
|
350
|
+
`/dokkit export pdf`
|
|
351
|
+
</example>
|
|
352
|
+
|
|
353
|
+
### Procedure
|
|
354
|
+
|
|
355
|
+
1. Check `.dokkit/state.json` for a filled document. If none, show error.
|
|
356
|
+
2. Validate the requested format is supported
|
|
357
|
+
3. Spawn the **dokkit-exporter** agent
|
|
358
|
+
|
|
359
|
+
### Delegation
|
|
360
|
+
> "Export the filled document. Format: `<format>`. Compile from `.dokkit/template_work/` and save to `.dokkit/output/`."
|
|
361
|
+
|
|
362
|
+
### Rules
|
|
363
|
+
- Supported formats: docx, hwpx, pdf
|
|
364
|
+
- Cross-format exports show a warning about potential formatting differences
|
|
365
|
+
- Same-format exports preserve 100% formatting fidelity
|
|
@@ -0,0 +1,76 @@
|
|
|
1
|
+
# DOCX XML Knowledge
|
|
2
|
+
|
|
3
|
+
Open XML structure for surgical DOCX document editing.
|
|
4
|
+
|
|
5
|
+
## DOCX Structure
|
|
6
|
+
|
|
7
|
+
A DOCX file is a ZIP archive:
|
|
8
|
+
```
|
|
9
|
+
[Content_Types].xml — MIME type mappings
|
|
10
|
+
_rels/.rels — root relationships
|
|
11
|
+
word/
|
|
12
|
+
document.xml — main document body (PRIMARY TARGET)
|
|
13
|
+
styles.xml — style definitions
|
|
14
|
+
numbering.xml — list numbering definitions
|
|
15
|
+
settings.xml — document settings
|
|
16
|
+
fontTable.xml — font declarations
|
|
17
|
+
theme/theme1.xml — theme colors/fonts
|
|
18
|
+
media/ — embedded images
|
|
19
|
+
_rels/document.xml.rels — document relationships
|
|
20
|
+
docProps/
|
|
21
|
+
app.xml — application metadata
|
|
22
|
+
core.xml — document metadata
|
|
23
|
+
```
|
|
24
|
+
|
|
25
|
+
## Key XML Elements
|
|
26
|
+
|
|
27
|
+
### Namespace
|
|
28
|
+
```xml
|
|
29
|
+
xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main"
|
|
30
|
+
```
|
|
31
|
+
|
|
32
|
+
### Document Body
|
|
33
|
+
```xml
|
|
34
|
+
<w:body>
|
|
35
|
+
<w:p> <!-- paragraph -->
|
|
36
|
+
<w:pPr> <!-- paragraph properties -->
|
|
37
|
+
<w:r> <!-- run (text with formatting) -->
|
|
38
|
+
<w:rPr> <!-- run properties (font, size, bold, etc.) -->
|
|
39
|
+
<w:t> <!-- text content -->
|
|
40
|
+
</w:r>
|
|
41
|
+
</w:p>
|
|
42
|
+
</w:body>
|
|
43
|
+
```
|
|
44
|
+
|
|
45
|
+
### Tables
|
|
46
|
+
```xml
|
|
47
|
+
<w:tbl>
|
|
48
|
+
<w:tblPr> <!-- table properties -->
|
|
49
|
+
<w:tblGrid> <!-- column widths -->
|
|
50
|
+
<w:tr> <!-- table row -->
|
|
51
|
+
<w:trPr> <!-- row properties -->
|
|
52
|
+
<w:tc> <!-- table cell -->
|
|
53
|
+
<w:tcPr> <!-- cell properties (width, merge, borders) -->
|
|
54
|
+
<w:p> <!-- cell content (paragraph) -->
|
|
55
|
+
</w:tc>
|
|
56
|
+
</w:tr>
|
|
57
|
+
</w:tbl>
|
|
58
|
+
```
|
|
59
|
+
|
|
60
|
+
### Content Controls (Structured Document Tags)
|
|
61
|
+
```xml
|
|
62
|
+
<w:sdt>
|
|
63
|
+
<w:sdtPr>
|
|
64
|
+
<w:alias w:val="FieldName"/>
|
|
65
|
+
<w:tag w:val="field_tag"/>
|
|
66
|
+
</w:sdtPr>
|
|
67
|
+
<w:sdtContent>
|
|
68
|
+
<w:p><w:r><w:t>Placeholder</w:t></w:r></w:p>
|
|
69
|
+
</w:sdtContent>
|
|
70
|
+
</w:sdt>
|
|
71
|
+
```
|
|
72
|
+
|
|
73
|
+
## References
|
|
74
|
+
|
|
75
|
+
See `references/docx-structure.md` for unpacking, repackaging, and critical rules.
|
|
76
|
+
See `references/docx-field-patterns.md` for field detection patterns (placeholders, empty cells, underline, content controls, instruction text, tip boxes).
|
|
@@ -0,0 +1,102 @@
|
|
|
1
|
+
# Export Knowledge
|
|
2
|
+
|
|
3
|
+
Document compilation and format conversion for the dokkit-exporter agent.
|
|
4
|
+
|
|
5
|
+
## Compilation (Repackaging)
|
|
6
|
+
|
|
7
|
+
### DOCX Compilation
|
|
8
|
+
```python
|
|
9
|
+
import os, zipfile
|
|
10
|
+
|
|
11
|
+
def compile_docx(work_dir: str, output_path: str):
|
|
12
|
+
"""Repackage a DOCX from its unpacked working directory."""
|
|
13
|
+
with zipfile.ZipFile(output_path, 'w', zipfile.ZIP_DEFLATED) as zf:
|
|
14
|
+
for root, dirs, files in os.walk(work_dir):
|
|
15
|
+
for file in files:
|
|
16
|
+
file_path = os.path.join(root, file)
|
|
17
|
+
arcname = os.path.relpath(file_path, work_dir)
|
|
18
|
+
zf.write(file_path, arcname)
|
|
19
|
+
return output_path
|
|
20
|
+
```
|
|
21
|
+
|
|
22
|
+
### HWPX Compilation
|
|
23
|
+
```python
|
|
24
|
+
import os, zipfile
|
|
25
|
+
|
|
26
|
+
def compile_hwpx(work_dir: str, output_path: str):
|
|
27
|
+
"""Repackage HWPX. CRITICAL: mimetype must be first and uncompressed."""
|
|
28
|
+
with zipfile.ZipFile(output_path, 'w') as zf:
|
|
29
|
+
mimetype_path = os.path.join(work_dir, "mimetype")
|
|
30
|
+
if os.path.exists(mimetype_path):
|
|
31
|
+
zf.write(mimetype_path, "mimetype", compress_type=zipfile.ZIP_STORED)
|
|
32
|
+
for root, dirs, files in os.walk(work_dir):
|
|
33
|
+
for file in sorted(files):
|
|
34
|
+
if file == "mimetype" or file.endswith(".bak"):
|
|
35
|
+
continue
|
|
36
|
+
file_path = os.path.join(root, file)
|
|
37
|
+
arcname = os.path.relpath(file_path, work_dir)
|
|
38
|
+
zf.write(file_path, arcname, compress_type=zipfile.ZIP_DEFLATED)
|
|
39
|
+
return output_path
|
|
40
|
+
```
|
|
41
|
+
|
|
42
|
+
### Scripts
|
|
43
|
+
```bash
|
|
44
|
+
python .claude/skills/dokkit/scripts/compile_hwpx.py <work_dir> <output.hwpx>
|
|
45
|
+
python .claude/skills/dokkit/scripts/export_pdf.py <input> <output.pdf>
|
|
46
|
+
```
|
|
47
|
+
|
|
48
|
+
## PDF Conversion
|
|
49
|
+
|
|
50
|
+
### Using LibreOffice
|
|
51
|
+
```bash
|
|
52
|
+
soffice --headless --convert-to pdf --outdir <output_dir> <input_file>
|
|
53
|
+
```
|
|
54
|
+
|
|
55
|
+
### Using Python Script
|
|
56
|
+
```bash
|
|
57
|
+
python .claude/skills/dokkit/scripts/export_pdf.py <input> <output.pdf>
|
|
58
|
+
```
|
|
59
|
+
|
|
60
|
+
## Cross-Format Conversion
|
|
61
|
+
|
|
62
|
+
Use LibreOffice as intermediary:
|
|
63
|
+
```bash
|
|
64
|
+
soffice --headless --convert-to hwpx --outdir <dir> <input.docx>
|
|
65
|
+
soffice --headless --convert-to docx --outdir <dir> <input.hwpx>
|
|
66
|
+
```
|
|
67
|
+
|
|
68
|
+
Cross-format conversion may lose formatting fidelity. Always warn the user.
|
|
69
|
+
|
|
70
|
+
## Validation
|
|
71
|
+
|
|
72
|
+
After compilation, verify:
|
|
73
|
+
1. Output file is a valid ZIP archive
|
|
74
|
+
2. File size is reasonable (> 0 bytes)
|
|
75
|
+
3. For DOCX: `[Content_Types].xml` exists at root
|
|
76
|
+
4. For HWPX: `mimetype` is first entry and correct value
|
|
77
|
+
|
|
78
|
+
```python
|
|
79
|
+
import zipfile
|
|
80
|
+
|
|
81
|
+
def validate_archive(path: str, doc_type: str) -> list[str]:
|
|
82
|
+
errors = []
|
|
83
|
+
try:
|
|
84
|
+
with zipfile.ZipFile(path, 'r') as zf:
|
|
85
|
+
names = zf.namelist()
|
|
86
|
+
if doc_type == "docx":
|
|
87
|
+
if "[Content_Types].xml" not in names:
|
|
88
|
+
errors.append("Missing [Content_Types].xml")
|
|
89
|
+
elif doc_type == "hwpx":
|
|
90
|
+
if not names or names[0] != "mimetype":
|
|
91
|
+
errors.append("mimetype is not the first entry")
|
|
92
|
+
except zipfile.BadZipFile:
|
|
93
|
+
errors.append("Output is not a valid ZIP archive")
|
|
94
|
+
return errors
|
|
95
|
+
```
|
|
96
|
+
|
|
97
|
+
## Rules
|
|
98
|
+
|
|
99
|
+
- Never modify filled XML during export — only repackage
|
|
100
|
+
- ZIP structure must match original (Content_Types.xml at root for DOCX, mimetype first for HWPX)
|
|
101
|
+
- Skip .bak files during HWPX compilation
|
|
102
|
+
- Report clear errors if conversion tools unavailable
|