kordoc 2.2.5 → 2.3.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +16 -4
- package/dist/{chunk-UU2O6D3R.js → chunk-JFTFC2BB.js} +2 -2
- package/dist/{chunk-JH5XLWJQ.js.map → chunk-JFTFC2BB.js.map} +1 -1
- package/dist/{chunk-5Y2Q3BRW.js → chunk-M3E3C5GS.js} +8 -1
- package/dist/chunk-M3E3C5GS.js.map +1 -0
- package/dist/{chunk-RQWICKON.js → chunk-OEJJPCMM.js} +369 -73
- package/dist/chunk-OEJJPCMM.js.map +1 -0
- package/dist/{chunk-JH5XLWJQ.js → chunk-Z7UPTVMX.js} +2 -2
- package/dist/{chunk-UU2O6D3R.js.map → chunk-Z7UPTVMX.js.map} +1 -1
- package/dist/{chunk-OJ4QR33V.cjs → chunk-ZNJPRRIA.cjs} +2 -2
- package/dist/{chunk-OJ4QR33V.cjs.map → chunk-ZNJPRRIA.cjs.map} +1 -1
- package/dist/cli.js +7 -4
- package/dist/cli.js.map +1 -1
- package/dist/{detect-GYK3HKD5.js → detect-I7YIS4Q6.js} +4 -2
- package/dist/index.cjs +463 -160
- package/dist/index.cjs.map +1 -1
- package/dist/index.d.cts +4 -2
- package/dist/index.d.ts +4 -2
- package/dist/index.js +387 -84
- package/dist/index.js.map +1 -1
- package/dist/mcp.js +5 -5
- package/dist/{parser-OIRWPKIQ.js → parser-25LF2S2J.js} +45 -42
- package/dist/{parser-OIRWPKIQ.js.map → parser-25LF2S2J.js.map} +1 -1
- package/dist/{parser-PXD73E4H.js → parser-4LKJXBPP.js} +45 -42
- package/dist/{parser-PXD73E4H.js.map → parser-4LKJXBPP.js.map} +1 -1
- package/dist/{parser-CYBX5MP4.cjs → parser-KBQZB3QY.cjs} +61 -58
- package/dist/{parser-CYBX5MP4.cjs.map → parser-KBQZB3QY.cjs.map} +1 -1
- package/dist/{watch-NSBABJ4A.js → watch-GXRBLW3Y.js} +4 -4
- package/package.json +2 -2
- package/dist/chunk-5Y2Q3BRW.js.map +0 -1
- package/dist/chunk-RQWICKON.js.map +0 -1
- /package/dist/{detect-GYK3HKD5.js.map → detect-I7YIS4Q6.js.map} +0 -0
- /package/dist/{watch-NSBABJ4A.js.map → watch-GXRBLW3Y.js.map} +0 -0
package/README.md
CHANGED
|
@@ -19,7 +19,7 @@ HWP, HWPX, PDF, XLSX, DOCX — 관공서에서 쏟아지는 모든 문서를 파
|
|
|
19
19
|
|
|
20
20
|
단순한 텍스트 추출을 넘어, **공문서 처리를 위한 모든 과정**을 자동화합니다.
|
|
21
21
|
|
|
22
|
-
* **📄 어떤 문서든 마크다운으로**: `HWP`, `HWPX`, `PDF`, `XLSX`, `DOCX` 파일을 즉시 `Markdown`으로 변환합니다. AI(LLM)가 문서를 읽고 분석하기 가장 좋은 상태로 만들어줍니다.
|
|
22
|
+
* **📄 어떤 문서든 마크다운으로**: `HWP`, `HWPX`, `HWPML`, `PDF`, `XLSX`, `DOCX` 파일을 즉시 `Markdown`으로 변환합니다. AI(LLM)가 문서를 읽고 분석하기 가장 좋은 상태로 만들어줍니다.
|
|
23
23
|
* **📊 복잡한 표(Table) 완벽 재현**: 선이 없는 PDF나 복잡하게 병합된 HWP 표도 구조를 분석하여 정확한 마크다운 테이블로 복원합니다.
|
|
24
24
|
* **🔍 신구대조표 자동 생성**: 두 문서의 차이점을 분석하여 무엇이 바뀌었는지 한눈에 보여줍니다. (HWP와 HWPX 간의 비교도 가능!)
|
|
25
25
|
* **📝 마크다운을 다시 HWPX로**: AI가 작성한 내용을 다시 보고서 양식(`HWPX`)으로 되돌려줍니다. 이제 복사-붙여넣기 노가다에서 해방되세요.
|
|
@@ -28,7 +28,15 @@ HWP, HWPX, PDF, XLSX, DOCX — 관공서에서 쏟아지는 모든 문서를 파
|
|
|
28
28
|
|
|
29
29
|
---
|
|
30
30
|
|
|
31
|
-
## v2.
|
|
31
|
+
## v2.3.0 변경사항
|
|
32
|
+
|
|
33
|
+
- **📄 HWPML 2.x 파서 추가** — XML 기반 한컴 문서(`.hwp` XML 방식) 파싱 지원. `npx kordoc <file.hwp>`에서 `지원하지 않는 파일 형식` 오류가 나던 XML 기반 공문서를 이제 Markdown으로 변환할 수 있습니다. HWP 5.x 바이너리와 자동 구분(XML 시그니처 감지).
|
|
34
|
+
- **🧩 중첩 테이블 마커** — HWPX/HWP5에서 셀 내부 중첩 테이블이 있던 위치에 `[중첩 테이블 #N]` 마커 삽입. 큰 중첩 테이블(≥3행 + ≥2열)은 별도 블록으로 분리, 작은 것은 셀 내 평탄화. HWP5는 기존에 내용이 완전히 손실되던 것을 마커로 복구.
|
|
35
|
+
- **🖼️ HWPX 이미지 추출 버그 수정** — `binaryItemIDRef`가 확장자 없이(`"image1"`) 저장된 HWPX에서 이미지 추출이 실패하던 문제 해결. ZIP 내 파일명 regex 매칭으로 복원.
|
|
36
|
+
- **📄 PDF 머리글/바닥글 감지 개선** — 텍스트 반복 패턴 + y좌표 클러스터링 하이브리드. 페이지마다 달라지는 동적 머리글(챕터명 등)도 위치 기반으로 감지. 감지 영역 10% → 12%로 확장.
|
|
37
|
+
|
|
38
|
+
<details>
|
|
39
|
+
<summary>v2.2.4 변경사항</summary>
|
|
32
40
|
|
|
33
41
|
- **📝 양식 자동 채우기 (Form Filler)** — 공문서 양식 템플릿에 값을 자동으로 채워넣습니다. 라벨-값 셀 패턴, 체크박스(`□`→`☑`), 괄호 빈칸(`일반( )통`→`일반(3)통`), 어노테이션(`(한자:)`→`(한자:金)`) 지원.
|
|
34
42
|
- **🏛️ HWPX 원본 서식 보존 모드** — `fillHwpx()`로 HWPX XML을 직접 조작하여 글꼴, 크기, 정렬 등 원본 서식 100% 유지한 채 값만 교체.
|
|
@@ -36,6 +44,8 @@ HWP, HWPX, PDF, XLSX, DOCX — 관공서에서 쏟아지는 모든 문서를 파
|
|
|
36
44
|
- **🔧 markdownToHwpx 서식 강화** — 역변환 시 heading/bold/italic/table 등 서식 지원 대폭 개선.
|
|
37
45
|
- **🤖 MCP fill_form 도구** — AI 에이전트가 양식을 직접 채울 수 있는 새 MCP 도구 추가 (총 8개).
|
|
38
46
|
|
|
47
|
+
</details>
|
|
48
|
+
|
|
39
49
|
<details>
|
|
40
50
|
<summary>v2.2.1 변경사항</summary>
|
|
41
51
|
|
|
@@ -266,7 +276,7 @@ npx kordoc watch ./문서 --webhook https://api/hook # 웹훅 알림
|
|
|
266
276
|
"mcpServers": {
|
|
267
277
|
"kordoc": {
|
|
268
278
|
"command": "npx",
|
|
269
|
-
"args": ["-y", "kordoc
|
|
279
|
+
"args": ["-y", "kordoc", "mcp"]
|
|
270
280
|
}
|
|
271
281
|
}
|
|
272
282
|
}
|
|
@@ -297,7 +307,8 @@ npx kordoc watch ./문서 --webhook https://api/hook # 웹훅 알림
|
|
|
297
307
|
| `parsePdf(buffer, options?)` | PDF 전용 |
|
|
298
308
|
| `parseXlsx(buffer, options?)` | XLSX 전용 |
|
|
299
309
|
| `parseDocx(buffer, options?)` | DOCX 전용 |
|
|
300
|
-
| `
|
|
310
|
+
| `parseHwpml(buffer, options?)` | HWPML (XML 기반 HWP) 전용 |
|
|
311
|
+
| `detectFormat(buffer)` | `"hwpx" \| "hwp" \| "hwpml" \| "pdf" \| "xlsx" \| "docx" \| "unknown"` |
|
|
301
312
|
|
|
302
313
|
### 고급 함수
|
|
303
314
|
|
|
@@ -330,6 +341,7 @@ import type {
|
|
|
330
341
|
|------|------|------|
|
|
331
342
|
| **HWPX** (한컴 2020+) | ZIP + XML DOM | 매니페스트, 중첩 테이블, 병합 셀, 손상 ZIP 복구 |
|
|
332
343
|
| **HWP 5.x** (한컴 레거시) | OLE2 + CFB | 배포용 복호화, 손상 CFB 복구, 각주/하이퍼링크, 21종 제어문자, 이미지 추출 |
|
|
344
|
+
| **HWPML 2.x** (XML 기반 HWP) | XML DOM | HeadingType 기반 헤딩 감지, 병합 셀, DoS 방어 |
|
|
333
345
|
| **PDF** | pdfjs-dist | 선 기반 테이블, XY-Cut 읽기 순서, 헤딩 감지, OCR |
|
|
334
346
|
| **XLSX** (Excel) | ZIP + XML DOM | 공유 문자열, 병합 셀, 다중 시트, 수식 표시 |
|
|
335
347
|
| **DOCX** (Word) | ZIP + XML DOM | 스타일 heading, 번호 매기기, 각주, 이미지 추출 |
|
|
@@ -1,5 +1,5 @@
|
|
|
1
1
|
// src/utils.ts
|
|
2
|
-
var VERSION = true ? "2.
|
|
2
|
+
var VERSION = true ? "2.3.0" : "0.0.0-dev";
|
|
3
3
|
function toArrayBuffer(buf) {
|
|
4
4
|
if (buf.byteOffset === 0 && buf.byteLength === buf.buffer.byteLength) {
|
|
5
5
|
return buf.buffer;
|
|
@@ -447,4 +447,4 @@ export {
|
|
|
447
447
|
HEADING_RATIO_H2,
|
|
448
448
|
HEADING_RATIO_H3
|
|
449
449
|
};
|
|
450
|
-
//# sourceMappingURL=chunk-
|
|
450
|
+
//# sourceMappingURL=chunk-JFTFC2BB.js.map
|
|
@@ -1 +1 @@
|
|
|
1
|
-
{"version":3,"sources":["../src/utils.ts","../src/table/builder.ts","../src/types.ts"],"sourcesContent":["/** kordoc 공용 유틸리티 */\n\n/** 빌드 타임에 tsup define으로 주입되는 버전 */\ndeclare const __KORDOC_VERSION__: string\nexport const VERSION: string = typeof __KORDOC_VERSION__ !== \"undefined\" ? __KORDOC_VERSION__ : \"0.0.0-dev\"\n\n/**\n * Node.js Buffer → ArrayBuffer 변환\n * pool Buffer의 공유 ArrayBuffer 문제를 안전하게 처리.\n * offset=0이고 전체 ArrayBuffer를 차지하면 복사 없이 직접 반환.\n */\nexport function toArrayBuffer(buf: Buffer): ArrayBuffer {\n if (buf.byteOffset === 0 && buf.byteLength === buf.buffer.byteLength) {\n return buf.buffer as ArrayBuffer\n }\n return buf.buffer.slice(buf.byteOffset, buf.byteOffset + buf.byteLength) as ArrayBuffer\n}\n\n/**\n * kordoc 내부 에러 클래스 — 사용자에게 노출해도 안전한 메시지만 포함.\n * MCP 에러 정제에서 instanceof로 판별하여 allowlist 패턴 매칭 없이 안전하게 통과.\n */\nexport class KordocError extends Error {\n constructor(message: string) {\n super(message)\n this.name = \"KordocError\"\n }\n}\n\n/**\n * 에러 메시지 정제 — KordocError는 그대로, 나머지는 일반 메시지로 대체.\n * 파일시스템 경로, 스택 트레이스 등 내부 정보 노출 방지.\n */\nexport function sanitizeError(err: unknown): string {\n if (err instanceof KordocError) return err.message\n return \"문서 처리 중 오류가 발생했습니다\"\n}\n\n/**\n * ZIP 엔트리 경로의 경로 순회 여부 판별.\n * 백슬래시 정규화, .., 절대경로, Windows 드라이브 문자 모두 차단.\n */\nexport function isPathTraversal(name: string): boolean {\n if (name.includes(\"\\x00\")) return true\n const normalized = name.replace(/\\\\/g, \"/\")\n const segments = normalized.split(\"/\")\n return segments.some(s => s === \"..\") || normalized.startsWith(\"/\") || /^[A-Za-z]:/.test(normalized)\n}\n\n// ─── ZIP 안전 로딩 (ZIP bomb 방지) ────────────────────\n\n/**\n * ZIP bomb 사전 검사 — Central Directory에서 비압축 합계와 엔트리 수 확인.\n * HWPX/XLSX/DOCX 등 모든 ZIP 기반 포맷에서 공통 사용.\n */\nexport function precheckZipSize(\n buffer: ArrayBuffer,\n maxUncompressedSize = 100 * 1024 * 1024,\n maxEntries = 500,\n): { totalUncompressed: number; entryCount: number } {\n try {\n const data = new DataView(buffer)\n const len = buffer.byteLength\n // EOCD 시그니처 역방향 스캔\n let eocdOffset = -1\n for (let i = len - 22; i >= Math.max(0, len - 65557); i--) {\n if (data.getUint32(i, true) === 0x06054b50) { eocdOffset = i; break }\n }\n if (eocdOffset < 0) return { totalUncompressed: 0, entryCount: 0 }\n\n const entryCount = data.getUint16(eocdOffset + 10, true)\n if (entryCount > maxEntries) {\n throw new KordocError(`ZIP 엔트리 수 초과: ${entryCount} (최대 ${maxEntries})`)\n }\n\n const cdSize = data.getUint32(eocdOffset + 12, true)\n const cdOffset = data.getUint32(eocdOffset + 16, true)\n if (cdOffset + cdSize > len) return { totalUncompressed: 0, entryCount }\n\n let totalUncompressed = 0\n let pos = cdOffset\n for (let i = 0; i < entryCount && pos + 46 <= cdOffset + cdSize; i++) {\n if (data.getUint32(pos, true) !== 0x02014b50) break\n totalUncompressed += data.getUint32(pos + 24, true)\n const nameLen = data.getUint16(pos + 28, true)\n const extraLen = data.getUint16(pos + 30, true)\n const commentLen = data.getUint16(pos + 32, true)\n pos += 46 + nameLen + extraLen + commentLen\n }\n\n if (totalUncompressed > maxUncompressedSize) {\n throw new KordocError(`ZIP 비압축 크기 초과: ${(totalUncompressed / 1024 / 1024).toFixed(1)}MB (최대 ${maxUncompressedSize / 1024 / 1024}MB)`)\n }\n\n return { totalUncompressed, entryCount }\n } catch (err) {\n if (err instanceof KordocError) throw err\n return { totalUncompressed: 0, entryCount: 0 }\n }\n}\n\n/** XXE/Billion Laughs 방지 — DOCTYPE 제거 (내부 DTD 서브셋 포함) */\nexport function stripDtd(xml: string): string {\n return xml.replace(/<!DOCTYPE\\s[^[>]*(\\[[\\s\\S]*?\\])?\\s*>/gi, \"\")\n}\n\n/** 하이퍼링크 URL 살균 — javascript: 등 XSS 위험 스킴 차단 */\nconst SAFE_HREF_RE = /^(?:https?:|mailto:|tel:|#)/i\nexport function sanitizeHref(href: string): string | null {\n const trimmed = href.trim()\n if (!trimmed || !SAFE_HREF_RE.test(trimmed)) return null\n return trimmed\n}\n\n// ─── 안전한 min/max (스택 오버플로 방지) ─────────────\n\n/** Math.min(...arr) 대체 — 대형 배열에서 스택 오버플로 방지 */\nexport function safeMin(arr: number[]): number {\n let min = Infinity\n for (let i = 0; i < arr.length; i++) if (arr[i] < min) min = arr[i]\n return min\n}\n\n/** Math.max(...arr) 대체 — 대형 배열에서 스택 오버플로 방지 */\nexport function safeMax(arr: number[]): number {\n let max = -Infinity\n for (let i = 0; i < arr.length; i++) if (arr[i] > max) max = arr[i]\n return max\n}\n\n// ─── 에러 분류 ──────────────────────────────────────\n\nimport type { ErrorCode } from \"./types.js\"\n\n/** 에러를 구조화된 ErrorCode로 분류 — KordocError 메시지 패턴 매칭 */\nexport function classifyError(err: unknown): ErrorCode {\n if (!(err instanceof Error)) return \"PARSE_ERROR\"\n const msg = err.message\n if (msg.includes(\"암호화\")) return \"ENCRYPTED\"\n if (msg.includes(\"DRM\")) return \"DRM_PROTECTED\"\n if (msg.includes(\"ZIP bomb\") || msg.includes(\"ZIP 비압축 크기 초과\") || msg.includes(\"ZIP 엔트리 수 초과\")) return \"ZIP_BOMB\"\n if (msg.includes(\"bomb\") || msg.includes(\"크기 초과\") || msg.includes(\"압축 해제\")) return \"DECOMPRESSION_BOMB\"\n if (msg.includes(\"이미지 기반\")) return \"IMAGE_BASED_PDF\"\n if (msg.includes(\"섹션\") && (msg.includes(\"찾을 수 없\") || msg.includes(\"없음\"))) return \"NO_SECTIONS\"\n if (msg.includes(\"시그니처\") || msg.includes(\"복구할 수 없\")) return \"CORRUPTED\"\n return \"PARSE_ERROR\"\n}\n","/** 2-pass colSpan/rowSpan 테이블 빌더 및 Markdown 변환 */\r\n\r\nimport type { CellContext, IRBlock, IRCell, IRTable } from \"../types.js\"\r\nimport { sanitizeHref } from \"../utils.js\"\r\n\r\n/** 테이블 열 수 상한 — 한국 공공문서 기준 충분한 값 */\r\nexport const MAX_COLS = 200\r\n/** 테이블 행 수 상한 — 메모리 폭주 방지 */\r\nexport const MAX_ROWS = 10000\r\n\r\nexport function buildTable(rows: CellContext[][]): IRTable {\r\n if (rows.length > MAX_ROWS) rows = rows.slice(0, MAX_ROWS)\r\n const numRows = rows.length\r\n\r\n // colAddr/rowAddr가 있으면 직접 배치 (HWPX cellAddr, HWP5 colAddr/rowAddr)\r\n const hasAddr = rows.some(row => row.some(c => c.colAddr !== undefined && c.rowAddr !== undefined))\r\n if (hasAddr) return buildTableDirect(rows, numRows)\r\n\r\n // Pass 1: maxCols 계산 — 2D 배열 사용 (동적 확장)\r\n let maxCols = 0\r\n const tempOccupied: boolean[][] = Array.from({ length: numRows }, () => [])\r\n\r\n for (let rowIdx = 0; rowIdx < numRows; rowIdx++) {\r\n let colIdx = 0\r\n for (const cell of rows[rowIdx]) {\r\n while (colIdx < MAX_COLS && tempOccupied[rowIdx][colIdx]) colIdx++\r\n if (colIdx >= MAX_COLS) break\r\n\r\n for (let r = rowIdx; r < Math.min(rowIdx + cell.rowSpan, numRows); r++) {\r\n for (let c = colIdx; c < Math.min(colIdx + cell.colSpan, MAX_COLS); c++) {\r\n tempOccupied[r][c] = true\r\n }\r\n }\r\n colIdx += cell.colSpan\r\n if (colIdx > maxCols) maxCols = colIdx\r\n }\r\n }\r\n\r\n if (maxCols === 0) return { rows: 0, cols: 0, cells: [], hasHeader: false }\r\n\r\n // Pass 2: 실제 배치\r\n const grid: IRCell[][] = Array.from({ length: numRows }, () =>\r\n Array.from({ length: maxCols }, () => ({ text: \"\", colSpan: 1, rowSpan: 1 }))\r\n )\r\n const occupied: boolean[][] = Array.from({ length: numRows }, () => Array(maxCols).fill(false))\r\n\r\n for (let rowIdx = 0; rowIdx < numRows; rowIdx++) {\r\n let colIdx = 0\r\n let cellIdx = 0\r\n\r\n while (colIdx < maxCols && cellIdx < rows[rowIdx].length) {\r\n while (colIdx < maxCols && occupied[rowIdx][colIdx]) colIdx++\r\n if (colIdx >= maxCols) break\r\n\r\n const cell = rows[rowIdx][cellIdx]\r\n grid[rowIdx][colIdx] = {\r\n text: cell.text.trim(),\r\n colSpan: cell.colSpan,\r\n rowSpan: cell.rowSpan,\r\n }\r\n\r\n for (let r = rowIdx; r < Math.min(rowIdx + cell.rowSpan, numRows); r++) {\r\n for (let c = colIdx; c < Math.min(colIdx + cell.colSpan, maxCols); c++) {\r\n occupied[r][c] = true\r\n }\r\n }\r\n\r\n colIdx += cell.colSpan\r\n cellIdx++\r\n }\r\n }\r\n\r\n return trimAndReturn(grid, numRows, maxCols)\r\n}\r\n\r\n/** colAddr/rowAddr 절대 좌표 기반 직접 배치 */\r\nfunction buildTableDirect(rows: CellContext[][], numRows: number): IRTable {\r\n // 전체 셀에서 maxCols 계산 (MAX_COLS 상한 적용)\r\n let maxCols = 0\r\n for (const row of rows) {\r\n for (const cell of row) {\r\n const end = (cell.colAddr ?? 0) + cell.colSpan\r\n if (end > maxCols) maxCols = end\r\n }\r\n }\r\n if (maxCols > MAX_COLS) maxCols = MAX_COLS\r\n if (maxCols === 0) return { rows: 0, cols: 0, cells: [], hasHeader: false }\r\n\r\n const grid: IRCell[][] = Array.from({ length: numRows }, () =>\r\n Array.from({ length: maxCols }, () => ({ text: \"\", colSpan: 1, rowSpan: 1 }))\r\n )\r\n\r\n for (const row of rows) {\r\n for (const cell of row) {\r\n const r = cell.rowAddr ?? 0\r\n const c = cell.colAddr ?? 0\r\n if (r >= numRows || c >= maxCols || r < 0 || c < 0) continue\r\n\r\n grid[r][c] = { text: cell.text.trim(), colSpan: cell.colSpan, rowSpan: cell.rowSpan }\r\n\r\n // 병합 영역 마킹\r\n for (let dr = 0; dr < cell.rowSpan; dr++) {\r\n for (let dc = 0; dc < cell.colSpan; dc++) {\r\n if (dr === 0 && dc === 0) continue\r\n if (r + dr < numRows && c + dc < maxCols) {\r\n grid[r + dr][c + dc] = { text: \"\", colSpan: 1, rowSpan: 1 }\r\n }\r\n }\r\n }\r\n }\r\n }\r\n\r\n return trimAndReturn(grid, numRows, maxCols)\r\n}\r\n\r\n/** 빈 후행 열 제거 후 IRTable 반환 */\r\nfunction trimAndReturn(grid: IRCell[][], numRows: number, maxCols: number): IRTable {\r\n let effectiveCols = maxCols\r\n while (effectiveCols > 0) {\r\n const colEmpty = grid.every(row => !row[effectiveCols - 1]?.text?.trim())\r\n if (!colEmpty) break\r\n effectiveCols--\r\n }\r\n if (effectiveCols < maxCols && effectiveCols > 0) {\r\n const trimmed = grid.map(row => row.slice(0, effectiveCols))\r\n return { rows: numRows, cols: effectiveCols, cells: trimmed, hasHeader: numRows > 1 }\r\n }\r\n return { rows: numRows, cols: maxCols, cells: grid, hasHeader: numRows > 1 }\r\n}\r\n\r\nexport function convertTableToText(rows: CellContext[][]): string {\r\n return rows\r\n .map(row =>\r\n row\r\n .map(c => c.text.trim().replace(/\\n/g, \" \").replace(/\\|/g, \"\\\\|\"))\r\n .filter(Boolean)\r\n .join(\" / \")\r\n )\r\n .filter(Boolean)\r\n .join(\"\\n\")\r\n}\r\n\r\n/** 마크다운 GFM 특수문자 이스케이프 — remark-gfm 오해석 방지 */\r\nfunction escapeGfm(text: string): string {\r\n // ~ → \\~ (GFM strikethrough 방지)\r\n return text.replace(/~/g, \"\\\\~\")\r\n}\r\n\r\n/** HWP 자동생성 도형/개체 대체텍스트 정규식 — 한컴오피스가 삽입하는 모든 알려진 패턴 */\r\nconst HWP_SHAPE_ALT_TEXT_RE = /(?:모서리가 둥근 |둥근 )?(?:사각형|직사각형|정사각형|원|타원|삼각형|이등변 삼각형|직각 삼각형|선|직선|곡선|화살표|굵은 화살표|이중 화살표|오각형|육각형|팔각형|별|[4-8]점별|십자|십자형|구름|구름형|마름모|도넛|평행사변형|사다리꼴|부채꼴|호|반원|물결|번개|하트|빗금|블록 화살표|수식|표|그림|개체|그리기\\s?개체|묶음\\s?개체|글상자|수식\\s?개체|OLE\\s?개체)\\s?입니다\\.?/g\r\n\r\n/** HWP PUA 특수문자 및 도형 대체텍스트 제거 — 모든 포맷 공통 */\r\nfunction sanitizeText(text: string): string {\r\n let result = text\r\n // Supplementary Private Use Area (U+F0000-U+FFFFD) — HWP 전용 기호\r\n .replace(/[\\u{F0000}-\\u{FFFFD}]/gu, \"\")\r\n // HWP 도형/개체 자동생성 대체텍스트 제거\r\n .replace(HWP_SHAPE_ALT_TEXT_RE, \"\")\r\n .replace(/ +/g, \" \")\r\n .trim()\r\n // 균등배분 스페이스 정리 (\"현 장 대 응 단 장\" → \"현장대응단장\")\r\n // 짧은 텍스트(30자 이하)에서 70%+ 토큰이 한글 1글자면 균등배분으로 판단\r\n if (result.length <= 30 && result.includes(\" \")) {\r\n const tokens = result.split(\" \")\r\n // 한글 1글자 토큰만 카운트 — ASCII 특수문자(< > & 등)는 균등배분이 아님\r\n const koreanSingleCharCount = tokens.filter(t => t.length === 1 && /[\\uAC00-\\uD7AF\\u3131-\\u318E]/.test(t)).length\r\n if (tokens.length >= 3 && koreanSingleCharCount / tokens.length >= 0.7) {\r\n result = tokens.join(\"\")\r\n }\r\n }\r\n return result\r\n}\r\n\r\n/**\r\n * 레이아웃 테이블 감지 및 해체 — IRBlock 레벨에서 수행\r\n * 적은 행(≤3) + 셀 내 줄바꿈 다량 → table 블록을 paragraph 블록들로 분해\r\n * heading 감지 전에 호출해야 해체된 텍스트에 heading 감지 적용 가능\r\n */\r\nexport function flattenLayoutTables(blocks: IRBlock[]): IRBlock[] {\r\n const result: IRBlock[] = []\r\n\r\n for (const block of blocks) {\r\n if (block.type !== \"table\" || !block.table) {\r\n result.push(block)\r\n continue\r\n }\r\n\r\n const { rows: numRows, cols: numCols, cells } = block.table\r\n\r\n // 1x1 테이블은 기존 로직(tableToMarkdown)에서 처리\r\n if (numRows === 1 && numCols === 1) {\r\n result.push(block)\r\n continue\r\n }\r\n\r\n // 레이아웃 테이블 휴리스틱\r\n if (numRows <= 3) {\r\n let totalNewlines = 0\r\n let totalTextLen = 0\r\n for (let r = 0; r < numRows; r++) {\r\n for (let c = 0; c < numCols; c++) {\r\n const t = cells[r]?.[c]?.text || \"\"\r\n totalNewlines += (t.match(/\\n/g) || []).length\r\n totalTextLen += t.length\r\n }\r\n }\r\n\r\n // 레이아웃 테이블 판정: 많은 줄바꿈(>5), 또는 적은 행에 비해 총 텍스트 과다(>300)\r\n if (totalNewlines > 5 || (numRows <= 2 && totalTextLen > 300)) {\r\n // 레이아웃 테이블 → 각 셀을 paragraph 블록으로 분해\r\n for (let r = 0; r < numRows; r++) {\r\n for (let c = 0; c < numCols; c++) {\r\n const cellText = cells[r]?.[c]?.text?.trim()\r\n if (!cellText) continue\r\n // 셀 내 줄바꿈을 별도 paragraph로 분리\r\n for (const line of cellText.split(\"\\n\")) {\r\n const trimmed = line.trim()\r\n if (!trimmed) continue\r\n result.push({ type: \"paragraph\", text: trimmed, pageNumber: block.pageNumber })\r\n }\r\n }\r\n }\r\n continue\r\n }\r\n }\r\n\r\n result.push(block)\r\n }\r\n\r\n return result\r\n}\r\n\r\nexport function blocksToMarkdown(blocks: IRBlock[]): string {\r\n const lines: string[] = []\r\n\r\n for (let i = 0; i < blocks.length; i++) {\r\n const block = blocks[i]\r\n\r\n // 헤딩 블록\r\n if (block.type === \"heading\" && block.text) {\r\n const prefix = \"#\".repeat(Math.min(block.level || 2, 6))\r\n const headingText = sanitizeText(block.text)\r\n if (headingText) lines.push(\"\", `${prefix} ${headingText}`, \"\")\r\n continue\r\n }\r\n\r\n // 이미지 블록 —  참조\r\n if (block.type === \"image\" && block.text) {\r\n lines.push(\"\", ``, \"\")\r\n continue\r\n }\r\n\r\n // 구분선 블록\r\n if (block.type === \"separator\") {\r\n lines.push(\"\", \"---\", \"\")\r\n continue\r\n }\r\n\r\n // 리스트 블록\r\n if (block.type === \"list\" && block.text) {\r\n const listText = sanitizeText(block.text)\r\n if (!listText) continue\r\n // 텍스트가 이미 번호로 시작하면 그대로 출력 (원래 번호 보존)\r\n const alreadyNumbered = block.listType === \"ordered\" && /^\\d+\\.\\s/.test(listText)\r\n const prefix = alreadyNumbered ? \"\" : block.listType === \"ordered\" ? \"1. \" : \"- \"\r\n lines.push(`${prefix}${listText}`)\r\n if (block.children) {\r\n for (const child of block.children) {\r\n const childPrefix = child.listType === \"ordered\" ? \"1.\" : \"-\"\r\n lines.push(` ${childPrefix} ${child.text || \"\"}`)\r\n }\r\n }\r\n continue\r\n }\r\n\r\n if (block.type === \"paragraph\" && block.text) {\r\n let text = sanitizeText(block.text)\r\n if (!text) continue\r\n\r\n // 별표 패턴 (기존 호환)\r\n if (/^\\[별표\\s*\\d+/.test(text)) {\r\n const nextBlock = blocks[i + 1]\r\n if (nextBlock?.type === \"paragraph\" && nextBlock.text && /관련\\)?$/.test(nextBlock.text)) {\r\n lines.push(\"\", `## ${text} ${nextBlock.text}`, \"\")\r\n i++\r\n } else {\r\n lines.push(\"\", `## ${text}`, \"\")\r\n }\r\n continue\r\n }\r\n\r\n if (/^\\([^)]*조[^)]*관련\\)$/.test(text)) {\r\n lines.push(`*${text}*`, \"\")\r\n continue\r\n }\r\n\r\n // 하이퍼링크가 있으면 텍스트에 링크 삽입 (javascript: 등 위험 스킴 제거)\r\n if (block.href) {\r\n const href = sanitizeHref(block.href)\r\n if (href) text = `[${text}](${href})`\r\n }\r\n\r\n // 각주가 있으면 괄호로 인라인 삽입\r\n if (block.footnoteText) {\r\n text += ` (주: ${block.footnoteText})`\r\n }\r\n\r\n lines.push(escapeGfm(text), \"\")\r\n } else if (block.type === \"table\" && block.table) {\r\n // 테이블 앞에 빈 줄 보장 (마크다운 렌더링 필수)\r\n if (lines.length > 0 && lines[lines.length - 1] !== \"\") {\r\n lines.push(\"\")\r\n }\r\n const tableMd = tableToMarkdown(block.table)\r\n if (tableMd) {\r\n lines.push(tableMd)\r\n lines.push(\"\")\r\n }\r\n }\r\n }\r\n\r\n return lines.join(\"\\n\").trim()\r\n}\r\n\r\n/** 병합 셀 존재 여부 확인 */\r\nfunction hasMergedCells(table: IRTable): boolean {\r\n for (const row of table.cells) {\r\n for (const cell of row) {\r\n if (cell.colSpan > 1 || cell.rowSpan > 1) return true\r\n }\r\n }\r\n return false\r\n}\r\n\r\n/** 병합 테이블 → HTML <table> 출력 (rowspan/colspan 보존) */\r\nfunction tableToHtml(table: IRTable): string {\r\n const { cells, rows: numRows, cols: numCols } = table\r\n const skip = new Set<string>()\r\n const lines: string[] = [\"<table>\"]\r\n\r\n for (let r = 0; r < numRows; r++) {\r\n const tag = r === 0 ? \"th\" : \"td\"\r\n const rowHtml: string[] = []\r\n for (let c = 0; c < numCols; c++) {\r\n if (skip.has(`${r},${c}`)) continue\r\n const cell = cells[r]?.[c]\r\n if (!cell) continue\r\n\r\n // 병합 영역 skip 마킹\r\n for (let dr = 0; dr < cell.rowSpan; dr++) {\r\n for (let dc = 0; dc < cell.colSpan; dc++) {\r\n if (dr === 0 && dc === 0) continue\r\n if (r + dr < numRows && c + dc < numCols) skip.add(`${r + dr},${c + dc}`)\r\n }\r\n }\r\n\r\n const text = sanitizeText(cell.text).replace(/\\n/g, \"<br>\")\r\n const attrs: string[] = []\r\n if (cell.colSpan > 1) attrs.push(`colspan=\"${cell.colSpan}\"`)\r\n if (cell.rowSpan > 1) attrs.push(`rowspan=\"${cell.rowSpan}\"`)\r\n const attrStr = attrs.length ? \" \" + attrs.join(\" \") : \"\"\r\n rowHtml.push(`<${tag}${attrStr}>${text}</${tag}>`)\r\n }\r\n if (rowHtml.length) lines.push(`<tr>${rowHtml.join(\"\")}</tr>`)\r\n }\r\n\r\n lines.push(\"</table>\")\r\n return lines.join(\"\\n\")\r\n}\r\n\r\nfunction tableToMarkdown(table: IRTable): string {\r\n if (table.rows === 0 || table.cols === 0) return \"\"\r\n\r\n const { cells, rows: numRows, cols: numCols } = table\r\n\r\n // 병합 셀이 있으면 HTML 테이블로 출력\r\n if (hasMergedCells(table)) return tableToHtml(table)\r\n\r\n // 1행 1열 → 구조화된 텍스트 (빈 셀이면 스킵)\r\n if (numRows === 1 && numCols === 1) {\r\n const content = sanitizeText(cells[0][0].text)\r\n if (!content) return \"\"\r\n return content\r\n .split(/\\n/)\r\n .map(line => {\r\n const trimmed = line.trim()\r\n if (!trimmed) return \"\"\r\n if (/^\\d+\\.\\s/.test(trimmed)) return `**${escapeGfm(trimmed)}**`\r\n if (/^[가-힣]\\.\\s/.test(trimmed)) return ` ${escapeGfm(trimmed)}`\r\n return escapeGfm(trimmed)\r\n })\r\n .filter(Boolean)\r\n .join(\"\\n\")\r\n }\r\n\r\n // 1열 다행 테이블 → 각 행을 별도 라인으로 출력 (목록성 데이터)\r\n if (numCols === 1 && numRows >= 2) {\r\n return cells\r\n .map(row => escapeGfm(sanitizeText(row[0].text)).replace(/\\n/g, \" \"))\r\n .filter(Boolean)\r\n .join(\"\\n\")\r\n }\r\n\r\n // 병합 셀: 행/열 병합된 셀은 빈 칸으로\r\n const display: string[][] = Array.from({ length: numRows }, () => Array(numCols).fill(\"\"))\r\n const skip = new Set<string>()\r\n\r\n for (let r = 0; r < numRows; r++) {\r\n for (let c = 0; c < numCols; c++) {\r\n if (skip.has(`${r},${c}`)) continue\r\n const cell = cells[r]?.[c]\r\n if (!cell) continue\r\n display[r][c] = escapeGfm(sanitizeText(cell.text)).replace(/\\|/g, \"\\\\|\").replace(/\\n/g, \"<br>\")\r\n\r\n // colSpan/rowSpan: 병합된 열은 빈 칸으로 유지 (텍스트 중복 방지)\r\n for (let dr = 0; dr < cell.rowSpan; dr++) {\r\n for (let dc = 0; dc < cell.colSpan; dc++) {\r\n if (dr === 0 && dc === 0) continue\r\n if (r + dr < numRows && c + dc < numCols) {\r\n skip.add(`${r + dr},${c + dc}`)\r\n }\r\n }\r\n }\r\n // colSpan > 1이면 display 열 인덱스를 건너뜀\r\n c += cell.colSpan - 1\r\n }\r\n }\r\n\r\n // rowSpan 잔류 처리:\r\n // 1) 완전 빈 행 제거\r\n // 2) \"첫 열만 값, 나머지 빈\" 행 → 다음 데이터 행의 첫 열에 값을 전파\r\n // 단, colSpan으로 인한 빈 열(skip 셀)은 이 대상이 아님\r\n const uniqueRows: string[][] = []\r\n let pendingFirstCol = \"\"\r\n for (let r = 0; r < display.length; r++) {\r\n const row = display[r]\r\n const isEmptyPlaceholder = row.every(cell => cell === \"\")\r\n if (isEmptyPlaceholder) continue\r\n\r\n // 첫 열만 값이 있고 나머지 모두 빈 행 → 다음 데이터 행의 첫 열에 전파\r\n // 단, colSpan으로 인한 빈 열(skip 셀)은 \"진짜 빈\"이 아니므로 제외\r\n const nonEmptyCols = row.filter(cell => cell !== \"\")\r\n const hasSkipInRow = row.some((_, c) => skip.has(`${r},${c}`))\r\n if (!hasSkipInRow && nonEmptyCols.length === 1 && row[0] !== \"\" && row.slice(1).every(c => c === \"\")) {\r\n pendingFirstCol = row[0]\r\n continue\r\n }\r\n\r\n // 저장된 첫 열 값을 현재 행의 빈 첫 열에 전파\r\n if (pendingFirstCol && row[0] === \"\") {\r\n row[0] = pendingFirstCol\r\n pendingFirstCol = \"\"\r\n } else {\r\n pendingFirstCol = \"\"\r\n }\r\n uniqueRows.push(row)\r\n }\r\n\r\n if (uniqueRows.length === 0) return \"\"\r\n\r\n const md: string[] = []\r\n md.push(\"| \" + uniqueRows[0].join(\" | \") + \" |\")\r\n md.push(\"| \" + uniqueRows[0].map(() => \"---\").join(\" | \") + \" |\")\r\n for (let i = 1; i < uniqueRows.length; i++) {\r\n md.push(\"| \" + uniqueRows[i].join(\" | \") + \" |\")\r\n }\r\n return md.join(\"\\n\")\r\n}\r\n","/** kordoc 공통 타입 정의 */\n\n// ─── 중간 표현 (Intermediate Representation) ─────────\n\nexport interface CellContext {\n text: string\n colSpan: number\n rowSpan: number\n /** HWP5 셀 열 주소 (0-based) — 병합 테이블 배치용 */\n colAddr?: number\n /** HWP5 셀 행 주소 (0-based) — 병합 테이블 배치용 */\n rowAddr?: number\n}\n\n/** 블록 타입 — v2.0에서 heading, list, image, separator 추가 */\nexport type IRBlockType = \"paragraph\" | \"table\" | \"heading\" | \"list\" | \"image\" | \"separator\"\n\nexport interface IRBlock {\n type: IRBlockType\n text?: string\n table?: IRTable\n /** 헤딩 레벨 (1-6), type=\"heading\"일 때 사용 */\n level?: number\n /** 원본 페이지 번호 (1-based) */\n pageNumber?: number\n /** 바운딩 박스 — PDF에서만 제공 */\n bbox?: BoundingBox\n /** 텍스트 스타일 정보 (선택) */\n style?: InlineStyle\n /** 리스트 타입, type=\"list\"일 때 사용 */\n listType?: \"ordered\" | \"unordered\"\n /** 중첩 리스트 아이템 */\n children?: IRBlock[]\n /** 하이퍼링크 URL */\n href?: string\n /** 각주/미주 텍스트 (인라인 삽입용) */\n footnoteText?: string\n /** 이미지 데이터 (type=\"image\"일 때) */\n imageData?: ImageData\n}\n\n/** 추출된 이미지 바이너리 데이터 */\nexport interface ImageData {\n /** 이미지 바이너리 */\n data: Uint8Array\n /** MIME 타입 (image/png, image/jpeg, image/gif, image/bmp, image/wmf, image/emf) */\n mimeType: string\n /** 원본 파일명 (있는 경우) */\n filename?: string\n}\n\n/** 바운딩 박스 — PDF 포인트 단위 (72pt = 1인치) */\nexport interface BoundingBox {\n page: number\n x: number\n y: number\n width: number\n height: number\n}\n\n/** 인라인 텍스트 스타일 */\nexport interface InlineStyle {\n bold?: boolean\n italic?: boolean\n fontSize?: number\n fontName?: string\n}\n\nexport interface IRTable {\n rows: number\n cols: number\n cells: IRCell[][]\n /** 첫 행을 헤더로 렌더링할지 여부 (현재: rows > 1이면 true — 의미적 감지가 아닌 레이아웃 힌트) */\n hasHeader: boolean\n}\n\nexport interface IRCell {\n text: string\n colSpan: number\n rowSpan: number\n}\n\n// ─── 메타데이터 ─────────────────────────────────────\n\n/** 문서 메타데이터 — 각 포맷에서 추출 가능한 필드만 채워짐 */\nexport interface DocumentMetadata {\n /** 문서 제목 */\n title?: string\n /** 작성자 */\n author?: string\n /** 작성 프로그램 (예: \"한글 2020\", \"Adobe Acrobat\") */\n creator?: string\n /** 생성일시 (ISO 8601) */\n createdAt?: string\n /** 수정일시 (ISO 8601) */\n modifiedAt?: string\n /** 페이지/섹션 수 */\n pageCount?: number\n /** 문서 포맷 버전 (예: HWP \"5.1.0.1\") */\n version?: string\n /** 설명 */\n description?: string\n /** 키워드 */\n keywords?: string[]\n}\n\n// ─── 파싱 옵션 ──────────────────────────────────────\n\n/** 파싱 옵션 — parse() 함수에 전달 */\nexport interface ParseOptions {\n /**\n * 파싱할 페이지/섹션 범위 (1-based).\n * - 배열: [1, 2, 3]\n * - 문자열: \"1-3\", \"1,3,5-7\"\n *\n * PDF: 정확한 페이지 단위. HWP/HWPX: 섹션 단위 근사치.\n */\n pages?: number[] | string\n /** 이미지 기반 PDF용 OCR 프로바이더 (선택) */\n ocr?: OcrProvider\n /** 진행률 콜백 — current: 현재 페이지/섹션, total: 전체 수 */\n onProgress?: (current: number, total: number) => void\n /** PDF 머리글/바닥글 자동 제거 */\n removeHeaderFooter?: boolean\n}\n\n// ─── 파싱 경고 ──────────────────────────────────────\n\n/** 파싱 중 스킵/실패한 요소 보고 */\nexport interface ParseWarning {\n /** 관련 페이지 번호 (알 수 있는 경우) */\n page?: number\n /** 경고 메시지 */\n message: string\n /** 구조화된 경고 코드 */\n code: WarningCode\n}\n\nexport type WarningCode =\n | \"SKIPPED_IMAGE\"\n | \"SKIPPED_OLE\"\n | \"TRUNCATED_TABLE\"\n | \"OCR_FALLBACK\"\n | \"UNSUPPORTED_ELEMENT\"\n | \"BROKEN_ZIP_RECOVERY\"\n | \"HIDDEN_TEXT_FILTERED\"\n | \"MALFORMED_XML\"\n | \"PARTIAL_PARSE\"\n | \"LENIENT_CFB_RECOVERY\"\n\n/** 문서 구조 (헤딩 트리) */\nexport interface OutlineItem {\n level: number\n text: string\n pageNumber?: number\n}\n\n// ─── 에러 코드 ──────────────────────────────────────\n\n/** 구조화된 에러 코드 — 프로그래밍적 에러 핸들링용 */\nexport type ErrorCode =\n | \"EMPTY_INPUT\"\n | \"UNSUPPORTED_FORMAT\"\n | \"ENCRYPTED\"\n | \"DRM_PROTECTED\"\n | \"CORRUPTED\"\n | \"DECOMPRESSION_BOMB\"\n | \"ZIP_BOMB\"\n | \"IMAGE_BASED_PDF\"\n | \"NO_SECTIONS\"\n | \"PARSE_ERROR\"\n | \"MISSING_DEPENDENCY\"\n\n// ─── 파싱 결과 (discriminated union) ────────────────\n\nexport type FileType = \"hwpx\" | \"hwp\" | \"pdf\" | \"xlsx\" | \"docx\" | \"unknown\"\n\ninterface ParseResultBase {\n fileType: FileType\n /** 페이지/섹션 수 — PDF: 실제 페이지 수, HWP/HWPX: 섹션 수, XLSX: 시트 수 */\n pageCount?: number\n /** 이미지 기반 PDF 여부 (텍스트 추출 불가) */\n isImageBased?: boolean\n}\n\nexport interface ParseSuccess extends ParseResultBase {\n success: true\n /** 추출된 마크다운 텍스트 */\n markdown: string\n /** 중간 표현 블록 (구조화된 데이터 접근용) */\n blocks: IRBlock[]\n /** 문서 메타데이터 */\n metadata?: DocumentMetadata\n /** 문서 구조 (헤딩 트리) — v2.0 */\n outline?: OutlineItem[]\n /** 파싱 중 발생한 경고 — v2.0 */\n warnings?: ParseWarning[]\n /** 추출된 이미지 목록 — 마크다운에서 파일명으로 참조됨 */\n images?: ExtractedImage[]\n}\n\n/** 추출된 이미지 — ParseSuccess.images에 포함 */\nexport interface ExtractedImage {\n /** 마크다운에서 참조되는 파일명 (예: image_001.png) */\n filename: string\n /** 이미지 바이너리 */\n data: Uint8Array\n /** MIME 타입 */\n mimeType: string\n}\n\nexport interface ParseFailure extends ParseResultBase {\n success: false\n /** 오류 메시지 */\n error: string\n /** 구조화된 에러 코드 */\n code?: ErrorCode\n}\n\nexport type ParseResult = ParseSuccess | ParseFailure\n\n// ─── 문서 비교 (Diff) ───────────────────────────────\n\nexport type DiffChangeType = \"added\" | \"removed\" | \"modified\" | \"unchanged\"\n\nexport interface BlockDiff {\n type: DiffChangeType\n /** 원본 블록 (added이면 undefined) */\n before?: IRBlock\n /** 변경 후 블록 (removed이면 undefined) */\n after?: IRBlock\n /** modified 테이블의 셀 단위 diff */\n cellDiffs?: CellDiff[][]\n /** 유사도 (0-1) */\n similarity?: number\n}\n\nexport interface CellDiff {\n type: DiffChangeType\n before?: string\n after?: string\n}\n\nexport interface DiffResult {\n stats: { added: number; removed: number; modified: number; unchanged: number }\n diffs: BlockDiff[]\n}\n\n// ─── 양식 인식 ──────────────────────────────────────\n\nexport interface FormField {\n label: string\n value: string\n /** 0-based 소스 행 */\n row: number\n /** 0-based 소스 열 */\n col: number\n}\n\nexport interface FormResult {\n fields: FormField[]\n /** 양식 확신도 (0-1) */\n confidence: number\n}\n\n// ─── OCR 프로바이더 ─────────────────────────────────\n\n/** 사용자 제공 OCR 함수 — 페이지 이미지를 받아 텍스트 반환 */\nexport type OcrProvider = (\n pageImage: Uint8Array,\n pageNumber: number,\n mimeType: \"image/png\"\n) => Promise<string>\n\n// ─── Watch 모드 ─────────────────────────────────────\n\nexport interface WatchOptions {\n dir: string\n outDir?: string\n webhook?: string\n format?: \"markdown\" | \"json\"\n pages?: string\n silent?: boolean\n}\n\n// ─── 헤딩 감지 공통 임계값 ──────────────────────────\n\n/** 폰트 크기 비율 → heading level (전 파서 공통) */\nexport const HEADING_RATIO_H1 = 1.5\nexport const HEADING_RATIO_H2 = 1.3\nexport const HEADING_RATIO_H3 = 1.15\n\n// ─── 내부 파서 반환 타입 ─────────────────────────────\n\n/** 내부 파서가 index.ts에 반환하는 공통 타입 (HWP5/HWPX/PDF/XLSX/DOCX) */\nexport interface InternalParseResult {\n markdown: string\n blocks: IRBlock[]\n metadata?: DocumentMetadata\n outline?: OutlineItem[]\n warnings?: ParseWarning[]\n images?: ExtractedImage[]\n /** PDF 전용: 이미지 기반 PDF 여부 */\n isImageBased?: boolean\n}\n"],"mappings":";;;AAIO,IAAM,UAAkB,OAA4C,UAAqB;AAOzF,SAAS,cAAc,KAA0B;AACtD,MAAI,IAAI,eAAe,KAAK,IAAI,eAAe,IAAI,OAAO,YAAY;AACpE,WAAO,IAAI;AAAA,EACb;AACA,SAAO,IAAI,OAAO,MAAM,IAAI,YAAY,IAAI,aAAa,IAAI,UAAU;AACzE;AAMO,IAAM,cAAN,cAA0B,MAAM;AAAA,EACrC,YAAY,SAAiB;AAC3B,UAAM,OAAO;AACb,SAAK,OAAO;AAAA,EACd;AACF;AAMO,SAAS,cAAc,KAAsB;AAClD,MAAI,eAAe,YAAa,QAAO,IAAI;AAC3C,SAAO;AACT;AAMO,SAAS,gBAAgB,MAAuB;AACrD,MAAI,KAAK,SAAS,IAAM,EAAG,QAAO;AAClC,QAAM,aAAa,KAAK,QAAQ,OAAO,GAAG;AAC1C,QAAM,WAAW,WAAW,MAAM,GAAG;AACrC,SAAO,SAAS,KAAK,OAAK,MAAM,IAAI,KAAK,WAAW,WAAW,GAAG,KAAK,aAAa,KAAK,UAAU;AACrG;AAQO,SAAS,gBACd,QACA,sBAAsB,MAAM,OAAO,MACnC,aAAa,KACsC;AACnD,MAAI;AACF,UAAM,OAAO,IAAI,SAAS,MAAM;AAChC,UAAM,MAAM,OAAO;AAEnB,QAAI,aAAa;AACjB,aAAS,IAAI,MAAM,IAAI,KAAK,KAAK,IAAI,GAAG,MAAM,KAAK,GAAG,KAAK;AACzD,UAAI,KAAK,UAAU,GAAG,IAAI,MAAM,WAAY;AAAE,qBAAa;AAAG;AAAA,MAAM;AAAA,IACtE;AACA,QAAI,aAAa,EAAG,QAAO,EAAE,mBAAmB,GAAG,YAAY,EAAE;AAEjE,UAAM,aAAa,KAAK,UAAU,aAAa,IAAI,IAAI;AACvD,QAAI,aAAa,YAAY;AAC3B,YAAM,IAAI,YAAY,+CAAiB,UAAU,kBAAQ,UAAU,GAAG;AAAA,IACxE;AAEA,UAAM,SAAS,KAAK,UAAU,aAAa,IAAI,IAAI;AACnD,UAAM,WAAW,KAAK,UAAU,aAAa,IAAI,IAAI;AACrD,QAAI,WAAW,SAAS,IAAK,QAAO,EAAE,mBAAmB,GAAG,WAAW;AAEvE,QAAI,oBAAoB;AACxB,QAAI,MAAM;AACV,aAAS,IAAI,GAAG,IAAI,cAAc,MAAM,MAAM,WAAW,QAAQ,KAAK;AACpE,UAAI,KAAK,UAAU,KAAK,IAAI,MAAM,SAAY;AAC9C,2BAAqB,KAAK,UAAU,MAAM,IAAI,IAAI;AAClD,YAAM,UAAU,KAAK,UAAU,MAAM,IAAI,IAAI;AAC7C,YAAM,WAAW,KAAK,UAAU,MAAM,IAAI,IAAI;AAC9C,YAAM,aAAa,KAAK,UAAU,MAAM,IAAI,IAAI;AAChD,aAAO,KAAK,UAAU,WAAW;AAAA,IACnC;AAEA,QAAI,oBAAoB,qBAAqB;AAC3C,YAAM,IAAI,YAAY,sDAAmB,oBAAoB,OAAO,MAAM,QAAQ,CAAC,CAAC,oBAAU,sBAAsB,OAAO,IAAI,KAAK;AAAA,IACtI;AAEA,WAAO,EAAE,mBAAmB,WAAW;AAAA,EACzC,SAAS,KAAK;AACZ,QAAI,eAAe,YAAa,OAAM;AACtC,WAAO,EAAE,mBAAmB,GAAG,YAAY,EAAE;AAAA,EAC/C;AACF;AAGO,SAAS,SAAS,KAAqB;AAC5C,SAAO,IAAI,QAAQ,0CAA0C,EAAE;AACjE;AAGA,IAAM,eAAe;AACd,SAAS,aAAa,MAA6B;AACxD,QAAM,UAAU,KAAK,KAAK;AAC1B,MAAI,CAAC,WAAW,CAAC,aAAa,KAAK,OAAO,EAAG,QAAO;AACpD,SAAO;AACT;AAKO,SAAS,QAAQ,KAAuB;AAC7C,MAAI,MAAM;AACV,WAAS,IAAI,GAAG,IAAI,IAAI,QAAQ,IAAK,KAAI,IAAI,CAAC,IAAI,IAAK,OAAM,IAAI,CAAC;AAClE,SAAO;AACT;AAGO,SAAS,QAAQ,KAAuB;AAC7C,MAAI,MAAM;AACV,WAAS,IAAI,GAAG,IAAI,IAAI,QAAQ,IAAK,KAAI,IAAI,CAAC,IAAI,IAAK,OAAM,IAAI,CAAC;AAClE,SAAO;AACT;AAOO,SAAS,cAAc,KAAyB;AACrD,MAAI,EAAE,eAAe,OAAQ,QAAO;AACpC,QAAM,MAAM,IAAI;AAChB,MAAI,IAAI,SAAS,oBAAK,EAAG,QAAO;AAChC,MAAI,IAAI,SAAS,KAAK,EAAG,QAAO;AAChC,MAAI,IAAI,SAAS,UAAU,KAAK,IAAI,SAAS,kDAAe,KAAK,IAAI,SAAS,4CAAc,EAAG,QAAO;AACtG,MAAI,IAAI,SAAS,MAAM,KAAK,IAAI,SAAS,2BAAO,KAAK,IAAI,SAAS,2BAAO,EAAG,QAAO;AACnF,MAAI,IAAI,SAAS,iCAAQ,EAAG,QAAO;AACnC,MAAI,IAAI,SAAS,cAAI,MAAM,IAAI,SAAS,4BAAQ,KAAK,IAAI,SAAS,cAAI,GAAI,QAAO;AACjF,MAAI,IAAI,SAAS,0BAAM,KAAK,IAAI,SAAS,kCAAS,EAAG,QAAO;AAC5D,SAAO;AACT;;;AC5IO,IAAM,WAAW;AAEjB,IAAM,WAAW;AAEjB,SAAS,WAAW,MAAgC;AACzD,MAAI,KAAK,SAAS,SAAU,QAAO,KAAK,MAAM,GAAG,QAAQ;AACzD,QAAM,UAAU,KAAK;AAGrB,QAAM,UAAU,KAAK,KAAK,SAAO,IAAI,KAAK,OAAK,EAAE,YAAY,UAAa,EAAE,YAAY,MAAS,CAAC;AAClG,MAAI,QAAS,QAAO,iBAAiB,MAAM,OAAO;AAGlD,MAAI,UAAU;AACd,QAAM,eAA4B,MAAM,KAAK,EAAE,QAAQ,QAAQ,GAAG,MAAM,CAAC,CAAC;AAE1E,WAAS,SAAS,GAAG,SAAS,SAAS,UAAU;AAC/C,QAAI,SAAS;AACb,eAAW,QAAQ,KAAK,MAAM,GAAG;AAC/B,aAAO,SAAS,YAAY,aAAa,MAAM,EAAE,MAAM,EAAG;AAC1D,UAAI,UAAU,SAAU;AAExB,eAAS,IAAI,QAAQ,IAAI,KAAK,IAAI,SAAS,KAAK,SAAS,OAAO,GAAG,KAAK;AACtE,iBAAS,IAAI,QAAQ,IAAI,KAAK,IAAI,SAAS,KAAK,SAAS,QAAQ,GAAG,KAAK;AACvE,uBAAa,CAAC,EAAE,CAAC,IAAI;AAAA,QACvB;AAAA,MACF;AACA,gBAAU,KAAK;AACf,UAAI,SAAS,QAAS,WAAU;AAAA,IAClC;AAAA,EACF;AAEA,MAAI,YAAY,EAAG,QAAO,EAAE,MAAM,GAAG,MAAM,GAAG,OAAO,CAAC,GAAG,WAAW,MAAM;AAG1E,QAAM,OAAmB,MAAM;AAAA,IAAK,EAAE,QAAQ,QAAQ;AAAA,IAAG,MACvD,MAAM,KAAK,EAAE,QAAQ,QAAQ,GAAG,OAAO,EAAE,MAAM,IAAI,SAAS,GAAG,SAAS,EAAE,EAAE;AAAA,EAC9E;AACA,QAAM,WAAwB,MAAM,KAAK,EAAE,QAAQ,QAAQ,GAAG,MAAM,MAAM,OAAO,EAAE,KAAK,KAAK,CAAC;AAE9F,WAAS,SAAS,GAAG,SAAS,SAAS,UAAU;AAC/C,QAAI,SAAS;AACb,QAAI,UAAU;AAEd,WAAO,SAAS,WAAW,UAAU,KAAK,MAAM,EAAE,QAAQ;AACxD,aAAO,SAAS,WAAW,SAAS,MAAM,EAAE,MAAM,EAAG;AACrD,UAAI,UAAU,QAAS;AAEvB,YAAM,OAAO,KAAK,MAAM,EAAE,OAAO;AACjC,WAAK,MAAM,EAAE,MAAM,IAAI;AAAA,QACrB,MAAM,KAAK,KAAK,KAAK;AAAA,QACrB,SAAS,KAAK;AAAA,QACd,SAAS,KAAK;AAAA,MAChB;AAEA,eAAS,IAAI,QAAQ,IAAI,KAAK,IAAI,SAAS,KAAK,SAAS,OAAO,GAAG,KAAK;AACtE,iBAAS,IAAI,QAAQ,IAAI,KAAK,IAAI,SAAS,KAAK,SAAS,OAAO,GAAG,KAAK;AACtE,mBAAS,CAAC,EAAE,CAAC,IAAI;AAAA,QACnB;AAAA,MACF;AAEA,gBAAU,KAAK;AACf;AAAA,IACF;AAAA,EACF;AAEA,SAAO,cAAc,MAAM,SAAS,OAAO;AAC7C;AAGA,SAAS,iBAAiB,MAAuB,SAA0B;AAEzE,MAAI,UAAU;AACd,aAAW,OAAO,MAAM;AACtB,eAAW,QAAQ,KAAK;AACtB,YAAM,OAAO,KAAK,WAAW,KAAK,KAAK;AACvC,UAAI,MAAM,QAAS,WAAU;AAAA,IAC/B;AAAA,EACF;AACA,MAAI,UAAU,SAAU,WAAU;AAClC,MAAI,YAAY,EAAG,QAAO,EAAE,MAAM,GAAG,MAAM,GAAG,OAAO,CAAC,GAAG,WAAW,MAAM;AAE1E,QAAM,OAAmB,MAAM;AAAA,IAAK,EAAE,QAAQ,QAAQ;AAAA,IAAG,MACvD,MAAM,KAAK,EAAE,QAAQ,QAAQ,GAAG,OAAO,EAAE,MAAM,IAAI,SAAS,GAAG,SAAS,EAAE,EAAE;AAAA,EAC9E;AAEA,aAAW,OAAO,MAAM;AACtB,eAAW,QAAQ,KAAK;AACtB,YAAM,IAAI,KAAK,WAAW;AAC1B,YAAM,IAAI,KAAK,WAAW;AAC1B,UAAI,KAAK,WAAW,KAAK,WAAW,IAAI,KAAK,IAAI,EAAG;AAEpD,WAAK,CAAC,EAAE,CAAC,IAAI,EAAE,MAAM,KAAK,KAAK,KAAK,GAAG,SAAS,KAAK,SAAS,SAAS,KAAK,QAAQ;AAGpF,eAAS,KAAK,GAAG,KAAK,KAAK,SAAS,MAAM;AACxC,iBAAS,KAAK,GAAG,KAAK,KAAK,SAAS,MAAM;AACxC,cAAI,OAAO,KAAK,OAAO,EAAG;AAC1B,cAAI,IAAI,KAAK,WAAW,IAAI,KAAK,SAAS;AACxC,iBAAK,IAAI,EAAE,EAAE,IAAI,EAAE,IAAI,EAAE,MAAM,IAAI,SAAS,GAAG,SAAS,EAAE;AAAA,UAC5D;AAAA,QACF;AAAA,MACF;AAAA,IACF;AAAA,EACF;AAEA,SAAO,cAAc,MAAM,SAAS,OAAO;AAC7C;AAGA,SAAS,cAAc,MAAkB,SAAiB,SAA0B;AAClF,MAAI,gBAAgB;AACpB,SAAO,gBAAgB,GAAG;AACxB,UAAM,WAAW,KAAK,MAAM,SAAO,CAAC,IAAI,gBAAgB,CAAC,GAAG,MAAM,KAAK,CAAC;AACxE,QAAI,CAAC,SAAU;AACf;AAAA,EACF;AACA,MAAI,gBAAgB,WAAW,gBAAgB,GAAG;AAChD,UAAM,UAAU,KAAK,IAAI,SAAO,IAAI,MAAM,GAAG,aAAa,CAAC;AAC3D,WAAO,EAAE,MAAM,SAAS,MAAM,eAAe,OAAO,SAAS,WAAW,UAAU,EAAE;AAAA,EACtF;AACA,SAAO,EAAE,MAAM,SAAS,MAAM,SAAS,OAAO,MAAM,WAAW,UAAU,EAAE;AAC7E;AAEO,SAAS,mBAAmB,MAA+B;AAChE,SAAO,KACJ;AAAA,IAAI,SACH,IACG,IAAI,OAAK,EAAE,KAAK,KAAK,EAAE,QAAQ,OAAO,GAAG,EAAE,QAAQ,OAAO,KAAK,CAAC,EAChE,OAAO,OAAO,EACd,KAAK,KAAK;AAAA,EACf,EACC,OAAO,OAAO,EACd,KAAK,IAAI;AACd;AAGA,SAAS,UAAU,MAAsB;AAEvC,SAAO,KAAK,QAAQ,MAAM,KAAK;AACjC;AAGA,IAAM,wBAAwB;AAG9B,SAAS,aAAa,MAAsB;AAC1C,MAAI,SAAS,KAEV,QAAQ,2BAA2B,EAAE,EAErC,QAAQ,uBAAuB,EAAE,EACjC,QAAQ,QAAQ,GAAG,EACnB,KAAK;AAGR,MAAI,OAAO,UAAU,MAAM,OAAO,SAAS,GAAG,GAAG;AAC/C,UAAM,SAAS,OAAO,MAAM,GAAG;AAE/B,UAAM,wBAAwB,OAAO,OAAO,OAAK,EAAE,WAAW,KAAK,+BAA+B,KAAK,CAAC,CAAC,EAAE;AAC3G,QAAI,OAAO,UAAU,KAAK,wBAAwB,OAAO,UAAU,KAAK;AACtE,eAAS,OAAO,KAAK,EAAE;AAAA,IACzB;AAAA,EACF;AACA,SAAO;AACT;AAOO,SAAS,oBAAoB,QAA8B;AAChE,QAAM,SAAoB,CAAC;AAE3B,aAAW,SAAS,QAAQ;AAC1B,QAAI,MAAM,SAAS,WAAW,CAAC,MAAM,OAAO;AAC1C,aAAO,KAAK,KAAK;AACjB;AAAA,IACF;AAEA,UAAM,EAAE,MAAM,SAAS,MAAM,SAAS,MAAM,IAAI,MAAM;AAGtD,QAAI,YAAY,KAAK,YAAY,GAAG;AAClC,aAAO,KAAK,KAAK;AACjB;AAAA,IACF;AAGA,QAAI,WAAW,GAAG;AAChB,UAAI,gBAAgB;AACpB,UAAI,eAAe;AACnB,eAAS,IAAI,GAAG,IAAI,SAAS,KAAK;AAChC,iBAAS,IAAI,GAAG,IAAI,SAAS,KAAK;AAChC,gBAAM,IAAI,MAAM,CAAC,IAAI,CAAC,GAAG,QAAQ;AACjC,4BAAkB,EAAE,MAAM,KAAK,KAAK,CAAC,GAAG;AACxC,0BAAgB,EAAE;AAAA,QACpB;AAAA,MACF;AAGA,UAAI,gBAAgB,KAAM,WAAW,KAAK,eAAe,KAAM;AAE7D,iBAAS,IAAI,GAAG,IAAI,SAAS,KAAK;AAChC,mBAAS,IAAI,GAAG,IAAI,SAAS,KAAK;AAChC,kBAAM,WAAW,MAAM,CAAC,IAAI,CAAC,GAAG,MAAM,KAAK;AAC3C,gBAAI,CAAC,SAAU;AAEf,uBAAW,QAAQ,SAAS,MAAM,IAAI,GAAG;AACvC,oBAAM,UAAU,KAAK,KAAK;AAC1B,kBAAI,CAAC,QAAS;AACd,qBAAO,KAAK,EAAE,MAAM,aAAa,MAAM,SAAS,YAAY,MAAM,WAAW,CAAC;AAAA,YAChF;AAAA,UACF;AAAA,QACF;AACA;AAAA,MACF;AAAA,IACF;AAEA,WAAO,KAAK,KAAK;AAAA,EACnB;AAEA,SAAO;AACT;AAEO,SAAS,iBAAiB,QAA2B;AAC1D,QAAM,QAAkB,CAAC;AAEzB,WAAS,IAAI,GAAG,IAAI,OAAO,QAAQ,KAAK;AACtC,UAAM,QAAQ,OAAO,CAAC;AAGtB,QAAI,MAAM,SAAS,aAAa,MAAM,MAAM;AAC1C,YAAM,SAAS,IAAI,OAAO,KAAK,IAAI,MAAM,SAAS,GAAG,CAAC,CAAC;AACvD,YAAM,cAAc,aAAa,MAAM,IAAI;AAC3C,UAAI,YAAa,OAAM,KAAK,IAAI,GAAG,MAAM,IAAI,WAAW,IAAI,EAAE;AAC9D;AAAA,IACF;AAGA,QAAI,MAAM,SAAS,WAAW,MAAM,MAAM;AACxC,YAAM,KAAK,IAAI,YAAY,MAAM,IAAI,KAAK,EAAE;AAC5C;AAAA,IACF;AAGA,QAAI,MAAM,SAAS,aAAa;AAC9B,YAAM,KAAK,IAAI,OAAO,EAAE;AACxB;AAAA,IACF;AAGA,QAAI,MAAM,SAAS,UAAU,MAAM,MAAM;AACvC,YAAM,WAAW,aAAa,MAAM,IAAI;AACxC,UAAI,CAAC,SAAU;AAEf,YAAM,kBAAkB,MAAM,aAAa,aAAa,WAAW,KAAK,QAAQ;AAChF,YAAM,SAAS,kBAAkB,KAAK,MAAM,aAAa,YAAY,QAAQ;AAC7E,YAAM,KAAK,GAAG,MAAM,GAAG,QAAQ,EAAE;AACjC,UAAI,MAAM,UAAU;AAClB,mBAAW,SAAS,MAAM,UAAU;AAClC,gBAAM,cAAc,MAAM,aAAa,YAAY,OAAO;AAC1D,gBAAM,KAAK,KAAK,WAAW,IAAI,MAAM,QAAQ,EAAE,EAAE;AAAA,QACnD;AAAA,MACF;AACA;AAAA,IACF;AAEA,QAAI,MAAM,SAAS,eAAe,MAAM,MAAM;AAC5C,UAAI,OAAO,aAAa,MAAM,IAAI;AAClC,UAAI,CAAC,KAAM;AAGX,UAAI,cAAc,KAAK,IAAI,GAAG;AAC5B,cAAM,YAAY,OAAO,IAAI,CAAC;AAC9B,YAAI,WAAW,SAAS,eAAe,UAAU,QAAQ,SAAS,KAAK,UAAU,IAAI,GAAG;AACtF,gBAAM,KAAK,IAAI,MAAM,IAAI,IAAI,UAAU,IAAI,IAAI,EAAE;AACjD;AAAA,QACF,OAAO;AACL,gBAAM,KAAK,IAAI,MAAM,IAAI,IAAI,EAAE;AAAA,QACjC;AACA;AAAA,MACF;AAEA,UAAI,sBAAsB,KAAK,IAAI,GAAG;AACpC,cAAM,KAAK,IAAI,IAAI,KAAK,EAAE;AAC1B;AAAA,MACF;AAGA,UAAI,MAAM,MAAM;AACd,cAAM,OAAO,aAAa,MAAM,IAAI;AACpC,YAAI,KAAM,QAAO,IAAI,IAAI,KAAK,IAAI;AAAA,MACpC;AAGA,UAAI,MAAM,cAAc;AACtB,gBAAQ,aAAQ,MAAM,YAAY;AAAA,MACpC;AAEA,YAAM,KAAK,UAAU,IAAI,GAAG,EAAE;AAAA,IAChC,WAAW,MAAM,SAAS,WAAW,MAAM,OAAO;AAEhD,UAAI,MAAM,SAAS,KAAK,MAAM,MAAM,SAAS,CAAC,MAAM,IAAI;AACtD,cAAM,KAAK,EAAE;AAAA,MACf;AACA,YAAM,UAAU,gBAAgB,MAAM,KAAK;AAC3C,UAAI,SAAS;AACX,cAAM,KAAK,OAAO;AAClB,cAAM,KAAK,EAAE;AAAA,MACf;AAAA,IACF;AAAA,EACF;AAEA,SAAO,MAAM,KAAK,IAAI,EAAE,KAAK;AAC/B;AAGA,SAAS,eAAe,OAAyB;AAC/C,aAAW,OAAO,MAAM,OAAO;AAC7B,eAAW,QAAQ,KAAK;AACtB,UAAI,KAAK,UAAU,KAAK,KAAK,UAAU,EAAG,QAAO;AAAA,IACnD;AAAA,EACF;AACA,SAAO;AACT;AAGA,SAAS,YAAY,OAAwB;AAC3C,QAAM,EAAE,OAAO,MAAM,SAAS,MAAM,QAAQ,IAAI;AAChD,QAAM,OAAO,oBAAI,IAAY;AAC7B,QAAM,QAAkB,CAAC,SAAS;AAElC,WAAS,IAAI,GAAG,IAAI,SAAS,KAAK;AAChC,UAAM,MAAM,MAAM,IAAI,OAAO;AAC7B,UAAM,UAAoB,CAAC;AAC3B,aAAS,IAAI,GAAG,IAAI,SAAS,KAAK;AAChC,UAAI,KAAK,IAAI,GAAG,CAAC,IAAI,CAAC,EAAE,EAAG;AAC3B,YAAM,OAAO,MAAM,CAAC,IAAI,CAAC;AACzB,UAAI,CAAC,KAAM;AAGX,eAAS,KAAK,GAAG,KAAK,KAAK,SAAS,MAAM;AACxC,iBAAS,KAAK,GAAG,KAAK,KAAK,SAAS,MAAM;AACxC,cAAI,OAAO,KAAK,OAAO,EAAG;AAC1B,cAAI,IAAI,KAAK,WAAW,IAAI,KAAK,QAAS,MAAK,IAAI,GAAG,IAAI,EAAE,IAAI,IAAI,EAAE,EAAE;AAAA,QAC1E;AAAA,MACF;AAEA,YAAM,OAAO,aAAa,KAAK,IAAI,EAAE,QAAQ,OAAO,MAAM;AAC1D,YAAM,QAAkB,CAAC;AACzB,UAAI,KAAK,UAAU,EAAG,OAAM,KAAK,YAAY,KAAK,OAAO,GAAG;AAC5D,UAAI,KAAK,UAAU,EAAG,OAAM,KAAK,YAAY,KAAK,OAAO,GAAG;AAC5D,YAAM,UAAU,MAAM,SAAS,MAAM,MAAM,KAAK,GAAG,IAAI;AACvD,cAAQ,KAAK,IAAI,GAAG,GAAG,OAAO,IAAI,IAAI,KAAK,GAAG,GAAG;AAAA,IACnD;AACA,QAAI,QAAQ,OAAQ,OAAM,KAAK,OAAO,QAAQ,KAAK,EAAE,CAAC,OAAO;AAAA,EAC/D;AAEA,QAAM,KAAK,UAAU;AACrB,SAAO,MAAM,KAAK,IAAI;AACxB;AAEA,SAAS,gBAAgB,OAAwB;AAC/C,MAAI,MAAM,SAAS,KAAK,MAAM,SAAS,EAAG,QAAO;AAEjD,QAAM,EAAE,OAAO,MAAM,SAAS,MAAM,QAAQ,IAAI;AAGhD,MAAI,eAAe,KAAK,EAAG,QAAO,YAAY,KAAK;AAGnD,MAAI,YAAY,KAAK,YAAY,GAAG;AAClC,UAAM,UAAU,aAAa,MAAM,CAAC,EAAE,CAAC,EAAE,IAAI;AAC7C,QAAI,CAAC,QAAS,QAAO;AACrB,WAAO,QACJ,MAAM,IAAI,EACV,IAAI,UAAQ;AACX,YAAM,UAAU,KAAK,KAAK;AAC1B,UAAI,CAAC,QAAS,QAAO;AACrB,UAAI,WAAW,KAAK,OAAO,EAAG,QAAO,KAAK,UAAU,OAAO,CAAC;AAC5D,UAAI,aAAa,KAAK,OAAO,EAAG,QAAO,KAAK,UAAU,OAAO,CAAC;AAC9D,aAAO,UAAU,OAAO;AAAA,IAC1B,CAAC,EACA,OAAO,OAAO,EACd,KAAK,IAAI;AAAA,EACd;AAGA,MAAI,YAAY,KAAK,WAAW,GAAG;AACjC,WAAO,MACJ,IAAI,SAAO,UAAU,aAAa,IAAI,CAAC,EAAE,IAAI,CAAC,EAAE,QAAQ,OAAO,GAAG,CAAC,EACnE,OAAO,OAAO,EACd,KAAK,IAAI;AAAA,EACd;AAGA,QAAM,UAAsB,MAAM,KAAK,EAAE,QAAQ,QAAQ,GAAG,MAAM,MAAM,OAAO,EAAE,KAAK,EAAE,CAAC;AACzF,QAAM,OAAO,oBAAI,IAAY;AAE7B,WAAS,IAAI,GAAG,IAAI,SAAS,KAAK;AAChC,aAAS,IAAI,GAAG,IAAI,SAAS,KAAK;AAChC,UAAI,KAAK,IAAI,GAAG,CAAC,IAAI,CAAC,EAAE,EAAG;AAC3B,YAAM,OAAO,MAAM,CAAC,IAAI,CAAC;AACzB,UAAI,CAAC,KAAM;AACX,cAAQ,CAAC,EAAE,CAAC,IAAI,UAAU,aAAa,KAAK,IAAI,CAAC,EAAE,QAAQ,OAAO,KAAK,EAAE,QAAQ,OAAO,MAAM;AAG9F,eAAS,KAAK,GAAG,KAAK,KAAK,SAAS,MAAM;AACxC,iBAAS,KAAK,GAAG,KAAK,KAAK,SAAS,MAAM;AACxC,cAAI,OAAO,KAAK,OAAO,EAAG;AAC1B,cAAI,IAAI,KAAK,WAAW,IAAI,KAAK,SAAS;AACxC,iBAAK,IAAI,GAAG,IAAI,EAAE,IAAI,IAAI,EAAE,EAAE;AAAA,UAChC;AAAA,QACF;AAAA,MACF;AAEA,WAAK,KAAK,UAAU;AAAA,IACtB;AAAA,EACF;AAMA,QAAM,aAAyB,CAAC;AAChC,MAAI,kBAAkB;AACtB,WAAS,IAAI,GAAG,IAAI,QAAQ,QAAQ,KAAK;AACvC,UAAM,MAAM,QAAQ,CAAC;AACrB,UAAM,qBAAqB,IAAI,MAAM,UAAQ,SAAS,EAAE;AACxD,QAAI,mBAAoB;AAIxB,UAAM,eAAe,IAAI,OAAO,UAAQ,SAAS,EAAE;AACnD,UAAM,eAAe,IAAI,KAAK,CAAC,GAAG,MAAM,KAAK,IAAI,GAAG,CAAC,IAAI,CAAC,EAAE,CAAC;AAC7D,QAAI,CAAC,gBAAgB,aAAa,WAAW,KAAK,IAAI,CAAC,MAAM,MAAM,IAAI,MAAM,CAAC,EAAE,MAAM,OAAK,MAAM,EAAE,GAAG;AACpG,wBAAkB,IAAI,CAAC;AACvB;AAAA,IACF;AAGA,QAAI,mBAAmB,IAAI,CAAC,MAAM,IAAI;AACpC,UAAI,CAAC,IAAI;AACT,wBAAkB;AAAA,IACpB,OAAO;AACL,wBAAkB;AAAA,IACpB;AACA,eAAW,KAAK,GAAG;AAAA,EACrB;AAEA,MAAI,WAAW,WAAW,EAAG,QAAO;AAEpC,QAAM,KAAe,CAAC;AACtB,KAAG,KAAK,OAAO,WAAW,CAAC,EAAE,KAAK,KAAK,IAAI,IAAI;AAC/C,KAAG,KAAK,OAAO,WAAW,CAAC,EAAE,IAAI,MAAM,KAAK,EAAE,KAAK,KAAK,IAAI,IAAI;AAChE,WAAS,IAAI,GAAG,IAAI,WAAW,QAAQ,KAAK;AAC1C,OAAG,KAAK,OAAO,WAAW,CAAC,EAAE,KAAK,KAAK,IAAI,IAAI;AAAA,EACjD;AACA,SAAO,GAAG,KAAK,IAAI;AACrB;;;ACnLO,IAAM,mBAAmB;AACzB,IAAM,mBAAmB;AACzB,IAAM,mBAAmB;","names":[]}
|
|
1
|
+
{"version":3,"sources":["../src/utils.ts","../src/table/builder.ts","../src/types.ts"],"sourcesContent":["/** kordoc 공용 유틸리티 */\n\n/** 빌드 타임에 tsup define으로 주입되는 버전 */\ndeclare const __KORDOC_VERSION__: string\nexport const VERSION: string = typeof __KORDOC_VERSION__ !== \"undefined\" ? __KORDOC_VERSION__ : \"0.0.0-dev\"\n\n/**\n * Node.js Buffer → ArrayBuffer 변환\n * pool Buffer의 공유 ArrayBuffer 문제를 안전하게 처리.\n * offset=0이고 전체 ArrayBuffer를 차지하면 복사 없이 직접 반환.\n */\nexport function toArrayBuffer(buf: Buffer): ArrayBuffer {\n if (buf.byteOffset === 0 && buf.byteLength === buf.buffer.byteLength) {\n return buf.buffer as ArrayBuffer\n }\n return buf.buffer.slice(buf.byteOffset, buf.byteOffset + buf.byteLength) as ArrayBuffer\n}\n\n/**\n * kordoc 내부 에러 클래스 — 사용자에게 노출해도 안전한 메시지만 포함.\n * MCP 에러 정제에서 instanceof로 판별하여 allowlist 패턴 매칭 없이 안전하게 통과.\n */\nexport class KordocError extends Error {\n constructor(message: string) {\n super(message)\n this.name = \"KordocError\"\n }\n}\n\n/**\n * 에러 메시지 정제 — KordocError는 그대로, 나머지는 일반 메시지로 대체.\n * 파일시스템 경로, 스택 트레이스 등 내부 정보 노출 방지.\n */\nexport function sanitizeError(err: unknown): string {\n if (err instanceof KordocError) return err.message\n return \"문서 처리 중 오류가 발생했습니다\"\n}\n\n/**\n * ZIP 엔트리 경로의 경로 순회 여부 판별.\n * 백슬래시 정규화, .., 절대경로, Windows 드라이브 문자 모두 차단.\n */\nexport function isPathTraversal(name: string): boolean {\n if (name.includes(\"\\x00\")) return true\n const normalized = name.replace(/\\\\/g, \"/\")\n const segments = normalized.split(\"/\")\n return segments.some(s => s === \"..\") || normalized.startsWith(\"/\") || /^[A-Za-z]:/.test(normalized)\n}\n\n// ─── ZIP 안전 로딩 (ZIP bomb 방지) ────────────────────\n\n/**\n * ZIP bomb 사전 검사 — Central Directory에서 비압축 합계와 엔트리 수 확인.\n * HWPX/XLSX/DOCX 등 모든 ZIP 기반 포맷에서 공통 사용.\n */\nexport function precheckZipSize(\n buffer: ArrayBuffer,\n maxUncompressedSize = 100 * 1024 * 1024,\n maxEntries = 500,\n): { totalUncompressed: number; entryCount: number } {\n try {\n const data = new DataView(buffer)\n const len = buffer.byteLength\n // EOCD 시그니처 역방향 스캔\n let eocdOffset = -1\n for (let i = len - 22; i >= Math.max(0, len - 65557); i--) {\n if (data.getUint32(i, true) === 0x06054b50) { eocdOffset = i; break }\n }\n if (eocdOffset < 0) return { totalUncompressed: 0, entryCount: 0 }\n\n const entryCount = data.getUint16(eocdOffset + 10, true)\n if (entryCount > maxEntries) {\n throw new KordocError(`ZIP 엔트리 수 초과: ${entryCount} (최대 ${maxEntries})`)\n }\n\n const cdSize = data.getUint32(eocdOffset + 12, true)\n const cdOffset = data.getUint32(eocdOffset + 16, true)\n if (cdOffset + cdSize > len) return { totalUncompressed: 0, entryCount }\n\n let totalUncompressed = 0\n let pos = cdOffset\n for (let i = 0; i < entryCount && pos + 46 <= cdOffset + cdSize; i++) {\n if (data.getUint32(pos, true) !== 0x02014b50) break\n totalUncompressed += data.getUint32(pos + 24, true)\n const nameLen = data.getUint16(pos + 28, true)\n const extraLen = data.getUint16(pos + 30, true)\n const commentLen = data.getUint16(pos + 32, true)\n pos += 46 + nameLen + extraLen + commentLen\n }\n\n if (totalUncompressed > maxUncompressedSize) {\n throw new KordocError(`ZIP 비압축 크기 초과: ${(totalUncompressed / 1024 / 1024).toFixed(1)}MB (최대 ${maxUncompressedSize / 1024 / 1024}MB)`)\n }\n\n return { totalUncompressed, entryCount }\n } catch (err) {\n if (err instanceof KordocError) throw err\n return { totalUncompressed: 0, entryCount: 0 }\n }\n}\n\n/** XXE/Billion Laughs 방지 — DOCTYPE 제거 (내부 DTD 서브셋 포함) */\nexport function stripDtd(xml: string): string {\n return xml.replace(/<!DOCTYPE\\s[^[>]*(\\[[\\s\\S]*?\\])?\\s*>/gi, \"\")\n}\n\n/** 하이퍼링크 URL 살균 — javascript: 등 XSS 위험 스킴 차단 */\nconst SAFE_HREF_RE = /^(?:https?:|mailto:|tel:|#)/i\nexport function sanitizeHref(href: string): string | null {\n const trimmed = href.trim()\n if (!trimmed || !SAFE_HREF_RE.test(trimmed)) return null\n return trimmed\n}\n\n// ─── 안전한 min/max (스택 오버플로 방지) ─────────────\n\n/** Math.min(...arr) 대체 — 대형 배열에서 스택 오버플로 방지 */\nexport function safeMin(arr: number[]): number {\n let min = Infinity\n for (let i = 0; i < arr.length; i++) if (arr[i] < min) min = arr[i]\n return min\n}\n\n/** Math.max(...arr) 대체 — 대형 배열에서 스택 오버플로 방지 */\nexport function safeMax(arr: number[]): number {\n let max = -Infinity\n for (let i = 0; i < arr.length; i++) if (arr[i] > max) max = arr[i]\n return max\n}\n\n// ─── 에러 분류 ──────────────────────────────────────\n\nimport type { ErrorCode } from \"./types.js\"\n\n/** 에러를 구조화된 ErrorCode로 분류 — KordocError 메시지 패턴 매칭 */\nexport function classifyError(err: unknown): ErrorCode {\n if (!(err instanceof Error)) return \"PARSE_ERROR\"\n const msg = err.message\n if (msg.includes(\"암호화\")) return \"ENCRYPTED\"\n if (msg.includes(\"DRM\")) return \"DRM_PROTECTED\"\n if (msg.includes(\"ZIP bomb\") || msg.includes(\"ZIP 비압축 크기 초과\") || msg.includes(\"ZIP 엔트리 수 초과\")) return \"ZIP_BOMB\"\n if (msg.includes(\"bomb\") || msg.includes(\"크기 초과\") || msg.includes(\"압축 해제\")) return \"DECOMPRESSION_BOMB\"\n if (msg.includes(\"이미지 기반\")) return \"IMAGE_BASED_PDF\"\n if (msg.includes(\"섹션\") && (msg.includes(\"찾을 수 없\") || msg.includes(\"없음\"))) return \"NO_SECTIONS\"\n if (msg.includes(\"시그니처\") || msg.includes(\"복구할 수 없\")) return \"CORRUPTED\"\n return \"PARSE_ERROR\"\n}\n","/** 2-pass colSpan/rowSpan 테이블 빌더 및 Markdown 변환 */\r\n\r\nimport type { CellContext, IRBlock, IRCell, IRTable } from \"../types.js\"\r\nimport { sanitizeHref } from \"../utils.js\"\r\n\r\n/** 테이블 열 수 상한 — 한국 공공문서 기준 충분한 값 */\r\nexport const MAX_COLS = 200\r\n/** 테이블 행 수 상한 — 메모리 폭주 방지 */\r\nexport const MAX_ROWS = 10000\r\n\r\nexport function buildTable(rows: CellContext[][]): IRTable {\r\n if (rows.length > MAX_ROWS) rows = rows.slice(0, MAX_ROWS)\r\n const numRows = rows.length\r\n\r\n // colAddr/rowAddr가 있으면 직접 배치 (HWPX cellAddr, HWP5 colAddr/rowAddr)\r\n const hasAddr = rows.some(row => row.some(c => c.colAddr !== undefined && c.rowAddr !== undefined))\r\n if (hasAddr) return buildTableDirect(rows, numRows)\r\n\r\n // Pass 1: maxCols 계산 — 2D 배열 사용 (동적 확장)\r\n let maxCols = 0\r\n const tempOccupied: boolean[][] = Array.from({ length: numRows }, () => [])\r\n\r\n for (let rowIdx = 0; rowIdx < numRows; rowIdx++) {\r\n let colIdx = 0\r\n for (const cell of rows[rowIdx]) {\r\n while (colIdx < MAX_COLS && tempOccupied[rowIdx][colIdx]) colIdx++\r\n if (colIdx >= MAX_COLS) break\r\n\r\n for (let r = rowIdx; r < Math.min(rowIdx + cell.rowSpan, numRows); r++) {\r\n for (let c = colIdx; c < Math.min(colIdx + cell.colSpan, MAX_COLS); c++) {\r\n tempOccupied[r][c] = true\r\n }\r\n }\r\n colIdx += cell.colSpan\r\n if (colIdx > maxCols) maxCols = colIdx\r\n }\r\n }\r\n\r\n if (maxCols === 0) return { rows: 0, cols: 0, cells: [], hasHeader: false }\r\n\r\n // Pass 2: 실제 배치\r\n const grid: IRCell[][] = Array.from({ length: numRows }, () =>\r\n Array.from({ length: maxCols }, () => ({ text: \"\", colSpan: 1, rowSpan: 1 }))\r\n )\r\n const occupied: boolean[][] = Array.from({ length: numRows }, () => Array(maxCols).fill(false))\r\n\r\n for (let rowIdx = 0; rowIdx < numRows; rowIdx++) {\r\n let colIdx = 0\r\n let cellIdx = 0\r\n\r\n while (colIdx < maxCols && cellIdx < rows[rowIdx].length) {\r\n while (colIdx < maxCols && occupied[rowIdx][colIdx]) colIdx++\r\n if (colIdx >= maxCols) break\r\n\r\n const cell = rows[rowIdx][cellIdx]\r\n grid[rowIdx][colIdx] = {\r\n text: cell.text.trim(),\r\n colSpan: cell.colSpan,\r\n rowSpan: cell.rowSpan,\r\n }\r\n\r\n for (let r = rowIdx; r < Math.min(rowIdx + cell.rowSpan, numRows); r++) {\r\n for (let c = colIdx; c < Math.min(colIdx + cell.colSpan, maxCols); c++) {\r\n occupied[r][c] = true\r\n }\r\n }\r\n\r\n colIdx += cell.colSpan\r\n cellIdx++\r\n }\r\n }\r\n\r\n return trimAndReturn(grid, numRows, maxCols)\r\n}\r\n\r\n/** colAddr/rowAddr 절대 좌표 기반 직접 배치 */\r\nfunction buildTableDirect(rows: CellContext[][], numRows: number): IRTable {\r\n // 전체 셀에서 maxCols 계산 (MAX_COLS 상한 적용)\r\n let maxCols = 0\r\n for (const row of rows) {\r\n for (const cell of row) {\r\n const end = (cell.colAddr ?? 0) + cell.colSpan\r\n if (end > maxCols) maxCols = end\r\n }\r\n }\r\n if (maxCols > MAX_COLS) maxCols = MAX_COLS\r\n if (maxCols === 0) return { rows: 0, cols: 0, cells: [], hasHeader: false }\r\n\r\n const grid: IRCell[][] = Array.from({ length: numRows }, () =>\r\n Array.from({ length: maxCols }, () => ({ text: \"\", colSpan: 1, rowSpan: 1 }))\r\n )\r\n\r\n for (const row of rows) {\r\n for (const cell of row) {\r\n const r = cell.rowAddr ?? 0\r\n const c = cell.colAddr ?? 0\r\n if (r >= numRows || c >= maxCols || r < 0 || c < 0) continue\r\n\r\n grid[r][c] = { text: cell.text.trim(), colSpan: cell.colSpan, rowSpan: cell.rowSpan }\r\n\r\n // 병합 영역 마킹\r\n for (let dr = 0; dr < cell.rowSpan; dr++) {\r\n for (let dc = 0; dc < cell.colSpan; dc++) {\r\n if (dr === 0 && dc === 0) continue\r\n if (r + dr < numRows && c + dc < maxCols) {\r\n grid[r + dr][c + dc] = { text: \"\", colSpan: 1, rowSpan: 1 }\r\n }\r\n }\r\n }\r\n }\r\n }\r\n\r\n return trimAndReturn(grid, numRows, maxCols)\r\n}\r\n\r\n/** 빈 후행 열 제거 후 IRTable 반환 */\r\nfunction trimAndReturn(grid: IRCell[][], numRows: number, maxCols: number): IRTable {\r\n let effectiveCols = maxCols\r\n while (effectiveCols > 0) {\r\n const colEmpty = grid.every(row => !row[effectiveCols - 1]?.text?.trim())\r\n if (!colEmpty) break\r\n effectiveCols--\r\n }\r\n if (effectiveCols < maxCols && effectiveCols > 0) {\r\n const trimmed = grid.map(row => row.slice(0, effectiveCols))\r\n return { rows: numRows, cols: effectiveCols, cells: trimmed, hasHeader: numRows > 1 }\r\n }\r\n return { rows: numRows, cols: maxCols, cells: grid, hasHeader: numRows > 1 }\r\n}\r\n\r\nexport function convertTableToText(rows: CellContext[][]): string {\r\n return rows\r\n .map(row =>\r\n row\r\n .map(c => c.text.trim().replace(/\\n/g, \" \").replace(/\\|/g, \"\\\\|\"))\r\n .filter(Boolean)\r\n .join(\" / \")\r\n )\r\n .filter(Boolean)\r\n .join(\"\\n\")\r\n}\r\n\r\n/** 마크다운 GFM 특수문자 이스케이프 — remark-gfm 오해석 방지 */\r\nfunction escapeGfm(text: string): string {\r\n // ~ → \\~ (GFM strikethrough 방지)\r\n return text.replace(/~/g, \"\\\\~\")\r\n}\r\n\r\n/** HWP 자동생성 도형/개체 대체텍스트 정규식 — 한컴오피스가 삽입하는 모든 알려진 패턴 */\r\nconst HWP_SHAPE_ALT_TEXT_RE = /(?:모서리가 둥근 |둥근 )?(?:사각형|직사각형|정사각형|원|타원|삼각형|이등변 삼각형|직각 삼각형|선|직선|곡선|화살표|굵은 화살표|이중 화살표|오각형|육각형|팔각형|별|[4-8]점별|십자|십자형|구름|구름형|마름모|도넛|평행사변형|사다리꼴|부채꼴|호|반원|물결|번개|하트|빗금|블록 화살표|수식|표|그림|개체|그리기\\s?개체|묶음\\s?개체|글상자|수식\\s?개체|OLE\\s?개체)\\s?입니다\\.?/g\r\n\r\n/** HWP PUA 특수문자 및 도형 대체텍스트 제거 — 모든 포맷 공통 */\r\nfunction sanitizeText(text: string): string {\r\n let result = text\r\n // Supplementary Private Use Area (U+F0000-U+FFFFD) — HWP 전용 기호\r\n .replace(/[\\u{F0000}-\\u{FFFFD}]/gu, \"\")\r\n // HWP 도형/개체 자동생성 대체텍스트 제거\r\n .replace(HWP_SHAPE_ALT_TEXT_RE, \"\")\r\n .replace(/ +/g, \" \")\r\n .trim()\r\n // 균등배분 스페이스 정리 (\"현 장 대 응 단 장\" → \"현장대응단장\")\r\n // 짧은 텍스트(30자 이하)에서 70%+ 토큰이 한글 1글자면 균등배분으로 판단\r\n if (result.length <= 30 && result.includes(\" \")) {\r\n const tokens = result.split(\" \")\r\n // 한글 1글자 토큰만 카운트 — ASCII 특수문자(< > & 등)는 균등배분이 아님\r\n const koreanSingleCharCount = tokens.filter(t => t.length === 1 && /[\\uAC00-\\uD7AF\\u3131-\\u318E]/.test(t)).length\r\n if (tokens.length >= 3 && koreanSingleCharCount / tokens.length >= 0.7) {\r\n result = tokens.join(\"\")\r\n }\r\n }\r\n return result\r\n}\r\n\r\n/**\r\n * 레이아웃 테이블 감지 및 해체 — IRBlock 레벨에서 수행\r\n * 적은 행(≤3) + 셀 내 줄바꿈 다량 → table 블록을 paragraph 블록들로 분해\r\n * heading 감지 전에 호출해야 해체된 텍스트에 heading 감지 적용 가능\r\n */\r\nexport function flattenLayoutTables(blocks: IRBlock[]): IRBlock[] {\r\n const result: IRBlock[] = []\r\n\r\n for (const block of blocks) {\r\n if (block.type !== \"table\" || !block.table) {\r\n result.push(block)\r\n continue\r\n }\r\n\r\n const { rows: numRows, cols: numCols, cells } = block.table\r\n\r\n // 1x1 테이블은 기존 로직(tableToMarkdown)에서 처리\r\n if (numRows === 1 && numCols === 1) {\r\n result.push(block)\r\n continue\r\n }\r\n\r\n // 레이아웃 테이블 휴리스틱\r\n if (numRows <= 3) {\r\n let totalNewlines = 0\r\n let totalTextLen = 0\r\n for (let r = 0; r < numRows; r++) {\r\n for (let c = 0; c < numCols; c++) {\r\n const t = cells[r]?.[c]?.text || \"\"\r\n totalNewlines += (t.match(/\\n/g) || []).length\r\n totalTextLen += t.length\r\n }\r\n }\r\n\r\n // 레이아웃 테이블 판정: 많은 줄바꿈(>5), 또는 적은 행에 비해 총 텍스트 과다(>300)\r\n if (totalNewlines > 5 || (numRows <= 2 && totalTextLen > 300)) {\r\n // 레이아웃 테이블 → 각 셀을 paragraph 블록으로 분해\r\n for (let r = 0; r < numRows; r++) {\r\n for (let c = 0; c < numCols; c++) {\r\n const cellText = cells[r]?.[c]?.text?.trim()\r\n if (!cellText) continue\r\n // 셀 내 줄바꿈을 별도 paragraph로 분리\r\n for (const line of cellText.split(\"\\n\")) {\r\n const trimmed = line.trim()\r\n if (!trimmed) continue\r\n result.push({ type: \"paragraph\", text: trimmed, pageNumber: block.pageNumber })\r\n }\r\n }\r\n }\r\n continue\r\n }\r\n }\r\n\r\n result.push(block)\r\n }\r\n\r\n return result\r\n}\r\n\r\nexport function blocksToMarkdown(blocks: IRBlock[]): string {\r\n const lines: string[] = []\r\n\r\n for (let i = 0; i < blocks.length; i++) {\r\n const block = blocks[i]\r\n\r\n // 헤딩 블록\r\n if (block.type === \"heading\" && block.text) {\r\n const prefix = \"#\".repeat(Math.min(block.level || 2, 6))\r\n const headingText = sanitizeText(block.text)\r\n if (headingText) lines.push(\"\", `${prefix} ${headingText}`, \"\")\r\n continue\r\n }\r\n\r\n // 이미지 블록 —  참조\r\n if (block.type === \"image\" && block.text) {\r\n lines.push(\"\", ``, \"\")\r\n continue\r\n }\r\n\r\n // 구분선 블록\r\n if (block.type === \"separator\") {\r\n lines.push(\"\", \"---\", \"\")\r\n continue\r\n }\r\n\r\n // 리스트 블록\r\n if (block.type === \"list\" && block.text) {\r\n const listText = sanitizeText(block.text)\r\n if (!listText) continue\r\n // 텍스트가 이미 번호로 시작하면 그대로 출력 (원래 번호 보존)\r\n const alreadyNumbered = block.listType === \"ordered\" && /^\\d+\\.\\s/.test(listText)\r\n const prefix = alreadyNumbered ? \"\" : block.listType === \"ordered\" ? \"1. \" : \"- \"\r\n lines.push(`${prefix}${listText}`)\r\n if (block.children) {\r\n for (const child of block.children) {\r\n const childPrefix = child.listType === \"ordered\" ? \"1.\" : \"-\"\r\n lines.push(` ${childPrefix} ${child.text || \"\"}`)\r\n }\r\n }\r\n continue\r\n }\r\n\r\n if (block.type === \"paragraph\" && block.text) {\r\n let text = sanitizeText(block.text)\r\n if (!text) continue\r\n\r\n // 별표 패턴 (기존 호환)\r\n if (/^\\[별표\\s*\\d+/.test(text)) {\r\n const nextBlock = blocks[i + 1]\r\n if (nextBlock?.type === \"paragraph\" && nextBlock.text && /관련\\)?$/.test(nextBlock.text)) {\r\n lines.push(\"\", `## ${text} ${nextBlock.text}`, \"\")\r\n i++\r\n } else {\r\n lines.push(\"\", `## ${text}`, \"\")\r\n }\r\n continue\r\n }\r\n\r\n if (/^\\([^)]*조[^)]*관련\\)$/.test(text)) {\r\n lines.push(`*${text}*`, \"\")\r\n continue\r\n }\r\n\r\n // 하이퍼링크가 있으면 텍스트에 링크 삽입 (javascript: 등 위험 스킴 제거)\r\n if (block.href) {\r\n const href = sanitizeHref(block.href)\r\n if (href) text = `[${text}](${href})`\r\n }\r\n\r\n // 각주가 있으면 괄호로 인라인 삽입\r\n if (block.footnoteText) {\r\n text += ` (주: ${block.footnoteText})`\r\n }\r\n\r\n lines.push(escapeGfm(text), \"\")\r\n } else if (block.type === \"table\" && block.table) {\r\n // 테이블 앞에 빈 줄 보장 (마크다운 렌더링 필수)\r\n if (lines.length > 0 && lines[lines.length - 1] !== \"\") {\r\n lines.push(\"\")\r\n }\r\n const tableMd = tableToMarkdown(block.table)\r\n if (tableMd) {\r\n lines.push(tableMd)\r\n lines.push(\"\")\r\n }\r\n }\r\n }\r\n\r\n return lines.join(\"\\n\").trim()\r\n}\r\n\r\n/** 병합 셀 존재 여부 확인 */\r\nfunction hasMergedCells(table: IRTable): boolean {\r\n for (const row of table.cells) {\r\n for (const cell of row) {\r\n if (cell.colSpan > 1 || cell.rowSpan > 1) return true\r\n }\r\n }\r\n return false\r\n}\r\n\r\n/** 병합 테이블 → HTML <table> 출력 (rowspan/colspan 보존) */\r\nfunction tableToHtml(table: IRTable): string {\r\n const { cells, rows: numRows, cols: numCols } = table\r\n const skip = new Set<string>()\r\n const lines: string[] = [\"<table>\"]\r\n\r\n for (let r = 0; r < numRows; r++) {\r\n const tag = r === 0 ? \"th\" : \"td\"\r\n const rowHtml: string[] = []\r\n for (let c = 0; c < numCols; c++) {\r\n if (skip.has(`${r},${c}`)) continue\r\n const cell = cells[r]?.[c]\r\n if (!cell) continue\r\n\r\n // 병합 영역 skip 마킹\r\n for (let dr = 0; dr < cell.rowSpan; dr++) {\r\n for (let dc = 0; dc < cell.colSpan; dc++) {\r\n if (dr === 0 && dc === 0) continue\r\n if (r + dr < numRows && c + dc < numCols) skip.add(`${r + dr},${c + dc}`)\r\n }\r\n }\r\n\r\n const text = sanitizeText(cell.text).replace(/\\n/g, \"<br>\")\r\n const attrs: string[] = []\r\n if (cell.colSpan > 1) attrs.push(`colspan=\"${cell.colSpan}\"`)\r\n if (cell.rowSpan > 1) attrs.push(`rowspan=\"${cell.rowSpan}\"`)\r\n const attrStr = attrs.length ? \" \" + attrs.join(\" \") : \"\"\r\n rowHtml.push(`<${tag}${attrStr}>${text}</${tag}>`)\r\n }\r\n if (rowHtml.length) lines.push(`<tr>${rowHtml.join(\"\")}</tr>`)\r\n }\r\n\r\n lines.push(\"</table>\")\r\n return lines.join(\"\\n\")\r\n}\r\n\r\nfunction tableToMarkdown(table: IRTable): string {\r\n if (table.rows === 0 || table.cols === 0) return \"\"\r\n\r\n const { cells, rows: numRows, cols: numCols } = table\r\n\r\n // 병합 셀이 있으면 HTML 테이블로 출력\r\n if (hasMergedCells(table)) return tableToHtml(table)\r\n\r\n // 1행 1열 → 구조화된 텍스트 (빈 셀이면 스킵)\r\n if (numRows === 1 && numCols === 1) {\r\n const content = sanitizeText(cells[0][0].text)\r\n if (!content) return \"\"\r\n return content\r\n .split(/\\n/)\r\n .map(line => {\r\n const trimmed = line.trim()\r\n if (!trimmed) return \"\"\r\n if (/^\\d+\\.\\s/.test(trimmed)) return `**${escapeGfm(trimmed)}**`\r\n if (/^[가-힣]\\.\\s/.test(trimmed)) return ` ${escapeGfm(trimmed)}`\r\n return escapeGfm(trimmed)\r\n })\r\n .filter(Boolean)\r\n .join(\"\\n\")\r\n }\r\n\r\n // 1열 다행 테이블 → 각 행을 별도 라인으로 출력 (목록성 데이터)\r\n if (numCols === 1 && numRows >= 2) {\r\n return cells\r\n .map(row => escapeGfm(sanitizeText(row[0].text)).replace(/\\n/g, \" \"))\r\n .filter(Boolean)\r\n .join(\"\\n\")\r\n }\r\n\r\n // 병합 셀: 행/열 병합된 셀은 빈 칸으로\r\n const display: string[][] = Array.from({ length: numRows }, () => Array(numCols).fill(\"\"))\r\n const skip = new Set<string>()\r\n\r\n for (let r = 0; r < numRows; r++) {\r\n for (let c = 0; c < numCols; c++) {\r\n if (skip.has(`${r},${c}`)) continue\r\n const cell = cells[r]?.[c]\r\n if (!cell) continue\r\n display[r][c] = escapeGfm(sanitizeText(cell.text)).replace(/\\|/g, \"\\\\|\").replace(/\\n/g, \"<br>\")\r\n\r\n // colSpan/rowSpan: 병합된 열은 빈 칸으로 유지 (텍스트 중복 방지)\r\n for (let dr = 0; dr < cell.rowSpan; dr++) {\r\n for (let dc = 0; dc < cell.colSpan; dc++) {\r\n if (dr === 0 && dc === 0) continue\r\n if (r + dr < numRows && c + dc < numCols) {\r\n skip.add(`${r + dr},${c + dc}`)\r\n }\r\n }\r\n }\r\n // colSpan > 1이면 display 열 인덱스를 건너뜀\r\n c += cell.colSpan - 1\r\n }\r\n }\r\n\r\n // rowSpan 잔류 처리:\r\n // 1) 완전 빈 행 제거\r\n // 2) \"첫 열만 값, 나머지 빈\" 행 → 다음 데이터 행의 첫 열에 값을 전파\r\n // 단, colSpan으로 인한 빈 열(skip 셀)은 이 대상이 아님\r\n const uniqueRows: string[][] = []\r\n let pendingFirstCol = \"\"\r\n for (let r = 0; r < display.length; r++) {\r\n const row = display[r]\r\n const isEmptyPlaceholder = row.every(cell => cell === \"\")\r\n if (isEmptyPlaceholder) continue\r\n\r\n // 첫 열만 값이 있고 나머지 모두 빈 행 → 다음 데이터 행의 첫 열에 전파\r\n // 단, colSpan으로 인한 빈 열(skip 셀)은 \"진짜 빈\"이 아니므로 제외\r\n const nonEmptyCols = row.filter(cell => cell !== \"\")\r\n const hasSkipInRow = row.some((_, c) => skip.has(`${r},${c}`))\r\n if (!hasSkipInRow && nonEmptyCols.length === 1 && row[0] !== \"\" && row.slice(1).every(c => c === \"\")) {\r\n pendingFirstCol = row[0]\r\n continue\r\n }\r\n\r\n // 저장된 첫 열 값을 현재 행의 빈 첫 열에 전파\r\n if (pendingFirstCol && row[0] === \"\") {\r\n row[0] = pendingFirstCol\r\n pendingFirstCol = \"\"\r\n } else {\r\n pendingFirstCol = \"\"\r\n }\r\n uniqueRows.push(row)\r\n }\r\n\r\n if (uniqueRows.length === 0) return \"\"\r\n\r\n const md: string[] = []\r\n md.push(\"| \" + uniqueRows[0].join(\" | \") + \" |\")\r\n md.push(\"| \" + uniqueRows[0].map(() => \"---\").join(\" | \") + \" |\")\r\n for (let i = 1; i < uniqueRows.length; i++) {\r\n md.push(\"| \" + uniqueRows[i].join(\" | \") + \" |\")\r\n }\r\n return md.join(\"\\n\")\r\n}\r\n","/** kordoc 공통 타입 정의 */\r\n\r\n// ─── 중간 표현 (Intermediate Representation) ─────────\r\n\r\nexport interface CellContext {\r\n text: string\r\n colSpan: number\r\n rowSpan: number\r\n /** HWP5 셀 열 주소 (0-based) — 병합 테이블 배치용 */\r\n colAddr?: number\r\n /** HWP5 셀 행 주소 (0-based) — 병합 테이블 배치용 */\r\n rowAddr?: number\r\n}\r\n\r\n/** 블록 타입 — v2.0에서 heading, list, image, separator 추가 */\r\nexport type IRBlockType = \"paragraph\" | \"table\" | \"heading\" | \"list\" | \"image\" | \"separator\"\r\n\r\nexport interface IRBlock {\r\n type: IRBlockType\r\n text?: string\r\n table?: IRTable\r\n /** 헤딩 레벨 (1-6), type=\"heading\"일 때 사용 */\r\n level?: number\r\n /** 원본 페이지 번호 (1-based) */\r\n pageNumber?: number\r\n /** 바운딩 박스 — PDF에서만 제공 */\r\n bbox?: BoundingBox\r\n /** 텍스트 스타일 정보 (선택) */\r\n style?: InlineStyle\r\n /** 리스트 타입, type=\"list\"일 때 사용 */\r\n listType?: \"ordered\" | \"unordered\"\r\n /** 중첩 리스트 아이템 */\r\n children?: IRBlock[]\r\n /** 하이퍼링크 URL */\r\n href?: string\r\n /** 각주/미주 텍스트 (인라인 삽입용) */\r\n footnoteText?: string\r\n /** 이미지 데이터 (type=\"image\"일 때) */\r\n imageData?: ImageData\r\n}\r\n\r\n/** 추출된 이미지 바이너리 데이터 */\r\nexport interface ImageData {\r\n /** 이미지 바이너리 */\r\n data: Uint8Array\r\n /** MIME 타입 (image/png, image/jpeg, image/gif, image/bmp, image/wmf, image/emf) */\r\n mimeType: string\r\n /** 원본 파일명 (있는 경우) */\r\n filename?: string\r\n}\r\n\r\n/** 바운딩 박스 — PDF 포인트 단위 (72pt = 1인치) */\r\nexport interface BoundingBox {\r\n page: number\r\n x: number\r\n y: number\r\n width: number\r\n height: number\r\n}\r\n\r\n/** 인라인 텍스트 스타일 */\r\nexport interface InlineStyle {\r\n bold?: boolean\r\n italic?: boolean\r\n fontSize?: number\r\n fontName?: string\r\n}\r\n\r\nexport interface IRTable {\r\n rows: number\r\n cols: number\r\n cells: IRCell[][]\r\n /** 첫 행을 헤더로 렌더링할지 여부 (현재: rows > 1이면 true — 의미적 감지가 아닌 레이아웃 힌트) */\r\n hasHeader: boolean\r\n}\r\n\r\nexport interface IRCell {\r\n text: string\r\n colSpan: number\r\n rowSpan: number\r\n}\r\n\r\n// ─── 메타데이터 ─────────────────────────────────────\r\n\r\n/** 문서 메타데이터 — 각 포맷에서 추출 가능한 필드만 채워짐 */\r\nexport interface DocumentMetadata {\r\n /** 문서 제목 */\r\n title?: string\r\n /** 작성자 */\r\n author?: string\r\n /** 작성 프로그램 (예: \"한글 2020\", \"Adobe Acrobat\") */\r\n creator?: string\r\n /** 생성일시 (ISO 8601) */\r\n createdAt?: string\r\n /** 수정일시 (ISO 8601) */\r\n modifiedAt?: string\r\n /** 페이지/섹션 수 */\r\n pageCount?: number\r\n /** 문서 포맷 버전 (예: HWP \"5.1.0.1\") */\r\n version?: string\r\n /** 설명 */\r\n description?: string\r\n /** 키워드 */\r\n keywords?: string[]\r\n}\r\n\r\n// ─── 파싱 옵션 ──────────────────────────────────────\r\n\r\n/** 파싱 옵션 — parse() 함수에 전달 */\r\nexport interface ParseOptions {\r\n /**\r\n * 파싱할 페이지/섹션 범위 (1-based).\r\n * - 배열: [1, 2, 3]\r\n * - 문자열: \"1-3\", \"1,3,5-7\"\r\n *\r\n * PDF: 정확한 페이지 단위. HWP/HWPX: 섹션 단위 근사치.\r\n */\r\n pages?: number[] | string\r\n /** 이미지 기반 PDF용 OCR 프로바이더 (선택) */\r\n ocr?: OcrProvider\r\n /** 진행률 콜백 — current: 현재 페이지/섹션, total: 전체 수 */\r\n onProgress?: (current: number, total: number) => void\r\n /** PDF 머리글/바닥글 자동 제거 */\r\n removeHeaderFooter?: boolean\r\n}\r\n\r\n// ─── 파싱 경고 ──────────────────────────────────────\r\n\r\n/** 파싱 중 스킵/실패한 요소 보고 */\r\nexport interface ParseWarning {\r\n /** 관련 페이지 번호 (알 수 있는 경우) */\r\n page?: number\r\n /** 경고 메시지 */\r\n message: string\r\n /** 구조화된 경고 코드 */\r\n code: WarningCode\r\n}\r\n\r\nexport type WarningCode =\r\n | \"SKIPPED_IMAGE\"\r\n | \"SKIPPED_OLE\"\r\n | \"TRUNCATED_TABLE\"\r\n | \"OCR_FALLBACK\"\r\n | \"UNSUPPORTED_ELEMENT\"\r\n | \"BROKEN_ZIP_RECOVERY\"\r\n | \"HIDDEN_TEXT_FILTERED\"\r\n | \"MALFORMED_XML\"\r\n | \"PARTIAL_PARSE\"\r\n | \"LENIENT_CFB_RECOVERY\"\r\n\r\n/** 문서 구조 (헤딩 트리) */\r\nexport interface OutlineItem {\r\n level: number\r\n text: string\r\n pageNumber?: number\r\n}\r\n\r\n// ─── 에러 코드 ──────────────────────────────────────\r\n\r\n/** 구조화된 에러 코드 — 프로그래밍적 에러 핸들링용 */\r\nexport type ErrorCode =\r\n | \"EMPTY_INPUT\"\r\n | \"UNSUPPORTED_FORMAT\"\r\n | \"ENCRYPTED\"\r\n | \"DRM_PROTECTED\"\r\n | \"CORRUPTED\"\r\n | \"DECOMPRESSION_BOMB\"\r\n | \"ZIP_BOMB\"\r\n | \"IMAGE_BASED_PDF\"\r\n | \"NO_SECTIONS\"\r\n | \"PARSE_ERROR\"\r\n | \"MISSING_DEPENDENCY\"\r\n\r\n// ─── 파싱 결과 (discriminated union) ────────────────\r\n\r\nexport type FileType = \"hwpx\" | \"hwp\" | \"hwpml\" | \"pdf\" | \"xlsx\" | \"docx\" | \"unknown\"\r\n\r\ninterface ParseResultBase {\r\n fileType: FileType\r\n /** 페이지/섹션 수 — PDF: 실제 페이지 수, HWP/HWPX: 섹션 수, XLSX: 시트 수 */\r\n pageCount?: number\r\n /** 이미지 기반 PDF 여부 (텍스트 추출 불가) */\r\n isImageBased?: boolean\r\n}\r\n\r\nexport interface ParseSuccess extends ParseResultBase {\r\n success: true\r\n /** 추출된 마크다운 텍스트 */\r\n markdown: string\r\n /** 중간 표현 블록 (구조화된 데이터 접근용) */\r\n blocks: IRBlock[]\r\n /** 문서 메타데이터 */\r\n metadata?: DocumentMetadata\r\n /** 문서 구조 (헤딩 트리) — v2.0 */\r\n outline?: OutlineItem[]\r\n /** 파싱 중 발생한 경고 — v2.0 */\r\n warnings?: ParseWarning[]\r\n /** 추출된 이미지 목록 — 마크다운에서 파일명으로 참조됨 */\r\n images?: ExtractedImage[]\r\n}\r\n\r\n/** 추출된 이미지 — ParseSuccess.images에 포함 */\r\nexport interface ExtractedImage {\r\n /** 마크다운에서 참조되는 파일명 (예: image_001.png) */\r\n filename: string\r\n /** 이미지 바이너리 */\r\n data: Uint8Array\r\n /** MIME 타입 */\r\n mimeType: string\r\n}\r\n\r\nexport interface ParseFailure extends ParseResultBase {\r\n success: false\r\n /** 오류 메시지 */\r\n error: string\r\n /** 구조화된 에러 코드 */\r\n code?: ErrorCode\r\n}\r\n\r\nexport type ParseResult = ParseSuccess | ParseFailure\r\n\r\n// ─── 문서 비교 (Diff) ───────────────────────────────\r\n\r\nexport type DiffChangeType = \"added\" | \"removed\" | \"modified\" | \"unchanged\"\r\n\r\nexport interface BlockDiff {\r\n type: DiffChangeType\r\n /** 원본 블록 (added이면 undefined) */\r\n before?: IRBlock\r\n /** 변경 후 블록 (removed이면 undefined) */\r\n after?: IRBlock\r\n /** modified 테이블의 셀 단위 diff */\r\n cellDiffs?: CellDiff[][]\r\n /** 유사도 (0-1) */\r\n similarity?: number\r\n}\r\n\r\nexport interface CellDiff {\r\n type: DiffChangeType\r\n before?: string\r\n after?: string\r\n}\r\n\r\nexport interface DiffResult {\r\n stats: { added: number; removed: number; modified: number; unchanged: number }\r\n diffs: BlockDiff[]\r\n}\r\n\r\n// ─── 양식 인식 ──────────────────────────────────────\r\n\r\nexport interface FormField {\r\n label: string\r\n value: string\r\n /** 0-based 소스 행 */\r\n row: number\r\n /** 0-based 소스 열 */\r\n col: number\r\n}\r\n\r\nexport interface FormResult {\r\n fields: FormField[]\r\n /** 양식 확신도 (0-1) */\r\n confidence: number\r\n}\r\n\r\n// ─── OCR 프로바이더 ─────────────────────────────────\r\n\r\n/** 사용자 제공 OCR 함수 — 페이지 이미지를 받아 텍스트 반환 */\r\nexport type OcrProvider = (\r\n pageImage: Uint8Array,\r\n pageNumber: number,\r\n mimeType: \"image/png\"\r\n) => Promise<string>\r\n\r\n// ─── Watch 모드 ─────────────────────────────────────\r\n\r\nexport interface WatchOptions {\r\n dir: string\r\n outDir?: string\r\n webhook?: string\r\n format?: \"markdown\" | \"json\"\r\n pages?: string\r\n silent?: boolean\r\n}\r\n\r\n// ─── 헤딩 감지 공통 임계값 ──────────────────────────\r\n\r\n/** 폰트 크기 비율 → heading level (전 파서 공통) */\r\nexport const HEADING_RATIO_H1 = 1.5\r\nexport const HEADING_RATIO_H2 = 1.3\r\nexport const HEADING_RATIO_H3 = 1.15\r\n\r\n// ─── 내부 파서 반환 타입 ─────────────────────────────\r\n\r\n/** 내부 파서가 index.ts에 반환하는 공통 타입 (HWP5/HWPX/PDF/XLSX/DOCX) */\r\nexport interface InternalParseResult {\r\n markdown: string\r\n blocks: IRBlock[]\r\n metadata?: DocumentMetadata\r\n outline?: OutlineItem[]\r\n warnings?: ParseWarning[]\r\n images?: ExtractedImage[]\r\n /** PDF 전용: 이미지 기반 PDF 여부 */\r\n isImageBased?: boolean\r\n}\r\n"],"mappings":";AAIO,IAAM,UAAkB,OAA4C,UAAqB;AAOzF,SAAS,cAAc,KAA0B;AACtD,MAAI,IAAI,eAAe,KAAK,IAAI,eAAe,IAAI,OAAO,YAAY;AACpE,WAAO,IAAI;AAAA,EACb;AACA,SAAO,IAAI,OAAO,MAAM,IAAI,YAAY,IAAI,aAAa,IAAI,UAAU;AACzE;AAMO,IAAM,cAAN,cAA0B,MAAM;AAAA,EACrC,YAAY,SAAiB;AAC3B,UAAM,OAAO;AACb,SAAK,OAAO;AAAA,EACd;AACF;AAeO,SAAS,gBAAgB,MAAuB;AACrD,MAAI,KAAK,SAAS,IAAM,EAAG,QAAO;AAClC,QAAM,aAAa,KAAK,QAAQ,OAAO,GAAG;AAC1C,QAAM,WAAW,WAAW,MAAM,GAAG;AACrC,SAAO,SAAS,KAAK,OAAK,MAAM,IAAI,KAAK,WAAW,WAAW,GAAG,KAAK,aAAa,KAAK,UAAU;AACrG;AAQO,SAAS,gBACd,QACA,sBAAsB,MAAM,OAAO,MACnC,aAAa,KACsC;AACnD,MAAI;AACF,UAAM,OAAO,IAAI,SAAS,MAAM;AAChC,UAAM,MAAM,OAAO;AAEnB,QAAI,aAAa;AACjB,aAAS,IAAI,MAAM,IAAI,KAAK,KAAK,IAAI,GAAG,MAAM,KAAK,GAAG,KAAK;AACzD,UAAI,KAAK,UAAU,GAAG,IAAI,MAAM,WAAY;AAAE,qBAAa;AAAG;AAAA,MAAM;AAAA,IACtE;AACA,QAAI,aAAa,EAAG,QAAO,EAAE,mBAAmB,GAAG,YAAY,EAAE;AAEjE,UAAM,aAAa,KAAK,UAAU,aAAa,IAAI,IAAI;AACvD,QAAI,aAAa,YAAY;AAC3B,YAAM,IAAI,YAAY,+CAAiB,UAAU,kBAAQ,UAAU,GAAG;AAAA,IACxE;AAEA,UAAM,SAAS,KAAK,UAAU,aAAa,IAAI,IAAI;AACnD,UAAM,WAAW,KAAK,UAAU,aAAa,IAAI,IAAI;AACrD,QAAI,WAAW,SAAS,IAAK,QAAO,EAAE,mBAAmB,GAAG,WAAW;AAEvE,QAAI,oBAAoB;AACxB,QAAI,MAAM;AACV,aAAS,IAAI,GAAG,IAAI,cAAc,MAAM,MAAM,WAAW,QAAQ,KAAK;AACpE,UAAI,KAAK,UAAU,KAAK,IAAI,MAAM,SAAY;AAC9C,2BAAqB,KAAK,UAAU,MAAM,IAAI,IAAI;AAClD,YAAM,UAAU,KAAK,UAAU,MAAM,IAAI,IAAI;AAC7C,YAAM,WAAW,KAAK,UAAU,MAAM,IAAI,IAAI;AAC9C,YAAM,aAAa,KAAK,UAAU,MAAM,IAAI,IAAI;AAChD,aAAO,KAAK,UAAU,WAAW;AAAA,IACnC;AAEA,QAAI,oBAAoB,qBAAqB;AAC3C,YAAM,IAAI,YAAY,sDAAmB,oBAAoB,OAAO,MAAM,QAAQ,CAAC,CAAC,oBAAU,sBAAsB,OAAO,IAAI,KAAK;AAAA,IACtI;AAEA,WAAO,EAAE,mBAAmB,WAAW;AAAA,EACzC,SAAS,KAAK;AACZ,QAAI,eAAe,YAAa,OAAM;AACtC,WAAO,EAAE,mBAAmB,GAAG,YAAY,EAAE;AAAA,EAC/C;AACF;AAGO,SAAS,SAAS,KAAqB;AAC5C,SAAO,IAAI,QAAQ,0CAA0C,EAAE;AACjE;AAGA,IAAM,eAAe;AACd,SAAS,aAAa,MAA6B;AACxD,QAAM,UAAU,KAAK,KAAK;AAC1B,MAAI,CAAC,WAAW,CAAC,aAAa,KAAK,OAAO,EAAG,QAAO;AACpD,SAAO;AACT;AAKO,SAAS,QAAQ,KAAuB;AAC7C,MAAI,MAAM;AACV,WAAS,IAAI,GAAG,IAAI,IAAI,QAAQ,IAAK,KAAI,IAAI,CAAC,IAAI,IAAK,OAAM,IAAI,CAAC;AAClE,SAAO;AACT;AAGO,SAAS,QAAQ,KAAuB;AAC7C,MAAI,MAAM;AACV,WAAS,IAAI,GAAG,IAAI,IAAI,QAAQ,IAAK,KAAI,IAAI,CAAC,IAAI,IAAK,OAAM,IAAI,CAAC;AAClE,SAAO;AACT;AAOO,SAAS,cAAc,KAAyB;AACrD,MAAI,EAAE,eAAe,OAAQ,QAAO;AACpC,QAAM,MAAM,IAAI;AAChB,MAAI,IAAI,SAAS,oBAAK,EAAG,QAAO;AAChC,MAAI,IAAI,SAAS,KAAK,EAAG,QAAO;AAChC,MAAI,IAAI,SAAS,UAAU,KAAK,IAAI,SAAS,kDAAe,KAAK,IAAI,SAAS,4CAAc,EAAG,QAAO;AACtG,MAAI,IAAI,SAAS,MAAM,KAAK,IAAI,SAAS,2BAAO,KAAK,IAAI,SAAS,2BAAO,EAAG,QAAO;AACnF,MAAI,IAAI,SAAS,iCAAQ,EAAG,QAAO;AACnC,MAAI,IAAI,SAAS,cAAI,MAAM,IAAI,SAAS,4BAAQ,KAAK,IAAI,SAAS,cAAI,GAAI,QAAO;AACjF,MAAI,IAAI,SAAS,0BAAM,KAAK,IAAI,SAAS,kCAAS,EAAG,QAAO;AAC5D,SAAO;AACT;;;AC5IO,IAAM,WAAW;AAEjB,IAAM,WAAW;AAEjB,SAAS,WAAW,MAAgC;AACzD,MAAI,KAAK,SAAS,SAAU,QAAO,KAAK,MAAM,GAAG,QAAQ;AACzD,QAAM,UAAU,KAAK;AAGrB,QAAM,UAAU,KAAK,KAAK,SAAO,IAAI,KAAK,OAAK,EAAE,YAAY,UAAa,EAAE,YAAY,MAAS,CAAC;AAClG,MAAI,QAAS,QAAO,iBAAiB,MAAM,OAAO;AAGlD,MAAI,UAAU;AACd,QAAM,eAA4B,MAAM,KAAK,EAAE,QAAQ,QAAQ,GAAG,MAAM,CAAC,CAAC;AAE1E,WAAS,SAAS,GAAG,SAAS,SAAS,UAAU;AAC/C,QAAI,SAAS;AACb,eAAW,QAAQ,KAAK,MAAM,GAAG;AAC/B,aAAO,SAAS,YAAY,aAAa,MAAM,EAAE,MAAM,EAAG;AAC1D,UAAI,UAAU,SAAU;AAExB,eAAS,IAAI,QAAQ,IAAI,KAAK,IAAI,SAAS,KAAK,SAAS,OAAO,GAAG,KAAK;AACtE,iBAAS,IAAI,QAAQ,IAAI,KAAK,IAAI,SAAS,KAAK,SAAS,QAAQ,GAAG,KAAK;AACvE,uBAAa,CAAC,EAAE,CAAC,IAAI;AAAA,QACvB;AAAA,MACF;AACA,gBAAU,KAAK;AACf,UAAI,SAAS,QAAS,WAAU;AAAA,IAClC;AAAA,EACF;AAEA,MAAI,YAAY,EAAG,QAAO,EAAE,MAAM,GAAG,MAAM,GAAG,OAAO,CAAC,GAAG,WAAW,MAAM;AAG1E,QAAM,OAAmB,MAAM;AAAA,IAAK,EAAE,QAAQ,QAAQ;AAAA,IAAG,MACvD,MAAM,KAAK,EAAE,QAAQ,QAAQ,GAAG,OAAO,EAAE,MAAM,IAAI,SAAS,GAAG,SAAS,EAAE,EAAE;AAAA,EAC9E;AACA,QAAM,WAAwB,MAAM,KAAK,EAAE,QAAQ,QAAQ,GAAG,MAAM,MAAM,OAAO,EAAE,KAAK,KAAK,CAAC;AAE9F,WAAS,SAAS,GAAG,SAAS,SAAS,UAAU;AAC/C,QAAI,SAAS;AACb,QAAI,UAAU;AAEd,WAAO,SAAS,WAAW,UAAU,KAAK,MAAM,EAAE,QAAQ;AACxD,aAAO,SAAS,WAAW,SAAS,MAAM,EAAE,MAAM,EAAG;AACrD,UAAI,UAAU,QAAS;AAEvB,YAAM,OAAO,KAAK,MAAM,EAAE,OAAO;AACjC,WAAK,MAAM,EAAE,MAAM,IAAI;AAAA,QACrB,MAAM,KAAK,KAAK,KAAK;AAAA,QACrB,SAAS,KAAK;AAAA,QACd,SAAS,KAAK;AAAA,MAChB;AAEA,eAAS,IAAI,QAAQ,IAAI,KAAK,IAAI,SAAS,KAAK,SAAS,OAAO,GAAG,KAAK;AACtE,iBAAS,IAAI,QAAQ,IAAI,KAAK,IAAI,SAAS,KAAK,SAAS,OAAO,GAAG,KAAK;AACtE,mBAAS,CAAC,EAAE,CAAC,IAAI;AAAA,QACnB;AAAA,MACF;AAEA,gBAAU,KAAK;AACf;AAAA,IACF;AAAA,EACF;AAEA,SAAO,cAAc,MAAM,SAAS,OAAO;AAC7C;AAGA,SAAS,iBAAiB,MAAuB,SAA0B;AAEzE,MAAI,UAAU;AACd,aAAW,OAAO,MAAM;AACtB,eAAW,QAAQ,KAAK;AACtB,YAAM,OAAO,KAAK,WAAW,KAAK,KAAK;AACvC,UAAI,MAAM,QAAS,WAAU;AAAA,IAC/B;AAAA,EACF;AACA,MAAI,UAAU,SAAU,WAAU;AAClC,MAAI,YAAY,EAAG,QAAO,EAAE,MAAM,GAAG,MAAM,GAAG,OAAO,CAAC,GAAG,WAAW,MAAM;AAE1E,QAAM,OAAmB,MAAM;AAAA,IAAK,EAAE,QAAQ,QAAQ;AAAA,IAAG,MACvD,MAAM,KAAK,EAAE,QAAQ,QAAQ,GAAG,OAAO,EAAE,MAAM,IAAI,SAAS,GAAG,SAAS,EAAE,EAAE;AAAA,EAC9E;AAEA,aAAW,OAAO,MAAM;AACtB,eAAW,QAAQ,KAAK;AACtB,YAAM,IAAI,KAAK,WAAW;AAC1B,YAAM,IAAI,KAAK,WAAW;AAC1B,UAAI,KAAK,WAAW,KAAK,WAAW,IAAI,KAAK,IAAI,EAAG;AAEpD,WAAK,CAAC,EAAE,CAAC,IAAI,EAAE,MAAM,KAAK,KAAK,KAAK,GAAG,SAAS,KAAK,SAAS,SAAS,KAAK,QAAQ;AAGpF,eAAS,KAAK,GAAG,KAAK,KAAK,SAAS,MAAM;AACxC,iBAAS,KAAK,GAAG,KAAK,KAAK,SAAS,MAAM;AACxC,cAAI,OAAO,KAAK,OAAO,EAAG;AAC1B,cAAI,IAAI,KAAK,WAAW,IAAI,KAAK,SAAS;AACxC,iBAAK,IAAI,EAAE,EAAE,IAAI,EAAE,IAAI,EAAE,MAAM,IAAI,SAAS,GAAG,SAAS,EAAE;AAAA,UAC5D;AAAA,QACF;AAAA,MACF;AAAA,IACF;AAAA,EACF;AAEA,SAAO,cAAc,MAAM,SAAS,OAAO;AAC7C;AAGA,SAAS,cAAc,MAAkB,SAAiB,SAA0B;AAClF,MAAI,gBAAgB;AACpB,SAAO,gBAAgB,GAAG;AACxB,UAAM,WAAW,KAAK,MAAM,SAAO,CAAC,IAAI,gBAAgB,CAAC,GAAG,MAAM,KAAK,CAAC;AACxE,QAAI,CAAC,SAAU;AACf;AAAA,EACF;AACA,MAAI,gBAAgB,WAAW,gBAAgB,GAAG;AAChD,UAAM,UAAU,KAAK,IAAI,SAAO,IAAI,MAAM,GAAG,aAAa,CAAC;AAC3D,WAAO,EAAE,MAAM,SAAS,MAAM,eAAe,OAAO,SAAS,WAAW,UAAU,EAAE;AAAA,EACtF;AACA,SAAO,EAAE,MAAM,SAAS,MAAM,SAAS,OAAO,MAAM,WAAW,UAAU,EAAE;AAC7E;AAEO,SAAS,mBAAmB,MAA+B;AAChE,SAAO,KACJ;AAAA,IAAI,SACH,IACG,IAAI,OAAK,EAAE,KAAK,KAAK,EAAE,QAAQ,OAAO,GAAG,EAAE,QAAQ,OAAO,KAAK,CAAC,EAChE,OAAO,OAAO,EACd,KAAK,KAAK;AAAA,EACf,EACC,OAAO,OAAO,EACd,KAAK,IAAI;AACd;AAGA,SAAS,UAAU,MAAsB;AAEvC,SAAO,KAAK,QAAQ,MAAM,KAAK;AACjC;AAGA,IAAM,wBAAwB;AAG9B,SAAS,aAAa,MAAsB;AAC1C,MAAI,SAAS,KAEV,QAAQ,2BAA2B,EAAE,EAErC,QAAQ,uBAAuB,EAAE,EACjC,QAAQ,QAAQ,GAAG,EACnB,KAAK;AAGR,MAAI,OAAO,UAAU,MAAM,OAAO,SAAS,GAAG,GAAG;AAC/C,UAAM,SAAS,OAAO,MAAM,GAAG;AAE/B,UAAM,wBAAwB,OAAO,OAAO,OAAK,EAAE,WAAW,KAAK,+BAA+B,KAAK,CAAC,CAAC,EAAE;AAC3G,QAAI,OAAO,UAAU,KAAK,wBAAwB,OAAO,UAAU,KAAK;AACtE,eAAS,OAAO,KAAK,EAAE;AAAA,IACzB;AAAA,EACF;AACA,SAAO;AACT;AAOO,SAAS,oBAAoB,QAA8B;AAChE,QAAM,SAAoB,CAAC;AAE3B,aAAW,SAAS,QAAQ;AAC1B,QAAI,MAAM,SAAS,WAAW,CAAC,MAAM,OAAO;AAC1C,aAAO,KAAK,KAAK;AACjB;AAAA,IACF;AAEA,UAAM,EAAE,MAAM,SAAS,MAAM,SAAS,MAAM,IAAI,MAAM;AAGtD,QAAI,YAAY,KAAK,YAAY,GAAG;AAClC,aAAO,KAAK,KAAK;AACjB;AAAA,IACF;AAGA,QAAI,WAAW,GAAG;AAChB,UAAI,gBAAgB;AACpB,UAAI,eAAe;AACnB,eAAS,IAAI,GAAG,IAAI,SAAS,KAAK;AAChC,iBAAS,IAAI,GAAG,IAAI,SAAS,KAAK;AAChC,gBAAM,IAAI,MAAM,CAAC,IAAI,CAAC,GAAG,QAAQ;AACjC,4BAAkB,EAAE,MAAM,KAAK,KAAK,CAAC,GAAG;AACxC,0BAAgB,EAAE;AAAA,QACpB;AAAA,MACF;AAGA,UAAI,gBAAgB,KAAM,WAAW,KAAK,eAAe,KAAM;AAE7D,iBAAS,IAAI,GAAG,IAAI,SAAS,KAAK;AAChC,mBAAS,IAAI,GAAG,IAAI,SAAS,KAAK;AAChC,kBAAM,WAAW,MAAM,CAAC,IAAI,CAAC,GAAG,MAAM,KAAK;AAC3C,gBAAI,CAAC,SAAU;AAEf,uBAAW,QAAQ,SAAS,MAAM,IAAI,GAAG;AACvC,oBAAM,UAAU,KAAK,KAAK;AAC1B,kBAAI,CAAC,QAAS;AACd,qBAAO,KAAK,EAAE,MAAM,aAAa,MAAM,SAAS,YAAY,MAAM,WAAW,CAAC;AAAA,YAChF;AAAA,UACF;AAAA,QACF;AACA;AAAA,MACF;AAAA,IACF;AAEA,WAAO,KAAK,KAAK;AAAA,EACnB;AAEA,SAAO;AACT;AAEO,SAAS,iBAAiB,QAA2B;AAC1D,QAAM,QAAkB,CAAC;AAEzB,WAAS,IAAI,GAAG,IAAI,OAAO,QAAQ,KAAK;AACtC,UAAM,QAAQ,OAAO,CAAC;AAGtB,QAAI,MAAM,SAAS,aAAa,MAAM,MAAM;AAC1C,YAAM,SAAS,IAAI,OAAO,KAAK,IAAI,MAAM,SAAS,GAAG,CAAC,CAAC;AACvD,YAAM,cAAc,aAAa,MAAM,IAAI;AAC3C,UAAI,YAAa,OAAM,KAAK,IAAI,GAAG,MAAM,IAAI,WAAW,IAAI,EAAE;AAC9D;AAAA,IACF;AAGA,QAAI,MAAM,SAAS,WAAW,MAAM,MAAM;AACxC,YAAM,KAAK,IAAI,YAAY,MAAM,IAAI,KAAK,EAAE;AAC5C;AAAA,IACF;AAGA,QAAI,MAAM,SAAS,aAAa;AAC9B,YAAM,KAAK,IAAI,OAAO,EAAE;AACxB;AAAA,IACF;AAGA,QAAI,MAAM,SAAS,UAAU,MAAM,MAAM;AACvC,YAAM,WAAW,aAAa,MAAM,IAAI;AACxC,UAAI,CAAC,SAAU;AAEf,YAAM,kBAAkB,MAAM,aAAa,aAAa,WAAW,KAAK,QAAQ;AAChF,YAAM,SAAS,kBAAkB,KAAK,MAAM,aAAa,YAAY,QAAQ;AAC7E,YAAM,KAAK,GAAG,MAAM,GAAG,QAAQ,EAAE;AACjC,UAAI,MAAM,UAAU;AAClB,mBAAW,SAAS,MAAM,UAAU;AAClC,gBAAM,cAAc,MAAM,aAAa,YAAY,OAAO;AAC1D,gBAAM,KAAK,KAAK,WAAW,IAAI,MAAM,QAAQ,EAAE,EAAE;AAAA,QACnD;AAAA,MACF;AACA;AAAA,IACF;AAEA,QAAI,MAAM,SAAS,eAAe,MAAM,MAAM;AAC5C,UAAI,OAAO,aAAa,MAAM,IAAI;AAClC,UAAI,CAAC,KAAM;AAGX,UAAI,cAAc,KAAK,IAAI,GAAG;AAC5B,cAAM,YAAY,OAAO,IAAI,CAAC;AAC9B,YAAI,WAAW,SAAS,eAAe,UAAU,QAAQ,SAAS,KAAK,UAAU,IAAI,GAAG;AACtF,gBAAM,KAAK,IAAI,MAAM,IAAI,IAAI,UAAU,IAAI,IAAI,EAAE;AACjD;AAAA,QACF,OAAO;AACL,gBAAM,KAAK,IAAI,MAAM,IAAI,IAAI,EAAE;AAAA,QACjC;AACA;AAAA,MACF;AAEA,UAAI,sBAAsB,KAAK,IAAI,GAAG;AACpC,cAAM,KAAK,IAAI,IAAI,KAAK,EAAE;AAC1B;AAAA,MACF;AAGA,UAAI,MAAM,MAAM;AACd,cAAM,OAAO,aAAa,MAAM,IAAI;AACpC,YAAI,KAAM,QAAO,IAAI,IAAI,KAAK,IAAI;AAAA,MACpC;AAGA,UAAI,MAAM,cAAc;AACtB,gBAAQ,aAAQ,MAAM,YAAY;AAAA,MACpC;AAEA,YAAM,KAAK,UAAU,IAAI,GAAG,EAAE;AAAA,IAChC,WAAW,MAAM,SAAS,WAAW,MAAM,OAAO;AAEhD,UAAI,MAAM,SAAS,KAAK,MAAM,MAAM,SAAS,CAAC,MAAM,IAAI;AACtD,cAAM,KAAK,EAAE;AAAA,MACf;AACA,YAAM,UAAU,gBAAgB,MAAM,KAAK;AAC3C,UAAI,SAAS;AACX,cAAM,KAAK,OAAO;AAClB,cAAM,KAAK,EAAE;AAAA,MACf;AAAA,IACF;AAAA,EACF;AAEA,SAAO,MAAM,KAAK,IAAI,EAAE,KAAK;AAC/B;AAGA,SAAS,eAAe,OAAyB;AAC/C,aAAW,OAAO,MAAM,OAAO;AAC7B,eAAW,QAAQ,KAAK;AACtB,UAAI,KAAK,UAAU,KAAK,KAAK,UAAU,EAAG,QAAO;AAAA,IACnD;AAAA,EACF;AACA,SAAO;AACT;AAGA,SAAS,YAAY,OAAwB;AAC3C,QAAM,EAAE,OAAO,MAAM,SAAS,MAAM,QAAQ,IAAI;AAChD,QAAM,OAAO,oBAAI,IAAY;AAC7B,QAAM,QAAkB,CAAC,SAAS;AAElC,WAAS,IAAI,GAAG,IAAI,SAAS,KAAK;AAChC,UAAM,MAAM,MAAM,IAAI,OAAO;AAC7B,UAAM,UAAoB,CAAC;AAC3B,aAAS,IAAI,GAAG,IAAI,SAAS,KAAK;AAChC,UAAI,KAAK,IAAI,GAAG,CAAC,IAAI,CAAC,EAAE,EAAG;AAC3B,YAAM,OAAO,MAAM,CAAC,IAAI,CAAC;AACzB,UAAI,CAAC,KAAM;AAGX,eAAS,KAAK,GAAG,KAAK,KAAK,SAAS,MAAM;AACxC,iBAAS,KAAK,GAAG,KAAK,KAAK,SAAS,MAAM;AACxC,cAAI,OAAO,KAAK,OAAO,EAAG;AAC1B,cAAI,IAAI,KAAK,WAAW,IAAI,KAAK,QAAS,MAAK,IAAI,GAAG,IAAI,EAAE,IAAI,IAAI,EAAE,EAAE;AAAA,QAC1E;AAAA,MACF;AAEA,YAAM,OAAO,aAAa,KAAK,IAAI,EAAE,QAAQ,OAAO,MAAM;AAC1D,YAAM,QAAkB,CAAC;AACzB,UAAI,KAAK,UAAU,EAAG,OAAM,KAAK,YAAY,KAAK,OAAO,GAAG;AAC5D,UAAI,KAAK,UAAU,EAAG,OAAM,KAAK,YAAY,KAAK,OAAO,GAAG;AAC5D,YAAM,UAAU,MAAM,SAAS,MAAM,MAAM,KAAK,GAAG,IAAI;AACvD,cAAQ,KAAK,IAAI,GAAG,GAAG,OAAO,IAAI,IAAI,KAAK,GAAG,GAAG;AAAA,IACnD;AACA,QAAI,QAAQ,OAAQ,OAAM,KAAK,OAAO,QAAQ,KAAK,EAAE,CAAC,OAAO;AAAA,EAC/D;AAEA,QAAM,KAAK,UAAU;AACrB,SAAO,MAAM,KAAK,IAAI;AACxB;AAEA,SAAS,gBAAgB,OAAwB;AAC/C,MAAI,MAAM,SAAS,KAAK,MAAM,SAAS,EAAG,QAAO;AAEjD,QAAM,EAAE,OAAO,MAAM,SAAS,MAAM,QAAQ,IAAI;AAGhD,MAAI,eAAe,KAAK,EAAG,QAAO,YAAY,KAAK;AAGnD,MAAI,YAAY,KAAK,YAAY,GAAG;AAClC,UAAM,UAAU,aAAa,MAAM,CAAC,EAAE,CAAC,EAAE,IAAI;AAC7C,QAAI,CAAC,QAAS,QAAO;AACrB,WAAO,QACJ,MAAM,IAAI,EACV,IAAI,UAAQ;AACX,YAAM,UAAU,KAAK,KAAK;AAC1B,UAAI,CAAC,QAAS,QAAO;AACrB,UAAI,WAAW,KAAK,OAAO,EAAG,QAAO,KAAK,UAAU,OAAO,CAAC;AAC5D,UAAI,aAAa,KAAK,OAAO,EAAG,QAAO,KAAK,UAAU,OAAO,CAAC;AAC9D,aAAO,UAAU,OAAO;AAAA,IAC1B,CAAC,EACA,OAAO,OAAO,EACd,KAAK,IAAI;AAAA,EACd;AAGA,MAAI,YAAY,KAAK,WAAW,GAAG;AACjC,WAAO,MACJ,IAAI,SAAO,UAAU,aAAa,IAAI,CAAC,EAAE,IAAI,CAAC,EAAE,QAAQ,OAAO,GAAG,CAAC,EACnE,OAAO,OAAO,EACd,KAAK,IAAI;AAAA,EACd;AAGA,QAAM,UAAsB,MAAM,KAAK,EAAE,QAAQ,QAAQ,GAAG,MAAM,MAAM,OAAO,EAAE,KAAK,EAAE,CAAC;AACzF,QAAM,OAAO,oBAAI,IAAY;AAE7B,WAAS,IAAI,GAAG,IAAI,SAAS,KAAK;AAChC,aAAS,IAAI,GAAG,IAAI,SAAS,KAAK;AAChC,UAAI,KAAK,IAAI,GAAG,CAAC,IAAI,CAAC,EAAE,EAAG;AAC3B,YAAM,OAAO,MAAM,CAAC,IAAI,CAAC;AACzB,UAAI,CAAC,KAAM;AACX,cAAQ,CAAC,EAAE,CAAC,IAAI,UAAU,aAAa,KAAK,IAAI,CAAC,EAAE,QAAQ,OAAO,KAAK,EAAE,QAAQ,OAAO,MAAM;AAG9F,eAAS,KAAK,GAAG,KAAK,KAAK,SAAS,MAAM;AACxC,iBAAS,KAAK,GAAG,KAAK,KAAK,SAAS,MAAM;AACxC,cAAI,OAAO,KAAK,OAAO,EAAG;AAC1B,cAAI,IAAI,KAAK,WAAW,IAAI,KAAK,SAAS;AACxC,iBAAK,IAAI,GAAG,IAAI,EAAE,IAAI,IAAI,EAAE,EAAE;AAAA,UAChC;AAAA,QACF;AAAA,MACF;AAEA,WAAK,KAAK,UAAU;AAAA,IACtB;AAAA,EACF;AAMA,QAAM,aAAyB,CAAC;AAChC,MAAI,kBAAkB;AACtB,WAAS,IAAI,GAAG,IAAI,QAAQ,QAAQ,KAAK;AACvC,UAAM,MAAM,QAAQ,CAAC;AACrB,UAAM,qBAAqB,IAAI,MAAM,UAAQ,SAAS,EAAE;AACxD,QAAI,mBAAoB;AAIxB,UAAM,eAAe,IAAI,OAAO,UAAQ,SAAS,EAAE;AACnD,UAAM,eAAe,IAAI,KAAK,CAAC,GAAG,MAAM,KAAK,IAAI,GAAG,CAAC,IAAI,CAAC,EAAE,CAAC;AAC7D,QAAI,CAAC,gBAAgB,aAAa,WAAW,KAAK,IAAI,CAAC,MAAM,MAAM,IAAI,MAAM,CAAC,EAAE,MAAM,OAAK,MAAM,EAAE,GAAG;AACpG,wBAAkB,IAAI,CAAC;AACvB;AAAA,IACF;AAGA,QAAI,mBAAmB,IAAI,CAAC,MAAM,IAAI;AACpC,UAAI,CAAC,IAAI;AACT,wBAAkB;AAAA,IACpB,OAAO;AACL,wBAAkB;AAAA,IACpB;AACA,eAAW,KAAK,GAAG;AAAA,EACrB;AAEA,MAAI,WAAW,WAAW,EAAG,QAAO;AAEpC,QAAM,KAAe,CAAC;AACtB,KAAG,KAAK,OAAO,WAAW,CAAC,EAAE,KAAK,KAAK,IAAI,IAAI;AAC/C,KAAG,KAAK,OAAO,WAAW,CAAC,EAAE,IAAI,MAAM,KAAK,EAAE,KAAK,KAAK,IAAI,IAAI;AAChE,WAAS,IAAI,GAAG,IAAI,WAAW,QAAQ,KAAK;AAC1C,OAAG,KAAK,OAAO,WAAW,CAAC,EAAE,KAAK,KAAK,IAAI,IAAI;AAAA,EACjD;AACA,SAAO,GAAG,KAAK,IAAI;AACrB;;;ACnLO,IAAM,mBAAmB;AACzB,IAAM,mBAAmB;AACzB,IAAM,mBAAmB;","names":[]}
|
|
@@ -20,11 +20,17 @@ function isPdfFile(buffer) {
|
|
|
20
20
|
const b = magicBytes(buffer);
|
|
21
21
|
return b[0] === 37 && b[1] === 80 && b[2] === 68 && b[3] === 70;
|
|
22
22
|
}
|
|
23
|
+
function isHwpmlFile(buffer) {
|
|
24
|
+
const bytes = new Uint8Array(buffer, 0, Math.min(512, buffer.byteLength));
|
|
25
|
+
const head = new TextDecoder("utf-8", { fatal: false }).decode(bytes).replace(/^\uFEFF/, "");
|
|
26
|
+
return head.trimStart().startsWith("<?xml") && head.includes("<HWPML");
|
|
27
|
+
}
|
|
23
28
|
function detectFormat(buffer) {
|
|
24
29
|
if (buffer.byteLength < 4) return "unknown";
|
|
25
30
|
if (isZipFile(buffer)) return "hwpx";
|
|
26
31
|
if (isOldHwpFile(buffer)) return "hwp";
|
|
27
32
|
if (isPdfFile(buffer)) return "pdf";
|
|
33
|
+
if (isHwpmlFile(buffer)) return "hwpml";
|
|
28
34
|
return "unknown";
|
|
29
35
|
}
|
|
30
36
|
async function detectZipFormat(buffer) {
|
|
@@ -46,7 +52,8 @@ export {
|
|
|
46
52
|
isHwpxFile,
|
|
47
53
|
isOldHwpFile,
|
|
48
54
|
isPdfFile,
|
|
55
|
+
isHwpmlFile,
|
|
49
56
|
detectFormat,
|
|
50
57
|
detectZipFormat
|
|
51
58
|
};
|
|
52
|
-
//# sourceMappingURL=chunk-
|
|
59
|
+
//# sourceMappingURL=chunk-M3E3C5GS.js.map
|
|
@@ -0,0 +1 @@
|
|
|
1
|
+
{"version":3,"sources":["../src/detect.ts"],"sourcesContent":["/** 매직 바이트 기반 파일 포맷 감지 */\r\n\r\nimport JSZip from \"jszip\"\r\nimport type { FileType } from \"./types.js\"\r\n\r\n/** 매직 바이트 뷰 생성 (복사 없이 view) */\r\nfunction magicBytes(buffer: ArrayBuffer): Uint8Array {\r\n return new Uint8Array(buffer, 0, Math.min(4, buffer.byteLength))\r\n}\r\n\r\n/** ZIP 파일 여부: PK\\x03\\x04 */\r\nexport function isZipFile(buffer: ArrayBuffer): boolean {\r\n const b = magicBytes(buffer)\r\n return b[0] === 0x50 && b[1] === 0x4b && b[2] === 0x03 && b[3] === 0x04\r\n}\r\n\r\n/** HWPX (ZIP 기반 한컴 문서): PK\\x03\\x04 — 하위 호환용 */\r\nexport function isHwpxFile(buffer: ArrayBuffer): boolean {\r\n return isZipFile(buffer)\r\n}\r\n\r\n/** HWP 5.x (OLE2 바이너리 한컴 문서): \\xD0\\xCF\\x11\\xE0 */\r\nexport function isOldHwpFile(buffer: ArrayBuffer): boolean {\r\n const b = magicBytes(buffer)\r\n return b[0] === 0xd0 && b[1] === 0xcf && b[2] === 0x11 && b[3] === 0xe0\r\n}\r\n\r\n/** PDF 문서: %PDF */\r\nexport function isPdfFile(buffer: ArrayBuffer): boolean {\r\n const b = magicBytes(buffer)\r\n return b[0] === 0x25 && b[1] === 0x50 && b[2] === 0x44 && b[3] === 0x46\r\n}\r\n\r\n/** HWPML (XML 기반 한컴 문서): <?xml ... <HWPML */\r\nexport function isHwpmlFile(buffer: ArrayBuffer): boolean {\r\n const bytes = new Uint8Array(buffer, 0, Math.min(512, buffer.byteLength))\r\n const head = new TextDecoder(\"utf-8\", { fatal: false }).decode(bytes).replace(/^\\uFEFF/, \"\")\r\n return head.trimStart().startsWith(\"<?xml\") && head.includes(\"<HWPML\")\r\n}\r\n\r\n/** 동기 포맷 감지 — ZIP은 모두 \"hwpx\"로 반환 (하위 호환) */\r\nexport function detectFormat(buffer: ArrayBuffer): FileType {\r\n if (buffer.byteLength < 4) return \"unknown\"\r\n if (isZipFile(buffer)) return \"hwpx\"\r\n if (isOldHwpFile(buffer)) return \"hwp\"\r\n if (isPdfFile(buffer)) return \"pdf\"\r\n if (isHwpmlFile(buffer)) return \"hwpml\"\r\n return \"unknown\"\r\n}\r\n\r\n/**\r\n * ZIP 내부 구조 기반 포맷 세분화.\r\n * HWPX, XLSX, DOCX 모두 ZIP이므로 내부 파일로 구분.\r\n */\r\nexport async function detectZipFormat(buffer: ArrayBuffer): Promise<\"hwpx\" | \"xlsx\" | \"docx\" | \"unknown\"> {\r\n try {\r\n const zip = await JSZip.loadAsync(buffer)\r\n // XLSX: xl/workbook.xml\r\n if (zip.file(\"xl/workbook.xml\")) return \"xlsx\"\r\n // DOCX: word/document.xml\r\n if (zip.file(\"word/document.xml\")) return \"docx\"\r\n // HWPX: Contents/ 또는 content.hpf 또는 mimetype\r\n if (zip.file(\"Contents/content.hpf\") || zip.file(\"mimetype\")) return \"hwpx\"\r\n // 기타 ZIP 내에 section 파일이 있으면 HWPX로 추정\r\n const hasSection = Object.keys(zip.files).some(f => f.startsWith(\"Contents/\"))\r\n if (hasSection) return \"hwpx\"\r\n return \"unknown\"\r\n } catch {\r\n return \"unknown\"\r\n }\r\n}\r\n"],"mappings":";;;AAEA,OAAO,WAAW;AAIlB,SAAS,WAAW,QAAiC;AACnD,SAAO,IAAI,WAAW,QAAQ,GAAG,KAAK,IAAI,GAAG,OAAO,UAAU,CAAC;AACjE;AAGO,SAAS,UAAU,QAA8B;AACtD,QAAM,IAAI,WAAW,MAAM;AAC3B,SAAO,EAAE,CAAC,MAAM,MAAQ,EAAE,CAAC,MAAM,MAAQ,EAAE,CAAC,MAAM,KAAQ,EAAE,CAAC,MAAM;AACrE;AAGO,SAAS,WAAW,QAA8B;AACvD,SAAO,UAAU,MAAM;AACzB;AAGO,SAAS,aAAa,QAA8B;AACzD,QAAM,IAAI,WAAW,MAAM;AAC3B,SAAO,EAAE,CAAC,MAAM,OAAQ,EAAE,CAAC,MAAM,OAAQ,EAAE,CAAC,MAAM,MAAQ,EAAE,CAAC,MAAM;AACrE;AAGO,SAAS,UAAU,QAA8B;AACtD,QAAM,IAAI,WAAW,MAAM;AAC3B,SAAO,EAAE,CAAC,MAAM,MAAQ,EAAE,CAAC,MAAM,MAAQ,EAAE,CAAC,MAAM,MAAQ,EAAE,CAAC,MAAM;AACrE;AAGO,SAAS,YAAY,QAA8B;AACxD,QAAM,QAAQ,IAAI,WAAW,QAAQ,GAAG,KAAK,IAAI,KAAK,OAAO,UAAU,CAAC;AACxE,QAAM,OAAO,IAAI,YAAY,SAAS,EAAE,OAAO,MAAM,CAAC,EAAE,OAAO,KAAK,EAAE,QAAQ,WAAW,EAAE;AAC3F,SAAO,KAAK,UAAU,EAAE,WAAW,OAAO,KAAK,KAAK,SAAS,QAAQ;AACvE;AAGO,SAAS,aAAa,QAA+B;AAC1D,MAAI,OAAO,aAAa,EAAG,QAAO;AAClC,MAAI,UAAU,MAAM,EAAG,QAAO;AAC9B,MAAI,aAAa,MAAM,EAAG,QAAO;AACjC,MAAI,UAAU,MAAM,EAAG,QAAO;AAC9B,MAAI,YAAY,MAAM,EAAG,QAAO;AAChC,SAAO;AACT;AAMA,eAAsB,gBAAgB,QAAoE;AACxG,MAAI;AACF,UAAM,MAAM,MAAM,MAAM,UAAU,MAAM;AAExC,QAAI,IAAI,KAAK,iBAAiB,EAAG,QAAO;AAExC,QAAI,IAAI,KAAK,mBAAmB,EAAG,QAAO;AAE1C,QAAI,IAAI,KAAK,sBAAsB,KAAK,IAAI,KAAK,UAAU,EAAG,QAAO;AAErE,UAAM,aAAa,OAAO,KAAK,IAAI,KAAK,EAAE,KAAK,OAAK,EAAE,WAAW,WAAW,CAAC;AAC7E,QAAI,WAAY,QAAO;AACvB,WAAO;AAAA,EACT,QAAQ;AACN,WAAO;AAAA,EACT;AACF;","names":[]}
|