python-hwpx 1.3__tar.gz → 1.4__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (38) hide show
  1. {python-hwpx-1.3/src/python_hwpx.egg-info → python-hwpx-1.4}/PKG-INFO +2 -1
  2. {python-hwpx-1.3 → python-hwpx-1.4}/README.md +1 -0
  3. {python-hwpx-1.3 → python-hwpx-1.4}/pyproject.toml +1 -1
  4. {python-hwpx-1.3 → python-hwpx-1.4}/src/hwpx/document.py +18 -0
  5. {python-hwpx-1.3 → python-hwpx-1.4}/src/hwpx/oxml/__init__.py +6 -0
  6. {python-hwpx-1.3 → python-hwpx-1.4}/src/hwpx/oxml/document.py +118 -1
  7. {python-hwpx-1.3 → python-hwpx-1.4}/src/hwpx/package.py +88 -0
  8. {python-hwpx-1.3 → python-hwpx-1.4/src/python_hwpx.egg-info}/PKG-INFO +2 -1
  9. {python-hwpx-1.3 → python-hwpx-1.4}/tests/test_integration_hwpx_compatibility.py +89 -0
  10. {python-hwpx-1.3 → python-hwpx-1.4}/LICENSE +0 -0
  11. {python-hwpx-1.3 → python-hwpx-1.4}/setup.cfg +0 -0
  12. {python-hwpx-1.3 → python-hwpx-1.4}/src/hwpx/__init__.py +0 -0
  13. {python-hwpx-1.3 → python-hwpx-1.4}/src/hwpx/data/Skeleton.hwpx +0 -0
  14. {python-hwpx-1.3 → python-hwpx-1.4}/src/hwpx/opc/package.py +0 -0
  15. {python-hwpx-1.3 → python-hwpx-1.4}/src/hwpx/oxml/body.py +0 -0
  16. {python-hwpx-1.3 → python-hwpx-1.4}/src/hwpx/oxml/common.py +0 -0
  17. {python-hwpx-1.3 → python-hwpx-1.4}/src/hwpx/oxml/header.py +0 -0
  18. {python-hwpx-1.3 → python-hwpx-1.4}/src/hwpx/oxml/parser.py +0 -0
  19. {python-hwpx-1.3 → python-hwpx-1.4}/src/hwpx/oxml/schema.py +0 -0
  20. {python-hwpx-1.3 → python-hwpx-1.4}/src/hwpx/oxml/utils.py +0 -0
  21. {python-hwpx-1.3 → python-hwpx-1.4}/src/hwpx/templates.py +0 -0
  22. {python-hwpx-1.3 → python-hwpx-1.4}/src/hwpx/tools/__init__.py +0 -0
  23. {python-hwpx-1.3 → python-hwpx-1.4}/src/hwpx/tools/_schemas/header.xsd +0 -0
  24. {python-hwpx-1.3 → python-hwpx-1.4}/src/hwpx/tools/_schemas/section.xsd +0 -0
  25. {python-hwpx-1.3 → python-hwpx-1.4}/src/hwpx/tools/object_finder.py +0 -0
  26. {python-hwpx-1.3 → python-hwpx-1.4}/src/hwpx/tools/text_extractor.py +0 -0
  27. {python-hwpx-1.3 → python-hwpx-1.4}/src/hwpx/tools/validator.py +0 -0
  28. {python-hwpx-1.3 → python-hwpx-1.4}/src/python_hwpx.egg-info/SOURCES.txt +0 -0
  29. {python-hwpx-1.3 → python-hwpx-1.4}/src/python_hwpx.egg-info/dependency_links.txt +0 -0
  30. {python-hwpx-1.3 → python-hwpx-1.4}/src/python_hwpx.egg-info/entry_points.txt +0 -0
  31. {python-hwpx-1.3 → python-hwpx-1.4}/src/python_hwpx.egg-info/requires.txt +0 -0
  32. {python-hwpx-1.3 → python-hwpx-1.4}/src/python_hwpx.egg-info/top_level.txt +0 -0
  33. {python-hwpx-1.3 → python-hwpx-1.4}/tests/test_document_formatting.py +0 -0
  34. {python-hwpx-1.3 → python-hwpx-1.4}/tests/test_inline_models.py +0 -0
  35. {python-hwpx-1.3 → python-hwpx-1.4}/tests/test_memo_and_style_editing.py +0 -0
  36. {python-hwpx-1.3 → python-hwpx-1.4}/tests/test_oxml_parsing.py +0 -0
  37. {python-hwpx-1.3 → python-hwpx-1.4}/tests/test_section_headers.py +0 -0
  38. {python-hwpx-1.3 → python-hwpx-1.4}/tests/test_text_extractor_annotations.py +0 -0
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.1
2
2
  Name: python-hwpx
3
- Version: 1.3
3
+ Version: 1.4
4
4
  Summary: Hancom HWPX 패키지를 로드하고 편집하기 위한 Python 유틸리티 모음
5
5
  Author: python-hwpx Maintainers
6
6
  License: Non-Commercial License
@@ -39,6 +39,7 @@ Requires-Dist: pytest>=7.4; extra == "test"
39
39
  - **타입이 지정된 본문 모델** – `hwpx.oxml.body`는 표·컨트롤·인라인 도형·변경 추적 태그를 데이터 클래스에 매핑하고, `HwpxOxmlParagraph.model`/`HwpxOxmlRun.model`로 이를 조회·수정한 뒤 XML로 되돌릴 수 있도록 지원합니다.
40
40
  - **메모와 필드 앵커** – `add_memo_with_anchor()`로 메모를 생성하면서 MEMO 필드 컨트롤을 자동 삽입해 한/글에서 바로 표시되도록 합니다.
41
41
  - **헤더 참조 목록 탐색** – 글머리표, 문단 속성, 스타일, 변경 추적 항목, 작성자 정보를 데이터클래스로 파싱하고 `document.bullets`·`document.styles` 같은 조회 헬퍼로 ID 기반 검색을 단순화했습니다.
42
+ - **바탕쪽·이력·버전 파트 제어** – 매니페스트에 포함된 master-page/history/version 파트를 `document.master_pages`, `document.histories`, `document.version`으로 직접 편집하고 저장합니다.
42
43
  - **스타일 기반 텍스트 치환** – 런 서식(색상, 밑줄, `charPrIDRef`)으로 필터링해 텍스트를 선택적으로 교체하거나 삭제합니다. 하이라이트
43
44
  마커나 태그로 분리된 문자열도 서식을 유지한 채 치환합니다.
44
45
  - **텍스트 추출 파이프라인** – `hwpx.tools.text_extractor.TextExtractor`는 하이라이트, 각주, 컨트롤을 원하는 방식으로 표현하며 문단 텍스트를 반환합니다.
@@ -9,6 +9,7 @@
9
9
  - **타입이 지정된 본문 모델** – `hwpx.oxml.body`는 표·컨트롤·인라인 도형·변경 추적 태그를 데이터 클래스에 매핑하고, `HwpxOxmlParagraph.model`/`HwpxOxmlRun.model`로 이를 조회·수정한 뒤 XML로 되돌릴 수 있도록 지원합니다.
10
10
  - **메모와 필드 앵커** – `add_memo_with_anchor()`로 메모를 생성하면서 MEMO 필드 컨트롤을 자동 삽입해 한/글에서 바로 표시되도록 합니다.
11
11
  - **헤더 참조 목록 탐색** – 글머리표, 문단 속성, 스타일, 변경 추적 항목, 작성자 정보를 데이터클래스로 파싱하고 `document.bullets`·`document.styles` 같은 조회 헬퍼로 ID 기반 검색을 단순화했습니다.
12
+ - **바탕쪽·이력·버전 파트 제어** – 매니페스트에 포함된 master-page/history/version 파트를 `document.master_pages`, `document.histories`, `document.version`으로 직접 편집하고 저장합니다.
12
13
  - **스타일 기반 텍스트 치환** – 런 서식(색상, 밑줄, `charPrIDRef`)으로 필터링해 텍스트를 선택적으로 교체하거나 삭제합니다. 하이라이트
13
14
  마커나 태그로 분리된 문자열도 서식을 유지한 채 치환합니다.
14
15
  - **텍스트 추출 파이프라인** – `hwpx.tools.text_extractor.TextExtractor`는 하이라이트, 각주, 컨트롤을 원하는 방식으로 표현하며 문단 텍스트를 반환합니다.
@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
4
4
 
5
5
  [project]
6
6
  name = "python-hwpx"
7
- version = "1.3"
7
+ version = "1.4"
8
8
  description = "Hancom HWPX 패키지를 로드하고 편집하기 위한 Python 유틸리티 모음"
9
9
  readme = { file = "README.md", content-type = "text/markdown" }
10
10
  license = { text = "Non-Commercial License" }
@@ -13,13 +13,16 @@ from .oxml import (
13
13
  Bullet,
14
14
  HwpxOxmlDocument,
15
15
  HwpxOxmlHeader,
16
+ HwpxOxmlHistory,
16
17
  HwpxOxmlInlineObject,
18
+ HwpxOxmlMasterPage,
17
19
  HwpxOxmlMemo,
18
20
  HwpxOxmlParagraph,
19
21
  HwpxOxmlRun,
20
22
  HwpxOxmlSection,
21
23
  HwpxOxmlSectionHeaderFooter,
22
24
  HwpxOxmlTable,
25
+ HwpxOxmlVersion,
23
26
  MemoShape,
24
27
  ParagraphProperty,
25
28
  RunStyle,
@@ -89,6 +92,21 @@ class HwpxDocument:
89
92
  """Return the header parts referenced by the document."""
90
93
  return self._root.headers
91
94
 
95
+ @property
96
+ def master_pages(self) -> List[HwpxOxmlMasterPage]:
97
+ """Return the master-page parts declared in the manifest."""
98
+ return self._root.master_pages
99
+
100
+ @property
101
+ def histories(self) -> List[HwpxOxmlHistory]:
102
+ """Return document history parts referenced by the manifest."""
103
+ return self._root.histories
104
+
105
+ @property
106
+ def version(self) -> HwpxOxmlVersion | None:
107
+ """Return the version metadata part if present."""
108
+ return self._root.version
109
+
92
110
  @property
93
111
  def memo_shapes(self) -> dict[str, MemoShape]:
94
112
  """Return memo shapes available in the header reference lists."""
@@ -19,7 +19,9 @@ from .document import (
19
19
  DocumentNumbering,
20
20
  HwpxOxmlDocument,
21
21
  HwpxOxmlHeader,
22
+ HwpxOxmlHistory,
22
23
  HwpxOxmlInlineObject,
24
+ HwpxOxmlMasterPage,
23
25
  HwpxOxmlMemo,
24
26
  HwpxOxmlMemoGroup,
25
27
  HwpxOxmlParagraph,
@@ -30,6 +32,7 @@ from .document import (
30
32
  HwpxOxmlTable,
31
33
  HwpxOxmlTableCell,
32
34
  HwpxOxmlTableRow,
35
+ HwpxOxmlVersion,
33
36
  PageMargins,
34
37
  PageSize,
35
38
  RunStyle,
@@ -126,7 +129,9 @@ __all__ = [
126
129
  "DocumentNumbering",
127
130
  "HwpxOxmlDocument",
128
131
  "HwpxOxmlHeader",
132
+ "HwpxOxmlHistory",
129
133
  "HwpxOxmlInlineObject",
134
+ "HwpxOxmlMasterPage",
130
135
  "HwpxOxmlMemo",
131
136
  "HwpxOxmlMemoGroup",
132
137
  "HwpxOxmlParagraph",
@@ -137,6 +142,7 @@ __all__ = [
137
142
  "HwpxOxmlTable",
138
143
  "HwpxOxmlTableCell",
139
144
  "HwpxOxmlTableRow",
145
+ "HwpxOxmlVersion",
140
146
  "KeyDerivation",
141
147
  "KeyEncryption",
142
148
  "LinkInfo",
@@ -1999,6 +1999,61 @@ class HwpxOxmlParagraph:
1999
1999
  self.section.mark_dirty()
2000
2000
 
2001
2001
 
2002
+ class _HwpxOxmlSimplePart:
2003
+ """Common base for standalone XML parts that are not sections or headers."""
2004
+
2005
+ def __init__(
2006
+ self,
2007
+ part_name: str,
2008
+ element: ET.Element,
2009
+ document: "HwpxOxmlDocument" | None = None,
2010
+ ):
2011
+ self.part_name = part_name
2012
+ self._element = element
2013
+ self._document = document
2014
+ self._dirty = False
2015
+
2016
+ @property
2017
+ def element(self) -> ET.Element:
2018
+ return self._element
2019
+
2020
+ @property
2021
+ def document(self) -> "HwpxOxmlDocument" | None:
2022
+ return self._document
2023
+
2024
+ def attach_document(self, document: "HwpxOxmlDocument") -> None:
2025
+ self._document = document
2026
+
2027
+ @property
2028
+ def dirty(self) -> bool:
2029
+ return self._dirty
2030
+
2031
+ def mark_dirty(self) -> None:
2032
+ self._dirty = True
2033
+
2034
+ def reset_dirty(self) -> None:
2035
+ self._dirty = False
2036
+
2037
+ def replace_element(self, element: ET.Element) -> None:
2038
+ self._element = element
2039
+ self.mark_dirty()
2040
+
2041
+ def to_bytes(self) -> bytes:
2042
+ return _serialize_xml(self._element)
2043
+
2044
+
2045
+ class HwpxOxmlMasterPage(_HwpxOxmlSimplePart):
2046
+ """Represents a master page part in the package."""
2047
+
2048
+
2049
+ class HwpxOxmlHistory(_HwpxOxmlSimplePart):
2050
+ """Represents a document history part."""
2051
+
2052
+
2053
+ class HwpxOxmlVersion(_HwpxOxmlSimplePart):
2054
+ """Represents the ``version.xml`` part."""
2055
+
2056
+
2002
2057
  class HwpxOxmlSection:
2003
2058
  """Represents the contents of a section XML part."""
2004
2059
 
@@ -2540,16 +2595,29 @@ class HwpxOxmlDocument:
2540
2595
  manifest: ET.Element,
2541
2596
  sections: Sequence[HwpxOxmlSection],
2542
2597
  headers: Sequence[HwpxOxmlHeader],
2598
+ *,
2599
+ master_pages: Sequence[HwpxOxmlMasterPage] | None = None,
2600
+ histories: Sequence[HwpxOxmlHistory] | None = None,
2601
+ version: HwpxOxmlVersion | None = None,
2543
2602
  ):
2544
2603
  self._manifest = manifest
2545
2604
  self._sections = list(sections)
2546
2605
  self._headers = list(headers)
2606
+ self._master_pages = list(master_pages or [])
2607
+ self._histories = list(histories or [])
2608
+ self._version = version
2547
2609
  self._char_property_cache: dict[str, RunStyle] | None = None
2548
2610
 
2549
2611
  for section in self._sections:
2550
2612
  section.attach_document(self)
2551
2613
  for header in self._headers:
2552
2614
  header.attach_document(self)
2615
+ for master_page in self._master_pages:
2616
+ master_page.attach_document(self)
2617
+ for history in self._histories:
2618
+ history.attach_document(self)
2619
+ if self._version is not None:
2620
+ self._version.attach_document(self)
2553
2621
 
2554
2622
  @classmethod
2555
2623
  def from_package(cls, package: "HwpxPackage") -> "HwpxOxmlDocument":
@@ -2561,12 +2629,35 @@ class HwpxOxmlDocument:
2561
2629
  manifest = package.get_xml(package.MANIFEST_PATH)
2562
2630
  section_paths = package.section_paths()
2563
2631
  header_paths = package.header_paths()
2632
+ master_page_paths = package.master_page_paths()
2633
+ history_paths = package.history_paths()
2634
+ version_path = package.version_path()
2564
2635
 
2565
2636
  sections = [
2566
2637
  HwpxOxmlSection(path, package.get_xml(path)) for path in section_paths
2567
2638
  ]
2568
2639
  headers = [HwpxOxmlHeader(path, package.get_xml(path)) for path in header_paths]
2569
- return cls(manifest, sections, headers)
2640
+ master_pages = [
2641
+ HwpxOxmlMasterPage(path, package.get_xml(path))
2642
+ for path in master_page_paths
2643
+ if package.has_part(path)
2644
+ ]
2645
+ histories = [
2646
+ HwpxOxmlHistory(path, package.get_xml(path))
2647
+ for path in history_paths
2648
+ if package.has_part(path)
2649
+ ]
2650
+ version = None
2651
+ if version_path and package.has_part(version_path):
2652
+ version = HwpxOxmlVersion(version_path, package.get_xml(version_path))
2653
+ return cls(
2654
+ manifest,
2655
+ sections,
2656
+ headers,
2657
+ master_pages=master_pages,
2658
+ histories=histories,
2659
+ version=version,
2660
+ )
2570
2661
 
2571
2662
  @property
2572
2663
  def manifest(self) -> ET.Element:
@@ -2580,6 +2671,18 @@ class HwpxOxmlDocument:
2580
2671
  def headers(self) -> List[HwpxOxmlHeader]:
2581
2672
  return list(self._headers)
2582
2673
 
2674
+ @property
2675
+ def master_pages(self) -> List[HwpxOxmlMasterPage]:
2676
+ return list(self._master_pages)
2677
+
2678
+ @property
2679
+ def histories(self) -> List[HwpxOxmlHistory]:
2680
+ return list(self._histories)
2681
+
2682
+ @property
2683
+ def version(self) -> HwpxOxmlVersion | None:
2684
+ return self._version
2685
+
2583
2686
  def _ensure_char_property_cache(self) -> dict[str, RunStyle]:
2584
2687
  if self._char_property_cache is None:
2585
2688
  mapping: dict[str, RunStyle] = {}
@@ -2812,6 +2915,14 @@ class HwpxOxmlDocument:
2812
2915
  headers_dirty = True
2813
2916
  if headers_dirty:
2814
2917
  self.invalidate_char_property_cache()
2918
+ for master_page in self._master_pages:
2919
+ if master_page.dirty:
2920
+ updates[master_page.part_name] = master_page.to_bytes()
2921
+ for history in self._histories:
2922
+ if history.dirty:
2923
+ updates[history.part_name] = history.to_bytes()
2924
+ if self._version is not None and self._version.dirty:
2925
+ updates[self._version.part_name] = self._version.to_bytes()
2815
2926
  return updates
2816
2927
 
2817
2928
  def reset_dirty(self) -> None:
@@ -2820,3 +2931,9 @@ class HwpxOxmlDocument:
2820
2931
  section.reset_dirty()
2821
2932
  for header in self._headers:
2822
2933
  header.reset_dirty()
2934
+ for master_page in self._master_pages:
2935
+ master_page.reset_dirty()
2936
+ for history in self._histories:
2937
+ history.reset_dirty()
2938
+ if self._version is not None:
2939
+ self._version.reset_dirty()
@@ -11,6 +11,21 @@ from zipfile import ZIP_DEFLATED, ZipFile
11
11
  _OPF_NS = "http://www.idpf.org/2007/opf/"
12
12
 
13
13
 
14
+ def _normalized_manifest_value(element: ET.Element) -> str:
15
+ values = [
16
+ element.attrib.get("id", ""),
17
+ element.attrib.get("href", ""),
18
+ element.attrib.get("media-type", ""),
19
+ element.attrib.get("properties", ""),
20
+ ]
21
+ return " ".join(part.lower() for part in values if part)
22
+
23
+
24
+ def _manifest_matches(element: ET.Element, *candidates: str) -> bool:
25
+ normalized = _normalized_manifest_value(element)
26
+ return any(candidate in normalized for candidate in candidates if candidate)
27
+
28
+
14
29
  def _ensure_bytes(value: bytes | str | ET.Element) -> bytes:
15
30
  if isinstance(value, bytes):
16
31
  return value
@@ -38,6 +53,10 @@ class HwpxPackage:
38
53
  self._spine_cache: list[str] | None = None
39
54
  self._section_paths_cache: list[str] | None = None
40
55
  self._header_paths_cache: list[str] | None = None
56
+ self._master_page_paths_cache: list[str] | None = None
57
+ self._history_paths_cache: list[str] | None = None
58
+ self._version_path_cache: str | None = None
59
+ self._version_path_cache_resolved = False
41
60
 
42
61
  # -- construction ----------------------------------------------------
43
62
  @classmethod
@@ -85,6 +104,12 @@ class HwpxPackage:
85
104
  self._spine_cache = None
86
105
  self._section_paths_cache = None
87
106
  self._header_paths_cache = None
107
+ self._master_page_paths_cache = None
108
+ self._history_paths_cache = None
109
+ self._version_path_cache = None
110
+ self._version_path_cache_resolved = False
111
+ elif part_name == "version.xml":
112
+ self._version_path_cache_resolved = False
88
113
 
89
114
  def get_xml(self, part_name: str) -> ET.Element:
90
115
  return ET.fromstring(self.get_part(part_name))
@@ -101,6 +126,11 @@ class HwpxPackage:
101
126
  self._manifest_tree = self.get_xml(self.MANIFEST_PATH)
102
127
  return self._manifest_tree
103
128
 
129
+ def _manifest_items(self) -> list[ET.Element]:
130
+ manifest = self.manifest_tree()
131
+ ns = {"opf": _OPF_NS}
132
+ return list(manifest.findall("./opf:manifest/opf:item", ns))
133
+
104
134
  def _resolve_spine_paths(self) -> list[str]:
105
135
  if self._spine_cache is None:
106
136
  manifest = self.manifest_tree()
@@ -155,6 +185,64 @@ class HwpxPackage:
155
185
  self._header_paths_cache = paths
156
186
  return list(self._header_paths_cache)
157
187
 
188
+ def master_page_paths(self) -> list[str]:
189
+ if self._master_page_paths_cache is None:
190
+ from pathlib import PurePosixPath
191
+
192
+ paths = [
193
+ item.attrib.get("href", "")
194
+ for item in self._manifest_items()
195
+ if _manifest_matches(item, "masterpage", "master-page")
196
+ and item.attrib.get("href")
197
+ ]
198
+
199
+ if not paths:
200
+ paths = [
201
+ name
202
+ for name in self._parts.keys()
203
+ if "master" in PurePosixPath(name).name.lower()
204
+ and "page" in PurePosixPath(name).name.lower()
205
+ ]
206
+
207
+ self._master_page_paths_cache = paths
208
+ return list(self._master_page_paths_cache)
209
+
210
+ def history_paths(self) -> list[str]:
211
+ if self._history_paths_cache is None:
212
+ from pathlib import PurePosixPath
213
+
214
+ paths = [
215
+ item.attrib.get("href", "")
216
+ for item in self._manifest_items()
217
+ if _manifest_matches(item, "history")
218
+ and item.attrib.get("href")
219
+ ]
220
+
221
+ if not paths:
222
+ paths = [
223
+ name
224
+ for name in self._parts.keys()
225
+ if "history" in PurePosixPath(name).name.lower()
226
+ ]
227
+
228
+ self._history_paths_cache = paths
229
+ return list(self._history_paths_cache)
230
+
231
+ def version_path(self) -> str | None:
232
+ if not self._version_path_cache_resolved:
233
+ path: str | None = None
234
+ for item in self._manifest_items():
235
+ if _manifest_matches(item, "version"):
236
+ href = item.attrib.get("href", "").strip()
237
+ if href:
238
+ path = href
239
+ break
240
+ if path is None and self.has_part("version.xml"):
241
+ path = "version.xml"
242
+ self._version_path_cache = path
243
+ self._version_path_cache_resolved = True
244
+ return self._version_path_cache
245
+
158
246
  # -- saving ----------------------------------------------------------
159
247
  def save(
160
248
  self,
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.1
2
2
  Name: python-hwpx
3
- Version: 1.3
3
+ Version: 1.4
4
4
  Summary: Hancom HWPX 패키지를 로드하고 편집하기 위한 Python 유틸리티 모음
5
5
  Author: python-hwpx Maintainers
6
6
  License: Non-Commercial License
@@ -39,6 +39,7 @@ Requires-Dist: pytest>=7.4; extra == "test"
39
39
  - **타입이 지정된 본문 모델** – `hwpx.oxml.body`는 표·컨트롤·인라인 도형·변경 추적 태그를 데이터 클래스에 매핑하고, `HwpxOxmlParagraph.model`/`HwpxOxmlRun.model`로 이를 조회·수정한 뒤 XML로 되돌릴 수 있도록 지원합니다.
40
40
  - **메모와 필드 앵커** – `add_memo_with_anchor()`로 메모를 생성하면서 MEMO 필드 컨트롤을 자동 삽입해 한/글에서 바로 표시되도록 합니다.
41
41
  - **헤더 참조 목록 탐색** – 글머리표, 문단 속성, 스타일, 변경 추적 항목, 작성자 정보를 데이터클래스로 파싱하고 `document.bullets`·`document.styles` 같은 조회 헬퍼로 ID 기반 검색을 단순화했습니다.
42
+ - **바탕쪽·이력·버전 파트 제어** – 매니페스트에 포함된 master-page/history/version 파트를 `document.master_pages`, `document.histories`, `document.version`으로 직접 편집하고 저장합니다.
42
43
  - **스타일 기반 텍스트 치환** – 런 서식(색상, 밑줄, `charPrIDRef`)으로 필터링해 텍스트를 선택적으로 교체하거나 삭제합니다. 하이라이트
43
44
  마커나 태그로 분리된 문자열도 서식을 유지한 채 치환합니다.
44
45
  - **텍스트 추출 파이프라인** – `hwpx.tools.text_extractor.TextExtractor`는 하이라이트, 각주, 컨트롤을 원하는 방식으로 표현하며 문단 텍스트를 반환합니다.
@@ -1,6 +1,7 @@
1
1
  from __future__ import annotations
2
2
 
3
3
  import io
4
+ import xml.etree.ElementTree as ET
4
5
  from pathlib import Path
5
6
  from zipfile import ZIP_DEFLATED, ZIP_STORED, ZipFile
6
7
 
@@ -9,6 +10,7 @@ import pytest
9
10
  from hwpx.document import HwpxDocument
10
11
  from hwpx.package import HwpxPackage
11
12
  from hwpx.tools import load_default_schemas, validate_document
13
+ from hwpx.templates import blank_document_bytes
12
14
 
13
15
  _MIMETYPE = b"application/hwp+zip"
14
16
  _VERSION_XML = (
@@ -137,3 +139,90 @@ def test_fixture_validates_against_reference_schemas(
137
139
 
138
140
  bytes_report = validate_document(sample_document_bytes)
139
141
  assert bytes_report.ok, "Generated sample failed schema validation from bytes"
142
+
143
+
144
+ def test_master_page_history_and_version_round_trip(tmp_path: Path) -> None:
145
+ package = HwpxPackage.open(blank_document_bytes())
146
+
147
+ manifest = package.manifest_tree()
148
+ ns = {"opf": "http://www.idpf.org/2007/opf/"}
149
+ manifest_list = manifest.find(f"{{{ns['opf']}}}manifest")
150
+ assert manifest_list is not None
151
+
152
+ def add_manifest_item(item_id: str, href: str) -> None:
153
+ ET.SubElement(
154
+ manifest_list,
155
+ f"{{{ns['opf']}}}item",
156
+ {"id": item_id, "href": href, "media-type": "application/xml"},
157
+ )
158
+
159
+ add_manifest_item("master-page-0", "Contents/masterPages/masterPage0.xml")
160
+ add_manifest_item("history", "Contents/history.xml")
161
+ add_manifest_item("version", "version.xml")
162
+ package.set_xml(package.MANIFEST_PATH, manifest)
163
+
164
+ hm_ns = "http://www.hancom.co.kr/hwpml/2011/master-page"
165
+ master_root = ET.Element(f"{{{hm_ns}}}masterPage")
166
+ ET.SubElement(
167
+ master_root,
168
+ f"{{{hm_ns}}}masterPageItem",
169
+ {"id": "0", "type": "BOTH", "name": "초기 바탕쪽"},
170
+ )
171
+ package.set_xml("Contents/masterPages/masterPage0.xml", master_root)
172
+
173
+ hhs_ns = "http://www.hancom.co.kr/hwpml/2011/history"
174
+ history_root = ET.Element(f"{{{hhs_ns}}}history")
175
+ history_entry = ET.SubElement(history_root, f"{{{hhs_ns}}}historyEntry", {"id": "0"})
176
+ comment = ET.SubElement(history_entry, f"{{{hhs_ns}}}comment")
177
+ comment.text = "초기 내역"
178
+ package.set_xml("Contents/history.xml", history_root)
179
+
180
+ document = HwpxDocument.from_package(package)
181
+
182
+ assert len(document.master_pages) == 1
183
+ assert len(document.histories) == 1
184
+ version_part = document.version
185
+ assert version_part is not None
186
+
187
+ master_page = document.master_pages[0]
188
+ master_item = master_page.element.find(f"{{{hm_ns}}}masterPageItem")
189
+ assert master_item is not None
190
+ master_item.set("name", "검토용 바탕쪽")
191
+ master_page.mark_dirty()
192
+
193
+ history_part = document.histories[0]
194
+ history_comment = history_part.element.find(
195
+ f"{{{hhs_ns}}}historyEntry/{{{hhs_ns}}}comment"
196
+ )
197
+ assert history_comment is not None
198
+ history_comment.text = "업데이트된 변경 기록"
199
+ history_part.mark_dirty()
200
+
201
+ version_part.element.set("appVersion", "15.0.0.100 WIN32")
202
+ version_part.mark_dirty()
203
+
204
+ output_path = tmp_path / "master_history_roundtrip.hwpx"
205
+ document.save(output_path)
206
+
207
+ reopened = HwpxDocument.open(output_path)
208
+ assert reopened.master_pages
209
+ assert reopened.histories
210
+ reopened_version = reopened.version
211
+ assert reopened_version is not None
212
+
213
+ reopened_master_item = reopened.master_pages[0].element.find(
214
+ f"{{{hm_ns}}}masterPageItem"
215
+ )
216
+ assert reopened_master_item is not None
217
+ assert reopened_master_item.get("name") == "검토용 바탕쪽"
218
+
219
+ reopened_history_comment = reopened.histories[0].element.find(
220
+ f"{{{hhs_ns}}}historyEntry/{{{hhs_ns}}}comment"
221
+ )
222
+ assert reopened_history_comment is not None
223
+ assert reopened_history_comment.text == "업데이트된 변경 기록"
224
+
225
+ assert reopened_version.element.get("appVersion") == "15.0.0.100 WIN32"
226
+ assert "Contents/masterPages/masterPage0.xml" in reopened.package.master_page_paths()
227
+ assert "Contents/history.xml" in reopened.package.history_paths()
228
+ assert reopened.package.version_path() == "version.xml"
File without changes
File without changes