markdown-to-confluence 0.2.5__py3-none-any.whl → 0.2.7__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.1
2
2
  Name: markdown-to-confluence
3
- Version: 0.2.5
3
+ Version: 0.2.7
4
4
  Summary: Publish Markdown files to Confluence wiki
5
5
  Home-page: https://github.com/hunyadi/md2conf
6
6
  Author: Levente Hunyadi
@@ -22,10 +22,10 @@ Requires-Python: >=3.8
22
22
  Description-Content-Type: text/markdown
23
23
  License-File: LICENSE
24
24
  Requires-Dist: lxml>=5.3
25
- Requires-Dist: types-lxml>=2024.8.7
26
- Requires-Dist: markdown>=3.6
27
- Requires-Dist: types-markdown>=3.6
28
- Requires-Dist: pymdown-extensions>=10.9
25
+ Requires-Dist: types-lxml>=2024.11.8
26
+ Requires-Dist: markdown>=3.7
27
+ Requires-Dist: types-markdown>=3.7
28
+ Requires-Dist: pymdown-extensions>=10.12
29
29
  Requires-Dist: pyyaml>=6.0
30
30
  Requires-Dist: types-PyYAML>=6.0
31
31
  Requires-Dist: requests>=2.32
@@ -50,7 +50,7 @@ This Python package
50
50
  * Link to [sections on the same page](#getting-started) or [external locations](http://example.com/)
51
51
  * Ordered and unordered lists
52
52
  * Code blocks (e.g. Python, JSON, XML)
53
- * Image references (uploaded as Confluence page attachments)
53
+ * Images (uploaded as Confluence page attachments or hosted externally)
54
54
  * Tables
55
55
  * [Table of contents](https://docs.gitlab.com/ee/user/markdown.html#table-of-contents)
56
56
  * [Admonitions](https://python-markdown.github.io/extensions/admonition/) and alert boxes in [GitHub](https://docs.github.com/en/get-started/writing-on-github/getting-started-with-writing-and-formatting-on-github/basic-writing-and-formatting-syntax#alerts) and [GitLab](https://docs.gitlab.com/ee/development/documentation/styleguide/#alert-boxes)
@@ -75,11 +75,11 @@ npm install -g @mermaid-js/mermaid-cli
75
75
 
76
76
  In order to get started, you will need
77
77
 
78
- * your organization domain name (e.g. `instructure.atlassian.net`),
78
+ * your organization domain name (e.g. `example.atlassian.net`),
79
79
  * base path for Confluence wiki (typically `/wiki/` for managed Confluence, `/` for on-premise)
80
80
  * your Confluence username (e.g. `levente.hunyadi@instructure.com`) (only if required by your deployment),
81
81
  * a Confluence API token (a string of alphanumeric characters), and
82
- * the space key in Confluence (e.g. `DAP`) you are publishing content to.
82
+ * the space key in Confluence (e.g. `SPACE`) you are publishing content to.
83
83
 
84
84
  ### Obtaining an API token
85
85
 
@@ -93,11 +93,11 @@ In order to get started, you will need
93
93
  Confluence organization domain, base path, username, API token and space key can be specified at runtime or set as Confluence environment variables (e.g. add to your `~/.profile` on Linux, or `~/.bash_profile` or `~/.zshenv` on MacOS):
94
94
 
95
95
  ```bash
96
- export CONFLUENCE_DOMAIN='instructure.atlassian.net'
96
+ export CONFLUENCE_DOMAIN='example.atlassian.net'
97
97
  export CONFLUENCE_PATH='/wiki/'
98
98
  export CONFLUENCE_USER_NAME='levente.hunyadi@instructure.com'
99
99
  export CONFLUENCE_API_KEY='0123456789abcdef'
100
- export CONFLUENCE_SPACE_KEY='DAP'
100
+ export CONFLUENCE_SPACE_KEY='SPACE'
101
101
  ```
102
102
 
103
103
  On Windows, these can be set via system properties.
@@ -129,7 +129,7 @@ The above tells the tool to synchronize the Markdown file with the given Conflue
129
129
  If you work in an environment where there are multiple Confluence spaces, and some Markdown pages may go into one space, whereas other pages may go into another, you can set the target space on a per-document basis:
130
130
 
131
131
  ```markdown
132
- <!-- confluence-space-key: DAP -->
132
+ <!-- confluence-space-key: SPACE -->
133
133
  ```
134
134
 
135
135
  This overrides the default space set via command-line arguments or environment variables.
@@ -146,9 +146,17 @@ Provide generated-by prompt text in the Markdown file with a tag:
146
146
 
147
147
  Alternatively, use the `--generated-by GENERATED_BY` option. The tag takes precedence.
148
148
 
149
+ ### Publishing a single page
150
+
151
+ *md2conf* has two modes of operation: *single-page mode* and *directory mode*.
152
+
153
+ In single-page mode, you specify a single Markdown file as the source, which can contain absolute links to external locations (e.g. `https://example.com`) but not relative links to other pages (e.g. `local.md`). In other words, the page must be stand-alone.
154
+
149
155
  ### Publishing a directory
150
156
 
151
- *md2conf* allows you to convert and publish a directory of Markdown files rather than a single Markdown file if you pass a directory as `mdpath`. This will traverse the specified directory recursively, and synchronize each Markdown file.
157
+ *md2conf* allows you to convert and publish a directory of Markdown files rather than a single Markdown file in *directory mode* if you pass a directory as the source. This will traverse the specified directory recursively, and synchronize each Markdown file.
158
+
159
+ First, *md2conf* builds an index of pages in the directory hierarchy. The index maps each Markdown file path to a Confluence page ID. Whenever a relative link is encountered in a Markdown file, the relative link is replaced with a Confluence URL to the referenced page with the help of the index. All relative links must point to Markdown files that are located in the directory hierarchy.
152
160
 
153
161
  If a Markdown file doesn't yet pair up with a Confluence page, *md2conf* creates a new page and assigns a parent. Parent-child relationships are reflected in the navigation panel in Confluence. You can set a root page ID with the command-line option `-r`, which constitutes the topmost parent. (This could correspond to the landing page of your Confluence space. The Confluence page ID is always revealed when you edit a page.) Whenever a directory contains the file `index.md` or `README.md`, this page becomes the future parent page, and all Markdown files in this directory (and possibly nested directories) become its child pages (unless they already have a page ID). However, if an `index.md` or `README.md` file is subsequently found in one of the nested directories, it becomes the parent page of that directory, and any of its subdirectories.
154
162
 
@@ -216,7 +224,7 @@ You can run the Docker container via `docker run` or via `Dockerfile`. Either ca
216
224
  With `docker run`, you can pass Confluence domain, user, API and space key directly to `docker run`:
217
225
 
218
226
  ```sh
219
- docker run --rm --name md2conf -v $(pwd):/data leventehunyadi/md2conf:latest -d instructure.atlassian.net -u levente.hunyadi@instructure.com -a 0123456789abcdef -s DAP ./
227
+ docker run --rm --name md2conf -v $(pwd):/data leventehunyadi/md2conf:latest -d example.atlassian.net -u levente.hunyadi@instructure.com -a 0123456789abcdef -s SPACE ./
220
228
  ```
221
229
 
222
230
  Alternatively, you can use a separate file `.env` to pass these parameters as environment variables:
@@ -234,11 +242,11 @@ With the `Dockerfile` approach, you can extend the base image:
234
242
  ```Dockerfile
235
243
  FROM leventehunyadi/md2conf:latest
236
244
 
237
- ENV CONFLUENCE_DOMAIN='instructure.atlassian.net'
245
+ ENV CONFLUENCE_DOMAIN='example.atlassian.net'
238
246
  ENV CONFLUENCE_PATH='/wiki/'
239
247
  ENV CONFLUENCE_USER_NAME='levente.hunyadi@instructure.com'
240
248
  ENV CONFLUENCE_API_KEY='0123456789abcdef'
241
- ENV CONFLUENCE_SPACE_KEY='DAP'
249
+ ENV CONFLUENCE_SPACE_KEY='SPACE'
242
250
 
243
251
  CMD ["./"]
244
252
  ```
@@ -248,5 +256,5 @@ Alternatively,
248
256
  ```Dockerfile
249
257
  FROM leventehunyadi/md2conf:latest
250
258
 
251
- CMD ["-d", "instructure.atlassian.net", "-u", "levente.hunyadi@instructure.com", "-a", "0123456789abcdef", "-s", "DAP", "./"]
259
+ CMD ["-d", "example.atlassian.net", "-u", "levente.hunyadi@instructure.com", "-a", "0123456789abcdef", "-s", "SPACE", "./"]
252
260
  ```
@@ -0,0 +1,21 @@
1
+ md2conf/__init__.py,sha256=U8zdop7-AIrfwCYzWiwKfhCEPF_1QEKPt4Zwq-38LlU,402
2
+ md2conf/__main__.py,sha256=6iOI28W_d71tlnCMFpZwvkBmBt5-HazlZsz69gS4Oak,6894
3
+ md2conf/api.py,sha256=NmAbNWTrTSi2ZDGYymy70Fw6HcgrmB-Ua4re4yLJvVc,17715
4
+ md2conf/application.py,sha256=-kFpMRtSpQUU1hsiW5O73gL1X9McQWpvyAAEUxEnpuU,8869
5
+ md2conf/converter.py,sha256=S8Kka35Y99w0J00CYi-DQwsKzlHAvBfaSCf10mb1FZk,36596
6
+ md2conf/emoji.py,sha256=w9oiOIxzObAE7HTo3f6aETT1_D3t3yZwr88ynU4ENm0,1924
7
+ md2conf/entities.dtd,sha256=M6NzqL5N7dPs_eUA_6sDsiSLzDaAacrx9LdttiufvYU,30215
8
+ md2conf/matcher.py,sha256=mYMltZOLypK4O-SJugLgicOwUMem67hiNLg_kPFoJkU,3583
9
+ md2conf/mermaid.py,sha256=gqA6Hg6WcPDdR7JOClezAgNZj2Gq4pXJSgmOUlUt6Dk,2192
10
+ md2conf/processor.py,sha256=E-Na-a8tNp4CaoRPA5etcXdHXNRdgyMrf6bfKa9P7O4,4781
11
+ md2conf/properties.py,sha256=iVIc0h0XtS3Y2LCywX1C9cvmVQ0WljOMt8pl2MDMVCI,1990
12
+ md2conf/puppeteer-config.json,sha256=-dMTAN_7kNTGbDlfXzApl0KJpAWna9YKZdwMKbpOb60,159
13
+ md2conf/py.typed,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
14
+ md2conf/util.py,sha256=ftf60MiW7S7rW45ipWX6efP_Sv2F2qpyIDHrGA0cBiw,743
15
+ markdown_to_confluence-0.2.7.dist-info/LICENSE,sha256=Pv43so2bPfmKhmsrmXFyAvS7M30-1i1tzjz6-dfhyOo,1077
16
+ markdown_to_confluence-0.2.7.dist-info/METADATA,sha256=76K_O_5b__MnKT7FuLXgCHX6hR5dZio3mK6RWR4DyCA,13551
17
+ markdown_to_confluence-0.2.7.dist-info/WHEEL,sha256=PZUExdf71Ui_so67QXpySuHtCi3-J3wvF4ORK6k_S8U,91
18
+ markdown_to_confluence-0.2.7.dist-info/entry_points.txt,sha256=F1zxa1wtEObtbHS-qp46330WVFLHdMnV2wQ-ZorRmX0,50
19
+ markdown_to_confluence-0.2.7.dist-info/top_level.txt,sha256=_FJfl_kHrHNidyjUOuS01ngu_jDsfc-ZjSocNRJnTzU,8
20
+ markdown_to_confluence-0.2.7.dist-info/zip-safe,sha256=AbpHGcgLb-kRsJGnwFEktk7uzpZOCcBY74-YBdrKVGs,1
21
+ markdown_to_confluence-0.2.7.dist-info/RECORD,,
@@ -1,5 +1,5 @@
1
1
  Wheel-Version: 1.0
2
- Generator: setuptools (75.3.0)
2
+ Generator: setuptools (75.6.0)
3
3
  Root-Is-Purelib: true
4
4
  Tag: py3-none-any
5
5
 
md2conf/__init__.py CHANGED
@@ -5,7 +5,7 @@ Parses Markdown files, converts Markdown content into the Confluence Storage For
5
5
  Confluence API endpoints to upload images and content.
6
6
  """
7
7
 
8
- __version__ = "0.2.5"
8
+ __version__ = "0.2.7"
9
9
  __author__ = "Levente Hunyadi"
10
10
  __copyright__ = "Copyright 2022-2024, Levente Hunyadi"
11
11
  __license__ = "MIT"
md2conf/api.py CHANGED
@@ -178,17 +178,30 @@ class ConfluenceSession:
178
178
  def upload_attachment(
179
179
  self,
180
180
  page_id: str,
181
- attachment_path: Path,
182
181
  attachment_name: str,
182
+ *,
183
+ attachment_path: Optional[Path] = None,
183
184
  raw_data: Optional[bytes] = None,
185
+ content_type: Optional[str] = None,
184
186
  comment: Optional[str] = None,
185
- *,
186
187
  space_key: Optional[str] = None,
187
188
  force: bool = False,
188
189
  ) -> None:
189
- content_type = mimetypes.guess_type(attachment_path, strict=True)[0]
190
190
 
191
- if not raw_data and not attachment_path.is_file():
191
+ if attachment_path is None and raw_data is None:
192
+ raise ConfluenceError("required: `attachment_path` or `raw_data`")
193
+
194
+ if attachment_path is not None and raw_data is not None:
195
+ raise ConfluenceError("expected: either `attachment_path` or `raw_data`")
196
+
197
+ if content_type is None:
198
+ if attachment_path is not None:
199
+ name = str(attachment_path)
200
+ else:
201
+ name = attachment_name
202
+ content_type, _ = mimetypes.guess_type(name, strict=True)
203
+
204
+ if attachment_path is not None and not attachment_path.is_file():
192
205
  raise ConfluenceError(f"file not found: {attachment_path}")
193
206
 
194
207
  try:
@@ -196,14 +209,16 @@ class ConfluenceSession:
196
209
  page_id, attachment_name, space_key=space_key
197
210
  )
198
211
 
199
- if not raw_data:
212
+ if attachment_path is not None:
200
213
  if not force and attachment.file_size == attachment_path.stat().st_size:
201
214
  LOGGER.info("Up-to-date attachment: %s", attachment_name)
202
215
  return
203
- else:
216
+ elif raw_data is not None:
204
217
  if not force and attachment.file_size == len(raw_data):
205
218
  LOGGER.info("Up-to-date embedded image: %s", attachment_name)
206
219
  return
220
+ else:
221
+ raise NotImplementedError("never occurs")
207
222
 
208
223
  id = removeprefix(attachment.id, "att")
209
224
  path = f"/content/{page_id}/child/attachment/{id}/data"
@@ -213,7 +228,7 @@ class ConfluenceSession:
213
228
 
214
229
  url = self._build_url(path)
215
230
 
216
- if not raw_data:
231
+ if attachment_path is not None:
217
232
  with open(attachment_path, "rb") as attachment_file:
218
233
  file_to_upload = {
219
234
  "comment": comment,
@@ -230,24 +245,27 @@ class ConfluenceSession:
230
245
  files=file_to_upload, # type: ignore
231
246
  headers={"X-Atlassian-Token": "no-check"},
232
247
  )
233
- else:
248
+ elif raw_data is not None:
234
249
  LOGGER.info("Uploading raw data: %s", attachment_name)
235
250
 
251
+ raw_file = io.BytesIO(raw_data)
252
+ raw_file.name = attachment_name
236
253
  file_to_upload = {
237
254
  "comment": comment,
238
255
  "file": (
239
256
  attachment_name, # will truncate path component
240
- io.BytesIO(raw_data), # type: ignore
257
+ raw_file, # type: ignore
241
258
  content_type,
242
259
  {"Expires": "0"},
243
260
  ),
244
261
  }
245
-
246
262
  response = self.session.post(
247
263
  url,
248
264
  files=file_to_upload, # type: ignore
249
265
  headers={"X-Atlassian-Token": "no-check"},
250
266
  )
267
+ else:
268
+ raise NotImplementedError("never occurs")
251
269
 
252
270
  response.raise_for_status()
253
271
  data = response.json()
@@ -402,12 +420,23 @@ class ConfluenceSession:
402
420
  new_content: str,
403
421
  *,
404
422
  space_key: Optional[str] = None,
423
+ title: Optional[str] = None,
405
424
  ) -> None:
425
+ """
426
+ Update a page via the Confluence API.
427
+
428
+ :param page_id: The Confluence page ID.
429
+ :param new_content: Confluence Storage Format XHTML.
430
+ :param space_key: The Confluence space key (unless the default space is to be used).
431
+ :param title: New title to assign to the page. Needs to be unique within a space.
432
+ """
433
+
406
434
  page = self.get_page(page_id, space_key=space_key)
435
+ new_title = title or page.title
407
436
 
408
437
  try:
409
438
  old_content = sanitize_confluence(page.content)
410
- if old_content == new_content:
439
+ if page.title == new_title and old_content == new_content:
411
440
  LOGGER.info("Up-to-date page: %s", page_id)
412
441
  return
413
442
  except ParseError as exc:
@@ -417,7 +446,7 @@ class ConfluenceSession:
417
446
  data = {
418
447
  "id": page_id,
419
448
  "type": "page",
420
- "title": page.title, # title needs to be unique within a space so the original title is maintained
449
+ "title": new_title,
421
450
  "space": {"key": space_key or self.space_key},
422
451
  "body": {"storage": {"value": new_content, "representation": "storage"}},
423
452
  "version": {"minorEdit": True, "number": page.version + 1},
md2conf/application.py CHANGED
@@ -11,8 +11,6 @@ import os.path
11
11
  from pathlib import Path
12
12
  from typing import Dict, List, Optional
13
13
 
14
- import yaml
15
-
16
14
  from .api import ConfluencePage, ConfluenceSession
17
15
  from .converter import (
18
16
  ConfluenceDocument,
@@ -20,7 +18,7 @@ from .converter import (
20
18
  ConfluencePageMetadata,
21
19
  ConfluenceQualifiedID,
22
20
  attachment_name,
23
- extract_frontmatter,
21
+ extract_frontmatter_title,
24
22
  extract_qualified_id,
25
23
  read_qualified_id,
26
24
  )
@@ -52,17 +50,31 @@ class Application:
52
50
  else:
53
51
  raise ValueError(f"expected: valid file or directory path; got: {path}")
54
52
 
55
- def synchronize_page(self, page_path: Path) -> None:
53
+ def synchronize_page(
54
+ self, page_path: Path, root_dir: Optional[Path] = None
55
+ ) -> None:
56
56
  "Synchronizes a single Markdown page with Confluence."
57
57
 
58
58
  page_path = page_path.resolve(True)
59
- self._synchronize_page(page_path, {})
59
+ if root_dir is None:
60
+ root_dir = page_path.parent
61
+ else:
62
+ root_dir = root_dir.resolve(True)
60
63
 
61
- def synchronize_directory(self, local_dir: Path) -> None:
64
+ self._synchronize_page(page_path, root_dir, {})
65
+
66
+ def synchronize_directory(
67
+ self, local_dir: Path, root_dir: Optional[Path] = None
68
+ ) -> None:
62
69
  "Synchronizes a directory of Markdown pages with Confluence."
63
70
 
64
- LOGGER.info("Synchronizing directory: %s", local_dir)
65
71
  local_dir = local_dir.resolve(True)
72
+ if root_dir is None:
73
+ root_dir = local_dir
74
+ else:
75
+ root_dir = root_dir.resolve(True)
76
+
77
+ LOGGER.info("Synchronizing directory: %s", local_dir)
66
78
 
67
79
  # Step 1: build index of all page metadata
68
80
  page_metadata: Dict[Path, ConfluencePageMetadata] = {}
@@ -76,17 +88,18 @@ class Application:
76
88
 
77
89
  # Step 2: convert each page
78
90
  for page_path in page_metadata.keys():
79
- self._synchronize_page(page_path, page_metadata)
91
+ self._synchronize_page(page_path, root_dir, page_metadata)
80
92
 
81
93
  def _synchronize_page(
82
94
  self,
83
95
  page_path: Path,
96
+ root_dir: Path,
84
97
  page_metadata: Dict[Path, ConfluencePageMetadata],
85
98
  ) -> None:
86
99
  base_path = page_path.parent
87
100
 
88
101
  LOGGER.info("Synchronizing page: %s", page_path)
89
- document = ConfluenceDocument(page_path, self.options, page_metadata)
102
+ document = ConfluenceDocument(page_path, self.options, root_dir, page_metadata)
90
103
 
91
104
  if document.id.space_key:
92
105
  with self.api.switch_space(document.id.space_key):
@@ -159,7 +172,7 @@ class Application:
159
172
  document = f.read()
160
173
 
161
174
  qualified_id, document = extract_qualified_id(document)
162
- frontmatter, document = extract_frontmatter(document)
175
+ frontmatter_title, _ = extract_frontmatter_title(document)
163
176
 
164
177
  if qualified_id is not None:
165
178
  confluence_page = self.api.get_page(
@@ -172,15 +185,8 @@ class Application:
172
185
  )
173
186
 
174
187
  # assign title from frontmatter if present
175
- if title is None and frontmatter is not None:
176
- properties = yaml.safe_load(frontmatter)
177
- if isinstance(properties, dict):
178
- property_title = properties.get("title")
179
- if isinstance(property_title, str):
180
- title = property_title
181
-
182
188
  confluence_page = self._create_page(
183
- absolute_path, document, title, parent_id
189
+ absolute_path, document, title or frontmatter_title, parent_id
184
190
  )
185
191
 
186
192
  return ConfluencePageMetadata(
@@ -221,21 +227,20 @@ class Application:
221
227
  for image in document.images:
222
228
  self.api.upload_attachment(
223
229
  document.id.page_id,
224
- base_path / image,
225
230
  attachment_name(image),
231
+ attachment_path=base_path / image,
226
232
  )
227
233
 
228
- for image, data in document.embedded_images.items():
234
+ for name, data in document.embedded_images.items():
229
235
  self.api.upload_attachment(
230
236
  document.id.page_id,
231
- Path("EMB") / image,
232
- attachment_name(image),
237
+ name,
233
238
  raw_data=data,
234
239
  )
235
240
 
236
241
  content = document.xhtml()
237
242
  LOGGER.debug("Generated Confluence Storage Format document:\n%s", content)
238
- self.api.update_page(document.id.page_id, content)
243
+ self.api.update_page(document.id.page_id, content, title=document.title)
239
244
 
240
245
  def _update_markdown(
241
246
  self,
md2conf/converter.py CHANGED
@@ -18,11 +18,12 @@ import uuid
18
18
  import xml.etree.ElementTree
19
19
  from dataclasses import dataclass
20
20
  from pathlib import Path
21
- from typing import Any, Dict, List, Literal, Optional, Tuple
21
+ from typing import Any, Dict, List, Literal, Optional, Tuple, Union
22
22
  from urllib.parse import ParseResult, urlparse, urlunparse
23
23
 
24
24
  import lxml.etree as ET
25
25
  import markdown
26
+ import yaml
26
27
  from lxml.builder import ElementMaker
27
28
 
28
29
  from . import mermaid
@@ -301,9 +302,10 @@ class ConfluenceStorageFormatConverter(NodeVisitor):
301
302
 
302
303
  options: ConfluenceConverterOptions
303
304
  path: Path
304
- base_path: Path
305
+ base_dir: Path
306
+ root_dir: Path
305
307
  links: List[str]
306
- images: List[str]
308
+ images: List[Path]
307
309
  embedded_images: Dict[str, bytes]
308
310
  page_metadata: Dict[Path, ConfluencePageMetadata]
309
311
 
@@ -311,12 +313,14 @@ class ConfluenceStorageFormatConverter(NodeVisitor):
311
313
  self,
312
314
  options: ConfluenceConverterOptions,
313
315
  path: Path,
316
+ root_dir: Path,
314
317
  page_metadata: Dict[Path, ConfluencePageMetadata],
315
318
  ) -> None:
316
319
  super().__init__()
317
320
  self.options = options
318
321
  self.path = path
319
- self.base_path = path.parent
322
+ self.base_dir = path.parent
323
+ self.root_dir = root_dir
320
324
  self.links = []
321
325
  self.images = []
322
326
  self.embedded_images = {}
@@ -347,8 +351,8 @@ class ConfluenceStorageFormatConverter(NodeVisitor):
347
351
  heading.text = None
348
352
 
349
353
  def _transform_link(self, anchor: ET._Element) -> Optional[ET._Element]:
350
- url = anchor.attrib["href"]
351
- if is_absolute_url(url):
354
+ url = anchor.attrib.get("href")
355
+ if url is None or is_absolute_url(url):
352
356
  return None
353
357
 
354
358
  LOGGER.debug("Found link %s relative to %s", url, self.path)
@@ -383,9 +387,9 @@ class ConfluenceStorageFormatConverter(NodeVisitor):
383
387
  # convert the relative URL to absolute URL based on the base path value, then look up
384
388
  # the absolute path in the page metadata dictionary to discover the relative path
385
389
  # within Confluence that should be used
386
- absolute_path = (self.base_path / relative_url.path).absolute()
387
- if not str(absolute_path).startswith(str(self.base_path)):
388
- msg = f"relative URL {url} points to outside base path: {self.base_path}"
390
+ absolute_path = (self.base_dir / relative_url.path).resolve(True)
391
+ if not str(absolute_path).startswith(str(self.root_dir)):
392
+ msg = f"relative URL {url} points to outside root path: {self.root_dir}"
389
393
  if self.options.ignore_invalid_url:
390
394
  LOGGER.warning(msg)
391
395
  anchor.attrib.pop("href")
@@ -393,8 +397,6 @@ class ConfluenceStorageFormatConverter(NodeVisitor):
393
397
  else:
394
398
  raise DocumentError(msg)
395
399
 
396
- relative_path = os.path.relpath(absolute_path, self.base_path)
397
-
398
400
  link_metadata = self.page_metadata.get(absolute_path)
399
401
  if link_metadata is None:
400
402
  msg = f"unable to find matching page for URL: {url}"
@@ -405,6 +407,7 @@ class ConfluenceStorageFormatConverter(NodeVisitor):
405
407
  else:
406
408
  raise DocumentError(msg)
407
409
 
410
+ relative_path = os.path.relpath(absolute_path, self.base_dir)
408
411
  LOGGER.debug(
409
412
  "found link to page %s with metadata: %s", relative_path, link_metadata
410
413
  )
@@ -430,31 +433,72 @@ class ConfluenceStorageFormatConverter(NodeVisitor):
430
433
  return None
431
434
 
432
435
  def _transform_image(self, image: ET._Element) -> ET._Element:
433
- path: str = image.attrib["src"]
436
+ src = image.attrib.get("src")
437
+
438
+ if not src:
439
+ raise DocumentError("image lacks `src` attribute")
440
+
441
+ attributes: Dict[str, Any] = {
442
+ ET.QName(namespaces["ac"], "align"): "center",
443
+ ET.QName(namespaces["ac"], "layout"): "center",
444
+ }
445
+ width = image.attrib.get("width")
446
+ if width is not None:
447
+ attributes.update({ET.QName(namespaces["ac"], "width"): width})
448
+ height = image.attrib.get("height")
449
+ if height is not None:
450
+ attributes.update({ET.QName(namespaces["ac"], "height"): height})
451
+
452
+ caption = image.attrib.get("alt")
453
+
454
+ if is_absolute_url(src):
455
+ return self._transform_external_image(src, caption, attributes)
456
+ else:
457
+ return self._transform_attached_image(Path(src), caption, attributes)
458
+
459
+ def _transform_external_image(
460
+ self, url: str, caption: Optional[str], attributes: Dict[str, Any]
461
+ ) -> ET._Element:
462
+ "Emits Confluence Storage Format XHTML for an external image."
463
+
464
+ elements: List[ET._Element] = []
465
+ elements.append(
466
+ RI(
467
+ "url",
468
+ # refers to an external image
469
+ {ET.QName(namespaces["ri"], "value"): url},
470
+ )
471
+ )
472
+ if caption is not None:
473
+ elements.append(AC("caption", HTML.p(caption)))
474
+
475
+ return AC("image", attributes, *elements)
476
+
477
+ def _transform_attached_image(
478
+ self, path: Path, caption: Optional[str], attributes: Dict[str, Any]
479
+ ) -> ET._Element:
480
+ "Emits Confluence Storage Format XHTML for an attached image."
434
481
 
435
482
  # prefer PNG over SVG; Confluence displays SVG in wrong size, and text labels are truncated
436
- if path and is_relative_url(path):
437
- relative_path = Path(path)
438
- if (
439
- relative_path.suffix == ".svg"
440
- and (self.base_path / relative_path.with_suffix(".png")).exists()
441
- ):
442
- path = str(relative_path.with_suffix(".png"))
483
+ png_file = path.with_suffix(".png")
484
+ if path.suffix == ".svg" and (self.base_dir / png_file).exists():
485
+ path = png_file
443
486
 
444
487
  self.images.append(path)
445
- caption = image.attrib["alt"]
446
- return AC(
447
- "image",
448
- {
449
- ET.QName(namespaces["ac"], "align"): "center",
450
- ET.QName(namespaces["ac"], "layout"): "center",
451
- },
488
+ image_name = attachment_name(path)
489
+
490
+ elements: List[ET._Element] = []
491
+ elements.append(
452
492
  RI(
453
493
  "attachment",
454
- {ET.QName(namespaces["ri"], "filename"): attachment_name(path)},
455
- ),
456
- AC("caption", HTML.p(caption)),
494
+ # refers to an attachment uploaded alongside the page
495
+ {ET.QName(namespaces["ri"], "filename"): image_name},
496
+ )
457
497
  )
498
+ if caption is not None:
499
+ elements.append(AC("caption", HTML.p(caption)))
500
+
501
+ return AC("image", attributes, *elements)
458
502
 
459
503
  def _transform_block(self, code: ET._Element) -> ET._Element:
460
504
  language = code.attrib.get("class")
@@ -757,6 +801,9 @@ class ConfluenceStorageFormatConverter(NodeVisitor):
757
801
  tail: str = child.tail
758
802
  child.tail = tail.replace("\n", " ")
759
803
 
804
+ if not isinstance(child.tag, str):
805
+ return None
806
+
760
807
  if self.options.heading_anchors:
761
808
  # <h1>...</h1>
762
809
  # <h2>...</h2> ...
@@ -894,6 +941,20 @@ def extract_frontmatter(text: str) -> Tuple[Optional[str], str]:
894
941
  return extract_value(r"(?ms)\A---$(.+?)^---$", text)
895
942
 
896
943
 
944
+ def extract_frontmatter_title(text: str) -> Tuple[Optional[str], str]:
945
+ frontmatter, text = extract_frontmatter(text)
946
+
947
+ title: Optional[str] = None
948
+ if frontmatter is not None:
949
+ properties = yaml.safe_load(frontmatter)
950
+ if isinstance(properties, dict):
951
+ property_title = properties.get("title")
952
+ if isinstance(property_title, str):
953
+ title = property_title
954
+
955
+ return title, text
956
+
957
+
897
958
  def read_qualified_id(absolute_path: Path) -> Optional[ConfluenceQualifiedID]:
898
959
  "Reads the Confluence page ID and space key from a Markdown document."
899
960
 
@@ -931,8 +992,9 @@ class ConfluenceDocumentOptions:
931
992
 
932
993
  class ConfluenceDocument:
933
994
  id: ConfluenceQualifiedID
995
+ title: Optional[str]
934
996
  links: List[str]
935
- images: List[str]
997
+ images: List[Path]
936
998
 
937
999
  options: ConfluenceDocumentOptions
938
1000
  root: ET._Element
@@ -941,10 +1003,11 @@ class ConfluenceDocument:
941
1003
  self,
942
1004
  path: Path,
943
1005
  options: ConfluenceDocumentOptions,
1006
+ root_dir: Path,
944
1007
  page_metadata: Dict[Path, ConfluencePageMetadata],
945
1008
  ) -> None:
946
1009
  self.options = options
947
- path = path.absolute()
1010
+ path = path.resolve(True)
948
1011
 
949
1012
  with open(path, "r", encoding="utf-8") as f:
950
1013
  text = f.read()
@@ -968,7 +1031,7 @@ class ConfluenceDocument:
968
1031
  )
969
1032
 
970
1033
  # extract frontmatter
971
- frontmatter, text = extract_frontmatter(text)
1034
+ self.title, text = extract_frontmatter_title(text)
972
1035
 
973
1036
  # convert to HTML
974
1037
  html = markdown_to_html(text)
@@ -998,6 +1061,7 @@ class ConfluenceDocument:
998
1061
  webui_links=self.options.webui_links,
999
1062
  ),
1000
1063
  path,
1064
+ root_dir,
1001
1065
  page_metadata,
1002
1066
  )
1003
1067
  converter.visit(self.root)
@@ -1009,7 +1073,7 @@ class ConfluenceDocument:
1009
1073
  return elements_to_string(self.root)
1010
1074
 
1011
1075
 
1012
- def attachment_name(name: str) -> str:
1076
+ def attachment_name(name: Union[Path, str]) -> str:
1013
1077
  """
1014
1078
  Safe name for use with attachment uploads.
1015
1079
 
@@ -1018,7 +1082,7 @@ def attachment_name(name: str) -> str:
1018
1082
  * Special characters: hyphen (-), underscore (_), period (.)
1019
1083
  """
1020
1084
 
1021
- return re.sub(r"[^\-0-9A-Za-z_.]", "_", name)
1085
+ return re.sub(r"[^\-0-9A-Za-z_.]", "_", str(name))
1022
1086
 
1023
1087
 
1024
1088
  def sanitize_confluence(html: str) -> str:
md2conf/mermaid.py CHANGED
@@ -56,6 +56,10 @@ def render(source: str, output_format: Literal["png", "svg"] = "png") -> bytes:
56
56
  filename,
57
57
  "--outputFormat",
58
58
  output_format,
59
+ "--backgroundColor",
60
+ "transparent",
61
+ "--scale",
62
+ "2",
59
63
  ]
60
64
  root = os.path.dirname(__file__)
61
65
  if is_docker():
md2conf/processor.py CHANGED
@@ -10,7 +10,7 @@ import hashlib
10
10
  import logging
11
11
  import os
12
12
  from pathlib import Path
13
- from typing import Dict, List
13
+ from typing import Dict, List, Optional
14
14
 
15
15
  from .converter import (
16
16
  ConfluenceDocument,
@@ -42,15 +42,22 @@ class Processor:
42
42
  if path.is_dir():
43
43
  self.process_directory(path)
44
44
  elif path.is_file():
45
- self.process_page(path, {})
45
+ self.process_page(path)
46
46
  else:
47
47
  raise ValueError(f"expected: valid file or directory path; got: {path}")
48
48
 
49
- def process_directory(self, local_dir: Path) -> None:
49
+ def process_directory(
50
+ self, local_dir: Path, root_dir: Optional[Path] = None
51
+ ) -> None:
50
52
  "Recursively scans a directory hierarchy for Markdown files."
51
53
 
52
- LOGGER.info("Synchronizing directory: %s", local_dir)
53
54
  local_dir = local_dir.resolve(True)
55
+ if root_dir is None:
56
+ root_dir = local_dir
57
+ else:
58
+ root_dir = root_dir.resolve(True)
59
+
60
+ LOGGER.info("Synchronizing directory: %s", local_dir)
54
61
 
55
62
  # Step 1: build index of all page metadata
56
63
  page_metadata: Dict[Path, ConfluencePageMetadata] = {}
@@ -59,15 +66,28 @@ class Processor:
59
66
 
60
67
  # Step 2: convert each page
61
68
  for page_path in page_metadata.keys():
62
- self.process_page(page_path, page_metadata)
69
+ self._process_page(page_path, root_dir, page_metadata)
63
70
 
64
- def process_page(
65
- self, path: Path, page_metadata: Dict[Path, ConfluencePageMetadata]
66
- ) -> None:
71
+ def process_page(self, path: Path, root_dir: Optional[Path] = None) -> None:
67
72
  "Processes a single Markdown file."
68
73
 
69
74
  path = path.resolve(True)
70
- document = ConfluenceDocument(path, self.options, page_metadata)
75
+ if root_dir is None:
76
+ root_dir = path.parent
77
+ else:
78
+ root_dir = root_dir.resolve(True)
79
+
80
+ self._process_page(path, root_dir, {})
81
+
82
+ def _process_page(
83
+ self,
84
+ path: Path,
85
+ root_dir: Path,
86
+ page_metadata: Dict[Path, ConfluencePageMetadata],
87
+ ) -> None:
88
+ "Processes a single Markdown file."
89
+
90
+ document = ConfluenceDocument(path, self.options, root_dir, page_metadata)
71
91
  content = document.xhtml()
72
92
  with open(path.with_suffix(".csf"), "w", encoding="utf-8") as f:
73
93
  f.write(content)
@@ -1,21 +0,0 @@
1
- md2conf/__init__.py,sha256=0eak9lvskuCqGJnGeno6SHoCiBFAX5IQLHVBx1LV0w8,402
2
- md2conf/__main__.py,sha256=6iOI28W_d71tlnCMFpZwvkBmBt5-HazlZsz69gS4Oak,6894
3
- md2conf/api.py,sha256=EZSHbuH5O9fPyW7iLAX0Fqw8njXmvd6sEbgseP-eUUc,16498
4
- md2conf/application.py,sha256=hmfLiofGulN8zUw2uXuueohCkDh978sqLkoUot928qM,8796
5
- md2conf/converter.py,sha256=8X8tNELqwAaZYSVvczJl_ZpJL9tu2ImCBXaQBQvGgeM,34413
6
- md2conf/emoji.py,sha256=w9oiOIxzObAE7HTo3f6aETT1_D3t3yZwr88ynU4ENm0,1924
7
- md2conf/entities.dtd,sha256=M6NzqL5N7dPs_eUA_6sDsiSLzDaAacrx9LdttiufvYU,30215
8
- md2conf/matcher.py,sha256=mYMltZOLypK4O-SJugLgicOwUMem67hiNLg_kPFoJkU,3583
9
- md2conf/mermaid.py,sha256=Tsibd1aOn4hRYv6emQg0hrZMPTkflIeXHVbZ7nQ5lSc,2108
10
- md2conf/processor.py,sha256=tUt5D4_D3uhofg2Bn23owBJmkVHj4tSll0zI95J6cdk,4243
11
- md2conf/properties.py,sha256=iVIc0h0XtS3Y2LCywX1C9cvmVQ0WljOMt8pl2MDMVCI,1990
12
- md2conf/puppeteer-config.json,sha256=-dMTAN_7kNTGbDlfXzApl0KJpAWna9YKZdwMKbpOb60,159
13
- md2conf/py.typed,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
14
- md2conf/util.py,sha256=ftf60MiW7S7rW45ipWX6efP_Sv2F2qpyIDHrGA0cBiw,743
15
- markdown_to_confluence-0.2.5.dist-info/LICENSE,sha256=Pv43so2bPfmKhmsrmXFyAvS7M30-1i1tzjz6-dfhyOo,1077
16
- markdown_to_confluence-0.2.5.dist-info/METADATA,sha256=E7j_aFJ7rT4SOpoUIa40G2QJL_7PjuXBA5JvdANRIdc,12764
17
- markdown_to_confluence-0.2.5.dist-info/WHEEL,sha256=P9jw-gEje8ByB7_hXoICnHtVCrEwMQh-630tKvQWehc,91
18
- markdown_to_confluence-0.2.5.dist-info/entry_points.txt,sha256=F1zxa1wtEObtbHS-qp46330WVFLHdMnV2wQ-ZorRmX0,50
19
- markdown_to_confluence-0.2.5.dist-info/top_level.txt,sha256=_FJfl_kHrHNidyjUOuS01ngu_jDsfc-ZjSocNRJnTzU,8
20
- markdown_to_confluence-0.2.5.dist-info/zip-safe,sha256=AbpHGcgLb-kRsJGnwFEktk7uzpZOCcBY74-YBdrKVGs,1
21
- markdown_to_confluence-0.2.5.dist-info/RECORD,,