PyPI - markdown-to-confluence - Versions diffs - 0.2.0__py3-none-any.whl → 0.2.2__py3-none-any.whl - Mend

markdown-to-confluence 0.2.0py3-none-any.whl → 0.2.2py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (18) hide show

{markdown_to_confluence-0.2.0.dist-info → markdown_to_confluence-0.2.2.dist-info}/METADATA +35 -12
markdown_to_confluence-0.2.2.dist-info/RECORD +19 -0
md2conf/__init__.py +1 -1
md2conf/__main__.py +44 -2
md2conf/api.py +32 -0
md2conf/application.py +108 -39
md2conf/converter.py +131 -34
md2conf/matcher.py +83 -0
md2conf/mermaid.py +41 -19
md2conf/processor.py +45 -20
md2conf/properties.py +4 -3
md2conf/puppeteer-config.json +8 -0
markdown_to_confluence-0.2.0.dist-info/RECORD +0 -17
{markdown_to_confluence-0.2.0.dist-info → markdown_to_confluence-0.2.2.dist-info}/LICENSE +0 -0
{markdown_to_confluence-0.2.0.dist-info → markdown_to_confluence-0.2.2.dist-info}/WHEEL +0 -0
{markdown_to_confluence-0.2.0.dist-info → markdown_to_confluence-0.2.2.dist-info}/entry_points.txt +0 -0
{markdown_to_confluence-0.2.0.dist-info → markdown_to_confluence-0.2.2.dist-info}/top_level.txt +0 -0
{markdown_to_confluence-0.2.0.dist-info → markdown_to_confluence-0.2.2.dist-info}/zip-safe +0 -0

{markdown_to_confluence-0.2.0.dist-info → markdown_to_confluence-0.2.2.dist-info}/METADATA RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.1
 Name: markdown-to-confluence
-Version: 0.2.0
+Version: 0.2.2
 Summary: Publish Markdown files to Confluence wiki
 Home-page: https://github.com/hunyadi/md2conf
 Author: Levente Hunyadi
@@ -51,7 +51,7 @@ This Python package
 * Image references (uploaded as Confluence page attachments)
 * Tables
 * [Table of contents](https://docs.gitlab.com/ee/user/markdown.html#table-of-contents)
-* [Admonitions](https://python-markdown.github.io/extensions/admonition/) and [alerts](https://docs.github.com/en/get-started/writing-on-github/getting-started-with-writing-and-formatting-on-github/basic-writing-and-formatting-syntax#alerts)
+* [Admonitions](https://python-markdown.github.io/extensions/admonition/) and alert boxes in [GitHub](https://docs.github.com/en/get-started/writing-on-github/getting-started-with-writing-and-formatting-on-github/basic-writing-and-formatting-syntax#alerts) and [GitLab](https://docs.gitlab.com/ee/development/documentation/styleguide/#alert-boxes)
 * [Collapsed sections](https://docs.github.com/en/get-started/writing-on-github/working-with-advanced-formatting/organizing-information-with-collapsed-sections)
 * [Mermaid diagrams](https://mermaid.live/) in code blocks (converted to images)
@@ -144,21 +144,28 @@ Provide generated-by prompt text in the Markdown file with a tag:
 Alternatively, use the `--generated-by GENERATED_BY` option. The tag takes precedence.
+### Ignoring files
+Skip files in a directory with rules defined in `.mdignore`. Each rule should occupy a single line. Rules follow the syntax of [fnmatch](https://docs.python.org/3/library/fnmatch.html#fnmatch.fnmatch). Specifically, `?` matches any single character, and `*` matches zero or more characters. For example, use `up-*.md` to exclude Markdown files that start with `up-`. Lines that start with `#` are treated as comments.
+Files that don't have the extension `*.md` are skipped automatically. Hidden directories (whose name starts with `.`) are not recursed into.
 ### Running the tool
 You execute the command-line tool `md2conf` to synchronize the Markdown file with Confluence:
 ```sh
-$ python3 -m md2conf sample/example.md
+$ python3 -m md2conf sample/index.md
 ```
 Use the `--help` switch to get a full list of supported command-line options:
 ```console
 $ python3 -m md2conf --help
-usage: md2conf [-h] [-d DOMAIN] [-p PATH] [-u USERNAME] [-a APIKEY] [-s SPACE] [-l {debug,info,warning,error,critical}] [-r ROOT_PAGE]
-               [--generated-by GENERATED_BY] [--no-generated-by] [--render-mermaid] [--no-render-mermaid]
-               [--render-mermaid-format {png,svg}] [--heading-anchors] [--ignore-invalid-url] [--local]
+usage: md2conf [-h] [--version] [-d DOMAIN] [-p PATH] [-u USERNAME] [-a APIKEY] [-s SPACE]
+               [-l {debug,info,warning,error,critical}] [-r ROOT_PAGE] [--generated-by GENERATED_BY] [--no-generated-by]
+               [--render-mermaid] [--no-render-mermaid] [--render-mermaid-format {png,svg}] [--heading-anchors]
+               [--ignore-invalid-url] [--local] [--headers [KEY=VALUE ...]] [--webui-links]
                mdpath
 positional arguments:
@@ -166,6 +173,7 @@ positional arguments:
 options:
   -h, --help            show this help message and exit
+  --version             show program's version number and exit
   -d DOMAIN, --domain DOMAIN
                         Confluence organization domain.
   -p PATH, --path PATH  Base path for Confluence (default: '/wiki/').
@@ -188,20 +196,35 @@ options:
   --heading-anchors     Place an anchor at each section heading with GitHub-style same-page identifiers.
   --ignore-invalid-url  Emit a warning but otherwise ignore relative URLs that point to ill-specified locations.
   --local               Write XHTML-based Confluence Storage Format files locally without invoking Confluence API.
+  --headers [KEY=VALUE ...]
+                        Apply custom headers to all Confluence API requests.
+  --webui-links         Enable Confluence Web UI links.
 ```
-### Using the docker container
+### Using the Docker container
+You can run the Docker container via `docker run` or via `Dockerfile`. Either can accept the environment variables or arguments similar to the Python options. The final argument `./` corresponds to `mdpath` in the command-line utility.
+With `docker run`, you can pass Confluence domain, user, API and space key directly to `docker run`:
-You can run the docker container via `docker run` or via `Dockerfile`. Either can accept the environment variables or arguments similar to the Python options.  The final argument `./` corresponds to `mdpath` in the command-line utility.
+```sh
+docker run --rm --name md2conf -v $(pwd):/data leventehunyadi/md2conf -d instructure.atlassian.net -u levente.hunyadi@instructure.com -a 0123456789abcdef -s DAP ./
+```
+Alternatively, you can use a separate file `.env` to pass these parameters as environment variables:
 ```sh
-docker run --rm --name md2conf hunyadi/md2conf -d instructure.atlassian.net -u levente.hunyadi@instructure.com -a 0123456789abcdef -s DAP ./
+docker run --rm --env-file .env --name md2conf -v $(pwd):/data leventehunyadi/md2conf ./
 ```
-Note that the entry point for the docker container's base image is `ENTRYPOINT ["python3", "-m", "md2conf"]`.
+In each case, `-v $(pwd):/data` maps the current directory to Docker container's `WORKDIR` such *md2conf* can scan files and directories in the local file system.
+Note that the entry point for the Docker container's base image is `ENTRYPOINT ["python3", "-m", "md2conf"]`.
+With the `Dockerfile` approach, you can extend the base image:
 ```Dockerfile
-FROM hunyadi/md2conf:latest
+FROM leventehunyadi/md2conf:latest
 ENV CONFLUENCE_DOMAIN='instructure.atlassian.net'
 ENV CONFLUENCE_PATH='/wiki/'
@@ -215,7 +238,7 @@ CMD ["./"]
 Alternatively,
 ```Dockerfile
-FROM hunyadi/md2conf:latest
+FROM leventehunyadi/md2conf:latest
 CMD ["-d", "instructure.atlassian.net", "-u", "levente.hunyadi@instructure.com", "-a", "0123456789abcdef", "-s", "DAP", "./"]
 ```

markdown_to_confluence-0.2.2.dist-info/RECORD ADDED Viewed

@@ -0,0 +1,19 @@
+md2conf/__init__.py,sha256=1DSbQlz0zNxil7Lbsh7VjmGvJdtKhOjtd67r2elUSjE,402
+md2conf/__main__.py,sha256=_qUspNQmQdhpH4Myh9vXDcauPyUx_FyEzNtaW_c8ytY,6601
+md2conf/api.py,sha256=UZ7mkeE1d_f_bACj8LC-t6d4EqXFQCufbeVVdi4FsTs,16947
+md2conf/application.py,sha256=mQusGnzu-ssFn9-aC_rGsqsWpDtw8qFJDnPW7cRkXC0,7762
+md2conf/converter.py,sha256=_zFk-H4NZuY2Y58enVGgFNubOJv9EI2u8tS7RQRiD3A,30391
+md2conf/entities.dtd,sha256=M6NzqL5N7dPs_eUA_6sDsiSLzDaAacrx9LdttiufvYU,30215
+md2conf/matcher.py,sha256=SAmXQzQNan05jVcmZ8PEONynj-SEcVrkCHyXvBxEi2Q,2690
+md2conf/mermaid.py,sha256=a7PVcd7kcFBOMw7Z2mOfvWC1JIVR4Q1EkkanLk1SLx0,1981
+md2conf/processor.py,sha256=V_kxpk4da8vzSLx4Zixhf1sEWdVIxKZeJocJvWhOK6Y,4020
+md2conf/properties.py,sha256=2l1tW8HmnrEsXN4-Dtby2tYJQTG1MirRpM3H6ykjQ4c,1858
+md2conf/puppeteer-config.json,sha256=-dMTAN_7kNTGbDlfXzApl0KJpAWna9YKZdwMKbpOb60,159
+md2conf/py.typed,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
+markdown_to_confluence-0.2.2.dist-info/LICENSE,sha256=Pv43so2bPfmKhmsrmXFyAvS7M30-1i1tzjz6-dfhyOo,1077
+markdown_to_confluence-0.2.2.dist-info/METADATA,sha256=a_CQkC2-De5lcIAudWShsx0m1DIAtA6utrsJKcAi20I,11571
+markdown_to_confluence-0.2.2.dist-info/WHEEL,sha256=GV9aMThwP_4oNCtvEC2ec3qUYutgWeAzklro_0m4WJQ,91
+markdown_to_confluence-0.2.2.dist-info/entry_points.txt,sha256=F1zxa1wtEObtbHS-qp46330WVFLHdMnV2wQ-ZorRmX0,50
+markdown_to_confluence-0.2.2.dist-info/top_level.txt,sha256=_FJfl_kHrHNidyjUOuS01ngu_jDsfc-ZjSocNRJnTzU,8
+markdown_to_confluence-0.2.2.dist-info/zip-safe,sha256=AbpHGcgLb-kRsJGnwFEktk7uzpZOCcBY74-YBdrKVGs,1
+markdown_to_confluence-0.2.2.dist-info/RECORD,,

md2conf/__init__.py CHANGED Viewed

@@ -5,7 +5,7 @@ Parses Markdown files, converts Markdown content into the Confluence Storage For
 Confluence API endpoints to upload images and content.
 """
-__version__ = "0.2.0"
+__version__ = "0.2.2"
 __author__ = "Levente Hunyadi"
 __copyright__ = "Copyright 2022-2024, Levente Hunyadi"
 __license__ = "MIT"

md2conf/__main__.py CHANGED Viewed

@@ -2,11 +2,13 @@ import argparse
 import logging
 import os.path
 import sys
+import typing
 from pathlib import Path
-from typing import Optional
+from typing import Any, Literal, Optional, Sequence, Union
 import requests
+from . import __version__
 from .api import ConfluenceAPI
 from .application import Application
 from .converter import ConfluenceDocumentOptions
@@ -24,12 +26,37 @@ class Arguments(argparse.Namespace):
     loglevel: str
     ignore_invalid_url: bool
     heading_anchors: bool
+    root_page: Optional[str]
     generated_by: Optional[str]
+    render_mermaid: bool
+    diagram_output_format: Literal["png", "svg"]
+    webui_links: bool
+class KwargsAppendAction(argparse.Action):
+    """Append key-value pairs to a dictionary"""
+    def __call__(
+        self,
+        parser: argparse.ArgumentParser,
+        namespace: argparse.Namespace,
+        values: Union[None, str, Sequence[Any]],
+        option_string: Optional[str] = None,
+    ) -> None:
+        try:
+            d = dict(map(lambda x: x.split("="), typing.cast(Sequence[str], values)))
+        except ValueError:
+            raise argparse.ArgumentError(
+                self,
+                f'Could not parse argument "{values}". It should follow the format: k1=v1 k2=v2 ...',
+            )
+        setattr(namespace, self.dest, d)
 def main() -> None:
     parser = argparse.ArgumentParser()
     parser.prog = os.path.basename(os.path.dirname(__file__))
+    parser.add_argument("--version", action="version", version=__version__)
     parser.add_argument(
         "mdpath", help="Path to Markdown file or directory to convert and publish."
     )
@@ -119,6 +146,20 @@ def main() -> None:
         default=False,
         help="Write XHTML-based Confluence Storage Format files locally without invoking Confluence API.",
     )
+    parser.add_argument(
+        "--headers",
+        nargs="*",
+        required=False,
+        action=KwargsAppendAction,
+        metavar="KEY=VALUE",
+        help="Apply custom headers to all Confluence API requests.",
+    )
+    parser.add_argument(
+        "--webui-links",
+        action="store_true",
+        default=False,
+        help="Enable Confluence Web UI links.",
+    )
     args = Arguments()
     parser.parse_args(namespace=args)
@@ -139,9 +180,10 @@ def main() -> None:
         root_page_id=args.root_page,
         render_mermaid=args.render_mermaid,
         diagram_output_format=args.diagram_output_format,
+        webui_links=args.webui_links,
     )
     properties = ConfluenceProperties(
-        args.domain, args.path, args.username, args.apikey, args.space
+        args.domain, args.path, args.username, args.apikey, args.space, args.headers
     )
     if args.local:
         Processor(options, properties).process(args.mdpath)

md2conf/api.py CHANGED Viewed

@@ -98,6 +98,10 @@ class ConfluenceAPI:
             session.headers.update(
                 {"Authorization": f"Bearer {self.properties.api_key}"}
             )
+        if self.properties.headers:
+            session.headers.update(self.properties.headers)
         self.session = ConfluenceSession(
             session,
             self.properties.domain,
@@ -352,6 +356,34 @@ class ConfluenceSession:
             content=typing.cast(str, storage["value"]),
         )
+    def get_page_ancestors(
+        self, page_id: str, *, space_key: Optional[str] = None
+    ) -> Dict[str, str]:
+        """
+        Retrieve Confluence wiki page ancestors.
+        :param page_id: The Confluence page ID.
+        :param space_key: The Confluence space key (unless the default space is to be used).
+        :returns: Dictionary of ancestor page ID to title, with topmost ancestor first.
+        """
+        path = f"/content/{page_id}"
+        query = {
+            "spaceKey": space_key or self.space_key,
+            "expand": "ancestors",
+        }
+        data = typing.cast(Dict[str, JsonType], self._invoke(path, query))
+        ancestors = typing.cast(List[JsonType], data["ancestors"])
+        # from the JSON array of ancestors, extract the "id" and "title"
+        results: Dict[str, str] = {}
+        for node in ancestors:
+            ancestor = typing.cast(Dict[str, JsonType], node)
+            id = typing.cast(str, ancestor["id"])
+            title = typing.cast(str, ancestor["title"])
+            results[id] = title
+        return results
     def get_page_version(
         self,
         page_id: str,

md2conf/application.py CHANGED Viewed

@@ -1,16 +1,19 @@
 import logging
 import os.path
 from pathlib import Path
-from typing import Dict, Optional
+from typing import Dict, List, Optional
-from .api import ConfluenceSession
+from .api import ConfluencePage, ConfluenceSession
 from .converter import (
     ConfluenceDocument,
     ConfluenceDocumentOptions,
     ConfluencePageMetadata,
+    ConfluenceQualifiedID,
     attachment_name,
     extract_qualified_id,
+    read_qualified_id,
 )
+from .matcher import Matcher, MatcherOptions
 LOGGER = logging.getLogger(__name__)
@@ -45,28 +48,19 @@ class Application:
     def synchronize_directory(self, local_dir: Path) -> None:
         "Synchronizes a directory of Markdown pages with Confluence."
-        page_metadata: Dict[Path, ConfluencePageMetadata] = {}
         LOGGER.info(f"Synchronizing directory: {local_dir}")
         # Step 1: build index of all page metadata
-        # NOTE: Pathlib.walk() is implemented only in Python 3.12+
-        # so sticking for old os.walk
-        for root, directories, files in os.walk(local_dir):
-            for file_name in files:
-                # Reconstitute Path object back
-                docfile = (Path(root) / file_name).absolute()
-                # Skip non-markdown files
-                if docfile.suffix.lower() != ".md":
-                    continue
-                metadata = self._get_or_create_page(docfile)
-                LOGGER.debug(f"indexed {docfile} with metadata: {metadata}")
-                page_metadata[docfile] = metadata
-        LOGGER.info(f"indexed {len(page_metadata)} pages")
+        page_metadata: Dict[Path, ConfluencePageMetadata] = {}
+        root_id = (
+            ConfluenceQualifiedID(self.options.root_page_id, self.api.space_key)
+            if self.options.root_page_id
+            else None
+        )
+        self._index_directory(local_dir, root_id, page_metadata)
+        LOGGER.info(f"indexed {len(page_metadata)} page(s)")
-        # Step 2: Convert each page
+        # Step 2: convert each page
         for page_path in page_metadata.keys():
             self._synchronize_page(page_path, page_metadata)
@@ -86,8 +80,53 @@ class Application:
         else:
             self._update_document(document, base_path)
+    def _index_directory(
+        self,
+        local_dir: Path,
+        root_id: Optional[ConfluenceQualifiedID],
+        page_metadata: Dict[Path, ConfluencePageMetadata],
+    ) -> None:
+        "Indexes Markdown files in a directory recursively."
+        LOGGER.info(f"Indexing directory: {local_dir}")
+        matcher = Matcher(MatcherOptions(source=".mdignore", extension="md"), local_dir)
+        files: List[Path] = []
+        directories: List[Path] = []
+        for entry in os.scandir(local_dir):
+            if matcher.is_excluded(entry.name):
+                continue
+            if entry.is_file():
+                files.append((Path(local_dir) / entry.name).absolute())
+            elif entry.is_dir():
+                directories.append((Path(local_dir) / entry.name).absolute())
+        # make page act as parent node in Confluence
+        parent_id: Optional[ConfluenceQualifiedID] = None
+        if "index.md" in files:
+            parent_id = read_qualified_id(Path(local_dir) / "index.md")
+        elif "README.md" in files:
+            parent_id = read_qualified_id(Path(local_dir) / "README.md")
+        if parent_id is None:
+            parent_id = root_id
+        for doc in files:
+            metadata = self._get_or_create_page(doc, parent_id)
+            LOGGER.debug(f"indexed {doc} with metadata: {metadata}")
+            page_metadata[doc] = metadata
+        for directory in directories:
+            self._index_directory(Path(local_dir) / directory, parent_id, page_metadata)
     def _get_or_create_page(
-        self, absolute_path: Path, title: Optional[str] = None
+        self,
+        absolute_path: Path,
+        parent_id: Optional[ConfluenceQualifiedID],
+        *,
+        title: Optional[str] = None,
     ) -> ConfluencePageMetadata:
         """
         Creates a new Confluence page if no page is linked in the Markdown document.
@@ -103,23 +142,13 @@ class Application:
                 qualified_id.page_id, space_key=qualified_id.space_key
             )
         else:
-            if self.options.root_page_id is None:
+            if parent_id is None:
                 raise ValueError(
-                    "expected: Confluence page ID to act as parent for Markdown files with no linked Confluence page"
+                    f"expected: parent page ID for Markdown file with no linked Confluence page: {absolute_path}"
                 )
-            # use file name without extension if no title is supplied
-            if title is None:
-                title = absolute_path.stem
-            confluence_page = self.api.get_or_create_page(
-                title, self.options.root_page_id
-            )
-            self._update_markdown(
-                absolute_path,
-                document,
-                confluence_page.id,
-                confluence_page.space_key,
+            confluence_page = self._create_page(
+                absolute_path, document, title, parent_id
             )
         return ConfluencePageMetadata(
@@ -130,7 +159,32 @@ class Application:
             title=confluence_page.title or "",
         )
+    def _create_page(
+        self,
+        absolute_path: Path,
+        document: str,
+        title: Optional[str],
+        parent_id: ConfluenceQualifiedID,
+    ) -> ConfluencePage:
+        "Creates a new Confluence page when Markdown file doesn't have an embedded page ID yet."
+        # use file name without extension if no title is supplied
+        if title is None:
+            title = absolute_path.stem
+        confluence_page = self.api.get_or_create_page(
+            title, parent_id.page_id, space_key=parent_id.space_key
+        )
+        self._update_markdown(
+            absolute_path,
+            document,
+            confluence_page.id,
+            confluence_page.space_key,
+        )
+        return confluence_page
     def _update_document(self, document: ConfluenceDocument, base_path: Path) -> None:
+        "Saves a new version of a Confluence document."
         for image in document.images:
             self.api.upload_attachment(
@@ -158,8 +212,23 @@ class Application:
         page_id: str,
         space_key: Optional[str],
     ) -> None:
+        "Writes the Confluence page ID and space key at the beginning of the Markdown file."
+        content: List[str] = []
+        # check if the file has frontmatter
+        index = 0
+        if document.startswith("---\n"):
+            index = document.find("\n---\n", 4) + 4
+            # insert the Confluence keys after the frontmatter
+            content.append(document[:index])
+        content.append(f"<!-- confluence-page-id: {page_id} -->")
+        if space_key:
+            content.append(f"<!-- confluence-space-key: {space_key} -->")
+        content.append(document[index:])
         with open(path, "w", encoding="utf-8") as file:
-            file.write(f"<!-- confluence-page-id: {page_id} -->\n")
-            if space_key:
-                file.write(f"<!-- confluence-space-key: {space_key} -->\n")
-            file.write(document)
+            file.write("\n".join(content))

md2conf/converter.py CHANGED Viewed

@@ -4,11 +4,11 @@ import hashlib
 import importlib.resources as resources
 import logging
 import os.path
-import pathlib
 import re
 import sys
 import uuid
 from dataclasses import dataclass
+from pathlib import Path
 from typing import Dict, List, Literal, Optional, Tuple
 from urllib.parse import ParseResult, urlparse, urlunparse
@@ -36,6 +36,15 @@ class ParseError(RuntimeError):
     pass
+def starts_with_any(text: str, prefixes: List[str]) -> bool:
+    "True if text starts with any of the listed prefixes."
+    for prefix in prefixes:
+        if text.startswith(prefix):
+            return True
+    return False
 def is_absolute_url(url: str) -> bool:
     urlparts = urlparse(url)
     return bool(urlparts.scheme) or bool(urlparts.netloc)
@@ -61,7 +70,7 @@ def markdown_to_html(content: str) -> str:
     )
-def _elements_from_strings(dtd_path: pathlib.Path, items: List[str]) -> ET._Element:
+def _elements_from_strings(dtd_path: Path, items: List[str]) -> ET._Element:
     """
     Creates a fragment of several XML nodes from their string representation wrapped in a root element.
@@ -240,30 +249,32 @@ class ConfluenceConverterOptions:
         conversion rules for the identifier.
     :param render_mermaid: Whether to pre-render Mermaid diagrams into PNG/SVG images.
     :param diagram_output_format: Target image format for diagrams.
+    :param web_links: When true, convert relative URLs to Confluence Web UI links.
     """
     ignore_invalid_url: bool = False
     heading_anchors: bool = False
     render_mermaid: bool = False
     diagram_output_format: Literal["png", "svg"] = "png"
+    webui_links: bool = False
 class ConfluenceStorageFormatConverter(NodeVisitor):
     "Transforms a plain HTML tree into the Confluence storage format."
     options: ConfluenceConverterOptions
-    path: pathlib.Path
-    base_path: pathlib.Path
+    path: Path
+    base_path: Path
     links: List[str]
     images: List[str]
     embedded_images: Dict[str, bytes]
-    page_metadata: Dict[pathlib.Path, ConfluencePageMetadata]
+    page_metadata: Dict[Path, ConfluencePageMetadata]
     def __init__(
         self,
         options: ConfluenceConverterOptions,
-        path: pathlib.Path,
-        page_metadata: Dict[pathlib.Path, ConfluencePageMetadata],
+        path: Path,
+        page_metadata: Dict[Path, ConfluencePageMetadata],
     ) -> None:
         super().__init__()
         self.options = options
@@ -347,10 +358,15 @@ class ConfluenceStorageFormatConverter(NodeVisitor):
         )
         self.links.append(url)
+        if self.options.webui_links:
+            page_url = f"{link_metadata.base_path}pages/viewpage.action?pageId={link_metadata.page_id}"
+        else:
+            page_url = f"{link_metadata.base_path}spaces/{link_metadata.space_key}/pages/{link_metadata.page_id}/{link_metadata.title}"
         components = ParseResult(
             scheme="https",
             netloc=link_metadata.domain,
-            path=f"{link_metadata.base_path}spaces/{link_metadata.space_key}/pages/{link_metadata.page_id}/{link_metadata.title}",
+            path=page_url,
             params="",
             query="",
             fragment=relative_url.fragment,
@@ -365,7 +381,7 @@ class ConfluenceStorageFormatConverter(NodeVisitor):
         # prefer PNG over SVG; Confluence displays SVG in wrong size, and text labels are truncated
         if path and is_relative_url(path):
-            relative_path = pathlib.Path(path)
+            relative_path = Path(path)
             if (
                 relative_path.suffix == ".svg"
                 and (self.base_path / relative_path.with_suffix(".png")).exists()
@@ -541,43 +557,83 @@ class ConfluenceStorageFormatConverter(NodeVisitor):
             *content,
         )
-    def _transform_alert(self, elem: ET._Element) -> ET._Element:
+    def _transform_github_alert(self, elem: ET._Element) -> ET._Element:
+        content = elem[0]
+        if content.text is None:
+            raise DocumentError("empty content")
+        class_name: Optional[str] = None
+        skip = 0
+        pattern = re.compile(r"^\[!([A-Z]+)\]\s*")
+        match = pattern.match(content.text)
+        if match:
+            skip = len(match.group(0))
+            alert = match.group(1)
+            if alert == "NOTE":
+                class_name = "note"
+            elif alert == "TIP":
+                class_name = "tip"
+            elif alert == "IMPORTANT":
+                class_name = "tip"
+            elif alert == "WARNING":
+                class_name = "warning"
+            elif alert == "CAUTION":
+                class_name = "warning"
+            else:
+                raise DocumentError(f"unsupported GitHub alert: {alert}")
+        return self._transform_alert(elem, class_name, skip)
+    def _transform_gitlab_alert(self, elem: ET._Element) -> ET._Element:
+        content = elem[0]
+        if content.text is None:
+            raise DocumentError("empty content")
+        class_name: Optional[str] = None
+        skip = 0
+        pattern = re.compile(r"^(FLAG|NOTE|WARNING|DISCLAIMER):\s*")
+        match = pattern.match(content.text)
+        if match:
+            skip = len(match.group(0))
+            alert = match.group(1)
+            if alert == "FLAG":
+                class_name = "note"
+            elif alert == "NOTE":
+                class_name = "note"
+            elif alert == "WARNING":
+                class_name = "warning"
+            elif alert == "DISCLAIMER":
+                class_name = "info"
+            else:
+                raise DocumentError(f"unsupported GitLab alert: {alert}")
+        return self._transform_alert(elem, class_name, skip)
+    def _transform_alert(
+        self, elem: ET._Element, class_name: Optional[str], skip: int
+    ) -> ET._Element:
         """
-        Creates an info, tip, note or warning panel from a GitHub alert.
+        Creates an info, tip, note or warning panel from a GitHub or GitLab alert.
         Transforms
-        [GitHub alert](https://docs.github.com/en/get-started/writing-on-github/getting-started-with-writing-and-formatting-on-github/basic-writing-and-formatting-syntax#alerts)  # noqa: E501 # no way to make this link shorter
+        [GitHub alert](https://docs.github.com/get-started/writing-on-github/getting-started-with-writing-and-formatting-on-github/basic-writing-and-formatting-syntax#alerts)
+        or [GitLab alert](https://docs.gitlab.com/ee/development/documentation/styleguide/#alert-boxes)
         syntax into one of the Confluence structured macros *info*, *tip*, *note*, or *warning*.
         """
-        pattern = re.compile(r"^\[!([A-Z]+)\]\s*")
         content = elem[0]
         if content.text is None:
             raise DocumentError("empty content")
-        match = pattern.match(content.text)
-        if match is None:
+        if class_name is None:
             raise DocumentError("not an alert")
-        alert = match.group(1)
-        if alert == "NOTE":
-            class_name = "note"
-        elif alert == "TIP":
-            class_name = "tip"
-        elif alert == "IMPORTANT":
-            class_name = "tip"
-        elif alert == "WARNING":
-            class_name = "warning"
-        elif alert == "CAUTION":
-            class_name = "warning"
-        else:
-            raise DocumentError(f"unsupported alert: {alert}")
         for e in elem:
             self.visit(e)
-        content.text = pattern.sub("", content.text, count=1)
+        content.text = content.text[skip:]
         return AC(
             "structured-macro",
             {
@@ -671,7 +727,22 @@ class ConfluenceStorageFormatConverter(NodeVisitor):
             and child[0].text is not None
             and child[0].text.startswith("[!")
         ):
-            return self._transform_alert(child)
+            return self._transform_github_alert(child)
+        # Alerts in GitLab
+        # <blockquote>
+        #   <p>DISCLAIMER: ...</p>
+        # </blockquote>
+        elif (
+            child.tag == "blockquote"
+            and len(child) > 0
+            and child[0].tag == "p"
+            and child[0].text is not None
+            and starts_with_any(
+                child[0].text, ["FLAG:", "NOTE:", "WARNING:", "DISCLAIMER:"]
+            )
+        ):
+            return self._transform_gitlab_alert(child)
         # <details markdown="1">
         # <summary>...</summary>
@@ -726,8 +797,14 @@ class ConfluenceQualifiedID:
     page_id: str
     space_key: Optional[str] = None
+    def __init__(self, page_id: str, space_key: Optional[str] = None):
+        self.page_id = page_id
+        self.space_key = space_key
 def extract_qualified_id(string: str) -> Tuple[Optional[ConfluenceQualifiedID], str]:
+    "Extracts the Confluence page ID and space key from a Markdown document."
     page_id, string = extract_value(r"<!--\s+confluence-page-id:\s*(\d+)\s+-->", string)
     if page_id is None:
@@ -741,6 +818,16 @@ def extract_qualified_id(string: str) -> Tuple[Optional[ConfluenceQualifiedID],
     return ConfluenceQualifiedID(page_id, space_key), string
+def read_qualified_id(absolute_path: Path) -> Optional[ConfluenceQualifiedID]:
+    "Reads the Confluence page ID and space key from a Markdown document."
+    with open(absolute_path, "r", encoding="utf-8") as f:
+        document = f.read()
+    qualified_id, _ = extract_qualified_id(document)
+    return qualified_id
 @dataclass
 class ConfluenceDocumentOptions:
     """
@@ -754,6 +841,7 @@ class ConfluenceDocumentOptions:
     :param show_generated: Whether to display a prompt "This page has been generated with a tool."
     :param render_mermaid: Whether to pre-render Mermaid diagrams into PNG/SVG images.
     :param diagram_output_format: Target image format for diagrams.
+    :param webui_links: When true, convert relative URLs to Confluence Web UI links.
     """
     ignore_invalid_url: bool = False
@@ -762,6 +850,7 @@ class ConfluenceDocumentOptions:
     root_page_id: Optional[str] = None
     render_mermaid: bool = False
     diagram_output_format: Literal["png", "svg"] = "png"
+    webui_links: bool = False
 class ConfluenceDocument:
@@ -774,9 +863,9 @@ class ConfluenceDocument:
     def __init__(
         self,
-        path: pathlib.Path,
+        path: Path,
         options: ConfluenceDocumentOptions,
-        page_metadata: Dict[pathlib.Path, ConfluencePageMetadata],
+        page_metadata: Dict[Path, ConfluencePageMetadata],
     ) -> None:
         self.options = options
         path = path.absolute()
@@ -786,6 +875,13 @@ class ConfluenceDocument:
         # extract Confluence page ID
         qualified_id, text = extract_qualified_id(text)
+        if qualified_id is None:
+            # look up Confluence page ID in metadata
+            metadata = page_metadata.get(path)
+            if metadata is not None:
+                qualified_id = ConfluenceQualifiedID(
+                    metadata.page_id, metadata.space_key
+                )
         if qualified_id is None:
             raise ValueError("missing Confluence page ID")
         self.id = qualified_id
@@ -823,6 +919,7 @@ class ConfluenceDocument:
                 heading_anchors=self.options.heading_anchors,
                 render_mermaid=self.options.render_mermaid,
                 diagram_output_format=self.options.diagram_output_format,
+                webui_links=self.options.webui_links,
             ),
             path,
             page_metadata,

md2conf/matcher.py ADDED Viewed

@@ -0,0 +1,83 @@
+import os.path
+from dataclasses import dataclass
+from fnmatch import fnmatch
+from pathlib import Path
+from typing import Iterable, List, Optional
+@dataclass
+class MatcherOptions:
+    """
+    Options for checking against a list of exclude/include patterns.
+    :param source: File name to read exclusion rules from.
+    :param extension: Extension to narrow down search to.
+    """
+    source: str
+    extension: Optional[str] = None
+    def __post_init__(self) -> None:
+        if self.extension is not None and not self.extension.startswith("."):
+            self.extension = f".{self.extension}"
+class Matcher:
+    "Compares file and directory names against a list of exclude/include patterns."
+    options: MatcherOptions
+    rules: List[str]
+    def __init__(self, options: MatcherOptions, directory: Path) -> None:
+        self.options = options
+        if os.path.exists(directory / options.source):
+            with open(directory / options.source, "r") as f:
+                rules = f.read().splitlines()
+            self.rules = [rule for rule in rules if rule and not rule.startswith("#")]
+        else:
+            self.rules = []
+    def extension_matches(self, name: str) -> bool:
+        "True if the file name has the expected extension."
+        return self.options.extension is None or name.endswith(self.options.extension)
+    def is_excluded(self, name: str) -> bool:
+        "True if the file or directory name matches any of the exclusion patterns."
+        if name.startswith("."):
+            return True
+        if not self.extension_matches(name):
+            return True
+        for rule in self.rules:
+            if fnmatch(name, rule):
+                return True
+        else:
+            return False
+    def is_included(self, name: str) -> bool:
+        "True if the file or directory name matches none of the exclusion patterns."
+        return not self.is_excluded(name)
+    def filter(self, items: Iterable[str]) -> List[str]:
+        """
+        Returns only those elements from the input that don't match any of the exclusion rules.
+        :param items: A list of names to filter.
+        :returns: A filtered list of names that didn't match any of the exclusion rules.
+        """
+        return [item for item in items if self.is_included(item)]
+    def scandir(self, path: Path) -> List[str]:
+        """
+        Returns only those entries in a directory whose name doesn't match any of the exclusion rules.
+        :param path: Directory to scan.
+        :returns: A filtered list of entries whose name didn't match any of the exclusion rules.
+        """
+        return self.filter(entry.name for entry in os.scandir(path))

md2conf/mermaid.py CHANGED Viewed

@@ -1,17 +1,37 @@
+import logging
 import os
 import os.path
 import shutil
 import subprocess
 from typing import Literal
+LOGGER = logging.getLogger(__name__)
+def is_docker() -> bool:
+    "True if the application is running in a Docker container."
+    return (
+        os.environ.get("CHROME_BIN") == "/usr/bin/chromium-browser"
+        and os.environ.get("PUPPETEER_SKIP_DOWNLOAD") == "true"
+    )
+def get_mmdc() -> str:
+    "Path to the Mermaid diagram converter."
+    if is_docker():
+        return "/home/md2conf/node_modules/.bin/mmdc"
+    elif os.name == "nt":
+        return "mmdc.cmd"
+    else:
+        return "mmdc"
 def has_mmdc() -> bool:
     "True if Mermaid diagram converter is available on the OS."
-    if os.name == "nt":
-        executable = "mmdc.cmd"
-    else:
-        executable = "mmdc"
+    executable = get_mmdc()
     return shutil.which(executable) is not None
@@ -20,20 +40,21 @@ def render(source: str, output_format: Literal["png", "svg"] = "png") -> bytes:
     filename = f"tmp_mermaid.{output_format}"
-    if os.name == "nt":
-        executable = "mmdc.cmd"
-    else:
-        executable = "mmdc"
+    cmd = [
+        get_mmdc(),
+        "--input",
+        "-",
+        "--output",
+        filename,
+        "--outputFormat",
+        output_format,
+    ]
+    if is_docker():
+        cmd.extend(
+            ["-p", os.path.join(os.path.dirname(__file__), "puppeteer-config.json")]
+        )
+    LOGGER.debug(f"Executing: {' '.join(cmd)}")
     try:
-        cmd = [
-            executable,
-            "--input",
-            "-",
-            "--output",
-            filename,
-            "--outputFormat",
-            output_format,
-        ]
         proc = subprocess.Popen(
             cmd,
             stdout=subprocess.PIPE,
@@ -41,10 +62,11 @@ def render(source: str, output_format: Literal["png", "svg"] = "png") -> bytes:
             stderr=subprocess.PIPE,
             text=False,
         )
-        proc.communicate(input=source.encode("utf-8"))
+        stdout, stderr = proc.communicate(input=source.encode("utf-8"))
         if proc.returncode:
             raise RuntimeError(
-                f"failed to convert Mermaid diagram; exit code: {proc.returncode}"
+                f"failed to convert Mermaid diagram; exit code: {proc.returncode}, "
+                f"output:\n{stdout.decode('utf-8')}\n{stderr.decode('utf-8')}"
             )
         with open(filename, "rb") as image:
             return image.read()

md2conf/processor.py CHANGED Viewed

@@ -1,14 +1,17 @@
+import hashlib
 import logging
 import os
 from pathlib import Path
-from typing import Dict
+from typing import Dict, List
 from .converter import (
     ConfluenceDocument,
     ConfluenceDocumentOptions,
     ConfluencePageMetadata,
+    ConfluenceQualifiedID,
     extract_qualified_id,
 )
+from .matcher import Matcher, MatcherOptions
 from .properties import ConfluenceProperties
 LOGGER = logging.getLogger(__name__)
@@ -37,28 +40,14 @@ class Processor:
     def process_directory(self, local_dir: Path) -> None:
         "Recursively scans a directory hierarchy for Markdown files."
-        page_metadata: Dict[Path, ConfluencePageMetadata] = {}
         LOGGER.info(f"Synchronizing directory: {local_dir}")
         # Step 1: build index of all page metadata
-        # NOTE: Pathlib.walk() is implemented only in Python 3.12+
-        # so sticking for old os.walk
-        for root, directories, files in os.walk(local_dir):
-            for file_name in files:
-                # Reconstitute Path object back
-                docfile = (Path(root) / file_name).absolute()
-                # Skip non-markdown files
-                if docfile.suffix.lower() != ".md":
-                    continue
-                metadata = self._get_page(docfile)
-                LOGGER.debug(f"indexed {docfile} with metadata: {metadata}")
-                page_metadata[docfile] = metadata
-        LOGGER.info(f"indexed {len(page_metadata)} pages")
+        page_metadata: Dict[Path, ConfluencePageMetadata] = {}
+        self._index_directory(local_dir, page_metadata)
+        LOGGER.info(f"indexed {len(page_metadata)} page(s)")
-        # Step 2: Convert each page
+        # Step 2: convert each page
         for page_path in page_metadata.keys():
             self.process_page(page_path, page_metadata)
@@ -72,6 +61,36 @@ class Processor:
         with open(path.with_suffix(".csf"), "w", encoding="utf-8") as f:
             f.write(content)
+    def _index_directory(
+        self,
+        local_dir: Path,
+        page_metadata: Dict[Path, ConfluencePageMetadata],
+    ) -> None:
+        "Indexes Markdown files in a directory recursively."
+        LOGGER.info(f"Indexing directory: {local_dir}")
+        matcher = Matcher(MatcherOptions(source=".mdignore", extension="md"), local_dir)
+        files: List[Path] = []
+        directories: List[Path] = []
+        for entry in os.scandir(local_dir):
+            if matcher.is_excluded(entry.name):
+                continue
+            if entry.is_file():
+                files.append((Path(local_dir) / entry.name).absolute())
+            elif entry.is_dir():
+                directories.append((Path(local_dir) / entry.name).absolute())
+        for doc in files:
+            metadata = self._get_page(doc)
+            LOGGER.debug(f"indexed {doc} with metadata: {metadata}")
+            page_metadata[doc] = metadata
+        for directory in directories:
+            self._index_directory(Path(local_dir) / directory, page_metadata)
     def _get_page(self, absolute_path: Path) -> ConfluencePageMetadata:
         "Extracts metadata from a Markdown file."
@@ -80,7 +99,13 @@ class Processor:
         qualified_id, document = extract_qualified_id(document)
         if qualified_id is None:
-            raise ValueError("required: page ID for local output")
+            if self.options.root_page_id is not None:
+                hash = hashlib.md5(document.encode("utf-8"))
+                digest = "".join(f"{c:x}" for c in hash.digest())
+                LOGGER.info(f"Identifier '{digest}' assigned to page: {absolute_path}")
+                qualified_id = ConfluenceQualifiedID(digest)
+            else:
+                raise ValueError("required: page ID for local output")
         return ConfluencePageMetadata(
             domain=self.properties.domain,

md2conf/properties.py CHANGED Viewed

@@ -1,5 +1,5 @@
 import os
-from typing import Optional
+from typing import Dict, Optional
 class ConfluenceError(RuntimeError):
@@ -12,6 +12,7 @@ class ConfluenceProperties:
     space_key: str
     user_name: Optional[str]
     api_key: str
+    headers: Optional[Dict[str, str]]
     def __init__(
         self,
@@ -20,6 +21,7 @@ class ConfluenceProperties:
         user_name: Optional[str] = None,
         api_key: Optional[str] = None,
         space_key: Optional[str] = None,
+        headers: Optional[Dict[str, str]] = None,
     ) -> None:
         opt_domain = domain or os.getenv("CONFLUENCE_DOMAIN")
         opt_base_path = base_path or os.getenv("CONFLUENCE_PATH")
@@ -48,5 +50,4 @@ class ConfluenceProperties:
         self.user_name = opt_user_name
         self.api_key = opt_api_key
         self.space_key = opt_space_key
-        self.space_key = opt_space_key
-        self.space_key = opt_space_key
+        self.headers = headers

md2conf/puppeteer-config.json ADDED Viewed

@@ -0,0 +1,8 @@
+{
+    "executablePath": "/usr/bin/chromium-browser",
+    "args": [
+        "--no-sandbox",
+        "--disable-gpu",
+        "--disable-setuid-sandbox"
+    ]
+}

markdown_to_confluence-0.2.0.dist-info/RECORD DELETED Viewed

@@ -1,17 +0,0 @@
-md2conf/__init__.py,sha256=1KRpqiilQTkQz-oL8-HFPnI_6_3-_H0dq-SxQxDw56s,402
-md2conf/__main__.py,sha256=tWMEA_spxUTNNgViHtjsA85NzJixX-0G2zCq8BO3y_E,5230
-md2conf/api.py,sha256=Oc4FAQBNs85U8s-lbY0XwLBUcjm3Sd0_W59N4H3XAnE,15768
-md2conf/application.py,sha256=NnF84-cdW2cZUbU6VeHvuEg6g5NL5M9o2cpOSU7uv7o,5548
-md2conf/converter.py,sha256=XY7D8zpsVS7_PZzywciQ5YT2SHH5t1udPU5s2aPsmqs,27040
-md2conf/entities.dtd,sha256=M6NzqL5N7dPs_eUA_6sDsiSLzDaAacrx9LdttiufvYU,30215
-md2conf/mermaid.py,sha256=3zawPXHXkCDhEK-WNtCH-gTqsLBDRzLrmlSo8ZW-Ii8,1371
-md2conf/processor.py,sha256=3JZkbFtMjbtnQLEm6wFum96ldjZ9xNJuL8JjFadyGmg,3084
-md2conf/properties.py,sha256=oXvtPssbougM1BTE9ytcD_1Yjc3nd7DDSHqEr0QoZAU,1811
-md2conf/py.typed,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
-markdown_to_confluence-0.2.0.dist-info/LICENSE,sha256=Pv43so2bPfmKhmsrmXFyAvS7M30-1i1tzjz6-dfhyOo,1077
-markdown_to_confluence-0.2.0.dist-info/METADATA,sha256=nxwG4F2TX1do0lk38BCFIUMwqv6y2edErd2_5M-4la4,10023
-markdown_to_confluence-0.2.0.dist-info/WHEEL,sha256=GV9aMThwP_4oNCtvEC2ec3qUYutgWeAzklro_0m4WJQ,91
-markdown_to_confluence-0.2.0.dist-info/entry_points.txt,sha256=F1zxa1wtEObtbHS-qp46330WVFLHdMnV2wQ-ZorRmX0,50
-markdown_to_confluence-0.2.0.dist-info/top_level.txt,sha256=_FJfl_kHrHNidyjUOuS01ngu_jDsfc-ZjSocNRJnTzU,8
-markdown_to_confluence-0.2.0.dist-info/zip-safe,sha256=AbpHGcgLb-kRsJGnwFEktk7uzpZOCcBY74-YBdrKVGs,1
-markdown_to_confluence-0.2.0.dist-info/RECORD,,

{markdown_to_confluence-0.2.0.dist-info → markdown_to_confluence-0.2.2.dist-info}/LICENSE RENAMED Viewed

File without changes

{markdown_to_confluence-0.2.0.dist-info → markdown_to_confluence-0.2.2.dist-info}/WHEEL RENAMED Viewed

File without changes

{markdown_to_confluence-0.2.0.dist-info → markdown_to_confluence-0.2.2.dist-info}/entry_points.txt RENAMED Viewed

File without changes

{markdown_to_confluence-0.2.0.dist-info → markdown_to_confluence-0.2.2.dist-info}/top_level.txt RENAMED Viewed

File without changes

{markdown_to_confluence-0.2.0.dist-info → markdown_to_confluence-0.2.2.dist-info}/zip-safe RENAMED Viewed

File without changes

markdown-to-confluence 0.2.0__py3-none-any.whl → 0.2.2__py3-none-any.whl

markdown-to-confluence 0.2.0py3-none-any.whl → 0.2.2py3-none-any.whl