PyPI - mdproc - Versions diffs - 0.2.1__tar.gz → 0.3.0__tar.gz - Mend

mdproc 0.2.1tar.gz → 0.3.0tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (26) hide show

mdproc-0.3.0/.gitignore +5 -0
mdproc-0.3.0/.vscode/launch.json +41 -0
mdproc-0.3.0/.vscode/settings.json +6 -0
{mdproc-0.2.1 → mdproc-0.3.0}/PKG-INFO +37 -10
mdproc-0.3.0/README.md +69 -0
{mdproc-0.2.1 → mdproc-0.3.0}/pyproject.toml +11 -3
mdproc-0.3.0/src/mdproc/demo_mermaid.md +12 -0
mdproc-0.3.0/src/mdproc/demo_mermaid_mm2img.md +5 -0
mdproc-0.3.0/src/mdproc/demo_table.md +10 -0
mdproc-0.3.0/src/mdproc/demo_table_tb2img.md +3 -0
mdproc-0.3.0/src/mdproc/extract_tables.py +64 -0
mdproc-0.3.0/src/mdproc/mdforzhihu.py +48 -0
mdproc-0.3.0/src/mdproc/mdmermaid2img.py +282 -0
mdproc-0.3.0/src/mdproc/mdtable2img.py +118 -0
mdproc-0.3.0/src/mdproc/mermaid2img.py +69 -0
mdproc-0.3.0/src/mdproc/mermaid2img_playwright.py +175 -0
mdproc-0.3.0/src/mdproc/mermaid2img_playwright_cdn.py +169 -0
mdproc-0.2.1/.gitignore +0 -4
mdproc-0.2.1/.vscode/launch.json +0 -12
mdproc-0.2.1/README.md +0 -44
{mdproc-0.2.1 → mdproc-0.3.0}/LICENSE +0 -0
{mdproc-0.2.1 → mdproc-0.3.0}/src/mdproc/cos_uploader.py +0 -0
{mdproc-0.2.1 → mdproc-0.3.0}/src/mdproc/demo.md +0 -0
{mdproc-0.2.1 → mdproc-0.3.0}/src/mdproc/demo_output.md +0 -0
{mdproc-0.2.1 → mdproc-0.3.0}/src/mdproc/mdimgupload.py +0 -0
{mdproc-0.2.1 → mdproc-0.3.0}/src/mdproc/mdproc.py +0 -0

mdproc-0.3.0/.gitignore ADDED Viewed

@@ -0,0 +1,5 @@
+.env
+.venv
+__pycache__/
+dist/
+src/mdproc/assets/mermaid.bundle.js

mdproc-0.3.0/.vscode/launch.json ADDED Viewed

@@ -0,0 +1,41 @@
+{
+    "configurations": [
+        {
+            "name": "Python Debugger: Current File",
+            "type": "debugpy",
+            "request": "launch",
+            "program": "${file}",
+            "console": "integratedTerminal"
+        },
+        {
+            "name": "Python Debugger: mdimgupload",
+            "type": "debugpy",
+            "request": "launch",
+            "module": "mdproc.mdimgupload",
+            "args": [
+                "mdproc/demo.md"
+            ],
+            "cwd": "${workspaceFolder}/src",
+        },
+        {
+            "name": "Python Debugger: mdtable2img",
+            "type": "debugpy",
+            "request": "launch",
+            "module": "mdproc.mdtable2img",
+            "args": [
+                "mdproc/demo_table.md"
+            ],
+            "cwd": "${workspaceFolder}/src",
+        },
+        {
+            "name": "Python Debugger: mdmermaid2img",
+            "type": "debugpy",
+            "request": "launch",
+            "module": "mdproc.mdmermaid2img",
+            "args": [
+                "mdproc/demo_mermaid.md"
+            ],
+            "cwd": "${workspaceFolder}/src",
+        }
+    ]
+}

mdproc-0.3.0/.vscode/settings.json ADDED Viewed

@@ -0,0 +1,6 @@
+{
+    "editor.formatOnSave": true,
+    "editor.codeActionsOnSave": {
+        "source.fixAll.ruff": "always"
+    }
+}

{mdproc-0.2.1 → mdproc-0.3.0}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: mdproc
-Version: 0.2.1
+Version: 0.3.0
 Summary: A tool to process markdown files.
 Project-URL: Homepage, https://github.com/honghe/mdproc
 Project-URL: Repository, https://github.com/honghe/mdproc
@@ -19,6 +19,8 @@ Classifier: Topic :: Multimedia :: Video
 Requires-Python: >=3.10
 Requires-Dist: cos-python-sdk-v5
 Requires-Dist: httpx
+Requires-Dist: markdown-it-py
+Requires-Dist: playwright
 Requires-Dist: python-dotenv
 Provides-Extra: dev
 Requires-Dist: build; extra == 'dev'
@@ -32,10 +34,13 @@ A simple Python tool to process markdown files.
 ## Features
 - Markdown Image Uploader to COS.
+- Convert Markdown tables to images and upload to COS.
+- Convert mermaid chart to image. (dependency `npm install -g @mermaid-js/mermaid-cli`)
 ## Config
 `.env` or configure environment variables:
 ```
 COS_SECRET_ID=<xyz>
 COS_SECRET_KEY=<xyz>
@@ -45,27 +50,49 @@ COS_BUCKET=<xyz>
 ## Usage
-1. Install dependencies:
-    ```bash
-    pip install mdproc
-    ```
-2. Run the script:
-    ```bash
-    mdproc-imgupload your_markdown.md
-    ```
+- Install dependencies:
+  ```bash
+  pip install mdproc
+  # for md-table2img
+  playwright install chromium
+  ```
+- Markdown images upload:
+  ```bash
+  mdproc-imgupload your_markdown.md
+  ```
+- Markdown table to image:
+  ```bash
+   mdproc-table2img your_markdown.md
+  ```
+- Markdown mermaid to image:
+  ```bash
+   mdproc-mermaid2img your_markdown.md
+  ```
 ## Demo
 demo.md:
 ```
 ![first-version](https://www.python.org/static/img/python-logo.png)
 ```
 demo_output.md
 ```
 ![first-version](https://pic-1251484506.cos.ap-guangzhou.myqcloud.com/imgs/python-logo_ae79195a.png)
 ```
+## mermaid2img Benchmark
+Note: Browser is Chromium. mermaid-cli use puppeteer.
+| mermaid2img | Cold Start /s | Warm Start /s |
+| --------------------------------- | ------------- | ------------- |
+| playwright (memaidjs cdn) | 2.5 | 1.5 |
+| playwright (local mermaid bundle) | 2.5 | 1.5 |
+| mermaid-cli | 5.7 | 3.7 |
 ## License
-Apache License
+Apache License

mdproc-0.3.0/README.md ADDED Viewed

@@ -0,0 +1,69 @@
+# mdproc
+A simple Python tool to process markdown files.
+## Features
+- Markdown Image Uploader to COS.
+- Convert Markdown tables to images and upload to COS.
+- Convert mermaid chart to image. (dependency `npm install -g @mermaid-js/mermaid-cli`)
+## Config
+`.env` or configure environment variables:
+```
+COS_SECRET_ID=<xyz>
+COS_SECRET_KEY=<xyz>
+COS_REGION=<xyz>
+COS_BUCKET=<xyz>
+```
+## Usage
+- Install dependencies:
+  ```bash
+  pip install mdproc
+  # for md-table2img
+  playwright install chromium
+  ```
+- Markdown images upload:
+  ```bash
+  mdproc-imgupload your_markdown.md
+  ```
+- Markdown table to image:
+  ```bash
+   mdproc-table2img your_markdown.md
+  ```
+- Markdown mermaid to image:
+  ```bash
+   mdproc-mermaid2img your_markdown.md
+  ```
+## Demo
+demo.md:
+```
+![first-version](https://www.python.org/static/img/python-logo.png)
+```
+demo_output.md
+```
+![first-version](https://pic-1251484506.cos.ap-guangzhou.myqcloud.com/imgs/python-logo_ae79195a.png)
+```
+## mermaid2img Benchmark
+Note: Browser is Chromium. mermaid-cli use puppeteer.
+| mermaid2img | Cold Start /s | Warm Start /s |
+| --------------------------------- | ------------- | ------------- |
+| playwright (memaidjs cdn) | 2.5 | 1.5 |
+| playwright (local mermaid bundle) | 2.5 | 1.5 |
+| mermaid-cli | 5.7 | 3.7 |
+## License
+Apache License

{mdproc-0.2.1 → mdproc-0.3.0}/pyproject.toml RENAMED Viewed

@@ -4,12 +4,18 @@ build-backend = "hatchling.build"
 [project]
 name = "mdproc"
-version = "0.2.1"
+version = "0.3.0"
 description = "A tool to process markdown files."
 authors = [{ name = "Honghe" }]
 readme = "README.md"
 requires-python = ">=3.10"
-dependencies = ["httpx", "python-dotenv", "cos-python-sdk-v5"]
+dependencies = [
+    "httpx",
+    "python-dotenv",
+    "cos-python-sdk-v5",
+    "markdown-it-py",
+    "playwright",
+]
 keywords = ["markdown", "jpg", "png", "process"]
 classifiers = [
@@ -27,7 +33,9 @@ classifiers = [
 [project.scripts]
 mdproc = "mdproc.mdproc:main"
 mdproc-imgupload = "mdproc.mdimgupload:main"
+mdproc-forzhihu = "mdproc.mdforzhihu:main"
+mdproc-table2img = "mdproc.mdtable2img:main"
+mdproc-mermaid2img = "mdproc.mdmermaid2img:main"
 [project.urls]
 Homepage = "https://github.com/honghe/mdproc"

mdproc-0.3.0/src/mdproc/demo_mermaid.md ADDED Viewed

@@ -0,0 +1,12 @@
+This is doc.
+```mermaid
+graph TD
+    A[Start] --> B{Is it?}
+    B -->|Yes| C[OK]
+    C --> D[Rethink]
+    D --> B
+    B ---->|No| E[End]
+```
+Done.

mdproc-0.3.0/src/mdproc/demo_mermaid_mm2img.md ADDED Viewed

@@ -0,0 +1,5 @@
+This is doc.
+![mermaid 1](https://pic-1251484506.cos.ap-guangzhou.myqcloud.com/imgs/mermaid_1031221472.png)
+Done.

mdproc-0.3.0/src/mdproc/demo_table.md ADDED Viewed

@@ -0,0 +1,10 @@
+table
+| ID | txt_0      | txt_1       | txt_2       | txt_3       | txt_4       | txt_5       | txt_6       | txt_7       | txt_8       | txt_9       | cat_Appetizers & Sides | cat_Aussie Pub Classics | cat_Burgers & Sandwiches | cat_Drinks & Desserts | cat_Mexican Specialties | cat_Pasta & Risotto | cat_Pizzas | cat_Salads & Healthy Options | is_coffee | price  |
+| ------- | ---------- | ----------- | ----------- | ----------- | ----------- | ----------- | ----------- | ----------- | ----------- | ----------- | ---------------------- | ----------------------- | ------------------------ | --------------------- | ----------------------- | ------------------- | ---------- | ---------------------------- | --------- | ------ |
+| 8       | 0.3354452  | 0.36037982  | -0.04443971 | 0.14370468  | -0.19956689 | -0.17493485 | -0.18741444 | -0.02776922 | -0.07173516 | -0.11751403 | 0                      | 0                       | 1                        | 0                     | 0                       | 0                   | 0          | 0                            | 0         | 0.3887 |
+| 42      | 0.3015529  | 0.28032377  | 0.03035132  | 0.21287075  | 0.04236558  | -0.054545   | -0.10349114 | -0.13550489 | -0.04504355 | -0.22817583 | 0                      | 0                       | 0                        | 0                     | 0                       | 0                   | 1          | 0                            | 0         | 0.5832 |
+| 61      | 0.53950787 | -0.020039   | -0.36858445 | -0.10636957 | 0.00259933  | 0.15990224  | 0.04153050  | 0.11348728  | -0.02482079 | -0.23463035 | 0                      | 1                       | 0                        | 0                     | 0                       | 0                   | 0          | 0                            | 0         | 0.6110 |
+| 101     | 0.20630628 | -0.04121789 | 0.11134595  | -0.2160106  | 0.00511632  | -0.20131038 | 0.05482014  | -0.19734132 | 0.35356910  | 0.23985470  | 0                      | 0                       | 0                        | 0                     | 1                       | 0                   | 0          | 0                            | 0         | 0.2765 |
+end

mdproc-0.3.0/src/mdproc/demo_table_tb2img.md ADDED Viewed

@@ -0,0 +1,3 @@
+table
+![Table 1](https://pic-1251484506.cos.ap-guangzhou.myqcloud.com/imgs/table_1.png)
+end

mdproc-0.3.0/src/mdproc/extract_tables.py ADDED Viewed

@@ -0,0 +1,64 @@
+from markdown_it import MarkdownIt
+def extract_raw_tables(md_text):
+    """
+    Extracts the raw markdown strings of tables from a given markdown text.
+    """
+    # Configure the parser to enable tables (GFM-like is a good preset)
+    md = MarkdownIt("gfm-like", {"linkify": False}).enable("table")
+    # Parse the markdown into tokens
+    tokens = md.parse(md_text, {})
+    raw_tables = []
+    current_table_start = None
+    for i, token in enumerate(tokens):
+        if token.type == "table_open":
+            # Store the starting line number if source map is available
+            if token.map:
+                current_table_start = token.map[0]
+        if token.type == "table_close":
+            # If we have a start, extract the lines up to the end line number
+            if current_table_start is not None and token.map:
+                current_table_end = token.map[1]
+                # Extract the relevant lines from the original text
+                table_lines = md_text.splitlines()[
+                    current_table_start:current_table_end
+                ]
+                raw_tables.append("\n".join(table_lines))
+                current_table_start = None
+    return raw_tables
+def main():
+    # Example usage:
+    markdown_content = """
+    Here is some introductory text.
+    | Header 1 | Header 2 |
+    |---|---|
+    | Cell 1 | Cell 2 |
+    | Cell 3 | Cell 4 |
+    Some text in between.
+    | Name | Age |
+    |---|---|
+    | Alice | 30 |
+    | Bob | 25 |
+    """
+    tables = extract_raw_tables(markdown_content)
+    for i, table_str in enumerate(tables):
+        print(f"--- Table {i + 1} Raw String ---")
+        print(table_str)
+        print("----------------------------\n")
+if __name__ == "__main__":
+    main()

mdproc-0.3.0/src/mdproc/mdforzhihu.py ADDED Viewed

@@ -0,0 +1,48 @@
+import re
+import os
+import argparse
+def main():
+    # delete (multi) empty lines before and after img tags
+    parser = argparse.ArgumentParser(
+        description="Process markdown file for Zhihu."
+    )
+    parser.add_argument("input_file", help="Path to the input markdown file.")
+    args = parser.parse_args()
+    input_file = args.input_file
+    output_file = f"{os.path.splitext(input_file)[0]}_4zhihu.md"
+    with open(input_file, "r", encoding="utf-8") as f:
+        lines = f.readlines()
+    new_lines = []
+    i = 0
+    n = len(lines)
+    img_tag_count = 0
+    removed_empty_count = 0
+    while i < n:
+        line = lines[i]
+        stripped = line.strip()
+        if re.match(r"!\[.*?\]\(.*?\)", stripped):
+            img_tag_count += 1
+            # Remove all empty lines before img tag
+            before = len(new_lines)
+            while new_lines and new_lines[-1].strip() == "":
+                new_lines.pop()
+            removed_empty_count += before - len(new_lines)
+            new_lines.append(line)
+            # Skip all empty lines after img tag
+            j = i + 1
+            after = 0
+            while j < n and lines[j].strip() == "":
+                after += 1
+                j += 1
+            removed_empty_count += after
+            i = j
+        else:
+            new_lines.append(line)
+            i += 1
+    with open(output_file, "w", encoding="utf-8") as f:
+        f.writelines(new_lines)
+    print(f"Image tags: {img_tag_count}, removed empty lines: {removed_empty_count}")
+if __name__ == "__main__":
+    main()

mdproc-0.3.0/src/mdproc/mdmermaid2img.py ADDED Viewed

@@ -0,0 +1,282 @@
+# -*- coding: utf-8 -*-
+"""
+Process Mermaid charts in Markdown documents: Convert → Upload → Replace.
+TRUE 3-STEP WORKFLOW (writes file ONCE):
+1. convert_mermaid_in_markdown() - Convert mermaid code blocks to images
+2. upload_mermaid_images_to_cos() - Upload images to COS
+3. replace_mermaid_with_images() - Replace mermaid blocks with image links (local or COS)
+Use the unified pipeline: process_mermaid_markdown_3steps()
+- Reads file once
+- Does all replacements in memory
+- Writes file once at the end
+"""
+from dotenv import load_dotenv
+import argparse
+import os
+import re
+import tempfile
+from pathlib import Path
+from typing import Optional, Tuple, Dict, List
+from .mermaid2img_playwright import render_mermaid_playwright
+from .cos_uploader import upload
+load_dotenv()
+def extract_mermaid_code(markdown_content: str) -> list[Tuple[str, str]]:
+    """
+    Extract mermaid code blocks from markdown content.
+    Args:
+        markdown_content: The markdown text content
+    Returns:
+        List of tuples (mermaid_code, original_block) where:
+        - mermaid_code: Clean mermaid code without markdown fences
+        - original_block: Original markdown block including fences
+    """
+    # Pattern to match ```mermaid ... ```
+    pattern = r"```mermaid\n(.*?)\n```"
+    matches = re.finditer(pattern, markdown_content, re.DOTALL)
+    results = []
+    for match in matches:
+        mermaid_code = match.group(1).strip()
+        original_block = match.group(0)
+        results.append((mermaid_code, original_block))
+    return results
+def mermaid_to_image(
+    mermaid_code: str,
+    output_dir: Optional[str] = None,
+    theme: str = "default",
+    scale: int = 2,
+) -> str:
+    """
+    Convert mermaid code to image file.
+    Args:
+        mermaid_code: Raw mermaid diagram code (without markdown fences)
+        output_dir: Directory to save the image. If None, uses temp directory
+        theme: Theme for rendering ("default" or "dark")
+        scale: Scale factor for image
+    Returns:
+        Path to the generated image file
+    """
+    if output_dir is None:
+        output_dir = os.path.join(tempfile.gettempdir(), "mermaid2img")
+    os.makedirs(output_dir, exist_ok=True)
+    # Generate unique filename based on mermaid code hash
+    code_hash = hash(mermaid_code) & 0x7FFFFFFF
+    output_filename = f"mermaid_{code_hash}.png"
+    output_path = os.path.join(output_dir, output_filename)
+    # Render mermaid code to image
+    render_mermaid_playwright(mermaid_code, output_path, theme=theme, scale=scale)
+    return output_path
+def convert_mermaid_in_markdown(
+    markdown_content: str,
+    img_output_dir: Optional[str] = None,
+    theme: str = "default",
+    scale: int = 1,
+) -> Tuple[str, Dict[str, str]]:
+    """
+    Convert all mermaid charts in markdown to images.
+    DOES NOT modify markdown content, only generates images.
+    Args:
+        markdown_content: The markdown text content
+        img_output_dir: Directory to save images. If None, uses temp directory
+        theme: Theme for rendering ("default" or "dark")
+        scale: Scale factor for image
+    Returns:
+        Tuple of (markdown_unchanged, image_map_dict)
+        where image_map_dict contains {original_block: img_path}
+    """
+    # Extract mermaid blocks
+    mermaid_blocks = extract_mermaid_code(markdown_content)
+    if not mermaid_blocks:
+        print("No mermaid blocks found.")
+        return markdown_content, {}
+    print(f"Found {len(mermaid_blocks)} mermaid blocks")
+    image_map = {}  # {original_block: img_path}
+    # Convert each mermaid block to image
+    for i, (mermaid_code, original_block) in enumerate(mermaid_blocks, 1):
+        try:
+            print(f"Converting block {i}/{len(mermaid_blocks)}...")
+            # Convert to image
+            img_path = mermaid_to_image(mermaid_code, img_output_dir, theme, scale)
+            print(f"  Generated: {img_path}")
+            # Store mapping
+            image_map[original_block] = img_path
+        except Exception as e:
+            print(f"  Error: {e}")
+            continue
+    return markdown_content, image_map
+def upload_mermaid_images_to_cos(local_image_paths: List[str]) -> Dict[str, str]:
+    """
+    Upload images to COS and map to URLs.
+    Args:
+        local_image_paths: List of local image file paths
+    Returns:
+        Dictionary mapping {img_path: cos_url}
+    """
+    upload_results = {}
+    for i, img_path in enumerate(local_image_paths, 1):
+        try:
+            print(f"Uploading image {i}/{len(local_image_paths)}...")
+            print(f"  Source: {img_path}")
+            cos_url = upload(Path(img_path))
+            upload_results[img_path] = cos_url
+            print(f"  COS URL: {cos_url}")
+        except Exception as e:
+            print(f"  Upload failed: {e}")
+            continue
+    return upload_results
+def replace_mermaid_with_images(
+    markdown_content: str,
+    mermaid_to_img_map: Dict[str, str],
+    img_to_url_map: Dict[str, str],
+) -> str:
+    """
+    Replace mermaid code blocks with image links (local or COS URLs).
+    Args:
+        markdown_content: Original markdown text
+        mermaid_to_img_map: Dictionary mapping {original_mermaid_block: img_path}
+        img_to_url_map: dictionary mapping {img_path: cos_url}. If None, use local paths.
+        markdown_path: Path to markdown file (for calculating relative paths)
+    Returns:
+        Updated markdown content with image links
+    """
+    updated_content = markdown_content
+    for i, (original_block, img_path) in enumerate(mermaid_to_img_map.items(), 1):
+        # Determine image URL: COS if available, otherwise local path
+        if img_to_url_map and img_path in img_to_url_map:
+            image_url = img_to_url_map[img_path]
+            print(f"  Using COS URL: {image_url}")
+        else:
+            raise ValueError(f"No COS URL found for image path: {img_path}")
+        # Create markdown image link
+        image_link = f"![mermaid {i}]({image_url})"
+        # Replace original mermaid block
+        updated_content = updated_content.replace(original_block, image_link)
+    return updated_content
+def process_mermaid_markdown_3steps(
+    markdown_path: str,
+    output_path: Optional[str] = None,
+    theme: str = "default",
+    scale: int = 1,
+    img_output_dir: Optional[str] = None,
+):
+    """
+    Process markdown in 3 steps: Convert → Upload (optional) → Replace links.
+    Write file only ONCE at the end.
+    Args:
+        markdown_path: Path to input markdown file
+        output_path: Path to output markdown file. If None, overwrites input
+        upload_to_cos: Whether to upload images to COS
+        theme: Theme for rendering ("default" or "dark")
+        scale: Scale factor for image
+        img_output_dir: Directory to save images. If None, uses temp directory
+    Returns:
+        Tuple of (final_markdown_content, results_dict)
+    """
+    # Read markdown file once
+    with open(markdown_path, "r", encoding="utf-8") as f:
+        markdown_content = f.read()
+    if output_path is None:
+        output_path = markdown_path
+    # ===== STEP 1: Convert mermaid to images =====
+    print("STEP 1: Converting mermaid charts to images...")
+    _, mermaid_to_img_map = convert_mermaid_in_markdown(
+        markdown_content, img_output_dir, theme, scale
+    )
+    if not mermaid_to_img_map:
+        print("No mermaid blocks found. Writing unchanged content.")
+        with open(output_path, "w", encoding="utf-8") as f:
+            f.write(markdown_content)
+        return markdown_content, {}
+    results = {"images": mermaid_to_img_map}
+    # ===== STEP 2: Upload  =====
+    print("STEP 2: Uploading images to COS...")
+    image_paths = list(mermaid_to_img_map.values())
+    img_to_url_map = upload_mermaid_images_to_cos(image_paths)
+    if img_to_url_map:
+        results["cos_urls"] = img_to_url_map
+        print(f"Uploaded {len(img_to_url_map)} images successfully.")
+    # ===== STEP 3: Replace in memory =====
+    print("STEP 3: Replacing mermaid blocks with image links...")
+    final_content = replace_mermaid_with_images(
+        markdown_content, mermaid_to_img_map, img_to_url_map
+    )
+    # Write file ONCE
+    print("Writing output file...")
+    with open(output_path, "w", encoding="utf-8") as f:
+        f.write(final_content)
+    print(f"Output saved to: {output_path}")
+def main():
+    parser = argparse.ArgumentParser(
+        description="Convert tables in a Markdown file to images and upload to COS."
+    )
+    parser.add_argument("input_file", help="Path to the input markdown file.")
+    args = parser.parse_args()
+    input_file = args.input_file
+    output_file = f"{os.path.splitext(input_file)[0]}_mm2img.md"
+    process_mermaid_markdown_3steps(input_file, output_path=output_file, scale=2)
+if __name__ == "__main__":
+    main()

mdproc-0.3.0/src/mdproc/mdtable2img.py ADDED Viewed

@@ -0,0 +1,118 @@
+"""
+Markdown table to image and Uploader to COS.
+table with few columns: the table unfolds naturally.
+table with many columns: the table is spread out horizontally.
+"""
+import argparse
+import os
+import re
+import tempfile
+from pathlib import Path
+from dotenv import load_dotenv
+from markdown_it import MarkdownIt
+from playwright.sync_api import sync_playwright
+from .cos_uploader import upload
+load_dotenv()
+def extract_tables(md_text):
+    # re is simple than markdown-it table extractor for our use case
+    table_pattern = re.compile(r"(?:^\s*\|.*\|\s*\n)+", re.MULTILINE)
+    return [m.group(0) for m in table_pattern.finditer(md_text)]
+def table_to_image(md_text, output_path):
+    # md_text = """
+    # | A | B | C | D | E | F | G |
+    # |---|---|---|---|---|---|---|
+    # | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
+    # """
+    html_table = (
+        MarkdownIt("gfm-like", {"linkify": False}).enable("table").render(md_text)
+    )
+    html_table = html_table.replace("<table", '<table id="mdtable2img"', 1)
+    html = f"""
+    <html>
+    <head>
+    <style>
+    # usually 800px in width, but can be wider if there no sentenses to break.
+    table {{
+        border-collapse: collapse;
+        width: auto;
+        max-width: 800px;
+        table-layout: auto;
+        font-size: 14px;
+    }}
+    td, th {{
+        border: 1px solid #333;
+        padding: 6px 10px;
+        white-space: pre-line;
+    }}
+    </style>
+    </head>
+    <body>
+    {html_table}
+    </body>
+    </html>
+    """
+    with sync_playwright() as p:
+        browser = p.chromium.launch()
+        page = browser.new_page(viewport={"width": 2000, "height": 800})
+        page.set_content(html)
+        table_locator = page.locator("#mdtable2img")
+        table_locator.screenshot(path=output_path)
+        browser.close()
+def main():
+    parser = argparse.ArgumentParser(
+        description="Convert tables in a Markdown file to images and upload to COS."
+    )
+    parser.add_argument("input_file", help="Path to the input markdown file.")
+    args = parser.parse_args()
+    input_file = args.input_file
+    output_file = f"{os.path.splitext(input_file)[0]}_tb2img.md"
+    with open(input_file, "r", encoding="utf-8") as f:
+        content = f.read()
+    # Directory to store temporary imgs
+    img_dir = os.path.join(tempfile.gettempdir(), "mdtable2img")
+    os.makedirs(img_dir, exist_ok=True)
+    # Process tables and convert to images
+    tables = extract_tables(content)
+    print(f"Find {len(tables)} tables")
+    images = []
+    for i, table_md in enumerate(tables):
+        img_path = os.path.join(img_dir, f"table_{i + 1}.png")
+        table_to_image(table_md, img_path)
+        print(f"Converted table {i + 1} to image: {img_path}")
+        images.append(img_path)
+    print(f"Converted {len(images)} tables to images.")
+    # Upload images to COS and replace in markdown
+    for i, img_path in enumerate(images):
+        cos_url = upload(Path(img_path))
+        # Replace the first occurrence of the table markdown with image markdown
+        table_md = tables[i]
+        img_md = f"![Table {i + 1}]({cos_url})\n"
+        content = content.replace(table_md, img_md, 1)
+    print(f"Uploaded {len(images)} table images to COS.")
+    with open(output_file, "w", encoding="utf-8") as f:
+        f.write(content)
+    print(f"Processed markdown saved to {output_file}")
+if __name__ == "__main__":
+    main()

mdproc-0.3.0/src/mdproc/mermaid2img.py ADDED Viewed

@@ -0,0 +1,69 @@
+import tempfile
+import os
+import shutil
+import subprocess
+def render_mermaid_cli(code: str, output_path: str, theme="default", scale=1):
+    """
+    Render mermaid code to image using mermaid-cli.
+    Supports Chinese and other Unicode characters.
+    """
+    mmdc_path = os.environ.get("MMDC_PATH") or shutil.which("mmdc")
+    if mmdc_path and os.name == "nt":
+        candidate_cmd = f"{mmdc_path}.cmd"
+        if os.path.exists(candidate_cmd):
+            mmdc_path = candidate_cmd
+    if not mmdc_path:
+        raise FileNotFoundError(
+            "mmdc not found. Add it to PATH or set MMDC_PATH to the full path "
+        )
+    cmd = [
+        mmdc_path,
+        "-i",
+        "-",
+        "-o",
+        output_path,
+        "--theme",
+        theme,
+        "--scale",
+        str(scale),
+        "--backgroundColor",
+        "white",
+    ]
+    process = subprocess.run(
+        cmd,
+        input=code.encode(
+            "utf-8"
+        ),  # Explicitly specify UTF-8 encoding to support Chinese
+        capture_output=True,
+        text=False,  # Receive as bytes to avoid encoding issues
+    )
+    if process.returncode != 0:
+        stderr_msg = process.stderr.decode("utf-8", errors="replace")
+        print("Error:", stderr_msg)
+        raise RuntimeError(f"mermaid-cli execution failed: {stderr_msg}")
+def main():
+    demo_code = """
+    flowchart TD
+        A[开始] --> B{Is it?}
+        B -->|Yes| C[OK]
+        C --> D[Rethink]
+        D --> B
+        B ---->|No| E[End]
+    """
+    img_dir = os.path.join(tempfile.gettempdir(), "mermaid2img")
+    os.makedirs(img_dir, exist_ok=True)
+    output_path = os.path.join(img_dir, "output.png")
+    render_mermaid_cli(demo_code, output_path, theme="default", scale=1)
+    print(f"Image saved to: {output_path}")
+if __name__ == "__main__":
+    main()

mdproc-0.3.0/src/mdproc/mermaid2img_playwright.py ADDED Viewed

@@ -0,0 +1,175 @@
+# -*- coding: utf-8 -*-
+"""
+Render Mermaid diagrams to images using Playwright.
+Alternative to mermaid-cli that uses browser rendering.
+"""
+import os
+import tempfile
+from pathlib import Path
+from playwright.sync_api import sync_playwright
+def render_mermaid_playwright(
+    mermaid_code: str,
+    output_path: str,
+    theme: str = "default",
+    background_color: str = "white",
+    scale: float = 2.0,
+    layout: str = "elk",
+) -> None:
+    """
+    Render mermaid diagram to PNG image using Playwright.
+    Args:
+        mermaid_code: Raw mermaid diagram code (without ```mermaid fences)
+        output_path: Path to save the output PNG image
+        theme: Mermaid theme ("default", "dark", "forest", "neutral")
+        background_color: Background color (CSS color)
+        scale: Device scale factor for higher resolution (default 2.0)
+        layout: Layout engine for flowchart ("dagre" or "elk"). Only applies to flowchart type.
+    Raises:
+        RuntimeError: If rendering fails
+    """
+    # Determine if we need flowchart layout config
+    # ELK layout only works for flowchart diagrams
+    is_flowchart = "flowchart" in mermaid_code.lower()
+    if is_flowchart and layout != "dagre":
+        flowchart_config = f"""
+            flowchart: {{
+                defaultRenderer: '{layout}'
+            }},"""
+    else:
+        flowchart_config = ""
+    # Get the absolute path to your local bundle
+    assets_dir = Path(__file__).parent / "assets"
+    # Copy from https://github.com/Honghe/mermaid-bundle/blob/master/mermaid.bundle.js
+    mermaid_bundle_path = (assets_dir / "mermaid.bundle.js").absolute().as_uri()
+    # HTML template with Mermaid.js
+    html_template = """
+<!DOCTYPE html>
+<html>
+<head>
+    <meta charset="UTF-8">
+    <script src="{mermaid_bundle_path}"></script>
+    <script type="module">
+        mermaid.initialize({{
+            startOnLoad: true,
+            theme: '{theme}',
+            securityLevel: 'loose',
+            {flowchart_config}
+        }});
+    </script>
+    <style>
+        body {{
+            margin: 0;
+            padding: 20px;
+            background-color: {background_color};
+            display: flex;
+            justify-content: center;
+            align-items: center;
+            min-height: 100vh;
+        }}
+        #diagram {{
+            max-width: 100%;
+        }}
+    </style>
+</head>
+<body>
+    <div class="mermaid" id="diagram">
+{mermaid_code}
+    </div>
+</body>
+</html>
+"""
+    html_content = html_template.format(
+        theme=theme,
+        background_color=background_color,
+        mermaid_code=mermaid_code,
+        flowchart_config=flowchart_config,
+        mermaid_bundle_path=mermaid_bundle_path,
+    )
+    # Create temporary HTML file
+    with tempfile.NamedTemporaryFile(
+        mode="w", encoding="utf-8", suffix=".html", delete=False
+    ) as f:
+        temp_html_path = f.name
+        f.write(html_content)
+    try:
+        with sync_playwright() as p:
+            # Launch browser in headless mode
+            browser = p.chromium.launch(
+                headless=True,
+            )
+            context = browser.new_context(
+                viewport={"width": 800, "height": 800},
+                device_scale_factor=scale,
+            )
+            page = context.new_page()
+            # Load HTML file
+            page.goto(f"file://{Path(temp_html_path).as_posix()}")
+            # Wait for mermaid to render
+            page.wait_for_selector("#diagram svg", timeout=3000)
+            # Get the SVG element for precise cropping
+            diagram = page.locator("#diagram")
+            # Take screenshot
+            diagram.screenshot(path=output_path, type="png")
+            browser.close()
+    except Exception as e:
+        raise RuntimeError(f"Failed to render mermaid diagram: {e}")
+    finally:
+        # Clean up temporary HTML file
+        if os.path.exists(temp_html_path):
+            os.remove(temp_html_path)
+def main():
+    """Demo: render mermaid diagram using Playwright."""
+    demo_code = """
+flowchart TD
+    A[开始] --> B["Popen()"]
+    B --> C[子进程启动<br>独立运行]
+    B --> D[主进程继续执行]
+    D --> E{需要<br>子进程结果?}
+    E -->|否| D
+    E -->|是| F["P.wait()"]
+    F -->|阻塞等待| G[子进程结束]
+    G --> H[拿到 returncode]
+    H --> I[可安全读 stdout/stderr<br>（如果用了 PIPE）]
+    I --> J[结束]
+"""
+    # Create output directory
+    img_dir = os.path.join(tempfile.gettempdir(), "mermaid2img")
+    os.makedirs(img_dir, exist_ok=True)
+    output_path = os.path.join(img_dir, "output_playwright.png")
+    print("Rendering mermaid diagram with Playwright...")
+    render_mermaid_playwright(
+        demo_code,
+        output_path,
+        theme="default",
+        background_color="white",
+        scale=2.0,
+        layout="elk",
+    )
+    print(f"Image saved to: {output_path}")
+if __name__ == "__main__":
+    main()

mdproc-0.3.0/src/mdproc/mermaid2img_playwright_cdn.py ADDED Viewed

@@ -0,0 +1,169 @@
+# -*- coding: utf-8 -*-
+"""
+Render Mermaid diagrams to images using Playwright.
+Alternative to mermaid-cli that uses browser rendering.
+"""
+import os
+import tempfile
+from pathlib import Path
+from playwright.sync_api import sync_playwright
+def render_mermaid_playwright(
+    mermaid_code: str,
+    output_path: str,
+    theme: str = "default",
+    background_color: str = "white",
+    scale: float = 2.0,
+    layout: str = "elk",
+) -> None:
+    """
+    Render mermaid diagram to PNG image using Playwright.
+    Args:
+        mermaid_code: Raw mermaid diagram code (without ```mermaid fences)
+        output_path: Path to save the output PNG image
+        theme: Mermaid theme ("default", "dark", "forest", "neutral")
+        background_color: Background color (CSS color)
+        scale: Device scale factor for higher resolution (default 2.0)
+        layout: Layout engine for flowchart ("dagre" or "elk"). Only applies to flowchart type.
+    Raises:
+        RuntimeError: If rendering fails
+    """
+    # Determine if we need flowchart layout config
+    # ELK layout only works for flowchart diagrams
+    is_flowchart = "flowchart" in mermaid_code.lower()
+    if is_flowchart and layout != "dagre":
+        flowchart_config = f"""
+            flowchart: {{
+                defaultRenderer: '{layout}'
+            }},"""
+    else:
+        flowchart_config = ""
+    # Get the absolute path to your local bundle
+    assets_dir = Path(__file__).parent / "assets"
+    mermaid_bundle_path = (assets_dir / "mermaid.bundle.js").absolute().as_uri()
+    # HTML template with Mermaid.js
+    html_template = """
+<!DOCTYPE html>
+<html>
+<head>
+    <meta charset="UTF-8">
+    <script type="module">
+        import mermaid from 'https://cdn.jsdelivr.net/npm/mermaid@11/dist/mermaid.esm.min.mjs';
+        import elkLayouts from 'https://cdn.jsdelivr.net/npm/@mermaid-js/layout-elk@0/dist/mermaid-layout-elk.esm.min.mjs';
+        mermaid.registerLayoutLoaders(elkLayouts);
+        mermaid.initialize({{
+            startOnLoad: true,
+            theme: '{theme}',
+            securityLevel: 'loose',
+            {flowchart_config}
+        }});
+    </script>
+    <style>
+        body {{
+            margin: 0;
+            padding: 20px;
+            background-color: {background_color};
+            display: flex;
+            justify-content: center;
+            align-items: center;
+            min-height: 100vh;
+        }}
+        #diagram {{
+            max-width: 100%;
+        }}
+    </style>
+</head>
+<body>
+    <div class="mermaid" id="diagram">
+{mermaid_code}
+    </div>
+</body>
+</html>
+"""
+    html_content = html_template.format(
+        theme=theme,
+        background_color=background_color,
+        mermaid_code=mermaid_code,
+        flowchart_config=flowchart_config,
+        mermaid_bundle_path=mermaid_bundle_path,
+    )
+    # Create temporary HTML file
+    with tempfile.NamedTemporaryFile(
+        mode="w", encoding="utf-8", suffix=".html", delete=False
+    ) as f:
+        temp_html_path = f.name
+        f.write(html_content)
+    try:
+        with sync_playwright() as p:
+            # Launch browser in headless mode
+            browser = p.chromium.launch(headless=True)
+            context = browser.new_context(
+                viewport={"width": 800, "height": 800},
+                device_scale_factor=scale,
+            )
+            page = context.new_page()
+            # Load HTML file
+            page.goto(f"file://{Path(temp_html_path).as_posix()}")
+            # Wait for mermaid to render
+            page.wait_for_selector("#diagram svg", timeout=3000)
+            # Get the SVG element for precise cropping
+            diagram = page.locator("#diagram")
+            # Take screenshot
+            diagram.screenshot(path=output_path, type="png")
+            browser.close()
+    except Exception as e:
+        raise RuntimeError(f"Failed to render mermaid diagram: {e}")
+    finally:
+        # Clean up temporary HTML file
+        if os.path.exists(temp_html_path):
+            os.remove(temp_html_path)
+def main():
+    """Demo: render mermaid diagram using Playwright."""
+    demo_code = """
+flowchart TD
+    A[开始] --> B{Is it?}
+    B -->|Yes| C[OK]
+    C --> D[Rethink]
+    D --> B
+    B ---->|No| E[End]
+"""
+    # Create output directory
+    img_dir = os.path.join(tempfile.gettempdir(), "mermaid2img")
+    os.makedirs(img_dir, exist_ok=True)
+    output_path = os.path.join(img_dir, "output_playwright.png")
+    print("Rendering mermaid diagram with Playwright...")
+    render_mermaid_playwright(
+        demo_code,
+        output_path,
+        theme="default",
+        background_color="white",
+        scale=2.0,
+        layout="elk",
+    )
+    print(f"Image saved to: {output_path}")
+if __name__ == "__main__":
+    main()

mdproc-0.2.1/.gitignore DELETED Viewed

@@ -1,4 +0,0 @@
-.env
-.venv
-__pycache__/
-dist/

mdproc-0.2.1/.vscode/launch.json DELETED Viewed

@@ -1,12 +0,0 @@
-{
-    "configurations": [
-        {
-            "name": "Python Debugger: Module",
-            "type": "debugpy",
-            "request": "launch",
-            "module": "mdproc.mdimgupload",
-            "args": ["mdproc/demo.md"],
-            "cwd": "${workspaceFolder}/src",
-        }
-    ]
-}

mdproc-0.2.1/README.md DELETED Viewed

@@ -1,44 +0,0 @@
-# mdproc
-A simple Python tool to process markdown files.
-## Features
-- Markdown Image Uploader to COS.
-## Config
-`.env` or configure environment variables:
-```
-COS_SECRET_ID=<xyz>
-COS_SECRET_KEY=<xyz>
-COS_REGION=<xyz>
-COS_BUCKET=<xyz>
-```
-## Usage
-1. Install dependencies:
-    ```bash
-    pip install mdproc
-    ```
-2. Run the script:
-    ```bash
-    mdproc-imgupload your_markdown.md
-    ```
-## Demo
-demo.md:
-```
-![first-version](https://www.python.org/static/img/python-logo.png)
-```
-demo_output.md
-```
-![first-version](https://pic-1251484506.cos.ap-guangzhou.myqcloud.com/imgs/python-logo_ae79195a.png)
-```
-## License
-Apache License

{mdproc-0.2.1 → mdproc-0.3.0}/LICENSE RENAMED Viewed

File without changes

{mdproc-0.2.1 → mdproc-0.3.0}/src/mdproc/cos_uploader.py RENAMED Viewed

File without changes

{mdproc-0.2.1 → mdproc-0.3.0}/src/mdproc/demo.md RENAMED Viewed

File without changes

{mdproc-0.2.1 → mdproc-0.3.0}/src/mdproc/demo_output.md RENAMED Viewed

File without changes

{mdproc-0.2.1 → mdproc-0.3.0}/src/mdproc/mdimgupload.py RENAMED Viewed

File without changes

{mdproc-0.2.1 → mdproc-0.3.0}/src/mdproc/mdproc.py RENAMED Viewed

File without changes

mdproc 0.2.1__tar.gz → 0.3.0__tar.gz

mdproc 0.2.1tar.gz → 0.3.0tar.gz