slack-markdown-parser 2.4.0__tar.gz → 2.4.2__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- {slack_markdown_parser-2.4.0 → slack_markdown_parser-2.4.2}/CHANGELOG.md +14 -0
- {slack_markdown_parser-2.4.0/slack_markdown_parser.egg-info → slack_markdown_parser-2.4.2}/PKG-INFO +1 -1
- {slack_markdown_parser-2.4.0 → slack_markdown_parser-2.4.2}/docs/spec-ja.md +9 -2
- {slack_markdown_parser-2.4.0 → slack_markdown_parser-2.4.2}/docs/spec.md +9 -2
- {slack_markdown_parser-2.4.0 → slack_markdown_parser-2.4.2}/pyproject.toml +1 -1
- {slack_markdown_parser-2.4.0 → slack_markdown_parser-2.4.2}/slack_markdown_parser/__init__.py +1 -1
- {slack_markdown_parser-2.4.0 → slack_markdown_parser-2.4.2}/slack_markdown_parser/converter.py +80 -15
- {slack_markdown_parser-2.4.0 → slack_markdown_parser-2.4.2/slack_markdown_parser.egg-info}/PKG-INFO +1 -1
- {slack_markdown_parser-2.4.0 → slack_markdown_parser-2.4.2}/LICENSE +0 -0
- {slack_markdown_parser-2.4.0 → slack_markdown_parser-2.4.2}/MANIFEST.in +0 -0
- {slack_markdown_parser-2.4.0 → slack_markdown_parser-2.4.2}/README-ja.md +0 -0
- {slack_markdown_parser-2.4.0 → slack_markdown_parser-2.4.2}/README.md +0 -0
- {slack_markdown_parser-2.4.0 → slack_markdown_parser-2.4.2}/setup.cfg +0 -0
- {slack_markdown_parser-2.4.0 → slack_markdown_parser-2.4.2}/slack_markdown_parser/py.typed +0 -0
- {slack_markdown_parser-2.4.0 → slack_markdown_parser-2.4.2}/slack_markdown_parser.egg-info/SOURCES.txt +0 -0
- {slack_markdown_parser-2.4.0 → slack_markdown_parser-2.4.2}/slack_markdown_parser.egg-info/dependency_links.txt +0 -0
- {slack_markdown_parser-2.4.0 → slack_markdown_parser-2.4.2}/slack_markdown_parser.egg-info/requires.txt +0 -0
- {slack_markdown_parser-2.4.0 → slack_markdown_parser-2.4.2}/slack_markdown_parser.egg-info/top_level.txt +0 -0
|
@@ -6,6 +6,20 @@ The format is based on Keep a Changelog, and the project follows Semantic Versio
|
|
|
6
6
|
|
|
7
7
|
## [Unreleased]
|
|
8
8
|
|
|
9
|
+
## [2.4.2] - 2026-05-29
|
|
10
|
+
|
|
11
|
+
### Fixed
|
|
12
|
+
|
|
13
|
+
- Stopped an unbalanced emphasis delimiter from corrupting unrelated, well-formed spans in the same block. The bold/italic/strikethrough patterns are matched with `re.DOTALL`, so a single stray `**` (for example a whitespace-flanked literal `**` in `閉じ ** が`, or an unclosed marker) shifted marker pairing across the whole block and flipped the protective ZWSP of nearby punctuation-terminated bold to the broken *outer* position, re-exposing the literal markers on Slack. `EMPHASIS_PATTERNS` now enforces CommonMark's minimal flanking requirement — an opening run is not followed by whitespace and a closing run is not preceded by whitespace — so a non-flanking stray marker stays literal and no longer disturbs its neighbours.
|
|
14
|
+
- Bounded the `**` and `~~` emphasis bodies to a single delimiter run so a dangling opener with no valid closer of its own (for example `**oops **` or `**: x **` before a later `**…%**`) can no longer scan past the literal stray and steal a following well-formed span's closing marker, which had misplaced that span's protective ZWSP. The single-`*` italic body is intentionally left unbounded because italics legitimately wrap `**bold**`.
|
|
15
|
+
|
|
16
|
+
## [2.4.1] - 2026-05-29
|
|
17
|
+
|
|
18
|
+
### Fixed
|
|
19
|
+
|
|
20
|
+
- Stopped punctuation-terminated emphasis from leaking its literal markers (`**`, `*`, `~~`) in `markdown` blocks. A ZWSP placed just outside a closing marker broke Slack's CommonMark right-flanking check whenever the last inner character was punctuation (e.g. `- **項目:**` at a line end, or `**70.9%→83.0%**、` before CJK punctuation), exposing the raw markers. Chunk boundaries are now treated as safe so no stray ZWSP is appended at line/text ends, and when a marker sits against inner punctuation a ZWSP is inserted just inside it so the run flanks via rule 2a regardless of the following character — including before CJK text and CJK punctuation that Slack does not accept as a flanking neighbor.
|
|
21
|
+
- Stopped preserving English-like punctuation-flanked emphasis raw when its tight neighbor is non-ASCII punctuation (e.g. `**APIYI (apiyi.com)**。` or `Score **70.9%→83.0%**、`). Slack only accepts ASCII punctuation/whitespace as a flanking neighbor, so these now receive the inner ZWSP protection instead of being emitted unchanged.
|
|
22
|
+
|
|
9
23
|
## [2.4.0] - 2026-05-14
|
|
10
24
|
|
|
11
25
|
### Added
|
|
@@ -166,16 +166,23 @@ LLM は外枠パイプの省略、区切り行の欠落、列数の不一致な
|
|
|
166
166
|
|
|
167
167
|
### 対象パターン
|
|
168
168
|
|
|
169
|
-
|
|
169
|
+
以下の装飾記号について、見た目を変えずに Slack の装飾境界を保つために必要な箇所だけにゼロ幅スペース(`U+200B`)を挿入します。
|
|
170
170
|
|
|
171
171
|
- `` `code` `` — インラインコード
|
|
172
172
|
- `**bold**` — 太字
|
|
173
173
|
- `*italic*` — 斜体
|
|
174
174
|
- `~~strike~~` — 取消線
|
|
175
175
|
|
|
176
|
+
ルール:
|
|
177
|
+
|
|
178
|
+
- チャンクの先頭・末尾(行頭・行末・テキスト端、またはフェンスドコードブロックの境界)は安全とみなし、ゼロ幅スペースを付けません。
|
|
179
|
+
- 外側の片方が前後の非境界テキストに密着している場合、その側だけにゼロ幅スペースを付けます。安全(境界)側はそのままにします。
|
|
180
|
+
- 強調マーカー(`**`・`*`・`~~`)の内側が句読点に密着している場合(例 `**注意:**` や `**70.9%→83.0%**`)、マーカーの内側にゼロ幅スペースを挿入します。これによりマーカーの内側隣接文字が非句読点になり、後続が何であっても Slack の CommonMark right-/left-flanking 判定が成立します。Slack が flanking 近傍として認めない CJK テキストや CJK 句読点(`、` / `。`)の直前でも有効です。インラインコードは flanking 規則の対象外なので、このルールから除外します。
|
|
181
|
+
- 強調デリミタは CommonMark の最小 flanking 条件を満たす場合のみ認識します。すなわち、開きランの直後が空白でなく、閉じランの直前が空白でないこと。両側が空白の単独マーカー(例 `閉じ ** が` の literal な `**`)や、その他の対になっていないマーカーはそのまま残します。これにより、1 個の余分なマーカーが近くの正しい装飾のペアリングをずらして、ゼロ幅スペースを誤った位置に挿入することを防ぎます。
|
|
182
|
+
|
|
176
183
|
例外:
|
|
177
184
|
|
|
178
|
-
-
|
|
185
|
+
- 装飾の中身が英語系テキストで、密着している隣接文字が **ASCII** 句読点だけの場合は、元のトークンをそのまま保ちます。`**APIYI (apiyi.com)**:` のように Slack がそのまま表示できるケースで、不要なゼロ幅スペースを増やさないためです。`、` や `。` のような非ASCII句読点が隣接する場合は保持せず、上記の内側ゼロ幅スペースで保護します。
|
|
179
186
|
|
|
180
187
|
### 除外範囲
|
|
181
188
|
|
|
@@ -165,16 +165,23 @@ In languages such as Japanese, Chinese, and Korean that do not usually put space
|
|
|
165
165
|
|
|
166
166
|
### Target patterns
|
|
167
167
|
|
|
168
|
-
|
|
168
|
+
The library inserts zero-width spaces (`U+200B`) only where they are needed to keep Slack's formatting boundaries intact, without changing the visible layout, for each formatting token below:
|
|
169
169
|
|
|
170
170
|
- `` `code` ``: inline code
|
|
171
171
|
- `**bold**`: bold
|
|
172
172
|
- `*italic*`: italic
|
|
173
173
|
- `~~strike~~`: strikethrough
|
|
174
174
|
|
|
175
|
+
Rules:
|
|
176
|
+
|
|
177
|
+
- The start and end of a chunk (a line/text boundary, or the edge of a fenced code block) are treated as safe; no zero-width space is added there.
|
|
178
|
+
- When an outer edge is tight against surrounding non-boundary text, only that edge is padded with a zero-width space. The safe (boundary) edge is left clean.
|
|
179
|
+
- When an emphasis marker (`**`, `*`, `~~`) sits directly against punctuation on its inner side (for example `**注意:**` or `**70.9%→83.0%**`), a zero-width space is inserted just *inside* the marker. This makes the marker's inner neighbor a non-punctuation character, so Slack's CommonMark right-/left-flanking check succeeds regardless of what surrounds the token — including before CJK text and CJK punctuation (`、` / `。`), which Slack does not accept as a flanking neighbor. Inline code spans are exempt from this rule because they do not obey flanking rules.
|
|
180
|
+
- Emphasis delimiters are recognized only when they satisfy CommonMark's minimal flanking rule: an opening run is not immediately followed by whitespace, and a closing run is not immediately preceded by whitespace. A stray, whitespace-flanked marker (for example the literal `**` in `閉じ ** が`), or an otherwise unbalanced marker, is left untouched. This prevents one dangling marker from shifting the pairing of nearby well-formed spans and misplacing their zero-width spaces.
|
|
181
|
+
|
|
175
182
|
Exception:
|
|
176
183
|
|
|
177
|
-
- If the token body is English-like text and
|
|
184
|
+
- If the token body is English-like text and its only tight neighbors are **ASCII** punctuation characters, the raw token is preserved. This avoids over-correcting spans such as `**APIYI (apiyi.com)**:` that Slack already renders correctly without extra zero-width spaces. A non-ASCII punctuation neighbor such as `、` or `。` is not preserved — it is protected by the inner zero-width space described above.
|
|
178
185
|
|
|
179
186
|
### Excluded regions
|
|
180
187
|
|
{slack_markdown_parser-2.4.0 → slack_markdown_parser-2.4.2}/slack_markdown_parser/converter.py
RENAMED
|
@@ -32,10 +32,24 @@ STANDALONE_IMAGE_PATTERN = re.compile(
|
|
|
32
32
|
)
|
|
33
33
|
MARKDOWN_LINK_PATTERN = re.compile(r"\[[^\]\n]+\]\([^\)\n]+\)")
|
|
34
34
|
INLINE_CODE_SPAN_PATTERN = re.compile(r"(?<!`)`[^`\n]+`(?!`)", flags=re.DOTALL)
|
|
35
|
+
# Emphasis delimiters must satisfy CommonMark's minimal flanking requirement:
|
|
36
|
+
# an opening run is not followed by whitespace and a closing run is not preceded
|
|
37
|
+
# by whitespace. Enforcing this keeps a stray, whitespace-flanked delimiter
|
|
38
|
+
# (e.g. the literal ``**`` in ``閉じ ** が``) from being paired at all.
|
|
39
|
+
#
|
|
40
|
+
# For ``**`` and ``~~`` the body additionally may not contain the same delimiter
|
|
41
|
+
# run (``(?:(?!\*\*).)+?`` / ``(?:(?!~~).)+?``). Without this, a dangling opener
|
|
42
|
+
# with no valid closer of its own (``**oops ** and **70.9%→83.0%**``) would scan
|
|
43
|
+
# past the literal stray and steal a *later* well-formed span's closing marker,
|
|
44
|
+
# shifting the pairing and corrupting that span's ZWSP placement. Bounding the
|
|
45
|
+
# body to a single run makes the regex pair the same markers CommonMark does.
|
|
46
|
+
# (The single-``*`` italic body is intentionally not bounded this way: italics
|
|
47
|
+
# legitimately wrap ``**bold**`` and ``*`` is heavily overloaded, so it keeps the
|
|
48
|
+
# whitespace guard only.)
|
|
35
49
|
EMPHASIS_PATTERNS = (
|
|
36
|
-
re.compile(r"(?<!\*)\*\*(
|
|
37
|
-
re.compile(r"(?<!\*)\*(?!\*)(.+?)(?<!\*)\*(?!\*)", flags=re.DOTALL),
|
|
38
|
-
re.compile(r"~~(
|
|
50
|
+
re.compile(r"(?<!\*)\*\*(?!\s)((?:(?!\*\*).)+?)(?<!\s)\*\*(?!\*)", flags=re.DOTALL),
|
|
51
|
+
re.compile(r"(?<!\*)\*(?!\*)(?!\s)(.+?)(?<!\s)(?<!\*)\*(?!\*)", flags=re.DOTALL),
|
|
52
|
+
re.compile(r"~~(?!\s)((?:(?!~~).)+?)(?<!\s)~~", flags=re.DOTALL),
|
|
39
53
|
)
|
|
40
54
|
INLINE_CODE_PLACEHOLDER_PATTERN = re.compile(r"\ufff0code\d+\ufff1")
|
|
41
55
|
PROTECTED_UNDERSCORE_SPAN_PATTERN = re.compile(
|
|
@@ -515,6 +529,13 @@ def _should_preserve_raw_punctuation_emphasis(
|
|
|
515
529
|
return False
|
|
516
530
|
if any(not _is_punctuation_like(char, boundary_chars) for char in tight_chars):
|
|
517
531
|
return False
|
|
532
|
+
# Slack only accepts ASCII punctuation (and whitespace) as a flanking
|
|
533
|
+
# neighbor. A non-ASCII punctuation neighbor — e.g. the CJK comma/period
|
|
534
|
+
# ``、``/``。`` — does not satisfy the right-/left-flanking rule, so the
|
|
535
|
+
# token must not be preserved raw; it needs the inner-ZWSP protection in
|
|
536
|
+
# ``wrap_match`` instead.
|
|
537
|
+
if any(ord(char) > 127 for char in tight_chars):
|
|
538
|
+
return False
|
|
518
539
|
if any(_is_han_or_kana_char(char) or _is_hangul_char(char) for char in token_text):
|
|
519
540
|
return False
|
|
520
541
|
|
|
@@ -703,21 +724,65 @@ def _format_markdown_with_spacing_metadata(text: str) -> tuple[str, list[int]]:
|
|
|
703
724
|
|
|
704
725
|
def wrap_match(match: re.Match[str], source: str) -> str:
|
|
705
726
|
start, end = match.start(), match.end()
|
|
706
|
-
|
|
707
|
-
|
|
727
|
+
token = match.group(0)
|
|
728
|
+
# The start/end of the chunk are effective boundaries: there is no
|
|
729
|
+
# adjacent text to separate the marker from, so they are safe. Treating
|
|
730
|
+
# them as unsafe used to append a ZWSP right after a closing marker, and
|
|
731
|
+
# when the last content character was punctuation (e.g. ``**注意:**``)
|
|
732
|
+
# the trailing ZWSP made Slack fail the CommonMark right-flanking check
|
|
733
|
+
# and exposed the literal ``**``.
|
|
734
|
+
before_safe = start == 0 or source[start - 1] in boundary_chars
|
|
735
|
+
after_safe = end == len(source) or source[end] in boundary_chars
|
|
708
736
|
if before_safe and after_safe:
|
|
709
|
-
return
|
|
737
|
+
return token
|
|
710
738
|
if _should_preserve_raw_punctuation_emphasis(
|
|
711
|
-
source, start, end,
|
|
739
|
+
source, start, end, token, boundary_chars
|
|
712
740
|
):
|
|
713
|
-
return
|
|
714
|
-
|
|
715
|
-
# When
|
|
716
|
-
#
|
|
717
|
-
#
|
|
718
|
-
prefix = ZWSP
|
|
719
|
-
suffix = ZWSP
|
|
720
|
-
|
|
741
|
+
return token
|
|
742
|
+
|
|
743
|
+
# When an outer edge is tightly coupled to surrounding text, pad only
|
|
744
|
+
# that edge so Slack can treat the decoration as a standalone span.
|
|
745
|
+
# Padding a safe edge is unnecessary noise.
|
|
746
|
+
prefix = "" if before_safe else ZWSP
|
|
747
|
+
suffix = "" if after_safe else ZWSP
|
|
748
|
+
|
|
749
|
+
# Emphasis markers (``*``/``**``/``~~``) obey CommonMark delimiter-run
|
|
750
|
+
# flanking rules; inline code spans (``` `…` ```) do not. When an
|
|
751
|
+
# emphasis marker sits directly against punctuation on its inner side
|
|
752
|
+
# (``**注意:**``, ``**70%→83%**``) Slack treats the run as a delimiter
|
|
753
|
+
# only when the *outer* neighbour is whitespace or ASCII punctuation; a
|
|
754
|
+
# following CJK character or CJK punctuation (e.g. ``、``) — and even a
|
|
755
|
+
# ZWSP placed just outside the marker — leaves the literal ``**``
|
|
756
|
+
# exposed. Inserting a ZWSP just *inside* the marker makes its inner
|
|
757
|
+
# neighbour a non-punctuation character, so the run flanks via rule 2a
|
|
758
|
+
# regardless of what surrounds the token.
|
|
759
|
+
marker_char = token[0]
|
|
760
|
+
if marker_char != "`":
|
|
761
|
+
marker_len = len(token) - len(token.lstrip(marker_char))
|
|
762
|
+
open_marker = token[:marker_len]
|
|
763
|
+
inner = token[marker_len : len(token) - marker_len]
|
|
764
|
+
close_marker = token[len(token) - marker_len :]
|
|
765
|
+
inner_prefix = (
|
|
766
|
+
ZWSP if inner and _is_punctuation_like(inner[0], boundary_chars) else ""
|
|
767
|
+
)
|
|
768
|
+
inner_suffix = (
|
|
769
|
+
ZWSP
|
|
770
|
+
if inner and _is_punctuation_like(inner[-1], boundary_chars)
|
|
771
|
+
else ""
|
|
772
|
+
)
|
|
773
|
+
if inner_prefix or inner_suffix:
|
|
774
|
+
token = (
|
|
775
|
+
f"{open_marker}{inner_prefix}{inner}{inner_suffix}{close_marker}"
|
|
776
|
+
)
|
|
777
|
+
# The inner ZWSP already lets the marker flank correctly, so an
|
|
778
|
+
# outer ZWSP on the same edge is redundant — and after a closing
|
|
779
|
+
# marker it is precisely what would re-break rendering.
|
|
780
|
+
if inner_prefix:
|
|
781
|
+
prefix = ""
|
|
782
|
+
if inner_suffix:
|
|
783
|
+
suffix = ""
|
|
784
|
+
|
|
785
|
+
return f"{prefix}{token}{suffix}"
|
|
721
786
|
|
|
722
787
|
def wrap_nested_code_emphasis_match(
|
|
723
788
|
match: re.Match[str],
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|