red_quilt 0.6.1 → 0.7.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CHANGELOG.md +19 -0
- data/LICENSE.txt +21 -0
- data/README.md +63 -105
- data/docs/api.md +132 -0
- data/docs/{architecture.md → architecture.ja.md} +3 -3
- data/docs/{arena-usage.md → arena-usage.ja.md} +38 -30
- data/docs/{commonmark-conformance.md → commonmark-conformance.ja.md} +26 -10
- data/lib/red_quilt/cli.rb +1 -0
- data/lib/red_quilt/document.rb +4 -2
- data/lib/red_quilt/renderer/html.rb +8 -2
- data/lib/red_quilt/slug.rb +38 -0
- data/lib/red_quilt/tilt.rb +42 -0
- data/lib/red_quilt/version.rb +1 -1
- data/lib/red_quilt.rb +3 -2
- data/sig/red_quilt.rbs +199 -8
- metadata +15 -10
- data/.rspec +0 -3
- data/ast-spec.md +0 -1227
- data/mise.toml +0 -2
checksums.yaml
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
SHA256:
|
|
3
|
-
metadata.gz:
|
|
4
|
-
data.tar.gz:
|
|
3
|
+
metadata.gz: 45de1415fe8cc64323b390d8cedc4d6691fa6c0ab17337fac8aa956cef8bd107
|
|
4
|
+
data.tar.gz: c3392372dd65fe74149678dbb9b5ebc8cc389b6f44ca4b0013e6eadfd565568d
|
|
5
5
|
SHA512:
|
|
6
|
-
metadata.gz:
|
|
7
|
-
data.tar.gz:
|
|
6
|
+
metadata.gz: c7f3c0226e9a2e2cac06273147dfd497f45566ba99d766cee30ccd6fbff728669ab1d9ab112c93de33cce405e2cea9c8789e4f431a70c10d546e2bc2309c8c5b
|
|
7
|
+
data.tar.gz: 69cae77089c64fbf8821cc6340e0c0ad8749004fbf03ec5acf1b55daa388404db5a64a276f48267ab9e58a66a21eb2892b9f120b942e105930862c94e29de7fe
|
data/CHANGELOG.md
CHANGED
|
@@ -5,6 +5,25 @@ All notable changes to this project are documented in this file.
|
|
|
5
5
|
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
|
|
6
6
|
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
|
|
7
7
|
|
|
8
|
+
## [0.7.0] - 2026-05-29
|
|
9
|
+
|
|
10
|
+
### Added
|
|
11
|
+
|
|
12
|
+
- Optional Tilt template adapter, registered for the common markdown
|
|
13
|
+
extensions (`.md`, `.markdown`, …).
|
|
14
|
+
- Opt-in heading anchor ids via the `heading_ids:` option on `render_html` /
|
|
15
|
+
`Document#to_html`. Slugs follow GitHub's scheme but preserve Unicode, so
|
|
16
|
+
non-ASCII (e.g. Japanese) headings stay readable; duplicates within a
|
|
17
|
+
document get `-1`, `-2`, … suffixes.
|
|
18
|
+
|
|
19
|
+
### Fixed
|
|
20
|
+
|
|
21
|
+
- `require "red_quilt/cli"` on its own now works (cli.rb requires red_quilt).
|
|
22
|
+
|
|
23
|
+
### Internal
|
|
24
|
+
|
|
25
|
+
- Add LICENSE and gemspec metadata; move the API reference to `docs/api.md`.
|
|
26
|
+
|
|
8
27
|
## [0.6.1] - 2026-05-29
|
|
9
28
|
|
|
10
29
|
### Added
|
data/LICENSE.txt
ADDED
|
@@ -0,0 +1,21 @@
|
|
|
1
|
+
The MIT License (MIT)
|
|
2
|
+
|
|
3
|
+
Copyright (c) 2026 Masayoshi Takahashi (takahashim)
|
|
4
|
+
|
|
5
|
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
|
6
|
+
of this software and associated documentation files (the "Software"), to deal
|
|
7
|
+
in the Software without restriction, including without limitation the rights
|
|
8
|
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
|
9
|
+
copies of the Software, and to permit persons to whom the Software is
|
|
10
|
+
furnished to do so, subject to the following conditions:
|
|
11
|
+
|
|
12
|
+
The above copyright notice and this permission notice shall be included in
|
|
13
|
+
all copies or substantial portions of the Software.
|
|
14
|
+
|
|
15
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
|
16
|
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
|
17
|
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
|
18
|
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
|
19
|
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
|
20
|
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
|
|
21
|
+
THE SOFTWARE.
|
data/README.md
CHANGED
|
@@ -37,85 +37,78 @@ RedQuilt.render_html("Hi <em>tag</em>", allow_html: true)
|
|
|
37
37
|
# => "<p>Hi <em>tag</em></p>\n"
|
|
38
38
|
```
|
|
39
39
|
|
|
40
|
-
|
|
40
|
+
### Options
|
|
41
|
+
|
|
42
|
+
`RedQuilt.parse` and `RedQuilt.render_html` accept:
|
|
41
43
|
|
|
42
|
-
|
|
44
|
+
| Option | Default | Effect |
|
|
45
|
+
|--------|---------|--------|
|
|
46
|
+
| `allow_html:` | `false` | Pass raw HTML through instead of escaping it |
|
|
47
|
+
| `disallow_raw_html:` | `false` | With `allow_html`, still neutralize GFM's dangerous tags (`<script>`, `<iframe>`, …) |
|
|
48
|
+
| `extended_autolinks:` | `false` | GFM: linkify bare `http(s)://` / `www.` / email addresses |
|
|
49
|
+
| `footnotes:` | `false` | GFM footnotes (see below) |
|
|
50
|
+
| `lint:` | `false` | Collect lint diagnostics (empty links, missing image alt, heading-level skips) |
|
|
51
|
+
|
|
52
|
+
### Footnotes (opt-in)
|
|
43
53
|
|
|
44
54
|
```ruby
|
|
45
|
-
|
|
46
|
-
|
|
47
|
-
|
|
48
|
-
|
|
49
|
-
|
|
50
|
-
|
|
51
|
-
|
|
52
|
-
|
|
53
|
-
doc.source_map # Line/column lookup (lazy memoized)
|
|
54
|
-
doc.allow_html? # Check HTML pass-through setting
|
|
55
|
+
RedQuilt.render_html(<<~MD, footnotes: true)
|
|
56
|
+
Here is a reference.[^1]
|
|
57
|
+
|
|
58
|
+
[^1]: And the footnote text.
|
|
59
|
+
MD
|
|
60
|
+
# The reference becomes a superscript link, and a trailing
|
|
61
|
+
# <section class="footnotes"> lists the referenced definitions (in
|
|
62
|
+
# first-reference order) with backrefs.
|
|
55
63
|
```
|
|
56
64
|
|
|
57
|
-
###
|
|
65
|
+
### Diagnostics
|
|
58
66
|
|
|
59
|
-
|
|
60
|
-
node = doc.root.children.first
|
|
67
|
+
Parsing never raises on malformed input; warnings are collected on the document.
|
|
61
68
|
|
|
62
|
-
|
|
63
|
-
|
|
64
|
-
|
|
65
|
-
|
|
66
|
-
|
|
67
|
-
node.text # String (concatenated child text)
|
|
69
|
+
```ruby
|
|
70
|
+
doc = RedQuilt.parse("[x](javascript:alert(1))", lint: true)
|
|
71
|
+
doc.diagnostics.map(&:rule) # => [:unsafe_url]
|
|
72
|
+
doc.diagnostics.first.severity # => :warning
|
|
73
|
+
```
|
|
68
74
|
|
|
69
|
-
|
|
70
|
-
node.source_span # SourceSpan with start_byte, end_byte
|
|
75
|
+
### Heading anchors (opt-in)
|
|
71
76
|
|
|
72
|
-
|
|
73
|
-
|
|
74
|
-
|
|
77
|
+
`render_html` / `to_html` accept `heading_ids:` to give every heading a
|
|
78
|
+
slugified `id` for anchor links. Slugs follow GitHub's scheme but keep Unicode
|
|
79
|
+
intact, so Japanese headings stay readable; duplicates get `-1`, `-2` suffixes.
|
|
75
80
|
|
|
76
|
-
|
|
77
|
-
|
|
81
|
+
```ruby
|
|
82
|
+
RedQuilt.render_html("# Hello World\n\n## はじめに", heading_ids: true)
|
|
83
|
+
# => "<h1 id=\"hello-world\">Hello World</h1>\n<h2 id=\"はじめに\">はじめに</h2>\n"
|
|
78
84
|
```
|
|
79
85
|
|
|
80
|
-
###
|
|
86
|
+
### Tilt integration
|
|
87
|
+
|
|
88
|
+
RedQuilt ships a [Tilt](https://github.com/jeremyevans/tilt) adapter.
|
|
89
|
+
NOTE: It is not loaded by default; require it explicitly and add `tilt` to your own bundle:
|
|
81
90
|
|
|
82
91
|
```ruby
|
|
83
|
-
|
|
84
|
-
|
|
85
|
-
|
|
86
|
-
|
|
92
|
+
require "red_quilt/tilt"
|
|
93
|
+
|
|
94
|
+
Tilt.new("page.md").render # => HTML
|
|
95
|
+
Tilt.new("page.md", footnotes: true).render
|
|
87
96
|
```
|
|
88
97
|
|
|
89
|
-
|
|
90
|
-
|
|
91
|
-
|
|
92
|
-
|
|
93
|
-
-
|
|
94
|
-
-
|
|
95
|
-
-
|
|
96
|
-
-
|
|
97
|
-
- Block quotes: `> quote text`
|
|
98
|
-
- Lists: Ordered (`1.`) and unordered (`-`, `*`, `+`)
|
|
99
|
-
- List items: Nested blocks, tight/loose detection
|
|
100
|
-
- Tables: GFM syntax with header/body rows
|
|
101
|
-
- Raw HTML blocks: 7 types (script, comment, etc.)
|
|
102
|
-
- Link reference definitions: `[foo]: /url "title"`
|
|
103
|
-
|
|
104
|
-
### Inline elements
|
|
105
|
-
|
|
106
|
-
- Text: Plain strings
|
|
107
|
-
- Emphasis/Strong: `*em*`, `**strong**`, `_em_`, `__strong__`
|
|
108
|
-
- Code spans: `` `code` ``
|
|
109
|
-
- Links: `[text](/url)`, `[text](/url "title")`, reference links
|
|
110
|
-
- Images: ``, ``, reference images
|
|
111
|
-
- Soft/Hard line breaks: Implicit (soft) and explicit `\` or two spaces
|
|
112
|
-
- Raw HTML inline: `<a href="#">link</a>`
|
|
113
|
-
- Autolinks: `<http://example.com>`, `<user@example.com>`
|
|
114
|
-
- Character references: `&`, `'`, etc.
|
|
98
|
+
Native options (`allow_html:`, `footnotes:`, …) pass straight through; Tilt's `escape_html:` convention is also honored.
|
|
99
|
+
|
|
100
|
+
## Documentation
|
|
101
|
+
|
|
102
|
+
- [API reference](docs/api.md) — `Document` / `NodeRef` / `SourceSpan`, supported syntax, and usage examples
|
|
103
|
+
- [Architecture overview](docs/architecture.ja.md) (日本語)
|
|
104
|
+
- [Arena usage guide](docs/arena-usage.ja.md) (日本語)
|
|
105
|
+
- [CommonMark conformance notes](docs/commonmark-conformance.ja.md) (日本語)
|
|
115
106
|
|
|
116
107
|
## CommonMark Compatibility
|
|
117
108
|
|
|
118
109
|
RedQuilt achieves 100% compliance with the CommonMark v0.31.2 specification.
|
|
110
|
+
See the [conformance notes](docs/commonmark-conformance.ja.md) for GFM
|
|
111
|
+
extensions and intentional deviations.
|
|
119
112
|
|
|
120
113
|
## Command-line Tool
|
|
121
114
|
|
|
@@ -139,8 +132,11 @@ redquilt --format json input.md
|
|
|
139
132
|
# Standalone HTML document with title
|
|
140
133
|
redquilt --standalone --title "My Document" input.md
|
|
141
134
|
|
|
142
|
-
# Enable GFM extended autolinks
|
|
143
|
-
redquilt --extended-autolinks input.md
|
|
135
|
+
# Enable GFM extended autolinks / footnotes
|
|
136
|
+
redquilt --extended-autolinks --footnotes input.md
|
|
137
|
+
|
|
138
|
+
# Standalone page with the bare template (no embedded CSS)
|
|
139
|
+
redquilt --theme none input.md
|
|
144
140
|
```
|
|
145
141
|
|
|
146
142
|
### Options
|
|
@@ -148,12 +144,16 @@ redquilt --extended-autolinks input.md
|
|
|
148
144
|
```
|
|
149
145
|
--format FORMAT Output format: html (default), ast, json
|
|
150
146
|
--allow-html Pass raw HTML through to the output
|
|
147
|
+
--disallow-raw-html With --allow-html, filter GFM's dangerous tags
|
|
151
148
|
--extended-autolinks Linkify bare URLs and email addresses (GFM)
|
|
149
|
+
--footnotes Enable GFM footnotes
|
|
150
|
+
--lint Collect lint diagnostics
|
|
152
151
|
--[no-]standalone Wrap HTML in full document (default: on)
|
|
153
152
|
--auto-title Use the first heading's text as <title>
|
|
154
153
|
--title TITLE Explicit <title> text
|
|
155
154
|
--lang LANG html lang attribute (default: "en")
|
|
156
155
|
--css URL Add a stylesheet link
|
|
156
|
+
--theme THEME Embedded stylesheet: default (default) or none
|
|
157
157
|
--diagnostics Print diagnostics to stderr
|
|
158
158
|
--diagnostics-only Print diagnostics only (suppress output)
|
|
159
159
|
-h, --help Show help
|
|
@@ -187,6 +187,8 @@ In link/image destinations, only these schemes are permitted:
|
|
|
187
187
|
|
|
188
188
|
All other schemes (`javascript:`, `data:`, `vbscript:`, etc.) are blocked by replacing the URL with an empty string.
|
|
189
189
|
|
|
190
|
+
Autolinks (`<scheme:...>`) follow CommonMark and allow arbitrary schemes, so they use a denylist instead: only the script-executing schemes `javascript:`, `vbscript:`, and `data:` are blocked.
|
|
191
|
+
|
|
190
192
|
### Opting into HTML pass-through
|
|
191
193
|
|
|
192
194
|
```ruby
|
|
@@ -196,50 +198,6 @@ RedQuilt.render_html(user_markdown, allow_html: true)
|
|
|
196
198
|
# This passes HTML blocks and inline tags through unchanged
|
|
197
199
|
```
|
|
198
200
|
|
|
199
|
-
## Usage Examples
|
|
200
|
-
|
|
201
|
-
### Extract all headings
|
|
202
|
-
|
|
203
|
-
```ruby
|
|
204
|
-
doc = RedQuilt.parse(source)
|
|
205
|
-
headings = doc.root.find_all(:heading)
|
|
206
|
-
|
|
207
|
-
headings.each do |node|
|
|
208
|
-
level = node.to_h[:attributes][:level]
|
|
209
|
-
text = node.text
|
|
210
|
-
puts "#{'#' * level} #{text}"
|
|
211
|
-
end
|
|
212
|
-
```
|
|
213
|
-
|
|
214
|
-
### Walk the AST with line numbers
|
|
215
|
-
|
|
216
|
-
```ruby
|
|
217
|
-
doc = RedQuilt.parse(source)
|
|
218
|
-
|
|
219
|
-
doc.root.walk do |node|
|
|
220
|
-
loc = node.source_location
|
|
221
|
-
if loc
|
|
222
|
-
puts "#{node.type} at line #{loc[:start_line]}"
|
|
223
|
-
end
|
|
224
|
-
end
|
|
225
|
-
```
|
|
226
|
-
|
|
227
|
-
### Export and transform
|
|
228
|
-
|
|
229
|
-
```ruby
|
|
230
|
-
doc = RedQuilt.parse("# Title\n\nBody with [link](/url)")
|
|
231
|
-
ast = doc.to_ast
|
|
232
|
-
|
|
233
|
-
# Print AST structure (for debugging)
|
|
234
|
-
pp ast
|
|
235
|
-
|
|
236
|
-
# Process nodes
|
|
237
|
-
doc.root.find_all(:link).each do |link|
|
|
238
|
-
attrs = link.to_h[:attributes]
|
|
239
|
-
puts "Link: #{link.text} → #{attrs[:destination]}"
|
|
240
|
-
end
|
|
241
|
-
```
|
|
242
|
-
|
|
243
201
|
## Development
|
|
244
202
|
|
|
245
203
|
### Running tests
|
data/docs/api.md
ADDED
|
@@ -0,0 +1,132 @@
|
|
|
1
|
+
# RedQuilt API Reference
|
|
2
|
+
|
|
3
|
+
Detailed API, supported syntax, and usage examples. For installation and a
|
|
4
|
+
quick start, see the [README](../README.md).
|
|
5
|
+
|
|
6
|
+
## Document
|
|
7
|
+
|
|
8
|
+
```ruby
|
|
9
|
+
doc = RedQuilt.parse("# Title\n\nBody")
|
|
10
|
+
|
|
11
|
+
doc.root # Root node (NodeRef)
|
|
12
|
+
doc.walk # Traverse all nodes (block: { |node| ... } or Enumerator)
|
|
13
|
+
doc.to_html # Render as HTML (see options below)
|
|
14
|
+
doc.to_ast # Export complete AST as Hash
|
|
15
|
+
doc.to_json # Export as MDAST-compatible JSON
|
|
16
|
+
doc.to_mdast # Export as MDAST Hash
|
|
17
|
+
doc.source_map # Line/column lookup (lazy memoized)
|
|
18
|
+
doc.diagnostics # Array of RedQuilt::Diagnostic collected while parsing
|
|
19
|
+
doc.allow_html? # Check HTML pass-through setting
|
|
20
|
+
doc.disallow_raw_html? # Check GFM disallowed-raw-HTML filtering setting
|
|
21
|
+
|
|
22
|
+
# Standalone document with an embedded theme:
|
|
23
|
+
doc.to_html(standalone: true, theme: :default, title: "My Doc", lang: "en")
|
|
24
|
+
# theme: :default (compact, dark-mode-aware stylesheet) or :none (bare).
|
|
25
|
+
# css: "style.css" links an external stylesheet instead.
|
|
26
|
+
```
|
|
27
|
+
|
|
28
|
+
## NodeRef (AST node wrapper)
|
|
29
|
+
|
|
30
|
+
```ruby
|
|
31
|
+
node = doc.root.children.first
|
|
32
|
+
|
|
33
|
+
# Traversal
|
|
34
|
+
node.type # :heading, :paragraph, :link, etc. (Symbol)
|
|
35
|
+
node.children # Array[NodeRef]
|
|
36
|
+
node.walk # Enumerator[NodeRef] or { |node| ... } block
|
|
37
|
+
node.find_all(:link) # Array[NodeRef] with matching type
|
|
38
|
+
node.text # String (concatenated child text)
|
|
39
|
+
|
|
40
|
+
# Position information (byte offset)
|
|
41
|
+
node.source_span # SourceSpan with start_byte, end_byte
|
|
42
|
+
|
|
43
|
+
# Position information (line/column)
|
|
44
|
+
node.source_location # { start_line, start_column, end_line, end_column }
|
|
45
|
+
# line: 1-indexed, column: 0-indexed (character-based)
|
|
46
|
+
|
|
47
|
+
# AST export
|
|
48
|
+
node.to_h # Export subtree as Hash[Symbol, untyped]
|
|
49
|
+
```
|
|
50
|
+
|
|
51
|
+
## SourceSpan
|
|
52
|
+
|
|
53
|
+
```ruby
|
|
54
|
+
span = node.source_span
|
|
55
|
+
span.start_byte # Integer (0-indexed byte offset)
|
|
56
|
+
span.end_byte # Integer (exclusive)
|
|
57
|
+
span.length # Computed: end_byte - start_byte
|
|
58
|
+
```
|
|
59
|
+
|
|
60
|
+
## Supported Syntax
|
|
61
|
+
|
|
62
|
+
### Block elements
|
|
63
|
+
|
|
64
|
+
- Paragraphs: Plain text blocks
|
|
65
|
+
- Headings: ATX headings (`# Title`)
|
|
66
|
+
- Thematic breaks: `---`, `***`, `___`
|
|
67
|
+
- Code blocks: Indented and fenced (with info string)
|
|
68
|
+
- Block quotes: `> quote text`
|
|
69
|
+
- Lists: Ordered (`1.`) and unordered (`-`, `*`, `+`)
|
|
70
|
+
- List items: Nested blocks, tight/loose detection
|
|
71
|
+
- Tables: GFM syntax with header/body rows
|
|
72
|
+
- Raw HTML blocks: 7 types (script, comment, etc.)
|
|
73
|
+
- Link reference definitions: `[foo]: /url "title"`
|
|
74
|
+
- Footnote definitions: `[^label]: …` (GFM, opt-in via `footnotes: true`)
|
|
75
|
+
|
|
76
|
+
### Inline elements
|
|
77
|
+
|
|
78
|
+
- Text: Plain strings
|
|
79
|
+
- Emphasis/Strong: `*em*`, `**strong**`, `_em_`, `__strong__`
|
|
80
|
+
- Strikethrough: `~~text~~` (GFM)
|
|
81
|
+
- Code spans: `` `code` ``
|
|
82
|
+
- Links: `[text](/url)`, `[text](/url "title")`, reference links
|
|
83
|
+
- Images: ``, ``, reference images
|
|
84
|
+
- Soft/Hard line breaks: Implicit (soft) and explicit `\` or two spaces
|
|
85
|
+
- Raw HTML inline: `<a href="#">link</a>`
|
|
86
|
+
- Autolinks: `<http://example.com>`, `<user@example.com>`
|
|
87
|
+
- Footnote references: `[^label]` (GFM, opt-in via `footnotes: true`)
|
|
88
|
+
- Character references: `&`, `'`, etc.
|
|
89
|
+
|
|
90
|
+
## Usage Examples
|
|
91
|
+
|
|
92
|
+
### Extract all headings
|
|
93
|
+
|
|
94
|
+
```ruby
|
|
95
|
+
doc = RedQuilt.parse(source)
|
|
96
|
+
headings = doc.root.find_all(:heading)
|
|
97
|
+
|
|
98
|
+
headings.each do |node|
|
|
99
|
+
level = node.to_h[:attributes][:level]
|
|
100
|
+
text = node.text
|
|
101
|
+
puts "#{'#' * level} #{text}"
|
|
102
|
+
end
|
|
103
|
+
```
|
|
104
|
+
|
|
105
|
+
### Walk the AST with line numbers
|
|
106
|
+
|
|
107
|
+
```ruby
|
|
108
|
+
doc = RedQuilt.parse(source)
|
|
109
|
+
|
|
110
|
+
doc.root.walk do |node|
|
|
111
|
+
loc = node.source_location
|
|
112
|
+
if loc
|
|
113
|
+
puts "#{node.type} at line #{loc[:start_line]}"
|
|
114
|
+
end
|
|
115
|
+
end
|
|
116
|
+
```
|
|
117
|
+
|
|
118
|
+
### Export and transform
|
|
119
|
+
|
|
120
|
+
```ruby
|
|
121
|
+
doc = RedQuilt.parse("# Title\n\nBody with [link](/url)")
|
|
122
|
+
ast = doc.to_ast
|
|
123
|
+
|
|
124
|
+
# Print AST structure (for debugging)
|
|
125
|
+
pp ast
|
|
126
|
+
|
|
127
|
+
# Process nodes
|
|
128
|
+
doc.root.find_all(:link).each do |link|
|
|
129
|
+
attrs = link.to_h[:attributes]
|
|
130
|
+
puts "Link: #{link.text} → #{attrs[:destination]}"
|
|
131
|
+
end
|
|
132
|
+
```
|
|
@@ -14,7 +14,7 @@ Source (Markdown String)
|
|
|
14
14
|
│ (list.rb, blockquote.rb, reference_definition.rb)
|
|
15
15
|
│
|
|
16
16
|
▼Arena (raw inline spans)
|
|
17
|
-
│paragraph / heading / table cellの本文はbyte span
|
|
17
|
+
│paragraph / heading / table cellの本文はbyte spanまたはstr1 literalで保持
|
|
18
18
|
│
|
|
19
19
|
▼InlinePass (lib/red_quilt/inline_pass.rb)
|
|
20
20
|
│ ├─Inline::Lexer (lib/red_quilt/inline/lexer.rb)
|
|
@@ -47,9 +47,9 @@ CommonMark§2.3/2.4の最小前処理。`\r\n`/`\r`→`\n`の行末正規化と
|
|
|
47
47
|
|
|
48
48
|
### InlinePass / Lexer / Builder
|
|
49
49
|
- 対象選定: paragraph / heading / table cellの各inline targetを走査して処理。
|
|
50
|
-
- Lexer: targetのbyte span
|
|
50
|
+
- Lexer: targetのbyte span、またはstr1 literalの範囲をスキャンしTokens(parallel array)へ。
|
|
51
51
|
- Builder①linear pass: code span / link / image / autolink / 簡易inlineを解決。
|
|
52
|
-
- Builder②process_emphasis: delimiter stackを畳んでemphasis / strongを確定(CommonMark§6.2)。
|
|
52
|
+
- Builder②process_emphasis: delimiter stackを畳んでemphasis / strong / strikethroughを確定(CommonMark§6.2、strikethroughはGFM拡張)。
|
|
53
53
|
- footnote参照: `[^label]`を`FootnoteRegistry`で解決し、初回参照順に採番して`FOOTNOTE_REFERENCE`を生成。
|
|
54
54
|
|
|
55
55
|
### FootnotePass (`footnotes: true`)
|
|
@@ -78,19 +78,19 @@ Arenaの扱いのポイントは以下になります。
|
|
|
78
78
|
|
|
79
79
|
---
|
|
80
80
|
|
|
81
|
-
## 1
|
|
81
|
+
## 1. 設計の要点
|
|
82
82
|
|
|
83
83
|
ArenaはASTを「オブジェクトのツリー」ではなく[parallel array](https://en.wikipedia.org/wiki/Parallel_array)として表現します。
|
|
84
84
|
|
|
85
|
-
|
|
86
|
-
|
|
87
|
-
|
|
85
|
+
- ノードは整数ID(`node_id`)で識別されます
|
|
86
|
+
- 各IDに対応する属性(parent / source span / payload)はそれぞれ異なるArrayに列として保持します
|
|
87
|
+
- ノードの追加は各Arrayの末尾への代入操作だけで完結し、新しいRubyオブジェクトは一切生成しません
|
|
88
88
|
|
|
89
89
|
結果として、Arenaは以下のような性質を持ちます。
|
|
90
90
|
|
|
91
|
-
|
|
92
|
-
|
|
93
|
-
|
|
91
|
+
- ホットパスではIDというIntegerだけを取り回せる
|
|
92
|
+
- メモリ局所性が良く、GC圧が小さい
|
|
93
|
+
- ノードを「軽い」値として扱えるのでRenderer / Builderをinline化しやすい
|
|
94
94
|
|
|
95
95
|
###列(column)一覧
|
|
96
96
|
|
|
@@ -98,13 +98,13 @@ ArenaはASTを「オブジェクトのツリー」ではなく[parallel array](h
|
|
|
98
98
|
|------|------|
|
|
99
99
|
| `@type` | NodeType (Integer定数) |
|
|
100
100
|
| `@parent` / `@first_child` / `@last_child` / `@next_sibling` / `@prev_sibling` |親・子・兄弟リンク。値はnode id (`NO_NODE`で「なし」) |
|
|
101
|
-
| `@source_start` / `@source_len` | document source内のバイト範囲。`source_start < 0`は「span
|
|
101
|
+
| `@source_start` / `@source_len` | document source内のバイト範囲。`source_start < 0`は「spanなし」を意味する|
|
|
102
102
|
| `@int1` / `@int2` / `@int3` | NodeTypeごとに用途が決まる整数スロット(default `0`) |
|
|
103
103
|
| `@str1` / `@str2` | NodeTypeごとに用途が決まる文字列スロット(default `nil`) |
|
|
104
104
|
|
|
105
105
|
---
|
|
106
106
|
|
|
107
|
-
## 2
|
|
107
|
+
## 2. 不変条件
|
|
108
108
|
|
|
109
109
|
Arenaを扱う上で常に成り立つ前提です。
|
|
110
110
|
|
|
@@ -117,7 +117,7 @@ Arenaを扱う上で常に成り立つ前提です。
|
|
|
117
117
|
4. `NO_NODE` = -1
|
|
118
118
|
親や兄弟が存在しないことを示すsentinelです。`Arena::NO_NODE`定数で参照できます。
|
|
119
119
|
5. `source_start < 0`は「spanなし」
|
|
120
|
-
|
|
120
|
+
この場合、leafノードの内容は`@str1`にliteralとして持つことが多いです(例: blockquoteを解除したparagraph、entityデコード後のTEXT)。ただしcontainer inlineのように、spanなしでも`str1`を使わず子ノードから内容を構成するNodeTypeもあります。
|
|
121
121
|
|
|
122
122
|
---
|
|
123
123
|
|
|
@@ -125,7 +125,7 @@ Arenaを扱う上で常に成り立つ前提です。
|
|
|
125
125
|
|
|
126
126
|
Arenaの公開メソッドは以下の3レイヤーに分けて読むと意図が掴みやすくなります。
|
|
127
127
|
|
|
128
|
-
### 3.1構造の操作(mutators)
|
|
128
|
+
### 3.1 構造の操作(mutators)
|
|
129
129
|
|
|
130
130
|
ツリーを組み立て・編集するためのAPIです。`validなid`を渡す前提で、安全性チェックは最小限です。
|
|
131
131
|
|
|
@@ -139,7 +139,7 @@ Arenaの公開メソッドは以下の3レイヤーに分けて読むと意図
|
|
|
139
139
|
| `update_span(id, start_byte, end_byte)` | source spanを再設定|
|
|
140
140
|
| `update_str1(id, value)` / `update_int3(id, value)` |個別slotの書き換え|
|
|
141
141
|
|
|
142
|
-
### 3.2構造の参照(raw id accessors)
|
|
142
|
+
### 3.2 構造の参照(raw id accessors)
|
|
143
143
|
|
|
144
144
|
`NO_NODE`を返しうる、生のcolumn値を取り出します。命名規則`raw_X_id`は「戻り値がnode idで、-1 (`NO_NODE`)になる可能性がある」ことを示します。
|
|
145
145
|
|
|
@@ -149,7 +149,7 @@ Arenaの公開メソッドは以下の3レイヤーに分けて読むと意図
|
|
|
149
149
|
| `raw_first_child_id(id)` / `raw_last_child_id(id)` |子idか`NO_NODE` |
|
|
150
150
|
| `raw_next_sibling_id(id)` / `raw_prev_sibling_id(id)` |兄弟idか`NO_NODE` |
|
|
151
151
|
|
|
152
|
-
### 3.3ペイロードの参照(column accessors)
|
|
152
|
+
### 3.3 ペイロードの参照(column accessors)
|
|
153
153
|
|
|
154
154
|
各columnを生のまま返します。「sentinelが返り得る」ことは戻り値の型から読み取ってください。
|
|
155
155
|
|
|
@@ -161,7 +161,7 @@ Arenaの公開メソッドは以下の3レイヤーに分けて読むと意図
|
|
|
161
161
|
| `int1(id)` / `int2(id)` / `int3(id)` | Integer (default 0) |
|
|
162
162
|
| `str1(id)` / `str2(id)` | String or `nil` |
|
|
163
163
|
|
|
164
|
-
### 3.4セマンティックaccessor
|
|
164
|
+
### 3.4 セマンティックaccessor
|
|
165
165
|
|
|
166
166
|
低レベル列を解釈して「使いやすい値」を返します。`nil`を返しうるのは明示的に「無い」ことを表現するためです。
|
|
167
167
|
|
|
@@ -170,7 +170,7 @@ Arenaの公開メソッドは以下の3レイヤーに分けて読むと意図
|
|
|
170
170
|
| `source_span(id)` | `SourceSpan`か`nil` (spanなしの場合) |
|
|
171
171
|
| `text(id)` | `str1`があればそれ、なければ`source.byteslice(...)`。どちらもなければ`nil` |
|
|
172
172
|
|
|
173
|
-
### 3.5走査
|
|
173
|
+
### 3.5 走査
|
|
174
174
|
|
|
175
175
|
|メソッド|用途|
|
|
176
176
|
|----------|------|
|
|
@@ -188,8 +188,8 @@ Arenaの公開メソッドは以下の3レイヤーに分けて読むと意図
|
|
|
188
188
|
| NodeType | int1 | int2 | int3 | str1 | str2 |
|
|
189
189
|
|----------|------|------|------|------|------|
|
|
190
190
|
| `DOCUMENT` | - | - | - | - | - |
|
|
191
|
-
| `PARAGRAPH` | - | - | - | (transformed
|
|
192
|
-
| `HEADING` | level (1-6) | - | - | (transformed
|
|
191
|
+
| `PARAGRAPH` | - | - | - | 必要時の結合済みliteral(transformed時、leading indent除去時など) | - |
|
|
192
|
+
| `HEADING` | level (1-6) | - | - | 必要時のinline literal(transformed時、setext headingなど) | - |
|
|
193
193
|
| `THEMATIC_BREAK` | - | - | - | - | - |
|
|
194
194
|
| `BLOCKQUOTE` | - | - | - | - | - |
|
|
195
195
|
| `LIST` | ordered? (0/1) | start_number | tight? (1=tight) | marker (`-`/`*`/`+`/`.`/`)`) | - |
|
|
@@ -222,12 +222,14 @@ Arenaの公開メソッドは以下の3レイヤーに分けて読むと意図
|
|
|
222
222
|
### Source spanの慣習
|
|
223
223
|
|
|
224
224
|
- `source_start` / `source_len`: 元documentのbytes (絶対byte offset)
|
|
225
|
-
- `source_start < 0`: span
|
|
226
|
-
- block
|
|
225
|
+
- `source_start < 0`: spanなし。leafノードでは内容を`str1`にliteralとして持つことが多いが、container inlineは子ノードだけを持つ場合がある。
|
|
226
|
+
- blockノードのspanは用途で2系統に分かれる。
|
|
227
|
+
- inline対象(paragraph / heading / table cell)のspanは、InlinePassがそのまま字句解析するbyte範囲を兼ねるため、`#`やprefixを除いたinline本文を指す。
|
|
228
|
+
- それ以外(list / blockquote / table / code / html block等)は字句解析に使われず、構造/行寄りの位置情報のみを持つ。
|
|
227
229
|
|
|
228
230
|
---
|
|
229
231
|
|
|
230
|
-
## 5
|
|
232
|
+
## 5. 典型的な使い方
|
|
231
233
|
|
|
232
234
|
### 5.1 Arenaを作って小さなASTを組み立てる
|
|
233
235
|
|
|
@@ -259,7 +261,7 @@ arena.text(inner_id) # => "world"
|
|
|
259
261
|
arena.source_span(em_id) # => #<SourceSpan @start_byte=6 @end_byte=13>
|
|
260
262
|
```
|
|
261
263
|
|
|
262
|
-
### 5.2兄弟をループする(ホットパス)
|
|
264
|
+
### 5.2 兄弟をループする(ホットパス)
|
|
263
265
|
|
|
264
266
|
```ruby
|
|
265
267
|
arena.each_child(para_id) do |child_id|
|
|
@@ -281,19 +283,26 @@ arena.child_ids(para_id).map { |id| arena.type_name(id) }
|
|
|
281
283
|
# => [:text, :emphasis]
|
|
282
284
|
```
|
|
283
285
|
|
|
284
|
-
### 5.3ノードを別の親に移動する
|
|
286
|
+
### 5.3 ノードを別の親に移動する
|
|
287
|
+
|
|
288
|
+
`reparent`は移動先ノードのchildrenを置き換えるAPIなので、移動先は新規作成した空ノードにするのが基本です。
|
|
285
289
|
|
|
286
290
|
```ruby
|
|
287
|
-
# `em_id
|
|
291
|
+
# `em_id`の子を新しいstrong_id直下に移す
|
|
292
|
+
strong_id = arena.add_node(RedQuilt::NodeType::STRONG,
|
|
293
|
+
source_start: arena.source_start(em_id),
|
|
294
|
+
source_len: arena.source_len(em_id))
|
|
295
|
+
arena.insert_before(arena.raw_parent_id(em_id), em_id, strong_id)
|
|
296
|
+
|
|
288
297
|
first = arena.raw_first_child_id(em_id)
|
|
289
298
|
last = arena.raw_last_child_id(em_id)
|
|
290
|
-
arena.reparent(
|
|
299
|
+
arena.reparent(strong_id, first, last) if first != RedQuilt::Arena::NO_NODE
|
|
291
300
|
|
|
292
|
-
# em_id
|
|
301
|
+
# em_idを空のまま切り離す。strong_idはem_idの位置に残る。
|
|
293
302
|
arena.detach(em_id)
|
|
294
303
|
```
|
|
295
304
|
|
|
296
|
-
### 5.4ノードを差し替える
|
|
305
|
+
### 5.4 ノードを差し替える
|
|
297
306
|
|
|
298
307
|
```ruby
|
|
299
308
|
# em_idをstrong_idに置換(中身はそのまま)
|
|
@@ -309,7 +318,7 @@ arena.reparent(strong_id, first, last) if first != RedQuilt::Arena::NO_NODE
|
|
|
309
318
|
arena.detach(em_id)
|
|
310
319
|
```
|
|
311
320
|
|
|
312
|
-
### 5.5列の値を直接更新する
|
|
321
|
+
### 5.5 列の値を直接更新する
|
|
313
322
|
|
|
314
323
|
```ruby
|
|
315
324
|
# headingのレベルはint1に入っているが、書き換え専用setterは無いので
|
|
@@ -323,7 +332,7 @@ arena.update_span(text_id, 0, 12)
|
|
|
323
332
|
|
|
324
333
|
---
|
|
325
334
|
|
|
326
|
-
## 6
|
|
335
|
+
## 6. パフォーマンス上の注意
|
|
327
336
|
|
|
328
337
|
####ホットパスでは`each_child`を使う
|
|
329
338
|
|
|
@@ -331,7 +340,7 @@ arena.update_span(text_id, 0, 12)
|
|
|
331
340
|
|
|
332
341
|
#### `text(id)`はstr1を優先する
|
|
333
342
|
|
|
334
|
-
余計な`byteslice
|
|
343
|
+
余計な`byteslice`を起こさないため、sourceから復元できる内容は`str1`を`nil`のままにするのが基本。ただしentity decode後のTEXT、code/html literal、table cell、transformed/literal inline targetなど、正しさのためにliteralが必要なケースでは`str1`を使う
|
|
335
344
|
|
|
336
345
|
#### `source_span(id)`は`SourceSpan`を毎回allocateする
|
|
337
346
|
|
|
@@ -360,4 +369,3 @@ arena.update_span(text_id, 0, 12)
|
|
|
360
369
|
#### `@source`を後から変えない
|
|
361
370
|
|
|
362
371
|
仮にやると`text` / `source_span`の戻り値が静かに壊れる
|
|
363
|
-
|