ya_multilingual_markdown 0.0.2 → 0.0.3
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/.github/workflows/ci.yml +19 -0
- data/.rubocop.yml +2 -0
- data/Gemfile +2 -0
- data/LICENSE +18 -0
- data/README.md +352 -0
- data/README_ja.md +352 -0
- data/Rakefile +13 -0
- data/lib/ya_multilingual_markdown/cli.rb +140 -0
- data/lib/ya_multilingual_markdown/converter/lang_filtered_html.rb +432 -0
- data/lib/ya_multilingual_markdown/converter/lang_filtered_kramdown.rb +306 -0
- data/lib/ya_multilingual_markdown/converter/lang_filtered_markdown.rb +14 -0
- data/lib/ya_multilingual_markdown/document.rb +186 -0
- data/lib/ya_multilingual_markdown/parser/ya_multilingual_markdown.rb +434 -0
- data/lib/ya_multilingual_markdown/version.rb +3 -0
- data/lib/ya_multilingual_markdown.rb +13 -924
- data/test/cli_test.rb +189 -0
- data/test/document_test.rb +207 -0
- data/test/lang_filtered_html_converter_test.rb +712 -0
- data/test/lang_filtered_kramdown_converter_test.rb +684 -0
- data/test/lang_filtered_markdown_converter_test.rb +44 -0
- data/test/test_helper.rb +4 -0
- data/test/ya_multilingual_markdown_parser_test.rb +327 -0
- data/test/ya_multilingual_markdown_test.rb +26 -0
- data/ya_multilingual_markdown.gemspec +24 -0
- metadata +63 -39
checksums.yaml
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
SHA256:
|
|
3
|
-
metadata.gz:
|
|
4
|
-
data.tar.gz:
|
|
3
|
+
metadata.gz: 85b65af64f47695df849bc48706b3fe3b2f691cbb7205d9edd9dbdaf7627bbee
|
|
4
|
+
data.tar.gz: 0d95434a2e33c236b7736b2825b31f774db3249cc2c14b70f8b5caf011699b6b
|
|
5
5
|
SHA512:
|
|
6
|
-
metadata.gz:
|
|
7
|
-
data.tar.gz:
|
|
6
|
+
metadata.gz: 640cdf404f1ecead1e4483816e7fb0adee0c233479ccb02567c69eaf0cc9c9ebc1b64e7250a3e102609d0d096f874c0862c4d5ec7b50c913f3bc0df7120acae9
|
|
7
|
+
data.tar.gz: 24bae476fa264efdd35258ea493b4b583410d2f5ca227105a4ed867f324fad481bc7efed73e869af31d76f339df3e4288eb9d713847a0cd259f1dddc644f5f5c
|
|
@@ -0,0 +1,19 @@
|
|
|
1
|
+
# see https://github.com/ruby/setup-ruby
|
|
2
|
+
|
|
3
|
+
name: CI
|
|
4
|
+
on: [push, pull_request]
|
|
5
|
+
jobs:
|
|
6
|
+
test:
|
|
7
|
+
strategy:
|
|
8
|
+
fail-fast: false
|
|
9
|
+
matrix:
|
|
10
|
+
os: [ubuntu-latest, macos-latest]
|
|
11
|
+
ruby: ["3.3", "3.4"]
|
|
12
|
+
runs-on: ${{ matrix.os }}
|
|
13
|
+
steps:
|
|
14
|
+
- uses: actions/checkout@v5
|
|
15
|
+
- uses: ruby/setup-ruby@v1
|
|
16
|
+
with:
|
|
17
|
+
ruby-version: ${{ matrix.ruby }}
|
|
18
|
+
bundler-cache: true
|
|
19
|
+
- run: bundle exec rake
|
data/.rubocop.yml
ADDED
data/Gemfile
ADDED
data/LICENSE
ADDED
|
@@ -0,0 +1,18 @@
|
|
|
1
|
+
Copyright and condition of use of main portion of the source:
|
|
2
|
+
|
|
3
|
+
----
|
|
4
|
+
|
|
5
|
+
Copyright 2020 Hisashi Morita
|
|
6
|
+
|
|
7
|
+
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
|
|
8
|
+
|
|
9
|
+
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
|
|
10
|
+
|
|
11
|
+
THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
|
|
12
|
+
|
|
13
|
+
----
|
|
14
|
+
|
|
15
|
+
Copyright of kramdown's parse_emphasis, from which parse_ml_emphasis derived:
|
|
16
|
+
|
|
17
|
+
Copyright (C) 2009-2019 Thomas Leitner <t_leitner@gmx.at>
|
|
18
|
+
This file is part of kramdown which is licensed under the MIT.
|
data/README.md
ADDED
|
@@ -0,0 +1,352 @@
|
|
|
1
|
+
# YAMultilingualMarkdown
|
|
2
|
+
|
|
3
|
+
* English / [Japanese](README_ja.md)
|
|
4
|
+
|
|
5
|
+
YAMultilingualMarkdown is a utility to convert Yet Another Multilingual Markdown to HTML and other formats.
|
|
6
|
+
|
|
7
|
+
Yet Another Multilingual Markdown (format) is a Markdown dialect designed for hosting multilingual content. YAMultilingualMarkdown (tool) converts Yet Another Multilingual Markdown to other formats while extracting only the content in specified language(s).
|
|
8
|
+
|
|
9
|
+
## Usage
|
|
10
|
+
|
|
11
|
+
### Synopsis
|
|
12
|
+
|
|
13
|
+
```shell
|
|
14
|
+
ya_multilingual_markdown [OPTIONS] [FILE]
|
|
15
|
+
```
|
|
16
|
+
|
|
17
|
+
### Options
|
|
18
|
+
|
|
19
|
+
Type `ya_multilingual_markdown --help` to show command line options.
|
|
20
|
+
|
|
21
|
+
```
|
|
22
|
+
Convert Yet Another Multilingual Markdown to HTML and other formats
|
|
23
|
+
|
|
24
|
+
Usage:
|
|
25
|
+
ya_multilingual_markdown [OPTIONS] [FILE]
|
|
26
|
+
|
|
27
|
+
Options:
|
|
28
|
+
--output-format=FORMAT Output format
|
|
29
|
+
(html|kramdown|markdown)
|
|
30
|
+
(default: html)
|
|
31
|
+
--langs=LANG1,LANG2,... Languages to be included
|
|
32
|
+
(omitting this option implies all)
|
|
33
|
+
--heading-lang-sep=STRING Languages separator in headings
|
|
34
|
+
(default: " / ")
|
|
35
|
+
--lang-attr-name=STRING Attribute name for language
|
|
36
|
+
(default: lang)
|
|
37
|
+
--html-output-type=TYPE Output type for HTML output
|
|
38
|
+
(fragment|document)
|
|
39
|
+
(default: fragment)
|
|
40
|
+
--html-template-file=FILE HTML document template file in eRuby format
|
|
41
|
+
--html-link-suffixes=FROM:TO,...
|
|
42
|
+
Link suffixes to rewrite in HTML output
|
|
43
|
+
(default: .md:.html)
|
|
44
|
+
--show-default-html-template Show default HTML document template
|
|
45
|
+
--log-level=SEVERITY Log level
|
|
46
|
+
(unknown|fatal|error|warn|info|debug)
|
|
47
|
+
(default: warn)
|
|
48
|
+
--help Show help message
|
|
49
|
+
--version Show version
|
|
50
|
+
```
|
|
51
|
+
|
|
52
|
+
## Examples
|
|
53
|
+
|
|
54
|
+
### Multilingual contents: headings, paragraphs, and other elements
|
|
55
|
+
|
|
56
|
+
A simple Yet Aonther Multilingual Markdown document looks like the following (`snow_white.md`):
|
|
57
|
+
|
|
58
|
+
```markdown
|
|
59
|
+
# Schneeweißchen
|
|
60
|
+
{: lang="de"}
|
|
61
|
+
|
|
62
|
+
# Little Snow-white
|
|
63
|
+
{: lang="en"}
|
|
64
|
+
|
|
65
|
+
Es war einmal mitten im Winter,...
|
|
66
|
+
{: lang="de"}
|
|
67
|
+
|
|
68
|
+
Once upon a time in the middle of winter,...
|
|
69
|
+
{: lang="en"}
|
|
70
|
+
```
|
|
71
|
+
|
|
72
|
+
(You can use kramdown-style (PHP Markdown Extra-style) extended syntax (`{: name="value"}`) to add attributes to block elements.)
|
|
73
|
+
|
|
74
|
+
#### Keep all languages
|
|
75
|
+
|
|
76
|
+
Without language-related options, the output will contain all languages.
|
|
77
|
+
|
|
78
|
+
```shell
|
|
79
|
+
ya_multilingual_markdown snow_white.md
|
|
80
|
+
```
|
|
81
|
+
|
|
82
|
+
Excerpt from the output:
|
|
83
|
+
|
|
84
|
+
```html
|
|
85
|
+
<h1><span lang="de">Schneeweißchen</span> / <span lang="en">Little Snow-white</span></h1>
|
|
86
|
+
|
|
87
|
+
<p lang="de">Es war einmal mitten im Winter,...</p>
|
|
88
|
+
|
|
89
|
+
<p lang="en">Once upon a time in the middle of winter,...</p>
|
|
90
|
+
```
|
|
91
|
+
|
|
92
|
+
In a browser, above output may look like the following:
|
|
93
|
+
|
|
94
|
+
> **Schneeweißchen / Little Snow-white**
|
|
95
|
+
>
|
|
96
|
+
> Es war einmal mitten im Winter, ...
|
|
97
|
+
>
|
|
98
|
+
> Once upon a time in the middle of winter, ...
|
|
99
|
+
|
|
100
|
+
#### Extract single language
|
|
101
|
+
|
|
102
|
+
With option `--langs=en`, the output will contain only the elements with `lang` whose value is set to `en` (and elements without `lang` attribute).
|
|
103
|
+
|
|
104
|
+
```shell
|
|
105
|
+
ya_multilingual_markdown --langs=en snow_white.md
|
|
106
|
+
```
|
|
107
|
+
|
|
108
|
+
Excerpt from the output:
|
|
109
|
+
|
|
110
|
+
```html
|
|
111
|
+
<h1><span lang="en">Little Snow-white</span></h1>
|
|
112
|
+
|
|
113
|
+
|
|
114
|
+
<p lang="en">Once upon a time in the middle of winter, ...</p>
|
|
115
|
+
```
|
|
116
|
+
|
|
117
|
+
In a browser, above output may look like the following:
|
|
118
|
+
|
|
119
|
+
> **Little Snow-white**
|
|
120
|
+
>
|
|
121
|
+
> Once upon a time in the middle of winter, ...
|
|
122
|
+
|
|
123
|
+
#### Extract multiple languages
|
|
124
|
+
|
|
125
|
+
With option `--langs=de,en`, the output will contain elements with `lang` set to `de` or `en` (and elements without `lang`).
|
|
126
|
+
|
|
127
|
+
```shell
|
|
128
|
+
ya_multilingual_markdown --langs=de,en snow_white.md
|
|
129
|
+
```
|
|
130
|
+
|
|
131
|
+
Excerpt from the output:
|
|
132
|
+
|
|
133
|
+
```html
|
|
134
|
+
<h1><span lang="de">Schneeweißchen</span> / <span lang="en">Little Snow-white</span></h1>
|
|
135
|
+
|
|
136
|
+
<p lang="de">Es war einmal mitten im Winter,...</p>
|
|
137
|
+
|
|
138
|
+
<p lang="en">Once upon a time in the middle of winter,...</p>
|
|
139
|
+
```
|
|
140
|
+
|
|
141
|
+
In a browser, the output may look like the following:
|
|
142
|
+
|
|
143
|
+
> **Schneeweißchen / Little Snow-white**
|
|
144
|
+
>
|
|
145
|
+
> Es war einmal mitten im Winter, ...
|
|
146
|
+
>
|
|
147
|
+
> Once upon a time in the middle of winter, ...
|
|
148
|
+
|
|
149
|
+
### Metadata in YAML front matter
|
|
150
|
+
|
|
151
|
+
Document metadata can be stored in the document using Jekyll-style YAML front matter.
|
|
152
|
+
|
|
153
|
+
A simple Yet Aonther Multilingual Markdown document with YAML front matter looks like the following (`snow_white_with_metadata.md`):
|
|
154
|
+
|
|
155
|
+
```
|
|
156
|
+
---
|
|
157
|
+
title: Little Snow-white
|
|
158
|
+
author:
|
|
159
|
+
- Jacob Ludwig Karl Grimm
|
|
160
|
+
- Wilhelm Carl Grimm
|
|
161
|
+
meta:
|
|
162
|
+
- name: original title
|
|
163
|
+
content: Schneeweißchen
|
|
164
|
+
lang: de
|
|
165
|
+
- name: translator
|
|
166
|
+
content: Margaret Hunt
|
|
167
|
+
lang: en
|
|
168
|
+
---
|
|
169
|
+
...
|
|
170
|
+
```
|
|
171
|
+
|
|
172
|
+
(The key `author` is a shortcut to `<meta name="author" .../>`.)
|
|
173
|
+
|
|
174
|
+
Let us include all languages in the output:
|
|
175
|
+
|
|
176
|
+
```shell
|
|
177
|
+
ya_multilingual_markdown snow_white_with_metadata.md
|
|
178
|
+
```
|
|
179
|
+
|
|
180
|
+
Excerpt from the output:
|
|
181
|
+
|
|
182
|
+
```html
|
|
183
|
+
<title>Little Snow-white</title>
|
|
184
|
+
<meta name="author" content="Jacob Ludwig Karl Grimm" />
|
|
185
|
+
<meta name="author" content="Wilhelm Carl Grimm" />
|
|
186
|
+
<meta name="original title" content="Schneeweißchen" lang="de" />
|
|
187
|
+
<meta name="translator" content="Margaret Hunt" lang="en" />
|
|
188
|
+
<p>...</p>
|
|
189
|
+
```
|
|
190
|
+
|
|
191
|
+
You can filter metadata based on their languages.
|
|
192
|
+
|
|
193
|
+
Let us include `en` only (thus exclude `de`) in the output:
|
|
194
|
+
|
|
195
|
+
```shell
|
|
196
|
+
ya_multilingual_markdown --langs=en snow_white_with_metadata.md
|
|
197
|
+
```
|
|
198
|
+
|
|
199
|
+
Excerpt from the output:
|
|
200
|
+
|
|
201
|
+
```html
|
|
202
|
+
<title>Little Snow-white</title>
|
|
203
|
+
<meta name="author" content="Jacob Ludwig Karl Grimm" />
|
|
204
|
+
<meta name="author" content="Wilhelm Carl Grimm" />
|
|
205
|
+
<meta name="translator" content="Margaret Hunt" lang="en" />
|
|
206
|
+
<p>...</p>
|
|
207
|
+
```
|
|
208
|
+
|
|
209
|
+
### Output complete HTML document
|
|
210
|
+
|
|
211
|
+
Use `--html-output-type=document` to print complete HTML document rather than HTML fragments.
|
|
212
|
+
|
|
213
|
+
Input:
|
|
214
|
+
|
|
215
|
+
```
|
|
216
|
+
---
|
|
217
|
+
title: Little Snow-white
|
|
218
|
+
---
|
|
219
|
+
Once upon a time in the middle of winter, ...
|
|
220
|
+
```
|
|
221
|
+
|
|
222
|
+
Command line:
|
|
223
|
+
|
|
224
|
+
```shell
|
|
225
|
+
ya_multilingual_markdown --html-output-type=document snow_white_with_title.md
|
|
226
|
+
```
|
|
227
|
+
|
|
228
|
+
Output:
|
|
229
|
+
|
|
230
|
+
```html
|
|
231
|
+
<!DOCTYPE html>
|
|
232
|
+
<html>
|
|
233
|
+
<head>
|
|
234
|
+
<title>Little Snow-white</title>
|
|
235
|
+
</head>
|
|
236
|
+
<body>
|
|
237
|
+
<p>Once upon a time in the middle of winter, ...</p>
|
|
238
|
+
</body>
|
|
239
|
+
</html>
|
|
240
|
+
```
|
|
241
|
+
|
|
242
|
+
You can provide a custom template using `--html-template-file=FILE`. Templates must be in [eRuby](https://docs.ruby-lang.org/en/master/ERB.html) format. Use `--show-default-html-template` to see the built-in default template.
|
|
243
|
+
|
|
244
|
+
## Installation
|
|
245
|
+
|
|
246
|
+
```
|
|
247
|
+
gem install ya_multilingual_markdown
|
|
248
|
+
```
|
|
249
|
+
|
|
250
|
+
or
|
|
251
|
+
|
|
252
|
+
```
|
|
253
|
+
git clone https://github.com/hisashim/ya_multilingual_markdown
|
|
254
|
+
cd ya_multilingual_markdown
|
|
255
|
+
rake install
|
|
256
|
+
```
|
|
257
|
+
|
|
258
|
+
## Requirements
|
|
259
|
+
|
|
260
|
+
Runtime requirements:
|
|
261
|
+
|
|
262
|
+
* [Ruby](https://www.ruby-lang.org/)
|
|
263
|
+
* [kramdown](https://kramdown.gettalong.org/)
|
|
264
|
+
- [Rouge](https://github.com/rouge-ruby/rouge/)
|
|
265
|
+
* [kramdown-parser-gfm](https://github.com/kramdown/parser-gfm/)
|
|
266
|
+
* [kramdown-math-katex](https://github.com/kramdown/math-katex/)
|
|
267
|
+
|
|
268
|
+
Development requirements (in addition to runtime requirements):
|
|
269
|
+
|
|
270
|
+
* [Rake](https://ruby.github.io/rake/)
|
|
271
|
+
* [Bundler](https://bundler.io/)
|
|
272
|
+
* [Minitest](https://github.com/minitest/minitest)
|
|
273
|
+
* [Rubocop](https://github.com/rubocop/rubocop) (optional)
|
|
274
|
+
- [Shopify's Ruby Style Guide](https://github.com/Shopify/ruby-style-guide)
|
|
275
|
+
|
|
276
|
+
## Notes
|
|
277
|
+
|
|
278
|
+
### Limitations and known problems
|
|
279
|
+
|
|
280
|
+
* Only a small subset of kramdown's extended syntax is supported, although YAMultilingualMarkdown is built upon kramdown.
|
|
281
|
+
|
|
282
|
+
* As for multilingual headings, ALD (Attribute List Definition) for each heading must be placed only _after_ the heading.
|
|
283
|
+
|
|
284
|
+
Supported:
|
|
285
|
+
```
|
|
286
|
+
# Schneeweißchen
|
|
287
|
+
{: lang="de"}
|
|
288
|
+
|
|
289
|
+
# Little Snow-white
|
|
290
|
+
{: lang="en"}
|
|
291
|
+
```
|
|
292
|
+
|
|
293
|
+
Not supported:
|
|
294
|
+
```
|
|
295
|
+
{: lang="de"}
|
|
296
|
+
# Schneeweißchen
|
|
297
|
+
|
|
298
|
+
{: lang="en"}
|
|
299
|
+
# Little Snow-white
|
|
300
|
+
```
|
|
301
|
+
|
|
302
|
+
This compromise allows us to write id at the beginning of headings as well as at the end, with less code.
|
|
303
|
+
|
|
304
|
+
```
|
|
305
|
+
{: #title}
|
|
306
|
+
# Schneeweißchen
|
|
307
|
+
{: lang="de"}
|
|
308
|
+
|
|
309
|
+
# Little Snow-white
|
|
310
|
+
{: lang="en"}
|
|
311
|
+
```
|
|
312
|
+
|
|
313
|
+
```
|
|
314
|
+
# Schneeweißchen
|
|
315
|
+
{: lang="de"}
|
|
316
|
+
|
|
317
|
+
# Little Snow-white
|
|
318
|
+
{: lang="en"}
|
|
319
|
+
{: #title}
|
|
320
|
+
```
|
|
321
|
+
|
|
322
|
+
### Motivation
|
|
323
|
+
|
|
324
|
+
Yet Aonther Multilingual Markdown and its processor were born out of the need for a manuscript format for translated books.
|
|
325
|
+
|
|
326
|
+
Having a side-by-side version of the galley proof that includes both the original and translated texts helps translators review their work. Being able to search and edit manuscripts in a (sort of) side-by-side format is also useful.
|
|
327
|
+
|
|
328
|
+
While placing translated text in separate files from the original is a common and effective approach for localization/multilingualization projects, a format allowing multiple languages within a single file comes in handy for small projects. Yet Aonther Multilingual Markdown is an attempt to develop a proof of concept for such a format and a processing tool.
|
|
329
|
+
|
|
330
|
+
### See also
|
|
331
|
+
|
|
332
|
+
* [Requirements for Japanese Text Layout](https://w3c.github.io/jlreq/) is an excellent example of a multilingual document in HTML format.
|
|
333
|
+
|
|
334
|
+
* Lightweight text formats and processing tools that allow multiple languages to be written in a single file (not necessarily feature or aim at extracting or representing multiple languages side-by-side):
|
|
335
|
+
- [Multilang Preprocessor](https://github.com/polm/multilang-filter)
|
|
336
|
+
- [greple](https://github.com/kaz-utashiro/greple)
|
|
337
|
+
- [Multilingual Markdown Generator](https://mmg.ryul1206.dev/)
|
|
338
|
+
|
|
339
|
+
## License
|
|
340
|
+
|
|
341
|
+
This software is distributed under the terms of the [MIT license](LICENSE).
|
|
342
|
+
|
|
343
|
+
## Acknowledgments
|
|
344
|
+
|
|
345
|
+
Many thanks to:
|
|
346
|
+
|
|
347
|
+
* Koichi Sasada, whose manuscript preprocessor inspired me to come up with a lightweight markup format that features multilingualization.
|
|
348
|
+
* kramdown developers
|
|
349
|
+
|
|
350
|
+
## Contributors
|
|
351
|
+
|
|
352
|
+
* [Hisashi Morita](https://github.com/hisashim) - creator and maintainer
|