@markuplint/spec-generator 5.0.0-alpha.1 → 5.0.0-alpha.3
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/ARCHITECTURE.ja.md +5 -1
- package/ARCHITECTURE.md +5 -1
- package/CHANGELOG.md +10 -0
- package/README.md +1 -1
- package/docs/modules.ja.md +27 -8
- package/docs/modules.md +27 -8
- package/docs/scraping.ja.md +27 -0
- package/docs/scraping.md +27 -0
- package/lib/aria.js +3 -2
- package/lib/html-elements.d.ts +2 -2
- package/lib/html-elements.js +29 -8
- package/lib/mathml.d.ts +8 -0
- package/lib/mathml.js +29 -0
- package/lib/utils.d.ts +6 -5
- package/lib/utils.js +8 -6
- package/package.json +5 -5
package/ARCHITECTURE.ja.md
CHANGED
|
@@ -2,7 +2,7 @@
|
|
|
2
2
|
|
|
3
3
|
## 概要
|
|
4
4
|
|
|
5
|
-
`@markuplint/spec-generator` は、markuplint の拡張仕様 JSON を生成するビルドツールです。W3C および MDN のウェブ標準ドキュメントをスクレイピングし、HTML/SVG 要素仕様、グローバル属性、ARIA ロール・プロパティ、コンテンツモデル定義を集約して、`@markuplint/html-spec` が消費する単一の `index.json` ファイルに出力します。
|
|
5
|
+
`@markuplint/spec-generator` は、markuplint の拡張仕様 JSON を生成するビルドツールです。W3C および MDN のウェブ標準ドキュメントをスクレイピングし、HTML/SVG/MathML 要素仕様、グローバル属性、ARIA ロール・プロパティ、コンテンツモデル定義を集約して、`@markuplint/html-spec` が消費する単一の `index.json` ファイルに出力します。
|
|
6
6
|
|
|
7
7
|
このパッケージは直接利用するために公開されるものではありません。`@markuplint/html-spec/build.mjs` からのみ呼び出されます。
|
|
8
8
|
|
|
@@ -16,6 +16,7 @@ src/
|
|
|
16
16
|
├── aria.ts — W3C ARIA 仕様のスクレイピング(ロール、プロパティ、ステート)
|
|
17
17
|
├── global-attrs.ts — グローバル属性定義の読み込み
|
|
18
18
|
├── svg.ts — MDN から SVG 非推奨要素名を取得
|
|
19
|
+
├── mathml.ts — MDN から MathML 非推奨要素名を取得
|
|
19
20
|
├── fetch.ts — HTTP フェッチ(プロセス内キャッシュ+プログレスバー付き)
|
|
20
21
|
├── read-json.ts — コメント除去付き JSON ファイル読み込み+ glob 対応
|
|
21
22
|
└── utils.ts — 共有ヘルパー関数(ソート、重複排除、名前解析)
|
|
@@ -40,6 +41,7 @@ flowchart TD
|
|
|
40
41
|
subgraph scraping ["ウェブスクレイピング"]
|
|
41
42
|
mdnHTML["MDN HTML 要素ページ\n(scraping.ts)"]
|
|
42
43
|
mdnSVG["MDN SVG 要素インデックス\n(svg.ts)"]
|
|
44
|
+
mdnMathML["MDN MathML 要素インデックス\n(mathml.ts)"]
|
|
43
45
|
ariaSpecs["W3C ARIA 1.1 / 1.2 / 1.3\n(aria.ts)"]
|
|
44
46
|
graphicsAria["Graphics ARIA\n(aria.ts)"]
|
|
45
47
|
htmlAria["HTML-ARIA マッピング\n(aria.ts)"]
|
|
@@ -62,6 +64,7 @@ flowchart TD
|
|
|
62
64
|
getElements --> specFiles
|
|
63
65
|
getElements --> mdnHTML
|
|
64
66
|
getElements --> mdnSVG
|
|
67
|
+
getElements --> mdnMathML
|
|
65
68
|
getGlobalAttrs --> commonAttrs
|
|
66
69
|
getAria --> ariaSpecs
|
|
67
70
|
getAria --> graphicsAria
|
|
@@ -84,6 +87,7 @@ flowchart TD
|
|
|
84
87
|
| `aria.ts` | `getAria()` | W3C ARIA 仕様からロール、プロパティ、ステートをスクレイピング |
|
|
85
88
|
| `global-attrs.ts` | `getGlobalAttrs()` | JSON からグローバル属性定義を読み込み |
|
|
86
89
|
| `svg.ts` | `getSVGElementList()` | MDN から非推奨 SVG 要素名を取得 |
|
|
90
|
+
| `mathml.ts` | `getMathMLElementList()` | MDN から非推奨 MathML 要素名を取得 |
|
|
87
91
|
| `fetch.ts` | `fetch()`, `fetchText()`, `getReferences()` | 2層キャッシュとプログレスバー付き HTTP フェッチ |
|
|
88
92
|
| `read-json.ts` | `readJson()`, `readJsons()` | コメント除去と glob マッチング付き JSON 読み込み |
|
|
89
93
|
| `utils.ts` | `nameCompare()`, `sortObjectByKey()`, `arrayUnique()`, `getName()`, `getThisOutline()`, `mergeAttributes()`, `keys()` | 共有ユーティリティ |
|
package/ARCHITECTURE.md
CHANGED
|
@@ -2,7 +2,7 @@
|
|
|
2
2
|
|
|
3
3
|
## Overview
|
|
4
4
|
|
|
5
|
-
`@markuplint/spec-generator` is a build tool that generates the markuplint extended specification JSON. It scrapes W3C and MDN web standards documentation, aggregates HTML/SVG element specs, global attributes, ARIA roles and properties, and content model definitions into a single `index.json` file consumed by `@markuplint/html-spec`.
|
|
5
|
+
`@markuplint/spec-generator` is a build tool that generates the markuplint extended specification JSON. It scrapes W3C and MDN web standards documentation, aggregates HTML/SVG/MathML element specs, global attributes, ARIA roles and properties, and content model definitions into a single `index.json` file consumed by `@markuplint/html-spec`.
|
|
6
6
|
|
|
7
7
|
This package is not published for direct use. It is invoked exclusively from `@markuplint/html-spec/build.mjs`.
|
|
8
8
|
|
|
@@ -16,6 +16,7 @@ src/
|
|
|
16
16
|
├── aria.ts — W3C ARIA specification scraping (roles, properties, states)
|
|
17
17
|
├── global-attrs.ts — Global attribute definition loader
|
|
18
18
|
├── svg.ts — SVG deprecated element list fetcher
|
|
19
|
+
├── mathml.ts — MathML deprecated element list fetcher
|
|
19
20
|
├── fetch.ts — HTTP fetch with in-process caching and progress bar
|
|
20
21
|
├── read-json.ts — JSON file reader with comment stripping and glob support
|
|
21
22
|
└── utils.ts — Shared helper functions (sorting, deduplication, name parsing)
|
|
@@ -40,6 +41,7 @@ flowchart TD
|
|
|
40
41
|
subgraph scraping ["Web Scraping"]
|
|
41
42
|
mdnHTML["MDN HTML Element Pages\n(scraping.ts)"]
|
|
42
43
|
mdnSVG["MDN SVG Element Index\n(svg.ts)"]
|
|
44
|
+
mdnMathML["MDN MathML Element Index\n(mathml.ts)"]
|
|
43
45
|
ariaSpecs["W3C ARIA 1.1 / 1.2 / 1.3\n(aria.ts)"]
|
|
44
46
|
graphicsAria["Graphics ARIA\n(aria.ts)"]
|
|
45
47
|
dpubAria["DPub ARIA\n(aria.ts)"]
|
|
@@ -63,6 +65,7 @@ flowchart TD
|
|
|
63
65
|
getElements --> specFiles
|
|
64
66
|
getElements --> mdnHTML
|
|
65
67
|
getElements --> mdnSVG
|
|
68
|
+
getElements --> mdnMathML
|
|
66
69
|
getGlobalAttrs --> commonAttrs
|
|
67
70
|
getAria --> ariaSpecs
|
|
68
71
|
getAria --> graphicsAria
|
|
@@ -86,6 +89,7 @@ flowchart TD
|
|
|
86
89
|
| `aria.ts` | `getAria()` | Scrapes W3C ARIA specs for roles, properties, states |
|
|
87
90
|
| `global-attrs.ts` | `getGlobalAttrs()` | Reads global attribute definitions from JSON |
|
|
88
91
|
| `svg.ts` | `getSVGElementList()` | Fetches deprecated SVG element names from MDN |
|
|
92
|
+
| `mathml.ts` | `getMathMLElementList()` | Fetches deprecated MathML element names from MDN |
|
|
89
93
|
| `fetch.ts` | `fetch()`, `fetchText()`, `getReferences()` | HTTP fetching with dual-layer cache and progress bar |
|
|
90
94
|
| `read-json.ts` | `readJson()`, `readJsons()` | JSON reading with comment stripping and glob matching |
|
|
91
95
|
| `utils.ts` | `nameCompare()`, `sortObjectByKey()`, `arrayUnique()`, `getName()`, `getThisOutline()`, `mergeAttributes()`, `keys()` | Shared utilities |
|
package/CHANGELOG.md
CHANGED
|
@@ -3,6 +3,16 @@
|
|
|
3
3
|
All notable changes to this project will be documented in this file.
|
|
4
4
|
See [Conventional Commits](https://conventionalcommits.org) for commit guidelines.
|
|
5
5
|
|
|
6
|
+
# [5.0.0-alpha.3](https://github.com/markuplint/markuplint/compare/v5.0.0-alpha.2...v5.0.0-alpha.3) (2026-02-26)
|
|
7
|
+
|
|
8
|
+
**Note:** Version bump only for package @markuplint/spec-generator
|
|
9
|
+
|
|
10
|
+
# [5.0.0-alpha.2](https://github.com/markuplint/markuplint/compare/v5.0.0-alpha.1...v5.0.0-alpha.2) (2026-02-23)
|
|
11
|
+
|
|
12
|
+
### Features
|
|
13
|
+
|
|
14
|
+
- **spec-generator:** add MathML element scraping support ([0c56ba4](https://github.com/markuplint/markuplint/commit/0c56ba4c8ca1a2e401da5cfea0f5cf8f9056f737))
|
|
15
|
+
|
|
6
16
|
# [5.0.0-alpha.1](https://github.com/markuplint/markuplint/compare/v5.0.0-alpha.0...v5.0.0-alpha.1) (2026-02-22)
|
|
7
17
|
|
|
8
18
|
**Note:** Version bump only for package @markuplint/spec-generator
|
package/README.md
CHANGED
|
@@ -27,7 +27,7 @@ You normally don't run this directly; use:
|
|
|
27
27
|
|
|
28
28
|
1. **Read element sources** -- Load every `src/spec.*.jsonc` and infer the element name from the filename
|
|
29
29
|
2. **Enrich from MDN** -- Fetch MDN element pages for descriptions, categories, and attribute metadata (manual specs take precedence)
|
|
30
|
-
3. **Add obsolete elements** -- Inject HTML obsolete elements and deprecated
|
|
30
|
+
3. **Add obsolete elements** -- Inject HTML obsolete elements, deprecated SVG elements, and deprecated MathML elements
|
|
31
31
|
4. **Load shared data** -- Read global attributes and content model definitions
|
|
32
32
|
5. **Build ARIA definitions** -- Scrape WAI-ARIA (1.1/1.2/1.3), Graphics-ARIA, DPub-ARIA, and HTML-ARIA
|
|
33
33
|
6. **Emit Extended Spec JSON** -- Write `{ cites, def, specs }` to `index.json`
|
package/docs/modules.ja.md
CHANGED
|
@@ -45,25 +45,26 @@
|
|
|
45
45
|
|
|
46
46
|
## html-elements.ts
|
|
47
47
|
|
|
48
|
-
HTML
|
|
48
|
+
HTML、SVG、MathML 要素仕様の完全なリストを構築します。
|
|
49
49
|
|
|
50
50
|
### `getElements(filePattern: string): Promise<ExtendedElementSpec[]>`
|
|
51
51
|
|
|
52
52
|
**フロー:**
|
|
53
53
|
|
|
54
54
|
1. `readJsons()` で glob パターンにマッチする全仕様ファイルを読み込み。要素名は正規表現 `spec.([\w-]+).jsonc` でファイル名から抽出(例: `spec.a.jsonc` → `a`)
|
|
55
|
-
2. `getSVGElementList()` で非推奨 SVG 要素リストを取得
|
|
55
|
+
2. `getSVGElementList()` で非推奨 SVG 要素リストを取得、`getMathMLElementList()` で非推奨 MathML 要素リストを取得
|
|
56
56
|
3. `fetchObsoleteElements()` で既存の仕様にない非推奨要素のスタブを生成
|
|
57
57
|
4. 各要素について MDN URL を構築し、`fetchHTMLElement()` でメタデータをスクレイピング:
|
|
58
58
|
- 見出し要素(`h1`-`h6`)は MDN パス `Heading_Elements` にマッピング
|
|
59
59
|
- SVG 要素は `/Web/SVG/Reference/Element/<name>` パスを使用
|
|
60
|
+
- MathML 要素は `/Web/MathML/Element/<name>` パスを使用
|
|
60
61
|
- HTML 要素は `/Web/HTML/Reference/Elements/<name>` パスを使用
|
|
61
62
|
5. スクレイピングデータとローカル仕様データをマージ。**ローカル仕様データが優先**:
|
|
62
63
|
- `cite` -- ローカル値があればそちらを使用、なければ MDN URL
|
|
63
64
|
- `description`, `categories`, `omission` -- MDN から
|
|
64
65
|
- `contentModel`, `aria` -- ローカル仕様のみ(スクレイピングされない)
|
|
65
66
|
- `attributes` -- 属性名ごとにマージ; ローカルエントリが MDN エントリをオーバーライド
|
|
66
|
-
6. アルファベット順にソートし、
|
|
67
|
+
6. アルファベット順にソートし、MathML 要素を HTML 要素の後、SVG 要素をその後に配置
|
|
67
68
|
|
|
68
69
|
### `obsoleteList`
|
|
69
70
|
|
|
@@ -71,7 +72,24 @@ HTML および SVG 要素仕様の完全なリストを構築します。
|
|
|
71
72
|
|
|
72
73
|
`applet`, `acronym`, `bgsound`, `dir`, `frame`, `frameset`, `noframes`, `isindex`, `keygen`, `listing`, `menuitem`, `nextid`, `noembed`, `param`, `plaintext`, `rb`, `rtc`, `strike`, `xmp`, `basefont`, `big`, `blink`, `center`, `font`, `marquee`, `multicol`, `nobr`, `spacer`, `tt`
|
|
73
74
|
|
|
74
|
-
MDN から取得した非推奨 SVG 要素と組み合わせて、完全な非推奨セットを構成します。
|
|
75
|
+
MDN から取得した非推奨 SVG・MathML 要素と組み合わせて、完全な非推奨セットを構成します。
|
|
76
|
+
|
|
77
|
+
---
|
|
78
|
+
|
|
79
|
+
## mathml.ts
|
|
80
|
+
|
|
81
|
+
### `getMathMLElementList(): Promise<string[]>`
|
|
82
|
+
|
|
83
|
+
MDN MathML 要素インデックスページ(`https://developer.mozilla.org/en-US/docs/Web/MathML/Element`)をフェッチし、非推奨および非標準の MathML 要素名を抽出します。
|
|
84
|
+
|
|
85
|
+
**処理:**
|
|
86
|
+
|
|
87
|
+
1. メインコンテンツ領域内の全 `<td> <code>` 要素を検索
|
|
88
|
+
2. `m` で始まる名前(MathML の命名規則)でフィルタリング
|
|
89
|
+
3. 含まれる `<tr>` の `.icon-deprecated` または `.icon-nonstandard` アイコンクラスを確認
|
|
90
|
+
4. 各名前に `mml_` プレフィックスを付加(例: `maction` → `mml_maction`)
|
|
91
|
+
|
|
92
|
+
SVG と異なり、MathML には「Obsolete and deprecated elements」の独立セクションはない。要素テーブル内のアイコンでインラインにステータスが表示される。
|
|
75
93
|
|
|
76
94
|
---
|
|
77
95
|
|
|
@@ -250,7 +268,8 @@ MDN SVG 要素インデックスページ(`https://developer.mozilla.org/en-US
|
|
|
250
268
|
|
|
251
269
|
要素名文字列をパース:
|
|
252
270
|
|
|
253
|
-
| 入力 | `localName` | `namespace`
|
|
254
|
-
| -------------- | ----------- |
|
|
255
|
-
| `"div"` | `"div"` | `undefined`
|
|
256
|
-
| `"svg_circle"` | `"circle"` | `"http://www.w3.org/2000/svg"`
|
|
271
|
+
| 入力 | `localName` | `namespace` | `ml` |
|
|
272
|
+
| -------------- | ----------- | -------------------------------------- | ---------- |
|
|
273
|
+
| `"div"` | `"div"` | `undefined` | `"HTML"` |
|
|
274
|
+
| `"svg_circle"` | `"circle"` | `"http://www.w3.org/2000/svg"` | `"SVG"` |
|
|
275
|
+
| `"mml_math"` | `"math"` | `"http://www.w3.org/1998/Math/MathML"` | `"MathML"` |
|
package/docs/modules.md
CHANGED
|
@@ -45,25 +45,26 @@ Coordinates three parallel data-gathering tasks, assembles the results into an `
|
|
|
45
45
|
|
|
46
46
|
## html-elements.ts
|
|
47
47
|
|
|
48
|
-
Builds the complete list of HTML and
|
|
48
|
+
Builds the complete list of HTML, SVG, and MathML element specifications.
|
|
49
49
|
|
|
50
50
|
### `getElements(filePattern: string): Promise<ExtendedElementSpec[]>`
|
|
51
51
|
|
|
52
52
|
**Flow:**
|
|
53
53
|
|
|
54
54
|
1. Read all spec files matching the glob pattern via `readJsons()`. Element names are extracted from filenames using the regex `spec.([\w-]+).jsonc` (e.g., `spec.a.jsonc` becomes `a`)
|
|
55
|
-
2. Fetch the deprecated SVG element list via `getSVGElementList()`
|
|
55
|
+
2. Fetch the deprecated SVG element list via `getSVGElementList()` and MathML element list via `getMathMLElementList()`
|
|
56
56
|
3. Generate stubs for obsolete elements not already present via `fetchObsoleteElements()`
|
|
57
57
|
4. For each element, construct the MDN URL and scrape metadata via `fetchHTMLElement()`:
|
|
58
58
|
- Heading elements (`h1`-`h6`) are mapped to MDN path `Heading_Elements`
|
|
59
59
|
- SVG elements use the path `/Web/SVG/Reference/Element/<name>`
|
|
60
|
+
- MathML elements use the path `/Web/MathML/Element/<name>`
|
|
60
61
|
- HTML elements use `/Web/HTML/Reference/Elements/<name>`
|
|
61
62
|
5. Merge scraped data with local spec data. **Local spec data takes precedence**:
|
|
62
63
|
- `cite` -- local value wins if present, otherwise MDN URL
|
|
63
64
|
- `description`, `categories`, `omission` -- from MDN
|
|
64
65
|
- `contentModel`, `aria` -- from local spec only (never scraped)
|
|
65
66
|
- `attributes` -- merged per attribute name; local entries override MDN entries
|
|
66
|
-
6. Sort alphabetically, with
|
|
67
|
+
6. Sort alphabetically, with MathML elements placed after HTML elements and SVG elements after MathML
|
|
67
68
|
|
|
68
69
|
### `obsoleteList`
|
|
69
70
|
|
|
@@ -71,7 +72,24 @@ A hardcoded list of 31 non-conforming HTML elements:
|
|
|
71
72
|
|
|
72
73
|
`applet`, `acronym`, `bgsound`, `dir`, `frame`, `frameset`, `noframes`, `isindex`, `keygen`, `listing`, `menuitem`, `nextid`, `noembed`, `param`, `plaintext`, `rb`, `rtc`, `strike`, `xmp`, `basefont`, `big`, `blink`, `center`, `font`, `marquee`, `multicol`, `nobr`, `spacer`, `tt`
|
|
73
74
|
|
|
74
|
-
These are combined with deprecated SVG elements fetched from MDN to form the complete obsolete set.
|
|
75
|
+
These are combined with deprecated SVG and MathML elements fetched from MDN to form the complete obsolete set.
|
|
76
|
+
|
|
77
|
+
---
|
|
78
|
+
|
|
79
|
+
## mathml.ts
|
|
80
|
+
|
|
81
|
+
### `getMathMLElementList(): Promise<string[]>`
|
|
82
|
+
|
|
83
|
+
Fetches the MDN MathML element index page (`https://developer.mozilla.org/en-US/docs/Web/MathML/Element`) and extracts deprecated and non-standard MathML element names.
|
|
84
|
+
|
|
85
|
+
**Processing:**
|
|
86
|
+
|
|
87
|
+
1. Find all `<td> <code>` elements in the main content area
|
|
88
|
+
2. Filter for names starting with `m` (MathML naming convention)
|
|
89
|
+
3. Check the containing `<tr>` for `.icon-deprecated` or `.icon-nonstandard` icon classes
|
|
90
|
+
4. Prefix each name with `mml_` (e.g., `maction` becomes `mml_maction`)
|
|
91
|
+
|
|
92
|
+
Unlike SVG, MathML does not have a separate "Obsolete and deprecated elements" section. Status is indicated inline with icons in the element table.
|
|
75
93
|
|
|
76
94
|
---
|
|
77
95
|
|
|
@@ -250,7 +268,8 @@ Returns `Object.keys()` with a custom type cast.
|
|
|
250
268
|
|
|
251
269
|
Parses an element name string:
|
|
252
270
|
|
|
253
|
-
| Input | `localName` | `namespace`
|
|
254
|
-
| -------------- | ----------- |
|
|
255
|
-
| `"div"` | `"div"` | `undefined`
|
|
256
|
-
| `"svg_circle"` | `"circle"` | `"http://www.w3.org/2000/svg"`
|
|
271
|
+
| Input | `localName` | `namespace` | `ml` |
|
|
272
|
+
| -------------- | ----------- | -------------------------------------- | ---------- |
|
|
273
|
+
| `"div"` | `"div"` | `undefined` | `"HTML"` |
|
|
274
|
+
| `"svg_circle"` | `"circle"` | `"http://www.w3.org/2000/svg"` | `"SVG"` |
|
|
275
|
+
| `"mml_math"` | `"math"` | `"http://www.w3.org/1998/Math/MathML"` | `"MathML"` |
|
package/docs/scraping.ja.md
CHANGED
|
@@ -20,6 +20,12 @@ SVG 要素:
|
|
|
20
20
|
https://developer.mozilla.org/en-US/docs/Web/SVG/Reference/Element/<name>
|
|
21
21
|
```
|
|
22
22
|
|
|
23
|
+
MathML 要素:
|
|
24
|
+
|
|
25
|
+
```
|
|
26
|
+
https://developer.mozilla.org/en-US/docs/Web/MathML/Element/<name>
|
|
27
|
+
```
|
|
28
|
+
|
|
23
29
|
**特殊ケース:** 見出し要素(`h1`-`h6`)は単一のページにマッピング:
|
|
24
30
|
|
|
25
31
|
```
|
|
@@ -122,6 +128,27 @@ https://developer.mozilla.org/en-US/docs/Web/SVG/Element
|
|
|
122
128
|
|
|
123
129
|
---
|
|
124
130
|
|
|
131
|
+
## MDN MathML インデックススクレイピング
|
|
132
|
+
|
|
133
|
+
**モジュール:** `mathml.ts`
|
|
134
|
+
|
|
135
|
+
### 対象
|
|
136
|
+
|
|
137
|
+
```
|
|
138
|
+
https://developer.mozilla.org/en-US/docs/Web/MathML/Element
|
|
139
|
+
```
|
|
140
|
+
|
|
141
|
+
### 抽出プロセス
|
|
142
|
+
|
|
143
|
+
1. メインコンテンツ領域内の全 `<td> <code>` 要素を検索
|
|
144
|
+
2. `m` で始まる要素名(MathML の慣例)でフィルタリング
|
|
145
|
+
3. 含まれる `<tr>` の非推奨/非標準アイコンクラス(`.icon-deprecated`, `.icon-nonstandard`)を確認
|
|
146
|
+
4. 各非推奨/非標準要素名に `mml_` プレフィックスを付加(例: `maction` → `mml_maction`)
|
|
147
|
+
|
|
148
|
+
SVG と異なり、MathML には「Obsolete and deprecated elements」の独立したセクションはない。代わりに、要素テーブル内のアイコンでインラインに非推奨/非標準ステータスが表示される。
|
|
149
|
+
|
|
150
|
+
---
|
|
151
|
+
|
|
125
152
|
## WAI-ARIA スクレイピング
|
|
126
153
|
|
|
127
154
|
**モジュール:** `aria.ts`
|
package/docs/scraping.md
CHANGED
|
@@ -20,6 +20,12 @@ SVG elements:
|
|
|
20
20
|
https://developer.mozilla.org/en-US/docs/Web/SVG/Reference/Element/<name>
|
|
21
21
|
```
|
|
22
22
|
|
|
23
|
+
MathML elements:
|
|
24
|
+
|
|
25
|
+
```
|
|
26
|
+
https://developer.mozilla.org/en-US/docs/Web/MathML/Element/<name>
|
|
27
|
+
```
|
|
28
|
+
|
|
23
29
|
**Special case:** Heading elements (`h1`-`h6`) are mapped to a single page:
|
|
24
30
|
|
|
25
31
|
```
|
|
@@ -122,6 +128,27 @@ https://developer.mozilla.org/en-US/docs/Web/SVG/Element
|
|
|
122
128
|
|
|
123
129
|
---
|
|
124
130
|
|
|
131
|
+
## MDN MathML Index Scraping
|
|
132
|
+
|
|
133
|
+
**Module:** `mathml.ts`
|
|
134
|
+
|
|
135
|
+
### Target
|
|
136
|
+
|
|
137
|
+
```
|
|
138
|
+
https://developer.mozilla.org/en-US/docs/Web/MathML/Element
|
|
139
|
+
```
|
|
140
|
+
|
|
141
|
+
### Extraction Process
|
|
142
|
+
|
|
143
|
+
1. Find all `<td> <code>` elements in the main content area
|
|
144
|
+
2. Filter for element names starting with `m` (MathML convention)
|
|
145
|
+
3. Check the containing `<tr>` for deprecated/non-standard icon classes (`.icon-deprecated`, `.icon-nonstandard`)
|
|
146
|
+
4. Prefix each deprecated/non-standard element name with `mml_` (e.g., `maction` becomes `mml_maction`)
|
|
147
|
+
|
|
148
|
+
Unlike SVG, MathML does not have a separate "Obsolete and deprecated elements" section. Instead, deprecated/non-standard status is indicated inline with icons in the element table.
|
|
149
|
+
|
|
150
|
+
---
|
|
151
|
+
|
|
125
152
|
## WAI-ARIA Scraping
|
|
126
153
|
|
|
127
154
|
**Module:** `aria.ts`
|
package/lib/aria.js
CHANGED
|
@@ -390,8 +390,9 @@ async function getAriaInHtml() {
|
|
|
390
390
|
for (const $implicitProp of $implicitProps) {
|
|
391
391
|
const htmlAttrName = $($implicitProp).find('th:nth-of-type(1) a').eq(0).text();
|
|
392
392
|
if (htmlAttrName === 'contenteditable') {
|
|
393
|
-
// FIXME:
|
|
394
|
-
//
|
|
393
|
+
// FIXME: The contenteditable attribute's ARIA semantics depend on ancestor elements
|
|
394
|
+
// (an element inherits contenteditable from its parent). Evaluating this requires
|
|
395
|
+
// cross-element ancestor traversal which is not available during spec generation.
|
|
395
396
|
continue;
|
|
396
397
|
}
|
|
397
398
|
const implicitProp = $($implicitProp).find('td:nth-of-type(1) code').eq(0).text();
|
package/lib/html-elements.d.ts
CHANGED
|
@@ -1,8 +1,8 @@
|
|
|
1
1
|
import type { ExtendedElementSpec } from '@markuplint/ml-spec';
|
|
2
2
|
/**
|
|
3
|
-
* Builds the complete list of HTML and
|
|
3
|
+
* Builds the complete list of HTML, SVG, and MathML element specifications by reading local JSON spec files,
|
|
4
4
|
* enriching them with data scraped from MDN, and appending obsolete/deprecated elements.
|
|
5
|
-
* Elements are sorted alphabetically with
|
|
5
|
+
* Elements are sorted alphabetically with MathML elements after HTML and SVG elements after MathML.
|
|
6
6
|
*
|
|
7
7
|
* @param filePattern - An absolute glob pattern matching the per-element JSON spec files
|
|
8
8
|
* @returns A sorted array of extended element specification objects
|
package/lib/html-elements.js
CHANGED
|
@@ -1,3 +1,4 @@
|
|
|
1
|
+
import { getMathMLElementList } from './mathml.js';
|
|
1
2
|
import { readJsons } from './read-json.js';
|
|
2
3
|
import { fetchHTMLElement, fetchObsoleteElements } from './scraping.js';
|
|
3
4
|
import { getSVGElementList } from './svg.js';
|
|
@@ -39,9 +40,16 @@ const obsoleteList = [
|
|
|
39
40
|
'tt',
|
|
40
41
|
];
|
|
41
42
|
/**
|
|
42
|
-
*
|
|
43
|
+
* Namespace sort order: HTML first, then MathML, then SVG.
|
|
44
|
+
*/
|
|
45
|
+
const namespaceSortOrder = {
|
|
46
|
+
'http://www.w3.org/2000/svg': 2,
|
|
47
|
+
'http://www.w3.org/1998/Math/MathML': 1,
|
|
48
|
+
};
|
|
49
|
+
/**
|
|
50
|
+
* Builds the complete list of HTML, SVG, and MathML element specifications by reading local JSON spec files,
|
|
43
51
|
* enriching them with data scraped from MDN, and appending obsolete/deprecated elements.
|
|
44
|
-
* Elements are sorted alphabetically with
|
|
52
|
+
* Elements are sorted alphabetically with MathML elements after HTML and SVG elements after MathML.
|
|
45
53
|
*
|
|
46
54
|
* @param filePattern - An absolute glob pattern matching the per-element JSON spec files
|
|
47
55
|
* @returns A sorted array of extended element specification objects
|
|
@@ -55,23 +63,34 @@ export async function getElements(filePattern) {
|
|
|
55
63
|
...body,
|
|
56
64
|
};
|
|
57
65
|
});
|
|
58
|
-
const
|
|
59
|
-
const obsoleteElements = fetchObsoleteElements([...obsoleteList, ...
|
|
66
|
+
const [svgDeprecatedList, mathmlDeprecatedList] = await Promise.all([getSVGElementList(), getMathMLElementList()]);
|
|
67
|
+
const obsoleteElements = fetchObsoleteElements([...obsoleteList, ...svgDeprecatedList, ...mathmlDeprecatedList], specs);
|
|
60
68
|
specs.push(...obsoleteElements);
|
|
61
69
|
specs = await Promise.all(specs.map(async (el) => {
|
|
62
70
|
const { localName, namespace, ml } = getName(el.name);
|
|
63
71
|
const urlTagName = /^h[1-6]$/i.test(localName) ? 'Heading_Elements' : localName;
|
|
64
72
|
// https://developer.mozilla.org/en-US/docs/Web/HTML/Reference/Elements/a
|
|
65
73
|
// https://developer.mozilla.org/en-US/docs/Web/SVG/Reference/Element/a
|
|
74
|
+
// https://developer.mozilla.org/en-US/docs/Web/MathML/Reference/Element/math
|
|
66
75
|
const cite = `https://developer.mozilla.org/en-US/docs/Web/${ml}/Reference/Element${ml === 'HTML' ? 's' : ''}/${urlTagName}`;
|
|
67
76
|
const mdnData = await fetchHTMLElement(cite);
|
|
68
77
|
// @ts-ignore
|
|
69
78
|
delete el.name;
|
|
70
79
|
// @ts-ignore
|
|
71
80
|
delete el.namespace;
|
|
81
|
+
let qualifiedName;
|
|
82
|
+
if (namespace === 'http://www.w3.org/2000/svg') {
|
|
83
|
+
qualifiedName = `svg:${localName}`;
|
|
84
|
+
}
|
|
85
|
+
else if (namespace === 'http://www.w3.org/1998/Math/MathML') {
|
|
86
|
+
qualifiedName = `mml:${localName}`;
|
|
87
|
+
}
|
|
88
|
+
else {
|
|
89
|
+
qualifiedName = localName;
|
|
90
|
+
}
|
|
72
91
|
const spec = {
|
|
73
92
|
// @ts-ignore
|
|
74
|
-
name:
|
|
93
|
+
name: qualifiedName,
|
|
75
94
|
namespace,
|
|
76
95
|
cite: el.cite ?? mdnData.cite,
|
|
77
96
|
description: mdnData.description,
|
|
@@ -126,7 +145,9 @@ export async function getElements(filePattern) {
|
|
|
126
145
|
};
|
|
127
146
|
return spec;
|
|
128
147
|
}));
|
|
129
|
-
return specs
|
|
130
|
-
.
|
|
131
|
-
|
|
148
|
+
return specs.toSorted(nameCompare).toSorted((a, b) => {
|
|
149
|
+
const orderA = namespaceSortOrder[a.namespace ?? ''] ?? 0;
|
|
150
|
+
const orderB = namespaceSortOrder[b.namespace ?? ''] ?? 0;
|
|
151
|
+
return orderA - orderB;
|
|
152
|
+
});
|
|
132
153
|
}
|
package/lib/mathml.d.ts
ADDED
|
@@ -0,0 +1,8 @@
|
|
|
1
|
+
/**
|
|
2
|
+
* Fetches the MDN MathML element index page and extracts the list of
|
|
3
|
+
* deprecated and non-standard MathML element names.
|
|
4
|
+
* Each name is prefixed with `"mml_"` to distinguish it from HTML/SVG elements.
|
|
5
|
+
*
|
|
6
|
+
* @returns An array of deprecated/non-standard MathML element names (e.g., `["mml_maction", ...]`)
|
|
7
|
+
*/
|
|
8
|
+
export declare function getMathMLElementList(): Promise<string[]>;
|
package/lib/mathml.js
ADDED
|
@@ -0,0 +1,29 @@
|
|
|
1
|
+
import { fetch } from './fetch.js';
|
|
2
|
+
/**
|
|
3
|
+
* Fetches the MDN MathML element index page and extracts the list of
|
|
4
|
+
* deprecated and non-standard MathML element names.
|
|
5
|
+
* Each name is prefixed with `"mml_"` to distinguish it from HTML/SVG elements.
|
|
6
|
+
*
|
|
7
|
+
* @returns An array of deprecated/non-standard MathML element names (e.g., `["mml_maction", ...]`)
|
|
8
|
+
*/
|
|
9
|
+
export async function getMathMLElementList() {
|
|
10
|
+
const index = 'https://developer.mozilla.org/en-US/docs/Web/MathML/Element';
|
|
11
|
+
const $ = await fetch(index);
|
|
12
|
+
const deprecatedList = [];
|
|
13
|
+
// MathML element page lists deprecated/non-standard status inline with icons
|
|
14
|
+
// rather than having a separate section like SVG
|
|
15
|
+
$('main#content td code').each((_, el) => {
|
|
16
|
+
const $el = $(el);
|
|
17
|
+
const text = $el.text().trim().replaceAll(/<|>/g, '');
|
|
18
|
+
if (!text || !text.startsWith('m')) {
|
|
19
|
+
return;
|
|
20
|
+
}
|
|
21
|
+
const $row = $el.closest('tr');
|
|
22
|
+
const hasDeprecated = $row.find('.icon.icon-deprecated, .icon-deprecated').length > 0;
|
|
23
|
+
const hasNonStandard = $row.find('.icon.icon-nonstandard, .icon-nonstandard').length > 0;
|
|
24
|
+
if (hasDeprecated || hasNonStandard) {
|
|
25
|
+
deprecatedList.push('mml_' + text);
|
|
26
|
+
}
|
|
27
|
+
});
|
|
28
|
+
return deprecatedList;
|
|
29
|
+
}
|
package/lib/utils.d.ts
CHANGED
|
@@ -14,7 +14,7 @@ type HasName = {
|
|
|
14
14
|
* @param b - The second item to compare, either a string or an object with a `name` property
|
|
15
15
|
* @returns A negative number if `a` comes before `b`, positive if after, or zero if equal
|
|
16
16
|
*/
|
|
17
|
-
export declare function nameCompare(a: HasName | string, b: HasName | string): 1 |
|
|
17
|
+
export declare function nameCompare(a: HasName | string, b: HasName | string): 1 | -1 | 0;
|
|
18
18
|
/**
|
|
19
19
|
* Creates a new object with the same key-value pairs, sorted alphabetically by key.
|
|
20
20
|
*
|
|
@@ -63,14 +63,15 @@ export declare function mergeAttributes<T>(fromDocs: T, fromJSON: T): T;
|
|
|
63
63
|
export declare function keys<T, K = keyof T>(object: T): K[];
|
|
64
64
|
/**
|
|
65
65
|
* Parses an element name string to extract the local name, namespace, and markup language type.
|
|
66
|
-
* Handles SVG-prefixed names (e.g., `"svg_circle"`)
|
|
66
|
+
* Handles SVG-prefixed names (e.g., `"svg_circle"`), MathML-prefixed names (e.g., `"mml_math"`),
|
|
67
|
+
* and plain HTML names.
|
|
67
68
|
*
|
|
68
|
-
* @param origin - The raw element name, optionally prefixed with `"svg_"` for SVG elements
|
|
69
|
-
* @returns An object containing the `localName`, optional
|
|
69
|
+
* @param origin - The raw element name, optionally prefixed with `"svg_"` or `"mml_"` for SVG/MathML elements
|
|
70
|
+
* @returns An object containing the `localName`, optional `namespace` URI, and `ml` type (`"SVG"`, `"MathML"`, or `"HTML"`)
|
|
70
71
|
*/
|
|
71
72
|
export declare function getName(origin: string): {
|
|
72
73
|
localName: string;
|
|
73
|
-
namespace: "http://www.w3.org/2000/svg" | undefined;
|
|
74
|
+
namespace: "http://www.w3.org/2000/svg" | "http://www.w3.org/1998/Math/MathML" | undefined;
|
|
74
75
|
ml: string;
|
|
75
76
|
};
|
|
76
77
|
export {};
|
package/lib/utils.js
CHANGED
|
@@ -109,16 +109,18 @@ export function keys(object) {
|
|
|
109
109
|
}
|
|
110
110
|
/**
|
|
111
111
|
* Parses an element name string to extract the local name, namespace, and markup language type.
|
|
112
|
-
* Handles SVG-prefixed names (e.g., `"svg_circle"`)
|
|
112
|
+
* Handles SVG-prefixed names (e.g., `"svg_circle"`), MathML-prefixed names (e.g., `"mml_math"`),
|
|
113
|
+
* and plain HTML names.
|
|
113
114
|
*
|
|
114
|
-
* @param origin - The raw element name, optionally prefixed with `"svg_"` for SVG elements
|
|
115
|
-
* @returns An object containing the `localName`, optional
|
|
115
|
+
* @param origin - The raw element name, optionally prefixed with `"svg_"` or `"mml_"` for SVG/MathML elements
|
|
116
|
+
* @returns An object containing the `localName`, optional `namespace` URI, and `ml` type (`"SVG"`, `"MathML"`, or `"HTML"`)
|
|
116
117
|
*/
|
|
117
118
|
export function getName(origin) {
|
|
118
|
-
const [,
|
|
119
|
+
const [, rawNs, localName] = origin.match(/^(?:(svg|mml)_)?([\w-]+)/i) ?? [];
|
|
120
|
+
const ns = rawNs?.toLowerCase();
|
|
119
121
|
const name = localName ?? origin;
|
|
120
|
-
const ml = ns === 'svg' ? 'SVG' : 'HTML';
|
|
121
|
-
const namespace = ns === 'svg' ? 'http://www.w3.org/2000/svg' : undefined;
|
|
122
|
+
const ml = ns === 'svg' ? 'SVG' : ns === 'mml' ? 'MathML' : 'HTML';
|
|
123
|
+
const namespace = ns === 'svg' ? 'http://www.w3.org/2000/svg' : ns === 'mml' ? 'http://www.w3.org/1998/Math/MathML' : undefined;
|
|
122
124
|
return {
|
|
123
125
|
localName: name,
|
|
124
126
|
namespace,
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "@markuplint/spec-generator",
|
|
3
|
-
"version": "5.0.0-alpha.
|
|
3
|
+
"version": "5.0.0-alpha.3",
|
|
4
4
|
"description": "Generates @markuplint/html-spec",
|
|
5
5
|
"repository": "git@github.com:markuplint/markuplint.git",
|
|
6
6
|
"author": "Yusuke Hirao <yusukehirao@me.com>",
|
|
@@ -28,15 +28,15 @@
|
|
|
28
28
|
"ajv": "8.18.0",
|
|
29
29
|
"cheerio": "1.2.0",
|
|
30
30
|
"cli-progress": "3.12.0",
|
|
31
|
-
"fast-xml-parser": "5.
|
|
31
|
+
"fast-xml-parser": "5.4.1",
|
|
32
32
|
"glob": "13.0.6",
|
|
33
33
|
"jsonc-parser": "3.3.1"
|
|
34
34
|
},
|
|
35
35
|
"devDependencies": {
|
|
36
|
-
"@markuplint/ml-spec": "5.0.0-alpha.
|
|
37
|
-
"@markuplint/test-tools": "5.0.0-alpha.
|
|
36
|
+
"@markuplint/ml-spec": "5.0.0-alpha.3",
|
|
37
|
+
"@markuplint/test-tools": "5.0.0-alpha.3",
|
|
38
38
|
"@types/cli-progress": "3.11.6",
|
|
39
39
|
"type-fest": "5.4.4"
|
|
40
40
|
},
|
|
41
|
-
"gitHead": "
|
|
41
|
+
"gitHead": "2fbdf26daa3d021ac628ccc2f59f0eeae6ddd53d"
|
|
42
42
|
}
|