reffy 20.0.16 → 21.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (3) hide show
  1. package/README.md +151 -151
  2. package/package.json +6 -6
  3. package/src/lib/mock-server.js +202 -218
package/README.md CHANGED
@@ -1,151 +1,151 @@
1
- # Reffy
2
-
3
- <img align="right" width="256" height="256" src="images/reffy-512.png" alt="Reffy, represented as a brave little worm with a construction helmet, ready to crawl specs">
4
-
5
- Reffy is a **Web spec crawler** tool. It is notably used to update [Webref](https://github.com/w3c/webref#webref) every 6 hours.
6
-
7
- The code features a generic crawler that can fetch Web specifications and generate machine-readable extracts out of them. Created extracts include lists of CSS properties, definitions, IDL, links and references contained in the specification.
8
-
9
- ## How to use
10
-
11
- ### Pre-requisites
12
-
13
- To install Reffy, you need [Node.js](https://nodejs.org/en/) 20.12.1 or greater (the crawler itself may still run with earlier versions of Node.js but without any guarantee).
14
-
15
- ### Installation
16
-
17
- Reffy is available as an NPM package. To install the package globally, run:
18
-
19
- ```bash
20
- npm install -g reffy
21
- ```
22
-
23
- This will install Reffy as a command-line interface tool.
24
-
25
- The list of specs crawled by default evolves regularly. To make sure that you run the latest version, use:
26
-
27
- ```bash
28
- npm update -g reffy
29
- ```
30
-
31
- ### Launch Reffy
32
-
33
- Reffy crawls requested specifications and runs a set of processing modules on the content fetched to create relevant extracts from each spec. Which specs get crawled, and which processing modules get run depend on how the crawler gets called. By default, the crawler crawls all specs defined in [browser-specs](https://github.com/w3c/browser-specs/) and runs all core processing modules defined in the [`browserlib`](https://github.com/w3c/reffy/tree/main/src/browserlib) folder.
34
-
35
- Reffy can also run post-processing modules on the results of the crawl to create additional views of the data extracted from the spec during the crawl.
36
-
37
- Crawl results will either be returned to the console or saved in individual files in a report folder when the `--output` parameter is set.
38
-
39
- Examples of information that can be extracted from the specs:
40
-
41
- 1. Generic information such as the title of the spec or the URL of the Editor's Draft. This information is typically copied over from [browser-specs](https://github.com/w3c/browser-specs/).
42
- 2. The list of terms that the spec defines, in a format suitable for ingestion in cross-referencing tools such as [ReSpec](https://respec.org/xref/).
43
- 3. The list of IDs, the list of headings and the list of links in the spec.
44
- 4. The list of normative/informative references found in the spec.
45
- 5. Extended information about WebIDL term definitions and references that the spec contains
46
- 6. For CSS specs, the list of CSS properties, descriptors and value spaces that the spec defines.
47
-
48
- The crawler can be fully parameterized to crawl a specific list of specs and run a custom set of processing modules on them. For example:
49
-
50
- - To extract the raw IDL defined in Fetch, run:
51
- ```bash
52
- reffy --spec fetch --module idl
53
- ```
54
- - To retrieve the list of specs that the HTML spec references, run (noting that crawling the HTML spec takes some time due to it being a multipage spec):
55
- ```bash
56
- reffy --spec html --module refs
57
- ```
58
- - To extract the list of CSS properties defined in CSS Flexible Box Layout Module Level 1, run:
59
- ```bash
60
- reffy --spec css-flexbox-1 --module css
61
- ```
62
- - To extract the list of terms defined in WAI ARIA 1.2, run:
63
- ```bash
64
- reffy --spec wai-aria-1.2 --module dfns
65
- ```
66
- - To run an hypothetical `extract-editors.mjs` processing module and create individual spec extracts with the result of the processing under an `editors` folder for all specs in browser-specs, run:
67
- ```bash
68
- reffy --output reports/test --module editors:extract-editors.mjs
69
- ```
70
-
71
- You may add `--terse` (or `-t`) to the above commands to access the extracts directly.
72
-
73
- Run `reffy -h` for a complete list of options and usage details.
74
-
75
-
76
- Some notes:
77
-
78
- * The crawler may take a few minutes, depending on the number of specs it needs to crawl.
79
- * The crawler uses a local cache for HTTP exchanges. It will create and fill a `.cache` subfolder in particular.
80
- * If you cloned the repo instead of installing Reffy globally, replace `reffy` width `node reffy.js` in the above example to run Reffy.
81
-
82
-
83
- ## Additional tools
84
-
85
- Additional CLI tools in the `src/cli` folder complete the main specs crawler.
86
-
87
-
88
- ### WebIDL parser
89
-
90
- The **WebIDL parser** takes the relative path to an IDL extract and generates a JSON structure that describes WebIDL term definitions and references that the spec contains. The parser uses [WebIDL2](https://github.com/darobin/webidl2.js/) to parse the WebIDL content found in the spec. To run the WebIDL parser: `node src/cli/parse-webidl.js [idlfile]`
91
-
92
- To create the WebIDL extract in the first place, you will need to run the `idl` module in Reffy, as in:
93
-
94
- ```bash
95
- reffy --spec fetch --module idl > fetch.idl
96
- ```
97
-
98
-
99
- ### Crawl results merger
100
-
101
- The **crawl results merger** merges a new JSON crawl report into a reference one. This tool is typically useful to replace the crawl results of a given specification with the results of a new run of the crawler on that specification. To run the crawl results merger: `node src/cli/merge-crawl-results.js [new crawl report] [reference crawl report] [crawl report to create]`
102
-
103
-
104
- ### Analysis tools
105
-
106
- Starting with Reffy v5, analysis tools that used to be part of Reffy's suite of tools to study extracts and create human-readable reports of potential spec anomalies migrated to a companion tool named [Strudy](https://github.com/w3c/strudy). The actual reports get published in a separate [w3c/webref-analysis](https://github.com/w3c/webref-analysis) repository as well.
107
-
108
-
109
- ### WebIDL terms explorer
110
-
111
- See the related **[WebIDLPedia](https://dontcallmedom.github.io/webidlpedia)** project and its [repo](https://github.com/dontcallmedom/webidlpedia).
112
-
113
-
114
- ## Technical notes
115
-
116
- Reffy should be able to parse most of the W3C/WHATWG specifications that define CSS and/or WebIDL terms (both published versions and Editor's Drafts), and more generally speaking specs authored with one of [Bikeshed](https://tabatkins.github.io/bikeshed/) or [ReSpec](https://respec.org/docs/). Reffy can also parse certain IETF specs to some extent, and may work with other types of specs as well.
117
-
118
- ### List of specs to crawl
119
-
120
- Reffy crawls specs defined in [w3c/browser-specs](https://github.com/w3c/browser-specs/). If you believe a spec is missing, please check the [Spec selection criteria](https://github.com/w3c/browser-specs/#spec-selection-criteria) and create an issue (or prepare a pull request) against the [w3c/browser-specs](https://github.com/w3c/browser-specs/) repository.
121
-
122
- ### Crawling a spec
123
-
124
- Given some spec info, the crawler basically goes through the following steps:
125
-
126
- 1. Load the URL through Puppeteer.
127
- 2. If the document contains a "head" section that includes a link whose label looks like "single page", go back to step 2 and load the target of that link instead. This makes the crawler load the single page version of multi-page specifications such as HTML5.
128
- 3. If the document is a multi-page spec without a "single page" version, load the individual subpage and add their content to the bottom of the first page to create a single page version.
129
- 4. If the document uses ReSpec, let ReSpec finish its generation work.
130
- 5. Run internal tools on the generated document to build the relevant information.
131
-
132
- The crawler processes 4 specifications at a time. Network and parsing errors should be reported in the crawl results.
133
-
134
- ### Config parameters
135
-
136
- The crawler reads parameters from the `config.json` file. Optional parameters:
137
-
138
- * `cacheRefresh`: set this flag to `never` to tell the crawler to use the cache entry for a URL directly, instead of sending a conditional HTTP request to check whether the entry is still valid. This parameter is typically useful when developing Reffy's code to work offline.
139
- * `resetCache`: set this flag to `true` to tell the crawler to reset the contents of the local cache when it starts.
140
-
141
-
142
- ## Contributing
143
-
144
- Authors so far are [François Daoust](https://github.com/tidoust/) and [Dominique Hazaël-Massieux](https://github.com/dontcallmedom/).
145
-
146
- Additional ideas, bugs and/or code contributions are most welcome. Create [issues on GitHub](https://github.com/w3c/reffy/issues) as needed!
147
-
148
-
149
- ## Licensing
150
-
151
- The code is available under an [MIT license](LICENSE).
1
+ # Reffy
2
+
3
+ <img align="right" width="256" height="256" src="images/reffy-512.png" alt="Reffy, represented as a brave little worm with a construction helmet, ready to crawl specs">
4
+
5
+ Reffy is a **Web spec crawler** tool. It is notably used to update [Webref](https://github.com/w3c/webref#webref) every 6 hours.
6
+
7
+ The code features a generic crawler that can fetch Web specifications and generate machine-readable extracts out of them. Created extracts include lists of CSS properties, definitions, IDL, links and references contained in the specification.
8
+
9
+ ## How to use
10
+
11
+ ### Pre-requisites
12
+
13
+ To install Reffy, you need [Node.js](https://nodejs.org/en/) 22.19.0 or greater (the crawler itself may still run with earlier versions of Node.js but without any guarantee).
14
+
15
+ ### Installation
16
+
17
+ Reffy is available as an NPM package. To install the package globally, run:
18
+
19
+ ```bash
20
+ npm install -g reffy
21
+ ```
22
+
23
+ This will install Reffy as a command-line interface tool.
24
+
25
+ The list of specs crawled by default evolves regularly. To make sure that you run the latest version, use:
26
+
27
+ ```bash
28
+ npm update -g reffy
29
+ ```
30
+
31
+ ### Launch Reffy
32
+
33
+ Reffy crawls requested specifications and runs a set of processing modules on the content fetched to create relevant extracts from each spec. Which specs get crawled, and which processing modules get run depend on how the crawler gets called. By default, the crawler crawls all specs defined in [browser-specs](https://github.com/w3c/browser-specs/) and runs all core processing modules defined in the [`browserlib`](https://github.com/w3c/reffy/tree/main/src/browserlib) folder.
34
+
35
+ Reffy can also run post-processing modules on the results of the crawl to create additional views of the data extracted from the spec during the crawl.
36
+
37
+ Crawl results will either be returned to the console or saved in individual files in a report folder when the `--output` parameter is set.
38
+
39
+ Examples of information that can be extracted from the specs:
40
+
41
+ 1. Generic information such as the title of the spec or the URL of the Editor's Draft. This information is typically copied over from [browser-specs](https://github.com/w3c/browser-specs/).
42
+ 2. The list of terms that the spec defines, in a format suitable for ingestion in cross-referencing tools such as [ReSpec](https://respec.org/xref/).
43
+ 3. The list of IDs, the list of headings and the list of links in the spec.
44
+ 4. The list of normative/informative references found in the spec.
45
+ 5. Extended information about WebIDL term definitions and references that the spec contains
46
+ 6. For CSS specs, the list of CSS properties, descriptors and value spaces that the spec defines.
47
+
48
+ The crawler can be fully parameterized to crawl a specific list of specs and run a custom set of processing modules on them. For example:
49
+
50
+ - To extract the raw IDL defined in Fetch, run:
51
+ ```bash
52
+ reffy --spec fetch --module idl
53
+ ```
54
+ - To retrieve the list of specs that the HTML spec references, run (noting that crawling the HTML spec takes some time due to it being a multipage spec):
55
+ ```bash
56
+ reffy --spec html --module refs
57
+ ```
58
+ - To extract the list of CSS properties defined in CSS Flexible Box Layout Module Level 1, run:
59
+ ```bash
60
+ reffy --spec css-flexbox-1 --module css
61
+ ```
62
+ - To extract the list of terms defined in WAI ARIA 1.2, run:
63
+ ```bash
64
+ reffy --spec wai-aria-1.2 --module dfns
65
+ ```
66
+ - To run an hypothetical `extract-editors.mjs` processing module and create individual spec extracts with the result of the processing under an `editors` folder for all specs in browser-specs, run:
67
+ ```bash
68
+ reffy --output reports/test --module editors:extract-editors.mjs
69
+ ```
70
+
71
+ You may add `--terse` (or `-t`) to the above commands to access the extracts directly.
72
+
73
+ Run `reffy -h` for a complete list of options and usage details.
74
+
75
+
76
+ Some notes:
77
+
78
+ * The crawler may take a few minutes, depending on the number of specs it needs to crawl.
79
+ * The crawler uses a local cache for HTTP exchanges. It will create and fill a `.cache` subfolder in particular.
80
+ * If you cloned the repo instead of installing Reffy globally, replace `reffy` width `node reffy.js` in the above example to run Reffy.
81
+
82
+
83
+ ## Additional tools
84
+
85
+ Additional CLI tools in the `src/cli` folder complete the main specs crawler.
86
+
87
+
88
+ ### WebIDL parser
89
+
90
+ The **WebIDL parser** takes the relative path to an IDL extract and generates a JSON structure that describes WebIDL term definitions and references that the spec contains. The parser uses [WebIDL2](https://github.com/darobin/webidl2.js/) to parse the WebIDL content found in the spec. To run the WebIDL parser: `node src/cli/parse-webidl.js [idlfile]`
91
+
92
+ To create the WebIDL extract in the first place, you will need to run the `idl` module in Reffy, as in:
93
+
94
+ ```bash
95
+ reffy --spec fetch --module idl > fetch.idl
96
+ ```
97
+
98
+
99
+ ### Crawl results merger
100
+
101
+ The **crawl results merger** merges a new JSON crawl report into a reference one. This tool is typically useful to replace the crawl results of a given specification with the results of a new run of the crawler on that specification. To run the crawl results merger: `node src/cli/merge-crawl-results.js [new crawl report] [reference crawl report] [crawl report to create]`
102
+
103
+
104
+ ### Analysis tools
105
+
106
+ Starting with Reffy v5, analysis tools that used to be part of Reffy's suite of tools to study extracts and create human-readable reports of potential spec anomalies migrated to a companion tool named [Strudy](https://github.com/w3c/strudy). The actual reports get published in a separate [w3c/webref-analysis](https://github.com/w3c/webref-analysis) repository as well.
107
+
108
+
109
+ ### WebIDL terms explorer
110
+
111
+ See the related **[WebIDLPedia](https://dontcallmedom.github.io/webidlpedia)** project and its [repo](https://github.com/dontcallmedom/webidlpedia).
112
+
113
+
114
+ ## Technical notes
115
+
116
+ Reffy should be able to parse most of the W3C/WHATWG specifications that define CSS and/or WebIDL terms (both published versions and Editor's Drafts), and more generally speaking specs authored with one of [Bikeshed](https://tabatkins.github.io/bikeshed/) or [ReSpec](https://respec.org/docs/). Reffy can also parse certain IETF specs to some extent, and may work with other types of specs as well.
117
+
118
+ ### List of specs to crawl
119
+
120
+ Reffy crawls specs defined in [w3c/browser-specs](https://github.com/w3c/browser-specs/). If you believe a spec is missing, please check the [Spec selection criteria](https://github.com/w3c/browser-specs/#spec-selection-criteria) and create an issue (or prepare a pull request) against the [w3c/browser-specs](https://github.com/w3c/browser-specs/) repository.
121
+
122
+ ### Crawling a spec
123
+
124
+ Given some spec info, the crawler basically goes through the following steps:
125
+
126
+ 1. Load the URL through Puppeteer.
127
+ 2. If the document contains a "head" section that includes a link whose label looks like "single page", go back to step 2 and load the target of that link instead. This makes the crawler load the single page version of multi-page specifications such as HTML5.
128
+ 3. If the document is a multi-page spec without a "single page" version, load the individual subpage and add their content to the bottom of the first page to create a single page version.
129
+ 4. If the document uses ReSpec, let ReSpec finish its generation work.
130
+ 5. Run internal tools on the generated document to build the relevant information.
131
+
132
+ The crawler processes 4 specifications at a time. Network and parsing errors should be reported in the crawl results.
133
+
134
+ ### Config parameters
135
+
136
+ The crawler reads parameters from the `config.json` file. Optional parameters:
137
+
138
+ * `cacheRefresh`: set this flag to `never` to tell the crawler to use the cache entry for a URL directly, instead of sending a conditional HTTP request to check whether the entry is still valid. This parameter is typically useful when developing Reffy's code to work offline.
139
+ * `resetCache`: set this flag to `true` to tell the crawler to reset the contents of the local cache when it starts.
140
+
141
+
142
+ ## Contributing
143
+
144
+ Authors so far are [François Daoust](https://github.com/tidoust/) and [Dominique Hazaël-Massieux](https://github.com/dontcallmedom/).
145
+
146
+ Additional ideas, bugs and/or code contributions are most welcome. Create [issues on GitHub](https://github.com/w3c/reffy/issues) as needed!
147
+
148
+
149
+ ## Licensing
150
+
151
+ The code is available under an [MIT license](LICENSE).
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "reffy",
3
- "version": "20.0.16",
3
+ "version": "21.0.0",
4
4
  "description": "W3C/WHATWG spec dependencies exploration companion. Features a short set of tools to study spec references as well as WebIDL term definitions and references found in W3C specifications.",
5
5
  "repository": {
6
6
  "type": "git",
@@ -27,7 +27,7 @@
27
27
  ],
28
28
  "license": "MIT",
29
29
  "engines": {
30
- "node": ">=20.18.1"
30
+ "node": ">=22.19.0"
31
31
  },
32
32
  "type": "module",
33
33
  "main": "index.js",
@@ -37,16 +37,16 @@
37
37
  "ajv-formats": "3.0.1",
38
38
  "commander": "14.0.3",
39
39
  "fetch-filecache-for-crawling": "5.1.1",
40
- "puppeteer": "24.42.0",
40
+ "puppeteer": "24.43.0",
41
41
  "semver": "^7.3.5",
42
- "web-specs": "3.84.0",
42
+ "web-specs": "3.85.0",
43
43
  "webidl2": "24.5.0"
44
44
  },
45
45
  "devDependencies": {
46
46
  "respec": "37.0.0",
47
47
  "respec-hljs": "2.1.1",
48
- "rollup": "4.60.2",
49
- "undici": "^7.0.0"
48
+ "rollup": "4.60.3",
49
+ "undici": "^8.2.0"
50
50
  },
51
51
  "overrides": {
52
52
  "puppeteer": "$puppeteer"
@@ -1,218 +1,202 @@
1
- /**
2
- * Setup a proxy server that intercepts some network requests. To be used in
3
- * tests not to hit the network.
4
- *
5
- * @module mock-server
6
- */
7
-
8
- import { MockAgent, setGlobalDispatcher } from 'undici';
9
- import path from 'node:path';
10
- import { existsSync, readFileSync } from 'node:fs';
11
- import { fileURLToPath } from 'node:url';
12
-
13
- const scriptPath = path.dirname(fileURLToPath(import.meta.url));
14
-
15
- /**
16
- * Determine the path to the "node_modules" folder. The path depends on whether
17
- * Reffy is run directly, or installed as a library.
18
- *
19
- * @function
20
- * @return {String} Path to the node_modules folder.
21
- */
22
- function getModulesFolder() {
23
- const rootFolder = path.resolve(scriptPath, '../..');
24
- let folder = path.resolve(rootFolder, 'node_modules');
25
- if (existsSync(folder)) {
26
- return folder;
27
- }
28
- folder = path.resolve(rootFolder, '..');
29
- return folder;
30
- }
31
- const modulesFolder = getModulesFolder();
32
-
33
- const mockSpecs = {
34
- "/woff/woff2/": {
35
- html: `
36
- <title>WOFF2</title>
37
- <body>
38
- <dfn id='foo' data-dfn-type="dfn">Foo</dfn>
39
- <a href="https://www.w3.org/TR/bar/#baz">bar</a>
40
- <ul class='toc'><li><a href='page.html'>page</a></ul>`,
41
- pages: {
42
- "page.html": `<h2 id='bar'>Heading in subpage</h2>`
43
- }
44
- },
45
- "/mediacapture-output/": `
46
- <script>respecConfig = { shortName: 'test' };</script>
47
- <script src='https://www.w3.org/Tools/respec/respec-w3c'></script>
48
- <div id=abstract></div>
49
- <pre class='idl'>[Exposed=Window] interface Foo { attribute DOMString bar; };</pre>`,
50
- "/accelerometer/": `<html><meta name='document-revision' content='c0917d216986f88bdd43c72c0b13352c71f283aa'>
51
- <h2>Normative references</h2>
52
- <dl>
53
- <dt>FOO</dt>
54
- <dd><a href='https://www.w3.org/TR/Foo'>Foo</a></dd>
55
- </dl>`,
56
- "/pointerlock/": `<html>
57
- <h1>Pointer Lock 2.0`,
58
- "/TR/remote-playback/": {
59
- html: `<title>Published version</title>
60
- <body><h1>Published version</h1></body>`,
61
- domain: 'https://www.w3.org'
62
- }
63
- };
64
-
65
- const respecHiglight = readFileSync(
66
- path.join(modulesFolder, "respec-hljs", "dist", "respec-highlight.js"),
67
- 'utf8'
68
- );
69
- const respecW3C = readFileSync(
70
- path.join(modulesFolder, "respec", "builds", "respec-w3c.js"),
71
- 'utf8'
72
- );
73
-
74
- const mockAgent = new MockAgent();
75
- setGlobalDispatcher(mockAgent);
76
- mockAgent.disableNetConnect();
77
- // for chrome devtool protocol
78
- mockAgent.enableNetConnect('127.0.0.1');
79
-
80
- for (const [path, desc] of Object.entries(mockSpecs)) {
81
- mockAgent.get(desc.domain || "https://w3c.github.io")
82
- .intercept({ method: "GET", path })
83
- .reply(200, desc.html || desc, {
84
- headers: { "Content-Type": "text/html" }
85
- })
86
- .persist();
87
-
88
- for (const [page, pageContent] of Object.entries(desc.pages || {})) {
89
- mockAgent.get(desc.domain || "https://w3c.github.io")
90
- .intercept({ method: "GET", path: path + page })
91
- .reply(200, pageContent, {
92
- headers: { "Content-Type": "text/html" }
93
- })
94
- .persist();
95
- }
96
- }
97
-
98
-
99
- // Handling requests generated by ReSpec documents
100
- mockAgent
101
- .get("https://api.specref.org")
102
- .intercept({ method: "GET", path: "/bibrefs?refs=webidl" })
103
- .reply(200, { webidl: { href: "https://webidl.spec.whatwg.org/" } }, {
104
- headers: {
105
- "Content-Type": "application/json",
106
- "Access-Control-Allow-Origin": "*"
107
- }
108
- })
109
- .persist();
110
-
111
- mockAgent
112
- .get("https://www.w3.org")
113
- .intercept({ method: "GET", path: "/scripts/TR/2021/fixup.js" })
114
- .reply(200, '')
115
- .persist();
116
-
117
- mockAgent
118
- .get("https://www.w3.org")
119
- .intercept({ method: "GET", path: "/StyleSheets/TR/2021/logos/W3C" })
120
- .reply(200, '')
121
- .persist();
122
-
123
- mockAgent
124
- .get("https://www.w3.org")
125
- .intercept({ method: "GET", path: "/Tools/respec/respec-highlight" })
126
- .reply(200, respecHiglight, {
127
- headers: { "Content-Type": "application/js" }
128
- })
129
- .persist();
130
-
131
- mockAgent
132
- .get("https://www.w3.org")
133
- .intercept({ method: "GET", path: "/Tools/respec/respec-w3c" })
134
- .reply(200, respecW3C, {
135
- headers: { "Content-Type": "application/js" }
136
- })
137
- .persist();
138
-
139
- mockAgent
140
- .get("https://www.w3.org")
141
- .intercept({ method: "GET", path: "/TR/idontexist/" })
142
- .reply(404, '');
143
-
144
- mockAgent
145
- .get("https://www.w3.org")
146
- .intercept({ method: "GET", path: "/TR/ididnotchange/" })
147
- .reply(({ headers }) => {
148
- // NB: Before Node.js v18.17.0, the headers parameters is not an instance
149
- // of Headers as suggested in examples, but rather an array that alternates
150
- // header names and header values. Bug detailed at:
151
- // https://github.com/nodejs/undici/issues/2078
152
- // Bug fix was integrated in Node.js v18.17.0.
153
- // Code below can be simplified when support for Node.js v18 gets dropped.
154
- let value;
155
- if (Array.isArray(headers)) {
156
- const pos = headers.findIndex(h => h === 'If-Modified-Since');
157
- if (pos === -1) {
158
- return { statusCode: 200, data: 'Unexpected If-Modified-Since header' };
159
- }
160
- value = headers[pos+1];
161
- }
162
- else {
163
- value = headers['If-Modified-Since'];
164
- }
165
- if (value === "Fri, 11 Feb 2022 00:00:42 GMT") {
166
- return { statusCode: 304 };
167
- } else {
168
- return { statusCode: 200, data: 'Unexpected If-Modified-Since header' };
169
- }
170
- });
171
-
172
- mockAgent
173
- .get("https://www.w3.org")
174
- .intercept({ method: "GET", path: "/TR/iredirect/" })
175
- .reply(200,
176
- `<!DOCTYPE html><script>window.location = '/TR/recentlyupdated/';</script>`,
177
- {
178
- headers: {
179
- "Content-Type": "text/html",
180
- "Last-Modified": "Fri, 11 Feb 2022 00:00:42 GMT"
181
- }
182
- }
183
- );
184
-
185
- mockAgent
186
- .get("https://www.w3.org")
187
- .intercept({ method: "GET", path: "/TR/recentlyupdated/" })
188
- .reply(200,
189
- `<html><title>Recently updated</title>
190
- <h1>Recently updated</h1>`,
191
- {
192
- headers: {
193
- "Content-Type": "text/html",
194
- "Last-Modified": (new Date()).toString()
195
- }
196
- }
197
- );
198
-
199
- mockAgent
200
- .get("https://drafts.csswg.org")
201
- .intercept({ method: "GET", path: "/server-hiccup/" })
202
- .reply(200,
203
- `<html><title>Server hiccup</title>
204
- <h1> Index of Server Hiccup Module Level 42 </h1>`,
205
- { headers: { "Content-Type": "text/html" } })
206
- .persist();
207
-
208
- /*nock.emitter.on('error', function (err) {
209
- console.error(err);
210
- });
211
- nock.emitter.on('no match', function(req, options, requestBody) {
212
- // 127.0.0.1 is used by the devtool protocol, we ignore it
213
- if (req && req.hostname !== '127.0.0.1') {
214
- console.error("No match for nock request on " + (options ? options.href : req.href));
215
- }
216
- });*/
217
-
218
- export default mockAgent;
1
+ /**
2
+ * Setup a proxy server that intercepts some network requests. To be used in
3
+ * tests not to hit the network.
4
+ *
5
+ * @module mock-server
6
+ */
7
+
8
+ import { MockAgent, setGlobalDispatcher, install } from 'undici';
9
+ import path from 'node:path';
10
+ import { existsSync, readFileSync } from 'node:fs';
11
+ import { fileURLToPath } from 'node:url';
12
+
13
+ const scriptPath = path.dirname(fileURLToPath(import.meta.url));
14
+
15
+ /**
16
+ * Determine the path to the "node_modules" folder. The path depends on whether
17
+ * Reffy is run directly, or installed as a library.
18
+ *
19
+ * @function
20
+ * @return {String} Path to the node_modules folder.
21
+ */
22
+ function getModulesFolder() {
23
+ const rootFolder = path.resolve(scriptPath, '../..');
24
+ let folder = path.resolve(rootFolder, 'node_modules');
25
+ if (existsSync(folder)) {
26
+ return folder;
27
+ }
28
+ folder = path.resolve(rootFolder, '..');
29
+ return folder;
30
+ }
31
+ const modulesFolder = getModulesFolder();
32
+
33
+ const mockSpecs = {
34
+ "/woff/woff2/": {
35
+ html: `
36
+ <title>WOFF2</title>
37
+ <body>
38
+ <dfn id='foo' data-dfn-type="dfn">Foo</dfn>
39
+ <a href="https://www.w3.org/TR/bar/#baz">bar</a>
40
+ <ul class='toc'><li><a href='page.html'>page</a></ul>`,
41
+ pages: {
42
+ "page.html": `<h2 id='bar'>Heading in subpage</h2>`
43
+ }
44
+ },
45
+ "/mediacapture-output/": `
46
+ <script>respecConfig = { shortName: 'test' };</script>
47
+ <script src='https://www.w3.org/Tools/respec/respec-w3c'></script>
48
+ <div id=abstract></div>
49
+ <pre class='idl'>[Exposed=Window] interface Foo { attribute DOMString bar; };</pre>`,
50
+ "/accelerometer/": `<html><meta name='document-revision' content='c0917d216986f88bdd43c72c0b13352c71f283aa'>
51
+ <h2>Normative references</h2>
52
+ <dl>
53
+ <dt>FOO</dt>
54
+ <dd><a href='https://www.w3.org/TR/Foo'>Foo</a></dd>
55
+ </dl>`,
56
+ "/pointerlock/": `<html>
57
+ <h1>Pointer Lock 2.0`,
58
+ "/TR/remote-playback/": {
59
+ html: `<title>Published version</title>
60
+ <body><h1>Published version</h1></body>`,
61
+ domain: 'https://www.w3.org'
62
+ }
63
+ };
64
+
65
+ const respecHiglight = readFileSync(
66
+ path.join(modulesFolder, "respec-hljs", "dist", "respec-highlight.js"),
67
+ 'utf8'
68
+ );
69
+ const respecW3C = readFileSync(
70
+ path.join(modulesFolder, "respec", "builds", "respec-w3c.js"),
71
+ 'utf8'
72
+ );
73
+
74
+ install();
75
+ const mockAgent = new MockAgent();
76
+ setGlobalDispatcher(mockAgent);
77
+ mockAgent.disableNetConnect();
78
+ // for chrome devtool protocol
79
+ mockAgent.enableNetConnect('127.0.0.1');
80
+
81
+ for (const [path, desc] of Object.entries(mockSpecs)) {
82
+ mockAgent.get(desc.domain || "https://w3c.github.io")
83
+ .intercept({ method: "GET", path })
84
+ .reply(200, desc.html || desc, {
85
+ headers: { "Content-Type": "text/html" }
86
+ })
87
+ .persist();
88
+
89
+ for (const [page, pageContent] of Object.entries(desc.pages || {})) {
90
+ mockAgent.get(desc.domain || "https://w3c.github.io")
91
+ .intercept({ method: "GET", path: path + page })
92
+ .reply(200, pageContent, {
93
+ headers: { "Content-Type": "text/html" }
94
+ })
95
+ .persist();
96
+ }
97
+ }
98
+
99
+
100
+ // Handling requests generated by ReSpec documents
101
+ mockAgent
102
+ .get("https://api.specref.org")
103
+ .intercept({ method: "GET", path: "/bibrefs?refs=webidl" })
104
+ .reply(200, { webidl: { href: "https://webidl.spec.whatwg.org/" } }, {
105
+ headers: {
106
+ "Content-Type": "application/json",
107
+ "Access-Control-Allow-Origin": "*"
108
+ }
109
+ })
110
+ .persist();
111
+
112
+ mockAgent
113
+ .get("https://www.w3.org")
114
+ .intercept({ method: "GET", path: "/scripts/TR/2021/fixup.js" })
115
+ .reply(200, '')
116
+ .persist();
117
+
118
+ mockAgent
119
+ .get("https://www.w3.org")
120
+ .intercept({ method: "GET", path: "/StyleSheets/TR/2021/logos/W3C" })
121
+ .reply(200, '')
122
+ .persist();
123
+
124
+ mockAgent
125
+ .get("https://www.w3.org")
126
+ .intercept({ method: "GET", path: "/Tools/respec/respec-highlight.js" })
127
+ .reply(200, respecHiglight, {
128
+ headers: { "Content-Type": "application/js" }
129
+ })
130
+ .persist();
131
+
132
+ mockAgent
133
+ .get("https://www.w3.org")
134
+ .intercept({ method: "GET", path: "/Tools/respec/respec-w3c" })
135
+ .reply(200, respecW3C, {
136
+ headers: { "Content-Type": "application/js" }
137
+ })
138
+ .persist();
139
+
140
+ mockAgent
141
+ .get("https://www.w3.org")
142
+ .intercept({ method: "GET", path: "/TR/idontexist/" })
143
+ .reply(404, '');
144
+
145
+ mockAgent
146
+ .get("https://www.w3.org")
147
+ .intercept({ method: "GET", path: "/TR/ididnotchange/" })
148
+ .reply(({ headers }) => {
149
+ if (headers['If-Modified-Since'] === "Fri, 11 Feb 2022 00:00:42 GMT") {
150
+ return { statusCode: 304 };
151
+ } else {
152
+ return { statusCode: 200, data: 'Unexpected If-Modified-Since header' };
153
+ }
154
+ });
155
+
156
+ mockAgent
157
+ .get("https://www.w3.org")
158
+ .intercept({ method: "GET", path: "/TR/iredirect/" })
159
+ .reply(200,
160
+ `<!DOCTYPE html><script>window.location = '/TR/recentlyupdated/';</script>`,
161
+ {
162
+ headers: {
163
+ "Content-Type": "text/html",
164
+ "Last-Modified": "Fri, 11 Feb 2022 00:00:42 GMT"
165
+ }
166
+ }
167
+ );
168
+
169
+ mockAgent
170
+ .get("https://www.w3.org")
171
+ .intercept({ method: "GET", path: "/TR/recentlyupdated/" })
172
+ .reply(200,
173
+ `<html><title>Recently updated</title>
174
+ <h1>Recently updated</h1>`,
175
+ {
176
+ headers: {
177
+ "Content-Type": "text/html",
178
+ "Last-Modified": (new Date()).toString()
179
+ }
180
+ }
181
+ );
182
+
183
+ mockAgent
184
+ .get("https://drafts.csswg.org")
185
+ .intercept({ method: "GET", path: "/server-hiccup/" })
186
+ .reply(200,
187
+ `<html><title>Server hiccup</title>
188
+ <h1> Index of Server Hiccup Module Level 42 </h1>`,
189
+ { headers: { "Content-Type": "text/html" } })
190
+ .persist();
191
+
192
+ /*nock.emitter.on('error', function (err) {
193
+ console.error(err);
194
+ });
195
+ nock.emitter.on('no match', function(req, options, requestBody) {
196
+ // 127.0.0.1 is used by the devtool protocol, we ignore it
197
+ if (req && req.hostname !== '127.0.0.1') {
198
+ console.error("No match for nock request on " + (options ? options.href : req.href));
199
+ }
200
+ });*/
201
+
202
+ export default mockAgent;