whatwg-mimetype 4.0.0 → 5.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,9 +1,9 @@
1
- # Parse, serialize, and manipulate MIME types
1
+ # Parse, serialize, manipulate, and sniff MIME types
2
2
 
3
3
  This package will parse [MIME types](https://mimesniff.spec.whatwg.org/#understanding-mime-types) into a structured format, which can then be manipulated and serialized:
4
4
 
5
5
  ```js
6
- const MIMEType = require("whatwg-mimetype");
6
+ const { MIMEType } = require("whatwg-mimetype");
7
7
 
8
8
  const mimeType = new MIMEType(`Text/HTML;Charset="utf-8"`);
9
9
 
@@ -22,24 +22,42 @@ console.assert(mimeType.isHTML() === true);
22
22
  console.assert(mimeType.isXML() === false);
23
23
  ```
24
24
 
25
- Parsing is a fairly complex process; see [the specification](https://mimesniff.spec.whatwg.org/#parsing-a-mime-type) for details (and similarly [for serialization](https://mimesniff.spec.whatwg.org/#serializing-a-mime-type)).
25
+ It can also [determine the MIME type of a resource](https://mimesniff.spec.whatwg.org/#determining-the-computed-mime-type-of-a-resource) by sniffing its contents:
26
26
 
27
- This package's algorithms conform to those of the WHATWG [MIME Sniffing Standard](https://mimesniff.spec.whatwg.org/), and is aligned up to commit [8e9a7dd](https://github.com/whatwg/mimesniff/commit/8e9a7dd90717c595a4e4d982cd216e4411d33736).
27
+ ```js
28
+ const { computedMIMEType } = require("whatwg-mimetype");
29
+
30
+ const pngBytes = new Uint8Array([0x89, 0x50, 0x4E, 0x47, 0x0D, 0x0A, 0x1A, 0x0A]);
31
+
32
+ computedMIMEType(pngBytes).essence; // "image/png"
33
+ computedMIMEType(pngBytes, { contentTypeHeader: "image/gif" }).essence // "image/png"
34
+ computedMIMEType(pngBytes, { contentTypeHeader: "text/html" }).essence // "text/html"
35
+ ```
36
+
37
+ Parsing, serialization, and sniffing are all fairly complex processes; see the specification for more details:
38
+
39
+ * [parsing](https://mimesniff.spec.whatwg.org/#parsing-a-mime-type)
40
+ * [serialization](https://mimesniff.spec.whatwg.org/#serializing-a-mime-type)
41
+ * [sniffing](https://mimesniff.spec.whatwg.org/#determining-the-computed-mime-type-of-a-resource)
42
+
43
+ This package's algorithms conform to those of the WHATWG [MIME Sniffing Standard](https://mimesniff.spec.whatwg.org/), as of [commit `e7594d5`](https://github.com/whatwg/mimesniff/commit/e7594d531819053508447f688a3bde465531aca5).
44
+
45
+ ## APIs
28
46
 
29
- ## `MIMEType` API
47
+ ### `MIMEType`
30
48
 
31
- This package's main module's default export is a class, `MIMEType`. Its constructor takes a string which it will attempt to parse into a MIME type; if parsing fails, an `Error` will be thrown.
49
+ The `MIMEType` class's constructor takes a string which it will attempt to parse into a MIME type; if parsing fails, an `Error` will be thrown.
32
50
 
33
- ### The `parse()` static factory method
51
+ #### The `parse()` static factory method
34
52
 
35
53
  As an alternative to the constructor, you can use `MIMEType.parse(string)`. The only difference is that `parse()` will return `null` on failed parsing, whereas the constructor will throw. It thus makes the most sense to use the constructor in cases where unparseable MIME types would be exceptional, and use `parse()` when dealing with input from some unconstrained source.
36
54
 
37
- ### Properties
55
+ #### Properties
38
56
 
39
- - `type`: the MIME type's [type](https://mimesniff.spec.whatwg.org/#mime-type-type), e.g. `"text"`
40
- - `subtype`: the MIME type's [subtype](https://mimesniff.spec.whatwg.org/#mime-type-subtype), e.g. `"html"`
41
- - `essence`: the MIME type's [essence](https://mimesniff.spec.whatwg.org/#mime-type-essence), e.g. `"text/html"`
42
- - `parameters`: an instance of `MIMETypeParameters`, containing this MIME type's [parameters](https://mimesniff.spec.whatwg.org/#mime-type-parameters)
57
+ * `type`: the MIME type's [type](https://mimesniff.spec.whatwg.org/#mime-type-type), e.g. `"text"`
58
+ * `subtype`: the MIME type's [subtype](https://mimesniff.spec.whatwg.org/#mime-type-subtype), e.g. `"html"`
59
+ * `essence`: the MIME type's [essence](https://mimesniff.spec.whatwg.org/#mime-type-essence), e.g. `"text/html"`
60
+ * `parameters`: an instance of `MIMETypeParameters`, containing this MIME type's [parameters](https://mimesniff.spec.whatwg.org/#mime-type-parameters)
43
61
 
44
62
  `type` and `subtype` can be changed. They will be validated to be non-empty and only contain [HTTP token code points](https://mimesniff.spec.whatwg.org/#http-token-code-point).
45
63
 
@@ -47,16 +65,16 @@ As an alternative to the constructor, you can use `MIMEType.parse(string)`. The
47
65
 
48
66
  `parameters` is also a getter, but the contents of the `MIMETypeParameters` object are mutable, as described below.
49
67
 
50
- ### Methods
68
+ #### Methods
51
69
 
52
- - `toString()` serializes the MIME type to a string
53
- - `isHTML()`: returns true if this instance represents [a HTML MIME type](https://mimesniff.spec.whatwg.org/#html-mime-type)
54
- - `isXML()`: returns true if this instance represents [an XML MIME type](https://mimesniff.spec.whatwg.org/#xml-mime-type)
55
- - `isJavaScript({ prohibitParameters })`: returns true if this instance represents [a JavaScript MIME type](https://html.spec.whatwg.org/multipage/scripting.html#javascript-mime-type). `prohibitParameters` can be set to true to disallow any parameters, i.e. to test if the MIME type's serialization is a [JavaScript MIME type essence match](https://mimesniff.spec.whatwg.org/#javascript-mime-type-essence-match).
70
+ * `toString()` serializes the MIME type to a string
71
+ * `isHTML()`: returns true if this instance represents [a HTML MIME type](https://mimesniff.spec.whatwg.org/#html-mime-type)
72
+ * `isXML()`: returns true if this instance represents [an XML MIME type](https://mimesniff.spec.whatwg.org/#xml-mime-type)
73
+ * `isJavaScript({ prohibitParameters })`: returns true if this instance represents [a JavaScript MIME type](https://html.spec.whatwg.org/multipage/scripting.html#javascript-mime-type). `prohibitParameters` can be set to true to disallow any parameters, i.e. to test if the MIME type's serialization is a [JavaScript MIME type essence match](https://mimesniff.spec.whatwg.org/#javascript-mime-type-essence-match).
56
74
 
57
75
  _Note: the `isHTML()`, `isXML()`, and `isJavaScript()` methods are speculative, and may be removed or changed in future major versions. See [whatwg/mimesniff#48](https://github.com/whatwg/mimesniff/issues/48) for brainstorming in this area. Currently we implement these mainly because they are useful in jsdom._
58
76
 
59
- ## `MIMETypeParameters` API
77
+ ### `MIMETypeParameters`
60
78
 
61
79
  The `MIMETypeParameters` class, instances of which are returned by `mimeType.parameters`, has equivalent surface API to a [JavaScript `Map`](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Map).
62
80
 
@@ -65,6 +83,8 @@ However, `MIMETypeParameters` methods will always interpret their arguments as a
65
83
  Some examples:
66
84
 
67
85
  ```js
86
+ const { MIMEType } = require("whatwg-mimetype");
87
+
68
88
  const mimeType = new MIMEType(`x/x;a=b;c=D;E="F"`);
69
89
 
70
90
  // Logs:
@@ -87,15 +107,26 @@ console.assert(mimeType.toString() === "x/x;a=b;c=d;e=F;q=X");
87
107
  mimeType.parameters.set("@", "x");
88
108
  ```
89
109
 
90
- ## Raw parsing/serialization APIs
110
+ ### `computedMIMEType()`
91
111
 
92
- If you want primitives on which to build your own API, you can get direct access to the parsing and serialization algorithms as follows:
112
+ The `computedMIMEType(resource, options)` function determines a resource's MIME type using the full [MIME type sniffing algorithm](https://mimesniff.spec.whatwg.org/#determining-the-computed-mime-type-of-a-resource). This includes:
93
113
 
94
- ```js
95
- const parse = require("whatwg-mimetype/parser");
96
- const serialize = require("whatwg-mimetype/serialize");
97
- ```
114
+ * complexities around how the supplied `Content-Type` header or other external information interacts with (but usually does not override) the sniffing process;
115
+ * the ability to set a "no-sniff" flag, like the one used when `X-Content-Type-Options: nosniff` is present, which mostly (but not entirely) prevents sniffing; and
116
+ * the "Apache bug" check which occurs when the `Content-Type` header is one of several specific values.
117
+
118
+ That is, this doesn't just implement the [rules for identifying an unknown MIME type](https://mimesniff.spec.whatwg.org/#rules-for-identifying-an-unknown-mime-type). It gives you everything you need for a full browser-compatible MIME sniffing procedure.
119
+
120
+ #### Arguments
121
+
122
+ * **`resource`** (`Uint8Array`): The resource bytes.
123
+ * **`options.contentTypeHeader`** (`string`): The Content-Type header value, for HTTP resources.
124
+ * **`options.providedType`** (`string`): The MIME type from the filesystem or another protocol (like FTP), for non-HTTP resources.
125
+ * **`options.noSniff`** (`boolean`, default `false`): Whether the `X-Content-Type-Options: nosniff` header was present.
126
+ * **`options.isSupported`** (`function`, default `() => true`): A predicate called with a `MIMEType` instance to check if an image/audio/video MIME type is supported by the user agent. If it returns `false`, sniffing is skipped for that type.
127
+
128
+ The `contentTypeHeader` and `providedType` options will be stringified, so you could also supply a `MIMEType` (or a [Node.js `util` module `MIMEType`](https://nodejs.org/api/util.html#class-utilmimetype)).
98
129
 
99
- `parse(string)` returns an object containing the `type` and `subtype` strings, plus `parameters`, which is a `Map`. This is roughly our equivalent of the spec's [MIME type record](https://mimesniff.spec.whatwg.org/#mime-type). If parsing fails, it instead returns `null`.
130
+ #### Return value
100
131
 
101
- `serialize(record)` operates on the such an object, giving back a string according to the serialization algorithm.
132
+ A `MIMEType` instance representing the computed MIME type.
package/lib/index.js ADDED
@@ -0,0 +1,4 @@
1
+ "use strict";
2
+
3
+ exports.MIMEType = require("./mime-type.js");
4
+ exports.computedMIMEType = require("./sniff.js");
package/lib/mime-type.js CHANGED
@@ -23,7 +23,7 @@ module.exports = class MIMEType {
23
23
  static parse(string) {
24
24
  try {
25
25
  return new this(string);
26
- } catch (e) {
26
+ } catch {
27
27
  return null;
28
28
  }
29
29
  }
package/lib/sniff.js ADDED
@@ -0,0 +1,751 @@
1
+ "use strict";
2
+ const MIMEType = require("./mime-type.js");
3
+
4
+ // Normalize a MIME type input (string or MIMEType-like object) to a MIMEType.
5
+ // Returns null if parsing fails (including for undefined input).
6
+ function normalizeMIMEType(input) {
7
+ return MIMEType.parse(`${input}`);
8
+ }
9
+
10
+ // https://mimesniff.spec.whatwg.org/#xml-mime-type
11
+ function isXMLMIMEType(mimeType) {
12
+ return mimeType.subtype.endsWith("+xml") ||
13
+ (mimeType.type === "text" && mimeType.subtype === "xml") ||
14
+ (mimeType.type === "application" && mimeType.subtype === "xml");
15
+ }
16
+
17
+ // https://mimesniff.spec.whatwg.org/#html-mime-type
18
+ function isHTMLMIMEType(mimeType) {
19
+ return mimeType.type === "text" && mimeType.subtype === "html";
20
+ }
21
+
22
+ // https://mimesniff.spec.whatwg.org/#resource-header
23
+ const RESOURCE_HEADER_LENGTH = 1445;
24
+
25
+ function getResourceHeader(resource) {
26
+ if (resource.length <= RESOURCE_HEADER_LENGTH) {
27
+ return resource;
28
+ }
29
+ return resource.subarray(0, RESOURCE_HEADER_LENGTH);
30
+ }
31
+
32
+ // https://mimesniff.spec.whatwg.org/#image-mime-type
33
+ function isImageMIMEType(mimeType) {
34
+ return mimeType.type === "image";
35
+ }
36
+
37
+ // https://mimesniff.spec.whatwg.org/#audio-or-video-mime-type
38
+ function isAudioOrVideoMIMEType(mimeType) {
39
+ return mimeType.type === "audio" ||
40
+ mimeType.type === "video" ||
41
+ (mimeType.type === "application" && mimeType.subtype === "ogg");
42
+ }
43
+
44
+ // https://mimesniff.spec.whatwg.org/#whitespace-byte
45
+ function isWhitespaceByte(byte) {
46
+ return byte === 0x09 || byte === 0x0A || byte === 0x0C || byte === 0x0D || byte === 0x20;
47
+ }
48
+
49
+ // https://mimesniff.spec.whatwg.org/#binary-data-byte
50
+ function isBinaryDataByte(byte) {
51
+ return (byte >= 0x00 && byte <= 0x08) ||
52
+ byte === 0x0B ||
53
+ (byte >= 0x0E && byte <= 0x1A) ||
54
+ (byte >= 0x1C && byte <= 0x1F);
55
+ }
56
+
57
+ // https://mimesniff.spec.whatwg.org/#pattern-matching-algorithm
58
+ function matchesSignature(resource, signature) {
59
+ const { pattern, mask, ignoredLeadingBytes, mimeType } = signature;
60
+
61
+ let s = 0;
62
+ if (ignoredLeadingBytes) {
63
+ while (s < resource.length && ignoredLeadingBytes(resource[s])) {
64
+ s++;
65
+ }
66
+ }
67
+
68
+ if (resource.length < s + pattern.length) {
69
+ return null;
70
+ }
71
+
72
+ for (let i = 0; i < pattern.length; i++) {
73
+ if ((resource[s + i] & mask[i]) !== (pattern[i] & mask[i])) {
74
+ return null;
75
+ }
76
+ }
77
+
78
+ return mimeType;
79
+ }
80
+
81
+ // https://mimesniff.spec.whatwg.org/#rules-for-identifying-an-unknown-mime-type
82
+ const step1Table = [
83
+ // <!DOCTYPE HTML TT
84
+ {
85
+ pattern: [0x3C, 0x21, 0x44, 0x4F, 0x43, 0x54, 0x59, 0x50, 0x45, 0x20, 0x48, 0x54, 0x4D, 0x4C, 0x20],
86
+ mask: [0xFF, 0xFF, 0xDF, 0xDF, 0xDF, 0xDF, 0xDF, 0xDF, 0xDF, 0xFF, 0xDF, 0xDF, 0xDF, 0xDF, 0xE1],
87
+ ignoredLeadingBytes: isWhitespaceByte,
88
+ mimeType: "text/html"
89
+ },
90
+ // <HTML TT
91
+ {
92
+ pattern: [0x3C, 0x48, 0x54, 0x4D, 0x4C, 0x20],
93
+ mask: [0xFF, 0xDF, 0xDF, 0xDF, 0xDF, 0xE1],
94
+ ignoredLeadingBytes: isWhitespaceByte,
95
+ mimeType: "text/html"
96
+ },
97
+ // <HEAD TT
98
+ {
99
+ pattern: [0x3C, 0x48, 0x45, 0x41, 0x44, 0x20],
100
+ mask: [0xFF, 0xDF, 0xDF, 0xDF, 0xDF, 0xE1],
101
+ ignoredLeadingBytes: isWhitespaceByte,
102
+ mimeType: "text/html"
103
+ },
104
+ // <SCRIPT TT
105
+ {
106
+ pattern: [0x3C, 0x53, 0x43, 0x52, 0x49, 0x50, 0x54, 0x20],
107
+ mask: [0xFF, 0xDF, 0xDF, 0xDF, 0xDF, 0xDF, 0xDF, 0xE1],
108
+ ignoredLeadingBytes: isWhitespaceByte,
109
+ mimeType: "text/html"
110
+ },
111
+ // <IFRAME TT
112
+ {
113
+ pattern: [0x3C, 0x49, 0x46, 0x52, 0x41, 0x4D, 0x45, 0x20],
114
+ mask: [0xFF, 0xDF, 0xDF, 0xDF, 0xDF, 0xDF, 0xDF, 0xE1],
115
+ ignoredLeadingBytes: isWhitespaceByte,
116
+ mimeType: "text/html"
117
+ },
118
+ // <H1 TT
119
+ {
120
+ pattern: [0x3C, 0x48, 0x31, 0x20],
121
+ mask: [0xFF, 0xDF, 0xFF, 0xE1],
122
+ ignoredLeadingBytes: isWhitespaceByte,
123
+ mimeType: "text/html"
124
+ },
125
+ // <DIV TT
126
+ {
127
+ pattern: [0x3C, 0x44, 0x49, 0x56, 0x20],
128
+ mask: [0xFF, 0xDF, 0xDF, 0xDF, 0xE1],
129
+ ignoredLeadingBytes: isWhitespaceByte,
130
+ mimeType: "text/html"
131
+ },
132
+ // <FONT TT
133
+ {
134
+ pattern: [0x3C, 0x46, 0x4F, 0x4E, 0x54, 0x20],
135
+ mask: [0xFF, 0xDF, 0xDF, 0xDF, 0xDF, 0xE1],
136
+ ignoredLeadingBytes: isWhitespaceByte,
137
+ mimeType: "text/html"
138
+ },
139
+ // <TABLE TT
140
+ {
141
+ pattern: [0x3C, 0x54, 0x41, 0x42, 0x4C, 0x45, 0x20],
142
+ mask: [0xFF, 0xDF, 0xDF, 0xDF, 0xDF, 0xDF, 0xE1],
143
+ ignoredLeadingBytes: isWhitespaceByte,
144
+ mimeType: "text/html"
145
+ },
146
+ // <A TT
147
+ {
148
+ pattern: [0x3C, 0x41, 0x20],
149
+ mask: [0xFF, 0xDF, 0xE1],
150
+ ignoredLeadingBytes: isWhitespaceByte,
151
+ mimeType: "text/html"
152
+ },
153
+ // <STYLE TT
154
+ {
155
+ pattern: [0x3C, 0x53, 0x54, 0x59, 0x4C, 0x45, 0x20],
156
+ mask: [0xFF, 0xDF, 0xDF, 0xDF, 0xDF, 0xDF, 0xE1],
157
+ ignoredLeadingBytes: isWhitespaceByte,
158
+ mimeType: "text/html"
159
+ },
160
+ // <TITLE TT
161
+ {
162
+ pattern: [0x3C, 0x54, 0x49, 0x54, 0x4C, 0x45, 0x20],
163
+ mask: [0xFF, 0xDF, 0xDF, 0xDF, 0xDF, 0xDF, 0xE1],
164
+ ignoredLeadingBytes: isWhitespaceByte,
165
+ mimeType: "text/html"
166
+ },
167
+ // <B TT
168
+ {
169
+ pattern: [0x3C, 0x42, 0x20],
170
+ mask: [0xFF, 0xDF, 0xE1],
171
+ ignoredLeadingBytes: isWhitespaceByte,
172
+ mimeType: "text/html"
173
+ },
174
+ // <BODY TT
175
+ {
176
+ pattern: [0x3C, 0x42, 0x4F, 0x44, 0x59, 0x20],
177
+ mask: [0xFF, 0xDF, 0xDF, 0xDF, 0xDF, 0xE1],
178
+ ignoredLeadingBytes: isWhitespaceByte,
179
+ mimeType: "text/html"
180
+ },
181
+ // <BR TT
182
+ {
183
+ pattern: [0x3C, 0x42, 0x52, 0x20],
184
+ mask: [0xFF, 0xDF, 0xDF, 0xE1],
185
+ ignoredLeadingBytes: isWhitespaceByte,
186
+ mimeType: "text/html"
187
+ },
188
+ // <P TT
189
+ {
190
+ pattern: [0x3C, 0x50, 0x20],
191
+ mask: [0xFF, 0xDF, 0xE1],
192
+ ignoredLeadingBytes: isWhitespaceByte,
193
+ mimeType: "text/html"
194
+ },
195
+ // <!-- TT
196
+ {
197
+ pattern: [0x3C, 0x21, 0x2D, 0x2D, 0x20],
198
+ mask: [0xFF, 0xFF, 0xFF, 0xFF, 0xE1],
199
+ ignoredLeadingBytes: isWhitespaceByte,
200
+ mimeType: "text/html"
201
+ },
202
+ // <?xml
203
+ {
204
+ pattern: [0x3C, 0x3F, 0x78, 0x6D, 0x6C],
205
+ mask: [0xFF, 0xFF, 0xFF, 0xFF, 0xFF],
206
+ ignoredLeadingBytes: isWhitespaceByte,
207
+ mimeType: "text/xml"
208
+ },
209
+ // %PDF-
210
+ {
211
+ pattern: [0x25, 0x50, 0x44, 0x46, 0x2D],
212
+ mask: [0xFF, 0xFF, 0xFF, 0xFF, 0xFF],
213
+ mimeType: "application/pdf"
214
+ }
215
+ ];
216
+
217
+ // https://mimesniff.spec.whatwg.org/#rules-for-identifying-an-unknown-mime-type
218
+ const step2Table = [
219
+ // %!PS-Adobe-
220
+ {
221
+ pattern: [0x25, 0x21, 0x50, 0x53, 0x2D, 0x41, 0x64, 0x6F, 0x62, 0x65, 0x2D],
222
+ mask: [0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF],
223
+ mimeType: "application/postscript"
224
+ },
225
+ // UTF-16BE BOM
226
+ {
227
+ pattern: [0xFE, 0xFF, 0x00, 0x00],
228
+ mask: [0xFF, 0xFF, 0x00, 0x00],
229
+ mimeType: "text/plain"
230
+ },
231
+ // UTF-16LE BOM
232
+ {
233
+ pattern: [0xFF, 0xFE, 0x00, 0x00],
234
+ mask: [0xFF, 0xFF, 0x00, 0x00],
235
+ mimeType: "text/plain"
236
+ },
237
+ // UTF-8 BOM
238
+ {
239
+ pattern: [0xEF, 0xBB, 0xBF, 0x00],
240
+ mask: [0xFF, 0xFF, 0xFF, 0x00],
241
+ mimeType: "text/plain"
242
+ }
243
+ ];
244
+
245
+ // https://mimesniff.spec.whatwg.org/#matching-an-image-type-pattern
246
+ const imageSignatures = [
247
+ { pattern: [0x00, 0x00, 0x01, 0x00], mask: [0xFF, 0xFF, 0xFF, 0xFF], mimeType: "image/x-icon" },
248
+ { pattern: [0x00, 0x00, 0x02, 0x00], mask: [0xFF, 0xFF, 0xFF, 0xFF], mimeType: "image/x-icon" },
249
+ { pattern: [0x42, 0x4D], mask: [0xFF, 0xFF], mimeType: "image/bmp" },
250
+ {
251
+ pattern: [0x47, 0x49, 0x46, 0x38, 0x37, 0x61],
252
+ mask: [0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF],
253
+ mimeType: "image/gif"
254
+ },
255
+ {
256
+ pattern: [0x47, 0x49, 0x46, 0x38, 0x39, 0x61],
257
+ mask: [0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF],
258
+ mimeType: "image/gif"
259
+ },
260
+ {
261
+ pattern: [0x52, 0x49, 0x46, 0x46, 0x00, 0x00, 0x00, 0x00, 0x57, 0x45, 0x42, 0x50, 0x56, 0x50],
262
+ mask: [0xFF, 0xFF, 0xFF, 0xFF, 0x00, 0x00, 0x00, 0x00, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF],
263
+ mimeType: "image/webp"
264
+ },
265
+ {
266
+ pattern: [0x89, 0x50, 0x4E, 0x47, 0x0D, 0x0A, 0x1A, 0x0A],
267
+ mask: [0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF],
268
+ mimeType: "image/png"
269
+ },
270
+ { pattern: [0xFF, 0xD8, 0xFF], mask: [0xFF, 0xFF, 0xFF], mimeType: "image/jpeg" }
271
+ ];
272
+
273
+ // https://mimesniff.spec.whatwg.org/#matching-an-audio-or-video-type-pattern
274
+ const audioVideoSignatures = [
275
+ {
276
+ pattern: [0x46, 0x4F, 0x52, 0x4D, 0x00, 0x00, 0x00, 0x00, 0x41, 0x49, 0x46, 0x46],
277
+ mask: [0xFF, 0xFF, 0xFF, 0xFF, 0x00, 0x00, 0x00, 0x00, 0xFF, 0xFF, 0xFF, 0xFF],
278
+ mimeType: "audio/aiff"
279
+ },
280
+ { pattern: [0x49, 0x44, 0x33], mask: [0xFF, 0xFF, 0xFF], mimeType: "audio/mpeg" },
281
+ {
282
+ pattern: [0x4F, 0x67, 0x67, 0x53, 0x00],
283
+ mask: [0xFF, 0xFF, 0xFF, 0xFF, 0xFF],
284
+ mimeType: "application/ogg"
285
+ },
286
+ {
287
+ pattern: [0x4D, 0x54, 0x68, 0x64, 0x00, 0x00, 0x00, 0x06],
288
+ mask: [0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF],
289
+ mimeType: "audio/midi"
290
+ },
291
+ {
292
+ pattern: [0x52, 0x49, 0x46, 0x46, 0x00, 0x00, 0x00, 0x00, 0x41, 0x56, 0x49, 0x20],
293
+ mask: [0xFF, 0xFF, 0xFF, 0xFF, 0x00, 0x00, 0x00, 0x00, 0xFF, 0xFF, 0xFF, 0xFF],
294
+ mimeType: "video/avi"
295
+ },
296
+ {
297
+ pattern: [0x52, 0x49, 0x46, 0x46, 0x00, 0x00, 0x00, 0x00, 0x57, 0x41, 0x56, 0x45],
298
+ mask: [0xFF, 0xFF, 0xFF, 0xFF, 0x00, 0x00, 0x00, 0x00, 0xFF, 0xFF, 0xFF, 0xFF],
299
+ mimeType: "audio/wave"
300
+ }
301
+ ];
302
+
303
+ // https://mimesniff.spec.whatwg.org/#signature-for-mp4
304
+ function matchMP4(resource) {
305
+ if (resource.length < 12) {
306
+ return null;
307
+ }
308
+ // Bytes 4-7 must be "ftyp"
309
+ if (resource[4] !== 0x66 || resource[5] !== 0x74 ||
310
+ resource[6] !== 0x79 || resource[7] !== 0x70) {
311
+ return null;
312
+ }
313
+ const length = (resource[0] << 24) | (resource[1] << 16) | (resource[2] << 8) | resource[3];
314
+ if (length < 12 || length > resource.length) {
315
+ return null;
316
+ }
317
+ const brand = String.fromCharCode(resource[8], resource[9], resource[10], resource[11]);
318
+ const mp4Brands = ["mp41", "mp42", "isom", "iso2", "mmp4", "M4V ", "M4A ", "M4P ", "avc1"];
319
+ if (mp4Brands.includes(brand)) {
320
+ return "video/mp4";
321
+ }
322
+ for (let i = 16; i + 4 <= length && i + 4 <= resource.length; i += 4) {
323
+ const compat = String.fromCharCode(resource[i], resource[i + 1], resource[i + 2], resource[i + 3]);
324
+ if (mp4Brands.includes(compat)) {
325
+ return "video/mp4";
326
+ }
327
+ }
328
+ return null;
329
+ }
330
+
331
+ // https://mimesniff.spec.whatwg.org/#parse-a-vint
332
+ //
333
+ // Note: The spec has a bug: it takes an "iter" parameter but never uses it, always starting from index 0.
334
+ // See https://github.com/whatwg/mimesniff/issues/167. We implement the intended behavior.
335
+ //
336
+ // The spec algorithm also does extra processing which is not actually needed by its single caller:
337
+ // https://github.com/whatwg/mimesniff/issues/146. We omit that processing.
338
+ function parseVint(sequence, iter) {
339
+ let mask = 128;
340
+ const maxVintLength = 8;
341
+ let numberSize = 1;
342
+ while (numberSize < maxVintLength && numberSize < sequence.length) {
343
+ if ((sequence[iter] & mask) !== 0) {
344
+ break;
345
+ }
346
+ mask >>= 1;
347
+ ++numberSize;
348
+ }
349
+ return numberSize;
350
+ }
351
+
352
+ // https://mimesniff.spec.whatwg.org/#matching-a-padded-sequence
353
+ function matchPaddedSequence(sequence, offset, pattern) {
354
+ // Skip leading 0x00 bytes
355
+ while (offset < sequence.length && sequence[offset] === 0x00) {
356
+ offset++;
357
+ }
358
+ // Check if pattern matches at current offset
359
+ if (sequence.length < offset + pattern.length) {
360
+ return false;
361
+ }
362
+ for (let i = 0; i < pattern.length; i++) {
363
+ if (sequence[offset + i] !== pattern[i]) {
364
+ return false;
365
+ }
366
+ }
367
+ return true;
368
+ }
369
+
370
+ // https://mimesniff.spec.whatwg.org/#signature-for-webm
371
+ function matchWebM(resource) {
372
+ const { length } = resource;
373
+ // Step 3: If length < 4, return false
374
+ if (length < 4) {
375
+ return null;
376
+ }
377
+ // Step 4: Check EBML header 0x1A 0x45 0xDF 0xA3
378
+ if (resource[0] !== 0x1A || resource[1] !== 0x45 ||
379
+ resource[2] !== 0xDF || resource[3] !== 0xA3) {
380
+ return null;
381
+ }
382
+ // Step 5-6: Search for DocType element (0x42 0x82) in bytes 4-37
383
+ let iter = 4;
384
+ while (iter < length && iter < 38) {
385
+ if (iter + 1 < length && resource[iter] === 0x42 && resource[iter + 1] === 0x82) {
386
+ iter += 2;
387
+ if (iter >= length) {
388
+ break;
389
+ }
390
+ const numberSize = parseVint(resource, iter);
391
+ iter += numberSize;
392
+ if (iter >= length - 4) {
393
+ break;
394
+ }
395
+ // Match padded sequence "webm" (0x77 0x65 0x62 0x6D)
396
+ if (matchPaddedSequence(resource, iter, [0x77, 0x65, 0x62, 0x6D])) {
397
+ return "video/webm";
398
+ }
399
+ }
400
+ iter++;
401
+ }
402
+ return null;
403
+ }
404
+
405
+ // https://mimesniff.spec.whatwg.org/#signature-for-mp3-without-id3
406
+
407
+ // Bitrate tables (kbps) indexed by bitrate-index (0-15)
408
+ // https://mimesniff.spec.whatwg.org/#mp3-rates-table
409
+ const mp3Rates = [0, 32, 40, 48, 56, 64, 80, 96, 112, 128, 160, 192, 224, 256, 320, 0];
410
+ const mp25Rates = [0, 8, 16, 24, 32, 40, 48, 56, 64, 80, 96, 112, 128, 144, 160, 0];
411
+
412
+ // Sample rate table indexed by samplerate-index (0-3)
413
+ // https://mimesniff.spec.whatwg.org/#mp3-sample-rate-table
414
+ const sampleRates = [44100, 48000, 32000, 0];
415
+
416
+ // https://mimesniff.spec.whatwg.org/#match-an-mp3-header
417
+ function matchMP3Header(sequence, s) {
418
+ const { length } = sequence;
419
+ // Step 1: If length < s + 4, return false
420
+ if (length < s + 4) {
421
+ return false;
422
+ }
423
+ // Step 2: If sequence[s] ≠ 0xFF or sequence[s+1] & 0xE0 ≠ 0xE0, return false
424
+ if (sequence[s] !== 0xFF || (sequence[s + 1] & 0xE0) !== 0xE0) {
425
+ return false;
426
+ }
427
+ // Step 3: Extract layer
428
+ const layer = (sequence[s + 1] & 0x06) >> 1;
429
+ // Step 4: If layer is 0, return false
430
+ if (layer === 0) {
431
+ return false;
432
+ }
433
+ // Step 5: Extract bit-rate index, return false if 15
434
+ const bitRate = (sequence[s + 2] & 0xF0) >> 4;
435
+ if (bitRate === 15) {
436
+ return false;
437
+ }
438
+ // Step 6: Extract sample-rate index, return false if 3
439
+ const sampleRate = (sequence[s + 2] & 0x0C) >> 2;
440
+ if (sampleRate === 3) {
441
+ return false;
442
+ }
443
+ // Step 9: Check final-layer (layer must be 3 for MP3)
444
+ const finalLayer = (4 - layer) & 0x03;
445
+ if (finalLayer !== 3) {
446
+ return false;
447
+ }
448
+ // Step 10: Return true
449
+ return true;
450
+ }
451
+
452
+ // https://mimesniff.spec.whatwg.org/#parse-an-mp3-frame
453
+ function parseMP3Frame(sequence, s) {
454
+ // Step 1: Extract version
455
+ const version = (sequence[s + 1] & 0x18) >> 3;
456
+ // Step 2: Extract bitrate-index
457
+ const bitrateIndex = (sequence[s + 2] & 0xF0) >> 4;
458
+ // Step 3+4: Get bitrate from appropriate table
459
+ const bitrate = (version & 0x01) !== 0 ? mp3Rates[bitrateIndex] : mp25Rates[bitrateIndex];
460
+ // Step 5: Extract samplerate-index
461
+ const samplerateIndex = (sequence[s + 2] & 0x0C) >> 2;
462
+ // Step 6: Get samplerate
463
+ const samplerate = sampleRates[samplerateIndex];
464
+ // Step 7: Extract pad
465
+ const pad = (sequence[s + 2] & 0x02) >> 1;
466
+
467
+ return { version, bitrate, samplerate, pad };
468
+ }
469
+
470
+ // https://mimesniff.spec.whatwg.org/#compute-an-mp3-frame-size
471
+ function computeMP3FrameSize(version, bitrate, samplerate, pad) {
472
+ // Step 1: Determine scale based on version
473
+ const scale = version === 1 ? 72 : 144;
474
+ // Step 2: Compute size
475
+ let size = Math.floor((bitrate * 1000 * scale) / samplerate);
476
+ // Step 3: Add padding if present
477
+ if (pad !== 0) {
478
+ size += 1;
479
+ }
480
+ // Step 4: Return size
481
+ return size;
482
+ }
483
+
484
+ // https://mimesniff.spec.whatwg.org/#signature-for-mp3-without-id3
485
+ function matchMP3WithoutID3(resource) {
486
+ const { length } = resource;
487
+ // Step 2: Let s be 0
488
+ let s = 0;
489
+ // Step 3: If match mp3 header returns false, return false
490
+ if (!matchMP3Header(resource, s)) {
491
+ return null;
492
+ }
493
+ // Step 4: Parse an mp3 frame
494
+ const { version, bitrate, samplerate, pad } = parseMP3Frame(resource, s);
495
+ // Step 5: Compute frame size
496
+ const skippedBytes = computeMP3FrameSize(version, bitrate, samplerate, pad);
497
+ // Step 6: If skipped-bytes < 4 or skipped-bytes > length - s, return false
498
+ if (skippedBytes < 4 || skippedBytes > length - s) {
499
+ return null;
500
+ }
501
+ // Step 7: Increment s by skipped-bytes
502
+ s += skippedBytes;
503
+ // Step 8: If match mp3 header returns false, return false; otherwise return true
504
+ if (!matchMP3Header(resource, s)) {
505
+ return null;
506
+ }
507
+ return "audio/mpeg";
508
+ }
509
+
510
+ // https://mimesniff.spec.whatwg.org/#matching-an-archive-type-pattern
511
+ const archiveSignatures = [
512
+ { pattern: [0x1F, 0x8B, 0x08], mask: [0xFF, 0xFF, 0xFF], mimeType: "application/x-gzip" },
513
+ { pattern: [0x50, 0x4B, 0x03, 0x04], mask: [0xFF, 0xFF, 0xFF, 0xFF], mimeType: "application/zip" },
514
+ {
515
+ pattern: [0x52, 0x61, 0x72, 0x21, 0x1A, 0x07, 0x00],
516
+ mask: [0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF],
517
+ mimeType: "application/x-rar-compressed"
518
+ }
519
+ ];
520
+
521
+ // https://mimesniff.spec.whatwg.org/#matching-an-image-type-pattern
522
+ function matchImageType(resource) {
523
+ for (const sig of imageSignatures) {
524
+ const result = matchesSignature(resource, sig);
525
+ if (result) {
526
+ return result;
527
+ }
528
+ }
529
+ return null;
530
+ }
531
+
532
+ // https://mimesniff.spec.whatwg.org/#matching-an-audio-or-video-type-pattern
533
+ function matchAudioOrVideoType(resource) {
534
+ for (const sig of audioVideoSignatures) {
535
+ const result = matchesSignature(resource, sig);
536
+ if (result) {
537
+ return result;
538
+ }
539
+ }
540
+ const mp4Result = matchMP4(resource);
541
+ if (mp4Result) {
542
+ return mp4Result;
543
+ }
544
+ const webmResult = matchWebM(resource);
545
+ if (webmResult) {
546
+ return webmResult;
547
+ }
548
+ const mp3Result = matchMP3WithoutID3(resource);
549
+ if (mp3Result) {
550
+ return mp3Result;
551
+ }
552
+ return null;
553
+ }
554
+
555
+ // https://mimesniff.spec.whatwg.org/#rules-for-text-or-binary
556
+ function distinguishTextOrBinary(resourceHeader) {
557
+ // Step 1: Let length be the number of bytes in the resource header.
558
+ const { length } = resourceHeader;
559
+
560
+ // Step 2: If the first 2 bytes match a UTF-16 BOM, return "text/plain".
561
+ if (length >= 2) {
562
+ // UTF-16 BE BOM
563
+ if (resourceHeader[0] === 0xFE && resourceHeader[1] === 0xFF) {
564
+ return "text/plain";
565
+ }
566
+ // UTF-16 LE BOM
567
+ if (resourceHeader[0] === 0xFF && resourceHeader[1] === 0xFE) {
568
+ return "text/plain";
569
+ }
570
+ }
571
+
572
+ // Step 3: If the first 3 bytes match the UTF-8 BOM, return "text/plain".
573
+ if (length >= 3) {
574
+ if (resourceHeader[0] === 0xEF && resourceHeader[1] === 0xBB && resourceHeader[2] === 0xBF) {
575
+ return "text/plain";
576
+ }
577
+ }
578
+
579
+ // Step 4: If the resource header contains no binary data bytes, return "text/plain".
580
+ for (let i = 0; i < length; i++) {
581
+ if (isBinaryDataByte(resourceHeader[i])) {
582
+ // Step 5: Return "application/octet-stream".
583
+ return "application/octet-stream";
584
+ }
585
+ }
586
+
587
+ return "text/plain";
588
+ }
589
+
590
+ // https://mimesniff.spec.whatwg.org/#rules-for-identifying-an-unknown-mime-type
591
+ function identifyAnUnknownMIMEType(resourceHeader, { sniffScriptable = false } = {}) {
592
+ // Step 1
593
+ if (sniffScriptable) {
594
+ for (const sig of step1Table) {
595
+ const result = matchesSignature(resourceHeader, sig);
596
+ if (result) {
597
+ return result;
598
+ }
599
+ }
600
+ }
601
+
602
+ // Step 2
603
+ for (const sig of step2Table) {
604
+ const result = matchesSignature(resourceHeader, sig);
605
+ if (result) {
606
+ return result;
607
+ }
608
+ }
609
+
610
+ // Step 3: image type pattern matching
611
+ for (const sig of imageSignatures) {
612
+ const result = matchesSignature(resourceHeader, sig);
613
+ if (result) {
614
+ return result;
615
+ }
616
+ }
617
+
618
+ // Step 4: audio/video type pattern matching
619
+ // https://mimesniff.spec.whatwg.org/#matching-an-audio-or-video-type-pattern
620
+ for (const sig of audioVideoSignatures) {
621
+ const result = matchesSignature(resourceHeader, sig);
622
+ if (result) {
623
+ return result;
624
+ }
625
+ }
626
+ // Then check MP4, WebM, and MP3-without-ID3 signatures
627
+ const mp4Result = matchMP4(resourceHeader);
628
+ if (mp4Result) {
629
+ return mp4Result;
630
+ }
631
+ const webmResult = matchWebM(resourceHeader);
632
+ if (webmResult) {
633
+ return webmResult;
634
+ }
635
+ const mp3Result = matchMP3WithoutID3(resourceHeader);
636
+ if (mp3Result) {
637
+ return mp3Result;
638
+ }
639
+
640
+ // Step 5: archive type pattern matching
641
+ for (const sig of archiveSignatures) {
642
+ const result = matchesSignature(resourceHeader, sig);
643
+ if (result) {
644
+ return result;
645
+ }
646
+ }
647
+
648
+ // Step 6: If resource header contains no binary data bytes, return text/plain
649
+ for (let i = 0; i < resourceHeader.length; i++) {
650
+ if (isBinaryDataByte(resourceHeader[i])) {
651
+ // Step 7: return application/octet-stream
652
+ return "application/octet-stream";
653
+ }
654
+ }
655
+
656
+ return "text/plain";
657
+ }
658
+
659
+ // Apache bug values that trigger text/binary sniffing
660
+ // https://mimesniff.spec.whatwg.org/#supplied-mime-type-detection-algorithm
661
+ const apacheBugValues = new Set([
662
+ "text/plain",
663
+ "text/plain; charset=ISO-8859-1",
664
+ "text/plain; charset=iso-8859-1",
665
+ "text/plain; charset=UTF-8"
666
+ ]);
667
+
668
+ // https://mimesniff.spec.whatwg.org/#supplied-mime-type-detection-algorithm
669
+ function detectSuppliedMIMEType({ contentTypeHeader, providedType }) {
670
+ let suppliedMIMEType = null;
671
+ let checkForApacheBug = false;
672
+
673
+ if (contentTypeHeader !== undefined) {
674
+ // Step 2: HTTP Content-Type header
675
+ suppliedMIMEType = normalizeMIMEType(contentTypeHeader);
676
+ if (suppliedMIMEType !== null && typeof contentTypeHeader === "string") {
677
+ checkForApacheBug = apacheBugValues.has(contentTypeHeader);
678
+ }
679
+ } else if (providedType !== undefined) {
680
+ // Steps 3-4: Filesystem or other protocol
681
+ suppliedMIMEType = normalizeMIMEType(providedType);
682
+ }
683
+ // Step 5: If parsing failed, suppliedMIMEType remains null (undefined per spec)
684
+
685
+ return { suppliedMIMEType, checkForApacheBug };
686
+ }
687
+
688
+ /**
689
+ * Determine the computed MIME type of a resource.
690
+ * https://mimesniff.spec.whatwg.org/#determining-the-computed-mime-type-of-a-resource
691
+ *
692
+ * @param {Uint8Array} resource - The resource bytes
693
+ * @param {object} options - Options object
694
+ * @param {string} options.contentTypeHeader - The Content-Type header value (for HTTP resources)
695
+ * @param {string} options.providedType - MIME type from filesystem or other protocol (for non-HTTP resources)
696
+ * @param {boolean} options.noSniff - Whether the X-Content-Type-Options: nosniff header was present
697
+ * @param {function} options.isSupported - Predicate to check if an image/audio/video MIME type is supported
698
+ * @returns {MIMEType} The computed MIME type
699
+ */
700
+ module.exports = function computedMIMEType(
701
+ resource,
702
+ { contentTypeHeader, providedType, noSniff = false, isSupported = () => true } = {}
703
+ ) {
704
+ const resourceHeader = getResourceHeader(resource);
705
+ const { suppliedMIMEType, checkForApacheBug } = detectSuppliedMIMEType({ contentTypeHeader, providedType });
706
+
707
+ // Step 1: If the supplied MIME type is an XML MIME type or HTML MIME type, return it
708
+ if (suppliedMIMEType !== null && (isXMLMIMEType(suppliedMIMEType) || isHTMLMIMEType(suppliedMIMEType))) {
709
+ return suppliedMIMEType;
710
+ }
711
+
712
+ // Step 2: If supplied MIME type is undefined, or its essence is "unknown/unknown",
713
+ // "application/unknown", or "*/*", execute rules for identifying an unknown MIME type
714
+ if (suppliedMIMEType === null ||
715
+ suppliedMIMEType.essence === "unknown/unknown" ||
716
+ suppliedMIMEType.essence === "application/unknown" ||
717
+ suppliedMIMEType.essence === "*/*") {
718
+ // sniff-scriptable flag is the inverse of no-sniff flag
719
+ return new MIMEType(identifyAnUnknownMIMEType(resourceHeader, { sniffScriptable: !noSniff }));
720
+ }
721
+
722
+ // Step 3: If the no-sniff flag is set, return the supplied MIME type
723
+ if (noSniff) {
724
+ return suppliedMIMEType;
725
+ }
726
+
727
+ // Step 4: If the check-for-apache-bug flag is set, execute rules for distinguishing
728
+ // if a resource is text or binary
729
+ if (checkForApacheBug) {
730
+ return new MIMEType(distinguishTextOrBinary(resourceHeader));
731
+ }
732
+
733
+ // Steps 5-6: If supplied MIME type is a supported image MIME type, execute image pattern matching
734
+ if (isImageMIMEType(suppliedMIMEType) && isSupported(suppliedMIMEType)) {
735
+ const imageResult = matchImageType(resourceHeader);
736
+ if (imageResult !== null) {
737
+ return new MIMEType(imageResult);
738
+ }
739
+ }
740
+
741
+ // Steps 7-8: If supplied MIME type is a supported audio/video type, execute audio/video matching
742
+ if (isAudioOrVideoMIMEType(suppliedMIMEType) && isSupported(suppliedMIMEType)) {
743
+ const avResult = matchAudioOrVideoType(resourceHeader);
744
+ if (avResult !== null) {
745
+ return new MIMEType(avResult);
746
+ }
747
+ }
748
+
749
+ // Step 9: Return the supplied MIME type
750
+ return suppliedMIMEType;
751
+ };
package/package.json CHANGED
@@ -8,33 +8,32 @@
8
8
  "http",
9
9
  "whatwg"
10
10
  ],
11
- "version": "4.0.0",
11
+ "version": "5.0.0",
12
12
  "author": "Domenic Denicola <d@domenic.me> (https://domenic.me/)",
13
13
  "license": "MIT",
14
14
  "repository": "jsdom/whatwg-mimetype",
15
- "main": "lib/mime-type.js",
15
+ "main": "lib/index.js",
16
16
  "files": [
17
17
  "lib/"
18
18
  ],
19
19
  "scripts": {
20
20
  "test": "node --test",
21
21
  "coverage": "c8 node --test --experimental-test-coverage",
22
- "lint": "eslint .",
23
- "pretest": "node scripts/get-latest-platform-tests.js"
22
+ "lint": "eslint",
23
+ "pretest": "node scripts/get-latest-platform-tests.mjs"
24
24
  },
25
25
  "devDependencies": {
26
- "@domenic/eslint-config": "^3.0.0",
27
- "c8": "^8.0.1",
28
- "eslint": "^8.53.0",
29
- "printable-string": "^0.3.0",
30
- "whatwg-encoding": "^3.0.0"
26
+ "@domenic/eslint-config": "^4.0.1",
27
+ "@exodus/bytes": "^1.9.0",
28
+ "c8": "^10.1.3",
29
+ "eslint": "^9.39.2",
30
+ "printable-string": "^0.3.0"
31
31
  },
32
32
  "engines": {
33
- "node": ">=18"
33
+ "node": ">=20"
34
34
  },
35
35
  "c8": {
36
36
  "reporter": [
37
- "text",
38
37
  "html"
39
38
  ],
40
39
  "exclude": [