gatsby-source-filesystem 5.5.0-next.0 → 5.6.0-next.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -3,6 +3,14 @@
3
3
  All notable changes to this project will be documented in this file.
4
4
  See [Conventional Commits](https://conventionalcommits.org) for commit guidelines.
5
5
 
6
+ ## [5.4.0](https://github.com/gatsbyjs/gatsby/commits/gatsby-source-filesystem@5.4.0/packages/gatsby-source-filesystem) (2023-01-10)
7
+
8
+ [🧾 Release notes](https://www.gatsbyjs.com/docs/reference/release-notes/v5.4)
9
+
10
+ #### Chores
11
+
12
+ - update babel monorepo [#37386](https://github.com/gatsbyjs/gatsby/issues/37386) ([b941876](https://github.com/gatsbyjs/gatsby/commit/b94187633d94d0f0071b38ffe93380dd802ec70f))
13
+
6
14
  ### [5.3.1](https://github.com/gatsbyjs/gatsby/commits/gatsby-source-filesystem@5.3.1/packages/gatsby-source-filesystem) (2022-12-14)
7
15
 
8
16
  **Note:** Version bump only for package gatsby-source-filesystem
package/README.md CHANGED
@@ -1,35 +1,28 @@
1
1
  # gatsby-source-filesystem
2
2
 
3
- A Gatsby source plugin for sourcing data into your Gatsby application
4
- from your local filesystem.
3
+ A Gatsby plugin for sourcing data into your Gatsby application from your local filesystem.
5
4
 
6
- The plugin creates `File` nodes from files. The various "transformer"
7
- plugins can transform `File` nodes into various other types of data e.g.
8
- `gatsby-transformer-json` transforms JSON files into JSON data nodes and
9
- `gatsby-transformer-remark` transforms markdown files into `MarkdownRemark`
10
- nodes from which you can query an HTML representation of the markdown.
5
+ The plugin creates `File` nodes from files. The various [transformer plugins](https://www.gatsbyjs.com/plugins/?=gatsby-transformer) can transform `File` nodes into other types of data e.g. [`gatsby-transformer-json`](https://www.gatsbyjs.com/plugins/gatsby-transformer-json/) transforms JSON files into `JSON` nodes and [`gatsby-transformer-remark`](https://www.gatsbyjs.com/plugins/gatsby-transformer-remark/) transforms markdown files into `MarkdownRemark` nodes.
11
6
 
12
7
  ## Install
13
8
 
14
- `npm install gatsby-source-filesystem`
9
+ ```shell
10
+ npm install gatsby-source-filesystem
11
+ ```
15
12
 
16
13
  ## How to use
17
14
 
18
- ```javascript
19
- // In your gatsby-config.js
15
+ You can have multiple instances of this plugin in your `gatsby-config` to read files from different locations on your filesystem. Be sure to give each instance a unique `name`.
16
+
17
+ ```js:title=gatsby-config.js
20
18
  module.exports = {
21
19
  plugins: [
22
- // You can have multiple instances of this plugin
23
- // to read source nodes from different locations on your
24
- // filesystem.
25
- //
26
- // The following sets up the Jekyll pattern of having a
27
- // "pages" directory for Markdown files and a "data" directory
28
- // for `.json`, `.yaml`, `.csv`.
29
20
  {
30
21
  resolve: `gatsby-source-filesystem`,
31
22
  options: {
23
+ // The unique name for each instance
32
24
  name: `pages`,
25
+ // Path to the directory
33
26
  path: `${__dirname}/src/pages/`,
34
27
  },
35
28
  },
@@ -38,18 +31,37 @@ module.exports = {
38
31
  options: {
39
32
  name: `data`,
40
33
  path: `${__dirname}/src/data/`,
41
- ignore: [`**/\.*`], // ignore files starting with a dot
34
+ // Ignore files starting with a dot
35
+ ignore: [`**/\.*`],
36
+ // Use "mtime" and "inode" to fingerprint files (to check if file has changed)
37
+ fastHash: true,
42
38
  },
43
39
  },
44
40
  ],
45
41
  }
46
42
  ```
47
43
 
44
+ In the above example every file under `src/pages` and `src/data` will be made available as a `File` node inside GraphQL. You don't need to set up another instance of `gatsby-source-filesystem` for e.g. `src/data/images` (since those files are already sourced). However, if you want to be able to filter your files you can set up a new instance and later use the `sourceInstanceName`.
45
+
48
46
  ## Options
49
47
 
50
- In addition to the name and path parameters you may pass an optional `ignore` array of file globs to ignore.
48
+ ### name
49
+
50
+ **Required**
51
+
52
+ A unique name for the `gatsby-source-filesytem` instance. This name will also be a key on the `File` node called `sourceInstanceName`. You can use this e.g. for filtering.
53
+
54
+ ### path
55
+
56
+ **Required**
57
+
58
+ Path to the folder that should be sourced. Ideally an absolute path.
59
+
60
+ ### ignore
61
+
62
+ **Optional**
51
63
 
52
- They will be added to the following default list:
64
+ Array of file globs to ignore. They will be added to the following default list:
53
65
 
54
66
  ```text
55
67
  **/*.un~
@@ -62,37 +74,51 @@ They will be added to the following default list:
62
74
  ../**/dist/**
63
75
  ```
64
76
 
65
- To prevent concurrent requests overload of `processRemoteNode`, you can adjust the `200` default concurrent downloads, with `GATSBY_CONCURRENT_DOWNLOAD` environment variable.
77
+ ### fastHash
78
+
79
+ **Optional**
80
+
81
+ By default, `gatsby-source-filesystem` creates an MD5 hash of each file to determine if it has changed between sourcing. However, on sites with many large files this can lead to a significant slowdown. Thus you can enable the `fastHash` setting to use an alternative hashing mechanism.
82
+
83
+ `fastHash` uses the `mtime` and `inode` to fingerprint the files. On a modern OS this can be considered a robust solution to determine if a file has changed, however on older systems it can be unreliable. Therefore it's not enabled by default.
84
+
85
+ ### Environment variables
86
+
87
+ - `GATSBY_CONCURRENT_DOWNLOAD` (default: `200`). To prevent concurrent requests you can configure the concurrency of `processRemoteNode`.
88
+
89
+ If you have a spotty network or slow connection, you can adjust the retries and timeouts:
90
+
91
+ - `GATSBY_STALL_RETRY_LIMIT` (default: `3`)
92
+ - `GATSBY_STALL_TIMEOUT` (default: `30000`)
93
+ - `GATSBY_CONNECTION_TIMEOUT` (default: `30000`)
66
94
 
67
95
  ## How to query
68
96
 
69
- You can query file nodes like the following:
97
+ You can query the `File` nodes as following:
70
98
 
71
99
  ```graphql
72
100
  {
73
101
  allFile {
74
- edges {
75
- node {
76
- extension
77
- dir
78
- modifiedTime
79
- }
102
+ nodes {
103
+ extension
104
+ dir
105
+ modifiedTime
80
106
  }
81
107
  }
82
108
  }
83
109
  ```
84
110
 
85
- To filter by the `name` you specified in the config, use `sourceInstanceName`:
111
+ Use [GraphiQL](https://www.gatsbyjs.com/docs/how-to/querying-data/running-queries-with-graphiql/) to explore all available keys.
112
+
113
+ To filter by the `name` you specified in the `gatsby-config`, use `sourceInstanceName`:
86
114
 
87
115
  ```graphql
88
116
  {
89
117
  allFile(filter: { sourceInstanceName: { eq: "data" } }) {
90
- edges {
91
- node {
92
- extension
93
- dir
94
- modifiedTime
95
- }
118
+ nodes {
119
+ extension
120
+ dir
121
+ modifiedTime
96
122
  }
97
123
  }
98
124
  }
@@ -102,24 +128,24 @@ To filter by the `name` you specified in the config, use `sourceInstanceName`:
102
128
 
103
129
  `gatsby-source-filesystem` exports three helper functions:
104
130
 
105
- - `createFilePath`
106
- - `createRemoteFileNode`
107
- - `createFileNodeFromBuffer`
131
+ - [`createFilePath`](#createfilepath)
132
+ - [`createRemoteFileNode`](#createremotefilenode)
133
+ - [`createFileNodeFromBuffer`](#createfilenodefrombuffer)
108
134
 
109
- ### createFilePath
135
+ ### `createFilePath`
110
136
 
111
- When building pages from files, you often want to create a URL from a file's path on the file system. E.g. if you have a markdown file at `src/content/2018-01-23-an-exploration-of-the-nature-of-reality/index.md`, you might want to turn that into a page on your site at `example.com/2018-01-23-an-exploration-of-the-nature-of-reality/`. `createFilePath` is a helper function to make this task easier.
137
+ When building pages from files, you often want to create a URL from a file's path on the filesystem. For example, if you have a markdown file at `src/content/2018-01-23-my-blog-post/index.md`, you might want to turn that into a page on your site at `example.com/blog/2018-01-23-my-blog-post/`. `createFilePath` is a helper function to make this task easier.
112
138
 
113
139
  ```javascript
114
140
  createFilePath({
115
141
  // The node you'd like to convert to a path
116
- // e.g. from a markdown, JSON, YAML file, etc
142
+ // e.g. from a markdown, JSON, YAML file, etc.
117
143
  node,
118
144
  // Method used to get a node
119
145
  // The parameter from `onCreateNode` should be passed in here
120
146
  getNode,
121
147
  // The base path for your files.
122
- // It is relative to the `options.path` setting in the `gatsby-source-filesystem` entries of your `gatsby-config.js`.
148
+ // It is relative to the `options.path` setting in the `gatsby-source-filesystem` entries of your `gatsby-config`.
123
149
  // Defaults to `src/pages`. For the example above, you'd use `src/content`.
124
150
  basePath,
125
151
  // Whether you want your file paths to contain a trailing `/` slash
@@ -128,35 +154,35 @@ createFilePath({
128
154
  })
129
155
  ```
130
156
 
131
- #### Example usage
157
+ #### Example
132
158
 
133
- ```javascript
159
+ ```js:title=gatsby-node.js
134
160
  const { createFilePath } = require(`gatsby-source-filesystem`)
135
161
 
136
162
  exports.onCreateNode = ({ node, getNode, actions }) => {
137
163
  const { createNodeField } = actions
138
164
  // Ensures we are processing only markdown files
139
165
  if (node.internal.type === "MarkdownRemark") {
140
- // Use `createFilePath` to turn markdown files in our `data/faqs` directory into `/faqs/slug`
166
+ // Use `createFilePath` to turn markdown files in our `src/content` directory into `/blog/slug`
141
167
  const relativeFilePath = createFilePath({
142
168
  node,
143
169
  getNode,
144
- basePath: "data/faqs/",
170
+ basePath: "src/content",
145
171
  })
146
172
 
147
173
  // Creates new query'able field with name of 'slug'
148
174
  createNodeField({
149
175
  node,
150
176
  name: "slug",
151
- value: `/faqs${relativeFilePath}`,
177
+ value: `/blog${relativeFilePath}`,
152
178
  })
153
179
  }
154
180
  }
155
181
  ```
156
182
 
157
- ### createRemoteFileNode
183
+ ### `createRemoteFileNode`
158
184
 
159
- When building source plugins for remote data sources such as headless CMSs, their data will often link to files stored remotely that are often convenient to download so you can work with them locally.
185
+ When building source plugins for remote data sources (Headless CMSs, APIs, etc.), their data will often link to files stored remotely that are often convenient to download so you can work with them locally.
160
186
 
161
187
  The `createRemoteFileNode` helper makes it easy to download remote files and add them to your site's GraphQL schema.
162
188
 
@@ -166,80 +192,63 @@ While downloading the assets, special characters (regex: `/:|\/|\*|\?|"|<|>|\||\
166
192
  createRemoteFileNode({
167
193
  // The source url of the remote file
168
194
  url: `https://example.com/a-file.jpg`,
169
-
170
- // The id of the parent node (i.e. the node to which the new remote File node will be linked to.
195
+ // The id of the parent node (i.e. the node to which the new remote File node will be linked to)
171
196
  parentNodeId,
172
-
173
197
  // Gatsby's cache which the helper uses to check if the file has been downloaded already. It's passed to all Node APIs.
174
198
  getCache,
175
-
176
199
  // The action used to create nodes
177
200
  createNode,
178
-
179
201
  // A helper function for creating node Ids
180
202
  createNodeId,
181
-
182
203
  // OPTIONAL
183
204
  // Adds htaccess authentication to the download request if passed in.
184
205
  auth: { htaccess_user: `USER`, htaccess_pass: `PASSWORD` },
185
-
186
206
  // OPTIONAL
187
207
  // Adds extra http headers to download request if passed in.
188
208
  httpHeaders: { Authorization: `Bearer someAccessToken` },
189
-
190
209
  // OPTIONAL
191
210
  // Sets the file extension
192
- ext: ".jpg",
211
+ ext: `.jpg`,
193
212
  })
194
213
  ```
195
214
 
196
- #### Example usage
215
+ #### Example
197
216
 
198
- The following example is pulled from [gatsby-source-wordpress](https://github.com/gatsbyjs/gatsby/tree/master/packages/gatsby-source-wordpress). Downloaded files are created as `File` nodes and then linked to the WordPress Media node, so it can be queried both as a regular `File` node and from the `localFile` field in the Media node.
217
+ The following example is pulled from the [Preprocessing External Images guide](https://www.gatsbyjs.com/docs/how-to/images-and-media/preprocessing-external-images/). Downloaded files are created as `File` nodes and then linked to the `MarkdownRemark` node, so it can be used with e.g. [`gatsby-plugin-image`](https://www.gatsbyjs.com/docs/how-to/images-and-media/using-gatsby-plugin-image/). The file node can then be queried using GraphQL.
199
218
 
200
- ```javascript
201
- const { createRemoteFileNode } = require(`gatsby-source-filesystem`)
219
+ ```js:title=gatsby-node.js
220
+ const { createRemoteFileNode } = require("gatsby-source-filesystem")
202
221
 
203
- exports.downloadMediaFiles = ({
204
- nodes,
205
- getCache,
206
- createNode,
222
+ exports.onCreateNode = async ({
223
+ node,
224
+ actions: { createNode, createNodeField },
207
225
  createNodeId,
208
- _auth,
226
+ getCache,
209
227
  }) => {
210
- nodes.map(async node => {
211
- let fileNode
212
- // Ensures we are only processing Media Files
213
- // `wordpress__wp_media` is the media file type name for WordPress
214
- if (node.__type === `wordpress__wp_media`) {
215
- try {
216
- fileNode = await createRemoteFileNode({
217
- url: node.source_url,
218
- parentNodeId: node.id,
219
- getCache,
220
- createNode,
221
- createNodeId,
222
- auth: _auth,
223
- })
224
- } catch (e) {
225
- // Ignore
226
- }
227
- }
228
+ // For all MarkdownRemark nodes that have a featured image url, call createRemoteFileNode
229
+ if (
230
+ node.internal.type === "MarkdownRemark" &&
231
+ node.frontmatter.featuredImgUrl !== null
232
+ ) {
233
+ const fileNode = await createRemoteFileNode({
234
+ url: node.frontmatter.featuredImgUrl, // string that points to the URL of the image
235
+ parentNodeId: node.id, // id of the parent node of the fileNode you are going to create
236
+ createNode, // helper function in gatsby-node to generate the node
237
+ createNodeId, // helper function in gatsby-node to generate the node id
238
+ getCache,
239
+ })
228
240
 
229
- // Adds a field `localFile` to the node
230
- // ___NODE appendix tells Gatsby that this field will link to another node
241
+ // if the file was created, extend the node with "localFile"
231
242
  if (fileNode) {
232
- node.localFile___NODE = fileNode.id
243
+ createNodeField({ node, name: "localFile", value: fileNode.id })
233
244
  }
234
- })
245
+ }
235
246
  }
236
247
  ```
237
248
 
238
- The file node can then be queried using GraphQL. See an example of this in the [gatsby-source-wordpress README](/plugins/gatsby-source-wordpress/#image-processing) where downloaded images are queried using [gatsby-transformer-sharp](/plugins/gatsby-transformer-sharp/) to use in the component [gatsby-image](/plugins/gatsby-image/).
239
-
240
249
  #### Retrieving the remote file name and extension
241
250
 
242
- The helper tries first to retrieve the file name and extension by parsing the url and the path provided (e.g. if the url is `https://example.com/image.jpg`, the extension will be inferred as `.jpg` and the name as `image`). If the url does not contain an extension, we use the [`file-type`](https://www.npmjs.com/package/file-type) package to infer the file type. Finally, the name and the extension _can_ be explicitly passed, like so:
251
+ The helper first tries to retrieve the file name and extension by parsing the url and the path provided (e.g. if the url is `https://example.com/image.jpg`, the extension will be inferred as `.jpg` and the name as `image`). If the url does not contain an extension, `createRemoteFileNode` use the [`file-type`](https://www.npmjs.com/package/file-type) package to infer the file type. Finally, the name and the extension _can_ be explicitly passed, like so:
243
252
 
244
253
  ```javascript
245
254
  createRemoteFileNode({
@@ -250,25 +259,24 @@ createRemoteFileNode({
250
259
  createNode,
251
260
  createNodeId,
252
261
  // if necessary!
253
- ext: ".jpg",
254
- name: "image",
262
+ ext: `.jpg`,
263
+ name: `image`,
255
264
  })
256
265
  ```
257
266
 
258
- ### createFileNodeFromBuffer
267
+ ### `createFileNodeFromBuffer`
259
268
 
260
269
  When working with data that isn't already stored in a file, such as when querying binary/blob fields from a database, it's helpful to cache that data to the filesystem in order to use it with other transformers that accept files as input.
261
270
 
262
- The `createFileNodeFromBuffer` helper accepts a `Buffer`, caches its contents to disk, and creates a file node that points to it.
271
+ The `createFileNodeFromBuffer` helper accepts a `Buffer`, caches its contents to disk, and creates a `File` node that points to it.
263
272
 
264
273
  The name of the file can be passed to the `createFileNodeFromBuffer` helper. If no name is given, the content hash will be used to determine the name.
265
274
 
266
- ## Example usage
275
+ #### Example
267
276
 
268
277
  The following example is adapted from the source of [`gatsby-source-mysql`](https://github.com/malcolm-kee/gatsby-source-mysql):
269
278
 
270
- ```js
271
- // gatsby-node.js
279
+ ```js:title=gatsby-node.js
272
280
  const createMySqlNodes = require(`./create-nodes`)
273
281
 
274
282
  exports.sourceNodes = async ({ actions, createNodeId, getCache }, config) => {
@@ -338,11 +346,3 @@ function createMySqlNodes({ name, __sql, idField, keys }, results, ctx) {
338
346
 
339
347
  module.exports = createMySqlNodes
340
348
  ```
341
-
342
- ## Troubleshooting
343
-
344
- In case that due to spotty network, or slow connection, some remote files fail to download. Even after multiple retries and adjusting concurrent downloads, you can adjust timeout and retry settings with these environment variables:
345
-
346
- - `GATSBY_STALL_RETRY_LIMIT`, default: `3`
347
- - `GATSBY_STALL_TIMEOUT`, default: `30000`
348
- - `GATSBY_CONNECTION_TIMEOUT`, default: `30000`
@@ -4,12 +4,12 @@ const path = require(`path`);
4
4
  const fs = require(`fs-extra`);
5
5
  const mime = require(`mime`);
6
6
  const prettyBytes = require(`pretty-bytes`);
7
- const md5File = require(`md5-file`);
8
7
  const {
9
8
  createContentDigest,
10
- slash
9
+ slash,
10
+ md5File
11
11
  } = require(`gatsby-core-utils`);
12
- exports.createFileNode = async (pathToFile, createNodeId, pluginOptions = {}) => {
12
+ exports.createFileNode = async (pathToFile, createNodeId, pluginOptions = {}, cache = null) => {
13
13
  const slashed = slash(pathToFile);
14
14
  const parsedSlashed = path.parse(slashed);
15
15
  const slashedFile = {
@@ -31,7 +31,19 @@ exports.createFileNode = async (pathToFile, createNodeId, pluginOptions = {}) =>
31
31
  description: `Directory "${path.relative(process.cwd(), slashed)}"`
32
32
  };
33
33
  } else {
34
- const contentDigest = await md5File(slashedFile.absolutePath);
34
+ const key = stats.mtimeMs.toString() + stats.ino.toString();
35
+ let contentDigest;
36
+ if (pluginOptions.fastHash) {
37
+ // Skip hashing.
38
+ contentDigest = key;
39
+ } else {
40
+ // Generate a hash, but only if the file has changed.
41
+ contentDigest = cache && (await cache.get(key));
42
+ if (!contentDigest) {
43
+ contentDigest = await md5File(slashedFile.absolutePath);
44
+ if (cache) await cache.set(key, contentDigest);
45
+ }
46
+ }
35
47
  const mediaType = mime.getType(slashedFile.ext);
36
48
  internal = {
37
49
  contentDigest,
package/gatsby-node.js CHANGED
@@ -32,10 +32,11 @@ const createFSMachine = ({
32
32
  },
33
33
  getNode,
34
34
  createNodeId,
35
- reporter
35
+ reporter,
36
+ cache
36
37
  }, pluginOptions) => {
37
38
  const createAndProcessNode = path => {
38
- const fileNodePromise = createFileNode(path, createNodeId, pluginOptions).then(fileNode => {
39
+ const fileNodePromise = createFileNode(path, createNodeId, pluginOptions, cache).then(fileNode => {
39
40
  createNode(fileNode);
40
41
  return null;
41
42
  });
@@ -190,6 +191,7 @@ exports.pluginOptionsSchema = ({
190
191
  }) => Joi.object({
191
192
  name: Joi.string(),
192
193
  path: Joi.string(),
194
+ fastHash: Joi.boolean().default(false),
193
195
  ignore: Joi.array().items(Joi.string(), Joi.object().regex(), Joi.function())
194
196
  });
195
197
  exports.sourceNodes = (api, pluginOptions) => {
package/package.json CHANGED
@@ -1,7 +1,7 @@
1
1
  {
2
2
  "name": "gatsby-source-filesystem",
3
3
  "description": "Gatsby source plugin for building websites from local data. Markdown, JSON, images, YAML, CSV, and dozens of other data types supported.",
4
- "version": "5.5.0-next.0",
4
+ "version": "5.6.0-next.0",
5
5
  "author": "Kyle Mathews <mathews.kyle@gmail.com>",
6
6
  "bugs": {
7
7
  "url": "https://github.com/gatsbyjs/gatsby/issues"
@@ -10,10 +10,9 @@
10
10
  "@babel/runtime": "^7.20.7",
11
11
  "chokidar": "^3.5.3",
12
12
  "file-type": "^16.5.4",
13
- "fs-extra": "^10.1.0",
14
- "gatsby-core-utils": "^4.5.0-next.0",
15
- "md5-file": "^5.0.0",
16
- "mime": "^2.6.0",
13
+ "fs-extra": "^11.1.0",
14
+ "gatsby-core-utils": "^4.6.0-next.0",
15
+ "mime": "^3.0.0",
17
16
  "pretty-bytes": "^5.6.0",
18
17
  "valid-url": "^1.0.9",
19
18
  "xstate": "^4.34.0"
@@ -21,7 +20,7 @@
21
20
  "devDependencies": {
22
21
  "@babel/cli": "^7.20.7",
23
22
  "@babel/core": "^7.20.7",
24
- "babel-preset-gatsby-package": "^3.5.0-next.0",
23
+ "babel-preset-gatsby-package": "^3.6.0-next.0",
25
24
  "cross-env": "^7.0.3"
26
25
  },
27
26
  "homepage": "https://github.com/gatsbyjs/gatsby/tree/master/packages/gatsby-source-filesystem#readme",
@@ -47,5 +46,5 @@
47
46
  "engines": {
48
47
  "node": ">=18.0.0"
49
48
  },
50
- "gitHead": "4ef5d775d18329abc5d4e78d1e0a99e904f00cd1"
49
+ "gitHead": "ede0901d34f1224914a873d59c2a370297e47422"
51
50
  }