file-entry-cache 10.1.4 → 11.0.0-beta.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -15,19 +15,25 @@
15
15
  - Ideal for processes that work on a specific set of files
16
16
  - Persists cache to Disk via `reconcile()` or `persistInterval` on `cache` options.
17
17
  - Uses `checksum` to determine if a file has changed
18
- - Supports `relative` and `absolute` paths
19
- - Ability to rename keys in the cache. Useful when renaming directories.
18
+ - Supports `relative` and `absolute` paths with configurable current working directory
19
+ - Portable cache files when using relative paths
20
20
  - ESM and CommonJS support with Typescript
21
21
 
22
22
  # Table of Contents
23
23
 
24
24
  - [Installation](#installation)
25
25
  - [Getting Started](#getting-started)
26
+ - [Changes from v10 to v11](#changes-from-v10-to-v11)
26
27
  - [Changes from v9 to v10](#changes-from-v9-to-v10)
27
28
  - [Global Default Functions](#global-default-functions)
28
29
  - [FileEntryCache Options (FileEntryCacheOptions)](#fileentrycache-options-fileentrycacheoptions)
29
30
  - [API](#api)
30
31
  - [Get File Descriptor](#get-file-descriptor)
32
+ - [Path Handling and Current Working Directory](#path-handling-and-current-working-directory)
33
+ - [Cache Portability](#cache-portability)
34
+ - [Maximum Portability with Checksums](#maximum-portability-with-checksums)
35
+ - [Handling Project Relocations](#handling-project-relocations)
36
+ - [Path Security and Traversal Prevention](#path-security-and-traversal-prevention)
31
37
  - [Using Checksums to Determine if a File has Changed (useCheckSum)](#using-checksums-to-determine-if-a-file-has-changed-usechecksum)
32
38
  - [Setting Additional Meta Data](#setting-additional-meta-data)
33
39
  - [How to Contribute](#how-to-contribute)
@@ -43,24 +49,30 @@ npm install file-entry-cache
43
49
  ```javascript
44
50
  import fileEntryCache from 'file-entry-cache';
45
51
  const cache = fileEntryCache.create('cache1');
46
- let fileDescriptor = cache.getFileDescriptor('file.txt');
52
+
53
+ // Using relative paths
54
+ let fileDescriptor = cache.getFileDescriptor('./src/file.txt');
47
55
  console.log(fileDescriptor.changed); // true as it is the first time
48
- fileDescriptor = cache.getFileDescriptor('file.txt');
56
+ console.log(fileDescriptor.key); // './src/file.txt' (stored as provided)
57
+
58
+ fileDescriptor = cache.getFileDescriptor('./src/file.txt');
49
59
  console.log(fileDescriptor.changed); // false as it has not changed
60
+
50
61
  // do something to change the file
51
- fs.writeFileSync('file.txt', 'new data foo bar');
62
+ fs.writeFileSync('./src/file.txt', 'new data foo bar');
63
+
52
64
  // check if the file has changed
53
- fileDescriptor = cache.getFileDescriptor('file.txt');
65
+ fileDescriptor = cache.getFileDescriptor('./src/file.txt');
54
66
  console.log(fileDescriptor.changed); // true
55
67
  ```
56
68
 
57
- Save it to Disk and Reconsile files that are no longer found
69
+ Save it to Disk and Reconcile files that are no longer found
58
70
  ```javascript
59
71
  import fileEntryCache from 'file-entry-cache';
60
72
  const cache = fileEntryCache.create('cache1');
61
- let fileDescriptor = cache.getFileDescriptor('file.txt');
73
+ let fileDescriptor = cache.getFileDescriptor('./src/file.txt');
62
74
  console.log(fileDescriptor.changed); // true as it is the first time
63
- fileEntryCache.reconcile(); // save the cache to disk and remove files that are no longer found
75
+ cache.reconcile(); // save the cache to disk and remove files that are no longer found
64
76
  ```
65
77
 
66
78
  Load the cache from a file:
@@ -68,33 +80,43 @@ Load the cache from a file:
68
80
  ```javascript
69
81
  import fileEntryCache from 'file-entry-cache';
70
82
  const cache = fileEntryCache.createFromFile('/path/to/cache/file');
71
- let fileDescriptor = cache.getFileDescriptor('file.txt');
83
+ let fileDescriptor = cache.getFileDescriptor('./src/file.txt');
72
84
  console.log(fileDescriptor.changed); // false as it has not changed from the saved cache.
73
85
  ```
74
86
 
87
+
88
+ # Changes from v10 to v11
89
+
90
+ **BREAKING CHANGES:**
91
+ - **`strictPaths` now defaults to `true`** - Path traversal protection is enabled by default for security. To restore v10 behavior, explicitly set `strictPaths: false`
92
+
93
+ **NEW FEATURES:**
94
+ - **Added `cwd` option** - You can now specify a custom current working directory for resolving relative paths
95
+ - **Added `strictPaths` option** - Provides protection against path traversal attacks (enabled by default)
96
+ - **Improved cache portability** - When using relative paths with the same `cwd`, cache files are portable across different environments
97
+
75
98
  # Changes from v9 to v10
76
99
 
77
100
  There have been many features added and changes made to the `file-entry-cache` class. Here are the main changes:
78
101
  - Added `cache` object to the options to allow for more control over the cache
79
- - Added `hashAlgorithm` to the options to allow for different checksum algorithms. Note that if you load from file it most likely will break if the value was something before.
80
- - Updated more on using Relative or Absolute paths. We now support both on `getFileDescriptor()`. You can read more on this in the `Get File Descriptor` section.
102
+ - Added `hashAlgorithm` to the options to allow for different checksum algorithms. Note that if you load from file it most likely will break if the value was something before.
81
103
  - Migrated to Typescript with ESM and CommonJS support. This allows for better type checking and support for both ESM and CommonJS.
82
- - Once options are passed in they get assigned as properties such as `hashAlgorithm` and `currentWorkingDirectory`. This allows for better control and access to the options. For the Cache options they are assigned to `cache` such as `cache.ttl` and `cache.lruSize`.
104
+ - Once options are passed in they get assigned as properties such as `hashAlgorithm`. For the Cache options they are assigned to `cache` such as `cache.ttl` and `cache.lruSize`.
83
105
  - Added `cache.persistInterval` to allow for saving the cache to disk at a specific interval. This will save the cache to disk at the interval specified instead of calling `reconsile()` to save. (`off` by default)
84
106
  - Added `getFileDescriptorsByPath(filePath: string): FileEntryDescriptor[]` to get all the file descriptors that start with the path specified. This is useful when you want to get all the files in a directory or a specific path.
85
- - Added `renameAbsolutePathKeys(oldPath: string, newPath: string): void` will rename the keys in the cache from the old path to the new path. This is useful when you rename a directory and want to update the cache without reanalyzing the files.
86
107
  - Using `flat-cache` v6 which is a major update. This allows for better performance and more control over the cache.
87
108
  - On `FileEntryDescriptor.meta` if using typescript you need to use the `meta.data` to set additional information. This is to allow for better type checking and to avoid conflicts with the `meta` object which was `any`.
88
109
 
89
110
  # Global Default Functions
90
- - `create(cacheId: string, cacheDirectory?: string, useCheckSum?: boolean, currentWorkingDirectory?: string)` - Creates a new instance of the `FileEntryCache` class
91
- - `createFromFile(cachePath: string, useCheckSum?: boolean, currentWorkingDirectory?: string)` - Creates a new instance of the `FileEntryCache` class and loads the cache from a file.
111
+ - `create(cacheId: string, cacheDirectory?: string, useCheckSum?: boolean, cwd?: string)` - Creates a new instance of the `FileEntryCache` class
112
+ - `createFromFile(cachePath: string, useCheckSum?: boolean, cwd?: string)` - Creates a new instance of the `FileEntryCache` class and loads the cache from a file.
92
113
 
93
114
  # FileEntryCache Options (FileEntryCacheOptions)
94
- - `currentWorkingDirectory?` - The current working directory. Used when resolving relative paths.
95
115
  - `useModifiedTime?` - If `true` it will use the modified time to determine if the file has changed. Default is `true`
96
116
  - `useCheckSum?` - If `true` it will use a checksum to determine if the file has changed. Default is `false`
97
117
  - `hashAlgorithm?` - The algorithm to use for the checksum. Default is `md5` but can be any algorithm supported by `crypto.createHash`
118
+ - `cwd?` - The current working directory for resolving relative paths. Default is `process.cwd()`
119
+ - `strictPaths?` - If `true` restricts file access to within `cwd` boundaries, preventing path traversal attacks. Default is `true`
98
120
  - `cache.ttl?` - The time to live for the cache in milliseconds. Default is `0` which means no expiration
99
121
  - `cache.lruSize?` - The number of items to keep in the cache. Default is `0` which means no limit
100
122
  - `cache.useClone?` - If `true` it will clone the data before returning it. Default is `false`
@@ -110,51 +132,121 @@ There have been many features added and changes made to the `file-entry-cache` c
110
132
  - `constructor(options?: FileEntryCacheOptions)` - Creates a new instance of the `FileEntryCache` class
111
133
  - `useCheckSum: boolean` - If `true` it will use a checksum to determine if the file has changed. Default is `false`
112
134
  - `hashAlgorithm: string` - The algorithm to use for the checksum. Default is `md5` but can be any algorithm supported by `crypto.createHash`
113
- - `currentWorkingDirectory: string` - The current working directory. Used when resolving relative paths.
114
135
  - `getHash(buffer: Buffer): string` - Gets the hash of a buffer used for checksums
115
- - `createFileKey(filePath: string): string` - Creates a key for the file path. This is used to store the data in the cache based on relative or absolute paths.
116
- - `deleteCacheFile(filePath: string): void` - Deletes the cache file
117
- - `destroy(): void` - Destroys the cache. This will also delete the cache file. If using cache persistence it will stop the interval.
118
- - `removeEntry(filePath: string): void` - Removes an entry from the cache. This can be `relative` or `absolute` paths.
136
+ - `cwd: string` - The current working directory for resolving relative paths. Default is `process.cwd()`
137
+ - `strictPaths: boolean` - If `true` restricts file access to within `cwd` boundaries. Default is `true`
138
+ - `createFileKey(filePath: string): string` - Returns the cache key for the file path (returns the path exactly as provided).
139
+ - `deleteCacheFile(): boolean` - Deletes the cache file from disk
140
+ - `destroy(): void` - Destroys the cache. This will clear the cache in memory. If using cache persistence it will stop the interval.
141
+ - `removeEntry(filePath: string): void` - Removes an entry from the cache.
119
142
  - `reconcile(): void` - Saves the cache to disk and removes any files that are no longer found.
120
143
  - `hasFileChanged(filePath: string): boolean` - Checks if the file has changed. This will return `true` if the file has changed.
121
- - `getFileDescriptor(filePath: string, options?: { useModifiedTime?: boolean, useCheckSum?: boolean, currentWorkingDirectory?: string }): FileEntryDescriptor` - Gets the file descriptor for the file. Please refer to the entire section on `Get File Descriptor` for more information.
122
- - `normalizeEntries(entries: FileEntryDescriptor[]): FileEntryDescriptor[]` - Normalizes the entries to have the correct paths. This is used when loading the cache from disk.
144
+ - `getFileDescriptor(filePath: string, options?: { useModifiedTime?: boolean, useCheckSum?: boolean }): FileEntryDescriptor` - Gets the file descriptor for the file. Please refer to the entire section on `Get File Descriptor` for more information.
145
+ - `normalizeEntries(files?: string[]): FileDescriptor[]` - Normalizes the entries. If no files are provided, it will return all cached entries.
123
146
  - `analyzeFiles(files: string[])` will return `AnalyzedFiles` object with `changedFiles`, `notFoundFiles`, and `notChangedFiles` as FileDescriptor arrays.
124
147
  - `getUpdatedFiles(files: string[])` will return an array of `FileEntryDescriptor` objects that have changed.
125
- - `getFileDescriptorsByPath(filePath: string): FileEntryDescriptor[]` will return an array of `FileEntryDescriptor` objects that starts with the path specified.
126
- - `renameAbsolutePathKeys(oldPath: string, newPath: string): void` - Renames the keys in the cache from the old path to the new path. This is useful when you rename a directory and want to update the cache without reanalyzing the files.
148
+ - `getFileDescriptorsByPath(filePath: string): FileEntryDescriptor[]` will return an array of `FileEntryDescriptor` objects that starts with the path prefix specified.
149
+ - `getAbsolutePath(filePath: string): string` - Resolves a relative path to absolute using the configured `cwd`. Returns absolute paths unchanged. When `strictPaths` is enabled, throws an error if the path resolves outside `cwd`.
150
+ - `getAbsolutePathWithCwd(filePath: string, cwd: string): string` - Resolves a relative path to absolute using a custom working directory. When `strictPaths` is enabled, throws an error if the path resolves outside the provided `cwd`.
127
151
 
128
152
  # Get File Descriptor
129
153
 
130
- The `getFileDescriptor(filePath: string, options?: { useCheckSum?: boolean, currentWorkingDirectory?: string }): FileEntryDescriptor` function is used to get the file descriptor for the file. This function will return a `FileEntryDescriptor` object that has the following properties:
154
+ The `getFileDescriptor(filePath: string, options?: { useCheckSum?: boolean, useModifiedTime?: boolean }): FileEntryDescriptor` function is used to get the file descriptor for the file. This function will return a `FileEntryDescriptor` object that has the following properties:
131
155
 
132
- - `key: string` - The key for the file. This is the relative or absolute path of the file.
156
+ - `key: string` - The cache key for the file. This is exactly the path that was provided (relative or absolute).
133
157
  - `changed: boolean` - If the file has changed since the last time it was analyzed.
134
158
  - `notFound: boolean` - If the file was not found.
135
- - `meta: FileEntryMeta` - The meta data for the file. This has the following prperties: `size`, `mtime`, `ctime`, `hash`, `data`. Note that `data` is an object that can be used to store additional information.
159
+ - `meta: FileEntryMeta` - The meta data for the file. This has the following properties: `size`, `mtime`, `hash`, `data`. Note that `data` is an object that can be used to store additional information.
136
160
  - `err` - If there was an error analyzing the file.
137
161
 
138
- We have added the ability to use `relative` or `absolute` paths. If you pass in a `relative` path it will use the `currentWorkingDirectory` to resolve the path. If you pass in an `absolute` path it will use the path as is. This is useful when you want to use `relative` paths but also want to use `absolute` paths.
162
+ ## Path Handling and Current Working Directory
139
163
 
140
- If you do not pass in `currentWorkingDirectory` in the class options or in the `getFileDescriptor` function it will use the `process.cwd()` as the default `currentWorkingDirectory`.
164
+ The cache stores paths exactly as they are provided (relative or absolute). When checking if files have changed, relative paths are resolved using the configured `cwd` (current working directory):
141
165
 
142
166
  ```javascript
143
- const fileEntryCache = new FileEntryCache();
144
- const fileDescriptor = fileEntryCache.getFileDescriptor('file.txt', { currentWorkingDirectory: '/path/to/directory' });
167
+ // Default: uses process.cwd()
168
+ const cache1 = fileEntryCache.create('cache1');
169
+
170
+ // Custom working directory
171
+ const cache2 = fileEntryCache.create('cache2', './cache', false, '/project/root');
172
+ // Or with options object
173
+ const cache3 = new FileEntryCache({ cwd: '/project/root' });
174
+
175
+ // The cache key is always the provided path
176
+ const descriptor = cache2.getFileDescriptor('./src/file.txt');
177
+ console.log(descriptor.key); // './src/file.txt'
178
+ // But file operations resolve from: '/project/root/src/file.txt'
145
179
  ```
146
180
 
147
- Since this is a relative path it will use the `currentWorkingDirectory` to resolve the path. If you want to use an absolute path you can do the following:
181
+ ### Cache Portability
182
+
183
+ Using relative paths with a consistent `cwd` (defaults to `process.cwd()`) makes cache files portable across different machines and environments. This is especially useful for CI/CD pipelines and team development.
148
184
 
149
185
  ```javascript
150
- const fileEntryCache = new FileEntryCache();
151
- const filePath = path.resolve('/path/to/directory', 'file.txt');
152
- const fileDescriptor = fileEntryCache.getFileDescriptor(filePath);
186
+ // On machine A (project at /home/user/project)
187
+ const cacheA = fileEntryCache.create('build-cache', './cache', false, '/home/user/project');
188
+ cacheA.getFileDescriptor('./src/index.js'); // Resolves to /home/user/project/src/index.js
189
+ cacheA.reconcile();
190
+
191
+ // On machine B (project at /workspace/project)
192
+ const cacheB = fileEntryCache.create('build-cache', './cache', false, '/workspace/project');
193
+ cacheB.getFileDescriptor('./src/index.js'); // Resolves to /workspace/project/src/index.js
194
+ // Cache hit! File hasn't changed since machine A
153
195
  ```
154
196
 
155
- This will save the key as the absolute path.
197
+ ### Maximum Portability with Checksums
156
198
 
157
- If there is an error when trying to get the file descriptor it will return an ``notFound` and `err` property with the error.
199
+ For maximum cache portability across different environments, use checksums (`useCheckSum: true`) along with relative paths and `cwd` which defaults to `process.cwd()`. This ensures that cache validity is determined by file content rather than modification times, which can vary across systems:
200
+
201
+ ```javascript
202
+ // Development machine
203
+ const devCache = fileEntryCache.create(
204
+ '.buildcache',
205
+ './cache', // cache directory
206
+ true // Use checksums for content-based comparison
207
+ );
208
+
209
+ // Process files using relative paths
210
+ const descriptor = devCache.getFileDescriptor('./src/index.js');
211
+ if (descriptor.changed) {
212
+ console.log('Building ./src/index.js...');
213
+ // Build process here
214
+ }
215
+ devCache.reconcile(); // Save cache
216
+
217
+ // CI/CD Pipeline or another developer's machine
218
+ const ciCache = fileEntryCache.create(
219
+ '.buildcache',
220
+ './node_modules/.cache',
221
+ true, // Same checksum setting
222
+ process.cwd() // Different absolute path, same relative structure
223
+ );
224
+
225
+ // Same relative path works across environments
226
+ const descriptor2 = ciCache.getFileDescriptor('./src/index.js');
227
+ if (!descriptor2.changed) {
228
+ console.log('Using cached result for ./src/index.js');
229
+ // Skip rebuild - file content unchanged
230
+ }
231
+ ```
232
+
233
+ ### Handling Project Relocations
234
+
235
+ Cache remains valid even when projects are moved or renamed:
236
+
237
+ ```javascript
238
+ // Original location: /projects/my-app
239
+ const cache1 = fileEntryCache.create('.cache', './cache', true, '/projects/my-app');
240
+ cache1.getFileDescriptor('./src/app.js');
241
+ cache1.reconcile();
242
+
243
+ // After moving project to: /archived/2024/my-app
244
+ const cache2 = fileEntryCache.create('.cache', './cache', true, '/archived/2024/my-app');
245
+ cache2.getFileDescriptor('./src/app.js'); // Still finds cached entry!
246
+ // Cache valid as long as relative structure unchanged
247
+ ```
248
+
249
+ If there is an error when trying to get the file descriptor it will return a `notFound` and `err` property with the error.
158
250
 
159
251
  ```javascript
160
252
  const fileEntryCache = new FileEntryCache();
@@ -168,6 +260,124 @@ if (fileDescriptor.notFound) {
168
260
  }
169
261
  ```
170
262
 
263
+ # Path Security and Traversal Prevention
264
+
265
+ The `strictPaths` option provides security against path traversal attacks by restricting file access to within the configured `cwd` boundaries. **This is enabled by default (since v11)** to ensure secure defaults when processing untrusted input or when running in security-sensitive environments.
266
+
267
+ ## Basic Usage
268
+
269
+ ```javascript
270
+ // strictPaths is enabled by default for security
271
+ const cache = new FileEntryCache({
272
+ cwd: '/project/root'
273
+ });
274
+
275
+ // This will work - file is within cwd
276
+ const descriptor = cache.getFileDescriptor('./src/index.js');
277
+
278
+ // This will throw an error - attempts to access parent directory
279
+ try {
280
+ cache.getFileDescriptor('../../../etc/passwd');
281
+ } catch (error) {
282
+ console.error(error); // Path traversal attempt blocked
283
+ }
284
+
285
+ // To allow parent directory access (not recommended for untrusted input)
286
+ const unsafeCache = new FileEntryCache({
287
+ cwd: '/project/root',
288
+ strictPaths: false // Explicitly disable protection
289
+ });
290
+ ```
291
+
292
+ ## Security Features
293
+
294
+ When `strictPaths` is enabled:
295
+ - **Path Traversal Prevention**: Blocks attempts to access files outside the working directory using `../` sequences
296
+ - **Null Byte Protection**: Automatically removes null bytes from paths to prevent injection attacks
297
+ - **Path Normalization**: Cleans and normalizes paths to prevent bypass attempts
298
+
299
+ ## Use Cases
300
+
301
+ ### Build Tools with Untrusted Input
302
+ ```javascript
303
+ // Secure build tool configuration
304
+ const cache = fileEntryCache.create(
305
+ '.buildcache',
306
+ './cache',
307
+ true, // useCheckSum
308
+ process.cwd()
309
+ );
310
+
311
+ // Enable strict path checking for security
312
+ cache.strictPaths = true;
313
+
314
+ // Process user-provided file paths safely
315
+ function processUserFile(userProvidedPath) {
316
+ try {
317
+ const descriptor = cache.getFileDescriptor(userProvidedPath);
318
+ // Safe to process - file is within boundaries
319
+ return descriptor;
320
+ } catch (error) {
321
+ if (error.message.includes('Path traversal attempt blocked')) {
322
+ console.warn('Security: Blocked access to:', userProvidedPath);
323
+ return null;
324
+ }
325
+ throw error;
326
+ }
327
+ }
328
+ ```
329
+
330
+ ### CI/CD Environments
331
+ ```javascript
332
+ // Strict security for CI/CD pipelines
333
+ const cache = new FileEntryCache({
334
+ cwd: process.env.GITHUB_WORKSPACE || process.cwd(),
335
+ strictPaths: true, // Prevent access outside workspace
336
+ useCheckSum: true // Content-based validation
337
+ });
338
+
339
+ // All file operations are now restricted to the workspace
340
+ cache.getFileDescriptor('./src/app.js'); // ✓ Allowed
341
+ cache.getFileDescriptor('/etc/passwd'); // ✗ Blocked (absolute path outside cwd)
342
+ cache.getFileDescriptor('../../../root'); // ✗ Blocked (path traversal)
343
+ ```
344
+
345
+ ### Dynamic Security Control
346
+ ```javascript
347
+ const cache = new FileEntryCache({ cwd: '/safe/directory' });
348
+
349
+ // Start with relaxed mode for trusted operations
350
+ cache.strictPaths = false;
351
+ processInternalFiles();
352
+
353
+ // Enable strict mode for untrusted input
354
+ cache.strictPaths = true;
355
+ processUserUploadedPaths();
356
+
357
+ // Return to relaxed mode if needed
358
+ cache.strictPaths = false;
359
+ ```
360
+
361
+ ## Default Behavior
362
+
363
+ **As of v11, `strictPaths` is enabled by default** to provide secure defaults. This means:
364
+ - Path traversal attempts using `../` are blocked
365
+ - File access is restricted to within the configured `cwd`
366
+ - Null bytes in paths are automatically sanitized
367
+
368
+ ### Migrating from v10 or Earlier
369
+
370
+ If you're upgrading from v10 or earlier and need to maintain the previous behavior (for example, if your code legitimately accesses parent directories), you can explicitly disable strict paths:
371
+
372
+ ```javascript
373
+ const cache = new FileEntryCache({
374
+ cwd: process.cwd(),
375
+ strictPaths: false // Restore v10 behavior
376
+ });
377
+ ```
378
+
379
+ However, we strongly recommend keeping `strictPaths: true` and adjusting your code to work within the security boundaries, especially when processing any untrusted input.
380
+
171
381
  # Using Checksums to Determine if a File has Changed (useCheckSum)
172
382
 
173
383
  By default the `useCheckSum` is `false`. This means that the `FileEntryCache` will use the `mtime` and `ctime` to determine if the file has changed. If you set `useCheckSum` to `true` it will use a checksum to determine if the file has changed. This is useful when you want to make sure that the file has not changed at all.