npm - read-excel-file - Versions diffs - 4.0.5 → 4.1.0 - Mend

read-excel-file 4.0.5 → 4.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (42) hide show

package/.gitlab-ci.yml +15 -0
package/CHANGELOG.md +27 -0
package/LICENSE +1 -1
package/README.md +110 -17
package/bundle/index.html +22 -16
package/bundle/read-excel-file.min.js +2 -2
package/bundle/read-excel-file.min.js.map +1 -1
package/commonjs/convertMapToSchema.js +41 -0
package/commonjs/convertMapToSchema.js.map +1 -0
package/commonjs/convertMapToSchema.test.js.map +1 -0
package/commonjs/convertToJson.js +19 -9
package/commonjs/convertToJson.js.map +1 -1
package/commonjs/convertToJson.test.js.map +1 -1
package/commonjs/readXlsx.js +6 -3
package/commonjs/readXlsx.js.map +1 -1
package/commonjs/readXlsxFileContents.js +17 -4
package/commonjs/readXlsxFileContents.js.map +1 -1
package/commonjs/readXlsxFileNode.test.js.map +1 -1
package/index.d.ts.test +20 -0
package/modules/convertMapToSchema.js +34 -0
package/modules/convertMapToSchema.js.map +1 -0
package/modules/convertMapToSchema.test.js.map +1 -0
package/modules/convertToJson.js +19 -9
package/modules/convertToJson.js.map +1 -1
package/modules/convertToJson.test.js.map +1 -1
package/modules/readXlsx.js +6 -3
package/modules/readXlsx.js.map +1 -1
package/modules/readXlsxFileContents.js +14 -4
package/modules/readXlsxFileContents.js.map +1 -1
package/modules/readXlsxFileNode.test.js.map +1 -1
package/node/index.commonjs.js +6 -0
package/node/index.d.ts.test +23 -0
package/node/index.js +5 -0
package/node/package.json +9 -0
package/package.json +7 -6
package/schema/index.commonjs.js +2 -0
package/schema/index.d.ts.test +6 -0
package/schema/index.js +1 -0
package/schema/package.json +9 -0
package/types.d.ts +80 -0
package/website/index.html +105 -0
package/node.js +0 -6

package/.gitlab-ci.yml ADDED Viewed

@@ -0,0 +1,15 @@
+image: node:10
+pages:
+  script:
+  - npm install
+  - npm run build
+  - mv ./bundle ./public
+  - cp --recursive ./website/* ./public/
+  artifacts:
+    paths:
+    - public
+  only:
+  - master

package/CHANGELOG.md CHANGED Viewed

@@ -1,3 +1,30 @@
+<!--
+5.0.0 / 30.08.2020
+==================
+  * Added [TypeScript](https://github.com/catamphetamine/read-excel-file/issues/71) definitions.
+  * Removed deprecated `URL`, `Integer` and `Email` exports (use the string variants instead: `"URL"`, `"Integer"`, `"Email"`).
+  * Removed undocumented `convertToJson()` export.
+-->
+4.1.0 / 09.11.2020
+==================
+* Renamed schema entry `parse()` function: now it's called `type`. This way, `type` could be both a built-in type and a custom type.
+* Changed the built-in `"Integer"`, `"URL"` and `"Email"` types: now they're exported functions again instead of strings. Strings still work.
+* Added `map` parameter: similar to `schema` but doesn't perform any parsing or validation. Can be used to map an Excel file to an array of objects that could be parsed/validated using [`yup`](https://github.com/jquense/yup).
+* `type` of a schema entry is no longer required: if no `type` is specified, then the cell value is returned "as is" (string, or number, or boolean, or `Date`).
+4.0.8 / 08.11.2020
+==================
+* Updated `JSZip` to the latest version. The [issue](https://gitlab.com/catamphetamine/read-excel-file/-/issues/8). The [original issue](https://github.com/catamphetamine/read-excel-file/issues/54).
 4.0.0 / 25.05.2019
 ==================

package/LICENSE CHANGED Viewed

@@ -1,6 +1,6 @@
 MIT License
-Copyright (c) 2018 github.com/catamphetamine
+Copyright (c) 2018 gitlab.com/catamphetamine
 Permission is hereby granted, free of charge, to any person obtaining a copy
 of this software and associated documentation files (the "Software"), to deal

package/README.md CHANGED Viewed

@@ -1,8 +1,16 @@
 # `read-excel-file`
-Read `*.xlsx` files in a browser or Node.js. Parse to JSON with a strict schema.
+Read small to medium `*.xlsx` files in a browser or Node.js. Parse to JSON with a strict schema.
-[Demo](https://catamphetamine.github.io/read-excel-file/)
+[Demo](https://catamphetamine.gitlab.io/read-excel-file/)
+## Restrictions
+There have been some [complaints](https://github.com/catamphetamine/read-excel-file/issues/38#issuecomment-544286628) about this library not being able to handle large `*.xlsx` spreadsheets. It's true that this library's main point have been usability and convenience, and not performance or the ability to handle huge datasets. For example, the time of parsing a 2000 rows / 20 columns file is about 3 seconds, and when parsing a 30k+ rows file, it may throw a `RangeError: Maximum call stack size exceeded`. So, for handling huge datasets, use something like [`xlsx`](https://github.com/catamphetamine/read-excel-file/issues/38#issuecomment-544286628) package instead. This library is suitable for handling small to medium `*.xlsx` files.
+## GitHub
+On March 9th, 2020, GitHub, Inc. silently [banned](https://medium.com/@catamphetamine/how-github-blocked-me-and-all-my-libraries-c32c61f061d3) my account (and all my libraries) without any notice. I opened a support ticked but they didn't answer. Because of that, I had to move all my libraries to [GitLab](https://gitlab.com/catamphetamine).
 ## Install
@@ -10,6 +18,8 @@ Read `*.xlsx` files in a browser or Node.js. Parse to JSON with a strict schema.
 npm install read-excel-file --save
 ```
+If you're not using a bundler then use a [standalone version from a CDN](#cdn).
 ## Browser
 ```html
@@ -48,7 +58,11 @@ readXlsxFile(fs.createReadStream('/path/to/file')).then((rows) => {
 ## Dates
-XLSX format has no dedicated "date" type so dates are stored internally as simply numbers along with a "format" (e.g. `"MM/DD/YY"`). When using `readXlsx()` with `schema` parameter all dates get parsed correctly in any case. But if using `readXlsx()` without `schema` parameter (to get "raw" data) then this library attempts to guess whether a cell value is a date or not by examining the cell "format" (e.g. `"MM/DD/YY"`), so in most cases dates are detected and parsed automatically. For exotic cases one can pass an explicit `dateFormat` parameter (e.g. `"MM/DD/YY"`) to instruct the library to parse numbers with such "format" as dates.
+XLSX format has no dedicated "date" type so dates are stored internally as simply numbers along with a "format" (e.g. `"MM/DD/YY"`). When using `readXlsx()` with `schema` parameter all dates get parsed correctly in any case. But if using `readXlsx()` without `schema` parameter (to get "raw" data) then this library attempts to guess whether a cell value is a date or not by examining the cell "format" (e.g. `"MM/DD/YY"`), so in most cases dates are detected and parsed automatically. For exotic cases one can pass an explicit `dateFormat` parameter (e.g. `"MM/DD/YY"`) to instruct the library to parse numbers with such "format" as dates:
+```js
+readXlsxFile(file, { dateFormat: 'MM/DD/YY' })
+```
 ## JSON
@@ -56,11 +70,11 @@ To convert rows to JSON pass `schema` option to `readXlsxFile()`. It will return
 ```js
 // An example *.xlsx document:
-// -----------------------------------------------------------------------------
-// | START DATE | NUMBER OF STUDENTS | IS FREE | COURSE TITLE |    CONTACT     |
-// -----------------------------------------------------------------------------
-// | 03/24/2018 |         123        |   true  |  Chemistry   | (123) 456-7890 |
-// -----------------------------------------------------------------------------
+// -----------------------------------------------------------------------------------------
+// | START DATE | NUMBER OF STUDENTS | IS FREE | COURSE TITLE |    CONTACT     |  STATUS   |
+// -----------------------------------------------------------------------------------------
+// | 03/24/2018 |         123        |   true  |  Chemistry   | (123) 456-7890 | SCHEDULED |
+// -----------------------------------------------------------------------------------------
 const schema = {
   'START DATE': {
@@ -75,6 +89,8 @@ const schema = {
     type: Number,
     required: true
   },
+  // 'COURSE' is not a real Excel file column name,
+  // it can be any string — it's just for code readability.
   'COURSE': {
     prop: 'course',
     type: {
@@ -94,13 +110,22 @@ const schema = {
   'CONTACT': {
     prop: 'contact',
     required: true,
-    parse(value) {
+    type: (value) => {
       const number = parsePhoneNumber(value)
       if (!number) {
         throw new Error('invalid')
       }
       return number
     }
+  },
+  'STATUS': {
+    prop: 'status',
+    type: String,
+    oneOf: [
+      'SCHEDULED',
+      'STARTED',
+      'FINISHED'
+    ]
   }
 }
@@ -116,30 +141,74 @@ readXlsxFile(file, { schema }).then(({ rows, errors }) => {
       title: 'Chemistry'
     },
     contact: '+11234567890',
+    status: 'SCHEDULED'
   }]
 })
 ```
+If no `type` is specified then the cell value is returned "as is".
 There are also some additional exported `type`s:
-* `"Integer"` for parsing integer `Number`s.
-* `"URL"` for parsing URLs.
-* `"Email"` for parsing email addresses.
+* `Integer` for parsing integer `Number`s.
+* `URL` for parsing URLs.
+* `Email` for parsing email addresses.
+A schema entry for a column may also define an optional `validate(value)` function for validating the parsed value: in that case, it must `throw` an `Error` if the `value` is invalid.
+#### Map
+Sometimes, a developer might want to use some other (more advanced) solution for schema parsing and validation (like [`yup`](https://github.com/jquense/yup)). If a developer passes a `map` instead of a `schema` to `readXlsxFile()`, then it would just map each data row to a JSON object without doing any parsing or validation.
+```js
+// An example *.xlsx document:
+// -----------------------------------------------------------------------------------------
+// | START DATE | NUMBER OF STUDENTS | IS FREE | COURSE TITLE |    CONTACT     |  STATUS   |
+// -----------------------------------------------------------------------------------------
+// | 03/24/2018 |         123        |   true  |  Chemistry   | (123) 456-7890 | SCHEDULED |
+// -----------------------------------------------------------------------------------------
+const map = {
+  'START DATE': 'date',
+  'NUMBER OF STUDENTS': 'numberOfStudents',
+  'COURSE': {
+    'course': {
+      'IS FREE': 'isFree',
+      'COURSE TITLE': 'title'
+    }
+  },
+  'CONTACT': 'contact',
+  'STATUS': 'status'
+}
+readXlsxFile(file, { map }).then(({ rows }) => {
+  rows === [{
+    date: new Date(2018, 2, 24),
+    numberOfStudents: 123,
+    course: {
+      isFree: true,
+      title: 'Chemistry'
+    },
+    contact: '(123) 456-7890',
+    status: 'SCHEDULED'
+  }]
+})
+```
-A schema entry for a column can also have a `validate(value)` function for validating the parsed value. It must `throw` an `Error` if the value is invalid.
+#### Displaying schema errors
-A React component for displaying error info could look like this:
+A React component for displaying schema parsing/validation errors could look like this:
 ```js
 import { parseExcelDate } from 'read-excel-file'
 function ParseExcelError({ children: error }) {
-  // Human-readable value.
+  // Get a human-readable value.
   let value = error.value
   if (error.type === Date) {
     value = parseExcelDate(value).toString()
   }
-  // Error summary.
+  // Render error summary.
   return (
     <div>
       <code>"{error.error}"</code>
@@ -156,6 +225,8 @@ function ParseExcelError({ children: error }) {
 }
 ```
+#### Transforming rows/columns before schema is applied
 When using a `schema` there's also an optional `transformData(data)` parameter which can be used for the cases when the spreadsheet rows/columns aren't in the correct format. For example, the heading row may be missing, or there may be some purely presentational or empty rows. Example:
 ```js
@@ -163,13 +234,17 @@ readXlsxFile(file, {
   schema,
   transformData(data) {
     // Adds header row to the data.
-    return ['ID', 'NAME', ...].concat(data)
+    return [['ID', 'NAME', ...]].concat(data)
     // Removes empty rows.
     return data.filter(row => row.filter(column => column !== null).length > 0)
   }
 })
 ```
+## TypeScript
+See [testing `index.d.ts`](https://github.com/catamphetamine/read-excel-file/issues/71#issuecomment-675140448).
 ## Browser compatibility
 Node.js `*.xlxs` parser uses `xpath` and `xmldom` packages for XML parsing. The same packages could be used in a browser because [all modern browsers](https://caniuse.com/#search=domparser) (except IE 11) have native `DOMParser` built-in which could is used instead (meaning smaller footprint and better performance) but since Internet Explorer 11 support is still required the browser version doesn't use the native `DOMParser` and instead uses `xpath` and `xmldom` packages for XML parsing just like the Node.js version.
@@ -204,6 +279,24 @@ readXlsxFile(file, { getSheets: true }).then((sheets) => {
 })
 ```
+## CDN
+One can use any npm CDN service, e.g. [unpkg.com](https://unpkg.com) or [jsdelivr.net](https://jsdelivr.net)
+```html
+<script src="https://unpkg.com/read-excel-file@4.x/bundle/read-excel-file.min.js"></script>
+<script>
+  var input = document.getElementById('input')
+  input.addEventListener('change', function() {
+    readXlsxFile(input.files[0]).then(function() {
+      // `rows` is an array of rows
+      // each row being an array of cells.
+    })
+  })
+</script>
+```
 ## References
 For XML parsing [`xmldom`](https://github.com/jindw/xmldom) and [`xpath`](https://github.com/goto100/xpath) are used.

package/bundle/index.html CHANGED Viewed

@@ -52,14 +52,14 @@
 	</head>
 	<body>
-		<a id="main-link" href="https://github.com/catamphetamine/read-excel-file">
+		<a id="main-link" href="https://gitlab.com/catamphetamine/read-excel-file">
 			read-excel-file
 		</a>
 		<input type="file" id="input" />
 		<div style="font-size: 12px">
-			* Parsing to JSON with a strict schema is supported. <a target="_blank" href="https://github.com/catamphetamine/read-excel-file#json" style="color: #0093C4; text-decoration: none">Read more</a>.
+			* Parsing to JSON with a strict schema is supported. <a target="_blank" href="https://gitlab.com/catamphetamine/read-excel-file#json" style="color: #0093C4; text-decoration: none">Read more</a>.
 		</div>
 		<div id="result-table"></div>
@@ -75,20 +75,26 @@
 			    // each row being an array of cells.
 			    document.getElementById('result').innerText = JSON.stringify(data, null, 2)
-			    document.getElementById('result-table').innerHTML =
-			    	'<table>' +
-			    	'<tbody>' +
-			    	data.map(function (row) {
-			    		return '<tr>' +
-			    			row.map(function (cell) {
-			    				return '<td>' +
-				    				(cell === null ? '' : cell) +
-				    				'</td>'
-			    			}).join('') +
-			    			'</tr>'
-			    	}).join('') +
-			    	'</tbody>' +
-			    	'</table>'
+			    // Applying `innerHTML` hangs the browser when there're a lot of rows/columns.
+			    // For example, for a file having 2000 rows and 20 columns on a modern
+			    // mid-tier CPU it parses the file (using a "schema") for 3 seconds
+			    // (blocking) with 100% single CPU core usage.
+			    // Then applying `innerHTML` hangs the browser.
+			    // document.getElementById('result-table').innerHTML =
+			    // 	'<table>' +
+			    // 	'<tbody>' +
+			    // 	data.map(function (row) {
+			    // 		return '<tr>' +
+			    // 			row.map(function (cell) {
+			    // 				return '<td>' +
+				   //  				(cell === null ? '' : cell) +
+				   //  				'</td>'
+			    // 			}).join('') +
+			    // 			'</tr>'
+			    // 	}).join('') +
+			    // 	'</tbody>' +
+			    // 	'</table>'
 			  }, function (error) {
 			  	console.error(error)
 			  	alert("Error while parsing Excel file. See console output for the error stack trace.")