ilib-lint 2.2.1 → 2.4.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,6 +1,20 @@
1
1
  Release Notes
2
2
  =============
3
3
 
4
+ ### v2.4.0
5
+
6
+ - added the snake case match rule. If source strings contain only snake case and no whitespace, then the targets must be
7
+ the same. It is treated as Do Not Translate. If the target is different from the source, it is an error.
8
+
9
+ ### v2.3.0
10
+
11
+ - implemented the XML match rule. If there are XML tags and entities in the
12
+ source, then the translations must match. The order of XML tags can change,
13
+ as the grammar of other languages might require that, but the number and
14
+ type of XML tags must match or an error will recorded.
15
+ - this rule will also record an error if the XML in the source is
16
+ well-formed, but the XML in the translation is not
17
+
4
18
  ### v2.2.1
5
19
 
6
20
  - fixed the output from the LintableFile class so that if there is only one
@@ -10,17 +24,20 @@ Release Notes
10
24
  - updated dependencies
11
25
 
12
26
  ### v2.2.0
27
+
13
28
  - added --no-return-value command-line flag to have the linter always return 0, even
14
29
  when there are errors and warnings. This still reports the results to the output.
15
30
  The intention is that the linter can be used to report results without causing
16
31
  build pipelines to fail.
17
32
 
18
33
  ### v2.1.1
34
+
19
35
  - check to make sure that every result in the results array returned by the plugins
20
36
  is not undefined so that we do not run into the problem of dereferencing undefined
21
37
  results later on, which caused some exceptions
22
38
 
23
39
  ### v2.1.0
40
+
24
41
  - fixed a bug where the quote style checker was not converting the highlight quotes properly
25
42
  - added an option `output` to write the output to a file.
26
43
  - added an option `name` to give the project name. It is useful when the config file is shared in multiple projects.
@@ -31,6 +48,7 @@ Release Notes
31
48
  longer accepted.
32
49
 
33
50
  ### v2.0.1
51
+
34
52
  - fixed loading of plugins
35
53
  - if a plugin `ilib-lint-x` exists and a different package `x`
36
54
  also exists that is unrelated to ilib-lint, and the config
@@ -127,7 +145,7 @@ Release Notes
127
145
  substituted into a replacement parameter in the source English text. Nouns
128
146
  and the articles "a", "an", and "the" are not translatable to all languages
129
147
  because of gender and plurality agreement rules.
130
- - converted all unit tests from nodeunit to jest
148
+ - converted all unit tests from nodeunit to jest
131
149
  - updated dependencies
132
150
 
133
151
  ### v1.10.0
@@ -233,7 +251,7 @@ Release Notes
233
251
  - added rule to warn against half-width kana characters
234
252
  - added rule to warn against double-byte whitespace characters
235
253
  - added rule to warn of whitespace adjacent to certain fullwidth punctuation characters
236
- - added rule to warn of a space between double-byte and single-byte character
254
+ - added rule to warn of a space between double-byte and single-byte character
237
255
  - added rule to check whether or not there is a translation for each source string in
238
256
  a resource
239
257
  - removed ability for the ICU plural rule to report results on the
@@ -0,0 +1,68 @@
1
+ # resource-snake-case
2
+
3
+ If the source string contains only snake case and no whitespace, then the target must be the same.
4
+
5
+ The source string is treated as 'Do Not Translate' because snake-cased strings are generally not meant to be translated.
6
+ Instead, they are commonly used in software as identifiers, variable names, or control strings.
7
+
8
+ ## Rule explanation
9
+ Snake case is a way of writing phrases without spaces, where spaces are replaced with underscores (`_`), and the words are typically all lowercase.
10
+
11
+ In this context, any string that conforms to the following rules is considered snake case and should not be translated:
12
+ * Words in mixed case and/or digits separated by underscores (including trailing and leading whitespace), i.e:
13
+ * snake_case,
14
+ * SOME_SCREAMING_SNAKE_CASE,
15
+ * camel_Snake_Case,
16
+ * mixed_CASE,
17
+ * RandomLY_MixED_case,
18
+ * even_MORE_RandomLY_MixED_case_with_numbers123_456,
19
+ * any_case_with_numbers123_456,
20
+ * any_Case_with_Trailing_And_Leading_WHITESPACE ,
21
+
22
+
23
+ * Single word with leading underscore, i.e:
24
+ * _test,
25
+
26
+
27
+ * Digits with leading underscore, i.e:
28
+ * _123,
29
+ * _123_456
30
+
31
+
32
+ * Any case with a leading underscore and/or number, i.e:
33
+ * _test_and_retest,
34
+ * _test_And_REtest,
35
+ * _test_ANd_RETEST,
36
+ * _test_and_RETEST_and123_456,
37
+
38
+
39
+ ## Examples
40
+ ### Correct
41
+ Correctly matched snake case variations in a Spanish (es-ES) translation, where both source and target are the same:
42
+
43
+ 1. snake_case
44
+ - source: `access_granted`
45
+ - target: `access_granted`
46
+
47
+ 2. SCREAMING_SNAKE_CASE
48
+ - source: `ACCESS_GRANTED`
49
+ - target: `ACCESS_GRANTED`
50
+
51
+ 3. camel_Snake_Case
52
+ - source: `access_Granted`
53
+ - target: `access_Granted`
54
+
55
+ 4. mixed_CASE with digits
56
+ - source: `acceSS_GRantEd123_456`
57
+ - target: `acceSS_GRantEd123_456`
58
+
59
+ ### Incorrect
60
+ Incorrectly matched snake case in a Spanish translation:
61
+
62
+ 1. snake_case
63
+ - source: `access_granted`
64
+ - target: `acceso_concedido`
65
+
66
+
67
+ Problems in the above incorrect translation:
68
+ The "access_granted" snake-cased string was translated when it should have been treated as "Do Not Translate".
@@ -0,0 +1,58 @@
1
+ # resource-xml
2
+
3
+ If the source string contains XML-like tags, then the translation must contain
4
+ the same tags. The tags themselves may be reordered or nested differently than
5
+ in the source, but:
6
+
7
+ - they should include the same number of tags
8
+ - the tags should have the same name as ones in the source
9
+ - the XML must be well-formed. That is, tags are nested properly and every
10
+ open tag has a corresponding closing tag
11
+ - unnamed tags such as `<>` and `</>` are not allowed
12
+
13
+ Self closing tags such as `<p/>` are allowed.
14
+
15
+ Example of correctly matched tags in a German translation:
16
+
17
+ - source: `You must <b>wait</b> for the <a href="url">job</a>.`
18
+ - target: `Sie müssen auf den <a href="url">Job</a> <b>warten</b>.`
19
+
20
+ Example of incorrectly matched tags in a German translation:
21
+
22
+ - source: `You must <b>wait</b> for the <a href="url">job</a>.`
23
+ - target: `Sie <b>müssen</c> auf den <a href="url">Job</a> <c>warten</c>.`
24
+
25
+ Problems in the above translation:
26
+
27
+ 1. The `<b>` tag has a closing `</c>` tag making it is not well-formed
28
+ 2. The number of tags is different than the source
29
+ 3. The names of tags are different than the source
30
+
31
+ ## Exceptions for HTML Tags
32
+
33
+ HTML4 tags that are commonly written without a closing tag are allowed.
34
+ The code first checks if the tags are well-formed already. If not, then it
35
+ treats these HTML tags as if they were a self-closing tag without having
36
+ the trailing slash inside the angle brackets.
37
+
38
+ Example: `<p>` (start paragraph) is treated as it is were `<p/>`
39
+
40
+ Here is the list of HTML4 tags that are treated as if they were self-closing
41
+ if they are not well-formed:
42
+
43
+ - `<area>`
44
+ - `<base>`
45
+ - `<bdi>`
46
+ - `<bdo>`
47
+ - `<br>`
48
+ - `<embed>`
49
+ - `<hr>`
50
+ - `<img>`
51
+ - `<input>`
52
+ - `<li>`
53
+ - `<link>`
54
+ - `<option>`
55
+ - `<p>`
56
+ - `<param>`
57
+ - `<source>`
58
+ - `<track>`
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "ilib-lint",
3
- "version": "2.2.1",
3
+ "version": "2.4.0",
4
4
  "module": "./src/index.js",
5
5
  "type": "module",
6
6
  "bin": "./src/index.js",
@@ -61,16 +61,16 @@
61
61
  },
62
62
  "devDependencies": {
63
63
  "@tsconfig/node14": "^14.1.2",
64
- "@types/node": "^20.14.10",
64
+ "@types/node": "^14.0.0",
65
65
  "docdash": "^2.0.2",
66
- "ilib-lint-plugin-test": "file:test/ilib-lint-plugin-test",
67
- "ilib-lint-plugin-obsolete": "file:test/ilib-lint-plugin-obsolete",
68
66
  "i18nlint-plugin-test-old": "file:test/i18nlint-plugin-test-old",
67
+ "ilib-lint-plugin-obsolete": "file:test/ilib-lint-plugin-obsolete",
68
+ "ilib-lint-plugin-test": "file:test/ilib-lint-plugin-test",
69
69
  "jest": "^29.7.0",
70
70
  "jsdoc": "^4.0.3",
71
- "jsdoc-to-markdown": "^8.0.1",
71
+ "jsdoc-to-markdown": "^8.0.3",
72
72
  "npm-run-all": "^4.1.5",
73
- "typescript": "^5.5.3"
73
+ "typescript": "^5.5.4"
74
74
  },
75
75
  "dependencies": {
76
76
  "@formatjs/intl": "^2.10.4",
@@ -78,11 +78,12 @@
78
78
  "ilib-lint-common": "^3.0.0",
79
79
  "ilib-locale": "^1.2.2",
80
80
  "ilib-localeinfo": "^1.1.0",
81
- "ilib-tools-common": "^1.10.0",
81
+ "ilib-tools-common": "^1.11.0",
82
82
  "intl-messageformat": "^10.5",
83
83
  "json5": "^2.2.3",
84
84
  "log4js": "^6.9.1",
85
85
  "micromatch": "^4.0.7",
86
- "options-parser": "^0.4.0"
86
+ "options-parser": "^0.4.0",
87
+ "xml-js": "^1.6.11"
87
88
  }
88
89
  }
package/src/Project.js CHANGED
@@ -38,7 +38,8 @@ const rulesetDefinitions = {
38
38
  "resource-quote-style": "localeOnly",
39
39
  "resource-unique-keys": true,
40
40
  "resource-url-match": true,
41
- "resource-named-params": true
41
+ "resource-named-params": true,
42
+ "resource-snake-case": true,
42
43
  }
43
44
  };
44
45
 
@@ -61,7 +61,8 @@ class AnsiConsoleFormatter extends Formatter {
61
61
  `;
62
62
 
63
63
  // output ascii terminal escape sequences
64
- output = output.replace(/<e\d><\/e\d>/g, "\u001B[91m \u001B[0m");
64
+ output = output.replace(/<e\d><\/e\d>/g, "\u001B[91m␣\u001B[0m");
65
+ output = output.replace(/<e\d\/>/g, "\u001B[91m␣\u001B[0m");
65
66
  output = output.replace(/<e\d>/g, "\u001B[91m");
66
67
  output = output.replace(/<\/e\d>/g, "\u001B[0m");
67
68
  if (typeof(result.rule.getLink) === 'function' && result.rule.getLink()) {
@@ -36,6 +36,7 @@ import ResourceSourceICUPluralSyntax from '../rules/ResourceSourceICUPluralSynta
36
36
  import ResourceSourceICUPluralParams from '../rules/ResourceSourceICUPluralParams.js';
37
37
  import ResourceSourceICUPluralCategories from '../rules/ResourceSourceICUPluralCategories.js';
38
38
  import ResourceSourceICUUnexplainedParams from '../rules/ResourceSourceICUUnexplainedParams.js';
39
+ import ResourceXML from '../rules/ResourceXML.js';
39
40
 
40
41
  // built-in declarative rules
41
42
  export const regexRules = [
@@ -226,7 +227,19 @@ export const regexRules = [
226
227
  ],
227
228
  link: "https://github.com/ilib-js/ilib-lint/blob/main/docs/source-no-manual-date-formatting.md",
228
229
  severity: "error"
229
- }
230
+ },
231
+ {
232
+ type: "resource-matcher",
233
+ name: "resource-snake-case",
234
+ description: "Ensure that when source strings contain only snake case (words and/or numbers separeated by underscores) and no whitespace, then the targets are the same",
235
+ note: "Do not translate the source string if it consists solely of snake-cased strings and/or digits. Please update the target string so it matches the source string.",
236
+ regexps: [
237
+ "^\\s*[a-zA-Z0-9]*(_[a-zA-Z0-9]+)+\\s*$",
238
+ "^\\s*[a-zA-Z0-9]+(_[a-zA-Z0-9]+)*_\\s*$"
239
+ ],
240
+ link: "https://gihub.com/ilib-js/ilib-lint/blob/main/docs/resource-snake-case.md",
241
+ severity: "error"
242
+ },
230
243
  ];
231
244
 
232
245
  // built-in ruleset that contains all the built-in rules
@@ -241,6 +254,7 @@ export const builtInRulesets = {
241
254
  "resource-completeness": true,
242
255
  "resource-no-translation": true,
243
256
  "resource-icu-plurals-translated": true,
257
+ "resource-xml": true,
244
258
 
245
259
  // declarative rules from above
246
260
  "resource-url-match": true,
@@ -253,6 +267,7 @@ export const builtInRulesets = {
253
267
  "resource-no-halfwidth-kana-characters": true,
254
268
  "resource-no-double-byte-space": true,
255
269
  "resource-no-space-with-fullwidth-punctuation": true,
270
+ "resource-snake-case": true,
256
271
  },
257
272
 
258
273
  source: {
@@ -310,6 +325,7 @@ class BuiltinPlugin extends Plugin {
310
325
  ResourceSourceICUPluralParams,
311
326
  ResourceSourceICUPluralCategories,
312
327
  ResourceSourceICUUnexplainedParams,
328
+ ResourceXML,
313
329
  ...regexRules
314
330
  ];
315
331
  }
@@ -145,7 +145,7 @@ class DeclarativeResourceRule extends ResourceRule {
145
145
  let results = [];
146
146
  // only need 1 regexp to match in order to trigger this rule
147
147
  for (const re of this.re) {
148
- results = results.concat(this.checkString({re, source, target, file, resource}));
148
+ results = results.concat(this.checkString({re, source, target, file, resource}) ?? []);
149
149
  if (results.length > 0) break;
150
150
  }
151
151
  results = results.filter(result => result);
@@ -0,0 +1,244 @@
1
+ /*
2
+ * ResourceXML.js - rule to check that XML in the translations match
3
+ * XML in the source
4
+ *
5
+ * Copyright © 2024 JEDLSoft
6
+ *
7
+ * Licensed under the Apache License, Version 2.0 (the "License");
8
+ * you may not use this file except in compliance with the License.
9
+ * You may obtain a copy of the License at
10
+ *
11
+ * http://www.apache.org/licenses/LICENSE-2.0
12
+ *
13
+ * Unless required by applicable law or agreed to in writing, software
14
+ * distributed under the License is distributed on an "AS IS" BASIS,
15
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
16
+ *
17
+ * See the License for the specific language governing permissions and
18
+ * limitations under the License.
19
+ */
20
+
21
+ import { Result } from 'ilib-lint-common';
22
+ import { xml2js } from 'xml-js';
23
+ import { selfClosingTags } from 'ilib-tools-common';
24
+ import ResourceRule from './ResourceRule.js';
25
+
26
+ const htmlTags = Object.keys(selfClosingTags).concat(["p", "li"]);
27
+ const selfClosingRe = new RegExp(`<(${htmlTags.join('|')})>`, "g");
28
+ const endTagRe = new RegExp(`</(${htmlTags.join('|')})>`);
29
+ const unnamedTagRe = /<\/?>/;
30
+
31
+ /**
32
+ * @class Represent an ilib-lint rule.
33
+ */
34
+ class ResourceXML extends ResourceRule {
35
+ /**
36
+ * Make a new rule instance.
37
+ * @constructor
38
+ */
39
+ constructor(options) {
40
+ super(options);
41
+ this.name = "resource-xml";
42
+ this.description = "Ensure that XML in translated resources match the source";
43
+ this.sourceLocale = (options && options.sourceLocale) || "en-US";
44
+ this.link = "https://gihub.com/ilib-js/ilib-lint/blob/main/docs/resource-xml.md";
45
+ }
46
+
47
+ /**
48
+ * @private
49
+ * @param {Node} node a node in the AST
50
+ * @param {Object} elements an object that maps each element found to the number of times it
51
+ * has been found
52
+ */
53
+ countElements(node, elements) {
54
+ if (Array.isArray(node)) {
55
+ for (let i in node) {
56
+ this.countElements(node[i], elements);
57
+ }
58
+ } else {
59
+ if (node.type === "element") {
60
+ if (!elements[node.name]) {
61
+ elements[node.name] = 1;
62
+ } else {
63
+ elements[node.name]++;
64
+ }
65
+ }
66
+ if (node.elements) {
67
+ this.countElements(node.elements, elements);
68
+ }
69
+ }
70
+ }
71
+
72
+ /**
73
+ * @private
74
+ * @param {Node} sourceAst the root of the AST of the source string
75
+ * @param {Node} targetAst the root of the AST of the target string
76
+ * @param {Resource} resource the resource instance where the source
77
+ * and target strings came from
78
+ */
79
+ matchElements(sourceAst, targetAst, resource) {
80
+ // first traverse the source tree looking for elements to count
81
+ let sourceElements = {}, targetElements = {};
82
+ let problems = [];
83
+
84
+ if (sourceAst?.elements?.length > 0) {
85
+ this.countElements(sourceAst?.elements, sourceElements);
86
+ if (targetAst?.elements?.length > 0) {
87
+ this.countElements(targetAst?.elements, targetElements);
88
+ }
89
+
90
+ for (let element in sourceElements) {
91
+ if (!targetElements[element] || sourceElements[element] !== targetElements[element]) {
92
+ let opts = {
93
+ severity: "error",
94
+ rule: this,
95
+ description: `The number of XML <${element}> elements in the target (${targetElements[element] ?? 0}) does not match the number in the source (${sourceElements[element]}).`,
96
+ id: resource.getKey(),
97
+ highlight: `Target: ${resource.getTarget()}<e0/>`,
98
+ pathName: resource.getPath(),
99
+ source: resource.getSource(),
100
+ locale: resource.getTargetLocale()
101
+ };
102
+ problems.push(new Result(opts));
103
+ }
104
+ }
105
+
106
+ for (let element in targetElements) {
107
+ if (!sourceElements[element]) {
108
+ const re = new RegExp(`<(?<tag>\/?${element}\/?)>`, "g");
109
+ const highlight =
110
+ resource.getTarget().replace(re, "<e0><$<tag>></e0>");
111
+ let opts = {
112
+ severity: "error",
113
+ rule: this,
114
+ description: `The XML element <${element}> in the target does not appear in the source.`,
115
+ id: resource.getKey(),
116
+ highlight: `Target: ${highlight}`,
117
+ pathName: resource.getPath(),
118
+ source: resource.getSource(),
119
+ locale: resource.getTargetLocale()
120
+ };
121
+ problems.push(new Result(opts));
122
+ }
123
+ }
124
+ }
125
+
126
+ return problems;
127
+ }
128
+
129
+ /**
130
+ * Sometimes, the xml tags are really html, which has notorious problems
131
+ * with unclosed tags being considered valid, such as the <p> or
132
+ * <br> tags. The xml parser we are using does not recognize html,
133
+ * so we have to convert the unclosed html tags into valid xml before we
134
+ * attempt to parse them. This function does that by making those tags into
135
+ * self-closing tags. <p> becomes <p/>
136
+ *
137
+ * Note that if there is a <p> tag, we have to make sure there is also no
138
+ * </p> in the string as that is valid xml already. We should only convert
139
+ * the <p> tags when there are no </p> tags to go with it.
140
+ *
141
+ * @private
142
+ * @param {string} string the string to convert
143
+ * @returns {string}
144
+ */
145
+ convertUnclosedTags(string) {
146
+ let converted = string;
147
+
148
+ if (!endTagRe.test(string)) {
149
+ converted = string.replace(selfClosingRe, "<$1/>");
150
+ }
151
+ return converted;
152
+ }
153
+
154
+ /**
155
+ * @override
156
+ */
157
+ matchString({source, target, resource}) {
158
+ if (!target) return; // can't check "nothing" !
159
+ let srcObj, tgtObj;
160
+ let problems = [];
161
+ const prefix = '<?xml version="1.0" encoding="UTF-8"?><root>';
162
+ const suffix = '</root>';
163
+
164
+ // convert html tags to valid xml tags and wrap the strings with a prefix
165
+ // and suffix so that it forms a whole xml document before we attempt to
166
+ // call the parser on them
167
+ const wrappedSource = `${prefix}${this.convertUnclosedTags(source)}${suffix}`;
168
+ const wrappedTarget = `${prefix}${this.convertUnclosedTags(target)}${suffix}`;
169
+
170
+ // First, check the source string for problems. If there are any,
171
+ // don't even bother checking the target string for problems because
172
+ // we don't even know if they are valid problems. The translators may
173
+ // just have echoed the problems already in the source. There will be
174
+ // another rule that checks the well-formedness of the source string
175
+ // for the engineers to fix. It is not the job of this rule to report
176
+ // on the well-formedness of the source.
177
+ try {
178
+ srcObj = xml2js(wrappedSource, {
179
+ trim: false
180
+ });
181
+ } catch (e) {
182
+ // source is not well-formed, so don't even
183
+ // attempt to parse the target! Just bail.
184
+ return undefined;
185
+ }
186
+
187
+ try {
188
+ // Second, tags that have no name are a special type of un-well-formedness
189
+ // that we want to call out separately. If the target contains them, the
190
+ // xml2js parser below will find it, but it will show as an unclosed tag error.
191
+ // While that is true, it's a poor error message that doesn't help the
192
+ // translators fix the real problem, which is the unnamed tag.
193
+ if (unnamedTagRe.test(target)) {
194
+ const highlight =
195
+ target.replace(/(<\/?>)/g, "<e0>$1</e0>");
196
+ let opts = {
197
+ severity: "error",
198
+ rule: this,
199
+ description: `Empty XML elements <> and </> are not allowed in the target.`,
200
+ id: resource.getKey(),
201
+ highlight: `Target: ${highlight}`,
202
+ pathName: resource.getPath(),
203
+ source: resource.getSource(),
204
+ locale: resource.getTargetLocale()
205
+ };
206
+ problems.push(new Result(opts));
207
+ }
208
+
209
+ // Third, parse the target string for well-formedness. If it does not parse properly,
210
+ // it throws the exception handled below
211
+ tgtObj = xml2js(wrappedTarget, {
212
+ trim: false
213
+ });
214
+
215
+ // And finally match the xml elements/tags from the source to the target
216
+ problems = problems.concat(this.matchElements(srcObj, tgtObj, resource));
217
+ } catch (e) {
218
+ const lines = e.message.split(/\n/g);
219
+ // find the column number in the 3rd line of the exception message and subtract off
220
+ // the length of the prefix text we added in wrappedTarget
221
+ const column = parseInt(lines[2].substring(8)) - prefix.length;
222
+ // create the highlight, but make sure to escape any less than characters so that
223
+ // it does not conflict with the highlight
224
+ const highlight = column >= target.length ?
225
+ target + '<e0/>' :
226
+ target.substring(0, column) + '<e0>' + target[column] + '</e0>' + target.substring(column+1);
227
+ let opts = {
228
+ severity: "error",
229
+ rule: this,
230
+ description: `XML in translation is not well-formed. Error: ${lines[0]}`,
231
+ id: resource.getKey(),
232
+ highlight: `Target: ${highlight}`,
233
+ pathName: resource.getPath(),
234
+ source: resource.getSource(),
235
+ locale: resource.getTargetLocale()
236
+ };
237
+ problems.push(new Result(opts));
238
+ }
239
+
240
+ return problems.length < 2 ? problems[0] : problems;
241
+ }
242
+ }
243
+
244
+ export default ResourceXML;