jscpd-rs 0.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +69 -0
- package/Cargo.lock +1323 -0
- package/Cargo.toml +54 -0
- package/LICENSE +21 -0
- package/README.md +372 -0
- package/docs/api-parity.md +49 -0
- package/docs/cloning-plan.md +281 -0
- package/docs/compat-baseline.md +535 -0
- package/docs/format-porting.md +86 -0
- package/docs/junior-task-template.md +62 -0
- package/docs/junior-workflow.md +87 -0
- package/docs/migrating-from-jscpd.md +193 -0
- package/docs/npm-release.md +116 -0
- package/docs/public-benchmark-suite.md +81 -0
- package/docs/release-checklist.md +200 -0
- package/docs/release-decisions.md +103 -0
- package/docs/release-readiness.md +51 -0
- package/docs/upstream-bugs.md +501 -0
- package/docs/upstream-issue-drafts.md +393 -0
- package/docs/user-guide.md +309 -0
- package/examples/dump_oxc_tokens.rs +112 -0
- package/examples/library_api.rs +42 -0
- package/npm/bin/jscpd-rs.js +6 -0
- package/npm/bin/jscpd-server.js +6 -0
- package/npm/lib/run-binary.js +68 -0
- package/npm/scripts/postinstall.js +50 -0
- package/package.json +53 -0
- package/skills/dry-refactoring/SKILL.md +63 -0
- package/skills/jscpd/SKILL.md +85 -0
- package/src/app.rs +512 -0
- package/src/bin/jscpd-server.rs +429 -0
- package/src/blame.rs +130 -0
- package/src/cli/config.rs +543 -0
- package/src/cli/parsing.rs +301 -0
- package/src/cli/tests.rs +543 -0
- package/src/cli.rs +671 -0
- package/src/detector/matching/secondary.rs +387 -0
- package/src/detector/matching.rs +274 -0
- package/src/detector/model.rs +190 -0
- package/src/detector/prepare.rs +71 -0
- package/src/detector/skip_local.rs +40 -0
- package/src/detector/statistics.rs +138 -0
- package/src/detector/store.rs +96 -0
- package/src/detector/tests.rs +238 -0
- package/src/detector.rs +265 -0
- package/src/files/discovery.rs +508 -0
- package/src/files/gitignore.rs +203 -0
- package/src/files/paths.rs +68 -0
- package/src/files/shebang.rs +106 -0
- package/src/files/tests.rs +523 -0
- package/src/files.rs +25 -0
- package/src/formats.rs +570 -0
- package/src/lib.rs +433 -0
- package/src/main.rs +26 -0
- package/src/report/ai.rs +125 -0
- package/src/report/badge.rs +238 -0
- package/src/report/console.rs +180 -0
- package/src/report/console_common.rs +37 -0
- package/src/report/console_full.rs +139 -0
- package/src/report/csv.rs +65 -0
- package/src/report/escape.rs +8 -0
- package/src/report/file_output.rs +28 -0
- package/src/report/html/assets.rs +47 -0
- package/src/report/html.rs +336 -0
- package/src/report/json.rs +119 -0
- package/src/report/markdown.rs +125 -0
- package/src/report/sarif.rs +302 -0
- package/src/report/silent.rs +22 -0
- package/src/report/source.rs +38 -0
- package/src/report/summary.rs +50 -0
- package/src/report/test_support.rs +133 -0
- package/src/report/threshold.rs +76 -0
- package/src/report/xcode.rs +90 -0
- package/src/report/xml.rs +119 -0
- package/src/report.rs +250 -0
- package/src/server/mcp.rs +942 -0
- package/src/server.rs +1081 -0
- package/src/tokenizer/apex.rs +97 -0
- package/src/tokenizer/blocks.rs +532 -0
- package/src/tokenizer/embedded.rs +106 -0
- package/src/tokenizer/generic.rs +511 -0
- package/src/tokenizer/hash.rs +27 -0
- package/src/tokenizer/ignore.rs +33 -0
- package/src/tokenizer/line_index.rs +33 -0
- package/src/tokenizer/markdown.rs +289 -0
- package/src/tokenizer/markup_attrs.rs +289 -0
- package/src/tokenizer/oxc/fallback.rs +275 -0
- package/src/tokenizer/oxc/jsx.rs +168 -0
- package/src/tokenizer/oxc/kind.rs +177 -0
- package/src/tokenizer/oxc/lexical.rs +67 -0
- package/src/tokenizer/oxc.rs +659 -0
- package/src/tokenizer/scan.rs +88 -0
- package/src/tokenizer/tap.rs +150 -0
- package/src/tokenizer/tests.rs +915 -0
- package/src/tokenizer.rs +328 -0
- package/src/verbose.rs +195 -0
|
@@ -0,0 +1,501 @@
|
|
|
1
|
+
# Upstream Bug Candidates
|
|
2
|
+
|
|
3
|
+
These are compatibility findings that look like upstream `jscpd` issues rather
|
|
4
|
+
than Rust clone issues. Verify each against the current upstream default branch
|
|
5
|
+
before filing.
|
|
6
|
+
|
|
7
|
+
Verification snapshot: on 2026-05-31, upstream remote `HEAD` resolved to
|
|
8
|
+
`refs/heads/master` at `50290cf`; no `refs/heads/main` was advertised. The
|
|
9
|
+
local upstream submodule is clean at that SHA and was used as the current
|
|
10
|
+
upstream checkout for quick repro verification.
|
|
11
|
+
|
|
12
|
+
## Prism JS tokenizer swallows code after nested template literals
|
|
13
|
+
|
|
14
|
+
Status: observed on `jscpd` submodule during compatibility work.
|
|
15
|
+
|
|
16
|
+
Repro target:
|
|
17
|
+
|
|
18
|
+
```sh
|
|
19
|
+
FORMAT=javascript MIN_TOKENS=50 MIN_LINES=5 MAX_SIZE=1mb KEEP=1 \
|
|
20
|
+
scripts/compat.sh /path/to/generated-next-app
|
|
21
|
+
```
|
|
22
|
+
|
|
23
|
+
Observed mismatch:
|
|
24
|
+
|
|
25
|
+
- Upstream reports a clone between:
|
|
26
|
+
- generated SSR chunk A
|
|
27
|
+
- generated standalone server chunk B
|
|
28
|
+
- The Rust/Oxc tokenizer also sees the equivalent clone in:
|
|
29
|
+
- generated standalone SSR chunk C
|
|
30
|
+
- Upstream does not see that second candidate because Prism tokenizes a large
|
|
31
|
+
minified JS range as one `string` token.
|
|
32
|
+
|
|
33
|
+
Concrete tokenization symptom in upstream tokenizer:
|
|
34
|
+
|
|
35
|
+
```text
|
|
36
|
+
line 3 column 284: string token length ~8194
|
|
37
|
+
starts with: `}f.__next_img_default=!0;let g=f},67161,...
|
|
38
|
+
ends before: `locale-option...
|
|
39
|
+
```
|
|
40
|
+
|
|
41
|
+
The source pattern around the start is a nested template expression in minified
|
|
42
|
+
JavaScript:
|
|
43
|
+
|
|
44
|
+
```js
|
|
45
|
+
return`${a.path}?url=${encodeURIComponent(b)}&w=${c}&q=${i}${b.startsWith("/")&&h?`&dpl=${h}`:""}`}...
|
|
46
|
+
```
|
|
47
|
+
|
|
48
|
+
Expected behavior: the tokenizer should continue parsing JavaScript after the
|
|
49
|
+
outer template literal instead of treating the following module body as a single
|
|
50
|
+
template/string token.
|
|
51
|
+
|
|
52
|
+
## Prism JS tokenizer treats `//` inside a template literal as a line comment
|
|
53
|
+
|
|
54
|
+
Status: observed on `jscpd` submodule during compatibility work.
|
|
55
|
+
|
|
56
|
+
Related repro target:
|
|
57
|
+
|
|
58
|
+
```sh
|
|
59
|
+
FORMAT=javascript MIN_TOKENS=50 MIN_LINES=5 MAX_SIZE=1mb KEEP=1 \
|
|
60
|
+
scripts/compat.sh /path/to/generated-next-app
|
|
61
|
+
```
|
|
62
|
+
|
|
63
|
+
In another generated SSR chunk, upstream tokenizes a `//` sequence inside a
|
|
64
|
+
template literal as a comment that runs to the end of a very large minified
|
|
65
|
+
line.
|
|
66
|
+
|
|
67
|
+
Observed source pattern:
|
|
68
|
+
|
|
69
|
+
```js
|
|
70
|
+
return`${a}//${b}${c?":"+c:""}`
|
|
71
|
+
```
|
|
72
|
+
|
|
73
|
+
Concrete tokenization symptom in upstream tokenizer:
|
|
74
|
+
|
|
75
|
+
```text
|
|
76
|
+
line 3 column 7270: comment token length ~300232
|
|
77
|
+
starts with: //${b}${c?":"+c:""}...
|
|
78
|
+
contains later ordinary module code and localeConfig data
|
|
79
|
+
```
|
|
80
|
+
|
|
81
|
+
Expected behavior: the `//` text inside the template literal should remain a
|
|
82
|
+
template string segment and should not comment out the rest of the generated
|
|
83
|
+
line.
|
|
84
|
+
|
|
85
|
+
## `--blame` fails for files inside a nested Git repository when run from the parent repo
|
|
86
|
+
|
|
87
|
+
Status: observed on the `jscpd` submodule during compatibility work.
|
|
88
|
+
|
|
89
|
+
Repro from the Rust clone repository root, where `jscpd/` is a Git submodule:
|
|
90
|
+
|
|
91
|
+
```sh
|
|
92
|
+
node jscpd/apps/jscpd/bin/jscpd jscpd/fixtures/javascript \
|
|
93
|
+
--format javascript \
|
|
94
|
+
--reporters json \
|
|
95
|
+
--output /tmp/jscpd-upstream-blame \
|
|
96
|
+
--silent \
|
|
97
|
+
--noTips \
|
|
98
|
+
--blame \
|
|
99
|
+
--min-tokens 20 \
|
|
100
|
+
--min-lines 3 \
|
|
101
|
+
--max-size 1mb \
|
|
102
|
+
--exitCode 0
|
|
103
|
+
```
|
|
104
|
+
|
|
105
|
+
Observed failure:
|
|
106
|
+
|
|
107
|
+
```text
|
|
108
|
+
Error: Command failed with exit code 128: /usr/bin/git blame -w jscpd/fixtures/javascript/file_4.js
|
|
109
|
+
fatal: no such path 'jscpd/fixtures/javascript/file_4.js' in HEAD
|
|
110
|
+
```
|
|
111
|
+
|
|
112
|
+
The failure comes from `blamer@1.0.7`, which invokes `git blame -w <path>` from
|
|
113
|
+
the current process directory. When the scanned path is inside a nested Git
|
|
114
|
+
repository or submodule, the parent repository does not track the nested file
|
|
115
|
+
path as a regular file, so `git blame` exits with 128.
|
|
116
|
+
|
|
117
|
+
Expected behavior: blame should run from the file's own repository/worktree, or
|
|
118
|
+
fail per file without aborting the entire detection run.
|
|
119
|
+
|
|
120
|
+
## Pug report overextends a clone into a non-matching `style.` block
|
|
121
|
+
|
|
122
|
+
Status: observed on the `jscpd` submodule during compatibility work. The Rust
|
|
123
|
+
clone currently mirrors this range behavior for compatibility.
|
|
124
|
+
|
|
125
|
+
Repro target:
|
|
126
|
+
|
|
127
|
+
```sh
|
|
128
|
+
FORMAT=pug MIN_TOKENS=20 MIN_LINES=3 MAX_SIZE=1mb KEEP=1 scripts/compat.sh jscpd/fixtures/pug
|
|
129
|
+
```
|
|
130
|
+
|
|
131
|
+
Observed upstream clone:
|
|
132
|
+
|
|
133
|
+
- `jscpd/fixtures/pug/file1.pug:1-274`
|
|
134
|
+
- `jscpd/fixtures/pug/file2.pug:1-266`
|
|
135
|
+
|
|
136
|
+
Those ranges include the `style.` multiline plain-text block. The upstream
|
|
137
|
+
tokenizer emits that block as one `multiline-plain-text` token:
|
|
138
|
+
|
|
139
|
+
```text
|
|
140
|
+
file1.pug: token 391, lines 49-274, length 5278, md5 prefix 8a231
|
|
141
|
+
file2.pug: token 391, lines 49-266, length 5115, md5 prefix 9eaf8
|
|
142
|
+
```
|
|
143
|
+
|
|
144
|
+
The token values differ: `file1.pug` contains extra `.clones-excellent` and
|
|
145
|
+
`.clones-fine` CSS blocks before `.clones-danger`, while `file2.pug` jumps
|
|
146
|
+
directly from `.stats` to `.clones-danger`.
|
|
147
|
+
|
|
148
|
+
Expected behavior: the reported clone range should stop before the non-matching
|
|
149
|
+
multiline token, or the tokenizer should split the `style.` content so only
|
|
150
|
+
matching CSS ranges are reported.
|
|
151
|
+
|
|
152
|
+
## HAML report overextends a clone into a non-matching comment block
|
|
153
|
+
|
|
154
|
+
Status: observed on the `jscpd` submodule during compatibility work. The Rust
|
|
155
|
+
clone currently mirrors this range behavior for compatibility.
|
|
156
|
+
|
|
157
|
+
Repro target:
|
|
158
|
+
|
|
159
|
+
```sh
|
|
160
|
+
FORMAT=haml MIN_TOKENS=20 MIN_LINES=3 MAX_SIZE=1mb KEEP=1 scripts/compat.sh jscpd/fixtures/haml
|
|
161
|
+
```
|
|
162
|
+
|
|
163
|
+
Observed upstream clone:
|
|
164
|
+
|
|
165
|
+
- `jscpd/fixtures/haml/file1.haml:1-26`
|
|
166
|
+
- `jscpd/fixtures/haml/file2.haml:1-26`
|
|
167
|
+
|
|
168
|
+
The ranges include a HAML silent-comment block whose visible source differs:
|
|
169
|
+
|
|
170
|
+
```haml
|
|
171
|
+
-# File-specific: user settings section
|
|
172
|
+
.settings-section
|
|
173
|
+
%h2 Account Settings
|
|
174
|
+
%p Change your password and security preferences.
|
|
175
|
+
```
|
|
176
|
+
|
|
177
|
+
versus:
|
|
178
|
+
|
|
179
|
+
```haml
|
|
180
|
+
-# File-specific: notification preferences
|
|
181
|
+
.notifications-section
|
|
182
|
+
%h2 Notification Preferences
|
|
183
|
+
%p Manage how you receive alerts and updates.
|
|
184
|
+
```
|
|
185
|
+
|
|
186
|
+
Expected behavior: the reported clone should stop before the differing
|
|
187
|
+
commented block, or the tokenizer/report range logic should document that HAML
|
|
188
|
+
silent comments are ignored and may extend clone ranges through non-matching
|
|
189
|
+
source text.
|
|
190
|
+
|
|
191
|
+
## ASP.NET report overextends a clone through an inserted email form group
|
|
192
|
+
|
|
193
|
+
Status: observed on the `jscpd` submodule during compatibility work.
|
|
194
|
+
|
|
195
|
+
Repro target:
|
|
196
|
+
|
|
197
|
+
```sh
|
|
198
|
+
FORMAT=aspnet MIN_TOKENS=20 MIN_LINES=3 MAX_SIZE=1mb KEEP=1 scripts/compat.sh jscpd/fixtures/htmlembedded
|
|
199
|
+
```
|
|
200
|
+
|
|
201
|
+
Observed upstream clone:
|
|
202
|
+
|
|
203
|
+
- `jscpd/fixtures/htmlembedded/file1.aspx:18-36`
|
|
204
|
+
- `jscpd/fixtures/htmlembedded/file2.aspx:18-43`
|
|
205
|
+
|
|
206
|
+
The `file2.aspx` range includes an inserted email field group on lines 36-42:
|
|
207
|
+
|
|
208
|
+
```aspx
|
|
209
|
+
<div class="form-group">
|
|
210
|
+
<asp:Label ID="lblEmail" runat="server" AssociatedControlID="txtEmail" Text="Email:" />
|
|
211
|
+
<asp:TextBox ID="txtEmail" runat="server" CssClass="form-control" MaxLength="255" />
|
|
212
|
+
<asp:RequiredFieldValidator ID="rfvEmail" runat="server"
|
|
213
|
+
ControlToValidate="txtEmail" ErrorMessage="Email is required"
|
|
214
|
+
CssClass="text-danger" Display="Dynamic" />
|
|
215
|
+
</div>
|
|
216
|
+
```
|
|
217
|
+
|
|
218
|
+
Those controls are not present in the paired `file1.aspx` range. The upstream
|
|
219
|
+
Prism tokens also keep distinct values such as `lblEmail`, `txtEmail`,
|
|
220
|
+
`Email:`, `rfvEmail`, and `Email is required`, so this does not look like a
|
|
221
|
+
normal token normalization difference.
|
|
222
|
+
|
|
223
|
+
Expected behavior: the reported clone should stop before the inserted email
|
|
224
|
+
group, split around it, or report only the structurally duplicated subranges.
|
|
225
|
+
|
|
226
|
+
## React public benchmark reports overextended JavaScript clone ranges
|
|
227
|
+
|
|
228
|
+
Status: observed on the `jscpd` submodule during React public benchmark
|
|
229
|
+
compatibility work.
|
|
230
|
+
|
|
231
|
+
Repro target:
|
|
232
|
+
|
|
233
|
+
```sh
|
|
234
|
+
FORMAT=javascript MIN_TOKENS=50 MIN_LINES=5 MAX_SIZE=1mb KEEP=1 \
|
|
235
|
+
scripts/compat.sh "${BENCH_ROOT:-$HOME/.cache/jscpd-rs/public-bench}/repos/react/."
|
|
236
|
+
```
|
|
237
|
+
|
|
238
|
+
After the Rust clone covers the real duplicated subranges, three upstream
|
|
239
|
+
fragments still look overextended rather than genuinely missed:
|
|
240
|
+
|
|
241
|
+
- `SyntheticMouseEvent-test.js:21-38`: upstream pairs
|
|
242
|
+
`SyntheticClipboardEvent-test.js:20-34` with
|
|
243
|
+
`SyntheticMouseEvent-test.js:21-38`. Lines 21-35 are duplicated setup code,
|
|
244
|
+
but lines 36-38 already enter the `onMouseMove` test body and do not match
|
|
245
|
+
the clipboard test's nested `describe`/`it` block.
|
|
246
|
+
- `ReactDOMFizzServerNode.js:179-229`: one upstream clone reports the Node
|
|
247
|
+
fragment as `229-179` against `ReactDOMFizzServerEdge.js:92-165`, producing a
|
|
248
|
+
reversed range. The Rust clone covers the surrounding real duplicated
|
|
249
|
+
subranges, but not the reversed overextension gap.
|
|
250
|
+
- `ReactDOMViewTransition-test.js:39-135`: upstream pairs
|
|
251
|
+
`ReactDOMSuspensePlaceholder-test.js:37-109` with
|
|
252
|
+
`ReactDOMViewTransition-test.js:39-135`. Lines 39-111 cover the shared test
|
|
253
|
+
helpers; lines 112-135 enter a ViewTransition-specific SuspenseList test and
|
|
254
|
+
do not correspond to the SuspensePlaceholder range.
|
|
255
|
+
|
|
256
|
+
Expected behavior: clone fragments should stop at the last matching token range,
|
|
257
|
+
or the detector should split separate duplicated subranges instead of extending
|
|
258
|
+
through neighboring non-matching test code.
|
|
259
|
+
|
|
260
|
+
## Next.js TypeScript public benchmark overextended report ranges
|
|
261
|
+
|
|
262
|
+
Status: observed on the `next` public benchmark at commit `2bbb67b9` during
|
|
263
|
+
coverage-first compatibility work.
|
|
264
|
+
|
|
265
|
+
Repro target:
|
|
266
|
+
|
|
267
|
+
```sh
|
|
268
|
+
FORMAT=typescript MIN_TOKENS=50 MIN_LINES=5 MAX_SIZE=1mb STRICT=coverage KEEP=1 \
|
|
269
|
+
scripts/compat.sh "${BENCH_ROOT:-$HOME/.cache/jscpd-rs/public-bench}/repos/next/."
|
|
270
|
+
```
|
|
271
|
+
|
|
272
|
+
After matching the upstream `console-exit` template interpolation behavior and
|
|
273
|
+
TypeScript array-regex tokenization, the Rust clone covers `3900/3908` upstream
|
|
274
|
+
clone fragments on this benchmark. The remaining missing coverage is dominated
|
|
275
|
+
by upstream fragments that extend across unrelated neighboring test cases or
|
|
276
|
+
through reversed/oversized ranges:
|
|
277
|
+
|
|
278
|
+
- `next-style-loader/index.ts:221-229`: upstream tokenizes generated JS inside a
|
|
279
|
+
template literal as ordinary TypeScript and reports a clone against
|
|
280
|
+
`154-165`. A broad Rust experiment that tokenized code-like template raw text
|
|
281
|
+
did cover this fragment, but it increased overall Next missing coverage and
|
|
282
|
+
token volume, so it was rejected as too invasive.
|
|
283
|
+
- `non-root-project-monorepo.test.ts:221-240` and `284-303`: inline snapshot
|
|
284
|
+
blocks with similar stack traces; upstream starts/ends inside snapshot text.
|
|
285
|
+
- `normalize-next-data.test.ts:185-681`: upstream pairs a 22-line later test
|
|
286
|
+
block with a 497-line earlier range. Rust covers the actual smaller repeated
|
|
287
|
+
route-normalization blocks around that area, but not the whole overextended
|
|
288
|
+
range.
|
|
289
|
+
- `edge-runtime-module-errors.test.ts:314-459` and `745-892`: upstream contains
|
|
290
|
+
several useful repeated subranges, but some reported pairs have reversed or
|
|
291
|
+
overextended endpoints such as `459-314` and `892-745`.
|
|
292
|
+
- `next-rs-api.test.ts:175-203` and `327-356`: a real repeated config object
|
|
293
|
+
body with an upstream start before the stable matching token run.
|
|
294
|
+
|
|
295
|
+
Expected behavior: clone fragments should be split at the actual matching token
|
|
296
|
+
runs and should not report reversed or multi-test overextended ranges.
|
|
297
|
+
|
|
298
|
+
## Prometheus Go public benchmark overextended report ranges
|
|
299
|
+
|
|
300
|
+
Status: observed on the `prometheus` public benchmark at commit `a0524ee`
|
|
301
|
+
during coverage-first compatibility work.
|
|
302
|
+
|
|
303
|
+
Repro target:
|
|
304
|
+
|
|
305
|
+
```sh
|
|
306
|
+
FORMAT=go MIN_TOKENS=50 MIN_LINES=5 MAX_SIZE=1mb STRICT=coverage KEEP=1 \
|
|
307
|
+
scripts/compat.sh "${BENCH_ROOT:-$HOME/.cache/jscpd-rs/public-bench}/repos/prometheus/."
|
|
308
|
+
```
|
|
309
|
+
|
|
310
|
+
The Rust clone reports more Go clones overall on this benchmark, but upstream
|
|
311
|
+
still has a small set of fragments whose line ranges extend beyond the matching
|
|
312
|
+
token run. Several are reversed ranges, and several table-driven test cases use
|
|
313
|
+
one early case as the paired fragment for many later cases, which makes the
|
|
314
|
+
reported early range span unrelated intervening cases. The public benchmark
|
|
315
|
+
gate allows these exact ranges while keeping them visible as ignored exceptions:
|
|
316
|
+
|
|
317
|
+
- `storage/remote/write_test.go:214-221` and `240-249`: upstream starts on the
|
|
318
|
+
tail of a different config-call line; Rust starts at the following shared
|
|
319
|
+
block and covers the real repeated assertions.
|
|
320
|
+
- `storage/remote/read_test.go:339-414`: upstream reports a reversed fragment
|
|
321
|
+
(`414-339`) across multiple table entries; Rust covers the smaller repeated
|
|
322
|
+
entries around that region.
|
|
323
|
+
- `discovery/marathon/marathon_test.go:325-478`: upstream reports a reversed
|
|
324
|
+
table-test range twice.
|
|
325
|
+
- `discovery/hetzner/mock_test.go:58-457` and `464-517`: upstream pairs a very
|
|
326
|
+
large mock implementation range with a later smaller block.
|
|
327
|
+
- `discovery/triton/triton.go:90-136` and `discovery/gce/gce.go:91-117`:
|
|
328
|
+
upstream extends structurally similar config validation clones into
|
|
329
|
+
neighboring declarations.
|
|
330
|
+
- `cmd/promtool/main_test.go:250-256` and `250-258`: upstream reports two
|
|
331
|
+
overlapping partial ranges for the same test setup.
|
|
332
|
+
- `tsdb/head_read_test.go:73-94`, `122-171`, `122-213`, and `122-280`:
|
|
333
|
+
upstream repeatedly pairs later table entries with broad earlier table-entry
|
|
334
|
+
spans instead of the closest equivalent repeated case.
|
|
335
|
+
- `rules/group_test.go:42-67`: upstream includes adjacent setup lines around
|
|
336
|
+
the repeated body.
|
|
337
|
+
|
|
338
|
+
Expected behavior: clone fragments should stop at the matching token run, keep
|
|
339
|
+
start/end ordering stable, and avoid stretching one table-driven test case
|
|
340
|
+
through unrelated neighboring cases.
|
|
341
|
+
|
|
342
|
+
## Option fields are exposed but unused at runtime
|
|
343
|
+
|
|
344
|
+
Status: observed on the `jscpd` submodule during compatibility work.
|
|
345
|
+
|
|
346
|
+
The option surface contains fields that look like user-facing workflow hooks,
|
|
347
|
+
but the current CLI/runtime does not consume them after option parsing or
|
|
348
|
+
defaulting:
|
|
349
|
+
|
|
350
|
+
- `cache` is defined in
|
|
351
|
+
`jscpd/packages/core/src/interfaces/options.interface.ts`, defaults to `true`
|
|
352
|
+
in `jscpd/packages/core/src/options.ts`, and is copied from the CLI object in
|
|
353
|
+
`jscpd/apps/jscpd/src/options.ts`. There is no `--cache` CLI option and no
|
|
354
|
+
runtime read of `options.cache` in core/finder/tokenizer.
|
|
355
|
+
- `listeners` is defined in the options interface and normalized to `[]` in
|
|
356
|
+
`jscpd/apps/jscpd/src/options.ts`, but runtime subscribers are registered
|
|
357
|
+
only from built-in `verbose` and progress rules.
|
|
358
|
+
- `tokensToSkip` appears only in the options interface. It is not consumed by
|
|
359
|
+
tokenization or detector code.
|
|
360
|
+
|
|
361
|
+
Expected behavior: either document these fields as reserved/no-op, remove them
|
|
362
|
+
from the public option surface, or wire them to runtime behavior.
|
|
363
|
+
|
|
364
|
+
## String `minTokens` in config can corrupt token windows
|
|
365
|
+
|
|
366
|
+
Status: observed on the `jscpd` submodule during compatibility work.
|
|
367
|
+
|
|
368
|
+
Runtime config values from `.jscpd.json` and `package.json#jscpd` are merged
|
|
369
|
+
without the CLI numeric parsing step. Some numeric-looking strings still work
|
|
370
|
+
through JavaScript coercion, for example `minLines`, `maxLines`, and
|
|
371
|
+
`threshold`, but `minTokens` is used in token-window indexing with `+` before
|
|
372
|
+
numeric subtraction. A string value such as `"5"` can turn
|
|
373
|
+
`position + minTokens - 1` into indices like `14`, eventually producing an
|
|
374
|
+
undefined token frame and a detector crash.
|
|
375
|
+
|
|
376
|
+
Rust clone handling: `minLines`, `maxLines`, and `threshold` now accept string
|
|
377
|
+
numeric values where upstream continues. Config `minTokens` remains strict for
|
|
378
|
+
now because accepting that upstream-broken path would silently change visible
|
|
379
|
+
runtime behavior instead of preserving a safe compatibility contract.
|
|
380
|
+
|
|
381
|
+
## Malformed format mapping CLI values crash during option conversion
|
|
382
|
+
|
|
383
|
+
Status: observed on the `jscpd` submodule during compatibility work.
|
|
384
|
+
|
|
385
|
+
`--formats-exts` and `--formats-names` are parsed by splitting each semicolon
|
|
386
|
+
entry on `:` and then calling `.split(',')` on the second half. If an entry does
|
|
387
|
+
not contain `:`, option conversion throws a runtime TypeError before detection.
|
|
388
|
+
|
|
389
|
+
Repro:
|
|
390
|
+
|
|
391
|
+
```sh
|
|
392
|
+
node jscpd/apps/jscpd/bin/jscpd jscpd/fixtures/custom \
|
|
393
|
+
--formats-exts javascript \
|
|
394
|
+
--silent \
|
|
395
|
+
--noTips \
|
|
396
|
+
--min-tokens 20 \
|
|
397
|
+
--min-lines 3 \
|
|
398
|
+
--max-size 1mb
|
|
399
|
+
```
|
|
400
|
+
|
|
401
|
+
Observed first line:
|
|
402
|
+
|
|
403
|
+
```text
|
|
404
|
+
TypeError: Cannot read properties of undefined (reading 'split')
|
|
405
|
+
```
|
|
406
|
+
|
|
407
|
+
Rust clone handling: the CLI now mirrors this visible runtime error for
|
|
408
|
+
malformed `--formats-exts` and `--formats-names` values. Valid mapping strings
|
|
409
|
+
and empty strings continue through the upstream-compatible path.
|
|
410
|
+
|
|
411
|
+
## Bare optional numeric CLI flags produce accidental behavior
|
|
412
|
+
|
|
413
|
+
Status: observed on the `jscpd` submodule during compatibility work.
|
|
414
|
+
|
|
415
|
+
Several Commander options are declared with optional numeric values, for example
|
|
416
|
+
`--threshold [number]` and `--exitCode [number]`. When the flag is passed
|
|
417
|
+
without a value, Commander supplies boolean `true`.
|
|
418
|
+
|
|
419
|
+
Repro:
|
|
420
|
+
|
|
421
|
+
```sh
|
|
422
|
+
node jscpd/apps/jscpd/bin/jscpd jscpd/fixtures/javascript \
|
|
423
|
+
--threshold \
|
|
424
|
+
--silent \
|
|
425
|
+
--noTips \
|
|
426
|
+
--min-tokens 20 \
|
|
427
|
+
--min-lines 3 \
|
|
428
|
+
--max-size 1mb
|
|
429
|
+
|
|
430
|
+
node jscpd/apps/jscpd/bin/jscpd jscpd/fixtures/javascript \
|
|
431
|
+
--exitCode \
|
|
432
|
+
--silent \
|
|
433
|
+
--noTips \
|
|
434
|
+
--min-tokens 20 \
|
|
435
|
+
--min-lines 3 \
|
|
436
|
+
--max-size 1mb
|
|
437
|
+
```
|
|
438
|
+
|
|
439
|
+
Observed behavior:
|
|
440
|
+
|
|
441
|
+
- Bare `--threshold` is converted with `Number(true)`, so the threshold becomes
|
|
442
|
+
`1%` and detection fails if duplication is at least 1%.
|
|
443
|
+
- Bare `--exitCode` stores boolean `true`; when clones are found, Node rejects
|
|
444
|
+
that boolean as `process.exitCode` with `TypeError [ERR_INVALID_ARG_TYPE]`.
|
|
445
|
+
|
|
446
|
+
Expected behavior: require a numeric value, or explicitly document and normalize
|
|
447
|
+
the default value for bare flags.
|
|
448
|
+
|
|
449
|
+
Rust clone handling: bare `--threshold` and bare `--exitCode` are mirrored for
|
|
450
|
+
CLI compatibility. The `--exitCode` behavior remains an upstream bug candidate,
|
|
451
|
+
but preserving it is cheaper than leaving a visible CLI parity gap.
|
|
452
|
+
|
|
453
|
+
## Bare optional string CLI flags produce inconsistent failures
|
|
454
|
+
|
|
455
|
+
Status: observed on the `jscpd` submodule during compatibility work.
|
|
456
|
+
|
|
457
|
+
Several Commander string options are declared with optional values. When the
|
|
458
|
+
flag is passed without a value, Commander supplies boolean `true`, and later
|
|
459
|
+
runtime code either crashes with a type error or continues depending on whether
|
|
460
|
+
that option is used.
|
|
461
|
+
|
|
462
|
+
Repro shape:
|
|
463
|
+
|
|
464
|
+
```sh
|
|
465
|
+
node jscpd/apps/jscpd/bin/jscpd <flag> \
|
|
466
|
+
--silent \
|
|
467
|
+
--noTips \
|
|
468
|
+
jscpd/fixtures/clike/file2.c \
|
|
469
|
+
--min-tokens 20 \
|
|
470
|
+
--min-lines 3 \
|
|
471
|
+
--max-size 1mb
|
|
472
|
+
```
|
|
473
|
+
|
|
474
|
+
Observed first stdout lines:
|
|
475
|
+
|
|
476
|
+
| Flag | Exit | First line |
|
|
477
|
+
| --- | ---: | --- |
|
|
478
|
+
| `--config` | 1 | `TypeError [ERR_INVALID_ARG_TYPE]: The "paths[0]" argument must be of type string. Received type boolean (true)` |
|
|
479
|
+
| `--ignore` | 1 | `TypeError: cli.ignore.split is not a function` |
|
|
480
|
+
| `--ignore-pattern` | 1 | `TypeError: cli.ignorePattern.split is not a function` |
|
|
481
|
+
| `--reporters` | 1 | `TypeError: cli.reporters.split is not a function` |
|
|
482
|
+
| `--mode` | 1 | `TypeError: mode is not a function` |
|
|
483
|
+
| `--format` | 1 | `TypeError: cli.format.split is not a function` |
|
|
484
|
+
| `--formats-exts` | 1 | `TypeError: extensions.split is not a function` |
|
|
485
|
+
| `--formats-names` | 1 | `TypeError: extensions.split is not a function` |
|
|
486
|
+
| `--output` | 0 | continues when no file-writing reporter uses `output` |
|
|
487
|
+
|
|
488
|
+
`--output --reporters json` later fails when the JSON reporter passes boolean
|
|
489
|
+
`true` to filesystem path creation.
|
|
490
|
+
|
|
491
|
+
Expected behavior: require string values for these flags, or normalize bare
|
|
492
|
+
flags before option conversion.
|
|
493
|
+
|
|
494
|
+
Rust clone handling: low-risk bare-value cases that upstream continues with are
|
|
495
|
+
mirrored, and the CLI gate now also mirrors the visible runtime TypeError shape
|
|
496
|
+
for bare `--ignore`, `--ignore-pattern`, `--reporters`, `--mode`, `--format`,
|
|
497
|
+
`--formats-exts`, `--formats-names`, and `--output` when a file-writing
|
|
498
|
+
reporter consumes the boolean output path. The `--output` case preserves the
|
|
499
|
+
different first-line TypeError strings from `fs.mkdirSync`-based reporters and
|
|
500
|
+
`path.join`-based reporters. These remain upstream bug candidates, but
|
|
501
|
+
preserving the visible command behavior removes a CLI parity gap.
|