jscpd-rs 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (96) hide show
  1. package/CHANGELOG.md +69 -0
  2. package/Cargo.lock +1323 -0
  3. package/Cargo.toml +54 -0
  4. package/LICENSE +21 -0
  5. package/README.md +372 -0
  6. package/docs/api-parity.md +49 -0
  7. package/docs/cloning-plan.md +281 -0
  8. package/docs/compat-baseline.md +535 -0
  9. package/docs/format-porting.md +86 -0
  10. package/docs/junior-task-template.md +62 -0
  11. package/docs/junior-workflow.md +87 -0
  12. package/docs/migrating-from-jscpd.md +193 -0
  13. package/docs/npm-release.md +116 -0
  14. package/docs/public-benchmark-suite.md +81 -0
  15. package/docs/release-checklist.md +200 -0
  16. package/docs/release-decisions.md +103 -0
  17. package/docs/release-readiness.md +51 -0
  18. package/docs/upstream-bugs.md +501 -0
  19. package/docs/upstream-issue-drafts.md +393 -0
  20. package/docs/user-guide.md +309 -0
  21. package/examples/dump_oxc_tokens.rs +112 -0
  22. package/examples/library_api.rs +42 -0
  23. package/npm/bin/jscpd-rs.js +6 -0
  24. package/npm/bin/jscpd-server.js +6 -0
  25. package/npm/lib/run-binary.js +68 -0
  26. package/npm/scripts/postinstall.js +50 -0
  27. package/package.json +53 -0
  28. package/skills/dry-refactoring/SKILL.md +63 -0
  29. package/skills/jscpd/SKILL.md +85 -0
  30. package/src/app.rs +512 -0
  31. package/src/bin/jscpd-server.rs +429 -0
  32. package/src/blame.rs +130 -0
  33. package/src/cli/config.rs +543 -0
  34. package/src/cli/parsing.rs +301 -0
  35. package/src/cli/tests.rs +543 -0
  36. package/src/cli.rs +671 -0
  37. package/src/detector/matching/secondary.rs +387 -0
  38. package/src/detector/matching.rs +274 -0
  39. package/src/detector/model.rs +190 -0
  40. package/src/detector/prepare.rs +71 -0
  41. package/src/detector/skip_local.rs +40 -0
  42. package/src/detector/statistics.rs +138 -0
  43. package/src/detector/store.rs +96 -0
  44. package/src/detector/tests.rs +238 -0
  45. package/src/detector.rs +265 -0
  46. package/src/files/discovery.rs +508 -0
  47. package/src/files/gitignore.rs +203 -0
  48. package/src/files/paths.rs +68 -0
  49. package/src/files/shebang.rs +106 -0
  50. package/src/files/tests.rs +523 -0
  51. package/src/files.rs +25 -0
  52. package/src/formats.rs +570 -0
  53. package/src/lib.rs +433 -0
  54. package/src/main.rs +26 -0
  55. package/src/report/ai.rs +125 -0
  56. package/src/report/badge.rs +238 -0
  57. package/src/report/console.rs +180 -0
  58. package/src/report/console_common.rs +37 -0
  59. package/src/report/console_full.rs +139 -0
  60. package/src/report/csv.rs +65 -0
  61. package/src/report/escape.rs +8 -0
  62. package/src/report/file_output.rs +28 -0
  63. package/src/report/html/assets.rs +47 -0
  64. package/src/report/html.rs +336 -0
  65. package/src/report/json.rs +119 -0
  66. package/src/report/markdown.rs +125 -0
  67. package/src/report/sarif.rs +302 -0
  68. package/src/report/silent.rs +22 -0
  69. package/src/report/source.rs +38 -0
  70. package/src/report/summary.rs +50 -0
  71. package/src/report/test_support.rs +133 -0
  72. package/src/report/threshold.rs +76 -0
  73. package/src/report/xcode.rs +90 -0
  74. package/src/report/xml.rs +119 -0
  75. package/src/report.rs +250 -0
  76. package/src/server/mcp.rs +942 -0
  77. package/src/server.rs +1081 -0
  78. package/src/tokenizer/apex.rs +97 -0
  79. package/src/tokenizer/blocks.rs +532 -0
  80. package/src/tokenizer/embedded.rs +106 -0
  81. package/src/tokenizer/generic.rs +511 -0
  82. package/src/tokenizer/hash.rs +27 -0
  83. package/src/tokenizer/ignore.rs +33 -0
  84. package/src/tokenizer/line_index.rs +33 -0
  85. package/src/tokenizer/markdown.rs +289 -0
  86. package/src/tokenizer/markup_attrs.rs +289 -0
  87. package/src/tokenizer/oxc/fallback.rs +275 -0
  88. package/src/tokenizer/oxc/jsx.rs +168 -0
  89. package/src/tokenizer/oxc/kind.rs +177 -0
  90. package/src/tokenizer/oxc/lexical.rs +67 -0
  91. package/src/tokenizer/oxc.rs +659 -0
  92. package/src/tokenizer/scan.rs +88 -0
  93. package/src/tokenizer/tap.rs +150 -0
  94. package/src/tokenizer/tests.rs +915 -0
  95. package/src/tokenizer.rs +328 -0
  96. package/src/verbose.rs +195 -0
package/CHANGELOG.md ADDED
@@ -0,0 +1,69 @@
1
+ # Changelog
2
+
3
+ ## 0.1.0 - 2026-05-31
4
+
5
+ First release candidate for `jscpd-rs`, a native Rust clone of upstream
6
+ `jscpd`.
7
+
8
+ ### Added
9
+
10
+ - Native `jscpd` CLI binary with upstream-compatible command name and help
11
+ shape.
12
+ - Native `jscpd-server` binary exposing `/`, `/api/health`, `/api/stats`,
13
+ `/api/check`, `/api/recheck`, and `/mcp`.
14
+ - Coverage-first compatibility gates against the upstream `jscpd` submodule.
15
+ - CLI/config support for the main upstream option surface, including Commander
16
+ edge cases covered by compatibility scripts.
17
+ - Native file discovery with `.gitignore`, global Git excludes, symlink policy,
18
+ shebang detection, max size, max line, custom extension, and custom filename
19
+ handling.
20
+ - Upstream-synchronized format registry with 223 formats and 206 extension
21
+ mappings.
22
+ - Native Oxc-backed JavaScript, TypeScript, JSX, and TSX token processing.
23
+ - Native generic tokenization for long-tail formats, plus block handling for
24
+ Markdown, markup, Vue, Svelte, Astro, Apex, and TAP where needed for current
25
+ coverage gates.
26
+ - Built-in native reporters: `ai`, `console`, `consoleFull`, `csv`, `html`,
27
+ `json`, `markdown`, `silent`, `sarif`, `threshold`, `xcode`, `xml`, and
28
+ `badge`.
29
+ - Native `git blame -w` support in reports.
30
+ - Native Rust API for path-based detection, in-memory `SourceFile` detection,
31
+ an embeddable argv runner, native tokenizer map generation, native
32
+ `Detector`/`Statistic`/`MemoryStore` counterparts, upstream-style default
33
+ options, argv option parsing, supported format listing, format lookup, and
34
+ both `detect_clones_and_statistic` and
35
+ `detect_clones_and_statistics` spellings.
36
+ - Source-build npm package metadata and bin shims for `npx jscpd-rs`,
37
+ `jscpd`, and `jscpd-server`.
38
+ - Public benchmark suite on pinned React, Next.js, and Prometheus revisions.
39
+
40
+ ### Compatibility And Performance
41
+
42
+ The first release is intentionally coverage-first: Rust must not miss duplicated
43
+ upstream lines on the same inputs/options. Additional Rust findings are allowed
44
+ while compatibility converges and remain visible in comparison output.
45
+
46
+ Recorded release-candidate public benchmark measurements from
47
+ `scripts/release-candidate.sh`:
48
+
49
+ | Case | Commit | Format | Rust avg | Upstream avg | Speedup | Compat |
50
+ | --- | --- | --- | ---: | ---: | ---: | --- |
51
+ | React | `f0dfee3` | JavaScript | 0.199097s | 10.079214s | 50.62x | pass |
52
+ | Next.js | `2bbb67b9` | TypeScript | 0.262433s | 14.715736s | 56.07x | pass |
53
+ | Prometheus | `a0524ee` | Go | 0.085239s | 4.642435s | 54.46x | pass |
54
+
55
+ ### Known First-Release Deviations
56
+
57
+ - Dynamic npm reporters, stores, listeners, and plugins are not loaded.
58
+ - External reporter and store names keep upstream-style warning/fallback
59
+ behavior where upstream continues.
60
+ - Exact clone pair ordering, token totals, and fragment boundaries remain
61
+ diagnostic as long as upstream duplicated lines are covered.
62
+ - HTML output is self-contained and practically compatible, not pixel-perfect.
63
+ - The Rust crate exposes a native Rust API, not the upstream JavaScript package
64
+ API.
65
+ - The npm package currently builds native binaries from source during install;
66
+ prebuilt platform packages are planned as a later publication improvement.
67
+ - Full Prism grammar parity for every long-tail format is not attempted in this
68
+ release. Formats should be promoted from generic tokenization when concrete
69
+ coverage gates show missed upstream lines.