patina-cli 3.11.0 → 4.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (193) hide show
  1. package/.patina.default.yaml +29 -29
  2. package/CHANGELOG.md +53 -0
  3. package/NOTICE +21 -0
  4. package/README.md +117 -224
  5. package/README_JA.md +134 -77
  6. package/README_KR.md +132 -74
  7. package/README_ZH.md +137 -80
  8. package/SKILL.md +11 -20
  9. package/artifacts/rebaseline-2025/README.md +147 -0
  10. package/artifacts/rebaseline-2025/human-controls.public.jsonl +250 -0
  11. package/artifacts/rebaseline-2025/intake.example.jsonl +2 -0
  12. package/artifacts/rebaseline-2025/intake.local.example.jsonl +25 -0
  13. package/artifacts/rebaseline-2025/prompts.template.jsonl +7 -0
  14. package/artifacts/rebaseline-2025/sources.ko-public.jsonl +39 -0
  15. package/assets/brand/patina-badge.svg +18 -0
  16. package/assets/brand/patina-mark.svg +8 -0
  17. package/assets/demo/README.md +79 -0
  18. package/core/scoring.md +12 -12
  19. package/core/standalone-prompt.md +3 -1
  20. package/core/stylometry.md +93 -22
  21. package/docs/API.md +1554 -0
  22. package/docs/AUTHENTICATION.md +50 -26
  23. package/docs/AUTHENTICATION_KR.md +54 -29
  24. package/docs/BRANDING.md +9 -8
  25. package/docs/CLI.md +55 -14
  26. package/docs/COOKBOOK.md +8 -21
  27. package/docs/DEMO.md +32 -5
  28. package/docs/EXIT-CODES.md +2 -3
  29. package/docs/FALSE-POSITIVES.md +63 -0
  30. package/docs/FAQ.md +9 -1
  31. package/docs/FAQ_KR.md +3 -1
  32. package/docs/FLAG-PARITY.md +33 -47
  33. package/docs/ISSUE-WAVES.md +57 -0
  34. package/docs/PATTERNS-EN.md +67 -3
  35. package/docs/PATTERNS-JA.md +68 -2
  36. package/docs/PATTERNS-KO.md +70 -7
  37. package/docs/PATTERNS-ZH.md +67 -3
  38. package/docs/PATTERNS.md +5 -5
  39. package/docs/RESEARCH-DOCS-PLATFORM.md +54 -0
  40. package/docs/ROADMAP.md +46 -66
  41. package/docs/TRANSLATIONESE-KO.md +51 -0
  42. package/docs/audits/2026-05-deep-research.md +3 -1
  43. package/docs/benchmarks/README.md +51 -0
  44. package/docs/benchmarks/detector-comparison.json +69 -9
  45. package/docs/benchmarks/detector-comparison.md +10 -5
  46. package/docs/benchmarks/katfish-ko-latest.json +657 -0
  47. package/docs/benchmarks/katfish-ko-latest.md +77 -0
  48. package/docs/benchmarks/latest.json +1183 -108
  49. package/docs/benchmarks/latest.md +84 -60
  50. package/docs/benchmarks/lexicon-freshness-en-2026-05-22.json +1121 -0
  51. package/docs/benchmarks/lexicon-freshness-en-2026-05-22.md +136 -0
  52. package/docs/benchmarks/rebaseline-latest.json +381 -0
  53. package/docs/benchmarks/rebaseline-latest.md +121 -0
  54. package/docs/benchmarks/register-stratified-latest.json +164 -0
  55. package/docs/benchmarks/register-stratified-latest.md +99 -0
  56. package/docs/benchmarks/register-stratified.md +43 -0
  57. package/docs/integrations/github-action.md +44 -11
  58. package/docs/integrations/playground.md +58 -0
  59. package/docs/integrations/pre-commit.md +5 -5
  60. package/docs/integrations/release.md +5 -3
  61. package/docs/integrations/static-sites.md +83 -0
  62. package/docs/research/2025-rebaseline-plan.md +71 -2
  63. package/docs/research/2026-rebaseline.md +102 -0
  64. package/docs/research/adversarial-mps.md +41 -0
  65. package/docs/research/ai-human-metrics.md +35 -23
  66. package/docs/research/human-eval-panel.md +42 -0
  67. package/docs/research/judge-agreement.md +24 -0
  68. package/docs/research/ko-2025-corpus-sources.md +135 -0
  69. package/docs/research/lexicon-freshness-audit.md +64 -0
  70. package/docs/research/zh-ja-lexicon-calibration.md +60 -0
  71. package/docs/social/patina-launch-copy.md +173 -100
  72. package/docs/social/patina-launch-execution.md +94 -0
  73. package/docs/social/patina-launch-korean-first.md +83 -0
  74. package/docs/social/signs-of-ai-writing.md +26 -0
  75. package/docs/social/signs-of-ai-writing_KR.md +26 -0
  76. package/lexicon/ai-en.md +21 -24
  77. package/lexicon/ai-ja.md +158 -0
  78. package/lexicon/ai-ko.md +9 -9
  79. package/lexicon/ai-zh.md +158 -0
  80. package/lexicon/provenance/ai-en.json +970 -0
  81. package/lexicon/provenance/ai-ja.json +542 -0
  82. package/lexicon/provenance/ai-ko.json +866 -0
  83. package/lexicon/provenance/ai-zh.json +542 -0
  84. package/package.json +49 -8
  85. package/patterns/en-communication.md +5 -0
  86. package/patterns/en-content.md +5 -0
  87. package/patterns/en-filler.md +5 -0
  88. package/patterns/en-language.md +29 -1
  89. package/patterns/en-structure.md +5 -0
  90. package/patterns/en-style.md +5 -0
  91. package/patterns/en-viral-hook.md +42 -2
  92. package/patterns/ja-communication.md +5 -0
  93. package/patterns/ja-content.md +5 -0
  94. package/patterns/ja-filler.md +5 -0
  95. package/patterns/ja-language.md +33 -1
  96. package/patterns/ja-structure.md +12 -0
  97. package/patterns/ja-style.md +5 -0
  98. package/patterns/ja-viral-hook.md +41 -2
  99. package/patterns/ko-communication.md +5 -0
  100. package/patterns/ko-content.md +5 -0
  101. package/patterns/ko-filler.md +5 -0
  102. package/patterns/ko-language.md +33 -1
  103. package/patterns/ko-structure.md +25 -6
  104. package/patterns/ko-style.md +5 -0
  105. package/patterns/ko-viral-hook.md +38 -2
  106. package/patterns/zh-communication.md +5 -0
  107. package/patterns/zh-content.md +5 -0
  108. package/patterns/zh-filler.md +5 -0
  109. package/patterns/zh-language.md +37 -1
  110. package/patterns/zh-structure.md +12 -0
  111. package/patterns/zh-style.md +5 -0
  112. package/patterns/zh-viral-hook.md +38 -2
  113. package/playground/README.md +55 -0
  114. package/playground/analytics.js +4 -0
  115. package/playground/analyzer.js +883 -0
  116. package/playground/app.js +157 -0
  117. package/playground/data/lexicons.js +343 -0
  118. package/playground/index.html +138 -0
  119. package/playground/styles.css +267 -0
  120. package/profiles/namuwiki.md +111 -0
  121. package/scripts/adversarial-mps-report.mjs +201 -0
  122. package/scripts/badge-json.mjs +79 -0
  123. package/scripts/benchmark-report.mjs +56 -9
  124. package/scripts/check-release-metadata.mjs +0 -2
  125. package/scripts/detector-comparison.mjs +7 -7
  126. package/scripts/generate-playground-data.mjs +77 -0
  127. package/scripts/katfish-calibration.mjs +464 -0
  128. package/scripts/lexicon-freshness.mjs +485 -0
  129. package/scripts/lint.mjs +1 -1
  130. package/scripts/precommit-score.mjs +4 -3
  131. package/scripts/prose-score.mjs +81 -5
  132. package/scripts/rebaseline-intake.mjs +242 -0
  133. package/scripts/rebaseline-score.mjs +268 -0
  134. package/scripts/rebaseline-summary.mjs +773 -0
  135. package/scripts/rebaseline-web-collect.mjs +410 -0
  136. package/scripts/update-benchmark-ranges.mjs +1 -0
  137. package/src/api.js +69 -105
  138. package/src/auth.js +50 -2
  139. package/src/backends/claude-cli.js +19 -4
  140. package/src/backends/codex-cli.js +19 -3
  141. package/src/backends/contract.js +230 -1
  142. package/src/backends/gemini-cli.js +18 -5
  143. package/src/backends/index.js +87 -12
  144. package/src/backends/kimi-cli.js +161 -0
  145. package/src/cli.js +577 -567
  146. package/src/commands/doctor.js +2 -2
  147. package/src/config.js +29 -0
  148. package/src/errors.js +53 -1
  149. package/src/features/discourse-tells.js +68 -0
  150. package/src/features/index.js +82 -8
  151. package/src/features/lexicon.js +40 -6
  152. package/src/features/markup-leakage.js +69 -0
  153. package/src/features/segment.js +41 -0
  154. package/src/features/signal-strength.js +81 -0
  155. package/src/features/stylometry.js +231 -1
  156. package/src/features/translationese.js +127 -0
  157. package/src/loader.js +76 -0
  158. package/src/logger.js +22 -23
  159. package/src/model-defaults.js +55 -0
  160. package/src/ouroboros.js +31 -0
  161. package/src/output.js +102 -90
  162. package/src/prompt-builder.js +103 -68
  163. package/src/providers.js +51 -4
  164. package/src/scoring.js +210 -2
  165. package/src/security.js +75 -0
  166. package/tests/fixtures/live-quality/en/public-docs-01.md +26 -0
  167. package/tests/fixtures/live-quality/ko/public-docs-01.md +26 -0
  168. package/tests/fixtures/suspect-zones/expected-ranges.json +207 -16
  169. package/tests/fixtures/suspect-zones/ja/ai/ja-ai-04-lexicon.md +11 -0
  170. package/tests/fixtures/suspect-zones/ja/natural/ja-nat-04-lexicon-cold.md +11 -0
  171. package/tests/fixtures/suspect-zones/ko/ai/ko-ai-02.md +4 -5
  172. package/tests/fixtures/suspect-zones/ko/ai/ko-ai-07-ko-diagnostic.md +11 -0
  173. package/tests/fixtures/suspect-zones/zh/ai/zh-ai-04-lexicon.md +11 -0
  174. package/tests/fixtures/suspect-zones/zh/natural/zh-nat-04-lexicon-cold.md +11 -0
  175. package/tests/quality/README.md +188 -11
  176. package/tests/quality/adversarial-mps/fixtures.jsonl +10 -0
  177. package/tests/quality/benchmark.mjs +39 -1
  178. package/tests/quality/dogfood.mjs +5 -3
  179. package/tests/quality/live-fixtures.jsonl +2 -0
  180. package/tests/quality/live-quality.mjs +596 -0
  181. package/tests/quality/ranking-metrics.mjs +136 -0
  182. package/tests/quality/rebaseline-manifest.example.jsonl +5 -0
  183. package/vercel.json +53 -0
  184. package/SKILL-MAX.md +0 -455
  185. package/docs/internal/HARNESS.md +0 -14
  186. package/docs/internal/README.md +0 -14
  187. package/docs/internal/WARP.md +0 -23
  188. package/patina-max/SKILL.md +0 -523
  189. package/patina-max/composite.py +0 -457
  190. package/src/cache.js +0 -106
  191. package/src/commands/init.js +0 -208
  192. package/src/manifest.js +0 -162
  193. package/src/max-mode.js +0 -207
@@ -7,17 +7,17 @@ This is the latest checked-in report for patina's deterministic suspect-zone ben
7
7
  ## Current result
8
8
 
9
9
  - Status: **passing**
10
- - Generated at: 2026-05-20T11:14:22.284Z
11
- - Node: v22.17.1
10
+ - Generated at: 2026-06-07T05:27:17.629Z
11
+ - Node: v20.20.2
12
12
  - Fixture schema: v1
13
- - Fixtures: 34
13
+ - Fixtures: 39
14
14
  - Languages: 4 (en, ja, ko, zh)
15
- - Overall accuracy: **100.0%** [89.8%–100.0%] (n=34, Wilson score interval, 95%)
15
+ - Overall accuracy: **100.0%** [91.0%–100.0%] (n=39, Wilson score interval, 95%)
16
16
  - Source fixtures: `tests/fixtures/suspect-zones/**`
17
17
  - Regression ranges: `tests/fixtures/suspect-zones/expected-ranges.json` (refresh with `npm run benchmark:ranges`)
18
18
  - Reproduce: `npm run benchmark:report`
19
19
  - Raw JSON: [latest.json](latest.json)
20
- - Detector comparison harness: [detector-comparison.md](detector-comparison.md)
20
+ - Detector comparison protocol: [detector-comparison.md](detector-comparison.md)
21
21
  - 2025+ re-baseline plan: [docs/research/2025-rebaseline-plan.md](../research/2025-rebaseline-plan.md)
22
22
 
23
23
  ## Language breakdown
@@ -25,26 +25,44 @@ This is the latest checked-in report for patina's deterministic suspect-zone ben
25
25
  | lang | fixtures | accuracy | 95% CI | precision | recall | f1 | TP | FP | FN | TN |
26
26
  |---|---:|---:|---:|---:|---:|---:|---:|---:|---:|---:|
27
27
  | en | 11 | 100.0% | 74.1%–100.0% | 100.0% | 100.0% | 1 | 6 | 0 | 0 | 5 |
28
- | ja | 6 | 100.0% | 61.0%–100.0% | 100.0% | 100.0% | 1 | 3 | 0 | 0 | 3 |
29
- | ko | 11 | 100.0% | 74.1%–100.0% | 100.0% | 100.0% | 1 | 6 | 0 | 0 | 5 |
30
- | zh | 6 | 100.0% | 61.0%–100.0% | 100.0% | 100.0% | 1 | 3 | 0 | 0 | 3 |
28
+ | ja | 8 | 100.0% | 67.6%–100.0% | 100.0% | 100.0% | 1 | 4 | 0 | 0 | 4 |
29
+ | ko | 12 | 100.0% | 75.8%–100.0% | 100.0% | 100.0% | 1 | 7 | 0 | 0 | 5 |
30
+ | zh | 8 | 100.0% | 67.6%–100.0% | 100.0% | 100.0% | 1 | 4 | 0 | 0 | 4 |
31
31
 
32
32
  ## Detector breakdown
33
33
 
34
34
  | lang | detector | fixtures | accuracy | 95% CI | precision | recall | f1 | TP | FP | FN | TN |
35
35
  |---|---|---:|---:|---:|---:|---:|---:|---:|---:|---:|---:|
36
36
  | en | burstiness | 11 | 100.0% | 74.1%–100.0% | 100.0% | 100.0% | 1 | 6 | 0 | 0 | 5 |
37
+ | en | koDiagnostics | 11 | 45.5% | 21.3%–72.0% | 0.0% | 0.0% | 0 | 0 | 0 | 6 | 5 |
37
38
  | en | lexicon | 11 | 45.5% | 21.3%–72.0% | 0.0% | 0.0% | 0 | 0 | 0 | 6 | 5 |
38
39
  | en | mattr | 11 | 45.5% | 21.3%–72.0% | 0.0% | 0.0% | 0 | 0 | 0 | 6 | 5 |
39
- | ja | burstiness | 6 | 100.0% | 61.0%–100.0% | 100.0% | 100.0% | 1 | 3 | 0 | 0 | 3 |
40
- | ja | lexicon | 6 | 50.0% | 18.8%–81.2% | 0.0% | 0.0% | 0 | 0 | 0 | 3 | 3 |
41
- | ja | mattr | 6 | 50.0% | 18.8%–81.2% | 0.0% | 0.0% | 0 | 0 | 0 | 3 | 3 |
42
- | ko | burstiness | 11 | 100.0% | 74.1%–100.0% | 100.0% | 100.0% | 1 | 6 | 0 | 0 | 5 |
43
- | ko | lexicon | 11 | 81.8% | 52.3%–94.9% | 100.0% | 66.7% | 0.8 | 4 | 0 | 2 | 5 |
44
- | ko | mattr | 11 | 45.5% | 21.3%–72.0% | 0.0% | 0.0% | 0 | 0 | 0 | 6 | 5 |
45
- | zh | burstiness | 6 | 100.0% | 61.0%–100.0% | 100.0% | 100.0% | 1 | 3 | 0 | 0 | 3 |
46
- | zh | lexicon | 6 | 50.0% | 18.8%–81.2% | 0.0% | 0.0% | 0 | 0 | 0 | 3 | 3 |
47
- | zh | mattr | 6 | 50.0% | 18.8%–81.2% | 0.0% | 0.0% | 0 | 0 | 0 | 3 | 3 |
40
+ | ja | burstiness | 8 | 87.5% | 52.9%–97.8% | 100.0% | 75.0% | 0.86 | 3 | 0 | 1 | 4 |
41
+ | ja | koDiagnostics | 8 | 50.0% | 21.5%–78.5% | 0.0% | 0.0% | 0 | 0 | 0 | 4 | 4 |
42
+ | ja | lexicon | 8 | 62.5% | 30.6%–86.3% | 100.0% | 25.0% | 0.4 | 1 | 0 | 3 | 4 |
43
+ | ja | mattr | 8 | 50.0% | 21.5%–78.5% | 0.0% | 0.0% | 0 | 0 | 0 | 4 | 4 |
44
+ | ko | burstiness | 12 | 91.7% | 64.6%–98.5% | 100.0% | 85.7% | 0.92 | 6 | 0 | 1 | 5 |
45
+ | ko | koDiagnostics | 12 | 58.3% | 32.0%–80.7% | 100.0% | 28.6% | 0.44 | 2 | 0 | 5 | 5 |
46
+ | ko | lexicon | 12 | 41.7% | 19.3%–68.0% | 0.0% | 0.0% | 0 | 0 | 0 | 7 | 5 |
47
+ | ko | mattr | 12 | 41.7% | 19.3%–68.0% | 0.0% | 0.0% | 0 | 0 | 0 | 7 | 5 |
48
+ | zh | burstiness | 8 | 87.5% | 52.9%–97.8% | 100.0% | 75.0% | 0.86 | 3 | 0 | 1 | 4 |
49
+ | zh | koDiagnostics | 8 | 50.0% | 21.5%–78.5% | 0.0% | 0.0% | 0 | 0 | 0 | 4 | 4 |
50
+ | zh | lexicon | 8 | 62.5% | 30.6%–86.3% | 100.0% | 25.0% | 0.4 | 1 | 0 | 3 | 4 |
51
+ | zh | mattr | 8 | 50.0% | 21.5%–78.5% | 0.0% | 0.0% | 0 | 0 | 0 | 4 | 4 |
52
+
53
+ ## Ranking diagnostics
54
+
55
+ Signal-score ranking shows whether the diagnostic `signal_score` separates hot
56
+ fixtures from natural fixtures before any threshold is chosen. It is computed
57
+ only on the checked-in fixture corpus and is not a broader model-era claim.
58
+
59
+ | scope | fixtures | positives | negatives | ROC-AUC | PR-AUC | best threshold | precision | recall | best F1 | accuracy |
60
+ |---|---:|---:|---:|---:|---:|---:|---:|---:|---:|---:|
61
+ | overall | 39 | 21 | 18 | 1 | 1 | 3.846 | 100.0% | 100.0% | 1 | 100.0% |
62
+ | en | 11 | 6 | 5 | 1 | 1 | 68.994 | 100.0% | 100.0% | 1 | 100.0% |
63
+ | ja | 8 | 4 | 4 | 1 | 1 | 23.167 | 100.0% | 100.0% | 1 | 100.0% |
64
+ | ko | 12 | 7 | 5 | 1 | 1 | 3.846 | 100.0% | 100.0% | 1 | 100.0% |
65
+ | zh | 8 | 4 | 4 | 1 | 1 | 6.772 | 100.0% | 100.0% | 1 | 100.0% |
48
66
 
49
67
  ## Sample sizes
50
68
 
@@ -52,12 +70,12 @@ This is the latest checked-in report for patina's deterministic suspect-zone ben
52
70
  |---|---|---:|
53
71
  | en | ai | 6 |
54
72
  | en | natural | 5 |
55
- | ja | ai | 3 |
56
- | ja | natural | 3 |
57
- | ko | ai | 6 |
73
+ | ja | ai | 4 |
74
+ | ja | natural | 4 |
75
+ | ko | ai | 7 |
58
76
  | ko | natural | 5 |
59
- | zh | ai | 3 |
60
- | zh | natural | 3 |
77
+ | zh | ai | 4 |
78
+ | zh | natural | 4 |
61
79
 
62
80
  ## Misclassifications
63
81
 
@@ -65,48 +83,54 @@ All fixtures classified correctly.
65
83
 
66
84
  ## Fixture log
67
85
 
68
- | fixture | lang | class | expected | predicted | ok | CV band | MATTR band | lexicon/1k | sample lexicon hits |
69
- |---|---|---|---|---|---:|---:|---:|---:|---|
70
- | en-ai-01 | en | ai | hot | hot | ✓ | 0.058 low | 0.928 high | 0 | — |
71
- | en-ai-02 | en | ai | hot | hot | ✓ | 0.09 low | 0.841 high | 0 | — |
72
- | en-ai-03 | en | ai | hot | hot | ✓ | 0.065 low | 0.828 high | 0 | — |
73
- | en-ai-04 | en | ai | hot | hot | ✓ | 0.07 low | 0.84 high | 0 | — |
74
- | en-ai-05 | en | ai | hot | hot | ✓ | 0.093 low | 0.879 high | 0 | — |
75
- | en-ai-06-chat-register | en | ai | hot | hot | ✓ | 0.034 low | 0.814 high | 0 | — |
76
- | en-nat-01 | en | natural | cold | cold | ✓ | 0.881 high | 0.898 high | 0 | — |
77
- | en-nat-02 | en | natural | cold | cold | ✓ | 0.886 high | 0.884 high | 0 | — |
78
- | en-nat-03 | en | natural | cold | cold | ✓ | 0.914 high | 0.882 high | 0 | — |
79
- | en-nat-04 | en | natural | cold | cold | ✓ | 0.494 mid | 0.854 high | 0 | — |
80
- | en-nat-05 | en | natural | cold | cold | ✓ | 0.853 high | 0.875 high | 0 | — |
81
- | ja-ai-01 | ja | ai | hot | hot | ✓ | 0.045 low | 0.833 high | 0 | — |
82
- | ja-ai-02 | ja | ai | hot | hot | ✓ | 0.23 low | 0.785 high | 0 | — |
83
- | ja-ai-03 | ja | ai | hot | hot | ✓ | 0.063 low | 0.795 high | 0 | — |
84
- | ja-nat-01 | ja | natural | cold | cold | ✓ | 0.487 mid | 0.719 high | 0 | |
85
- | ja-nat-02 | ja | natural | cold | cold | ✓ | 0.65 high | 0.796 high | 0 | — |
86
- | ja-nat-03 | ja | natural | cold | cold | ✓ | 0.395 mid | 0.807 high | 0 | — |
87
- | ko-ai-01 | ko | ai | hot | hot | ✓ | 0.093 low | 0.977 high | 23.256 | 추세 |
88
- | ko-ai-02 | ko | ai | hot | hot | ✓ | 0.073 low | 0.82 high | 19.608 | 환경 |
89
- | ko-ai-03 | ko | ai | hot | hot | ✓ | 0.073 low | 0.79 high | 19.608 | 추세 |
90
- | ko-ai-04 | ko | ai | hot | hot | ✓ | 0.098 low | 0.853 high | 0 | — |
91
- | ko-ai-05 | ko | ai | hot | hot | ✓ | 0.098 low | 0.853 high | 0 | |
92
- | ko-ai-06-chat-register | ko | ai | hot | hot | ✓ | 0.081 low | 1 high | 21.739 | 흐름 |
93
- | ko-nat-01 | ko | natural | cold | cold | ✓ | 0.717 high | 1 high | 0 | — |
94
- | ko-nat-02 | ko | natural | cold | cold | ✓ | 0.552 high | 1 high | 0 | — |
95
- | ko-nat-03 | ko | natural | cold | cold | ✓ | 0.68 high | 1 high | 0 | — |
96
- | ko-nat-04 | ko | natural | cold | cold | ✓ | 0.771 high | 0.975 high | 0 | — |
97
- | ko-nat-05 | ko | natural | cold | cold | ✓ | 0.996 high | 0.998 high | 0 | — |
98
- | zh-ai-01 | zh | ai | hot | hot | ✓ | 0.062 low | 0.902 high | 0 | — |
99
- | zh-ai-02 | zh | ai | hot | hot | ✓ | 0.28 low | 0.734 high | 0 | — |
100
- | zh-ai-03 | zh | ai | hot | hot | ✓ | 0.083 low | 0.933 high | 0 | — |
101
- | zh-nat-01 | zh | natural | cold | cold | ✓ | 0.506 high | 0.875 high | 0 | — |
102
- | zh-nat-02 | zh | natural | cold | cold | ✓ | 0.528 high | 0.936 high | 0 | — |
103
- | zh-nat-03 | zh | natural | cold | cold | ✓ | 0.58 high | 0.907 high | 0 | — |
86
+ | fixture | lang | class | expected | predicted | ok | signal | CV band | MATTR band | lexicon/1k | KO diagnostic | sample lexicon hits |
87
+ |---|---|---|---|---|---:|---:|---:|---:|---:|---|---|
88
+ | en-ai-01 | en | ai | hot | hot | ✓ | 80.512 | 0.058 low | 0.928 high | 0 | cold | — |
89
+ | en-ai-02 | en | ai | hot | hot | ✓ | 69.883 | 0.09 low | 0.841 high | 0 | cold | — |
90
+ | en-ai-03 | en | ai | hot | hot | ✓ | 78.495 | 0.065 low | 0.828 high | 0 | cold | — |
91
+ | en-ai-04 | en | ai | hot | hot | ✓ | 76.717 | 0.07 low | 0.84 high | 0 | cold | — |
92
+ | en-ai-05 | en | ai | hot | hot | ✓ | 68.994 | 0.093 low | 0.879 high | 0 | cold | — |
93
+ | en-ai-06-chat-register | en | ai | hot | hot | ✓ | 88.701 | 0.034 low | 0.814 high | 0 | cold | — |
94
+ | en-nat-01 | en | natural | cold | cold | ✓ | 0 | 0.881 high | 0.898 high | 0 | cold | — |
95
+ | en-nat-02 | en | natural | cold | cold | ✓ | 0 | 0.886 high | 0.884 high | 0 | cold | — |
96
+ | en-nat-03 | en | natural | cold | cold | ✓ | 0 | 0.914 high | 0.882 high | 0 | cold | — |
97
+ | en-nat-04 | en | natural | cold | cold | ✓ | 0 | 0.494 mid | 0.854 high | 0 | cold | — |
98
+ | en-nat-05 | en | natural | cold | cold | ✓ | 0 | 0.853 high | 0.875 high | 0 | cold | — |
99
+ | ja-ai-01 | ja | ai | hot | hot | ✓ | 84.959 | 0.045 low | 0.833 high | 0 | cold | — |
100
+ | ja-ai-02 | ja | ai | hot | hot | ✓ | 23.167 | 0.23 low | 0.785 high | 0 | cold | — |
101
+ | ja-ai-03 | ja | ai | hot | hot | ✓ | 79.067 | 0.063 low | 0.795 high | 0 | cold | — |
102
+ | ja-ai-04-lexicon | ja | ai | hot | hot | ✓ | 100 | 0.56 high | 0.803 high | 63.83 | cold | まとめると, 結論として, 重要なのは, デジタル時代において |
103
+ | ja-nat-01 | ja | natural | cold | cold | ✓ | 0 | 0.487 mid | 0.719 high | 0 | cold | — |
104
+ | ja-nat-02 | ja | natural | cold | cold | ✓ | 0 | 0.65 high | 0.796 high | 0 | cold | — |
105
+ | ja-nat-03 | ja | natural | cold | cold | ✓ | 0 | 0.395 mid | 0.807 high | 0 | cold | — |
106
+ | ja-nat-04-lexicon-cold | ja | natural | cold | cold | ✓ | 0 | 0.396 mid | 0.752 high | 0 | cold | — |
107
+ | ko-ai-01 | ko | ai | hot | hot | ✓ | 68.992 | 0.093 low | 0.977 high | 23.256 | cold | 추세 |
108
+ | ko-ai-02 | ko | ai | hot | hot | ✓ | 75.545 | 0.073 low | 0.82 high | 0 | cold | — |
109
+ | ko-ai-03 | ko | ai | hot | hot | ✓ | 75.545 | 0.073 low | 0.79 high | 19.608 | cold | 추세 |
110
+ | ko-ai-04 | ko | ai | hot | hot | ✓ | 67.314 | 0.098 low | 0.853 high | 0 | cold | — |
111
+ | ko-ai-05 | ko | ai | hot | hot | ✓ | 67.314 | 0.098 low | 0.853 high | 0 | hot: regular-eojeol-length, low-comma-density, low-suffix-class-diversity | — |
112
+ | ko-ai-06-chat-register | ko | ai | hot | hot | ✓ | 72.887 | 0.081 low | 1 high | 0 | cold | — |
113
+ | ko-ai-07-ko-diagnostic | ko | ai | hot | hot | ✓ | 3.846 | 0.417 mid | 0.955 high | 0 | hot: regular-eojeol-length, low-comma-density, low-suffix-class-diversity | — |
114
+ | ko-nat-01 | ko | natural | cold | cold | ✓ | 0 | 0.717 high | 1 high | 0 | cold | — |
115
+ | ko-nat-02 | ko | natural | cold | cold | ✓ | 0 | 0.552 high | 1 high | 0 | cold | — |
116
+ | ko-nat-03 | ko | natural | cold | cold | ✓ | 0 | 0.68 high | 1 high | 0 | cold | — |
117
+ | ko-nat-04 | ko | natural | cold | cold | ✓ | 0 | 0.771 high | 0.975 high | 0 | cold | — |
118
+ | ko-nat-05 | ko | natural | cold | cold | ✓ | 0 | 0.996 high | 0.998 high | 0 | cold | — |
119
+ | zh-ai-01 | zh | ai | hot | hot | ✓ | 79.272 | 0.062 low | 0.902 high | 0 | cold | — |
120
+ | zh-ai-02 | zh | ai | hot | hot | ✓ | 6.772 | 0.28 low | 0.734 high | 0 | cold | — |
121
+ | zh-ai-03 | zh | ai | hot | hot | ✓ | 72.43 | 0.083 low | 0.933 high | 0 | cold | — |
122
+ | zh-ai-04-lexicon | zh | ai | hot | hot | ✓ | 100 | 0.748 high | 0.894 high | 92.593 | cold | 总而言之, 总的来说, 值得注意的是, 在数字时代 |
123
+ | zh-nat-01 | zh | natural | cold | cold | ✓ | 0 | 0.506 high | 0.875 high | 0 | cold | — |
124
+ | zh-nat-02 | zh | natural | cold | cold | ✓ | 0 | 0.528 high | 0.936 high | 0 | cold | — |
125
+ | zh-nat-03 | zh | natural | cold | cold | ✓ | 0 | 0.58 high | 0.907 high | 0 | cold | — |
126
+ | zh-nat-04-lexicon-cold | zh | natural | cold | cold | ✓ | 0 | 0.387 mid | 0.931 high | 0 | cold | — |
104
127
 
105
128
  ## How to read this
106
129
 
107
- - **Hot** means at least one deterministic signal crossed the benchmark threshold: low burstiness CV, low MATTR, or AI-lexicon density.
130
+ - **Hot** means at least one deterministic signal crossed the benchmark threshold: low burstiness CV, low MATTR, AI-lexicon density, or the conservative Korean diagnostic composite.
108
131
  - **Cold** means the fixture did not cross those thresholds.
132
+ - **Signal** is the 0–100 diagnostic strength of the strongest deterministic trigger. It supports ranking diagnostics but does not replace the binary hot/cold regression gate.
109
133
  - The report is meant for regression tracking and contributor discussion, not for authorship accusation.
110
- - This deterministic corpus is intentionally small (34 fixtures across en, ja, ko, zh); do not treat 100% fixture accuracy as generalization to new models, genres, or edited AI text.
134
+ - This deterministic corpus is intentionally small (39 fixtures across en, ja, ko, zh); do not treat 100% fixture accuracy as generalization to new models, genres, or edited AI text.
111
135
  - Confidence intervals use Wilson score intervals for the checked-in fixture set; external threshold sweeps and 2025+ model rebaselines are separate research follow-ups tracked in [2025+ Re-baseline Plan](../research/2025-rebaseline-plan.md).
112
136
  - Broader methodology notes live in [AI/Human Metrics Research](../research/ai-human-metrics.md) and [Quality Checks](../../tests/quality/README.md).