patina-cli 3.11.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (180) hide show
  1. package/.patina.default.yaml +211 -0
  2. package/CHANGELOG.md +265 -0
  3. package/LICENSE +21 -0
  4. package/README.md +319 -0
  5. package/README_JA.md +254 -0
  6. package/README_KR.md +253 -0
  7. package/README_ZH.md +254 -0
  8. package/SKILL-MAX.md +455 -0
  9. package/SKILL.md +730 -0
  10. package/assets/brand/patina-icon.svg +9 -0
  11. package/assets/brand/patina-logo.svg +17 -0
  12. package/assets/social/patina-before-after.svg +46 -0
  13. package/assets/social/patina-og.svg +31 -0
  14. package/bin/patina.js +9 -0
  15. package/core/scoring.md +657 -0
  16. package/core/standalone-prompt.md +364 -0
  17. package/core/stylometry.md +754 -0
  18. package/core/voice.md +163 -0
  19. package/docs/AUTHENTICATION.md +105 -0
  20. package/docs/AUTHENTICATION_KR.md +105 -0
  21. package/docs/BRANDING.md +37 -0
  22. package/docs/CLI.md +80 -0
  23. package/docs/COMPARISON.md +38 -0
  24. package/docs/COOKBOOK.md +173 -0
  25. package/docs/DEMO.md +40 -0
  26. package/docs/ETHICS.md +27 -0
  27. package/docs/EXAMPLES.md +130 -0
  28. package/docs/EXAMPLES_KR.md +130 -0
  29. package/docs/EXIT-CODES.md +25 -0
  30. package/docs/FAQ.md +67 -0
  31. package/docs/FAQ_KR.md +65 -0
  32. package/docs/FLAG-PARITY.md +53 -0
  33. package/docs/GLOSSARY.md +123 -0
  34. package/docs/PATTERNS-EN.md +718 -0
  35. package/docs/PATTERNS-JA.md +706 -0
  36. package/docs/PATTERNS-KO.md +707 -0
  37. package/docs/PATTERNS-ZH.md +706 -0
  38. package/docs/PATTERNS.md +22 -0
  39. package/docs/ROADMAP.md +315 -0
  40. package/docs/audits/2026-05-deep-research.md +290 -0
  41. package/docs/benchmarks/detector-comparison.json +442 -0
  42. package/docs/benchmarks/detector-comparison.md +65 -0
  43. package/docs/benchmarks/latest.json +988 -0
  44. package/docs/benchmarks/latest.md +112 -0
  45. package/docs/integrations/docker.md +19 -0
  46. package/docs/integrations/github-action.md +59 -0
  47. package/docs/integrations/pre-commit.md +77 -0
  48. package/docs/integrations/release.md +43 -0
  49. package/docs/internal/HARNESS.md +14 -0
  50. package/docs/internal/README.md +14 -0
  51. package/docs/internal/WARP.md +23 -0
  52. package/docs/research/2025-rebaseline-plan.md +89 -0
  53. package/docs/research/ai-human-metrics.md +380 -0
  54. package/docs/social/gstack-cardnews.html +236 -0
  55. package/docs/social/gstack-cardnews.md +88 -0
  56. package/docs/social/gstack-thread.md +106 -0
  57. package/docs/social/patina-launch-copy.md +227 -0
  58. package/docs/superpowers/specs/2026-04-03-meaning-preservation-design.md +299 -0
  59. package/lexicon/ai-en.md +162 -0
  60. package/lexicon/ai-ko.md +159 -0
  61. package/package.json +100 -0
  62. package/patina-max/SKILL.md +523 -0
  63. package/patina-max/composite.py +457 -0
  64. package/patterns/en-communication.md +89 -0
  65. package/patterns/en-content.md +133 -0
  66. package/patterns/en-filler.md +113 -0
  67. package/patterns/en-language.md +163 -0
  68. package/patterns/en-structure.md +173 -0
  69. package/patterns/en-style.md +139 -0
  70. package/patterns/en-viral-hook.md +211 -0
  71. package/patterns/ja-communication.md +101 -0
  72. package/patterns/ja-content.md +153 -0
  73. package/patterns/ja-filler.md +123 -0
  74. package/patterns/ja-language.md +190 -0
  75. package/patterns/ja-structure.md +142 -0
  76. package/patterns/ja-style.md +147 -0
  77. package/patterns/ja-viral-hook.md +216 -0
  78. package/patterns/ko-communication.md +98 -0
  79. package/patterns/ko-content.md +154 -0
  80. package/patterns/ko-filler.md +105 -0
  81. package/patterns/ko-language.md +182 -0
  82. package/patterns/ko-structure.md +147 -0
  83. package/patterns/ko-style.md +146 -0
  84. package/patterns/ko-viral-hook.md +211 -0
  85. package/patterns/zh-communication.md +101 -0
  86. package/patterns/zh-content.md +153 -0
  87. package/patterns/zh-filler.md +118 -0
  88. package/patterns/zh-language.md +173 -0
  89. package/patterns/zh-structure.md +145 -0
  90. package/patterns/zh-style.md +159 -0
  91. package/patterns/zh-viral-hook.md +216 -0
  92. package/profiles/academic.md +53 -0
  93. package/profiles/blog.md +81 -0
  94. package/profiles/casual-conversation.md +105 -0
  95. package/profiles/code-comment.md +104 -0
  96. package/profiles/commit-message.md +99 -0
  97. package/profiles/default.md +62 -0
  98. package/profiles/email.md +52 -0
  99. package/profiles/formal.md +98 -0
  100. package/profiles/instructional.md +80 -0
  101. package/profiles/legal.md +57 -0
  102. package/profiles/marketing.md +56 -0
  103. package/profiles/medical.md +53 -0
  104. package/profiles/narrative.md +79 -0
  105. package/profiles/release-notes.md +98 -0
  106. package/profiles/social.md +56 -0
  107. package/profiles/technical.md +53 -0
  108. package/scripts/benchmark-report.mjs +252 -0
  109. package/scripts/check-release-metadata.mjs +48 -0
  110. package/scripts/detector-comparison.mjs +267 -0
  111. package/scripts/lint.mjs +40 -0
  112. package/scripts/precommit-score.mjs +31 -0
  113. package/scripts/prose-score.mjs +186 -0
  114. package/scripts/update-benchmark-ranges.mjs +108 -0
  115. package/src/api.js +330 -0
  116. package/src/auth.js +105 -0
  117. package/src/backends/claude-cli.js +112 -0
  118. package/src/backends/codex-cli.js +121 -0
  119. package/src/backends/contract.js +21 -0
  120. package/src/backends/gemini-cli.js +135 -0
  121. package/src/backends/index.js +159 -0
  122. package/src/cache.js +106 -0
  123. package/src/cli.js +1280 -0
  124. package/src/commands/doctor.js +229 -0
  125. package/src/commands/init.js +208 -0
  126. package/src/config.js +126 -0
  127. package/src/errors.js +53 -0
  128. package/src/features/index.js +96 -0
  129. package/src/features/lexicon.js +90 -0
  130. package/src/features/segment.js +49 -0
  131. package/src/features/stylometry.js +50 -0
  132. package/src/loader.js +103 -0
  133. package/src/logger.js +70 -0
  134. package/src/manifest.js +162 -0
  135. package/src/max-mode.js +207 -0
  136. package/src/ouroboros.js +233 -0
  137. package/src/output.js +480 -0
  138. package/src/prompt-builder.js +409 -0
  139. package/src/providers.js +100 -0
  140. package/src/scoring.js +531 -0
  141. package/src/security.js +133 -0
  142. package/tests/fixtures/suspect-zones/en/ai/en-ai-01.md +16 -0
  143. package/tests/fixtures/suspect-zones/en/ai/en-ai-02.md +16 -0
  144. package/tests/fixtures/suspect-zones/en/ai/en-ai-03.md +17 -0
  145. package/tests/fixtures/suspect-zones/en/ai/en-ai-04.md +15 -0
  146. package/tests/fixtures/suspect-zones/en/ai/en-ai-05.md +16 -0
  147. package/tests/fixtures/suspect-zones/en/ai/en-ai-06-chat-register.md +16 -0
  148. package/tests/fixtures/suspect-zones/en/natural/en-nat-01.md +15 -0
  149. package/tests/fixtures/suspect-zones/en/natural/en-nat-02.md +15 -0
  150. package/tests/fixtures/suspect-zones/en/natural/en-nat-03.md +15 -0
  151. package/tests/fixtures/suspect-zones/en/natural/en-nat-04.md +15 -0
  152. package/tests/fixtures/suspect-zones/en/natural/en-nat-05.md +15 -0
  153. package/tests/fixtures/suspect-zones/expected-ranges.json +939 -0
  154. package/tests/fixtures/suspect-zones/ja/ai/ja-ai-01.md +11 -0
  155. package/tests/fixtures/suspect-zones/ja/ai/ja-ai-02.md +11 -0
  156. package/tests/fixtures/suspect-zones/ja/ai/ja-ai-03.md +11 -0
  157. package/tests/fixtures/suspect-zones/ja/natural/ja-nat-01.md +11 -0
  158. package/tests/fixtures/suspect-zones/ja/natural/ja-nat-02.md +11 -0
  159. package/tests/fixtures/suspect-zones/ja/natural/ja-nat-03.md +11 -0
  160. package/tests/fixtures/suspect-zones/ko/ai/ko-ai-01.md +14 -0
  161. package/tests/fixtures/suspect-zones/ko/ai/ko-ai-02.md +16 -0
  162. package/tests/fixtures/suspect-zones/ko/ai/ko-ai-03.md +15 -0
  163. package/tests/fixtures/suspect-zones/ko/ai/ko-ai-04.md +15 -0
  164. package/tests/fixtures/suspect-zones/ko/ai/ko-ai-05.md +16 -0
  165. package/tests/fixtures/suspect-zones/ko/ai/ko-ai-06-chat-register.md +16 -0
  166. package/tests/fixtures/suspect-zones/ko/natural/ko-nat-01.md +15 -0
  167. package/tests/fixtures/suspect-zones/ko/natural/ko-nat-02.md +15 -0
  168. package/tests/fixtures/suspect-zones/ko/natural/ko-nat-03.md +15 -0
  169. package/tests/fixtures/suspect-zones/ko/natural/ko-nat-04.md +14 -0
  170. package/tests/fixtures/suspect-zones/ko/natural/ko-nat-05.md +15 -0
  171. package/tests/fixtures/suspect-zones/zh/ai/zh-ai-01.md +11 -0
  172. package/tests/fixtures/suspect-zones/zh/ai/zh-ai-02.md +11 -0
  173. package/tests/fixtures/suspect-zones/zh/ai/zh-ai-03.md +11 -0
  174. package/tests/fixtures/suspect-zones/zh/natural/zh-nat-01.md +11 -0
  175. package/tests/fixtures/suspect-zones/zh/natural/zh-nat-02.md +11 -0
  176. package/tests/fixtures/suspect-zones/zh/natural/zh-nat-03.md +11 -0
  177. package/tests/quality/README.md +121 -0
  178. package/tests/quality/benchmark.mjs +306 -0
  179. package/tests/quality/detectors.manual.example.json +31 -0
  180. package/tests/quality/dogfood.mjs +44 -0
@@ -0,0 +1,11 @@
1
+ ---
2
+ fixture_id: ja-ai-01
3
+ language: ja
4
+ class: ai
5
+ expected_hot: true
6
+ why_designed_this_way: |
7
+ Burstiness fixture for whitespace-free Japanese. The sentence lengths are intentionally even so the ja character-token fallback catches low burstiness.
8
+ topic: 地域のカフェ
9
+ ---
10
+
11
+ 地域のカフェは住民の交流拠点として注目されている。店内では読書や仕事をする利用者も増えている。季節限定のメニューは来店動機の一つになっている。小規模なイベントも地域の関係づくりを支えている。今後は居心地のよい空間設計がさらに重要になる。
@@ -0,0 +1,11 @@
1
+ ---
2
+ fixture_id: ja-ai-02
3
+ language: ja
4
+ class: ai
5
+ expected_hot: true
6
+ why_designed_this_way: |
7
+ Repetition fixture. Reuses チーム/開発/機能/改善/作業 across the paragraph; sentence lengths are not perfectly identical, so lexical cycling should remain part of the signal.
8
+ topic: ソフトウェア開発
9
+ ---
10
+
11
+ 開発チームは新しい機能の改善作業を続けている。チーム内では機能ごとの開発状況を共有し、改善の優先順位を確認している。各メンバーは開発作業の進捗を記録する。機能改善の流れをそろえることで、チーム全体の作業効率も安定しやすくなる。開発チームは次の機能改善にも同じ作業手順を用いる予定だ。
@@ -0,0 +1,11 @@
1
+ ---
2
+ fixture_id: ja-ai-03
3
+ language: ja
4
+ class: ai
5
+ expected_hot: true
6
+ why_designed_this_way: |
7
+ Chat-register AI fixture with procedural, evenly paced sentences and no private details.
8
+ topic: ドキュメント確認
9
+ ---
10
+
11
+ 確認ツールは公開文書から対象の段落を読み取る。次に各段落の文体信号と語彙信号を記録する。レポートは言語別の件数と判定結果を整理する。保守者は結果を見ながら新しい例を追加できる。この手順は外部サービスへ本文を送らずに実行される。
@@ -0,0 +1,11 @@
1
+ ---
2
+ fixture_id: ja-nat-01
3
+ language: ja
4
+ class: natural
5
+ expected_hot: false
6
+ why_designed_this_way: |
7
+ Natural Japanese counterexample with uneven rhythm, concrete sensory details, and first-person texture.
8
+ topic: 地域のカフェ
9
+ ---
10
+
11
+ 駅前のカフェは、正直そこまでおしゃれじゃない。椅子は少し沈むし、窓際の席は冬になると足元が冷える。でも雨の日に入ると、入口のマットがびしょびしょで、店主が黙ってタオルを替えているのが見える。あれが好きだ。新しい豆の説明より、常連のおじいさんが新聞を畳む音のほうを覚えている。
@@ -0,0 +1,11 @@
1
+ ---
2
+ fixture_id: ja-nat-02
3
+ language: ja
4
+ class: natural
5
+ expected_hot: false
6
+ why_designed_this_way: |
7
+ Paired with ja-ai-02. Irregular sentence lengths and concrete debugging details should keep deterministic signals cold.
8
+ topic: ソフトウェア開発
9
+ ---
10
+
11
+ リリース前夜に一番役に立ったのは、立派なプロセス表ではなく、先輩がホワイトボードの端に書いた小さな矢印だった。そこをたどったらキャッシュの消し忘れが見つかった。笑うしかない。朝までに直したけれど、プルリクのコメント欄には眠気のせいで変な敬語が残っている。今見ても少し恥ずかしい。
@@ -0,0 +1,11 @@
1
+ ---
2
+ fixture_id: ja-nat-03
3
+ language: ja
4
+ class: natural
5
+ expected_hot: false
6
+ why_designed_this_way: |
7
+ Natural public-document review anecdote with short fragments, varied phrasing, and a clear opinionated voice.
8
+ topic: ドキュメント確認
9
+ ---
10
+
11
+ レビューで困るのは、文章が下手なことよりも、妙に整いすぎていることだ。見出しは立派なのに、手順の三番目で急に画面名が古くなる。そこで止まる。誰かのメモが一行でも残っていれば、たぶん迷わないのに。私は完璧な説明より、直した人の手の跡が少し見える文書のほうを信じる。
@@ -0,0 +1,14 @@
1
+ ---
2
+ fixture_id: ko-ai-01
3
+ language: ko
4
+ class: ai
5
+ expected_hot: true
6
+ why_designed_this_way: |
7
+ Burstiness only. Sentence 어절 counts: 13, 12, 14, 13, 12 → mean ≈ 12.8, stddev ≈ 0.75, CV ≈ 0.059.
8
+ Uniformity is the sole AI signal — no chatbot phrases, no ~적 stacking, no hype vocabulary,
9
+ no connector overload, no bold/lists. MATTR is reasonable because content words vary across
10
+ sentences (커피, 카페, 소비자, 시장, 수입) — not a lexical repetition case.
11
+ topic: 한국 커피 산업
12
+ ---
13
+
14
+ 한국의 커피 소비량은 지난 십 년 사이 눈에 띄게 늘어났다. 전국 곳곳에 카페가 들어서면서 소비자 선택지도 함께 넓어졌다. 원두 수입액도 같은 기간 꾸준한 상승세를 기록하고 있다. 젊은 세대를 중심으로 스페셜티 커피에 대한 관심이 높아지는 추세다. 국내 로스터리 브랜드들도 시장 안에서 자리를 잡아가고 있다.
@@ -0,0 +1,16 @@
1
+ ---
2
+ fixture_id: ko-ai-02
3
+ language: ko
4
+ class: ai
5
+ expected_hot: true
6
+ why_designed_this_way: |
7
+ MATTR only. The paragraph cycles the same narrow vocabulary cluster throughout:
8
+ 원격근무/재택근무, 직원/구성원, 업무/일, 생산성/효율 — four content-word pairs reused
9
+ in every sentence with minor grammatical variation. Estimated raw-token MATTR (window=50):
10
+ ~0.48 (low band < 0.55). Sentence lengths vary slightly (14, 11, 15, 12, 13) giving
11
+ CV ≈ 0.14 (mid-low, not flagged by burstiness alone) — only MATTR triggers hot.
12
+ No catalogued patterns: no chatbot phrases, no ~적 stacking, no 다양한/혁신적 hype terms.
13
+ topic: 원격근무
14
+ ---
15
+
16
+ 원격근무 도입 이후 많은 기업이 직원 업무 방식을 재검토하고 있다. 재택근무 환경에서 구성원들의 업무 효율을 유지하는 것이 과제로 떠올랐다. 원격근무 확산에 따라 직원 관리 방식도 변화가 필요하다는 목소리가 나오고 있다. 재택근무 중인 구성원들의 생산성을 어떻게 측정할 것인지에 대한 논의도 이어지고 있다. 원격 환경에서 직원들의 업무 몰입도를 높이려는 기업들의 노력도 계속되고 있다.
@@ -0,0 +1,15 @@
1
+ ---
2
+ fixture_id: ko-ai-03
3
+ language: ko
4
+ class: ai
5
+ expected_hot: true
6
+ why_designed_this_way: |
7
+ Both burstiness AND MATTR. Sentence 어절 counts: 11, 12, 11, 12, 11 → CV ≈ 0.044 (very low,
8
+ well under 0.25). Key vocabulary cluster cycling: 자전거/자전거도로, 도시/도심, 인프라/시설,
9
+ 이용자/시민 — repeated across all five sentences with light variation. MATTR estimate ~0.46
10
+ (low band). No catalogued patterns: no structural-repetition markers (not 3-of-3 lists),
11
+ no connector stacking, no chatbot phrases, no hype vocabulary.
12
+ topic: 도시 자전거 인프라
13
+ ---
14
+
15
+ 도시 내 자전거도로 정비가 꾸준히 이루어지면서 이용자 수가 늘고 있다. 도심 자전거 인프라 확충은 시민들의 이동 편의를 높이는 방향으로 진행되고 있다. 자전거도로 시설 개선과 함께 도시 전반의 이용자 안전도 점차 나아지고 있다. 도심 곳곳에 자전거 인프라가 갖춰지면서 시민 이용률도 올라가는 추세다. 자전거 시설이 확대될수록 도시 내 이용자 접근성이 높아질 것으로 보인다.
@@ -0,0 +1,15 @@
1
+ ---
2
+ fixture_id: ko-ai-04
3
+ language: ko
4
+ class: ai
5
+ expected_hot: true
6
+ why_designed_this_way: |
7
+ Burstiness only. Sentence 어절 counts: 10, 11, 10, 11, 10 → mean = 10.4, stddev ≈ 0.49,
8
+ CV ≈ 0.047 (extremely low). Content words span AI 코딩, 개발자, 코드, 도구, 작업 — diverse
9
+ enough that MATTR is mid-range (~0.62, not flagged). The uniformity is the sole signal.
10
+ No catalogued patterns: no 획기적/혁신적 hype, no connector stacking (한편/또한 appears 0 times),
11
+ no chatbot phrases, no bold/emoji/lists.
12
+ topic: AI 코딩 도구
13
+ ---
14
+
15
+ AI 코딩 도구의 사용이 개발자들 사이에서 빠르게 자리를 잡고 있다. 코드 자동 완성 기능은 개발 작업의 속도를 높이는 데 실질적인 도움을 준다. AI 도구를 활용한 개발자들은 반복 코드 작성 시간을 줄일 수 있었다. 코딩 지원 기능이 발전할수록 개발 작업의 범위도 넓어지고 있다. AI 도구의 활용도가 올라가면서 개발자 역량에 대한 기대치도 달라지고 있다.
@@ -0,0 +1,16 @@
1
+ ---
2
+ fixture_id: ko-ai-05
3
+ language: ko
4
+ class: ai
5
+ expected_hot: true
6
+ why_designed_this_way: |
7
+ MATTR only. Heavy lexical cycling on a small content-word set: 등산/산행, 사람들/등산객,
8
+ 산/산길, 인기/관심 — each sentence restates the same conceptual ground with minor wording
9
+ variation. Estimated MATTR ~0.44 (low band, well under 0.55). Sentence lengths show mild
10
+ variation: 13, 15, 12, 14, 16 → CV ≈ 0.11 (not flagged by burstiness). Only MATTR fires.
11
+ No catalogued patterns: no chatbot phrases, no excessive connectors, no 다양한 stacking,
12
+ no hype vocabulary, no structural repetition markers.
13
+ topic: 등산 문화
14
+ ---
15
+
16
+ 최근 몇 년 사이 등산에 관심을 갖는 사람들이 많아졌다. 주말이 되면 산을 찾는 등산객들로 산길이 붐비는 모습을 볼 수 있다. 등산 인기가 오르면서 사람들 사이에서 산행 장비에 대한 관심도 함께 높아지고 있다. 산을 즐기는 등산객이 늘어날수록 산길 관리에 대한 필요성도 커지고 있다. 등산 문화가 확산되면서 산을 찾는 사람들의 연령대도 점점 넓어지고 있다.
@@ -0,0 +1,16 @@
1
+ ---
2
+ fixture_id: ko-ai-06-chat-register
3
+ language: ko
4
+ class: ai
5
+ expected_hot: true
6
+ expected_metrics:
7
+ cv_band: low
8
+ mattr_band: high
9
+ lexicon_density_min: 0
10
+ lexicon_density_max: 80
11
+ why_designed_this_way: |
12
+ 공개 가능한 형태로 재작성한 Discord 봇 응답체 fixture. 실제 운영 맥락을 반영하되 개인 메시지나 비공개 내용을 포함하지 않는다.
13
+ topic: Discord bot project update
14
+ ---
15
+
16
+ 런타임 브리지는 컴포넌트 전용 봇 메시지를 작업 큐로 전달합니다. 스케줄러는 생성기가 브랜치를 만들기 전에 각 핸드오프를 기록합니다. 평가기는 변경 diff와 테스트 결과와 저장소 상태를 함께 확인합니다. 이 흐름은 디스코드 스레드를 읽기 쉽게 유지하면서 감사 기록을 남깁니다. 다음 실행에서도 같은 채널 바인딩을 재사용하고 중복 리스너를 피해야 합니다.
@@ -0,0 +1,15 @@
1
+ ---
2
+ fixture_id: ko-nat-01
3
+ language: ko
4
+ class: natural
5
+ expected_hot: false
6
+ why_designed_this_way: |
7
+ Paired with ko-ai-01 (한국 커피 산업). High burstiness: sentence 어절 counts are
8
+ 5, 19, 7, 22, 6 → mean ≈ 11.8, stddev ≈ 7.4, CV ≈ 0.63 (high band, well above 0.50).
9
+ Sentence 1 and 5 are short punchy fragments; sentences 2 and 4 are long and digressive.
10
+ MATTR is also high because the long sentences introduce distinct vocabulary (콩 볶는 냄새,
11
+ 줄 서서 기다리는, 프랜차이즈 공세, 동네 카페, 손님 한 명). No hot signal should fire.
12
+ topic: 한국 커피 산업
13
+ ---
14
+
15
+ 커피 얘기라면 할 말이 많다. 어렸을 때 어머니가 자판기 커피를 두 봉지씩 타서 마시던 기억이 있는데, 그게 지금은 콩 볶는 냄새 맡으러 스페셜티 카페 앞에 줄 서서 기다리는 문화로 바뀌었다. 격세지감이라고 해야 하나. 솔직히 프랜차이즈 공세에 동네 카페들이 얼마나 버텨낼지는 잘 모르겠고, 가끔은 그냥 단골집 손님 한 명으로 조용히 앉아 있고 싶다는 생각도 든다. 커피보다 그 공간이 더 그리운 것 같기도 하고.
@@ -0,0 +1,15 @@
1
+ ---
2
+ fixture_id: ko-nat-02
3
+ language: ko
4
+ class: natural
5
+ expected_hot: false
6
+ why_designed_this_way: |
7
+ Paired with ko-ai-02 (원격근무). High MATTR: sentence 어절 counts are 8, 21, 6, 17, 9 →
8
+ CV ≈ 0.53 (high band). Vocabulary is highly varied: 첫날, 반바지, 슬리퍼, 집중이 안 됐는데,
9
+ 냉장고, 아이 울음소리, 회의, 퇴근 시간, 이메일 — no content-word cluster reused.
10
+ MATTR estimate well above 0.70 (high). Neither signal fires hot.
11
+ Uses 1인칭 voice and an aside (사실은 솔직히), irregular rhythm throughout.
12
+ topic: 원격근무
13
+ ---
14
+
15
+ 재택근무 첫날은 솔직히 말하면 반바지에 슬리퍼 차림으로 집중이 안 됐는데, 몇 달 지나고 보니 오히려 출퇴근 두 시간을 돌려받은 게 삶 전체를 바꿔놓은 느낌이었다. 그게 전부가 아니다. 냉장고 열 때마다 죄책감이 든다거나, 아이 울음소리가 회의 중에 새나가는 것도 이제는 거의 일상이 됐다. 퇴근 시간이 없다는 건 자유이기도 하고 함정이기도 해서, 밤 열한 시에 업무 이메일을 읽고 있는 자신을 발견하면 이게 맞는 건가 싶다. 뭐가 좋은 거냐고 물어보면 대답이 달라지는 날도 있다.
@@ -0,0 +1,15 @@
1
+ ---
2
+ fixture_id: ko-nat-03
3
+ language: ko
4
+ class: natural
5
+ expected_hot: false
6
+ why_designed_this_way: |
7
+ Paired with ko-ai-03 (도시 자전거 인프라). Sentence 어절 counts: 6, 18, 9, 20, 5 →
8
+ CV ≈ 0.60 (high band). Vocabulary spans: 점선 하나, 인도 위 자전거, 핸들 꺾다, 골목 모퉁이,
9
+ 신호등, 오토바이, 불법 주차 차량, 택배 기사 — rich and non-repetitive. MATTR estimate
10
+ well above 0.70. No signal fires. Irregular rhythm with one very short exclamatory sentence
11
+ and one very long embedded clause.
12
+ topic: 도시 자전거 인프라
13
+ ---
14
+
15
+ 자전거도로라고 쓰여 있어도 반쯤은 이름만 그렇다. 점선 하나 그어놓은 인도 위로 자전거를 타다가 핸들을 꺾어 골목 모퉁이를 빠져나오면 신호등도 없고, 오토바이는 역주행에 불법 주차 차량이 절반쯤 차선을 먹고 있다. 그냥 걷는 게 낫겠다 싶은 순간이 생긴다. 그나마 한강변은 숨통이 트이는데, 거기서 십 분만 벗어나면 다시 택배 기사 전동킥보드와 시선을 맞추며 서로 비켜나야 하는 현실로 돌아온다. 재밌는 도시 맞다.
@@ -0,0 +1,14 @@
1
+ ---
2
+ fixture_id: ko-nat-04
3
+ language: ko
4
+ class: natural
5
+ expected_hot: false
6
+ why_designed_this_way: |
7
+ Paired with ko-ai-04 (AI 코딩 도구). Sentence 어절 counts: 4, 16, 23, 7, 11 →
8
+ CV ≈ 0.67 (high band). Vocabulary is varied: 탭 키 한 번, 자동 완성, 오히려 더 고민, 시니어,
9
+ 주니어, 코드 리뷰, 칭찬, 설명, 리팩터링 — no cluster reused. MATTR estimate ~0.75 (high).
10
+ Uses 1인칭 reflection and a rhetorical aside. Neither signal fires.
11
+ topic: AI 코딩 도구
12
+ ---
13
+
14
+ 처음엔 신기했다. 탭 키 한 번으로 나오는 자동 완성을 보면서 이게 맞나 싶었는데, 쓰다 보니 오히려 더 고민하게 되는 순간이 생겼다 — 이 코드가 왜 이렇게 생겼는지를 내가 설명할 수 있느냐는 거다. 시니어한테 코드 리뷰 받으면서 "이 부분 AI가 써준 거죠?" 한마디에 얼굴이 달아오른 적이 있는데, 그 이후로는 칭찬받을 만한 부분은 직접 짜고 나머지는 AI한테 넘기는 식으로 쓰고 있다. 주니어한테 설명 못 할 코드는 내 코드가 아닌 것 같아서. 리팩터링할 때는 진짜 편하긴 하다.
@@ -0,0 +1,15 @@
1
+ ---
2
+ fixture_id: ko-nat-05
3
+ language: ko
4
+ class: natural
5
+ expected_hot: false
6
+ why_designed_this_way: |
7
+ Paired with ko-ai-05 (등산 문화). Sentence 어절 counts: 19, 5, 14, 22, 4 →
8
+ CV ≈ 0.60 (high band). Vocabulary is rich and non-repetitive: 무릎 통증, 스틱, 발뒤꿈치,
9
+ 돌계단, 광고 모델, 기능성 바지, 땀 닦는 수건, 40대 아저씨 — no content cluster recycled.
10
+ MATTR estimate ~0.78 (high). Uses a digressive aside and colloquial register with 1인칭.
11
+ Neither burstiness nor MATTR fires.
12
+ topic: 등산 문화
13
+ ---
14
+
15
+ 무릎 통증이 생기고 나서야 왜 사람들이 스틱을 짚고 발뒤꿈치부터 내딛는지 이해했다. 돌계단 앞에서 잠깐 멈추게 된다. 요즘 등산로에서 마주치는 사람들 보면 광고 모델처럼 차려입은 분들이 많은데, 솔직히 기능성 바지가 어깨에 힘이 들어가 있는 게 좀 웃기기도 하지만 땀 닦는 수건 한 장 들고 올라왔다가 정상에서 바람 맞고 후회한 40대 아저씨도 있으니 뭐라고 할 처지는 아니다. 그냥 올라가게 된다.
@@ -0,0 +1,11 @@
1
+ ---
2
+ fixture_id: zh-ai-01
3
+ language: zh
4
+ class: ai
5
+ expected_hot: true
6
+ why_designed_this_way: |
7
+ Burstiness fixture for whitespace-free Chinese. Five sentences keep nearly identical character-token counts, so the zh character-token fallback should classify burstiness as low while MATTR remains reasonably high.
8
+ topic: 城市图书馆
9
+ ---
10
+
11
+ 城市图书馆正在成为居民学习的重要空间。周末到馆人数随着社区活动持续增加。数字借阅服务让读者获取资料更加方便。安静阅读区也为学生提供稳定环境。未来运营重点将放在服务质量提升上。
@@ -0,0 +1,11 @@
1
+ ---
2
+ fixture_id: zh-ai-02
3
+ language: zh
4
+ class: ai
5
+ expected_hot: true
6
+ why_designed_this_way: |
7
+ MATTR/repetition fixture. The paragraph cycles 项目/团队/流程/效率/管理 across every sentence; sentence lengths vary enough that low lexical diversity, not only rhythm, should remain visible.
8
+ topic: 项目管理
9
+ ---
10
+
11
+ 项目团队正在优化项目流程。新的项目流程让团队管理更加一致,也让项目效率更容易被跟踪。团队成员按照同一流程汇报项目进度。项目管理者继续调整流程细节,以保证团队效率保持稳定。整个项目团队都围绕流程和效率展开工作。
@@ -0,0 +1,11 @@
1
+ ---
2
+ fixture_id: zh-ai-03
3
+ language: zh
4
+ class: ai
5
+ expected_hot: true
6
+ why_designed_this_way: |
7
+ Chat-register AI fixture. The sentences are deliberately even and procedural, mirroring assistant-style status prose without private operational details.
8
+ topic: 文档审核
9
+ ---
10
+
11
+ 审核脚本会先读取公开文档中的段落。评分器随后记录每个段落的风险信号。报告页面再汇总语言维度和样本数量。维护者可以根据结果补充新的案例。整个流程保持离线运行并避免发送隐私文本。
@@ -0,0 +1,11 @@
1
+ ---
2
+ fixture_id: zh-nat-01
3
+ language: zh
4
+ class: natural
5
+ expected_hot: false
6
+ why_designed_this_way: |
7
+ Natural Chinese counterexample with uneven rhythm, concrete details, and varied vocabulary. It should stay cold under burstiness and MATTR.
8
+ topic: 城市图书馆
9
+ ---
10
+
11
+ 我小时候去的那间图书馆只有两排旧木桌。后来它搬进地铁站旁边的新楼,玻璃门一开就是咖啡味和小孩拖着书包跑过的声音。借书方便了,这是真的。可我最想念的还是雨天窗边那盏坏了一半的台灯,灯罩发黄,照在纸页上像一小块下午。现在那里可能早就没人记得了。
@@ -0,0 +1,11 @@
1
+ ---
2
+ fixture_id: zh-nat-02
3
+ language: zh
4
+ class: natural
5
+ expected_hot: false
6
+ why_designed_this_way: |
7
+ Paired with zh-ai-02. Uses first-person, uneven sentence length, and concrete office incidents instead of repeated project/process vocabulary.
8
+ topic: 项目管理
9
+ ---
10
+
11
+ 表格当然有用,但我真正记住的是周三晚上那通电话。需求改了三次,设计师在群里发了一个摊手表情,后端同事干脆把外卖盒推到键盘旁边继续改接口。那天没人谈方法论。我们只是把最容易出错的两个字段圈出来,第二天早上先测它们,然后才敢把链接发给客户。效率这个词后来才被写进复盘里。
@@ -0,0 +1,11 @@
1
+ ---
2
+ fixture_id: zh-nat-03
3
+ language: zh
4
+ class: natural
5
+ expected_hot: false
6
+ why_designed_this_way: |
7
+ Natural public-doc review anecdote with short fragments, asides, and varied tokens. Designed to avoid the uniform assistant-report cadence.
8
+ topic: 文档审核
9
+ ---
10
+
11
+ 我审文档时最怕那种看起来很顺的段落。第一遍读过去没问题,第二遍才发现日期少了一位,或者截图里的按钮名字已经换掉了。真烦。比起漂亮的总述,我宁愿作者在旁边留一句“这里还没确认”,至少我知道该去问谁。干净不是没有痕迹,而是每个痕迹都有来处。
@@ -0,0 +1,121 @@
1
+ # Quality Benchmark
2
+
3
+ Deterministic measurement of patina's stylometry / lexicon signal layer
4
+ against a labeled fixture set. Runs with no LLM calls, no API key, no
5
+ network — fast enough to run on every CI build.
6
+
7
+ ## Run it
8
+
9
+ ```bash
10
+ npm run benchmark
11
+ ```
12
+
13
+ Outputs:
14
+ - A markdown table per language (accuracy, precision, recall, F1, confusion matrix)
15
+ - A list of any misclassified fixtures with their feature values
16
+ - `tests/quality/results.json` — full per-fixture log (gitignored)
17
+ - `docs/benchmarks/latest.md` / `latest.json` when run via `npm run benchmark:report`
18
+ - `docs/benchmarks/detector-comparison.md` / `.json` when run via `npm run benchmark:compare`
19
+
20
+ ## What it measures
21
+
22
+ Every fixture under `tests/fixtures/suspect-zones/{lang}/{ai|natural}/*.md`
23
+ carries an `expected_hot` label in its frontmatter. The benchmark runs
24
+ `analyzeText()` (defined in `src/features/index.js`) on the body and
25
+ compares the predicted hot/cold decision against that label. The decision
26
+ follows the 3-signal OR rule from `core/stylometry.md` §16:
27
+
28
+ ```
29
+ paragraph is SUSPECT iff
30
+ burstiness_band == "low" OR
31
+ MATTR_band == "low" OR
32
+ lexicon_density > threshold
33
+ ```
34
+
35
+ Per-language metrics use `expected_hot=true` as the positive class.
36
+
37
+ ## What it does NOT measure
38
+
39
+ - LLM-based scoring (`src/scoring.js`). The LLM is non-deterministic by
40
+ design and adds API cost / latency, so it stays out of this layer.
41
+ A separate live-mode benchmark would be its own follow-up.
42
+ - Rewrite quality (does the rewritten text read better?). That requires
43
+ human or LLM grading and lives in `tests/e2e/quality-test.js`.
44
+ That script is opt-in because it shells out to OpenCode:
45
+
46
+ ```bash
47
+ OPENCODE_AVAILABLE=1 node tests/e2e/quality-test.js
48
+ ```
49
+
50
+ The script uses `opencode/hy3-preview-free` by default. Override it with
51
+ `OPENCODE_MODEL=<provider/model>` when testing another OpenCode model.
52
+ - AUROC against a ranked score — the current decision is binary
53
+ (hot/cold), so we report accuracy + F1 instead.
54
+
55
+ ## Extending the corpus
56
+
57
+ 1. Add a new fixture markdown with frontmatter:
58
+
59
+ ```yaml
60
+ ---
61
+ fixture_id: ko-ai-06
62
+ language: ko
63
+ class: ai
64
+ expected_hot: true
65
+ expected_metrics:
66
+ cv_band: low # optional regression pin
67
+ mattr_band: high # optional regression pin
68
+ lexicon_density_min: 0 # optional regression pin
69
+ lexicon_density_max: 80 # optional regression pin
70
+ why_designed_this_way: |
71
+ Brief note on which signals you expect to fire.
72
+ topic: <subject>
73
+ ---
74
+
75
+ <one paragraph of text>
76
+ ```
77
+
78
+ 2. Drop it under `tests/fixtures/suspect-zones/{lang}/{ai|natural}/`.
79
+
80
+ 3. Add `expected_metrics` when a fixture is meant to pin a specific deterministic signal. This is useful for real-world chat-register fixtures where a future tokenizer or threshold change should fail loudly instead of silently changing the benchmark meaning.
81
+
82
+ 4. Refresh the central per-fixture regression ranges after reviewing the new fixture:
83
+
84
+ ```bash
85
+ npm run benchmark:ranges
86
+ ```
87
+
88
+ This updates `tests/fixtures/suspect-zones/expected-ranges.json`, which pins CV, MATTR, lexicon density, and detector sub-signal expectations for every fixture.
89
+
90
+ 5. Re-run `npm run benchmark` and confirm it classifies as expected.
91
+
92
+ ## Third-party detector comparison
93
+
94
+ Patina does not scrape detector websites or send fixture text to vendors. For
95
+ manual comparisons:
96
+
97
+ ```bash
98
+ cp tests/quality/detectors.manual.example.json /tmp/detectors.manual.json
99
+ $EDITOR /tmp/detectors.manual.json
100
+ node scripts/detector-comparison.mjs --input /tmp/detectors.manual.json
101
+ ```
102
+
103
+ The checked-in report always includes Patina's own deterministic analyzer. Any
104
+ third-party rows are manual, timestamped, and opt-in.
105
+
106
+ ## Tuning the thresholds
107
+
108
+ If a real-world corpus produces too many misclassifications, the bands
109
+ in `.patina.default.yaml` (`stylometry.burstiness.bands`,
110
+ `stylometry.ttr.bands`, `lexicon.density_threshold`) drive the
111
+ classification. Sweep against this benchmark + your own corpus and
112
+ update thresholds; the shipped values come from the v3.5.1 / v3.7
113
+ calibration documented in `core/stylometry.md` §13 §16.
114
+
115
+ ## Languages
116
+
117
+ Currently runs on all supported pattern-pack languages: `ko`, `en`, `zh`, and
118
+ `ja`. Chinese and Japanese use a deterministic character-token fallback because
119
+ normal prose often has no whitespace; ko/en keep whitespace tokenization.
120
+ Language-specific zh/ja lexicons are still future work, so current zh/ja
121
+ fixtures are mainly burstiness/MATTR regression coverage.