kairos-chain 3.25.0 → 3.25.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 31f36e0972d3b7f0848a2a5334d18e5dce69ca4e9f507d0e8072ca9b263983b8
4
- data.tar.gz: 3d4c0ff8590721b645088b386e2d3acfc7944fae62b774776a4fd2f4f65887ab
3
+ metadata.gz: 2ff49b5dba49e9c78161990be5bbf9bc94252542aeafe636b0c6cd424952ccff
4
+ data.tar.gz: 816f240e2cea9d045b4ac0c19c622792fe603dc49d2b454ec9409a8a6ab4f74a
5
5
  SHA512:
6
- metadata.gz: 781cc48a6a9e55de327e2ca7cf4bee5d1e195dc9dc4c8f86e7998e36dc2d93ac301bd17c60efcacb0b63d4347d352c83c365d24cc7415e2f319f3a7276741c19
7
- data.tar.gz: 7f91d5619741e422a6594bddb22a8bfba4bacf31de8681424e46a91c5a7ffa3b71d1d5ecaf21306a66680db0136efe252744e9dc9ae5be3446a8d50916b8e306
6
+ metadata.gz: aa98790fe6f4b71d1995b6d6a6822692a246c5aff24e3d99c9087ed44e78c9dc92ee19497628335862b1b6c93d156da514a7059bb275116d6e2ee62474e091a2
7
+ data.tar.gz: d50285be992138c811fb27bf9bf375648bbab801b71f37c9150d980996fc418da9e8bbd0bbaed6b506d19ecb187df932395f0a70f4501772b027c2bea8695581
data/CHANGELOG.md CHANGED
@@ -4,6 +4,33 @@ All notable changes to the `kairos-chain` gem will be documented in this file.
4
4
 
5
5
  This project follows [Semantic Versioning](https://semver.org/).
6
6
 
7
+ ## [3.25.2] - 2026-05-07
8
+
9
+ ### Changed (L1 knowledge: reviewer evaluation feedback loop)
10
+
11
+ - `multi_llm_review_workflow` § L2 Save Points: 各 review round 終了時に
12
+ per-reviewer observation (verdict, (a)/(b)/(c) breakdown, briefing-reaction
13
+ shift, anomalies) を `reviewer_evaluation_observation_<reviewer>_<date>`
14
+ prefix で L2 context に記録するよう明示。次回以降の
15
+ `multi_llm_reviewer_evaluation` refinement 用 sample 蓄積 channel として
16
+ workflow に組み込み。
17
+ - `multi_llm_reviewer_evaluation` 末尾に "Refinement Source" section を追加。
18
+ 上記 L2 context を refinement の source として明示することで L2 → L1
19
+ promotion loop を reviewer profile 自身に対しても閉じる (Prop 5
20
+ constitutive recording + Prop 6 incompleteness as driving force)。
21
+
22
+ Surface 拡張なし: 既存セクションへの bullet 追加 + 新規 1 段落のみ。新 mechanism /
23
+ 新 field / 新 tool なし。
24
+
25
+ ## [3.25.1] - 2026-05-07
26
+
27
+ ### Changed (L1 knowledge: multi-LLM review)
28
+
29
+ - `multi_llm_reviewer_evaluation` v1.2 → v1.3: harness memory に分散していた reviewer 性癖知識 (Codex 3 structural biases、Cursor vs Codex briefing-reaction data、Codex GPT-5.5 profile) を統合。新セクション "Reviewer Value-System Divergence" + (a)/(b)/(c) finding classification を追加。Convergence Rule を分類後ベース ((a)+(b) のみ blocking) に更新。Cost-Benefit を "Phase 1 baseline (5 reviewers)" にリネームし scope 明示。
30
+ - `multi_llm_review_workflow`: Step 0 (mandatory `knowledge_get multi_llm_reviewer_evaluation`) と Step 0.5 (Design Direction Block for design / docs reviews) を追加。§ Convergence Rules と § Workflow Pattern step [4] を (a)/(b)/(c)-aware に整合。Step 0.5 block structure に invariant preface を追加 (anti-enumeration 整合)。
31
+
32
+ 設計の経緯と検証は self-review 2 round (Codex GPT-5.5 / Cursor Composer-2 / Claude CLI Opus 4.6 / Persona Team Opus 4.7) で実施。4/4 APPROVE / APPROVE WITH CHANGES、no REJECT。Phase 2 Case A (Context Graph review loop, 2026-05-04) で観察された value-system divergence を起点とし、KairosChain_2026 only の experimental briefing protocol (project CLAUDE.md) を operational extension として L1 化。
33
+
7
34
  ## [3.25.0] - 2026-05-07
8
35
 
9
36
  ### Added (Instruction mode projection)
data/bin/kairos-chain CHANGED
@@ -348,6 +348,42 @@ when 'mode'
348
348
 
349
349
  mode_action = ARGV.shift || 'project'
350
350
 
351
+ if %w[-h --help help].include?(mode_action)
352
+ puts <<~HELP
353
+ Usage: kairos-chain mode <action> [--data-dir DIR]
354
+
355
+ Project the active instruction mode (Masa Mode, Tutorial Mode, ...)
356
+ to project-root CLAUDE.md via a managed @-import region. Required
357
+ for the mode body to reach Agent tool sub-agents (which do not
358
+ receive MCP `instructions`) and to bypass the harness truncation
359
+ cap on long mode bodies.
360
+
361
+ Actions:
362
+ project Materialize the active mode body to .claude/kairos/
363
+ instruction_mode.md and merge a marker region into
364
+ project-root CLAUDE.md. Default action when no action
365
+ is given. Idempotent — safe to re-run.
366
+ status Print the current projection state (active mode name,
367
+ version, artifact path/size, region presence, last
368
+ projection time).
369
+ remove Delete the projected artifact and remove the marker
370
+ region from CLAUDE.md. Manifest is cleared.
371
+
372
+ Options:
373
+ --data-dir DIR Override the .kairos/ data directory location.
374
+
375
+ Notes:
376
+ - The active mode is read from `instructions_mode` in
377
+ .kairos/skills/config.yml. Use `instructions_update` MCP tool
378
+ to change it; then re-run `mode project`.
379
+ - CLAUDE.md @-imports resolve at Claude Code session start;
380
+ you must restart Claude Code (`exit` then `claude`) for any
381
+ projection or removal to take effect.
382
+ - Body size policy: warn at >=150KB, refuse at >=256KB.
383
+ HELP
384
+ exit 0
385
+ end
386
+
351
387
  $LOAD_PATH.unshift File.expand_path('../lib', __dir__)
352
388
  require 'kairos_mcp'
353
389
 
@@ -484,6 +520,9 @@ OptionParser.new do |opts|
484
520
  puts " init [DIR] Initialize data directory with default templates"
485
521
  puts " upgrade [--apply] Check/apply template migrations after gem update"
486
522
  puts " skillset <cmd> Manage SkillSet plugins (list/install/enable/disable/remove/info)"
523
+ puts " mode <action> Project active instruction mode to CLAUDE.md (project/status/remove)"
524
+ puts ""
525
+ puts "Run a subcommand with -h for details, e.g. 'kairos-chain mode -h'."
487
526
  exit
488
527
  end
489
528
  end.parse!
@@ -1,4 +1,4 @@
1
1
  module KairosMcp
2
- VERSION = "3.25.0"
2
+ VERSION = "3.25.2"
3
3
  CHANGELOG_URL = "https://github.com/masaomi/KairosChain_2026/blob/main/CHANGELOG.md"
4
4
  end
@@ -234,8 +234,10 @@ The user always has the final say.
234
234
  ├── outputs: revised artifact + new review prompt
235
235
  └── L2 save: consensus + revised artifact
236
236
  |
237
- [4] If 0 FAIL proceed to next phase
238
- If FAIL → repeat from [2] with revised artifact
237
+ [4] Classify findings as (a)/(b)/(c) per `multi_llm_reviewer_evaluation`
238
+ If no (a)/(b) blocking findings proceed to next phase
239
+ If any (a)/(b) finding → repeat from [2] with revised artifact
240
+ (c) findings are recorded as advisory; non-blocking
239
241
  ```
240
242
 
241
243
  ## Review Types
@@ -263,10 +265,21 @@ likely to be missed by a single LLM reviewing its own design. For per-model prof
263
265
 
264
266
  ## Convergence Rules
265
267
 
266
- - **3/4 APPROVE** (no REJECT) = proceed to next step
267
- - **Any REJECT or FAIL** = revise and re-review
268
- - **4/4 APPROVE** = highest confidence, proceed
269
- - Legacy 3-reviewer mode: 2/3 APPROVE = proceed
268
+ The rule applies **after** orchestrator classifies each finding as (a)/(b)/(c) per
269
+ `multi_llm_reviewer_evaluation` § Reviewer Value-System Divergence. Only (a)+(b)
270
+ findings count toward the thresholds below; (c) findings are recorded as advisory
271
+ and never block.
272
+
273
+ - **3/4 APPROVE** (no (a)/(b) REJECT) = proceed to next step
274
+ - **Any (a) or (b) REJECT or FAIL** = revise and re-review
275
+ - **(c)-only REJECT** = record as advisory, non-blocking
276
+ - **4/4 APPROVE** (no (a)/(b)) = highest confidence, proceed
277
+ - Legacy 3-reviewer mode: 2/3 APPROVE (no (a)/(b)) = proceed
278
+ - Codex REJECT with (a)/(b) findings + others APPROVE = likely real issue, investigate before overriding
279
+ - Codex REJECT with only (c) findings = expected per Codex value-system divergence; non-blocking
280
+
281
+ For normative detail and the underlying classification, see
282
+ `multi_llm_reviewer_evaluation` § Convergence Rule (Updated).
270
283
 
271
284
  ### Consensus Patterns
272
285
 
@@ -331,6 +344,14 @@ Save to L2 context at these moments:
331
344
  - After design/implementation complete (before review)
332
345
  - After synthesis of reviews (revised version)
333
346
  - After final convergence (implementation-ready / merge-ready)
347
+ - **After each review round**: capture per-reviewer observations — verdict,
348
+ (a)/(b)/(c) classification breakdown, briefing-reaction shift (did the
349
+ reviewer change verdict after Step 0.5 design direction?), anomalies
350
+ (off-pattern findings, format failures, refusal). Tag context name with
351
+ prefix `reviewer_evaluation_observation_<reviewer>_<date>` so future
352
+ refinement of `multi_llm_reviewer_evaluation` can sample these records
353
+ systematically. This closes the L2→L1 promotion loop for reviewer
354
+ profiles themselves.
334
355
 
335
356
  ---
336
357
 
@@ -271,7 +271,8 @@ Deployment: Composer-2 or Cursor GPT-5.4
271
271
  | Reviewer | Summary |
272
272
  |----------|---------|
273
273
  | Claude Opus 4.6 | Guardian of design. Finds security threats and novel architectural alternatives |
274
- | Codex GPT-5.4 | Strictest judge. Last to approve, but APPROVE = highest confidence signal |
274
+ | Codex GPT-5.4 | Strictest judge. Classify findings (a)/(b)/(c) before treating REJECT as blocking; APPROVE is a strong signal **when reachable**, not a mandatory gate (see Phase 2 Case A caveat) |
275
+ | Codex GPT-5.5 | Stricter sibling of 5.4. Same value-system divergence (3 biases); apply the same classification discipline |
275
276
  | Cursor Premium | Implementation craftsman. Bug hunter for concurrency and resource management |
276
277
  | Composer-2 | Fastest pragmatist. First to determine if something is deployable |
277
278
  | Cursor GPT-5.4 | Binary sword. Clear approve-or-reject, strictest on test coverage |
@@ -292,3 +293,12 @@ Deployment: Composer-2 or Cursor GPT-5.4
292
293
  5. Some REJECTs reflect the reviewer's value system, not the artifact. The (a)/(b)/(c)
293
294
  classification (see § Reviewer Value-System Divergence) is required to separate
294
295
  blocking signal from advisory noise. Codex models in particular require this lens.
296
+
297
+ ## Refinement Source
298
+
299
+ Profiles in this knowledge are refined from accumulated L2 contexts named with prefix
300
+ `reviewer_evaluation_observation_<reviewer>_<date>`, recorded after each multi-LLM
301
+ review round per `multi_llm_review_workflow` § L2 Save Points. When updating this
302
+ file, sample those records to revise per-reviewer profiles, Strength Matrix entries,
303
+ Cost-Benefit ratings, and the value-system divergence section. This closes the
304
+ L2 → L1 promotion loop for reviewer profiles themselves.
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: kairos-chain
3
3
  version: !ruby/object:Gem::Version
4
- version: 3.25.0
4
+ version: 3.25.2
5
5
  platform: ruby
6
6
  authors:
7
7
  - Masaomi Hatakeyama