okstra 0.46.0 → 0.47.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,570 @@
1
+ # 적대적 Phase 5.5 검증 구현 계획
2
+
3
+ > **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
4
+
5
+ **Goal:** `requirements-discovery` / `error-analysis` 두 phase 의 Phase 5.5 convergence 재검증을, 검증자가 다른 워커의 주장을 적극적으로 반박하고 입증 책임을 주장 쪽에 두는 적대적 검증으로 전환한다.
6
+
7
+ **Architecture:** 새 에이전트·스테이지를 만들지 않는다. manifest `convergence` 블록에 `adversarial` 플래그를 phase-aware 로 주입하고(render.py), convergence skill 이 `adversarial=true` 분기에서 적대적 프롬프트·집계·범위-한정 재조사를 정의한다. verdict 영속 enum 은 유지하고 신규 `disagreeBasis` 필드로 적대성을 기록한다.
8
+
9
+ **Tech Stack:** Python 3 (okstra_ctl, pytest), Markdown skill/prompt 문서, JSON fixture. 빌드: `tools/build.mjs`(`npm run build`).
10
+
11
+ **설계 근거:** [`docs/superpowers/specs/2026-06-04-adversarial-verification-design.md`](../specs/2026-06-04-adversarial-verification-design.md)
12
+
13
+ ---
14
+
15
+ ## 파일 구조
16
+
17
+ | 파일 | 책임 | 작업 |
18
+ |---|---|---|
19
+ | [`scripts/okstra_ctl/render.py`](../../../scripts/okstra_ctl/render.py) | manifest `convergence` 블록에 `adversarial`/`verificationMode` phase-aware 주입 | Modify (`_build_convergence_block`, 899–926) |
20
+ | `tests/test_render_convergence_adversarial.py` | render 주입값 단위 테스트 | Create |
21
+ | [`tests/test_convergence_state_contract.py`](../../../tests/test_convergence_state_contract.py) | 상태 스키마 1.2 + `disagreeBasis` + `config.adversarial` 형태 강제 | Modify |
22
+ | `tests/fixtures/convergence/adversarial-contested.json` | 적대적 contested 케이스 fixture | Create |
23
+ | [`skills/okstra-convergence/SKILL.md`](../../../skills/okstra-convergence/SKILL.md) | 적대적 모드 동작 정의(프롬프트·집계·범위·스키마) | Modify |
24
+ | [`prompts/profiles/requirements-discovery.md`](../../../prompts/profiles/requirements-discovery.md) | Phase 5.5 적대적 선언 | Modify |
25
+ | [`prompts/profiles/error-analysis.md`](../../../prompts/profiles/error-analysis.md) | Phase 5.5 적대적 선언 | Modify |
26
+ | [`prompts/profiles/_common-contract.md`](../../../prompts/profiles/_common-contract.md) | Worker interaction model 의 Phase 5.5 설명 갱신 | Modify |
27
+ | [`CHANGES.md`](../../../CHANGES.md) | `사용자 영향:` 항목 | Modify |
28
+
29
+ 작업 순서: render(코드) → contract 테스트/fixture(데이터) → skill 문서(동작) → 프로필/계약 선언 → CHANGES + 빌드 + 전체 검증.
30
+
31
+ ---
32
+
33
+ ### Task 1: render.py — `adversarial`/`verificationMode` phase-aware 주입
34
+
35
+ **Files:**
36
+ - Create: `tests/test_render_convergence_adversarial.py`
37
+ - Modify: `scripts/okstra_ctl/render.py:899-926`
38
+
39
+ - [ ] **Step 1: 실패하는 테스트 작성**
40
+
41
+ Create `tests/test_render_convergence_adversarial.py`:
42
+
43
+ ```python
44
+ """_build_convergence_block — adversarial 모드 phase-aware 기본값 단위 테스트.
45
+
46
+ requirements-discovery / error-analysis 만 adversarial=true + full-reanalysis 를
47
+ 받고, 나머지 phase 는 협조적 lightweight 를 유지한다. maxRounds 의 기존 분기는
48
+ adversarial 도입과 무관하게 보존된다.
49
+ """
50
+ from __future__ import annotations
51
+
52
+ import sys
53
+ from pathlib import Path
54
+
55
+ import pytest
56
+
57
+ _REPO_ROOT = Path(__file__).resolve().parents[1]
58
+ sys.path.insert(0, str(_REPO_ROOT / "scripts"))
59
+
60
+ from okstra_ctl.render import _build_convergence_block # noqa: E402
61
+
62
+
63
+ @pytest.mark.parametrize("task_type", ["requirements-discovery", "error-analysis"])
64
+ def test_adversarial_phases_get_adversarial_full_reanalysis(task_type):
65
+ block = _build_convergence_block({"TASK_TYPE": task_type})
66
+ assert block["adversarial"] is True
67
+ assert block["verificationMode"] == "full-reanalysis"
68
+
69
+
70
+ @pytest.mark.parametrize(
71
+ "task_type",
72
+ [
73
+ "implementation-planning",
74
+ "implementation",
75
+ "final-verification",
76
+ "release-handoff",
77
+ ],
78
+ )
79
+ def test_non_adversarial_phases_stay_lightweight(task_type):
80
+ block = _build_convergence_block({"TASK_TYPE": task_type})
81
+ assert block["adversarial"] is False
82
+ assert block["verificationMode"] == "lightweight"
83
+
84
+
85
+ def test_maxrounds_unchanged_by_adversarial():
86
+ assert _build_convergence_block({"TASK_TYPE": "requirements-discovery"})["maxRounds"] == 1
87
+ assert _build_convergence_block({"TASK_TYPE": "error-analysis"})["maxRounds"] == 2
88
+ ```
89
+
90
+ - [ ] **Step 2: 테스트 실패 확인**
91
+
92
+ Run: `python3 -m pytest tests/test_render_convergence_adversarial.py -v`
93
+ Expected: FAIL — `KeyError: 'adversarial'` (블록에 키가 아직 없음).
94
+
95
+ - [ ] **Step 3: 최소 구현**
96
+
97
+ Modify `_build_convergence_block` body in `scripts/okstra_ctl/render.py` (현재 899–926). 본문 교체:
98
+
99
+ ```python
100
+ task_type = ctx.get("TASK_TYPE", "")
101
+ default_max_rounds = 1 if task_type == "requirements-discovery" else 2
102
+ adversarial_phases = {"requirements-discovery", "error-analysis"}
103
+ is_adversarial = task_type in adversarial_phases
104
+ raw_plan_verify = (ctx.get("OKSTRA_PLAN_VERIFICATION", "") or "").strip().lower()
105
+ plan_verify_enabled = raw_plan_verify != "false"
106
+ return {
107
+ "enabled": True,
108
+ "adversarial": is_adversarial,
109
+ "maxRounds": default_max_rounds,
110
+ "verificationMode": "full-reanalysis" if is_adversarial else "lightweight",
111
+ "planBodyVerification": {
112
+ "enabled": plan_verify_enabled,
113
+ "maxRounds": 1,
114
+ "gating": True,
115
+ },
116
+ }
117
+ ```
118
+
119
+ 그리고 docstring(900–912)의 defaults 목록에 한 줄 추가 — `- verificationMode "lightweight"` 다음 줄에:
120
+
121
+ ```python
122
+ - `adversarial` default True for `requirements-discovery` / `error-analysis`
123
+ (forces `verificationMode` to "full-reanalysis"), False otherwise
124
+ ```
125
+
126
+ - [ ] **Step 4: 테스트 통과 확인**
127
+
128
+ Run: `python3 -m pytest tests/test_render_convergence_adversarial.py -v`
129
+ Expected: PASS (3 tests / 6 parametrized cases).
130
+
131
+ - [ ] **Step 5: 커밋**
132
+
133
+ ```bash
134
+ git add scripts/okstra_ctl/render.py tests/test_render_convergence_adversarial.py
135
+ git commit -m "feat(okstra_ctl/render): inject adversarial convergence mode for discovery/error-analysis"
136
+ ```
137
+
138
+ ---
139
+
140
+ ### Task 2: convergence 상태 contract 테스트 + 적대적 fixture
141
+
142
+ 스키마를 1.2 로 올리고 `config.adversarial` 와 `votes.<worker>.disagreeBasis` 형태를 강제한다. 기존 1.1 fixture 3개는 그대로 통과해야 한다(신규 필드는 optional).
143
+
144
+ **Files:**
145
+ - Create: `tests/fixtures/convergence/adversarial-contested.json`
146
+ - Modify: `tests/test_convergence_state_contract.py`
147
+
148
+ - [ ] **Step 1: 적대적 fixture 작성 (실패 유발 데이터)**
149
+
150
+ Create `tests/fixtures/convergence/adversarial-contested.json`. requirements-discovery 스타일(effectiveMaxRounds=1, 단일 라운드=마지막 라운드)로, codex 의 `counter-evidence` 반박 1건이 F-001 을 `contested` 로 강등하는 케이스:
151
+
152
+ ```json
153
+ {
154
+ "schemaVersion": "1.2",
155
+ "taskKey": "fixture/adversarial-contested",
156
+ "config": {
157
+ "enabled": true,
158
+ "adversarial": true,
159
+ "maxRounds": 1,
160
+ "effectiveMaxRounds": 1,
161
+ "verificationMode": "full-reanalysis"
162
+ },
163
+ "findings": [
164
+ {
165
+ "findingId": "F-001",
166
+ "summary": "Login handler skips input validation",
167
+ "category": "bug",
168
+ "ticketIds": ["AD-100"],
169
+ "originWorker": "claude-worker",
170
+ "originEvidence": "src/auth/login.ts:42",
171
+ "classification": "contested",
172
+ "rounds": [
173
+ {
174
+ "round": 1,
175
+ "votes": {
176
+ "codex-worker": {
177
+ "verdict": "disagree",
178
+ "disagreeBasis": "counter-evidence",
179
+ "explanation": "src/auth/login.ts:48 already runs validateBody(); the claimed gap does not exist."
180
+ },
181
+ "gemini-worker": {
182
+ "verdict": "agree",
183
+ "disagreeBasis": null,
184
+ "explanation": "Re-read login.ts; could not break the claim."
185
+ }
186
+ }
187
+ }
188
+ ],
189
+ "consensusWorkers": ["claude-worker", "gemini-worker"],
190
+ "dissentingWorkers": ["codex-worker"]
191
+ }
192
+ ],
193
+ "roundHistory": [
194
+ {
195
+ "round": 1,
196
+ "inputQueueSize": 1,
197
+ "resolvedCount": 0,
198
+ "carriedForwardCount": 1,
199
+ "dispatches": [
200
+ {"worker": "codex-worker", "status": "completed", "durationMs": 173004},
201
+ {"worker": "gemini-worker", "status": "completed", "durationMs": 188210}
202
+ ],
203
+ "skippedWorkers": [
204
+ {"worker": "claude-worker", "reason": "no items to verify"}
205
+ ]
206
+ }
207
+ ],
208
+ "round2SkippedReason": "max-rounds-1",
209
+ "finalState": "max-rounds-reached",
210
+ "totalRounds": 1,
211
+ "finalClassificationCounts": {
212
+ "fullConsensus": 0,
213
+ "partialConsensus": 0,
214
+ "contested": 1,
215
+ "workerUnique": 0
216
+ }
217
+ }
218
+ ```
219
+
220
+ - [ ] **Step 2: 기존 테스트가 새 fixture 에서 깨지는지 확인 (red)**
221
+
222
+ Run: `python3 -m pytest tests/test_convergence_state_contract.py -k adversarial -v`
223
+ Expected: FAIL — `test_schema_version_is_1_1[adversarial-contested]` 가 `"1.2" == "1.1"` 단언에서 실패.
224
+
225
+ - [ ] **Step 3: contract 테스트를 1.2 수용 + 적대적 형태 검증으로 갱신**
226
+
227
+ Edit `tests/test_convergence_state_contract.py`:
228
+
229
+ (a) 모듈 docstring 첫 줄(1행)을 `(schema v1.1)` → `(schema v1.1 / v1.2)` 로 수정.
230
+
231
+ (b) `VALID_VERDICTS = {...}` 정의(32행) 바로 다음에 추가:
232
+
233
+ ```python
234
+ VALID_DISAGREE_BASIS = {"counter-evidence", "burden-not-met", None}
235
+ ```
236
+
237
+ (c) `test_schema_version_is_1_1`(40–41행) 전체를 교체:
238
+
239
+ ```python
240
+ def test_schema_version_is_supported(fixture):
241
+ assert fixture["schemaVersion"] in {"1.1", "1.2"}
242
+ ```
243
+
244
+ (d) 파일 끝에 신규 테스트 2개 추가:
245
+
246
+ ```python
247
+ def test_disagree_basis_is_enum_when_present(fixture):
248
+ for f in fixture["findings"]:
249
+ for r in f["rounds"]:
250
+ for vote in r["votes"].values():
251
+ if "disagreeBasis" in vote:
252
+ assert vote["disagreeBasis"] in VALID_DISAGREE_BASIS
253
+
254
+
255
+ def test_adversarial_disagree_carries_basis(fixture):
256
+ """In an adversarial run every disagree vote must cite a refutation basis."""
257
+ if not fixture["config"].get("adversarial"):
258
+ return
259
+ for f in fixture["findings"]:
260
+ for r in f["rounds"]:
261
+ for worker, vote in r["votes"].items():
262
+ if vote["verdict"] == "disagree":
263
+ assert vote.get("disagreeBasis") in {"counter-evidence", "burden-not-met"}, (
264
+ f"{f['findingId']} {worker}: adversarial disagree without disagreeBasis"
265
+ )
266
+ ```
267
+
268
+ - [ ] **Step 4: 전체 contract 테스트 통과 확인**
269
+
270
+ Run: `python3 -m pytest tests/test_convergence_state_contract.py -v`
271
+ Expected: PASS — 기존 3 fixture(1.1) + 신규 1 fixture(1.2) 전부 통과. `config.get("adversarial")` 가 1.1 fixture 에서 `None`(falsy) 이라 `test_adversarial_disagree_carries_basis` 는 그들에 대해 no-op.
272
+
273
+ - [ ] **Step 5: 커밋**
274
+
275
+ ```bash
276
+ git add tests/test_convergence_state_contract.py tests/fixtures/convergence/adversarial-contested.json
277
+ git commit -m "test(convergence): accept schema v1.2 with adversarial config + disagreeBasis"
278
+ ```
279
+
280
+ ---
281
+
282
+ ### Task 3: convergence SKILL.md — 적대적 모드 동작 정의
283
+
284
+ 이 Task 는 적대적 *행동*의 authoritative 선언이다(코드 강제 불가, lead/워커 prompt 지시). 다섯 군데를 편집한다. 각 Edit 는 기존 텍스트를 anchor 로 잡는다.
285
+
286
+ **Files:**
287
+ - Modify: `skills/okstra-convergence/SKILL.md`
288
+
289
+ - [ ] **Step 1: Configuration 표에 `adversarial` 행 추가**
290
+
291
+ Edit — `| `verificationMode` | `"lightweight"` | `"lightweight"` or `"full-reanalysis"` |` 행(48행) 다음에 새 행 삽입:
292
+
293
+ ```markdown
294
+ | `adversarial` | phase-aware: `true` for `requirements-discovery` / `error-analysis`, `false` otherwise | When `true`, Phase 5.5 runs in **adversarial mode** (see §"Adversarial Verification Mode"): verifiers actively try to refute each finding, the burden of proof sits on the claim, and `verificationMode` is forced to `"full-reanalysis"` scoped to the finding's cited evidence. Resolved by `scripts/okstra_ctl/render.py` `_build_convergence_block` and recorded in `config.adversarial` of the convergence state artifact. |
295
+ ```
296
+
297
+ - [ ] **Step 2: 신규 §"Adversarial Verification Mode" 절 추가**
298
+
299
+ Edit — §"Verification Mode" 의 "Full Re-analysis (opt-in)" 블록 끝(`Disadvantages: 2–3 times the cost, increased time` 줄, 193행) 다음에 새 절 삽입:
300
+
301
+ ```markdown
302
+
303
+ ## Adversarial Verification Mode
304
+
305
+ Active only when `config.adversarial == true` (default for `requirements-discovery` and `error-analysis`; see §"Configuration"). When `false`, every rule in this section is inert and the collaborative behaviour documented elsewhere in this skill applies unchanged.
306
+
307
+ In adversarial mode the verifier's job inverts: instead of confirming a peer's finding, the verifier **tries to break it**, and the burden of proof sits on the claim — a finding survives only if refutation attempts fail.
308
+
309
+ ### Scoped full-reanalysis (BLOCKING)
310
+
311
+ Adversarial mode forces `verificationMode = "full-reanalysis"`, but the re-analysis is **scoped to the evidence the finding under attack cites** (the file paths / line ranges / log lines in its `originEvidence`), plus the immediately surrounding context. The verifier MUST NOT re-read the whole task brief, instruction-set, or `final-report-template.md`. This keeps the documented "single largest avoidable cost in requirements-discovery and error-analysis" (see §"Reverify prompt: required-reading suppression") bounded while making the refutation real rather than a text-only argument.
312
+
313
+ ### Adversarial verdict semantics
314
+
315
+ The persisted `verdict` enum is unchanged (`agree | disagree | supplement | verification-error`). The prompt-facing labels are adversarial and map down on persistence:
316
+
317
+ | Prompt label | Persisted `verdict` | Meaning |
318
+ |---|---|---|
319
+ | SURVIVES | `agree` | Actively tried to refute and failed — the claim withstood the attack. |
320
+ | SURVIVES-WITH-CAVEAT | `supplement` | Holds, but a scope limit / extra condition / precondition was found. |
321
+ | REFUTED | `disagree` | The claim was broken (or failed to prove itself). MUST carry a `disagreeBasis`. |
322
+
323
+ Each `disagree` vote records a new field `disagreeBasis`:
324
+
325
+ | `disagreeBasis` | Meaning |
326
+ |---|---|
327
+ | `counter-evidence` | The verifier cited contradicting evidence (`file:line` / log line) in `explanation`. A **hard refute**. |
328
+ | `burden-not-met` | The verifier re-inspected the cited evidence and could neither confirm nor refute → the claim failed to prove itself ("when uncertain, lean to rejection"). |
329
+
330
+ A `disagree` with `disagreeBasis == null` is a contract violation in adversarial mode — every refutation must state which of the two grounds it rests on. Bare "I disagree" without re-inspection is not allowed.
331
+
332
+ ### Adversarial classification (replaces the §"Convergence Algorithm" per-round classifier when `adversarial == true`)
333
+
334
+ `verification-error` votes are excluded from numerator and denominator exactly as in the collaborative classifier. For each finding `F` in the queue at a round:
335
+
336
+ ```text
337
+ disagrees = [v for v in non-error votes if v.verdict == "disagree"]
338
+ hard_refutes = [v for v in disagrees if v.disagreeBasis == "counter-evidence"]
339
+ all_others_disagree = (every non-discoverer non-error vote is "disagree")
340
+
341
+ IF len(disagrees) == 0:
342
+ resolve F as "full-consensus" (or "partial-consensus" if any SUPPLEMENT/caveat)
343
+ ELIF all_others_disagree:
344
+ resolve F as "worker-unique" # only the discoverer still holds it
345
+ ELIF len(hard_refutes) >= 1:
346
+ # an evidence-backed refute exists and the roster is split → the claim is disputed
347
+ carry F forward; at the LAST executed round classify it "contested"
348
+ ELIF burden-not-met disagrees are a majority of non-error votes:
349
+ carry F forward; at the LAST executed round classify it "contested"
350
+ ELSE:
351
+ # a lone weak (burden-not-met) doubt against an otherwise-surviving claim
352
+ resolve F as "partial-consensus"
353
+ ```
354
+
355
+ `contested` remains a **final classification only** (per §"Scope and Terminology"): a disputed finding is carried forward through intermediate rounds and labelled `contested` only at the last executed round. For `requirements-discovery` (`effectiveMaxRounds = 1`) the single round IS the last round, so a split-with-hard-refute finding is labelled `contested` in that one round. The final-classifier block of §"Convergence Algorithm" is unchanged; this section only changes how each round's verdicts resolve into queue actions.
356
+
357
+ Design intent: one `counter-evidence` refute is enough to deny a claim consensus (it cannot rise above `contested` no matter how many others AGREE), while a single `burden-not-met` doubt does not by itself sink an otherwise-surviving claim — only a majority of burden-not-met doubts does.
358
+ ```
359
+
360
+ - [ ] **Step 3: 적대적 재검증 프롬프트 추가**
361
+
362
+ Edit — §"Lightweight Re-verification Prompt" 의 코드펜스가 끝나는 지점(283행 `**Verdict**: ...` 다음 ` ``` ` 줄) 다음에 새 하위 절 삽입:
363
+
364
+ ```markdown
365
+
366
+ ### Adversarial Re-verification Prompt
367
+
368
+ Used instead of the lightweight/full-reanalysis prompt when `config.adversarial == true`. The required anchor headers (§"Required reverify-prompt anchor headers") are identical. The `[Required reading]` clause is suppressed; only the cited-evidence paths of the items under attack are injected (see §"Adversarial Verification Mode" → Scoped full-reanalysis).
369
+
370
+ ```
371
+ You are <worker-role> performing ADVERSARIAL re-verification for <task-key> (round <N>).
372
+
373
+ ## Instructions
374
+
375
+ Your job is to BREAK each finding below, not to confirm it. For EACH finding,
376
+ open the cited evidence directly and actively search for evidence that the claim
377
+ is wrong, overstated, or unproven. Then respond with exactly one verdict:
378
+
379
+ - **REFUTED**: You broke the claim. State the basis:
380
+ - counter-evidence — you found contradicting evidence (give file:line or log line), OR
381
+ - burden-not-met — you re-inspected the cited evidence and could neither confirm
382
+ nor refute it (the claim has not proven itself).
383
+ - **SURVIVES**: You actively tried to refute it and failed — the claim withstood the attack.
384
+ - **SURVIVES-WITH-CAVEAT**: It holds, but a scope limit / extra condition / missing
385
+ precondition exists (state it).
386
+
387
+ The burden of proof is on the claim. If after inspecting the cited evidence you remain
388
+ uncertain, your verdict is REFUTED with basis = burden-not-met.
389
+
390
+ Inspect ONLY the evidence each finding cites and its immediate surroundings. Do NOT
391
+ re-read the task brief, instruction-set, or report template.
392
+
393
+ ## Findings to verify
394
+
395
+ ### F-001: <one-line summary>
396
+ **Origin**: <worker role>
397
+ **Cited evidence**: <file paths, line numbers, log lines from origin worker>
398
+
399
+ ### F-002: <one-line summary>
400
+ ...
401
+
402
+ ## Response format
403
+
404
+ ### F-001
405
+ **Verdict**: REFUTED | SURVIVES | SURVIVES-WITH-CAVEAT
406
+ **Basis** (only if REFUTED): counter-evidence | burden-not-met
407
+ **Explanation**: <2-3 sentences; for counter-evidence include the file:line you found>
408
+
409
+ ### F-002
410
+ ...
411
+ ```
412
+
413
+ When persisting votes, map SURVIVES→`agree`, SURVIVES-WITH-CAVEAT→`supplement`, REFUTED→`disagree`, and copy the stated Basis into `votes.<worker>.disagreeBasis` (null for non-REFUTED verdicts).
414
+ ```
415
+
416
+ - [ ] **Step 4: 스키마(State Artifact)에 `config.adversarial` + `disagreeBasis` + v1.2 반영**
417
+
418
+ Edit (a) — §"Convergence State Artifact" 예시 JSON 의 `config` 블록에 `adversarial` 추가. `"enabled": true,`(330행 부근) 다음 줄에:
419
+
420
+ ```json
421
+ "adversarial": false,
422
+ ```
423
+
424
+ Edit (b) — 같은 예시의 `votes` 항목에 `disagreeBasis` 를 한 곳 보여준다. `"codex-worker": { "verdict": "agree", "explanation": "<brief>" },` 를 다음으로 교체:
425
+
426
+ ```json
427
+ "codex-worker": { "verdict": "agree", "disagreeBasis": null, "explanation": "<brief>" },
428
+ ```
429
+
430
+ Edit (c) — Schema rules 목록(386–401행)의 `schemaVersion` 규칙 줄을 교체:
431
+
432
+ ```markdown
433
+ - `schemaVersion`: literal string `"1.2"` for adversarial-capable runs (`"1.1"` for collaborative-only runs remains valid). Readers MUST accept `"1.0"` / `"1.1"` / `"1.2"` and treat any missing field as `null`.
434
+ ```
435
+
436
+ Edit (d) — 같은 목록의 `config.effectiveMaxRounds` 규칙 줄 **앞에** 새 규칙 줄 추가:
437
+
438
+ ```markdown
439
+ - `config.adversarial`: boolean. `true` when this run used adversarial verification (default for `requirements-discovery` / `error-analysis`). When `true`, `config.verificationMode` is `"full-reanalysis"` (scoped) and every `disagree` vote carries a non-null `disagreeBasis`.
440
+ ```
441
+
442
+ Edit (e) — `findings[].rounds[].votes.<worker>.verdict` 규칙 줄 다음에 새 규칙 줄 추가:
443
+
444
+ ```markdown
445
+ - `findings[].rounds[].votes.<worker>.disagreeBasis`: enum `counter-evidence | burden-not-met | null`. Non-null only when `verdict == "disagree"` AND `config.adversarial == true`; `null` (or absent, treated as null) otherwise. See §"Adversarial Verification Mode".
446
+ ```
447
+
448
+ - [ ] **Step 5: 빌드 + 워크플로 검증으로 문서 정합 확인**
449
+
450
+ Run: `npm run build && bash validators/validate-workflow.sh`
451
+ Expected: 빌드 성공(`runtime/` 동기화), validator PASS.
452
+
453
+ - [ ] **Step 6: 커밋**
454
+
455
+ ```bash
456
+ git add skills/okstra-convergence/SKILL.md runtime/
457
+ git commit -m "feat(skills/okstra-convergence): define adversarial Phase 5.5 verification mode"
458
+ ```
459
+
460
+ ---
461
+
462
+ ### Task 4: 프로필 + 공통 계약에 적대적 Phase 5.5 선언
463
+
464
+ **Files:**
465
+ - Modify: `prompts/profiles/requirements-discovery.md`
466
+ - Modify: `prompts/profiles/error-analysis.md`
467
+ - Modify: `prompts/profiles/_common-contract.md`
468
+
469
+ - [ ] **Step 1: requirements-discovery 프로필에 선언 추가**
470
+
471
+ Edit `prompts/profiles/requirements-discovery.md` — `- Non-goals:` 줄(54행) **앞에** 새 항목 삽입:
472
+
473
+ ```markdown
474
+ - Cross-verification mode:
475
+ - Phase 5.5 convergence runs in **adversarial mode** for this phase (`convergence.adversarial=true`). Verifiers actively try to refute each worker's finding by directly re-inspecting the cited evidence; the burden of proof sits on the claim. See `skills/okstra-convergence/SKILL.md` §"Adversarial Verification Mode". A single evidence-backed refutation prevents a finding from reaching consensus.
476
+ ```
477
+
478
+ - [ ] **Step 2: error-analysis 프로필에 선언 추가**
479
+
480
+ Edit `prompts/profiles/error-analysis.md` — `- Non-goals:` 줄(33행) **앞에** 동일 항목 삽입:
481
+
482
+ ```markdown
483
+ - Cross-verification mode:
484
+ - Phase 5.5 convergence runs in **adversarial mode** for this phase (`convergence.adversarial=true`). Verifiers actively try to refute each root-cause / reproduction claim by directly re-inspecting the cited code, logs, or config; the burden of proof sits on the claim. See `skills/okstra-convergence/SKILL.md` §"Adversarial Verification Mode". A single evidence-backed refutation prevents a finding from reaching consensus.
485
+ ```
486
+
487
+ - [ ] **Step 3: 공통 계약의 Phase 5.5 설명 갱신**
488
+
489
+ Edit `prompts/profiles/_common-contract.md` — "Worker interaction model" 의 Phase 5.5 항목(17행)에서, 문장 끝 `See `skills/okstra-convergence/SKILL.md` for the round protocol, queue invariants, and final classification (`full-consensus` / `partial-consensus` / `contested` / `worker-unique`).` 다음에 한 문장 추가(같은 bullet 내):
490
+
491
+ ```markdown
492
+ For `requirements-discovery` and `error-analysis` this phase runs in **adversarial mode** (`convergence.adversarial=true`): verifiers try to refute each finding against its cited evidence and the burden of proof sits on the claim — see that skill's §"Adversarial Verification Mode".
493
+ ```
494
+
495
+ - [ ] **Step 4: 빌드 + 검증**
496
+
497
+ Run: `npm run build && bash validators/validate-workflow.sh`
498
+ Expected: 빌드 성공, validator PASS.
499
+
500
+ - [ ] **Step 5: 커밋**
501
+
502
+ ```bash
503
+ git add prompts/profiles/requirements-discovery.md prompts/profiles/error-analysis.md prompts/profiles/_common-contract.md runtime/
504
+ git commit -m "feat(prompts/profiles): declare adversarial Phase 5.5 for discovery/error-analysis"
505
+ ```
506
+
507
+ ---
508
+
509
+ ### Task 5: CHANGES.md + 전체 검증 + 최종 커밋
510
+
511
+ **Files:**
512
+ - Modify: `CHANGES.md`
513
+
514
+ - [ ] **Step 1: CHANGES.md 항목 추가**
515
+
516
+ Edit `CHANGES.md` — `## 2026-06-04` 헤더(5행) 바로 다음에 새 `###` 블록 삽입(기존 첫 항목 위):
517
+
518
+ ```markdown
519
+ ### feat(convergence): requirements-discovery / error-analysis 의 Phase 5.5 를 적대적 검증으로
520
+
521
+ - 기존 Phase 5.5 재검증은 협조적이었다 — 프롬프트가 `AGREE/DISAGREE/SUPPLEMENT` 를 묻고(동의가 저비용 기본값), 집계는 "다수 AGREE → consensus" 라 입증 책임이 반박자 쪽에 있어, 틀린 주장이라도 아무도 적극 반박하지 않으면 `full-consensus` 로 살아남았다. 라우팅(`requirements-discovery`)·근본원인(`error-analysis`) 처럼 틀린 주장이 다음 phase 전체를 오도하는 두 phase 에서 거짓 합의 비용이 가장 크다. 이제 이 두 phase 의 Phase 5.5 가 **적대적 모드**(`convergence.adversarial=true`)로 돈다 — 검증자는 인용된 증거를 직접 재조사해 주장을 깨뜨리려 시도하고(REFUTED/SURVIVES/SURVIVES-WITH-CAVEAT), 입증 책임은 주장 쪽에 있다(불확실하면 기각). 증거 기반 반박 1건이면 그 주장은 consensus 에 오르지 못한다. 재조사 범위는 finding 이 인용한 증거 파일+인접부로 한정해 비용 폭증을 막았고, maxRounds 는 현행 유지(req-discovery=1, error-analysis=2). 상태 아티팩트는 `config.adversarial` 와 반박 근거(`disagreeBasis ∈ counter-evidence|burden-not-met`)를 기록(schema v1.2). 다른 phase 의 convergence 는 협조적 그대로다.
522
+ - 사용자 영향: 다음 release + `npx -y okstra@latest install` 후 적용. 이제 두 phase 의 교차검증이 워커 주장을 적극 반박해, 근거 약한 합의가 걸러진다. 적대적 *행동* 자체는 lead/워커 prompt 지시(LLM 실행)이며 런타임 강제가 아니다 — 강제되는 것은 상태 아티팩트 형태(contract 테스트)뿐이다. `contested` 는 기각이 아니라 "다툼 있음" 분류이므로 finding 은 리포트에 남고 강등 사유(반대 증거 vs 입증 실패)가 기록된다.
523
+ ```
524
+
525
+ - [ ] **Step 2: 전체 테스트 + 검증 + 빌드 정합**
526
+
527
+ Run:
528
+ ```bash
529
+ npm run build
530
+ python3 -m pytest tests/ -q
531
+ bash validators/validate-workflow.sh
532
+ node bin/okstra --version
533
+ ```
534
+ Expected: 빌드 성공, pytest 전부 PASS, validator PASS, 버전 출력.
535
+
536
+ - [ ] **Step 3: 리뷰어 시점 self-review (Rule 5)**
537
+
538
+ `git diff main...HEAD` 전체를 처음 보는 리뷰어 관점으로 통독하고, 신규 식별자(`adversarial`, `disagreeBasis`, `Adversarial Verification Mode`, `counter-evidence`, `burden-not-met`)를 `grep -rn` 으로 일관성 확인:
539
+
540
+ ```bash
541
+ grep -rn "disagreeBasis\|adversarial\|Adversarial Verification Mode\|counter-evidence\|burden-not-met" skills/ prompts/ scripts/ tests/ CHANGES.md
542
+ ```
543
+ Expected: render.py(주입), skill(정의), 프로필/계약(선언), 테스트/fixture(강제), CHANGES(기록) 모두에서 같은 의미로 등장. 정의되지 않은 곳에서 토큰이 떠다니지 않을 것.
544
+
545
+ - [ ] **Step 4: 최종 커밋**
546
+
547
+ ```bash
548
+ git add CHANGES.md
549
+ git commit -m "docs(changes): log adversarial Phase 5.5 verification for discovery/error-analysis"
550
+ ```
551
+
552
+ ---
553
+
554
+ ## Self-Review (작성자 체크리스트)
555
+
556
+ **1. Spec coverage**
557
+ - §2.1 phase-조건부 모드 → Task 1(render) + Task 3 Step 1(Configuration 표).
558
+ - §2.2 적대적 프롬프트 → Task 3 Step 3.
559
+ - §2.3 verdict 매핑 + disagreeBasis → Task 3 Step 2/4, Task 2.
560
+ - §2.4 적대적 집계 → Task 3 Step 2.
561
+ - §2.5 범위-한정 full-reanalysis → Task 3 Step 2(Scoped full-reanalysis), Step 3(프롬프트 지시).
562
+ - §3.1 상태 스키마 1.2 → Task 2 + Task 3 Step 4.
563
+ - §3.2 render 주입 → Task 1.
564
+ - §4 변경 파일 6종 → Task 1–5 전부 커버.
565
+ - §5 enforcement 정직성 → Task 2(형태 강제), CHANGES(행동은 prompt 지시 명시).
566
+ - §7 수용 기준 1–5 → Task 5 Step 2 의 전체 검증으로 확인.
567
+
568
+ **2. Placeholder scan:** 모든 코드/JSON/markdown 블록은 실제 내용. TBD/TODO 없음.
569
+
570
+ **3. Type/식별자 일관성:** `adversarial`(bool), `disagreeBasis`(enum `counter-evidence|burden-not-met|null`), `verificationMode`("full-reanalysis"), 분류값(`full-consensus|partial-consensus|contested|worker-unique`) — Task 간 동일 철자 사용 확인.
@@ -0,0 +1,176 @@
1
+ # Phase 5.5 적대적 검증 (adversarial verification) — 설계
2
+
3
+ - 작성일: 2026-06-04
4
+ - 범위: `requirements-discovery` / `error-analysis` 두 phase 의 **Phase 5.5 convergence 재검증**을, 검증자가 다른 워커의 주장을 적극적으로 반박(refute)하려 시도하고 입증 책임을 주장 쪽에 두는 **적대적 검증** 구조로 전환한다. 별도 검증자 에이전트나 새 스테이지를 만들지 않고, 기존 convergence 재검증 루프를 phase-조건부 적대적 모드로 재구성한다.
5
+ - 비범위
6
+ - 신규 worker/agent 추가 없음. `requirements-discovery` / `error-analysis` 의 `Required workers:` 로스터 불변.
7
+ - `implementation-planning` / `implementation` / `final-verification` / `release-handoff` 의 convergence 동작 불변 — 이들은 현행 협조적(collaborative) 재검증을 그대로 유지한다.
8
+ - `implementation-planning` 의 plan-body verification(`P-*` 큐) 불변 — 본 설계는 finding 큐(`F-*`)만 다룬다.
9
+ - convergence 라운드/큐 구조 자체(Round 0 grouping, queue-pruned 루프, Round 2 gate)는 그대로 재사용한다.
10
+ - 관계: 본 문서는 [`skills/okstra-convergence/SKILL.md`](../../../skills/okstra-convergence/SKILL.md) 의 §"Verification Mode" 와 §"Lightweight Re-verification Prompt" 를 **두 phase 에 한해** 적대적 변형으로 확장한다. 협조적 모드 정의는 다른 phase 를 위해 그대로 남는다.
11
+
12
+ ## 1. 동기 — 현재 재검증은 협조적이라 거짓 합의를 만든다
13
+
14
+ 현재 Phase 5.5 의 재검증은 본질적으로 "동의 기본값" 구조다.
15
+
16
+ 1. **프롬프트가 협조적이다.** lightweight reverify 프롬프트([`skills/okstra-convergence/SKILL.md:247`](../../../skills/okstra-convergence/SKILL.md)) 는 `AGREE / DISAGREE / SUPPLEMENT` 를 묻고, "제시된 증거에 기반해 유효한가" 를 판단하게 한다. 적극적으로 깨뜨리라는 압력이 없으므로 AGREE 가 저비용 기본값이 된다.
17
+ 2. **집계가 반박자에게 입증 책임을 지운다.** 집계 규칙([`skills/okstra-convergence/SKILL.md:120`](../../../skills/okstra-convergence/SKILL.md)) 은 "다수가 AGREE → consensus" 다. 즉 주장은 다수가 적극적으로 반박해야만 강등된다. 틀린 주장이라도 아무도 적극 반박하지 않으면 `full-consensus` 로 살아남는다.
18
+ 3. **lightweight 는 텍스트만 본다.** 검증자는 원본 코드/로그를 재조사하지 않고 "제시된 증거"만 본다([`skills/okstra-convergence/SKILL.md:183`](../../../skills/okstra-convergence/SKILL.md)). 잘못된 증거 인용이 그대로 통과한다.
19
+
20
+ 특히 `requirements-discovery`(라우팅 결정)와 `error-analysis`(근본 원인 분석)는 **틀린 주장이 다음 phase 전체를 오도**하는 지점이다. 이 두 phase 에서 거짓 합의의 비용이 가장 크다. 따라서 검증의 기본 자세를 "동의" 에서 "반박 시도" 로 뒤집는다.
21
+
22
+ ## 2. 핵심 원칙
23
+
24
+ ### 2.1 phase-조건부 적대적 모드
25
+
26
+ 적대적 검증은 **`requirements-discovery` 와 `error-analysis` 두 phase 에만** 적용한다. convergence skill 은 모든 phase 가 공유하므로, 모드 분기는 manifest 의 `convergence` 블록에 새 플래그로 표현한다.
27
+
28
+ | 키 | 두 적대적 phase 기본값 | 그 외 phase 기본값 |
29
+ |---|---|---|
30
+ | `convergence.adversarial` | `true` | `false` |
31
+ | `convergence.verificationMode` | `"full-reanalysis"` | `"lightweight"` |
32
+ | `convergence.maxRounds` | req-discovery=`1`, error-analysis=`2` (현행 유지) | 현행 유지 |
33
+
34
+ 이 기본값은 [`scripts/okstra_ctl/render.py:899`](../../../scripts/okstra_ctl/render.py) `_build_convergence_block` 가 주입한다. 기존 `maxRounds` 의 phase-aware 분기(`1 if requirements-discovery else 2`) 와 동일한 패턴을 따른다. manifest 가 키를 명시하면 그 값을 우선한다(다른 phase 에서 적대적 검증을 실험적으로 켜는 것은 manifest override 로 가능 — 그러나 기본값으로 권하지 않는다).
35
+
36
+ `adversarial=false` 이면 본 설계의 모든 변경은 비활성이고 현행 협조적 동작이 그대로 돈다.
37
+
38
+ ### 2.2 적대적 재검증 프롬프트 — 반박이 임무다
39
+
40
+ `adversarial=true` 일 때 lead 는 §"Lightweight Re-verification Prompt" 대신 **적대적 프롬프트**를 사용한다. 핵심 지시:
41
+
42
+ - "너의 임무는 이 주장을 **깨뜨리는 것**이다. 인용된 원본 증거를 직접 열어 재조사하고, 주장을 무너뜨릴 반대 증거를 적극적으로 찾아라."
43
+ - verdict 라벨(프롬프트 표면):
44
+ - **REFUTED** — 주장을 반박했다. 반드시 근거를 댄다(아래 `disagreeBasis`).
45
+ - **SURVIVES** — 적극적으로 반박을 시도했으나 깨지 못했다. 주장이 공격을 견뎠다.
46
+ - **SURVIVES-WITH-CAVEAT** — 견디나 범위 한정/추가 조건/전제를 발견했다.
47
+ - **불확실성 처리(BLOCKING):** 원본 증거를 재조사한 뒤에도 주장을 **확인할 수도, 반증할 수도 없으면** 기본 verdict 는 **REFUTED** 다(`disagreeBasis = burden-not-met`). 입증 책임은 주장 쪽에 있으므로, 스스로 입증되지 않은 주장은 살아남지 못한다.
48
+
49
+ ### 2.3 verdict 매핑 — 영속 enum 불변, 신규 필드로 적대성 기록
50
+
51
+ 상태 아티팩트의 `verdict` enum 은 `{agree, disagree, supplement, verification-error}` 를 **그대로 유지**한다(contract 테스트 enum 변경 최소화). 프롬프트 라벨은 아래로 매핑해 영속한다:
52
+
53
+ | 프롬프트 라벨 | 영속 `verdict` |
54
+ |---|---|
55
+ | SURVIVES | `agree` |
56
+ | SURVIVES-WITH-CAVEAT | `supplement` |
57
+ | REFUTED | `disagree` |
58
+
59
+ 적대성의 핵심 정보는 vote 에 추가하는 신규 필드 **`disagreeBasis`** 로 기록한다:
60
+
61
+ | 값 | 의미 |
62
+ |---|---|
63
+ | `counter-evidence` | 반대 증거를 `file:line`(또는 로그 라인)으로 인용한 **강한 반박**. 인용은 `votes.<worker>.explanation` 에 포함한다. |
64
+ | `burden-not-met` | 재조사했으나 확인도 반증도 못 함 → 주장이 입증 책임을 다하지 못함(= "불확실하면 기각"). |
65
+ | `null` | verdict 가 `disagree` 가 아닐 때(=agree/supplement/verification-error). |
66
+
67
+ `adversarial=true` 인데 verdict 가 `disagree` 이고 `disagreeBasis` 가 null 이면 contract 위반이다(§5 참조). 즉 적대적 모드의 모든 반박은 둘 중 하나의 근거를 반드시 가진다 — 근거 없는 "그냥 반대" 는 허용하지 않는다.
68
+
69
+ ### 2.4 적대적 집계 규칙 — 입증 책임을 주장 쪽으로
70
+
71
+ `adversarial=true` 일 때 §"Convergence Algorithm" 의 분류 로직을 다음으로 대체한다(협조적 모드 로직은 `adversarial=false` 에서 그대로). 한 finding `F` 에 대해, `verification-error` 표는 분자·분모 모두에서 제외한다(현행과 동일):
72
+
73
+ ```text
74
+ disagrees = [v for v in non-error votes if v.verdict == "disagree"]
75
+ hard_refutes = [v for v in disagrees if v.disagreeBasis == "counter-evidence"]
76
+
77
+ IF len(disagrees) == 0:
78
+ # 아무도 깨지 못함 → 주장이 공격을 견딤
79
+ F.classification = "full-consensus"
80
+ (단, supplement(=caveat)가 있으면 "partial-consensus")
81
+ ELIF len(hard_refutes) >= 1:
82
+ # 증거 기반 반박이 1건이라도 성립 → 즉시 강등 (다수결 무관)
83
+ IF 비-발견자 전원이 disagree:
84
+ F.classification = "worker-unique" # 사실상 기각
85
+ ELSE:
86
+ F.classification = "contested"
87
+ ELSE:
88
+ # disagree 는 있으나 전부 burden-not-met (강한 반박 0건)
89
+ IF 비-발견자 전원이 disagree:
90
+ F.classification = "worker-unique"
91
+ ELIF burden-not-met disagree 가 다수(비-error 표의 과반):
92
+ F.classification = "contested"
93
+ ELSE:
94
+ F.classification = "partial-consensus" # 소수의 약한 의심 — 견딘 것으로 본다
95
+ ```
96
+
97
+ 설계 의도:
98
+ - **`counter-evidence` 반박 1건 = 강등.** 사용자가 명시한 "증거 기반 반박이 1건이라도 성립하면 강등". 다수가 동의해도 누군가 반대 증거를 `file:line` 으로 제시하면 그 주장은 무조건 `contested` 이상으로 내려간다.
99
+ - **`burden-not-met` 은 다수일 때만 강등.** 한 검증자가 "잘 모르겠다" 한 것만으로 주장을 죽이지는 않되, 과반이 입증 실패를 보고하면 주장은 입증 책임을 못 다한 것으로 강등한다. 이로써 "불확실하면 기각 쪽으로 기운다" 를 구현한다.
100
+ - 반박의 두 종류를 구분 영속하므로, 최종 리포트에서 "왜 강등됐는가"(반대 증거 발견 vs 입증 실패)를 추적할 수 있다.
101
+
102
+ multi-라운드(error-analysis maxRounds=2)에서 라운드 간 carry-forward·최종 분류는 현행 규칙을 그대로 따르되, 각 라운드의 분류 판정에 위 적대적 로직을 적용한다.
103
+
104
+ ### 2.5 full-reanalysis 의 범위 한정 — 비용 폭증 방지
105
+
106
+ 선택된 `verificationMode="full-reanalysis"` 는 검증자가 원본 증거를 직접 재조사하게 한다. 그러나 [`skills/okstra-convergence/SKILL.md:245`](../../../skills/okstra-convergence/SKILL.md) 는 lightweight 를 "requirements-discovery·error-analysis 에서 가장 큰 회피 가능 비용" 이라 명시한다. 전체 instruction-set 재독으로 회귀하면 이 비용을 정면으로 되살린다.
107
+
108
+ **해소:** 적대적 full-reanalysis 의 재조사 범위를 **"해당 finding 이 인용한 증거 파일 + 그 인접부"로 한정**한다. 전체 task brief / instruction-set / `final-report-template.md` 재독은 금지한다. 즉 검증자는 공격 대상 주장이 가리키는 코드/로그만 직접 열어 반대 증거를 찾는다.
109
+
110
+ - §"Reverify prompt: required-reading suppression (BLOCKING)" 의 full-reanalysis 분기를 적대적 모드용으로 좁힌다: analysis-worker 파일 목록 전체가 아니라 **인용된 증거 경로만** 주입한다.
111
+ - maxRounds 는 현행 유지(req-discovery=1, error-analysis=2). 적대적 1라운드면 "한 번 깨뜨려 보기" 에 충분하고, 비용을 라운드 수로 곱하지 않는다.
112
+
113
+ ## 3. 데이터 모델
114
+
115
+ ### 3.1 convergence 상태 아티팩트 (`runs/<task-type>/state/convergence-<task-type>-<seq>.json`)
116
+
117
+ - `schemaVersion` 을 `"1.2"` 로 올린다. reader 는 `"1.0"`/`"1.1"` 을 계속 수용하고 누락 필드는 `null` 로 취급한다.
118
+ - `config` 에 신규 키 추가:
119
+ - `adversarial`: boolean. 이 run 이 적대적 모드였는지. 현행 두 적대적 phase 는 `true`.
120
+ - `findings[].rounds[].votes.<worker>` 에 신규 키 추가:
121
+ - `disagreeBasis`: enum `counter-evidence | burden-not-met | null`. §2.3 의 규칙을 따른다.
122
+ - 기존 필드(`verdict` enum, `classification` enum, `finalState` 등)는 불변.
123
+
124
+ ### 3.2 render.py 가 주입하는 manifest `convergence` 블록
125
+
126
+ `_build_convergence_block`([`scripts/okstra_ctl/render.py:899`](../../../scripts/okstra_ctl/render.py)) 가 다음을 추가로 결정한다:
127
+
128
+ ```python
129
+ adversarial_phases = {"requirements-discovery", "error-analysis"}
130
+ is_adversarial = task_type in adversarial_phases
131
+ # ...
132
+ "adversarial": is_adversarial,
133
+ "verificationMode": "full-reanalysis" if is_adversarial else "lightweight",
134
+ ```
135
+
136
+ `maxRounds` 의 기존 분기는 그대로 둔다.
137
+
138
+ ## 4. 변경 대상 파일 (모두 source — `runtime/` 직접 수정 없음)
139
+
140
+ 1. [`skills/okstra-convergence/SKILL.md`](../../../skills/okstra-convergence/SKILL.md)
141
+ - §"Configuration" 표에 `adversarial` 키 추가, 두 phase 기본값 명시.
142
+ - §"Verification Mode" 에 적대적 모드 설명 추가(범위 한정 full-reanalysis 포함).
143
+ - §"Convergence Algorithm" 에 `adversarial=true` 분기 집계 로직(§2.4) 추가. 협조적 로직은 그대로 유지.
144
+ - §"Lightweight Re-verification Prompt" 옆에 "Adversarial Re-verification Prompt"(§2.2) 신설.
145
+ - §"Reverify prompt: required-reading suppression" 의 full-reanalysis 분기를 적대적 모드용 인용-증거-한정으로 좁힘.
146
+ - §"Convergence State Artifact" 스키마를 1.2 로 갱신: `config.adversarial`, `votes.<worker>.disagreeBasis`.
147
+ 2. [`scripts/okstra_ctl/render.py:899`](../../../scripts/okstra_ctl/render.py) `_build_convergence_block` — §3.2.
148
+ 3. [`prompts/profiles/requirements-discovery.md`](../../../prompts/profiles/requirements-discovery.md) + [`prompts/profiles/error-analysis.md`](../../../prompts/profiles/error-analysis.md) — Phase 5.5 가 적대적으로 돈다는 선언 1줄(프로필이 동작의 authoritative 선언처임).
149
+ 4. [`prompts/profiles/_common-contract.md:16`](../../../prompts/profiles/_common-contract.md) — "Worker interaction model" 의 Phase 5.5 설명에, 두 phase 는 적대적 peer review 라는 한 줄 추가.
150
+ 5. [`tests/test_convergence_state_contract.py`](../../../tests/test_convergence_state_contract.py) + `tests/fixtures/convergence/` — `1.2` 수용, `disagreeBasis` enum 검증, `config.adversarial` 존재 검증, 적대적 fixture 1개 추가(`counter-evidence` 반박 1건 → `contested` 케이스).
151
+ 6. [`CHANGES.md`](../../../CHANGES.md) — `사용자 영향:` 항목.
152
+
153
+ ## 5. Enforcement — 선언과 강제의 구분
154
+
155
+ 정직한 enforcement 경계:
156
+
157
+ - **적대적 *행동* 자체(lead 가 실제로 반박을 시도했는지, 검증자가 증거를 재조사했는지)는 런타임으로 강제할 수 없다.** lead 와 워커는 LLM 이므로, 적대성은 skill/프롬프트의 선언과 지시로만 유도된다. 이 한계를 문서에 명시한다.
158
+ - **강제되는 것은 아티팩트의 *형태* 뿐이다.** `tests/test_convergence_state_contract.py` 가 fixture 에 대해 검증:
159
+ - `config.adversarial` 가 boolean 으로 존재.
160
+ - `disagreeBasis` 가 enum `{counter-evidence, burden-not-met, null}` 안에 있음.
161
+ - `adversarial==true` 인 fixture 에서, verdict 가 `disagree` 이면 `disagreeBasis != null`.
162
+ - convergence 상태는 런타임 `validators/validate-run.py` 가 검사하지 않는다(현행과 동일). 따라서 본 설계는 런타임 run 에 대한 적대성 강제를 **약속하지 않는다** — fixture contract 테스트가 유일한 자동 검증 지점이다.
163
+
164
+ ## 6. 비용·리스크
165
+
166
+ - **비용:** full-reanalysis 로의 전환은 lightweight 대비 라운드당 비용을 올린다. §2.5 의 인용-증거-한정으로 폭증을 막고, maxRounds 를 현행 유지(req-discovery=1)해 라운드 곱을 억제한다.
167
+ - **리스크 — 거짓 강등(false negative):** 적대적 모드는 참인 주장을 `contested` 로 강등할 수 있다(검증자가 잘못된 반대 증거를 제시). 완화: `counter-evidence` 반박은 반드시 `file:line` 인용을 요구하므로(§2.3), 강등 사유가 리포트에 기록되어 사용자가 추적·반박할 수 있다. `contested` 는 기각이 아니라 "다툼 있음" 분류이므로 finding 은 리포트에 남는다.
168
+ - **리스크 — burden-not-met 남용:** 검증자가 게으르게 "잘 모르겠다" 로 일관하면 다수 burden-not-met 으로 멀쩡한 주장이 강등될 수 있다. 완화: 프롬프트가 "재조사 후" 에만 burden-not-met 을 허용하도록 지시하고, 단일 burden-not-met 은 강등시키지 않는다(과반 필요, §2.4).
169
+
170
+ ## 7. 수용 기준
171
+
172
+ 1. `requirements-discovery` / `error-analysis` 의 manifest `convergence` 블록에 `adversarial: true`, `verificationMode: "full-reanalysis"` 가 주입된다. 그 외 phase 는 `adversarial: false`, `lightweight` 유지.
173
+ 2. convergence skill 이 `adversarial=true` 분기에서 적대적 프롬프트·적대적 집계·인용-증거-한정 재조사를 정의한다. `adversarial=false` 동작은 byte 단위로 현행과 동일.
174
+ 3. 상태 스키마 1.2 가 `config.adversarial` 와 `votes.<worker>.disagreeBasis` 를 문서화하고, contract 테스트가 §5 의 형태 규칙을 강제한다.
175
+ 4. `python3 -m pytest tests/` 와 `bash validators/validate-workflow.sh` 통과.
176
+ 5. 두 프로필과 `_common-contract.md` 가 적대적 Phase 5.5 를 선언한다.
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "okstra",
3
- "version": "0.46.0",
3
+ "version": "0.47.0",
4
4
  "description": "Multi-agent cross-verification orchestrator runtime + Claude Code skills.",
5
5
  "license": "MIT",
6
6
  "author": "devonshin",
@@ -1,5 +1,5 @@
1
1
  {
2
- "package": "0.46.0",
3
- "builtAt": "2026-06-04T11:19:42.641Z",
2
+ "package": "0.47.0",
3
+ "builtAt": "2026-06-04T12:46:31.759Z",
4
4
  "repoRoot": "/home/runner/work/okstra/okstra"
5
5
  }
@@ -250,7 +250,8 @@ Convergence is enabled by default. Configure via task-manifest.json:
250
250
 
251
251
  - `convergence.enabled`: true/false (default: true)
252
252
  - `convergence.maxRounds`: 1–3 — **phase-aware default**: `1` for `requirements-discovery`, `2` for all other task types
253
- - `convergence.verificationMode`: `"lightweight"` | `"full-reanalysis"` (default: `"lightweight"`)
253
+ - `convergence.verificationMode`: `"lightweight"` | `"full-reanalysis"` (default: `"lightweight"`; the adversarial phases below force `"full-reanalysis"`)
254
+ - `convergence.adversarial`: true/false — **phase-aware default**: `true` for `requirements-discovery` / `error-analysis`, `false` otherwise. When `true`, Phase 5.5 runs in adversarial mode (verifiers refute findings; burden of proof on the claim). See [okstra-convergence](./skills/okstra-convergence/SKILL.md) "Adversarial Verification Mode".
254
255
 
255
256
  When `task-manifest.json` does not set `convergence.maxRounds`, lead MUST resolve the effective value via the phase-aware default above before entering Phase 5.5, and record the resolved value in the convergence state artifact at `config.effectiveMaxRounds`.
256
257
 
@@ -14,7 +14,7 @@ profile document.
14
14
  - Worker interaction model (shared — read before inferring behaviour from the roster):
15
15
  - the per-profile `Required workers:` block is a **roster**, not a behaviour contract. Each role's interaction mode changes across operating phases of the same run.
16
16
  - **Phase 4 / 5 (independent analysis)**: analyser workers (`claude`, `codex`, `gemini` when opted in) produce findings independently and have no access to one another's outputs. `report-writer` does not analyse.
17
- - **Phase 5.5 (convergence — peer review by workers)**: the lead replays each analyser's findings to the *other* analysers and collects `AGREE` / `DISAGREE` / `SUPPLEMENT` verdicts across up to `effectiveMaxRounds` rounds. Workers act as peer reviewers of each other's findings in this phase; the lead mediates but does not vote. See `skills/okstra-convergence/SKILL.md` for the round protocol, queue invariants, and final classification (`full-consensus` / `partial-consensus` / `contested` / `worker-unique`).
17
+ - **Phase 5.5 (convergence — peer review by workers)**: the lead replays each analyser's findings to the *other* analysers and collects `AGREE` / `DISAGREE` / `SUPPLEMENT` verdicts across up to `effectiveMaxRounds` rounds. Workers act as peer reviewers of each other's findings in this phase; the lead mediates but does not vote. See `skills/okstra-convergence/SKILL.md` for the round protocol, queue invariants, and final classification (`full-consensus` / `partial-consensus` / `contested` / `worker-unique`). For `requirements-discovery` and `error-analysis` this phase runs in **adversarial mode** (`convergence.adversarial=true`): verifiers try to refute each finding against its cited evidence and the burden of proof sits on the claim — see that skill's §"Adversarial Verification Mode".
18
18
  - Do NOT conclude "no peer review happens" from the roster alone — every profile that lists ≥2 analyser workers runs convergence by default (`convergence.enabled=true` in `task-manifest.json`).
19
19
  - Tooling — read-only MCP availability (shared):
20
20
  - MCP is not implicit okstra context. Query an MCP server only when the task brief explicitly lists it as source material for this run. Any MCP-derived finding MUST cite server, table, and the SELECT used. MCP MUST NEVER be used as a write path — schema/data mutations go through repository migration files reviewed by humans.
@@ -30,6 +30,8 @@
30
30
  - every clarification row carries a recommended answer + one-line rationale inside the `Expected form` cell; rows that lack a recommendation are rejected as half-formed.
31
31
  - **Codebase-first ambiguity resolution (defect rule)**: any ambiguity about repro, file behavior, or symbol semantics that can be answered by `Read` / `Grep` / log inspection MUST be resolved that way and recorded with file:line (or log-line) evidence. Writing a clarification row for something the codebase or shipped logs already answer is a defect of this phase.
32
32
  - **Evidence note required inside `Statement`**: every clarification row includes `Evidence checked: <path:line>` or `Evidence checked: none — <reporter-only reason>` in the `Statement` cell. `none` is allowed ONLY when the row's nature is "only the reporter can answer this" (reporter-side data, business priority, environment they observed). A row with `none` that *could* have been answered by code or logs is a defect.
33
+ - Cross-verification mode:
34
+ - Phase 5.5 convergence runs in **adversarial mode** for this phase (`convergence.adversarial=true`). Verifiers actively try to refute each root-cause / reproduction claim by directly re-inspecting the cited code, logs, or config; the burden of proof sits on the claim. See `skills/okstra-convergence/SKILL.md` §"Adversarial Verification Mode". A single evidence-backed refutation prevents a finding from reaching consensus.
33
35
  - Non-goals:
34
36
  - implementation details unless they are necessary to validate the cause
35
37
  - **source code edits, builds, migrations, or deployments** — this run produces evidence and cause analysis only; the fix belongs to a later `implementation-planning` run followed by an `implementation` run
@@ -51,6 +51,8 @@
51
51
  - every clarification row carries a recommended answer + one-line rationale inside the `Expected form` cell; rows that lack a recommendation are rejected as half-formed.
52
52
  - **Codebase-first ambiguity resolution (defect rule)**: any ambiguity that can be answered by `Read` / `Grep` / file inspection MUST be resolved that way and recorded with file:line evidence. Writing a clarification row for something the codebase already answers is a defect of this phase.
53
53
  - **Evidence note required inside `Statement`**: every clarification row includes `Evidence checked: <path:line>` or `Evidence checked: none — <human-only reason>` in the `Statement` cell. `none` is allowed ONLY when the row's nature is "only a human can answer this" (reporter intent, business priority, external authority). A row with `none` that *could* have been answered by the codebase is a defect.
54
+ - Cross-verification mode:
55
+ - Phase 5.5 convergence runs in **adversarial mode** for this phase (`convergence.adversarial=true`). Verifiers actively try to refute each worker's finding by directly re-inspecting the cited evidence; the burden of proof sits on the claim. See `skills/okstra-convergence/SKILL.md` §"Adversarial Verification Mode". A single evidence-backed refutation prevents a finding from reaching consensus.
54
56
  - Non-goals:
55
57
  - full implementation design unless it is required to decide the next phase
56
58
  - **source code edits, plan authoring, builds, or deployments** — this run only classifies the work and routes it; deeper analysis and planning belong to subsequent phases
@@ -903,6 +903,8 @@ def _build_convergence_block(ctx: dict) -> dict:
903
903
  - `enabled` default True
904
904
  - `maxRounds` default 1 for `requirements-discovery`, 2 otherwise
905
905
  - `verificationMode` default "lightweight"
906
+ - `adversarial` default True for `requirements-discovery` / `error-analysis`
907
+ (forces `verificationMode` to "full-reanalysis"), False otherwise
906
908
  - `planBodyVerification` is implementation-planning specific; the key is
907
909
  always emitted (dead-letter on other phases) so the schema stays stable.
908
910
 
@@ -912,12 +914,15 @@ def _build_convergence_block(ctx: dict) -> dict:
912
914
  """
913
915
  task_type = ctx.get("TASK_TYPE", "")
914
916
  default_max_rounds = 1 if task_type == "requirements-discovery" else 2
917
+ adversarial_phases = {"requirements-discovery", "error-analysis"}
918
+ is_adversarial = task_type in adversarial_phases
915
919
  raw_plan_verify = (ctx.get("OKSTRA_PLAN_VERIFICATION", "") or "").strip().lower()
916
920
  plan_verify_enabled = raw_plan_verify != "false"
917
921
  return {
918
922
  "enabled": True,
923
+ "adversarial": is_adversarial,
919
924
  "maxRounds": default_max_rounds,
920
- "verificationMode": "lightweight",
925
+ "verificationMode": "full-reanalysis" if is_adversarial else "lightweight",
921
926
  "planBodyVerification": {
922
927
  "enabled": plan_verify_enabled,
923
928
  "maxRounds": 1,
@@ -46,6 +46,7 @@ Configure this in the `convergence` block of `task-manifest.json`. If the block
46
46
  | `enabled` | `true` | If `false`, skip the convergence loop and use the existing consensus/divergence method |
47
47
  | `maxRounds` | phase-aware: `1` for `requirements-discovery`, `2` otherwise (range 1–3) | Maximum number of re-verification rounds. Discovery's routing/missing-input outputs gain little from a second round; other phases (especially `error-analysis`) keep `2`. Lead resolves the effective value when the manifest omits the key and records it in `config.maxRounds` of the convergence state artifact. |
48
48
  | `verificationMode` | `"lightweight"` | `"lightweight"` or `"full-reanalysis"` |
49
+ | `adversarial` | phase-aware: `true` for `requirements-discovery` / `error-analysis`, `false` otherwise | When `true`, Phase 5.5 runs in **adversarial mode** (see §"Adversarial Verification Mode"): verifiers actively try to refute each finding, the burden of proof sits on the claim, and `verificationMode` is forced to `"full-reanalysis"` scoped to the finding's cited evidence. Resolved by `scripts/okstra_ctl/render.py` `_build_convergence_block` and recorded in `config.adversarial` of the convergence state artifact. |
49
50
 
50
51
  **Auto-disable rule (BLOCKING).** Convergence requires ≥2 analyser workers to produce a meaningful consensus tally. When the active profile's `Required workers:` block (see `prompts/profiles/*.md`) resolves to fewer than 2 analyser workers — e.g. `release-handoff` (zero analyser workers, lead-only) — the lead MUST treat `convergence.enabled` as `false` for that run regardless of manifest configuration, skip Phases 5.5 and the plan-body verification round, and record `finalState: "converged"` with `totalRounds: 0` and an explanatory note in `config` (e.g. `"autoDisabled": "fewer-than-two-analysers"`). The plan-body round inherits the same rule via its `gating=false` advisory path.
51
52
 
@@ -192,6 +193,62 @@ Use the findings as a guide, but reanalyze the original code/data yourself.
192
193
  Advantages: High accuracy
193
194
  Disadvantages: 2–3 times the cost, increased time
194
195
 
196
+ ## Adversarial Verification Mode
197
+
198
+ Active only when `config.adversarial == true` (default for `requirements-discovery` and `error-analysis`; see §"Configuration"). When `false`, every rule in this section is inert and the collaborative behaviour documented elsewhere in this skill applies unchanged.
199
+
200
+ In adversarial mode the verifier's job inverts: instead of confirming a peer's finding, the verifier **tries to break it**, and the burden of proof sits on the claim — a finding survives only if refutation attempts fail.
201
+
202
+ ### Scoped full-reanalysis (BLOCKING)
203
+
204
+ Adversarial mode forces `verificationMode = "full-reanalysis"`, but the re-analysis is **scoped to the evidence the finding under attack cites** (the file paths / line ranges / log lines in its `originEvidence`), plus the immediately surrounding context. The verifier MUST NOT re-read the whole task brief, instruction-set, or `final-report-template.md`. This keeps the documented "single largest avoidable cost in requirements-discovery and error-analysis" (see §"Reverify prompt: required-reading suppression") bounded while making the refutation real rather than a text-only argument.
205
+
206
+ ### Adversarial verdict semantics
207
+
208
+ The persisted `verdict` enum is unchanged (`agree | disagree | supplement | verification-error`). The prompt-facing labels are adversarial and map down on persistence:
209
+
210
+ | Prompt label | Persisted `verdict` | Meaning |
211
+ |---|---|---|
212
+ | SURVIVES | `agree` | Actively tried to refute and failed — the claim withstood the attack. |
213
+ | SURVIVES-WITH-CAVEAT | `supplement` | Holds, but a scope limit / extra condition / precondition was found. |
214
+ | REFUTED | `disagree` | The claim was broken (or failed to prove itself). MUST carry a `disagreeBasis`. |
215
+
216
+ Each `disagree` vote records a new field `disagreeBasis`:
217
+
218
+ | `disagreeBasis` | Meaning |
219
+ |---|---|
220
+ | `counter-evidence` | The verifier cited contradicting evidence (`file:line` / log line) in `explanation`. A **hard refute**. |
221
+ | `burden-not-met` | The verifier re-inspected the cited evidence and could neither confirm nor refute → the claim failed to prove itself ("when uncertain, lean to rejection"). |
222
+
223
+ A `disagree` with `disagreeBasis == null` is a contract violation in adversarial mode — every refutation must state which of the two grounds it rests on. Bare "I disagree" without re-inspection is not allowed.
224
+
225
+ ### Adversarial classification (replaces the §"Convergence Algorithm" per-round classifier when `adversarial == true`)
226
+
227
+ `verification-error` votes are excluded from numerator and denominator exactly as in the collaborative classifier. For each finding `F` in the queue at a round:
228
+
229
+ ```text
230
+ disagrees = [v for v in non-error votes if v.verdict == "disagree"]
231
+ hard_refutes = [v for v in disagrees if v.disagreeBasis == "counter-evidence"]
232
+ all_others_disagree = (every non-discoverer non-error vote is "disagree")
233
+
234
+ IF len(disagrees) == 0:
235
+ resolve F as "full-consensus" (or "partial-consensus" if any SUPPLEMENT/caveat)
236
+ ELIF all_others_disagree:
237
+ resolve F as "worker-unique" # only the discoverer still holds it
238
+ ELIF len(hard_refutes) >= 1:
239
+ # an evidence-backed refute exists and the roster is split → the claim is disputed
240
+ carry F forward; at the LAST executed round classify it "contested"
241
+ ELIF burden-not-met disagrees are a majority of non-error votes (per the Majority definition in the Convergence Algorithm section):
242
+ carry F forward; at the LAST executed round classify it "contested"
243
+ ELSE:
244
+ # a lone weak (burden-not-met) doubt against an otherwise-surviving claim
245
+ resolve F as "partial-consensus"
246
+ ```
247
+
248
+ `contested` remains a **final classification only** (per §"Scope and Terminology"): a disputed finding is carried forward through intermediate rounds and labelled `contested` only at the last executed round. For `requirements-discovery` (`effectiveMaxRounds = 1`) the single round IS the last round, so a split-with-hard-refute finding is labelled `contested` in that one round. The final-classifier block of §"Convergence Algorithm" is unchanged; this section only changes how each round's verdicts resolve into queue actions.
249
+
250
+ Design intent: one `counter-evidence` refute is enough to deny a claim consensus (it cannot rise above `contested` no matter how many others AGREE), while a single `burden-not-met` doubt does not by itself sink an otherwise-surviving claim — only a majority of burden-not-met doubts does. When every non-discoverer refutes (all_others_disagree), the finding is worker-unique regardless of whether those refutes were counter-evidence or burden-not-met — only the discoverer still holds it. A SUPPLEMENT/caveat with zero disagrees lands partial-consensus rather than full-consensus, because a caveat means the claim does not pass cleanly (this differs from the collaborative classifier, where SUPPLEMENT counts as full agreement).
251
+
195
252
  ## Re-verification Agent Dispatch
196
253
 
197
254
  ### Sponsorship Optimization
@@ -282,6 +339,55 @@ For each finding, respond as:
282
339
  **Verdict**: ...
283
340
  ```
284
341
 
342
+ ### Adversarial Re-verification Prompt
343
+
344
+ Used instead of the lightweight/full-reanalysis prompt when `config.adversarial == true`. The required anchor headers (§"Required reverify-prompt anchor headers") are identical. The `[Required reading]` clause is suppressed; only the cited-evidence paths of the items under attack are injected (see §"Adversarial Verification Mode" → Scoped full-reanalysis).
345
+
346
+ ```
347
+ You are <worker-role> performing ADVERSARIAL re-verification for <task-key> (round <N>).
348
+
349
+ ## Instructions
350
+
351
+ Your job is to BREAK each finding below, not to confirm it. For EACH finding,
352
+ open the cited evidence directly and actively search for evidence that the claim
353
+ is wrong, overstated, or unproven. Then respond with exactly one verdict:
354
+
355
+ - **REFUTED**: You broke the claim. State the basis:
356
+ - counter-evidence — you found contradicting evidence (give file:line or log line), OR
357
+ - burden-not-met — you re-inspected the cited evidence and could neither confirm
358
+ nor refute it (the claim has not proven itself).
359
+ - **SURVIVES**: You actively tried to refute it and failed — the claim withstood the attack.
360
+ - **SURVIVES-WITH-CAVEAT**: It holds, but a scope limit / extra condition / missing
361
+ precondition exists (state it).
362
+
363
+ The burden of proof is on the claim. If after inspecting the cited evidence you remain
364
+ uncertain, your verdict is REFUTED with basis = burden-not-met.
365
+
366
+ Inspect ONLY the evidence each finding cites and its immediate surroundings. Do NOT
367
+ re-read the task brief, instruction-set, or report template.
368
+
369
+ ## Findings to verify
370
+
371
+ ### F-001: <one-line summary>
372
+ **Origin**: <worker role>
373
+ **Cited evidence**: <file paths, line numbers, log lines from origin worker>
374
+
375
+ ### F-002: <one-line summary>
376
+ ...
377
+
378
+ ## Response format
379
+
380
+ ### F-001
381
+ **Verdict**: REFUTED | SURVIVES | SURVIVES-WITH-CAVEAT
382
+ **Basis** (only if REFUTED): counter-evidence | burden-not-met
383
+ **Explanation**: <2-3 sentences; for counter-evidence include the file:line you found>
384
+
385
+ ### F-002
386
+ ...
387
+ ```
388
+
389
+ When persisting votes, map SURVIVES→`agree`, SURVIVES-WITH-CAVEAT→`supplement`, REFUTED→`disagree`, and copy the stated Basis into `votes.<worker>.disagreeBasis` (null for non-REFUTED verdicts).
390
+
285
391
  ### Full Re-analysis Re-verification Prompt
286
392
 
287
393
  ```
@@ -324,10 +430,11 @@ Save it to `runs/<task-type>/state/convergence-<task-type>-<seq>.json`.
324
430
 
325
431
  ```json
326
432
  {
327
- "schemaVersion": "1.1",
433
+ "schemaVersion": "1.2",
328
434
  "taskKey": "<task-key>",
329
435
  "config": {
330
436
  "enabled": true,
437
+ "adversarial": false,
331
438
  "maxRounds": 2,
332
439
  "effectiveMaxRounds": 2,
333
440
  "verificationMode": "lightweight"
@@ -345,7 +452,7 @@ Save it to `runs/<task-type>/state/convergence-<task-type>-<seq>.json`.
345
452
  {
346
453
  "round": 1,
347
454
  "votes": {
348
- "codex-worker": { "verdict": "agree", "explanation": "<brief>" },
455
+ "codex-worker": { "verdict": "agree", "disagreeBasis": null, "explanation": "<brief>" },
349
456
  "gemini-worker": { "verdict": "supplement", "explanation": "<brief>" }
350
457
  }
351
458
  }
@@ -385,11 +492,13 @@ Save it to `runs/<task-type>/state/convergence-<task-type>-<seq>.json`.
385
492
 
386
493
  Schema rules:
387
494
 
388
- - `schemaVersion`: literal string `"1.1"` for new runs. Readers MUST accept `"1.0"` for historical artifacts and treat any missing v1.1 field as `null`.
495
+ - `schemaVersion`: literal string `"1.2"` for all new runs — both adversarial and collaborative. v1.2 adds `config.adversarial` and `votes.<worker>.disagreeBasis`, written as `false` / `null` respectively on collaborative runs. Readers MUST accept `"1.0"` / `"1.1"` / `"1.2"` for historical artifacts and treat any missing field as `null`.
496
+ - `config.adversarial`: boolean. `true` when this run used adversarial verification (default for `requirements-discovery` / `error-analysis`). When `true`, `config.verificationMode` is `"full-reanalysis"` (scoped) and every `disagree` vote carries a non-null `disagreeBasis`.
389
497
  - `config.effectiveMaxRounds`: the integer the lead actually used after resolving the phase-aware default (`1` for `requirements-discovery`, `2` otherwise). MUST equal `config.maxRounds` when the manifest explicitly set it.
390
498
  - `findings[].ticketIds`: array of ticket keys from Phase 4 grouping (parsed per the Round 0 step 5 rule). MAY be empty when the discovering worker tagged the finding `unknown`.
391
499
  - `findings[].rounds[].votes.<worker>.verdict`: enum, one of `agree | disagree | supplement | verification-error`. Lower-case tokens; map upper-case AGREE/DISAGREE/SUPPLEMENT verdicts emitted by workers to their lower-case form before persisting. `verification-error` is reserved for terminal non-result dispatches (§"Worker failure handling in reverify").
392
- - `findings[].classification`: enum, one of `full-consensus | partial-consensus | worker-unique | contested`. No other value is permitted in v1.1.
500
+ - `findings[].rounds[].votes.<worker>.disagreeBasis`: enum `counter-evidence | burden-not-met | null`. Non-null only when `verdict == "disagree"` AND `config.adversarial == true`; `null` (or absent, treated as null) otherwise. See §"Adversarial Verification Mode".
501
+ - `findings[].classification`: enum, one of `full-consensus | partial-consensus | worker-unique | contested`. No other value is permitted.
393
502
  - `roundHistory[].inputQueueSize`: queue size at the start of this round.
394
503
  - `roundHistory[].resolvedCount`: number of findings that exited the queue this round (sum of full+partial+worker-unique classifications produced this round).
395
504
  - `roundHistory[].carriedForwardCount`: queue size at the END of this round — the single definition. In-round insertions into the queue are forbidden, so this always equals `inputQueueSize - resolvedCount`. The pseudocode's per-item `carriedForwardCount += 1` accumulator is a counting convenience that lands on the same value; persist the post-round queue length, not the loop accumulator, if the two ever diverge.
@@ -397,7 +506,7 @@ Schema rules:
397
506
  - `roundHistory[].skippedWorkers[]`: per-worker `{worker, reason}` for workers with no items to verify OR with a non-result dispatch.
398
507
  - `round2SkippedReason`: literal enum `queue-empty | max-rounds-1 | all-reverify-non-result | not-skipped`. Always present. Use `"not-skipped"` when Round 2 actually ran. Use `"max-rounds-1"` when `effectiveMaxRounds == 1` (Round 2 was never attempted). Use `"queue-empty"` when Round 1 fully drained the queue. Use `"all-reverify-non-result"` when all Round 1 dispatches terminated as non-result.
399
508
  - `finalClassificationCounts`: post-loop counts. Required field with keys `fullConsensus`, `partialConsensus`, `contested`, `workerUnique`.
400
- - `finalState ∈ {converged, max-rounds-reached, aborted-non-result}`. Assigned by the lead at WHILE-loop exit: `converged` when the queue is empty at the end of any round; `max-rounds-reached` when the loop exits because `roundIndex == effectiveMaxRounds` with the queue still non-empty; `aborted-non-result` when the loop exits via the Worker-failure BREAK (Task 3's "Worker failure handling in reverify" rule 4). `aborted-non-result` is the new v1.1 value.
509
+ - `finalState ∈ {converged, max-rounds-reached, aborted-non-result}`. Assigned by the lead at WHILE-loop exit: `converged` when the queue is empty at the end of any round; `max-rounds-reached` when the loop exits because `roundIndex == effectiveMaxRounds` with the queue still non-empty; `aborted-non-result` when the loop exits via the Worker-failure BREAK (per the "Worker failure handling in reverify" section, rule 4). `aborted-non-result` is the new v1.1 value.
401
510
  - `totalRounds`: count of rounds actually executed (not `effectiveMaxRounds`). May be `0` when Round 0 produced no queue items (all findings reached consensus during grouping).
402
511
 
403
512
  ## Output