npm - @vigolium/piolium - Versions diffs - 0.0.2 → 0.0.3 - Mend

@vigolium/piolium 0.0.2 → 0.0.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (7) hide show

package/README.md +4 -2
package/agents/confirm-writer.md +58 -42
package/agents/poc-runner.md +13 -13
package/agents/test-locator.md +5 -5
package/extensions/piolium/export-results.ts +3 -2
package/extensions/piolium/modes/confirm.ts +37 -78
package/package.json +1 -1

package/README.md CHANGED Viewed

@@ -6,9 +6,11 @@
   <p align="center"><a href="https://www.vigolium.com">www.vigolium.com</a> - <a href="https://docs.vigolium.com">docs.vigolium.com</a></p>
 </p>
+![Vigolium Audit](https://github.com/vigolium/docs/blob/main/images/audit/vigolium-audit-with-piolium.png?raw=true)
 # Piolium
-Piolium is Vigolium's Pi-native repository security audit agent. It runs multi-phase source audits with specialist sub-agents, resumable state, controlled concurrency, PoC generation, and final reporting.
+Piolium is [Vigolium](https://www.vigolium.com/)'s Pi-native repository security audit agent. It runs multi-phase source audits with specialist sub-agents, resumable state, controlled concurrency, PoC generation, and final reporting.
 Piolium is packaged as a Pi extension. Once installed, it registers `/piolium-*` slash commands inside Pi sessions and also provides a standalone `piolium` launcher when installed through the quick installer.
@@ -36,7 +38,7 @@ For development from this checkout:
 ```bash
 bun install
 bun run import-archon -- --src /path/to/archon-audit
-pi install ./
+pi install .
 ```
 More install, build, release, auth, and development details are in [HACKING.md](HACKING.md).

package/agents/confirm-writer.md CHANGED Viewed

@@ -35,40 +35,48 @@ For each finding, extract:
 ### 2. Categorize Results
-Group findings into confirmation categories. Each finding gets ONE category — when both V4 and V5 produced verdicts, pick the strongest in this priority order: `confirmed-live` > `confirmed-test` > `confirmed-fp` > `analytical-only` > `unconfirmed` > `inconclusive` > `blocked` > `no-poc` > `error`.
+Group findings into confirmation categories. Each finding gets ONE category — when both V4 and V5 produced verdicts, pick the strongest in this priority order: `live-verified` > `test-verified` > `false-positive` > `analytical` > `not-reproduced` > `flaky` > `blocked` > `no-poc` > `errored`.
-The category is independent of `Documented-Intent`. A `match: yes` finding can still be `confirmed-live` — the PoC ran and the documented behavior was exactly what it produced. The reader uses both columns together to decide whether to triage further.
+The category is independent of `Documented-Intent`. A `match: yes` finding can still be `live-verified` — the PoC ran and the documented behavior was exactly what it produced. The reader uses both columns together to decide whether to triage further.
 | Category | Criteria |
 |----------|---------|
-| `confirmed-live` | PoC executed successfully against live environment (structured-output `status: confirmed`) |
-| `confirmed-test` | Generated test demonstrated the vulnerability |
-| `confirmed-fp` | fp-check determined the original draft was a false positive (drain from severity counts) |
-| `analytical-only` | Finding's `Protocol: non-exploitable` — confirmation is structural, not behavioural |
-| `unconfirmed` | PoC failed AND test could not confirm |
-| `inconclusive` | PoC's structured output reported `inconclusive` (e.g., race condition that didn't trigger) |
+| `live-verified` | PoC executed successfully against live environment (structured-output `status: confirmed`) |
+| `test-verified` | Generated test demonstrated the vulnerability |
+| `false-positive` | fp-check determined the original draft was a false positive (drain from severity counts) |
+| `analytical` | Finding's `Protocol: non-exploitable` — confirmation is structural, not behavioural |
+| `not-reproduced` | PoC ran cleanly AND/OR test ran cleanly without demonstrating the issue (covers both V4 `Confirm-Status: not-reproduced` and V5 `Confirm-Status: not-reproduced` — `Confirm-Method` tells the two apart) |
+| `flaky` | PoC's structured output reported `inconclusive` (e.g., race condition that didn't trigger deterministically) |
 | `blocked` | App unreachable, missing interpreter, missing auth token, install failure, test timeout, or no test framework |
 | `no-poc` | Finding had no PoC script and no testable code path |
-| `error` | Pipeline error during confirmation (record the failure for re-run) |
+| `errored` | Pipeline error during confirmation (record the failure for re-run) |
 **Deduplication rule**: a single finding ID appears in EXACTLY ONE category. Do not double-count when a finding was attempted by both V4 and V5 — the priority order above resolves it.
-### 3. Stage Confirmed Findings
+### 3. Stage Findings by Verdict
-Before writing the report, mirror every finding that received a verdict into `archon/confirm-workspace/confirmed-findings/`, grouped by category. This gives reviewers a single place to scan only the findings the confirmer reached a conclusion on, without having to cross-reference `confirmation-report.md` against `archon/findings/`.
+Before writing the report, mirror every finding that received a verdict into two top-level buckets under `archon/confirm-workspace/`, each grouped by category. This makes the outcome self-evident from the directory layout — a reviewer sees at a glance which findings the confirmer stood behind and which it could not, without cross-referencing `confirmation-report.md` against `archon/findings/`.
-Included categories: `confirmed-live`, `confirmed-test`, `analytical-only`, `confirmed-fp`. Findings in `unconfirmed | inconclusive | blocked | no-poc | error` are NOT staged — they remain only in `archon/findings/` and the report.
+- `archon/confirm-workspace/report-ready/<category>/` — findings the confirmer reached a positive conclusion on (the ship list). Categories: `live-verified`, `test-verified`, `analytical`, `false-positive`.
+- `archon/confirm-workspace/needs-review/<category>/` — every finding that did NOT confirm (the followup queue). Categories: `not-reproduced`, `flaky`, `blocked`, `no-poc`, `errored`.
+Both buckets are derived, disposable copies, regenerated each run. `archon/findings/` remains the canonical source of truth, and each staged `report.md` still carries the exact `Confirm-Status`, so the category folder is a convenience index, not authoritative.
 ```bash
-# Wipe any prior staging so the folder reflects only this run.
-rm -rf archon/confirm-workspace/confirmed-findings
-mkdir -p archon/confirm-workspace/confirmed-findings/{confirmed-live,confirmed-test,analytical-only,confirmed-fp}
+# Wipe any prior staging so the folders reflect only this run.
+rm -rf archon/confirm-workspace/report-ready archon/confirm-workspace/needs-review
+mkdir -p archon/confirm-workspace/report-ready/{live-verified,test-verified,analytical,false-positive}
+mkdir -p archon/confirm-workspace/needs-review/{not-reproduced,flaky,blocked,no-poc,errored}
 ```
-For each finding whose resolved category is one of the four above:
+For each finding, copy its directory into the bucket matching its resolved category from §2 — ship-list categories go to `report-ready/<category>/`, the rest to `needs-review/<category>/`:
 ```bash
-cp -R "archon/findings/<ID>-<slug>/" "archon/confirm-workspace/confirmed-findings/<category>/"
+# live-verified | test-verified | analytical | false-positive
+cp -R "archon/findings/<ID>-<slug>/" "archon/confirm-workspace/report-ready/<category>/"
+# not-reproduced | flaky | blocked | no-poc | errored
+cp -R "archon/findings/<ID>-<slug>/" "archon/confirm-workspace/needs-review/<category>/"
 ```
 `cp -R` copies the full directory (report.md, PoC scripts, `confirm-evidence/`, `confirm-test*`, etc.) so each staged entry is self-contained for review. If the source directory is missing (e.g., a finding ID survived in the report but its directory was deleted), log a warning and skip — do not abort report generation.
@@ -87,35 +95,43 @@ Write `archon/confirmation-report.md`:
 | Confirmed at | <ISO timestamp> |
 | Environment | <method_used from env-connection.json or "test-only" or "--target URL"> |
 | Original audit mode | <mode from audit-state.json, or "unknown"> |
-| Confirmed-findings staging | `archon/confirm-workspace/confirmed-findings/` (grouped by verdict) |
+| Findings staging | `archon/confirm-workspace/report-ready/` + `needs-review/` (grouped by verdict category) |
 ## Summary
-| Status | Count | Findings |
-|--------|-------|----------|
-| confirmed-live | N | C1, H2, ... |
-| confirmed-test | N | H3, M1, ... |
-| confirmed-fp | N | ... |
-| analytical-only | N | ... |
-| unconfirmed | N | M2, ... |
-| inconclusive | N | ... |
+| Verdict | Count | Findings |
+|---------|-------|----------|
+| live-verified | N | C1, H2, ... |
+| test-verified | N | H3, M1, ... |
+| false-positive | N | ... |
+| analytical | N | ... |
+| not-reproduced | N | M2, ... |
+| flaky | N | ... |
 | blocked | N | ... |
 | no-poc | N | ... |
-| error | N | ... |
+| errored | N | ... |
-**Confirmation rate**: X/Y findings confirmed (Z%) — `confirmed-fp` and `analytical-only` are excluded from the denominator (they're not pending verification).
+**Confirmation rate**: X/Y findings confirmed (Z%) — `false-positive` and `analytical` are excluded from the denominator (they're not pending verification).
 ## Breakdown by Exploitability Class
 (read from `archon/confirm-workspace/findings-inventory.json:by_class`)
-| Class | Total | confirmed-live | confirmed-test | unconfirmed | blocked | analytical-only |
-|-------|-------|----------------|----------------|-------------|---------|-----------------|
+| Class | Total | live-verified | test-verified | not-reproduced | blocked | analytical |
+|-------|-------|---------------|---------------|----------------|---------|------------|
 | network-exploitable | N | N | N | N | N | — |
 | local-exploitable | N | — | N | N | N | — |
 | non-exploitable | N | — | — | — | — | N |
-## Confirmed Findings (Live)
+## Pre-Auth Exposure
+(cross-cut index — list every finding whose `report.md` has `Auth-Required: no`, regardless of verdict. These are exploitable without credentials and are the highest priority for client reports. Omit the section entirely if no finding has `Auth-Required: no`.)
+| ID | Title | Severity | Verdict | Vector |
+|----|-------|----------|---------|--------|
+| C1 | ... | CRITICAL | live-verified | unauthenticated HTTP |
+## Report-Ready — Live Verified
 ### <ID> — <title> [<severity>]
@@ -127,7 +143,7 @@ Write `archon/confirmation-report.md`:
 ---
-## Confirmed Findings (Test)
+## Report-Ready — Test Verified
 ### <ID> — <title> [<severity>]
@@ -139,7 +155,7 @@ Write `archon/confirmation-report.md`:
 ---
-## Unconfirmed Findings
+## Needs-Review — Not Reproduced
 ### <ID> — <title> [<severity>]
@@ -151,7 +167,7 @@ Write `archon/confirmation-report.md`:
 ---
-## Blocked Findings
+## Needs-Review — Blocked
 ### <ID> — <title> [<severity>]
@@ -215,15 +231,15 @@ If `archon/audit-state.json` exists, update the latest audit entry. Two writes:
     "environment_method": "<method_used or 'remote' or 'test-only'>",
     "target_url": "<base_url or --target URL>",
     "results": {
-      "confirmed_live": <count>,
-      "confirmed_test": <count>,
-      "confirmed_fp": <count>,
-      "analytical_only": <count>,
-      "unconfirmed": <count>,
-      "inconclusive": <count>,
+      "live_verified": <count>,
+      "test_verified": <count>,
+      "false_positive": <count>,
+      "analytical": <count>,
+      "not_reproduced": <count>,
+      "flaky": <count>,
       "blocked": <count>,
       "no_poc": <count>,
-      "error": <count>
+      "errored": <count>
     },
     "by_class": {"network-exploitable": <count>, "local-exploitable": <count>, "non-exploitable": <count>},
     "confirmation_rate": "<X/Y (Z%)>"
@@ -241,7 +257,7 @@ If `archon/audit-state.json` exists, update the latest audit entry. Two writes:
       "started_at": "<ISO timestamp>",
       "completed_at": "<ISO timestamp>",
       "target_url": "<base_url>",
-      "results": {"confirmed_live": N, "confirmed_test": N, "...": "..."}
+      "results": {"live_verified": N, "test_verified": N, "...": "..."}
     }
   ]
 }

package/agents/poc-runner.md CHANGED Viewed

@@ -43,9 +43,9 @@ Read the finding report at `archon/findings/<ID>-<slug>/report.md`. Extract:
 - `Protocol:` field (`http`, `grpc`, `graphql`, `websocket`, `tcp`, `local`, `non-exploitable`) — written by poc-author. Defaults to `http` if absent.
 - `Auth-Required:` field (`yes` / `no`) — defaults to `no` if absent.
 - Expected security effect (what the PoC should demonstrate)
-- Current `Confirm-Status` (skip if already `confirmed-live` from a previous run)
+- Current `Confirm-Status` (skip if already `live-verified` from a previous run)
-If `Protocol: non-exploitable`, write `Confirm-Status: analytical-only` and exit cleanly — there is no live verification to run.
+If `Protocol: non-exploitable`, write `Confirm-Status: analytical` and exit cleanly — there is no live verification to run.
 ### 2. Locate the PoC Script
@@ -158,17 +158,17 @@ Allowed `status` values: `confirmed`, `failed`, `inconclusive`.
 Parse the LAST line of `exploit.log` matching `^\{.*"status".*\}$`. Map directly:
-- `confirmed` → `Confirm-Status: confirmed-live`
-- `failed`    → `Confirm-Status: failed` (try variant 2 if not yet attempted)
-- `inconclusive` → `Confirm-Status: inconclusive` (treated like failed for V5 fallback purposes; reporter surfaces it distinctly)
+- `confirmed` → `Confirm-Status: live-verified`
+- `failed`    → `Confirm-Status: not-reproduced` (try variant 2 if not yet attempted)
+- `inconclusive` → `Confirm-Status: flaky` (treated like not-reproduced for V5 fallback purposes; reporter surfaces it distinctly)
-**Legacy PoC fallback**: if no structured line is present (older PoCs from before the contract), apply the heuristic — non-zero exit + no security marker = `failed`; security marker present = `confirmed-live`. Add `Confirm-Notes: legacy-poc-format` so the operator knows to upgrade.
+**Legacy PoC fallback**: if no structured line is present (older PoCs from before the contract), apply the heuristic — non-zero exit + no security marker = `not-reproduced`; security marker present = `live-verified`. Add `Confirm-Notes: legacy-poc-format` so the operator knows to upgrade.
-For **failed** results from variant 1: run variant 2 with a different payload encoding, alternate endpoint path, or alternative auth identity (e.g., switch `{{TOKEN_user}}` ↔ `{{TOKEN_admin}}` for privilege-escalation-shaped findings).
+For **not-reproduced** results from variant 1: run variant 2 with a different payload encoding, alternate endpoint path, or alternative auth identity (e.g., switch `{{TOKEN_user}}` ↔ `{{TOKEN_admin}}` for privilege-escalation-shaped findings).
-For **failed** results after both variants: run the `fp-check` skill on the original draft (`archon/findings/<ID>-<slug>/draft.md`) using the live evidence as context. Two outcomes:
-- fp-check confirms the draft is itself a false positive → `Confirm-Status: confirmed-fp`
-- fp-check finds the draft sound but the live PoC weak → keep `Confirm-Status: failed` and let V5 generate a reproducer test
+For **not-reproduced** results after both variants: run the `fp-check` skill on the original draft (`archon/findings/<ID>-<slug>/draft.md`) using the live evidence as context. Two outcomes:
+- fp-check confirms the draft is itself a false positive → `Confirm-Status: false-positive`
+- fp-check finds the draft sound but the live PoC weak → keep `Confirm-Status: not-reproduced` and let V5 generate a reproducer test
 Record each attempt and the fp-check verdict in `archon/findings/<ID>-<slug>/confirm-evidence/attempts.log`.
@@ -176,7 +176,7 @@ Record each attempt and the fp-check verdict in `archon/findings/<ID>-<slug>/con
 Write confirmation status back to the finding:
 ```
-Confirm-Status: confirmed-live | failed | inconclusive | error | blocked | confirmed-fp | analytical-only | no-poc
+Confirm-Status: live-verified | not-reproduced | flaky | errored | blocked | false-positive | analytical | no-poc
 Confirm-Timestamp: <ISO timestamp>
 Confirm-Evidence: archon/findings/<ID>-<slug>/confirm-evidence/
 Confirm-Variant-Count: <1 or 2>
@@ -184,9 +184,9 @@ Confirm-FpCheck: ran | not-run
 Confirm-Notes: <brief description of what was observed>
 ```
-If **failed** or **inconclusive** after all attempts, the finding is queued for test-locator (V5) fallback.
+If **not-reproduced** or **flaky** after all attempts, the finding is queued for test-locator (V5) fallback.
 If **blocked** (missing interpreter, missing auth token, app unreachable), the finding is queued for V5 too — V5 may succeed where the live PoC could not.
-If **confirmed-fp** or **analytical-only**, the finding skips V5 entirely.
+If **false-positive** or **analytical**, the finding skips V5 entirely.
 ## Completion

package/agents/test-locator.md CHANGED Viewed

@@ -5,7 +5,7 @@ model: sonnet
 color: blue
 permissionMode: bypassPermissions
 effort: low
-description: Confirmation phase V5 test-based verification agent that maps unconfirmed findings to existing test files, generates minimal reproducer tests targeting each vulnerability, executes them in isolation within archon/findings/<ID>/, and updates confirmation status
+description: Confirmation phase V5 test-based verification agent that maps not-reproduced / blocked / no-poc findings to existing test files, generates minimal reproducer tests targeting each vulnerability, executes them in isolation within archon/findings/<ID>/, and updates confirmation status
 ---
 You are a test mapper for the confirmation phase of a security audit. You verify findings by generating and running targeted test cases when live PoC execution is not possible.
@@ -184,17 +184,17 @@ The outer `timeout 90` is a belt-and-suspenders cap — if the runner ignores it
 ### 8. Assess Result
 - **Test passes** (exit 0): the vulnerability is confirmed — malicious input reached the sink
-  → `Confirm-Status: confirmed-test`
+  → `Confirm-Status: test-verified`
 - **Test fails** (assertion error): the application sanitized/blocked the input — not confirmed this way
-  → `Confirm-Status: unconfirmed`
+  → `Confirm-Status: not-reproduced`
 - **Test errors** (import error, syntax error, runtime crash): test couldn't execute
-  → `Confirm-Status: unconfirmed` with `Confirm-Notes` explaining the error
+  → `Confirm-Status: not-reproduced` with `Confirm-Notes` explaining the error
 ### 9. Update Finding
 Write back to the finding report:
 ```
-Confirm-Status: confirmed-test | unconfirmed | blocked
+Confirm-Status: test-verified | not-reproduced | blocked
 Confirm-Method: generated-test
 Confirm-Test: archon/findings/<ID>-<slug>/confirm-test.{ext}
 Confirm-Test-Output: archon/findings/<ID>-<slug>/confirm-test-output.log

package/extensions/piolium/export-results.ts CHANGED Viewed

@@ -2,6 +2,7 @@ import { existsSync, mkdirSync, readFileSync, readdirSync, statSync, writeFileSy
 import { basename, dirname, extname, isAbsolute, join, relative, resolve } from "node:path";
 import { splitFrontmatter } from "./agents.ts";
 import { type FindingDraft, listFindingDirs, readFindingFrontmatter } from "./findings.ts";
+import type { ConfirmVerdict } from "./modes/confirm.ts";
 export type ExportFormat = "json" | "md-dir";
@@ -197,8 +198,8 @@ function includeFinding(
 function isConfirmed(confirmStatus: string | undefined): boolean {
 	return (
-		confirmStatus === "confirmed-live" ||
-		confirmStatus === "confirmed-test" ||
+		confirmStatus === ("live-verified" satisfies ConfirmVerdict) ||
+		confirmStatus === ("test-verified" satisfies ConfirmVerdict) ||
 		confirmStatus === "confirmed"
 	);
 }

package/extensions/piolium/modes/confirm.ts CHANGED Viewed

@@ -2,7 +2,7 @@
  * Confirm mode (`/piolium-confirm`).
  *
  * Verification pass over an already-completed audit (command-defs/confirm.md,
- * archon-audit @ 2026-05-16). Seven phases:
+ * vigolium-audit @ 2026-05-20). Seven phases:
  *
  *   V1   findings inventory (env-profiler surveys & classifies findings by
  *                            exploitability: network / local / non-exploitable)
@@ -28,15 +28,7 @@
  * `piolium/confirm-workspace/env-connection.json`.
  */
-import {
-	existsSync,
-	mkdirSync,
-	readFileSync,
-	readdirSync,
-	renameSync,
-	statSync,
-	writeFileSync,
-} from "node:fs";
+import { existsSync, mkdirSync, readFileSync, readdirSync, statSync, writeFileSync } from "node:fs";
 import { basename, extname, join } from "node:path";
 import type { AgentRuntimeModel } from "../agent-runner.ts";
 import { loadAgents } from "../agents.ts";
@@ -73,12 +65,28 @@ const WORK = CONFIRM_WORKSPACE;
 const REPORT = CONFIRM_REPORT;
 export const POC_RESULTS = `${WORK}/poc-results.json`;
 export const INTENT_CORPUS = `${WORK}/intent-corpus.json`;
-const FP_RENAMES = `${WORK}/false-positive-renames.json`;
 export const CLEANUP_SUMMARY = `${WORK}/cleanup-summary.json`;
 const MAX_REDACTABLE_BYTES = 5 * 1024 * 1024;
 export const CONFIRM_AGENT_PHASES = ["V1", "V1.5", "V2", "V3", "V4", "V5", "V6"] as const;
+export const REPORT_READY_VERDICTS = [
+	"live-verified",
+	"test-verified",
+	"analytical",
+	"false-positive",
+] as const;
+export const NEEDS_REVIEW_VERDICTS = [
+	"not-reproduced",
+	"flaky",
+	"blocked",
+	"no-poc",
+	"errored",
+] as const;
+export type ConfirmVerdict =
+	| (typeof REPORT_READY_VERDICTS)[number]
+	| (typeof NEEDS_REVIEW_VERDICTS)[number];
 const TEXT_EXTENSIONS = new Set([
 	".csv",
 	".curl",
@@ -159,7 +167,7 @@ const CONFIRMATION_STANDARD = [
 	"- Write evidence under each finding's `evidence/` directory; include enough detail for replay.",
 	"- Do not mark confirmed from code plausibility alone.",
 	"- Mark `Confirm-Status: false-positive` only when real execution or a targeted reproducer proves the claimed exploit path is blocked, unreachable, or contradicted by code/runtime behavior.",
-	"- If evidence is incomplete, use `blocked`, `inconclusive`, or `unconfirmed` instead of false-positive.",
+	"- If evidence is incomplete, use `blocked`, `flaky`, or `not-reproduced` instead of false-positive.",
 ].join("\n");
 export function buildConfirmTask(phase: string, target: string | undefined): string {
@@ -214,12 +222,12 @@ export function buildConfirmTask(phase: string, target: string | undefined): str
 		case "V4":
 			return [
 				"You are running V4 (PoC Execution) of /piolium-confirm.",
-				"Read findings-inventory.json and env-connection.json. Skip non-exploitable findings as `Confirm-Status: analytical-only`; route local-only findings to V5.",
+				"Read findings-inventory.json and env-connection.json. Skip non-exploitable findings as `Confirm-Status: analytical`; route local-only findings to V5.",
 				"Before per-finding execution, run one reachability check against base_url with a 5s timeout; if unreachable, mark queued network findings `blocked` and record the reason.",
 				"For every network-exploitable finding with a PoC, execute the real PoC against the target. Use a 30s timeout per variant, max 2 variants.",
 				"Capture exact command, relevant env, HTTP request/response or stdout/stderr, and observable before/after state to `<finding-dir>/evidence/confirmed-<timestamp>.log`.",
 				"Parse structured PoC output if present: final JSON line `{status,evidence,notes}`.",
-				"Update each `report.md` with `Confirm-Status: confirmed-live | failed | blocked | analytical-only | false-positive` and `Confirm-Evidence:` pointing at the evidence file.",
+				"Update each `report.md` with `Confirm-Status: live-verified | not-reproduced | flaky | blocked | analytical | false-positive` and `Confirm-Evidence:` pointing at the evidence file.",
 				`Write aggregate results to \`${POC_RESULTS}\`.`,
 				CONFIRMATION_STANDARD,
 			].join("\n\n");
@@ -229,20 +237,29 @@ export function buildConfirmTask(phase: string, target: string | undefined): str
 				"For findings whose live PoC did not confirm, had no PoC, or are local-exploitable, generate the smallest reproducer test in the existing test framework.",
 				"Actually run the test with a 60s cap (pytest timeout, jest --testTimeout, go test -timeout, etc.).",
 				"Keep reproducer files/evidence under each finding dir and write command/output logs under `evidence/`.",
-				"Update `report.md`: `Confirm-Status: confirmed-test | failed | blocked | false-positive` and `Confirm-Evidence:`.",
+				"Update `report.md`: `Confirm-Status: test-verified | not-reproduced | blocked | false-positive` and `Confirm-Evidence:`.",
 				"Only mark `false-positive` when the reproducer proves the claimed vulnerable path is unreachable, patched, protected, or based on an invalid assumption.",
 				`Write \`${WORK}/test-mapping.json\` with per-finding verdicts and evidence pointers.`,
 				CONFIRMATION_STANDARD,
 			].join("\n\n");
-		case "V6":
+		case "V6": {
+			const reportReady = REPORT_READY_VERDICTS.join(", ");
+			const needsReview = NEEDS_REVIEW_VERDICTS.join(", ");
+			const allVerdicts = [...REPORT_READY_VERDICTS, ...NEEDS_REVIEW_VERDICTS].join(", ");
 			return [
 				"You are running V6 (Confirmation Report) of /piolium-confirm.",
-				"Read `piolium/findings/`, including any directories renamed with `FP-` after V5.",
-				`Compose \`${REPORT}\` with: confirmed-live, confirmed-test, analytical-only, blocked, inconclusive/unconfirmed, and false-positive counts.`,
-				"Include one line per finding with status, evidence pointer, and reproduction command summary.",
-				"Create a dedicated false-positive section listing every `FP-*` directory and the evidence that disproved it.",
+				"Read every `report.md` under `piolium/findings/` and treat it as the source of truth.",
+				"Stage every finding into one of two derived buckets under `piolium/confirm-workspace/` (regenerated each run, wipe prior staging first):",
+				`  - \`piolium/confirm-workspace/report-ready/<category>/\` for ${reportReady} (the ship list)`,
+				`  - \`piolium/confirm-workspace/needs-review/<category>/\` for ${needsReview} (the followup queue)`,
+				"Use `cp -R` so each staged entry is self-contained (report.md, PoC scripts, confirm-evidence/, confirm-test*).",
+				`Compose \`${REPORT}\` with: a Summary table of all nine verdicts (${allVerdicts}), a Breakdown by Exploitability Class section, and a Pre-Auth Exposure cross-cut index that lists every finding whose \`report.md\` has \`Auth-Required: no\` (omit the section if none).`,
+				"For each verdict category that has findings, include a section with one entry per finding (ID — title [severity], vulnerability class, method, evidence pointer, observation).",
+				"Confirmation rate denominator excludes `false-positive` and `analytical`.",
+				"If `piolium/audit-state.json` exists, append a new entry to `audits[-1].confirmation_history[]` and refresh `audits[-1].confirmation` with the latest run's summary — never overwrite the history array.",
 				"Include environment setup notes, target URL/base_url, cleanup result, and methodology.",
 			].join("\n\n");
+		}
 		default:
 			return "Unknown V phase.";
 	}
@@ -288,58 +305,11 @@ export function writeRemoteConnection(cwd: string, target: string): void {
 	);
 }
-function reportMarksFalsePositive(text: string): boolean {
-	return (
-		/^(?:Confirm-Status|Confirmation|Confirm-Verdict|Verdict)\s*:\s*(?:false[-_ ]positive|fp)\b/im.test(
-			text,
-		) || /"confirm_status"\s*:\s*"false[-_ ]positive"/i.test(text)
-	);
-}
-function uniqueDest(root: string, name: string): string {
-	let candidate = join(root, name);
-	let suffix = 2;
-	while (existsSync(candidate)) {
-		candidate = join(root, `${name}-${suffix}`);
-		suffix++;
-	}
-	return candidate;
-}
-export function renameFalsePositiveFindings(cwd: string): string[] {
-	const root = join(cwd, "piolium", "findings");
-	if (!existsSync(root)) return [];
-	const renames: string[] = [];
-	for (const entry of readdirSync(root).sort()) {
-		if (entry.startsWith("FP-")) continue;
-		const dir = join(root, entry);
-		try {
-			if (!statSync(dir).isDirectory()) continue;
-			const reportPath = join(dir, "report.md");
-			if (!existsSync(reportPath)) continue;
-			if (!reportMarksFalsePositive(readFileSync(reportPath, "utf8"))) continue;
-			const destName = `FP-${entry}`;
-			const dest = uniqueDest(root, destName);
-			renameSync(dir, dest);
-			renames.push(`${entry} -> ${basename(dest)}`);
-		} catch {
-			// Keep confirmation moving; V6 will still report available evidence.
-		}
-	}
-	ensureConfirmWorkdir(cwd);
-	writeFileSync(
-		join(cwd, FP_RENAMES),
-		`${JSON.stringify({ renamed_at: new Date().toISOString(), renames }, null, "\t")}\n`,
-	);
-	return renames;
-}
 export interface ConfirmCleanupResult {
 	summaryPath: string;
 	checkedFindingDirs: string[];
 	createdEvidenceDirs: string[];
 	formatIssues: string[];
-	falsePositiveRenames: string[];
 	redactedFiles: Array<{ path: string; replacements: Record<string, number> }>;
 	skippedFiles: Array<{ path: string; reason: string }>;
 }
@@ -546,7 +516,6 @@ function normalizeFindingLayout(
 export function cleanupConfirmArtifacts(cwd: string): ConfirmCleanupResult {
 	ensureConfirmWorkdir(cwd);
-	const falsePositiveRenames = renameFalsePositiveFindings(cwd);
 	const layout = normalizeFindingLayout(cwd);
 	const skippedFiles: ConfirmCleanupResult["skippedFiles"] = [];
 	const candidates: string[] = [];
@@ -560,7 +529,6 @@ export function cleanupConfirmArtifacts(cwd: string): ConfirmCleanupResult {
 	const result: ConfirmCleanupResult = {
 		summaryPath: CLEANUP_SUMMARY,
 		...layout,
-		falsePositiveRenames,
 		redactedFiles,
 		skippedFiles,
 	};
@@ -618,15 +586,6 @@ export async function runConfirmAudit(opts: RunConfirmOptions): Promise<RunConfi
 			});
 			continue;
 		}
-		if (name === "V6") {
-			const renames = renameFalsePositiveFindings(cwd);
-			if (renames.length > 0) {
-				ui?.notify?.(
-					`Renamed ${renames.length} false-positive finding folder(s) with FP- prefix.`,
-					"warning",
-				);
-			}
-		}
 		try {
 			await runAgentPhase({
 				cwd,

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
 	"name": "@vigolium/piolium",
-	"version": "0.0.2",
+	"version": "0.0.3",
 	"description": "Pi-native port of archon-audit. Multi-phase security audits with specialist sub-agents, isolated context windows, capped concurrency, and resumable state — packaged as a Pi extension.",
 	"keywords": ["pi-package", "security", "audit", "subagents", "piolium"],
 	"license": "MIT",