npm - solidity-argus - Versions diffs - 0.6.0 → 0.6.1 - Mend

solidity-argus 0.6.0 → 0.6.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (20) hide show

package/package.json +1 -1
package/src/agents/argus-prompt.ts +25 -8
package/src/agents/audit-specialist-prompt.ts +18 -0
package/src/agents/pythia-prompt.ts +6 -3
package/src/agents/scribe-prompt.ts +4 -0
package/src/agents/sentinel-prompt.ts +7 -0
package/src/agents/themis-prompt.ts +2 -0
package/src/features/background-agent/background-manager.ts +85 -2
package/src/features/persistent-state/run-finalizer.ts +19 -3
package/src/hooks/system-prompt-hook.ts +18 -2
package/src/managers/types.ts +21 -0
package/src/shared/lineage-validator.ts +96 -0
package/src/shared/report-path-resolver.ts +8 -2
package/src/state/types.ts +1 -0
package/src/tools/forge-coverage-tool.ts +41 -5
package/src/tools/persist-deduped-tool.ts +45 -1
package/src/tools/read-findings-tool.ts +46 -5
package/src/tools/record-finding-tool.ts +3 -29
package/src/tools/report-generator-tool.ts +134 -37
package/src/tools/slither-tool.ts +62 -2

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "solidity-argus",
-  "version": "0.6.0",
+  "version": "0.6.1",
   "description": "Solidity smart contract security auditing plugin for OpenCode — 6 specialized agents, 16 tools (15 core + optional Solodit), and a curated vulnerability knowledge base",
   "keywords": [
     "solidity",

package/src/agents/argus-prompt.ts CHANGED Viewed

@@ -102,11 +102,13 @@ When the user explicitly asks for a deep or adversarial review, or when the scop
 Default deep/adversarial behavior: choose 2-4 relevant profiles, not every profile.
+Dispatch discipline is mandatory: run exactly one specialist profile per Task. Never bundle multiple profiles into the same audit-specialist prompt. If you choose 3 profiles, dispatch 3 separate audit-specialist tasks and synthesize their separate handoffs.
 Profile selection rules:
 - Privileged roles, proxies, initializers, or upgrade authority: \`access-control\`.
 - Asset/share vaults, staking, lending, or rewards: \`math-precision\`, \`invariant\`, \`economic-security\`.
 - Bridges, callbacks, queues, routers, or asynchronous flows: \`execution-trace\`, \`economic-security\`.
-- Heavy libraries, adapters, wrappers, or helpers: \`periphery\`.
+- Routers, position routers, heavy libraries, adapters, wrappers, or helpers: \`periphery\`.
 - High-value, unfamiliar, or broad adversarial requests: \`first-principles\` plus \`vector-scan\`.
 Dispatch examples:
@@ -115,6 +117,8 @@ Task(subagent_type="audit-specialist", prompt="Run specialist profile: math-prec
 Task(subagent_type="audit-specialist", prompt="Run specialist profile: vector-scan. Scope: src/. Load attack-vector-deck. Classify vectors as skip/drop/investigate and record only confirmed findings.")
 \`\`\`
+Each audit-specialist prompt must also request the structured handoff fields \`findings_recorded_ids\`, \`leads_not_recorded\`, \`tools_run\`, \`tool_failures\`, \`escalations_for_argus\`, and \`human_readable_brief\`.
 Audit-specialist findings are normal raw findings. Scribe and Themis must preserve \`reported_by_agent: "audit-specialist"\` and include them in raw -> deduped -> report parity checks.
 ### 6. Testing & Verification
@@ -213,9 +217,22 @@ Task(subagent_type="scribe", prompt="Generate the final audit report for Project
 ### Your Tools vs Subagent Tools
 **You (Argus) can use directly:**
-- \`read\`, \`bash\`, \`grep\`, \`glob\` — for reading code, running commands, searching patterns
+- \`read\`, \`bash\`, \`grep\`, \`glob\` — only for bounded scope discovery, not for executing the audit yourself
 - \`Task\` — for delegating to subagents
+### Direct-Tool Budget (CRITICAL)
+Argus is an orchestrator, not the tactical executor. Direct \`read\`/\`bash\`/\`grep\`/\`glob\` calls are capped at **8 total per user turn** and only for:
+- locating candidate scope files,
+- reading top-level project documentation,
+- checking whether the user's requested scope is ambiguous.
+After those bounded discovery calls, you MUST either:
+1. ask one concise scope-clarification question, or
+2. delegate the next audit work to Sentinel/Pythia/Audit Specialist with \`Task\`.
+Do NOT line-by-line audit contracts, enumerate every file, inspect full dependency trees, or run repeated shell/read probes directly in Argus. If more context is needed, delegate it. A broad audit request should produce early parallel delegation, not dozens of direct tool calls.
 **Only subagents can use (via Task delegation):**
 - \`argus_slither_analyze\`, \`argus_forge_test\`, \`argus_forge_fuzz\`, \`argus_forge_coverage\`, \`argus_gas_analysis\` → delegate to **sentinel**
 - \`argus_analyze_contract\`, \`argus_check_patterns\`, \`argus_proxy_detection\` → delegate to **sentinel**
@@ -249,8 +266,8 @@ Task(subagent_type="scribe", prompt="Generate the final audit report for Project
 - **Tools**: \`argus_skill_load\`, \`argus_check_patterns\`, \`argus_solodit_search\`, \`argus_analyze_contract\`, \`argus_slither_analyze\`, \`argus_proxy_detection\`, \`argus_forge_test\`, \`argus_forge_fuzz\`, \`argus_forge_coverage\`, \`argus_gas_analysis\`, \`argus_record_finding\`.
 - **Delegation Examples**:
   \`\`\`
-  Task(subagent_type="audit-specialist", prompt="Run specialist profile: math-precision. Scope: src/Vault.sol. Return FINDING/LEAD blocks and record only confirmed findings.")
-  Task(subagent_type="audit-specialist", prompt="Run specialist profile: vector-scan. Scope: src/. Load attack-vector-deck and record only confirmed findings.")
+  Task(subagent_type="audit-specialist", prompt="Run specialist profile: math-precision. Scope: src/Vault.sol. Return FINDING/LEAD blocks plus structured handoff fields. Record only confirmed findings.")
+  Task(subagent_type="audit-specialist", prompt="Run specialist profile: vector-scan. Scope: src/. Load attack-vector-deck and return structured handoff fields. Record only confirmed findings.")
   \`\`\`
 - **Constraint**: Use only for explicit deep/adversarial requests, complex protocol scopes, or Themis remediation. It returns \`FINDING\` and \`LEAD\` blocks; only confirmed findings are persisted.
@@ -567,7 +584,7 @@ STEPS:
 2. Deduplicate: group findings by vulnerability class + code location, merge into single entries. Include \`observation_ids\` on every deduped finding so each raw finding maps to exactly one report entry.
 3. Enrich: for each Critical/High finding, write specific impact and recommendation
 4. Call argus_persist_deduped with run_id and your deduped findings array — this writes the source-of-truth JSON to disk
-5. Call argus_generate_report with run_id, project_name, and scope — the tool reads deduped findings from disk
+5. Call argus_generate_report with run_id, project_name, scope, preflight_policy: "strict-fail", and quality_gate_policy: "strict-fail" — the tool reads deduped findings from disk
 Overall risk assessment: {your assessment}
 ")
@@ -588,7 +605,7 @@ After Scribe returns, check the \`<argus-context>\` injected in your system cont
 If you see \`REPORT GENERATION: INCOMPLETE\`, it means Scribe did NOT call \`argus_generate_report\` — the report file was NOT written to disk.
 **Recovery steps**:
-1. Re-dispatch Scribe with a shorter prompt: "Call argus_read_findings with run_id {run-id}, then call argus_generate_report with report_input containing the findings. The tool handles formatting."
+1. Re-dispatch Scribe with a shorter prompt: "Call argus_read_findings with run_id {run-id}, persist deduped findings if needed, then call argus_generate_report with run_id, project_name, scope, preflight_policy: 'strict-fail', and quality_gate_policy: 'strict-fail'."
 2. If Scribe fails a second time, call \`argus_generate_report\` yourself.
 **An audit is NOT complete until the report file exists on disk.**
@@ -608,14 +625,14 @@ Themis will:
 4. Return a verdict: approved or issues found
 **If Themis flags issues**, YOU are the final judge, but you must record a resolved disposition before the audit is complete:
-- If Themis found genuinely dropped findings → re-dispatch Scribe with specific correction instructions, then record status="remediated" with notes.
+- If Themis found genuinely dropped findings → re-dispatch Scribe with specific correction instructions, then re-run Themis on the regenerated report. Record status="remediated" only as an intermediate note; the audit is complete only after a fresh approved Themis disposition.
 - If Themis disagrees on severity → evaluate the evidence and either remediate the report or record status="overridden" with a concrete justification.
 - If Themis found potential false positives → assess and remediate or explicitly override with justification.
 - If Themis approves → record status="approved" with the Themis verdict.
 Record the disposition by calling \`argus_themis_disposition\` with \`status\`, \`verdict_json\`, and either \`notes\` for remediation or \`justification\` for overrides.
-If Themis returns approved=false, Argus remains the final judge but must record a disposition before the audit is complete: remediate the issue and record status="remediated", or deliberately override with status="overridden" and a concrete justification. A missing Themis verdict or missing Argus disposition means the audit is incomplete.
+If Themis returns approved=false, Argus remains the final judge but must record a disposition before the audit is complete: remediate the issue, regenerate the report, re-run Themis, and record a fresh status="approved" disposition; or deliberately override with status="overridden" and a concrete justification. A missing Themis verdict, a remediated status without a later approved Themis verdict, or missing Argus disposition means the audit is incomplete.
 **An audit is NOT complete until Themis has validated the output and Argus has recorded a resolved disposition.**

package/src/agents/audit-specialist-prompt.ts CHANGED Viewed

@@ -14,6 +14,8 @@ At task start:
 3. For \`vector-scan\`, \`first-principles\`, unfamiliar protocols, or broad adversarial review, also load \`attack-vector-deck\`.
 4. Load supporting vulnerability/protocol skills only when they materially sharpen the review.
+You must run exactly one active profile per task. If the prompt asks for multiple profiles, stop and return a LEAD asking Argus to split the work into one task per profile; do not execute a bundled multi-profile review.
 Recognized profiles:
 - \`vector-scan\`: mechanically apply the bundled attack-vector deck and classify vectors as skip/drop/investigate.
 - \`access-control\`: load \`access-control-specialist\`; map roles, modifiers, initialization, upgrade authority, and inconsistent guards.
@@ -47,6 +49,12 @@ When recording a confirmed finding with \`argus_record_finding\`, include specif
 ## OUTPUT CONTRACT
+## ANTI-LOOP CHECKPOINTS
+Emit a \`CHECKPOINT\` block after every 5 reviewed functions or when changing contracts. The checkpoint must state the active profile, last function reviewed, next function to review, tools run so far, and whether any new evidence was found.
+Do not repeat the same function, same trace, or same \`SAFE\`/\`LEAD\` assessment more than once. If a function remains unresolved after two consecutive passes with the same conclusion and no new evidence, move it to \`leads_not_recorded\` with the missing proof and continue to the next distinct target.
 Return structured blocks only:
 \`\`\`text
@@ -60,6 +68,16 @@ LEAD | contract: Name | function: func | bug_class: kebab-tag | profile: math-pr
 code_smells: what looked suspicious
 missing_proof: what still needs verification
 description: one sentence explaining the trail
+HANDOFF_JSON
+{
+  "findings_recorded_ids": ["observation-or-finding-id"],
+  "leads_not_recorded": [{ "group_key": "Name | func | bug-class", "missing_proof": "specific blocker" }],
+  "tools_run": ["argus_analyze_contract"],
+  "tool_failures": [],
+  "escalations_for_argus": [],
+  "human_readable_brief": "one paragraph summary"
+}
 \`\`\`
 Rules:

package/src/agents/pythia-prompt.ts CHANGED Viewed

@@ -42,9 +42,12 @@ You must follow this structured research process:
 ### 3. Cross-Referencing & Deep Dive
 - **Objective**: Connect the dots between history and the current code.
 - **Actions**:
-  - If Solodit shows that "Protocol X had a read-only reentrancy bug in function Y", check if the current contract has a similar function Y.
-  - If \`argus_check_patterns\` flags a delegatecall, search Solodit for "delegatecall storage collision" to find case studies.
-  - Synthesize the findings: "This pattern matches the 2022 Rari Capital exploit."
+   - If Solodit shows that "Protocol X had a read-only reentrancy bug in function Y", check if the current contract has a similar function Y.
+   - If \`argus_check_patterns\` flags a delegatecall, search Solodit for "delegatecall storage collision" to find case studies.
+   - Perform a bounded source read of the specific matched function or integration point before treating a precedent as applicable.
+   - Synthesize the findings: "This pattern matches the 2022 Rari Capital exploit."
+Do not record a precedent-only finding. A historical report can justify impact and recommendations, but \`argus_record_finding\` requires code-specific evidence from the current target.
 ### 4. Reporting
 - **Objective**: Deliver actionable intelligence to Argus.

package/src/agents/scribe-prompt.ts CHANGED Viewed

@@ -70,9 +70,13 @@ Argus provides you with a \`run_id\`. Your job: read findings, deduplicate, enri
    - \`project_name\`: the project name
    - \`scope\`: list of audited files
    - \`run_id\`: the run ID (the tool reads your persisted deduped findings from disk and resolves the canonical envelope automatically)
+   - \`preflight_policy: "strict-fail"\`
+   - \`quality_gate_policy: "strict-fail"\`
    **DO NOT** pass \`report_input\`, \`findings\`, \`toolsExecuted\`, \`session_id\`, or any other field — the tool reads them from durable state on disk. Passing them risks contract-mismatch failures.
+   Before this call, verify that every deduped finding file is inside the audited scope. Do not include findings outside the audited scope in the final persisted set.
 6. **Limitations disclosure**: If any tool failed or was absent, add a \`## Limitations\` section.
 7. Confirm: "Report generated via argus_generate_report: {filePath}".

package/src/agents/sentinel-prompt.ts CHANGED Viewed

@@ -106,6 +106,7 @@ You have access to a specific set of tools. Use them effectively.
 **When to use**: After running tests, to identify gaps in coverage.
 **Arguments**:
 - \`target\` (string): Path to the project directory (default ".").
+- \`ir_minimum\` (boolean): Retry coverage with \`ir_minimum: true\` when coverage fails with stack-too-deep, optimizerSteps, config parse, or instrumentation errors.
 **Interpretation**:
 - Focus on low branch coverage in critical contracts (vaults, token transfers, access control).
 - Untested code paths are prime candidates for hidden vulnerabilities.
@@ -156,6 +157,12 @@ You have access to a specific set of tools. Use them effectively.
 **Interpretation**:
 - Recording is mandatory before handing findings to Argus for final synthesis.
+### Large Tool Output Discipline
+- If any tool output or copied log exceeds 5,000 characters, summarize it in at most 10 bullets before continuing.
+- Preserve the exact failing command or tool name and preserve every artifact path needed for follow-up.
+- If a full output artifact path is available, reference that artifact path instead of embedding the full text.
+- do not paste the full output back into the conversation.
 ## SKILL SYSTEM
 Use \`argus_skill_load\` only when specialized context is needed before deep verification work.

package/src/agents/themis-prompt.ts CHANGED Viewed

@@ -84,6 +84,8 @@ Focus questions:
 Return a structured validation result, not a full report.
+Return exactly one JSON verdict. No prose after the JSON verdict.
 Use this exact shape:
 \`\`\`json

package/src/features/background-agent/background-manager.ts CHANGED Viewed

@@ -1,7 +1,12 @@
-import type { BackgroundManager } from "../../managers/types"
+import type {
+  BackgroundFailureDiagnostic,
+  BackgroundManager,
+  BackgroundTaskDiagnostic,
+  BackgroundTaskStatus,
+} from "../../managers/types"
 import { createLogger } from "../../shared/logger"
-type TaskStatus = "queued" | "running" | "completed" | "failed" | "cancelled"
+type TaskStatus = BackgroundTaskStatus
 type CompletionCallback = (taskId: string, result: unknown) => void
 export interface BackgroundTaskOptions {
@@ -25,6 +30,66 @@ interface TaskInfo {
   callbacks: Set<CompletionCallback>
 }
+function errorText(error: unknown): string {
+  if (error instanceof Error) return error.message
+  if (typeof error === "string") return error
+  try {
+    return JSON.stringify(error)
+  } catch {
+    return String(error)
+  }
+}
+export function classifyBackgroundFailure(
+  error: unknown,
+  task?: Pick<TaskInfo, "status" | "prompt">,
+): BackgroundFailureDiagnostic {
+  if (task?.status === "cancelled") {
+    return {
+      category: "cancelled",
+      retry_recommendation: "do_not_retry",
+      summary: "Background task was cancelled before completion.",
+    }
+  }
+  const text = errorText(error)
+  const lower = text.toLowerCase()
+  if (text.includes("This model does not support assistant message prefill")) {
+    return {
+      category: "model_error",
+      retry_recommendation: "retry_with_changes",
+      summary: "Provider rejected assistant prefill; retry with a fresh or shorter prompt.",
+    }
+  }
+  if (lower.includes("timed out")) {
+    const likelySizeRelated = (task?.prompt.length ?? 0) > 5_000
+    return {
+      category: "timeout",
+      retry_recommendation: likelySizeRelated ? "retry_with_changes" : "safe_to_retry",
+      summary: likelySizeRelated
+        ? "Background task timed out; retry with a shorter prompt or narrower scope."
+        : "Background task timed out; retrying is safe if upstream services are healthy.",
+    }
+  }
+  if (
+    lower.includes("argus tool") ||
+    lower.includes("command failed") ||
+    lower.includes("tool error") ||
+    lower.includes('"success":false')
+  ) {
+    return {
+      category: "tool_error",
+      retry_recommendation: "retry_with_changes",
+      summary: "Background task failed inside a tool or command invocation.",
+    }
+  }
+  return {
+    category: "unknown",
+    retry_recommendation: "retry_with_changes",
+    summary: text.length > 0 ? text : "Background task failed for an unknown reason.",
+  }
+}
 export type Dispatcher = (
   agentName: string,
   prompt: string,
@@ -185,6 +250,23 @@ export function createBackgroundManager(
     return Promise.resolve(task.result)
   }
+  function getTaskStatus(taskId: string): Promise<BackgroundTaskDiagnostic | undefined> {
+    const task = tasks.get(taskId)
+    if (!task) return Promise.resolve(undefined)
+    if (task.status === "completed") {
+      return Promise.resolve({ status: task.status, result: task.result })
+    }
+    if (task.status === "failed" || task.status === "cancelled") {
+      return Promise.resolve({
+        status: task.status,
+        error: task.error,
+        diagnostic: classifyBackgroundFailure(task.error, task),
+      })
+    }
+    return Promise.resolve({ status: task.status })
+  }
   function onComplete(
     taskIdOrCallback: string | CompletionCallback,
     callback?: CompletionCallback,
@@ -227,6 +309,7 @@ export function createBackgroundManager(
     dispatch,
     cancel,
     getResult,
+    getTaskStatus,
     onComplete,
     getActiveCount,
   }

package/src/features/persistent-state/run-finalizer.ts CHANGED Viewed

@@ -155,15 +155,21 @@ function isResolvedThemisDisposition(value: unknown): boolean {
   if (disposition?.status === "approved") {
     return disposition.verdict?.approved === true
   }
-  if (disposition?.status === "remediated") {
-    return disposition.verdict?.approved === false && hasText(disposition.notes)
-  }
   if (disposition?.status === "overridden") {
     return disposition.verdict?.approved === false && hasText(disposition.justification)
   }
   return false
 }
+function isRemediatedThemisDisposition(value: unknown): boolean {
+  const disposition = asRecord(value) as ThemisDisposition | null
+  return (
+    disposition?.status === "remediated" &&
+    disposition.verdict?.approved === false &&
+    hasText(disposition.notes)
+  )
+}
 function hasRejectedThemisVerdict(value: unknown): boolean {
   const verdict = asRecord(value) as ThemisVerdict | null
   return verdict?.approved === false
@@ -189,6 +195,16 @@ function collectThemisDispositionErrors(events: AuditEvent[]): string[] {
   if (hasResolvedDisposition) return []
+  const hasRemediatedDisposition = laterEvents.some((event) => {
+    if (event.type !== "tool.completed") return false
+    const payload = asRecord(event.payload)
+    return isRemediatedThemisDisposition(payload?.themisDisposition)
+  })
+  if (hasRemediatedDisposition) {
+    return ["remediated Themis disposition requires fresh approved Themis validation"]
+  }
   const hasUnresolvedRejection = laterEvents.some((event) => {
     if (event.type !== "tool.completed") return false
     const payload = asRecord(event.payload)

package/src/hooks/system-prompt-hook.ts CHANGED Viewed

@@ -41,14 +41,30 @@ function formatDuration(startTime: number, endTime?: number): string {
 }
 function buildToolLedgerLine(auditState: AuditState): string {
-  const taskDispatches = auditState.toolsExecuted.filter((tool) => tool.tool === "task").length
+  const taskTools = auditState.toolsExecuted.filter((tool) => tool.tool === "task")
+  const taskDispatches = taskTools.length
   const argusTools = auditState.toolsExecuted.filter((tool) => tool.tool !== "task").slice(-5)
   const entries = argusTools.map((tool) => {
     const status = tool.success ? "ok" : "failed"
     return `${tool.tool}=${status} findings=${tool.findingsCount} duration=${formatDuration(tool.startTime, tool.endTime)}`
   })
-  if (taskDispatches > 0) entries.push(`task dispatches=${taskDispatches}`)
+  if (taskDispatches > 0) {
+    const bySubagent = new Map<string, number>()
+    for (const tool of taskTools) {
+      const subagent = tool.subagent_type ?? "unknown"
+      bySubagent.set(subagent, (bySubagent.get(subagent) ?? 0) + 1)
+    }
+    const subagentSummary = [...bySubagent.entries()]
+      .sort(([a], [b]) => a.localeCompare(b))
+      .map(([subagent, count]) => `${subagent}=${count}`)
+      .join(" ")
+    entries.push(
+      subagentSummary.length > 0
+        ? `task dispatches=${taskDispatches} (${subagentSummary})`
+        : `task dispatches=${taskDispatches}`,
+    )
+  }
   return entries.length > 0 ? entries.join("; ") : "none"
 }

package/src/managers/types.ts CHANGED Viewed

@@ -1,5 +1,20 @@
 import type { AuditState } from "../state/types"
+export type BackgroundTaskStatus = "queued" | "running" | "completed" | "failed" | "cancelled"
+export type BackgroundFailureDiagnostic = {
+  category: "model_error" | "tool_error" | "timeout" | "cancelled" | "unknown"
+  retry_recommendation: "safe_to_retry" | "retry_with_changes" | "do_not_retry"
+  summary: string
+}
+export type BackgroundTaskDiagnostic = {
+  status: BackgroundTaskStatus
+  result?: unknown
+  error?: unknown
+  diagnostic?: BackgroundFailureDiagnostic
+}
 /**
  * BackgroundManager interface
  * Handles dispatching and managing background agent tasks
@@ -27,6 +42,12 @@ export interface BackgroundManager {
    */
   getResult(taskId: string): Promise<unknown>
+  /**
+   * Get task status and structured diagnostics for completed, failed, queued, and cancelled tasks.
+   * Unknown task IDs resolve to undefined.
+   */
+  getTaskStatus(taskId: string): Promise<BackgroundTaskDiagnostic | undefined>
   /**
    * Register a callback to be invoked when a task completes
    * @param callback - Function called with (taskId, result) when task finishes

package/src/shared/lineage-validator.ts ADDED Viewed

@@ -0,0 +1,96 @@
+import type { CanonicalFinding } from "../state/schemas"
+import type { Finding } from "../state/types"
+export type LineageCountMismatch = {
+  check: string
+  observation_count?: number
+  observation_ids_length: number
+}
+export type FindingLineageResult = {
+  valid: boolean
+  raw_count: number
+  mapped_count: number
+  duplicate_observation_ids: string[]
+  phantom_observation_ids: string[]
+  missing_observation_ids: string[]
+  count_mismatches: LineageCountMismatch[]
+}
+type FindingLike = Pick<Finding, "check"> & {
+  id?: string
+  observation_id?: string
+  observation_ids?: unknown
+  observation_count?: unknown
+}
+function sorted(values: Iterable<string>): string[] {
+  return Array.from(new Set(values)).sort((a, b) => a.localeCompare(b))
+}
+function observationIds(value: FindingLike): string[] {
+  if (!Array.isArray(value.observation_ids)) return []
+  return value.observation_ids.filter((id): id is string => typeof id === "string" && id.length > 0)
+}
+function rawObservationIds(rawFindings: CanonicalFinding[]): string[] {
+  return rawFindings
+    .map((finding) => finding.observation_id)
+    .filter((id): id is string => typeof id === "string" && id.length > 0)
+}
+export function validateFindingLineage(
+  rawFindings: CanonicalFinding[],
+  dedupedFindings: FindingLike[],
+): FindingLineageResult {
+  const rawIds = new Set(rawObservationIds(rawFindings))
+  const mappedIds: string[] = []
+  const seen = new Set<string>()
+  const duplicates = new Set<string>()
+  const countMismatches: LineageCountMismatch[] = []
+  for (const finding of dedupedFindings) {
+    const ids = observationIds(finding)
+    const suppliedCount = finding.observation_count
+    if (ids.length === 0 || (suppliedCount != null && suppliedCount !== ids.length)) {
+      countMismatches.push({
+        check: finding.check || finding.id || "(unknown finding)",
+        observation_count: typeof suppliedCount === "number" ? suppliedCount : undefined,
+        observation_ids_length: ids.length,
+      })
+    }
+    for (const id of ids) {
+      mappedIds.push(id)
+      if (seen.has(id)) {
+        duplicates.add(id)
+      }
+      seen.add(id)
+    }
+  }
+  const mappedSet = new Set(mappedIds)
+  const phantom = mappedIds.filter((id) => !rawIds.has(id))
+  const missing = Array.from(rawIds).filter((id) => !mappedSet.has(id))
+  const duplicateIds = sorted(duplicates)
+  const phantomIds = sorted(phantom)
+  const missingIds = sorted(missing)
+  countMismatches.sort((a, b) => a.check.localeCompare(b.check))
+  return {
+    valid:
+      duplicateIds.length === 0 &&
+      phantomIds.length === 0 &&
+      missingIds.length === 0 &&
+      countMismatches.length === 0 &&
+      mappedIds.length === rawIds.size,
+    raw_count: rawIds.size,
+    mapped_count: mappedIds.length,
+    duplicate_observation_ids: duplicateIds,
+    phantom_observation_ids: phantomIds,
+    missing_observation_ids: missingIds,
+    count_mismatches: countMismatches,
+  }
+}

package/src/shared/report-path-resolver.ts CHANGED Viewed

@@ -16,6 +16,8 @@ export interface ReportPathOptions {
   outputDir: string
   /** Optional run_id for run-scoped naming */
   runId?: string
+  /** Optional caller-supplied report revision. Base report is revision 1. */
+  revision?: number
 }
 export interface ResolvedReportPath {
@@ -46,7 +48,7 @@ export function sanitizeContractName(name: string): string {
 }
 export function resolveReportPath(options: ReportPathOptions): ResolvedReportPath {
-  const { contractName, date, outputDir, runId } = options
+  const { contractName, date, outputDir, runId, revision } = options
   if (!contractName || contractName.trim() === "") {
     throw new ReportPathError("contractName must not be empty")
@@ -54,12 +56,16 @@ export function resolveReportPath(options: ReportPathOptions): ResolvedReportPat
   if (!outputDir || outputDir.trim() === "") {
     throw new ReportPathError("outputDir must not be empty")
   }
+  if (revision != null && (!Number.isInteger(revision) || revision < 2)) {
+    throw new ReportPathError("revision must be an integer greater than or equal to 2")
+  }
   const resolvedDate = date ?? new Date()
   const dateStr = formatReportDate(resolvedDate)
   const sanitizedName = sanitizeContractName(contractName)
   const runIdSuffix = runId ? `-${runId.substring(0, 8)}` : ""
-  const filename = `${sanitizedName}-security-audit-${dateStr}${runIdSuffix}.md`
+  const revisionSuffix = revision == null ? "" : `-r${revision}`
+  const filename = `${sanitizedName}-security-audit-${dateStr}${runIdSuffix}${revisionSuffix}.md`
   const filePath = join(outputDir, filename)
   const canonicalId = runId ?? filename

package/src/state/types.ts CHANGED Viewed

@@ -95,6 +95,7 @@ export interface ToolExecution {
   success: boolean
   findingsCount: number
   findingCounts?: FindingCounts
+  subagent_type?: string
 }
 export interface FindingCounts {

package/src/tools/forge-coverage-tool.ts CHANGED Viewed

@@ -40,6 +40,8 @@ type ForgeCoverageResult = {
   report: ForgeCoverageReport
   executionTime: number
   error?: string
+  hint?: string
+  suggested_command?: string
 }
 export type ForgeCommandRunner = (
@@ -73,6 +75,36 @@ function isStackTooDeep(stderr: string): boolean {
   return /stack too deep/i.test(stderr)
 }
+function classifyCoverageFailure(
+  stderr: string,
+  args: NormalizedForgeCoverageArgs,
+): Pick<ForgeCoverageResult, "hint" | "suggested_command"> | undefined {
+  if (
+    !/(optimizerSteps|unsupported optimizer|config parse|failed to parse|instrumentation)/i.test(
+      stderr,
+    )
+  ) {
+    return undefined
+  }
+  const command = buildCoverageCommand({ ...args, ir_minimum: true }).join(" ")
+  return {
+    hint:
+      `Forge coverage failed for ${args.target} while parsing or instrumenting project configuration. ` +
+      "If foundry.toml uses optimizerSteps or unsupported optimizer settings, run a scoped coverage command or temporarily adjust coverage-only config manually; Argus will not edit foundry.toml.",
+    suggested_command: command,
+  }
+}
+function shouldRetryWithIrMinimum(stderr: string): boolean {
+  return (
+    isStackTooDeep(stderr) ||
+    /(optimizerSteps|unsupported optimizer|config parse|failed to parse|instrumentation)/i.test(
+      stderr,
+    )
+  )
+}
 function parsePercent(input: string): number {
   const match = input.match(/(\d+(?:\.\d+)?)%/)
   if (!match?.[1]) {
@@ -165,11 +197,15 @@ export async function executeForgeCoverage(
   const normalizedArgs = normalizeArgs(args, context)
   context.metadata({ title: `Run forge coverage: ${normalizedArgs.target}` })
-  const fail = (error: string): ForgeCoverageResult => ({
+  const fail = (
+    error: string,
+    diagnostics?: Pick<ForgeCoverageResult, "hint" | "suggested_command">,
+  ): ForgeCoverageResult => ({
     success: false,
     report: { files: [], summary: { ...EMPTY_SUMMARY } },
     executionTime: Date.now() - startedAt,
     error,
+    ...diagnostics,
   })
   try {
@@ -181,7 +217,7 @@ export async function executeForgeCoverage(
     if (
       runResult.exitCode !== 0 &&
       !normalizedArgs.ir_minimum &&
-      isStackTooDeep(runResult.stderr)
+      shouldRetryWithIrMinimum(runResult.stderr)
     ) {
       runResult = await runCommand(buildCoverageCommand(normalizedArgs, true), {
         signal: context.abort,
@@ -190,9 +226,9 @@ export async function executeForgeCoverage(
     }
     if (runResult.exitCode !== 0) {
-      return fail(
-        runResult.stderr.trim() || `forge coverage exited with code ${runResult.exitCode}`,
-      )
+      const error =
+        runResult.stderr.trim() || `forge coverage exited with code ${runResult.exitCode}`
+      return fail(error, classifyCoverageFailure(error, normalizedArgs))
     }
     let report: ForgeCoverageReport

package/src/tools/persist-deduped-tool.ts CHANGED Viewed

@@ -1,7 +1,8 @@
-import { mkdir, writeFile } from "node:fs/promises"
+import { mkdir, readFile, writeFile } from "node:fs/promises"
 import { dirname } from "node:path"
 import { type ToolContext, tool } from "@opencode-ai/plugin"
 import { createAuditArtifactResolver } from "../shared/audit-artifact-resolver"
+import { validateFindingLineage } from "../shared/lineage-validator"
 import { createLogger } from "../shared/logger"
 import { resolveProjectDir } from "../shared/project-utils"
 import { isNonEmptyString } from "../shared/type-guards"
@@ -22,6 +23,28 @@ export interface DedupedFindingsArtifact {
   findings: CanonicalFinding[]
 }
+async function loadRawFindings(
+  runId: string,
+  projectDir: string,
+): Promise<CanonicalFinding[] | null> {
+  const findingsFile = createAuditArtifactResolver(runId, projectDir).paths().findingsFile
+  try {
+    const parsed = JSON.parse(await readFile(findingsFile, "utf8"))
+    if (!parsed || !Array.isArray(parsed.findings)) return null
+    return parsed.findings
+  } catch {
+    return null
+  }
+}
+function missingRawFindings(runId: string): string {
+  return JSON.stringify({
+    success: false,
+    error: "MissingRawFindingsError",
+    message: `Cannot verify deduped lineage because .argus/runs/${runId}/findings.json is missing or invalid`,
+  })
+}
 export async function executePersistDeduped(
   args: PersistDedupedArgs,
   context: ToolContext,
@@ -55,6 +78,27 @@ export async function executePersistDeduped(
   const projectDir = resolveProjectDir(context)
   const resolver = createAuditArtifactResolver(args.run_id, projectDir)
   const dedupedPath = resolver.paths().dedupedFindingsFile
+  const rawFindings = await loadRawFindings(args.run_id, projectDir)
+  if (!rawFindings) {
+    return missingRawFindings(args.run_id)
+  }
+  const lineage = validateFindingLineage(rawFindings, findings)
+  if (!lineage.valid) {
+    return JSON.stringify({
+      success: false,
+      error: "LineageError",
+      lineage: {
+        raw_count: lineage.raw_count,
+        mapped_count: lineage.mapped_count,
+        duplicate_observation_ids: lineage.duplicate_observation_ids,
+        phantom_observation_ids: lineage.phantom_observation_ids,
+        missing_observation_ids: lineage.missing_observation_ids,
+        count_mismatches: lineage.count_mismatches,
+      },
+    })
+  }
   const artifact: DedupedFindingsArtifact = {
     run_id: args.run_id,

package/src/tools/read-findings-tool.ts CHANGED Viewed

@@ -12,6 +12,8 @@ import type { AuditState } from "../state/types"
 type ReadFindingsArgs = {
   run_id: string
+  findings_offset?: number
+  findings_limit?: number
 }
 type ReportFinding = Omit<
@@ -42,6 +44,11 @@ type CompactReportInput = Omit<
   run_id: string
   findings: ReportFinding[]
   toolsExecuted: ReportToolExecution[]
+  findingsPage?: {
+    offset: number
+    limit: number
+    total: number
+  }
 }
 type ReadFindingsInlineResult = {
@@ -98,13 +105,31 @@ function stripInternalKeys(obj: object, keysToStrip: ReadonlySet<string>): Recor
   return result
 }
-function buildCompactInput(reportInput: ReportInput): CompactReportInput {
+function normalizePageArgs(args: ReadFindingsArgs): { offset: number; limit: number } | null {
+  if (args.findings_offset == null && args.findings_limit == null) return null
+  const offset = args.findings_offset ?? 0
+  const limit = args.findings_limit ?? 50
+  if (!Number.isInteger(offset) || offset < 0) {
+    throw new Error("findings_offset must be a non-negative integer")
+  }
+  if (!Number.isInteger(limit) || limit < 1 || limit > 500) {
+    throw new Error("findings_limit must be an integer between 1 and 500")
+  }
+  return { offset, limit }
+}
+function buildCompactInput(
+  reportInput: ReportInput,
+  page: { offset: number; limit: number } | null = null,
+): CompactReportInput {
+  const rawFindings = page
+    ? reportInput.findings.slice(page.offset, page.offset + page.limit)
+    : reportInput.findings
   return {
     run_id: reportInput.run_id,
     projectDir: reportInput.projectDir,
-    findings: reportInput.findings.map(
-      (f) => stripInternalKeys(f, FINDING_INTERNAL_KEYS) as ReportFinding,
-    ),
+    findings: rawFindings.map((f) => stripInternalKeys(f, FINDING_INTERNAL_KEYS) as ReportFinding),
     toolsExecuted: reportInput.toolsExecuted.map(
       (t) => stripInternalKeys(t, TOOL_EXECUTION_INTERNAL_KEYS) as ReportToolExecution,
     ),
@@ -118,6 +143,13 @@ function buildCompactInput(reportInput: ReportInput): CompactReportInput {
     ...(reportInput.proxyContracts && { proxyContracts: reportInput.proxyContracts }),
     ...(reportInput.patternVersion && { patternVersion: reportInput.patternVersion }),
     ...(reportInput.skillsLoaded && { skillsLoaded: reportInput.skillsLoaded }),
+    ...(page && {
+      findingsPage: {
+        offset: page.offset,
+        limit: page.limit,
+        total: reportInput.findings.length,
+      },
+    }),
   }
 }
@@ -339,7 +371,8 @@ export async function executeReadFindings(
   const projectDir = resolveProjectDir(context)
   const reportInput = readAuditStateAsReportInput(projectDir, runId)
-  const compactInput = buildCompactInput(reportInput)
+  const page = normalizePageArgs(args)
+  const compactInput = buildCompactInput(reportInput, page)
   const inlineJson = JSON.stringify({
     success: true,
@@ -383,6 +416,14 @@ export const readFindingsTool = tool({
     "Read the materialized ReportInput artifact from disk for a given run. Returns the canonical findings, tools executed, scope, and all enrichment data. Scribe should call this before generating the report.",
   args: {
     run_id: tool.schema.string().describe("The run ID to read findings for."),
+    findings_offset: tool.schema
+      .number()
+      .optional()
+      .describe("Optional zero-based finding offset for paged inline retrieval."),
+    findings_limit: tool.schema
+      .number()
+      .optional()
+      .describe("Optional finding page size for inline retrieval (1-500)."),
   },
   async execute(args, context) {
     return executeReadFindings(args, context)

package/src/tools/record-finding-tool.ts CHANGED Viewed

@@ -1,7 +1,7 @@
 import { type ToolContext, tool } from "@opencode-ai/plugin"
 import { isNonEmptyString } from "../shared/type-guards"
 import { normalizeToCanonicalFinding } from "../state/adapters"
-import { SCHEMA_VERSION } from "../state/schemas"
+import { type CanonicalFinding, SCHEMA_VERSION } from "../state/schemas"
 import type { ArgusAgentName } from "../state/types"
 type RecordFindingArgs = {
@@ -12,20 +12,7 @@ type RecordFindingArgs = {
 type RecordFindingResponse = {
   success: boolean
   count: number
-  findings: Array<{
-    id: string
-    check: string
-    severity: string
-    confidence: string
-    file: string
-    description: string
-    lines: [number, number]
-    source: string
-    reported_by_agent: string
-    impact?: string
-    recommendation?: string
-    proofOfConcept?: string
-  }>
+  findings: CanonicalFinding[]
   schema_version: string
   note: string
   enrichment_warnings?: string[]
@@ -202,20 +189,7 @@ export async function executeRecordFinding(
   const response: RecordFindingResponse = {
     success: true,
     count: findings.length,
-    findings: findings.map((f) => ({
-      id: f.id,
-      check: f.check,
-      severity: f.severity,
-      confidence: f.confidence,
-      file: f.file,
-      description: f.description,
-      lines: f.lines,
-      source: f.source,
-      reported_by_agent: f.reported_by_agent,
-      ...(f.impact !== undefined ? { impact: f.impact } : {}),
-      ...(f.recommendation !== undefined ? { recommendation: f.recommendation } : {}),
-      ...(f.proofOfConcept !== undefined ? { proofOfConcept: f.proofOfConcept } : {}),
-    })),
+    findings,
     schema_version: SCHEMA_VERSION,
     note: "Findings recorded to event journal. The system assigns the canonical run_id automatically — use the run_id from <argus-context> for Scribe dispatch.",
     ...(enrichmentWarnings.length > 0

package/src/tools/report-generator-tool.ts CHANGED Viewed

@@ -9,6 +9,7 @@ import { createAuditArtifactResolver } from "../shared/audit-artifact-resolver"
 import type { DropDiagnostic } from "../shared/drop-diagnostics"
 import { createDropDiagnosticsCollector } from "../shared/drop-diagnostics"
 import { computeMissingKeyTools } from "../shared/key-tools"
+import { validateFindingLineage } from "../shared/lineage-validator"
 import { createLogger } from "../shared/logger"
 import { resolveProjectDir } from "../shared/project-utils"
 import { resolveReportPath } from "../shared/report-path-resolver"
@@ -38,6 +39,8 @@ type ReportGeneratorArgs = {
   preflight_policy?: PreflightPolicy
   tool_coverage_policy?: ToolCoveragePolicy
   run_id?: string
+  revision?: number
+  force?: boolean
 }
 type FindingsCount = {
@@ -131,6 +134,30 @@ async function checkDuplicateWrite(
   return null
 }
+async function checkSafeForceOverwrite(
+  filePath: string,
+  runId: string,
+): Promise<{ code: string; message: string } | null> {
+  if (!existsSync(filePath)) return null
+  try {
+    const existingContent = await Bun.file(filePath).text()
+    const existingRunId = extractReportRunId(existingContent)
+    if (existingRunId === runId) return null
+    return {
+      code: "INSECURE_OVERWRITE_REFUSED",
+      message:
+        existingRunId == null
+          ? `Refusing to force overwrite ${filePath}: existing file has no Argus report metadata.`
+          : `Refusing to force overwrite ${filePath}: existing report belongs to run_id "${existingRunId}", not "${runId}".`,
+    }
+  } catch (err) {
+    return {
+      code: "INSECURE_OVERWRITE_REFUSED",
+      message: `Refusing to force overwrite ${filePath}: existing file could not be read (${err instanceof Error ? err.message : String(err)}).`,
+    }
+  }
+}
 const SEVERITY_ORDER: FindingSeverity[] = ["Critical", "High", "Medium", "Low", "Informational"]
 const SEVERITY_PREFIX: Record<FindingSeverity, string> = {
@@ -767,6 +794,23 @@ function shouldIncludeFinding(finding: Finding, threshold: SeverityThreshold): b
   return FINDING_WEIGHT[finding.severity] >= THRESHOLD_WEIGHT[threshold]
 }
+function normalizeScopePath(value: string): string {
+  return value.replace(/^\.\//, "").replace(/\/+$|\\+$/g, "")
+}
+function isFindingInScope(finding: Finding, scope: string[]): boolean {
+  if (scope.length === 0) return true
+  const file = normalizeScopePath(finding.file)
+  return scope.some((entry) => {
+    const scoped = normalizeScopePath(entry)
+    return file === scoped || file.startsWith(`${scoped}/`)
+  })
+}
+function collectOutOfScopeFindings(findings: Finding[], scope: string[]): Finding[] {
+  return findings.filter((finding) => !isFindingInScope(finding, scope))
+}
 function calculateCounts(findings: Finding[]): FindingsCount {
   const counts = emptyCounts()
@@ -877,31 +921,6 @@ function hasDedupLineage(findings: Finding[]): boolean {
   })
 }
-function observationIdsForFinding(finding: Finding): string[] {
-  const observationIds = (finding as { observation_ids?: unknown }).observation_ids
-  if (Array.isArray(observationIds)) {
-    return observationIds.filter((id): id is string => typeof id === "string" && id.length > 0)
-  }
-  return typeof finding.observation_id === "string" && finding.observation_id.length > 0
-    ? [finding.observation_id]
-    : []
-}
-function compareObservationLineage(
-  eventFindings: Finding[],
-  reportFindings: Finding[],
-): { missing: string[]; extra: string[]; matches: boolean } {
-  const expected = new Set(eventFindings.flatMap(observationIdsForFinding))
-  const actual = new Set(reportFindings.flatMap(observationIdsForFinding))
-  const missing = Array.from(expected)
-    .filter((id) => !actual.has(id))
-    .sort((a, b) => a.localeCompare(b))
-  const extra = Array.from(actual)
-    .filter((id) => !expected.has(id))
-    .sort((a, b) => a.localeCompare(b))
-  return { missing, extra, matches: missing.length === 0 && extra.length === 0 }
-}
 export function validateReportQuality(
   findings: Finding[],
   policy: QualityGatePolicy,
@@ -1211,6 +1230,18 @@ export async function executeReportGeneration(
   const qualityGatePolicy = args.quality_gate_policy ?? "warn"
   const toolCoveragePolicy = args.tool_coverage_policy ?? "enforce"
   const expectedRunId = resolveExpectedRunId(args, context, deps)
+  const invalidRegenerationOptions =
+    args.force === true && args.revision != null
+      ? {
+          code: "INVALID_REGENERATION_OPTIONS",
+          message: "force and revision must not both be set.",
+        }
+      : args.revision != null && (!Number.isInteger(args.revision) || args.revision < 2)
+        ? {
+            code: "INVALID_REGENERATION_OPTIONS",
+            message: "revision must be an integer greater than or equal to 2.",
+          }
+        : null
   // Ensure report-input.json is materialized before attempting disk lookup.
   // Scribe may call generate_report without calling read_findings first,
@@ -1235,6 +1266,18 @@ export async function executeReportGeneration(
   const preflightPolicy = args.preflight_policy ?? "warn"
   let preflightWarningSection: string | null = null
   const warningBullets: string[] = []
+  const state = reportInputToAuditState(reportInput)
+  const scope = args.scope.length > 0 ? args.scope : reportInput.scope
+  const finalFindings = dedupeFindingsForFinalOutput(reportInput.findings)
+  const outOfScopeFindings = collectOutOfScopeFindings(finalFindings, scope)
+  if (outOfScopeFindings.length > 0) {
+    const locations = outOfScopeFindings.map(formatLocation).join(", ")
+    const message = `findings outside audited scope: ${locations}`
+    if (preflightPolicy === "strict-fail") {
+      throw new Error(`Preflight failed (strict-fail): ${message}`)
+    }
+    warningBullets.push(`- ${message}`)
+  }
   // Hard gate: refuse to generate a report if key audit tools have not been executed
   if (toolCoveragePolicy !== "skip") {
@@ -1285,11 +1328,24 @@ export async function executeReportGeneration(
     const inputFindings = dedupeFindingsForFinalOutput(reportInput.findings)
     const hasLineage = hasDedupLineage(reportInput.findings)
     const shouldCheckParity = eventFindings.length === inputFindings.length || hasLineage
+    const lineage = hasLineage
+      ? validateFindingLineage(projectFindings(events), reportInput.findings)
+      : null
     const parity = shouldCheckParity
-      ? hasLineage
-        ? compareObservationLineage(projectFindings(events), reportInput.findings)
-        : compareIssueFingerprintSets(eventFindings, inputFindings)
-      : { missing: [], extra: [], matches: true }
+      ? lineage
+        ? {
+            missing: lineage.missing_observation_ids,
+            extra: lineage.phantom_observation_ids,
+            duplicates: lineage.duplicate_observation_ids,
+            countMismatches: lineage.count_mismatches,
+            matches: lineage.valid,
+          }
+        : {
+            ...compareIssueFingerprintSets(eventFindings, inputFindings),
+            duplicates: [],
+            countMismatches: [],
+          }
+      : { missing: [], extra: [], duplicates: [], countMismatches: [], matches: true }
     if (!shouldCheckParity) {
       const unverifiableSummary = `event_findings=${eventFindings.length}, report_findings=${inputFindings.length}`
@@ -1320,6 +1376,14 @@ export async function executeReportGeneration(
       if (parity.extra.length > 0) {
         warningBullets.push(`- Extra ${parityLabel}: ${parity.extra.join(", ")}`)
       }
+      if (parity.duplicates.length > 0) {
+        warningBullets.push(`- Duplicate ${parityLabel}: ${parity.duplicates.join(", ")}`)
+      }
+      if (parity.countMismatches.length > 0) {
+        warningBullets.push(
+          `- Observation count mismatches: ${parity.countMismatches.map((item) => item.check).join(", ")}`,
+        )
+      }
     }
   } catch (err) {
     if (err instanceof Error && err.message.startsWith("Preflight failed (strict-fail)")) {
@@ -1341,9 +1405,6 @@ export async function executeReportGeneration(
     ].join("\n")
   }
-  const state = reportInputToAuditState(reportInput)
-  const scope = args.scope.length > 0 ? args.scope : reportInput.scope
-  const finalFindings = dedupeFindingsForFinalOutput(reportInput.findings)
   const findings = sortFindingsDeterministically(
     finalFindings.filter((finding) => shouldIncludeFinding(finding, threshold)),
   )
@@ -1442,6 +1503,7 @@ export async function executeReportGeneration(
     date: new Date(auditDate),
     outputDir: ".opencode/reports/",
     runId: runId || undefined,
+    revision: args.revision,
   })
   const result: ReportGenerationResult = {
@@ -1454,6 +1516,11 @@ export async function executeReportGeneration(
     contractDiagnostics: diagnostics,
   }
+  if (invalidRegenerationOptions) {
+    result.error = invalidRegenerationOptions
+    return result
+  }
   try {
     const loadConfig = deps.loadConfig ?? loadArgusConfig
     const projectDir = resolveProjectDir(context)
@@ -1468,14 +1535,28 @@ export async function executeReportGeneration(
       }
       return result
     }
-    const fullPath = path.join(resolvedOutput, canonicalFilename)
+    const { filePath: fullPath } = resolveReportPath({
+      contractName: args.project_name,
+      date: new Date(auditDate),
+      outputDir: resolvedOutput,
+      runId: runId || undefined,
+      revision: args.revision,
+    })
     // Single-writer policy: check for duplicate writes with same run_id
     if (runId) {
-      const duplicateError = await checkDuplicateWrite(fullPath, runId)
-      if (duplicateError) {
-        result.error = duplicateError
-        return result
+      if (args.force === true) {
+        const forceError = await checkSafeForceOverwrite(fullPath, runId)
+        if (forceError) {
+          result.error = forceError
+          return result
+        }
+      } else {
+        const duplicateError = await checkDuplicateWrite(fullPath, runId)
+        if (duplicateError) {
+          result.error = duplicateError
+          return result
+        }
       }
     }
@@ -1505,6 +1586,10 @@ export const reportGeneratorTool = tool({
       .enum(["critical", "high", "medium", "low", "informational"])
       .default("informational"),
     preflight_policy: tool.schema.enum(["warn", "strict-fail"]).optional(),
+    quality_gate_policy: tool.schema
+      .enum(["warn", "strict-fail"])
+      .optional()
+      .describe("Controls whether report quality gate violations warn or fail generation."),
     tool_coverage_policy: tool.schema
       .enum(["enforce", "warn", "skip"])
       .optional()
@@ -1518,6 +1603,18 @@ export const reportGeneratorTool = tool({
       .describe(
         "The canonical run ID from <argus-context>. The tool reads the materialized report-input.json from disk using this ID.",
       ),
+    revision: tool.schema
+      .number()
+      .optional()
+      .describe(
+        "Caller-supplied report revision. Must be an integer >= 2 and writes a -r{revision} file.",
+      ),
+    force: tool.schema
+      .boolean()
+      .optional()
+      .describe(
+        "Overwrite only the base canonical report path when existing Argus metadata matches the same run_id.",
+      ),
   },
   async execute(args, context) {
     const result = await executeReportGeneration(args, context)

package/src/tools/slither-tool.ts CHANGED Viewed

@@ -1,5 +1,13 @@
 import { createHash } from "node:crypto"
-import { existsSync, mkdtempSync, readFileSync, rmSync, writeFileSync } from "node:fs"
+import {
+  existsSync,
+  mkdtempSync,
+  readdirSync,
+  readFileSync,
+  rmSync,
+  statSync,
+  writeFileSync,
+} from "node:fs"
 import { tmpdir } from "node:os"
 import { dirname, isAbsolute, join, resolve } from "node:path"
 import { type ToolContext, tool } from "@opencode-ai/plugin"
@@ -63,6 +71,8 @@ export type SlitherAnalyzeResult = {
   executionTime: number
   errors: string[]
   error?: string
+  hint?: string
+  suggested_command?: string
 }
 function mapSeverity(impact?: string): FindingSeverity {
@@ -151,6 +161,50 @@ function shouldTryFlattenFallback(errors: string[], stderr: string): boolean {
   return FALLBACK_TRIGGERS.some((trigger) => combined.includes(trigger))
 }
+function isMixedPragmaSlitherFailure(errors: string[], stderr: string): boolean {
+  const combined = [...errors, stderr].join(" ")
+  return (
+    /(CryticCompileError|Slither exited with code 1)/i.test(combined) &&
+    /(solc|pragma|requires different compiler version|different compiler version|compiler version)/i.test(
+      combined,
+    )
+  )
+}
+function containsSolidityFile(dir: string): boolean {
+  try {
+    for (const entry of readdirSync(dir)) {
+      const fullPath = join(dir, entry)
+      const stat = statSync(fullPath)
+      if (stat.isFile() && entry.endsWith(".sol")) return true
+      if (stat.isDirectory() && containsSolidityFile(fullPath)) return true
+    }
+  } catch {
+    return false
+  }
+  return false
+}
+function mixedPragmaDiagnostics(
+  args: SlitherArgs,
+  projectDir: string,
+  errors: string[],
+  stderr: string,
+): Pick<SlitherAnalyzeResult, "hint" | "suggested_command"> | undefined {
+  if (!isMixedPragmaSlitherFailure(errors, stderr)) return undefined
+  const target = resolve(projectDir, args.target)
+  const srcCandidate = join(target, "src")
+  const suggestion =
+    existsSync(srcCandidate) && containsSolidityFile(srcCandidate) ? srcCandidate : undefined
+  return {
+    hint: "Try narrowing target to a single-pragma subdirectory and check foundry.toml/remappings for mixed compiler or vendored dependency scope issues.",
+    suggested_command: suggestion
+      ? buildCommand({ ...args, target: suggestion }).join(" ")
+      : undefined,
+  }
+}
 const parseSolcVersion = parseSolcVersionShared
 const extractContractNames = extractContractNamesShared
 const hasBinary = hasBinaryShared
@@ -488,7 +542,8 @@ export async function executeSlitherAnalyze(
       payload = JSON.parse(runResult.stdout) as SlitherPayload
     } catch (error) {
       const message = error instanceof Error ? error.message : "Unknown parse error"
-      if (args.via_ir || shouldTryFlattenFallback(errors, runResult.stderr)) {
+      const diagnostics = mixedPragmaDiagnostics(args, projectDir, errors, runResult.stderr)
+      if (!diagnostics && (args.via_ir || shouldTryFlattenFallback(errors, runResult.stderr))) {
         const fallbackResult = await flattenFallback(args, context, {
           ...getDefaultFlattenDeps(),
           runCommand,
@@ -503,6 +558,7 @@ export async function executeSlitherAnalyze(
         executionTime: Date.now() - startedAt,
         errors,
         error: `Slither output parse error: ${message}`,
+        ...diagnostics,
       }
     }
@@ -513,9 +569,12 @@ export async function executeSlitherAnalyze(
     const findings = parseFindings(payload)
     const success = findings.length > 0 || (runResult.exitCode === 0 && payload.success !== false)
+    const diagnostics = mixedPragmaDiagnostics(args, projectDir, errors, runResult.stderr)
     if (
       !success &&
       findings.length === 0 &&
+      !diagnostics &&
       (args.via_ir || shouldTryFlattenFallback(errors, runResult.stderr))
     ) {
       const fallbackResult = await flattenFallback(args, context, {
@@ -532,6 +591,7 @@ export async function executeSlitherAnalyze(
       findings,
       executionTime: Date.now() - startedAt,
       errors,
+      ...diagnostics,
     }
   } catch (error) {
     const message = error instanceof Error ? error.message : "Unknown error"