solidity-argus 0.5.7 → 0.5.8
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/AGENTS.md +6 -6
- package/README.md +10 -10
- package/package.json +1 -1
- package/src/agents/argus-prompt.ts +3 -3
- package/src/agents/pythia-prompt.ts +3 -2
- package/src/agents/sentinel-prompt.ts +1 -1
- package/src/agents/themis-prompt.ts +1 -1
- package/src/cli/commands/doctor.ts +155 -1
- package/src/cli/commands/install.ts +5 -2
- package/src/constants/defaults.ts +5 -5
- package/src/create-hooks.ts +0 -1
- package/src/features/persistent-state/findings-materializer.ts +1 -4
- package/src/features/persistent-state/run-finalizer.ts +63 -7
- package/src/hooks/config-handler.ts +1 -1
- package/src/tools/persist-deduped-tool.ts +1 -1
- package/src/tools/record-finding-tool.ts +26 -9
- package/src/tools/report-generator-tool.ts +28 -9
package/AGENTS.md
CHANGED
|
@@ -12,33 +12,33 @@ CLI: `argus doctor`, `argus init`, `argus install`.
|
|
|
12
12
|
|
|
13
13
|
**Role**: Primary security audit orchestrator
|
|
14
14
|
**Description**: Argus Panoptes, the All-Seeing Guardian. Coordinates full Solidity security audits by dispatching Sentinel (analysis), Pythia (research), Scribe (reporting), and Themis (validation). Follows a rigorous 7-step methodology: Reconnaissance, Automated Scanning, Manual Review, Attack Surface Mapping, Vulnerability Research, Testing & Verification, and Reporting.
|
|
15
|
-
**Model**: anthropic/claude-opus-4-
|
|
15
|
+
**Model**: anthropic/claude-opus-4-7
|
|
16
16
|
**Tools**: 14 orchestrator-accessible argus_* tools (argus_slither_analyze, argus_analyze_contract, argus_check_patterns, argus_proxy_detection, argus_solodit_search, argus_forge_test, argus_gas_analysis, argus_forge_fuzz, argus_forge_coverage, argus_skill_load, argus_generate_report, argus_record_finding, argus_read_findings, argus_sync_knowledge). `argus_persist_deduped` is reserved for Scribe.
|
|
17
17
|
|
|
18
18
|
## sentinel
|
|
19
19
|
|
|
20
20
|
**Role**: Static analysis and testing specialist
|
|
21
21
|
**Description**: Finds vulnerabilities through Slither static analysis, Foundry testing, fuzzing, and pattern matching. The tactical executor — runs tools, writes PoC tests, and verifies findings. Dispatched by Argus during Automated Scanning and Testing & Verification phases.
|
|
22
|
-
**Model**: anthropic/claude-sonnet-4-
|
|
22
|
+
**Model**: anthropic/claude-sonnet-4-7
|
|
23
23
|
**Tools**: argus_slither_analyze, argus_forge_test, argus_gas_analysis, argus_forge_fuzz, argus_forge_coverage, argus_analyze_contract, argus_check_patterns, argus_proxy_detection, argus_record_finding, skill
|
|
24
24
|
|
|
25
25
|
## pythia
|
|
26
26
|
|
|
27
27
|
**Role**: Vulnerability researcher
|
|
28
28
|
**Description**: Consults Solodit, SCVD, and the knowledge base to find historical precedents and known attack vectors. Searches 7,769+ real-world audit findings and 51 curated vulnerability pattern files. Dispatched by Argus during Vulnerability Research phase.
|
|
29
|
-
**Model**: anthropic/claude-sonnet-4-
|
|
29
|
+
**Model**: anthropic/claude-sonnet-4-7
|
|
30
30
|
**Tools**: argus_solodit_search, argus_check_patterns, argus_record_finding, skill
|
|
31
31
|
|
|
32
32
|
## scribe
|
|
33
33
|
|
|
34
34
|
**Role**: Audit report writer
|
|
35
35
|
**Description**: Transforms raw findings into professional markdown audit reports. Produces structured output with severity classifications (Critical/High/Medium/Low/Informational), impact assessments, proof-of-concept steps, and actionable recommendations. Dispatched by Argus only after all analysis is complete.
|
|
36
|
-
**Model**: anthropic/claude-sonnet-4-
|
|
36
|
+
**Model**: anthropic/claude-sonnet-4-7
|
|
37
37
|
**Tools**: argus_read_findings, argus_persist_deduped, argus_generate_report, skill
|
|
38
38
|
|
|
39
39
|
## themis
|
|
40
40
|
|
|
41
41
|
**Role**: Audit quality gate
|
|
42
|
-
**Description**: Independent cross-validation agent running on GPT-5.
|
|
43
|
-
**Model**: openai/gpt-5.
|
|
42
|
+
**Description**: Independent cross-validation agent running on GPT-5.5 (different LLM provider for reasoning diversity). Validates pipeline integrity: compares raw findings against Scribe's deduped output and the final report. Performs second-opinion research via Solodit and vulnerability skill checklists. Returns a structured verdict to Argus who makes the final decision. Dispatched by Argus after Scribe completes.
|
|
43
|
+
**Model**: openai/gpt-5.5
|
|
44
44
|
**Tools**: argus_read_findings, argus_solodit_search, argus_check_patterns, argus_skill_load, skill
|
package/README.md
CHANGED
|
@@ -65,11 +65,11 @@ Argus will automatically:
|
|
|
65
65
|
|
|
66
66
|
| Agent | Role | Model |
|
|
67
67
|
|-------|------|-------|
|
|
68
|
-
| `@argus` | Orchestrator — coordinates the full audit | claude-opus-4-
|
|
69
|
-
| `@sentinel` | Static analysis & testing specialist | claude-sonnet-4-
|
|
70
|
-
| `@pythia` | Vulnerability researcher | claude-sonnet-4-
|
|
71
|
-
| `@scribe` | Audit report writer | claude-sonnet-4-
|
|
72
|
-
| `@themis` | Independent audit quality gate | gpt-5.
|
|
68
|
+
| `@argus` | Orchestrator — coordinates the full audit | claude-opus-4-7 |
|
|
69
|
+
| `@sentinel` | Static analysis & testing specialist | claude-sonnet-4-7 |
|
|
70
|
+
| `@pythia` | Vulnerability researcher | claude-sonnet-4-7 |
|
|
71
|
+
| `@scribe` | Audit report writer | claude-sonnet-4-7 |
|
|
72
|
+
| `@themis` | Independent audit quality gate | gpt-5.5 |
|
|
73
73
|
|
|
74
74
|
### @argus — The Orchestrator
|
|
75
75
|
Argus Panoptes is the lead auditor. It follows a 7-step methodology (Reconnaissance, Automated Scanning, Manual Review, Attack Surface Mapping, Vulnerability Research, Testing & Verification, Reporting) and delegates to Sentinel, Pythia, Scribe, and Themis as needed.
|
|
@@ -284,11 +284,11 @@ Create `.argus/solidity-argus.jsonc` in your project root. `.opencode/solidity-a
|
|
|
284
284
|
```jsonc
|
|
285
285
|
{
|
|
286
286
|
"agents": {
|
|
287
|
-
"argus": { "model": "anthropic/claude-opus-4-
|
|
288
|
-
"sentinel": { "model": "anthropic/claude-sonnet-4-
|
|
289
|
-
"pythia": { "model": "anthropic/claude-sonnet-4-
|
|
290
|
-
"scribe": { "model": "anthropic/claude-sonnet-4-
|
|
291
|
-
"themis": { "model": "openai/gpt-5.
|
|
287
|
+
"argus": { "model": "anthropic/claude-opus-4-7" },
|
|
288
|
+
"sentinel": { "model": "anthropic/claude-sonnet-4-7" },
|
|
289
|
+
"pythia": { "model": "anthropic/claude-sonnet-4-7" },
|
|
290
|
+
"scribe": { "model": "anthropic/claude-sonnet-4-7" },
|
|
291
|
+
"themis": { "model": "openai/gpt-5.5" }
|
|
292
292
|
},
|
|
293
293
|
|
|
294
294
|
"tools": {
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "solidity-argus",
|
|
3
|
-
"version": "0.5.
|
|
3
|
+
"version": "0.5.8",
|
|
4
4
|
"description": "Solidity smart contract security auditing plugin for OpenCode — 5 specialized agents, 15 tools (14 core + optional Solodit), and a curated vulnerability knowledge base",
|
|
5
5
|
"keywords": [
|
|
6
6
|
"solidity",
|
|
@@ -229,7 +229,7 @@ Task(subagent_type="scribe", prompt="Generate the final audit report for Project
|
|
|
229
229
|
- **Constraint**: Only invoke Scribe after all analysis and testing are complete.
|
|
230
230
|
|
|
231
231
|
### **@themis** (The Quality Gate)
|
|
232
|
-
- **Role**: Independent audit validation using a different LLM provider (GPT-5.
|
|
232
|
+
- **Role**: Independent audit validation using a different LLM provider (GPT-5.5).
|
|
233
233
|
- **Tools**: \`argus_read_findings\`, \`argus_solodit_search\`, \`argus_check_patterns\`, \`argus_skill_load\`
|
|
234
234
|
- **Delegation Examples**:
|
|
235
235
|
\`\`\`
|
|
@@ -255,7 +255,7 @@ When building the final report or synthesizing findings:
|
|
|
255
255
|
2. **Secondary source**: Tool transcript text (use only when durable evidence is unavailable or incomplete).
|
|
256
256
|
3. **Never** synthesize findings from ephemeral background transcript retrieval alone if durable state evidence exists.
|
|
257
257
|
4. **Manual-finding durability**: If Argus, Sentinel, or Pythia identifies a finding outside analyzer tool payloads, they must call \
|
|
258
|
-
\`argus_record_finding\` before proceeding. The JSON payload
|
|
258
|
+
\`argus_record_finding\` before proceeding. The JSON payload should include \`impact\`, \`recommendation\`, and \`proofOfConcept\` fields whenever they are known. Missing enrichment is recorded with warnings rather than rejected, but Scribe must enrich final Critical/High findings before reporting.
|
|
259
259
|
5. **Report parity rule**: Scribe must not include findings in \`report_input\` unless they are event-backed (recorded via tools/events).
|
|
260
260
|
|
|
261
261
|
**Bounded background fan-out**: For deep audits, limit concurrent high-context background delegations to max 2 at a time. Split larger workloads into sequential waves. This prevents retrieval blind spots from simultaneous long-running tasks.
|
|
@@ -365,7 +365,7 @@ Your subagents have access to these specialized tools. Know when to delegate eac
|
|
|
365
365
|
"proofOfConcept": "Steps to reproduce or reference to PoC test"
|
|
366
366
|
}
|
|
367
367
|
\`\`\`
|
|
368
|
-
- **CRITICAL**: For Critical and High findings, \`impact\`, \`recommendation\`, and \`proofOfConcept\` are MANDATORY.
|
|
368
|
+
- **CRITICAL**: For Critical and High final report findings, \`impact\`, \`recommendation\`, and \`proofOfConcept\` are MANDATORY. For any finding with \`source: "slither"\`, preserve the finding even when enrichment is not ready, but add these three fields before final Scribe persistence whenever possible. \`argus_record_finding\` warns on incomplete Slither enrichment instead of dropping the finding. Preferred field names: \`check\`, \`file\`, \`lines\`. The aliases \`title\`/\`name\` → \`check\` and \`location\` → \`file\` are accepted but canonical names are preferred. Instruct Sentinel and Pythia accordingly when delegating.
|
|
369
369
|
|
|
370
370
|
- **\`argus_sync_knowledge\`**:
|
|
371
371
|
- **Use**: Maintenance.
|
|
@@ -103,11 +103,12 @@ You have two primary tools. Master them.
|
|
|
103
103
|
"lines": [startLine, endLine],
|
|
104
104
|
"source": "manual",
|
|
105
105
|
"impact": "Specific impact based on the historical precedent (e.g., 'Total vault drain via flash loan, similar to $X loss in Protocol Y')",
|
|
106
|
-
"recommendation": "Specific mitigation from the precedent audit report"
|
|
106
|
+
"recommendation": "Specific mitigation from the precedent audit report",
|
|
107
|
+
"proofOfConcept": "Steps to reproduce, exploit sketch, or reference to the historical exploit/audit evidence"
|
|
107
108
|
}
|
|
108
109
|
\`\`\`
|
|
109
110
|
|
|
110
|
-
**CRITICAL**: For Critical and High findings, \`impact\` and \`
|
|
111
|
+
**CRITICAL**: For Critical and High final report findings, \`impact\`, \`recommendation\`, and \`proofOfConcept\` are MANDATORY. \`argus_record_finding\` preserves incomplete findings with warnings rather than dropping them, but Scribe must enrich them before final reporting. Use your Solodit research to write specific, precedent-backed impact, recommendation, and proof-of-concept text — not generic placeholders.
|
|
111
112
|
|
|
112
113
|
**Interpretation**:
|
|
113
114
|
- A finding is not report-ready until it has been recorded through this tool.
|
|
@@ -151,7 +151,7 @@ You have access to a specific set of tools. Use them effectively.
|
|
|
151
151
|
}
|
|
152
152
|
\`\`\`
|
|
153
153
|
|
|
154
|
-
**CRITICAL**: For Critical and High findings, \`impact\`, \`recommendation\`, and \`proofOfConcept\` are MANDATORY.
|
|
154
|
+
**CRITICAL**: For Critical and High findings, \`impact\`, \`recommendation\`, and \`proofOfConcept\` are MANDATORY. For any finding with \`source: "slither"\`, preserve the finding even when enrichment is not ready, but add these three fields before final Scribe persistence whenever possible. \`argus_record_finding\` warns on incomplete Slither enrichment instead of dropping the finding. Do not use generic placeholders — be specific to the vulnerability.
|
|
155
155
|
|
|
156
156
|
**Interpretation**:
|
|
157
157
|
- Recording is mandatory before handing findings to Argus for final synthesis.
|
|
@@ -5,7 +5,7 @@ export const THEMIS_PROMPT = `You are **Themis**, the Quality Gate of Argus Pano
|
|
|
5
5
|
You are the final validation and review agent in the audit pipeline. You do not run the full audit from scratch and you do not write the final report. You verify that the pipeline output is complete, consistent, and defensible.
|
|
6
6
|
|
|
7
7
|
Model context:
|
|
8
|
-
- You run on **OpenAI GPT-5.
|
|
8
|
+
- You run on **OpenAI GPT-5.5**.
|
|
9
9
|
- This is intentionally a different provider than the other Argus agents (Claude) to increase reasoning diversity for final quality checks.
|
|
10
10
|
|
|
11
11
|
Your core responsibilities are:
|
|
@@ -1,5 +1,6 @@
|
|
|
1
1
|
import { existsSync, readdirSync, readFileSync } from "node:fs"
|
|
2
|
-
import {
|
|
2
|
+
import { homedir } from "node:os"
|
|
3
|
+
import { basename, dirname, extname, join, resolve } from "node:path"
|
|
3
4
|
import { loadArgusConfig } from "../../config/loader"
|
|
4
5
|
import type { ArgusConfig } from "../../config/types"
|
|
5
6
|
import { createLogger } from "../../shared/logger"
|
|
@@ -133,6 +134,143 @@ export function buildSkillHealthReport(
|
|
|
133
134
|
}
|
|
134
135
|
}
|
|
135
136
|
|
|
137
|
+
// ─────────────────────────────────────────────────────────────────────────────
|
|
138
|
+
// Install-drift detection
|
|
139
|
+
//
|
|
140
|
+
// OpenCode's plugin resolver walks up the filesystem looking up `node_modules`
|
|
141
|
+
// directories. A stale copy of solidity-argus hoisted to a higher-precedence
|
|
142
|
+
// location (typically `~/.cache/opencode/node_modules/solidity-argus`) will
|
|
143
|
+
// SHADOW the canonical install under `~/.cache/opencode/packages/...`. The
|
|
144
|
+
// shadowing install is loaded silently, leading to confusing failures like
|
|
145
|
+
// `undefined is not an object (evaluating 'result.toLowerCase')` on every MCP
|
|
146
|
+
// call (older versions lacked defensive guards in `tool.execute.after`).
|
|
147
|
+
//
|
|
148
|
+
// This check enumerates known install locations and flags drift.
|
|
149
|
+
// ─────────────────────────────────────────────────────────────────────────────
|
|
150
|
+
|
|
151
|
+
export type ArgusInstallSource =
|
|
152
|
+
| "current"
|
|
153
|
+
| "hoisted-cache"
|
|
154
|
+
| "package-cache"
|
|
155
|
+
| "user-config"
|
|
156
|
+
| "project-local"
|
|
157
|
+
|
|
158
|
+
export type ArgusInstall = {
|
|
159
|
+
source: ArgusInstallSource
|
|
160
|
+
path: string
|
|
161
|
+
version: string | null
|
|
162
|
+
}
|
|
163
|
+
|
|
164
|
+
export type InstallDriftReport = {
|
|
165
|
+
current: ArgusInstall | null
|
|
166
|
+
installs: ArgusInstall[]
|
|
167
|
+
errors: string[]
|
|
168
|
+
warnings: string[]
|
|
169
|
+
}
|
|
170
|
+
|
|
171
|
+
function readPackageVersion(packageRoot: string): string | null {
|
|
172
|
+
try {
|
|
173
|
+
const raw = readFileSync(join(packageRoot, "package.json"), "utf8")
|
|
174
|
+
const parsed = JSON.parse(raw) as { version?: unknown }
|
|
175
|
+
return typeof parsed.version === "string" ? parsed.version : null
|
|
176
|
+
} catch {
|
|
177
|
+
return null
|
|
178
|
+
}
|
|
179
|
+
}
|
|
180
|
+
|
|
181
|
+
function getCurrentArgusInstall(): ArgusInstall | null {
|
|
182
|
+
// doctor.ts lives at <packageRoot>/src/cli/commands/doctor.ts
|
|
183
|
+
const packageRoot = resolve(import.meta.dir, "../../..")
|
|
184
|
+
if (!existsSync(join(packageRoot, "package.json"))) return null
|
|
185
|
+
const version = readPackageVersion(packageRoot)
|
|
186
|
+
return { source: "current", path: packageRoot, version }
|
|
187
|
+
}
|
|
188
|
+
|
|
189
|
+
export function enumerateArgusInstallCandidates(
|
|
190
|
+
cwd: string,
|
|
191
|
+
home: string,
|
|
192
|
+
): Array<{ source: ArgusInstallSource; path: string }> {
|
|
193
|
+
return [
|
|
194
|
+
{
|
|
195
|
+
source: "hoisted-cache",
|
|
196
|
+
path: join(home, ".cache", "opencode", "node_modules", "solidity-argus"),
|
|
197
|
+
},
|
|
198
|
+
{
|
|
199
|
+
source: "package-cache",
|
|
200
|
+
path: join(
|
|
201
|
+
home,
|
|
202
|
+
".cache",
|
|
203
|
+
"opencode",
|
|
204
|
+
"packages",
|
|
205
|
+
"solidity-argus@latest",
|
|
206
|
+
"node_modules",
|
|
207
|
+
"solidity-argus",
|
|
208
|
+
),
|
|
209
|
+
},
|
|
210
|
+
{
|
|
211
|
+
source: "user-config",
|
|
212
|
+
path: join(home, ".config", "opencode", "node_modules", "solidity-argus"),
|
|
213
|
+
},
|
|
214
|
+
{
|
|
215
|
+
source: "project-local",
|
|
216
|
+
path: join(cwd, "node_modules", "solidity-argus"),
|
|
217
|
+
},
|
|
218
|
+
]
|
|
219
|
+
}
|
|
220
|
+
|
|
221
|
+
function findArgusInstalls(cwd: string, home: string): ArgusInstall[] {
|
|
222
|
+
const installs: ArgusInstall[] = []
|
|
223
|
+
for (const { source, path } of enumerateArgusInstallCandidates(cwd, home)) {
|
|
224
|
+
if (existsSync(path)) {
|
|
225
|
+
installs.push({ source, path, version: readPackageVersion(path) })
|
|
226
|
+
}
|
|
227
|
+
}
|
|
228
|
+
return installs
|
|
229
|
+
}
|
|
230
|
+
|
|
231
|
+
export function detectInstallDrift(
|
|
232
|
+
current: ArgusInstall | null,
|
|
233
|
+
installs: ArgusInstall[],
|
|
234
|
+
): { errors: string[]; warnings: string[] } {
|
|
235
|
+
const errors: string[] = []
|
|
236
|
+
const warnings: string[] = []
|
|
237
|
+
|
|
238
|
+
const hoisted = installs.find((i) => i.source === "hoisted-cache")
|
|
239
|
+
const pkgCache = installs.find((i) => i.source === "package-cache")
|
|
240
|
+
|
|
241
|
+
// Highest-confidence error: hoisted cache shadows the canonical cache with a
|
|
242
|
+
// DIFFERENT version. OpenCode will load the wrong one.
|
|
243
|
+
if (hoisted && pkgCache && hoisted.version !== pkgCache.version) {
|
|
244
|
+
errors.push(
|
|
245
|
+
`Stale install shadowing canonical version:\n` +
|
|
246
|
+
` ${hoisted.path} (v${hoisted.version ?? "unknown"})\n` +
|
|
247
|
+
` shadows ${pkgCache.path} (v${pkgCache.version ?? "unknown"}).\n` +
|
|
248
|
+
` OpenCode will load v${hoisted.version ?? "unknown"} instead of v${pkgCache.version ?? "unknown"}.\n` +
|
|
249
|
+
` Fix: rm -rf "${hoisted.path}"`,
|
|
250
|
+
)
|
|
251
|
+
return { errors, warnings }
|
|
252
|
+
}
|
|
253
|
+
|
|
254
|
+
// Lower-confidence: hoisted install drifts from the version the doctor CLI
|
|
255
|
+
// is itself running as (typical when the user upgraded via bunx/opencode).
|
|
256
|
+
if (hoisted && current?.version && hoisted.version && hoisted.version !== current.version) {
|
|
257
|
+
warnings.push(
|
|
258
|
+
`Possible stale install (drift from running version):\n` +
|
|
259
|
+
` ${hoisted.path} (v${hoisted.version}) differs from current (v${current.version}).\n` +
|
|
260
|
+
` Fix: rm -rf "${hoisted.path}"`,
|
|
261
|
+
)
|
|
262
|
+
}
|
|
263
|
+
|
|
264
|
+
return { errors, warnings }
|
|
265
|
+
}
|
|
266
|
+
|
|
267
|
+
export function buildInstallDriftReport(cwd: string, home: string): InstallDriftReport {
|
|
268
|
+
const current = getCurrentArgusInstall()
|
|
269
|
+
const installs = findArgusInstalls(cwd, home)
|
|
270
|
+
const { errors, warnings } = detectInstallDrift(current, installs)
|
|
271
|
+
return { current, installs, errors, warnings }
|
|
272
|
+
}
|
|
273
|
+
|
|
136
274
|
const NON_SKILL_FILENAMES = new Set(["README.md", "INVENTORY.md", "CHANGELOG.md", "LICENSE.md"])
|
|
137
275
|
|
|
138
276
|
function scanMarkdownFiles(dir: string, maxDepth = 8): string[] {
|
|
@@ -237,6 +375,22 @@ export const doctorCommand: CliCommand = {
|
|
|
237
375
|
cliOutput.log(`${YELLOW}⚠${RESET} Project: no Solidity project detected`)
|
|
238
376
|
}
|
|
239
377
|
|
|
378
|
+
const driftReport = buildInstallDriftReport(cwd, homedir())
|
|
379
|
+
if (driftReport.errors.length === 0 && driftReport.warnings.length === 0) {
|
|
380
|
+
const versionStr = driftReport.current?.version
|
|
381
|
+
? ` (current: v${driftReport.current.version})`
|
|
382
|
+
: ""
|
|
383
|
+
cliOutput.log(`${GREEN}✓${RESET} Install drift: none detected${versionStr}`)
|
|
384
|
+
} else {
|
|
385
|
+
for (const err of driftReport.errors) {
|
|
386
|
+
cliOutput.log(`${RED}✗${RESET} Install drift: ${err}`)
|
|
387
|
+
hasFailure = true
|
|
388
|
+
}
|
|
389
|
+
for (const warn of driftReport.warnings) {
|
|
390
|
+
cliOutput.log(`${YELLOW}⚠${RESET} Install drift: ${warn}`)
|
|
391
|
+
}
|
|
392
|
+
}
|
|
393
|
+
|
|
240
394
|
if (projectType === "foundry" && detectViaIr(cwd)) {
|
|
241
395
|
cliOutput.log(
|
|
242
396
|
`${YELLOW}⚠${RESET} via_ir: enabled in foundry.toml — Slither will use flatten fallback`,
|
|
@@ -64,7 +64,8 @@ function addPluginToConfig(configPath: string): { added: boolean; ok: boolean }
|
|
|
64
64
|
|
|
65
65
|
export const installCommand: CliCommand = {
|
|
66
66
|
name: "install",
|
|
67
|
-
description:
|
|
67
|
+
description:
|
|
68
|
+
"Register solidity-argus in your OpenCode config (use --global for ~/.config/opencode)",
|
|
68
69
|
async execute(args: string[]): Promise<number> {
|
|
69
70
|
const isGlobal = args.includes("--global") || args.includes("-g")
|
|
70
71
|
const local = localConfigPath()
|
|
@@ -85,7 +86,9 @@ export const installCommand: CliCommand = {
|
|
|
85
86
|
` Installing globally would write to ${global} and load solidity-argus in EVERY OpenCode session.`,
|
|
86
87
|
)
|
|
87
88
|
cliOutput.warn(` To install globally on purpose, re-run with: argus install --global`)
|
|
88
|
-
cliOutput.warn(
|
|
89
|
+
cliOutput.warn(
|
|
90
|
+
` To install for this project, first create an opencode.json in this directory.`,
|
|
91
|
+
)
|
|
89
92
|
|
|
90
93
|
const proceed = await confirm("Install globally anyway?", false)
|
|
91
94
|
if (!proceed) {
|
|
@@ -1,9 +1,9 @@
|
|
|
1
1
|
export const DEFAULT_MODELS = {
|
|
2
|
-
argus: "anthropic/claude-opus-4-
|
|
3
|
-
sentinel: "anthropic/claude-sonnet-4-
|
|
4
|
-
pythia: "anthropic/claude-sonnet-4-
|
|
5
|
-
scribe: "anthropic/claude-sonnet-4-
|
|
6
|
-
themis: "openai/gpt-5.
|
|
2
|
+
argus: "anthropic/claude-opus-4-7",
|
|
3
|
+
sentinel: "anthropic/claude-sonnet-4-7",
|
|
4
|
+
pythia: "anthropic/claude-sonnet-4-7",
|
|
5
|
+
scribe: "anthropic/claude-sonnet-4-7",
|
|
6
|
+
themis: "openai/gpt-5.5",
|
|
7
7
|
} as const
|
|
8
8
|
|
|
9
9
|
export const DEFAULT_STEPS = 50 as const
|
package/src/create-hooks.ts
CHANGED
|
@@ -12,10 +12,7 @@ import type { CanonicalFinding, CanonicalToolExecution, ReportInput } from "../.
|
|
|
12
12
|
import { SCHEMA_VERSION } from "../../state/schemas"
|
|
13
13
|
import { readEvents } from "./event-sink"
|
|
14
14
|
|
|
15
|
-
export type MaterializeFindingsTrigger =
|
|
16
|
-
| "session.idle"
|
|
17
|
-
| "session.deleted"
|
|
18
|
-
| "tool.execute.after"
|
|
15
|
+
export type MaterializeFindingsTrigger = "session.idle" | "session.deleted" | "tool.execute.after"
|
|
19
16
|
|
|
20
17
|
export interface MaterializeFindingsForRunOptions {
|
|
21
18
|
failFast?: boolean
|
|
@@ -83,6 +83,54 @@ function asRecord(value: unknown): Record<string, unknown> | null {
|
|
|
83
83
|
return null
|
|
84
84
|
}
|
|
85
85
|
|
|
86
|
+
function isGenerateReportCompletion(event: AuditEvent): boolean {
|
|
87
|
+
if (event.type !== "tool.completed") return false
|
|
88
|
+
const payload = asRecord(event.payload)
|
|
89
|
+
if (!payload) return false
|
|
90
|
+
return payload.tool === "argus_generate_report" || payload.name === "argus_generate_report"
|
|
91
|
+
}
|
|
92
|
+
|
|
93
|
+
async function collectReportCompletenessErrors(events: AuditEvent[]): Promise<string[]> {
|
|
94
|
+
const errors: string[] = []
|
|
95
|
+
const reportEvents = events.filter(isGenerateReportCompletion)
|
|
96
|
+
|
|
97
|
+
for (const event of reportEvents) {
|
|
98
|
+
const payload = asRecord(event.payload)
|
|
99
|
+
const filePath = payload?.filePath
|
|
100
|
+
if (typeof filePath !== "string" || filePath.length === 0) continue
|
|
101
|
+
|
|
102
|
+
try {
|
|
103
|
+
const report = await Bun.file(filePath).text()
|
|
104
|
+
if (report.includes("## ⚠ Completeness Warning")) {
|
|
105
|
+
errors.push("generated report contains Completeness Warning")
|
|
106
|
+
}
|
|
107
|
+
} catch {
|
|
108
|
+
// Missing report files are handled by report-generation/tool-tracking gates.
|
|
109
|
+
}
|
|
110
|
+
}
|
|
111
|
+
|
|
112
|
+
return errors
|
|
113
|
+
}
|
|
114
|
+
|
|
115
|
+
function collectReportQualityGateErrors(events: AuditEvent[]): string[] {
|
|
116
|
+
const errors: string[] = []
|
|
117
|
+
const reportEvents = events.filter(isGenerateReportCompletion)
|
|
118
|
+
|
|
119
|
+
for (const event of reportEvents) {
|
|
120
|
+
const payload = asRecord(event.payload)
|
|
121
|
+
const qualityGates = asRecord(payload?.qualityGates)
|
|
122
|
+
if (qualityGates?.passed !== false) continue
|
|
123
|
+
|
|
124
|
+
const violations = Array.isArray(qualityGates.violations)
|
|
125
|
+
? qualityGates.violations.filter((entry): entry is string => typeof entry === "string")
|
|
126
|
+
: []
|
|
127
|
+
const details = violations.length > 0 ? `: ${violations.join("; ")}` : ""
|
|
128
|
+
errors.push(`generated report failed quality gates${details}`)
|
|
129
|
+
}
|
|
130
|
+
|
|
131
|
+
return errors
|
|
132
|
+
}
|
|
133
|
+
|
|
86
134
|
function collectParentChildIntegrityErrors(events: AuditEvent[]): string[] {
|
|
87
135
|
const errors: string[] = []
|
|
88
136
|
const parentByChild = new Map<string, string>()
|
|
@@ -257,17 +305,25 @@ export async function finalizeRun(
|
|
|
257
305
|
const hasEventsAfterExistingFinalization =
|
|
258
306
|
existingResult !== null && existingResult.finalizedIndex < events.length - 1
|
|
259
307
|
if (existingResult?.invariantsPassed && !hasEventsAfterExistingFinalization) {
|
|
260
|
-
|
|
261
|
-
|
|
262
|
-
|
|
263
|
-
|
|
264
|
-
|
|
265
|
-
|
|
266
|
-
|
|
308
|
+
const reportErrors = [
|
|
309
|
+
...(await collectReportCompletenessErrors(events)),
|
|
310
|
+
...collectReportQualityGateErrors(events),
|
|
311
|
+
]
|
|
312
|
+
if (reportErrors.length === 0) {
|
|
313
|
+
return {
|
|
314
|
+
success: existingResult.success,
|
|
315
|
+
invariantsPassed: existingResult.invariantsPassed,
|
|
316
|
+
errors: existingResult.errors,
|
|
317
|
+
warnings: existingResult.warnings,
|
|
318
|
+
runId: existingResult.runId,
|
|
319
|
+
timestamp: existingResult.timestamp,
|
|
320
|
+
}
|
|
267
321
|
}
|
|
268
322
|
}
|
|
269
323
|
|
|
270
324
|
const { errors, warnings } = collectInvariantErrors(events)
|
|
325
|
+
errors.push(...(await collectReportCompletenessErrors(events)))
|
|
326
|
+
errors.push(...collectReportQualityGateErrors(events))
|
|
271
327
|
const invariantsPassed = errors.length === 0
|
|
272
328
|
const sessionId = events.at(-1)?.session_id ?? ""
|
|
273
329
|
|
|
@@ -185,7 +185,7 @@ export function createConfigHandler(
|
|
|
185
185
|
mode: "subagent",
|
|
186
186
|
model: argusConfig.agents?.themis?.model ?? DEFAULT_MODELS.themis,
|
|
187
187
|
steps: argusConfig.agents?.themis?.steps ?? DEFAULT_STEPS,
|
|
188
|
-
description: "Audit quality gate — independent cross-validation (GPT-5.
|
|
188
|
+
description: "Audit quality gate — independent cross-validation (GPT-5.5)",
|
|
189
189
|
prompt: THEMIS_PROMPT,
|
|
190
190
|
permission: {
|
|
191
191
|
argus_read_findings: "allow",
|
|
@@ -85,7 +85,7 @@ export const persistDedupedTool = tool({
|
|
|
85
85
|
deduped_findings: tool.schema
|
|
86
86
|
.string()
|
|
87
87
|
.describe(
|
|
88
|
-
"Serialized JSON array of deduplicated and enriched findings. Each finding should have: check, severity, confidence, description, file, lines, source, impact, recommendation.",
|
|
88
|
+
"Serialized JSON array of deduplicated and enriched findings. Each finding should have: check, severity, confidence, description, file, lines, source, impact, recommendation, proofOfConcept.",
|
|
89
89
|
),
|
|
90
90
|
},
|
|
91
91
|
async execute(args, context) {
|
|
@@ -28,6 +28,8 @@ type RecordFindingResponse = {
|
|
|
28
28
|
}>
|
|
29
29
|
schema_version: string
|
|
30
30
|
note: string
|
|
31
|
+
enrichment_warnings?: string[]
|
|
32
|
+
enrichment_hint?: string
|
|
31
33
|
}
|
|
32
34
|
|
|
33
35
|
type ParseResult = { ok: true; data: Record<string, unknown>[] } | { ok: false; error: string }
|
|
@@ -79,6 +81,16 @@ function errorResponse(error: string): string {
|
|
|
79
81
|
})
|
|
80
82
|
}
|
|
81
83
|
|
|
84
|
+
function collectMissingEnrichmentFields(
|
|
85
|
+
finding: ReturnType<typeof normalizeToCanonicalFinding>["data"],
|
|
86
|
+
): string[] {
|
|
87
|
+
const missing: string[] = []
|
|
88
|
+
if (!isNonEmptyString(finding.impact)) missing.push("impact")
|
|
89
|
+
if (!isNonEmptyString(finding.recommendation)) missing.push("recommendation")
|
|
90
|
+
if (!isNonEmptyString(finding.proofOfConcept)) missing.push("proofOfConcept")
|
|
91
|
+
return missing
|
|
92
|
+
}
|
|
93
|
+
|
|
82
94
|
export async function executeRecordFinding(
|
|
83
95
|
args: RecordFindingArgs,
|
|
84
96
|
context: ToolContext,
|
|
@@ -160,16 +172,21 @@ export async function executeRecordFinding(
|
|
|
160
172
|
return errorResponse(`Failed to record finding(s): ${errors.join("; ")}`)
|
|
161
173
|
}
|
|
162
174
|
|
|
163
|
-
// Warn when
|
|
175
|
+
// Warn when report-quality enrichment is missing without dropping findings.
|
|
164
176
|
const enrichmentWarnings: string[] = []
|
|
165
177
|
const HIGH_SEVERITIES = new Set(["Critical", "High"])
|
|
166
178
|
for (const f of findings) {
|
|
167
|
-
|
|
168
|
-
const missing: string[] = []
|
|
169
|
-
if (!f.impact) missing.push("impact")
|
|
170
|
-
if (!f.recommendation) missing.push("recommendation")
|
|
171
|
-
if (!f.proofOfConcept) missing.push("proofOfConcept")
|
|
179
|
+
const missing = collectMissingEnrichmentFields(f)
|
|
172
180
|
if (missing.length > 0) {
|
|
181
|
+
if (f.source === "slither") {
|
|
182
|
+
enrichmentWarnings.push(
|
|
183
|
+
`[${f.severity}] Slither finding ${f.check} in ${f.file} is missing: ${missing.join(", ")}. The finding was recorded, but Scribe must enrich it before final reporting.`,
|
|
184
|
+
)
|
|
185
|
+
continue
|
|
186
|
+
}
|
|
187
|
+
|
|
188
|
+
if (!HIGH_SEVERITIES.has(f.severity)) continue
|
|
189
|
+
|
|
173
190
|
enrichmentWarnings.push(
|
|
174
191
|
`[${f.severity}] ${f.check} in ${f.file} is missing: ${missing.join(", ")}. Quality gate will flag this.`,
|
|
175
192
|
)
|
|
@@ -199,7 +216,7 @@ export async function executeRecordFinding(
|
|
|
199
216
|
? {
|
|
200
217
|
enrichment_warnings: enrichmentWarnings,
|
|
201
218
|
enrichment_hint:
|
|
202
|
-
"Critical and High findings MUST include impact, recommendation, and proofOfConcept fields.
|
|
219
|
+
"Critical and High findings MUST include impact, recommendation, and proofOfConcept fields. Slither findings should include all three fields before Scribe persists deduped findings; incomplete Slither records are preserved but will be flagged by report quality gates if not enriched downstream.",
|
|
203
220
|
}
|
|
204
221
|
: {}),
|
|
205
222
|
}
|
|
@@ -215,13 +232,13 @@ export const recordFindingTool = tool({
|
|
|
215
232
|
.string()
|
|
216
233
|
.optional()
|
|
217
234
|
.describe(
|
|
218
|
-
'Serialized JSON object for a single finding. Required fields: check (string, e.g. "reentrancy-eth"), severity (Critical|High|Medium|Low|Informational), confidence (High|Medium|Low), description (string), file (relative path, e.g. "src/Vault.sol"), lines ([startLine, endLine] tuple), source ("manual"). Optional: impact, recommendation, proofOfConcept (mandatory for Critical/High).',
|
|
235
|
+
'Serialized JSON object for a single finding. Required fields: check (string, e.g. "reentrancy-eth"), severity (Critical|High|Medium|Low|Informational), confidence (High|Medium|Low), description (string), file (relative path, e.g. "src/Vault.sol"), lines ([startLine, endLine] tuple), source ("manual"|"slither"|"pattern"|"scvd"|"solodit"|"fuzz"). Optional: impact, recommendation, proofOfConcept (mandatory for Critical/High final report findings; strongly recommended for Slither-source findings before Scribe persistence).',
|
|
219
236
|
),
|
|
220
237
|
findings: tool.schema
|
|
221
238
|
.string()
|
|
222
239
|
.optional()
|
|
223
240
|
.describe(
|
|
224
|
-
"Serialized JSON array of finding objects. Each object requires the same fields as the finding parameter: check, severity, confidence, description, file, lines, source. Aliases title/name → check and location → file are accepted but canonical names are preferred.",
|
|
241
|
+
"Serialized JSON array of finding objects. Each object requires the same fields as the finding parameter: check, severity, confidence, description, file, lines, source. impact, recommendation, and proofOfConcept are mandatory for Critical/High final report findings and strongly recommended for Slither-source findings before Scribe persistence. Aliases title/name → check and location → file are accepted but canonical names are preferred.",
|
|
225
242
|
),
|
|
226
243
|
},
|
|
227
244
|
async execute(args, context) {
|
|
@@ -627,9 +627,7 @@ function parseReportInputPayload(
|
|
|
627
627
|
dedupedArtifact.findings,
|
|
628
628
|
effectiveRunId,
|
|
629
629
|
projectDir,
|
|
630
|
-
typeof dedupedArtifact.deduped_by === "string"
|
|
631
|
-
? dedupedArtifact.deduped_by
|
|
632
|
-
: "scribe",
|
|
630
|
+
typeof dedupedArtifact.deduped_by === "string" ? dedupedArtifact.deduped_by : "scribe",
|
|
633
631
|
)
|
|
634
632
|
const merged: Record<string, unknown> = {
|
|
635
633
|
...baseInput,
|
|
@@ -658,10 +656,7 @@ function parseReportInputPayload(
|
|
|
658
656
|
) {
|
|
659
657
|
merged.schema_version = SCHEMA_VERSION
|
|
660
658
|
}
|
|
661
|
-
if (
|
|
662
|
-
typeof merged.projectDir !== "string" ||
|
|
663
|
-
(merged.projectDir as string).length === 0
|
|
664
|
-
) {
|
|
659
|
+
if (typeof merged.projectDir !== "string" || (merged.projectDir as string).length === 0) {
|
|
665
660
|
merged.projectDir = projectDir
|
|
666
661
|
}
|
|
667
662
|
if (!Array.isArray(merged.scope)) {
|
|
@@ -858,6 +853,13 @@ function sortFindingsDeterministically(findings: Finding[]): Finding[] {
|
|
|
858
853
|
return [...findings].sort(compareFindingsDeterministically)
|
|
859
854
|
}
|
|
860
855
|
|
|
856
|
+
function hasDedupLineage(findings: Finding[]): boolean {
|
|
857
|
+
return findings.some((finding) => {
|
|
858
|
+
const observationIds = (finding as { observation_ids?: unknown }).observation_ids
|
|
859
|
+
return Array.isArray(observationIds) && observationIds.length > 0
|
|
860
|
+
})
|
|
861
|
+
}
|
|
862
|
+
|
|
861
863
|
export function validateReportQuality(
|
|
862
864
|
findings: Finding[],
|
|
863
865
|
policy: QualityGatePolicy,
|
|
@@ -1154,7 +1156,7 @@ export async function executeReportGeneration(
|
|
|
1154
1156
|
deps: ReportGenerationDependencies = {},
|
|
1155
1157
|
): Promise<ReportGenerationResult> {
|
|
1156
1158
|
const includeExecutiveSummary = args.include_executive_summary ?? true
|
|
1157
|
-
const threshold = args.severity_threshold ?? "
|
|
1159
|
+
const threshold = args.severity_threshold ?? "informational"
|
|
1158
1160
|
const qualityGatePolicy = args.quality_gate_policy ?? "warn"
|
|
1159
1161
|
const toolCoveragePolicy = args.tool_coverage_policy ?? "enforce"
|
|
1160
1162
|
const expectedRunId = resolveExpectedRunId(args, context, deps)
|
|
@@ -1230,7 +1232,24 @@ export async function executeReportGeneration(
|
|
|
1230
1232
|
|
|
1231
1233
|
const eventFindings = dedupeFindingsForFinalOutput(projectFindings(events))
|
|
1232
1234
|
const inputFindings = dedupeFindingsForFinalOutput(reportInput.findings)
|
|
1233
|
-
const
|
|
1235
|
+
const hasLineage = hasDedupLineage(reportInput.findings)
|
|
1236
|
+
const shouldCheckParity = eventFindings.length === inputFindings.length || hasLineage
|
|
1237
|
+
const parity = shouldCheckParity
|
|
1238
|
+
? compareIssueFingerprintSets(eventFindings, inputFindings)
|
|
1239
|
+
: { missing: [], extra: [], matches: true }
|
|
1240
|
+
|
|
1241
|
+
if (!shouldCheckParity) {
|
|
1242
|
+
const unverifiableSummary = `event_findings=${eventFindings.length}, report_findings=${inputFindings.length}`
|
|
1243
|
+
if (preflightPolicy === "strict-fail") {
|
|
1244
|
+
throw new Error(
|
|
1245
|
+
`Preflight failed (strict-fail): finding parity not verifiable (${unverifiableSummary}; missing observation_ids)`,
|
|
1246
|
+
)
|
|
1247
|
+
}
|
|
1248
|
+
|
|
1249
|
+
warningBullets.push(
|
|
1250
|
+
`- Finding parity not verifiable: ${unverifiableSummary}; deduped findings must include observation_ids to prove merged observations were preserved`,
|
|
1251
|
+
)
|
|
1252
|
+
}
|
|
1234
1253
|
|
|
1235
1254
|
if (!parity.matches) {
|
|
1236
1255
|
const mismatchSummary = `missing=${parity.missing.length}, extra=${parity.extra.length}`
|