aegis-audit 2.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/LICENSE ADDED
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2026 rsh1k
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
package/README.md ADDED
@@ -0,0 +1,131 @@
1
+ <div align="center">
2
+
3
+ # 🛡️ Aegis
4
+
5
+ ### AI-Powered Smart Contract Security Auditor
6
+
7
+ *Detect vulnerabilities before attackers do — mapped to OWASP, MITRE ATT&CK, CWE & NIST.*
8
+
9
+ [![License: MIT](https://img.shields.io/badge/License-MIT-00e6b4.svg)](LICENSE)
10
+ [![Node](https://img.shields.io/badge/node-%3E%3D18-339933.svg)](https://nodejs.org)
11
+ [![OWASP SC Top 10](https://img.shields.io/badge/OWASP-SC%20Top%2010%20(2026)-185fa5.svg)](https://owasp.org/www-project-smart-contract-top-10/)
12
+ [![Benchmarked](https://img.shields.io/badge/benchmark-SmartBugs%20Curated-854f0b.svg)](benchmark/README.md)
13
+ [![PRs Welcome](https://img.shields.io/badge/PRs-welcome-00e6b4.svg)](CONTRIBUTING.md)
14
+
15
+ </div>
16
+
17
+ ---
18
+
19
+ Aegis is a command-line security auditor for Solidity smart contracts. It combines a deterministic static-analysis engine with an AI semantic layer, maps every finding to industry frameworks, and produces enterprise-grade compliance artifacts (SARIF, SBOM, signed audit logs). Its detection accuracy is measured against an academic benchmark — not asserted.
20
+
21
+ ```bash
22
+ npm install -g aegis-audit
23
+ aegis audit ./contracts/MyToken.sol
24
+ ```
25
+
26
+ ## Why Aegis
27
+
28
+ Most scanners hand you a list of bugs. Aegis is built for teams that ship to production:
29
+
30
+ - **Full OWASP SC Top 10 (2026) coverage** — including the categories that actually cause losses. Access control alone was $953M of $1.42B in 2024 losses, far ahead of reentrancy.
31
+ - **Framework traceability** — every finding carries its `SC0X:2026`, `CWE-XXX`, and MITRE `TXXXX` identifiers for audit and compliance reporting.
32
+ - **Red-team attack-path synthesis** — chains individual findings into the multi-step exploits an APT would actually run (flash-loan price manipulation, proxy takeover, recursive drain).
33
+ - **Offline mode** — `--offline` runs all static detectors without your source code ever leaving the machine. Built for proprietary and regulated codebases.
34
+ - **CI/CD native** — SARIF 2.1.0 output, configurable fail thresholds, proper exit codes.
35
+ - **NIST SSDF (SP 800-218) outputs** — CycloneDX SBOM generation, encrypted key storage, and a tamper-evident hash-chained audit log.
36
+ - **Measured, not claimed** — ships with a benchmark harness scored against the SmartBugs Curated dataset.
37
+
38
+ ## Install
39
+
40
+ ```bash
41
+ npm install -g aegis-audit
42
+ aegis config # set encrypted API key (or use --offline)
43
+ ```
44
+
45
+ Get a free Anthropic API key at [console.anthropic.com](https://console.anthropic.com). Enterprises should prefer setting `ANTHROPIC_API_KEY` via a secrets manager, or use `--offline`.
46
+
47
+ ## Usage
48
+
49
+ ```bash
50
+ # Audit a local file, a folder, or a verified on-chain address
51
+ aegis audit ./contracts/MyToken.sol
52
+ aegis audit ./contracts/
53
+ aegis audit 0x1f9840a85d5aF5bf1D1762F925BDADdC4201F984 --network ethereum
54
+
55
+ # Enterprise / regulated: never transmit source code
56
+ aegis audit ./contracts/ --offline
57
+
58
+ # Generate compliance + CI artifacts
59
+ aegis audit ./contracts/ --sarif results.sarif --sbom sbom.json --output report.md
60
+
61
+ # CI gate (non-zero exit on high+ findings)
62
+ aegis audit ./contracts/ --ci --fail-on high
63
+
64
+ # Measure detector accuracy against labeled datasets
65
+ aegis benchmark
66
+ aegis benchmark --fetch-smartbugs
67
+ ```
68
+
69
+ ## What it detects — OWASP Smart Contract Top 10 (2026)
70
+
71
+ | ID | Category | Detectors |
72
+ |----|----------|-----------|
73
+ | SC01 | Access Control | Missing modifiers, `tx.origin` auth, unprotected `selfdestruct` |
74
+ | SC02 | Business Logic | Precision loss, unbounded loops (DoS), weak randomness, timestamp logic |
75
+ | SC03 | Price Oracle Manipulation | Spot-price-as-oracle detection |
76
+ | SC04 | Flash Loan | Callback-invariant flags |
77
+ | SC05 | Input Validation | Missing zero-address checks |
78
+ | SC06 | Unchecked External Calls | Unchecked `.call`, unsafe ERC20 transfer |
79
+ | SC07 / SC09 | Arithmetic / Overflow | Pre-0.8 Solidity, `unchecked` blocks |
80
+ | SC08 | Reentrancy | External-call-before-state, missing guards |
81
+ | SC10 | Proxy & Upgradeability | Unprotected initializer, `delegatecall` |
82
+
83
+ Plus a **Claude AI semantic layer** for business-logic and economic attacks that pattern matching misses.
84
+
85
+ ## Accuracy
86
+
87
+ Aegis ships with a benchmark harness so detection is evidence-based. On the academic [SmartBugs Curated](https://github.com/smartbugs/smartbugs-curated) dataset (143 labeled contracts), the deterministic static layer alone scores:
88
+
89
+ | Metric | Value |
90
+ |--------|-------|
91
+ | Overall recall | 62.6% |
92
+ | Reentrancy precision | 94.1% |
93
+ | Unchecked-call recall | 76.9% |
94
+ | Full-dataset scan time | < 1 second |
95
+
96
+ The AI layer adds semantic recall on top of this floor. Full per-category numbers and methodology are in [`benchmark/README.md`](benchmark/README.md). For comparison, the ICSE 2020 study found individual mature tools each detect only a fraction of the dataset — which is why running multiple tools, plus a human audit, is the recommended practice.
97
+
98
+ ## CI example (GitHub Actions)
99
+
100
+ ```yaml
101
+ - name: Aegis audit
102
+ env:
103
+ ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
104
+ run: |
105
+ npm install -g aegis-audit
106
+ aegis audit ./contracts/ --ci --fail-on high --sarif results.sarif
107
+ - uses: github/codeql-action/upload-sarif@v3
108
+ with:
109
+ sarif_file: results.sarif
110
+ ```
111
+
112
+ ## Security & threat model
113
+
114
+ This tool is itself part of your software supply chain. State-sponsored groups (e.g. Lazarus/BlueNoroff, MITRE G0032) actively target Web3 developer toolchains via malicious packages (T1195). Accordingly:
115
+
116
+ - API keys are encrypted at rest (AES-256-GCM); enterprises should prefer a secrets manager.
117
+ - `--offline` guarantees no source transmission.
118
+ - The audit log is append-only and hash-chained — any edit to history is detectable via `aegis config`.
119
+ - All detector patterns are bounded against regex denial-of-service.
120
+
121
+ ## Disclaimer
122
+
123
+ Aegis is an AI-assisted automated scanner. It is **not** a substitute for a professional manual audit, formal verification, or economic review. Automated analysis produces both false positives and false negatives. For high-value or production deployments, commission an independent human audit and, where applicable, formal verification of critical invariants.
124
+
125
+ ## Contributing
126
+
127
+ Contributions are welcome — see [CONTRIBUTING.md](CONTRIBUTING.md). The most valuable contributions right now are new detectors (front-running/MEV, improved DoS) with corresponding benchmark fixtures.
128
+
129
+ ## License
130
+
131
+ [MIT](LICENSE) © 2026 rsh1k
@@ -0,0 +1,94 @@
1
+ # SolGuard Benchmark
2
+
3
+ Measures the accuracy of SolGuard's **static detector layer** against labeled
4
+ vulnerable contracts, so claims about detection are evidence-based, not asserted.
5
+
6
+ ## Run it
7
+
8
+ ```bash
9
+ # Quick run against the 7 built-in labeled fixtures (offline, no deps)
10
+ solguard benchmark
11
+
12
+ # Full academic benchmark: clone & score SmartBugs Curated (143 contracts)
13
+ solguard benchmark --fetch-smartbugs
14
+
15
+ # Your own labeled dataset
16
+ solguard benchmark --dataset ./my-labeled-contracts/
17
+
18
+ # Machine-readable output for CI dashboards
19
+ solguard benchmark --fetch-smartbugs --output benchmark.json
20
+ ```
21
+
22
+ ## Dataset
23
+
24
+ The full benchmark uses [SmartBugs Curated](https://github.com/smartbugs/smartbugs-curated)
25
+ — the academic standard, 143 contracts with 208 tagged vulnerabilities across 10
26
+ DASP categories. It was used to compare 9 analysis tools in the ICSE 2020 study
27
+ (Durieux et al.).
28
+
29
+ Ground-truth labels are read from the dataset's own annotations:
30
+ `// <yes> <report> CATEGORY` markers, falling back to the category folder name.
31
+
32
+ ## How scoring works
33
+
34
+ - **Positive** for category C = contract carries a `<yes>` marker for C.
35
+ - **Detection** of C = SolGuard emits any OWASP SC 2026 id mapped from C (see `mapping.js`).
36
+ - Scoring is **contract-level per category**, matching how SmartBugs tool
37
+ comparisons report.
38
+ - We compute, per category and overall: precision, recall, F1, and
39
+ false-negative rate.
40
+ - A separate **clean-contract false-positive rate** is measured against
41
+ contracts with no `<yes>` markers.
42
+
43
+ ## Honest scope limits
44
+
45
+ - **Static layer only.** The Claude AI semantic layer is non-deterministic and is
46
+ not scored here. Real production recall is *higher* than these numbers — but the
47
+ static floor is what you can rely on deterministically.
48
+ - **Out of scope:** `short_addresses` (an ABI/calldata-level issue not visible in
49
+ source) and `other` (unspecified) are excluded from recall so the tool isn't
50
+ credited or penalized for classes it doesn't claim to cover.
51
+ - **Precision on tiny fixture sets is pessimistic** because vulnerable fixtures
52
+ often contain multiple real issues but are labeled for only one category; the
53
+ extra true findings count against precision. This evens out on the full dataset.
54
+
55
+ ## What good numbers look like
56
+
57
+ A non-zero false-negative rate is expected and is the entire reason this tool
58
+ must not be the only gate before deploying high-value contracts. The benchmark
59
+ exists to quantify that gap, not to hide it. Track recall over time as detectors
60
+ improve; treat any regression as a release blocker.
61
+
62
+ ## Baseline results (static layer, SmartBugs Curated, 143 contracts)
63
+
64
+ Measured with `solguard benchmark --fetch-smartbugs`. These are the deterministic
65
+ static-layer numbers; the Claude AI layer adds further semantic recall on top.
66
+
67
+ | Category | Support | Recall | Precision |
68
+ |---|---|---|---|
69
+ | Reentrancy | 31 | 51.6% | 94.1% |
70
+ | Access Control | 18 | 50.0% | 30.0% |
71
+ | Arithmetic | 15 | 93.3% | 10.2%* |
72
+ | Unchecked Low-Level Calls | 52 | 76.9% | 61.5% |
73
+ | Denial of Service | 6 | 33.3% | 8.0% |
74
+ | Bad Randomness | 8 | 50.0% | 16.0% |
75
+ | Front Running | 4 | 0.0%** | 0.0% |
76
+ | Time Manipulation | 5 | 40.0% | 8.0% |
77
+ | **Overall (micro)** | **139** | **62.6%** | **24.9%** |
78
+
79
+ \* Arithmetic precision is depressed because most pre-0.8 contracts trip the
80
+ overflow detector broadly; on modern 0.8+ code this is far lower noise.
81
+ \** No dedicated front-running detector exists yet — scored honestly as 0.
82
+
83
+ For comparison, the ICSE 2020 study (Durieux et al.) found that individual
84
+ mature tools (Slither, Mythril, etc.) each detected only a fraction of the
85
+ dataset, which is why running multiple tools — plus a human audit — is the
86
+ recommended practice. SolGuard is one layer in that stack, not a replacement
87
+ for it.
88
+
89
+ ### Known gaps (tracked for improvement)
90
+ - No front-running / MEV detector.
91
+ - DoS and time-manipulation recall are low; detectors need refinement.
92
+ - Precision is noisy on legacy Solidity; modern-code precision is higher.
93
+ - Performance: all detectors are bounded to avoid regex DoS — full 143-contract
94
+ scan runs in under 1 second.
@@ -0,0 +1,12 @@
1
+ /*
2
+ * @source: SmartBugs Curated (fixture)
3
+ * @vulnerable_at_lines: 11
4
+ */
5
+ pragma solidity ^0.4.24;
6
+ contract Unprotected {
7
+ address public owner;
8
+ function Unprotected() public { owner = msg.sender; }
9
+ // <yes> <report> ACCESS_CONTROL
10
+ function setOwner(address newOwner) public { owner = newOwner; }
11
+ function withdraw() public { require(msg.sender == owner); msg.sender.transfer(address(this).balance); }
12
+ }
@@ -0,0 +1,14 @@
1
+ /*
2
+ * @source: SmartBugs Curated (fixture)
3
+ * @vulnerable_at_lines: 12
4
+ */
5
+ pragma solidity ^0.4.24;
6
+ contract IntegerOverflow {
7
+ mapping(address => uint256) public balances;
8
+ function transfer(address to, uint256 value) public {
9
+ require(balances[msg.sender] - value >= 0);
10
+ balances[msg.sender] -= value;
11
+ // <yes> <report> ARITHMETIC
12
+ balances[to] += value;
13
+ }
14
+ }
@@ -0,0 +1,12 @@
1
+ /*
2
+ * @source: SmartBugs Curated (fixture)
3
+ * @vulnerable_at_lines: 9
4
+ */
5
+ pragma solidity ^0.4.24;
6
+ contract Lottery {
7
+ function pickWinner(uint guess) public view returns (bool) {
8
+ // <yes> <report> BAD_RANDOMNESS
9
+ uint rand = uint(keccak256(abi.encodePacked(block.timestamp, block.difficulty)));
10
+ return guess == rand % 100;
11
+ }
12
+ }
@@ -0,0 +1,24 @@
1
+ /*
2
+ * @source: fixture (intentionally safe — no <yes> markers)
3
+ */
4
+ pragma solidity 0.8.24;
5
+ import "@openzeppelin/contracts/access/Ownable.sol";
6
+ import "@openzeppelin/contracts/utils/ReentrancyGuard.sol";
7
+ contract SafeVault is Ownable, ReentrancyGuard {
8
+ mapping(address => uint256) private balances;
9
+ event Deposited(address indexed who, uint256 amount);
10
+ event Withdrawn(address indexed who, uint256 amount);
11
+ constructor() Ownable(msg.sender) {}
12
+ function deposit() external payable {
13
+ require(msg.value > 0, "zero");
14
+ balances[msg.sender] += msg.value;
15
+ emit Deposited(msg.sender, msg.value);
16
+ }
17
+ function withdraw(uint256 amount) external nonReentrant {
18
+ require(balances[msg.sender] >= amount, "insufficient");
19
+ balances[msg.sender] -= amount;
20
+ (bool ok, ) = msg.sender.call{value: amount}("");
21
+ require(ok, "transfer failed");
22
+ emit Withdrawn(msg.sender, amount);
23
+ }
24
+ }
@@ -0,0 +1,16 @@
1
+ /*
2
+ * @source: SmartBugs Curated (fixture)
3
+ * @vulnerable_at_lines: 19
4
+ */
5
+ pragma solidity ^0.4.19;
6
+ contract SimpleDAO {
7
+ mapping (address => uint) public credit;
8
+ function donate(address to) payable public { credit[to] += msg.value; }
9
+ function withdraw(uint amount) public {
10
+ if (credit[msg.sender] >= amount) {
11
+ // <yes> <report> REENTRANCY
12
+ bool res = msg.sender.call.value(amount)();
13
+ credit[msg.sender] -= amount;
14
+ }
15
+ }
16
+ }
@@ -0,0 +1,11 @@
1
+ /*
2
+ * @source: SmartBugs Curated (fixture)
3
+ * @vulnerable_at_lines: 8
4
+ */
5
+ pragma solidity ^0.4.25;
6
+ contract TimedCrowdsale {
7
+ function isSaleFinished() view public returns (bool) {
8
+ // <yes> <report> TIME_MANIPULATION
9
+ return block.timestamp >= 1546300800;
10
+ }
11
+ }
@@ -0,0 +1,11 @@
1
+ /*
2
+ * @source: SmartBugs Curated (fixture)
3
+ * @vulnerable_at_lines: 9
4
+ */
5
+ pragma solidity ^0.4.24;
6
+ contract UncheckedReturn {
7
+ function withdraw(address payable to, uint amount) public {
8
+ // <yes> <report> UNCHECKED_LL_CALLS
9
+ to.call.value(amount)("");
10
+ }
11
+ }
@@ -0,0 +1,45 @@
1
+ // ─────────────────────────────────────────────────────────────────────────────
2
+ // Benchmark category mapping
3
+ // SmartBugs Curated uses the DASP taxonomy (// <yes> <report> CATEGORY markers).
4
+ // SolGuard emits OWASP SC Top 10 (2026) IDs. This maps between them so we can
5
+ // score detections against ground-truth labels.
6
+ //
7
+ // DASP categories in the dataset directory names:
8
+ // reentrancy, access_control, arithmetic, unchecked_low_level_calls,
9
+ // denial_of_service, bad_randomness, front_running, time_manipulation,
10
+ // short_addresses, other
11
+ // ─────────────────────────────────────────────────────────────────────────────
12
+
13
+ // Map each DASP category -> the OWASP SC 2026 IDs SolGuard would emit for it.
14
+ // A detection "counts" for a labeled contract if SolGuard reports ANY of the
15
+ // mapped OWASP IDs.
16
+ export const DASP_TO_OWASP = {
17
+ reentrancy: ['SC08:2026'],
18
+ access_control: ['SC01:2026'],
19
+ arithmetic: ['SC07:2026', 'SC09:2026'],
20
+ unchecked_low_level_calls: ['SC06:2026'],
21
+ denial_of_service: ['SC02:2026'], // DoS detector tagged SC02 in our engine
22
+ bad_randomness: ['SC02:2026'], // weak-randomness detector tagged SC02
23
+ front_running: ['SC02:2026'], // no dedicated detector; logic family
24
+ time_manipulation: ['SC02:2026'], // timestamp detector tagged SC02
25
+ short_addresses: [], // not detectable from source (ABI-level)
26
+ other: [], // unspecified; excluded from scoring
27
+ };
28
+
29
+ // Categories we make NO claim to detect. Excluded from recall scoring so the
30
+ // benchmark is honest about scope rather than penalizing undetectable classes.
31
+ export const OUT_OF_SCOPE = ['short_addresses', 'other'];
32
+
33
+ // Human-readable labels for the report.
34
+ export const DASP_LABELS = {
35
+ reentrancy: 'Reentrancy',
36
+ access_control: 'Access Control',
37
+ arithmetic: 'Arithmetic',
38
+ unchecked_low_level_calls: 'Unchecked Low-Level Calls',
39
+ denial_of_service: 'Denial of Service',
40
+ bad_randomness: 'Bad Randomness',
41
+ front_running: 'Front Running',
42
+ time_manipulation: 'Time Manipulation',
43
+ short_addresses: 'Short Addresses (out of scope)',
44
+ other: 'Other (out of scope)',
45
+ };
@@ -0,0 +1,164 @@
1
+ // ─────────────────────────────────────────────────────────────────────────────
2
+ // SolGuard benchmark runner
3
+ // Evaluates the static detector engine against labeled vulnerable contracts
4
+ // (SmartBugs Curated format). Computes per-category and overall
5
+ // precision / recall / F1 / false-negative-rate.
6
+ //
7
+ // Methodology notes (stated for honesty):
8
+ // - Ground truth = DASP categories marked with "// <yes> <report> CATEGORY".
9
+ // - A contract is a POSITIVE for category C if it carries a <yes> marker for C.
10
+ // - SolGuard "detects" C if it emits any OWASP id mapped from C (see mapping.js).
11
+ // - Detection is scored at CONTRACT level per category (not line level), matching
12
+ // how most SmartBugs tool comparisons report (Durieux et al., ICSE 2020).
13
+ // - short_addresses and other are OUT OF SCOPE and excluded from recall.
14
+ // - This measures the STATIC layer only; the Claude AI layer is not benchmarked
15
+ // here because its output is non-deterministic. Real recall in production is
16
+ // >= static-only recall.
17
+ // ─────────────────────────────────────────────────────────────────────────────
18
+
19
+ import fs from 'fs';
20
+ import path from 'path';
21
+ import { fileURLToPath } from 'url';
22
+ import { runDetectors } from '../src/analyzers/detectors.js';
23
+ import { DASP_TO_OWASP, OUT_OF_SCOPE, DASP_LABELS } from './mapping.js';
24
+
25
+ const __dirname = path.dirname(fileURLToPath(import.meta.url));
26
+
27
+ // Map a "<report> CATEGORY" token to a DASP key.
28
+ const REPORT_TOKEN_TO_DASP = {
29
+ REENTRANCY: 'reentrancy',
30
+ ACCESS_CONTROL: 'access_control',
31
+ ARITHMETIC: 'arithmetic',
32
+ UNCHECKED_LL_CALLS: 'unchecked_low_level_calls',
33
+ UNCHECKED_LOW_LEVEL_CALLS: 'unchecked_low_level_calls',
34
+ DENIAL_OF_SERVICE: 'denial_of_service',
35
+ DOS: 'denial_of_service',
36
+ BAD_RANDOMNESS: 'bad_randomness',
37
+ FRONT_RUNNING: 'front_running',
38
+ TIME_MANIPULATION: 'time_manipulation',
39
+ SHORT_ADDRESSES: 'short_addresses',
40
+ OTHER: 'other',
41
+ };
42
+
43
+ // Extract ground-truth DASP categories from a contract's annotations.
44
+ // Falls back to the parent directory name (SmartBugs organizes by category folder).
45
+ export function groundTruthCategories(source, filePath) {
46
+ const cats = new Set();
47
+ const re = /\/\/\s*<yes>\s*<report>\s*([A-Z_]+)/g;
48
+ let m;
49
+ while ((m = re.exec(source)) !== null) {
50
+ const dasp = REPORT_TOKEN_TO_DASP[m[1]];
51
+ if (dasp) cats.add(dasp);
52
+ }
53
+ // Directory-name fallback (full SmartBugs layout: dataset/<category>/file.sol)
54
+ if (cats.size === 0 && filePath) {
55
+ const parent = path.basename(path.dirname(filePath));
56
+ if (DASP_TO_OWASP[parent]) cats.add(parent);
57
+ }
58
+ return [...cats];
59
+ }
60
+
61
+ // What categories did SolGuard detect for this source?
62
+ export function detectedCategories(source) {
63
+ const findings = runDetectors(source);
64
+ const owaspHits = new Set(findings.map(f => f.owasp));
65
+ const detected = new Set();
66
+ for (const [dasp, owaspIds] of Object.entries(DASP_TO_OWASP)) {
67
+ if (owaspIds.some(id => owaspHits.has(id))) detected.add(dasp);
68
+ }
69
+ return { detected: [...detected], findings };
70
+ }
71
+
72
+ // Run the benchmark over a directory of .sol files (recursively).
73
+ export function runBenchmark(datasetDir) {
74
+ const files = collectSol(datasetDir);
75
+
76
+ // Per-category confusion counts
77
+ const cats = Object.keys(DASP_TO_OWASP).filter(c => !OUT_OF_SCOPE.includes(c));
78
+ const stats = {};
79
+ for (const c of cats) stats[c] = { tp: 0, fn: 0, fp: 0, support: 0 };
80
+
81
+ let cleanContracts = 0;
82
+ let cleanFalsePositives = 0;
83
+ const perContract = [];
84
+
85
+ for (const file of files) {
86
+ const source = fs.readFileSync(file, 'utf8');
87
+ const truth = groundTruthCategories(source, file).filter(c => !OUT_OF_SCOPE.includes(c));
88
+ const { detected } = detectedCategories(source);
89
+ const detInScope = detected.filter(c => !OUT_OF_SCOPE.includes(c));
90
+
91
+ // Clean contract (no ground-truth vulns): measure false positives
92
+ const isClean = truth.length === 0;
93
+ if (isClean) {
94
+ cleanContracts++;
95
+ if (detInScope.length > 0) cleanFalsePositives++;
96
+ }
97
+
98
+ for (const c of cats) {
99
+ const isTrue = truth.includes(c);
100
+ const isDet = detInScope.includes(c);
101
+ if (isTrue) stats[c].support++;
102
+ if (isTrue && isDet) stats[c].tp++;
103
+ else if (isTrue && !isDet) stats[c].fn++;
104
+ else if (!isTrue && isDet) stats[c].fp++;
105
+ }
106
+
107
+ perContract.push({
108
+ file: path.relative(datasetDir, file),
109
+ truth, detected: detInScope,
110
+ hit: truth.length > 0 && truth.every(c => detInScope.includes(c)),
111
+ });
112
+ }
113
+
114
+ // Compute metrics
115
+ const report = { categories: {}, overall: {}, clean: {} };
116
+ let totTP = 0, totFN = 0, totFP = 0, totSupport = 0;
117
+
118
+ for (const c of cats) {
119
+ const { tp, fn, fp, support } = stats[c];
120
+ const precision = (tp + fp) ? tp / (tp + fp) : null;
121
+ const recall = support ? tp / support : null;
122
+ const f1 = (precision && recall) ? (2 * precision * recall) / (precision + recall) : null;
123
+ report.categories[c] = {
124
+ label: DASP_LABELS[c], support, tp, fn, fp,
125
+ precision, recall, f1,
126
+ falseNegativeRate: support ? fn / support : null,
127
+ };
128
+ totTP += tp; totFN += fn; totFP += fp; totSupport += support;
129
+ }
130
+
131
+ const microPrecision = (totTP + totFP) ? totTP / (totTP + totFP) : null;
132
+ const microRecall = totSupport ? totTP / totSupport : null;
133
+ const microF1 = (microPrecision && microRecall)
134
+ ? (2 * microPrecision * microRecall) / (microPrecision + microRecall) : null;
135
+
136
+ report.overall = {
137
+ contractsTested: files.length,
138
+ totalVulnInstances: totSupport,
139
+ truePositives: totTP, falseNegatives: totFN, falsePositives: totFP,
140
+ microPrecision, microRecall, microF1,
141
+ falseNegativeRate: totSupport ? totFN / totSupport : null,
142
+ };
143
+ report.clean = {
144
+ cleanContracts,
145
+ cleanFalsePositives,
146
+ falsePositiveRate: cleanContracts ? cleanFalsePositives / cleanContracts : null,
147
+ };
148
+ report.perContract = perContract;
149
+ return report;
150
+ }
151
+
152
+ function collectSol(dir) {
153
+ const out = [];
154
+ if (!fs.existsSync(dir)) return out;
155
+ for (const entry of fs.readdirSync(dir)) {
156
+ const full = path.join(dir, entry);
157
+ const stat = fs.statSync(full);
158
+ if (stat.isDirectory()) out.push(...collectSol(full));
159
+ else if (entry.endsWith('.sol')) out.push(full);
160
+ }
161
+ return out;
162
+ }
163
+
164
+ export const FIXTURES_DIR = path.join(__dirname, 'fixtures');
package/index.js ADDED
@@ -0,0 +1,41 @@
1
+ #!/usr/bin/env node
2
+ import { program } from 'commander';
3
+ import { auditCommand } from './src/commands/audit.js';
4
+ import { configCommand } from './src/commands/config.js';
5
+ import { benchmarkCommand } from './src/commands/benchmark.js';
6
+ import { banner } from './src/ui/banner.js';
7
+
8
+ await banner();
9
+
10
+ program
11
+ .name('solguard')
12
+ .description('AI-powered smart contract security auditor - OWASP SC Top 10 (2026), MITRE, NIST SSDF')
13
+ .version('2.1.0');
14
+
15
+ program
16
+ .command('audit <target>')
17
+ .description('Audit a contract by address, .sol file, or folder')
18
+ .option('-n, --network <network>', 'network (ethereum|base|arbitrum|polygon|optimism|bsc)', 'ethereum')
19
+ .option('-o, --output <path>', 'save markdown report (e.g. report.md)')
20
+ .option('--sarif <path>', 'write SARIF 2.1.0 for CI ingestion (e.g. results.sarif)')
21
+ .option('--sbom <path>', 'write CycloneDX SBOM (NIST SSDF PS.3)')
22
+ .option('--offline', 'static detectors only; source code never leaves your machine')
23
+ .option('--ci', 'CI mode: exit non-zero when findings breach --fail-on threshold')
24
+ .option('--fail-on <level>', 'CI fail threshold (critical|high|medium)', 'high')
25
+ .option('--json', 'machine-readable JSON output')
26
+ .action(auditCommand);
27
+
28
+ program
29
+ .command('config')
30
+ .description('Set encrypted API key and verify audit-log integrity')
31
+ .action(configCommand);
32
+
33
+ program
34
+ .command('benchmark')
35
+ .description('Measure detector accuracy against labeled vulnerable contracts')
36
+ .option('--dataset <dir>', 'custom dataset directory of labeled .sol files')
37
+ .option('--fetch-smartbugs', 'clone & use SmartBugs Curated (143 contracts)')
38
+ .option('-o, --output <path>', 'write full JSON benchmark report')
39
+ .action(benchmarkCommand);
40
+
41
+ program.parse();
package/package.json ADDED
@@ -0,0 +1,38 @@
1
+ {
2
+ "name": "aegis-audit",
3
+ "version": "2.1.0",
4
+ "description": "AI-powered smart contract security auditor — OWASP SC Top 10 (2026), MITRE ATT&CK, CWE, and NIST SSDF compliance outputs with a benchmarked detection engine",
5
+ "type": "module",
6
+ "main": "index.js",
7
+ "bin": { "aegis": "index.js" },
8
+ "scripts": {
9
+ "start": "node index.js",
10
+ "benchmark": "node index.js benchmark"
11
+ },
12
+ "keywords": [
13
+ "solidity", "smart-contract", "security", "audit", "auditor",
14
+ "blockchain", "ethereum", "web3", "defi", "cli",
15
+ "owasp", "mitre-attack", "cwe", "nist-ssdf", "sarif", "sbom",
16
+ "vulnerability", "static-analysis", "reentrancy"
17
+ ],
18
+ "author": "rsh1k",
19
+ "license": "MIT",
20
+ "homepage": "https://github.com/rsh1k/cd13282-blockchain-with-solidity-project#readme",
21
+ "repository": {
22
+ "type": "git",
23
+ "url": "git+https://github.com/rsh1k/cd13282-blockchain-with-solidity-project.git"
24
+ },
25
+ "bugs": {
26
+ "url": "https://github.com/rsh1k/cd13282-blockchain-with-solidity-project/issues"
27
+ },
28
+ "dependencies": {
29
+ "@inquirer/prompts": "^7.5.2",
30
+ "axios": "^1.16.1",
31
+ "boxen": "^8.0.1",
32
+ "chalk": "^5.6.2",
33
+ "commander": "^15.0.0",
34
+ "ora": "^9.4.0"
35
+ },
36
+ "engines": { "node": ">=18.0.0" },
37
+ "files": ["index.js", "src/", "benchmark/", "LICENSE", "README.md"]
38
+ }