cairn-security-agent-audit 0.1.0__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- cairn_security_agent_audit-0.1.0/LICENSE +21 -0
- cairn_security_agent_audit-0.1.0/PKG-INFO +319 -0
- cairn_security_agent_audit-0.1.0/README.md +297 -0
- cairn_security_agent_audit-0.1.0/cairn_audit_agent_logs.py +693 -0
- cairn_security_agent_audit-0.1.0/cairn_demo.py +58 -0
- cairn_security_agent_audit-0.1.0/cairn_inspect_log_schema.py +140 -0
- cairn_security_agent_audit-0.1.0/cairn_pilot_from_raw_logs.py +138 -0
- cairn_security_agent_audit-0.1.0/cairn_security_agent_audit/__init__.py +3 -0
- cairn_security_agent_audit-0.1.0/cairn_security_agent_audit/samples/__init__.py +0 -0
- cairn_security_agent_audit-0.1.0/cairn_security_agent_audit/samples/pentest_trace_sample.jsonl +8 -0
- cairn_security_agent_audit-0.1.0/cairn_security_agent_audit.egg-info/PKG-INFO +319 -0
- cairn_security_agent_audit-0.1.0/cairn_security_agent_audit.egg-info/SOURCES.txt +18 -0
- cairn_security_agent_audit-0.1.0/cairn_security_agent_audit.egg-info/dependency_links.txt +1 -0
- cairn_security_agent_audit-0.1.0/cairn_security_agent_audit.egg-info/entry_points.txt +4 -0
- cairn_security_agent_audit-0.1.0/cairn_security_agent_audit.egg-info/top_level.txt +7 -0
- cairn_security_agent_audit-0.1.0/cairn_security_audit_ui.py +418 -0
- cairn_security_agent_audit-0.1.0/pilot/__init__.py +0 -0
- cairn_security_agent_audit-0.1.0/pilot/ingest_agent_trace.py +258 -0
- cairn_security_agent_audit-0.1.0/pyproject.toml +48 -0
- cairn_security_agent_audit-0.1.0/setup.cfg +4 -0
|
@@ -0,0 +1,21 @@
|
|
|
1
|
+
MIT License
|
|
2
|
+
|
|
3
|
+
Copyright (c) 2026 fraQtl
|
|
4
|
+
|
|
5
|
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
|
6
|
+
of this software and associated documentation files (the "Software"), to deal
|
|
7
|
+
in the Software without restriction, including without limitation the rights
|
|
8
|
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
|
9
|
+
copies of the Software, and to permit persons to whom the Software is
|
|
10
|
+
furnished to do so, subject to the following conditions:
|
|
11
|
+
|
|
12
|
+
The above copyright notice and this permission notice shall be included in all
|
|
13
|
+
copies or substantial portions of the Software.
|
|
14
|
+
|
|
15
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
|
16
|
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
|
17
|
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
|
18
|
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
|
19
|
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
|
20
|
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
|
21
|
+
SOFTWARE.
|
|
@@ -0,0 +1,319 @@
|
|
|
1
|
+
Metadata-Version: 2.4
|
|
2
|
+
Name: cairn-security-agent-audit
|
|
3
|
+
Version: 0.1.0
|
|
4
|
+
Summary: Local/offline audit for AI security-agent traces: repeated tool-output work, stale replay risk, and token-savings reports.
|
|
5
|
+
Author: fraQtl
|
|
6
|
+
License-Expression: MIT
|
|
7
|
+
Project-URL: Homepage, https://github.com/fraqtl-ai/cairn-security-agent-audit
|
|
8
|
+
Project-URL: Repository, https://github.com/fraqtl-ai/cairn-security-agent-audit
|
|
9
|
+
Project-URL: Issues, https://github.com/fraqtl-ai/cairn-security-agent-audit/issues
|
|
10
|
+
Keywords: ai-agents,security,pentest,audit,caching,observability
|
|
11
|
+
Classifier: Development Status :: 3 - Alpha
|
|
12
|
+
Classifier: Environment :: Console
|
|
13
|
+
Classifier: Intended Audience :: Developers
|
|
14
|
+
Classifier: Programming Language :: Python :: 3
|
|
15
|
+
Classifier: Programming Language :: Python :: 3 :: Only
|
|
16
|
+
Classifier: Topic :: Security
|
|
17
|
+
Classifier: Topic :: Software Development :: Quality Assurance
|
|
18
|
+
Requires-Python: >=3.10
|
|
19
|
+
Description-Content-Type: text/markdown
|
|
20
|
+
License-File: LICENSE
|
|
21
|
+
Dynamic: license-file
|
|
22
|
+
|
|
23
|
+
<p align="center">
|
|
24
|
+
<img src="docs/logo.png" alt="fraQtl" width="150"/>
|
|
25
|
+
</p>
|
|
26
|
+
|
|
27
|
+
<h1 align="center">CAIRN Security Agent Audit</h1>
|
|
28
|
+
|
|
29
|
+
<p align="center">
|
|
30
|
+
Local/offline audit for AI-pentest traces: repeated tool-output work, stale replay risk, protected-lane blocks, and token-savings reports.
|
|
31
|
+
</p>
|
|
32
|
+
|
|
33
|
+
<p align="center">
|
|
34
|
+
<a href="docs/CACHE_CONTROL_FOR_SECURITY_AGENTS.md">Proof page</a> ·
|
|
35
|
+
<a href="OPEN_CORE.md">Open-core</a> ·
|
|
36
|
+
<a href="#quickstart">Quickstart</a> ·
|
|
37
|
+
<a href="#try-your-own-logs">Try your logs</a> ·
|
|
38
|
+
<a href="#have-different-logs">One redacted event</a> ·
|
|
39
|
+
<a href="#what-the-report-shows">Report output</a> ·
|
|
40
|
+
<a href="#product-direction">Product direction</a>
|
|
41
|
+
</p>
|
|
42
|
+
|
|
43
|
+
---
|
|
44
|
+
|
|
45
|
+
CAIRN audits traces from security agents that call scanners, shells, exploit
|
|
46
|
+
frameworks, HTTP clients, search tools, and file inspection commands.
|
|
47
|
+
|
|
48
|
+
It answers one practical question:
|
|
49
|
+
|
|
50
|
+
```text
|
|
51
|
+
Are AI-pentest agents repeatedly re-reading expensive tool outputs, and where
|
|
52
|
+
would exact replay be stale or unsafe because target/session state changed?
|
|
53
|
+
```
|
|
54
|
+
|
|
55
|
+
This repository is the free, open-source audit slice of CAIRN. It is local, CLI-first, and
|
|
56
|
+
audit-only. It does not run pentests, does not need live target access, and does
|
|
57
|
+
not auto-serve cached outputs. The commercial CAIRN Runtime is a separate protected sidecar for production reuse decisions.
|
|
58
|
+
|
|
59
|
+
CAIRN is cache-control for security agents, not generic caching:
|
|
60
|
+
|
|
61
|
+
```text
|
|
62
|
+
Agent trace
|
|
63
|
+
-> normalize tool events
|
|
64
|
+
-> compare protected target/session state
|
|
65
|
+
-> choose the safest action
|
|
66
|
+
|
|
67
|
+
same work + same protected state -> EXACT_CACHE
|
|
68
|
+
related work + changed/partial state -> DELTA_SERVE
|
|
69
|
+
uncertain or first-seen work -> LIVE_CALL
|
|
70
|
+
unsafe protected-state mismatch -> BLOCK_REUSE
|
|
71
|
+
```
|
|
72
|
+
|
|
73
|
+
> **Open-core scope:** this repo is the open audit slice of CAIRN, not the full
|
|
74
|
+
> runtime product. The protected runtime sidecar, production serving layer,
|
|
75
|
+
> enterprise dashboard/history, custom mappers, support, and commercial deployment
|
|
76
|
+
> are not included here. Use this repo to test whether a repeated-work signal
|
|
77
|
+
> exists in your AI-pentest traces.
|
|
78
|
+
|
|
79
|
+
## What It Does
|
|
80
|
+
|
|
81
|
+
Given JSON/JSONL security-agent logs, CAIRN:
|
|
82
|
+
|
|
83
|
+
- normalizes trace records into a common audit schema,
|
|
84
|
+
- groups repeated tool-output/context work,
|
|
85
|
+
- compares protected target/session state,
|
|
86
|
+
- marks exact-cache stale-risk events,
|
|
87
|
+
- classifies `LIVE_CALL`, `EXACT_CACHE`, `DELTA_SERVE`, and `BLOCK_REUSE`,
|
|
88
|
+
- estimates point-token and carried-context savings,
|
|
89
|
+
- writes terminal JSON plus `summary.json` and `summary.md`; `report.html` is optional with `--html`.
|
|
90
|
+
|
|
91
|
+
## Quickstart
|
|
92
|
+
|
|
93
|
+
Install from the repo:
|
|
94
|
+
|
|
95
|
+
```bash
|
|
96
|
+
git clone https://github.com/fraqtl-ai/cairn-security-agent-audit.git
|
|
97
|
+
cd cairn-security-agent-audit
|
|
98
|
+
python3 -m venv .venv
|
|
99
|
+
source .venv/bin/activate
|
|
100
|
+
python -m pip install -e .
|
|
101
|
+
```
|
|
102
|
+
|
|
103
|
+
After PyPI release, this becomes:
|
|
104
|
+
|
|
105
|
+
```bash
|
|
106
|
+
pip install cairn-security-agent-audit
|
|
107
|
+
```
|
|
108
|
+
|
|
109
|
+
Run the included pentest sample:
|
|
110
|
+
|
|
111
|
+
```bash
|
|
112
|
+
cairn-demo
|
|
113
|
+
```
|
|
114
|
+
|
|
115
|
+
Terminal-only JSON:
|
|
116
|
+
|
|
117
|
+
```bash
|
|
118
|
+
cairn-demo --json-only | less
|
|
119
|
+
```
|
|
120
|
+
|
|
121
|
+
From a repo clone, `./demo.sh` also works without installing.
|
|
122
|
+
|
|
123
|
+
The sample prints a JSON summary in the terminal and writes JSON/Markdown receipts. For a larger public benchmark HTML example, open:
|
|
124
|
+
|
|
125
|
+
```text
|
|
126
|
+
examples/autopenbench/report.html
|
|
127
|
+
```
|
|
128
|
+
|
|
129
|
+
## Try Your Own Logs
|
|
130
|
+
|
|
131
|
+
If your trace is JSONL:
|
|
132
|
+
|
|
133
|
+
```bash
|
|
134
|
+
cairn-audit \
|
|
135
|
+
--input your_trace.jsonl \
|
|
136
|
+
--out report \
|
|
137
|
+
--price-input-per-m 3.0 \
|
|
138
|
+
--no-cleaned-trace
|
|
139
|
+
```
|
|
140
|
+
|
|
141
|
+
If your trace is one JSON file:
|
|
142
|
+
|
|
143
|
+
```bash
|
|
144
|
+
cairn-audit \
|
|
145
|
+
--input your_trace.json \
|
|
146
|
+
--out report \
|
|
147
|
+
--price-input-per-m 3.0 \
|
|
148
|
+
--no-cleaned-trace
|
|
149
|
+
```
|
|
150
|
+
|
|
151
|
+
If your traces are a directory of JSON logs:
|
|
152
|
+
|
|
153
|
+
```bash
|
|
154
|
+
cairn-audit \
|
|
155
|
+
--input logs/ \
|
|
156
|
+
--glob '*.json' \
|
|
157
|
+
--out report \
|
|
158
|
+
--price-input-per-m 3.0 \
|
|
159
|
+
--no-cleaned-trace
|
|
160
|
+
```
|
|
161
|
+
|
|
162
|
+
Outputs:
|
|
163
|
+
|
|
164
|
+
```text
|
|
165
|
+
report/summary.json
|
|
166
|
+
report/summary.md
|
|
167
|
+
report/normalization_summary.json
|
|
168
|
+
```
|
|
169
|
+
|
|
170
|
+
## Input Shape
|
|
171
|
+
|
|
172
|
+
Preferred input is one JSON object per tool event:
|
|
173
|
+
|
|
174
|
+
```json
|
|
175
|
+
{"session_id":"run-1","step":1,"tool":"shell","command":"nmap -sV 10.0.0.5","output":"PORT 22 open ssh...","output_tokens":900,"before":{"fingerprint":"target-a"},"after":{"fingerprint":"target-a"}}
|
|
176
|
+
```
|
|
177
|
+
|
|
178
|
+
Useful fields:
|
|
179
|
+
|
|
180
|
+
```text
|
|
181
|
+
session_id or run_id
|
|
182
|
+
step index or timestamp
|
|
183
|
+
tool/action name
|
|
184
|
+
command/action text
|
|
185
|
+
stdout/stderr/observation/output text
|
|
186
|
+
target/session/provenance hints if available
|
|
187
|
+
input/output token counts if available
|
|
188
|
+
```
|
|
189
|
+
|
|
190
|
+
If you do not know whether your export has the right fields, inspect the shape:
|
|
191
|
+
|
|
192
|
+
```bash
|
|
193
|
+
cairn-inspect \
|
|
194
|
+
--input your_trace.jsonl \
|
|
195
|
+
--out report/schema_inspection.json
|
|
196
|
+
```
|
|
197
|
+
|
|
198
|
+
You can share `report/schema_inspection.json` or one redacted example row
|
|
199
|
+
without sharing raw logs.
|
|
200
|
+
|
|
201
|
+
If output or observation text is missing, CAIRN can still show repeated-work and
|
|
202
|
+
stale-risk structure, but token-savings and delta-serving estimates will be
|
|
203
|
+
weaker. If target/session fingerprints are missing, CAIRN can still run with
|
|
204
|
+
conservative proxy fingerprints, but real fingerprints make the protected-lane
|
|
205
|
+
analysis stronger.
|
|
206
|
+
|
|
207
|
+
## Have Different Logs?
|
|
208
|
+
|
|
209
|
+
If your logs do not map cleanly, do not prepare a full export first. Send or inspect one redacted event instead. The minimum useful shape is:
|
|
210
|
+
|
|
211
|
+
```text
|
|
212
|
+
session_id, timestamp or step, tool name, tool input/command, output/observation, target/session state if available
|
|
213
|
+
```
|
|
214
|
+
|
|
215
|
+
See [One Redacted Event](docs/ONE_REDACTED_EVENT.md) for exactly what to share and what to redact. One event is enough to adapt the mapper; the full audit can still run locally inside your environment.
|
|
216
|
+
|
|
217
|
+
## What The Report Shows
|
|
218
|
+
|
|
219
|
+
The report is designed to be readable by product and engineering teams:
|
|
220
|
+
|
|
221
|
+
| Area | What CAIRN reports |
|
|
222
|
+
|---|---|
|
|
223
|
+
| Repeated work | Events audited, re-reads, repeated-work percentage |
|
|
224
|
+
| Tool families | Top repeated commands/tools by carried-context savings |
|
|
225
|
+
| Safety | Protected-lane blocks and exact-cache stale-risk events |
|
|
226
|
+
| Opportunities | `EXACT_CACHE`, `DELTA_SERVE`, `LIVE_CALL`, `BLOCK_REUSE` |
|
|
227
|
+
| Savings | Point tokens avoided, carried-context tokens avoided, dollar estimate |
|
|
228
|
+
| Examples | Concrete commands/actions that created the signal |
|
|
229
|
+
|
|
230
|
+
Public reference result from AutoPenBench / genai-pentest-paper logs:
|
|
231
|
+
|
|
232
|
+
```text
|
|
233
|
+
2,764 tool events audited
|
|
234
|
+
1,031 re-reads
|
|
235
|
+
37.30% repeated work
|
|
236
|
+
548,335 point tokens avoided
|
|
237
|
+
3,698,589 carried-context tokens avoided
|
|
238
|
+
1,016 protected-lane blocks
|
|
239
|
+
0 stale serves
|
|
240
|
+
0 false hits
|
|
241
|
+
```
|
|
242
|
+
|
|
243
|
+
Top repeated families in that public run included:
|
|
244
|
+
|
|
245
|
+
```text
|
|
246
|
+
nmap, curl, ssh, msfconsole, searchsploit, find, cat
|
|
247
|
+
```
|
|
248
|
+
|
|
249
|
+
## How To Read The Actions
|
|
250
|
+
|
|
251
|
+
`LIVE_CALL`
|
|
252
|
+
|
|
253
|
+
```text
|
|
254
|
+
First time seeing this work, or not enough evidence to reuse safely.
|
|
255
|
+
```
|
|
256
|
+
|
|
257
|
+
`EXACT_CACHE`
|
|
258
|
+
|
|
259
|
+
```text
|
|
260
|
+
Same work repeated and protected state still matches. Exact reuse would be safe.
|
|
261
|
+
```
|
|
262
|
+
|
|
263
|
+
`DELTA_SERVE`
|
|
264
|
+
|
|
265
|
+
```text
|
|
266
|
+
Related output repeated, but exact replay is not the right safety choice.
|
|
267
|
+
Prior output can still shrink context/reporting burden while staying live-aware.
|
|
268
|
+
```
|
|
269
|
+
|
|
270
|
+
`BLOCK_REUSE`
|
|
271
|
+
|
|
272
|
+
```text
|
|
273
|
+
Repeated work exists, but reuse should be blocked.
|
|
274
|
+
```
|
|
275
|
+
|
|
276
|
+
## Open-Core Model
|
|
277
|
+
|
|
278
|
+
This repository is MIT-licensed and contains the local audit slice: CLI, schema inspector, sample traces, and JSON/Markdown report generation. HTML output is available, but the main product path is terminal-first.
|
|
279
|
+
|
|
280
|
+
The paid/commercial layer is CAIRN Runtime: a protected sidecar for production reuse decisions, custom trace mappers, dashboard/history, deployment support, and enterprise licensing.
|
|
281
|
+
|
|
282
|
+
See [Open-Core Model](OPEN_CORE.md).
|
|
283
|
+
|
|
284
|
+
## Product Direction
|
|
285
|
+
|
|
286
|
+
This repository is the audit slice, not the full CAIRN runtime product.
|
|
287
|
+
|
|
288
|
+
Design-partner path:
|
|
289
|
+
|
|
290
|
+
1. **Offline audit**: run CAIRN on existing AI-pentest traces and measure the
|
|
291
|
+
repeated-work signal.
|
|
292
|
+
2. **Local dashboard**: review repeated tool families, stale-risk examples, and
|
|
293
|
+
savings over time without raw logs leaving the customer environment.
|
|
294
|
+
3. **Protected runtime sidecar**: integrate around one high-volume tool family.
|
|
295
|
+
CAIRN observes tool calls plus target/session state, exact-caches only inside
|
|
296
|
+
safe provenance cells, delta-serves when appropriate, and falls back live when
|
|
297
|
+
state changed or is uncertain.
|
|
298
|
+
|
|
299
|
+
The goal is not to replay pentest results blindly. The goal is to reduce
|
|
300
|
+
repeated context/tool-output cost while refusing stale replay across protected
|
|
301
|
+
target/session changes.
|
|
302
|
+
|
|
303
|
+
## Boundaries
|
|
304
|
+
|
|
305
|
+
CAIRN Security Agent Audit is:
|
|
306
|
+
|
|
307
|
+
- local,
|
|
308
|
+
- audit-only,
|
|
309
|
+
- trace/report oriented,
|
|
310
|
+
- designed to avoid stale replay.
|
|
311
|
+
|
|
312
|
+
It is not a vulnerability scanner, pentest runner, live target automation, or
|
|
313
|
+
production serving layer.
|
|
314
|
+
|
|
315
|
+
## License
|
|
316
|
+
|
|
317
|
+
MIT License. See [LICENSE](LICENSE).
|
|
318
|
+
|
|
319
|
+
Commercial CAIRN Runtime, private integrations, managed deployments, support, and enterprise licensing are separate from this open audit package. See [Open-Core Model](OPEN_CORE.md).
|
|
@@ -0,0 +1,297 @@
|
|
|
1
|
+
<p align="center">
|
|
2
|
+
<img src="docs/logo.png" alt="fraQtl" width="150"/>
|
|
3
|
+
</p>
|
|
4
|
+
|
|
5
|
+
<h1 align="center">CAIRN Security Agent Audit</h1>
|
|
6
|
+
|
|
7
|
+
<p align="center">
|
|
8
|
+
Local/offline audit for AI-pentest traces: repeated tool-output work, stale replay risk, protected-lane blocks, and token-savings reports.
|
|
9
|
+
</p>
|
|
10
|
+
|
|
11
|
+
<p align="center">
|
|
12
|
+
<a href="docs/CACHE_CONTROL_FOR_SECURITY_AGENTS.md">Proof page</a> ·
|
|
13
|
+
<a href="OPEN_CORE.md">Open-core</a> ·
|
|
14
|
+
<a href="#quickstart">Quickstart</a> ·
|
|
15
|
+
<a href="#try-your-own-logs">Try your logs</a> ·
|
|
16
|
+
<a href="#have-different-logs">One redacted event</a> ·
|
|
17
|
+
<a href="#what-the-report-shows">Report output</a> ·
|
|
18
|
+
<a href="#product-direction">Product direction</a>
|
|
19
|
+
</p>
|
|
20
|
+
|
|
21
|
+
---
|
|
22
|
+
|
|
23
|
+
CAIRN audits traces from security agents that call scanners, shells, exploit
|
|
24
|
+
frameworks, HTTP clients, search tools, and file inspection commands.
|
|
25
|
+
|
|
26
|
+
It answers one practical question:
|
|
27
|
+
|
|
28
|
+
```text
|
|
29
|
+
Are AI-pentest agents repeatedly re-reading expensive tool outputs, and where
|
|
30
|
+
would exact replay be stale or unsafe because target/session state changed?
|
|
31
|
+
```
|
|
32
|
+
|
|
33
|
+
This repository is the free, open-source audit slice of CAIRN. It is local, CLI-first, and
|
|
34
|
+
audit-only. It does not run pentests, does not need live target access, and does
|
|
35
|
+
not auto-serve cached outputs. The commercial CAIRN Runtime is a separate protected sidecar for production reuse decisions.
|
|
36
|
+
|
|
37
|
+
CAIRN is cache-control for security agents, not generic caching:
|
|
38
|
+
|
|
39
|
+
```text
|
|
40
|
+
Agent trace
|
|
41
|
+
-> normalize tool events
|
|
42
|
+
-> compare protected target/session state
|
|
43
|
+
-> choose the safest action
|
|
44
|
+
|
|
45
|
+
same work + same protected state -> EXACT_CACHE
|
|
46
|
+
related work + changed/partial state -> DELTA_SERVE
|
|
47
|
+
uncertain or first-seen work -> LIVE_CALL
|
|
48
|
+
unsafe protected-state mismatch -> BLOCK_REUSE
|
|
49
|
+
```
|
|
50
|
+
|
|
51
|
+
> **Open-core scope:** this repo is the open audit slice of CAIRN, not the full
|
|
52
|
+
> runtime product. The protected runtime sidecar, production serving layer,
|
|
53
|
+
> enterprise dashboard/history, custom mappers, support, and commercial deployment
|
|
54
|
+
> are not included here. Use this repo to test whether a repeated-work signal
|
|
55
|
+
> exists in your AI-pentest traces.
|
|
56
|
+
|
|
57
|
+
## What It Does
|
|
58
|
+
|
|
59
|
+
Given JSON/JSONL security-agent logs, CAIRN:
|
|
60
|
+
|
|
61
|
+
- normalizes trace records into a common audit schema,
|
|
62
|
+
- groups repeated tool-output/context work,
|
|
63
|
+
- compares protected target/session state,
|
|
64
|
+
- marks exact-cache stale-risk events,
|
|
65
|
+
- classifies `LIVE_CALL`, `EXACT_CACHE`, `DELTA_SERVE`, and `BLOCK_REUSE`,
|
|
66
|
+
- estimates point-token and carried-context savings,
|
|
67
|
+
- writes terminal JSON plus `summary.json` and `summary.md`; `report.html` is optional with `--html`.
|
|
68
|
+
|
|
69
|
+
## Quickstart
|
|
70
|
+
|
|
71
|
+
Install from the repo:
|
|
72
|
+
|
|
73
|
+
```bash
|
|
74
|
+
git clone https://github.com/fraqtl-ai/cairn-security-agent-audit.git
|
|
75
|
+
cd cairn-security-agent-audit
|
|
76
|
+
python3 -m venv .venv
|
|
77
|
+
source .venv/bin/activate
|
|
78
|
+
python -m pip install -e .
|
|
79
|
+
```
|
|
80
|
+
|
|
81
|
+
After PyPI release, this becomes:
|
|
82
|
+
|
|
83
|
+
```bash
|
|
84
|
+
pip install cairn-security-agent-audit
|
|
85
|
+
```
|
|
86
|
+
|
|
87
|
+
Run the included pentest sample:
|
|
88
|
+
|
|
89
|
+
```bash
|
|
90
|
+
cairn-demo
|
|
91
|
+
```
|
|
92
|
+
|
|
93
|
+
Terminal-only JSON:
|
|
94
|
+
|
|
95
|
+
```bash
|
|
96
|
+
cairn-demo --json-only | less
|
|
97
|
+
```
|
|
98
|
+
|
|
99
|
+
From a repo clone, `./demo.sh` also works without installing.
|
|
100
|
+
|
|
101
|
+
The sample prints a JSON summary in the terminal and writes JSON/Markdown receipts. For a larger public benchmark HTML example, open:
|
|
102
|
+
|
|
103
|
+
```text
|
|
104
|
+
examples/autopenbench/report.html
|
|
105
|
+
```
|
|
106
|
+
|
|
107
|
+
## Try Your Own Logs
|
|
108
|
+
|
|
109
|
+
If your trace is JSONL:
|
|
110
|
+
|
|
111
|
+
```bash
|
|
112
|
+
cairn-audit \
|
|
113
|
+
--input your_trace.jsonl \
|
|
114
|
+
--out report \
|
|
115
|
+
--price-input-per-m 3.0 \
|
|
116
|
+
--no-cleaned-trace
|
|
117
|
+
```
|
|
118
|
+
|
|
119
|
+
If your trace is one JSON file:
|
|
120
|
+
|
|
121
|
+
```bash
|
|
122
|
+
cairn-audit \
|
|
123
|
+
--input your_trace.json \
|
|
124
|
+
--out report \
|
|
125
|
+
--price-input-per-m 3.0 \
|
|
126
|
+
--no-cleaned-trace
|
|
127
|
+
```
|
|
128
|
+
|
|
129
|
+
If your traces are a directory of JSON logs:
|
|
130
|
+
|
|
131
|
+
```bash
|
|
132
|
+
cairn-audit \
|
|
133
|
+
--input logs/ \
|
|
134
|
+
--glob '*.json' \
|
|
135
|
+
--out report \
|
|
136
|
+
--price-input-per-m 3.0 \
|
|
137
|
+
--no-cleaned-trace
|
|
138
|
+
```
|
|
139
|
+
|
|
140
|
+
Outputs:
|
|
141
|
+
|
|
142
|
+
```text
|
|
143
|
+
report/summary.json
|
|
144
|
+
report/summary.md
|
|
145
|
+
report/normalization_summary.json
|
|
146
|
+
```
|
|
147
|
+
|
|
148
|
+
## Input Shape
|
|
149
|
+
|
|
150
|
+
Preferred input is one JSON object per tool event:
|
|
151
|
+
|
|
152
|
+
```json
|
|
153
|
+
{"session_id":"run-1","step":1,"tool":"shell","command":"nmap -sV 10.0.0.5","output":"PORT 22 open ssh...","output_tokens":900,"before":{"fingerprint":"target-a"},"after":{"fingerprint":"target-a"}}
|
|
154
|
+
```
|
|
155
|
+
|
|
156
|
+
Useful fields:
|
|
157
|
+
|
|
158
|
+
```text
|
|
159
|
+
session_id or run_id
|
|
160
|
+
step index or timestamp
|
|
161
|
+
tool/action name
|
|
162
|
+
command/action text
|
|
163
|
+
stdout/stderr/observation/output text
|
|
164
|
+
target/session/provenance hints if available
|
|
165
|
+
input/output token counts if available
|
|
166
|
+
```
|
|
167
|
+
|
|
168
|
+
If you do not know whether your export has the right fields, inspect the shape:
|
|
169
|
+
|
|
170
|
+
```bash
|
|
171
|
+
cairn-inspect \
|
|
172
|
+
--input your_trace.jsonl \
|
|
173
|
+
--out report/schema_inspection.json
|
|
174
|
+
```
|
|
175
|
+
|
|
176
|
+
You can share `report/schema_inspection.json` or one redacted example row
|
|
177
|
+
without sharing raw logs.
|
|
178
|
+
|
|
179
|
+
If output or observation text is missing, CAIRN can still show repeated-work and
|
|
180
|
+
stale-risk structure, but token-savings and delta-serving estimates will be
|
|
181
|
+
weaker. If target/session fingerprints are missing, CAIRN can still run with
|
|
182
|
+
conservative proxy fingerprints, but real fingerprints make the protected-lane
|
|
183
|
+
analysis stronger.
|
|
184
|
+
|
|
185
|
+
## Have Different Logs?
|
|
186
|
+
|
|
187
|
+
If your logs do not map cleanly, do not prepare a full export first. Send or inspect one redacted event instead. The minimum useful shape is:
|
|
188
|
+
|
|
189
|
+
```text
|
|
190
|
+
session_id, timestamp or step, tool name, tool input/command, output/observation, target/session state if available
|
|
191
|
+
```
|
|
192
|
+
|
|
193
|
+
See [One Redacted Event](docs/ONE_REDACTED_EVENT.md) for exactly what to share and what to redact. One event is enough to adapt the mapper; the full audit can still run locally inside your environment.
|
|
194
|
+
|
|
195
|
+
## What The Report Shows
|
|
196
|
+
|
|
197
|
+
The report is designed to be readable by product and engineering teams:
|
|
198
|
+
|
|
199
|
+
| Area | What CAIRN reports |
|
|
200
|
+
|---|---|
|
|
201
|
+
| Repeated work | Events audited, re-reads, repeated-work percentage |
|
|
202
|
+
| Tool families | Top repeated commands/tools by carried-context savings |
|
|
203
|
+
| Safety | Protected-lane blocks and exact-cache stale-risk events |
|
|
204
|
+
| Opportunities | `EXACT_CACHE`, `DELTA_SERVE`, `LIVE_CALL`, `BLOCK_REUSE` |
|
|
205
|
+
| Savings | Point tokens avoided, carried-context tokens avoided, dollar estimate |
|
|
206
|
+
| Examples | Concrete commands/actions that created the signal |
|
|
207
|
+
|
|
208
|
+
Public reference result from AutoPenBench / genai-pentest-paper logs:
|
|
209
|
+
|
|
210
|
+
```text
|
|
211
|
+
2,764 tool events audited
|
|
212
|
+
1,031 re-reads
|
|
213
|
+
37.30% repeated work
|
|
214
|
+
548,335 point tokens avoided
|
|
215
|
+
3,698,589 carried-context tokens avoided
|
|
216
|
+
1,016 protected-lane blocks
|
|
217
|
+
0 stale serves
|
|
218
|
+
0 false hits
|
|
219
|
+
```
|
|
220
|
+
|
|
221
|
+
Top repeated families in that public run included:
|
|
222
|
+
|
|
223
|
+
```text
|
|
224
|
+
nmap, curl, ssh, msfconsole, searchsploit, find, cat
|
|
225
|
+
```
|
|
226
|
+
|
|
227
|
+
## How To Read The Actions
|
|
228
|
+
|
|
229
|
+
`LIVE_CALL`
|
|
230
|
+
|
|
231
|
+
```text
|
|
232
|
+
First time seeing this work, or not enough evidence to reuse safely.
|
|
233
|
+
```
|
|
234
|
+
|
|
235
|
+
`EXACT_CACHE`
|
|
236
|
+
|
|
237
|
+
```text
|
|
238
|
+
Same work repeated and protected state still matches. Exact reuse would be safe.
|
|
239
|
+
```
|
|
240
|
+
|
|
241
|
+
`DELTA_SERVE`
|
|
242
|
+
|
|
243
|
+
```text
|
|
244
|
+
Related output repeated, but exact replay is not the right safety choice.
|
|
245
|
+
Prior output can still shrink context/reporting burden while staying live-aware.
|
|
246
|
+
```
|
|
247
|
+
|
|
248
|
+
`BLOCK_REUSE`
|
|
249
|
+
|
|
250
|
+
```text
|
|
251
|
+
Repeated work exists, but reuse should be blocked.
|
|
252
|
+
```
|
|
253
|
+
|
|
254
|
+
## Open-Core Model
|
|
255
|
+
|
|
256
|
+
This repository is MIT-licensed and contains the local audit slice: CLI, schema inspector, sample traces, and JSON/Markdown report generation. HTML output is available, but the main product path is terminal-first.
|
|
257
|
+
|
|
258
|
+
The paid/commercial layer is CAIRN Runtime: a protected sidecar for production reuse decisions, custom trace mappers, dashboard/history, deployment support, and enterprise licensing.
|
|
259
|
+
|
|
260
|
+
See [Open-Core Model](OPEN_CORE.md).
|
|
261
|
+
|
|
262
|
+
## Product Direction
|
|
263
|
+
|
|
264
|
+
This repository is the audit slice, not the full CAIRN runtime product.
|
|
265
|
+
|
|
266
|
+
Design-partner path:
|
|
267
|
+
|
|
268
|
+
1. **Offline audit**: run CAIRN on existing AI-pentest traces and measure the
|
|
269
|
+
repeated-work signal.
|
|
270
|
+
2. **Local dashboard**: review repeated tool families, stale-risk examples, and
|
|
271
|
+
savings over time without raw logs leaving the customer environment.
|
|
272
|
+
3. **Protected runtime sidecar**: integrate around one high-volume tool family.
|
|
273
|
+
CAIRN observes tool calls plus target/session state, exact-caches only inside
|
|
274
|
+
safe provenance cells, delta-serves when appropriate, and falls back live when
|
|
275
|
+
state changed or is uncertain.
|
|
276
|
+
|
|
277
|
+
The goal is not to replay pentest results blindly. The goal is to reduce
|
|
278
|
+
repeated context/tool-output cost while refusing stale replay across protected
|
|
279
|
+
target/session changes.
|
|
280
|
+
|
|
281
|
+
## Boundaries
|
|
282
|
+
|
|
283
|
+
CAIRN Security Agent Audit is:
|
|
284
|
+
|
|
285
|
+
- local,
|
|
286
|
+
- audit-only,
|
|
287
|
+
- trace/report oriented,
|
|
288
|
+
- designed to avoid stale replay.
|
|
289
|
+
|
|
290
|
+
It is not a vulnerability scanner, pentest runner, live target automation, or
|
|
291
|
+
production serving layer.
|
|
292
|
+
|
|
293
|
+
## License
|
|
294
|
+
|
|
295
|
+
MIT License. See [LICENSE](LICENSE).
|
|
296
|
+
|
|
297
|
+
Commercial CAIRN Runtime, private integrations, managed deployments, support, and enterprise licensing are separate from this open audit package. See [Open-Core Model](OPEN_CORE.md).
|