pentesting 0.90.10 → 0.92.5
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/ARCHITECTURE.md +409 -0
- package/LICENSE +28 -0
- package/README.md +24 -274
- package/bin/pentesting.mjs +0 -0
- package/lib/runtime.mjs +1 -5
- package/package.json +28 -37
- package/pentesting-logo.svg +13 -0
- package/scripts/postinstall.mjs +0 -0
package/ARCHITECTURE.md
ADDED
|
@@ -0,0 +1,409 @@
|
|
|
1
|
+
# Architecture
|
|
2
|
+
|
|
3
|
+
This document reflects the current `main` branch architecture. It is the single,
|
|
4
|
+
detailed reference for how the runtime is structured; the [README](README.md)
|
|
5
|
+
stays intentionally short and usage-focused.
|
|
6
|
+
|
|
7
|
+
> **Engine vs product.** The orchestration engine is **`builder`** — a
|
|
8
|
+
> general-purpose, local-first Rust agent runtime. **`pentesting`** is the
|
|
9
|
+
> published, security-focused distribution of that same engine: the npm package
|
|
10
|
+
> and the command you run. They share one binary; `pentesting` is `builder` with
|
|
11
|
+
> a security skill set and banner. The model proposes actions, while the Rust
|
|
12
|
+
> runtime owns routing, tool policy, evidence capture, verification, and
|
|
13
|
+
> completion adjudication.
|
|
14
|
+
|
|
15
|
+
## Design Pillars
|
|
16
|
+
|
|
17
|
+
- **Runtime-Owned Adjudication** — Model output is a *candidate* until a runtime
|
|
18
|
+
acceptance-gate lattice verifies and closes the task (outcome: complete /
|
|
19
|
+
replan / blocked).
|
|
20
|
+
- **Local-First State** — Runtime state and a markdown knowledge graph live on
|
|
21
|
+
disk. No external storage service.
|
|
22
|
+
- **Weak-Model Hardening** — Rust-enforced classifiers, validators, and
|
|
23
|
+
self-reviews guard against hallucination.
|
|
24
|
+
- **Lineage & Horizons** — Run lineage, objective IDs, and horizon-aware memory
|
|
25
|
+
reuse, all in local state.
|
|
26
|
+
- **Policy-Gated Tools** — `Allow/Deny/Confirm` gates shell, file, and network
|
|
27
|
+
access (policy-based today; OS sandboxing is roadmap).
|
|
28
|
+
- **Intelligent Queuing** — Queued inputs are deduplicated and prioritized live;
|
|
29
|
+
obsolete tasks are preempted, not FIFO-replayed.
|
|
30
|
+
- **Lab Session Multiplexing** — `shell-listener` runs multiple authorized TCP
|
|
31
|
+
sessions with per-session routing and PTY upgrades.
|
|
32
|
+
- **Dynamic Agent Profiles** — No fixed persona: the runtime *derives* a profile
|
|
33
|
+
per request from the prompt and can overlay a named autonomy profile, driving
|
|
34
|
+
tool scope, phase, and memory weighting.
|
|
35
|
+
- **Ebbinghaus-Inspired Memory** — Memories carry a *strength* that fades like
|
|
36
|
+
human memory and reinforces on recall. Faded notes are de-referenced, never
|
|
37
|
+
destroyed.
|
|
38
|
+
- **Git-Backed Rewind** — Working-tree checkpoint/restore keeps autonomous edits
|
|
39
|
+
reversible.
|
|
40
|
+
|
|
41
|
+
## Runtime Flow
|
|
42
|
+
|
|
43
|
+
```mermaid
|
|
44
|
+
flowchart TD
|
|
45
|
+
U[User request] --> CLI[builder_main CLI / TUI]
|
|
46
|
+
CLI --> API[builder_api facade]
|
|
47
|
+
API --> APP[builder_app coordination]
|
|
48
|
+
APP --> CLS[Intent classifier]
|
|
49
|
+
CLS --> STR[Dynamic profile + execution strategy]
|
|
50
|
+
STR --> AG[Active agent]
|
|
51
|
+
AG --> DEC{Delegate?}
|
|
52
|
+
DEC -- no --> TOOLS[Policy-gated tools]
|
|
53
|
+
DEC -- yes --> SUB[Subagent task packet]
|
|
54
|
+
SUB --> TOOLS
|
|
55
|
+
TOOLS --> EV[Evidence + state cards]
|
|
56
|
+
EV --> VER[Verification gates]
|
|
57
|
+
VER --> DONE[Completion adjudication]
|
|
58
|
+
DONE --> STORE[Local runtime state]
|
|
59
|
+
STORE --> CTX[Next-turn context]
|
|
60
|
+
CTX --> APP
|
|
61
|
+
```
|
|
62
|
+
|
|
63
|
+
Plain view:
|
|
64
|
+
|
|
65
|
+
```text
|
|
66
|
+
user input
|
|
67
|
+
-> CLI / interactive prompt loop
|
|
68
|
+
-> BuilderAPI facade
|
|
69
|
+
-> BuilderApp classification and strategy
|
|
70
|
+
-> active agent or delegated subagent
|
|
71
|
+
-> policy-gated tool execution
|
|
72
|
+
-> evidence, workflow state, and verification records
|
|
73
|
+
-> completion adjudication
|
|
74
|
+
-> local state and next-turn context
|
|
75
|
+
```
|
|
76
|
+
|
|
77
|
+
Builder is a **domain-neutral** runtime with a security focus. Development,
|
|
78
|
+
pentesting, CTF, audit, and release work layer in via skills — never baked into
|
|
79
|
+
the core.
|
|
80
|
+
|
|
81
|
+
## Dynamic Agent Profiles
|
|
82
|
+
|
|
83
|
+
There is no fixed persona. The runtime derives a profile per request, optionally
|
|
84
|
+
overlays a named autonomy profile, then reuses tool scope, phase, and memory
|
|
85
|
+
weighting.
|
|
86
|
+
|
|
87
|
+
```mermaid
|
|
88
|
+
flowchart LR
|
|
89
|
+
R[Request] --> GEN["Derive<br/>dynamic profile"]
|
|
90
|
+
GEN --> REC["Overlay<br/>named autonomy profile"]
|
|
91
|
+
REC --> USE["Reuse<br/>tool scope · phase · memory weight"]
|
|
92
|
+
```
|
|
93
|
+
|
|
94
|
+
Request to closure:
|
|
95
|
+
|
|
96
|
+
```mermaid
|
|
97
|
+
flowchart TD
|
|
98
|
+
U[User request] --> C[Intent classifier]
|
|
99
|
+
C --> P[Dynamic profile<br/>task shape + tool scope + rigor]
|
|
100
|
+
P --> R[Runtime router]
|
|
101
|
+
R --> A[Active agent]
|
|
102
|
+
A --> T{Need delegation?}
|
|
103
|
+
T -- no --> O[Tool execution]
|
|
104
|
+
T -- yes --> D[Agent tool]
|
|
105
|
+
D --> CO[coordinator]
|
|
106
|
+
D --> I[investigator]
|
|
107
|
+
D --> OP[operator]
|
|
108
|
+
D --> RV[reviewer]
|
|
109
|
+
D --> V[verifier]
|
|
110
|
+
D --> W[report-writer]
|
|
111
|
+
CO --> O
|
|
112
|
+
I --> O
|
|
113
|
+
OP --> O
|
|
114
|
+
RV --> O
|
|
115
|
+
V --> O
|
|
116
|
+
W --> O
|
|
117
|
+
O --> G[Completion gates]
|
|
118
|
+
G --> M[Memory + artifacts]
|
|
119
|
+
M --> U
|
|
120
|
+
```
|
|
121
|
+
|
|
122
|
+
## Agent Team
|
|
123
|
+
|
|
124
|
+
The core agent team is domain-neutral. Each agent ships as a markdown persona
|
|
125
|
+
definition under `crates/builder_repo/src/agents/`.
|
|
126
|
+
|
|
127
|
+
| Agent | Default role | Runtime profile |
|
|
128
|
+
| --- | --- | --- |
|
|
129
|
+
| `builder` | Hands-on implementation, refactoring, local file changes, tests | Implement + broad local write |
|
|
130
|
+
| `planner` | Implementation plans and risk breakdowns | Plan + read-only |
|
|
131
|
+
| `researcher` | Read-only codebase and reference research | Investigate + read-only |
|
|
132
|
+
| `coordinator` | Splits broad work into owned packets and consolidates results | Coordinate + broad local write |
|
|
133
|
+
| `investigator` | Evidence gathering across code, logs, commands, APIs, and behavior | Investigate + shell diagnostics |
|
|
134
|
+
| `operator` | Builds, tests, packaging, service startup, and runtime workflows | Implement + broad local write |
|
|
135
|
+
| `reviewer` | Findings-first technical review | Review + strict read-only |
|
|
136
|
+
| `verifier` | Reproduction, build/test proof, and completion claim checks | Verify + strict read-only |
|
|
137
|
+
| `report-writer` | Reports, handoffs, release notes, and reproducibility records | Implement + bounded write |
|
|
138
|
+
|
|
139
|
+
> Distinct tool-scope is runtime-enforced for `builder`, `planner`, and
|
|
140
|
+
> `researcher`; the remaining personas share a task-shaped generic profile
|
|
141
|
+
> (read-only when the task shape is review/verify, broad local write otherwise)
|
|
142
|
+
> and differ by their prompt instructions.
|
|
143
|
+
|
|
144
|
+
## Named Autonomy Profiles
|
|
145
|
+
|
|
146
|
+
Optionally pin the top-level profile (`autonomy_profile = "ctf-competition"`);
|
|
147
|
+
leave unset for classifier-driven defaults. Delegated subagent contracts stay
|
|
148
|
+
intact. Defined in `crates/builder_domain/src/autonomy_profile.rs`.
|
|
149
|
+
|
|
150
|
+
| Profile | Purpose |
|
|
151
|
+
| --- | --- |
|
|
152
|
+
| `general-agent` | Broad autonomous local orchestration for mixed tasks |
|
|
153
|
+
| `local-builder` | Hands-on implementation with fresh evidence retrieval |
|
|
154
|
+
| `ctf-competition` | Competition/lab workflow backed by `ctf-competition` and `pentesting-methodology` skills |
|
|
155
|
+
| `enterprise-review` | Strict, review-heavy profile with read-only dynamic scope |
|
|
156
|
+
|
|
157
|
+
## Engagement Metadata
|
|
158
|
+
|
|
159
|
+
Runs can attach typed engagement context — scope, phase, tags, and standard refs
|
|
160
|
+
(PTES, MITRE ATT&CK, OWASP, CWE/CAPEC, NIST CSF, CIS Controls) — to workflow
|
|
161
|
+
metadata. `/workflow report` exports it as a Markdown handoff.
|
|
162
|
+
|
|
163
|
+
## Memory & Knowledge
|
|
164
|
+
|
|
165
|
+
Local-first and split across two complementary layers, no vector DB, no cloud:
|
|
166
|
+
|
|
167
|
+
- **Conversation memories** — facts/preferences/gotchas distilled from sessions,
|
|
168
|
+
persisted in local runtime state and recalled into each prompt by hybrid
|
|
169
|
+
retrieval. These carry a *strength* that fades like human memory, so the store
|
|
170
|
+
stays sharp instead of rotting.
|
|
171
|
+
- **Knowledge vault** — an optional Obsidian-style set of markdown notes under
|
|
172
|
+
`.builder/knowledge` (wiki-links, backlinks, tags). The agent reads it on every
|
|
173
|
+
turn and grows it with the `write` tool; it is never auto-deleted.
|
|
174
|
+
|
|
175
|
+
Both layers feed the same strength-weighted hybrid retrieval.
|
|
176
|
+
|
|
177
|
+
**Storage** — Ebbinghaus lifecycle (decay · reinforce · floor, never deleted):
|
|
178
|
+
|
|
179
|
+
```mermaid
|
|
180
|
+
flowchart TD
|
|
181
|
+
N[New memory] --> S["strength = quality × recall × e^-λ·age"]
|
|
182
|
+
S --> U{recalled?}
|
|
183
|
+
U -- yes --> RE[reinforce ↑ · reset age]
|
|
184
|
+
U -- no --> D[decay over time]
|
|
185
|
+
RE --> S
|
|
186
|
+
D --> F{below floor?}
|
|
187
|
+
F -- no --> S
|
|
188
|
+
F -- yes --> AR[de-reference: archive / tombstone]
|
|
189
|
+
AR -. recoverable on disk .-> N
|
|
190
|
+
```
|
|
191
|
+
|
|
192
|
+
Kinds fade at different speeds (procedural outlives episodic); bi-temporal
|
|
193
|
+
`event_time` vs `ingestion_time` lets newer facts supersede stale ones.
|
|
194
|
+
|
|
195
|
+
**Retrieval** — hybrid fuse, strength-weighted, read-only:
|
|
196
|
+
|
|
197
|
+
```mermaid
|
|
198
|
+
flowchart LR
|
|
199
|
+
Q[Query] --> L[Lexical]
|
|
200
|
+
Q --> SE[Semantic]
|
|
201
|
+
Q --> G[Graph]
|
|
202
|
+
L --> RRF[RRF fuse]
|
|
203
|
+
SE --> RRF
|
|
204
|
+
G --> RRF
|
|
205
|
+
RRF --> RR[rerank: phase · recency · task]
|
|
206
|
+
RR --> W[weight by strength]
|
|
207
|
+
W --> P[Prompt context]
|
|
208
|
+
```
|
|
209
|
+
|
|
210
|
+
Faded, private, or unsafe memories are held back from the prompt. Lookups never
|
|
211
|
+
write — reinforce/archive/supersede are explicit, never search side effects.
|
|
212
|
+
|
|
213
|
+
## Storage Strategy
|
|
214
|
+
|
|
215
|
+
Builder uses one operational storage strategy: local on-disk runtime state plus
|
|
216
|
+
markdown-native knowledge artifacts.
|
|
217
|
+
|
|
218
|
+
```mermaid
|
|
219
|
+
flowchart LR
|
|
220
|
+
W[Workspace files] --> AD[Source adapters]
|
|
221
|
+
S[Skills] --> AD
|
|
222
|
+
M[Conversation memories] --> AD
|
|
223
|
+
N[Markdown notes] --> AD
|
|
224
|
+
AD --> IDX[Local lexical + semantic + graph index]
|
|
225
|
+
IDX --> RR[Contextual reranker]
|
|
226
|
+
RR --> PC[Prompt context]
|
|
227
|
+
PC --> RUN[Agent run]
|
|
228
|
+
RUN --> LS[.builder local state]
|
|
229
|
+
RUN --> ART[Local artifacts and reports]
|
|
230
|
+
```
|
|
231
|
+
|
|
232
|
+
Key invariants:
|
|
233
|
+
|
|
234
|
+
- Runtime state is local-first and does not require an external data service.
|
|
235
|
+
- The markdown knowledge graph is derived from local facts: notes, skills,
|
|
236
|
+
conversation memory, wiki-links, backlinks, and run artifacts.
|
|
237
|
+
- Graph data is a derived retrieval view, not a second source of truth.
|
|
238
|
+
- Workflow runs, evidence, state cards, memories, and large-output artifacts
|
|
239
|
+
persist through local repositories.
|
|
240
|
+
- Domain methods such as development, CTF practice, pentesting methodology,
|
|
241
|
+
audits, and release work live in skills and task instructions; they do not
|
|
242
|
+
change the core runtime identity.
|
|
243
|
+
|
|
244
|
+
## Crate Map
|
|
245
|
+
|
|
246
|
+
| Layer | Crates | Responsibility |
|
|
247
|
+
| --- | --- | --- |
|
|
248
|
+
| Interaction | `builder_main`, `builder_ratatui`, `builder_select`, `builder_markdown_stream` | CLI, TUI, prompt flow, rendering, selector fallback |
|
|
249
|
+
| Facade | `builder_api` | Thin application boundary used by entry points |
|
|
250
|
+
| Coordination | `builder_app` | classification, routing, execution strategy, orchestration, completion gates |
|
|
251
|
+
| Contracts | `builder_domain` | typed IDs, tools, workflow records, engagement metadata, policies |
|
|
252
|
+
| Capabilities | `builder_services` | tool services, auth, provider support, file/search/shell/fetch operations |
|
|
253
|
+
| Persistence | `builder_repo`, `builder_provider_repo` | local repositories, provider catalog, chat repository |
|
|
254
|
+
| Knowledge | `builder_knowledge`, `builder_workspace_index`, `builder_embed` | markdown note adapters, graph parsing, hybrid retrieval, local semantic scoring |
|
|
255
|
+
| Infrastructure | `builder_config`, `builder_infra`, `builder_fs`, `builder_walker` | config, environment, filesystem, process/runtime helpers |
|
|
256
|
+
| Support | `builder_display`, `builder_stream`, `builder_template`, `builder_json_repair`, `builder_snaps`, `builder_brand`, `builder_tracker`, `builder_test_kit`, `builder_tool_macros` | formatting, streaming, templates, JSON repair, snapshots, branding/theming, version tracking, tests, tool macros |
|
|
257
|
+
|
|
258
|
+
## Tool Surface
|
|
259
|
+
|
|
260
|
+
The runtime tool catalog is file, shell, network-fetch, planning, skill, todo,
|
|
261
|
+
and delegation oriented. Core tools:
|
|
262
|
+
|
|
263
|
+
- `read`, `write`, `patch`, `multi_patch`, `undo`, `remove`
|
|
264
|
+
- `fs_search`, `sem_search`
|
|
265
|
+
- `shell`, `fetch`
|
|
266
|
+
- `plan`, `skill`, `todo_read`, `todo_write`
|
|
267
|
+
- `task` and dynamically registered agent tools
|
|
268
|
+
|
|
269
|
+
The full catalog (`builder_domain/src/tools/catalog.rs`) also includes web
|
|
270
|
+
search, memory, verification primitives (flag check, best-of-N verify, finding),
|
|
271
|
+
session/process control, and compaction tools. Direct raw storage-query tooling
|
|
272
|
+
is not part of the active runtime surface.
|
|
273
|
+
|
|
274
|
+
## Verification And Completion
|
|
275
|
+
|
|
276
|
+
Completion is runtime-owned:
|
|
277
|
+
|
|
278
|
+
```text
|
|
279
|
+
candidate answer
|
|
280
|
+
-> evidence check
|
|
281
|
+
-> verification record
|
|
282
|
+
-> acceptance-gate check
|
|
283
|
+
-> adjudication
|
|
284
|
+
-> complete / replan / blocked
|
|
285
|
+
```
|
|
286
|
+
|
|
287
|
+
The key rule is that a model response is not considered done until the runtime
|
|
288
|
+
has enough recorded evidence to close the task.
|
|
289
|
+
|
|
290
|
+
## Distribution — single source, two surfaces
|
|
291
|
+
|
|
292
|
+
Pentesting and Builder share the **same Rust runtime binary**. The `pentesting`
|
|
293
|
+
npm package is a thin distribution facade:
|
|
294
|
+
|
|
295
|
+
```text
|
|
296
|
+
npm install -g pentesting
|
|
297
|
+
│
|
|
298
|
+
▼
|
|
299
|
+
pentesting CLI (Node.js shim)
|
|
300
|
+
│ resolves or downloads the matching Builder release asset
|
|
301
|
+
▼
|
|
302
|
+
Builder binary (Rust) ← single runtime engine
|
|
303
|
+
│ PENTESTING_PRODUCT_NAME=pentesting
|
|
304
|
+
▼
|
|
305
|
+
Interactive TUI with "pentesting" banner
|
|
306
|
+
```
|
|
307
|
+
|
|
308
|
+
- The npm package installs a launcher, **not** a second agent runtime.
|
|
309
|
+
- It resolves or downloads the correct release asset from
|
|
310
|
+
`agnusdei1207/pentesting-public`.
|
|
311
|
+
- It forwards arguments directly into the Rust binary — no command translation
|
|
312
|
+
or compatibility shims.
|
|
313
|
+
- If a change would add orchestration, memory, or prompt logic into the npm
|
|
314
|
+
layer, that change belongs upstream in the Rust runtime.
|
|
315
|
+
|
|
316
|
+
## Supported Runtime Targets
|
|
317
|
+
|
|
318
|
+
The npm launcher currently resolves managed native release assets for the
|
|
319
|
+
targets below. Other operating systems can run the Docker image or use
|
|
320
|
+
`PENTESTING_BIN` to point at a locally built compatible binary.
|
|
321
|
+
|
|
322
|
+
| OS | CPU | Release asset |
|
|
323
|
+
| --- | --- | --- |
|
|
324
|
+
| Linux | x64 | `pentesting-x86_64-unknown-linux-musl` |
|
|
325
|
+
| Android | arm64 | `pentesting-aarch64-linux-android` |
|
|
326
|
+
|
|
327
|
+
`/update` (and the npm postinstall) resolves the matching asset for the current
|
|
328
|
+
supported OS/CPU target from `agnusdei1207/pentesting-public`.
|
|
329
|
+
|
|
330
|
+
## Configuration
|
|
331
|
+
|
|
332
|
+
Pentesting reads a user-global config from `~/.pentesting/.pentesting.toml`,
|
|
333
|
+
overridden by project-local `.pentesting.toml` files discovered by walking up
|
|
334
|
+
from the working directory. The storage config is intentionally small:
|
|
335
|
+
|
|
336
|
+
```toml
|
|
337
|
+
[storage]
|
|
338
|
+
backend = "local" # Local md/fs runtime state
|
|
339
|
+
```
|
|
340
|
+
|
|
341
|
+
The same shape is represented in `builder.schema.json`, `.pentesting.toml`, and
|
|
342
|
+
the interactive `/config` UI.
|
|
343
|
+
|
|
344
|
+
Environment variables use the `PENTESTING_` prefix, with `__` for nested keys —
|
|
345
|
+
e.g. `PENTESTING_SESSION__MODEL_ID=...`. Set `PENTESTING_CONFIG` to override the
|
|
346
|
+
global config directory.
|
|
347
|
+
|
|
348
|
+
| Variable | Description |
|
|
349
|
+
| --- | --- |
|
|
350
|
+
| `PENTESTING_BIN` | Use an already-installed Builder binary instead of the managed download. |
|
|
351
|
+
| `PENTESTING_PRODUCT_NAME` | Runtime banner label. The `pentesting` launcher sets this to `pentesting` automatically. |
|
|
352
|
+
| `PENTESTING_REPO` | Override the public release repo used for binary downloads. Defaults to `agnusdei1207/pentesting-public`. |
|
|
353
|
+
| `PENTESTING_SKIP_DOWNLOAD` | Skip the postinstall binary download. Useful in CI or when `PENTESTING_BIN` will be provided later. |
|
|
354
|
+
| `PENTESTING_CONFIG` | Override the global config directory. |
|
|
355
|
+
|
|
356
|
+
### Backward compatibility — `builder` ↔ `pentesting`
|
|
357
|
+
|
|
358
|
+
The runtime engine is still `builder` under the hood, so legacy names keep
|
|
359
|
+
working. Use whichever you like; the `PENTESTING_*` form wins when both are set.
|
|
360
|
+
|
|
361
|
+
| Surface | Canonical | Legacy (still accepted) |
|
|
362
|
+
| :--- | :--- | :--- |
|
|
363
|
+
| Env vars | `PENTESTING_*` | `BUILDER_*` |
|
|
364
|
+
| Config-dir override | `PENTESTING_CONFIG` | `BUILDER_CONFIG` |
|
|
365
|
+
| Global config file | `~/.pentesting/.pentesting.toml` | `~/.builder/.builder.toml`, `~/builder/.builder.toml` |
|
|
366
|
+
| Project config file | `.pentesting.toml` | `.builder.toml` |
|
|
367
|
+
|
|
368
|
+
## Interactive Commands
|
|
369
|
+
|
|
370
|
+
Inside an interactive session, these commands inspect and drive runtime state
|
|
371
|
+
(`/help` lists the full set):
|
|
372
|
+
|
|
373
|
+
```text
|
|
374
|
+
/status Show the current run phase, active tasks, gates, hooks, and budget signals
|
|
375
|
+
/workflow Show the current focus and recent workflow steps for the active conversation
|
|
376
|
+
/workflow report
|
|
377
|
+
Export the active run, engagement metadata, evidence, and large outputs to Markdown
|
|
378
|
+
/context Show recent context-budget snapshots for the current conversation
|
|
379
|
+
/memory Show stored conversation memories for the current conversation
|
|
380
|
+
/tools List the currently available tools and schemas
|
|
381
|
+
/agent Switch the active agent
|
|
382
|
+
/conversation Browse conversations for the active workspace
|
|
383
|
+
/goal <task> Set the active goal
|
|
384
|
+
/auto Toggle autonomous mode for the current goal
|
|
385
|
+
/update Download and apply the latest public release asset for supported native targets
|
|
386
|
+
/help Show all commands
|
|
387
|
+
/exit Quit
|
|
388
|
+
```
|
|
389
|
+
|
|
390
|
+
## Security Domain Skills
|
|
391
|
+
|
|
392
|
+
The `ctf-competition` and `pentesting-methodology` skills map authorized
|
|
393
|
+
assessments to standard frameworks:
|
|
394
|
+
|
|
395
|
+
- **PTES** (Penetration Testing Execution Standard)
|
|
396
|
+
- **MITRE ATT&CK** tactics and techniques
|
|
397
|
+
- **OWASP** Top 10 and Testing Guide
|
|
398
|
+
- **CWE/CAPEC** weakness and attack pattern catalogs
|
|
399
|
+
- **NIST CSF** and **CIS Controls**
|
|
400
|
+
|
|
401
|
+
### Shell listener for authorized labs
|
|
402
|
+
|
|
403
|
+
```bash
|
|
404
|
+
pentesting shell-listener --bind 127.0.0.1 --port 4444
|
|
405
|
+
```
|
|
406
|
+
|
|
407
|
+
Manages multiple accepted TCP sessions with per-session routing, buffered
|
|
408
|
+
output, raw byte logging, and PTY-upgrade helpers. Bound to loopback by default;
|
|
409
|
+
`--allow-remote` is an explicit opt-in gate.
|
package/LICENSE
ADDED
|
@@ -0,0 +1,28 @@
|
|
|
1
|
+
MIT License
|
|
2
|
+
|
|
3
|
+
Copyright (c) 2026 agnusdei1207
|
|
4
|
+
|
|
5
|
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
|
6
|
+
of this software and associated documentation files (the "Software"), to deal
|
|
7
|
+
in the Software without restriction, including without limitation the rights
|
|
8
|
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
|
9
|
+
copies of the Software, and to permit persons to whom the Software is
|
|
10
|
+
furnished to do so, subject to the following conditions:
|
|
11
|
+
|
|
12
|
+
The above copyright notice and this permission notice shall be included in all
|
|
13
|
+
copies or substantial portions of the Software.
|
|
14
|
+
|
|
15
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
|
16
|
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
|
17
|
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
|
18
|
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
|
19
|
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
|
20
|
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
|
21
|
+
SOFTWARE.
|
|
22
|
+
|
|
23
|
+
---
|
|
24
|
+
|
|
25
|
+
Note: This MIT license covers the published `pentesting` npm wrapper (the
|
|
26
|
+
launcher in `bin/` and `lib/` plus its install scripts). The Builder runtime
|
|
27
|
+
engine that the wrapper downloads and executes is a separate, proprietary
|
|
28
|
+
component and is not licensed under these terms.
|
package/README.md
CHANGED
|
@@ -13,50 +13,26 @@
|
|
|
13
13
|
|
|
14
14
|
---
|
|
15
15
|
|
|
16
|
-
> **Role:** The published npm package and the command you run are both **`pentesting`**. The internal orchestration engine is **`builder`** — `pentesting` downloads and runs it under the hood. `npm i -g pentesting` installs the **`pentesting`** command (never a `builder` command); the engine name does not surface in normal use.
|
|
17
|
-
|
|
18
|
-
---
|
|
19
|
-
|
|
20
|
-
## 🎯 Key Design Pillars
|
|
21
|
-
|
|
22
|
-
* **Runtime-Owned Adjudication** — Model output is a *candidate* until a 5-gate completion lattice verifies and closes the task.
|
|
23
|
-
* **Local-First State** — Runtime state and a markdown knowledge graph live on disk. No external storage service.
|
|
24
|
-
* **Weak-Model Hardening** — Rust-enforced classifiers, validators, and self-reviews guard against hallucination.
|
|
25
|
-
* **Lineage & Horizons** — Run lineage, objective IDs, and horizon-aware memory reuse, all in local state.
|
|
26
|
-
* **Policy-Gated Tools** — `Allow/Deny/Confirm` gates shell, file, and network access (policy-based today; OS sandboxing is roadmap).
|
|
27
|
-
* **Intelligent Queuing** — Queued inputs are deduplicated and prioritized live; obsolete tasks are preempted, not FIFO-replayed.
|
|
28
|
-
* **Lab Session Multiplexing** — `shell-listener` runs multiple authorized TCP sessions with per-session routing and PTY upgrades.
|
|
29
|
-
* **Dynamic Agent Profiles** — No fixed persona: the classifier *generates* a profile per request, *recalls* reusable templates, *reuses* them to drive tool scope, phase, and memory weighting.
|
|
30
|
-
* **Ebbinghaus-Inspired Memory** — Memories carry a *strength* that fades like human memory and reinforces on recall. Faded notes are de-referenced, never destroyed.
|
|
31
|
-
* **Git-Backed Rewind** — Working-tree checkpoint/restore keeps autonomous edits reversible.
|
|
32
|
-
|
|
33
|
-
---
|
|
34
|
-
|
|
35
16
|
## 🚀 Quick Start
|
|
36
17
|
|
|
37
|
-
### Install via npm
|
|
38
|
-
|
|
39
18
|
```bash
|
|
40
19
|
npm install -g pentesting
|
|
41
20
|
pentesting
|
|
42
21
|
```
|
|
43
22
|
|
|
44
|
-
|
|
23
|
+
Or run it with Docker:
|
|
45
24
|
|
|
46
25
|
```bash
|
|
47
|
-
docker run -it --rm
|
|
48
|
-
-v "$(pwd):/workspace" \
|
|
49
|
-
-w /workspace \
|
|
50
|
-
agnusdei1207/pentesting:latest
|
|
26
|
+
docker run -it --rm -v "$(pwd):/workspace" -w /workspace agnusdei1207/pentesting:latest
|
|
51
27
|
```
|
|
52
28
|
|
|
53
|
-
|
|
29
|
+
Or via Docker Compose:
|
|
54
30
|
|
|
55
31
|
```bash
|
|
56
32
|
PENTESTING_PROJECT_DIR=/path/to/project docker compose run pentesting
|
|
57
33
|
```
|
|
58
34
|
|
|
59
|
-
###
|
|
35
|
+
### Common commands
|
|
60
36
|
|
|
61
37
|
```bash
|
|
62
38
|
pentesting # Interactive TUI
|
|
@@ -65,271 +41,45 @@ pentesting shell-listener --bind 127.0.0.1 --port 4444 # Authorized lab listen
|
|
|
65
41
|
pentesting --version
|
|
66
42
|
```
|
|
67
43
|
|
|
68
|
-
|
|
69
|
-
|
|
70
|
-
## 🧩 Orchestration Map
|
|
71
|
-
|
|
72
|
-
Pentesting is a **domain-neutral** runtime with a security focus. Development, pentesting, CTF, audit, and release work layer in via skills — never baked into the core.
|
|
73
|
-
|
|
74
|
-
**Profile flow** — generate → recall → reuse:
|
|
75
|
-
|
|
76
|
-
```mermaid
|
|
77
|
-
flowchart LR
|
|
78
|
-
R[Request] --> GEN["Generate<br/>dynamic profile"]
|
|
79
|
-
GEN --> REC["Recall<br/>named template + overlay"]
|
|
80
|
-
REC --> USE["Reuse<br/>tool scope · phase · memory weight"]
|
|
81
|
-
```
|
|
82
|
-
|
|
83
|
-
**Agent flow** — request to closure:
|
|
84
|
-
|
|
85
|
-
```mermaid
|
|
86
|
-
flowchart TD
|
|
87
|
-
U[User request] --> C[Intent classifier]
|
|
88
|
-
C --> P[Dynamic profile<br/>task shape + tool scope + rigor]
|
|
89
|
-
P --> R[Runtime router]
|
|
90
|
-
R --> A[Active agent]
|
|
91
|
-
A --> T{Need delegation?}
|
|
92
|
-
T -- no --> O[Tool execution]
|
|
93
|
-
T -- yes --> D[Agent tool]
|
|
94
|
-
D --> CO[coordinator]
|
|
95
|
-
D --> I[investigator]
|
|
96
|
-
D --> OP[operator]
|
|
97
|
-
D --> RV[reviewer]
|
|
98
|
-
D --> V[verifier]
|
|
99
|
-
D --> W[report-writer]
|
|
100
|
-
CO --> O
|
|
101
|
-
I --> O
|
|
102
|
-
OP --> O
|
|
103
|
-
RV --> O
|
|
104
|
-
V --> O
|
|
105
|
-
W --> O
|
|
106
|
-
O --> G[Completion gates]
|
|
107
|
-
G --> M[Memory + artifacts]
|
|
108
|
-
M --> U
|
|
109
|
-
```
|
|
110
|
-
|
|
111
|
-
### Built-in agent team
|
|
112
|
-
|
|
113
|
-
| Agent | Default role | Runtime profile |
|
|
114
|
-
| --- | --- | --- |
|
|
115
|
-
| `builder` | Hands-on implementation, refactoring, local file changes, tests | Implement + broad local write |
|
|
116
|
-
| `planner` | Implementation plans and risk breakdowns | Plan + read-only |
|
|
117
|
-
| `researcher` | Read-only codebase and reference research | Investigate + read-only |
|
|
118
|
-
| `coordinator` | Splits broad work into owned packets and consolidates results | Coordinate + broad local write |
|
|
119
|
-
| `investigator` | Evidence gathering across code, logs, commands, APIs, and behavior | Investigate + shell diagnostics |
|
|
120
|
-
| `operator` | Builds, tests, packaging, service startup, and runtime workflows | Implement + broad local write |
|
|
121
|
-
| `reviewer` | Findings-first technical review | Review + strict read-only |
|
|
122
|
-
| `verifier` | Reproduction, build/test proof, and completion claim checks | Verify + strict read-only |
|
|
123
|
-
| `report-writer` | Reports, handoffs, release notes, and reproducibility records | Implement + bounded write |
|
|
124
|
-
|
|
125
|
-
### Named autonomy profiles
|
|
126
|
-
|
|
127
|
-
Optionally pin the top-level profile (`autonomy_profile = "ctf-competition"`); leave unset for classifier-driven defaults. Delegated subagent contracts stay intact.
|
|
128
|
-
|
|
129
|
-
| Profile | Purpose |
|
|
130
|
-
| --- | --- |
|
|
131
|
-
| `general-agent` | Broad autonomous local orchestration for mixed tasks |
|
|
132
|
-
| `local-builder` | Hands-on implementation with fresh evidence retrieval |
|
|
133
|
-
| `ctf-competition` | Competition/lab workflow backed by `ctf-competition` and `pentesting-methodology` skills |
|
|
134
|
-
| `enterprise-review` | Strict, review-heavy profile with read-only dynamic scope |
|
|
135
|
-
|
|
136
|
-
### Engagement metadata
|
|
137
|
-
|
|
138
|
-
Runs can attach typed engagement context — scope, phase, tags, and standard refs (PTES, MITRE ATT&CK, OWASP, CWE/CAPEC, NIST CSF, CIS Controls) — to workflow metadata. `/workflow report` exports it as a Markdown handoff.
|
|
139
|
-
|
|
140
|
-
### Storage and graph strategy
|
|
141
|
-
|
|
142
|
-
- **Local-first** — runtime state, notes, scratchpad, and graph inputs live on disk.
|
|
143
|
-
- **Derived graph** — built from markdown notes, wiki-links/backlinks, skills, and memory; a view, not a second source of truth.
|
|
144
|
-
- **Separate skills** — domain skills (`ctf-competition`, audit, release, dev) teach method without changing runtime identity.
|
|
145
|
-
|
|
146
|
-
---
|
|
147
|
-
|
|
148
|
-
## 🧠 Memory & Knowledge
|
|
149
|
-
|
|
150
|
-
One local-first store — plain markdown notes on disk, no vector DB, no cloud. Memories carry a *strength* that fades like human memory, so the store stays sharp instead of rotting.
|
|
151
|
-
|
|
152
|
-
**Storage** — Ebbinghaus lifecycle (decay · reinforce · floor, never deleted):
|
|
153
|
-
|
|
154
|
-
```mermaid
|
|
155
|
-
flowchart TD
|
|
156
|
-
N[New memory] --> S["strength = quality × recall × e^-λ·age"]
|
|
157
|
-
S --> U{recalled?}
|
|
158
|
-
U -- yes --> RE[reinforce ↑ · reset age]
|
|
159
|
-
U -- no --> D[decay over time]
|
|
160
|
-
RE --> S
|
|
161
|
-
D --> F{below floor?}
|
|
162
|
-
F -- no --> S
|
|
163
|
-
F -- yes --> AR[de-reference: archive / tombstone]
|
|
164
|
-
AR -. recoverable on disk .-> N
|
|
165
|
-
```
|
|
166
|
-
|
|
167
|
-
Kinds fade at different speeds (procedural outlives episodic); bi-temporal `event_time` vs `ingestion_time` lets newer facts supersede stale ones.
|
|
168
|
-
|
|
169
|
-
**Retrieval** — hybrid fuse, strength-weighted, read-only:
|
|
170
|
-
|
|
171
|
-
```mermaid
|
|
172
|
-
flowchart LR
|
|
173
|
-
Q[Query] --> L[Lexical]
|
|
174
|
-
Q --> SE[Semantic]
|
|
175
|
-
Q --> G[Graph]
|
|
176
|
-
L --> RRF[RRF fuse]
|
|
177
|
-
SE --> RRF
|
|
178
|
-
G --> RRF
|
|
179
|
-
RRF --> RR[rerank: phase · recency · task]
|
|
180
|
-
RR --> W[weight by strength]
|
|
181
|
-
W --> P[Prompt context]
|
|
182
|
-
```
|
|
183
|
-
|
|
184
|
-
Faded, private, or unsafe memories are held back from the prompt. Lookups never write — reinforce/archive/supersede are explicit, never search side effects.
|
|
44
|
+
Inside a session, `/help` lists every command and `/update` pulls the latest release for supported native targets.
|
|
185
45
|
|
|
186
46
|
---
|
|
187
47
|
|
|
188
|
-
##
|
|
189
|
-
|
|
190
|
-
Inside an interactive pentesting session, these commands are the fastest way to inspect state:
|
|
191
|
-
|
|
192
|
-
```text
|
|
193
|
-
/status Show the current run phase, active tasks, gates, hooks, and budget signals
|
|
194
|
-
/workflow Show the current focus and recent workflow steps for the active conversation
|
|
195
|
-
/workflow report
|
|
196
|
-
Export the active run, engagement metadata, evidence, and large outputs to Markdown
|
|
197
|
-
/context Show recent context-budget snapshots for the current conversation
|
|
198
|
-
/memory Show stored conversation memories for the current conversation
|
|
199
|
-
/tools List the currently available tools and schemas
|
|
200
|
-
/agent Switch the active agent
|
|
201
|
-
/conversation Browse conversations for the active workspace
|
|
202
|
-
/goal <task> Set the active goal
|
|
203
|
-
/auto Toggle autonomous mode for the current goal
|
|
204
|
-
/help Show all commands
|
|
205
|
-
/exit Quit
|
|
206
|
-
```
|
|
48
|
+
## ⚙️ Configuration
|
|
207
49
|
|
|
208
|
-
|
|
209
|
-
|
|
210
|
-
---
|
|
211
|
-
|
|
212
|
-
## ⚙️ Configuration (`.pentesting.toml`)
|
|
213
|
-
|
|
214
|
-
Pentesting reads a user-global config from `~/.pentesting/.pentesting.toml`, overridden by project-local `.pentesting.toml` files discovered by walking up from the working directory.
|
|
50
|
+
Pentesting reads `~/.pentesting/.pentesting.toml`, overridden by a project-local `.pentesting.toml`:
|
|
215
51
|
|
|
216
52
|
```toml
|
|
217
53
|
[storage]
|
|
218
|
-
backend = "local"
|
|
219
|
-
```
|
|
220
|
-
|
|
221
|
-
Environment variables use the `PENTESTING_` prefix, with `__` for nested keys — e.g. `PENTESTING_SESSION__MODEL_ID=...`. Set `PENTESTING_CONFIG` to override the global config directory.
|
|
222
|
-
|
|
223
|
-
### Backward compatibility — `builder` ↔ `pentesting`
|
|
224
|
-
|
|
225
|
-
The runtime engine is still `builder` under the hood, so legacy names keep working. Use whichever you like; the `PENTESTING_*` form wins when both are set.
|
|
226
|
-
|
|
227
|
-
| Surface | Canonical | Legacy (still accepted) |
|
|
228
|
-
| :--- | :--- | :--- |
|
|
229
|
-
| Env vars | `PENTESTING_*` | `BUILDER_*` |
|
|
230
|
-
| Config-dir override | `PENTESTING_CONFIG` | `BUILDER_CONFIG` |
|
|
231
|
-
| Global config file | `~/.pentesting/.pentesting.toml` | `~/.builder/.builder.toml`, `~/builder/.builder.toml` |
|
|
232
|
-
| Project config file | `.pentesting.toml` | `.builder.toml` |
|
|
233
|
-
|
|
234
|
-
---
|
|
235
|
-
|
|
236
|
-
## 🔐 Pentesting-Specific Notes
|
|
237
|
-
|
|
238
|
-
### Architecture — single source, two surfaces
|
|
239
|
-
|
|
240
|
-
Pentesting and Builder share the **same Rust runtime binary**. The `pentesting` npm package is a thin distribution facade:
|
|
241
|
-
|
|
242
|
-
```text
|
|
243
|
-
npm install -g pentesting
|
|
244
|
-
│
|
|
245
|
-
▼
|
|
246
|
-
pentesting CLI (Node.js shim)
|
|
247
|
-
│ resolves or downloads the matching Builder release asset
|
|
248
|
-
▼
|
|
249
|
-
Builder binary (Rust) ← single runtime engine
|
|
250
|
-
│ PENTESTING_PRODUCT_NAME=pentesting
|
|
251
|
-
▼
|
|
252
|
-
Interactive TUI with "pentesting" banner
|
|
253
|
-
```
|
|
254
|
-
|
|
255
|
-
- The npm package installs a launcher, **not** a second agent runtime.
|
|
256
|
-
- It resolves or downloads the correct release asset from `agnusdei1207/pentesting-public`.
|
|
257
|
-
- It forwards arguments directly into the Rust binary — no command translation or compatibility shims.
|
|
258
|
-
- If a change would add orchestration, memory, or prompt logic into the npm layer, that change belongs upstream in the Rust runtime.
|
|
259
|
-
|
|
260
|
-
### Security domain skills
|
|
261
|
-
|
|
262
|
-
The `ctf-competition` and `pentesting-methodology` skills map authorized assessments to standard frameworks:
|
|
263
|
-
|
|
264
|
-
- **PTES** (Penetration Testing Execution Standard)
|
|
265
|
-
- **MITRE ATT&CK** tactics and techniques
|
|
266
|
-
- **OWASP** Top 10 and Testing Guide
|
|
267
|
-
- **CWE/CAPEC** weakness and attack pattern catalogs
|
|
268
|
-
- **NIST CSF** and **CIS Controls**
|
|
269
|
-
|
|
270
|
-
### Shell listener for authorized labs
|
|
271
|
-
|
|
272
|
-
```bash
|
|
273
|
-
pentesting shell-listener --bind 127.0.0.1 --port 4444
|
|
54
|
+
backend = "local" # Local md/fs runtime state
|
|
274
55
|
```
|
|
275
56
|
|
|
276
|
-
|
|
277
|
-
|
|
278
|
-
---
|
|
279
|
-
|
|
280
|
-
## 📦 Supported Runtime Targets
|
|
281
|
-
|
|
282
|
-
| OS | CPU | Release asset |
|
|
283
|
-
| --- | --- | --- |
|
|
284
|
-
| Linux | x64 | `pentesting-x86_64-unknown-linux-musl` |
|
|
285
|
-
| Linux | arm64 | `pentesting-aarch64-unknown-linux-musl` |
|
|
286
|
-
| macOS | x64 | `pentesting-x86_64-apple-darwin` |
|
|
287
|
-
| macOS | arm64 | `pentesting-aarch64-apple-darwin` |
|
|
288
|
-
| Windows | x64 | `pentesting-x86_64-pc-windows-msvc.exe` |
|
|
289
|
-
| Windows | arm64 | `pentesting-aarch64-pc-windows-msvc.exe` |
|
|
290
|
-
| Android | arm64 | `pentesting-aarch64-linux-android` |
|
|
291
|
-
|
|
292
|
-
---
|
|
293
|
-
|
|
294
|
-
## 🌍 Environment Variables
|
|
57
|
+
Environment variables use the `PENTESTING_` prefix (`__` for nested keys, e.g. `PENTESTING_SESSION__MODEL_ID=...`). Full variable list and the legacy `BUILDER_*` aliases are documented in [`ARCHITECTURE.md`](ARCHITECTURE.md#configuration).
|
|
295
58
|
|
|
296
|
-
|
|
297
|
-
| --- | --- |
|
|
298
|
-
| `PENTESTING_BIN` | Use an already-installed Builder binary instead of the managed download. |
|
|
299
|
-
| `PENTESTING_PRODUCT_NAME` | Runtime banner label. The `pentesting` launcher sets this to `pentesting` automatically. |
|
|
300
|
-
| `PENTESTING_REPO` | Override the public release repo used for binary downloads. Defaults to `agnusdei1207/pentesting-public`. |
|
|
301
|
-
| `PENTESTING_SKIP_DOWNLOAD` | Skip the postinstall binary download. Useful in CI or when `PENTESTING_BIN` will be provided later. |
|
|
59
|
+
> **Note:** The command you run is always **`pentesting`**. The internal engine is **`builder`** — `pentesting` downloads and runs it under the hood; the engine name never surfaces in normal use.
|
|
302
60
|
|
|
303
61
|
---
|
|
304
62
|
|
|
305
|
-
## 📖
|
|
63
|
+
## 📖 Documentation
|
|
306
64
|
|
|
307
|
-
* [`
|
|
65
|
+
* [`ARCHITECTURE.md`](ARCHITECTURE.md) — Runtime flow, agent team, memory model, crate map, tool surface, and supported targets.
|
|
66
|
+
* [Public site](https://agnusdei1207.github.io/pentesting-public/) — Landing page and public runtime entry surface.
|
|
308
67
|
* [`compose.yaml`](https://github.com/agnusdei1207/pentesting-public) — Docker Compose facade for pentesting sessions.
|
|
309
68
|
|
|
310
69
|
---
|
|
311
70
|
|
|
312
|
-
|
|
313
|
-
|
|
314
|
-
<br/>
|
|
315
|
-
|
|
316
|
-
<img src="https://api.iconify.design/twemoji:flag-ireland.svg" width="36" height="36" alt="Ireland" />
|
|
317
|
-
<img src="https://api.iconify.design/twemoji:flag-south-korea.svg" width="36" height="36" alt="South Korea" />
|
|
318
|
-
<img src="https://api.iconify.design/twemoji:flag-germany.svg" width="36" height="36" alt="Germany" />
|
|
319
|
-
<img src="https://api.iconify.design/twemoji:flag-italy.svg" width="36" height="36" alt="Italy" />
|
|
320
|
-
<img src="https://api.iconify.design/twemoji:flag-netherlands.svg" width="36" height="36" alt="Netherlands" />
|
|
321
|
-
<img src="https://api.iconify.design/twemoji:flag-japan.svg" width="36" height="36" alt="Japan" />
|
|
322
|
-
<img src="https://api.iconify.design/twemoji:flag-belgium.svg" width="36" height="36" alt="Belgium" />
|
|
323
|
-
<img src="https://api.iconify.design/twemoji:flag-spain.svg" width="36" height="36" alt="Spain" />
|
|
324
|
-
<img src="https://api.iconify.design/twemoji:flag-portugal.svg" width="36" height="36" alt="Portugal" />
|
|
325
|
-
<img src="https://api.iconify.design/twemoji:flag-austria.svg" width="36" height="36" alt="Austria" />
|
|
71
|
+
## 🎹 From the Developer
|
|
326
72
|
|
|
327
|
-
|
|
328
|
-
|
|
329
|
-
|
|
330
|
-
|
|
331
|
-
[](#)
|
|
73
|
+
<div align="center">
|
|
74
|
+
<img src="https://github.com/user-attachments/assets/abe73474-f27b-4536-b358-fedef1e461f0" alt="Chopin Ballade No.4" width="600" />
|
|
75
|
+
</div>
|
|
332
76
|
|
|
333
77
|
<br/>
|
|
334
78
|
|
|
335
|
-
|
|
79
|
+
> "I believe playing the piano is also a form of orchestration."
|
|
80
|
+
>
|
|
81
|
+
> The harmony of polyphony — multiple voices — and homophony — a single melodic line.
|
|
82
|
+
>
|
|
83
|
+
> Each voice sings its most beautiful song from its own place, yet when combined, they create one grand, beautiful melody. I believe this structure is no different from AI agents.
|
|
84
|
+
>
|
|
85
|
+
> — *agnusdei1207*
|
package/bin/pentesting.mjs
CHANGED
|
File without changes
|
package/lib/runtime.mjs
CHANGED
|
@@ -119,11 +119,7 @@ export async function installManagedBuilder(options = {}) {
|
|
|
119
119
|
|
|
120
120
|
await mkdir(MANAGED_BINARY_DIR, { recursive: true });
|
|
121
121
|
const downloadUrl = releaseAssetUrl(target.assetName, { repo, releaseTag });
|
|
122
|
-
const response = await fetch(downloadUrl
|
|
123
|
-
headers: {
|
|
124
|
-
"user-agent": `pentesting-npm/${packageVersion()}`,
|
|
125
|
-
},
|
|
126
|
-
});
|
|
122
|
+
const response = await fetch(downloadUrl);
|
|
127
123
|
|
|
128
124
|
if (!response.ok || !response.body) {
|
|
129
125
|
throw new Error(
|
package/package.json
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "pentesting",
|
|
3
|
-
"version": "0.
|
|
4
|
-
"builderReleaseTag": "v0.
|
|
3
|
+
"version": "0.92.5",
|
|
4
|
+
"builderReleaseTag": "v0.92.5",
|
|
5
5
|
"description": "pentesting — security-focused agent runtime (internal engine: builder). Thin npm facade that downloads the managed Builder binary and forwards arguments.",
|
|
6
6
|
"license": "MIT",
|
|
7
7
|
"author": "agnusdei1207",
|
|
@@ -13,7 +13,9 @@
|
|
|
13
13
|
"bin",
|
|
14
14
|
"lib",
|
|
15
15
|
"scripts/postinstall.mjs",
|
|
16
|
-
"README.md"
|
|
16
|
+
"README.md",
|
|
17
|
+
"ARCHITECTURE.md",
|
|
18
|
+
"pentesting-logo.svg"
|
|
17
19
|
],
|
|
18
20
|
"engines": {
|
|
19
21
|
"node": ">=18.18.0"
|
|
@@ -46,45 +48,34 @@
|
|
|
46
48
|
"scripts": {
|
|
47
49
|
"postinstall": "node ./scripts/postinstall.mjs",
|
|
48
50
|
"prepublishOnly": "npm run verify",
|
|
51
|
+
"consistency": "bash scripts/check-project-consistency.sh",
|
|
52
|
+
"audit": "bash scripts/audit-project.sh",
|
|
49
53
|
"test": "node --test tests/*.test.mjs",
|
|
50
54
|
"preflight:local": "bash scripts/preflight-local.sh",
|
|
51
|
-
"verify": "npm run
|
|
52
|
-
"
|
|
53
|
-
"
|
|
54
|
-
"check:smoke": "sh -c 'npm run docker:builder:build && docker run --rm -v builder-workspace:/workspace agnusdei1207/pentesting:latest --version'",
|
|
55
|
-
"eval": "tsx benchmarks/cli.ts",
|
|
56
|
-
"eval:all": "npm run build && find benchmarks/evals -name task.yml -exec dirname {} \\; | sort | while read -r d; do echo \"\\n=== Running $(basename $d) ===\"; tsx benchmarks/cli.ts \"$d\" || true; done",
|
|
57
|
-
"pentesting:help": "./scripts/pentesting-release-help.sh",
|
|
58
|
-
"pentesting:status": "./scripts/pentesting-release-status.sh",
|
|
55
|
+
"verify": "npm run consistency && npm run test",
|
|
56
|
+
"pentesting:help": "bash scripts/pentesting-release-help.sh",
|
|
57
|
+
"pentesting:status": "bash scripts/pentesting-release-status.sh",
|
|
59
58
|
"pentesting:test": "npm run test",
|
|
60
59
|
"pentesting:verify": "npm run verify",
|
|
61
60
|
"pentesting:pack:dry-run": "npm pack --dry-run",
|
|
62
|
-
"pentesting:check": "
|
|
63
|
-
"pentesting:publish": "
|
|
64
|
-
"pentesting:publish:dry-run": "DRY_RUN=true
|
|
65
|
-
"
|
|
66
|
-
"
|
|
67
|
-
"
|
|
68
|
-
"
|
|
69
|
-
"
|
|
70
|
-
"
|
|
71
|
-
"public:sync": "
|
|
72
|
-
"public:
|
|
73
|
-
"
|
|
74
|
-
"
|
|
75
|
-
"
|
|
76
|
-
"docker:
|
|
77
|
-
"
|
|
78
|
-
"
|
|
79
|
-
"
|
|
80
|
-
"docker:base:push": "if [ -n \"$DOCKER_PASSWORD\" ]; then echo \"$DOCKER_PASSWORD\" | docker login -u \"${DOCKER_USERNAME:-agnusdei1207}\" --password-stdin; fi && npm run docker:base:build && docker push agnusdei1207/pentesting-build-base:1.95 && docker push agnusdei1207/pentesting-runtime-base:26.04",
|
|
81
|
-
"docker:clean": "docker stop $(docker ps -q) 2>/dev/null || true && docker system prune -af",
|
|
82
|
-
"release": "./scripts/release-all.sh patch",
|
|
83
|
-
"release:dry": "DRY_RUN=true ./scripts/release-all.sh patch",
|
|
84
|
-
"release:local": "./scripts/run-release-in-docker.sh ./scripts/build-release-local.sh",
|
|
85
|
-
"release:local:dry": "DRY_RUN=true ./scripts/run-release-in-docker.sh ./scripts/build-release-local.sh",
|
|
86
|
-
"release:backfill": "./scripts/run-release-in-docker.sh ./scripts/backfill-release-local.sh",
|
|
87
|
-
"release:backfill:dry": "DRY_RUN=true ./scripts/run-release-in-docker.sh ./scripts/backfill-release-local.sh"
|
|
61
|
+
"pentesting:check": "bash scripts/check-pentesting-package.sh",
|
|
62
|
+
"pentesting:publish": "bash scripts/publish-pentesting-package.sh",
|
|
63
|
+
"pentesting:publish:dry-run": "DRY_RUN=true bash scripts/publish-pentesting-package.sh",
|
|
64
|
+
"release:npm": "bash scripts/publish-pentesting-package.sh patch",
|
|
65
|
+
"release:npm:minor": "bash scripts/publish-pentesting-package.sh minor",
|
|
66
|
+
"release:npm:major": "bash scripts/publish-pentesting-package.sh major",
|
|
67
|
+
"release:npm:dry": "DRY_RUN=true bash scripts/publish-pentesting-package.sh patch",
|
|
68
|
+
"release:npm:minor:dry": "DRY_RUN=true bash scripts/publish-pentesting-package.sh minor",
|
|
69
|
+
"release:npm:major:dry": "DRY_RUN=true bash scripts/publish-pentesting-package.sh major",
|
|
70
|
+
"public:sync": "bash scripts/sync-public-repo.sh",
|
|
71
|
+
"public:mirror-release": "bash scripts/mirror-public-release.sh",
|
|
72
|
+
"release:backfill": "bash scripts/backfill-release-local.sh",
|
|
73
|
+
"check": "sh -c 'npm run docker:build && docker run -it --rm -v builder-workspace:/workspace -e ANTHROPIC_BASE_URL -e ANTHROPIC_AUTH_TOKEN -e ANTHROPIC_MODEL -e ANTHROPIC_API_KEY -e MINIMAX_API_KEY -e OPENAI_API_KEY -e OPENAI_BASE_URL -e GEMINI_API_KEY -e DEEPSEEK_API_KEY agnusdei1207/pentesting:latest'",
|
|
74
|
+
"docker:build": "(docker image inspect agnusdei1207/pentesting-build-base:1.96 >/dev/null 2>&1 && docker image inspect agnusdei1207/pentesting-runtime-base:26.04 >/dev/null 2>&1 || npm run docker:base:build) && docker build --build-arg APP_VERSION=$(git describe --tags --abbrev=0 2>/dev/null || echo dev) -t agnusdei1207/pentesting:latest .",
|
|
75
|
+
"docker:base:build": "docker build -t agnusdei1207/pentesting-build-base:1.96 -f docker/build-base.Dockerfile . && docker build -t agnusdei1207/pentesting-runtime-base:26.04 -f docker/runtime-base.Dockerfile .",
|
|
76
|
+
"release:patch": "./scripts/release-all.sh patch",
|
|
77
|
+
"release:minor": "./scripts/release-all.sh minor",
|
|
78
|
+
"release:major": "./scripts/release-all.sh major"
|
|
88
79
|
},
|
|
89
80
|
"devDependencies": {
|
|
90
81
|
"@ai-sdk/google-vertex": "^4.0.47",
|
|
@@ -0,0 +1,13 @@
|
|
|
1
|
+
<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24" aria-hidden="true">
|
|
2
|
+
<g
|
|
3
|
+
fill="none"
|
|
4
|
+
stroke="#00d4aa"
|
|
5
|
+
stroke-linecap="round"
|
|
6
|
+
stroke-linejoin="round"
|
|
7
|
+
stroke-width="1.5"
|
|
8
|
+
>
|
|
9
|
+
<path d="m7 7l1.227 1.057C8.742 8.502 9 8.724 9 9s-.258.498-.773.943L7 11" />
|
|
10
|
+
<path d="M11 11h3" />
|
|
11
|
+
<path d="M12 21c3.75 0 5.625 0 6.939-.955a5 5 0 0 0 1.106-1.106C21 17.625 21 15.749 21 12s0-5.625-.955-6.939a5 5 0 0 0-1.106-1.106C17.625 3 15.749 3 12 3s-5.625 0-6.939.955A5 5 0 0 0 3.955 5.06C3 6.375 3 8.251 3 12s0 5.625.955 6.939a5 5 0 0 0 1.106 1.106C6.375 21 8.251 21 12 21" />
|
|
12
|
+
</g>
|
|
13
|
+
</svg>
|
package/scripts/postinstall.mjs
CHANGED
|
File without changes
|