pentesting 0.73.7 → 0.73.9

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,12 +1,15 @@
1
1
  <div align="center">
2
2
 
3
- <img src="https://api.iconify.design/game-icons:fizzing-flask.svg?color=%232496ED" width="80" height="80" alt="Pentesting Agent" />
3
+ <a href="https://agnusdei1207.github.io/brainscience/pentesting/">
4
+ <img src="https://api.iconify.design/game-icons:fizzing-flask.svg?color=%232496ED" width="80" height="80" alt="Pentesting Agent" />
5
+ </a>
4
6
 
5
7
  # pentesting
6
8
  > **Autonomous Offensive Security AI Agent**
7
9
 
8
- [![npm](https://img.shields.io/badge/npm-pentesting-2496ED)](https://www.npmjs.org/package/pentesting)
9
- [![docker](https://img.shields.io/badge/docker-pentesting-2496ED)](https://hub.docker.com/r/agnusdei1207/pentesting)
10
+ [![npm](https://img.shields.io/badge/npm-pentesting-CB3837?style=flat-square&logo=npm&logoColor=white)](https://www.npmjs.com/package/pentesting)
11
+ [![docker](https://img.shields.io/badge/docker-pentesting-2496ED?style=flat-square&logo=docker&logoColor=white)](https://hub.docker.com/r/agnusdei1207/pentesting)
12
+ [![docs](https://img.shields.io/badge/docs-brainscience-10B981?style=flat-square&logo=readthedocs&logoColor=white)](https://agnusdei1207.github.io/brainscience/pentesting/)
10
13
 
11
14
  </div>
12
15
 
@@ -30,48 +33,26 @@
30
33
 
31
34
  ## Purpose
32
35
 
33
- Pentesting support tool. Can autonomously execute network penetration tests or assist with generic Capture The Flag (CTF) challenges (such as Reverse Engineering, Cryptography, and binary analysis) without requiring a specific network target.
36
+ Autonomous network penetration testing and CTF assistant. Supports offensive security workflows including recon, exploit, and post-exploitation.
34
37
 
35
38
  ## Quick Start
36
39
 
37
- ### z.ai GLM Coding Plan Max (Recommended)
38
-
39
- Web search is included in the subscription — **no separate Search API key required**.
40
-
41
- If you want the repository default start path, export your env vars locally and run:
42
-
43
- ```bash
44
- export PENTEST_API_KEY="your_z_ai_key"
45
- export PENTEST_BASE_URL="https://api.z.ai/api/anthropic"
46
- export PENTEST_MODEL="glm-4.7"
47
- npm run start
48
- ```
49
-
50
- `npm run start -- -t 10.10.10.5` passes CLI arguments through to the container entrypoint.
51
- Use `npm run start:local` only if you explicitly want the non-container Node runtime.
40
+ ### 🐳 Docker (Recommended)
52
41
 
53
42
  ```bash
54
43
  docker run -it --rm \
55
- -e PENTEST_API_KEY="your_z_ai_key" \
44
+ -e PENTEST_API_KEY="your_key" \
56
45
  -e PENTEST_BASE_URL="https://api.z.ai/api/anthropic" \
57
46
  -e PENTEST_MODEL="glm-4.7" \
58
47
  agnusdei1207/pentesting
59
48
  ```
60
49
 
61
- Enable container Tor mode by adding `-e PENTEST_TOR=true` to the same `docker run` command.
62
-
63
- ### External Search API (Optional)
64
-
65
- For providers other than z.ai, or to use a dedicated search backend.
50
+ ### 🐉 Kali Linux (Native)
66
51
 
67
52
  ```bash
68
- docker run -it --rm \
69
- -e PENTEST_API_KEY="your_api_key" \
70
- -e PENTEST_BASE_URL="https://open.bigmodel.cn/api/paas/v4" \
71
- -e PENTEST_MODEL="glm-4-plus" \
72
- -e SEARCH_API_KEY="your_brave_api_key" \
73
- -e SEARCH_API_URL="https://api.search.brave.com/res/v1/web/search" \
74
- agnusdei1207/pentesting
53
+ npm install -g pentesting
54
+ export PENTEST_API_KEY="your_key"
55
+ pentesting
75
56
  ```
76
57
 
77
58
  ### Environment Variables
@@ -79,38 +60,10 @@ docker run -it --rm \
79
60
  | Variable | Required | Description |
80
61
  |----------|----------|-------------|
81
62
  | `PENTEST_API_KEY` | ✅ | LLM API key |
82
- | `PENTEST_BASE_URL` | ❌ | Custom API endpoint (web search auto-enabled when URL contains `z.ai`) |
83
- | `PENTEST_MODEL` | ❌ | Model override (defaults depend on provider/runtime; examples use `glm-4.7`) |
84
- | `SEARCH_API_KEY` | ❌ | External search API key (not needed with z.ai) |
85
- | `SEARCH_API_URL` | ❌ | External search API URL (not needed with z.ai) |
86
- | `PENTEST_SCOPE_MODE` | ❌ | Scope mode override: `advisory` or `enforce` |
87
- | `PENTEST_APPROVAL_MODE` | ❌ | Approval mode override: `advisory` or `require_auto_approve` |
88
- | `PENTEST_TOR` | ❌ | Container-only Tor mode. When `true`, the Docker entrypoint starts Tor and launches the agent through `proxychains4` |
89
-
90
- Safety defaults:
91
-
92
- - Containerized runtime defaults to `PENTEST_SCOPE_MODE=advisory` and `PENTEST_APPROVAL_MODE=advisory`.
93
- - Non-container runtime defaults to `PENTEST_SCOPE_MODE=enforce` and `PENTEST_APPROVAL_MODE=require_auto_approve`.
94
- - Explicit env vars override those defaults.
95
-
96
- Tor notes:
97
-
98
- - Tor is supported only in the containerized runtime.
99
- - There is no in-app `/tor` toggle. Enable it at container startup with `-e PENTEST_TOR=true`.
100
- - Non-container runs ignore `PENTEST_TOR`, so local host execution stays on direct networking.
101
-
102
- ### Developer Verification
103
-
104
- ```bash
105
- npm run verify
106
- npm run verify:docker
107
- ```
108
-
109
- - `npm run verify` now runs typecheck, tests, and build.
110
- - `npm run verify:docker` builds the image and launches the Docker TUI path through `test.sh`.
111
- - `npm run check` prunes Docker state, runs tests and build, rebuilds the local image, and then launches the Docker TUI path.
112
- - `npm run check:clean` runs `npm run check:ci` after an explicit `docker system prune -af --volumes`.
113
-
63
+ | `PENTEST_BASE_URL` | ❌ | API endpoint (z.ai auto-enables web search) |
64
+ | `PENTEST_MODEL` | ❌ | Model (default: `glm-4.7`) |
65
+ | `SEARCH_API_KEY` | ❌ | External search key (not needed for z.ai) |
66
+ | `PENTEST_TOR` | ❌ | Enable Tor (`true`, Docker only) |
114
67
  ---
115
68
 
116
69
  ## Issue
@@ -134,124 +87,3 @@ we don't stop until the flag is captured.
134
87
  <br/>
135
88
 
136
89
  </div>
137
-
138
- ---
139
-
140
- ## Research References
141
-
142
- This section collects representative papers matched to the design themes reflected in `pentesting`.
143
-
144
- It is an inference-based reconstruction from topic overlap, not a verbatim personal reading log.
145
-
146
- ### Mapping
147
-
148
- - Offensive security agent papers inform the autonomous pentest workflow.
149
- - Planner-executor and heterogeneous collaboration papers inform task decomposition and coordination.
150
- - Multi-agent orchestration papers inform role separation, delegation, and control topology.
151
- - Benchmark and evaluation papers inform capability framing and validation strategy.
152
-
153
- ### Offensive Security Agents
154
-
155
- 1. [PentestGPT: Evaluating and Harnessing Large Language Models for Automated Penetration Testing](https://www.usenix.org/conference/usenixsecurity24/presentation/deng)
156
- USENIX Security 2024
157
- Relevance: autonomous pentest loop and operator-assist workflow.
158
-
159
- 2. [D-CIPHER: Dynamic Collaborative Intelligent Agents with Planning and Heterogeneous Execution for Enhanced Reasoning in Offensive Security](https://arxiv.org/abs/2502.10931)
160
- arXiv 2025
161
- Relevance: collaborative offensive agents, planning, and heterogeneous execution roles.
162
-
163
- 3. [Towards Automated Software Security Testing: Augmenting Penetration Testing through LLMs](https://conf.researchr.org/room/ssbse-2023/fse-2023-venue-golden-gate-c1)
164
- ESEC/FSE 2023
165
- Relevance: LLM-augmented penetration testing as a software engineering workflow.
166
-
167
- 4. [LLMs as Hackers: Autonomous Linux Privilege Escalation Attacks](https://arxiv.org/abs/2310.11409)
168
- arXiv 2023
169
- Relevance: offensive autonomy in post-exploitation and privilege escalation.
170
-
171
- 5. [Can LLMs Hack Enterprise Networks? Autonomous Assumed Breach Penetration-Testing Active Directory Networks](https://arxiv.org/abs/2502.04227)
172
- arXiv 2025
173
- Relevance: enterprise network movement and AD-focused agent behavior.
174
-
175
- 6. [LLM Agents can Autonomously Hack Websites](https://arxiv.org/abs/2402.06664)
176
- arXiv 2024
177
- Relevance: web exploitation agents and end-to-end task execution.
178
-
179
- 7. [LLM Agents can Autonomously Exploit One-day Vulnerabilities](https://arxiv.org/abs/2404.08144)
180
- arXiv 2024
181
- Relevance: exploit execution against known-vulnerability targets.
182
-
183
- 8. [Teams of LLM Agents can Exploit Zero-Day Vulnerabilities](https://arxiv.org/abs/2406.01637)
184
- arXiv 2024
185
- Relevance: multi-agent offensive workflows for harder vulnerability exploitation.
186
-
187
- 9. [AutoPentester: An LLM Agent-based Framework for Automated Pentesting](https://arxiv.org/abs/2510.05605)
188
- arXiv 2025
189
- Relevance: explicit automated pentesting framework alignment.
190
-
191
- ### Benchmarks and Cyber Evaluation
192
-
193
- 10. [AutoPenBench: A Vulnerability Testing Benchmark for Generative Agents](https://aclanthology.org/2025.emnlp-industry.114/)
194
- EMNLP Industry 2025
195
- Relevance: benchmark framing for generative vulnerability-testing agents.
196
-
197
- 11. [Training Language Model Agents to Find Vulnerabilities with CTF-Dojo](https://arxiv.org/abs/2508.18370)
198
- arXiv 2025
199
- Relevance: CTF-grounded vulnerability discovery and training/eval setup.
200
-
201
- 12. [Cybench: A Framework for Evaluating Cybersecurity Capabilities and Risks of Language Models](https://arxiv.org/abs/2408.08926)
202
- arXiv 2024
203
- Relevance: evaluation of cybersecurity capability and misuse risk.
204
-
205
- 13. [CyberGym: Evaluating AI Agents' Cybersecurity Capabilities with Real-World Vulnerabilities at Scale](https://arxiv.org/abs/2506.02548)
206
- arXiv 2025
207
- Relevance: large-scale realistic vulnerability evaluation.
208
-
209
- 14. [CyberSecEval 2: A Wide-Ranging Cybersecurity Evaluation Suite for Large Language Models](https://arxiv.org/abs/2404.13161)
210
- arXiv 2024
211
- Relevance: broad cyber eval framing and safety measurement.
212
-
213
- 15. [When LLMs Meet Cybersecurity: A Systematic Literature Review](https://arxiv.org/abs/2405.03644)
214
- arXiv 2024
215
- Relevance: survey grounding across offensive and defensive use cases.
216
-
217
- 16. [Large Language Models in Cybersecurity: State-of-the-Art](https://arxiv.org/abs/2402.00891)
218
- arXiv 2024
219
- Relevance: landscape overview for positioning the project.
220
-
221
- ### Multi-Agent Collaboration and Orchestration
222
-
223
- 17. [A Survey on Large Language Model based Autonomous Agents](https://arxiv.org/abs/2308.11432)
224
- arXiv 2023
225
- Relevance: agent architecture baseline and terminology.
226
-
227
- 18. [Large Language Model based Multi-Agents: A Survey of Progress and Challenges](https://arxiv.org/abs/2402.01680)
228
- arXiv 2024
229
- Relevance: multi-agent coordination patterns and failure modes.
230
-
231
- 19. [AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation](https://arxiv.org/abs/2308.08155)
232
- arXiv 2023
233
- Relevance: role-based dialogue and tool-using multi-agent orchestration.
234
-
235
- 20. [MetaGPT: Meta Programming for A Multi-Agent Collaborative Framework](https://arxiv.org/abs/2308.00352)
236
- arXiv 2023
237
- Relevance: structured role decomposition and pipeline-style collaboration.
238
-
239
- 21. [ChatDev: Communicative Agents for Software Development](https://aclanthology.org/2024.acl-long.810/)
240
- ACL 2024
241
- Relevance: communication protocol and software-task role separation.
242
-
243
- 22. [CAMEL: Communicative Agents for "Mind" Exploration of Large Language Model Society](https://arxiv.org/abs/2303.17760)
244
- arXiv 2023
245
- Relevance: agent role prompting and cooperative interaction patterns.
246
-
247
- 23. [AgentVerse: Facilitating Multi-Agent Collaboration and Exploring Emergent Behaviors](https://arxiv.org/abs/2308.10848)
248
- arXiv 2023
249
- Relevance: multi-agent environment framing and emergent collaboration.
250
-
251
- 24. [Scaling Large-Language-Model-based Multi-Agent Collaboration](https://arxiv.org/abs/2406.07155)
252
- arXiv 2024
253
- Relevance: scale behavior and coordination bottlenecks.
254
-
255
- 25. [Multi-Agent Collaboration via Evolving Orchestration](https://arxiv.org/abs/2505.19591)
256
- arXiv 2025
257
- Relevance: orchestration policy evolution and adaptive coordination.
@@ -5,7 +5,7 @@ import {
5
5
  createContextExtractor,
6
6
  getLLMClient,
7
7
  getShellSupervisorLifecycleSnapshot
8
- } from "./chunk-UIYY4RLA.js";
8
+ } from "./chunk-DDLHDNOM.js";
9
9
  import {
10
10
  AGENT_ROLES,
11
11
  EVENT_TYPES,
@@ -13,14 +13,14 @@ import {
13
13
  TOOL_NAMES,
14
14
  getProcessOutput,
15
15
  listBackgroundProcesses
16
- } from "./chunk-FJ7PENUK.js";
16
+ } from "./chunk-6M7LXEMY.js";
17
17
  import {
18
18
  DETECTION_PATTERNS,
19
19
  PROCESS_EVENTS,
20
20
  PROCESS_ROLES,
21
21
  getActiveProcessSummary,
22
22
  getProcessEventLog
23
- } from "./chunk-KAUE3MSR.js";
23
+ } from "./chunk-S5ZMXFHR.js";
24
24
 
25
25
  // src/engine/agent-tool/completion-box.ts
26
26
  function createCompletionBox() {