pwnkit-cli 0.3.3 → 0.3.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/dist/README.md ADDED
@@ -0,0 +1,305 @@
1
+ <p align="center">
2
+ <img src="assets/pwnkit-icon.gif" alt="pwnkit" width="80" />
3
+ </p>
4
+
5
+ <h1 align="center">pwnkit</h1>
6
+
7
+ <p align="center">
8
+ <strong>General-purpose autonomous pentesting framework</strong><br/>
9
+ <em>Scan LLM endpoints. Audit npm packages. Review source code. Re-exploit to kill false positives.</em>
10
+ </p>
11
+
12
+ <p align="center">
13
+ <a href="https://www.npmjs.com/package/pwnkit-cli"><img src="https://img.shields.io/npm/v/pwnkit-cli?color=crimson&style=flat-square" alt="npm version" /></a>
14
+ <a href="https://github.com/peaktwilight/pwnkit/blob/main/LICENSE"><img src="https://img.shields.io/badge/license-Apache%202.0-blue?style=flat-square" alt="license" /></a>
15
+ <a href="https://github.com/peaktwilight/pwnkit/actions"><img src="https://img.shields.io/github/actions/workflow/status/peaktwilight/pwnkit/ci.yml?style=flat-square" alt="CI" /></a>
16
+ <a href="https://github.com/peaktwilight/pwnkit/stargazers"><img src="https://img.shields.io/github/stars/peaktwilight/pwnkit?style=flat-square&color=gold" alt="stars" /></a>
17
+ <a href="https://pwnkit.com"><img src="https://pwnkit.com/badge/peaktwilight/pwnkit" alt="pwnkit verified" /></a>
18
+ </p>
19
+
20
+ <p align="center">
21
+ <img src="assets/demo.gif" alt="pwnkit Demo" width="700" />
22
+ </p>
23
+
24
+ <p align="center">
25
+ <a href="#quick-start">Quick Start</a> &middot;
26
+ <a href="#commands">Commands</a> &middot;
27
+ <a href="#how-it-works">How It Works</a> &middot;
28
+ <a href="#what-pwnkit-scans">What It Scans</a> &middot;
29
+ <a href="#how-it-compares">Comparison</a> &middot;
30
+ <a href="#github-action">CI/CD</a> &middot;
31
+ <a href="#built-by">About</a>
32
+ </p>
33
+
34
+ ---
35
+
36
+ pwnkit is an open-source agentic security toolkit. A research agent discovers, attacks, and writes proof-of-concept code for vulnerabilities across LLM endpoints, npm packages, and Git repositories. Then a blind verify agent — given ONLY the PoC and file path, not the reasoning — independently reproduces each finding to **kill false positives**. No templates, no static rules — multi-turn agentic reasoning that thinks like an attacker.
37
+
38
+ One command. Zero config. Every finding re-exploited or dropped.
39
+
40
+ ## Quick Start
41
+
42
+ ```bash
43
+ # Scan an LLM endpoint
44
+ npx pwnkit-cli scan --target https://your-app.com/api/chat
45
+
46
+ # Audit an npm package for vulnerabilities
47
+ npx pwnkit-cli audit lodash
48
+
49
+ # Deep security review of a codebase
50
+ npx pwnkit-cli review ./my-ai-app
51
+
52
+ # Or just point pwnkit-cli at a target — it auto-detects what to do
53
+ npx pwnkit-cli express # audits npm package
54
+ npx pwnkit-cli ./my-repo # reviews source code
55
+ npx pwnkit-cli https://github.com/user/repo # clones and reviews
56
+ ```
57
+
58
+ That's it. pwnkit discovers your attack surface, launches targeted attacks, verifies findings, and generates a report — all in under 5 minutes.
59
+
60
+ ### Auto-Detect
61
+
62
+ `pwnkit-cli <target>` figures out what you mean without explicit subcommands:
63
+
64
+ | Input | What pwnkit-cli does |
65
+ |-------|-----------------|
66
+ | `pwnkit-cli express` | Treats it as an npm package name and runs `audit` |
67
+ | `pwnkit-cli ./my-repo` | Detects a local path and runs `review` |
68
+ | `pwnkit-cli https://github.com/user/repo` | Clones the repo and runs `review` |
69
+ | `pwnkit-cli https://example.com/api/chat` | Detects an LLM endpoint URL and runs `scan` |
70
+
71
+ Explicit subcommands (`scan`, `audit`, `review`) still work — auto-detect is just a convenience layer on top.
72
+
73
+ ## Commands
74
+
75
+ All commands are available via `npx pwnkit-cli <command>`. Explicit subcommands are optional — thanks to auto-detect, `npx pwnkit-cli <target>` works for most use cases (see [Auto-Detect](#auto-detect) above).
76
+
77
+ pwnkit ships five commands — from quick API probes to deep source-level audits:
78
+
79
+ | Command | What It Does | Example |
80
+ |---------|-------------|---------|
81
+ | **`scan`** | Probe LLM endpoints and AI APIs for vulnerabilities | `npx pwnkit-cli scan --target https://api.example.com/chat` |
82
+ | **`audit`** | Install and security-audit any npm package with static analysis + AI review | `npx pwnkit-cli audit express@4.18.2` |
83
+ | **`review`** | Deep source code security review of a local repo or GitHub URL | `npx pwnkit-cli review https://github.com/user/repo` |
84
+ | **`history`** | Browse past scans with status, depth, findings count, and duration | `npx pwnkit-cli history --limit 20` |
85
+ | **`findings`** | Query, filter, and inspect verified findings across all scans | `npx pwnkit-cli findings list --severity critical` |
86
+
87
+ ## How It Works
88
+
89
+ pwnkit runs autonomous AI agents in a research-then-verify pipeline. Each agent uses tools (`read_file`, `run_command`, `send_prompt`, `save_finding`) and makes multi-turn decisions — adapting its strategy based on what it learns:
90
+
91
+ ```mermaid
92
+ graph LR
93
+ A["Research\ndiscover + attack + PoC\nsingle agent session"] --> B["Blind Verify\ngets ONLY PoC + path\nno reasoning, no bias"]
94
+ B --> C["Report\nSARIF, Markdown, JSON\nonly confirmed findings"]
95
+ B -->|can't reproduce| D["Killed"]
96
+
97
+ style A fill:#1a1a2e,stroke:#DC2626,color:#fff
98
+ style B fill:#1a1a2e,stroke:#3B82F6,color:#fff
99
+ style C fill:#1a1a2e,stroke:#8B5CF6,color:#fff
100
+ style D fill:#1a1a2e,stroke:#6B7280,color:#6B7280
101
+ ```
102
+
103
+ | Agent | Role | What It Does |
104
+ |-------|------|-------------|
105
+ | **Research** | Discover + Attack + PoC | Maps endpoints, detects models, extracts system prompts, crafts multi-turn attacks (prompt injection, jailbreaks, tool poisoning, data exfiltration), and writes proof-of-concept code — all in one agent session |
106
+ | **Verify** | Blind validation | Gets ONLY the PoC code and file path — not the research agent's reasoning. Independently traces data flow and reproduces each finding. Can't reproduce? Killed as false positive |
107
+ | **Report** | Output | SARIF for GitHub Security tab, Markdown for humans, JSON for pipelines — only confirmed findings with severity scores and remediation |
108
+
109
+ The **blind verification is the differentiator.** The verify agent can't be biased by the research agent's reasoning — same principle as double-blind peer review. No more triaging 200 "possible prompt injections" that turn out to be nothing.
110
+
111
+ ## What pwnkit Scans
112
+
113
+ | Target | Command | How |
114
+ |--------|---------|-----|
115
+ | **LLM Endpoints** — ChatGPT, Claude, Llama APIs, custom chatbots | `pwnkit-cli scan --target <url>` | HTTP probing + multi-turn agent attacks |
116
+ | **npm Packages** — Dependency supply chain, malicious code | `pwnkit-cli audit <package>` | Installs in sandbox, runs semgrep + AI code review |
117
+ | **Git Repositories** — Source-level security review | `pwnkit-cli review <path-or-url>` | Deep analysis with Claude Code, Codex, or Gemini CLI |
118
+ | **Auto-detect** — Give it anything | `pwnkit-cli <target>` | URL, package name, or path — pwnkit-cli figures it out |
119
+
120
+ ## Example Output
121
+
122
+ See the [demo GIF above](#) for real scan output, or run it yourself:
123
+
124
+ ```bash
125
+ npx pwnkit-cli scan --target https://your-app.com/api/chat --depth quick
126
+ ```
127
+
128
+ For a verbose view with the animated attack replay:
129
+
130
+ ```bash
131
+ npx pwnkit-cli scan --target https://your-app.com/api/chat --verbose
132
+ ```
133
+
134
+ ## Scan Depth
135
+
136
+ | Depth | Test Cases | Time |
137
+ |-------|-----------|------|
138
+ | `quick` | ~15 | ~1 min |
139
+ | `default` | ~50 | ~3 min |
140
+ | `deep` | ~150 | ~10 min |
141
+
142
+ pwnkit is an agentic harness — bring your own AI. Use your API key (OpenRouter, Anthropic, OpenAI, Ollama), or use the Claude Code CLI or Codex CLI with your existing subscription via `--runtime claude` or `--runtime codex`.
143
+
144
+ ```bash
145
+ # Quick scan for CI
146
+ npx pwnkit-cli scan --target https://api.example.com/chat --depth quick
147
+
148
+ # Deep audit before launch
149
+ npx pwnkit-cli scan --target https://api.example.com/chat --depth deep
150
+
151
+ # Deep scan with Claude Code CLI
152
+ npx pwnkit-cli scan --target https://api.example.com/chat --depth deep --runtime claude
153
+
154
+ # Audit an npm package
155
+ npx pwnkit-cli audit react --depth deep --runtime claude
156
+
157
+ # Review a GitHub repo
158
+ npx pwnkit-cli review https://github.com/user/repo --runtime codex --depth deep
159
+
160
+ # Auto-detect — just give it a target
161
+ npx pwnkit-cli express
162
+ npx pwnkit-cli ./my-repo
163
+ npx pwnkit-cli https://api.example.com
164
+ ```
165
+
166
+ ## Runtime Modes
167
+
168
+ Bring your own agent CLI — pwnkit orchestrates it:
169
+
170
+ | Runtime | Flag | Best For |
171
+ |---------|------|----------|
172
+ | `api` | `--runtime api` | CI, quick scans — uses your API key (OpenRouter, Anthropic, OpenAI). Default |
173
+ | `claude` | `--runtime claude` | Deep analysis — spawns Claude Code CLI with your subscription |
174
+ | `codex` | `--runtime codex` | Source analysis — spawns Codex CLI |
175
+ | `gemini` | `--runtime gemini` | Large context source analysis — spawns Gemini CLI |
176
+ | `auto` | `--runtime auto` | Auto-detects installed CLIs, picks best per stage |
177
+
178
+ ## How It Compares
179
+
180
+ | Feature | pwnkit | promptfoo | garak | semgrep | nuclei |
181
+ |---------|--------|-----------|-------|---------|--------|
182
+ | **Agentic multi-turn pipeline** | Yes — Autonomous agents with tool use | No — Single runner | No — Single runner | No — Rule-based | No — Template runner |
183
+ | **Verification (no false positives)** | Yes — Re-exploits to confirm | No | No | No | No |
184
+ | **LLM endpoint scanning** | Yes — Prompt injection, jailbreaks, exfil | Yes — Red-teaming | Yes — Probes | No | No |
185
+ | **npm package audit** | Yes — Semgrep + AI review | No | No | Yes — Rules only | No |
186
+ | **Source code review** | Yes — AI-powered deep analysis | No | No | Yes — Rules only | No |
187
+ | **OWASP LLM Top 10** | Yes — Covered | Partial | Partial | N/A | N/A |
188
+ | **SARIF + GitHub Security tab** | Yes | Yes | No | Yes | Yes |
189
+ | **One command, zero config** | Yes — `npx pwnkit-cli scan` | Needs YAML config | Needs Python setup | Needs rules config | Needs templates |
190
+ | **Open source** | Yes — Apache-2.0 | Yes — (acquired by OpenAI) | Yes — MIT | Yes — LGPL / Paid Pro | Yes — MIT |
191
+ | **Pricing** | Free + bring your own AI | Varies | Free (local) | Free (OSS) / Paid (Pro) | Free |
192
+
193
+ pwnkit isn't replacing semgrep or nuclei — it covers the AI-specific attack surface they can't see. Use them together.
194
+
195
+ ## GitHub Action
196
+
197
+ Add pwnkit to your CI/CD pipeline:
198
+
199
+ ```yaml
200
+ name: AI Security Scan
201
+ on: [push, pull_request]
202
+
203
+ permissions:
204
+ contents: read
205
+ security-events: write
206
+
207
+ jobs:
208
+ pwnkit:
209
+ runs-on: ubuntu-latest
210
+ steps:
211
+ - uses: actions/checkout@v4
212
+
213
+ - name: Run pwnkit
214
+ uses: peaktwilight/pwnkit/action@v1
215
+ with:
216
+ target: ${{ secrets.STAGING_API_URL }}
217
+ depth: default # quick | default | deep
218
+ fail-on-severity: high # critical | high | medium | low | info | none
219
+ env:
220
+ OPENROUTER_API_KEY: ${{ secrets.OPENROUTER_API_KEY }}
221
+
222
+ - name: Upload SARIF
223
+ uses: github/codeql-action/upload-sarif@v3
224
+ with:
225
+ sarif_file: pwnkit-report/report.sarif
226
+ ```
227
+
228
+ > **API Key Priority:** pwnkit checks for `OPENROUTER_API_KEY` first, then `ANTHROPIC_API_KEY`, then `OPENAI_API_KEY`. OpenRouter gives you access to many models (including free ones) through a single key at [openrouter.ai](https://openrouter.ai).
229
+
230
+ Findings show up directly in the **Security** tab of your repository.
231
+
232
+ ### Badge
233
+
234
+ Add a pwnkit badge to your README:
235
+
236
+ ```markdown
237
+ [![pwnkit](https://pwnkit.com/badge/YOUR_ORG/YOUR_REPO)](https://pwnkit.com)
238
+ ```
239
+
240
+ The badge auto-updates from your GitHub Actions scan results. Shows `verified` (green), finding counts (yellow/red), or `not scanned` (gray).
241
+
242
+ Also available as a [shields.io endpoint](https://shields.io/endpoint):
243
+ ```
244
+ https://img.shields.io/endpoint?url=https://pwnkit.com/badge/YOUR_ORG/YOUR_REPO/shield
245
+ ```
246
+
247
+ ## Findings Management
248
+
249
+ Every finding is persisted in a local SQLite database. Query across scans:
250
+
251
+ ```bash
252
+ # List critical findings
253
+ npx pwnkit-cli findings list --severity critical
254
+
255
+ # Filter by category
256
+ npx pwnkit-cli findings list --category prompt-injection --status confirmed
257
+
258
+ # Inspect a specific finding with full evidence
259
+ npx pwnkit-cli findings show NF-001
260
+
261
+ # Browse scan history
262
+ npx pwnkit-cli history --limit 10
263
+ ```
264
+
265
+ Finding lifecycle: `discovered → verified → confirmed → scored → reported` (or `false-positive` if verification fails).
266
+
267
+ ## Roadmap
268
+
269
+ - [x] Core autonomous agent pipeline (research, blind verify, report)
270
+ - [x] OWASP LLM Top 10 coverage
271
+ - [x] SARIF output + GitHub Action
272
+ - [x] npm package auditing
273
+ - [x] Source code review (local + GitHub)
274
+ - [x] Multi-runtime support (Claude, Codex, Gemini)
275
+ - [x] Multi-turn agentic attacks (agents adapt payloads based on responses)
276
+ - [ ] MCP server scanning (tool poisoning, schema abuse)
277
+ - [ ] Web pentesting mode (SQLi, XSS, SSRF, auth bypass, IDOR)
278
+ - [ ] RAG pipeline security (poisoning, extraction)
279
+ - [ ] Agentic workflow testing (multi-tool chains)
280
+ - [ ] VS Code extension
281
+ - [ ] Team dashboard & historical tracking
282
+ - [ ] SOC 2 / compliance report generation
283
+
284
+ ## Built By
285
+
286
+ Created by a security researcher with [7 published CVEs](https://doruk.ch/blog) across node-forge, mysql2, uptime-kuma, liquidjs, picomatch, and jspdf.
287
+
288
+ pwnkit is a general-purpose autonomous pentesting framework. It exists because modern attack surfaces — LLM endpoints, npm supply chains, AI-powered codebases — require agents that adapt, not static rules that don't. You can't `nmap` a language model. You can't write a rule for a jailbreak that hasn't been invented yet. Static analysis alone misses logical flaws and semantic vulnerabilities that only an agent tracing data flow can find.
289
+
290
+ pwnkit uses autonomous agents that think like attackers, adapt their strategy mid-scan, and re-exploit every finding before reporting it. The result: real vulnerabilities, zero noise.
291
+
292
+ ## Contributing
293
+
294
+ Contributions welcome! See [CONTRIBUTING.md](CONTRIBUTING.md) for guidelines.
295
+
296
+ ```bash
297
+ git clone https://github.com/peaktwilight/pwnkit.git
298
+ cd pwnkit
299
+ pnpm install
300
+ pnpm test
301
+ ```
302
+
303
+ ## License
304
+
305
+ [Apache 2.0](LICENSE) — use it, fork it, ship it.