ship-safe 4.2.0 → 5.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -9,23 +9,27 @@
9
9
  <a href="https://github.com/asamassekou10/ship-safe/actions/workflows/ci.yml"><img src="https://github.com/asamassekou10/ship-safe/actions/workflows/ci.yml/badge.svg" alt="CI" /></a>
10
10
  <a href="https://nodejs.org"><img src="https://img.shields.io/node/v/ship-safe" alt="Node.js version" /></a>
11
11
  <a href="https://opensource.org/licenses/MIT"><img src="https://img.shields.io/badge/License-MIT-yellow.svg" alt="License: MIT" /></a>
12
+ <a href="https://github.com/asamassekou10/ship-safe/stargazers"><img src="https://img.shields.io/github/stars/asamassekou10/ship-safe?style=social" alt="GitHub stars" /></a>
12
13
  </p>
13
14
 
14
15
  ---
15
16
 
16
- 12 security agents. 50+ attack classes. One command.
17
+ 16 security agents. 80+ attack classes. One command.
17
18
 
18
- **Ship Safe v4.0** is an AI-powered security platform that runs 12 specialized agents against your codebase — scanning for secrets, injection vulnerabilities, auth bypass, SSRF, supply chain attacks, Docker/Terraform misconfigs, CI/CD pipeline poisoning, LLM security issues, and more. It produces a prioritized remediation plan so you know exactly what to fix first.
19
+ **Ship Safe v5.0** is an AI-powered security platform that runs 16 specialized agents in parallel against your codebase — scanning for secrets, injection vulnerabilities, auth bypass, SSRF, supply chain attacks, Supabase RLS misconfigs, Docker/Terraform/Kubernetes misconfigs, CI/CD pipeline poisoning, LLM/agentic AI security, MCP server misuse, RAG poisoning, PII compliance, and more. LLM-powered deep analysis verifies exploitability of critical findings. Secrets verification probes provider APIs to check if leaked keys are still active. A dedicated CI command (`ship-safe ci`) integrates into any pipeline with threshold-based gating and SARIF output.
19
20
 
20
21
  ---
21
22
 
22
23
  ## Quick Start
23
24
 
24
25
  ```bash
25
- # Full security audit — secrets + 12 agents + deps + remediation plan
26
+ # Full security audit — secrets + 16 agents + deps + remediation plan
26
27
  npx ship-safe audit .
27
28
 
28
- # Red team scan only (12 agents, 50+ attack classes)
29
+ # LLM-powered deep analysis (Anthropic, OpenAI, Google, Ollama)
30
+ npx ship-safe audit . --deep
31
+
32
+ # Red team scan only (16 agents, 80+ attack classes)
29
33
  npx ship-safe red-team .
30
34
 
31
35
  # Quick secret scan
@@ -33,6 +37,19 @@ npx ship-safe scan .
33
37
 
34
38
  # Security health score (0-100)
35
39
  npx ship-safe score .
40
+
41
+ # CI/CD pipeline mode — compact output, exit codes, SARIF
42
+ npx ship-safe ci .
43
+
44
+ # Accept current findings, only report regressions
45
+ npx ship-safe baseline .
46
+ npx ship-safe audit . --baseline
47
+
48
+ # Check if leaked secrets are still active
49
+ npx ship-safe audit . --verify
50
+
51
+ # Environment diagnostics
52
+ npx ship-safe doctor
36
53
  ```
37
54
 
38
55
  ![ship-safe terminal demo](.github/assets/ship%20safe%20terminal.jpg)
@@ -49,11 +66,11 @@ npx ship-safe audit .
49
66
 
50
67
  ```
51
68
  ════════════════════════════════════════════════════════════
52
- Ship Safe v4.0 — Full Security Audit
69
+ Ship Safe v5.0 — Full Security Audit
53
70
  ════════════════════════════════════════════════════════════
54
71
 
55
72
  [Phase 1/4] Scanning for secrets... ✔ 49 found
56
- [Phase 2/4] Running 12 security agents... ✔ 103 findings
73
+ [Phase 2/4] Running 16 security agents... ✔ 103 findings
57
74
  [Phase 3/4] Auditing dependencies... ✔ 44 CVEs
58
75
  [Phase 4/4] Computing security score... ✔ 25/100 F
59
76
 
@@ -80,38 +97,58 @@ npx ship-safe audit .
80
97
 
81
98
  **What it runs:**
82
99
  1. **Secret scan** — 50+ patterns with entropy scoring (API keys, passwords, tokens)
83
- 2. **12 security agents** — injection, auth, SSRF, supply chain, config, LLM, mobile, git history, CI/CD, API
100
+ 2. **16 security agents** — run in parallel with per-agent timeouts and framework-aware filtering (injection, auth, SSRF, supply chain, config, Supabase RLS, LLM, MCP, agentic AI, RAG, PII, mobile, git history, CI/CD, API)
84
101
  3. **Dependency audit** — npm/pip/bundler CVE scanning
85
- 4. **Score computation** — 8-category weighted scoring (0-100, A-F)
86
- 5. **Remediation plan** — prioritized fix list grouped by severity
87
- 6. **HTML report** — standalone dark-themed report with table of contents
102
+ 4. **Secrets verification** — probes provider APIs (GitHub, Stripe, OpenAI, etc.) to check if leaked keys are still active
103
+ 5. **Deep analysis** — LLM-powered taint analysis verifies exploitability of critical/high findings (optional)
104
+ 6. **Score computation** — confidence-weighted scoring across 8 categories (0-100, A-F)
105
+ 7. **Context-aware confidence tuning** — downgrades findings in test files, docs, and comments
106
+ 8. **Remediation plan** — prioritized fix list grouped by severity
107
+ 9. **HTML report** — standalone dark-themed report with code context
88
108
 
89
109
  **Flags:**
90
110
  - `--json` — structured JSON output (clean for piping)
91
111
  - `--sarif` — SARIF format for GitHub Code Scanning
112
+ - `--csv` — CSV export for spreadsheets
113
+ - `--md` — Markdown report
92
114
  - `--html [file]` — custom HTML report path (default: `ship-safe-report.html`)
115
+ - `--compare` — show per-category score delta vs. last scan
116
+ - `--timeout <ms>` — per-agent timeout (default: 30s)
93
117
  - `--no-deps` — skip dependency audit
94
118
  - `--no-ai` — skip AI classification
95
119
  - `--no-cache` — force full rescan (ignore cached results)
120
+ - `--baseline` — only show findings not in the baseline
121
+ - `--pdf [file]` — generate PDF report (requires Chrome/Chromium)
122
+ - `--deep` — LLM-powered taint analysis for critical/high findings
123
+ - `--local` — use local Ollama model for deep analysis
124
+ - `--model <model>` — LLM model to use for deep/AI analysis
125
+ - `--budget <cents>` — max spend in cents for deep analysis (default: 50)
126
+ - `--verify` — check if leaked secrets are still active (probes provider APIs)
96
127
 
97
128
  ---
98
129
 
99
- ## 12 Security Agents
130
+ ## 16 Security Agents
100
131
 
101
132
  | Agent | Category | What It Detects |
102
133
  |-------|----------|-----------------|
103
- | **InjectionTester** | Code Vulns | SQL/NoSQL injection, command injection, code injection (eval), XSS, path traversal, XXE, ReDoS, prototype pollution |
104
- | **AuthBypassAgent** | Auth | JWT vulnerabilities (alg:none, weak secrets), cookie security, CSRF, OAuth misconfig, BOLA/IDOR, weak crypto, timing attacks, TLS bypass |
134
+ | **InjectionTester** | Code Vulns | SQL/NoSQL injection, command injection, code injection (eval), XSS, path traversal, XXE, ReDoS, prototype pollution, Python f-string SQL injection, Python subprocess shell injection |
135
+ | **AuthBypassAgent** | Auth | JWT vulnerabilities (alg:none, weak secrets), cookie security, CSRF, OAuth misconfig, BOLA/IDOR, weak crypto, timing attacks, TLS bypass, Django `DEBUG = True`, Flask hardcoded secret keys |
105
136
  | **SSRFProber** | SSRF | User input in fetch/axios, cloud metadata endpoints, internal IPs, redirect following |
106
- | **SupplyChainAudit** | Supply Chain | Typosquatting (Levenshtein distance), git/URL dependencies, wildcard versions, suspicious install scripts |
107
- | **ConfigAuditor** | Config | Dockerfile (running as root, :latest tags), Terraform (public S3, open SG), Kubernetes (privileged containers), CORS, CSP, Firebase, Nginx |
137
+ | **SupplyChainAudit** | Supply Chain | Typosquatting (Levenshtein distance), git/URL dependencies, wildcard versions, suspicious install scripts, dependency confusion, scoped packages without registry pinning |
138
+ | **ConfigAuditor** | Config | Dockerfile (running as root, :latest tags), Terraform (public S3/RDS, open SG, CloudFront HTTP, Lambda admin, S3 no versioning), Kubernetes (privileged containers, `:latest` tags, missing NetworkPolicy), CORS, CSP, Firebase, Nginx |
139
+ | **SupabaseRLSAgent** | Auth | Supabase Row Level Security — `service_role` key in client code, `CREATE TABLE` without RLS, anon key inserts, unprotected storage operations |
108
140
  | **LLMRedTeam** | AI/LLM | OWASP LLM Top 10 — prompt injection, excessive agency, system prompt leakage, unbounded consumption, RAG poisoning |
141
+ | **MCPSecurityAgent** | AI/LLM | MCP server security — unvalidated tool inputs, missing auth, excessive permissions, tool poisoning |
142
+ | **AgenticSecurityAgent** | AI/LLM | OWASP Agentic AI Top 10 — agent hijacking, privilege escalation, unsafe code execution, memory poisoning |
143
+ | **RAGSecurityAgent** | AI/LLM | RAG pipeline security — unvalidated embeddings, context injection, document poisoning, vector DB access control |
144
+ | **PIIComplianceAgent** | Compliance | PII detection — SSNs, credit cards, emails, phone numbers in source code, logs, and configs |
109
145
  | **MobileScanner** | Mobile | OWASP Mobile Top 10 2024 — insecure storage, WebView JS injection, HTTP endpoints, excessive permissions, debug mode |
110
146
  | **GitHistoryScanner** | Secrets | Leaked secrets in git commit history (checks if still active in working tree) |
111
147
  | **CICDScanner** | CI/CD | OWASP CI/CD Top 10 — pipeline poisoning, unpinned actions, secret logging, self-hosted runners, script injection |
112
- | **APIFuzzer** | API | Routes without auth, missing input validation, mass assignment, unrestricted file upload, GraphQL introspection, debug endpoints |
148
+ | **APIFuzzer** | API | Routes without auth, missing input validation, mass assignment, unrestricted file upload, GraphQL introspection, debug endpoints, missing rate limiting, OpenAPI spec security issues |
113
149
  | **ReconAgent** | Recon | Attack surface discovery — frameworks, languages, auth patterns, databases, cloud providers, IaC, CI/CD pipelines |
114
- | **ScoringEngine** | Scoring | 8-category weighted scoring with trend tracking |
150
+
151
+ **Post-processors:** ScoringEngine (8-category weighted scoring), VerifierAgent (secrets liveness verification), DeepAnalyzer (LLM-powered taint analysis)
115
152
 
116
153
  ---
117
154
 
@@ -123,7 +160,7 @@ npx ship-safe audit .
123
160
  # Full audit with remediation plan + HTML report
124
161
  npx ship-safe audit .
125
162
 
126
- # Red team: 12 agents, 50+ attack classes
163
+ # Red team: 16 agents, 80+ attack classes
127
164
  npx ship-safe red-team .
128
165
  npx ship-safe red-team . --agents injection,auth # Run specific agents
129
166
  npx ship-safe red-team . --html report.html # HTML report
@@ -150,11 +187,57 @@ npx ship-safe agent .
150
187
 
151
188
  # Auto-fix hardcoded secrets: rewrite code + write .env
152
189
  npx ship-safe remediate .
190
+ npx ship-safe remediate . --all # Also fix agent findings (TLS, debug, XSS, etc.)
153
191
 
154
192
  # Revoke exposed keys — opens provider dashboards
155
193
  npx ship-safe rotate .
156
194
  ```
157
195
 
196
+ ### Baseline Management
197
+
198
+ ```bash
199
+ # Accept current findings as baseline
200
+ npx ship-safe baseline .
201
+
202
+ # Audit showing only new findings since baseline
203
+ npx ship-safe audit . --baseline
204
+
205
+ # Show what changed since baseline
206
+ npx ship-safe baseline --diff
207
+
208
+ # Remove baseline
209
+ npx ship-safe baseline --clear
210
+ ```
211
+
212
+ ### CI/CD Pipeline
213
+
214
+ ```bash
215
+ # CI mode — compact output, exit codes, threshold gating
216
+ npx ship-safe ci .
217
+ npx ship-safe ci . --threshold 80 # Custom passing score
218
+ npx ship-safe ci . --fail-on critical # Fail on severity
219
+ npx ship-safe ci . --sarif out.sarif # SARIF for GitHub
220
+ ```
221
+
222
+ ### Deep Analysis & Verification
223
+
224
+ ```bash
225
+ # LLM-powered deep analysis (Anthropic/OpenAI/Google/Ollama)
226
+ npx ship-safe audit . --deep
227
+ npx ship-safe audit . --deep --local # Use local Ollama
228
+ npx ship-safe audit . --deep --budget 50 # Cap spend at 50 cents
229
+
230
+ # Check if leaked secrets are still active
231
+ npx ship-safe audit . --verify
232
+ ```
233
+
234
+ ### Diagnostics
235
+
236
+ ```bash
237
+ # Environment check — Node.js, git, npm, API keys, cache, version
238
+ npx ship-safe doctor
239
+ ```
240
+
158
241
  ### Infrastructure Commands
159
242
 
160
243
  ```bash
@@ -192,9 +275,11 @@ claude plugin add github:asamassekou10/ship-safe
192
275
 
193
276
  | Command | Description |
194
277
  |---------|-------------|
195
- | `/ship-safe` | Full security audit — 12 agents, remediation plan, auto-fix |
278
+ | `/ship-safe` | Full security audit — 16 agents, remediation plan, auto-fix |
196
279
  | `/ship-safe-scan` | Quick scan for leaked secrets |
197
280
  | `/ship-safe-score` | Security health score (0-100) |
281
+ | `/ship-safe-deep` | LLM-powered deep taint analysis |
282
+ | `/ship-safe-ci` | CI/CD pipeline setup guide |
198
283
 
199
284
  Claude interprets the results, explains findings in plain language, and can fix issues directly in your codebase.
200
285
 
@@ -214,6 +299,10 @@ Ship Safe caches file hashes and findings in `.ship-safe/context.json`. On subse
214
299
 
215
300
  The cache is stored in `.ship-safe/` which is automatically excluded from scans.
216
301
 
302
+ ### LLM Response Caching
303
+
304
+ When using AI classification (`--no-ai` to disable), results are cached in `.ship-safe/llm-cache.json` with a 7-day TTL. Repeated scans reuse cached classifications — reducing API costs significantly.
305
+
217
306
  ---
218
307
 
219
308
  ## Smart `.gitignore` Handling
@@ -247,7 +336,7 @@ Auto-detected from environment variables. No API key required for scanning — A
247
336
 
248
337
  ## Scoring System
249
338
 
250
- Starts at 100. Each finding deducts points by severity and category.
339
+ Starts at 100. Each finding deducts points by severity and category, weighted by confidence level (high: 100%, medium: 60%, low: 30%) to reduce noise from heuristic patterns.
251
340
 
252
341
  **8 Categories** (with weight caps):
253
342
 
@@ -291,6 +380,24 @@ npx ship-safe policy init
291
380
 
292
381
  ## CI/CD Integration
293
382
 
383
+ The dedicated `ci` command is optimized for pipelines — compact output, exit codes, threshold-based gating:
384
+
385
+ ```bash
386
+ # Basic CI — fail if score < 75
387
+ npx ship-safe ci .
388
+
389
+ # Strict — fail on any critical finding
390
+ npx ship-safe ci . --fail-on critical
391
+
392
+ # Custom threshold + SARIF for GitHub Security tab
393
+ npx ship-safe ci . --threshold 80 --sarif results.sarif
394
+
395
+ # Only check new findings (not in baseline)
396
+ npx ship-safe ci . --baseline
397
+ ```
398
+
399
+ **GitHub Actions example:**
400
+
294
401
  ```yaml
295
402
  # .github/workflows/security.yml
296
403
  name: Security Audit
@@ -303,17 +410,17 @@ jobs:
303
410
  steps:
304
411
  - uses: actions/checkout@v4
305
412
 
306
- - name: Full security audit
307
- run: npx ship-safe audit . --no-ai --json
308
-
309
- - name: Upload SARIF to GitHub Security tab
310
- run: npx ship-safe audit . --no-ai --sarif > results.sarif
413
+ - name: Security gate
414
+ run: npx ship-safe ci . --threshold 75 --sarif results.sarif
311
415
 
312
416
  - uses: github/codeql-action/upload-sarif@v3
417
+ if: always()
313
418
  with:
314
419
  sarif_file: results.sarif
315
420
  ```
316
421
 
422
+ **Export formats:** `--json`, `--sarif`, `--csv`, `--md`, `--html`, `--pdf`
423
+
317
424
  ---
318
425
 
319
426
  ## Suppress False Positives
@@ -343,6 +450,7 @@ docs/
343
450
  | **OWASP Top 10 Mobile 2024** | M1-M10: Improper Credential Usage, Inadequate Supply Chain, Insecure Auth, Insufficient Validation, Insecure Communication, Inadequate Privacy, Binary Protections, Security Misconfiguration, Insecure Data Storage, Insufficient Cryptography |
344
451
  | **OWASP LLM Top 10 2025** | LLM01-LLM10: Prompt Injection, Sensitive Info Disclosure, Supply Chain, Data Poisoning, Improper Output Handling, Excessive Agency, System Prompt Leakage, Vector/Embedding Weaknesses, Misinformation, Unbounded Consumption |
345
452
  | **OWASP CI/CD Top 10** | CICD-SEC-1 to 10: Insufficient Flow Control, Identity Management, Dependency Chain Abuse, Poisoned Pipeline Execution, Insufficient PBAC, Credential Hygiene, Insecure System Config, Ungoverned Usage, Improper Artifact Integrity, Insufficient Logging |
453
+ | **OWASP Agentic AI Top 10** | ASI01-ASI10: Agent Hijacking, Tool Misuse, Privilege Escalation, Unsafe Code Execution, Memory Poisoning, Identity Spoofing, Excessive Autonomy, Logging Gaps, Supply Chain Attacks, Cascading Hallucination |
346
454
 
347
455
  ---
348
456
 
@@ -380,6 +488,7 @@ See [CONTRIBUTING.md](./CONTRIBUTING.md) for guidelines.
380
488
  - [OWASP LLM Top 10 2025](https://genai.owasp.org/llm-top-10/)
381
489
  - [OWASP API Security Top 10 2023](https://owasp.org/API-Security/)
382
490
  - [OWASP CI/CD Top 10](https://owasp.org/www-project-top-10-ci-cd-security-risks/)
491
+ - [OWASP Agentic AI Top 10](https://owasp.org/www-project-agentic-ai-top-10/)
383
492
 
384
493
  ---
385
494