ship-safe 4.3.0 → 5.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -14,19 +14,22 @@
14
14
 
15
15
  ---
16
16
 
17
- 13 security agents. 50+ attack classes. One command.
17
+ 16 security agents. 80+ attack classes. One command.
18
18
 
19
- **Ship Safe v4.3** is an AI-powered security platform that runs 13 specialized agents in parallel against your codebase — scanning for secrets, injection vulnerabilities, auth bypass, SSRF, supply chain attacks, Supabase RLS misconfigs, Docker/Terraform/Kubernetes misconfigs, CI/CD pipeline poisoning, LLM security issues, and more. Context-aware confidence tuning reduces false positives by up to 70%. Baseline support lets teams adopt incrementally accept existing debt, focus on not making it worse.
19
+ **Ship Safe v5.0** is an AI-powered security platform that runs 16 specialized agents in parallel against your codebase — scanning for secrets, injection vulnerabilities, auth bypass, SSRF, supply chain attacks, Supabase RLS misconfigs, Docker/Terraform/Kubernetes misconfigs, CI/CD pipeline poisoning, LLM/agentic AI security, MCP server misuse, RAG poisoning, PII compliance, and more. LLM-powered deep analysis verifies exploitability of critical findings. Secrets verification probes provider APIs to check if leaked keys are still active. A dedicated CI command (`ship-safe ci`) integrates into any pipeline with threshold-based gating and SARIF output.
20
20
 
21
21
  ---
22
22
 
23
23
  ## Quick Start
24
24
 
25
25
  ```bash
26
- # Full security audit — secrets + 12 agents + deps + remediation plan
26
+ # Full security audit — secrets + 16 agents + deps + remediation plan
27
27
  npx ship-safe audit .
28
28
 
29
- # Red team scan only (12 agents, 50+ attack classes)
29
+ # LLM-powered deep analysis (Anthropic, OpenAI, Google, Ollama)
30
+ npx ship-safe audit . --deep
31
+
32
+ # Red team scan only (16 agents, 80+ attack classes)
30
33
  npx ship-safe red-team .
31
34
 
32
35
  # Quick secret scan
@@ -35,10 +38,16 @@ npx ship-safe scan .
35
38
  # Security health score (0-100)
36
39
  npx ship-safe score .
37
40
 
41
+ # CI/CD pipeline mode — compact output, exit codes, SARIF
42
+ npx ship-safe ci .
43
+
38
44
  # Accept current findings, only report regressions
39
45
  npx ship-safe baseline .
40
46
  npx ship-safe audit . --baseline
41
47
 
48
+ # Check if leaked secrets are still active
49
+ npx ship-safe audit . --verify
50
+
42
51
  # Environment diagnostics
43
52
  npx ship-safe doctor
44
53
  ```
@@ -57,11 +66,11 @@ npx ship-safe audit .
57
66
 
58
67
  ```
59
68
  ════════════════════════════════════════════════════════════
60
- Ship Safe v4.3 — Full Security Audit
69
+ Ship Safe v5.0 — Full Security Audit
61
70
  ════════════════════════════════════════════════════════════
62
71
 
63
72
  [Phase 1/4] Scanning for secrets... ✔ 49 found
64
- [Phase 2/4] Running 13 security agents... ✔ 103 findings
73
+ [Phase 2/4] Running 16 security agents... ✔ 103 findings
65
74
  [Phase 3/4] Auditing dependencies... ✔ 44 CVEs
66
75
  [Phase 4/4] Computing security score... ✔ 25/100 F
67
76
 
@@ -88,12 +97,14 @@ npx ship-safe audit .
88
97
 
89
98
  **What it runs:**
90
99
  1. **Secret scan** — 50+ patterns with entropy scoring (API keys, passwords, tokens)
91
- 2. **13 security agents** — run in parallel with per-agent timeouts (injection, auth, SSRF, supply chain, config, Supabase RLS, LLM, mobile, git history, CI/CD, API)
100
+ 2. **16 security agents** — run in parallel with per-agent timeouts and framework-aware filtering (injection, auth, SSRF, supply chain, config, Supabase RLS, LLM, MCP, agentic AI, RAG, PII, mobile, git history, CI/CD, API)
92
101
  3. **Dependency audit** — npm/pip/bundler CVE scanning
93
- 4. **Score computation** — confidence-weighted scoring across 8 categories (0-100, A-F)
94
- 5. **Context-aware confidence tuning** — downgrades findings in test files, docs, and comments
95
- 6. **Remediation plan** — prioritized fix list grouped by severity
96
- 7. **HTML report** — standalone dark-themed report with code context
102
+ 4. **Secrets verification** — probes provider APIs (GitHub, Stripe, OpenAI, etc.) to check if leaked keys are still active
103
+ 5. **Deep analysis** — LLM-powered taint analysis verifies exploitability of critical/high findings (optional)
104
+ 6. **Score computation** — confidence-weighted scoring across 8 categories (0-100, A-F)
105
+ 7. **Context-aware confidence tuning** — downgrades findings in test files, docs, and comments
106
+ 8. **Remediation plan** — prioritized fix list grouped by severity
107
+ 9. **HTML report** — standalone dark-themed report with code context
97
108
 
98
109
  **Flags:**
99
110
  - `--json` — structured JSON output (clean for piping)
@@ -108,10 +119,15 @@ npx ship-safe audit .
108
119
  - `--no-cache` — force full rescan (ignore cached results)
109
120
  - `--baseline` — only show findings not in the baseline
110
121
  - `--pdf [file]` — generate PDF report (requires Chrome/Chromium)
122
+ - `--deep` — LLM-powered taint analysis for critical/high findings
123
+ - `--local` — use local Ollama model for deep analysis
124
+ - `--model <model>` — LLM model to use for deep/AI analysis
125
+ - `--budget <cents>` — max spend in cents for deep analysis (default: 50)
126
+ - `--verify` — check if leaked secrets are still active (probes provider APIs)
111
127
 
112
128
  ---
113
129
 
114
- ## 13 Security Agents
130
+ ## 16 Security Agents
115
131
 
116
132
  | Agent | Category | What It Detects |
117
133
  |-------|----------|-----------------|
@@ -122,12 +138,17 @@ npx ship-safe audit .
122
138
  | **ConfigAuditor** | Config | Dockerfile (running as root, :latest tags), Terraform (public S3/RDS, open SG, CloudFront HTTP, Lambda admin, S3 no versioning), Kubernetes (privileged containers, `:latest` tags, missing NetworkPolicy), CORS, CSP, Firebase, Nginx |
123
139
  | **SupabaseRLSAgent** | Auth | Supabase Row Level Security — `service_role` key in client code, `CREATE TABLE` without RLS, anon key inserts, unprotected storage operations |
124
140
  | **LLMRedTeam** | AI/LLM | OWASP LLM Top 10 — prompt injection, excessive agency, system prompt leakage, unbounded consumption, RAG poisoning |
141
+ | **MCPSecurityAgent** | AI/LLM | MCP server security — unvalidated tool inputs, missing auth, excessive permissions, tool poisoning |
142
+ | **AgenticSecurityAgent** | AI/LLM | OWASP Agentic AI Top 10 — agent hijacking, privilege escalation, unsafe code execution, memory poisoning |
143
+ | **RAGSecurityAgent** | AI/LLM | RAG pipeline security — unvalidated embeddings, context injection, document poisoning, vector DB access control |
144
+ | **PIIComplianceAgent** | Compliance | PII detection — SSNs, credit cards, emails, phone numbers in source code, logs, and configs |
125
145
  | **MobileScanner** | Mobile | OWASP Mobile Top 10 2024 — insecure storage, WebView JS injection, HTTP endpoints, excessive permissions, debug mode |
126
146
  | **GitHistoryScanner** | Secrets | Leaked secrets in git commit history (checks if still active in working tree) |
127
147
  | **CICDScanner** | CI/CD | OWASP CI/CD Top 10 — pipeline poisoning, unpinned actions, secret logging, self-hosted runners, script injection |
128
148
  | **APIFuzzer** | API | Routes without auth, missing input validation, mass assignment, unrestricted file upload, GraphQL introspection, debug endpoints, missing rate limiting, OpenAPI spec security issues |
129
149
  | **ReconAgent** | Recon | Attack surface discovery — frameworks, languages, auth patterns, databases, cloud providers, IaC, CI/CD pipelines |
130
- | **ScoringEngine** | Scoring | 8-category weighted scoring with trend tracking |
150
+
151
+ **Post-processors:** ScoringEngine (8-category weighted scoring), VerifierAgent (secrets liveness verification), DeepAnalyzer (LLM-powered taint analysis)
131
152
 
132
153
  ---
133
154
 
@@ -139,7 +160,7 @@ npx ship-safe audit .
139
160
  # Full audit with remediation plan + HTML report
140
161
  npx ship-safe audit .
141
162
 
142
- # Red team: 13 agents, 50+ attack classes
163
+ # Red team: 16 agents, 80+ attack classes
143
164
  npx ship-safe red-team .
144
165
  npx ship-safe red-team . --agents injection,auth # Run specific agents
145
166
  npx ship-safe red-team . --html report.html # HTML report
@@ -188,6 +209,28 @@ npx ship-safe baseline --diff
188
209
  npx ship-safe baseline --clear
189
210
  ```
190
211
 
212
+ ### CI/CD Pipeline
213
+
214
+ ```bash
215
+ # CI mode — compact output, exit codes, threshold gating
216
+ npx ship-safe ci .
217
+ npx ship-safe ci . --threshold 80 # Custom passing score
218
+ npx ship-safe ci . --fail-on critical # Fail on severity
219
+ npx ship-safe ci . --sarif out.sarif # SARIF for GitHub
220
+ ```
221
+
222
+ ### Deep Analysis & Verification
223
+
224
+ ```bash
225
+ # LLM-powered deep analysis (Anthropic/OpenAI/Google/Ollama)
226
+ npx ship-safe audit . --deep
227
+ npx ship-safe audit . --deep --local # Use local Ollama
228
+ npx ship-safe audit . --deep --budget 50 # Cap spend at 50 cents
229
+
230
+ # Check if leaked secrets are still active
231
+ npx ship-safe audit . --verify
232
+ ```
233
+
191
234
  ### Diagnostics
192
235
 
193
236
  ```bash
@@ -232,9 +275,11 @@ claude plugin add github:asamassekou10/ship-safe
232
275
 
233
276
  | Command | Description |
234
277
  |---------|-------------|
235
- | `/ship-safe` | Full security audit — 12 agents, remediation plan, auto-fix |
278
+ | `/ship-safe` | Full security audit — 16 agents, remediation plan, auto-fix |
236
279
  | `/ship-safe-scan` | Quick scan for leaked secrets |
237
280
  | `/ship-safe-score` | Security health score (0-100) |
281
+ | `/ship-safe-deep` | LLM-powered deep taint analysis |
282
+ | `/ship-safe-ci` | CI/CD pipeline setup guide |
238
283
 
239
284
  Claude interprets the results, explains findings in plain language, and can fix issues directly in your codebase.
240
285
 
@@ -335,6 +380,24 @@ npx ship-safe policy init
335
380
 
336
381
  ## CI/CD Integration
337
382
 
383
+ The dedicated `ci` command is optimized for pipelines — compact output, exit codes, threshold-based gating:
384
+
385
+ ```bash
386
+ # Basic CI — fail if score < 75
387
+ npx ship-safe ci .
388
+
389
+ # Strict — fail on any critical finding
390
+ npx ship-safe ci . --fail-on critical
391
+
392
+ # Custom threshold + SARIF for GitHub Security tab
393
+ npx ship-safe ci . --threshold 80 --sarif results.sarif
394
+
395
+ # Only check new findings (not in baseline)
396
+ npx ship-safe ci . --baseline
397
+ ```
398
+
399
+ **GitHub Actions example:**
400
+
338
401
  ```yaml
339
402
  # .github/workflows/security.yml
340
403
  name: Security Audit
@@ -347,16 +410,11 @@ jobs:
347
410
  steps:
348
411
  - uses: actions/checkout@v4
349
412
 
350
- - name: Full security audit
351
- run: npx ship-safe audit . --no-ai --json
352
-
353
- - name: Score delta vs. last scan
354
- run: npx ship-safe audit . --no-ai --compare
355
-
356
- - name: Upload SARIF to GitHub Security tab
357
- run: npx ship-safe audit . --no-ai --sarif > results.sarif
413
+ - name: Security gate
414
+ run: npx ship-safe ci . --threshold 75 --sarif results.sarif
358
415
 
359
416
  - uses: github/codeql-action/upload-sarif@v3
417
+ if: always()
360
418
  with:
361
419
  sarif_file: results.sarif
362
420
  ```
@@ -392,6 +450,7 @@ docs/
392
450
  | **OWASP Top 10 Mobile 2024** | M1-M10: Improper Credential Usage, Inadequate Supply Chain, Insecure Auth, Insufficient Validation, Insecure Communication, Inadequate Privacy, Binary Protections, Security Misconfiguration, Insecure Data Storage, Insufficient Cryptography |
393
451
  | **OWASP LLM Top 10 2025** | LLM01-LLM10: Prompt Injection, Sensitive Info Disclosure, Supply Chain, Data Poisoning, Improper Output Handling, Excessive Agency, System Prompt Leakage, Vector/Embedding Weaknesses, Misinformation, Unbounded Consumption |
394
452
  | **OWASP CI/CD Top 10** | CICD-SEC-1 to 10: Insufficient Flow Control, Identity Management, Dependency Chain Abuse, Poisoned Pipeline Execution, Insufficient PBAC, Credential Hygiene, Insecure System Config, Ungoverned Usage, Improper Artifact Integrity, Insufficient Logging |
453
+ | **OWASP Agentic AI Top 10** | ASI01-ASI10: Agent Hijacking, Tool Misuse, Privilege Escalation, Unsafe Code Execution, Memory Poisoning, Identity Spoofing, Excessive Autonomy, Logging Gaps, Supply Chain Attacks, Cascading Hallucination |
395
454
 
396
455
  ---
397
456
 
@@ -429,6 +488,7 @@ See [CONTRIBUTING.md](./CONTRIBUTING.md) for guidelines.
429
488
  - [OWASP LLM Top 10 2025](https://genai.owasp.org/llm-top-10/)
430
489
  - [OWASP API Security Top 10 2023](https://owasp.org/API-Security/)
431
490
  - [OWASP CI/CD Top 10](https://owasp.org/www-project-top-10-ci-cd-security-risks/)
491
+ - [OWASP Agentic AI Top 10](https://owasp.org/www-project-agentic-ai-top-10/)
432
492
 
433
493
  ---
434
494