codeguard-pro 0.3.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,398 @@
1
+ Metadata-Version: 2.4
2
+ Name: codeguard-pro
3
+ Version: 0.3.0
4
+ Summary: Inline security gate for AI coding agents: secrets, supply chain, OWASP, and MiniMax-assisted deep analysis.
5
+ Home-page: https://github.com/Miles0sage/codeguard-mcp
6
+ Author: Miles
7
+ Classifier: Development Status :: 4 - Beta
8
+ Classifier: Topic :: Security
9
+ Classifier: Topic :: Software Development :: Quality Assurance
10
+ Classifier: License :: OSI Approved :: MIT License
11
+ Classifier: Programming Language :: Python :: 3
12
+ Requires-Python: >=3.10
13
+ Description-Content-Type: text/markdown
14
+ Requires-Dist: mcp[cli]>=1.0.0
15
+ Requires-Dist: requests>=2.31.0
16
+ Dynamic: author
17
+ Dynamic: classifier
18
+ Dynamic: description
19
+ Dynamic: description-content-type
20
+ Dynamic: home-page
21
+ Dynamic: requires-dist
22
+ Dynamic: requires-python
23
+ Dynamic: summary
24
+
25
+ ![Python 3.10+](https://img.shields.io/badge/python-3.10%2B-blue)
26
+ ![MCP Compatible](https://img.shields.io/badge/MCP-compatible-green)
27
+ ![Tools: 23](https://img.shields.io/badge/tools-23-orange)
28
+ ![Tests: 112](https://img.shields.io/badge/tests-112-brightgreen)
29
+ ![License: MIT](https://img.shields.io/badge/license-MIT-lightgrey)
30
+ ![pip installable](https://img.shields.io/badge/pip-installable-blue)
31
+
32
+ # CodeGuard Pro
33
+
34
+ > Stop secrets from reaching git. The inline security gate for AI coding agents.
35
+
36
+ AI coding agents (Claude Code, Cursor, Copilot) write code fast. Too fast to catch the AWS key they just hardcoded. CodeGuard sits **inline** -- agents call `security_gate` before every commit. Secrets get blocked, not just flagged.
37
+
38
+ ```
39
+ $ git commit -m "add payment integration"
40
+
41
+ CodeGuard Pro — scanning for secrets...
42
+
43
+ BLOCKED — 2 critical secret(s) found.
44
+
45
+ [CRITICAL] Stripe Secret Key (line 14, col 12)
46
+ Found: sk_l****************************eJ7z
47
+ Fix: Use environment variable: os.environ["STRIPE_SECRET_KEY"]
48
+
49
+ [CRITICAL] AWS Access Key (line 22, col 8)
50
+ Found: AKIA****************************3Q9R
51
+ Fix: Use AWS credentials file (~/.aws/credentials) or IAM roles
52
+
53
+ Commit blocked. Fix the issues above and try again.
54
+ ```
55
+
56
+ ## Quick Start
57
+
58
+ ```bash
59
+ pip install mcp[cli]
60
+ git clone https://github.com/Miles0sage/codeguard-mcp && cd codeguard-mcp
61
+ codeguard install # hooks into your repo's pre-commit
62
+ ```
63
+
64
+ That's it. Every `git commit` now scans for secrets automatically.
65
+
66
+ ## Features
67
+
68
+ | Feature | What it does |
69
+ |---|---|
70
+ | **Inline Security Gate** | MCP tool agents call before committing -- returns APPROVED / BLOCKED |
71
+ | **25+ Secret Patterns** | OpenAI, AWS, Stripe, GitHub, Slack, GCP, Supabase, private keys, JWTs... |
72
+ | **Auto-Fix Patches** | Returns a diff replacing secrets with `os.environ[]` lookups |
73
+ | **Pre-Commit Hook** | Blocks commits containing secrets at the git level |
74
+ | **OWASP Top 10 Scanner** | SQL injection, XSS, command injection, SSRF, path traversal, weak crypto |
75
+ | **Code Review Engine** | Duplicate functions, deep nesting, long functions, naming, unused imports |
76
+ | **MCP Server Audit** | Checks input validation, error handling, shell execution, rate limiting |
77
+ | **Full Audit** | Combined secrets + code review + security scan in one call |
78
+ | **Supply Chain Scanner** | Known-compromised package DB, typosquatting, unpinned deps, GitHub Actions SHA audit, OSV lookups |
79
+ | **MiniMax Beta** | Behavioral malware analysis for setup hooks, exfiltration, and exploit explanations |
80
+ | **Learning Loop** | Stores missed samples, generates issue-ready markdown, and keeps reviewable detection backlog |
81
+ | **False Positive Filtering** | Skips comments, placeholders, examples, test keys |
82
+ | **Directory Scanning** | Recursive scan with smart skipping (node_modules, .git, binaries) |
83
+
84
+ ## Positioning
85
+
86
+ CodeGuard Pro is strongest as an **inline security gate for AI coding agents**.
87
+
88
+ What it does well now:
89
+ - blocks hardcoded secrets before commit
90
+ - scans code for OWASP-style issues and common injection patterns
91
+ - audits supply-chain risk before install or release
92
+ - uses MiniMax as a beta sidecar for behavioral malware and exploit explanation
93
+ - saves suspicious misses into a reviewable learning corpus instead of silently rewriting rules
94
+
95
+ What it is not yet:
96
+ - a fully autonomous self-improving security platform
97
+ - proof that the AI layer catches every crypto-indirection edge case
98
+ - an enterprise policy/observability suite with mature dashboards, SSO, audit trails, and fleet management
99
+
100
+ ## MCP Integration
101
+
102
+ Add to your Claude Code config (`~/.claude.json`):
103
+
104
+ ```json
105
+ {
106
+ "mcpServers": {
107
+ "codeguard": {
108
+ "command": "python3",
109
+ "args": ["/path/to/codeguard-mcp/server.py"]
110
+ }
111
+ }
112
+ }
113
+ ```
114
+
115
+ ### 23 MCP Tools
116
+
117
+ | Tool | Purpose |
118
+ |---|---|
119
+ | `security_gate` | **The gate.** Pass a diff, get APPROVED or BLOCKED with fix patches |
120
+ | `scan_secrets_in_file` | Scan a single file for hardcoded secrets |
121
+ | `scan_secrets_in_directory` | Recursive secret scan across a project |
122
+ | `smart_review` | Code review: duplicates, nesting, naming, debug statements |
123
+ | `review_file` | Review a file from disk (auto-detects language) |
124
+ | `security_scan` | OWASP Top 10 vulnerability scan on code |
125
+ | `security_scan_file` | Security scan a file from disk |
126
+ | `security_scan_directory` | Scan all files in a directory for vulns |
127
+ | `audit_mcp_server` | Security audit for MCP servers specifically |
128
+ | `scan_package` | Check a package before install for typosquatting and compromise history |
129
+ | `scan_requirements_file` | Audit a requirements file package-by-package |
130
+ | `scan_pth_files` | Detect malicious `.pth` startup hooks in site-packages |
131
+ | `scan_requirements_unpinned` | Find loose or missing dependency pins |
132
+ | `scan_github_actions` | Audit workflow actions for unpinned mutable refs |
133
+ | `query_osv` | Query OSV.dev for a package+version |
134
+ | `query_osv_batch` | Batch query OSV.dev across packages |
135
+ | `full_audit` | Secrets + code review + security scan combined |
136
+ | `deep_analyze` | MiniMax beta taint and behavioral analysis |
137
+ | `smart_analyze` | Recommended layered entry point: fast path first, MiniMax escalation only when justified |
138
+ | `analyze_setup_py` | MiniMax beta setup.py malware verdict |
139
+ | `explain_vulnerability` | MiniMax beta exploit explanation |
140
+ | `record_learning_candidate` | Save suspicious samples for later review |
141
+ | `generate_learning_issue` | Generate issue-ready markdown from a saved sample |
142
+ | `learning_summary` | Summarize the local learning corpus and issue queue |
143
+
144
+ ### How Agents Use It
145
+
146
+ ```
147
+ Agent writes code
148
+ |
149
+ v
150
+ Agent runs: security_gate(diff)
151
+ |
152
+ +----+----+
153
+ | |
154
+ APPROVED BLOCKED
155
+ | |
156
+ commit apply fix_patch
157
+ re-run gate
158
+ then commit
159
+ ```
160
+
161
+ The agent gets structured JSON back:
162
+
163
+ ```json
164
+ {
165
+ "status": "BLOCKED",
166
+ "critical": 1,
167
+ "total": 1,
168
+ "report": "...",
169
+ "fix_patch": "--- line 14\n- api_key = \"sk_live_abc123...\"\n+ api_key = os.environ[\"STRIPE_SECRET_KEY\"]",
170
+ "action": "Apply the fix patch, then re-run security_gate."
171
+ }
172
+ ```
173
+
174
+ ## CLI Usage
175
+
176
+ ```bash
177
+ codeguard install # Install pre-commit hook
178
+ codeguard scan ./src # Scan a directory
179
+ codeguard scan app.py # Scan a single file
180
+ codeguard scan-diff # Scan staged changes
181
+ codeguard learn-add sample.py --title "obfuscated setup hook"
182
+ codeguard learn-summary
183
+ codeguard uninstall # Remove hook (restores backup)
184
+ ```
185
+
186
+ ## Recommended Flow
187
+
188
+ Use `smart_analyze` as the default code-analysis entry point:
189
+
190
+ 1. deterministic fast path runs first
191
+ 2. MiniMax escalation runs only when the result is high-risk but incomplete, behavior looks suspicious, or a deeper explanation is requested
192
+ 3. optional learning-corpus capture stores suspicious misses for review
193
+
194
+ That keeps cost and latency low while still giving you deeper analysis when regex alone is not enough.
195
+
196
+ ## AI Beta
197
+
198
+ MiniMax is wired directly to the official MiniMax API using `MINIMAX_API_KEY`.
199
+
200
+ Recommended usage:
201
+ - call `smart_analyze` first for code-level analysis
202
+ - let `smart_analyze` decide whether deterministic findings are enough or whether MiniMax escalation is justified
203
+ - use `deep_analyze` directly only when you explicitly want forced AI taint/behavior review
204
+
205
+ ## Why Now
206
+
207
+ The timing is unusually good for launch:
208
+
209
+ - TeamPCP-class supply-chain attacks are active and visible
210
+ - AI coding agents are causing more developers to install and generate code faster than they review it
211
+ - most competing tools still run after code is written, not inline in the agent loop
212
+
213
+ CodeGuard is strongest when positioned as the security gate that sits *inside* the AI workflow:
214
+ - before install
215
+ - before commit
216
+ - before shipping
217
+
218
+ Current verified beta capabilities:
219
+ - setup-hook malware verdicts for obfuscated `exec(base64.b64decode(...))` patterns
220
+ - behavioral credential exfiltration detection
221
+ - exploit explanation generation
222
+ - graceful degradation to deterministic scanners when no key is configured
223
+
224
+ Current limits:
225
+ - latency is materially higher than regex/supply-chain scans
226
+ - crypto-indirection edge cases are not yet proven comprehensively
227
+ - AI output should be treated as reviewable evidence, not self-modifying truth
228
+
229
+ ## Learning Loop
230
+
231
+ New threats should become review artifacts, not silent rule mutations.
232
+
233
+ Recommended flow:
234
+ 1. Save a suspicious sample with `codeguard learn-add`.
235
+ 2. Generate an issue draft with `codeguard learn-report`.
236
+ 3. Review the issue and decide whether it needs a regex rule, AST rule, AI prompt update, or documentation-only limitation.
237
+ 4. Add a regression test before promoting any new detector.
238
+
239
+ This keeps the product getting smarter without turning it into an opaque self-editing scanner.
240
+
241
+ ## Secret Patterns (25+)
242
+
243
+ | Provider | Pattern | Severity |
244
+ |---|---|---|
245
+ | OpenAI | `sk-proj-...` | CRITICAL |
246
+ | Anthropic | `sk-ant-...` | CRITICAL |
247
+ | AWS Access Key | `AKIA...` | CRITICAL |
248
+ | AWS Secret Key | `aws_secret_access_key=...` | CRITICAL |
249
+ | GitHub Token | `ghp_...` | CRITICAL |
250
+ | GitHub OAuth | `gho_...` | CRITICAL |
251
+ | GitHub App | `ghu_/ghs_/ghr_...` | CRITICAL |
252
+ | Google API Key | `AIza...` | CRITICAL |
253
+ | Google OAuth | `GOCSPX-...` | CRITICAL |
254
+ | Stripe Secret | `sk_live_/sk_test_...` | CRITICAL |
255
+ | Stripe Publishable | `pk_live_/pk_test_...` | HIGH |
256
+ | Slack Token | `xox[bpors]-...` | CRITICAL |
257
+ | Slack Webhook | `hooks.slack.com/services/...` | HIGH |
258
+ | Twilio | `SK...` | CRITICAL |
259
+ | Discord | Bot token format | CRITICAL |
260
+ | Database URL | `postgres://user:pass@...` | CRITICAL |
261
+ | JWT | `eyJ...` | HIGH |
262
+ | Supabase Key | Supabase JWT format | HIGH |
263
+ | SendGrid | `SG....` | CRITICAL |
264
+ | Cloudflare | Context-aware token match | HIGH |
265
+ | MiniMax | `sk-cp-...` | CRITICAL |
266
+ | Vercel | `vercel_...` | CRITICAL |
267
+ | Private Keys | `-----BEGIN PRIVATE KEY-----` | CRITICAL |
268
+ | Hardcoded Passwords | `password = "..."` | CRITICAL |
269
+ | Generic API Keys | `api_key = "..."` | HIGH |
270
+
271
+ ## OWASP Security Checks
272
+
273
+ - **A01** Broken Access Control -- CORS wildcards, debug mode
274
+ - **A02** Cryptographic Failures -- MD5, SHA-1, DES, RC4, ECB, hardcoded keys
275
+ - **A03** Injection -- SQL (f-strings, concatenation, % formatting), command injection (os.system, eval, exec), XSS (innerHTML, document.write, dangerouslySetInnerHTML)
276
+ - **A05** Security Misconfiguration -- SSL verification disabled, binding 0.0.0.0
277
+ - **A07** Auth Failures -- JWT verification disabled, timing-unsafe comparisons
278
+ - **A09** Logging Failures -- Sensitive data in logs
279
+ - **A10** SSRF -- Unvalidated URLs in requests/httpx/fetch
280
+ - **Bonus** Path traversal, insecure deserialization (pickle, yaml.load)
281
+
282
+ ## CodeGuard vs. Others
283
+
284
+ | | CodeGuard Pro | GitGuardian | Semgrep | TruffleHog |
285
+ |---|---|---|---|---|
286
+ | **Inline agent gate** | Yes -- agents call it before commit | No | No | No |
287
+ | **Pre-commit hook** | Yes | Yes | Yes | Yes |
288
+ | **Auto-fix patches** | Yes -- returns env var replacements | No | Some | No |
289
+ | **OWASP scanner** | Yes (built-in) | No | Yes | No |
290
+ | **Code review** | Yes (built-in) | No | Partial | No |
291
+ | **MCP server audit** | Yes | No | No | No |
292
+ | **Runs locally** | Yes -- zero API calls | Cloud | Both | Both |
293
+ | **Cost** | Free | Freemium | Freemium | Free |
294
+ | **Setup** | 3 lines | Dashboard + token | Config files | Config files |
295
+
296
+ **The difference:** Other tools scan *after* code is written. CodeGuard gates the commit *inline* -- the AI agent can't proceed until secrets are removed. No secrets reach git history. Ever.
297
+
298
+ ## Supply Chain Scanner
299
+
300
+ Audit your dependencies for known vulnerabilities before they ship:
301
+
302
+ ```bash
303
+ codeguard check requests flask
304
+ ```
305
+
306
+ ```
307
+ Checking requests==2.31.0...
308
+ [CRITICAL] CVE-2024-35195 — Session headers leak on redirect (fixed in 2.32.0)
309
+
310
+ Checking flask==3.0.0...
311
+ No known vulnerabilities.
312
+
313
+ 1 package(s) with issues. Run `codeguard check --fix` to update.
314
+ ```
315
+
316
+ Works with `pip freeze` output too:
317
+
318
+ ```bash
319
+ pip freeze | codeguard check --stdin
320
+ ```
321
+
322
+ ## `codeguard init`
323
+
324
+ Bootstrap a security policy for your repo in one command:
325
+
326
+ ```bash
327
+ codeguard init # Interactive — asks questions
328
+ codeguard init --minimal # Pre-commit hook only, zero config
329
+ codeguard init --standard # Hook + .codeguardrc with sane defaults
330
+ codeguard init --full # Hook + strict policy + CI template + supply chain audit
331
+ ```
332
+
333
+ | Level | Pre-commit hook | `.codeguardrc` | CI template | Supply chain scan | OWASP scan |
334
+ |---|---|---|---|---|---|
335
+ | `--minimal` | Yes | No | No | No | No |
336
+ | `--standard` | Yes | Yes | No | No | Yes |
337
+ | `--full` | Yes | Yes | Yes | Yes | Yes |
338
+
339
+ ## GitHub Action
340
+
341
+ Add CodeGuard to your CI pipeline:
342
+
343
+ ```yaml
344
+ # .github/workflows/codeguard.yml
345
+ name: CodeGuard Security Gate
346
+ on: [push, pull_request]
347
+
348
+ jobs:
349
+ security:
350
+ runs-on: ubuntu-latest
351
+ steps:
352
+ - uses: actions/checkout@v4
353
+ - uses: Miles0sage/codeguard-mcp@main
354
+ with:
355
+ scan-mode: full # minimal | standard | full
356
+ fail-on: critical # critical | high | medium
357
+ supply-chain: true # audit dependencies
358
+ ```
359
+
360
+ Blocks the PR if secrets or critical vulnerabilities are found. Results appear as inline annotations on the diff.
361
+
362
+ ## Roadmap
363
+
364
+ These are the right next steps if you want to push toward the bigger vision:
365
+
366
+ - **Self-improving suggestions**: auto-cluster missed samples and propose rule/test patches for review
367
+ - **Crypto edge coverage**: stronger AST handling for `getattr`, f-string assembled alg names, and simple constant folding
368
+ - **Enterprise stability**: structured logs, request IDs, provider error reporting, policy bundles, and CI-grade observability
369
+ - **Enterprise policy**: allowlists, suppressions with expiry, approval workflows, and org-wide baseline management
370
+ - **Fleet operations**: dashboards, issue sync, audit trail, multi-repo rollups, and ticketing integrations
371
+
372
+ The product can grow into those areas, but it should not claim them as finished today.
373
+
374
+ ## vs Semgrep / GitGuardian
375
+
376
+ | Capability | CodeGuard Pro | Semgrep | GitGuardian |
377
+ |---|---|---|---|
378
+ | **Inline agent gate** | Yes -- blocks before commit | No | No |
379
+ | **MCP integration** | Native (14 tools) | No | No |
380
+ | **Auto-fix patches** | Yes -- env var replacements | Partial (autofix rules) | No |
381
+ | **Supply chain scan** | Yes -- CVE lookup per package | Via Supply Chain product | No |
382
+ | **OWASP scanner** | Built-in (8 categories) | Extensive rule library | No |
383
+ | **Secret detection** | 25+ patterns, inline block | Community rules | 350+ detectors |
384
+ | **Code review** | Built-in (duplicates, nesting, naming) | Custom rules only | No |
385
+ | **Runs 100% local** | Yes -- zero API calls | Both (local + cloud) | Cloud only |
386
+ | **Setup time** | `codeguard init` (10 seconds) | Config files + rules | Dashboard + token |
387
+ | **Pricing** | Free | Free tier + paid | Free tier + paid |
388
+ | **Best for** | AI agent workflows, solo devs | Large teams, custom rules | Enterprise secret mgmt |
389
+
390
+ **CodeGuard's edge:** It is the only tool designed to sit *inside* the AI agent loop. Semgrep and GitGuardian scan after code is written. CodeGuard gates the agent -- secrets never reach git history because the agent cannot proceed until they are removed.
391
+
392
+ ## Supported Languages
393
+
394
+ Python, JavaScript, TypeScript, Go, Rust, Java, Ruby -- with language-aware review rules for naming conventions, import analysis, and debug statement detection.
395
+
396
+ ## License
397
+
398
+ MIT