codeprobe-scanner 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (96) hide show
  1. package/.claude/settings.local.json +19 -0
  2. package/.dockerignore +17 -0
  3. package/.env.development +8 -0
  4. package/.env.example +20 -0
  5. package/.env.setup +214 -0
  6. package/.github/workflows/codeprobe-scan.yml +137 -0
  7. package/.github/workflows/codeprobe.yml +84 -0
  8. package/.github/workflows/scan-schedule.yml +28 -0
  9. package/ANALYSIS_SUMMARY.md +365 -0
  10. package/API_INTEGRATIONS.md +469 -0
  11. package/BUILD_PLAYBOOK.md +349 -0
  12. package/CLAUDE.md +106 -0
  13. package/DEPLOY.md +452 -0
  14. package/DEPLOYMENT_STATUS.md +240 -0
  15. package/DEPLOY_CHECKLIST.md +316 -0
  16. package/Dockerfile +24 -0
  17. package/EXECUTION_PLAN.html +1086 -0
  18. package/IMPLEMENTATION_COMPLETE.md +288 -0
  19. package/IMPLEMENTATION_SUMMARY.md +443 -0
  20. package/INTERACTIVE_FIX_FLOW.md +308 -0
  21. package/MIGRATION_COMPLETE.md +327 -0
  22. package/ORCHESTRATOR_SYNTHESIS.json +80 -0
  23. package/PENDING_WORK.md +308 -0
  24. package/PREFLIGHT_PLAN.md +182 -0
  25. package/QUICKSTART.md +305 -0
  26. package/README.md +15 -0
  27. package/STAGE_1_SETUP_ENGINE.md +245 -0
  28. package/STAGE_2_ARCHITECTURE.md +714 -0
  29. package/STAGE_2_CLI_VERIFICATION.md +269 -0
  30. package/STAGE_2_COMPLETE.md +332 -0
  31. package/STAGE_2_IMPLEMENTATION_PLAN.md +679 -0
  32. package/STAGE_3_COMPLETE.md +246 -0
  33. package/STAGE_3_DASHBOARD_POLISH.md +371 -0
  34. package/STAGE_3_SETUP.md +155 -0
  35. package/VIDEODB_INTEGRATION.md +237 -0
  36. package/archived/DASHBOARD_UI_WALKTHROUGH.md +392 -0
  37. package/archived/FRONTEND_SETUP.md +236 -0
  38. package/archived/auth.ts +40 -0
  39. package/archived/dashboard/components/BusinessImpactCard.tsx +48 -0
  40. package/archived/dashboard/components/CVETable.tsx +104 -0
  41. package/archived/dashboard/components/ErrorBoundary.tsx +48 -0
  42. package/archived/dashboard/components/PatchDiffViewer.tsx +43 -0
  43. package/archived/dashboard/components/RiskGauge.tsx +64 -0
  44. package/archived/dashboard/frontend.tsx +104 -0
  45. package/archived/dashboard/hooks/useAuth.ts +32 -0
  46. package/archived/dashboard/hooks/useScan.ts +65 -0
  47. package/archived/dashboard/index.html +15 -0
  48. package/archived/dashboard/pages/LoginPage.tsx +28 -0
  49. package/archived/dashboard/pages/ScanDetailPage.tsx +143 -0
  50. package/archived/dashboard/pages/ScansListPage.tsx +160 -0
  51. package/bin/install-and-run.sh +91 -0
  52. package/bun.lock +603 -0
  53. package/codeprobe-prd.md +674 -0
  54. package/cve-cache.json +25 -0
  55. package/demo-vulnerable-app/.github/workflows/codeprobe.yml +32 -0
  56. package/demo-vulnerable-app/README.md +70 -0
  57. package/demo-vulnerable-app/package-lock.json +27 -0
  58. package/demo-vulnerable-app/package.json +15 -0
  59. package/demo-vulnerable-app/server.js +34 -0
  60. package/demo.sh +45 -0
  61. package/index.ts +19 -0
  62. package/package.json +28 -0
  63. package/patches.json +12 -0
  64. package/serve-dashboard.ts +23 -0
  65. package/src/api/server-cli.ts +270 -0
  66. package/src/api/server.ts +293 -0
  67. package/src/bot/server.ts +113 -0
  68. package/src/cli/commands/report.ts +92 -0
  69. package/src/cli/commands/scan-with-fix.ts +123 -0
  70. package/src/cli/commands/scan.ts +137 -0
  71. package/src/cli/config.ts +188 -0
  72. package/src/cli/errors.ts +120 -0
  73. package/src/cli/index.ts +137 -0
  74. package/src/cli/progress.ts +119 -0
  75. package/src/cli-server.ts +523 -0
  76. package/src/engine/index.ts +90 -0
  77. package/src/engine/matcher.ts +115 -0
  78. package/src/engine/parser.ts +91 -0
  79. package/src/engine/patcher.ts +280 -0
  80. package/src/engine/report.ts +137 -0
  81. package/src/engine/sandbox.ts +222 -0
  82. package/src/engine/scraper.ts +122 -0
  83. package/src/integrations/videodb.ts +153 -0
  84. package/src/mcp/server.ts +149 -0
  85. package/src/scraper-cron.ts +103 -0
  86. package/src/shared/constants.ts +88 -0
  87. package/src/shared/types.ts +123 -0
  88. package/src/shared/utils.ts +80 -0
  89. package/src/test/cli.test.ts +211 -0
  90. package/src/test/dashboard.test.ts +38 -0
  91. package/src/test/demo-scan.json +32 -0
  92. package/src/test/engine.test.ts +157 -0
  93. package/tailwind.config.js +11 -0
  94. package/tsconfig.json +30 -0
  95. package/verify-dashboard.ts +87 -0
  96. package/verify-env.sh +98 -0
@@ -0,0 +1,365 @@
1
+ # CodeProbe Stage 2: Analysis & Planning Summary
2
+
3
+ **Date**: 2026-06-13
4
+ **Status**: ✅ Analysis complete, Ready for implementation
5
+ **Confidence**: 85/100 (pending token encryption decision)
6
+
7
+ ---
8
+
9
+ ## What Was Delivered
10
+
11
+ 1. **STAGE_2_IMPLEMENTATION_PLAN.md** (15+ pages)
12
+ - Hour-by-hour breakdown with file names
13
+ - Stage 1 dependency contract (interface spec Stage 2 codes against)
14
+ - Critical gate: Token encryption decision (SHA256 broken, needs replacement)
15
+ - Addresses all ORCHESTRATOR risks with concrete mitigations
16
+ - Test strategy + acceptance criteria
17
+ - 11 open decisions requiring user confirmation
18
+
19
+ 2. **STAGE_2_ARCHITECTURE.md** (10+ pages)
20
+ - System architecture diagram (visual)
21
+ - End-to-end data flow sequence
22
+ - Module dependency graph
23
+ - Event sequence timeline
24
+ - Fallback cascade (what happens when APIs fail)
25
+ - Security boundaries + threat model
26
+ - Test coverage map (unit, integration, E2E)
27
+ - Deployment checklist for demo day
28
+
29
+ 3. **codebase-explorer** (skill prepared)
30
+ - Ready to generate interactive graph once Stage 1 code exists
31
+ - Currently repo is minimal (5 files), so graph deferred until implementation
32
+
33
+ ---
34
+
35
+ ## Key Findings from ORCHESTRATOR Review
36
+
37
+ The ORCHESTRATOR_SYNTHESIS.json flagged serious issues with the original plan:
38
+
39
+ | Issue | Severity | Stage 2 Response |
40
+ |-------|----------|-----------------|
41
+ | **Timeline underestimated** (5h vs 12-24h) | CRITICAL | Stage 2 CLI alone is 3-4h realistic ✓ |
42
+ | **Zero test infrastructure** | HIGH | Added 12+ unit tests + E2E tests ✓ |
43
+ | **Dashboard auth undefined** | HIGH | Moved to Stage 3 scope; Stage 2 uses file permissions (0600) ✓ |
44
+ | **Patch validation missing** | HIGH | Specified: Stage 1 validates, Stage 2 applies + fallback ✓ |
45
+ | **Log4Shell incompatible with Node.js** | CRITICAL | Already fixed: HTTP/2 Rapid Reset only ✓ |
46
+ | **Token encryption broken** (SHA256) | HIGH | **OPEN: User must choose A/B/C** |
47
+ | **Security claims contradicted** | HIGH | Clarified: Nosana fallback to Claude, clearly documented |
48
+
49
+ ---
50
+
51
+ ## Critical Blockers (Must Resolve Now)
52
+
53
+ ### Blocker 1: Token Encryption Decision ⚠️
54
+
55
+ **Problem**: STAGE_2_CLI_VERIFICATION.md says "SHA256 + salt" but SHA256 is one-way hash.
56
+ **Impact**: Blocks `config.ts` implementation
57
+ **User Action Required**: Choose one:
58
+
59
+ ```
60
+ A) OS Keychain (macOS/Linux only, most secure)
61
+ Pros: Standard practice, most secure
62
+ Cons: Requires system setup, fails in headless envs
63
+
64
+ B) AES-256-GCM with machine ID (RECOMMENDED)
65
+ Pros: Cross-platform, no setup, reasonable security
66
+ Cons: Migration requires re-auth
67
+
68
+ C) Plaintext + 0600 perms (simplest, insecure)
69
+ Pros: MVP-honest, no dependencies
70
+ Cons: Secret on disk unencrypted
71
+ ```
72
+
73
+ **Recommendation**: Choose **Option B** for hackathon (best tradeoff)
74
+
75
+ ### Blocker 2: Stage 1 Engine Contract
76
+
77
+ **Problem**: Stage 2 depends on Stage 1 exports (runFullScan, ScanEvent interface, types)
78
+ **Impact**: Stage 2 can build mocked tests now, but real integration blocked
79
+ **User Action Required**: Confirm Stage 1 will export interface from STAGE_2_IMPLEMENTATION_PLAN.md §1.1
80
+
81
+ ---
82
+
83
+ ## Implementation Timeline (Revised)
84
+
85
+ ### Original Claim
86
+ > Stage 2: 2–4 hours
87
+
88
+ ### Realistic Assessment
89
+
90
+ | Phase | Duration | Dependency | Notes |
91
+ |-------|----------|-----------|-------|
92
+ | Setup + stubs | 30m | None | Can start immediately |
93
+ | Config + CLI bootstrap | 60m | None | Mocked engine works |
94
+ | Real-time output + errors | 60m | None | Mocked events work |
95
+ | Fallbacks + git integration | 90m | Stage 1 contract locked | 2-3 separate tasks |
96
+ | Unit tests (mocked) | 30m | None | Runnable in parallel with Stage 1 |
97
+ | E2E test (real engine) | 20m | **Stage 1 working** | Blocked gate #1 |
98
+ | Demo rehearsal | 45m | **Stage 1 + Stage 2 ready** | Blocked gate #2 |
99
+ | **TOTAL (sequential)** | **~4–5.5h** | **Both stages needed** | Honest estimate |
100
+ | **TOTAL (parallel)** | **~3–4h wall time** | **Both teams** | If no bottlenecks |
101
+
102
+ **Critical Assumption**: Stage 1 delivers `runFullScan()` by Hour 2 of hackathon.
103
+ **If Stage 1 slips**: Stage 2 CLI surface can still build with mocks; E2E blocked.
104
+
105
+ ---
106
+
107
+ ## Parallel Work Opportunities
108
+
109
+ ### Stage 2 Can Build Now (No Stage 1 Needed)
110
+
111
+ - ✅ `config.ts` — Local filesystem only (encryption decision pending)
112
+ - ✅ `index.ts` + command skeletons — Argument parsing, help text
113
+ - ✅ `progress.ts` — Mock events, test formatting
114
+ - ✅ `errors.ts` — Generic error handling
115
+ - ✅ Unit tests with mocked engine — All 12+ tests runnable
116
+ - ✅ File structure + TypeScript setup
117
+ - ✅ Package.json dependencies
118
+
119
+ **Estimated Time**: 2–3h (can overlap with Stage 1)
120
+
121
+ ### Stage 2 Blocked By Stage 1
122
+
123
+ - ✗ `scan.ts` full implementation (needs `runFullScan()`)
124
+ - ✗ `scan-with-fix.ts` (needs patcher + sandbox)
125
+ - ✗ E2E test (`src/test/e2e.cli.test.ts`)
126
+ - ✗ `demo.sh` execution
127
+ - ✗ Acceptance criteria validation
128
+
129
+ **Unblocks After**: Stage 1 delivers engine (Hour 2+)
130
+
131
+ ---
132
+
133
+ ## Dependency Chain Visualization
134
+
135
+ ```
136
+ Hour 0: Setup
137
+ ├─ Lock token encryption ────────────────┐
138
+ └─ Stage 1 contract confirmed │
139
+
140
+ Hour 1: Build Stage 2 with mocks (parallel with Stage 1)
141
+ ├─ config.ts (now unblocked) ────────────┤
142
+ ├─ index.ts + commands skeleton │
143
+ ├─ progress.ts + errors.ts │
144
+ └─ Unit tests (mocked engine) │
145
+
146
+ Hour 2: Stage 1 engine ready (critical gate)
147
+ └─ Wire Stage 2 to real engine ──────────┤
148
+
149
+ Hour 2.5–3: Stage 2 integration
150
+ ├─ Replace mocks with real imports │
151
+ ├─ Run E2E test ◄────────────────────────┤
152
+ └─ Both Stage 1 + Stage 2 done │
153
+
154
+ Hour 3.5–4: Demo
155
+ └─ Run demo.sh 3-5 times ◄───────────────┘
156
+ Record fallback video
157
+ Rehearse for judges
158
+ ```
159
+
160
+ ---
161
+
162
+ ## Files to Create (Stage 2 Scope)
163
+
164
+ **Immediate (Hour 0):**
165
+ - `src/cli/index.ts` — Entry point stub
166
+ - `src/cli/commands/scan.ts` — Skeleton
167
+ - `src/cli/commands/scan-with-fix.ts` — Skeleton
168
+ - `src/cli/commands/report.ts` — Skeleton
169
+ - `src/cli/config.ts` — Skeleton (encryption method TBD)
170
+ - `src/cli/progress.ts` — Skeleton
171
+ - `src/cli/errors.ts` — Skeleton
172
+ - `src/test/cli.test.ts` — Empty test file
173
+ - `package.json` — Add dependencies
174
+
175
+ **After Stage 1 Contract (Hours 1–3):**
176
+ - `src/test/e2e.cli.test.ts` — E2E tests
177
+ - `src/shared/types.ts` — Import from Stage 1
178
+ - `src/shared/constants.ts` — API timeouts, paths
179
+ - `src/shared/utils.ts` — Helpers (format score, colorize)
180
+ - `demo.sh` — Demo script
181
+
182
+ **Reference Only (Stage 1 Creates):**
183
+ - `src/engine/index.ts` — runFullScan export
184
+ - `cve-cache.json` — Fallback CVE data
185
+ - `patches.json` — Pre-baked patches
186
+ - `demo-vulnerable-app/` — Demo repo
187
+
188
+ ---
189
+
190
+ ## Test Acceptance Criteria
191
+
192
+ **Must Pass (Stage 2 Complete):**
193
+
194
+ 1. ✅ `bun test src/test/cli.test.ts` — All unit tests pass (with mocks)
195
+ 2. ✅ `bun run src/cli/index.ts scan ./demo-vulnerable-app` — Exits 1 (vulns found)
196
+ 3. ✅ Output contains "CVE-2023-44487" AND "CONFIRMED EXPLOITABLE"
197
+ 4. ✅ JSON report saved to `~/.codeprobe/scans/{id}.json` with valid schema
198
+ 5. ✅ `scan --fix` creates git branch matching `codeprobe-fix-*`
199
+ 6. ✅ `scan --json` outputs valid JSON
200
+ 7. ✅ File perms: `~/.codeprobe/scans/` is `0700`, files are `0600`
201
+ 8. ✅ Fallback works: If Bright Data fails, scan continues with cache
202
+ 9. ✅ `demo.sh` completes in <3 minutes without errors
203
+ 10. ✅ `bun test src/test/e2e.cli.test.ts` passes (after Stage 1 ready)
204
+
205
+ ---
206
+
207
+ ## Risk Assessment
208
+
209
+ | Risk | Likelihood | Severity | Mitigation |
210
+ |------|-----------|----------|-----------|
211
+ | Token encryption blocker | HIGH | CRITICAL | Choose method in Hour 0 |
212
+ | Stage 1 not ready on time | MEDIUM | HIGH | Build Stage 2 with mocks, delay E2E |
213
+ | Bright Data timeout during demo | MEDIUM | MEDIUM | Use cached data + show fallback working |
214
+ | Daytona slow/unavailable | MEDIUM | MEDIUM | Retry logic + mark "verification failed" |
215
+ | Git repo dirty on --fix | LOW | LOW | Detect + warn user to commit first |
216
+ | Bun startup slow | LOW | LOW | Pre-warm before demo (run once) |
217
+ | Config encryption fails | MEDIUM | MEDIUM | Fall back to plaintext + warn at runtime |
218
+
219
+ **Highest Risk**: Token encryption decision delayed → blocks config.ts → cascades
220
+
221
+ ---
222
+
223
+ ## Open Questions for User
224
+
225
+ | # | Question | Answer | Impact |
226
+ |----|----------|--------|--------|
227
+ | 1 | Token encryption method? | A/B/C | **BLOCKS config.ts** |
228
+ | 2 | Stage 1 contract finalized? | Confirm sig | Blocks Stage 2 integration |
229
+ | 3 | API key precedence? | Env-first or config-first | Moderate (config.ts logic) |
230
+ | 4 | E2E environment? | Local + real or mocked | Affects test speed |
231
+ | 5 | Demo repo location? | Subdirectory or separate | Low (setup complexity) |
232
+ | 6 | Scan history limit? | Keep all or N recent | Low (storage) |
233
+ | 7 | Pre-baked patches location? | Codebase or S3 | Low (availability) |
234
+ | 8 | Build Stage 2 now or wait for Stage 1? | Parallel or sequential | Affects timeline |
235
+ | 9 | GitHub OAuth for dashboard in Stage 2? | Include or Stage 3? | Out of scope (Stage 3) |
236
+ | 10 | Use preflight skill for multi-agent review? | Yes or skip? | Optional (already analyzed) |
237
+
238
+ ---
239
+
240
+ ## How to Move Forward
241
+
242
+ ### Step 1: Lock Encryption Decision (5 min)
243
+ ```
244
+ Choose one:
245
+ A) OS Keychain
246
+ B) AES-256-GCM (recommended)
247
+ C) Plaintext + 0600
248
+
249
+ Recommendation: B (cross-platform, reasonable security, no setup)
250
+ ```
251
+
252
+ ### Step 2: Confirm Stage 1 Contract (10 min)
253
+ ```
254
+ Verify Stage 1 will export:
255
+ - runFullScan(path, { onEvent }) → Promise<Report>
256
+ - ScanEvent interface with 7 phases
257
+ - Report, CVE, Scan types matching spec
258
+
259
+ → Share STAGE_2_IMPLEMENTATION_PLAN.md §1.1 with Stage 1 team
260
+ ```
261
+
262
+ ### Step 3: Start Stage 2 Implementation (now)
263
+ ```
264
+ Hour 0: Setup
265
+ - Scaffold directory structure
266
+ - Confirm encryption method
267
+ - Install dependencies
268
+
269
+ Hour 1: Build with mocks
270
+ - config.ts (using chosen encryption)
271
+ - index.ts + commands skeleton
272
+ - progress.ts + errors.ts
273
+
274
+ Hour 2: Unit tests
275
+ - Run: bun test src/test/cli.test.ts
276
+ - All mocked tests should pass
277
+
278
+ Hour 2.5+: Integrate Stage 1 (when ready)
279
+ - Replace mocks with real engine
280
+ - Run E2E test
281
+ - Demo rehearsal
282
+ ```
283
+
284
+ ### Step 4: Monitor Critical Gates
285
+ ```
286
+ Gate 1: Encryption decision locked (Hour 0)
287
+ Gate 2: Stage 1 contract confirmed (Hour 0)
288
+ Gate 3: Stage 1 engine working (Hour 2)
289
+ Gate 4: Stage 2 E2E passing (Hour 3)
290
+ Gate 5: Demo working <3 min (Hour 3.5)
291
+ ```
292
+
293
+ ---
294
+
295
+ ## Deliverables Summary
296
+
297
+ This analysis produced:
298
+
299
+ | Document | Purpose | Size | Ready? |
300
+ |-----------|---------|------|--------|
301
+ | STAGE_2_IMPLEMENTATION_PLAN.md | Hour-by-hour breakdown, test strategy, risks | 10 pages | ✅ |
302
+ | STAGE_2_ARCHITECTURE.md | System design, diagrams, data flow, security | 12 pages | ✅ |
303
+ | ANALYSIS_SUMMARY.md | This file — executive summary | 6 pages | ✅ |
304
+ | Codebase-explorer skill | Interactive graph (deferred until code exists) | N/A | ⏳ |
305
+
306
+ **Total Analysis**: ~4 hours equivalent
307
+ **Ready to Implement**: Yes (pending encryption decision)
308
+ **Blockers**: Token encryption choice + Stage 1 contract confirmation
309
+
310
+ ---
311
+
312
+ ## Comparison to Original Plan
313
+
314
+ **Original STAGE_2_CLI_VERIFICATION.md:**
315
+ - Claims 2–4h for Stage 2 (optimistic)
316
+ - No test infrastructure detailed
317
+ - Token encryption method broken (SHA256)
318
+ - No parallel work diagram
319
+ - No fallback strategy specifics
320
+ - No demo rehearsal process
321
+
322
+ **This Analysis:**
323
+ - Revises to 3–4h realistic (if Stage 1 on time)
324
+ - Defines 12+ unit tests + E2E tests
325
+ - Fixes encryption: recommends AES-256-GCM
326
+ - Maps parallel work clearly
327
+ - Specifies fallback cascade (cache, retry, pre-baked)
328
+ - Includes demo checklist + rehearsal plan
329
+
330
+ **Honest Assessment**: Original plan was optimistic; this adds realism + test coverage.
331
+
332
+ ---
333
+
334
+ ## Next Immediate Actions
335
+
336
+ ```
337
+ 1. [ ] User chooses token encryption method (A/B/C)
338
+ 2. [ ] Confirm Stage 1 contract with Stage 1 team
339
+ 3. [ ] Scaffold Stage 2 directory + package.json
340
+ 4. [ ] Implement config.ts using chosen encryption
341
+ 5. [ ] Run first unit test: bun test src/test/cli.test.ts
342
+ 6. [ ] Integrate with Stage 1 when engine ready
343
+ 7. [ ] Run demo.sh before hackathon
344
+ ```
345
+
346
+ **Estimated Time to Start Coding**: 15 minutes (decisions only)
347
+ **Estimated Time to First Test Pass**: 1 hour (config.ts + CLI skeleton)
348
+
349
+ ---
350
+
351
+ ## Questions?
352
+
353
+ Refer to:
354
+ - **For hourly timeline**: STAGE_2_IMPLEMENTATION_PLAN.md §2
355
+ - **For architecture**: STAGE_2_ARCHITECTURE.md
356
+ - **For Stage 1 dependency**: STAGE_2_IMPLEMENTATION_PLAN.md §1
357
+ - **For test strategy**: STAGE_2_IMPLEMENTATION_PLAN.md §7
358
+ - **For security**: STAGE_2_ARCHITECTURE.md "Security Boundaries"
359
+
360
+ ---
361
+
362
+ **Status**: ✅ Analysis Complete
363
+ **Next**: Implement Stage 2 CLI (start with config.ts)
364
+ **Target**: <4h for Stage 2 (with Stage 1 parallel)
365
+ **Demo Day**: Ready when Stage 1 + Stage 2 both complete