@hustle-together/api-dev-tools 1.3.0 → 1.7.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -26,11 +26,14 @@ Five powerful slash commands for Claude Code:
26
26
  - **`/api-status [endpoint]`** - Track implementation progress and phase completion
27
27
 
28
28
  ### Enforcement Hooks
29
- Three Python hooks that provide **real programmatic guarantees**:
29
+ Six Python hooks that provide **real programmatic guarantees**:
30
30
 
31
+ - **`enforce-external-research.py`** - (v1.7.0) Detects external API questions and requires research before answering
31
32
  - **`enforce-research.py`** - Blocks API code writing until research is complete
32
- - **`track-tool-use.py`** - Logs all research activity (Context7, WebSearch, WebFetch)
33
- - **`api-workflow-check.py`** - Prevents stopping until required phases are complete
33
+ - **`enforce-interview.py`** - Verifies user questions were actually asked (prevents self-answering)
34
+ - **`verify-implementation.py`** - Checks implementation matches interview requirements
35
+ - **`track-tool-use.py`** - Logs all research activity (Context7, WebSearch, WebFetch, AskUserQuestion)
36
+ - **`api-workflow-check.py`** - Prevents stopping until required phases are complete + git diff verification
34
37
 
35
38
  ### State Tracking
36
39
  - **`.claude/api-dev-state.json`** - Persistent state file tracking all workflow progress
@@ -402,6 +405,71 @@ The `.claude/api-dev-state.json` file tracks:
402
405
  - All research activity is **logged** - auditable trail
403
406
  - Workflow completion is **verified** - can't stop early
404
407
 
408
+ ## 🔍 Gap Detection & Verification (v1.6.0+)
409
+
410
+ The workflow now includes automatic detection of common implementation gaps:
411
+
412
+ ### Gap 1: Exact Term Matching
413
+ **Problem:** AI paraphrases user terminology instead of using exact terms for research.
414
+
415
+ **Example:**
416
+ - User says: "Use Vercel AI Gateway"
417
+ - AI searches for: "Vercel AI SDK" (wrong!)
418
+
419
+ **Fix:** `verify-implementation.py` extracts key terms from interview answers and warns if those exact terms weren't used in research queries.
420
+
421
+ ### Gap 2: File Change Tracking
422
+ **Problem:** AI claims "all files updated" but doesn't verify which files actually changed.
423
+
424
+ **Fix:** `api-workflow-check.py` runs `git diff --name-only` and compares against tracked `files_created`/`files_modified` in state. Warns about untracked changes.
425
+
426
+ ### Gap 3: Skipped Test Investigation
427
+ **Problem:** AI accepts "9 tests skipped" without investigating why.
428
+
429
+ **Fix:** `verification_warnings` in state file tracks issues that need review. Stop hook shows unaddressed warnings.
430
+
431
+ ### Gap 4: Implementation Verification
432
+ **Problem:** AI marks task complete without verifying implementation matches interview.
433
+
434
+ **Fix:** Stop hook checks that:
435
+ - Route files exist if endpoints mentioned
436
+ - Test files are tracked
437
+ - Key terms from interview appear in implementation
438
+
439
+ ### Gap 5: Test/Production Alignment
440
+ **Problem:** Test files check different environment variables than production code.
441
+
442
+ **Example:**
443
+ - Interview: "single gateway key"
444
+ - Production: uses `AI_GATEWAY_API_KEY`
445
+ - Test: still checks `OPENAI_API_KEY` (wrong!)
446
+
447
+ **Fix:** `verify-implementation.py` warns when test files check env vars that don't match interview requirements.
448
+
449
+ ### Gap 6: Training Data Reliance (v1.7.0+)
450
+ **Problem:** AI answers questions about external APIs from potentially outdated training data instead of researching first.
451
+
452
+ **Example:**
453
+ - User asks: "What providers does Vercel AI Gateway support?"
454
+ - AI answers from memory: "Groq not in gateway" (WRONG!)
455
+ - Reality: Groq has 4 models in the gateway (Llama variants)
456
+
457
+ **Fix:** New `UserPromptSubmit` hook (`enforce-external-research.py`) that:
458
+ 1. Detects questions about external APIs/SDKs using pattern matching
459
+ 2. Injects context requiring research before answering
460
+ 3. Works for ANY API (Brandfetch, Stripe, Twilio, etc.) - not just specific ones
461
+ 4. Auto-allows WebSearch and Context7 without permission prompts
462
+
463
+ ```
464
+ USER: "What providers does Brandfetch API support?"
465
+
466
+ HOOK: Detects "Brandfetch", "API", "providers"
467
+
468
+ INJECTS: "RESEARCH REQUIRED: Use Context7/WebSearch before answering"
469
+
470
+ CLAUDE: Researches first → Gives accurate answer
471
+ ```
472
+
405
473
  ## 🔧 Requirements
406
474
 
407
475
  - **Node.js** 14.0.0 or higher
@@ -259,6 +259,83 @@ With thorough research:
259
259
  - ✅ Robust implementation
260
260
  - ✅ Better documentation
261
261
 
262
+ ---
263
+
264
+ ## Research-First Schema Design (MANDATORY)
265
+
266
+ ### The Anti-Pattern: Schema-First Development
267
+
268
+ **NEVER DO THIS:**
269
+ - ❌ Define interfaces based on assumptions before researching
270
+ - ❌ Rely on training data for API capabilities
271
+ - ❌ Say "I think it supports..." without verification
272
+ - ❌ Build schemas from memory instead of documentation
273
+
274
+ **Real Example of Failure:**
275
+ - User asked: "What providers does Vercel AI Gateway support?"
276
+ - AI answered from memory: "Groq not in gateway"
277
+ - Reality: Groq has 4 models in the gateway (Llama variants)
278
+ - Root cause: No research was done before answering
279
+
280
+ ### The Correct Pattern: Research-First
281
+
282
+ **ALWAYS DO THIS:**
283
+
284
+ **Step 1: Research the Source of Truth**
285
+ - Use Context7 (`mcp__context7__resolve-library-id` + `get-library-docs`) for SDK docs
286
+ - Use WebSearch for official provider documentation
287
+ - Query APIs directly when possible (don't assume)
288
+ - Check GitHub repositories for current implementation
289
+
290
+ **Step 2: Build Schema FROM Research**
291
+ - Interface fields emerge from discovered capabilities
292
+ - Every field has a source (docs, SDK types, API response)
293
+ - Don't guess - verify each capability
294
+ - Document where each field came from
295
+
296
+ **Step 3: Verify with Actual Calls**
297
+ - Test capabilities before marking them supported
298
+ - Investigate skipped tests - they're bugs, not features
299
+ - No "should work" - prove it works
300
+ - All tests must pass, not be skipped
301
+
302
+ ### Mandatory Checklist Before Answering ANY External API Question
303
+
304
+ Before responding to questions about APIs, SDKs, or external services:
305
+
306
+ ```
307
+ [ ] Did I use Context7 to get current documentation?
308
+ [ ] Did I use WebSearch for official docs?
309
+ [ ] Did I verify the information is current (not training data)?
310
+ [ ] Am I stating facts from research, not memory?
311
+ [ ] Have I cited my sources?
312
+ ```
313
+
314
+ ### Research Query Tracking
315
+
316
+ All research is now tracked in `.claude/api-dev-state.json`:
317
+
318
+ ```json
319
+ {
320
+ "research_queries": [
321
+ {
322
+ "timestamp": "2025-12-07T...",
323
+ "tool": "WebSearch",
324
+ "query": "Vercel AI Gateway Groq providers",
325
+ "terms": ["vercel", "gateway", "groq", "providers"]
326
+ },
327
+ {
328
+ "timestamp": "2025-12-07T...",
329
+ "tool": "mcp__context7__get-library-docs",
330
+ "library": "@ai-sdk/gateway",
331
+ "terms": ["@ai-sdk/gateway"]
332
+ }
333
+ ]
334
+ }
335
+ ```
336
+
337
+ This allows verification that specific topics were actually researched before answering.
338
+
262
339
  <claude-commands-template>
263
340
  ## Research Guidelines
264
341