npm - @hustle-together/api-dev-tools - Versions diffs - 1.3.0 → 1.7.0 - Mend

@hustle-together/api-dev-tools 1.3.0 → 1.7.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (11) hide show

package/README.md +71 -3
package/commands/api-research.md +77 -0
package/demo/workflow-demo.html +1945 -0
package/hooks/api-workflow-check.py +135 -26
package/hooks/enforce-external-research.py +318 -0
package/hooks/enforce-interview.py +183 -0
package/hooks/track-tool-use.py +108 -5
package/hooks/verify-implementation.py +225 -0
package/package.json +2 -1
package/templates/api-dev-state.json +3 -1
package/templates/settings.json +35 -1

package/README.md CHANGED Viewed

@@ -26,11 +26,14 @@ Five powerful slash commands for Claude Code:
 - **`/api-status [endpoint]`** - Track implementation progress and phase completion
 ### Enforcement Hooks
-Three Python hooks that provide **real programmatic guarantees**:
+Six Python hooks that provide **real programmatic guarantees**:
+- **`enforce-external-research.py`** - (v1.7.0) Detects external API questions and requires research before answering
 - **`enforce-research.py`** - Blocks API code writing until research is complete
-- **`track-tool-use.py`** - Logs all research activity (Context7, WebSearch, WebFetch)
-- **`api-workflow-check.py`** - Prevents stopping until required phases are complete
+- **`enforce-interview.py`** - Verifies user questions were actually asked (prevents self-answering)
+- **`verify-implementation.py`** - Checks implementation matches interview requirements
+- **`track-tool-use.py`** - Logs all research activity (Context7, WebSearch, WebFetch, AskUserQuestion)
+- **`api-workflow-check.py`** - Prevents stopping until required phases are complete + git diff verification
 ### State Tracking
 - **`.claude/api-dev-state.json`** - Persistent state file tracking all workflow progress
@@ -402,6 +405,71 @@ The `.claude/api-dev-state.json` file tracks:
 - All research activity is **logged** - auditable trail
 - Workflow completion is **verified** - can't stop early
+## 🔍 Gap Detection & Verification (v1.6.0+)
+The workflow now includes automatic detection of common implementation gaps:
+### Gap 1: Exact Term Matching
+**Problem:** AI paraphrases user terminology instead of using exact terms for research.
+**Example:**
+- User says: "Use Vercel AI Gateway"
+- AI searches for: "Vercel AI SDK" (wrong!)
+**Fix:** `verify-implementation.py` extracts key terms from interview answers and warns if those exact terms weren't used in research queries.
+### Gap 2: File Change Tracking
+**Problem:** AI claims "all files updated" but doesn't verify which files actually changed.
+**Fix:** `api-workflow-check.py` runs `git diff --name-only` and compares against tracked `files_created`/`files_modified` in state. Warns about untracked changes.
+### Gap 3: Skipped Test Investigation
+**Problem:** AI accepts "9 tests skipped" without investigating why.
+**Fix:** `verification_warnings` in state file tracks issues that need review. Stop hook shows unaddressed warnings.
+### Gap 4: Implementation Verification
+**Problem:** AI marks task complete without verifying implementation matches interview.
+**Fix:** Stop hook checks that:
+- Route files exist if endpoints mentioned
+- Test files are tracked
+- Key terms from interview appear in implementation
+### Gap 5: Test/Production Alignment
+**Problem:** Test files check different environment variables than production code.
+**Example:**
+- Interview: "single gateway key"
+- Production: uses `AI_GATEWAY_API_KEY`
+- Test: still checks `OPENAI_API_KEY` (wrong!)
+**Fix:** `verify-implementation.py` warns when test files check env vars that don't match interview requirements.
+### Gap 6: Training Data Reliance (v1.7.0+)
+**Problem:** AI answers questions about external APIs from potentially outdated training data instead of researching first.
+**Example:**
+- User asks: "What providers does Vercel AI Gateway support?"
+- AI answers from memory: "Groq not in gateway" (WRONG!)
+- Reality: Groq has 4 models in the gateway (Llama variants)
+**Fix:** New `UserPromptSubmit` hook (`enforce-external-research.py`) that:
+1. Detects questions about external APIs/SDKs using pattern matching
+2. Injects context requiring research before answering
+3. Works for ANY API (Brandfetch, Stripe, Twilio, etc.) - not just specific ones
+4. Auto-allows WebSearch and Context7 without permission prompts
+```
+USER: "What providers does Brandfetch API support?"
+        ↓
+HOOK: Detects "Brandfetch", "API", "providers"
+        ↓
+INJECTS: "RESEARCH REQUIRED: Use Context7/WebSearch before answering"
+        ↓
+CLAUDE: Researches first → Gives accurate answer
+```
 ## 🔧 Requirements
 - **Node.js** 14.0.0 or higher

package/commands/api-research.md CHANGED Viewed

@@ -259,6 +259,83 @@ With thorough research:
 - ✅ Robust implementation
 - ✅ Better documentation
+---
+## Research-First Schema Design (MANDATORY)
+### The Anti-Pattern: Schema-First Development
+**NEVER DO THIS:**
+- ❌ Define interfaces based on assumptions before researching
+- ❌ Rely on training data for API capabilities
+- ❌ Say "I think it supports..." without verification
+- ❌ Build schemas from memory instead of documentation
+**Real Example of Failure:**
+- User asked: "What providers does Vercel AI Gateway support?"
+- AI answered from memory: "Groq not in gateway"
+- Reality: Groq has 4 models in the gateway (Llama variants)
+- Root cause: No research was done before answering
+### The Correct Pattern: Research-First
+**ALWAYS DO THIS:**
+**Step 1: Research the Source of Truth**
+- Use Context7 (`mcp__context7__resolve-library-id` + `get-library-docs`) for SDK docs
+- Use WebSearch for official provider documentation
+- Query APIs directly when possible (don't assume)
+- Check GitHub repositories for current implementation
+**Step 2: Build Schema FROM Research**
+- Interface fields emerge from discovered capabilities
+- Every field has a source (docs, SDK types, API response)
+- Don't guess - verify each capability
+- Document where each field came from
+**Step 3: Verify with Actual Calls**
+- Test capabilities before marking them supported
+- Investigate skipped tests - they're bugs, not features
+- No "should work" - prove it works
+- All tests must pass, not be skipped
+### Mandatory Checklist Before Answering ANY External API Question
+Before responding to questions about APIs, SDKs, or external services:
+```
+[ ] Did I use Context7 to get current documentation?
+[ ] Did I use WebSearch for official docs?
+[ ] Did I verify the information is current (not training data)?
+[ ] Am I stating facts from research, not memory?
+[ ] Have I cited my sources?
+```
+### Research Query Tracking
+All research is now tracked in `.claude/api-dev-state.json`:
+```json
+{
+  "research_queries": [
+    {
+      "timestamp": "2025-12-07T...",
+      "tool": "WebSearch",
+      "query": "Vercel AI Gateway Groq providers",
+      "terms": ["vercel", "gateway", "groq", "providers"]
+    },
+    {
+      "timestamp": "2025-12-07T...",
+      "tool": "mcp__context7__get-library-docs",
+      "library": "@ai-sdk/gateway",
+      "terms": ["@ai-sdk/gateway"]
+    }
+  ]
+}
+```
+This allows verification that specific topics were actually researched before answering.
 <claude-commands-template>
 ## Research Guidelines