npm - @hustle-together/api-dev-tools - Versions diffs - 1.6.0 → 1.7.0 - Mend

@hustle-together/api-dev-tools 1.6.0 → 1.7.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (7) hide show

package/README.md +26 -1
package/commands/api-research.md +77 -0
package/hooks/enforce-external-research.py +318 -0
package/hooks/track-tool-use.py +70 -2
package/package.json +1 -1
package/templates/api-dev-state.json +3 -1
package/templates/settings.json +12 -0

package/README.md CHANGED Viewed

@@ -26,8 +26,9 @@ Five powerful slash commands for Claude Code:
 - **`/api-status [endpoint]`** - Track implementation progress and phase completion
 ### Enforcement Hooks
-Five Python hooks that provide **real programmatic guarantees**:
+Six Python hooks that provide **real programmatic guarantees**:
+- **`enforce-external-research.py`** - (v1.7.0) Detects external API questions and requires research before answering
 - **`enforce-research.py`** - Blocks API code writing until research is complete
 - **`enforce-interview.py`** - Verifies user questions were actually asked (prevents self-answering)
 - **`verify-implementation.py`** - Checks implementation matches interview requirements
@@ -445,6 +446,30 @@ The workflow now includes automatic detection of common implementation gaps:
 **Fix:** `verify-implementation.py` warns when test files check env vars that don't match interview requirements.
+### Gap 6: Training Data Reliance (v1.7.0+)
+**Problem:** AI answers questions about external APIs from potentially outdated training data instead of researching first.
+**Example:**
+- User asks: "What providers does Vercel AI Gateway support?"
+- AI answers from memory: "Groq not in gateway" (WRONG!)
+- Reality: Groq has 4 models in the gateway (Llama variants)
+**Fix:** New `UserPromptSubmit` hook (`enforce-external-research.py`) that:
+1. Detects questions about external APIs/SDKs using pattern matching
+2. Injects context requiring research before answering
+3. Works for ANY API (Brandfetch, Stripe, Twilio, etc.) - not just specific ones
+4. Auto-allows WebSearch and Context7 without permission prompts
+```
+USER: "What providers does Brandfetch API support?"
+        ↓
+HOOK: Detects "Brandfetch", "API", "providers"
+        ↓
+INJECTS: "RESEARCH REQUIRED: Use Context7/WebSearch before answering"
+        ↓
+CLAUDE: Researches first → Gives accurate answer
+```
 ## 🔧 Requirements
 - **Node.js** 14.0.0 or higher

package/commands/api-research.md CHANGED Viewed

@@ -259,6 +259,83 @@ With thorough research:
 - ✅ Robust implementation
 - ✅ Better documentation
+---
+## Research-First Schema Design (MANDATORY)
+### The Anti-Pattern: Schema-First Development
+**NEVER DO THIS:**
+- ❌ Define interfaces based on assumptions before researching
+- ❌ Rely on training data for API capabilities
+- ❌ Say "I think it supports..." without verification
+- ❌ Build schemas from memory instead of documentation
+**Real Example of Failure:**
+- User asked: "What providers does Vercel AI Gateway support?"
+- AI answered from memory: "Groq not in gateway"
+- Reality: Groq has 4 models in the gateway (Llama variants)
+- Root cause: No research was done before answering
+### The Correct Pattern: Research-First
+**ALWAYS DO THIS:**
+**Step 1: Research the Source of Truth**
+- Use Context7 (`mcp__context7__resolve-library-id` + `get-library-docs`) for SDK docs
+- Use WebSearch for official provider documentation
+- Query APIs directly when possible (don't assume)
+- Check GitHub repositories for current implementation
+**Step 2: Build Schema FROM Research**
+- Interface fields emerge from discovered capabilities
+- Every field has a source (docs, SDK types, API response)
+- Don't guess - verify each capability
+- Document where each field came from
+**Step 3: Verify with Actual Calls**
+- Test capabilities before marking them supported
+- Investigate skipped tests - they're bugs, not features
+- No "should work" - prove it works
+- All tests must pass, not be skipped
+### Mandatory Checklist Before Answering ANY External API Question
+Before responding to questions about APIs, SDKs, or external services:
+```
+[ ] Did I use Context7 to get current documentation?
+[ ] Did I use WebSearch for official docs?
+[ ] Did I verify the information is current (not training data)?
+[ ] Am I stating facts from research, not memory?
+[ ] Have I cited my sources?
+```
+### Research Query Tracking
+All research is now tracked in `.claude/api-dev-state.json`:
+```json
+{
+  "research_queries": [
+    {
+      "timestamp": "2025-12-07T...",
+      "tool": "WebSearch",
+      "query": "Vercel AI Gateway Groq providers",
+      "terms": ["vercel", "gateway", "groq", "providers"]
+    },
+    {
+      "timestamp": "2025-12-07T...",
+      "tool": "mcp__context7__get-library-docs",
+      "library": "@ai-sdk/gateway",
+      "terms": ["@ai-sdk/gateway"]
+    }
+  ]
+}
+```
+This allows verification that specific topics were actually researched before answering.
 <claude-commands-template>
 ## Research Guidelines

package/hooks/enforce-external-research.py ADDED Viewed

@@ -0,0 +1,318 @@
+#!/usr/bin/env python3
+"""
+Hook: UserPromptSubmit
+Purpose: Enforce research before answering external API/SDK questions
+This hook runs BEFORE Claude processes the user's prompt. It detects
+questions about external APIs, SDKs, or services and injects context
+requiring Claude to research first before answering.
+Philosophy: "When in doubt, research. Training data is ALWAYS potentially outdated."
+Returns:
+  - Prints context to stdout (injected into conversation)
+  - Exit 0 to allow the prompt to proceed
+"""
+import json
+import sys
+import re
+from pathlib import Path
+from datetime import datetime
+# State file is in .claude/ directory (sibling to hooks/)
+STATE_FILE = Path(__file__).parent.parent / "api-dev-state.json"
+# ============================================================================
+# PATTERN-BASED DETECTION
+# ============================================================================
+# Patterns that indicate external service/API mentions
+EXTERNAL_SERVICE_PATTERNS = [
+    # Package names
+    r"@[\w-]+/[\w-]+",                    # @scope/package
+    r"\b[\w-]+-(?:sdk|api|js|ts|py)\b",   # something-sdk, something-api, something-js
+    # API/SDK keywords
+    r"\b(?:api|sdk|library|package|module|framework)\b",
+    # Technical implementation terms
+    r"\b(?:endpoint|route|webhook|oauth|auth|token)\b",
+    # Version references
+    r"\bv?\d+\.\d+(?:\.\d+)?\b",           # version numbers like v1.2.3, 2.0
+    # Import/require patterns
+    r"(?:import|require|from)\s+['\"][\w@/-]+['\"]",
+]
+# Patterns that indicate asking about features/capabilities
+CAPABILITY_QUESTION_PATTERNS = [
+    # "What does X support/have/do"
+    r"what\s+(?:does|can|are|is)\s+\w+",
+    r"what\s+\w+\s+(?:support|have|provide|offer)",
+    # "Does X support/have"
+    r"(?:does|can|will)\s+\w+\s+(?:support|have|handle|do|work)",
+    # "How to/do" questions
+    r"how\s+(?:to|do|does|can|should)\s+",
+    # Lists and availability
+    r"(?:list|show)\s+(?:of|all|available)",
+    r"which\s+\w+\s+(?:are|is)\s+(?:available|supported)",
+    r"all\s+(?:available|supported)\s+\w+",
+    # Examples and implementation
+    r"example\s+(?:of|for|using|with)",
+    r"how\s+to\s+(?:use|implement|integrate|connect|setup|configure)",
+]
+# Common external service/company names (partial list - patterns catch the rest)
+KNOWN_SERVICES = [
+    # AI/ML
+    "openai", "anthropic", "google", "gemini", "gpt", "claude", "llama",
+    "groq", "perplexity", "mistral", "cohere", "huggingface", "replicate",
+    # Cloud/Infrastructure
+    "aws", "azure", "gcp", "vercel", "netlify", "cloudflare", "supabase",
+    "firebase", "mongodb", "postgres", "redis", "elasticsearch",
+    # APIs/Services
+    "stripe", "twilio", "sendgrid", "mailchimp", "slack", "discord",
+    "github", "gitlab", "bitbucket", "jira", "notion", "airtable",
+    "shopify", "salesforce", "hubspot", "zendesk",
+    # Data/Analytics
+    "segment", "mixpanel", "amplitude", "datadog", "sentry", "grafana",
+    # Media/Content
+    "cloudinary", "imgix", "mux", "brandfetch", "unsplash", "pexels",
+    # Auth
+    "auth0", "okta", "clerk", "nextauth", "passport",
+    # Payments
+    "paypal", "square", "braintree", "adyen",
+]
+# ============================================================================
+# DETECTION LOGIC
+# ============================================================================
+def detect_external_api_question(prompt: str) -> dict:
+    """
+    Detect if the prompt is asking about external APIs/SDKs.
+    Returns:
+        {
+            "detected": bool,
+            "terms": list of detected terms,
+            "patterns_matched": list of pattern types matched,
+            "confidence": "high" | "medium" | "low"
+        }
+    """
+    prompt_lower = prompt.lower()
+    detected_terms = []
+    patterns_matched = []
+    # Check for known services
+    for service in KNOWN_SERVICES:
+        if service in prompt_lower:
+            detected_terms.append(service)
+            patterns_matched.append("known_service")
+    # Check external service patterns
+    for pattern in EXTERNAL_SERVICE_PATTERNS:
+        matches = re.findall(pattern, prompt_lower, re.IGNORECASE)
+        if matches:
+            detected_terms.extend(matches)
+            patterns_matched.append("external_service_pattern")
+    # Check capability question patterns
+    for pattern in CAPABILITY_QUESTION_PATTERNS:
+        if re.search(pattern, prompt_lower, re.IGNORECASE):
+            patterns_matched.append("capability_question")
+            break
+    # Deduplicate
+    detected_terms = list(set(detected_terms))
+    patterns_matched = list(set(patterns_matched))
+    # Determine confidence
+    if "known_service" in patterns_matched and "capability_question" in patterns_matched:
+        confidence = "high"
+    elif "known_service" in patterns_matched or len(detected_terms) >= 2:
+        confidence = "medium"
+    elif patterns_matched:
+        confidence = "low"
+    else:
+        confidence = "none"
+    return {
+        "detected": confidence in ["high", "medium"],
+        "terms": detected_terms[:10],  # Limit to 10 terms
+        "patterns_matched": patterns_matched,
+        "confidence": confidence,
+    }
+def check_active_workflow() -> bool:
+    """Check if there's an active API development workflow."""
+    if not STATE_FILE.exists():
+        return False
+    try:
+        state = json.loads(STATE_FILE.read_text())
+        phases = state.get("phases", {})
+        # Check if any phase is in progress
+        for phase_key, phase_data in phases.items():
+            if isinstance(phase_data, dict):
+                status = phase_data.get("status", "")
+                if status in ["in_progress", "pending"]:
+                    return True
+        return False
+    except (json.JSONDecodeError, Exception):
+        return False
+def check_already_researched(terms: list) -> list:
+    """Check which terms have already been researched."""
+    if not STATE_FILE.exists():
+        return []
+    try:
+        state = json.loads(STATE_FILE.read_text())
+        research_queries = state.get("research_queries", [])
+        # Also check sources in phases
+        phases = state.get("phases", {})
+        all_sources = []
+        for phase_data in phases.values():
+            if isinstance(phase_data, dict):
+                sources = phase_data.get("sources", [])
+                all_sources.extend(sources)
+        # Combine all research text
+        all_research_text = " ".join(str(s) for s in all_sources)
+        all_research_text += " ".join(
+            str(q.get("query", "")) + " " + str(q.get("term", ""))
+            for q in research_queries
+            if isinstance(q, dict)
+        )
+        all_research_text = all_research_text.lower()
+        # Find which terms were already researched
+        already_researched = []
+        for term in terms:
+            if term.lower() in all_research_text:
+                already_researched.append(term)
+        return already_researched
+    except (json.JSONDecodeError, Exception):
+        return []
+def log_detection(prompt: str, detection: dict) -> None:
+    """Log this detection for debugging/auditing."""
+    if not STATE_FILE.exists():
+        return
+    try:
+        state = json.loads(STATE_FILE.read_text())
+        if "prompt_detections" not in state:
+            state["prompt_detections"] = []
+        state["prompt_detections"].append({
+            "timestamp": datetime.now().isoformat(),
+            "prompt_preview": prompt[:100] + "..." if len(prompt) > 100 else prompt,
+            "detection": detection,
+        })
+        # Keep only last 20 detections
+        state["prompt_detections"] = state["prompt_detections"][-20:]
+        STATE_FILE.write_text(json.dumps(state, indent=2))
+    except Exception:
+        pass  # Don't fail the hook on logging errors
+# ============================================================================
+# MAIN
+# ============================================================================
+def main():
+    # Read hook input from stdin
+    try:
+        input_data = json.load(sys.stdin)
+    except json.JSONDecodeError:
+        # If we can't parse input, allow without injection
+        sys.exit(0)
+    prompt = input_data.get("prompt", "")
+    if not prompt:
+        sys.exit(0)
+    # Check if in active workflow mode (stricter enforcement)
+    active_workflow = check_active_workflow()
+    # Detect external API questions
+    detection = detect_external_api_question(prompt)
+    # Log for debugging
+    if detection["detected"] or active_workflow:
+        log_detection(prompt, detection)
+    # Determine if we should inject research requirement
+    should_inject = False
+    inject_reason = ""
+    if active_workflow:
+        # In active workflow, ALWAYS inject for technical questions
+        if detection["confidence"] in ["high", "medium", "low"]:
+            should_inject = True
+            inject_reason = "active_workflow"
+    elif detection["detected"]:
+        # Check if already researched
+        already_researched = check_already_researched(detection["terms"])
+        unresearched_terms = [t for t in detection["terms"] if t not in already_researched]
+        if unresearched_terms:
+            should_inject = True
+            inject_reason = "unresearched_terms"
+            detection["unresearched"] = unresearched_terms
+    # Inject context if needed
+    if should_inject:
+        terms_str = ", ".join(detection.get("unresearched", detection["terms"])[:5])
+        injection = f"""
+<user-prompt-submit-hook>
+EXTERNAL API/SDK DETECTED: {terms_str}
+Confidence: {detection["confidence"]}
+{"Mode: Active API Development Workflow" if active_workflow else ""}
+MANDATORY RESEARCH REQUIREMENT:
+Before answering this question, you MUST:
+1. Use Context7 (mcp__context7__resolve-library-id + get-library-docs) to look up current documentation
+2. Use WebSearch to find official documentation and recent updates
+3. NEVER answer from training data alone - it may be outdated
+Training data can be months or years old. APIs change constantly.
+Research first. Then answer with verified, current information.
+After researching, cite your sources in your response.
+</user-prompt-submit-hook>
+"""
+        print(injection)
+    # Always allow the prompt to proceed
+    sys.exit(0)
+if __name__ == "__main__":
+    main()

package/hooks/track-tool-use.py CHANGED Viewed

@@ -147,6 +147,29 @@ def main():
     # Add to sources list
     sources.append(source_entry)
+    # Also add to research_queries for prompt verification
+    research_queries = state.setdefault("research_queries", [])
+    query_entry = {
+        "timestamp": timestamp,
+        "tool": tool_name,
+    }
+    # Extract query/term based on tool type
+    if tool_name == "WebSearch":
+        query_entry["query"] = tool_input.get("query", "")
+        query_entry["terms"] = extract_terms(tool_input.get("query", ""))
+    elif tool_name == "WebFetch":
+        query_entry["url"] = tool_input.get("url", "")
+        query_entry["terms"] = extract_terms_from_url(tool_input.get("url", ""))
+    elif "context7" in tool_name.lower():
+        query_entry["library"] = tool_input.get("libraryName", tool_input.get("libraryId", ""))
+        query_entry["terms"] = [tool_input.get("libraryName", "").lower()]
+    research_queries.append(query_entry)
+    # Keep only last 50 queries
+    state["research_queries"] = research_queries[-50:]
     # Update last activity timestamp
     research["last_activity"] = timestamp
     research["source_count"] = len(sources)
@@ -190,7 +213,7 @@ def main():
 def create_initial_state():
     """Create initial state structure"""
     return {
-        "version": "1.0.0",
+        "version": "1.1.0",
         "created_at": datetime.now().isoformat(),
         "phases": {
             "scope": {"status": "not_started"},
@@ -209,7 +232,9 @@ def create_initial_state():
             "schema_matches_docs": False,
             "tests_cover_params": False,
             "all_tests_passing": False
-        }
+        },
+        "research_queries": [],
+        "prompt_detections": []
     }
@@ -225,5 +250,48 @@ def sanitize_input(tool_input):
     return sanitized
+def extract_terms(query: str) -> list:
+    """Extract searchable terms from a query string."""
+    import re
+    # Remove common words and extract meaningful terms
+    stop_words = {"the", "a", "an", "is", "are", "was", "were", "be", "been",
+                  "how", "to", "do", "does", "what", "which", "for", "and", "or",
+                  "in", "on", "at", "with", "from", "this", "that", "it"}
+    # Extract words
+    words = re.findall(r'\b[\w@/-]+\b', query.lower())
+    # Filter and return
+    terms = [w for w in words if w not in stop_words and len(w) > 2]
+    return terms[:10]  # Limit to 10 terms
+def extract_terms_from_url(url: str) -> list:
+    """Extract meaningful terms from a URL."""
+    import re
+    from urllib.parse import urlparse
+    try:
+        parsed = urlparse(url)
+        # Get domain parts and path parts
+        domain_parts = parsed.netloc.replace("www.", "").split(".")
+        path_parts = [p for p in parsed.path.split("/") if p]
+        # Combine and filter
+        all_parts = domain_parts + path_parts
+        terms = []
+        for part in all_parts:
+            # Split by common separators
+            sub_parts = re.split(r'[-_.]', part.lower())
+            terms.extend(sub_parts)
+        # Filter short/common terms
+        stop_terms = {"com", "org", "io", "dev", "api", "docs", "www", "http", "https"}
+        terms = [t for t in terms if t not in stop_terms and len(t) > 2]
+        return terms[:10]
+    except Exception:
+        return []
 if __name__ == "__main__":
     main()

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@hustle-together/api-dev-tools",
-  "version": "1.6.0",
+  "version": "1.7.0",
   "description": "Interview-driven API development workflow for Claude Code - Automates research, testing, and documentation",
   "main": "bin/cli.js",
   "bin": {

package/templates/api-dev-state.json CHANGED Viewed

@@ -1,8 +1,10 @@
 {
-  "version": "1.0.0",
+  "version": "1.1.0",
   "created_at": null,
   "endpoint": null,
   "library": null,
+  "research_queries": [],
+  "prompt_detections": [],
   "phases": {
     "scope": {
       "status": "not_started",

package/templates/settings.json CHANGED Viewed

@@ -4,6 +4,8 @@
       "WebSearch",
       "WebFetch",
       "mcp__context7",
+      "mcp__context7__resolve-library-id",
+      "mcp__context7__get-library-docs",
       "mcp__github",
       "Bash(claude mcp:*)",
       "Bash(pnpm test:*)",
@@ -14,6 +16,16 @@
     ]
   },
   "hooks": {
+    "UserPromptSubmit": [
+      {
+        "hooks": [
+          {
+            "type": "command",
+            "command": "$CLAUDE_PROJECT_DIR/.claude/hooks/enforce-external-research.py"
+          }
+        ]
+      }
+    ],
     "PreToolUse": [
       {
         "matcher": "Write|Edit",