npm - @apitap/core - Versions diffs - 1.0.0 - Mend

@apitap/core 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (236) hide show

package/LICENSE +60 -0
package/README.md +362 -0
package/SKILL.md +270 -0
package/dist/auth/crypto.d.ts +31 -0
package/dist/auth/crypto.js +66 -0
package/dist/auth/crypto.js.map +1 -0
package/dist/auth/handoff.d.ts +29 -0
package/dist/auth/handoff.js +180 -0
package/dist/auth/handoff.js.map +1 -0
package/dist/auth/manager.d.ts +46 -0
package/dist/auth/manager.js +127 -0
package/dist/auth/manager.js.map +1 -0
package/dist/auth/oauth-refresh.d.ts +16 -0
package/dist/auth/oauth-refresh.js +91 -0
package/dist/auth/oauth-refresh.js.map +1 -0
package/dist/auth/refresh.d.ts +43 -0
package/dist/auth/refresh.js +217 -0
package/dist/auth/refresh.js.map +1 -0
package/dist/capture/anti-bot.d.ts +15 -0
package/dist/capture/anti-bot.js +43 -0
package/dist/capture/anti-bot.js.map +1 -0
package/dist/capture/blocklist.d.ts +6 -0
package/dist/capture/blocklist.js +70 -0
package/dist/capture/blocklist.js.map +1 -0
package/dist/capture/body-diff.d.ts +8 -0
package/dist/capture/body-diff.js +102 -0
package/dist/capture/body-diff.js.map +1 -0
package/dist/capture/body-variables.d.ts +13 -0
package/dist/capture/body-variables.js +142 -0
package/dist/capture/body-variables.js.map +1 -0
package/dist/capture/domain.d.ts +8 -0
package/dist/capture/domain.js +34 -0
package/dist/capture/domain.js.map +1 -0
package/dist/capture/entropy.d.ts +33 -0
package/dist/capture/entropy.js +100 -0
package/dist/capture/entropy.js.map +1 -0
package/dist/capture/filter.d.ts +11 -0
package/dist/capture/filter.js +49 -0
package/dist/capture/filter.js.map +1 -0
package/dist/capture/graphql.d.ts +21 -0
package/dist/capture/graphql.js +99 -0
package/dist/capture/graphql.js.map +1 -0
package/dist/capture/idle.d.ts +23 -0
package/dist/capture/idle.js +44 -0
package/dist/capture/idle.js.map +1 -0
package/dist/capture/monitor.d.ts +26 -0
package/dist/capture/monitor.js +183 -0
package/dist/capture/monitor.js.map +1 -0
package/dist/capture/oauth-detector.d.ts +18 -0
package/dist/capture/oauth-detector.js +96 -0
package/dist/capture/oauth-detector.js.map +1 -0
package/dist/capture/pagination.d.ts +9 -0
package/dist/capture/pagination.js +40 -0
package/dist/capture/pagination.js.map +1 -0
package/dist/capture/parameterize.d.ts +17 -0
package/dist/capture/parameterize.js +63 -0
package/dist/capture/parameterize.js.map +1 -0
package/dist/capture/scrubber.d.ts +5 -0
package/dist/capture/scrubber.js +38 -0
package/dist/capture/scrubber.js.map +1 -0
package/dist/capture/session.d.ts +46 -0
package/dist/capture/session.js +445 -0
package/dist/capture/session.js.map +1 -0
package/dist/capture/token-detector.d.ts +16 -0
package/dist/capture/token-detector.js +62 -0
package/dist/capture/token-detector.js.map +1 -0
package/dist/capture/verifier.d.ts +17 -0
package/dist/capture/verifier.js +147 -0
package/dist/capture/verifier.js.map +1 -0
package/dist/cli.d.ts +2 -0
package/dist/cli.js +930 -0
package/dist/cli.js.map +1 -0
package/dist/discovery/auth.d.ts +17 -0
package/dist/discovery/auth.js +81 -0
package/dist/discovery/auth.js.map +1 -0
package/dist/discovery/fetch.d.ts +17 -0
package/dist/discovery/fetch.js +59 -0
package/dist/discovery/fetch.js.map +1 -0
package/dist/discovery/frameworks.d.ts +11 -0
package/dist/discovery/frameworks.js +249 -0
package/dist/discovery/frameworks.js.map +1 -0
package/dist/discovery/index.d.ts +21 -0
package/dist/discovery/index.js +219 -0
package/dist/discovery/index.js.map +1 -0
package/dist/discovery/openapi.d.ts +13 -0
package/dist/discovery/openapi.js +175 -0
package/dist/discovery/openapi.js.map +1 -0
package/dist/discovery/probes.d.ts +9 -0
package/dist/discovery/probes.js +70 -0
package/dist/discovery/probes.js.map +1 -0
package/dist/index.d.ts +25 -0
package/dist/index.js +25 -0
package/dist/index.js.map +1 -0
package/dist/inspect/report.d.ts +52 -0
package/dist/inspect/report.js +191 -0
package/dist/inspect/report.js.map +1 -0
package/dist/mcp.d.ts +8 -0
package/dist/mcp.js +526 -0
package/dist/mcp.js.map +1 -0
package/dist/orchestration/browse.d.ts +38 -0
package/dist/orchestration/browse.js +198 -0
package/dist/orchestration/browse.js.map +1 -0
package/dist/orchestration/cache.d.ts +15 -0
package/dist/orchestration/cache.js +24 -0
package/dist/orchestration/cache.js.map +1 -0
package/dist/plugin.d.ts +17 -0
package/dist/plugin.js +158 -0
package/dist/plugin.js.map +1 -0
package/dist/read/decoders/deepwiki.d.ts +2 -0
package/dist/read/decoders/deepwiki.js +148 -0
package/dist/read/decoders/deepwiki.js.map +1 -0
package/dist/read/decoders/grokipedia.d.ts +2 -0
package/dist/read/decoders/grokipedia.js +210 -0
package/dist/read/decoders/grokipedia.js.map +1 -0
package/dist/read/decoders/hackernews.d.ts +2 -0
package/dist/read/decoders/hackernews.js +168 -0
package/dist/read/decoders/hackernews.js.map +1 -0
package/dist/read/decoders/index.d.ts +2 -0
package/dist/read/decoders/index.js +12 -0
package/dist/read/decoders/index.js.map +1 -0
package/dist/read/decoders/reddit.d.ts +2 -0
package/dist/read/decoders/reddit.js +142 -0
package/dist/read/decoders/reddit.js.map +1 -0
package/dist/read/decoders/twitter.d.ts +12 -0
package/dist/read/decoders/twitter.js +187 -0
package/dist/read/decoders/twitter.js.map +1 -0
package/dist/read/decoders/wikipedia.d.ts +2 -0
package/dist/read/decoders/wikipedia.js +66 -0
package/dist/read/decoders/wikipedia.js.map +1 -0
package/dist/read/decoders/youtube.d.ts +2 -0
package/dist/read/decoders/youtube.js +69 -0
package/dist/read/decoders/youtube.js.map +1 -0
package/dist/read/extract.d.ts +25 -0
package/dist/read/extract.js +320 -0
package/dist/read/extract.js.map +1 -0
package/dist/read/index.d.ts +14 -0
package/dist/read/index.js +66 -0
package/dist/read/index.js.map +1 -0
package/dist/read/peek.d.ts +9 -0
package/dist/read/peek.js +137 -0
package/dist/read/peek.js.map +1 -0
package/dist/read/types.d.ts +44 -0
package/dist/read/types.js +3 -0
package/dist/read/types.js.map +1 -0
package/dist/replay/engine.d.ts +53 -0
package/dist/replay/engine.js +441 -0
package/dist/replay/engine.js.map +1 -0
package/dist/replay/truncate.d.ts +16 -0
package/dist/replay/truncate.js +92 -0
package/dist/replay/truncate.js.map +1 -0
package/dist/serve.d.ts +31 -0
package/dist/serve.js +149 -0
package/dist/serve.js.map +1 -0
package/dist/skill/generator.d.ts +44 -0
package/dist/skill/generator.js +419 -0
package/dist/skill/generator.js.map +1 -0
package/dist/skill/importer.d.ts +26 -0
package/dist/skill/importer.js +80 -0
package/dist/skill/importer.js.map +1 -0
package/dist/skill/search.d.ts +19 -0
package/dist/skill/search.js +51 -0
package/dist/skill/search.js.map +1 -0
package/dist/skill/signing.d.ts +16 -0
package/dist/skill/signing.js +34 -0
package/dist/skill/signing.js.map +1 -0
package/dist/skill/ssrf.d.ts +27 -0
package/dist/skill/ssrf.js +210 -0
package/dist/skill/ssrf.js.map +1 -0
package/dist/skill/store.d.ts +7 -0
package/dist/skill/store.js +93 -0
package/dist/skill/store.js.map +1 -0
package/dist/stats/report.d.ts +26 -0
package/dist/stats/report.js +157 -0
package/dist/stats/report.js.map +1 -0
package/dist/types.d.ts +214 -0
package/dist/types.js +3 -0
package/dist/types.js.map +1 -0
package/package.json +58 -0
package/src/auth/crypto.ts +92 -0
package/src/auth/handoff.ts +229 -0
package/src/auth/manager.ts +140 -0
package/src/auth/oauth-refresh.ts +120 -0
package/src/auth/refresh.ts +300 -0
package/src/capture/anti-bot.ts +63 -0
package/src/capture/blocklist.ts +75 -0
package/src/capture/body-diff.ts +109 -0
package/src/capture/body-variables.ts +156 -0
package/src/capture/domain.ts +34 -0
package/src/capture/entropy.ts +121 -0
package/src/capture/filter.ts +56 -0
package/src/capture/graphql.ts +124 -0
package/src/capture/idle.ts +45 -0
package/src/capture/monitor.ts +224 -0
package/src/capture/oauth-detector.ts +106 -0
package/src/capture/pagination.ts +49 -0
package/src/capture/parameterize.ts +68 -0
package/src/capture/scrubber.ts +49 -0
package/src/capture/session.ts +502 -0
package/src/capture/token-detector.ts +76 -0
package/src/capture/verifier.ts +171 -0
package/src/cli.ts +1031 -0
package/src/discovery/auth.ts +99 -0
package/src/discovery/fetch.ts +85 -0
package/src/discovery/frameworks.ts +231 -0
package/src/discovery/index.ts +256 -0
package/src/discovery/openapi.ts +230 -0
package/src/discovery/probes.ts +76 -0
package/src/index.ts +26 -0
package/src/inspect/report.ts +247 -0
package/src/mcp.ts +618 -0
package/src/orchestration/browse.ts +250 -0
package/src/orchestration/cache.ts +37 -0
package/src/plugin.ts +188 -0
package/src/read/decoders/deepwiki.ts +180 -0
package/src/read/decoders/grokipedia.ts +246 -0
package/src/read/decoders/hackernews.ts +198 -0
package/src/read/decoders/index.ts +15 -0
package/src/read/decoders/reddit.ts +158 -0
package/src/read/decoders/twitter.ts +211 -0
package/src/read/decoders/wikipedia.ts +75 -0
package/src/read/decoders/youtube.ts +75 -0
package/src/read/extract.ts +396 -0
package/src/read/index.ts +78 -0
package/src/read/peek.ts +175 -0
package/src/read/types.ts +37 -0
package/src/replay/engine.ts +559 -0
package/src/replay/truncate.ts +116 -0
package/src/serve.ts +189 -0
package/src/skill/generator.ts +473 -0
package/src/skill/importer.ts +107 -0
package/src/skill/search.ts +76 -0
package/src/skill/signing.ts +36 -0
package/src/skill/ssrf.ts +238 -0
package/src/skill/store.ts +107 -0
package/src/stats/report.ts +208 -0
package/src/types.ts +233 -0

package/LICENSE ADDED Viewed

@@ -0,0 +1,60 @@
+Business Source License 1.1
+Parameters
+Licensor: ApiTap Contributors
+Licensed Work: ApiTap
+Change Date: February 7, 2029
+Change License: Apache License 2.0
+Notice
+Business Source License 1.1
+This Business Source License (this "License") is not an Open Source license.
+However, the Licensed Work will eventually be made available under an Open Source License, as stated in this License.
+License text Copyright (c) 2023 Hashicorp, Inc. "Business Source License" is a trademark of Hashicorp, Inc.
+---
+Terms
+The Licensor hereby grants you the right to copy, modify, create derivative works, and distribute the Licensed Work. However, if you receive the Licensed Work from Licensor and you do not have a license agreement with Licensor for the Licensed Work, then your rights under this License will end on the earlier of: (i) the date such proprietary rights notice is first received by you, or (ii) the Change Date.
+Violation of Licensor's intellectual property rights (including patent, trademark, and/or trade secret) is prohibited.
+You are granted a personal, non-exclusive, non-transferable license to use the Licensed Work in a non-competing manner.
+Restrictions:
+1. **No Competing Commercial Use:** You may not offer the Licensed Work as a commercial service that competes with a hosting or software-as-a-service offering by Licensor or any of its affiliates.
+   Non-competing uses include:
+   - Self-hosted deployment for your own use
+   - Internal company deployment
+   - Open source forks and contributions
+   - Academic or non-profit use
+   - Educational use
+   - Research use
+   Competing uses include (prohibited until Change Date):
+   - Offering hosted/cloud ApiTap as a service
+   - Rebranding ApiTap and selling it
+   - Providing ApiTap services to third parties for commercial gain
+2. **Patent Rights:** Licensor grants you a license to any patent rights controlled by Licensor that are necessarily infringed by the Licensed Work as provided in source code form.
+3. **Open Source Exceptions:** The restrictions in Section 1 do not apply to any fork or modification that is properly made available under an Open Source License as defined by the Open Source Initiative (www.opensource.org), provided that you make the source code of any such fork publicly available.
+4. **No Other Rights:** Except as expressly stated herein, Licensor retains all right, title, and interest in the Licensed Work.
+---
+Change Date
+On the Change Date (February 7, 2029), or if earlier, upon the occurrence of an event specified by Licensor, the Licensed Work automatically converts to the Change License, which is the Apache License 2.0. At that time, the restrictions in Section 1 will no longer apply, and the Licensed Work will be available under Apache License 2.0.
+---
+Disclaimer of Warranties
+The Licensed Work is provided "AS-IS" without warranty of any kind, express, implied, or statutory, including but not limited to warranties of merchantability, fitness for a particular purpose, and non-infringement. In no event shall Licensor be liable for any indirect, incidental, special, exemplary, or consequential damages.

package/README.md ADDED Viewed

@@ -0,0 +1,362 @@
+# ApiTap
+[![npm version](https://badge.fury.io/js/apitap.svg)](https://www.npmjs.com/package/apitap)
+[![tests](https://img.shields.io/badge/tests-721%20passing-brightgreen)](https://github.com/n1byn1kt/apitap)
+[![license](https://img.shields.io/badge/license-BSL--1.1-blue)](./LICENSE)
+**The MCP server that turns any website into an API — no docs, no SDK, no browser.**
+ApiTap is an MCP server that lets AI agents browse the web through APIs instead of browsers. When an agent needs data from a website, ApiTap automatically detects the site's framework (WordPress, Next.js, Shopify, etc.), discovers its internal API endpoints, and calls them directly — returning clean JSON instead of forcing the agent to render and parse HTML. For sites that need authentication, it opens a browser window for a human to log in, captures the session tokens, and hands control back to the agent. Every site visited generates a reusable "skill file" that maps the site's APIs, so the first visit is a discovery step and every subsequent visit is a direct, instant API call. It works with any MCP-compatible LLM client and reduces token costs by 20-100x compared to browser automation.
+The web was built for human eyes; ApiTap makes it native to machines.
+```bash
+# One tool call: discover the API + replay it
+apitap browse https://techcrunch.com
+  ✓ Discovery: WordPress detected (medium confidence)
+  ✓ Replay: GET /wp-json/wp/v2/posts → 200 (10 articles)
+# Or read content directly — no browser needed
+apitap read https://en.wikipedia.org/wiki/Node.js
+  ✓ Wikipedia decoder: ~127 tokens (vs ~4,900 raw HTML)
+# Or step by step:
+apitap capture https://polymarket.com    # Watch API traffic
+apitap show gamma-api.polymarket.com     # See what was captured
+apitap replay gamma-api.polymarket.com get-events  # Call the API directly
+```
+No scraping. No browser. Just the API.
+---
+## How It Works
+1. **Capture** — Launch a Playwright browser, visit a site, browse normally. ApiTap intercepts all network traffic via CDP.
+2. **Filter** — Scoring engine separates signal from noise. Analytics, tracking pixels, and framework internals are filtered out. Only real API endpoints survive.
+3. **Generate** — Captured endpoints are grouped by domain, URLs are parameterized (`/users/123` → `/users/:id`), and a JSON skill file is written to `~/.apitap/skills/`.
+4. **Replay** — Read the skill file, substitute parameters, call the API with `fetch()`. Zero dependencies in the replay path.
+```
+Capture:  Browser → Playwright listener → Filter → Skill Generator → skill.json
+Replay:   Agent → Replay Engine (skill.json) → fetch() → API → JSON response
+```
+## Install
+```bash
+npm install -g apitap
+```
+Requires Node.js 20+. Playwright browsers are installed automatically on first capture.
+## Quick Start
+### Capture API traffic
+```bash
+# Capture from a single domain (default)
+apitap capture https://polymarket.com
+# Capture all domains (CDN, API subdomains, etc.)
+apitap capture https://polymarket.com --all-domains
+# Include response previews in the skill file
+apitap capture https://polymarket.com --preview
+# Stop after 30 seconds
+apitap capture https://polymarket.com --duration 30
+```
+ApiTap opens a browser window. Browse the site normally — click around, scroll, search. Every API call is captured. Press Ctrl+C when done.
+### List and explore captured APIs
+```bash
+# List all skill files
+apitap list
+  ✓ gamma-api.polymarket.com       3 endpoints   2m ago
+  ✓ www.reddit.com                 2 endpoints   1h ago
+# Show endpoints for a domain
+apitap show gamma-api.polymarket.com
+  [green] ✓ GET    /events                        object (3 fields)
+  [green] ✓ GET    /teams                         array (12 fields)
+# Search across all skill files
+apitap search polymarket
+```
+### Replay an endpoint
+```bash
+# Replay with captured defaults
+apitap replay gamma-api.polymarket.com get-events
+# Override parameters
+apitap replay gamma-api.polymarket.com get-events limit=5 offset=10
+# Machine-readable JSON output
+apitap replay gamma-api.polymarket.com get-events --json
+```
+## Text-Mode Browsing
+ApiTap includes a text-mode browsing pipeline — `peek` and `read` — that lets agents consume web content without launching a browser. Seven built-in decoders extract structured content from popular sites at a fraction of the token cost:
+| Site | Decoder | Typical Tokens | vs Raw HTML |
+|------|---------|----------------|-------------|
+| Reddit | `reddit` | ~500 | 95% smaller |
+| YouTube | `youtube` | ~36 | 99% smaller |
+| Wikipedia | `wikipedia` | ~127 | 97% smaller |
+| Hacker News | `hackernews` | ~200 | 90% smaller |
+| Grokipedia | `grokipedia` | ~150 | 90% smaller |
+| Twitter/X | `twitter` | ~80 | 95% smaller |
+| Any other site | `generic` | varies | ~74% avg |
+**Average token savings: 74% across 83 tested domains.**
+```bash
+# Triage first — zero-cost HEAD request
+apitap peek https://reddit.com/r/programming
+  ✓ accessible, recommendation: read
+# Extract content — no browser needed
+apitap read https://reddit.com/r/programming
+  ✓ Reddit decoder: 12 posts, ~500 tokens
+# Works for any URL — falls back to generic HTML extraction
+apitap read https://example.com/blog/post
+```
+For MCP agents, `apitap_peek` and `apitap_read` are the fastest way to consume web content — use them before reaching for `apitap_browse` or `apitap_capture`.
+## Tested Sites
+ApiTap has been tested against real-world sites:
+| Site | Endpoints | Tier | Replay |
+|------|-----------|------|--------|
+| Polymarket | 3 | Green | 200 |
+| Reddit | 2 | Green | 200 |
+| Discord | 4 | Green | 200 |
+| GitHub | 1 | Green | 200 |
+| HN (Algolia) | 1 | Yellow | 200 |
+| dev.to | 2 | Green | 200 |
+| CoinGecko | 6 | Green | 200 |
+78% overall replay success rate across 9 tested sites (green tier: 100%).
+## Why ApiTap?
+**Why not just use the public API?** Most sites don't have one, or it's heavily rate-limited. The internal API that powers the SPA is often richer, faster, and already handles auth.
+**Why not just use Playwright/Puppeteer?** Browser automation costs 50-200K tokens per page for an AI agent. ApiTap captures the API once, then your agent calls it directly at 1-5K tokens. No DOM, no selectors, no flaky waits.
+**Why not reverse-engineer the API manually?** You could open DevTools and copy headers by hand. ApiTap does it in 30 seconds and gives you a portable file any agent can use.
+**Isn't this just a MITM proxy?** No. ApiTap is read-only — it uses Chrome DevTools Protocol to observe responses. No certificate setup, no request modification, no code injection.
+## Replayability Tiers
+Every captured endpoint is classified by replay difficulty:
+| Tier | Meaning | Replay |
+|------|---------|--------|
+| **Green** | Public, permissive CORS, no signing | Works with `fetch()` |
+| **Yellow** | Needs auth, no signing/anti-bot | Works with stored credentials |
+| **Orange** | CSRF tokens, session binding | Fragile — may need browser refresh |
+| **Red** | Request signing, anti-bot (Cloudflare) | Needs full browser |
+GET endpoints are auto-verified during capture by comparing Playwright responses with raw `fetch()` responses.
+## MCP Server
+ApiTap includes an MCP server with 12 tools for Claude Desktop, Cursor, Windsurf, and other MCP-compatible clients.
+```bash
+# Start the MCP server
+apitap-mcp
+```
+Add to your MCP config (e.g. `claude_desktop_config.json`):
+```json
+{
+  "mcpServers": {
+    "apitap": {
+      "command": "npx",
+      "args": ["apitap-mcp"]
+    }
+  }
+}
+```
+### MCP Tools
+| Tool | Description |
+|------|-------------|
+| `apitap_browse` | High-level "just get me the data" (discover + replay in one call) |
+| `apitap_peek` | Zero-cost URL triage (HEAD only) |
+| `apitap_read` | Extract content without a browser (7 decoders) |
+| `apitap_discover` | Detect a site's APIs without launching a browser |
+| `apitap_search` | Search available skill files |
+| `apitap_replay` | Replay a captured API endpoint |
+| `apitap_replay_batch` | Replay multiple endpoints in parallel across domains |
+| `apitap_capture` | Capture API traffic via instrumented browser |
+| `apitap_capture_start` | Start an interactive capture session |
+| `apitap_capture_interact` | Interact with a live capture session (click, type, scroll) |
+| `apitap_capture_finish` | Finish or abort a capture session |
+| `apitap_auth_request` | Request human authentication for a site |
+You can also serve a single skill file as a dedicated MCP server with `apitap serve <domain>` — each endpoint becomes its own tool.
+## Auth Management
+ApiTap automatically detects and stores auth credentials (Bearer tokens, API keys, cookies) during capture. Credentials are encrypted at rest with AES-256-GCM.
+```bash
+# View auth status
+apitap auth api.example.com
+# List all domains with stored auth
+apitap auth --list
+# Refresh expired tokens via browser
+apitap refresh api.example.com
+# Force fresh token before replay
+apitap replay api.example.com get-data --fresh
+# Clear stored auth
+apitap auth api.example.com --clear
+```
+## Skill Files
+Skill files are JSON documents stored at `~/.apitap/skills/<domain>.json`. They contain everything needed to replay an API — endpoints, headers, query params, request bodies, pagination patterns, and response shapes.
+```json
+{
+  "version": "1.1",
+  "domain": "gamma-api.polymarket.com",
+  "baseUrl": "https://gamma-api.polymarket.com",
+  "endpoints": [
+    {
+      "id": "get-events",
+      "method": "GET",
+      "path": "/events",
+      "queryParams": { "limit": { "type": "string", "example": "10" } },
+      "headers": {},
+      "responseShape": { "type": "object", "fields": ["id", "title", "slug"] }
+    }
+  ]
+}
+```
+Skill files are portable and shareable. Auth credentials are stored separately in encrypted storage — never in the skill file itself.
+### Import / Export
+```bash
+# Import a skill file from someone else
+apitap import ./reddit-skills.json
+# Import validates: signature check → SSRF scan → confirmation
+```
+Imported files are re-signed with your local key and marked with `imported` provenance.
+## Security
+ApiTap handles untrusted skill files from the internet and replays HTTP requests on your behalf. That's a high-trust position, and we treat it seriously.
+### Defense in Depth
+- **Auth encryption** — AES-256-GCM with PBKDF2 key derivation, keyed to your machine
+- **PII scrubbing** — Emails, phones, IPs, credit cards, SSNs detected and redacted during capture
+- **SSRF protection** — Multi-layer URL validation blocks access to internal networks (see below)
+- **Header injection protection** — Allowlist prevents skill files from injecting dangerous HTTP headers (`Host`, `X-Forwarded-For`, `Cookie`, `Authorization`)
+- **Redirect validation** — Manual redirect handling with SSRF re-check prevents redirect-to-internal-IP attacks
+- **DNS rebinding prevention** — Resolved IPs are pinned to prevent TOCTOU attacks where DNS returns different IPs on second lookup
+- **Skill signing** — HMAC-SHA256 signatures detect tampering; three-state provenance tracking (self/imported/unsigned)
+- **No phone-home** — Everything runs locally. No external services, no telemetry
+- **Read-only capture** — Playwright intercepts responses only. No request modification or code injection
+### Why SSRF Protection Matters
+Since skill files can come from anywhere — shared by colleagues, downloaded from GitHub, or imported from untrusted sources — a malicious skill file is the primary threat vector. Here's what ApiTap defends against:
+**The attack:** An attacker crafts a skill file with `baseUrl: "http://169.254.169.254"` (the AWS/cloud metadata endpoint) or `baseUrl: "http://localhost:8080"` (your internal services). When you replay an endpoint, your machine makes the request, potentially leaking cloud credentials or hitting internal APIs.
+**The defense:** ApiTap validates every URL at multiple points:
+```
+Skill file imported
+  → validateUrl(): block private IPs, internal hostnames, non-HTTP schemes
+  → validateSkillFileUrls(): scan baseUrl + all endpoint example URLs
+Endpoint replayed
+  → resolveAndValidateUrl(): DNS lookup + verify resolved IP isn't private
+  → IP pinning: fetch uses resolved IP directly (prevents DNS rebinding)
+  → Header filtering: strip dangerous headers from skill file
+  → Redirect check: if server redirects, validate new target before following
+```
+**Blocked ranges:** `127.0.0.0/8`, `10.0.0.0/8`, `172.16.0.0/12`, `192.168.0.0/16`, `169.254.0.0/16` (cloud metadata), `0.0.0.0`, IPv6 equivalents (`::1`, `fe80::/10`, `fc00::/7`, `::ffff:` mapped addresses), `localhost`, `.local`, `.internal`, `file://`, `javascript:` schemes.
+This is especially relevant now that [MCP servers are being used as attack vectors in the wild](https://cloud.google.com/blog/topics/threat-intelligence/distillation-experimentation-integration-ai-adversarial-use) — Google's Threat Intelligence Group recently documented underground toolkits built on compromised MCP servers. ApiTap is designed to be safe even when processing untrusted inputs.
+See [docs/security-audit-v1.md](./docs/security-audit-v1.md) for the full security audit (19 findings, current posture 9/10).
+## CLI Reference
+All commands support `--json` for machine-readable output.
+| Command | Description |
+|---------|-------------|
+| `apitap browse <url>` | Discover + replay in one step |
+| `apitap peek <url>` | Zero-cost URL triage (HEAD only) |
+| `apitap read <url>` | Extract content without a browser |
+| `apitap discover <url>` | Detect APIs without launching a browser |
+| `apitap capture <url>` | Capture API traffic from a website |
+| `apitap list` | List available skill files |
+| `apitap show <domain>` | Show endpoints for a domain |
+| `apitap search <query>` | Search skill files by domain or endpoint |
+| `apitap replay <domain> <id> [key=val...]` | Replay an API endpoint |
+| `apitap import <file>` | Import a skill file with safety validation |
+| `apitap refresh <domain>` | Refresh auth tokens via browser |
+| `apitap auth [domain]` | View or manage stored auth |
+| `apitap serve <domain>` | Serve a skill file as an MCP server |
+| `apitap inspect <url>` | Discover APIs without saving |
+| `apitap stats` | Show token savings report |
+| `apitap --version` | Print version |
+### Capture flags
+| Flag | Description |
+|------|-------------|
+| `--all-domains` | Capture traffic from all domains (default: target domain only) |
+| `--preview` | Include response data previews |
+| `--duration <sec>` | Stop capture after N seconds |
+| `--port <port>` | Connect to specific CDP port |
+| `--launch` | Always launch a new browser |
+| `--attach` | Only attach to existing browser |
+| `--no-scrub` | Disable PII scrubbing |
+| `--no-verify` | Skip auto-verification of GET endpoints |
+## Development
+```bash
+git clone https://github.com/n1byn1kt/apitap.git
+cd apitap
+npm install
+npm test          # 721 tests, Node built-in test runner
+npm run typecheck # Type checking
+npm run build     # Compile to dist/
+npx tsx src/cli.ts capture <url>  # Run from source
+```
+## License
+[Business Source License 1.1](./LICENSE) — **free for all non-competing use** (personal, internal, educational, research, open source). Cannot be rebranded and sold as a competing service. Converts to Apache 2.0 on February 7, 2029.

package/SKILL.md ADDED Viewed

@@ -0,0 +1,270 @@
+# ApiTap — The MCP Server That Turns Any Website Into an API
+> No docs, no SDK, no browser. Just data.
+## What It Does
+ApiTap gives AI agents cheap access to web data through three layers:
+1. **Read** — Decode any URL into structured text without a browser (side-channel APIs, og: tags, HTML extraction). 0-10K tokens vs 50-200K for browser automation.
+2. **Replay** — Call captured API endpoints directly. 1-5K tokens per call.
+3. **Capture** — Record API traffic from a headless browser session, generating reusable skill files.
+## MCP Tools (12)
+### Tier 0: Triage (free)
+#### `apitap_peek`
+Zero-cost URL triage. HTTP HEAD only — checks accessibility, bot protection, framework detection.
+```
+apitap_peek(url: string) → PeekResult
+```
+**Use when:** You want to know if a site is accessible before spending tokens. Check bot protection, detect frameworks.
+**Returns:** `{ status, accessible, server, framework, botProtection, signals[], recommendation }`
+`recommendation` is one of: `read` | `capture` | `auth_required` | `blocked`
+**Example:**
+```
+apitap_peek("https://www.zillow.com") → { status: 200, recommendation: "read" }
+apitap_peek("https://www.doordash.com") → { status: 403, botProtection: "cloudflare", recommendation: "blocked" }
+```
+### Tier 1: Read (0-10K tokens, no browser)
+#### `apitap_read`
+Extract content from any URL without a browser. Uses side-channel APIs for known sites and HTML extraction for everything else.
+```
+apitap_read(url: string, maxBytes?: number) → ReadResult
+```
+**Use when:** You need page content, article text, post data, or listing info. Always try this before capture.
+**Returns:** `{ title, author, description, content (markdown), links[], images[], metadata: { source, type, publishedAt }, cost: { tokens } }`
+**Site-specific decoders (free, structured):**
+| Site | Side Channel | What You Get |
+|------|-------------|-------------|
+| Reddit | `.json` suffix | Posts, scores, comments, authors — full structured data |
+| YouTube | oembed API | Title, author, channel, thumbnail |
+| Wikipedia | REST API | Article summary, structured, with edit dates |
+| Hacker News | Firebase API | Stories, scores, comments, real-time |
+| Grokipedia | xAI public API | Full articles with citations, search, 6M+ articles |
+| Twitter/X | fxtwitter API | Full tweets, articles, engagement, quotes, media |
+| Everything else | og: tags + HTML extraction | Title, content as markdown, links, images |
+**Examples:**
+```
+# Reddit — full subreddit listing, ~500 tokens
+apitap_read("https://www.reddit.com/r/technology")
+# Reddit post with comments
+apitap_read("https://www.reddit.com/r/wallstreetbets/comments/abc123/some-post")
+# YouTube — 36 tokens
+apitap_read("https://www.youtube.com/watch?v=dQw4w9WgXcQ")
+# Wikipedia — 116 tokens
+apitap_read("https://en.wikipedia.org/wiki/Artificial_intelligence")
+# Grokipedia — full article with citations, 6M+ articles
+apitap_read("https://grokipedia.com/wiki/SpaceX")
+# Grokipedia — search across 6M articles
+apitap_read("https://grokipedia.com/search?q=artificial+intelligence")
+# Grokipedia — site stats and recent activity
+apitap_read("https://grokipedia.com/")
+# Twitter/X — full tweet with engagement, articles, quotes
+apitap_read("https://x.com/elonmusk/status/123456789")
+# Twitter/X article (long-form post) — full text extracted
+apitap_read("https://twitter.com/writer/status/987654321")
+# Any article/blog/news — generic extraction
+apitap_read("https://example.com/blog/some-article")
+# Zillow listing (bypasses PerimeterX via og: tags)
+apitap_read("https://www.zillow.com/homedetails/123-Main-St/12345_zpid/")
+```
+### Tier 2: Replay (1-5K tokens, needs skill file)
+#### `apitap_search`
+Find available skill files by domain or keyword.
+```
+apitap_search(query: string) → { found, results[] }
+```
+**Use when:** Looking for captured API endpoints. Search by domain name or topic.
+#### `apitap_replay`
+Call a captured API endpoint directly — no browser needed.
+```
+apitap_replay(domain: string, endpointId: string, endpointParams?: object, maxBytes?: number) → ReplayResult
+```
+**Use when:** A skill file exists for this domain. This is the cheapest way to get structured API data.
+**Returns:** `{ status, data (JSON), domain, endpointId, tier, fromCache }`
+**Example:**
+```
+# Get live stock quote (Robinhood, no auth needed)
+apitap_replay("api.robinhood.com", "get-marketdata-quotes", { symbols: "TSLA,MSFT" })
+# Get NBA scores (ESPN)
+apitap_replay("site.api.espn.com", "get-apis-personalized-v2-scoreboard-header")
+# Get crypto trending (CoinMarketCap)
+apitap_replay("api.coinmarketcap.com", "get-data-api-v3-unified-trending-top-boost-listing")
+```
+#### `apitap_replay_batch`
+Replay multiple endpoints in one call.
+```
+apitap_replay_batch(requests: Array<{ domain, endpointId, endpointParams? }>, maxBytes?: number)
+```
+### Tier 3: Capture (15-20K tokens, uses browser)
+#### `apitap_capture`
+Launch a headless browser to capture API traffic from a website.
+```
+apitap_capture(url: string, duration?: number) → { sessionId }
+```
+**Use when:** No skill file exists and `apitap_read` doesn't give you the data you need. This is expensive but creates a skill file for future free replays.
+#### `apitap_capture_interact`
+Send browser commands during an active capture session.
+```
+apitap_capture_interact(sessionId: string, action: string, ...) → result
+```
+Actions: `click`, `type`, `navigate`, `snapshot`, `scroll`, `wait`
+#### `apitap_capture_finish`
+End capture session, generate skill file, verify endpoints.
+```
+apitap_capture_finish(sessionId: string) → { skillFile, endpoints[] }
+```
+### Auto-Router
+#### `apitap_browse`
+Automatic escalation: cache → skill file → discover → **read** → capture_needed.
+```
+apitap_browse(url: string, query?: string, maxBytes?: number) → result
+```
+**Use when:** You don't know which tier to use. This tries the cheapest option first and escalates automatically.
+### Inspection
+#### `apitap_inspect`
+Get details about a skill file's endpoints.
+```
+apitap_inspect(domain: string) → { endpoints[], metadata }
+```
+#### `apitap_stats`
+Usage statistics across all skill files.
+```
+apitap_stats() → { domains, endpoints, tiers }
+```
+## Decision Tree
+```
+Need web data?
+│
+├─ Know the domain? → apitap_search → found? → apitap_replay (cheapest)
+│
+├─ Unknown URL → apitap_peek first (free)
+│   ├─ recommendation: "blocked" → STOP, tell user
+│   ├─ recommendation: "read" → apitap_read (no browser)
+│   ├─ recommendation: "capture" → apitap_capture (browser)
+│   └─ recommendation: "auth_required" → needs human login
+│
+├─ Just need article/post content → apitap_read directly
+│
+└─ Need structured API data → apitap_capture → creates skill file → future replays free
+```
+## Key Patterns
+### Instagram profile data (login wall bypass)
+Instagram blocks all normal scraping (Googlebot UA, oembed, noembed). But the **mobile API** works:
+```bash
+curl -s 'https://i.instagram.com/api/v1/users/web_profile_info/?params={"user_name":"TARGET_USERNAME"}' \
+  -H 'User-Agent: Instagram 275.0.0.27.98 Android (33/13; 420dpi; 1080x2400; samsung; SM-G991B; o1s; exynos2100)' \
+  -H 'X-IG-App-ID: 936619743392459'
+```
+**Returns:** Full profile JSON — bio, follower/following counts, post count, contact info (email, phone), category, highlights, recent posts with captions/engagement.
+**When to use:** Need Instagram profile data, follower counts, contact info, or recent post summaries. Works without auth.
+**Limitations:** Only public profiles. Rate-limited if abused. Does NOT return full post feeds — just recent edge.
+### Morning news scan
+```
+# Scan multiple subreddits
+for sub in ["technology", "wallstreetbets", "privacy"]:
+    apitap_read(f"https://www.reddit.com/r/{sub}")
+```
+### Stock research
+```
+# Live quote via captured API
+apitap_replay("api.robinhood.com", "get-marketdata-quotes", { symbols: "TSLA" })
+# Company fundamentals
+apitap_replay("api.robinhood.com", "get-fundamentals", { symbol: "TSLA" })
+```
+### Research any topic (dual knowledge base)
+```
+# 1. Read Wikipedia summary (established knowledge)
+apitap_read("https://en.wikipedia.org/wiki/Topic")
+# 2. Read Grokipedia article (AI-curated, with citations)
+apitap_read("https://grokipedia.com/wiki/Topic")
+# 3. Check Reddit discussion (community sentiment)
+apitap_read("https://www.reddit.com/r/relevant_sub")
+# 4. Read a linked article
+apitap_read("https://news-site.com/article")
+```
+### Check before committing
+```
+# Peek first — is it worth reading?
+result = apitap_peek("https://some-site.com")
+if result.recommendation == "read":
+    apitap_read("https://some-site.com")
+elif result.recommendation == "blocked":
+    # Don't waste tokens
+    pass
+```
+## Token Economics
+| Method | Cost per page | Notes |
+|--------|-------------|-------|
+| Browser automation | 50-200K tokens | Full DOM serialization |
+| apitap_read | 0-10K tokens | No browser, side channels |
+| apitap_replay | 1-5K tokens | Direct API call, needs skill file |
+| apitap_peek | ~0 tokens | HEAD request only |
+## CLI Usage
+All MCP tools are also available as CLI commands:
+```bash
+apitap peek <url> [--json]
+apitap read <url> [--json] [--max-bytes <n>]
+apitap search <query> [--json]
+apitap replay <domain> <endpointId> [--params '{}'] [--json]
+apitap capture <url> [--duration <sec>] [--json]
+apitap inspect <domain> [--json]
+apitap stats [--json]
+```
+Every command supports `--json` for machine-readable output.