npm - browseai-dev - Versions diffs - 0.2.0 - Mend

browseai-dev 0.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (6) hide show

package/README.md ADDED Viewed

@@ -0,0 +1,225 @@
+# browseai-dev
+**Reliable research infrastructure for AI agents.** The research layer your agents are missing.
+MCP server with real-time web search, evidence extraction, and structured citations. Drop into Claude Desktop, Cursor, Windsurf, LangChain, CrewAI, or any agent pipeline.
+## What it does
+Instead of letting your AI hallucinate, `browseai-dev` gives it real-time access to the web with **structured, cited answers**:
+```
+Your question → Web search → Neural rerank → Fetch pages → Extract claims → Verify → Cited answer (streamed)
+```
+Every answer includes:
+- **Claims** with source URLs, verification status, and consensus level
+- **7-factor confidence score** (0-1) — evidence-based, not LLM self-assessed, auto-calibrated from feedback
+- **Source quotes** verified against actual page text via hybrid BM25 + NLI matching
+- **Atomic claim decomposition** — compound facts split and verified independently
+- **Execution trace** with timing
+- **3 depth modes** — `"fast"` (default), `"thorough"` (auto-retry with rephrased queries), `"deep"` (premium multi-step agentic research: iterative think-search-extract-evaluate cycles with gap analysis, up to 4 steps, targets 0.85 confidence — requires BAI key + sign-in, 3x quota cost, falls back to thorough when exhausted)
+### Premium Features (with `BROWSE_API_KEY`)
+Users with a BrowseAI Dev API key (`bai_xxx`) get enhanced verification:
+- **Neural cross-encoder re-ranking** — search results re-scored by semantic query-document relevance
+- **NLI semantic reranking** — evidence matched by meaning, not just keywords
+- **Multi-provider search** — parallel search across multiple sources for broader coverage
+- **Multi-pass consistency** — claims cross-checked across independent extraction passes
+- **Deep reasoning mode** — multi-step agentic research with iterative think-search-extract-evaluate cycles, gap analysis, and cross-step claim merging (up to 4 steps, 3x quota cost, 100 deep queries/day)
+- **Research Sessions** — persistent memory across queries
+Free BAI key users get a generous daily quota (100 premium queries/day, or ~33 deep queries/day at 3x cost each). When exceeded, queries gracefully fall back to BM25 keyword verification (deep falls back to thorough). Quota resets every 24 hours.
+**No account needed** — all tools work with BYOK (your own Tavily + OpenRouter keys) with no signup, no limits, and BM25 keyword verification. Sign in at [browseai.dev](https://browseai.dev) for a free BAI key to unlock premium features.
+## Quick Start
+```bash
+npx browseai-dev setup
+```
+This auto-configures Claude Desktop. You'll need:
+- [Tavily API key](https://tavily.com) (free tier available)
+- [OpenRouter API key](https://openrouter.ai)
+## Manual Setup
+### Claude Desktop
+Add to `~/Library/Application Support/Claude/claude_desktop_config.json`:
+```json
+{
+  "mcpServers": {
+    "browseai-dev": {
+      "command": "npx",
+      "args": ["-y", "browseai-dev"],
+      "env": {
+        "SERP_API_KEY": "tvly-your-key",
+        "OPENROUTER_API_KEY": "your-openrouter-key",
+        "BROWSE_API_KEY": "bai_xxx"
+      }
+    }
+  }
+}
+```
+> `BROWSE_API_KEY` is optional for search/answer but **required for Research Memory (sessions)**. Get one free at [browseai.dev/dashboard](https://browseai.dev/dashboard).
+### Cursor / Windsurf
+Add to your MCP settings:
+```json
+{
+  "browseai-dev": {
+    "command": "npx",
+    "args": ["-y", "browseai-dev"],
+    "env": {
+      "SERP_API_KEY": "tvly-your-key",
+      "OPENROUTER_API_KEY": "your-openrouter-key",
+      "BROWSE_API_KEY": "bai_xxx"
+    }
+  }
+}
+```
+> Add `BROWSE_API_KEY` to enable Research Memory (sessions). Get one free at [browseai.dev/dashboard](https://browseai.dev/dashboard).
+### HTTP Transport
+Run as an HTTP server for browser-based clients, Smithery, or any HTTP-capable agent:
+```bash
+# Start with HTTP transport
+npx browseai-dev --http
+# Or set the port via environment variable
+MCP_HTTP_PORT=3100 npx browseai-dev --http
+```
+The server exposes:
+- `POST /mcp` — MCP Streamable HTTP endpoint
+- `GET /health` — Health check
+### Docker
+```bash
+docker build -t browseai-dev ./apps/mcp
+docker run -p 3100:3100 -e BROWSE_API_KEY=bai_xxx browseai-dev
+```
+## MCP Tools
+| Tool | Description |
+|------|-------------|
+| `browse_search` | Search the web via multi-provider search |
+| `browse_open` | Fetch and parse a page into clean text |
+| `browse_extract` | Extract structured knowledge from a page |
+| `browse_answer` | Full pipeline: search + extract + cite. `depth`: `"fast"`, `"thorough"`, or `"deep"` |
+| `browse_compare` | Compare raw LLM vs evidence-backed answer |
+| `browse_session_create` | Create a research session (persistent memory across queries) |
+| `browse_session_ask` | Research within a session (recalls prior knowledge, stores new claims) |
+| `browse_session_recall` | Query session knowledge without new web searches |
+| `browse_session_share` | Share a session publicly (returns share URL) |
+| `browse_session_knowledge` | Export all claims from a session |
+| `browse_session_fork` | Fork a shared session to continue the research |
+| `browse_feedback` | Submit accuracy feedback on a result |
+> **Note:** Session tools (`browse_session_*`) require a BrowseAI Dev API key (`bai_xxx`) for identity and ownership. Set `BROWSE_API_KEY` in your env config. BYOK users can use search/answer but cannot use sessions. Get a free API key at [browseai.dev/dashboard](https://browseai.dev/dashboard).
+## Examples
+**Quick lookup:**
+> *"Use browse_answer to explain what causes aurora borealis"*
+**Higher accuracy:**
+> *"Use browse_answer with depth thorough to research quantum computing"*
+**Deep research (multi-step, requires BAI key):**
+> *"Use browse_answer with depth deep to compare CRISPR approaches for sickle cell disease"*
+>
+> Deep mode runs iterative think-search-extract-evaluate cycles: gap analysis identifies missing info, follow-up queries fill the gaps, and claims/sources are merged across steps with final re-verification. Targets 0.85 confidence across up to 4 steps. Falls back to thorough without a BAI key or when quota is exhausted.
+**Contradiction detection:**
+> *"Use browse_answer with depth thorough to check if coffee is good for health, and show me any contradictions"*
+**Research session:**
+> *"Create a session called quantum-research, then ask about quantum entanglement, then ask how entanglement is used in computing"*
+**Enterprise search:**
+> *"Use browse_answer to search our Elasticsearch at https://es.company.com/kb/_search for our refund policy"*
+### Response structure
+```json
+{
+  "answer": "Aurora borealis occurs when charged particles from the Sun...",
+  "confidence": 0.92,
+  "claims": [
+    {
+      "claim": "Aurora borealis is caused by solar wind particles...",
+      "sources": ["https://en.wikipedia.org/wiki/Aurora"],
+      "verified": true,
+      "verificationScore": 0.82,
+      "consensusLevel": "strong"
+    }
+  ],
+  "sources": [
+    {
+      "url": "https://en.wikipedia.org/wiki/Aurora",
+      "title": "Aurora - Wikipedia",
+      "domain": "en.wikipedia.org",
+      "quote": "An aurora is a natural light display...",
+      "verified": true,
+      "authority": 0.70
+    }
+  ],
+  "contradictions": [],
+  "reasoningSteps": []
+}
+```
+## Why browseai-dev?
+| Feature | Raw LLM | browseai-dev |
+|---------|---------|-----------|
+| Sources | None | Real URLs with quotes |
+| Citations | Hallucinated | Verified from pages |
+| Confidence | Unknown | 7-factor evidence-based score |
+| Depth | Single pass | 3 modes: fast, thorough, deep reasoning |
+| Freshness | Training data | Real-time web |
+| Claims | Mixed in text | Structured + linked |
+## Reliability
+All API calls include automatic retry with exponential backoff on transient failures (429 rate limits, 5xx server errors). Auth errors fail immediately — no wasted retries.
+## Tech Stack
+- **Search**: Multi-provider (parallel search across sources)
+- **Parsing**: @mozilla/readability + linkedom
+- **AI**: OpenRouter (100+ models)
+- **Verification**: Hybrid BM25 + NLI semantic entailment
+- **Protocol**: Model Context Protocol (MCP)
+## Agent Skills
+Pre-built skills that teach coding agents when to use BrowseAI Dev tools:
+```bash
+npx skills add BrowseAI-HQ/browseAIDev_Skills
+```
+Skills work with Claude Code, Codex CLI, Gemini CLI, Cursor, and more. [View skills →](https://github.com/BrowseAI-HQ/browseAIDev_Skills)
+## Community
+- [Discord](https://discord.gg/ubAuT4YQsT)
+- [GitHub](https://github.com/BrowseAI-HQ/BrowseAI-Dev)
+## License
+MIT

package/dist/index.d.ts ADDED Viewed

	@@ -0,0 +1,2 @@
1	+ #!/usr/bin/env node
2	+ export {};

package/dist/index.js ADDED Viewed

@@ -0,0 +1,616 @@
+#!/usr/bin/env node
+import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";
+import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js";
+import { StreamableHTTPServerTransport } from "@modelcontextprotocol/sdk/server/streamableHttp.js";
+import { z } from "zod";
+import { Readability } from "@mozilla/readability";
+import { parseHTML } from "linkedom";
+import { createServer } from "node:http";
+import { randomUUID } from "node:crypto";
+// --- Constants (inlined for standalone npm package) ---
+const VERSION = "0.2.0";
+const LLM_MODEL = "google/gemini-2.5-flash";
+const LLM_ENDPOINT = "https://openrouter.ai/api/v1/chat/completions";
+const TAVILY_ENDPOINT = "https://api.tavily.com/search";
+const MAX_PAGE_CONTENT_LENGTH = 3000;
+// --- API mode (BrowseAI Dev API key) ---
+const BROWSE_API_KEY = process.env.BROWSE_API_KEY;
+const BROWSE_API_URL = process.env.BROWSE_API_URL || "https://browseai.dev/api";
+const API_MODE = !!BROWSE_API_KEY;
+// --- CLI handling ---
+const args = process.argv.slice(2);
+if (args.includes("--help") || args.includes("-h")) {
+    console.log(`
+  browseai-dev v${VERSION}
+  Open-source deep research MCP server for AI agents
+  Usage:
+    browseai-dev              Start the MCP server (stdio transport)
+    browseai-dev --http       Start the MCP server (HTTP transport)
+    browseai-dev setup        Auto-configure Claude Desktop
+    browseai-dev --help       Show this help
+    browseai-dev --version    Show version
+  Environment Variables:
+    BROWSE_API_KEY         BrowseAI Dev API key (get one at https://browseai.dev/dashboard)
+    SERP_API_KEY           Tavily API key (get one at https://tavily.com) — BYOK mode
+    OPENROUTER_API_KEY     OpenRouter API key (get one at https://openrouter.ai) — BYOK mode
+    MCP_HTTP_PORT          Port for HTTP transport (default: 3100)
+  MCP Tools:
+    browse_search          Search the web for information
+    browse_open            Fetch and parse a web page
+    browse_extract         Extract structured knowledge from a page
+    browse_answer          Full pipeline: search + extract + answer
+    browse_compare         Compare raw LLM vs evidence-backed answer
+    browse_session_create  Create a research session (persistent memory)
+    browse_session_ask     Research within a session (recalls prior knowledge)
+    browse_session_recall  Query session knowledge without new searches
+    browse_session_share   Share a session publicly via URL
+    browse_session_knowledge  Export all knowledge from a session
+  Quick Setup:
+    Option A: Use a BrowseAI Dev API key (one key for everything)
+      1. Sign in at https://browseai.dev and generate an API key
+      2. Run: npx browseai-dev setup
+      3. Restart Claude Desktop
+    Option B: Bring your own keys (BYOK)
+      1. Get API keys: https://tavily.com + https://openrouter.ai
+      2. Run: npx browseai-dev setup
+      3. Restart Claude Desktop
+`);
+    process.exit(0);
+}
+if (args.includes("--version") || args.includes("-v")) {
+    console.log(VERSION);
+    process.exit(0);
+}
+if (args[0] === "setup") {
+    import("./setup.js").then((m) => m.runSetup());
+}
+else {
+    // --- Start MCP server ---
+    startServer();
+}
+async function apiCall(path, body) {
+    const res = await fetch(`${BROWSE_API_URL}${path}`, {
+        method: "POST",
+        headers: {
+            "Content-Type": "application/json",
+            "X-API-Key": BROWSE_API_KEY,
+        },
+        body: JSON.stringify(body),
+    });
+    const data = await res.json();
+    if (!res.ok || !data.success)
+        throw new Error(data.error || `API failed: ${res.status}`);
+    // Include quota info in result if present
+    if (data.quota) {
+        return { ...data.result, _quota: data.quota };
+    }
+    return data.result;
+}
+// --- Env validation ---
+function getEnvKeys() {
+    if (API_MODE)
+        return { SERP_API_KEY: "", OPENROUTER_API_KEY: "" };
+    const SERP_API_KEY = process.env.SERP_API_KEY;
+    const OPENROUTER_API_KEY = process.env.OPENROUTER_API_KEY;
+    if (!SERP_API_KEY || !OPENROUTER_API_KEY) {
+        console.error(`
+  browseai-dev: Missing required environment variables
+  ${!SERP_API_KEY ? "  SERP_API_KEY       - Get one at https://tavily.com" : "  SERP_API_KEY       - Set"}
+  ${!OPENROUTER_API_KEY ? "  OPENROUTER_API_KEY - Get one at https://openrouter.ai" : "  OPENROUTER_API_KEY - Set"}
+  Quick fix: run 'npx browseai-dev setup' to configure automatically.
+  Or use a BrowseAI Dev API key: BROWSE_API_KEY=bai_xxx npx browseai-dev
+`);
+        process.exit(1);
+    }
+    return { SERP_API_KEY, OPENROUTER_API_KEY };
+}
+// --- In-memory cache ---
+const cache = new Map();
+function cacheGet(key) {
+    const entry = cache.get(key);
+    if (!entry || Date.now() > entry.expires) {
+        cache.delete(key);
+        return null;
+    }
+    return entry.value;
+}
+function cacheSet(key, value, ttl = 300) {
+    cache.set(key, { value, expires: Date.now() + ttl * 1000 });
+}
+// --- Tavily search ---
+async function tavilySearch(query, limit = 5) {
+    const { SERP_API_KEY } = getEnvKeys();
+    const cached = cacheGet(`search:${query}:${limit}`);
+    if (cached)
+        return JSON.parse(cached);
+    const res = await fetch(TAVILY_ENDPOINT, {
+        method: "POST",
+        headers: { "Content-Type": "application/json" },
+        body: JSON.stringify({
+            api_key: SERP_API_KEY,
+            query,
+            max_results: limit,
+            include_raw_content: false,
+            search_depth: "basic",
+        }),
+    });
+    if (!res.ok)
+        throw new Error(`Tavily search failed: ${res.status}`);
+    const data = await res.json();
+    const results = data.results.map((r) => ({
+        url: r.url,
+        title: r.title,
+        snippet: r.content,
+        score: r.score,
+    }));
+    cacheSet(`search:${query}:${limit}`, JSON.stringify(results), 600);
+    return results;
+}
+// --- Readability page fetch ---
+async function fetchPage(url) {
+    const cached = cacheGet(`page:${url}`);
+    if (cached)
+        return JSON.parse(cached);
+    const res = await fetch(url, {
+        headers: {
+            "User-Agent": "Mozilla/5.0 (compatible; BrowseAI-Dev/1.0)",
+            Accept: "text/html,application/xhtml+xml",
+        },
+        signal: AbortSignal.timeout(10000),
+    });
+    if (!res.ok)
+        throw new Error(`Failed to fetch ${url}: ${res.status}`);
+    const html = await res.text();
+    const { document } = parseHTML(html);
+    const reader = new Readability(document);
+    const article = reader.parse();
+    if (!article)
+        throw new Error(`Could not parse ${url}`);
+    const page = {
+        title: article.title,
+        content: (article.textContent ?? "").slice(0, MAX_PAGE_CONTENT_LENGTH * 2),
+        excerpt: article.excerpt || "",
+        siteName: article.siteName,
+    };
+    cacheSet(`page:${url}`, JSON.stringify(page), 1800);
+    return page;
+}
+// --- LLM knowledge extraction (via OpenRouter) ---
+async function extractKnowledge(query, pageContents) {
+    const { OPENROUTER_API_KEY } = getEnvKeys();
+    const res = await fetch(LLM_ENDPOINT, {
+        method: "POST",
+        headers: {
+            Authorization: `Bearer ${OPENROUTER_API_KEY}`,
+            "Content-Type": "application/json",
+        },
+        body: JSON.stringify({
+            model: LLM_MODEL,
+            messages: [
+                {
+                    role: "system",
+                    content: "You are a knowledge extraction engine. Given web page content, extract structured claims with source attribution and write a clear answer. Use only extracted evidence. Never invent sources. Preserve citations. Return a JSON object using the tool provided.",
+                },
+                {
+                    role: "user",
+                    content: `Question: ${query}\n\nWeb sources:\n${pageContents}`,
+                },
+            ],
+            tools: [
+                {
+                    type: "function",
+                    function: {
+                        name: "return_knowledge",
+                        description: "Return extracted knowledge with claims, sources, answer, and confidence",
+                        parameters: {
+                            type: "object",
+                            properties: {
+                                answer: { type: "string" },
+                                confidence: { type: "number" },
+                                claims: {
+                                    type: "array",
+                                    items: {
+                                        type: "object",
+                                        properties: {
+                                            claim: { type: "string" },
+                                            sources: {
+                                                type: "array",
+                                                items: { type: "string" },
+                                            },
+                                        },
+                                        required: ["claim", "sources"],
+                                    },
+                                },
+                                sources: {
+                                    type: "array",
+                                    items: {
+                                        type: "object",
+                                        properties: {
+                                            url: { type: "string" },
+                                            title: { type: "string" },
+                                            domain: { type: "string" },
+                                            quote: { type: "string" },
+                                        },
+                                        required: ["url", "title", "domain", "quote"],
+                                    },
+                                },
+                            },
+                            required: ["answer", "confidence", "claims", "sources"],
+                            additionalProperties: false,
+                        },
+                    },
+                },
+            ],
+            tool_choice: {
+                type: "function",
+                function: { name: "return_knowledge" },
+            },
+        }),
+    });
+    if (!res.ok)
+        throw new Error(`LLM failed: ${res.status}`);
+    const data = await res.json();
+    const toolCall = data.choices?.[0]?.message?.tool_calls?.[0];
+    if (!toolCall)
+        throw new Error("LLM did not return structured output");
+    try {
+        return JSON.parse(toolCall.function.arguments);
+    }
+    catch (e) {
+        throw new Error(`Failed to parse LLM tool call arguments: ${e.message}. Raw: ${toolCall.function.arguments}`, { cause: e });
+    }
+}
+// --- Raw LLM call (no sources, for compare) ---
+async function rawLLMAnswer(query) {
+    const { OPENROUTER_API_KEY } = getEnvKeys();
+    const res = await fetch(LLM_ENDPOINT, {
+        method: "POST",
+        headers: {
+            Authorization: `Bearer ${OPENROUTER_API_KEY}`,
+            "Content-Type": "application/json",
+        },
+        body: JSON.stringify({
+            model: LLM_MODEL,
+            messages: [
+                {
+                    role: "system",
+                    content: "Answer the question clearly and concisely.",
+                },
+                { role: "user", content: query },
+            ],
+        }),
+    });
+    if (!res.ok)
+        throw new Error(`LLM failed: ${res.status}`);
+    const data = await res.json();
+    return data.choices?.[0]?.message?.content || "No response";
+}
+async function answerPipeline(query) {
+    const trace = [];
+    const searchStart = Date.now();
+    const searchResults = await tavilySearch(query);
+    trace.push({
+        step: "Search Web",
+        duration_ms: Date.now() - searchStart,
+        detail: `${searchResults.length} results`,
+    });
+    const scrapeStart = Date.now();
+    const pages = await Promise.allSettled(searchResults.slice(0, 5).map((r) => fetchPage(r.url)));
+    const successfulPages = pages
+        .filter((p) => p.status === "fulfilled")
+        .map((p) => p.value);
+    trace.push({
+        step: "Fetch Pages",
+        duration_ms: Date.now() - scrapeStart,
+        detail: `${successfulPages.length} pages`,
+    });
+    const pageContents = successfulPages
+        .map((p, i) => `[Source ${i + 1}] URL: ${searchResults[i]?.url}\nTitle: ${p.title}\n\n${p.content.slice(0, MAX_PAGE_CONTENT_LENGTH)}`)
+        .join("\n\n---\n\n");
+    const llmStart = Date.now();
+    const knowledge = await extractKnowledge(query, pageContents);
+    const llmDuration = Date.now() - llmStart;
+    trace.push({
+        step: "Extract Claims",
+        duration_ms: Math.round(llmDuration * 0.4),
+        detail: `${knowledge.claims?.length || 0} claims`,
+    });
+    trace.push({
+        step: "Build Evidence Graph",
+        duration_ms: Math.round(llmDuration * 0.1),
+        detail: `${knowledge.sources?.length || 0} sources`,
+    });
+    trace.push({
+        step: "Generate Answer",
+        duration_ms: Math.round(llmDuration * 0.5),
+        detail: "OpenRouter",
+    });
+    return {
+        answer: knowledge.answer,
+        claims: knowledge.claims || [],
+        sources: knowledge.sources || [],
+        confidence: knowledge.confidence || 0.85,
+        trace,
+    };
+}
+// --- Tool registration (shared between stdio and http) ---
+function registerTools(server) {
+    server.tool("browse_search", "Search the web for information on a topic. Returns URLs, titles, snippets, and relevance scores.", { query: z.string(), limit: z.number().optional() }, async ({ query, limit }) => {
+        if (API_MODE) {
+            const result = await apiCall("/browse/search", { query, limit: limit ?? 5 });
+            return { content: [{ type: "text", text: JSON.stringify(result, null, 2) }] };
+        }
+        const results = await tavilySearch(query, limit ?? 5);
+        return {
+            content: [{ type: "text", text: JSON.stringify(results, null, 2) }],
+        };
+    });
+    server.tool("browse_open", "Fetch and parse a web page into clean text using Readability. Strips ads, nav, and boilerplate.", { url: z.string() }, async ({ url }) => {
+        if (API_MODE) {
+            const result = await apiCall("/browse/open", { url });
+            return { content: [{ type: "text", text: JSON.stringify(result, null, 2) }] };
+        }
+        const page = await fetchPage(url);
+        return {
+            content: [{ type: "text", text: JSON.stringify(page, null, 2) }],
+        };
+    });
+    server.tool("browse_extract", "Extract structured knowledge (claims + sources + confidence) from a single web page using AI.", { url: z.string(), query: z.string().optional() }, async ({ url, query }) => {
+        if (API_MODE) {
+            const result = await apiCall("/browse/extract", { url, query });
+            return { content: [{ type: "text", text: JSON.stringify(result, null, 2) }] };
+        }
+        const page = await fetchPage(url);
+        const domain = new URL(url).hostname;
+        const pageContent = `[Source 1] URL: ${url}\nTitle: ${page.title}\n\n${page.content.slice(0, MAX_PAGE_CONTENT_LENGTH)}`;
+        const q = query || `Summarize the content from ${domain}`;
+        const result = await extractKnowledge(q, pageContent);
+        return {
+            content: [{ type: "text", text: JSON.stringify(result, null, 2) }],
+        };
+    });
+    server.tool("browse_answer", "Full deep research pipeline: search the web, fetch pages, extract claims, build evidence graph, and generate a structured answer with citations and confidence score. Use depth='thorough' for auto-retry with rephrased queries when confidence is low. Use depth='deep' for multi-step agentic research that identifies knowledge gaps and runs follow-up searches. Enterprise: use searchProvider to search internal data instead of the public web.", {
+        query: z.string(),
+        depth: z.enum(["fast", "thorough", "deep"]).optional().describe("Research depth: 'fast' (default), 'thorough' (auto-retry if confidence < 60%), or 'deep' (multi-step agentic research with gap analysis)"),
+        searchProvider: z.object({
+            type: z.enum(["tavily", "brave", "elasticsearch", "confluence", "custom"]).describe("Search backend type"),
+            endpoint: z.string().optional().describe("Endpoint URL (required for elasticsearch, confluence, custom)"),
+            authHeader: z.string().optional().describe("Auth header value (e.g. 'Bearer xxx')"),
+            index: z.string().optional().describe("Elasticsearch index name"),
+            spaceKey: z.string().optional().describe("Confluence space key"),
+            dataRetention: z.enum(["normal", "none"]).optional().describe("'none' skips all caching/storage (enterprise)"),
+        }).optional().describe("Enterprise: configure a custom search backend instead of public web search"),
+    }, async ({ query, depth, searchProvider }) => {
+        if (API_MODE) {
+            const body = { query, depth: depth || "fast" };
+            if (searchProvider)
+                body.searchProvider = searchProvider;
+            const result = await apiCall("/browse/answer", body);
+            const content = [
+                { type: "text", text: JSON.stringify(result, null, 2) },
+            ];
+            // Surface quota info so agents can inform users about premium status
+            if (result._quota) {
+                const q = result._quota;
+                const status = q.premiumActive
+                    ? `Premium active (${q.used}/${q.limit} queries used today)`
+                    : `Premium quota exceeded (${q.used}/${q.limit}). Results use standard verification. Upgrade or wait 24h for reset.`;
+                content.push({ type: "text", text: `\n---\nQuota: ${status}` });
+            }
+            return { content };
+        }
+        const result = await answerPipeline(query);
+        let text = JSON.stringify(result, null, 2);
+        if (depth && depth !== "fast") {
+            text += `\n\n> Note: depth="${depth}" requested but BYOK mode uses standard search depth. Use a BrowseAI API key for thorough/deep modes.`;
+        }
+        return {
+            content: [{ type: "text", text }],
+        };
+    });
+    server.tool("browse_compare", "Compare a raw LLM answer (no sources) vs an evidence-backed answer. Shows the difference between hallucination-prone and grounded responses.", { query: z.string() }, async ({ query }) => {
+        if (API_MODE) {
+            const result = await apiCall("/browse/compare", { query });
+            return { content: [{ type: "text", text: JSON.stringify(result, null, 2) }] };
+        }
+        const [rawAnswer, evidenceResult] = await Promise.all([
+            rawLLMAnswer(query),
+            answerPipeline(query),
+        ]);
+        const comparison = {
+            query,
+            raw_llm: {
+                answer: rawAnswer,
+                sources: 0,
+                claims: 0,
+                confidence: null,
+            },
+            evidence_backed: {
+                answer: evidenceResult.answer,
+                sources: evidenceResult.sources.length,
+                claims: evidenceResult.claims.length,
+                confidence: evidenceResult.confidence,
+                citations: evidenceResult.sources,
+            },
+        };
+        return {
+            content: [
+                { type: "text", text: JSON.stringify(comparison, null, 2) },
+            ],
+        };
+    });
+    // --- Research Memory tools (API mode only — sessions require Supabase) ---
+    server.tool("browse_session_create", "Create a new research session. Sessions persist knowledge across multiple queries — each query builds on prior research.", { name: z.string().describe("Name for the session (e.g. 'wasm-research', 'react-comparison')") }, async ({ name }) => {
+        if (!API_MODE) {
+            return { content: [{ type: "text", text: "Research Memory requires a BrowseAI Dev API key. Set BROWSE_API_KEY to use sessions." }] };
+        }
+        const result = await apiCall("/session", { name });
+        return { content: [{ type: "text", text: JSON.stringify(result, null, 2) }] };
+    });
+    server.tool("browse_session_ask", "Research a question within a session. Recalls prior knowledge, runs the research pipeline, and stores new claims. Later queries in the same session benefit from accumulated knowledge.", {
+        session_id: z.string().describe("Session ID from browse_session_create"),
+        query: z.string(),
+        depth: z.enum(["fast", "thorough", "deep"]).optional().describe("'fast' (default), 'thorough', or 'deep' (multi-step agentic)"),
+    }, async ({ session_id, query, depth }) => {
+        if (!API_MODE) {
+            return { content: [{ type: "text", text: "Research Memory requires a BrowseAI Dev API key." }] };
+        }
+        const result = await apiCall(`/session/${session_id}/ask`, { query, depth: depth || "fast" });
+        return { content: [{ type: "text", text: JSON.stringify(result, null, 2) }] };
+    });
+    server.tool("browse_session_recall", "Query accumulated knowledge from a session without making new web searches. Returns previously verified claims relevant to the query.", {
+        session_id: z.string().describe("Session ID"),
+        query: z.string().describe("What to recall from session knowledge"),
+        limit: z.number().optional().describe("Max entries to return (default 10)"),
+    }, async ({ session_id, query, limit }) => {
+        if (!API_MODE) {
+            return { content: [{ type: "text", text: "Research Memory requires a BrowseAI Dev API key." }] };
+        }
+        const result = await apiCall(`/session/${session_id}/recall`, { query, limit: limit ?? 10 });
+        return { content: [{ type: "text", text: JSON.stringify(result, null, 2) }] };
+    });
+    server.tool("browse_session_share", "Share a research session publicly. Returns a shareable URL that anyone can view — great for sharing research findings with teammates, in reports, or on social media.", {
+        session_id: z.string().describe("Session ID to share"),
+    }, async ({ session_id }) => {
+        if (!API_MODE) {
+            return { content: [{ type: "text", text: "Research Memory requires a BrowseAI Dev API key." }] };
+        }
+        const result = await apiCall(`/session/${session_id}/share`, {});
+        const shareUrl = `https://browseai.dev/session/share/${result.shareId}`;
+        return {
+            content: [{
+                    type: "text",
+                    text: JSON.stringify({ shareId: result.shareId, url: shareUrl, message: "Session shared! Anyone with this link can view the research." }, null, 2),
+                }],
+        };
+    });
+    server.tool("browse_session_knowledge", "Export all knowledge from a research session. Returns all verified claims, sources, and confidence scores accumulated across queries.", {
+        session_id: z.string().describe("Session ID"),
+        limit: z.number().optional().describe("Max entries to return (default 50)"),
+    }, async ({ session_id, limit }) => {
+        if (!API_MODE) {
+            return { content: [{ type: "text", text: "Research Memory requires a BrowseAI Dev API key." }] };
+        }
+        const res = await fetch(`${BROWSE_API_URL}/session/${session_id}/knowledge?limit=${limit ?? 50}`, {
+            headers: { "X-API-Key": BROWSE_API_KEY },
+        });
+        const data = await res.json();
+        if (!data.success)
+            throw new Error(data.error || "Failed to export knowledge");
+        return { content: [{ type: "text", text: JSON.stringify(data.result, null, 2) }] };
+    });
+    server.tool("browse_session_fork", "Fork a shared research session to continue building on someone else's research. Creates a copy of all knowledge in your own session.", {
+        share_id: z.string().describe("Share ID from a shared session URL"),
+    }, async ({ share_id }) => {
+        if (!API_MODE) {
+            return { content: [{ type: "text", text: "Research Memory requires a BrowseAI Dev API key." }] };
+        }
+        const result = await apiCall(`/session/share/${share_id}/fork`, {});
+        return {
+            content: [{
+                    type: "text",
+                    text: JSON.stringify({
+                        sessionId: result.session.id,
+                        name: result.session.name,
+                        claimsForked: result.claimsForked,
+                        message: "Session forked! You can now continue researching with all the prior knowledge.",
+                    }, null, 2),
+                }],
+        };
+    });
+    // --- Feedback Tool ---
+    server.tool("browse_feedback", "Submit feedback on a search result to improve future accuracy. Helps the self-learning engine tune verification thresholds.", {
+        result_id: z.string().describe("The shareId/resultId from a previous search result"),
+        rating: z.enum(["good", "bad", "wrong"]).describe("Rate the result: 'good' (accurate), 'bad' (not helpful), or 'wrong' (factually incorrect)"),
+        claim_index: z.number().int().min(0).optional().describe("Optional: index of the specific claim that was wrong"),
+    }, async ({ result_id, rating, claim_index }) => {
+        if (!API_MODE) {
+            return { content: [{ type: "text", text: "Feedback requires a BrowseAI API key (BROWSE_API_KEY). Set it to enable feedback." }] };
+        }
+        const body = { resultId: result_id, rating };
+        if (claim_index !== undefined)
+            body.claimIndex = claim_index;
+        const result = await apiCall("/browse/feedback", body);
+        return {
+            content: [{
+                    type: "text",
+                    text: JSON.stringify({ recorded: true, message: "Feedback recorded. This helps improve future search accuracy." }, null, 2),
+                }],
+        };
+    });
+}
+// --- MCP Server ---
+function startServer() {
+    // Validate env before starting
+    getEnvKeys();
+    const useHttp = args.includes("--http") || !!process.env.MCP_HTTP_PORT;
+    const port = parseInt(process.env.MCP_HTTP_PORT || process.env.PORT || "3100", 10);
+    if (useHttp) {
+        const transports = new Map();
+        const httpServer = createServer(async (req, res) => {
+            const url = new URL(req.url || "/", `http://localhost:${port}`);
+            // Health check
+            if (url.pathname === "/health") {
+                res.writeHead(200, { "Content-Type": "application/json" });
+                res.end(JSON.stringify({ status: "ok", version: VERSION }));
+                return;
+            }
+            if (url.pathname === "/mcp") {
+                const sessionId = req.headers["mcp-session-id"];
+                if (sessionId && transports.has(sessionId)) {
+                    const transport = transports.get(sessionId);
+                    await transport.handleRequest(req, res);
+                    return;
+                }
+                // New session
+                const transport = new StreamableHTTPServerTransport({
+                    sessionIdGenerator: () => randomUUID(),
+                });
+                transport.onclose = () => {
+                    if (transport.sessionId) {
+                        transports.delete(transport.sessionId);
+                    }
+                };
+                const server = new McpServer({
+                    name: "browseai-dev",
+                    version: VERSION,
+                });
+                registerTools(server);
+                await server.connect(transport);
+                if (transport.sessionId) {
+                    transports.set(transport.sessionId, transport);
+                }
+                await transport.handleRequest(req, res);
+                return;
+            }
+            res.writeHead(404);
+            res.end("Not found");
+        });
+        httpServer.listen(port, () => {
+            console.error(`browseai-dev v${VERSION} MCP server running on http://localhost:${port}/mcp`);
+        });
+    }
+    else {
+        const server = new McpServer({
+            name: "browseai-dev",
+            version: VERSION,
+        });
+        registerTools(server);
+        async function run() {
+            const transport = new StdioServerTransport();
+            await server.connect(transport);
+            console.error(`browseai-dev v${VERSION} MCP server running on stdio`);
+        }
+        run().catch((err) => {
+            console.error("Failed to start browseai-dev:", err);
+            process.exit(1);
+        });
+    }
+}

package/dist/setup.d.ts ADDED Viewed

	@@ -0,0 +1 @@
1	+ export declare function runSetup(): Promise<void>;

package/dist/setup.js ADDED Viewed

@@ -0,0 +1,90 @@
+import { readFileSync, writeFileSync, existsSync, mkdirSync } from "fs";
+import { join } from "path";
+import { createInterface } from "readline";
+const rl = createInterface({ input: process.stdin, output: process.stdout });
+function ask(question) {
+    return new Promise((resolve) => rl.question(question, resolve));
+}
+function getConfigPath() {
+    const platform = process.platform;
+    const home = process.env.HOME || process.env.USERPROFILE || "";
+    if (platform === "darwin") {
+        return join(home, "Library", "Application Support", "Claude", "claude_desktop_config.json");
+    }
+    else if (platform === "win32") {
+        return join(process.env.APPDATA || join(home, "AppData", "Roaming"), "Claude", "claude_desktop_config.json");
+    }
+    else {
+        return join(home, ".config", "claude", "claude_desktop_config.json");
+    }
+}
+export async function runSetup() {
+    console.log(`
+  browseai-dev setup
+  ================
+  Configure browseai-dev for Claude Desktop / Cursor / Windsurf
+`);
+    const browseKey = await ask("  BrowseAI Dev API key (leave blank to use your own Tavily + OpenRouter keys):");
+    let mcpEnv;
+    if (browseKey.trim()) {
+        mcpEnv = { BROWSE_API_KEY: browseKey.trim() };
+    }
+    else {
+        const serpKey = await ask("  Tavily API key (get one at https://tavily.com): ");
+        if (!serpKey.trim()) {
+            console.log("\n  Tavily API key is required. Get one at https://tavily.com\n");
+            process.exit(1);
+        }
+        const openrouterKey = await ask("  OpenRouter API key (get one at https://openrouter.ai): ");
+        if (!openrouterKey.trim()) {
+            console.log("\n  OpenRouter API key is required. Get one at https://openrouter.ai\n");
+            process.exit(1);
+        }
+        mcpEnv = {
+            SERP_API_KEY: serpKey.trim(),
+            OPENROUTER_API_KEY: openrouterKey.trim(),
+        };
+    }
+    rl.close();
+    const mcpEntry = {
+        command: "npx",
+        args: ["-y", "browseai-dev"],
+        env: mcpEnv,
+    };
+    const configPath = getConfigPath();
+    console.log(`\n  Config path: ${configPath}`);
+    let config = { mcpServers: {} };
+    if (existsSync(configPath)) {
+        try {
+            config = JSON.parse(readFileSync(configPath, "utf-8"));
+            if (!config.mcpServers)
+                config.mcpServers = {};
+        }
+        catch {
+            console.log("  Could not parse existing config, creating new one...");
+        }
+    }
+    else {
+        const dir = configPath.replace(/[/\\][^/\\]+$/, "");
+        mkdirSync(dir, { recursive: true });
+    }
+    config.mcpServers["browseai-dev"] = mcpEntry;
+    writeFileSync(configPath, JSON.stringify(config, null, 2));
+    console.log(`
+  Done! browseai-dev has been configured.
+  Next steps:
+    1. Restart Claude Desktop
+    2. You should see "browseai-dev" in the MCP tools list
+    3. Try asking: "Use browse_answer to explain quantum computing"
+  Available tools:
+    browse_search    - Search the web
+    browse_open      - Fetch and parse a page
+    browse_extract   - Extract knowledge from a page
+    browse_answer    - Full deep research pipeline
+    browse_compare   - Compare raw LLM vs evidence-backed answer
+  Config written to: ${configPath}
+`);
+}

package/package.json ADDED Viewed

@@ -0,0 +1,63 @@
+{
+  "name": "browseai-dev",
+  "version": "0.2.0",
+  "type": "module",
+  "description": "Reliable research infrastructure for AI agents. MCP server with real-time web search, evidence extraction, and structured citations. The research layer for LangChain, CrewAI, and custom agents.",
+  "keywords": [
+    "mcp",
+    "claude",
+    "ai-agent",
+    "web-search",
+    "deep-research",
+    "model-context-protocol",
+    "cursor",
+    "windsurf",
+    "langchain",
+    "crewai",
+    "openai",
+    "deep-search",
+    "rag",
+    "agent-tools",
+    "research-agent",
+    "web-browsing",
+    "citations",
+    "ai-search",
+    "llm-tools",
+    "autogpt",
+    "llamaindex",
+    "evidence",
+    "fact-checking"
+  ],
+  "license": "MIT",
+  "repository": {
+    "type": "git",
+    "url": "git+https://github.com/BrowseAI-HQ/BrowseAI-Dev.git",
+    "directory": "apps/mcp"
+  },
+  "homepage": "https://browseai.dev",
+  "bugs": "https://github.com/BrowseAI-HQ/BrowseAI-Dev/issues",
+  "bin": {
+    "browseai-dev": "dist/index.js"
+  },
+  "files": [
+    "dist",
+    "README.md"
+  ],
+  "scripts": {
+    "dev": "tsx src/index.ts",
+    "build": "tsc",
+    "start": "node dist/index.js",
+    "prepublishOnly": "tsc"
+  },
+  "dependencies": {
+    "@modelcontextprotocol/sdk": "^1.12.0",
+    "@mozilla/readability": "^0.6.0",
+    "linkedom": "^0.18.0",
+    "zod": "^4.3.6"
+  },
+  "devDependencies": {
+    "tsx": "^4.19.0",
+    "typescript": "^5.8.3",
+    "@types/node": "^25.5.0"
+  }
+}