npm - crawlforge-mcp-server - Versions diffs - 4.6.3 → 4.6.5 - Mend

crawlforge-mcp-server 4.6.3 → 4.6.5

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (9) hide show

package/README.md +35 -39
package/package.json +2 -2
package/server.js +10 -18
package/src/core/AgentOrchestrator.js +13 -1
package/src/core/AuthManager.js +57 -49
package/src/core/ResearchOrchestrator.js +71 -29
package/src/server/withAuth.js +10 -5
package/src/tools/research/deepResearch.js +32 -2
package/src/tools/search/searchWeb.js +21 -12

package/README.md CHANGED Viewed

@@ -67,7 +67,9 @@
 npm install -g crawlforge-mcp-server
 ```
-### 2. Setup Your API Key
+### 2. Setup Your API Key (optional for the free local tools)
+The 15 free local tools work immediately with **no API key at all** — skip straight to step 3 if that's all you need. To unlock the metered premium tools (`search_web`, `crawl_deep`, `stealth_mode`, `agent`, …):
 ```bash
 npx crawlforge-setup
@@ -148,7 +150,9 @@ Restart Cursor to activate.
 ## 📊 Available Tools
-**Basic Tools** (1 credit each)
+CrawlForge is **open-core**: 15 tools run locally on your machine and are **completely free — no API key required**. The metered premium tools cover real infrastructure (search fees, proxies, browser farms) and need an API key.
+**Free Local Tools** (0 credits, no API key needed)
 | Tool | What it does |
 |------|--------------|
@@ -156,43 +160,33 @@ Restart Cursor to activate.
 | `extract_text` | Extract clean text from web pages |
 | `extract_links` | Get all links from a page |
 | `extract_metadata` | Extract page metadata (title, OG tags, schema.org) |
-| `scrape_template` | Structured data from well-known sites (Amazon, GitHub, LinkedIn, YouTube, Reddit, Hacker News, npm, and more) without writing selectors |
-**Advanced Tools** (2–3 credits)
-| Tool | What it does |
-|------|--------------|
-| `scrape` | **Unified single-fetch, multi-format extraction.** Pass a `formats` array (markdown/html/rawHtml/text/links/metadata/screenshot/json-schema) plus `onlyMainContent`; one fetch serves every requested format with per-format partial-success warnings |
+| `scrape` | **Unified single-fetch, multi-format extraction.** Pass a `formats` array (markdown/html/rawHtml/text/links/metadata/screenshot/json-schema) plus `onlyMainContent`; one fetch serves every requested format with per-format partial-success warnings. *The `screenshot` format is the one metered exception (2 credits — needs a server browser)* |
 | `scrape_structured` | Extract structured data with CSS selectors |
-| `search_web` | Search the web using Google Search API |
+| `scrape_template` | Structured data from well-known sites (Amazon, GitHub, LinkedIn, YouTube, Reddit, Hacker News, npm, and more) without writing selectors |
+| `extract_content` | Enhanced content extraction |
 | `summarize_content` | Generate intelligent summaries |
 | `analyze_content` | Comprehensive content analysis |
-| `extract_structured` | LLM-powered schema-driven extraction |
+| `extract_structured` | LLM-powered schema-driven extraction (your own LLM key or local Ollama) |
 | `extract_with_llm` | Natural-language extraction. **Defaults to a local Ollama model — no API key, no API costs.** Pass `provider: "openai" \| "anthropic"` with the matching key for cloud models |
-| `list_ollama_models` | List the Ollama models installed locally (free; helps you pick a `model` for `extract_with_llm`) |
-| `track_changes` | Monitor content changes over time |
+| `process_document` | Multi-format document processing |
+| `list_ollama_models` | List the Ollama models installed locally (helps you pick a `model` for `extract_with_llm`) |
 | `get_batch_results` | Retrieve paginated results for a `batch_scrape` job by `batchId` |
-**Premium Tools** (5–10 credits)
-| Tool | What it does |
-|------|--------------|
-| `agent` | **Autonomous research/extraction from a natural-language prompt — no URLs required.** Plans, gathers, and shapes an answer under hard safety stops (max steps/URLs/wall-clock enforced by the orchestrator, never the LLM) |
-| `crawl_deep` | Deep crawl entire websites |
-| `map_site` | Discover and map website structure (optional `search=` ranks the discovered URLs) |
-| `batch_scrape` | Process multiple URLs simultaneously |
-| `deep_research` | Multi-stage research with source verification |
-| `stealth_mode` | Anti-detection browser management |
-**Heavy Processing** (3–10 credits)
-| Tool | What it does |
-|------|--------------|
-| `process_document` | Multi-format document processing |
-| `extract_content` | Enhanced content extraction |
-| `scrape_with_actions` | Browser automation chains |
-| `generate_llms_txt` | Generate AI interaction guidelines |
-| `localization` | Multi-language and geo-location management |
+**Metered Premium Tools** (3–10 credits, API key required)
+| Tool | Credits | What it does |
+|------|---------|--------------|
+| `map_site` | 3 | Discover and map website structure (optional `search=` ranks the discovered URLs) |
+| `track_changes` | 3 | Monitor content changes over time |
+| `search_web` | 5 | Search the web using Google Search API |
+| `crawl_deep` | 5 | Deep crawl entire websites |
+| `batch_scrape` | 5 | Process multiple URLs simultaneously |
+| `scrape_with_actions` | 5 | Browser automation chains |
+| `generate_llms_txt` | 5 | Generate AI interaction guidelines |
+| `localization` | 5 | Multi-language and geo-location management |
+| `agent` | 8 | **Autonomous research/extraction from a natural-language prompt — no URLs required.** Plans, gathers, and shapes an answer under hard safety stops (max steps/URLs/wall-clock enforced by the orchestrator, never the LLM) |
+| `deep_research` | 10 | Multi-stage research with source verification |
+| `stealth_mode` | 10 | Anti-detection browser management |
 For the full canonical capabilities reference (all tools, CLI commands, stealth engines, research workflow), see [SKILL.md](SKILL.md).
@@ -200,15 +194,17 @@ For the full canonical capabilities reference (all tools, CLI commands, stealth
 ## 💳 Pricing
+**15 local tools are free forever — no API key, no credit card.** Credits only meter the premium tools that run on CrawlForge infrastructure.
 | Plan | Credits/Month | Best For |
 |------|---------------|----------|
 | **Free** | 1,000 | Testing & personal projects |
-| **Starter** | 5,000 | Small projects & development |
-| **Professional** | 50,000 | Professional use & production |
-| **Enterprise** | 250,000 | Large scale operations |
+| **Hobby** ($19) | 5,000 | Small projects & development |
+| **Professional** ($99) | 50,000 | Professional use & production |
+| **Business** ($399) | 250,000 | Large scale operations |
 **All plans include:**
-- Access to all 26 tools
+- Access to all 26 tools (the 15 local tools never consume credits)
 - Credits never expire and roll over month-to-month
 - API access and webhook notifications
@@ -277,7 +273,7 @@ Once configured, use these tools in your AI assistant:
 ## 🔒 Security & Privacy
-- **Secure Authentication**: API keys required for all operations (no bypass methods)
+- **Secure Authentication**: API keys required for all metered premium tools (the 15 free local tools run without one)
 - **Local Storage**: API keys stored securely at `~/.crawlforge/config.json`
 - **HTTPS Only**: All connections use encrypted HTTPS
 - **No Data Retention**: We don't store scraped data, only usage logs
@@ -291,7 +287,7 @@ Once configured, use these tools in your AI assistant:
 - **Action allowlist**: `scrape_with_actions` accepts only 7 action types (`wait`, `click`, `type`, `press`, `scroll`, `screenshot`, `executeJavaScript`). No download, file-write, or arbitrary cross-page navigation primitives exist.
 - **JavaScript gate**: The `executeJavaScript` action throws by default. Set `ALLOW_JAVASCRIPT_EXECUTION=true` at deploy time to enable (not recommended in production).
 - **MCP Elicitation** (v3.6.0): Four tools request user confirmation before executing expensive operations — `deep_research` (>50 URLs), `batch_scrape` (sync mode, >25 URLs), `crawl_deep` (projected >500 pages), `extract_structured` (schema has >3 required fields with no LLM configured). Credit-low situations also elicit. Confirmation is best-effort: if the MCP client does not support elicitation the tool proceeds (fail-open).
-- **Per-tool credit gating**: Every tool is wrapped with `withAuth()`, which checks and deducts credits before execution. Fail-closed since v3.0.18.
+- **Per-tool credit gating**: Every tool is wrapped with `withAuth()`; metered tools check and deduct credits before execution (fail-closed since v3.0.18). Free local tools (cost 0) skip the credit path entirely.
 See [docs/sandboxing-and-approvals.md](docs/sandboxing-and-approvals.md) for the full reference.

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "crawlforge-mcp-server",
-  "version": "4.6.3",
+  "version": "4.6.5",
   "mcpName": "io.github.mysleekdesigns/crawlforge-mcp-server",
   "description": "CrawlForge MCP Server - Professional Model Context Protocol server with 26 web scraping, crawling, deep-research, and autonomous-extraction tools. Returns clean Markdown and structured JSON for Claude, Cursor, and any MCP client. Defaults to local Ollama for LLM extraction (no API key needed); OpenAI/Anthropic available as opt-in. Includes a unified multi-format scrape tool, an autonomous agent, pre-built site templates, and Camoufox stealth browsing.",
   "main": "server.js",
@@ -18,7 +18,7 @@
     "test": "node tests/integration/mcp-protocol-compliance.test.js",
     "test:unit": "CRAWLFORGE_CREATOR_SECRET= node --test 'tests/unit/*.test.js'",
     "test:integration": "CRAWLFORGE_CREATOR_SECRET= node --test 'tests/integration/tools/*.test.js'",
-    "test:coverage": "CRAWLFORGE_CREATOR_SECRET= c8 --reporter=text --reporter=lcov --include='src/**/*.js' --exclude='src/**/_*.js' --lines=60 --statements=60 --functions=55 --branches=45 node --test 'tests/unit/*.test.js' 'tests/integration/tools/*.test.js'",
+    "test:coverage": "CRAWLFORGE_CREATOR_SECRET= c8 --reporter=text --reporter=lcov --include='src/**/*.js' --exclude='src/**/_*.js' --lines=60 --statements=60 --functions=55 --branches=45 node --test --test-force-exit 'tests/unit/*.test.js' 'tests/integration/tools/*.test.js'",
     "test:tools": "node test-tools.js",
     "test:real-world": "node test-real-world.js",
     "test:all": "bash run-all-tests.sh",

package/server.js CHANGED Viewed

@@ -68,23 +68,15 @@ if (!AuthManager.isAuthenticated() && !AuthManager.isCreatorMode()) {
       process.exit(1);
     }
   } else {
-    console.log('');
-    console.log('╔═══════════════════════════════════════════════════════╗');
-    console.log('║        CrawlForge MCP Server - Setup Required         ║');
-    console.log('╚═══════════════════════════════════════════════════════╝');
-    console.log('');
-    console.log('Welcome! This appears to be your first time using CrawlForge.');
-    console.log('');
-    console.log('To get started, please run:');
-    console.log('  npm run setup');
-    console.log('');
-    console.log('Or set your API key via environment variable:');
-    console.log('  export CRAWLFORGE_API_KEY="your_api_key_here"');
-    console.log('');
-    console.log('Get your free API key at: https://www.crawlforge.dev/signup');
-    console.log('(Includes 1,000 free credits!)');
-    console.log('');
-    process.exit(0);
+    // Open-core Phase 2: no API key is fine — start in free-tier mode.
+    // Tier-0 tools (cost 0) run locally without a key; Tier-1 metered tools
+    // return a "not configured" error until a key is set.
+    // Status → stderr; stdout is reserved for the MCP JSON-RPC stream.
+    console.error('ℹ️  CrawlForge running in free-tier mode (no API key configured).');
+    console.error('   Free local tools work out of the box. Premium tools (search_web,');
+    console.error('   crawl_deep, stealth_mode, agent, deep_research, …) need an API key:');
+    console.error('   get one at https://www.crawlforge.dev/signup, then run `npm run setup`');
+    console.error('   or set CRAWLFORGE_API_KEY.');
   }
 }
@@ -98,7 +90,7 @@ if (configErrors.length > 0 && config.server.nodeEnv === 'production') {
 // Create the server
 const server = new McpServer({
   name: "crawlforge",
-  version: "4.6.3",
+  version: "4.6.5",
   description: "Production-ready MCP server with 26 web scraping, crawling, and content processing tools. Features MCP Resources (crawlforge://), Prompts, Sampling fallback, Elicitation, stealth browsing, deep research, structured extraction, change tracking, local-LLM extraction via Ollama, unified multi-format scrape, and autonomous agent tool.",
   homepage: "https://www.crawlforge.dev",
   icon: "https://www.crawlforge.dev/icon.png"

package/src/core/AgentOrchestrator.js CHANGED Viewed

@@ -99,7 +99,10 @@ export class AgentOrchestrator {
       const { ResearchOrchestrator } = await import('./ResearchOrchestrator.js');
       this._researchOrchestrator = new ResearchOrchestrator({
         maxUrls: 50,
-        timeLimit: DEFAULT_WALL_CLOCK_MS
+        timeLimit: DEFAULT_WALL_CLOCK_MS,
+        // Without this the orchestrator builds a keyless SearchWebTool and
+        // every pro-model search silently fails (zero sources).
+        searchConfig: this._searchConfig
       });
     }
     return this._researchOrchestrator;
@@ -147,6 +150,15 @@ export class AgentOrchestrator {
           timeLimit: wallClockMs,
           researchApproach: 'focused'
         });
+        // conductResearch never rejects — failures come back as an error payload
+        if (result?.error) {
+          return {
+            success: false,
+            degraded: true,
+            reason: `pro research failed: ${result.error}`,
+            answer: null
+          };
+        }
         return { success: true, answer: result, model: 'pro', degraded: false };
       } catch (err) {
         // Fall through to default path on pro failure

package/src/core/AuthManager.js CHANGED Viewed

@@ -238,7 +238,12 @@ class AuthManager {
     if (this.isCreatorMode()) {
       return true;
     }
+    // Open-core Phase 2: Tier-0 tools cost 0 and run without an API key
+    if (estimatedCredits === 0) {
+      return true;
+    }
     if (!this.config) {
       throw new Error('CrawlForge not configured. Run setup first.');
     }
@@ -500,54 +505,56 @@ class AuthManager {
   }
   /**
-   * Get credit cost for a tool
+   * Get credit cost for a tool.
+   *
+   * Open-core Phase 1 (docs/tier-map.md): this table is the single source of
+   * truth shared with the backend (crawlforge-website/src/lib/credits.ts).
+   * Tier 0 tools run locally on the user's machine and cost 0; Tier 1 tools
+   * are metered per COGS.
+   *
+   * @param {string} tool
+   * @param {object} [params] — invocation params; only used for per-call
+   *        exceptions (scrape's screenshot format needs a server browser).
    */
-  getToolCost(tool) {
+  getToolCost(tool, params) {
+    // Tier-0 exception: the screenshot format of `scrape` is browser-backed
+    if (tool === 'scrape' && Array.isArray(params?.formats) && params.formats.includes('screenshot')) {
+      return 2;
+    }
     const costs = {
-      // Basic tools (1 credit)
-      fetch_url: 1,
-      extract_text: 1,
-      extract_links: 1,
-      extract_metadata: 1,
-      // Advanced tools (2-3 credits)
-      scrape_structured: 2,
-      search_web: 2,
-      summarize_content: 2,
-      analyze_content: 2,
-      // Premium tools (5-10 credits)
+      // Tier 0 — free, local (key optional)
+      fetch_url: 0,
+      extract_text: 0,
+      extract_links: 0,
+      extract_metadata: 0,
+      scrape_structured: 0,
+      scrape_template: 0,
+      extract_content: 0,
+      scrape: 0, // 2 if formats includes 'screenshot' (handled above)
+      summarize_content: 0,
+      analyze_content: 0,
+      extract_with_llm: 0,
+      extract_structured: 0,
+      process_document: 0,
+      list_ollama_models: 0,
+      get_batch_results: 0, // retrieval of an already-paid batch job
+      // Tier 1 — metered (costs reflect COGS)
+      map_site: 3,
+      track_changes: 3,
+      generate_llms_txt: 5,
+      search_web: 5,
       crawl_deep: 5,
-      map_site: 5,
       batch_scrape: 5,
-      deep_research: 10,
-      stealth_mode: 10,
-      // Heavy processing (3-5 credits)
-      process_document: 3,
-      extract_content: 3,
       scrape_with_actions: 5,
-      generate_llms_txt: 3,
       localization: 5,
-      track_changes: 3,
-      // Phase 1: LLM-Powered Structured Extraction
-      extract_structured: 4,
-      // D3.3: Pre-built site templates (1 credit — same as fetch_url)
-      extract_with_llm: 5,
-      // D3.3: Pre-built site templates (1 credit per template scrape)
-      scrape_template: 1,
-      // Phase D (v4.6.0)
-      // scrape: base 2; projectCost() scales with format count
-      scrape: 2,
-      // agent: base 8; projectCost() scales with maxUrls
-      agent: 8
+      agent: 8, // projectCost() scales with maxUrls
+      deep_research: 10,
+      stealth_mode: 10
     };
-    return costs[tool] || 1;
+    return costs[tool] ?? 1;
   }
   /**
@@ -563,11 +570,11 @@ class AuthManager {
    * @returns {{ projected: number, note: string }}
    */
   projectCost(toolName, params) {
-    const base = this.getToolCost(toolName);
+    const base = this.getToolCost(toolName, params);
     // Override for tools whose cost scales with params
     let projected = base;
-    let note = 'Fixed cost per invocation.';
+    let note = base === 0 ? 'Free local tool — no credits charged.' : 'Fixed cost per invocation.';
     switch (toolName) {
       case 'batch_scrape': {
@@ -589,13 +596,14 @@ class AuthManager {
         break;
       }
       case 'extract_with_llm':
-        note = 'Includes external LLM API call cost (not billed in credits, billed by your LLM provider).';
+        note = 'Free local tool. External LLM API call billed by your LLM provider, not in credits.';
         break;
       case 'scrape': {
-        // Base 2 + 1 per format beyond the first
-        const fmtCount = Array.isArray(params?.formats) ? params.formats.length : 1;
-        projected = Math.max(base, base + Math.max(0, fmtCount - 1));
-        note = `Estimated from ${fmtCount} format(s). json format may incur external LLM cost.`;
+        // Free local tool; only the browser-backed screenshot format is metered
+        projected = base;
+        note = base > 0
+          ? 'screenshot format requires a server browser (2 credits). Other formats are free.'
+          : 'Free local tool — no credits charged. json format may incur external LLM cost.';
         break;
       }
       case 'agent': {
@@ -606,7 +614,7 @@ class AuthManager {
         break;
       }
       default:
-        note = 'Fixed cost per invocation.';
+        note = base === 0 ? 'Free local tool — no credits charged.' : 'Fixed cost per invocation.';
     }
     return { projected, note };

package/src/core/ResearchOrchestrator.js CHANGED Viewed

@@ -34,12 +34,14 @@ export class ResearchOrchestrator extends EventEmitter {
       enableConflictDetection = true,
       cacheEnabled = true,
       cacheTTL = 1800000, // 30 minutes
+      researchApproach = 'broad',
       searchConfig = {},
       crawlConfig = {},
       extractConfig = {},
       summarizeConfig = {}
     } = options;
+    this.researchApproach = researchApproach;
     this.maxDepth = Math.min(Math.max(1, maxDepth), 10);
     this.maxUrls = Math.min(Math.max(1, maxUrls), 1000);
     this.timeLimit = Math.min(Math.max(30000, timeLimit), 300000);
@@ -269,32 +271,50 @@ export class ResearchOrchestrator extends EventEmitter {
   }
   /**
-   * Generate research-specific query variations
+   * Generate research-specific query variations, tuned to the research approach.
+   *
+   * Academic/scientific suffixes ("peer reviewed", "research paper", "what is")
+   * only help when the caller actually asked for an academic search. Appending
+   * them to commercial or comparative topics dragged web search toward
+   * irrelevant government/academic PDFs and long-tail noise — the cause of
+   * near-empty research runs on niche commercial topics.
    */
   generateResearchVariations(topic) {
-    const variations = [];
-    // Question-based variations
-    variations.push(`what is ${topic}`);
-    variations.push(`how does ${topic} work`);
-    variations.push(`${topic} explained`);
-    variations.push(`${topic} research`);
-    variations.push(`${topic} studies`);
-    variations.push(`${topic} analysis`);
-    // Academic and authoritative variations
-    variations.push(`${topic} academic`);
-    variations.push(`${topic} scientific`);
-    variations.push(`${topic} research paper`);
-    variations.push(`${topic} peer reviewed`);
-    // Current and historical context
-    variations.push(`latest ${topic}`);
-    variations.push(`current ${topic}`);
-    variations.push(`${topic} 2024`);
-    variations.push(`${topic} trends`);
-    return variations.slice(0, 10); // Limit variations
+    const approach = this.researchApproach || 'broad';
+    if (approach === 'academic') {
+      return [
+        `${topic} research`,
+        `${topic} study`,
+        `${topic} analysis`,
+        `${topic} academic`,
+        `${topic} scientific`,
+        `${topic} research paper`,
+        `${topic} peer reviewed`,
+        `${topic} explained`
+      ];
+    }
+    if (approach === 'current_events') {
+      return [
+        `latest ${topic}`,
+        `${topic} news`,
+        `recent ${topic}`,
+        `${topic} update`,
+        `${topic} announcement`
+      ];
+    }
+    // broad / focused / comparative — commercial & general intent
+    return [
+      `${topic} review`,
+      `${topic} reviews`,
+      `${topic} comparison`,
+      `${topic} vs alternatives`,
+      `${topic} pricing`,
+      `best ${topic}`,
+      `${topic} company`
+    ];
   }
   /**
@@ -409,18 +429,20 @@ export class ResearchOrchestrator extends EventEmitter {
    */
   async gatherInitialSources(queries, options) {
     const allSources = [];
+    const searchErrors = [];
+    const attemptedQueries = queries.slice(0, 5);
     const maxSourcesPerQuery = Math.ceil(this.maxUrls / queries.length);
     await this.processWithTimeLimit(async () => {
-      const searchPromises = queries.slice(0, 5).map(async (query) => {
+      const searchPromises = attemptedQueries.map(async (query) => {
         try {
-          this.metrics.searchQueries++;
           const searchResults = await this.searchTool.execute({
             query,
             limit: maxSourcesPerQuery,
             enable_ranking: true,
             enable_deduplication: true
           });
+          this.metrics.searchQueries++;
           if (searchResults.results && searchResults.results.length > 0) {
             const processedResults = searchResults.results.map(result => ({
@@ -437,6 +459,7 @@ export class ResearchOrchestrator extends EventEmitter {
           return [];
         } catch (error) {
           this.logger.warn('Search failed for query', { query, error: error.message });
+          searchErrors.push({ query, error: error.message });
           return [];
         }
       });
@@ -445,6 +468,14 @@ export class ResearchOrchestrator extends EventEmitter {
       results.forEach(sources => allSources.push(...sources));
     });
+    // Fail loudly when every search threw (e.g. missing API key) instead of
+    // reporting a successful research run with zero sources.
+    if (searchErrors.length === attemptedQueries.length && searchErrors.length > 0) {
+      throw new Error(
+        `All ${searchErrors.length} search queries failed — first error: ${searchErrors[0].error}`
+      );
+    }
     // Deduplicate and rank sources
     const uniqueSources = this.deduplicateSources(allSources);
     const rankedSources = await this.rankSourcesByResearchValue(uniqueSources);
@@ -633,8 +664,19 @@ export class ResearchOrchestrator extends EventEmitter {
           citationPotential: this.assessCitationPotential(source)
         };
-        const overallCredibility = this.calculateOverallCredibility(credibilityFactors);
+        let overallCredibility = this.calculateOverallCredibility(credibilityFactors);
+        // Down-weight topically-irrelevant sources so high-authority but
+        // off-topic pages (e.g. a .gov PDF unrelated to the query) don't
+        // dominate the results. relevanceScore is keyword-based here (no LLM):
+        // ~1 when the topic appears in the content, ~0 when it doesn't.
+        const relevance = typeof source.relevanceScore === 'number'
+          ? source.relevanceScore
+          : null;
+        if (relevance !== null) {
+          overallCredibility *= (0.4 + 0.6 * relevance);
+        }
         // Only include sources that meet minimum credibility threshold
         if (overallCredibility >= 0.3) {
           verifiedSources.push({

package/src/server/withAuth.js CHANGED Viewed

@@ -4,7 +4,8 @@
  * (OpenTelemetry spans + Prometheus counters) added in v3.2.0.
  *
  * Contract:
- *   - resolves toolCost once per call
+ *   - resolves toolCost once per call (params-aware; 0-cost Tier-0 tools skip
+ *     the credit check and usage reports entirely — open-core Phase 2)
  *   - try/finally guarantees a single `tool invocation` log line per call
  *   - log payload: { toolName, paramHash, durationMs, outcome, creditCost, creatorMode }
  *   - outcome ∈ { 'success' | 'error' | 'insufficient_credits' }
@@ -35,12 +36,16 @@ export function makeWithAuth({ authManager, logger, metrics = null }) {
       const startTime = Date.now();
       const paramHash = hashParams(params);
       const creatorMode = authManager.isCreatorMode();
-      const creditCost = creatorMode ? 0 : authManager.getToolCost(toolName);
+      // Params-aware: scrape's screenshot format is metered, other formats free
+      const creditCost = creatorMode ? 0 : authManager.getToolCost(toolName, params);
+      // Open-core Phase 2: Tier-0 tools (cost 0) run locally for free — no
+      // credit check, no usage report, and no API key required.
+      const freeTier = creditCost === 0;
       let outcome = 'pending';
       let thrown = null;
       try {
-        if (!creatorMode) {
+        if (!creatorMode && !freeTier) {
           const hasCredits = await authManager.checkCredits(creditCost);
           if (!hasCredits) {
             outcome = 'insufficient_credits';
@@ -85,7 +90,7 @@ export function makeWithAuth({ authManager, logger, metrics = null }) {
           // Cost injection must never break the request path
         }
-        if (!creatorMode) {
+        if (!creatorMode && !freeTier) {
           await authManager.reportUsage(toolName, creditCost, params, 200, Date.now() - startTime);
         }
@@ -93,7 +98,7 @@ export function makeWithAuth({ authManager, logger, metrics = null }) {
       } catch (error) {
         outcome = 'error';
         thrown = error;
-        if (!creatorMode) {
+        if (!creatorMode && !freeTier) {
           await authManager.reportUsage(
             toolName,
             Math.max(1, Math.floor(creditCost * 0.5)),

package/src/tools/research/deepResearch.js CHANGED Viewed

@@ -2,6 +2,7 @@ import { z } from 'zod';
 // D1.4: Elicitation helper (injected from server.js or can be used standalone)
 import { ElicitationHelper } from '../../core/ElicitationHelper.js';
 import { ResearchOrchestrator } from '../../core/ResearchOrchestrator.js';
+import { getToolConfig } from '../../constants/config.js';
 import { Logger } from '../../utils/Logger.js';
 /**
@@ -172,6 +173,20 @@ export class DeepResearchTool {
           this.buildResearchOptions(validated)
         );
+        // conductResearch never rejects — orchestrator failures come back as a
+        // handleResearchError() payload. Surface them as a failed run instead
+        // of formatting them into a success-shaped result.
+        if (researchResults?.error) {
+          this.activeSessions.delete(sessionId);
+          return {
+            success: false,
+            sessionId,
+            error: researchResults.error,
+            partialResults: validated.includeRawData ? researchResults.partialResults : undefined,
+            recommendations: researchResults.recommendations
+          };
+        }
         // Format results according to output preference
         const formattedResults = this.formatResults(researchResults, validated);
@@ -236,7 +251,15 @@ export class DeepResearchTool {
    */
   buildOrchestratorConfig(params) {
     const baseConfig = { ...this.defaultOrchestratorConfig };
+    // The orchestrator constructs its own SearchWebTool, so it needs the same
+    // config (apiKey/apiBaseUrl) as the registered search_web tool — without
+    // it every internal search throws and research returns zero sources.
+    baseConfig.searchConfig = {
+      ...getToolConfig('search_web'),
+      ...baseConfig.searchConfig
+    };
     // Add LLM configuration if provided
     if (params.llmConfig) {
       baseConfig.llmConfig = params.llmConfig;
@@ -248,7 +271,11 @@ export class DeepResearchTool {
     const scopeConfig = {
       maxUrls: params.maxUrls,
       timeLimit: params.timeLimit,
-      concurrency: params.concurrency
+      concurrency: params.concurrency,
+      // The orchestrator tunes its query expansion to the approach (commercial
+      // vs academic vs current-events); without this it always used academic
+      // variations, which poisoned commercial/comparative searches.
+      researchApproach: params.researchApproach
     };
     switch (params.researchApproach) {
@@ -259,6 +286,7 @@ export class DeepResearchTool {
           maxDepth: Math.min(params.maxDepth, 8),
           enableSourceVerification: true,
           searchConfig: {
+            ...baseConfig.searchConfig,
             enableRanking: true,
             rankingWeights: {
               authority: 0.4, // Higher weight for academic sources
@@ -275,6 +303,7 @@ export class DeepResearchTool {
           ...scopeConfig,
           maxDepth: Math.min(params.maxDepth, 6),
           searchConfig: {
+            ...baseConfig.searchConfig,
             enableRanking: true,
             rankingWeights: {
               freshness: 0.4, // Prioritize recent content
@@ -301,6 +330,7 @@ export class DeepResearchTool {
           enableConflictDetection: true,
           maxDepth: params.maxDepth,
           searchConfig: {
+            ...baseConfig.searchConfig,
             enableDeduplication: true,
             deduplicationThresholds: {
               url: 0.9,

package/src/tools/search/searchWeb.js CHANGED Viewed

@@ -79,19 +79,23 @@ export class SearchWebTool {
     // Check for Creator Mode - allows search without API key for development/testing
     const isCreatorMode = isCreatorModeVerified();
+    // Open-core Phase 2: no API key is allowed at construction time (the server
+    // now starts in free-tier mode without one). The key requirement is
+    // enforced at execute() time instead, so Tier-0 tools keep working.
     if (!apiKey && !isCreatorMode) {
-      throw new Error('CrawlForge API key is required for search functionality');
-    }
-    // Create the search adapter (CrawlForge API proxy or Google Search API direct in Creator Mode)
-    try {
-      this.searchAdapter = SearchProviderFactory.createAdapter(apiKey, {
-        apiBaseUrl,
-        creatorMode: isCreatorMode
-      });
-      this.isCreatorModeFallback = !apiKey && isCreatorMode;
-    } catch (error) {
-      throw new Error(`Failed to initialize search adapter: ${error.message}`);
+      this.searchAdapter = null;
+      this.isCreatorModeFallback = false;
+    } else {
+      // Create the search adapter (CrawlForge API proxy or Google Search API direct in Creator Mode)
+      try {
+        this.searchAdapter = SearchProviderFactory.createAdapter(apiKey, {
+          apiBaseUrl,
+          creatorMode: isCreatorMode
+        });
+        this.isCreatorModeFallback = !apiKey && isCreatorMode;
+      } catch (error) {
+        throw new Error(`Failed to initialize search adapter: ${error.message}`);
+      }
     }
     this.cache = cacheEnabled ? new CacheManager({ ttl: cacheTTL }) : null;
@@ -123,6 +127,11 @@ export class SearchWebTool {
       }
       // --- end SearXNG short-circuit ---
+      // Free-tier mode: search via the CrawlForge proxy needs an API key
+      if (!this.searchAdapter) {
+        throw new Error('CrawlForge API key is required for search functionality. Get one at https://www.crawlforge.dev/signup');
+      }
       // Apply localization if specified
       let localizedParams = validated;
       if (validated.localization) {