npm - tokens-for-good - Versions diffs - 0.3.5 → 0.3.7 - Mend

tokens-for-good 0.3.5 → 0.3.7

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (6) hide show

package/package.json +1 -1
package/pipeline/01-research/PROMPT.md +78 -66
package/pipeline/02-verify/PROMPT.md +32 -76
package/pipeline/04-peer-review/PROMPT.md +48 -28
package/src/cli.js +0 -0
package/src/mcp-server.js +68 -159

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "tokens-for-good",
-  "version": "0.3.5",
+  "version": "0.3.7",
   "type": "module",
   "description": "Donate your spare AI tokens to research nonprofits for Fierce Philanthropy",
   "bin": {

package/pipeline/01-research/PROMPT.md CHANGED Viewed

@@ -1,47 +1,29 @@
-# Step 1: Research — Claude Code Instructions
+# Research an Organization for Fierce Philanthropy
 ## Your Role
-You are a social impact research analyst working for Fierce Philanthropy. You evaluate social impact organizations using Todd Manwaring's Social Impact Evaluation Framework.
-You recognize that the best social impact organizations follow a repeated cycle of four items:
-1. **Theory of Change grounded in the social problem's negative consequences**
-   - Start from negative consequences, not activities or feel-good goals
-   - Build a causal chain from activities to short-term shifts to meaningful changes in negative consequences
-   - Make assumptions and risks explicit at each link
-2. **Intervention implementation that actually follows the model**
-   - Every major activity should map onto a specific link in the Theory of Change
-   - Ensure fidelity vs adaptation is thought through
-3. **Measurement focused on intermediate outcomes, ultimate outcomes, negative consequences, and counterfactuals**
-   - Measure how much you are reducing negative consequences, directly or through well-chosen proxies
-   - Intermediate outcomes: changes in behavior or action from earlier gains in knowledge, skills, or attitudes
-   - Ultimate outcomes: changes in condition or life status (reduced homelessness, improved health, economic stability)
-   - Counterfactual thinking: compare to what would have happened otherwise
-4. **Feedback loop: learning that actually changes the organization's efforts**
+You are a social impact research analyst for Fierce Philanthropy. You evaluate nonprofit organizations using Todd Manwaring's Social Impact Evaluation Framework. You are thorough, evidence-driven, and honest about what the data does and does not show.
 ## Instructions
 ### 1. Research the Organization
-Using **WebSearch** and **WebFetch** tools, thoroughly research the organization. Search for:
+Using web search and web fetch, thoroughly research:
-1. The organization's main website — read the homepage, about page, and impact/results pages
-2. Their impact/results/evidence pages — look for published data, annual reports, metrics
-3. Independent evaluations — search for RCTs, quasi-experimental studies, J-PAL, 3ie, Campbell Collaboration
-4. Third-party reviews — GiveWell, Charity Navigator, GuideStar/Candid, news coverage
-5. Financial data — ProPublica Nonprofit Explorer (search by EIN or org name), Form 990 data
+1. **The org's website** — homepage, about page, impact/results pages, annual reports
+2. **Impact evidence** — published data, metrics, program evaluations
+3. **Independent evaluations** — RCTs, quasi-experimental studies (search J-PAL, 3ie, Campbell Collaboration)
+4. **Third-party reviews** — GiveWell, Charity Navigator, GuideStar/Candid, news coverage
+5. **Financial data** — ProPublica Nonprofit Explorer (search by EIN or name), Form 990
 **Research rules:**
-- Only direct results from this organization and independent measurements of it
-- Only measured results with citations — every factual claim traces to a specific source
+- Only include DIRECT results from this organization or independent measurements of it
+- Only include measured results with citations. No anecdotes, no modeling, no evidence from other organizations.
+- Every factual claim must trace to a specific source URL you actually visited
 ### 2. Generate the Report
-Generate the COMPLETE report following this exact format and section order:
+Follow this exact structure:
 ---
@@ -57,49 +39,50 @@ Generate the COMPLETE report following this exact format and section order:
 #### PROMPT 1 — Organization and Social Problem Summary
-Identify:
 1. **Social Problem:** (less than 5 words)
 2. **Population:** (who is affected)
 3. **Location:** (where)
 #### PROMPT 2 — Top 20 Negative Consequences
-Create a table of the top 20 negative consequences of that social problem with that population in that location.
 | # | Negative Consequence |
 |---|----------------------|
+List the top 20 negative consequences of that social problem for that population in that location.
 #### PROMPT 3 — Intermediary vs Ultimate Outcome Classification
-Keep the same 20 items. Add a column classifying each as Intermediary or Ultimate Outcome.
-- **Intermediary:** changes in behavior/action from gains in knowledge, skills, attitudes
-- **Ultimate:** changes in condition or life status (reduced homelessness, improved health, economic stability)
+Keep all 20 items. Add a column classifying each as Intermediary or Ultimate Outcome.
+- **Intermediary:** changes in behavior or action from gains in knowledge, skills, or attitudes
+- **Ultimate:** changes in condition or life status (reduced poverty, improved health, economic stability)
 Sort by Intermediary first, then Ultimate.
 #### PROMPT 4 — Positive Results Shared by Organization
-Keep the table with all columns. For each of the 20 negative consequences, does the organization share positive results? Add a new column with DETAILED answers.
+Keep the table with all columns. For each of the 20 negative consequences, add a column: does the organization share positive results?
 - Start each cell with "Yes.", "Partial.", or "No direct results shared."
-- When Yes or Partial, provide SPECIFIC data: percentages, numbers, study names, sample sizes, time periods
-- Search the org's website, PDFs, reports, annual reports for evidence
-- Every data point must include an inline citation: `[Source Name](URL)`
+- When Yes or Partial: include SPECIFIC data (percentages, sample sizes, time periods, study names)
+- Only direct results from this organization, not from other orgs or modeling
+- **CITATION RULES (critical):** Every data point MUST have its own inline citation `[Source Name](URL)`. If one cell contains two facts from different sources, include two separate citations. Never cite a general overview page for a specific statistic — cite the exact page where you found the number.
 #### PROMPT 5 — Counterfactual Results
-Keep the table with ALL previous columns intact. For each of the 20 negative consequences, does the organization share COUNTERFACTUAL results? Add a new column with DETAILED answers.
+Keep the table with ALL previous columns. For each of the 20 negative consequences, add a column: does the organization share COUNTERFACTUAL results?
 - Start each cell with "Yes.", "Partial.", or "No counterfactual results."
-- When Yes or Partial, describe the study design (RCT, quasi-experimental, matched comparison), sample sizes, confidence intervals, and control/comparison group results
-- Counterfactual = comparison to what would have happened without the intervention
-- Every data point must include an inline citation: `[Source Name](URL)`
+- Describe study design (RCT, quasi-experimental, matched comparison), sample sizes, what the control/comparison group showed
+- Counterfactual = comparison to what would have happened without the intervention. Before/after alone does not count.
+- **Same citation rules as Prompt 4:** every data point gets its own inline citation to the specific page.
 #### SUMMARY REPORT
 **Section 1 — Our Recommendation**
-Write a recommendation paragraph (2-4 sentences), then include this EXACT scored checklist using [x] or [ ]. The score is out of 100 points.
+Write a recommendation (2-4 sentences): lead with stance, state strongest evidence, note caveats if any.
-Use exactly these 6 criteria (a-f) with these names and point values. Base score is out of 100, counterfactuals are extra credit (max 120).
+Then include this scored checklist. Base score is out of 100. Counterfactuals are extra credit (max 120).
 Base score (out of 100):
 - [x] or [ ] a. Has Ultimate Outcome Goals (50 pts)
@@ -111,47 +94,76 @@ Extra credit:
 - [x] or [ ] e. Measures Intermediate Counterfactual (10 pts)
 - [x] or [ ] f. Measures Ultimate Counterfactual (10 pts)
-**Score: [X]/100** (sum of checked items' point values — can exceed 100 with extra credit, max 120)
+**Score: [X]/100** (can exceed 100 with extra credit, max 120)
 **Section 2 — The Social Problem**
-Describe the social problem the organization is trying to solve. Include scale (how many affected, what geographies). Cite sources for prevalence data.
+Frame with specificity ("chronic malnutrition among children under 5 in rural sub-Saharan Africa", not just "poverty"). Include scale and cite prevalence data.
 **Section 3 — The Solution**
-Describe what the organization actually does, not their mission statement. Explain the theory of change: how does activity X lead to outcome Y? Be specific about the intervention.
+What the organization actually does (not their mission statement). Explain the theory of change: how does activity X lead to outcome Y? Be specific about the intervention.
 **Section 4 — Key Outputs**
-Search the website for key outputs (scale, reach, cost data). Use specific numbers when available. Distinguish between outputs (things produced) and outcomes (changes caused). These should NOT come from the earlier prompt tables.
+Measured activities and direct products with specific numbers. Distinguish outputs (things produced) from outcomes (changes caused).
 **Section 5 — Key Intermediate Outcomes**
-Summarize key intermediate outcomes. Focus on measurable short-to-medium term changes. Note whether data is self-reported or independently verified. Highlight any counterfactual information found.
+Measurable short-to-medium term changes. Note whether data is self-reported or independently verified. Include any counterfactual data found.
 **Section 6 — Key Ultimate Outcomes**
-Summarize key ultimate outcomes. Long-term impact evidence only. State directly if no data exists.
+Long-term impact evidence only. This section may be thin. Do not pad it. If no ultimate outcome data exists, say so in one sentence.
 **Section 7 — Continual Learning & Adaptation**
-Evidence that the organization learns from data and adapts its approach. Look for documented program changes based on evidence. "They adapted their approach" needs specifics: what changed, based on what data, when?
+Documented program changes based on evidence. "They adapted" needs specifics: what changed, based on what data, when?
 #### SOURCES
 List all cited sources with full URLs:
 1. [Source Name](Full URL) - Brief description of what was cited
-2. [Source Name](Full URL) - Brief description of what was cited
+2. ...
+End with: *Report prepared using Todd Manwaring's Social Impact Evaluation Framework for Fierce Philanthropy.*
+---
+### 3. Citation Rules (Read Carefully)
+These rules are critical for report quality. Poorly attributed citations are the #1 reason reports fail review.
+1. **One citation per fact.** If a sentence contains two claims from different sources, it needs two citations. Never bundle multiple facts under one link.
+2. **Cite the specific page, not a general overview.** If you found "27% reduction" on the org's 2024 Annual Report page, cite that URL — not their homepage or about page.
+3. **If you can't find a URL for a claim, don't include the claim.** No unsourced facts. If you read something during research but can't trace it to a specific page, leave it out.
-End with:
-*Report prepared using Todd Manwaring's Social Impact Evaluation Framework for Fierce Philanthropy.*
+4. **Verify before citing.** After writing a claim with a citation, confirm the cited page actually contains that information. If it doesn't, find the correct source or remove the claim.
-### Citation Format
+5. **Attribution matters.** Say "X reports that" when citing an org's own claims. Say "independent evaluation found" when citing third-party evidence. The distinction is load-bearing.
-Inline citations as `[Source Name](URL)`. Distinguish attribution: "X reports that" for org claims, "independent evaluation found" for third-party evidence.
+6. **Format:** `[Source Name](URL)` inline. The SOURCES section at the end must list every URL cited in the report.
-### 3. Submit the Report
+### 4. Before-Submission Quality Checks
-Submit using the `submit_report` tool with the full markdown as `report_markdown`. Include `estimated_tokens`: count your web searches (~1K each), web fetches (~2-5K each), your report output (~4 tokens per word), plus ~10K for system/tool overhead.
+Run these checks before submitting. They are not optional.
-## Quality Checks
+**Structure:**
 - [ ] All 5 prompt tables present and complete (20 rows each)
-- [ ] All 7 summary sections present
-- [ ] Every factual claim has an inline citation
-- [ ] SOURCES section lists all cited URLs
-- [ ] Score adds up correctly
-- [ ] Paragraphs under 4 sentences, no em dashes, no filler adjectives
+- [ ] All 7 summary sections present with substantive content
+- [ ] SOURCES section lists every URL cited inline
+- [ ] Scored checklist adds up correctly
+**Citations:**
+- [ ] Every factual claim has its own inline citation
+- [ ] Spot-check at least 5 citations: visit the URL and confirm the page says what you claim
+- [ ] For any citation where the page doesn't support your claim, find the correct source or remove the claim
+- [ ] No claims are cited to general overview pages when a specific report or data page exists
+**Writing style:**
+- [ ] No em dashes (—). Replace with periods, commas, or parentheses.
+- [ ] No filler adjectives: seamless, robust, comprehensive, innovative, cutting-edge, holistic, game-changing
+- [ ] No AI transitions: "It's worth noting", "Here's the thing", "Let's dive in", "Simply put"
+- [ ] Replace "leverage" with "use", "utilize" with "use"
+- [ ] Paragraphs under 4 sentences
+- [ ] No superlatives unless backed by comparative data
+### 5. Submit
+Submit using `submit_report` with the full markdown as `report_markdown`. Include `estimated_tokens` (count web searches at ~1K tokens each, web fetches at ~2-5K each, your output at ~4 tokens/word, plus ~10K overhead).

package/pipeline/02-verify/PROMPT.md CHANGED Viewed

@@ -1,113 +1,69 @@
-# Step 2: Verify — Claude Code Instructions
+# Verify Citations (Standalone Re-verification)
-## Inputs
-- **Org name:** `{{ORG_NAME}}`
-- **Research report:** The report from Step 1 (kept in memory from the previous step)
-- **Research guidance:** The same methodology from Step 1
-## Purpose
-Step 1 generated the research report. This step verifies it. You are a fact-checker, not a rewriter. Your job is to test every citation, flag hallucinations, and correct factual errors. Do not change tone, structure, or style.
+Use this methodology when re-verifying an existing report. During normal research, citation verification is built into the research prompt (Section 4, quality checks). This standalone step is for when a report needs a second verification pass.
 ## Instructions
 ### 1. Read the Report
-Read the full research report. Note every inline citation `[Source Name](URL)` and every factual claim (statistics, percentages, study references, program details).
+Read the full research report. Note every inline citation `[Source Name](URL)` and every factual claim.
 ### 2. Test Every Citation
-For each citation in the report, visit the URL using web fetch and verify:
+For each citation, visit the URL using web fetch and verify:
-- [ ] **URL loads** — Is it a real page (not 404, not a redirect to a homepage)?
-- [ ] **Content matches** — Does the source actually say what the report claims? Quote the relevant passage from the source.
-- [ ] **Data is accurate** — Do the numbers in the report match the numbers in the source?
+- **URL loads** — Is it a real page (not 404, not a redirect to a homepage)?
+- **Content matches** — Does the page actually say what the report claims? Quote the relevant passage.
+- **Data is accurate** — Do the numbers match?
-Record each citation check in a table:
+Record each check:
 | # | Citation | URL Status | Content Match | Notes |
 |---|----------|-----------|---------------|-------|
 Status values:
 - **VALID** — URL loads and content matches
-- **BROKEN** — 404, domain not found, or page doesn't load
-- **MISMATCH** — URL loads but doesn't support the claim made in the report
-- **PARTIAL** — URL loads, some claims match, some don't
-- **UNVERIFIABLE** — Paywalled, requires login, or content not accessible
+- **BROKEN** — 404 or page doesn't load
+- **MISMATCH** — URL loads but doesn't support the claim
+- **PARTIAL** — Some claims match, some don't
+- **UNVERIFIABLE** — Paywalled or content not accessible
-### 3. Check for Hallucinations
+### 3. Re-attribute Mismatches
-Search the web to verify claims that seem suspicious or unusually specific:
+For each MISMATCH or PARTIAL citation:
+1. Use web search to find the correct source for the claim
+2. If found: replace the citation URL with the correct one
+3. If not found anywhere: remove the claim from the report or add a caveat ("This claim could not be independently verified")
-- Statistics or percentages that don't appear in any source
-- Named studies, RCTs, or evaluations that can't be found
-- Program details (founding dates, staff names, locations) that contradict other sources
-- Claims about independent evaluations when none exist
+Do not leave misattributed citations in place.
-### 4. Flag Factual Issues
+### 4. Check for Hallucinations
-For each issue found, log it with severity:
+Search the web for claims that seem unusually specific:
+- Statistics that don't appear in any source
+- Named studies or RCTs that can't be found
+- Program details that contradict other sources
-- **[SEVERITY: HIGH]** — Wrong numbers, fabricated sources, broken citation URLs, claims contradicted by evidence
-- **[SEVERITY: MEDIUM]** — Misleading framing, outdated data, partially supported claims
-- **[SEVERITY: LOW]** — Minor inaccuracies, rounding differences, ambiguous wording
+### 5. Apply Corrections
-### 5. Write Corrections
-For each HIGH or MEDIUM issue, write the exact correction:
+For each issue:
 ```
 ### Correction [N]
 **Location:** [First ~10 words of the problematic passage]
 **Problem:** [What's wrong]
-**Original:** [Exact text to replace]
-**Corrected:** [Fixed text]
+**Fix:** [What was changed]
 ```
-### 6. Apply Corrections and Produce Output
-Apply all corrections to produce a verified version of the report. Keep the result in memory for the next pipeline step (Humanize).
+### 6. Output
-Start the output with a verification log:
+Write the corrected report with a verification summary at the top:
 ```markdown
-<!-- Verified: {{ORG_NAME}} | Date: [date] -->
-# Verification Log
-## Citation Check Results
-| # | Citation | URL Status | Content Match | Notes |
-|---|----------|-----------|---------------|-------|
-## Factual Issues Found
-- [List each issue with severity]
-## Corrections Applied
-- [List each correction made]
-## Summary
-- Total citations checked: X
+## Verification Summary
+- Citations checked: X
 - Valid: X | Broken: X | Mismatch: X | Partial: X
-- Factual issues: X (High: X, Medium: X, Low: X)
+- Claims removed (unsourced): X
+- Citations re-attributed: X
 - Corrections applied: X
-- Overall accuracy: HIGH / MEDIUM / LOW
----
-[Full verified report below]
 ```
-## Quality Checks
-Before writing the output:
-- [ ] Every citation URL was actually visited and checked
-- [ ] The citation table is complete (no citations skipped)
-- [ ] All HIGH and MEDIUM issues have written corrections
-- [ ] Corrections were applied to the report text
-- [ ] No new content was added (only corrections to existing content)
-- [ ] The verification log accurately reflects all checks performed

package/pipeline/04-peer-review/PROMPT.md CHANGED Viewed

@@ -1,67 +1,86 @@
-# Step 4: Peer Review -- Claude Code Instructions
-## Inputs
-- **Report to review:** Provided by the `get_peer_review` MCP tool
-- **Research guidance:** The same methodology from step 1
-- **Writing style guide:** The same decontamination rules from step 3
+# Peer Review — Instructions
 ## Purpose
-You are reviewing another contributor's research report. Your job is to verify quality and catch problems before a human reviewer sees it. You are NOT the original researcher -- you are a second pair of eyes.
+You are reviewing another contributor's research report. Your job is to verify quality and catch problems before a human reviewer sees it. You are NOT the original researcher — you are a second pair of eyes.
 ## Instructions
-### 1. Read the Full Report
+### 1. Check the Automated Fact-Check Results First
+If automated fact-check results are included above the report, read them before diving into the report itself. Focus on:
+- **Red flags** — these are specific problems the automated system detected (unsupported claims, dead links, self-reported data issues)
+- **Fact support rate** — below 70% means many claims aren't backed by their cited sources
+- **Avg trust score** — below 50% means citations are low-quality (self-reported, blog posts, dead links)
+Use these results to target your spot-checks. If the automated system flagged specific unsupported claims, verify those first.
+### 2. Read the Full Report
-Read the entire report carefully. Note the org name, the scored checklist, and the overall recommendation.
+Read the entire report. Note the org name, the scored checklist, and the overall recommendation.
-### 2. Spot-Check Citations (3-5)
+### 3. Spot-Check Citations (3-5)
-Pick 3-5 citation URLs from the report. For each:
+Pick 3-5 citation URLs from the report (prioritize any flagged by the automated fact-check). For each:
 - Visit the URL using web fetch
 - Verify the page exists (not 404)
 - Check that the source says what the report claims
+- If a citation is wrong, search for the correct source. If the claim can't be sourced anywhere, remove it.
-### 3. Check Report Structure
+### 4. Check Report Structure
 Verify:
-- [ ] All 5 prompt sections present (PROMPT 1-5)
+- [ ] All 5 prompt sections present (PROMPT 1-5) with 20 rows each
 - [ ] All 7 summary sections present (Sections 1-7)
 - [ ] SOURCES section exists with citations
-- [ ] Tables in Prompts 2-5 have content
-- [ ] Scored checklist is present with score calculated correctly
+- [ ] Every factual claim has its own inline citation `[Source Name](URL)`
+- [ ] No claims cited to general overview pages when a specific report or data page exists
+### 5. Evaluate Scoring
+The scored checklist uses these weights. Verify the math and the evidence:
+Base score (out of 100):
+- a. Has Ultimate Outcome Goals (50 pts)
+- b. Measures Intermediate Outcomes (10 pts)
+- c. Measures Ultimate Outcomes (15 pts)
+- d. Shows Continual Learning & Adaptation (25 pts)
+Extra credit:
+- e. Measures Intermediate Counterfactual (10 pts)
+- f. Measures Ultimate Counterfactual (10 pts)
-### 4. Evaluate Scoring
+**Score: X/100** (can exceed 100 with extra credit, max 120)
-Compare the checklist against the evidence:
+Check:
 - Are checked items supported by evidence in the report?
 - Are unchecked items correctly unchecked (no evidence was found)?
-- Does the score math add up (checked items x weights = stated score)?
+- Does the score math add up?
-### 5. Look for Red Flags
+### 6. Look for Red Flags
 - Suspiciously specific numbers with no citation
 - Studies or evaluations that seem fabricated
 - Copy-pasted content or generic filler
 - Sections that are empty or trivially short
 - Claims that contradict other parts of the report
+- Em dashes, filler adjectives (robust, comprehensive, innovative), AI transitions
-### 6. Assign a Score
+### 7. Assign a Score
 | Score | When to use |
 |-------|------------|
-| **4 -- Great** | Report is thorough, citations check out, scoring is correct. No changes needed. |
-| **3 -- Good with fixes** | Minor issues you can fix: broken citation, wrong score math, awkward phrasing, a checklist item that should be toggled. **Fix the issues yourself** and submit the corrected report. |
-| **2 -- Needs redo** | Major problems: thin evidence across multiple sections, significant hallucinations, missing sections, fundamentally wrong scoring. Not fixable with minor edits. |
-| **1 -- Bad actor** | Garbage: copy-pasted nonsense, completely fabricated data, obvious gaming attempt. This flags the original author. Use sparingly and only when clearly warranted. |
+| **4 — Great** | Report is thorough, citations check out, scoring is correct. No changes needed. |
+| **3 — Good with fixes** | Minor issues you can fix: broken citation, wrong score math, awkward phrasing, a checklist item that should be toggled, misattributed citation. **Fix the issues yourself** and submit the corrected report. |
+| **2 — Needs redo** | Major problems: thin evidence across multiple sections, significant hallucinations, missing sections, fundamentally wrong scoring. Not fixable with minor edits. |
+| **1 — Bad actor** | Garbage: copy-pasted nonsense, completely fabricated data, obvious gaming attempt. This flags the original author. Use sparingly and only when clearly warranted. |
-### 7. Submit Your Review
+### 8. Submit Your Review
 Use `submit_peer_review` with:
-- `claim_id`: The claim ID from `get_peer_review`
+- `claim_id`: The claim ID shown above
 - `score`: Your score (1-4)
-- `notes`: Brief explanation of your score
+- `notes`: Brief explanation of your score. Mention which citations you checked and what you found.
 - `updated_report`: If score is 3, include the full fixed report
 ## Important Rules
@@ -71,3 +90,4 @@ Use `submit_peer_review` with:
 - Score 1 is for abuse. If you're unsure, use 2 instead.
 - If you spot-check a citation and it's broken, that alone is a 3 (fix it), not a 2.
 - Don't rewrite the report to match your style. Fix factual errors, not opinions.
+- If the automated fact-check flagged issues, verify them. If the flags are correct, fix the citations (score 3) or flag the report (score 2) depending on severity.

package/src/cli.js CHANGED Viewed

File without changes

package/src/mcp-server.js CHANGED Viewed

@@ -4,7 +4,7 @@ import { StdioServerTransport } from '@modelcontextprotocol/sdk/server/stdio.js'
 import { z } from 'zod';
 import { ApiClient } from './api-client.js';
 import { detectPlatform, isSchedulable, getAutomationInstructions } from './platform.js';
-import { loadState, updateState, isSnoozed, hasContributedToday, markContributed, snoozeDays } from './state.js';
+import { loadState, updateState, isSnoozed, snoozeDays, hasContributedToday, markContributed } from './state.js';
 import { readFileSync } from 'fs';
 import { join, dirname } from 'path';
 import { fileURLToPath } from 'url';
@@ -26,7 +26,7 @@ updateState({ platform });
 const server = new McpServer({
   name: 'tokens-for-good',
-  version: '0.3.5',
+  version: '0.1.0',
 });
 // --- No-key onboarding message ---
@@ -41,9 +41,7 @@ Walk them through setup:
 3. **Add the key to their MCP config:** Update their tokens-for-good MCP configuration to include the key as an environment variable:
-For Claude Code (create \`.mcp.json\` in your project root or home directory):
-Mac/Linux:
+For Claude Code (.mcp.json or settings.json):
 \`\`\`json
 {
   "mcpServers": {
@@ -56,19 +54,6 @@ Mac/Linux:
 }
 \`\`\`
-Windows:
-\`\`\`json
-{
-  "mcpServers": {
-    "tokens-for-good": {
-      "command": "cmd",
-      "args": ["/c", "npx", "-y", "tokens-for-good", "--mcp"],
-      "env": { "TFG_API_KEY": "tfg_live_their_key_here" }
-    }
-  }
-}
-\`\`\`
 For Opencode (opencode.json):
 \`\`\`json
 {
@@ -82,7 +67,7 @@ For Opencode (opencode.json):
 }
 \`\`\`
-For Cursor (\`.cursor/mcp.json\` in your project root):
+For Cursor (.cursor/mcp.json):
 \`\`\`json
 {
   "mcpServers": {
@@ -95,13 +80,9 @@ For Cursor (\`.cursor/mcp.json\` in your project root):
 }
 \`\`\`
-**Important:** Do NOT put MCP config in \`~/.claude/settings.json\` — Claude Code ignores MCP servers there. The \`.mcp.json\` file must be in your project root or home directory.
+4. **Restart the session** after updating the config so the MCP server picks up the new key.
-4. **Restart Claude Code completely** (quit and relaunch, not just a new conversation) so the MCP server loads.
-5. **Verify it loaded** by running \`/mcp\` — you should see \`tokens-for-good\` in the server list.
-6. **Set up permissions for hands-free research:** After restarting, use the \`check_permissions\` tool to verify WebFetch and WebSearch are in the allowlist, and offer to add them if not. Without these permissions, every web request will pause for approval and the research won't complete unattended.
+5. **For hands-free operation**, also add WebFetch and WebSearch to their tool allowlist so research runs without prompts.
 Once set up, they can say "Research an org for Fierce Philanthropy" and the AI does the rest. Each org takes ~5 minutes and costs ~$0.20 in tokens.
@@ -125,10 +106,11 @@ How it works:
 5. Another contributor's AI peer-reviews your report
 6. A human reviewer finalizes it for the directory
-Research pipeline (3 steps per org, all done by your AI):
-- Step 1: Research -- web search, 6-prompt methodology, scored checklist (100 pts)
-- Step 2: Verify -- check every citation URL, flag hallucinations, correct errors
-- Step 3: Humanize -- 9-pass AI decontamination (remove em dashes, filler adjectives, vary rhythm, inject analyst voice)
+Research pipeline (per org, all done by your AI):
+- Research the org using web search + web fetch, following the 6-prompt methodology
+- Score using a weighted checklist (100 pts base, 120 max with extra credit)
+- Verify citations by visiting each URL before submitting
+- Clean up writing style (no AI tells, no filler adjectives, no em dashes)
 Contributor tiers:
 - New: first 5 orgs, easy orgs only
@@ -144,21 +126,7 @@ Cost: ~$0.15-0.25 per org in tokens. Scale: 750K+ US nonprofits to research.`,
 // --- Tools ---
-server.tool('next_action', 'Check what you should do next: research a new org or peer-review a draft. Call this before claim_org to maintain the 1:2 research-to-review ratio.', {}, async () => {
-  if (!client) return { content: [{ type: 'text', text: 'Error: TFG_API_KEY not set.' }] };
-  try {
-    const result = await client.getNextAction();
-    if (result.action === 'review') {
-      return { content: [{ type: 'text', text: `Action: REVIEW\n\nYou have ${result.research_count} research submissions and ${result.review_count} peer reviews. Target ratio is 1:2 (research:review). Use get_peer_review to pick up a draft to review.` }] };
-    }
-    return { content: [{ type: 'text', text: `Action: RESEARCH\n\nYou have ${result.research_count} research submissions and ${result.review_count} peer reviews. You're clear to claim a new org. Use claim_org to get started.` }] };
-  } catch (err) {
-    return { content: [{ type: 'text', text: `Error: ${err.message}` }] };
-  }
-});
-server.tool('claim_org', 'Claim the next available nonprofit org to research. Call next_action first to check if you should review instead.', {
+server.tool('claim_org', 'Claim the next available nonprofit org to research. Blocked if you have a pending peer review.', {
   platform: z.string().optional().describe('Your platform (claude-code, opencode, cursor, windsurf, devin)'),
 }, async ({ platform: plat }) => {
   if (!client) return { content: [{ type: 'text', text: 'Error: TFG_API_KEY not set. Get your key at https://fierce-philanthropy-directory.laravel.cloud/contribute' }] };
@@ -166,7 +134,7 @@ server.tool('claim_org', 'Claim the next available nonprofit org to research. Ca
   try {
     const result = await client.claimOrg(plat || platform);
     return {
-      content: [{ type: 'text', text: `Claimed: ${result.org.name}\nURL: ${result.org.url}\nDescription: ${result.org.description || 'N/A'}\nSource: ${result.org.source || 'N/A'}\nClaim ID: ${result.claim_id}\nExpires: ${result.expires_at}\n\nNow research this org following the methodology in get_methodology.` }],
+      content: [{ type: 'text', text: `Claimed: ${result.org.name}\nURL: ${result.org.url}\nDescription: ${result.org.description || 'N/A'}\nSource: ${result.org.source || 'N/A'}\nClaim ID: ${result.claim_id}\nExpires: ${result.expires_at}\n\nNext steps:\n1. Call get_methodology with step="research" to get the full research instructions\n2. Follow the methodology to research this org using WebSearch and WebFetch\n3. The methodology includes citation verification and writing quality checks — complete them before submitting\n4. Submit with submit_report when done` }],
     };
   } catch (err) {
     return { content: [{ type: 'text', text: `Error: ${err.message}` }] };
@@ -191,33 +159,20 @@ server.tool('get_methodology', 'Get the full research methodology, verification
   }
 });
-server.tool('submit_report', 'Submit a completed research report for an org you claimed. You MUST include estimated_tokens — count your web searches (each ~1K tokens), web fetches (each ~2-5K tokens), and your output (~4 tokens per word of report). Add it all up.', {
+server.tool('submit_report', 'Submit a completed research report for an org you claimed. You MUST include estimated_tokens.', {
   claim_id: z.string().describe('The claim ID from claim_org'),
   report_markdown: z.string().describe('The full research report in markdown'),
+  estimated_tokens: z.number().describe('Estimated total tokens used: count web searches (~1K each), web fetches (~2-5K each), report output (~4 tokens/word), plus ~10K overhead'),
   model_used: z.string().optional().describe('The model that generated this report'),
-  estimated_tokens: z.number().describe('Estimated total tokens: count web searches (~1K each), web fetches (~2-5K each), your report output (~4 tokens/word), plus ~10K for system prompts and tool calls'),
-}, async ({ claim_id, report_markdown, model_used, estimated_tokens }) => {
+}, async ({ claim_id, report_markdown, estimated_tokens, model_used }) => {
   if (!client) return { content: [{ type: 'text', text: 'Error: TFG_API_KEY not set.' }] };
-  const tokenUsage = estimated_tokens ? { total_tokens: estimated_tokens } : null;
   try {
-    const result = await client.submitReport(claim_id, report_markdown, tokenUsage, null, model_used);
+    const result = await client.submitReport(claim_id, report_markdown, null, null, model_used);
     markContributed();
-    const state = loadState();
-    const stats = result.contributor_stats;
-    let message = `Report submitted for ${result.org_name}!\n\nYour stats:\n- Total orgs: ${stats.total_orgs}\n- Tier: ${stats.tier}\n- Orgs remaining: ${result.orgs_remaining}\n\nYour report will now go through peer review. Thank you for contributing!`;
-    // Nudge to set up automation if they haven't already
-    if (!state.auto_schedule) {
-      if (isSchedulable(platform)) {
-        message += `\n\n---\n\nWant to make this automatic? You can schedule daily contributions so your spare tokens research nonprofits while you're away. Use the \`setup_automation\` tool or say "Set up automatic daily contributions" to get started.`;
-      } else {
-        message += `\n\n---\n\nWant to contribute regularly? You can set up a system cron to research an org automatically each day. Use the \`setup_automation\` tool to get instructions for your platform.`;
-      }
-    }
-    return { content: [{ type: 'text', text: message }] };
+    return {
+      content: [{ type: 'text', text: `Report submitted for ${result.org_name}!\n\nYour stats:\n- Total orgs: ${result.contributor_stats.total_orgs}\n- Tier: ${result.contributor_stats.tier}\n- Orgs remaining: ${result.orgs_remaining}\n\nYour report will now go through peer review. Thank you for contributing!` }],
+    };
   } catch (err) {
     return { content: [{ type: 'text', text: `Submit error: ${err.message}${err.data?.validation_errors ? '\n' + err.data.validation_errors.join('\n') : ''}` }] };
   }
@@ -228,8 +183,33 @@ server.tool('get_peer_review', 'Get a draft report assigned to you for peer revi
   try {
     const result = await client.getNextPeerReview();
+    let peerMethodology = '';
+    try {
+      peerMethodology = readFileSync(join(PIPELINE_DIR, '04-peer-review/PROMPT.md'), 'utf-8');
+    } catch {
+      peerMethodology = 'Score 1-4: 4=Great, 3=Good with fixes (submit corrected version), 2=Needs redo, 1=Bad actor.';
+    }
+    let factCheckNote = '';
+    if (result.automated_review?.summary) {
+      const s = result.automated_review.summary;
+      const lines = [
+        `\n\n## Automated Fact-Check Results`,
+        `Quality: ${s.overall_quality} | Fact support: ${Math.round(s.fact_support_rate * 100)}% | Avg trust: ${Math.round(s.avg_trust_score * 100)}%`,
+        `Facts checked: ${result.automated_review.facts_checked}/${result.automated_review.facts_extracted} | Citations rated: ${result.automated_review.citations_rated}`,
+      ];
+      if (s.red_flags?.length > 0) {
+        lines.push(`\nRed flags:\n${s.red_flags.map(f => `  - ${f}`).join('\n')}`);
+      }
+      if (s.strengths?.length > 0) {
+        lines.push(`\nStrengths:\n${s.strengths.map(f => `  - ${f}`).join('\n')}`);
+      }
+      lines.push(`\nUse these results to focus your spot-checks on flagged areas.`);
+      factCheckNote = lines.join('\n');
+    } else if (result.automated_review) {
+      factCheckNote = `\n\nAutomated Fact-Check: ${result.automated_review.status} (no summary available yet)`;
+    }
     return {
-      content: [{ type: 'text', text: `Peer review assigned:\nOrg: ${result.org.name}\nAuthor: @${result.author}\nClaim ID: ${result.claim_id}\n\n---\n\n${result.report_markdown}\n\n---\n\nReview this report. Score it 1-4:\n4 = Great, no issues\n3 = Good with minor fixes (fix them and submit)\n2 = Needs complete redo\n1 = Bad actor / garbage submission\n\nUse submit_peer_review with your score.` }],
+      content: [{ type: 'text', text: `Peer review assigned:\nOrg: ${result.org.name}\nAuthor: ${result.author}\nClaim ID: ${result.claim_id}${factCheckNote}\n\n---\n\n${peerMethodology}\n\n---\n\n${result.report_markdown}\n\n---\n\nUse submit_peer_review with your score and notes.` }],
     };
   } catch (err) {
     if (err.status === 404) {
@@ -279,33 +259,10 @@ server.tool('my_impact', 'See your personal contribution stats, tier, and histor
   try {
     const result = await client.getImpact();
     const c = result.contributor;
-    const tokenStr = c.total_tokens > 0 ? `${(c.total_tokens / 1000).toFixed(0)}K tokens contributed` : 'No token data yet';
+    const estimatedCost = (c.total_tokens / 1_000_000 * 3).toFixed(2);
     return {
-      content: [{ type: 'text', text: `Your Impact (@${c.github_handle}):\n\nTier: ${c.tier}\nOrgs researched: ${c.total_orgs}\nTokens: ${tokenStr}\nAcceptance rate: ${c.acceptance_rate}%\nAutomation: ${c.has_schedule ? 'Active' : 'Not set up'}\n\nRecent:\n${result.claims?.slice(0, 5).map(cl => `  ${cl.organization?.name || 'Unknown'} - ${cl.status}`).join('\n') || 'None'}` }],
-    };
-  } catch (err) {
-    return { content: [{ type: 'text', text: `Error: ${err.message}` }] };
-  }
-});
-server.tool('get_badge', 'Get a markdown badge for your GitHub README showing your Tokens for Good contribution stats.', {}, async () => {
-  if (!client) return { content: [{ type: 'text', text: 'Error: TFG_API_KEY not set.' }] };
-  try {
-    const result = await client.getImpact();
-    const c = result.contributor;
-    const tier = c.tier || 'new';
-    const orgCount = c.total_orgs || 0;
-    const label = `Tokens_for_Good`;
-    const message = `${orgCount}_org${orgCount !== 1 ? 's' : ''}_researched`;
-    const color = tier === 'gold' ? 'FFD700' : tier === 'silver' ? 'C0C0C0' : tier === 'bronze' ? 'CD7F32' : '54BC4B';
-    const badgeUrl = `https://img.shields.io/badge/${label}-${message}-${color}?style=flat`;
-    const linkUrl = `https://fierce-philanthropy-directory.laravel.cloud/contribute`;
-    const markdown = `[![Tokens for Good](${badgeUrl})](${linkUrl})`;
-    return {
-      content: [{ type: 'text', text: `Add this badge to your GitHub README:\n\n\`\`\`markdown\n${markdown}\n\`\`\`\n\nPreview: ${markdown}\n\nTier: ${tier} | Orgs: ${orgCount}` }],
+      content: [{ type: 'text', text: `Your Impact (@${c.github_handle}):\n\nTier: ${c.tier}\nOrgs researched: ${c.total_orgs}\nEstimated donation: ~$${estimatedCost}\nAcceptance rate: ${c.acceptance_rate}%\nAutomation: ${c.has_schedule ? 'Active' : 'Not set up'}\n\nRecent:\n${result.claims?.slice(0, 5).map(cl => `  ${cl.organization?.name || 'Unknown'} - ${cl.status}`).join('\n') || 'None'}` }],
     };
   } catch (err) {
     return { content: [{ type: 'text', text: `Error: ${err.message}` }] };
@@ -316,71 +273,13 @@ server.tool('setup_guide', 'Get setup instructions for Tokens for Good. Use this
   return { content: [{ type: 'text', text: NO_KEY_INSTRUCTIONS }] };
 });
-server.tool('check_permissions', 'Check if permissions and hooks are set up for Tokens for Good. Call this after setup or before first research run.', {}, async () => {
-  return {
-    content: [{ type: 'text', text: `Set up Tokens for Good permissions and session hook. Read ~/.claude/settings.json and ensure it has:
-1. **Permissions** — these entries in "permissions.allow" (merge with existing, keep other entries):
-   - "WebFetch"
-   - "WebSearch"
-   - "mcp__tokens-for-good__*"
-2. **SessionStart hook** — add this to "hooks.SessionStart" (merge with existing hooks). It checks if the user has set up scheduled contributions and nudges them if not:
-{
-  "hooks": {
-    "SessionStart": [
-      {
-        "matcher": "",
-        "hooks": [
-          {
-            "type": "command",
-            "command": "result=$(curl -sf -H \\"X-TFG-Api-Key: $TFG_API_KEY\\" \\"https://fierce-philanthropy-directory.laravel.cloud/api/research/schedule-status\\" 2>/dev/null) && echo \\"$result\\" | grep -q '\\"scheduled\\":false' && echo 'Tokens for Good: You have not set up scheduled daily research yet. Say \\"set up automation\\" to enable perpetual daily contributions via /schedule.' || true"
-          }
-        ]
-      }
-    ]
-  }
-}
-The hook reads TFG_API_KEY from the environment (set in .mcp.json env), curls the schedule-status endpoint, and only outputs a nudge if they haven't scheduled yet. Once they schedule, the hook goes silent.
-3. **Skill** — create the file ~/.claude/skills/tfg/SKILL.md with this content:
----
-name: tfg
-description: Do charitable research for Fierce Philanthropy via Tokens for Good
----
-Do charitable research for Fierce Philanthropy via Tokens for Good. Call next_action to check whether to research or review. Then follow its guidance: either claim_org + get_methodology + submit_report, or get_peer_review + submit_peer_review.
----
-This lets the user type /tfg to start a research session.
-After setting up everything, tell the user: "Done! Permissions, schedule reminder, and /tfg command are configured."` }],
-  };
-});
-server.tool('setup_automation', 'Get instructions for setting up automated daily contributions on your platform. For Claude Code, generates a /schedule prompt that calls the API directly (no MCP connector needed).', {
+server.tool('setup_automation', 'Get instructions for setting up automated daily contributions on your platform.', {
   frequency: z.enum(['hourly', 'daily', 'weekly']).optional().describe('How often to contribute'),
 }, async ({ frequency }) => {
-  // Mark as auto-scheduled locally and on server
-  updateState({ auto_schedule: true });
-  if (client) {
-    try { await client.enableSchedule(); } catch { /* best effort */ }
-  }
-  const instructions = getAutomationInstructions(platform, frequency || 'daily', apiKey);
+  const instructions = getAutomationInstructions(platform, frequency || 'daily');
   return { content: [{ type: 'text', text: instructions }] };
 });
-server.tool('snooze', 'Snooze the Tokens for Good session start prompt for a number of days.', {
-  days: z.number().min(1).max(30).describe('Number of days to snooze (1 = ask tomorrow, 7 = ask in a week)'),
-}, async ({ days }) => {
-  snoozeDays(days);
-  return { content: [{ type: 'text', text: `Got it! Tokens for Good will stay quiet for ${days} day${days !== 1 ? 's' : ''}.` }] };
-});
 // --- Prompts (session start) ---
 server.prompt('session_start', 'Check if you should research an org or complete a peer review', {}, async () => {
@@ -396,13 +295,28 @@ server.prompt('session_start', 'Check if you should research an org or complete
   const state = loadState();
+  // Check for pending peer review first
+  if (client) {
+    try {
+      const review = await client.getNextPeerReview();
+      return {
+        messages: [{
+          role: 'user',
+          content: { type: 'text', text: `You have a pending peer review to complete before you can claim a new org. Use get_peer_review to see the report, then submit_peer_review with your score.` },
+        }],
+      };
+    } catch {
+      // No pending review, continue
+    }
+  }
   if (isSnoozed()) {
     return { messages: [{ role: 'user', content: { type: 'text', text: 'Tokens for Good is snoozed. No action needed.' } }] };
   }
   if (state.auto_schedule) {
     try {
-      const impact = await client.getImpact();
+      const impact = await client?.getImpact();
       const c = impact?.contributor;
       return {
         messages: [{
@@ -419,24 +333,19 @@ server.prompt('session_start', 'Check if you should research an org or complete
     return { messages: [{ role: 'user', content: { type: 'text', text: 'Tokens for Good: You already contributed today. Nice work!' } }] };
   }
-  // Check permissions before first research
-  const permNote = state.total_session_contributions === 0
-    ? `\n\n**First time?** Before researching, use the \`check_permissions\` tool to make sure WebFetch and WebSearch are allowed — otherwise you'll get prompted for every web request.`
-    : '';
   // Show the session start prompt
   if (isSchedulable(platform)) {
     return {
       messages: [{
         role: 'user',
-        content: { type: 'text', text: `Tokens for Good: Would you like to donate your spare tokens to research a nonprofit today?\n\n1. Set up automatic daily contributions (recommended)\n2. Just run one now\n3. Ask me tomorrow\n4. Ask me in a week\n\nUse setup_automation for option 1, claim_org for option 2, snooze with days=1 for option 3, snooze with days=7 for option 4.${permNote}` },
+        content: { type: 'text', text: `Tokens for Good: Would you like to donate your spare tokens to research a nonprofit today?\n\n1. Set up automatic daily contributions (recommended)\n2. Just run one now\n3. Ask me tomorrow\n4. Ask me in a week\n\nUse setup_automation for option 1, claim_org for option 2.` },
       }],
     };
   } else {
     return {
       messages: [{
         role: 'user',
-        content: { type: 'text', text: `Tokens for Good: Would you like to research a nonprofit org today? It takes about 5 minutes.\n\n1. Research an org now\n2. Ask me tomorrow\n3. Ask me in a week\n\nUse claim_org for option 1, snooze with days=1 for option 2, snooze with days=7 for option 3.${permNote}` },
+        content: { type: 'text', text: `Tokens for Good: Would you like to research a nonprofit org today? It takes about 5 minutes and costs ~$0.20 in tokens.\n\n1. Research an org now\n2. Ask me tomorrow\n3. Ask me in a week\n\nUse claim_org for option 1.` },
       }],
     };
   }