npm - india-reg-mcp - Versions diffs - 1.0.0 - Mend

india-reg-mcp 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (13) hide show

package/LICENSE +21 -0
package/README.md +292 -0
package/dist/db/queries.js +91 -0
package/dist/db/schema.js +56 -0
package/dist/index.js +169 -0
package/dist/scrapers/pdf.js +19 -0
package/dist/scrapers/rbi.js +168 -0
package/dist/scrapers/repair-bodies.js +92 -0
package/dist/scrapers/run-sync.js +31 -0
package/dist/scrapers/sebi.js +146 -0
package/dist/util/format.js +10 -0
package/dist/util/http.js +25 -0
package/package.json +61 -0

package/LICENSE ADDED Viewed

@@ -0,0 +1,21 @@
+MIT License
+Copyright (c) 2026 Akhil Govind
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.

package/README.md ADDED Viewed

@@ -0,0 +1,292 @@
+# india-reg-mcp
+An MCP server that gives Claude (and any MCP client) searchable, cited access to **RBI and SEBI regulatory documents** — circulars, master directions, notifications, and regulations.
+No API keys. No subscriptions. Built entirely on free public government data.
+---
+## Why This Exists
+Regulatory rules in India are scattered across thousands of PDFs and HTML pages on rbi.org.in and sebi.gov.in. They reference and supersede each other constantly. There is no AI-accessible way to ask "what are the current rules on X" and get sourced answers.
+This MCP solves that. It scrapes, indexes, and exposes the full text of RBI and SEBI documents in a local SQLite database — so Claude can search and retrieve primary-source regulatory text with official citations, instantly.
+**This is not a legal advice tool.** It retrieves primary-source documents so you can reason over actual rules instead of hallucinated summaries.
+---
+## What It Does
+- **Full-text search** across all indexed RBI and SEBI documents
+- **Retrieve full document body** by ID, with official source link
+- **List recent documents** — useful for "what changed this month" questions
+- **List Master Directions and Master Circulars** — the consolidated, currently-in-force rules on each subject
+- **Browse by department** — SEBI's Investment Management, Market Regulation, etc.
+- **On-demand sync** — pull newly published documents from regulators' sites
+- **Topic search** — returns both consolidated master rules AND recent circulars on a subject in one call
+---
+## Tools (8 total)
+| Tool | Description |
+|---|---|
+| `search_regulations` | Full-text search with optional regulator/type filter |
+| `get_document` | Retrieve full text of a document by ID |
+| `get_recent` | Most recent documents, optionally filtered |
+| `list_master_directions` | All RBI Master Directions + SEBI Master Circulars |
+| `list_by_department` | SEBI documents by department name |
+| `sync_latest` | Incremental scrape of new documents from RBI + SEBI |
+| `sync_status` | Document count breakdown + last sync time |
+| `search_by_topic` | Combined master rules + recent circulars on a topic |
+---
+## Architecture
+```
+RBI / SEBI websites
+        │
+        ▼
+   Scrapers (TypeScript)
+   rbi.ts  ──── ASP.NET POST with viewstate tokens
+   sebi.ts ──── AJAX pagination via JSP endpoint
+        │
+        ▼
+   SQLite DB  (~/.india-reg-mcp/regdata.db)
+   ├── documents table (full text, metadata, source URLs)
+   ├── documents_fts (FTS5 full-text index, porter stemmer)
+   └── sync_meta (last sync timestamp)
+        │
+        ▼
+   MCP Server (stdio transport)
+   ├── 8 tools exposed to Claude
+   └── Every result includes official source URL + disclaimer
+```
+**Key design decision:** Tools never scrape live. The scrapers populate a local SQLite database once, and tools query that instantly. Only `sync_latest` hits the regulators' sites. This makes every tool call fast, keeps you off government servers during normal use, and works offline once indexed.
+---
+## Installation
+### Prerequisites
+- Node.js 22 LTS or higher
+- Claude Desktop or Claude Code CLI
+### Setup
+```bash
+git clone https://github.com/Akhilgovind02/india-regulatory-mcp.git
+cd india-regulatory-mcp
+npm install
+npm run build
+```
+### First-run sync (populates the database)
+```bash
+npm run sync
+```
+This scrapes RBI notifications for the last 36 months and SEBI circulars (~1000 recent documents). Takes **5–15 minutes** depending on your connection. The database is stored at `~/.india-reg-mcp/regdata.db` and survives rebuilds.
+Progress is printed to stderr:
+```
+[sync] RBI 2026-6: 12 docs found
+[sync] RBI 2026-5: 18 docs found
+...
+[sync] SEBI ssid=6 page 0: 25 docs
+...
+[sync] Sync complete.
+```
+---
+## Configuration
+### Claude Code (CLI)
+```bash
+claude mcp add -s user india-reg node /ABSOLUTE/PATH/TO/india-regulatory-mcp/dist/index.js
+```
+Or add to `~/.claude.json` manually:
+```json
+{
+  "mcpServers": {
+    "india-reg": {
+      "command": "node",
+      "args": ["/ABSOLUTE/PATH/TO/india-regulatory-mcp/dist/index.js"]
+    }
+  }
+}
+```
+### Claude Desktop
+Edit `%APPDATA%\Claude\claude_desktop_config.json` (Windows) or `~/Library/Application Support/Claude/claude_desktop_config.json` (macOS):
+```json
+{
+  "mcpServers": {
+    "india-reg": {
+      "command": "node",
+      "args": ["C:\\ABSOLUTE\\PATH\\TO\\india-regulatory-mcp\\dist\\index.js"]
+    }
+  }
+}
+```
+Restart Claude Desktop fully after editing the config.
+---
+## Usage Examples
+Once connected, ask Claude naturally:
+```
+What are the current RBI rules on digital lending?
+```
+→ Claude calls `search_by_topic("digital lending")` — returns the Master Direction on Digital Lending plus recent amending circulars, each with official source links.
+```
+What new SEBI circulars came out this month?
+```
+→ Claude calls `get_recent(regulator="SEBI", limit=10)`.
+```
+Show me all RBI Master Directions
+```
+→ Claude calls `list_master_directions(regulator="RBI")`.
+```
+What are the KYC requirements for mutual funds?
+```
+→ Claude calls `search_regulations("KYC mutual fund", doc_type="master_direction")`.
+```
+Pull in the latest regulatory updates
+```
+→ Claude calls `sync_latest()` — incremental, only fetches new documents.
+---
+## Data Sources
+### RBI (Reserve Bank of India)
+- **Notifications + Circulars**: `rbi.org.in/Scripts/NotificationUser.aspx`
+- **Master Directions**: `rbi.org.in/Scripts/BS_ViewMasterDirections.aspx`
+- **Master Circulars**: `rbi.org.in/Scripts/BS_ViewMasterCirculardetails.aspx`
+- Document bodies: HTML page text (with PDF fallback for PDF-only docs)
+### SEBI (Securities and Exchange Board of India)
+- **Circulars** (`ssid=7`): ~2,775 documents as of June 2026
+- **Master Circulars** (`ssid=6`): consolidated current rules by topic
+- **Regulations** (`ssid=3`): statutory regulations
+- Document bodies: primarily PDF-embedded content extracted via `pdf-parse`
+---
+## Document Types
+| Type | Source | Description |
+|---|---|---|
+| `circular` | RBI + SEBI | Point-in-time regulatory guidance |
+| `master_direction` | RBI | Consolidated current rules on a subject (supersedes earlier circulars) |
+| `master_circular` | SEBI | Same as master direction, SEBI's term |
+| `notification` | RBI | Statutory notifications under various Acts |
+| `regulation` | SEBI | Formal regulations (e.g. SEBI (FPI) Regulations 2019) |
+**For compliance questions, start with `master_direction`/`master_circular`.** These represent the current state of rules, not a point-in-time snapshot.
+---
+## Keeping the Index Fresh
+The database is a point-in-time snapshot. RBI and SEBI publish new documents frequently (several per week).
+**Option 1 — Ask Claude:** "Pull in the latest regulatory updates" → Claude calls `sync_latest`.
+**Option 2 — CLI:**
+```bash
+npm run sync
+```
+**Option 3 — Scheduled (example cron, runs every Sunday at 2am):**
+```
+0 2 * * 0 cd /path/to/india-regulatory-mcp && npm run sync >> ~/.india-reg-mcp/sync.log 2>&1
+```
+`sync_latest` and `npm run sync` are both incremental — they skip documents already in the database.
+---
+## Project Structure
+```
+india-regulatory-mcp/
+├── src/
+│   ├── index.ts                  ← MCP server, all 8 tools
+│   ├── db/
+│   │   ├── schema.ts             ← SQLite schema + FTS5 setup
+│   │   └── queries.ts            ← DB read/write functions
+│   ├── scrapers/
+│   │   ├── rbi.ts                ← RBI scraper (ASP.NET POST + viewstate)
+│   │   ├── sebi.ts               ← SEBI scraper (AJAX pagination)
+│   │   ├── pdf.ts                ← PDF download + text extraction
+│   │   ├── run-sync.ts           ← CLI full sync runner
+│   │   └── repair-bodies.ts      ← Utility: backfill missing body text
+│   └── util/
+│       ├── http.ts               ← Polite fetch (UA + retry + delay)
+│       └── format.ts             ← MCP response helpers
+├── dist/                         ← Compiled output (not in repo)
+├── package.json
+└── tsconfig.json
+```
+---
+## Technical Notes
+### Scraping approach
+**RBI** uses ASP.NET with viewstate tokens. The scraper fetches viewstate on startup, then POSTs month-by-month to get document listings. Content is extracted from the HTML circular page; PDF fallback for PDF-only documents.
+**SEBI** uses a JSP-based listing with AJAX pagination. Page 1 is a GET request (establishes JSESSIONID). Pages 2+ POST to `getnewslistinfo.jsp` with the session cookie. Circular content is typically a PDF embedded in an iframe — the scraper extracts the PDF URL and parses it with `pdf-parse`.
+### Polite scraping
+- Concurrency capped at 2 parallel requests per host
+- 300ms delay between document fetches
+- 500ms delay between listing pages
+- Exponential backoff on 429/5xx responses
+- Real User-Agent string identifying the tool
+### Database
+- Location: `~/.india-reg-mcp/regdata.db`
+- WAL mode for performance
+- FTS5 with Porter stemmer — handles morphological variants ("lending" matches "lend", "lender")
+- Full-text search terms are quoted per-word to handle hyphens and special characters safely
+---
+## Stack
+| Package | Version | Purpose |
+|---|---|---|
+| `@modelcontextprotocol/sdk` | 1.29.0 | MCP server framework |
+| `better-sqlite3` | 12.10.0 | SQLite with built-in FTS5 |
+| `cheerio` | 1.2.0 | HTML parsing |
+| `pdf-parse` | 2.4.5 | PDF text extraction |
+| `turndown` | 7.2.4 | HTML → Markdown |
+| `p-limit` | 7.3.0 | Concurrency control |
+| `zod` | 4.4.3 | Schema validation |
+---
+## Disclaimer
+This tool retrieves and surfaces primary-source regulatory documents from official RBI and SEBI publications. It does not provide legal advice. Always verify against the official linked document. The index is a point-in-time snapshot — use `sync_latest` to pull recent publications.

package/dist/db/queries.js ADDED Viewed

@@ -0,0 +1,91 @@
+import { db } from "./schema.js";
+// Lazy statement cache — prepared after initSchema() has created the tables
+const stmts = {};
+function stmt(key, sql) {
+    if (!stmts[key])
+        stmts[key] = db.prepare(sql);
+    return stmts[key];
+}
+export function upsertDoc(doc) {
+    stmt("upsert", `
+    INSERT INTO documents (id, regulator, doc_type, title, date, department, source_url, pdf_url, body, indexed_at)
+    VALUES (@id, @regulator, @doc_type, @title, @date, @department, @source_url, @pdf_url, @body, @indexed_at)
+    ON CONFLICT(id) DO UPDATE SET
+      title=@title, body=@body, department=@department, pdf_url=@pdf_url, indexed_at=@indexed_at
+  `).run(doc);
+}
+export function upsertMany(docs) {
+    const s = stmt("upsert", `
+    INSERT INTO documents (id, regulator, doc_type, title, date, department, source_url, pdf_url, body, indexed_at)
+    VALUES (@id, @regulator, @doc_type, @title, @date, @department, @source_url, @pdf_url, @body, @indexed_at)
+    ON CONFLICT(id) DO UPDATE SET
+      title=@title, body=@body, department=@department, pdf_url=@pdf_url, indexed_at=@indexed_at
+  `);
+    const tx = db.transaction((rows) => rows.forEach((r) => s.run(r)));
+    tx(docs);
+}
+export function docExists(id) {
+    return !!stmt("exists", "SELECT 1 FROM documents WHERE id = ?").get(id);
+}
+export function getDoc(id) {
+    return stmt("getDoc", "SELECT * FROM documents WHERE id = ?").get(id);
+}
+export function searchDocs(opts) {
+    if (!opts.query.trim())
+        return [];
+    const limit = opts.limit ?? 10;
+    let sql = `
+    SELECT d.*, snippet(documents_fts, 1, '<<', '>>', ' … ', 16) AS snippet
+    FROM documents_fts f
+    JOIN documents d ON d.rowid = f.rowid
+    WHERE documents_fts MATCH @q
+  `;
+    const params = { q: escapeFts(opts.query) };
+    if (opts.regulator) {
+        sql += " AND d.regulator = @regulator";
+        params.regulator = opts.regulator;
+    }
+    if (opts.docType) {
+        sql += " AND d.doc_type = @docType";
+        params.docType = opts.docType;
+    }
+    sql += " ORDER BY rank, d.date DESC LIMIT @limit";
+    params.limit = limit;
+    return db.prepare(sql).all(params);
+}
+export function recentDocs(opts) {
+    const limit = opts.limit ?? 15;
+    let sql = "SELECT * FROM documents WHERE 1=1";
+    const params = {};
+    if (opts.regulator) {
+        sql += " AND regulator = @regulator";
+        params.regulator = opts.regulator;
+    }
+    if (opts.docType) {
+        sql += " AND doc_type = @docType";
+        params.docType = opts.docType;
+    }
+    sql += " ORDER BY date DESC LIMIT @limit";
+    params.limit = limit;
+    return db.prepare(sql).all(params);
+}
+export function listByDepartment(dept, limit = 20) {
+    return db.prepare("SELECT * FROM documents WHERE department LIKE ? ORDER BY date DESC LIMIT ?").all(`%${dept}%`, limit);
+}
+export function docCount() {
+    return db.prepare("SELECT regulator, doc_type, COUNT(*) as n FROM documents GROUP BY regulator, doc_type").all();
+}
+export function getSyncMeta(key) {
+    const row = stmt("getMeta", "SELECT value FROM sync_meta WHERE key = ?").get(key);
+    return row?.value;
+}
+export function setSyncMeta(key, value) {
+    stmt("setMeta", "INSERT INTO sync_meta (key,value) VALUES (?,?) ON CONFLICT(key) DO UPDATE SET value=?")
+        .run(key, value, value);
+}
+function escapeFts(q) {
+    const trimmed = q.trim();
+    if (!trimmed)
+        return "";
+    return trimmed.split(/\s+/).map((w) => `"${w.replace(/"/g, '')}"`).join(" ");
+}

package/dist/db/schema.js ADDED Viewed

@@ -0,0 +1,56 @@
+import Database from "better-sqlite3";
+import { homedir } from "node:os";
+import { join } from "node:path";
+import { mkdirSync, existsSync } from "node:fs";
+const DB_DIR = join(homedir(), ".india-reg-mcp");
+if (!existsSync(DB_DIR))
+    mkdirSync(DB_DIR, { recursive: true });
+const DB_PATH = join(DB_DIR, "regdata.db");
+export const db = new Database(DB_PATH);
+db.pragma("journal_mode = WAL");
+export function initSchema() {
+    db.exec(`
+    CREATE TABLE IF NOT EXISTS documents (
+      id           TEXT PRIMARY KEY,
+      regulator    TEXT NOT NULL,
+      doc_type     TEXT NOT NULL,
+      title        TEXT NOT NULL,
+      date         TEXT NOT NULL,
+      department   TEXT,
+      source_url   TEXT NOT NULL,
+      pdf_url      TEXT,
+      body         TEXT,
+      indexed_at   TEXT NOT NULL
+    );
+    CREATE INDEX IF NOT EXISTS idx_regulator ON documents(regulator);
+    CREATE INDEX IF NOT EXISTS idx_doctype   ON documents(doc_type);
+    CREATE INDEX IF NOT EXISTS idx_date      ON documents(date);
+    CREATE VIRTUAL TABLE IF NOT EXISTS documents_fts USING fts5(
+      title, body,
+      content='documents',
+      content_rowid='rowid',
+      tokenize='porter unicode61'
+    );
+    DROP TRIGGER IF EXISTS documents_ai;
+    CREATE TRIGGER documents_ai AFTER INSERT ON documents BEGIN
+      INSERT INTO documents_fts(documents_fts, rowid, title, body) VALUES ('delete', new.rowid, new.title, new.body);
+      INSERT INTO documents_fts(rowid, title, body) VALUES (new.rowid, new.title, new.body);
+    END;
+    CREATE TRIGGER IF NOT EXISTS documents_ad AFTER DELETE ON documents BEGIN
+      INSERT INTO documents_fts(documents_fts, rowid, title, body) VALUES ('delete', old.rowid, old.title, old.body);
+    END;
+    CREATE TRIGGER IF NOT EXISTS documents_au AFTER UPDATE ON documents BEGIN
+      INSERT INTO documents_fts(documents_fts, rowid, title, body) VALUES ('delete', old.rowid, old.title, old.body);
+      INSERT INTO documents_fts(rowid, title, body) VALUES (new.rowid, new.title, new.body);
+    END;
+    CREATE TABLE IF NOT EXISTS sync_meta (
+      key   TEXT PRIMARY KEY,
+      value TEXT
+    );
+  `);
+}

package/dist/index.js ADDED Viewed

@@ -0,0 +1,169 @@
+#!/usr/bin/env node
+import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";
+import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js";
+import { z } from "zod";
+import { initSchema } from "./db/schema.js";
+import * as q from "./db/queries.js";
+import { syncRbi } from "./scrapers/rbi.js";
+import { syncSebi } from "./scrapers/sebi.js";
+import { DISCLAIMER, ok, err, emptyDbMsg } from "./util/format.js";
+initSchema();
+const server = new McpServer({ name: "india-reg-mcp", version: "1.0.0" });
+server.tool("search_regulations", "Full-text search across indexed RBI and SEBI regulatory documents (circulars, master directions, notifications, regulations). " +
+    "Returns matching documents with title, date, regulator, a highlighted snippet, and the official source link. " +
+    "Use this to answer 'what are the rules on X' style questions with cited primary sources.", {
+    query: z.string().describe("Search terms e.g. 'digital lending', 'KYC periodic updation', 'FPI registration', 'mutual fund nomination'"),
+    regulator: z.enum(["RBI", "SEBI"]).optional().describe("Optionally limit to one regulator"),
+    doc_type: z.enum(["circular", "master_direction", "master_circular", "notification", "regulation"]).optional()
+        .describe("Optionally limit to one document type. master_direction/master_circular are consolidated current rules."),
+    limit: z.number().default(10).describe("Max results (1-25)"),
+}, async ({ query, regulator, doc_type, limit }) => {
+    try {
+        if (!q.docCount().length)
+            return emptyDbMsg();
+        const results = q.searchDocs({ query, regulator, docType: doc_type, limit: Math.min(limit, 25) });
+        return ok({
+            query,
+            resultCount: results.length,
+            results: results.map((r) => ({
+                id: r.id, regulator: r.regulator, type: r.doc_type,
+                title: r.title, date: r.date, snippet: r.snippet,
+                source: r.source_url, pdf: r.pdf_url,
+            })),
+            note: "Use get_document with an id to retrieve the full text.",
+            disclaimer: DISCLAIMER,
+        });
+    }
+    catch (e) {
+        return err(e instanceof Error ? e.message : String(e));
+    }
+});
+server.tool("get_document", "Retrieve the full text of a specific regulatory document by its id (from search results). Returns the complete body plus metadata and official link.", { id: z.string().describe("Document id from search results e.g. 'rbi:13344' or 'sebi:101703'") }, async ({ id }) => {
+    try {
+        const doc = q.getDoc(id);
+        if (!doc)
+            return err(`No document found with id ${id}. Use search_regulations to find valid ids.`);
+        const body = doc.body && doc.body.length > 12000
+            ? doc.body.slice(0, 12000) + "\n\n[... truncated. See full document at source link ...]"
+            : doc.body;
+        return ok({
+            id: doc.id, regulator: doc.regulator, type: doc.doc_type,
+            title: doc.title, date: doc.date, department: doc.department,
+            source: doc.source_url, pdf: doc.pdf_url, body,
+            disclaimer: DISCLAIMER,
+        });
+    }
+    catch (e) {
+        return err(e instanceof Error ? e.message : String(e));
+    }
+});
+server.tool("get_recent", "Get the most recent regulatory documents, optionally filtered by regulator or type. Useful for 'what changed recently' questions.", {
+    regulator: z.enum(["RBI", "SEBI"]).optional(),
+    doc_type: z.enum(["circular", "master_direction", "master_circular", "notification", "regulation"]).optional(),
+    limit: z.number().default(15).describe("Max results (1-30)"),
+}, async ({ regulator, doc_type, limit }) => {
+    try {
+        if (!q.docCount().length)
+            return emptyDbMsg();
+        const docs = q.recentDocs({ regulator, docType: doc_type, limit: Math.min(limit, 30) });
+        return ok({
+            results: docs.map((d) => ({ id: d.id, regulator: d.regulator, type: d.doc_type, title: d.title, date: d.date, source: d.source_url })),
+            disclaimer: DISCLAIMER,
+        });
+    }
+    catch (e) {
+        return err(e instanceof Error ? e.message : String(e));
+    }
+});
+server.tool("list_master_directions", "List RBI Master Directions and SEBI Master Circulars — the consolidated, currently-in-force rules on each subject. " +
+    "Best starting point for understanding the current state of regulation on a topic.", { regulator: z.enum(["RBI", "SEBI"]).optional() }, async ({ regulator }) => {
+    try {
+        if (!q.docCount().length)
+            return emptyDbMsg();
+        const md = q.recentDocs({ regulator, docType: "master_direction", limit: 50 });
+        const mc = q.recentDocs({ regulator, docType: "master_circular", limit: 50 });
+        const all = [...md, ...mc].sort((a, b) => b.date.localeCompare(a.date));
+        return ok({
+            count: all.length,
+            documents: all.map((d) => ({ id: d.id, regulator: d.regulator, type: d.doc_type, title: d.title, date: d.date, source: d.source_url })),
+            disclaimer: DISCLAIMER,
+        });
+    }
+    catch (e) {
+        return err(e instanceof Error ? e.message : String(e));
+    }
+});
+server.tool("list_by_department", "List SEBI documents from a specific department e.g. 'Investment Management', 'Market Regulation', 'Corporation Finance'.", { department: z.string().describe("Department name or partial e.g. 'Investment Management', 'Foreign Portfolio'") }, async ({ department }) => {
+    try {
+        if (!q.docCount().length)
+            return emptyDbMsg();
+        const docs = q.listByDepartment(department, 25);
+        return ok({
+            department, count: docs.length,
+            documents: docs.map((d) => ({ id: d.id, title: d.title, date: d.date, source: d.source_url })),
+            disclaimer: DISCLAIMER,
+        });
+    }
+    catch (e) {
+        return err(e instanceof Error ? e.message : String(e));
+    }
+});
+server.tool("sync_latest", "Refresh the regulatory index by scraping the latest documents from RBI and SEBI. " +
+    "Run this to pull in newly published circulars, master circulars, and regulations. Incremental — only fetches documents not already indexed. " +
+    "Note: takes 2-5 minutes as it politely scrapes the regulators' sites.", {
+    months_back: z.number().default(2).describe("How many months of RBI history to check (default 2 for incremental refresh)"),
+    sebi_pages: z.number().default(3).describe("How many SEBI listing pages to check per section (default 3 = ~75 recent docs per section)"),
+}, async ({ months_back, sebi_pages }) => {
+    try {
+        const log = [];
+        const rbiCount = await syncRbi(months_back, (m) => log.push(m));
+        const sebiCirc = await syncSebi(7, sebi_pages, (m) => log.push(m));
+        const sebiMaster = await syncSebi(6, sebi_pages, (m) => log.push(m));
+        const sebiReg = await syncSebi(3, sebi_pages, (m) => log.push(m));
+        const sebiCount = sebiCirc + sebiMaster + sebiReg;
+        q.setSyncMeta("last_sync", new Date().toISOString());
+        return ok({
+            message: "Sync complete.",
+            newRbiDocs: rbiCount, newSebiDocs: sebiCount,
+            log,
+            disclaimer: DISCLAIMER,
+        });
+    }
+    catch (e) {
+        return err(e instanceof Error ? e.message : String(e));
+    }
+});
+server.tool("sync_status", "Show how many documents are indexed, broken down by regulator and type, and when the index was last synced.", {}, async () => {
+    try {
+        const counts = q.docCount();
+        const lastSync = q.getSyncMeta("last_sync");
+        const total = counts.reduce((s, c) => s + c.n, 0);
+        return ok({ totalDocuments: total, breakdown: counts, lastSync: lastSync || "never" });
+    }
+    catch (e) {
+        return err(e instanceof Error ? e.message : String(e));
+    }
+});
+server.tool("search_by_topic", "Topic-focused search that returns BOTH the consolidated master rules AND recent circulars on a subject, " +
+    "so you get the current baseline plus any recent changes. Best tool for 'give me everything on X' questions.", { topic: z.string().describe("Regulatory topic e.g. 'digital lending', 'NBFC capital adequacy', 'FPI', 'algo trading', 'KYC'") }, async ({ topic }) => {
+    try {
+        if (!q.docCount().length)
+            return emptyDbMsg();
+        const masters = q.searchDocs({ query: topic, docType: "master_direction", limit: 3 })
+            .concat(q.searchDocs({ query: topic, docType: "master_circular", limit: 3 }));
+        const recent = q.searchDocs({ query: topic, limit: 10 });
+        return ok({
+            topic,
+            consolidatedRules: masters.map((r) => ({ id: r.id, regulator: r.regulator, title: r.title, date: r.date, source: r.source_url })),
+            relatedDocuments: recent.map((r) => ({ id: r.id, regulator: r.regulator, type: r.doc_type, title: r.title, date: r.date, snippet: r.snippet, source: r.source_url })),
+            guidance: "Start with consolidatedRules for the current baseline, then check relatedDocuments for recent amendments. Use get_document for full text.",
+            disclaimer: DISCLAIMER,
+        });
+    }
+    catch (e) {
+        return err(e instanceof Error ? e.message : String(e));
+    }
+});
+const transport = new StdioServerTransport();
+await server.connect(transport);
+console.error("india-reg-mcp running on stdio");

package/dist/scrapers/pdf.js ADDED Viewed

@@ -0,0 +1,19 @@
+import { politeFetch } from "../util/http.js";
+import { PDFParse } from "pdf-parse";
+export async function extractPdfText(pdfUrl) {
+    try {
+        const res = await politeFetch(pdfUrl);
+        const buf = Buffer.from(await res.arrayBuffer());
+        const parser = new PDFParse({ data: buf });
+        const result = await parser.getText();
+        return cleanText(result.text);
+    }
+    catch (e) {
+        const msg = e instanceof Error ? e.message : String(e);
+        console.error(`PDF extract failed for ${pdfUrl}: ${msg}`);
+        return "";
+    }
+}
+function cleanText(t) {
+    return t.replace(/\r/g, "").replace(/\n{3,}/g, "\n\n").replace(/[ \t]{2,}/g, " ").trim();
+}

package/dist/scrapers/rbi.js ADDED Viewed

@@ -0,0 +1,168 @@
+import * as cheerio from "cheerio";
+import TurndownService from "turndown";
+import pLimit from "p-limit";
+import { politeFetch, sleep } from "../util/http.js";
+import { extractPdfText } from "./pdf.js";
+import { upsertMany, docExists } from "../db/queries.js";
+const td = new TurndownService();
+const limit = pLimit(2);
+const RBI_BASE = "https://rbi.org.in/Scripts/NotificationUser.aspx";
+async function fetchViewstateTokens() {
+    const res = await politeFetch(RBI_BASE);
+    const html = await res.text();
+    return {
+        vs: html.match(/id="__VIEWSTATE"\s+value="([^"]+)"/)?.[1] ?? "",
+        vsg: html.match(/id="__VIEWSTATEGENERATOR"\s+value="([^"]+)"/)?.[1] ?? "",
+        ev: html.match(/id="__EVENTVALIDATION"\s+value="([^"]+)"/)?.[1] ?? "",
+    };
+}
+// Scrape one month via POST (month: 1-12, or 0 = all)
+export async function scrapeRbiMonth(year, month, tokens) {
+    const t = tokens ?? await fetchViewstateTokens();
+    const body = new URLSearchParams({
+        __VIEWSTATE: t.vs,
+        __VIEWSTATEGENERATOR: t.vsg,
+        __EVENTVALIDATION: t.ev,
+        hdnYear: String(year),
+        hdnMonth: String(month),
+        "UsrFontCntr$btn": "",
+    });
+    let res;
+    for (let attempt = 0; attempt <= 2; attempt++) {
+        try {
+            res = await fetch(RBI_BASE, {
+                method: "POST",
+                headers: {
+                    "User-Agent": "india-reg-mcp/1.0 (open-source regulatory indexer)",
+                    "Content-Type": "application/x-www-form-urlencoded",
+                    "Referer": RBI_BASE,
+                    "Accept": "text/html,*/*",
+                },
+                body: body.toString(),
+                signal: AbortSignal.timeout(30_000),
+            });
+            if (res.ok)
+                break;
+            if (res.status === 429 || res.status >= 500) {
+                await sleep(2000 * (attempt + 1));
+                continue;
+            }
+            throw new Error(`RBI POST failed: HTTP ${res.status}`);
+        }
+        catch (e) {
+            if (attempt === 2)
+                throw e;
+            await sleep(2000 * (attempt + 1));
+        }
+    }
+    if (!res?.ok)
+        throw new Error(`RBI POST failed after retries`);
+    const html = await res.text();
+    if (html.includes("No Notification Found"))
+        return [];
+    const $ = cheerio.load(html);
+    const items = [];
+    let currentDate = "";
+    $("table tr").each((_, tr) => {
+        const $tr = $(tr);
+        const text = $tr.text().trim();
+        const dateMatch = text.match(/^([A-Z][a-z]{2}\s+\d{1,2},\s+\d{4})$/);
+        if (dateMatch) {
+            currentDate = toISO(dateMatch[1]);
+            return;
+        }
+        const titleLink = $tr.find('a[href*="NotificationUser.aspx?Id="]').first();
+        if (titleLink.length) {
+            const href = titleLink.attr("href") || "";
+            const idMatch = href.match(/Id=(\d+)/);
+            if (!idMatch)
+                return;
+            if (!currentDate)
+                return; // skip rows before the first date header
+            const pdfLink = $tr.find('a[href*=".PDF"], a[href*=".pdf"]').first();
+            items.push({
+                id: `rbi:${idMatch[1]}`,
+                title: titleLink.text().trim(),
+                date: currentDate,
+                htmlUrl: absolute(href, "https://rbi.org.in/Scripts/"),
+                pdfUrl: pdfLink.length ? (pdfLink.attr("href") || null) : null,
+            });
+        }
+    });
+    return items;
+}
+async function fetchRbiBody(item) {
+    try {
+        const res = await politeFetch(item.htmlUrl);
+        const $ = cheerio.load(await res.text());
+        // #pnlDetails is the main content container on RBI ASP.NET doc pages
+        const main = $("#pnlDetails, #example-min, table.tablebg").first();
+        const bodyHtml = main.length ? main.html() : $("body").html();
+        let markdown = bodyHtml ? td.turndown(bodyHtml) : "";
+        if (markdown.length < 200 && item.pdfUrl)
+            markdown = await extractPdfText(item.pdfUrl);
+        return markdown;
+    }
+    catch {
+        return item.pdfUrl ? await extractPdfText(item.pdfUrl) : "";
+    }
+}
+export async function syncRbi(monthsBack, onProgress) {
+    const tokens = await fetchViewstateTokens();
+    const now = new Date();
+    let total = 0;
+    for (let i = 0; i < monthsBack; i++) {
+        const d = new Date(now.getFullYear(), now.getMonth() - i, 1);
+        const year = d.getFullYear();
+        const month = d.getMonth() + 1; // 1-indexed to match GetYearMonth JS
+        let items;
+        try {
+            items = await scrapeRbiMonth(year, month, tokens);
+        }
+        catch (e) {
+            const msg = e instanceof Error ? e.message : String(e);
+            onProgress?.(`RBI ${year}-${month}: fetch failed (${msg}), skipping`);
+            await sleep(3000);
+            continue;
+        }
+        onProgress?.(`RBI ${year}-${month}: ${items.length} docs found`);
+        const newItems = items.filter((it) => !docExists(it.id));
+        const rows = await Promise.all(newItems.map((it) => limit(async () => {
+            const body = await fetchRbiBody(it);
+            await sleep(300);
+            return {
+                id: it.id, regulator: "RBI",
+                doc_type: classifyRbi(it.title),
+                title: it.title, date: it.date, department: null,
+                source_url: it.htmlUrl, pdf_url: it.pdfUrl, body,
+                indexed_at: new Date().toISOString(),
+            };
+        })));
+        if (rows.length)
+            upsertMany(rows);
+        total += rows.length;
+        await sleep(500);
+    }
+    return total;
+}
+function classifyRbi(title) {
+    const t = title.toLowerCase();
+    if (t.includes("master direction"))
+        return "master_direction";
+    if (t.includes("master circular"))
+        return "master_circular";
+    if (t.includes("regulations"))
+        return "regulation";
+    if (t.includes("circular"))
+        return "circular";
+    return "notification";
+}
+function toISO(s) {
+    const d = new Date(s);
+    return isNaN(d.getTime()) ? "" : d.toISOString().split("T")[0];
+}
+function absolute(href, base) {
+    if (href.startsWith("http"))
+        return href;
+    return new URL(href, base).toString();
+}

package/dist/scrapers/repair-bodies.js ADDED Viewed

@@ -0,0 +1,92 @@
+/**
+ * Backfill body text for docs that were indexed without body content.
+ * Run with: npx tsx src/scrapers/repair-bodies.ts
+ */
+import { initSchema } from "../db/schema.js";
+import { db } from "../db/schema.js";
+import { sleep } from "../util/http.js";
+import pLimit from "p-limit";
+// Import body fetchers directly
+import * as cheerio from "cheerio";
+import TurndownService from "turndown";
+import { politeFetch } from "../util/http.js";
+import { PDFParse } from "pdf-parse";
+initSchema();
+const td = new TurndownService();
+const limit = pLimit(2);
+const log = (m) => console.error(`[repair] ${m}`);
+async function fetchRbiBody(sourceUrl) {
+    try {
+        const res = await politeFetch(sourceUrl);
+        const $ = cheerio.load(await res.text());
+        const main = $("#pnlDetails, #example-min, table.tablebg").first();
+        const bodyHtml = main.length ? main.html() : $("body").html();
+        return bodyHtml ? td.turndown(bodyHtml) : "";
+    }
+    catch {
+        return "";
+    }
+}
+async function fetchSebiBody(sourceUrl) {
+    try {
+        const res = await politeFetch(sourceUrl);
+        const $ = cheerio.load(await res.text());
+        const iframeSrc = $("iframe[src*='sebi_data'], iframe[src*='?file=']").first().attr("src") || "";
+        const pdfUrlMatch = iframeSrc.match(/[?&]file=((?:https?:\/\/|\/)[^'"&\s]+\.pdf)/i);
+        const rawPdfPath = pdfUrlMatch ? pdfUrlMatch[1] : null;
+        const pdfUrl = rawPdfPath
+            ? rawPdfPath.startsWith("/") ? `https://www.sebi.gov.in${rawPdfPath}` : rawPdfPath
+            : null;
+        if (pdfUrl) {
+            const buf = Buffer.from(await (await politeFetch(pdfUrl)).arrayBuffer());
+            const parser = new PDFParse({ data: buf });
+            const result = await parser.getText();
+            const text = result.text.replace(/\r/g, "").replace(/\n{3,}/g, "\n\n").trim();
+            return { body: text, pdfUrl };
+        }
+        return { body: "", pdfUrl: null };
+    }
+    catch {
+        return { body: "", pdfUrl: null };
+    }
+}
+const updateStmt = db.prepare("UPDATE documents SET body=@body, pdf_url=@pdf_url, indexed_at=@indexed_at WHERE id=@id");
+const updateBodyOnly = db.prepare("UPDATE documents SET body=@body, indexed_at=@indexed_at WHERE id=@id");
+const docs = db.prepare("SELECT id, regulator, source_url FROM documents WHERE body IS NULL OR body = '' OR (regulator='SEBI' AND LENGTH(body) < 2000) ORDER BY date DESC").all();
+log(`Found ${docs.length} docs needing body repair`);
+let done = 0;
+let failed = 0;
+await Promise.all(docs.map((doc) => limit(async () => {
+    try {
+        let body = "";
+        let pdfUrl = null;
+        if (doc.regulator === "RBI") {
+            body = await fetchRbiBody(doc.source_url);
+        }
+        else {
+            const result = await fetchSebiBody(doc.source_url);
+            body = result.body;
+            pdfUrl = result.pdfUrl;
+        }
+        if (doc.regulator === "RBI") {
+            updateBodyOnly.run({ id: doc.id, body, indexed_at: new Date().toISOString() });
+        }
+        else {
+            updateStmt.run({ id: doc.id, body, pdf_url: pdfUrl, indexed_at: new Date().toISOString() });
+        }
+        done++;
+        if (done % 20 === 0)
+            log(`Progress: ${done}/${docs.length} done`);
+    }
+    catch (e) {
+        const msg = e instanceof Error ? e.message : String(e);
+        console.error(`Failed ${doc.id}: ${msg}`);
+        failed++;
+    }
+    await sleep(300);
+})));
+log(`Done. ${done} updated, ${failed} failed.`);
+// Show updated stats
+const stats = db.prepare("SELECT regulator, SUM(CASE WHEN body IS NULL OR body='' THEN 1 ELSE 0 END) as no_body, COUNT(*) as total FROM documents GROUP BY regulator").all();
+stats.forEach(s => log(`${s.regulator}: ${s.total - s.no_body}/${s.total} have body`));
+process.exit(0);

package/dist/scrapers/run-sync.js ADDED Viewed

@@ -0,0 +1,31 @@
+import { initSchema } from "../db/schema.js";
+import { syncRbi } from "./rbi.js";
+import { syncSebi } from "./sebi.js";
+import { setSyncMeta } from "../db/queries.js";
+async function main() {
+    initSchema();
+    const log = (m) => console.error(`[sync] ${m}`);
+    const args = process.argv.slice(2);
+    const quick = args.includes("--quick"); // quick mode: 6mo RBI + 5 pages SEBI
+    if (quick) {
+        log("Quick sync mode (6 months RBI, 5 pages SEBI each)...");
+        const rbiCount = await syncRbi(6, log);
+        log(`RBI: ${rbiCount} new documents`);
+        const sebiCirc = await syncSebi(7, 5, log);
+        log(`SEBI circulars: ${sebiCirc} new documents`);
+    }
+    else {
+        log("Starting RBI sync (last 36 months)...");
+        const rbiCount = await syncRbi(36, log);
+        log(`RBI: ${rbiCount} new documents`);
+        log("Starting SEBI sync...");
+        const sebiMaster = await syncSebi(6, 5, log);
+        const sebiCirc = await syncSebi(7, 40, log);
+        const sebiReg = await syncSebi(3, 10, log);
+        log(`SEBI: ${sebiMaster + sebiCirc + sebiReg} new documents`);
+    }
+    setSyncMeta("last_sync", new Date().toISOString());
+    log("Sync complete.");
+    process.exit(0);
+}
+main().catch((e) => { console.error(e); process.exit(1); });

package/dist/scrapers/sebi.js ADDED Viewed

@@ -0,0 +1,146 @@
+import * as cheerio from "cheerio";
+import TurndownService from "turndown";
+import pLimit from "p-limit";
+import { politeFetch, sleep } from "../util/http.js";
+import { upsertMany, docExists } from "../db/queries.js";
+const td = new TurndownService();
+const limit = pLimit(2);
+const UA = "india-reg-mcp/1.0 (open-source regulatory indexer)";
+// ssid: 7=circulars, 6=master circulars, 3=regulations
+const SEBI_LIST_BASE = "https://www.sebi.gov.in/sebiweb/home/HomeAction.do?doListing=yes&sid=1&ssid=";
+const SEBI_AJAX = "https://www.sebi.gov.in/sebiweb/ajax/home/getnewslistinfo.jsp";
+function parseListItems($) {
+    const items = [];
+    $("table tr").each((_, tr) => {
+        const $tr = $(tr);
+        const link = $tr.find('a[href*="/legal/"]').first();
+        if (!link.length)
+            return;
+        const href = link.attr("href") || "";
+        const idMatch = href.match(/_(\d+)\.html/);
+        if (!idMatch)
+            return;
+        const dateCell = $tr.find("td").first().text().trim();
+        items.push({
+            id: `sebi:${idMatch[1]}`,
+            title: link.text().trim(),
+            date: toISO(dateCell),
+            url: absolute(href, "https://www.sebi.gov.in/"),
+        });
+    });
+    return items;
+}
+// Pages 1+: AJAX POST to getnewslistinfo.jsp (page 0 handled in syncSebi)
+async function getSebiPage(ssid, pageIndex, jsessionid) {
+    // Pages 1+: AJAX POST to getnewslistinfo.jsp
+    const body = new URLSearchParams({
+        nextValue: "1",
+        next: "n",
+        search: "", fromDate: "", toDate: "", fromYear: "", toYear: "",
+        deptId: "",
+        sid: "1", ssid: String(ssid), smid: "0", ssidhidden: String(ssid),
+        intmid: "-1",
+        sText: "Legal", ssText: ssid === 7 ? "Circulars" : ssid === 6 ? "Master Circulars" : "Regulations",
+        smText: "",
+        doDirect: String(pageIndex),
+    });
+    const res = await fetch(SEBI_AJAX, {
+        method: "POST",
+        headers: {
+            "User-Agent": UA,
+            "Content-Type": "application/x-www-form-urlencoded",
+            "Cookie": `JSESSIONID=${jsessionid}`,
+            "Referer": `${SEBI_LIST_BASE}${ssid}&smid=0&nextValue=0`,
+            "Accept": "*/*",
+            "X-Requested-With": "XMLHttpRequest",
+        },
+        body: body.toString(),
+        signal: AbortSignal.timeout(30_000),
+    });
+    if (!res.ok)
+        throw new Error(`SEBI AJAX failed: HTTP ${res.status}`);
+    return parseListItems(cheerio.load(await res.text()));
+}
+async function fetchSebiBody(item) {
+    try {
+        const res = await politeFetch(item.url);
+        const $ = cheerio.load(await res.text());
+        // SEBI pages embed content as PDF in an iframe — src may have absolute or relative PDF path
+        // e.g. ?file=https://www.sebi.gov.in/sebi_data/... or ?file=/sebi_data/...
+        const iframeSrc = $("iframe[src*='sebi_data'], iframe[src*='?file=']").first().attr("src") || "";
+        const pdfUrlMatch = iframeSrc.match(/[?&]file=((?:https?:\/\/|\/)[^'"&\s]+\.pdf)/i);
+        const rawPdfPath = pdfUrlMatch ? pdfUrlMatch[1] : null;
+        const pdfUrl = rawPdfPath
+            ? rawPdfPath.startsWith("/") ? `https://www.sebi.gov.in${rawPdfPath}` : rawPdfPath
+            : null;
+        if (pdfUrl) {
+            const { extractPdfText } = await import("./pdf.js");
+            const body = await extractPdfText(pdfUrl);
+            return { body, pdfUrl };
+        }
+        // Fallback: extract any visible text from the page
+        const main = $(".main_section, .news-detail-slider, #member-wrapper").first();
+        const bodyHtml = main.length ? main.html() : $("body").html();
+        return { body: bodyHtml ? td.turndown(bodyHtml) : "", pdfUrl: null };
+    }
+    catch {
+        return { body: "", pdfUrl: null };
+    }
+}
+const SSID_DOC_TYPE = { 6: "master_circular", 3: "regulation" };
+const SSID_DEPARTMENT = { 6: "Master Circulars", 3: "Regulations", 7: "Circulars" };
+export async function syncSebi(ssid, maxPages, onProgress) {
+    // Page 0: single GET that both establishes the session and returns first page listings
+    const url = `${SEBI_LIST_BASE}${ssid}&smid=0&nextValue=0`;
+    const page0Res = await fetch(url, {
+        headers: { "User-Agent": UA, "Accept": "text/html,*/*" },
+        signal: AbortSignal.timeout(30_000),
+    });
+    if (!page0Res.ok)
+        throw new Error(`SEBI GET failed: HTTP ${page0Res.status}`);
+    const cookie = page0Res.headers.get("set-cookie") || "";
+    const jsessionid = cookie.match(/JSESSIONID=([^;]+)/)?.[1] ?? "";
+    if (!jsessionid)
+        throw new Error("SEBI: failed to obtain JSESSIONID — session cookie absent from page-0 response");
+    const doc_type = SSID_DOC_TYPE[ssid] ?? "circular";
+    const department = SSID_DEPARTMENT[ssid] ?? "Circulars";
+    let total = 0;
+    const processPage = async (items) => {
+        if (!items.length)
+            return false;
+        const newItems = items.filter((it) => !docExists(it.id));
+        const rows = await Promise.all(newItems.map((it) => limit(async () => {
+            const { body, pdfUrl } = await fetchSebiBody(it);
+            await sleep(300);
+            return {
+                id: it.id, regulator: "SEBI",
+                doc_type, title: it.title, date: it.date, department,
+                source_url: it.url, pdf_url: pdfUrl, body,
+                indexed_at: new Date().toISOString(),
+            };
+        })));
+        if (rows.length)
+            upsertMany(rows);
+        total += rows.length;
+        await sleep(500);
+        return true;
+    };
+    const page0Items = parseListItems(cheerio.load(await page0Res.text()));
+    onProgress?.(`SEBI ssid=${ssid} page 0: ${page0Items.length} docs`);
+    await processPage(page0Items);
+    for (let page = 1; page < maxPages; page++) {
+        const items = await getSebiPage(ssid, page, jsessionid);
+        if (!items.length)
+            break;
+        onProgress?.(`SEBI ssid=${ssid} page ${page}: ${items.length} docs`);
+        await processPage(items);
+    }
+    return total;
+}
+function toISO(s) {
+    const d = new Date(s.replace(/(\w{3})\s+(\d{1,2}),\s+(\d{4})/, "$1 $2 $3"));
+    return isNaN(d.getTime()) ? "" : d.toISOString().split("T")[0];
+}
+function absolute(href, base) {
+    return href.startsWith("http") ? href : new URL(href, base).toString();
+}

package/dist/util/format.js ADDED Viewed

@@ -0,0 +1,10 @@
+export const DISCLAIMER = "Source: official RBI/SEBI publications. This is primary-source retrieval, not legal advice. Verify against the linked official document.";
+export function ok(data) {
+    return { content: [{ type: "text", text: JSON.stringify(data, null, 2) }] };
+}
+export function err(m) {
+    return { content: [{ type: "text", text: `Error: ${m}` }] };
+}
+export function emptyDbMsg() {
+    return ok({ message: "The regulatory index is empty. Run 'npm run sync' first, or call the sync_latest tool to populate it.", disclaimer: DISCLAIMER });
+}

package/dist/util/http.js ADDED Viewed

@@ -0,0 +1,25 @@
+const UA = "india-reg-mcp/1.0 (open-source regulatory indexer; +https://github.com/yourusername/india-reg-mcp)";
+export async function politeFetch(url, retries = 2) {
+    for (let attempt = 0; attempt <= retries; attempt++) {
+        try {
+            const res = await fetch(url, {
+                headers: { "User-Agent": UA, "Accept": "text/html,application/pdf,*/*" },
+                signal: AbortSignal.timeout(30_000),
+            });
+            if (res.ok)
+                return res;
+            if (res.status === 429 || res.status >= 500) {
+                await sleep(1000 * (attempt + 1));
+                continue;
+            }
+            throw new Error(`HTTP ${res.status} for ${url}`);
+        }
+        catch (e) {
+            if (attempt === retries)
+                throw e;
+            await sleep(1000 * (attempt + 1));
+        }
+    }
+    throw new Error(`Failed after ${retries} retries: ${url}`);
+}
+export function sleep(ms) { return new Promise((r) => setTimeout(r, ms)); }

package/package.json ADDED Viewed

@@ -0,0 +1,61 @@
+{
+  "name": "india-reg-mcp",
+  "version": "1.0.0",
+  "description": "MCP server for Indian financial regulations — RBI & SEBI circulars, master directions, notifications. Searchable, cited, no API keys.",
+  "type": "module",
+  "main": "dist/index.js",
+  "bin": {
+    "india-reg-mcp": "./dist/index.js"
+  },
+  "files": [
+    "dist/**/*",
+    "README.md"
+  ],
+  "keywords": [
+    "mcp",
+    "model-context-protocol",
+    "india",
+    "rbi",
+    "sebi",
+    "regulations",
+    "circulars",
+    "compliance",
+    "fintech",
+    "claude"
+  ],
+  "license": "MIT",
+  "repository": {
+    "type": "git",
+    "url": "https://github.com/Akhilgovind02/india-regulatory-mcp.git"
+  },
+  "homepage": "https://github.com/Akhilgovind02/india-regulatory-mcp",
+  "engines": {
+    "node": ">=18.0.0"
+  },
+  "scripts": {
+    "build": "tsc",
+    "postbuild": "node scripts/add-shebang.mjs",
+    "dev": "tsx src/index.ts",
+    "sync": "tsx src/scrapers/run-sync.ts",
+    "inspect": "npx @modelcontextprotocol/inspector tsx src/index.ts",
+    "start": "node dist/index.js",
+    "prepublishOnly": "npm run build"
+  },
+  "dependencies": {
+    "@modelcontextprotocol/sdk": "1.29.0",
+    "better-sqlite3": "12.10.0",
+    "cheerio": "1.2.0",
+    "p-limit": "7.3.0",
+    "pdf-parse": "2.4.5",
+    "turndown": "7.2.4",
+    "zod": "4.4.3"
+  },
+  "devDependencies": {
+    "@modelcontextprotocol/inspector": "0.22.0",
+    "@types/better-sqlite3": "^7.6.13",
+    "@types/node": "25.9.3",
+    "@types/turndown": "^5.0.6",
+    "tsx": "4.22.4",
+    "typescript": "6.0.3"
+  }
+}