npm - @roomi-fields/notebooklm-mcp - Versions diffs - 1.5.9 → 1.7.0 - Mend

@roomi-fields/notebooklm-mcp 1.5.9 → 1.7.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (20) hide show

package/README.md +55 -37
package/deployment/docs/03-API.md +12 -3
package/deployment/docs/12-BATCH-1000.md +165 -0
package/deployment/docs/13-COMPARE.md +12 -0
package/deployment/docs/14-RTFM-INTEGRATION.md +323 -0
package/deployment/docs/openapi.yaml +492 -0
package/dist/http-wrapper.js +48 -0
package/dist/http-wrapper.js.map +1 -1
package/dist/index.js +3 -0
package/dist/index.js.map +1 -1
package/dist/tools/index.d.ts +16 -0
package/dist/tools/index.d.ts.map +1 -1
package/dist/tools/index.js +92 -0
package/dist/tools/index.js.map +1 -1
package/dist/utils/vault-writer.d.ts +107 -0
package/dist/utils/vault-writer.d.ts.map +1 -0
package/dist/utils/vault-writer.js +214 -0
package/dist/utils/vault-writer.js.map +1 -0
package/docs/MCP_DIRECTORIES.md +84 -0
package/package.json +1 -1

package/README.md CHANGED Viewed

@@ -1,10 +1,10 @@
 <div align="center">
-# NotebookLM MCP + HTTP REST API
+# NotebookLM REST API + MCP server
-**Google NotebookLM over MCP + a local HTTP REST API — Q&A with citations, audio podcasts, video generation, multi-account rotation. Works with Claude Code, Codex, Cursor, n8n, Zapier, Make.**
+**Automate Google NotebookLM at scale. 33-endpoint HTTP REST API for n8n / Zapier / Make / curl, plus an MCP server for Claude Code / Cursor / Codex. Citation-backed Q&A, full Studio generation (audio · video · infographic · report · presentation · data table), multi-account rotation with auto-reauth.**
-> 🟢 **Actively maintained fork** of [PleasePrompto/notebooklm-mcp](https://github.com/PleasePrompto/notebooklm-mcp) (upstream last push: 2025-12-27). This fork ships v1.5.8 (2026-04-19) with 2026 NotebookLM UI selectors, HTTP REST API for n8n / Zapier / Make, multi-account rotation, and documented install on Windows / WSL / Docker.
+> v1.7.0 — production-grade, batch-tested on overnight runs of 1 000+ questions. New: `batch_to_vault` is now a first-class MCP tool (no HTTP server required) on top of the existing `POST /batch-to-vault` endpoint. See [RTFM integration](./deployment/docs/14-RTFM-INTEGRATION.md) for the full pattern. [Compare with `PleasePrompto/notebooklm-mcp` v2.0.0](https://roomi-fields.github.io/notebooklm-mcp/compare) to see when this project is the right pick (REST API, full Studio, auto-reauth) and when the MCP-only upstream is.
 <!-- Badges -->
@@ -66,15 +66,35 @@ Generate multiple content types from your notebook sources:
 - **MCP Protocol** — Claude Code, Cursor, Codex, any MCP client
 - **HTTP REST API** — n8n, Zapier, Make.com, custom integrations
 - **Docker** — Isolated deployment with Docker or Docker Compose
+- **[RTFM](https://github.com/roomi-fields/rtfm) retrieval layer** — `/batch-to-vault` writes citation-backed answers as markdown + JSON sidecars (`nblm-answer-v1` schema), indexable by [RTFM](https://github.com/roomi-fields/rtfm) (FTS5 + semantic) for unlimited offline queries. Ideal for academic / SOTA workflows. [Guide](./deployment/docs/14-RTFM-INTEGRATION.md).
 ---
 ## Quick Start
-### Option 1: MCP Mode (Claude Code, Cursor, Codex)
+### Option 1 — HTTP REST API (n8n, Zapier, Make, curl, any HTTP client)
 ```bash
-# Clone and build locally
+git clone https://github.com/roomi-fields/notebooklm-mcp.git
+cd notebooklm-mcp
+npm install && npm run build
+npm run setup-auth   # One-time Google login
+npm run start:http   # Start REST API on port 3000
+```
+```bash
+# Citation-backed Q&A, single curl, JSON response
+curl -X POST http://localhost:3000/ask \
+  -H 'Content-Type: application/json' \
+  -d '{"question": "Summarize chapter 3", "notebook_id": "your-id", "source_format": "json"}'
+```
+The full surface is **33 documented endpoints** — see the [REST API reference](https://roomi-fields.github.io/notebooklm-mcp/notebooklm-rest-api). For overnight batches of 1 000+ questions, see the [batch pattern](https://roomi-fields.github.io/notebooklm-mcp/batch-1000-questions).
+### Option 2 — MCP Mode (Claude Code, Cursor, Codex)
+```bash
+# Build (same package, MCP transport)
 git clone https://github.com/roomi-fields/notebooklm-mcp.git
 cd notebooklm-mcp
 npm install && npm run build
@@ -82,7 +102,7 @@ npm install && npm run build
 # Claude Code
 claude mcp add notebooklm node /path/to/notebooklm-mcp/dist/index.js
-# Cursor - add to ~/.cursor/mcp.json
+# Cursor — add to ~/.cursor/mcp.json
 {
   "mcpServers": {
     "notebooklm": {
@@ -95,24 +115,7 @@ claude mcp add notebooklm node /path/to/notebooklm-mcp/dist/index.js
 Then say: _"Log me in to NotebookLM"_ → Chrome opens → log in with Google.
-### Option 2: HTTP REST API (n8n, Zapier, Make.com)
-```bash
-git clone https://github.com/roomi-fields/notebooklm-mcp.git
-cd notebooklm-mcp
-npm install && npm run build
-npm run setup-auth   # One-time Google login
-npm run start:http   # Start server on port 3000
-```
-```bash
-# Query the API
-curl -X POST http://localhost:3000/ask \
-  -H "Content-Type: application/json" \
-  -d '{"question": "Explain X", "notebook_id": "my-notebook"}'
-```
-### Option 3: Docker (NAS, Server)
+### Option 3 — Docker (NAS, server, headless)
 ```bash
 # Build and run
@@ -131,19 +134,26 @@ See [Docker Guide](./deployment/docs/08-DOCKER.md) for NAS deployment (Synology,
 ## Documentation
-| Guide                                                        | Description                               |
-| ------------------------------------------------------------ | ----------------------------------------- |
-| [Installation](./deployment/docs/01-INSTALL.md)              | Step-by-step setup for HTTP and MCP modes |
-| [Configuration](./deployment/docs/02-CONFIGURATION.md)       | Environment variables and security        |
-| [API Reference](./deployment/docs/03-API.md)                 | Complete HTTP endpoint documentation      |
-| [n8n Integration](./deployment/docs/04-N8N-INTEGRATION.md)   | Workflow automation setup                 |
-| [Troubleshooting](./deployment/docs/05-TROUBLESHOOTING.md)   | Common issues and solutions               |
-| [Notebook Library](./deployment/docs/06-NOTEBOOK-LIBRARY.md) | Multi-notebook management                 |
-| [Auto-Discovery](./deployment/docs/07-AUTO-DISCOVERY.md)     | Autonomous metadata generation            |
-| [Docker](./deployment/docs/08-DOCKER.md)                     | Docker and Docker Compose deployment      |
-| [Multi-Interface](./deployment/docs/09-MULTI-INTERFACE.md)   | Run Claude Desktop + HTTP simultaneously  |
-| [Chrome Limitation](./docs/CHROME_PROFILE_LIMITATION.md)     | Profile locking (solved in v1.3.6+)       |
-| [Adding a Language](./docs/ADDING_A_LANGUAGE.md)             | i18n system for multilingual UI support   |
+Full docs site: **<https://roomi-fields.github.io/notebooklm-mcp/>** · [OpenAPI 3.1 spec](./deployment/docs/openapi.yaml)
+| Guide                                                                                        | Description                                                                                                                       |
+| -------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------- |
+| [Installation](./deployment/docs/01-INSTALL.md)                                              | Step-by-step setup for HTTP and MCP modes                                                                                         |
+| [Configuration](./deployment/docs/02-CONFIGURATION.md)                                       | Environment variables and security                                                                                                |
+| [REST API reference](./deployment/docs/03-API.md)                                            | Complete HTTP endpoint documentation (33 endpoints)                                                                               |
+| [Run 1 000 questions overnight](./deployment/docs/12-BATCH-1000.md)                          | Production batch pattern with auto-reauth and rotation                                                                            |
+| [**RTFM integration — cache as searchable vault**](./deployment/docs/14-RTFM-INTEGRATION.md) | Pipeline pattern: NotebookLM as one-shot ingestion, RTFM as retrieval layer. `/batch-to-vault` endpoint, `nblm-answer-v1` schema. |
+| [n8n integration](./deployment/docs/04-N8N-INTEGRATION.md)                                   | Workflow automation setup                                                                                                         |
+| [Troubleshooting](./deployment/docs/05-TROUBLESHOOTING.md)                                   | Common issues and solutions                                                                                                       |
+| [Notebook library](./deployment/docs/06-NOTEBOOK-LIBRARY.md)                                 | Multi-notebook management                                                                                                         |
+| [Auto-discovery](./deployment/docs/07-AUTO-DISCOVERY.md)                                     | Autonomous metadata generation                                                                                                    |
+| [Content management](./deployment/docs/10-CONTENT-MANAGEMENT.md)                             | Audio, video, infographic, report, presentation                                                                                   |
+| [Multi-account rotation](./deployment/docs/11-MULTI-ACCOUNT.md)                              | Multiple accounts with TOTP auto-reauth                                                                                           |
+| [Docker](./deployment/docs/08-DOCKER.md)                                                     | Docker and Docker Compose deployment                                                                                              |
+| [Multi-interface](./deployment/docs/09-MULTI-INTERFACE.md)                                   | Run Claude Desktop + HTTP simultaneously                                                                                          |
+| [**Compare with PleasePrompto v2.0.0**](./deployment/docs/13-COMPARE.md)                     | Feature matrix vs the upstream MCP-only server                                                                                    |
+| [Chrome profile limitation](./docs/CHROME_PROFILE_LIMITATION.md)                             | Profile locking (solved in v1.3.6+)                                                                                               |
+| [Adding a language](./docs/ADDING_A_LANGUAGE.md)                                             | i18n system for multilingual UI support                                                                                           |
 ---
@@ -153,6 +163,14 @@ See [ROADMAP.md](./ROADMAP.md) for planned features and version history.
 **Latest releases:**
+- **v1.7.0** — `batch_to_vault` exposed as a first-class MCP tool (parity with the HTTP endpoint, no localhost server required); shared `runBatchToVault` helper deduplicates the loop across both transports
+- **v1.6.0** — `/batch-to-vault` endpoint + RTFM integration (`nblm-answer-v1` JSON Schema published at [schemas.roomi-fields.com/nblm-answer-v1.json](https://schemas.roomi-fields.com/nblm-answer-v1.json)) for caching NotebookLM answers as a searchable markdown vault
+- **v1.5.9** — Restore `mcpName` field for MCP Registry npm-package ownership verification
+- **v1.5.8** — NotebookLM 2026 UI adaptations (icon-label sanitization, Discussion-panel recovery, count-based source detection) — PR #5 by @KhizarJamshaidIqbal
+- **v1.5.7** — Citation extraction selector fix (`.highlighted`) and Docker multi-stage build — PR #1 by @JulienCANTONI
+- **v1.5.6** — Citation extraction major rewrite (97% success rate), browser-verified auth at startup, profile auto-sync
+- **v1.5.5** — Multi-account state-path bug fix, Windows startup scripts, hidden-window MCP proxy
+- **v1.5.4** — Mid-session auto-reauth with stored credentials, TOTP support
 - **v1.5.3** — Docker deployment with noVNC for visual authentication + NAS support (Synology, QNAP)
 - **v1.5.2** — Notebook scraping from NotebookLM + Bulk delete + Bug fixes
 - **v1.5.1** — Multilingual UI support (FR/EN) with i18n selector system + E2E tests (76 tests)

package/deployment/docs/03-API.md CHANGED Viewed

@@ -1,6 +1,15 @@
-# API Documentation - NotebookLM MCP HTTP Server
-> Complete reference for all REST endpoints
+# NotebookLM REST API reference — 33 HTTP endpoints
+> Complete reference for the NotebookLM HTTP REST API. Citation-backed Q&A,
+> Studio content generation, notebook library, multi-account, sessions.
+>
+> **OpenAPI 3.1 spec:** [openapi.yaml](pathname:///openapi.yaml) (importable in
+> Postman, Insomnia, Bruno, Swagger UI). Source in the repo at
+> [`deployment/docs/openapi.yaml`](https://github.com/roomi-fields/notebooklm-mcp/blob/main/deployment/docs/openapi.yaml).
+>
+> **Companion guides:** [REST batch pattern (1 000 questions overnight)](/batch-1000-questions) ·
+> [n8n integration](/notebooklm-n8n) · [Multi-account rotation](/notebooklm-multi-account) ·
+> [Compare with PleasePrompto/notebooklm-mcp v2.0.0](/compare)
 ---

package/deployment/docs/12-BATCH-1000.md ADDED Viewed

@@ -0,0 +1,165 @@
+# Run 1 000 NotebookLM questions overnight
+This is the pattern we use to run very long batches of citation-backed Q&A against Google NotebookLM — from PhD literature reviews to market intelligence pipelines. Single laptop, single account, eight hours, one thousand structured answers in a JSONL file.
+The whole thing fits in **one shell loop**, because the project exposes a plain REST API on `http://localhost:3000`. There is no SDK to learn, no agent harness to configure, no MCP client to wire.
+## What you need
+- This project running locally: `npm run setup-auth` (one-time Google login), then `npm run start:http`. [Install guide](/install).
+- A list of questions in a text file, one per line.
+- A notebook id. Either pick one from `GET /notebooks/scrape` or set a default with `PUT /notebooks/:id/activate`.
+- Optionally: a second Google account for rotation. [Multi-account guide](/notebooklm-multi-account).
+## The minimum viable batch (10 lines of bash)
+```bash
+NOTEBOOK_ID="paste-your-id-here"
+INPUT="questions.txt"
+OUTPUT="answers.jsonl"
+while IFS= read -r question; do
+  curl -sS -X POST http://localhost:3000/ask \
+    -H 'Content-Type: application/json' \
+    -d "$(jq -n --arg q "$question" --arg n "$NOTEBOOK_ID" \
+        '{question: $q, notebook_id: $n, source_format: "json"}')" \
+    >> "$OUTPUT"
+  echo >> "$OUTPUT"
+done < "$INPUT"
+```
+That works. It does not survive a session expiry at hour 4, it does not throttle, it does not resume after a network blip, and it makes 1 000 sequential blocking calls. So for real batches we wrap it.
+## The production pattern
+```python
+#!/usr/bin/env python3
+"""Run a batch of NotebookLM questions through the local REST API.
+Resumes safely on restart, handles re-auth, rotates accounts, throttles to
+respect rate limits, and writes one JSON line per answer with citations.
+"""
+import json, time, sys
+from pathlib import Path
+import httpx
+API = "http://localhost:3000"
+NOTEBOOK_ID = "paste-your-id-here"
+INPUT = Path("questions.txt")
+OUTPUT = Path("answers.jsonl")
+THROTTLE_SECONDS = 8           # average pace; tune to your account's quota
+MAX_RETRIES = 3
+ACCOUNTS = ["primary", "backup"]  # registered via `npm run accounts add`
+def already_done() -> set[str]:
+    """Resume support: skip questions already answered."""
+    if not OUTPUT.exists():
+        return set()
+    done = set()
+    for line in OUTPUT.read_text().splitlines():
+        try:
+            done.add(json.loads(line)["question"])
+        except (json.JSONDecodeError, KeyError):
+            continue
+    return done
+def switch_account(name: str) -> None:
+    httpx.post(f"{API}/re-auth", json={"account": name}, timeout=120).raise_for_status()
+def ask(question: str, account_idx: int = 0) -> dict:
+    payload = {
+        "question": question,
+        "notebook_id": NOTEBOOK_ID,
+        "source_format": "json",
+    }
+    for attempt in range(1, MAX_RETRIES + 1):
+        try:
+            r = httpx.post(f"{API}/ask", json=payload, timeout=180)
+            r.raise_for_status()
+            data = r.json()
+            if data.get("success"):
+                return data
+            # rate-limited or quota — try the next account
+            if "rate" in str(data.get("error", "")).lower():
+                account_idx = (account_idx + 1) % len(ACCOUNTS)
+                switch_account(ACCOUNTS[account_idx])
+                continue
+        except httpx.HTTPError as e:
+            print(f"  attempt {attempt}: {e}", file=sys.stderr)
+            time.sleep(2 ** attempt)
+    raise RuntimeError(f"failed after {MAX_RETRIES} retries: {question}")
+def main() -> None:
+    done = already_done()
+    questions = [q.strip() for q in INPUT.read_text().splitlines() if q.strip()]
+    todo = [q for q in questions if q not in done]
+    print(f"{len(done)} already answered · {len(todo)} to go")
+    with OUTPUT.open("a") as f:
+        for i, question in enumerate(todo, 1):
+            t0 = time.time()
+            answer = ask(question)
+            row = {
+                "question": question,
+                "answer": answer["answer"],
+                "citations": answer.get("citations", []),
+                "session_id": answer.get("session_id"),
+                "elapsed_s": round(time.time() - t0, 1),
+            }
+            f.write(json.dumps(row, ensure_ascii=False) + "\n")
+            f.flush()
+            print(f"[{i}/{len(todo)}] {row['elapsed_s']}s · {len(row['citations'])} cites")
+            time.sleep(THROTTLE_SECONDS)
+if __name__ == "__main__":
+    main()
+```
+Save it as `batch.py`, drop your questions in `questions.txt`, run `python batch.py`. Kill it any time, re-run, it picks up where it left off.
+## Why this pattern works
+### One file in, one file out
+Both ends are plain text. Your input is a `questions.txt` you can edit in any tool. Your output is `answers.jsonl` — JSON Lines, one answer per line, trivially loadable into pandas, jq, BigQuery, or another LLM for further processing:
+```bash
+jq '.answer' answers.jsonl | wc -l
+jq -c '{q: .question, n: (.citations | length)}' answers.jsonl
+```
+### Resume on restart
+Network drops, OS updates, batch scripts that get killed at 3am — they all happen. The first thing the script does is re-read the output file and skip questions whose text is already there. You lose the in-flight question and nothing else.
+### Auto-reauth across multi-hour runs
+Google sessions don't survive the night. The REST API checks the URL ground truth on every call and re-logins automatically using credentials stored in the AES-256-GCM vault (`npm run setup-auth` puts them there). TOTP codes are computed on the fly, so 2FA-protected accounts work transparently. [Multi-account configuration](/notebooklm-multi-account).
+### Account rotation when one quota saturates
+Free Google accounts hit a daily NotebookLM Q&A quota. The script flips to the next registered account on rate-limit errors via `POST /re-auth`. With two accounts you can typically push 1 500–2 000 questions in a 24-hour window without manual intervention.
+### Citations come back structured
+`source_format: "json"` returns a `citations` array of `{id, source, excerpt}` objects directly attached to the answer. You can join citations back to your sources for downstream processing — fact-checking, page-number resolution, LaTeX `\cite{}` generation for a thesis bibliography.
+## Sizing your throttle
+NotebookLM does not document a public rate limit, so we picked **8 seconds between calls** based on hundreds of overnight runs. That gives you ~450 questions in an hour and ~3 600 in eight hours per account. If you see `rate limit` errors before that, raise the throttle to 12-15 seconds or add a third account. If you can sustain 5 seconds without hitting limits, go for it.
+## When to switch to MCP mode instead
+If your driver is a coding agent (Claude Code, Cursor, Codex) rather than a script, the same operations are exposed as MCP tools. Use that surface when you want the agent to reason about which question to ask next; use the REST API when you have a flat list to grind through. [Both modes ship from the same package](/install).
+## What this gives you in practice
+For one of our PhD use cases we run 100-200 questions per chapter across a thirty-chapter thesis library. That's 5 000+ structured answers with citations, computed overnight on a laptop, indexed back into the thesis as `\cite{}`-ready snippets. Total cost: zero (uses the user's own NotebookLM account), total infrastructure: one Node process and one Python script.
+## Next steps
+- [HTTP API reference](/notebooklm-rest-api) — every endpoint, every parameter.
+- [n8n integration guide](/notebooklm-n8n) — same pattern but as a visual workflow.
+- [Multi-account guide](/notebooklm-multi-account) — register a second Google account for rotation.
+- [Compare with PleasePrompto](/compare) — when this project is the right pick over the upstream MCP-only server.

package/deployment/docs/13-COMPARE.md ADDED Viewed

@@ -0,0 +1,12 @@
+# Comparing NotebookLM MCP servers
+There are a handful of community projects that automate Google NotebookLM. The two most active TypeScript implementations are **`PleasePrompto/notebooklm-mcp`** (the original, MCP-only) and **`@roomi-fields/notebooklm-mcp`** (this project — REST API + MCP, Studio-complete, batch-oriented).
+This page is an honest, side-by-side comparison so you can pick the right one for your workflow. Both are MIT-licensed and actively maintained.
+> **TL;DR**
+>
+> - You want a quick **MCP server** for Claude Code / Cursor / Codex with citations and audio overviews → either works; PleasePrompto v2 is a clean, MCP-spec-first build.
+> - You need a **REST API** to call from n8n / Zapier / Make / curl / any non-MCP client → use this project. PleasePrompto v2 ships MCP-over-HTTP (Streamable HTTP transport) which is **not** a REST API.
+> - You generate **video / infographic / presentation / data table** from NotebookLM Studio → use this project. PleasePrompto v2 ships audio only; the rest is deferred to a follow-up.
+> - You run **long batches** (PhD research, market reports, content pipelines) and need **auto-reauth with TOTP** → use this project. PleasePrompto v2 explicitly does not store credentials.