pi-research 1.4.1 → 1.5.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +30 -86
- package/package.json +5 -3
package/README.md
CHANGED
|
@@ -1,100 +1,29 @@
|
|
|
1
|
-
#
|
|
1
|
+
# ⚠️ This package has moved
|
|
2
2
|
|
|
3
|
-
|
|
3
|
+
> **`pi-research` is deprecated and will no longer receive updates.**
|
|
4
4
|
|
|
5
|
-
[
|
|
6
|
-
[](https://github.com/endgegnerbert-tech/pi-research)
|
|
7
|
-
[](https://pi.ai)
|
|
5
|
+
## 👉 Please migrate to [`emet`](https://www.npmjs.com/package/emet)
|
|
8
6
|
|
|
9
|
-
**The Zero-Setup Research Engine for Autonomous AI Agents.**
|
|
10
|
-
|
|
11
|
-
`pi-research` is an advanced grounding tool designed specifically for AI coding agents. It prevents agents from hallucinating API endpoints, guessing library versions, or inventing CVE details by injecting real-time, highly authoritative, and conflict-resolved web research directly into their context window.
|
|
12
|
-
|
|
13
|
-

|
|
14
|
-
|
|
15
|
-
## 💡 Why `pi-research`?
|
|
16
|
-
|
|
17
|
-
The world does not need just another "AI Search Engine"—there are plenty of massive, standalone research tools out there.
|
|
18
|
-
|
|
19
|
-
Instead, `pi-research` was built specifically to solve a crucial problem in the **Agentic Workflow**: When an autonomous agent is deep in a coding loop, compiling errors, or debugging, it needs hard facts instantly without losing focus. Calling out to heavy external search services or trying to execute brittle Playwright scripts breaks the agent's flow, wastes context window tokens, and leads to hallucinations.
|
|
20
|
-
|
|
21
|
-
`pi-research` solves this by providing a lightweight, internal **cognitive research loop** directly into the agent harness:
|
|
22
|
-
1. **Agent-Centric Routing:** It knows exactly where developers look (GitHub, NPM, NIST, arXiv).
|
|
23
|
-
2. **Authority First:** It prioritizes official documentation over random SEO-optimized tutorials.
|
|
24
|
-
3. **Self-Awareness:** It extracts structured features to know when it lacks information, safely triggering follow-up questions *before* returning an answer to the agent.
|
|
25
|
-
|
|
26
|
-
Best of all? **Zero setup.** No external search API keys to configure, no heavy local LLMs to run, and no flaky browser automation scripts to maintain. It's built to run silently and reliably alongside your agent.
|
|
27
|
-
|
|
28
|
-
---
|
|
29
|
-
|
|
30
|
-
## ✨ Features
|
|
31
|
-
|
|
32
|
-
- 🚀 **Lightning Fast:** Powered by a Hybrid Tiny-Router Architecture (Model2Vec + SVC), routing queries in **< 0.6 milliseconds**.
|
|
33
|
-
- 🛡️ **Anti-Hallucination:** Built-in Veto-Power for high-risk queries. If a security question only finds blog posts, the system forces a follow-up to find authoritative NIST/CVE data.
|
|
34
|
-
- 🕸️ **Resilient Fetching:** Pre-emptively escalates blocked, JS-heavy, or thin pages through an integrated, robust Python `Scrapling` daemon (via IPC JSON-RPC 2.0).
|
|
35
|
-
- 🧩 **Domain Packs:** Built-in heuristics for `github`, `security`, `papers`, `package-registry`, and more.
|
|
36
|
-
- 📊 **Structured Outputs:** Returns citations, code blocks, missing aspects, confidence scores, and conflict summaries (e.g., "Source A contradicts Source B").
|
|
37
|
-
- 📂 **Local Context:** Ingests local files (`options.files`) to ground web research in your current repository context.
|
|
38
|
-
|
|
39
|
-
---
|
|
40
|
-
|
|
41
|
-
## 📦 Installation
|
|
42
|
-
|
|
43
|
-
### Pi Coding Agent (Extension)
|
|
44
|
-
If you are using the Pi Agent harness, install the extension directly:
|
|
45
|
-
```bash
|
|
46
|
-
pi install npm:pi-research
|
|
47
|
-
```
|
|
48
|
-
|
|
49
|
-
### Node.js / NPM (Standalone Server)
|
|
50
|
-
Install it globally to expose the MCP (Model Context Protocol) server for any compatible AI agent:
|
|
51
7
|
```bash
|
|
52
|
-
|
|
53
|
-
pi-research
|
|
54
|
-
```
|
|
55
|
-
*(The MCP server identifies itself as `unblind-mcp`, exposing the tool `pi-research`)*
|
|
8
|
+
# Uninstall old package
|
|
9
|
+
npm uninstall -g pi-research
|
|
56
10
|
|
|
57
|
-
|
|
58
|
-
|
|
59
|
-
## 🚀 Quick Start / Usage
|
|
60
|
-
|
|
61
|
-
Once installed, your agent has access to the `pi-research` tool. It accepts a `query`, a `mode`, and various `options`.
|
|
62
|
-
|
|
63
|
-
### Modes
|
|
64
|
-
| Mode | Best for |
|
|
65
|
-
| --- | --- |
|
|
66
|
-
| `fast` | Quick factual lookups (e.g., "What is the latest LTS version of Node.js?"). Stops fetching early if authoritative sources are found. |
|
|
67
|
-
| `deep` | Broader retrieval with automatic follow-up rounds. Perfect for comparisons, conflicts, or unclear architecture questions. |
|
|
68
|
-
| `code` | Docs, repositories, README-driven answers, and retrieving actual code snippets. |
|
|
69
|
-
| `academic` | Scholarly sources, DOI links, and paper-heavy topics. |
|
|
70
|
-
|
|
71
|
-
### Example Tool Calls (For Agents)
|
|
72
|
-
**Factual Lookup:**
|
|
73
|
-
```json
|
|
74
|
-
{
|
|
75
|
-
"query": "React 19 RC release notes",
|
|
76
|
-
"mode": "fast",
|
|
77
|
-
"options": { "requireAuthoritative": true }
|
|
78
|
-
}
|
|
79
|
-
```
|
|
11
|
+
# Install new package
|
|
12
|
+
npm install -g emet
|
|
80
13
|
|
|
81
|
-
|
|
82
|
-
|
|
83
|
-
{
|
|
84
|
-
"query": "Compare PostgreSQL and MySQL for multi-tenant SaaS",
|
|
85
|
-
"mode": "deep",
|
|
86
|
-
"options": { "preferRecent": true, "maxTurns": 2 }
|
|
87
|
-
}
|
|
14
|
+
# Pi Extension
|
|
15
|
+
pi install npm:emet
|
|
88
16
|
```
|
|
89
17
|
|
|
90
18
|
---
|
|
91
19
|
|
|
92
|
-
|
|
20
|
+
### What changed?
|
|
93
21
|
|
|
94
|
-
|
|
22
|
+
`pi-research` has been rebranded to **`emet`** (Hebrew for truth/fact) — a cleaner, standalone identity that better reflects what the tool does: it grounds AI agents in factual, real-time context.
|
|
95
23
|
|
|
24
|
+
<<<<<<< HEAD
|
|
96
25
|
- **Model2Vec & SVC:** Queries are classified via locally embedded features. Security and paper queries have a 0% downgrade rate.
|
|
97
|
-
- **Structured ML:** Instead of asking a heavy LLM "Is this enough data?", the system extracts deterministic features (`has_authority`, `conflict_state`) and uses an ultra-fast Logistic Regression model to evaluate sufficiency and follow-up actions
|
|
26
|
+
- **Structured ML:** Instead of asking a heavy LLM "Is this enough data?", the system extracts deterministic features (`has_authority`, `conflict_state`) and uses an ultra-fast Logistic Regression model to evaluate sufficiency and follow-up actions wich achieved 100% accuracy on the included eval_unseen_hard.js benchmark dataset (121 test cases)”
|
|
98
27
|
- **Node.js-to-Python IPC:** Operates entirely locally using a highly optimized, line-delimited JSON-RPC daemon to manage Python dependencies (`Scrapling`, `Model2Vec`) without memory leaks.
|
|
99
28
|
|
|
100
29
|
---
|
|
@@ -102,7 +31,7 @@ With `1.4.0`, `pi-research` shifted from heavy, generative JSON-planners to a **
|
|
|
102
31
|
## 🛣️ Future Roadmap
|
|
103
32
|
|
|
104
33
|
We are actively working on scaling the reasoning capabilities:
|
|
105
|
-
- **LLM Data Augmentation (Weak Supervision):** Generating synthetic training data for underconfident domains to boost zero-shot accuracy
|
|
34
|
+
- **LLM Data Augmentation (Weak Supervision):** Generating synthetic training data for underconfident domains to boost zero-shot accuracy targeting >95%” without manual labeling.
|
|
106
35
|
- **Active Learning Telemetry Loop:** Clustering low-confidence predictions from cache logs into a weakly-supervised retraining pipeline to let the system "self-heal."
|
|
107
36
|
- **Cross-Encoder for Conflict Detection:** Transitioning to a fine-tuned Cross-Encoder (e.g., MiniLM + Natural Language Inference) to detect deep semantic contradiction across differing texts (e.g., recognizing that "Node 20 is stable" contradicts "Node 20 is broken").
|
|
108
37
|
|
|
@@ -111,4 +40,19 @@ We are actively working on scaling the reasoning capabilities:
|
|
|
111
40
|
## 📝 License & Notices
|
|
112
41
|
- **License:** MIT
|
|
113
42
|
- **Third-party notices:** See `THIRD_PARTY_NOTICES.md`
|
|
114
|
-
- **GitHub:** [https://github.com/endgegnerbert-tech/pi-research](https://github.com/endgegnerbert-tech/pi-research)
|
|
43
|
+
- **GitHub:** [https://github.com/endgegnerbert-tech/pi-research](https://github.com/endgegnerbert-tech/pi-research)
|
|
44
|
+
=======
|
|
45
|
+
| Old | New |
|
|
46
|
+
|---|---|
|
|
47
|
+
| `pi-research` | `emet` |
|
|
48
|
+
| `unblind-mcp` (MCP server) | `emet-mcp` |
|
|
49
|
+
| `PI_RESEARCH_*` env vars | `EMET_*` env vars |
|
|
50
|
+
| `pi-research` tool name | `emet` tool name |
|
|
51
|
+
|
|
52
|
+
All features, modes (`fast`, `deep`, `code`, `academic`), and the zero-setup
|
|
53
|
+
architecture carry over 100%. No functionality was removed.
|
|
54
|
+
|
|
55
|
+
---
|
|
56
|
+
|
|
57
|
+
**GitHub:** [tomsej/emet](https://github.com/tomsej/emet)
|
|
58
|
+
>>>>>>> 72ee46a (chore: deprecate package, move to emet)
|
package/package.json
CHANGED
|
@@ -1,9 +1,9 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "pi-research",
|
|
3
|
-
"version": "1.
|
|
3
|
+
"version": "1.5.0",
|
|
4
4
|
"private": false,
|
|
5
5
|
"type": "module",
|
|
6
|
-
"description": "
|
|
6
|
+
"description": "⚠️ DEPRECATED: This package has moved to 'emet'. Run: npm install -g emet",
|
|
7
7
|
"license": "MIT",
|
|
8
8
|
"main": "./index.js",
|
|
9
9
|
"bin": {
|
|
@@ -33,7 +33,9 @@
|
|
|
33
33
|
"url": "https://github.com/endgegnerbert-tech/pi-research/issues"
|
|
34
34
|
},
|
|
35
35
|
"keywords": [
|
|
36
|
-
"
|
|
36
|
+
"deprecated",
|
|
37
|
+
"moved",
|
|
38
|
+
"use-emet"
|
|
37
39
|
],
|
|
38
40
|
"scripts": {
|
|
39
41
|
"test": "node --test",
|