@pi-unipi/cocoindex 2.0.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +93 -0
- package/bridge.ts +774 -0
- package/commands.ts +175 -0
- package/index.ts +55 -0
- package/installer.ts +397 -0
- package/package.json +42 -0
- package/skills/cocoindex/SKILL.md +88 -0
- package/tools.ts +131 -0
package/README.md
ADDED
|
@@ -0,0 +1,93 @@
|
|
|
1
|
+
# @pi-unipi/cocoindex
|
|
2
|
+
|
|
3
|
+
CocoIndex integration for Pi coding agent — AST-aware content indexing, semantic vector search, and incremental pipeline management.
|
|
4
|
+
|
|
5
|
+
## Overview
|
|
6
|
+
|
|
7
|
+
Replaces the compactor's FTS5-based content indexing with [CocoIndex](https://cocoindex.io/), providing:
|
|
8
|
+
|
|
9
|
+
- **AST-aware code chunking** — language-aware splitting for code files
|
|
10
|
+
- **Semantic vector search** — find content by meaning, not just keywords
|
|
11
|
+
- **Incremental indexing** — only reprocesses changed files (delta-only)
|
|
12
|
+
- **LanceDB storage** — zero-config, local file-based vector database
|
|
13
|
+
- **Shared embeddings** — reuses memory package's OpenRouter API key and model
|
|
14
|
+
|
|
15
|
+
## Prerequisites
|
|
16
|
+
|
|
17
|
+
1. **Python 3.10+**
|
|
18
|
+
2. **CocoIndex CLI**: `pip install cocoindex 'cocoindex[lancedb]'` (requires cocoindex >= 1.0)
|
|
19
|
+
3. **LanceDB SDK** (optional, for search): `npm install @lancedb/lancedb`
|
|
20
|
+
4. **Embedding API key** — configured via `/unipi:memory-settings`
|
|
21
|
+
|
|
22
|
+
## Quick Start
|
|
23
|
+
|
|
24
|
+
```
|
|
25
|
+
# 1. Initialize the pipeline (once per project)
|
|
26
|
+
/unipi:cocoindex-init
|
|
27
|
+
|
|
28
|
+
# 2. Index the project
|
|
29
|
+
/unipi:cocoindex-update
|
|
30
|
+
|
|
31
|
+
# 3. Search indexed content
|
|
32
|
+
cocoindex_search({ query: "how does authentication work?" })
|
|
33
|
+
```
|
|
34
|
+
|
|
35
|
+
## Architecture
|
|
36
|
+
|
|
37
|
+
```
|
|
38
|
+
Project files ──→ localfs.walk_dir (recursive)
|
|
39
|
+
│
|
|
40
|
+
▼
|
|
41
|
+
chunk_text (@coco.fn, memoized)
|
|
42
|
+
│
|
|
43
|
+
▼
|
|
44
|
+
LanceDB target (via ContextKey)
|
|
45
|
+
│
|
|
46
|
+
▼
|
|
47
|
+
Vector search → ranked results
|
|
48
|
+
```
|
|
49
|
+
|
|
50
|
+
Uses cocoindex v1.0+ App/fn/mount API with:
|
|
51
|
+
- `@coco.lifespan` for async environment setup (LanceDB connection)
|
|
52
|
+
- `@coco.fn` for memoized processing functions
|
|
53
|
+
- `coco.mount()` / `coco.mount_target()` for component management
|
|
54
|
+
- `localfs.walk_dir` for file enumeration
|
|
55
|
+
- `lancedb.TableTarget` for row-level target state management
|
|
56
|
+
|
|
57
|
+
## Tools
|
|
58
|
+
|
|
59
|
+
| Tool | Description |
|
|
60
|
+
|------|-------------|
|
|
61
|
+
| `cocoindex_search` | Search indexed content (semantic vector when available, LanceDB FTS when available, lexical fallback for text-only indexes) |
|
|
62
|
+
| `cocoindex_status` | Check indexing status, freshness, doc count |
|
|
63
|
+
|
|
64
|
+
## Commands
|
|
65
|
+
|
|
66
|
+
| Command | Description |
|
|
67
|
+
|---------|-------------|
|
|
68
|
+
| `/unipi:cocoindex-update` | Run incremental indexing |
|
|
69
|
+
| `/unipi:cocoindex-status` | Show pipeline status |
|
|
70
|
+
| `/unipi:cocoindex-init` | Scaffold default pipeline |
|
|
71
|
+
| `/unipi:cocoindex-settings` | View configuration |
|
|
72
|
+
|
|
73
|
+
## Configuration
|
|
74
|
+
|
|
75
|
+
- **Pipeline**: `.unipi/cocoindex/main.py` — auto-generated, fully customizable
|
|
76
|
+
- **Data store**: `.unipi/cocoindex/.lancedb/`
|
|
77
|
+
- **Embeddings**: `~/.unipi/memory/config.json` (shared with memory package)
|
|
78
|
+
- **Search fallback**: Existing text-only LanceDB tables remain searchable through a lexical scan fallback when no vector column or FTS index exists
|
|
79
|
+
|
|
80
|
+
## What Changed from FTS5
|
|
81
|
+
|
|
82
|
+
This package replaces compactor's content indexing subsystem:
|
|
83
|
+
|
|
84
|
+
| Feature | Before (FTS5) | After (CocoIndex) |
|
|
85
|
+
|---------|---------------|-------------------|
|
|
86
|
+
| Chunking | Heading/paragraph | AST-aware recursive |
|
|
87
|
+
| Search | BM25 + trigram | Vector + full-text |
|
|
88
|
+
| Incremental | No (full re-index) | Yes (delta-only) |
|
|
89
|
+
| Storage | SQLite FTS5 | LanceDB |
|
|
90
|
+
|
|
91
|
+
## Status
|
|
92
|
+
|
|
93
|
+
⚠️ **Experimental** — This is an `experiment/cocoindex` branch feature. Not yet merged to main.
|