agent-library 0.7.0__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,201 @@
1
+ Metadata-Version: 2.4
2
+ Name: agent-library
3
+ Version: 0.7.0
4
+ Summary: Markdown document management system with vector and full-text search
5
+ Author-email: "Arcade.dev" <dev@arcade.dev>
6
+ License-Expression: MIT
7
+ License-File: LICENSE
8
+ Keywords: ai,embeddings,markdown,mcp,search,vector
9
+ Classifier: Development Status :: 4 - Beta
10
+ Classifier: Environment :: Console
11
+ Classifier: Intended Audience :: Developers
12
+ Classifier: License :: OSI Approved :: MIT License
13
+ Classifier: Operating System :: OS Independent
14
+ Classifier: Programming Language :: Python :: 3
15
+ Classifier: Programming Language :: Python :: 3.10
16
+ Classifier: Programming Language :: Python :: 3.11
17
+ Classifier: Programming Language :: Python :: 3.12
18
+ Classifier: Topic :: Software Development :: Libraries :: Python Modules
19
+ Classifier: Topic :: Text Processing :: Indexing
20
+ Requires-Python: >=3.10
21
+ Requires-Dist: arcade-mcp-server>=0.1.0
22
+ Requires-Dist: numpy>=1.24.0
23
+ Requires-Dist: python-frontmatter>=1.0.0
24
+ Requires-Dist: pyyaml>=6.0
25
+ Requires-Dist: rich>=13.0.0
26
+ Requires-Dist: sentence-transformers>=2.2.0
27
+ Requires-Dist: sqlite-vec>=0.1.0
28
+ Requires-Dist: typer[all]>=0.9.0
29
+ Provides-Extra: dev
30
+ Requires-Dist: mypy>=1.5.1; extra == 'dev'
31
+ Requires-Dist: pre-commit>=3.4.0; extra == 'dev'
32
+ Requires-Dist: pytest-asyncio>=0.25.0; extra == 'dev'
33
+ Requires-Dist: pytest-cov>=4.0.0; extra == 'dev'
34
+ Requires-Dist: pytest>=8.3.0; extra == 'dev'
35
+ Requires-Dist: ruff>=0.7.4; extra == 'dev'
36
+ Requires-Dist: types-pyyaml>=6.0.1; extra == 'dev'
37
+ Provides-Extra: evals
38
+ Requires-Dist: arcade-mcp[evals]>=1.0.0; extra == 'evals'
39
+ Description-Content-Type: text/markdown
40
+
41
+ # Librarian
42
+
43
+ A personal knowledge library for AI agents, built on [Arcade](https://arcade.dev) for the Model Context Protocol (MCP).
44
+
45
+ ## Overview
46
+
47
+ Librarian provides AI agents with persistent storage for text, documents, and knowledge. Agents can store information and retrieve it later through semantic and keyword search, maintaining context across conversations.
48
+
49
+ ```mermaid
50
+ graph LR
51
+ A[Agent Stores Info] --> B[Parser]
52
+ B --> C[Chunker]
53
+ C --> D[Embedder]
54
+ D --> E[(SQLite + vec)]
55
+ F[Agent Queries] --> G[Hybrid Search]
56
+ E --> G
57
+ G --> H[Relevant Context]
58
+ ```
59
+
60
+ ## Features
61
+
62
+ - Persistent knowledge storage for AI agents
63
+ - SQLite storage with `sqlite-vec` for vector search
64
+ - Full-text search using FTS5 with BM25 ranking
65
+ - Hybrid search combining semantic and keyword matching
66
+ - Max Marginal Relevance (MMR) for diverse results
67
+ - Configurable embedding models (local or OpenAI-compatible API)
68
+ - Header-aware text chunking with overlap
69
+ - Time-bounded search filters
70
+ - CLI and MCP server interfaces
71
+
72
+ ## Installation
73
+
74
+ ```bash
75
+ git clone https://github.com/ArcadeAI/librarian.git
76
+ cd librarian
77
+ ./setup.sh
78
+ ```
79
+
80
+ Or install manually:
81
+
82
+ ```bash
83
+ uv pip install -e ".[dev]"
84
+ ```
85
+
86
+ ## CLI Usage
87
+
88
+ ```bash
89
+ # Add files to the library
90
+ libr add ~/notes
91
+
92
+ # Search the library
93
+ libr search "machine learning concepts"
94
+
95
+ # List sources
96
+ libr list
97
+
98
+ # View library statistics
99
+ libr index
100
+
101
+ # Rebuild the index
102
+ libr index build
103
+ ```
104
+
105
+ ## MCP Server
106
+
107
+ Start the server for AI assistant integration:
108
+
109
+ ```bash
110
+ # stdio transport (Claude Desktop, CLI)
111
+ libr serve stdio
112
+
113
+ # HTTP transport (Cursor, VS Code)
114
+ libr serve http --port 8000
115
+ ```
116
+
117
+ See the [Arcade MCP documentation](https://docs.arcade.dev) for integration details.
118
+
119
+ ### Available Tools
120
+
121
+ | Tool | Description |
122
+ |------|-------------|
123
+ | `Librarian_SearchLibrary` | Search the library with hybrid vector + keyword search |
124
+ | `Librarian_SemanticSearchLibrary` | Find content by meaning (semantic similarity) |
125
+ | `Librarian_KeywordSearchLibrary` | Find content by exact keywords |
126
+ | `Librarian_SearchLibraryByDates` | Search within a date range |
127
+ | `Librarian_AddToLibrary` | Store new content in the library |
128
+ | `Librarian_UpdateLibraryDoc` | Update existing content |
129
+ | `Librarian_ReadFromLibrary` | Read full document content |
130
+ | `Librarian_RemoveFromLibrary` | Remove content from the library |
131
+ | `Librarian_ListLibraryContents` | List all stored content |
132
+ | `Librarian_IndexDirectoryToLibrary` | Bulk import files |
133
+ | `Librarian_GetLibrarySources` | List sources with document/chunk counts |
134
+ | `Librarian_GetLibraryStats` | Overall library statistics |
135
+
136
+ ## Configuration
137
+
138
+ Set via environment variables:
139
+
140
+ | Variable | Default | Description |
141
+ |----------|---------|-------------|
142
+ | `DOCUMENTS_PATH` | `./documents` | Root directory for files |
143
+ | `DATABASE_PATH` | `~/.librarian/index.db` | SQLite database location |
144
+ | `EMBEDDING_PROVIDER` | `openai` | `local` or `openai` |
145
+ | `EMBEDDING_MODEL` | `all-MiniLM-L6-v2` | Local model name |
146
+ | `OPENAI_API_BASE` | `http://localhost:7171/v1` | OpenAI-compatible API URL |
147
+ | `OPENAI_EMBEDDING_MODEL` | `qwen3-embedding-06b` | API model name |
148
+ | `CHUNK_SIZE` | `512` | Max characters per chunk |
149
+ | `CHUNK_OVERLAP` | `50` | Overlap between chunks |
150
+ | `SEARCH_LIMIT` | `10` | Default results limit |
151
+ | `MMR_LAMBDA` | `0.5` | MMR diversity (0=diverse, 1=relevant) |
152
+ | `HYBRID_ALPHA` | `0.7` | Vector vs keyword weight (1=vector only) |
153
+
154
+ ## Project Structure
155
+
156
+ ```
157
+ librarian/
158
+ ├── cli.py # Command-line interface
159
+ ├── server.py # MCP server and tool definitions
160
+ ├── config.py # Configuration management
161
+ ├── indexing.py # Document indexing service
162
+ ├── types.py # Shared type definitions
163
+ ├── storage/
164
+ │ ├── database.py # SQLite operations
165
+ │ ├── vector_store.py # sqlite-vec search
166
+ │ └── fts_store.py # FTS5 search
167
+ ├── processing/
168
+ │ ├── embed/ # Embedding providers
169
+ │ ├── parsers/ # Document parsers
170
+ │ └── transform/ # Text chunking
171
+ ├── retrieval/
172
+ │ └── search.py # Hybrid search + MMR
173
+ └── utils/
174
+ └── timeframe.py # Time filter utilities
175
+ ```
176
+
177
+ ## Development
178
+
179
+ ```bash
180
+ make install # Install dependencies
181
+ make test # Run tests
182
+ make lint # Run linter
183
+ make format # Format code
184
+ make typecheck # Type checking
185
+ make check # All checks
186
+ make evals # Run evaluations
187
+ ```
188
+
189
+ ## Resources
190
+
191
+ - [Arcade.dev](https://arcade.dev) - Build AI-native applications
192
+ - [Arcade Documentation](https://docs.arcade.dev) - Integration guides and API reference
193
+
194
+ ## License
195
+
196
+ MIT License - see [LICENSE](LICENSE) for details.
197
+
198
+ ## Contact
199
+
200
+ - Email: <contact@arcade.dev>
201
+ - Website: [arcade.dev](https://arcade.dev)
@@ -0,0 +1,36 @@
1
+ librarian/.env.example,sha256=osYXUMrQ9obx7S8gf-XgwJ5YcvIkTr2suxHwOhIhDQ8,318
2
+ librarian/__init__.py,sha256=YNH8ZTOY-3NBfvvv4V5kImTzvLiA4F-gTgqr5sDkejg,1172
3
+ librarian/cli.py,sha256=xXpOoYGnsewqHRj765NKNEIat2l_LJ0oUALN_OlyrvM,40672
4
+ librarian/config.py,sha256=PvCx0XD4sE5nZQs40pqmfshSv1McT9hMJhbTagcEXj8,4219
5
+ librarian/indexing.py,sha256=Dt2t7YhCy4cJ0aIzCpMcXOykbhu_XOq29MmKOSK5028,4691
6
+ librarian/server.py,sha256=fJkzmPox6FGIa3q3jI22w5kHKKpe1o5sAwELnM-Y980,39255
7
+ librarian/types.py,sha256=_kVn7m9AdaL1aAus4aFLddrYsY6qDmCPFxuN4-pkV7Y,3958
8
+ librarian/processing/__init__.py,sha256=LhZ20oWS4RnbOZIxgt-qyKz7riO32mLsiiYovCQo55Y,519
9
+ librarian/processing/manager.py,sha256=WuQkkut-wUExXDG4Dukz9ucW1yfkpfv7tMlsluvmvHo,7854
10
+ librarian/processing/embed/__init__.py,sha256=YGKjFrRBMJ6QwfGZyyuu9RI3hpqljGR6P6g6lk0PqtU,5181
11
+ librarian/processing/embed/base.py,sha256=tB9gEFpSC4gKjV0OBWhSc2MqK_dSPMWvns_XDnjIG8s,2374
12
+ librarian/processing/embed/local.py,sha256=U2grCyLvUFF3AhpznZ1ab4K36Ez2j0ESQCjTWj9q8mg,5850
13
+ librarian/processing/embed/openai.py,sha256=8IAF7OTkGstp7hxN1HCqDOeP3Dy4euDEE03beZNXB40,8169
14
+ librarian/processing/parsers/__init__.py,sha256=lRAcqtFHqt5DUUzERIwqVryQr4HyqKB_ascaEAPGsYo,433
15
+ librarian/processing/parsers/base.py,sha256=w24W6L6Hnhf-5z0KQyiutKQTSaruPG6RPZF6Bk2NXX8,2062
16
+ librarian/processing/parsers/md.py,sha256=njUWb2AGxJC4bgY8KNQajN73lrkzAFKZW1oNGdWr1Yo,7240
17
+ librarian/processing/parsers/obsidian.py,sha256=cyZgN-eWJRLw8RVPYFLvEkXuD2jUytcRp6hXrMU3lDc,5400
18
+ librarian/processing/transform/__init__.py,sha256=iMuiKGNG7y3Fh_AhI5lwk_OzyQUs38HuzfQTuCLWZfo,217
19
+ librarian/processing/transform/chunker.py,sha256=2cRzuBiZmHL2dRLuXH6gtUPjBNlhjSEvcXUGvleri5s,12233
20
+ librarian/providers/__init__.py,sha256=TUkDXQzW-lvzbqtTHcT8L9JAEoyRRX4ySpAA6Pu_ToM,779
21
+ librarian/providers/anthropic.py,sha256=AlPD-wNLRihUowt6vUZIDMOdZMAQpMVMWi16Wc7gS9A,6709
22
+ librarian/providers/base.py,sha256=VgVQPPBtMld5fX4LBSD-PZEwGjOLb2C_XmeRInKZnQg,3232
23
+ librarian/providers/oai.py,sha256=iRbijYFvV8TyrE6LBZYtQ41KeeGUttGm2c5JbP15UHM,5770
24
+ librarian/retrieval/__init__.py,sha256=w9IOWBp11l-6e8EPsZTVCDkY0NybJ7ge6w_SveVhEiI,257
25
+ librarian/retrieval/search.py,sha256=AL4igvl9czDlzgX5GFK57ZnKfleAsWwvgKBElUnC8u0,11268
26
+ librarian/storage/__init__.py,sha256=b0n6Y1xynDOtQPORvKq0IIFHjr5NCYTuhdPiPYJuG_o,590
27
+ librarian/storage/database.py,sha256=N6TTUo0p-gjyQLeiRN2tmCx4c3QE5SYP_zZ8ao7S7sE,21159
28
+ librarian/storage/fts_store.py,sha256=jTHjkbbQeI9w_IQ3nnTuaoiasqKLgAW5OOwjGwxXKnk,6983
29
+ librarian/storage/vector_store.py,sha256=rzL2VhVRf7m2d6nzNiry1UBqbP1ZuMEE3fFGWTvHUEE,6524
30
+ librarian/utils/__init__.py,sha256=IsdXj8h6TRyhd9e1PFY87TrWC6CYJq6bj4piIup1wGs,274
31
+ librarian/utils/timeframe.py,sha256=GSVQWCdLo9YyTj2RNwvVJHikd9rkWbhUS-6EPwOTY78,5224
32
+ agent_library-0.7.0.dist-info/METADATA,sha256=c2HDX-Ys2INKSxziNOnymiiJZju81tNqnknh7MnLpjA,6369
33
+ agent_library-0.7.0.dist-info/WHEEL,sha256=WLgqFyCfm_KASv4WHyYy0P3pM_m7J5L9k2skdKLirC8,87
34
+ agent_library-0.7.0.dist-info/entry_points.txt,sha256=cX5fyMxssr_Bd_F_YLjAex6dC9-HfH93Wf5u3XiCNvY,73
35
+ agent_library-0.7.0.dist-info/licenses/LICENSE,sha256=_ehAoqrx6bKWEjcst2Q4SOkLBZyYDBIShk9xeCMRaeU,1064
36
+ agent_library-0.7.0.dist-info/RECORD,,
@@ -0,0 +1,4 @@
1
+ Wheel-Version: 1.0
2
+ Generator: hatchling 1.28.0
3
+ Root-Is-Purelib: true
4
+ Tag: py3-none-any
@@ -0,0 +1,3 @@
1
+ [console_scripts]
2
+ libr = librarian.cli:app
3
+ librarian = librarian.cli:app
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2024, spartee
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
librarian/.env.example ADDED
@@ -0,0 +1,11 @@
1
+ # Arcade Context Environment Configuration
2
+ # Copy this file to .env and update the values
3
+
4
+ # Path to your Obsidian vault directory
5
+ OBSIDIAN_VAULT_PATH="/path/to/your/obsidian/vault"
6
+
7
+ # Index configuration (optional)
8
+ INDEX_POLL_INTERVAL=60
9
+ INDEX_START_DELAY=5
10
+ INDEX_STORAGE_PATH="~/.arcade/obsidian"
11
+ INDEX_NAME="index"
librarian/__init__.py ADDED
@@ -0,0 +1,52 @@
1
+ """
2
+ Librarian - Markdown document management system.
3
+
4
+ A complete system for maintaining, indexing, ingesting, and retrieving
5
+ markdown documents through the Model Context Protocol (MCP).
6
+
7
+ Features:
8
+ - SQLite-based storage with vector search (sqlite-vec)
9
+ - Full-text search (FTS5) with BM25 ranking
10
+ - Hybrid search combining vector and keyword search
11
+ - Max Marginal Relevance (MMR) for diverse results
12
+ - Configurable embedding providers (local or OpenAI)
13
+ - Intelligent text chunking with overlap
14
+ - Time-bounded search with timeframe filters
15
+
16
+ Usage:
17
+ # For MCP server
18
+ from librarian.server import app
19
+
20
+ # For processing
21
+ from librarian.processing import ProcessingManager
22
+
23
+ # For CLI
24
+ librarian --help
25
+ """
26
+
27
+ from librarian.config import (
28
+ CHUNK_OVERLAP,
29
+ CHUNK_SIZE,
30
+ DATABASE_PATH,
31
+ DOCUMENTS_PATH,
32
+ EMBEDDING_MODEL,
33
+ EMBEDDING_PROVIDER,
34
+ HYBRID_ALPHA,
35
+ MMR_LAMBDA,
36
+ SEARCH_LIMIT,
37
+ )
38
+
39
+ __version__ = "0.6.0"
40
+
41
+ __all__ = [
42
+ "CHUNK_OVERLAP",
43
+ "CHUNK_SIZE",
44
+ "DATABASE_PATH",
45
+ "DOCUMENTS_PATH",
46
+ "EMBEDDING_MODEL",
47
+ "EMBEDDING_PROVIDER",
48
+ "HYBRID_ALPHA",
49
+ "MMR_LAMBDA",
50
+ "SEARCH_LIMIT",
51
+ "__version__",
52
+ ]