remdb 0.3.14__py3-none-any.whl → 0.3.157__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (112) hide show
  1. rem/agentic/README.md +76 -0
  2. rem/agentic/__init__.py +15 -0
  3. rem/agentic/agents/__init__.py +32 -2
  4. rem/agentic/agents/agent_manager.py +310 -0
  5. rem/agentic/agents/sse_simulator.py +502 -0
  6. rem/agentic/context.py +51 -27
  7. rem/agentic/context_builder.py +5 -3
  8. rem/agentic/llm_provider_models.py +301 -0
  9. rem/agentic/mcp/tool_wrapper.py +155 -18
  10. rem/agentic/otel/setup.py +93 -4
  11. rem/agentic/providers/phoenix.py +371 -108
  12. rem/agentic/providers/pydantic_ai.py +280 -57
  13. rem/agentic/schema.py +361 -21
  14. rem/agentic/tools/rem_tools.py +3 -3
  15. rem/api/README.md +215 -1
  16. rem/api/deps.py +255 -0
  17. rem/api/main.py +132 -40
  18. rem/api/mcp_router/resources.py +1 -1
  19. rem/api/mcp_router/server.py +28 -5
  20. rem/api/mcp_router/tools.py +555 -7
  21. rem/api/routers/admin.py +494 -0
  22. rem/api/routers/auth.py +278 -4
  23. rem/api/routers/chat/completions.py +402 -20
  24. rem/api/routers/chat/models.py +88 -10
  25. rem/api/routers/chat/otel_utils.py +33 -0
  26. rem/api/routers/chat/sse_events.py +542 -0
  27. rem/api/routers/chat/streaming.py +697 -45
  28. rem/api/routers/dev.py +81 -0
  29. rem/api/routers/feedback.py +268 -0
  30. rem/api/routers/messages.py +473 -0
  31. rem/api/routers/models.py +78 -0
  32. rem/api/routers/query.py +360 -0
  33. rem/api/routers/shared_sessions.py +406 -0
  34. rem/auth/__init__.py +13 -3
  35. rem/auth/middleware.py +186 -22
  36. rem/auth/providers/__init__.py +4 -1
  37. rem/auth/providers/email.py +215 -0
  38. rem/cli/commands/README.md +237 -64
  39. rem/cli/commands/cluster.py +1808 -0
  40. rem/cli/commands/configure.py +4 -7
  41. rem/cli/commands/db.py +386 -143
  42. rem/cli/commands/experiments.py +468 -76
  43. rem/cli/commands/process.py +14 -8
  44. rem/cli/commands/schema.py +97 -50
  45. rem/cli/commands/session.py +336 -0
  46. rem/cli/dreaming.py +2 -2
  47. rem/cli/main.py +29 -6
  48. rem/config.py +10 -3
  49. rem/models/core/core_model.py +7 -1
  50. rem/models/core/experiment.py +58 -14
  51. rem/models/core/rem_query.py +5 -2
  52. rem/models/entities/__init__.py +25 -0
  53. rem/models/entities/domain_resource.py +38 -0
  54. rem/models/entities/feedback.py +123 -0
  55. rem/models/entities/message.py +30 -1
  56. rem/models/entities/ontology.py +1 -1
  57. rem/models/entities/ontology_config.py +1 -1
  58. rem/models/entities/session.py +83 -0
  59. rem/models/entities/shared_session.py +180 -0
  60. rem/models/entities/subscriber.py +175 -0
  61. rem/models/entities/user.py +1 -0
  62. rem/registry.py +10 -4
  63. rem/schemas/agents/core/agent-builder.yaml +134 -0
  64. rem/schemas/agents/examples/contract-analyzer.yaml +1 -1
  65. rem/schemas/agents/examples/contract-extractor.yaml +1 -1
  66. rem/schemas/agents/examples/cv-parser.yaml +1 -1
  67. rem/schemas/agents/rem.yaml +7 -3
  68. rem/services/__init__.py +3 -1
  69. rem/services/content/service.py +92 -19
  70. rem/services/email/__init__.py +10 -0
  71. rem/services/email/service.py +459 -0
  72. rem/services/email/templates.py +360 -0
  73. rem/services/embeddings/api.py +4 -4
  74. rem/services/embeddings/worker.py +16 -16
  75. rem/services/phoenix/client.py +154 -14
  76. rem/services/postgres/README.md +197 -15
  77. rem/services/postgres/__init__.py +2 -1
  78. rem/services/postgres/diff_service.py +547 -0
  79. rem/services/postgres/pydantic_to_sqlalchemy.py +470 -140
  80. rem/services/postgres/repository.py +132 -0
  81. rem/services/postgres/schema_generator.py +205 -4
  82. rem/services/postgres/service.py +6 -6
  83. rem/services/rem/parser.py +44 -9
  84. rem/services/rem/service.py +36 -2
  85. rem/services/session/compression.py +137 -51
  86. rem/services/session/reload.py +15 -8
  87. rem/settings.py +515 -27
  88. rem/sql/background_indexes.sql +21 -16
  89. rem/sql/migrations/001_install.sql +387 -54
  90. rem/sql/migrations/002_install_models.sql +2304 -377
  91. rem/sql/migrations/003_optional_extensions.sql +326 -0
  92. rem/sql/migrations/004_cache_system.sql +548 -0
  93. rem/sql/migrations/005_schema_update.sql +145 -0
  94. rem/utils/README.md +45 -0
  95. rem/utils/__init__.py +18 -0
  96. rem/utils/date_utils.py +2 -2
  97. rem/utils/files.py +157 -1
  98. rem/utils/model_helpers.py +156 -1
  99. rem/utils/schema_loader.py +220 -22
  100. rem/utils/sql_paths.py +146 -0
  101. rem/utils/sql_types.py +3 -1
  102. rem/utils/vision.py +1 -1
  103. rem/workers/__init__.py +3 -1
  104. rem/workers/db_listener.py +579 -0
  105. rem/workers/unlogged_maintainer.py +463 -0
  106. {remdb-0.3.14.dist-info → remdb-0.3.157.dist-info}/METADATA +340 -229
  107. {remdb-0.3.14.dist-info → remdb-0.3.157.dist-info}/RECORD +109 -80
  108. {remdb-0.3.14.dist-info → remdb-0.3.157.dist-info}/WHEEL +1 -1
  109. rem/sql/002_install_models.sql +0 -1068
  110. rem/sql/install_models.sql +0 -1051
  111. rem/sql/migrations/003_seed_default_user.sql +0 -48
  112. {remdb-0.3.14.dist-info → remdb-0.3.157.dist-info}/entry_points.txt +0 -0
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.4
2
2
  Name: remdb
3
- Version: 0.3.14
3
+ Version: 0.3.157
4
4
  Summary: Resources Entities Moments - Bio-inspired memory system for agentic AI workloads
5
5
  Project-URL: Homepage, https://github.com/Percolation-Labs/reminiscent
6
6
  Project-URL: Documentation, https://github.com/Percolation-Labs/reminiscent/blob/main/README.md
@@ -12,9 +12,11 @@ Keywords: agents,ai,mcp,memory,postgresql,vector-search
12
12
  Classifier: Development Status :: 3 - Alpha
13
13
  Classifier: Intended Audience :: Developers
14
14
  Classifier: License :: OSI Approved :: MIT License
15
+ Classifier: Programming Language :: Python :: 3.11
15
16
  Classifier: Programming Language :: Python :: 3.12
17
+ Classifier: Programming Language :: Python :: 3.13
16
18
  Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
17
- Requires-Python: <3.13,>=3.12
19
+ Requires-Python: <3.14,>=3.11
18
20
  Requires-Dist: aioboto3>=13.0.0
19
21
  Requires-Dist: arize-phoenix>=5.0.0
20
22
  Requires-Dist: asyncpg>=0.30.0
@@ -101,32 +103,30 @@ Cloud-native unified memory infrastructure for agentic AI systems built with Pyd
101
103
  - **Database Layer**: PostgreSQL 18 with pgvector for multi-index memory (KV + Vector + Graph)
102
104
  - **REM Query Dialect**: Custom query language with O(1) lookups, semantic search, graph traversal
103
105
  - **Ingestion & Dreaming**: Background workers for content extraction and progressive index enrichment (0% → 100% answerable)
104
- - **Observability & Evals**: OpenTelemetry tracing + Arize Phoenix + LLM-as-a-Judge evaluation framework
106
+ - **Observability & Evals**: OpenTelemetry tracing supporting LLM-as-a-Judge evaluation frameworks
105
107
 
106
108
  ## Features
107
109
 
108
110
  | Feature | Description | Benefits |
109
111
  |---------|-------------|----------|
110
112
  | **OpenAI-Compatible Chat API** | Drop-in replacement for OpenAI chat completions API with streaming support | Use with existing OpenAI clients, switch models across providers (OpenAI, Anthropic, etc.) |
111
- | **Built-in MCP Server** | FastMCP server with 4 tools + 3 resources for memory operations | Export memory to Claude Desktop, Cursor, or any MCP-compatible host |
113
+ | **Built-in MCP Server** | FastMCP server with 4 tools + 5 resources for memory operations | Export memory to Claude Desktop, Cursor, or any MCP-compatible host |
112
114
  | **REM Query Engine** | Multi-index query system (LOOKUP, FUZZY, SEARCH, SQL, TRAVERSE) with custom dialect | O(1) lookups, semantic search, graph traversal - all tenant-isolated |
113
115
  | **Dreaming Workers** | Background workers for entity extraction, moment generation, and affinity matching | Automatic knowledge graph construction from resources (0% → 100% query answerable) |
114
116
  | **PostgreSQL + pgvector** | CloudNativePG with PostgreSQL 18, pgvector extension, streaming replication | Production-ready vector search, no external vector DB needed |
115
117
  | **AWS EKS Recipe** | Complete infrastructure-as-code with Pulumi, Karpenter, ArgoCD | Deploy to production EKS in minutes with auto-scaling and GitOps |
116
118
  | **JSON Schema Agents** | Dynamic agent creation from YAML schemas via Pydantic AI factory | Define agents declaratively, version control schemas, load dynamically |
117
- | **Content Providers** | Audio transcription (Whisper), vision (GPT-4V, Claude), PDFs, DOCX, images | Multimodal ingestion out of the box with format detection |
118
- | **Configurable Embeddings** | Provider-agnostic embedding system (OpenAI, Cohere, Jina) | Switch embedding providers via env vars, no code changes |
119
+ | **Content Providers** | Audio transcription (Whisper), vision (OpenAI, Anthropic, Gemini), PDFs, DOCX, PPTX, XLSX, images | Multimodal ingestion out of the box with format detection |
120
+ | **Configurable Embeddings** | OpenAI embedding system (text-embedding-3-small) | Production-ready embeddings, additional providers planned |
119
121
  | **Multi-Tenancy** | Tenant isolation at database level with automatic scoping | SaaS-ready with complete data separation per tenant |
120
- | **Streaming Everything** | SSE for chat, background workers for embeddings, async throughout | Real-time responses, non-blocking operations, scalable |
121
122
  | **Zero Vendor Lock-in** | Raw HTTP clients (no OpenAI SDK), swappable providers, open standards | Not tied to any vendor, easy to migrate, full control |
122
123
 
123
124
  ## Quick Start
124
125
 
125
126
  Choose your path:
126
127
 
127
- - **Option 1: Package Users with Example Data** (Recommended for first-time users) - PyPI + example datasets
128
- - **Option 2: Package Users** (Recommended for non-developers) - PyPI package + dockerized database
129
- - **Option 3: Developers** - Clone repo, local development with uv
128
+ - **Option 1: Package Users with Example Data** (Recommended) - PyPI + example datasets
129
+ - **Option 2: Developers** - Clone repo, local development with uv
130
130
 
131
131
  ---
132
132
 
@@ -145,34 +145,26 @@ pip install "remdb[all]"
145
145
  git clone https://github.com/Percolation-Labs/remstack-lab.git
146
146
  cd remstack-lab
147
147
 
148
- # Start PostgreSQL with docker-compose
148
+ # Start services (PostgreSQL, Phoenix observability)
149
149
  curl -O https://gist.githubusercontent.com/percolating-sirsh/d117b673bc0edfdef1a5068ccd3cf3e5/raw/docker-compose.prebuilt.yml
150
- docker compose -f docker-compose.prebuilt.yml up -d postgres
150
+ docker compose -f docker-compose.prebuilt.yml up -d
151
151
 
152
152
  # Configure REM (creates ~/.rem/config.yaml and installs database schema)
153
153
  # Add --claude-desktop to register with Claude Desktop app
154
154
  rem configure --install --claude-desktop
155
155
 
156
- # Load quickstart dataset (uses default user)
156
+ # Load quickstart dataset
157
157
  rem db load datasets/quickstart/sample_data.yaml
158
158
 
159
- # Optional: Set default LLM provider via environment variable
160
- # export LLM__DEFAULT_MODEL="openai:gpt-4.1-nano" # Fast and cheap
161
- # export LLM__DEFAULT_MODEL="anthropic:claude-sonnet-4-5-20250929" # High quality (default)
162
-
163
159
  # Ask questions
164
160
  rem ask "What documents exist in the system?"
165
161
  rem ask "Show me meetings about API design"
166
162
 
167
- # Ingest files (PDF, DOCX, images, etc.) - note: requires remstack-lab
163
+ # Ingest files (PDF, DOCX, images, etc.)
168
164
  rem process ingest datasets/formats/files/bitcoin_whitepaper.pdf --category research --tags bitcoin,whitepaper
169
165
 
170
166
  # Query ingested content
171
167
  rem ask "What is the Bitcoin whitepaper about?"
172
-
173
- # Try other datasets (use --user-id for multi-tenant scenarios)
174
- rem db load datasets/domains/recruitment/scenarios/candidate_pipeline/data.yaml --user-id acme-corp
175
- rem ask --user-id acme-corp "Show me candidates with Python experience"
176
168
  ```
177
169
 
178
170
  **What you get:**
@@ -182,130 +174,39 @@ rem ask --user-id acme-corp "Show me candidates with Python experience"
182
174
 
183
175
  **Learn more**: [remstack-lab repository](https://github.com/Percolation-Labs/remstack-lab)
184
176
 
185
- ---
186
-
187
- ## Option 2: Package Users (No Example Data)
188
-
189
- **Best for**: Using REM as a service (API + CLI) without modifying code, bringing your own data.
177
+ ### Using the API
190
178
 
191
- ### Step 1: Start Database and API with Docker Compose
179
+ Once configured, you can also use the OpenAI-compatible chat completions API:
192
180
 
193
181
  ```bash
194
- # Create a project directory
195
- mkdir my-rem-project && cd my-rem-project
196
-
197
- # Download docker-compose file from public gist
198
- curl -O https://gist.githubusercontent.com/percolating-sirsh/d117b673bc0edfdef1a5068ccd3cf3e5/raw/docker-compose.prebuilt.yml
199
-
200
- # IMPORTANT: Export API keys BEFORE running docker compose
201
- # Docker Compose reads env vars at startup - exporting them after won't work!
202
-
203
- # Required: OpenAI for embeddings (text-embedding-3-small)
204
- export OPENAI_API_KEY="sk-..."
205
-
206
- # Recommended: At least one chat completion provider
207
- export ANTHROPIC_API_KEY="sk-ant-..." # Claude Sonnet 4.5 (high quality)
208
- export CEREBRAS_API_KEY="csk-..." # Cerebras (fast, cheap inference)
209
-
210
- # Start PostgreSQL + API
182
+ # Start all services (PostgreSQL, Phoenix, API)
211
183
  docker compose -f docker-compose.prebuilt.yml up -d
212
184
 
213
- # Verify services are running
214
- curl http://localhost:8000/health
215
- ```
216
-
217
- This starts:
218
- - **PostgreSQL** with pgvector on port **5051** (connection: `postgresql://rem:rem@localhost:5051/rem`)
219
- - **REM API** on port **8000** with OpenAI-compatible chat completions + MCP server
220
- - Uses pre-built Docker image from Docker Hub (no local build required)
221
-
222
- ### Step 2: Install and Configure CLI (REQUIRED)
223
-
224
- **This step is required** before you can use REM - it installs the database schema and configures your LLM API keys.
225
-
226
- ```bash
227
- # Install remdb package from PyPI
228
- pip install remdb[all]
229
-
230
- # Configure REM (defaults to port 5051 for package users)
231
- rem configure --install --claude-desktop
185
+ # Test the API
186
+ curl -X POST http://localhost:8000/api/v1/chat/completions \
187
+ -H "Content-Type: application/json" \
188
+ -H "X-Session-Id: a1b2c3d4-e5f6-7890-abcd-ef1234567890" \
189
+ -d '{
190
+ "model": "anthropic:claude-sonnet-4-5-20250929",
191
+ "messages": [{"role": "user", "content": "What documents did Sarah Chen author?"}],
192
+ "stream": false
193
+ }'
232
194
  ```
233
195
 
234
- The interactive wizard will:
235
- 1. **Configure PostgreSQL**: Defaults to `postgresql://rem:rem@localhost:5051/rem` (prebuilt docker-compose)
236
- - Just press Enter to accept defaults
237
- - Custom database: Enter your own host/port/credentials
238
- 2. **Configure LLM providers**: Enter your OpenAI/Anthropic API keys
239
- 3. **Install database tables**: Creates schema, functions, indexes (**required for CLI/API to work**)
240
- 4. **Register with Claude Desktop**: Adds REM MCP server to Claude
241
-
242
- Configuration saved to `~/.rem/config.yaml` (can edit with `rem configure --edit`)
243
-
244
196
  **Port Guide:**
245
197
  - **5051**: Package users with `docker-compose.prebuilt.yml` (pre-built image)
246
198
  - **5050**: Developers with `docker-compose.yml` (local build)
247
- - **Custom**: Your own PostgreSQL database
248
199
 
249
200
  **Next Steps:**
250
201
  - See [CLI Reference](#cli-reference) for all available commands
251
202
  - See [REM Query Dialect](#rem-query-dialect) for query examples
252
203
  - See [API Endpoints](#api-endpoints) for OpenAI-compatible API usage
253
204
 
254
- ### Step 3: Load Sample Data (Optional but Recommended)
255
-
256
- **Option A: Clone example datasets** (Recommended - works with all README examples)
257
-
258
- ```bash
259
- # Clone datasets repository
260
- git clone https://github.com/Percolation-Labs/remstack-lab.git
261
-
262
- # Load quickstart dataset (uses default user)
263
- rem db load --file remstack-lab/datasets/quickstart/sample_data.yaml
264
-
265
- # Test with sample queries
266
- rem ask "What documents exist in the system?"
267
- rem ask "Show me meetings about API design"
268
- rem ask "Who is Sarah Chen?"
269
-
270
- # Try domain-specific datasets (use --user-id for multi-tenant scenarios)
271
- rem db load --file remstack-lab/datasets/domains/recruitment/scenarios/candidate_pipeline/data.yaml --user-id acme-corp
272
- rem ask --user-id acme-corp "Show me candidates with Python experience"
273
- ```
274
-
275
- **Option B: Bring your own data**
276
-
277
- ```bash
278
- # Ingest your own files (uses default user)
279
- echo "REM is a bio-inspired memory system for agentic AI workloads." > test-doc.txt
280
- rem process ingest test-doc.txt --category documentation --tags rem,ai
281
-
282
- # Query your ingested data
283
- rem ask "What do you know about REM from my knowledge base?"
284
- ```
285
-
286
- ### Step 4: Test the API
287
-
288
- ```bash
289
- # Test the OpenAI-compatible chat completions API
290
- curl -X POST http://localhost:8000/api/v1/chat/completions \
291
- -H "Content-Type: application/json" \
292
- -H "X-User-Id: demo-user" \
293
- -d '{
294
- "model": "anthropic:claude-sonnet-4-5-20250929",
295
- "messages": [{"role": "user", "content": "What documents did Sarah Chen author?"}],
296
- "stream": false
297
- }'
298
- ```
299
-
300
- **Available Commands:**
301
- - `rem ask` - Natural language queries to REM
302
- - `rem process ingest <file>` - Full ingestion pipeline (storage + parsing + embedding + database)
303
- - `rem process uri <file>` - READ-ONLY parsing (no database storage, useful for testing parsers)
304
- - `rem db load --file <yaml>` - Load structured datasets directly
205
+ ---
305
206
 
306
207
  ## Example Datasets
307
208
 
308
- 🎯 **Recommended**: Clone [remstack-lab](https://github.com/Percolation-Labs/remstack-lab) for curated datasets organized by domain and format.
209
+ Clone [remstack-lab](https://github.com/Percolation-Labs/remstack-lab) for curated datasets organized by domain and format.
309
210
 
310
211
  **What's included:**
311
212
  - **Quickstart**: Minimal dataset (3 users, 3 resources, 3 moments) - perfect for first-time users
@@ -317,14 +218,11 @@ curl -X POST http://localhost:8000/api/v1/chat/completions \
317
218
  ```bash
318
219
  cd remstack-lab
319
220
 
320
- # Load any dataset (uses default user)
221
+ # Load any dataset
321
222
  rem db load --file datasets/quickstart/sample_data.yaml
322
223
 
323
224
  # Explore formats
324
225
  rem db load --file datasets/formats/engrams/scenarios/team_meeting/team_standup_meeting.yaml
325
-
326
- # Try domain-specific examples (use --user-id for multi-tenant scenarios)
327
- rem db load --file datasets/domains/recruitment/scenarios/candidate_pipeline/data.yaml --user-id acme-corp
328
226
  ```
329
227
 
330
228
  ## See Also
@@ -435,7 +333,7 @@ rem ask research-assistant "Find documents about machine learning architecture"
435
333
  rem ask research-assistant "Summarize recent API design documents" --stream
436
334
 
437
335
  # With session continuity
438
- rem ask research-assistant "What did we discuss about ML?" --session-id abc-123
336
+ rem ask research-assistant "What did we discuss about ML?" --session-id c3d4e5f6-a7b8-9012-cdef-345678901234
439
337
  ```
440
338
 
441
339
  ### Agent Schema Structure
@@ -478,29 +376,16 @@ REM provides **4 built-in MCP tools** your agents can use:
478
376
 
479
377
  ### Multi-User Isolation
480
378
 
481
- Custom agents are **scoped by `user_id`**, ensuring complete data isolation:
379
+ For multi-tenant deployments, custom agents are **scoped by `user_id`**, ensuring complete data isolation. Use `--user-id` flag when you need tenant separation:
482
380
 
483
381
  ```bash
484
- # User A creates a custom agent
485
- rem process ingest my-agent.yaml --user-id user-a --category agents
486
-
487
- # User B cannot see User A's agent
488
- rem ask my-agent "test" --user-id user-b
489
- # ❌ Error: Schema not found (LOOKUP returns no results for user-b)
382
+ # Create agent for specific tenant
383
+ rem process ingest my-agent.yaml --user-id tenant-a --category agents
490
384
 
491
- # User A can use their agent
492
- rem ask my-agent "test" --user-id user-a
493
- # ✅ Works - LOOKUP finds schema for user-a
385
+ # Query with tenant context
386
+ rem ask my-agent "test" --user-id tenant-a
494
387
  ```
495
388
 
496
- ### Advanced: Ontology Extractors
497
-
498
- Custom agents can also be used as **ontology extractors** to extract structured knowledge from files. See [CLAUDE.md](../CLAUDE.md#ontology-extraction-pattern) for details on:
499
- - Multi-provider testing (`provider_configs`)
500
- - Semantic search configuration (`embedding_fields`)
501
- - File matching rules (`OntologyConfig`)
502
- - Dreaming workflow integration
503
-
504
389
  ### Troubleshooting
505
390
 
506
391
  **Schema not found error:**
@@ -534,15 +419,15 @@ REM provides a custom query language designed for **LLM-driven iterated retrieva
534
419
  Unlike traditional single-shot SQL queries, the REM dialect is optimized for **multi-turn exploration** where LLMs participate in query planning:
535
420
 
536
421
  - **Iterated Queries**: Queries return partial results that LLMs use to refine subsequent queries
537
- - **Composable WITH Syntax**: Chain operations together (e.g., `TRAVERSE FROM ... WITH LOOKUP "..."`)
422
+ - **Composable WITH Syntax**: Chain operations together (e.g., `TRAVERSE edge_type WITH LOOKUP "..."`)
538
423
  - **Mixed Indexes**: Combines exact lookups (O(1)), semantic search (vector), and graph traversal
539
424
  - **Query Planner Participation**: Results include metadata for LLMs to decide next steps
540
425
 
541
426
  **Example Multi-Turn Flow**:
542
427
  ```
543
428
  Turn 1: LOOKUP "sarah-chen" → Returns entity + available edge types
544
- Turn 2: TRAVERSE FROM "sarah-chen" TYPE "authored_by" DEPTH 1 → Returns connected documents
545
- Turn 3: SEARCH "architecture decisions" WITH TRAVERSE FROM "sarah-chen" Combines semantic + graph
429
+ Turn 2: TRAVERSE authored_by WITH LOOKUP "sarah-chen" DEPTH 1 → Returns connected documents
430
+ Turn 3: SEARCH "architecture decisions" Semantic search, then explore graph from results
546
431
  ```
547
432
 
548
433
  This enables LLMs to **progressively build context** rather than requiring perfect queries upfront.
@@ -595,8 +480,8 @@ SEARCH "contract disputes" FROM resources WHERE tags @> ARRAY['legal'] LIMIT 5
595
480
  Follow `graph_edges` relationships across the knowledge graph.
596
481
 
597
482
  ```sql
598
- TRAVERSE FROM "sarah-chen" TYPE "authored_by" DEPTH 2
599
- TRAVERSE FROM "api-design-v2" TYPE "references,depends_on" DEPTH 3
483
+ TRAVERSE authored_by WITH LOOKUP "sarah-chen" DEPTH 2
484
+ TRAVERSE references,depends_on WITH LOOKUP "api-design-v2" DEPTH 3
600
485
  ```
601
486
 
602
487
  **Features**:
@@ -689,7 +574,7 @@ SEARCH "API migration planning" FROM resources LIMIT 5
689
574
  LOOKUP "tidb-migration-spec" FROM resources
690
575
 
691
576
  # Query 3: Find related people
692
- TRAVERSE FROM "tidb-migration-spec" TYPE "authored_by,reviewed_by" DEPTH 1
577
+ TRAVERSE authored_by,reviewed_by WITH LOOKUP "tidb-migration-spec" DEPTH 1
693
578
 
694
579
  # Query 4: Recent activity
695
580
  SELECT * FROM moments WHERE
@@ -706,7 +591,7 @@ All queries automatically scoped by `user_id` for complete data isolation:
706
591
  SEARCH "contracts" FROM resources LIMIT 10
707
592
 
708
593
  -- No cross-user data leakage
709
- TRAVERSE FROM "project-x" TYPE "references" DEPTH 3
594
+ TRAVERSE references WITH LOOKUP "project-x" DEPTH 3
710
595
  ```
711
596
 
712
597
  ## API Endpoints
@@ -718,8 +603,8 @@ POST /api/v1/chat/completions
718
603
  ```
719
604
 
720
605
  **Headers**:
721
- - `X-Tenant-Id`: Tenant identifier (required for REM)
722
- - `X-User-Id`: User identifier
606
+ - `X-User-Id`: User identifier (required for data isolation, uses default if not provided)
607
+ - `X-Tenant-Id`: Deprecated - use `X-User-Id` instead (kept for backwards compatibility)
723
608
  - `X-Session-Id`: Session/conversation identifier
724
609
  - `X-Agent-Schema`: Agent schema URI to use
725
610
 
@@ -858,81 +743,144 @@ rem serve --log-level debug
858
743
 
859
744
  ### Database Management
860
745
 
861
- #### `rem db migrate` - Run Migrations
746
+ REM uses a **code-as-source-of-truth** approach for database schema management. Pydantic models define the schema, and the database is kept in sync via diff-based migrations.
862
747
 
863
- Apply database migrations (install.sql and install_models.sql).
748
+ #### Schema Management Philosophy
749
+
750
+ **Two migration files only:**
751
+ - `001_install.sql` - Core infrastructure (extensions, functions, KV store)
752
+ - `002_install_models.sql` - Entity tables (auto-generated from Pydantic models)
753
+
754
+ **No incremental migrations** (003, 004, etc.) - the models file is always regenerated to match code.
755
+
756
+ #### `rem db schema generate` - Regenerate Schema SQL
757
+
758
+ Generate `002_install_models.sql` from registered Pydantic models.
864
759
 
865
760
  ```bash
866
- # Apply all migrations
867
- rem db migrate
761
+ # Regenerate from model registry
762
+ rem db schema generate
868
763
 
869
- # Core infrastructure only (extensions, functions)
870
- rem db migrate --install
764
+ # Output: src/rem/sql/migrations/002_install_models.sql
765
+ ```
871
766
 
872
- # Entity tables only (Resource, Message, etc.)
873
- rem db migrate --models
767
+ This generates:
768
+ - CREATE TABLE statements for each registered entity
769
+ - Embeddings tables (`embeddings_<table>`)
770
+ - KV_STORE triggers for cache maintenance
771
+ - Foreground indexes (GIN for JSONB, B-tree for lookups)
874
772
 
875
- # Background indexes (HNSW for vectors)
876
- rem db migrate --background-indexes
773
+ #### `rem db diff` - Detect Schema Drift
774
+
775
+ Compare Pydantic models against the live database using Alembic autogenerate.
776
+
777
+ ```bash
778
+ # Show additive changes only (default, safe for production)
779
+ rem db diff
877
780
 
878
- # Custom connection string
879
- rem db migrate --connection "postgresql://user:pass@host:5432/db"
781
+ # Show all changes including drops
782
+ rem db diff --strategy full
880
783
 
881
- # Custom SQL directory
882
- rem db migrate --sql-dir /path/to/sql
784
+ # Show additive + safe type widenings
785
+ rem db diff --strategy safe
786
+
787
+ # CI mode: exit 1 if drift detected
788
+ rem db diff --check
789
+
790
+ # Generate migration SQL for changes
791
+ rem db diff --generate
883
792
  ```
884
793
 
885
- #### `rem db status` - Migration Status
794
+ **Migration Strategies:**
795
+ | Strategy | Description |
796
+ |----------|-------------|
797
+ | `additive` | Only ADD columns/tables/indexes (safe, no data loss) - **default** |
798
+ | `full` | All changes including DROPs (use with caution) |
799
+ | `safe` | Additive + safe column type widenings (e.g., VARCHAR(50) → VARCHAR(256)) |
800
+
801
+ **Output shows:**
802
+ - `+ ADD COLUMN` - Column in model but not in DB
803
+ - `- DROP COLUMN` - Column in DB but not in model (only with `--strategy full`)
804
+ - `~ ALTER COLUMN` - Column type or constraints differ
805
+ - `+ CREATE TABLE` / `- DROP TABLE` - Table additions/removals
886
806
 
887
- Show applied migrations and execution times.
807
+ #### `rem db apply` - Apply SQL Directly
808
+
809
+ Apply a SQL file directly to the database (bypasses migration tracking).
888
810
 
889
811
  ```bash
890
- rem db status
812
+ # Apply with audit logging (default)
813
+ rem db apply src/rem/sql/migrations/002_install_models.sql
814
+
815
+ # Preview without executing
816
+ rem db apply --dry-run src/rem/sql/migrations/002_install_models.sql
817
+
818
+ # Apply without audit logging
819
+ rem db apply --no-log src/rem/sql/migrations/002_install_models.sql
891
820
  ```
892
821
 
893
- #### `rem db rebuild-cache` - Rebuild KV Cache
822
+ #### `rem db migrate` - Initial Setup
894
823
 
895
- Rebuild KV_STORE cache from entity tables (after database restart or bulk imports).
824
+ Apply standard migrations (001 + 002). Use for initial setup only.
896
825
 
897
826
  ```bash
898
- rem db rebuild-cache
827
+ # Apply infrastructure + entity tables
828
+ rem db migrate
829
+
830
+ # Include background indexes (HNSW for vectors)
831
+ rem db migrate --background-indexes
899
832
  ```
900
833
 
901
- ### Schema Management
834
+ #### Database Workflows
902
835
 
903
- #### `rem db schema generate` - Generate SQL Schema
836
+ **Initial Setup (Local):**
837
+ ```bash
838
+ rem db schema generate # Generate from models
839
+ rem db migrate # Apply 001 + 002
840
+ rem db diff # Verify no drift
841
+ ```
904
842
 
905
- Generate database schema from Pydantic models.
843
+ **Adding/Modifying Models:**
844
+ ```bash
845
+ # 1. Edit models in src/rem/models/entities/
846
+ # 2. Register new models in src/rem/registry.py
847
+ rem db schema generate # Regenerate schema
848
+ rem db diff # See what changed
849
+ rem db apply src/rem/sql/migrations/002_install_models.sql
850
+ ```
906
851
 
852
+ **CI/CD Pipeline:**
907
853
  ```bash
908
- # Generate install_models.sql from entity models
909
- rem db schema generate \
910
- --models src/rem/models/entities \
911
- --output rem/src/rem/sql/install_models.sql
854
+ rem db diff --check # Fail build if drift detected
855
+ ```
856
+
857
+ **Remote Database (Production/Staging):**
858
+ ```bash
859
+ # Port-forward to cluster database
860
+ kubectl port-forward -n <namespace> svc/rem-postgres-rw 5433:5432 &
912
861
 
913
- # Generate migration file
914
- rem db schema generate \
915
- --models src/rem/models/entities \
916
- --output rem/src/rem/sql/migrations/003_add_fields.sql
862
+ # Override connection for diff check
863
+ POSTGRES__CONNECTION_STRING="postgresql://rem:rem@localhost:5433/rem" rem db diff
864
+
865
+ # Apply changes if needed
866
+ POSTGRES__CONNECTION_STRING="postgresql://rem:rem@localhost:5433/rem" \
867
+ rem db apply src/rem/sql/migrations/002_install_models.sql
917
868
  ```
918
869
 
919
- #### `rem db schema indexes` - Generate Background Indexes
870
+ #### `rem db rebuild-cache` - Rebuild KV Cache
920
871
 
921
- Generate SQL for background index creation (HNSW for vectors).
872
+ Rebuild KV_STORE cache from entity tables (after database restart or bulk imports).
922
873
 
923
874
  ```bash
924
- # Generate background_indexes.sql
925
- rem db schema indexes \
926
- --models src/rem/models/entities \
927
- --output rem/src/rem/sql/background_indexes.sql
875
+ rem db rebuild-cache
928
876
  ```
929
877
 
930
878
  #### `rem db schema validate` - Validate Models
931
879
 
932
- Validate Pydantic models for schema generation.
880
+ Validate registered Pydantic models for schema generation.
933
881
 
934
882
  ```bash
935
- rem db schema validate --models src/rem/models/entities
883
+ rem db schema validate
936
884
  ```
937
885
 
938
886
  ### File Processing
@@ -1138,14 +1086,11 @@ Test Pydantic AI agent with natural language queries.
1138
1086
  # Ask a question
1139
1087
  rem ask "What documents did Sarah Chen author?"
1140
1088
 
1141
- # With context headers
1142
- rem ask "Find all resources about API design" \
1143
- --user-id user-123 \
1144
- --tenant-id acme-corp
1145
-
1146
1089
  # Use specific agent schema
1147
- rem ask "Analyze this contract" \
1148
- --agent-schema contract-analyzer-v1
1090
+ rem ask contract-analyzer "Analyze this contract"
1091
+
1092
+ # Stream response
1093
+ rem ask "Find all resources about API design" --stream
1149
1094
  ```
1150
1095
 
1151
1096
  ### Global Options
@@ -1193,7 +1138,7 @@ export API__RELOAD=true
1193
1138
  rem serve
1194
1139
  ```
1195
1140
 
1196
- ## Development (For Contributors)
1141
+ ## Option 2: Development (For Contributors)
1197
1142
 
1198
1143
  **Best for**: Contributing to REM or customizing the codebase.
1199
1144
 
@@ -1297,6 +1242,30 @@ S3__BUCKET_NAME=rem-storage
1297
1242
  S3__REGION=us-east-1
1298
1243
  ```
1299
1244
 
1245
+ ### Building Docker Images
1246
+
1247
+ We tag Docker images with three labels for traceability:
1248
+ 1. `latest` - Always points to most recent build
1249
+ 2. `<git-sha>` - Short commit hash for exact version tracing
1250
+ 3. `<version>` - Semantic version from `pyproject.toml`
1251
+
1252
+ ```bash
1253
+ # Build and push multi-platform image to Docker Hub
1254
+ VERSION=$(grep '^version' pyproject.toml | cut -d'"' -f2) && \
1255
+ docker buildx build --platform linux/amd64,linux/arm64 \
1256
+ -t percolationlabs/rem:latest \
1257
+ -t percolationlabs/rem:$(git rev-parse --short HEAD) \
1258
+ -t percolationlabs/rem:$VERSION \
1259
+ --push \
1260
+ -f Dockerfile .
1261
+
1262
+ # Load locally for testing (single platform, no push)
1263
+ docker buildx build --platform linux/arm64 \
1264
+ -t percolationlabs/rem:latest \
1265
+ --load \
1266
+ -f Dockerfile .
1267
+ ```
1268
+
1300
1269
  ### Production Deployment (Optional)
1301
1270
 
1302
1271
  For production deployment to AWS EKS with Kubernetes, see the main repository README:
@@ -1465,45 +1434,156 @@ Successfully installed ... kreuzberg-4.0.0rc1 ... remdb-0.3.10
1465
1434
 
1466
1435
  REM wraps FastAPI - extend it exactly as you would any FastAPI app.
1467
1436
 
1437
+ ### Recommended Project Structure
1438
+
1439
+ REM auto-detects `./agents/` and `./models/` folders - no configuration needed:
1440
+
1441
+ ```
1442
+ my-rem-app/
1443
+ ├── agents/ # Auto-detected for agent schemas
1444
+ │ ├── my-agent.yaml # Custom agent (rem ask my-agent "query")
1445
+ │ └── another-agent.yaml
1446
+ ├── models/ # Auto-detected if __init__.py exists
1447
+ │ └── __init__.py # Register models with @rem.register_model
1448
+ ├── routers/ # Custom FastAPI routers
1449
+ │ └── custom.py
1450
+ ├── main.py # Entry point
1451
+ └── pyproject.toml
1452
+ ```
1453
+
1454
+ ### Quick Start
1455
+
1468
1456
  ```python
1469
- import rem
1457
+ # main.py
1470
1458
  from rem import create_app
1471
- from rem.models.core import CoreModel
1459
+ from fastapi import APIRouter
1472
1460
 
1473
- # 1. Register models (for schema generation)
1474
- rem.register_models(MyModel, AnotherModel)
1461
+ # Create REM app (auto-detects ./agents/ and ./models/)
1462
+ app = create_app()
1475
1463
 
1476
- # 2. Register schema paths (for custom agents/evaluators)
1477
- rem.register_schema_path("./schemas")
1464
+ # Add custom router
1465
+ router = APIRouter(prefix="/custom", tags=["custom"])
1478
1466
 
1479
- # 3. Create app
1480
- app = create_app()
1467
+ @router.get("/hello")
1468
+ async def hello():
1469
+ return {"message": "Hello from custom router!"}
1481
1470
 
1482
- # 4. Extend like normal FastAPI
1483
- app.include_router(my_router)
1471
+ app.include_router(router)
1484
1472
 
1473
+ # Add custom MCP tool
1485
1474
  @app.mcp_server.tool()
1486
1475
  async def my_tool(query: str) -> dict:
1487
- """Custom MCP tool."""
1476
+ """Custom MCP tool available to agents."""
1488
1477
  return {"result": query}
1489
1478
  ```
1490
1479
 
1491
- ### Project Structure
1480
+ ### Custom Models (Auto-Detected)
1481
+
1482
+ ```python
1483
+ # models/__init__.py
1484
+ import rem
1485
+ from rem.models.core import CoreModel
1486
+ from pydantic import Field
1492
1487
 
1488
+ @rem.register_model
1489
+ class MyEntity(CoreModel):
1490
+ """Custom entity - auto-registered for schema generation."""
1491
+ name: str = Field(description="Entity name")
1492
+ status: str = Field(default="active")
1493
1493
  ```
1494
- my-rem-app/
1495
- ├── my_app/
1496
- │ ├── main.py # Entry point (create_app + extensions)
1497
- │ ├── models.py # Custom models (inherit CoreModel)
1498
- │ └── routers/ # Custom FastAPI routers
1499
- ├── schemas/
1500
- │ ├── agents/ # Custom agent YAML schemas
1501
- │ └── evaluators/ # Custom evaluator schemas
1502
- ├── sql/migrations/ # Custom SQL migrations
1503
- └── pyproject.toml
1494
+
1495
+ Run `rem db schema generate` to include your models in the database schema.
1496
+
1497
+ ### Custom Agents (Auto-Detected)
1498
+
1499
+ ```yaml
1500
+ # agents/my-agent.yaml
1501
+ type: object
1502
+ description: |
1503
+ You are a helpful assistant that...
1504
+
1505
+ properties:
1506
+ answer:
1507
+ type: string
1508
+ description: Your response
1509
+
1510
+ required:
1511
+ - answer
1512
+
1513
+ json_schema_extra:
1514
+ kind: agent
1515
+ name: my-agent
1516
+ version: "1.0.0"
1517
+ tools:
1518
+ - search_rem
1519
+ ```
1520
+
1521
+ Test with: `rem ask my-agent "Hello!"`
1522
+
1523
+ ### Example Custom Router
1524
+
1525
+ ```python
1526
+ # routers/analytics.py
1527
+ from fastapi import APIRouter, Depends
1528
+ from rem.services.postgres import get_postgres_service
1529
+
1530
+ router = APIRouter(prefix="/analytics", tags=["analytics"])
1531
+
1532
+ @router.get("/stats")
1533
+ async def get_stats():
1534
+ """Get database statistics."""
1535
+ db = get_postgres_service()
1536
+ if not db:
1537
+ return {"error": "Database not available"}
1538
+
1539
+ await db.connect()
1540
+ try:
1541
+ result = await db.execute(
1542
+ "SELECT COUNT(*) as count FROM resources"
1543
+ )
1544
+ return {"resource_count": result[0]["count"]}
1545
+ finally:
1546
+ await db.disconnect()
1547
+
1548
+ @router.get("/recent")
1549
+ async def get_recent(limit: int = 10):
1550
+ """Get recent resources."""
1551
+ db = get_postgres_service()
1552
+ if not db:
1553
+ return {"error": "Database not available"}
1554
+
1555
+ await db.connect()
1556
+ try:
1557
+ result = await db.execute(
1558
+ f"SELECT label, category, created_at FROM resources ORDER BY created_at DESC LIMIT {limit}"
1559
+ )
1560
+ return {"resources": result}
1561
+ finally:
1562
+ await db.disconnect()
1563
+ ```
1564
+
1565
+ Include in main.py:
1566
+
1567
+ ```python
1568
+ from routers.analytics import router as analytics_router
1569
+ app.include_router(analytics_router)
1504
1570
  ```
1505
1571
 
1506
- Generate this structure with: `rem scaffold my-app` *(coming soon)*
1572
+ ### Running the App
1573
+
1574
+ ```bash
1575
+ # Development (auto-reload)
1576
+ uv run uvicorn main:app --reload --port 8000
1577
+
1578
+ # Or use rem serve
1579
+ uv run rem serve --reload
1580
+
1581
+ # Test agent
1582
+ uv run rem ask my-agent "What can you help me with?"
1583
+
1584
+ # Test custom endpoint
1585
+ curl http://localhost:8000/analytics/stats
1586
+ ```
1507
1587
 
1508
1588
  ### Extension Points
1509
1589
 
@@ -1515,6 +1595,37 @@ Generate this structure with: `rem scaffold my-app` *(coming soon)*
1515
1595
  | **MCP Prompts** | `@app.mcp_server.prompt()` or `app.mcp_server.add_prompt(fn)` |
1516
1596
  | **Models** | `rem.register_models(Model)` then `rem db schema generate` |
1517
1597
  | **Agent Schemas** | `rem.register_schema_path("./schemas")` or `SCHEMA__PATHS` env var |
1598
+ | **SQL Migrations** | Place in `sql/migrations/` (auto-detected) |
1599
+
1600
+ ### Custom Migrations
1601
+
1602
+ REM automatically discovers migrations from two sources:
1603
+
1604
+ 1. **Package migrations** (001-099): Built-in migrations from the `remdb` package
1605
+ 2. **User migrations** (100+): Your custom migrations in `./sql/migrations/`
1606
+
1607
+ **Convention**: Place custom SQL files in `sql/migrations/` relative to your project root:
1608
+
1609
+ ```
1610
+ my-rem-app/
1611
+ ├── sql/
1612
+ │ └── migrations/
1613
+ │ ├── 100_custom_table.sql # Runs after package migrations
1614
+ │ ├── 101_add_indexes.sql
1615
+ │ └── 102_custom_functions.sql
1616
+ └── ...
1617
+ ```
1618
+
1619
+ **Numbering**: Use 100+ for user migrations to ensure they run after package migrations (001-099). All migrations are sorted by filename, so proper numbering ensures correct execution order.
1620
+
1621
+ **Running migrations**:
1622
+ ```bash
1623
+ # Apply all migrations (package + user)
1624
+ rem db migrate
1625
+
1626
+ # Apply with background indexes (for production)
1627
+ rem db migrate --background-indexes
1628
+ ```
1518
1629
 
1519
1630
  ## License
1520
1631