remdb 0.3.0__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Potentially problematic release.


This version of remdb might be problematic. Click here for more details.

Files changed (187) hide show
  1. rem/__init__.py +2 -0
  2. rem/agentic/README.md +650 -0
  3. rem/agentic/__init__.py +39 -0
  4. rem/agentic/agents/README.md +155 -0
  5. rem/agentic/agents/__init__.py +8 -0
  6. rem/agentic/context.py +148 -0
  7. rem/agentic/context_builder.py +329 -0
  8. rem/agentic/mcp/__init__.py +0 -0
  9. rem/agentic/mcp/tool_wrapper.py +107 -0
  10. rem/agentic/otel/__init__.py +5 -0
  11. rem/agentic/otel/setup.py +151 -0
  12. rem/agentic/providers/phoenix.py +674 -0
  13. rem/agentic/providers/pydantic_ai.py +572 -0
  14. rem/agentic/query.py +117 -0
  15. rem/agentic/query_helper.py +89 -0
  16. rem/agentic/schema.py +396 -0
  17. rem/agentic/serialization.py +245 -0
  18. rem/agentic/tools/__init__.py +5 -0
  19. rem/agentic/tools/rem_tools.py +231 -0
  20. rem/api/README.md +420 -0
  21. rem/api/main.py +324 -0
  22. rem/api/mcp_router/prompts.py +182 -0
  23. rem/api/mcp_router/resources.py +536 -0
  24. rem/api/mcp_router/server.py +213 -0
  25. rem/api/mcp_router/tools.py +584 -0
  26. rem/api/routers/auth.py +229 -0
  27. rem/api/routers/chat/__init__.py +5 -0
  28. rem/api/routers/chat/completions.py +281 -0
  29. rem/api/routers/chat/json_utils.py +76 -0
  30. rem/api/routers/chat/models.py +124 -0
  31. rem/api/routers/chat/streaming.py +185 -0
  32. rem/auth/README.md +258 -0
  33. rem/auth/__init__.py +26 -0
  34. rem/auth/middleware.py +100 -0
  35. rem/auth/providers/__init__.py +13 -0
  36. rem/auth/providers/base.py +376 -0
  37. rem/auth/providers/google.py +163 -0
  38. rem/auth/providers/microsoft.py +237 -0
  39. rem/cli/README.md +455 -0
  40. rem/cli/__init__.py +8 -0
  41. rem/cli/commands/README.md +126 -0
  42. rem/cli/commands/__init__.py +3 -0
  43. rem/cli/commands/ask.py +566 -0
  44. rem/cli/commands/configure.py +497 -0
  45. rem/cli/commands/db.py +493 -0
  46. rem/cli/commands/dreaming.py +324 -0
  47. rem/cli/commands/experiments.py +1302 -0
  48. rem/cli/commands/mcp.py +66 -0
  49. rem/cli/commands/process.py +245 -0
  50. rem/cli/commands/schema.py +183 -0
  51. rem/cli/commands/serve.py +106 -0
  52. rem/cli/dreaming.py +363 -0
  53. rem/cli/main.py +96 -0
  54. rem/config.py +237 -0
  55. rem/mcp_server.py +41 -0
  56. rem/models/core/__init__.py +49 -0
  57. rem/models/core/core_model.py +64 -0
  58. rem/models/core/engram.py +333 -0
  59. rem/models/core/experiment.py +628 -0
  60. rem/models/core/inline_edge.py +132 -0
  61. rem/models/core/rem_query.py +243 -0
  62. rem/models/entities/__init__.py +43 -0
  63. rem/models/entities/file.py +57 -0
  64. rem/models/entities/image_resource.py +88 -0
  65. rem/models/entities/message.py +35 -0
  66. rem/models/entities/moment.py +123 -0
  67. rem/models/entities/ontology.py +191 -0
  68. rem/models/entities/ontology_config.py +131 -0
  69. rem/models/entities/resource.py +95 -0
  70. rem/models/entities/schema.py +87 -0
  71. rem/models/entities/user.py +85 -0
  72. rem/py.typed +0 -0
  73. rem/schemas/README.md +507 -0
  74. rem/schemas/__init__.py +6 -0
  75. rem/schemas/agents/README.md +92 -0
  76. rem/schemas/agents/core/moment-builder.yaml +178 -0
  77. rem/schemas/agents/core/rem-query-agent.yaml +226 -0
  78. rem/schemas/agents/core/resource-affinity-assessor.yaml +99 -0
  79. rem/schemas/agents/core/simple-assistant.yaml +19 -0
  80. rem/schemas/agents/core/user-profile-builder.yaml +163 -0
  81. rem/schemas/agents/examples/contract-analyzer.yaml +317 -0
  82. rem/schemas/agents/examples/contract-extractor.yaml +134 -0
  83. rem/schemas/agents/examples/cv-parser.yaml +263 -0
  84. rem/schemas/agents/examples/hello-world.yaml +37 -0
  85. rem/schemas/agents/examples/query.yaml +54 -0
  86. rem/schemas/agents/examples/simple.yaml +21 -0
  87. rem/schemas/agents/examples/test.yaml +29 -0
  88. rem/schemas/agents/rem.yaml +128 -0
  89. rem/schemas/evaluators/hello-world/default.yaml +77 -0
  90. rem/schemas/evaluators/rem/faithfulness.yaml +219 -0
  91. rem/schemas/evaluators/rem/lookup-correctness.yaml +182 -0
  92. rem/schemas/evaluators/rem/retrieval-precision.yaml +199 -0
  93. rem/schemas/evaluators/rem/retrieval-recall.yaml +211 -0
  94. rem/schemas/evaluators/rem/search-correctness.yaml +192 -0
  95. rem/services/__init__.py +16 -0
  96. rem/services/audio/INTEGRATION.md +308 -0
  97. rem/services/audio/README.md +376 -0
  98. rem/services/audio/__init__.py +15 -0
  99. rem/services/audio/chunker.py +354 -0
  100. rem/services/audio/transcriber.py +259 -0
  101. rem/services/content/README.md +1269 -0
  102. rem/services/content/__init__.py +5 -0
  103. rem/services/content/providers.py +806 -0
  104. rem/services/content/service.py +676 -0
  105. rem/services/dreaming/README.md +230 -0
  106. rem/services/dreaming/__init__.py +53 -0
  107. rem/services/dreaming/affinity_service.py +336 -0
  108. rem/services/dreaming/moment_service.py +264 -0
  109. rem/services/dreaming/ontology_service.py +54 -0
  110. rem/services/dreaming/user_model_service.py +297 -0
  111. rem/services/dreaming/utils.py +39 -0
  112. rem/services/embeddings/__init__.py +11 -0
  113. rem/services/embeddings/api.py +120 -0
  114. rem/services/embeddings/worker.py +421 -0
  115. rem/services/fs/README.md +662 -0
  116. rem/services/fs/__init__.py +62 -0
  117. rem/services/fs/examples.py +206 -0
  118. rem/services/fs/examples_paths.py +204 -0
  119. rem/services/fs/git_provider.py +935 -0
  120. rem/services/fs/local_provider.py +760 -0
  121. rem/services/fs/parsing-hooks-examples.md +172 -0
  122. rem/services/fs/paths.py +276 -0
  123. rem/services/fs/provider.py +460 -0
  124. rem/services/fs/s3_provider.py +1042 -0
  125. rem/services/fs/service.py +186 -0
  126. rem/services/git/README.md +1075 -0
  127. rem/services/git/__init__.py +17 -0
  128. rem/services/git/service.py +469 -0
  129. rem/services/phoenix/EXPERIMENT_DESIGN.md +1146 -0
  130. rem/services/phoenix/README.md +453 -0
  131. rem/services/phoenix/__init__.py +46 -0
  132. rem/services/phoenix/client.py +686 -0
  133. rem/services/phoenix/config.py +88 -0
  134. rem/services/phoenix/prompt_labels.py +477 -0
  135. rem/services/postgres/README.md +575 -0
  136. rem/services/postgres/__init__.py +23 -0
  137. rem/services/postgres/migration_service.py +427 -0
  138. rem/services/postgres/pydantic_to_sqlalchemy.py +232 -0
  139. rem/services/postgres/register_type.py +352 -0
  140. rem/services/postgres/repository.py +337 -0
  141. rem/services/postgres/schema_generator.py +379 -0
  142. rem/services/postgres/service.py +802 -0
  143. rem/services/postgres/sql_builder.py +354 -0
  144. rem/services/rem/README.md +304 -0
  145. rem/services/rem/__init__.py +23 -0
  146. rem/services/rem/exceptions.py +71 -0
  147. rem/services/rem/executor.py +293 -0
  148. rem/services/rem/parser.py +145 -0
  149. rem/services/rem/queries.py +196 -0
  150. rem/services/rem/query.py +371 -0
  151. rem/services/rem/service.py +527 -0
  152. rem/services/session/README.md +374 -0
  153. rem/services/session/__init__.py +6 -0
  154. rem/services/session/compression.py +360 -0
  155. rem/services/session/reload.py +77 -0
  156. rem/settings.py +1235 -0
  157. rem/sql/002_install_models.sql +1068 -0
  158. rem/sql/background_indexes.sql +42 -0
  159. rem/sql/install_models.sql +1038 -0
  160. rem/sql/migrations/001_install.sql +503 -0
  161. rem/sql/migrations/002_install_models.sql +1202 -0
  162. rem/utils/AGENTIC_CHUNKING.md +597 -0
  163. rem/utils/README.md +583 -0
  164. rem/utils/__init__.py +43 -0
  165. rem/utils/agentic_chunking.py +622 -0
  166. rem/utils/batch_ops.py +343 -0
  167. rem/utils/chunking.py +108 -0
  168. rem/utils/clip_embeddings.py +276 -0
  169. rem/utils/dict_utils.py +98 -0
  170. rem/utils/embeddings.py +423 -0
  171. rem/utils/examples/embeddings_example.py +305 -0
  172. rem/utils/examples/sql_types_example.py +202 -0
  173. rem/utils/markdown.py +16 -0
  174. rem/utils/model_helpers.py +236 -0
  175. rem/utils/schema_loader.py +336 -0
  176. rem/utils/sql_types.py +348 -0
  177. rem/utils/user_id.py +81 -0
  178. rem/utils/vision.py +330 -0
  179. rem/workers/README.md +506 -0
  180. rem/workers/__init__.py +5 -0
  181. rem/workers/dreaming.py +502 -0
  182. rem/workers/engram_processor.py +312 -0
  183. rem/workers/sqs_file_processor.py +193 -0
  184. remdb-0.3.0.dist-info/METADATA +1455 -0
  185. remdb-0.3.0.dist-info/RECORD +187 -0
  186. remdb-0.3.0.dist-info/WHEEL +4 -0
  187. remdb-0.3.0.dist-info/entry_points.txt +2 -0
@@ -0,0 +1,85 @@
1
+ """
2
+ User - User entity in REM.
3
+
4
+ Users represent people in the system, either as content creators,
5
+ participants in moments, or entities referenced in resources.
6
+
7
+ Users can be discovered through:
8
+ - Entity extraction from resources
9
+ - Moment present_persons lists
10
+ - Direct user registration
11
+ """
12
+
13
+ from datetime import datetime
14
+ from enum import Enum
15
+ from typing import Optional
16
+
17
+ from pydantic import Field
18
+
19
+ from ..core import CoreModel
20
+
21
+
22
+ class UserTier(str, Enum):
23
+ """User subscription tier for feature gating."""
24
+
25
+ FREE = "free"
26
+ SILVER = "silver"
27
+ GOLD = "gold"
28
+
29
+
30
+ class User(CoreModel):
31
+ """
32
+ User entity.
33
+
34
+ Represents people in the REM system, either as active users
35
+ or entities extracted from content. Tenant isolation is provided
36
+ via CoreModel.tenant_id field.
37
+
38
+ Enhanced by dreaming worker:
39
+ - summary: Generated from activity analysis
40
+ - interests: Extracted from resources and sessions
41
+ - activity_level: Computed from recent engagement
42
+ - preferred_topics: Extracted from moment/resource topics
43
+ """
44
+
45
+ name: str = Field(
46
+ ...,
47
+ description="User name (human-readable, used as graph label)",
48
+ json_schema_extra={"entity_key": True}, # Primary business key for KV lookups
49
+ )
50
+ email: Optional[str] = Field(
51
+ default=None,
52
+ description="User email address",
53
+ )
54
+ role: Optional[str] = Field(
55
+ default=None,
56
+ description="User role (employee, contractor, external, etc.)",
57
+ )
58
+ tier: UserTier = Field(
59
+ default=UserTier.FREE,
60
+ description="User subscription tier (free, silver, gold) for feature gating",
61
+ )
62
+ sec_policy: dict = Field(
63
+ default_factory=dict,
64
+ description="Security policy configuration (JSON, extensible for custom policies)",
65
+ )
66
+ summary: Optional[str] = Field(
67
+ default=None,
68
+ description="LLM-generated user profile summary (updated by dreaming worker)",
69
+ )
70
+ interests: list[str] = Field(
71
+ default_factory=list,
72
+ description="User interests extracted from activity",
73
+ )
74
+ preferred_topics: list[str] = Field(
75
+ default_factory=list,
76
+ description="Frequently discussed topics in kebab-case",
77
+ )
78
+ activity_level: Optional[str] = Field(
79
+ default=None,
80
+ description="Activity level: active, moderate, inactive",
81
+ )
82
+ last_active_at: Optional[datetime] = Field(
83
+ default=None,
84
+ description="Last activity timestamp",
85
+ )
rem/py.typed ADDED
File without changes
rem/schemas/README.md ADDED
@@ -0,0 +1,507 @@
1
+ # REM Agent and Evaluator Schemas
2
+
3
+ This directory contains versioned agent and evaluator schemas for the REM system. Git is used for version control - schemas are tagged with semantic versions (v1.0.0, v2.0.0, etc.) and loaded via the GitProvider.
4
+
5
+ ## Directory Structure
6
+
7
+ ```
8
+ schemas/
9
+ ├── agents/ # Agent schemas (one current version per agent)
10
+ │ ├── cv-parser.yaml
11
+ │ ├── contract-analyzer.yaml
12
+ │ ├── hello-world.yaml
13
+ │ ├── query.yaml
14
+ │ ├── rem.yaml
15
+ │ └── simple.yaml
16
+
17
+ └── evaluators/ # Evaluator schemas organized by agent name
18
+ ├── hello-world/
19
+ │ └── default.yaml
20
+ └── rem/
21
+ ├── faithfulness.yaml
22
+ ├── lookup-correctness.yaml
23
+ ├── retrieval-precision.yaml
24
+ ├── retrieval-recall.yaml
25
+ └── search-correctness.yaml
26
+ ```
27
+
28
+ ## Naming Conventions
29
+
30
+ ### Agents (`agents/`)
31
+
32
+ - **Location**: `schemas/agents/{agent-name}.yaml`
33
+ - **Naming**: Use lowercase with hyphens (kebab-case)
34
+ - **No "agent" suffix**: Files should NOT include `-agent` suffix
35
+ - **No version in filename**: Git tags handle versioning (not `cv-parser-v1.yaml`)
36
+ - **One current version**: Only the latest version exists in the repo
37
+
38
+ **Examples**:
39
+ - ✅ `cv-parser.yaml`
40
+ - ✅ `contract-analyzer.yaml`
41
+ - ✅ `hello-world.yaml`
42
+ - ❌ `cv-parser-agent.yaml` (no `-agent` suffix)
43
+ - ❌ `cv-parser-v1.yaml` (no version in filename)
44
+
45
+ ### Evaluators (`evaluators/`)
46
+
47
+ - **Location**: `schemas/evaluators/{agent-name}/{evaluator-name}.yaml`
48
+ - **Organization**: Group by agent name (the agent being evaluated)
49
+ - **Default evaluator**: Use `default.yaml` for the primary evaluator
50
+ - **Multiple evaluators**: Use descriptive names for specialized evaluators
51
+ - **No "agent" suffix**: Directory names should NOT include `-agent` suffix
52
+
53
+ **Examples**:
54
+ - ✅ `evaluators/hello-world/default.yaml`
55
+ - ✅ `evaluators/rem/lookup-correctness.yaml`
56
+ - ✅ `evaluators/rem/faithfulness.yaml`
57
+ - ❌ `evaluators/hello-world-agent/default.yaml` (no `-agent` suffix)
58
+ - ❌ `evaluators/rem-lookup-correctness.yaml` (must be in subdirectory)
59
+
60
+ ## Git Versioning Workflow
61
+
62
+ ### Creating a New Schema
63
+
64
+ 1. Create the schema file following naming conventions
65
+ 2. Commit the schema with a descriptive message
66
+ 3. Tag the commit with semantic version
67
+
68
+ ```bash
69
+ # Create schema
70
+ vim schemas/agents/my-new-agent.yaml
71
+
72
+ # Commit and tag
73
+ git add schemas/agents/my-new-agent.yaml
74
+ git commit -m "feat: Add my-new-agent v1.0.0"
75
+ git tag -a v1.0.0 -m "my-new-agent v1.0.0: Initial release"
76
+ git push origin main --tags
77
+ ```
78
+
79
+ ### Updating an Existing Schema
80
+
81
+ 1. Modify the schema file
82
+ 2. Commit the changes
83
+ 3. Tag with incremented version
84
+
85
+ ```bash
86
+ # Modify schema
87
+ vim schemas/agents/my-new-agent.yaml
88
+
89
+ # Commit and tag
90
+ git add schemas/agents/my-new-agent.yaml
91
+ git commit -m "feat: Add confidence scoring to my-new-agent v2.0.0"
92
+ git tag -a v2.0.0 -m "my-new-agent v2.0.0: Add confidence scoring"
93
+ git push origin main --tags
94
+ ```
95
+
96
+ ### Semantic Versioning Rules
97
+
98
+ Follow [semver](https://semver.org/) conventions:
99
+
100
+ - **MAJOR** (v2.0.0): Breaking changes (removed fields, changed types, different behavior)
101
+ - **MINOR** (v1.1.0): New features (added fields, new optional properties)
102
+ - **PATCH** (v1.0.1): Bug fixes (typos, documentation, no schema changes)
103
+
104
+ ## Agent Schema Format
105
+
106
+ All agent schemas must follow this JSON Schema structure:
107
+
108
+ ```yaml
109
+ ---
110
+ type: object
111
+ description: |
112
+ System prompt describing agent behavior and instructions.
113
+
114
+ This is shown to the LLM as the system prompt.
115
+ Provide clear, detailed instructions.
116
+
117
+ properties:
118
+ answer:
119
+ type: string
120
+ description: The answer field
121
+
122
+ confidence:
123
+ type: number
124
+ minimum: 0
125
+ maximum: 1
126
+ description: Confidence score (0.0-1.0)
127
+
128
+ required:
129
+ - answer
130
+ - confidence
131
+
132
+ json_schema_extra:
133
+ fully_qualified_name: "rem.agents.MyAgent"
134
+ version: "1.0.0"
135
+ tags: [domain, category]
136
+
137
+ # Optional: MCP tool configurations
138
+ tools: []
139
+
140
+ # Optional: MCP resource configurations
141
+ resources: []
142
+
143
+ # Optional: Multi-provider testing
144
+ provider_configs:
145
+ - provider_name: anthropic
146
+ model_name: claude-sonnet-4-5-20250929
147
+ - provider_name: openai
148
+ model_name: gpt-4o
149
+
150
+ # Optional: Fields to embed for semantic search
151
+ embedding_fields:
152
+ - field1
153
+ - field2
154
+ - nested.field3
155
+ ```
156
+
157
+ ## Evaluator Schema Format
158
+
159
+ Evaluators use the same JSON Schema structure as agents, but with evaluation-specific properties:
160
+
161
+ ```yaml
162
+ ---
163
+ type: object
164
+ description: |
165
+ You are THE JUDGE evaluating an agent's response.
166
+
167
+ Provide strict, objective evaluation without celebration.
168
+ Grade based on correctness, completeness, and accuracy.
169
+
170
+ properties:
171
+ correctness:
172
+ type: number
173
+ minimum: 0
174
+ maximum: 1
175
+ description: How correct is the response (0.0-1.0)
176
+
177
+ completeness:
178
+ type: number
179
+ minimum: 0
180
+ maximum: 1
181
+ description: How complete is the response (0.0-1.0)
182
+
183
+ explanation:
184
+ type: string
185
+ description: Detailed explanation of the evaluation
186
+
187
+ required:
188
+ - correctness
189
+ - completeness
190
+ - explanation
191
+
192
+ json_schema_extra:
193
+ fully_qualified_name: "rem.evaluators.MyEvaluator"
194
+ version: "1.0.0"
195
+ tags: [evaluation, correctness]
196
+ ```
197
+
198
+ ## Loading Schemas with GitService
199
+
200
+ ### From Python Code
201
+
202
+ ```python
203
+ from rem.services.git import GitService
204
+
205
+ git_svc = GitService()
206
+
207
+ # Load latest version
208
+ schema = git_svc.load_schema("cv-parser")
209
+
210
+ # Load specific version
211
+ schema = git_svc.load_schema("cv-parser", version="v2.0.0")
212
+
213
+ # List all versions
214
+ versions = git_svc.list_schema_versions("cv-parser")
215
+ # [{"tag": "v2.0.0", "version": (2,0,0), "commit": "abc123", ...}, ...]
216
+
217
+ # Compare versions
218
+ diff = git_svc.compare_schemas("cv-parser", "v1.0.0", "v2.0.0")
219
+
220
+ # Check for breaking changes
221
+ has_breaking = git_svc.has_breaking_changes("cv-parser", "v1.0.0", "v2.0.0")
222
+ ```
223
+
224
+ ### From CLI
225
+
226
+ ```bash
227
+ # List schema versions
228
+ rem git schema list cv-parser
229
+
230
+ # Compare versions
231
+ rem git schema diff cv-parser v1.0.0 v2.0.0
232
+
233
+ # Load schema at version
234
+ rem git schema show cv-parser --version v2.0.0
235
+
236
+ # Sync repo (pull latest changes)
237
+ rem git sync
238
+ ```
239
+
240
+ ### From Kubernetes
241
+
242
+ Schemas are loaded from Git repositories using GitProvider with IRSA authentication:
243
+
244
+ ```yaml
245
+ apiVersion: v1
246
+ kind: Secret
247
+ metadata:
248
+ name: rem-git-secret
249
+ type: Opaque
250
+ stringData:
251
+ ssh: |
252
+ -----BEGIN OPENSSH PRIVATE KEY-----
253
+ ...
254
+ -----END OPENSSH PRIVATE KEY-----
255
+ known_hosts: |
256
+ github.com ssh-rsa AAAA...
257
+
258
+ ---
259
+ apiVersion: apps/v1
260
+ kind: Deployment
261
+ metadata:
262
+ name: rem-api
263
+ spec:
264
+ template:
265
+ spec:
266
+ containers:
267
+ - name: api
268
+ env:
269
+ - name: GIT__ENABLED
270
+ value: "true"
271
+ - name: GIT__DEFAULT_REPO_URL
272
+ value: "git@github.com:org/repo.git"
273
+ - name: GIT__SSH_KEY_PATH
274
+ value: "/etc/git-secret/ssh"
275
+ - name: GIT__KNOWN_HOSTS_PATH
276
+ value: "/etc/git-secret/known_hosts"
277
+ volumeMounts:
278
+ - name: git-secret
279
+ mountPath: /etc/git-secret
280
+ readOnly: true
281
+ volumes:
282
+ - name: git-secret
283
+ secret:
284
+ secretName: rem-git-secret
285
+ defaultMode: 0400
286
+ ```
287
+
288
+ ## Schema Types
289
+
290
+ ### Core Agents
291
+
292
+ **hello-world**: Simple test agent for verification
293
+ **simple**: Basic conversational agent
294
+ **query**: REM query agent (LOOKUP, SEARCH, TRAVERSE)
295
+ **rem**: REM system expert agent
296
+
297
+ ### Domain-Specific Agents
298
+
299
+ **cv-parser**: Extract structured data from resumes/CVs
300
+ **contract-analyzer**: Analyze legal contracts and agreements
301
+
302
+ ### Evaluators
303
+
304
+ **default**: Primary evaluator for an agent
305
+ **lookup-correctness**: Evaluate LOOKUP query correctness
306
+ **search-correctness**: Evaluate SEARCH query correctness
307
+ **faithfulness**: Evaluate response faithfulness to context
308
+ **retrieval-precision**: Evaluate retrieval precision
309
+ **retrieval-recall**: Evaluate retrieval recall
310
+
311
+ ## Adding New Agent Types
312
+
313
+ ### Ontology Extractors
314
+
315
+ Ontology extractors are domain-specific agents that extract structured knowledge from files. They follow the same conventions as regular agents but include additional metadata:
316
+
317
+ ```yaml
318
+ json_schema_extra:
319
+ fully_qualified_name: "rem.agents.MyExtractor"
320
+ version: "1.0.0"
321
+ tags: [domain, ontology-extractor] # Include 'ontology-extractor' tag
322
+
323
+ # Multi-provider testing
324
+ provider_configs:
325
+ - provider_name: anthropic
326
+ model_name: claude-sonnet-4-5-20250929
327
+ - provider_name: openai
328
+ model_name: gpt-4o
329
+
330
+ # Fields to embed for semantic search
331
+ embedding_fields:
332
+ - candidate_name
333
+ - skills
334
+ - experience
335
+ ```
336
+
337
+ **Extraction workflow**:
338
+ 1. Files uploaded to S3
339
+ 2. File processor extracts content
340
+ 3. Dreaming worker finds matching OntologyConfig
341
+ 4. Loads agent schema from database (or Git)
342
+ 5. Runs extraction agent
343
+ 6. Stores results in Ontology table with embeddings
344
+
345
+ ## Testing
346
+
347
+ ### Unit Tests
348
+
349
+ Test individual schemas for validity:
350
+
351
+ ```python
352
+ from rem.agentic.factory import create_pydantic_ai_agent
353
+ from rem.services.git import GitService
354
+
355
+ git_svc = GitService()
356
+
357
+ # Load schema
358
+ schema = git_svc.load_schema("cv-parser", version="v1.0.0")
359
+
360
+ # Create agent
361
+ agent = create_pydantic_ai_agent(schema)
362
+
363
+ # Test execution
364
+ result = await agent.run("Extract from this CV: ...")
365
+ assert result.output.candidate_name == "John Doe"
366
+ ```
367
+
368
+ ### Integration Tests
369
+
370
+ Test with Git provider:
371
+
372
+ ```python
373
+ from rem.services.git import GitService
374
+
375
+ git_svc = GitService()
376
+
377
+ # List versions
378
+ versions = git_svc.list_schema_versions("cv-parser")
379
+ assert len(versions) > 0
380
+ assert versions[0]["tag"] == "v2.0.0"
381
+
382
+ # Load and compare
383
+ v1_schema = git_svc.load_schema("cv-parser", version="v1.0.0")
384
+ v2_schema = git_svc.load_schema("cv-parser", version="v2.0.0")
385
+ diff = git_svc.compare_schemas("cv-parser", "v1.0.0", "v2.0.0")
386
+ assert "confidence_score" in diff # New field added in v2.0.0
387
+ ```
388
+
389
+ ## Migration Guide
390
+
391
+ ### Updating Tests After Refactor
392
+
393
+ After removing `-agent` suffix from filenames, update test files:
394
+
395
+ ```python
396
+ # Before
397
+ schema = git_svc.load_schema("cv-parser-agent")
398
+ agent = create_pydantic_ai_agent("cv-parser-agent.yaml")
399
+
400
+ # After
401
+ schema = git_svc.load_schema("cv-parser")
402
+ agent = create_pydantic_ai_agent("cv-parser.yaml")
403
+ ```
404
+
405
+ Search for references in tests:
406
+
407
+ ```bash
408
+ # Find test files referencing old names
409
+ grep -r "cv-parser-agent" tests/
410
+ grep -r "hello-world-agent" tests/
411
+ grep -r "rem-agent" tests/
412
+
413
+ # Update references
414
+ sed -i 's/cv-parser-agent/cv-parser/g' tests/**/*.py
415
+ ```
416
+
417
+ ## Contributing
418
+
419
+ When adding new schemas:
420
+
421
+ 1. **Follow naming conventions** (no `-agent` suffix, no version in filename)
422
+ 2. **Include comprehensive docstrings** in the `description` field
423
+ 3. **Add examples** in the system prompt
424
+ 4. **Tag appropriately** (domain, category, ontology-extractor, etc.)
425
+ 5. **Test thoroughly** before tagging
426
+ 6. **Document changes** in git commit messages
427
+ 7. **Use semantic versioning** for tags
428
+
429
+ ## Experiments
430
+
431
+ Experiments are stored alongside schemas in the repository using the **`.experiments/` directory convention**. See [../.experiments/README.md](../.experiments/README.md) for complete documentation.
432
+
433
+ ### Quick Start
434
+
435
+ ```bash
436
+ # Create experiment
437
+ rem experiments create my-experiment \
438
+ --agent cv-parser \
439
+ --evaluator default \
440
+ --description "Test CV parsing accuracy"
441
+
442
+ # Generated structure:
443
+ .experiments/my-experiment/
444
+ ├── experiment.yaml # Configuration (ExperimentConfig model)
445
+ ├── README.md # Auto-generated docs
446
+ └── datasets/ # Optional: small datasets
447
+
448
+ # Run experiment
449
+ # Note: REM typically runs on Kubernetes with Phoenix
450
+ # Production (on cluster):
451
+ export PHOENIX_BASE_URL=http://phoenix-svc.observability.svc.cluster.local:6006
452
+ export PHOENIX_API_KEY=<your-key>
453
+ kubectl exec -it deployment/rem-api -- rem experiments run my-experiment
454
+
455
+ # Development (port-forward):
456
+ kubectl port-forward -n observability svc/phoenix-svc 6006:6006
457
+ export PHOENIX_API_KEY=<your-key>
458
+ rem experiments run my-experiment
459
+
460
+ # Commit to Git
461
+ git add .experiments/my-experiment/
462
+ git commit -m "feat: Add my-experiment v1.0.0"
463
+ git tag -a experiments/my-experiment/v1.0.0 \
464
+ -m "my-experiment v1.0.0: Initial experiment"
465
+ ```
466
+
467
+ ### Storage Convention: Git + S3 Hybrid
468
+
469
+ | Type | Git (`.experiments/`) | S3 (`s3://bucket/experiments/`) |
470
+ |------|----------------------|----------------------------------|
471
+ | Configuration | ✅ `experiment.yaml` | ❌ |
472
+ | Documentation | ✅ `README.md` | ❌ |
473
+ | Small datasets (<1MB) | ✅ `datasets/*.csv` | ❌ |
474
+ | Large datasets (>1MB) | ❌ | ✅ `datasets/*.parquet` |
475
+ | Metrics summary | ✅ `results/metrics.json` | ❌ |
476
+ | Full traces | ❌ | ✅ `results/run-*/traces.jsonl` |
477
+
478
+ ### Experiment Versioning
479
+
480
+ Experiments follow semantic versioning like schemas:
481
+
482
+ - **Tag Format**: `experiments/{experiment-name}/vMAJOR.MINOR.PATCH`
483
+ - **Example**: `experiments/cv-parser-accuracy/v1.0.0`
484
+ - **GitProvider**: Load versioned experiments via GitService
485
+
486
+ ```python
487
+ from rem.services.git import GitService
488
+ from rem.models.core.experiment import ExperimentConfig
489
+
490
+ git_svc = GitService()
491
+
492
+ # Load experiment at specific version
493
+ exp_yaml = git_svc.fs.read(
494
+ "git://rem/.experiments/my-experiment/experiment.yaml?ref=experiments/my-experiment/v1.0.0"
495
+ )
496
+ config = ExperimentConfig(**exp_yaml)
497
+ ```
498
+
499
+ ## Resources
500
+
501
+ - [GitProvider Documentation](../src/rem/services/git/README.md)
502
+ - [Experiments Documentation](../.experiments/README.md)
503
+ - [ExperimentConfig Model](../src/rem/models/core/experiment.py)
504
+ - [Pydantic AI Documentation](https://ai.pydantic.dev/)
505
+ - [JSON Schema Reference](https://json-schema.org/)
506
+ - [Semantic Versioning](https://semver.org/)
507
+ - [REM Architecture](../CLAUDE.md)
@@ -0,0 +1,6 @@
1
+ """REM schema definitions.
2
+
3
+ This package contains schema definitions for:
4
+ - evaluators: LLM-as-a-Judge evaluator schemas for agent evaluation
5
+ - agents: Agent schemas (if needed separate from agentic/schemas/)
6
+ """
@@ -0,0 +1,92 @@
1
+ # Agent Schemas
2
+
3
+ This directory contains YAML-based agent schemas for REM agents. Schemas are organized into folders for better maintainability.
4
+
5
+ ## Folder Structure
6
+
7
+ ```
8
+ agents/
9
+ ├── rem.yaml # Main REM agent (top-level)
10
+ ├── core/ # Core system agents
11
+ │ ├── moment-builder.yaml
12
+ │ ├── rem-query-agent.yaml
13
+ │ ├── resource-affinity-assessor.yaml
14
+ │ └── user-profile-builder.yaml
15
+ └── examples/ # Example and domain-specific agents
16
+ ├── contract-analyzer.yaml
17
+ ├── contract-extractor.yaml
18
+ ├── cv-parser.yaml
19
+ ├── hello-world.yaml
20
+ ├── query.yaml
21
+ ├── simple.yaml
22
+ └── test.yaml
23
+ ```
24
+
25
+ ## Schema Organization
26
+
27
+ ### Top-Level (`rem.yaml`)
28
+ The main REM agent that provides comprehensive memory querying capabilities.
29
+
30
+ ### Core Agents (`core/`)
31
+ System agents used by the dreaming worker and core REM functionality:
32
+ - **moment-builder.yaml** - Constructs temporal narratives from resources
33
+ - **rem-query-agent.yaml** - Translates natural language to REM queries
34
+ - **resource-affinity-assessor.yaml** - Calculates semantic affinity between resources
35
+ - **user-profile-builder.yaml** - Builds user profiles from activity data
36
+
37
+ ### Example Agents (`examples/`)
38
+ Domain-specific agents and examples for testing and demonstration:
39
+ - **contract-analyzer.yaml** - Legal contract analysis
40
+ - **contract-extractor.yaml** - Contract data extraction
41
+ - **cv-parser.yaml** - CV/resume parsing
42
+ - **hello-world.yaml** - Simple example agent
43
+ - **query.yaml** - Query example
44
+ - **simple.yaml** - Minimal example
45
+ - **test.yaml** - Testing agent
46
+
47
+ ## Usage
48
+
49
+ The schema loader automatically searches all subdirectories. You can reference schemas by:
50
+
51
+ ```bash
52
+ # Short name (searches all folders automatically)
53
+ rem ask moment-builder "Build moments for last week"
54
+ rem ask contract-analyzer -i contract.pdf
55
+
56
+ # With folder prefix (explicit)
57
+ rem ask core/moment-builder "Build moments"
58
+ rem ask examples/contract-analyzer -i contract.pdf
59
+
60
+ # Full path (absolute)
61
+ rem ask schemas/agents/core/moment-builder.yaml
62
+ ```
63
+
64
+ ## Creating New Agents
65
+
66
+ 1. **For system agents**: Add to `core/` folder
67
+ 2. **For domain-specific agents**: Add to `examples/` folder
68
+ 3. **For new categories**: Create a new folder and update `schema_loader.py`
69
+
70
+ Schema structure:
71
+ ```yaml
72
+ ---
73
+ type: object
74
+ description: |
75
+ System prompt with LLM instructions.
76
+
77
+ properties:
78
+ # Output schema fields
79
+ field_name:
80
+ type: string
81
+ description: Field description
82
+
83
+ required:
84
+ - required_fields
85
+
86
+ json_schema_extra:
87
+ fully_qualified_name: rem.agents.YourAgent
88
+ version: "1.0.0"
89
+ tags: [category, type]
90
+ ```
91
+
92
+ See [CLAUDE.md](../../../../../../CLAUDE.md) for complete documentation on agent schemas and the REM architecture.