PyPI - remdb - Versions diffs - 0.3.226__py3-none-any.whl → 0.3.245__py3-none-any.whl - Mend

remdb 0.3.226py3-none-any.whl → 0.3.245py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Potentially problematic release.

This version of remdb might be problematic. Click here for more details.

Files changed (29) hide show

rem/agentic/README.md +22 -248
rem/agentic/context.py +13 -2
rem/agentic/context_builder.py +39 -33
rem/agentic/providers/pydantic_ai.py +67 -50
rem/api/mcp_router/resources.py +223 -0
rem/api/mcp_router/tools.py +25 -9
rem/api/routers/auth.py +112 -9
rem/api/routers/chat/child_streaming.py +394 -0
rem/api/routers/chat/streaming.py +166 -357
rem/api/routers/chat/streaming_utils.py +327 -0
rem/api/routers/query.py +5 -14
rem/cli/commands/ask.py +144 -33
rem/cli/commands/process.py +9 -1
rem/cli/commands/query.py +109 -0
rem/cli/commands/session.py +117 -0
rem/cli/main.py +2 -0
rem/models/entities/session.py +1 -0
rem/services/postgres/repository.py +7 -17
rem/services/rem/service.py +47 -0
rem/services/session/compression.py +7 -3
rem/services/session/pydantic_messages.py +45 -11
rem/services/session/reload.py +2 -1
rem/settings.py +43 -0
rem/sql/migrations/004_cache_system.sql +3 -1
rem/utils/schema_loader.py +99 -99
{remdb-0.3.226.dist-info → remdb-0.3.245.dist-info}/METADATA +2 -2
{remdb-0.3.226.dist-info → remdb-0.3.245.dist-info}/RECORD +29 -26
{remdb-0.3.226.dist-info → remdb-0.3.245.dist-info}/WHEEL +0 -0
{remdb-0.3.226.dist-info → remdb-0.3.245.dist-info}/entry_points.txt +0 -0

rem/agentic/README.md CHANGED Viewed

@@ -718,263 +718,37 @@ See `rem/api/README.md` for full SSE event protocol documentation.
 ## Multi-Agent Orchestration
-REM supports hierarchical agent orchestration where agents can delegate work to other agents via the `ask_agent` tool. This enables complex workflows with specialized agents.
-### Architecture
-```mermaid
-sequenceDiagram
-    participant User
-    participant API as Chat API
-    participant Orchestrator as Orchestrator Agent
-    participant EventSink as Event Sink (Queue)
-    participant Child as Child Agent
-    participant DB as PostgreSQL
-    User->>API: POST /chat/completions (stream=true)
-    API->>API: Create event sink (asyncio.Queue)
-    API->>Orchestrator: agent.iter(prompt)
-    loop Streaming Loop
-        Orchestrator->>API: PartDeltaEvent (text)
-        API->>User: SSE: data: {"delta": {"content": "..."}}
-    end
-    Orchestrator->>Orchestrator: Decides to call ask_agent
-    Orchestrator->>API: ToolCallPart (ask_agent)
-    API->>User: SSE: event: tool_call
-    API->>Child: ask_agent("child_name", input)
-    Child->>EventSink: push_event(child_content)
-    EventSink->>API: Consume child events
-    API->>User: SSE: data: {"delta": {"content": "..."}}
-    Child->>Child: Completes
-    Child-->>Orchestrator: Return result
-    Orchestrator->>API: Final response
-    API->>DB: Save tool calls
-    API->>DB: Save assistant message
-    API->>User: SSE: data: [DONE]
-```
-### Event Sink Pattern
-When an agent delegates to a child via `ask_agent`, the child's streaming events need to bubble up to the parent's stream. This is achieved through an **event sink** pattern using Python's `ContextVar`:
-```python
-# context.py
-from contextvars import ContextVar
-_parent_event_sink: ContextVar["asyncio.Queue | None"] = ContextVar(
-    "parent_event_sink", default=None
-)
-async def push_event(event: Any) -> bool:
-    """Push event to parent's event sink if available."""
-    sink = _parent_event_sink.get()
-    if sink is not None:
-        await sink.put(event)
-        return True
-    return False
-```
-The streaming controller sets up the event sink before agent execution:
-```python
-# streaming.py
-child_event_sink: asyncio.Queue = asyncio.Queue()
-set_event_sink(child_event_sink)
-async for node in agent.iter(prompt):
-    # Process agent events...
-    # Consume any child events that arrived
-    while not child_event_sink.empty():
-        child_event = child_event_sink.get_nowait()
-        if child_event["type"] == "child_content":
-            yield format_sse_content_delta(child_event["content"])
-```
-### ask_agent Tool Implementation
-The `ask_agent` tool in `mcp_router/tools.py` uses Pydantic AI's streaming iteration:
-```python
-async def ask_agent(agent_name: str, input_text: str, ...):
-    """Delegate work to another agent."""
-    # Load and create child agent
-    schema = await load_agent_schema_async(agent_name, user_id)
-    child_agent = await create_agent(context=context, agent_schema_override=schema)
-    # Stream child agent with event proxying
-    async with child_agent.iter(prompt) as agent_run:
-        async for node in agent_run:
-            if Agent.is_model_request_node(node):
-                async with node.stream(agent_run.ctx) as request_stream:
-                    async for event in request_stream:
-                        if isinstance(event, PartDeltaEvent):
-                            # Push content to parent's event sink
-                            await push_event({
-                                "type": "child_content",
-                                "agent_name": agent_name,
-                                "content": event.delta.content_delta,
-                            })
-    return agent_run.result
-```
-### Pydantic AI Features Used
+Agents can delegate work to other agents via the `ask_agent` tool. This enables orchestrator patterns where a parent agent routes to specialists.
-#### 1. Streaming Iteration (`agent.iter()`)
+### How It Works
-Unlike `agent.run()` which blocks until completion, `agent.iter()` provides fine-grained control over the execution flow:
+1. **Parent agent** calls `ask_agent(agent_name, input_text)`
+2. **Child agent** executes and streams its response
+3. **Child events** bubble up to parent via an event sink (asyncio.Queue in ContextVar)
+4. **All tool calls** are saved to the database for the session
-```python
-async with agent.iter(prompt) as agent_run:
-    async for node in agent_run:
-        if Agent.is_model_request_node(node):
-            # Model is generating - stream the response
-            async with node.stream(agent_run.ctx) as stream:
-                async for event in stream:
-                    if isinstance(event, PartStartEvent):
-                        # Tool call starting
-                    elif isinstance(event, PartDeltaEvent):
-                        # Content chunk
-        elif Agent.is_call_tools_node(node):
-            # Tools are being executed
-            async with node.stream(agent_run.ctx) as stream:
-                async for event in stream:
-                    if isinstance(event, FunctionToolResultEvent):
-                        # Tool completed
-```
-#### 2. Node Types
-- **`ModelRequestNode`**: The model is generating a response (text or tool calls)
-- **`CallToolsNode`**: Tools are being executed
-- **`End`**: Agent execution complete
-#### 3. Event Types
+### Key Components
-- **`PartStartEvent`**: A new part (text or tool call) is starting
-- **`PartDeltaEvent`**: Content chunk for streaming text
-- **`FunctionToolResultEvent`**: Tool execution completed with result
-- **`ToolCallPart`**: Metadata about a tool call (name, arguments)
-- **`TextPart`**: Text content
+| Component | Location | Purpose |
+|-----------|----------|---------|
+| `ask_agent` tool | `mcp_router/tools.py` | Loads child agent, runs with streaming, pushes events to sink |
+| Event sink | `context.py` | ContextVar holding asyncio.Queue for child→parent event flow |
+| Streaming controller | `streaming.py` | Drains event sink, emits SSE events, saves to DB |
-### Message Persistence
-All messages are persisted to PostgreSQL for session continuity:
-```python
-# streaming.py - after agent completes
-async def save_session_messages(...):
-    store = SessionMessageStore(user_id=user_id)
-    # Save each tool call as a tool message
-    for tool_call in tool_calls:
-        await store.save_message(
-            session_id=session_id,
-            role="tool",
-            content=tool_call.result,
-            tool_name=tool_call.name,
-            tool_call_id=tool_call.id,
-        )
-    # Save the final assistant response
-    await store.save_message(
-        session_id=session_id,
-        role="assistant",
-        content=accumulated_content,
-    )
-```
+### Event Types
-Messages are stored with:
-- **Embeddings**: For semantic search across conversation history
-- **Compression**: Long conversations are summarized to manage context window
-- **Session isolation**: Each session maintains its own message history
+Child agents emit events that the parent streams to the client:
-### Testing Multi-Agent Systems
+- **`child_tool_start`**: Child is calling a tool (logged, streamed, saved to DB)
+- **`child_content`**: Child's text response (streamed as SSE content delta)
+- **`child_tool_result`**: Tool completed with result (metadata extraction)
-#### Integration Tests
+### Testing
-Real end-to-end tests without mocking are in `tests/integration/test_ask_agent_streaming.py`:
-```python
-class TestAskAgentStreaming:
-    async def test_ask_agent_streams_and_saves(self, session_id, user_id):
-        """Test delegation via ask_agent."""
-        # Uses test_orchestrator which always delegates to test_responder
-        agent = await create_agent(context=context, agent_schema_override=schema)
-        chunks = []
-        async for chunk in stream_openai_response_with_save(
-            agent=agent,
-            prompt="Hello, please delegate this",
-            ...
-        ):
-            chunks.append(chunk)
-        # Verify streaming worked
-        assert len(content_chunks) > 0
-        # Verify persistence
-        messages = await store.load_session_messages(session_id)
-        assert len([m for m in messages if m["role"] == "assistant"]) == 1
-        assert len([m for m in messages if m["tool_name"] == "ask_agent"]) >= 1
-    async def test_multi_turn_saves_all_assistant_messages(self, session_id, user_id):
-        """Test that each turn saves its own assistant message.
-        This catches scoping bugs like accumulated_content not being
-        properly scoped per-turn.
-        """
-        turn_prompts = [
-            "Hello, how are you?",
-            "Tell me something interesting",
-            "Thanks for chatting!",
-        ]
-        for prompt in turn_prompts:
-            async for chunk in stream_openai_response_with_save(...):
-                pass
-        # Each turn should save an assistant message
-        messages = await store.load_session_messages(session_id)
-        assistant_msgs = [m for m in messages if m["role"] == "assistant"]
-        assert len(assistant_msgs) == 3
-```
-#### Test Agent Schemas
-Test agents are defined in `tests/data/schemas/agents/`:
-- **`test_orchestrator.yaml`**: Always delegates via `ask_agent`
-- **`test_responder.yaml`**: Simple agent that responds directly
-```yaml
-# test_orchestrator.yaml
-type: object
-description: |
-  You are a TEST ORCHESTRATOR that ALWAYS delegates to another agent.
-  Call ask_agent with agent_name="test_responder" on EVERY turn.
-json_schema_extra:
-  kind: agent
-  name: test_orchestrator
-  tools:
-    - name: ask_agent
-      mcp_server: rem
-```
-#### Running Integration Tests
-```bash
-# Run individually (recommended due to async isolation)
-POSTGRES__CONNECTION_STRING="postgresql://rem:rem@localhost:5050/rem" \
-  uv run pytest tests/integration/test_ask_agent_streaming.py::TestAskAgentStreaming::test_multi_turn_saves_all_assistant_messages -v -s
-```
+Integration tests in `tests/integration/test_ask_agent_streaming.py` verify:
+- Child content streams correctly to client
+- Tool calls are persisted to database
+- Multi-turn conversations save all messages
 ## Future Work

rem/agentic/context.py CHANGED Viewed

@@ -16,6 +16,7 @@ Headers Mapping:
     X-Agent-Schema   → context.agent_schema_uri (default: "rem")
     X-Model-Name     → context.default_model
     X-Is-Eval        → context.is_eval (marks session as evaluation)
+    X-Client-Id      → context.client_id (e.g., "web", "mobile", "cli")
 Key Design Pattern:
 - AgentContext is passed to agent factory, not stored in agents
@@ -222,6 +223,11 @@ class AgentContext(BaseModel):
         description="Whether this is an evaluation session (set via X-Is-Eval header)",
     )
+    client_id: str | None = Field(
+        default=None,
+        description="Client identifier (e.g., 'web', 'mobile', 'cli') set via X-Client-Id header",
+    )
     model_config = {"populate_by_name": True}
     def child_context(
@@ -232,7 +238,7 @@ class AgentContext(BaseModel):
         """
         Create a child context for nested agent calls.
-        Inherits user_id, tenant_id, session_id, is_eval from parent.
+        Inherits user_id, tenant_id, session_id, is_eval, client_id from parent.
         Allows overriding agent_schema_uri and default_model for the child.
         Args:
@@ -256,6 +262,7 @@ class AgentContext(BaseModel):
             default_model=model_override or self.default_model,
             agent_schema_uri=agent_schema_uri or self.agent_schema_uri,
             is_eval=self.is_eval,
+            client_id=self.client_id,
         )
     @staticmethod
@@ -374,6 +381,7 @@ class AgentContext(BaseModel):
             default_model=normalized.get("x-model-name") or settings.llm.default_model,
             agent_schema_uri=normalized.get("x-agent-schema"),
             is_eval=is_eval,
+            client_id=normalized.get("x-client-id"),
         )
     @classmethod
@@ -391,6 +399,7 @@ class AgentContext(BaseModel):
         - X-Model-Name: Model override
         - X-Agent-Schema: Agent schema URI
         - X-Is-Eval: Whether this is an evaluation session (true/false)
+        - X-Client-Id: Client identifier (e.g., "web", "mobile", "cli")
         Args:
             headers: Dictionary of HTTP headers (case-insensitive)
@@ -404,7 +413,8 @@ class AgentContext(BaseModel):
                 "X-Tenant-Id": "acme-corp",
                 "X-Session-Id": "sess-456",
                 "X-Model-Name": "anthropic:claude-opus-4-20250514",
-                "X-Is-Eval": "true"
+                "X-Is-Eval": "true",
+                "X-Client-Id": "web"
             }
             context = AgentContext.from_headers(headers)
         """
@@ -422,4 +432,5 @@ class AgentContext(BaseModel):
             default_model=normalized.get("x-model-name") or settings.llm.default_model,
             agent_schema_uri=normalized.get("x-agent-schema"),
             is_eval=is_eval,
+            client_id=normalized.get("x-client-id"),
         )

rem/agentic/context_builder.py CHANGED Viewed

@@ -4,15 +4,12 @@ Centralized context builder for agent execution.
 Session History (ALWAYS loaded with compression):
 - Each chat request is a single message, so session history MUST be recovered
 - Uses SessionMessageStore with compression to keep context efficient
-- Long assistant responses include REM LOOKUP hints: "... [REM LOOKUP session-{id}-msg-{index}] ..."
-- Agent can retrieve full content on-demand using REM LOOKUP
 - Prevents context window bloat while maintaining conversation continuity
 User Context (on-demand by default):
-- System message includes REM LOOKUP hint for user profile
-- Agent decides whether to load profile based on query
-- More efficient for queries that don't need personalization
-- Example: "User: sarah@example.com. To load user profile: Use REM LOOKUP \"sarah@example.com\""
+- System message includes user email for context awareness
+- Fails silently if user not found - agent proceeds without user context
+- Example: "User: sarah@example.com"
 User Context (auto-inject when enabled):
 - Set CHAT__AUTO_INJECT_USER_CONTEXT=true
@@ -22,8 +19,8 @@ User Context (auto-inject when enabled):
 Design Pattern:
 1. Extract AgentContext from headers (user_id, tenant_id, session_id)
 2. If auto-inject enabled: Load User/Session from database
-3. If auto-inject disabled: Provide REM LOOKUP hints in system message
-4. Construct system message with date + context (injected or hints)
+3. If auto-inject disabled: Show user email for context (fail silently if not found)
+4. Construct system message with date + context
 5. Return complete context ready for agent execution
 Integration Points:
@@ -40,11 +37,10 @@ Usage (on-demand, default):
     # Messages list structure (on-demand):
     # [
-    #   {"role": "system", "content": "Today's date: 2025-11-22\nUser: sarah@example.com\nTo load user profile: Use REM LOOKUP \"sarah@example.com\"\nSession ID: sess-123\nTo load session history: Use REM LOOKUP messages?session_id=sess-123"},
+    #   {"role": "system", "content": "Today's date: 2025-11-22\n\nUser: sarah@example.com"},
     #   {"role": "user", "content": "What's next for the API migration?"}
     # ]
-    # Agent receives hints and can decide to load context if needed
     agent = await create_agent(context=context, ...)
     prompt = "\n".join(msg.content for msg in messages)
     result = await agent.run(prompt)
@@ -52,7 +48,7 @@ Usage (on-demand, default):
 Usage (auto-inject, CHAT__AUTO_INJECT_USER_CONTEXT=true):
     # Messages list structure (auto-inject):
     # [
-    #   {"role": "system", "content": "Today's date: 2025-11-22\n\nUser Context (auto-injected):\nSummary: ...\nInterests: ...\n\nSession History (auto-injected, 5 messages):"},
+    #   {"role": "system", "content": "Today's date: 2025-11-22\n\nUser Context (auto-injected):\nSummary: ...\nInterests: ..."},
     #   {"role": "user", "content": "Previous message"},
     #   {"role": "assistant", "content": "Previous response"},
     #   {"role": "user", "content": "What's next for the API migration?"}
@@ -110,13 +106,11 @@ class ContextBuilder:
         Session History (ALWAYS loaded with compression):
         - If session_id provided, session history is ALWAYS loaded using SessionMessageStore
-        - Compression keeps it efficient with REM LOOKUP hints for long messages
-        - Example: "... [Message truncated - REM LOOKUP session-{id}-msg-{index}] ..."
-        - Agent can retrieve full content on-demand using REM LOOKUP
+        - Compression keeps context efficient
         User Context (on-demand by default):
-        - System message includes REM LOOKUP hint: "User: {email}. To load user profile: Use REM LOOKUP \"{email}\""
-        - Agent decides whether to load profile based on query
+        - System message includes user email: "User: {email}"
+        - Fails silently if user not found - agent proceeds without user context
         User Context (auto-inject when enabled):
         - Set CHAT__AUTO_INJECT_USER_CONTEXT=true
@@ -137,9 +131,9 @@ class ContextBuilder:
             # messages structure:
             # [
-            #   {"role": "system", "content": "Today's date: 2025-11-22\nUser: sarah@example.com\nTo load user profile: Use REM LOOKUP \"sarah@example.com\""},
+            #   {"role": "system", "content": "Today's date: 2025-11-22\n\nUser: sarah@example.com"},
             #   {"role": "user", "content": "Previous message"},
-            #   {"role": "assistant", "content": "Start of long response... [REM LOOKUP session-123-msg-1] ...end"},
+            #   {"role": "assistant", "content": "Previous response"},
             #   {"role": "user", "content": "New message"}
             # ]
         """
@@ -158,6 +152,7 @@ class ContextBuilder:
                 default_model=context.default_model,
                 agent_schema_uri=context.agent_schema_uri,
                 is_eval=context.is_eval,
+                client_id=context.client_id,
             )
         # Initialize DB if not provided and needed (for user context or session history)
@@ -177,6 +172,10 @@ class ContextBuilder:
             today = datetime.now(timezone.utc).strftime("%Y-%m-%d")
             context_hint = f"Today's date: {today}."
+            # Add client identifier if present
+            if context.client_id:
+                context_hint += f"\nClient: {context.client_id}"
             # Add user context (auto-inject or on-demand hint)
             if settings.chat.auto_inject_user_context and context.user_id and db:
                 # Auto-inject: Load and include user profile
@@ -189,18 +188,18 @@ class ContextBuilder:
                     context_hint += f"\n\nUser Context (auto-injected):\n{user_context_content}"
                 else:
                     context_hint += "\n\nNo user context available (anonymous or new user)."
-            elif context.user_id:
-                # On-demand: Provide hint to use REM LOOKUP
-                # user_id is UUID5 hash of email - load user to get email for display and LOOKUP
-                user_repo = Repository(User, "users", db=db)
-                user = await user_repo.get_by_id(context.user_id, context.tenant_id)
-                if user and user.email:
-                    # Show email (more useful than UUID) and LOOKUP hint
-                    context_hint += f"\n\nUser: {user.email}"
-                    context_hint += f"\nTo load user profile: Use REM LOOKUP \"{user.email}\""
-                else:
-                    context_hint += f"\n\nUser ID: {context.user_id}"
-                    context_hint += "\nUser profile not available."
+            elif context.user_id and db:
+                # On-demand: Show user email for context (no REM LOOKUP - it requires exact user_id match)
+                # Fail silently if user lookup fails - just proceed without user context
+                try:
+                    user_repo = Repository(User, "users", db=db)
+                    user = await user_repo.get_by_id(context.user_id, context.tenant_id)
+                    if user and user.email:
+                        context_hint += f"\n\nUser: {user.email}"
+                    # If user not found, just proceed without adding user context
+                except Exception as e:
+                    # Fail silently - don't block agent execution if user lookup fails
+                    logger.debug(f"Could not load user context: {e}")
             # Add system context hint
             messages.append(ContextMessage(role="system", content=context_hint))
@@ -318,6 +317,7 @@ class ContextBuilder:
         session_id: str | None = None,
         message: str = "Hello",
         model: str | None = None,
+        client_id: str | None = None,
     ) -> tuple[AgentContext, list[ContextMessage]]:
         """
         Build context for testing (no database lookup).
@@ -325,7 +325,7 @@ class ContextBuilder:
         Creates minimal context with:
         - Test user (test@rem.ai)
         - Test tenant
-        - Context hint with date
+        - Context hint with date and client
         - Single user message
         Args:
@@ -334,6 +334,7 @@ class ContextBuilder:
             session_id: Optional session ID
             message: User message content
             model: Optional model override
+            client_id: Optional client identifier (e.g., "cli", "test")
         Returns:
             Tuple of (AgentContext, messages list)
@@ -341,7 +342,8 @@ class ContextBuilder:
         Example:
             context, messages = await ContextBuilder.build_from_test(
                 user_id="test@rem.ai",
-                message="What's the weather like?"
+                message="What's the weather like?",
+                client_id="cli"
             )
         """
         from ..settings import settings
@@ -352,11 +354,15 @@ class ContextBuilder:
             tenant_id=tenant_id,
             session_id=session_id,
             default_model=model or settings.llm.default_model,
+            client_id=client_id,
         )
         # Build minimal messages
         today = datetime.now(timezone.utc).strftime("%Y-%m-%d")
-        context_hint = f"Today's date: {today}.\n\nTest user context: {user_id} (test mode, no profile loaded)."
+        context_hint = f"Today's date: {today}."
+        if client_id:
+            context_hint += f"\nClient: {client_id}"
+        context_hint += f"\n\nTest user context: {user_id} (test mode, no profile loaded)."
         messages = [
             ContextMessage(role="system", content=context_hint),

remdb 0.3.226__py3-none-any.whl → 0.3.245__py3-none-any.whl

Potentially problematic release.

remdb 0.3.226py3-none-any.whl → 0.3.245py3-none-any.whl