PyPI - remdb - Versions diffs - 0.3.200__py3-none-any.whl → 0.3.226__py3-none-any.whl - Mend

remdb 0.3.200py3-none-any.whl → 0.3.226py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Potentially problematic release.

This version of remdb might be problematic. Click here for more details.

Files changed (36) hide show

rem/agentic/README.md +262 -2
rem/agentic/context.py +73 -1
rem/agentic/mcp/tool_wrapper.py +2 -2
rem/agentic/providers/pydantic_ai.py +1 -1
rem/agentic/schema.py +2 -2
rem/api/mcp_router/tools.py +154 -18
rem/api/routers/admin.py +30 -4
rem/api/routers/auth.py +106 -10
rem/api/routers/chat/completions.py +24 -29
rem/api/routers/chat/sse_events.py +5 -1
rem/api/routers/chat/streaming.py +163 -2
rem/api/routers/common.py +18 -0
rem/api/routers/dev.py +7 -1
rem/api/routers/feedback.py +9 -1
rem/api/routers/messages.py +80 -15
rem/api/routers/models.py +9 -1
rem/api/routers/query.py +12 -1
rem/api/routers/shared_sessions.py +16 -0
rem/auth/jwt.py +19 -4
rem/cli/commands/ask.py +61 -81
rem/cli/commands/process.py +3 -3
rem/models/entities/ontology.py +18 -20
rem/schemas/agents/rem.yaml +1 -1
rem/services/postgres/repository.py +14 -4
rem/services/session/__init__.py +2 -1
rem/services/session/compression.py +40 -2
rem/services/session/pydantic_messages.py +66 -0
rem/settings.py +28 -0
rem/sql/migrations/001_install.sql +13 -3
rem/sql/migrations/002_install_models.sql +20 -22
rem/sql/migrations/migrate_session_id_to_uuid.sql +45 -0
rem/utils/schema_loader.py +73 -45
{remdb-0.3.200.dist-info → remdb-0.3.226.dist-info}/METADATA +1 -1
{remdb-0.3.200.dist-info → remdb-0.3.226.dist-info}/RECORD +36 -34
{remdb-0.3.200.dist-info → remdb-0.3.226.dist-info}/WHEEL +0 -0
{remdb-0.3.200.dist-info → remdb-0.3.226.dist-info}/entry_points.txt +0 -0

rem/agentic/README.md CHANGED Viewed

@@ -716,11 +716,271 @@ curl -X POST http://localhost:8000/api/v1/chat/completions \
 See `rem/api/README.md` for full SSE event protocol documentation.
+## Multi-Agent Orchestration
+REM supports hierarchical agent orchestration where agents can delegate work to other agents via the `ask_agent` tool. This enables complex workflows with specialized agents.
+### Architecture
+```mermaid
+sequenceDiagram
+    participant User
+    participant API as Chat API
+    participant Orchestrator as Orchestrator Agent
+    participant EventSink as Event Sink (Queue)
+    participant Child as Child Agent
+    participant DB as PostgreSQL
+    User->>API: POST /chat/completions (stream=true)
+    API->>API: Create event sink (asyncio.Queue)
+    API->>Orchestrator: agent.iter(prompt)
+    loop Streaming Loop
+        Orchestrator->>API: PartDeltaEvent (text)
+        API->>User: SSE: data: {"delta": {"content": "..."}}
+    end
+    Orchestrator->>Orchestrator: Decides to call ask_agent
+    Orchestrator->>API: ToolCallPart (ask_agent)
+    API->>User: SSE: event: tool_call
+    API->>Child: ask_agent("child_name", input)
+    Child->>EventSink: push_event(child_content)
+    EventSink->>API: Consume child events
+    API->>User: SSE: data: {"delta": {"content": "..."}}
+    Child->>Child: Completes
+    Child-->>Orchestrator: Return result
+    Orchestrator->>API: Final response
+    API->>DB: Save tool calls
+    API->>DB: Save assistant message
+    API->>User: SSE: data: [DONE]
+```
+### Event Sink Pattern
+When an agent delegates to a child via `ask_agent`, the child's streaming events need to bubble up to the parent's stream. This is achieved through an **event sink** pattern using Python's `ContextVar`:
+```python
+# context.py
+from contextvars import ContextVar
+_parent_event_sink: ContextVar["asyncio.Queue | None"] = ContextVar(
+    "parent_event_sink", default=None
+)
+async def push_event(event: Any) -> bool:
+    """Push event to parent's event sink if available."""
+    sink = _parent_event_sink.get()
+    if sink is not None:
+        await sink.put(event)
+        return True
+    return False
+```
+The streaming controller sets up the event sink before agent execution:
+```python
+# streaming.py
+child_event_sink: asyncio.Queue = asyncio.Queue()
+set_event_sink(child_event_sink)
+async for node in agent.iter(prompt):
+    # Process agent events...
+    # Consume any child events that arrived
+    while not child_event_sink.empty():
+        child_event = child_event_sink.get_nowait()
+        if child_event["type"] == "child_content":
+            yield format_sse_content_delta(child_event["content"])
+```
+### ask_agent Tool Implementation
+The `ask_agent` tool in `mcp_router/tools.py` uses Pydantic AI's streaming iteration:
+```python
+async def ask_agent(agent_name: str, input_text: str, ...):
+    """Delegate work to another agent."""
+    # Load and create child agent
+    schema = await load_agent_schema_async(agent_name, user_id)
+    child_agent = await create_agent(context=context, agent_schema_override=schema)
+    # Stream child agent with event proxying
+    async with child_agent.iter(prompt) as agent_run:
+        async for node in agent_run:
+            if Agent.is_model_request_node(node):
+                async with node.stream(agent_run.ctx) as request_stream:
+                    async for event in request_stream:
+                        if isinstance(event, PartDeltaEvent):
+                            # Push content to parent's event sink
+                            await push_event({
+                                "type": "child_content",
+                                "agent_name": agent_name,
+                                "content": event.delta.content_delta,
+                            })
+    return agent_run.result
+```
+### Pydantic AI Features Used
+#### 1. Streaming Iteration (`agent.iter()`)
+Unlike `agent.run()` which blocks until completion, `agent.iter()` provides fine-grained control over the execution flow:
+```python
+async with agent.iter(prompt) as agent_run:
+    async for node in agent_run:
+        if Agent.is_model_request_node(node):
+            # Model is generating - stream the response
+            async with node.stream(agent_run.ctx) as stream:
+                async for event in stream:
+                    if isinstance(event, PartStartEvent):
+                        # Tool call starting
+                    elif isinstance(event, PartDeltaEvent):
+                        # Content chunk
+        elif Agent.is_call_tools_node(node):
+            # Tools are being executed
+            async with node.stream(agent_run.ctx) as stream:
+                async for event in stream:
+                    if isinstance(event, FunctionToolResultEvent):
+                        # Tool completed
+```
+#### 2. Node Types
+- **`ModelRequestNode`**: The model is generating a response (text or tool calls)
+- **`CallToolsNode`**: Tools are being executed
+- **`End`**: Agent execution complete
+#### 3. Event Types
+- **`PartStartEvent`**: A new part (text or tool call) is starting
+- **`PartDeltaEvent`**: Content chunk for streaming text
+- **`FunctionToolResultEvent`**: Tool execution completed with result
+- **`ToolCallPart`**: Metadata about a tool call (name, arguments)
+- **`TextPart`**: Text content
+### Message Persistence
+All messages are persisted to PostgreSQL for session continuity:
+```python
+# streaming.py - after agent completes
+async def save_session_messages(...):
+    store = SessionMessageStore(user_id=user_id)
+    # Save each tool call as a tool message
+    for tool_call in tool_calls:
+        await store.save_message(
+            session_id=session_id,
+            role="tool",
+            content=tool_call.result,
+            tool_name=tool_call.name,
+            tool_call_id=tool_call.id,
+        )
+    # Save the final assistant response
+    await store.save_message(
+        session_id=session_id,
+        role="assistant",
+        content=accumulated_content,
+    )
+```
+Messages are stored with:
+- **Embeddings**: For semantic search across conversation history
+- **Compression**: Long conversations are summarized to manage context window
+- **Session isolation**: Each session maintains its own message history
+### Testing Multi-Agent Systems
+#### Integration Tests
+Real end-to-end tests without mocking are in `tests/integration/test_ask_agent_streaming.py`:
+```python
+class TestAskAgentStreaming:
+    async def test_ask_agent_streams_and_saves(self, session_id, user_id):
+        """Test delegation via ask_agent."""
+        # Uses test_orchestrator which always delegates to test_responder
+        agent = await create_agent(context=context, agent_schema_override=schema)
+        chunks = []
+        async for chunk in stream_openai_response_with_save(
+            agent=agent,
+            prompt="Hello, please delegate this",
+            ...
+        ):
+            chunks.append(chunk)
+        # Verify streaming worked
+        assert len(content_chunks) > 0
+        # Verify persistence
+        messages = await store.load_session_messages(session_id)
+        assert len([m for m in messages if m["role"] == "assistant"]) == 1
+        assert len([m for m in messages if m["tool_name"] == "ask_agent"]) >= 1
+    async def test_multi_turn_saves_all_assistant_messages(self, session_id, user_id):
+        """Test that each turn saves its own assistant message.
+        This catches scoping bugs like accumulated_content not being
+        properly scoped per-turn.
+        """
+        turn_prompts = [
+            "Hello, how are you?",
+            "Tell me something interesting",
+            "Thanks for chatting!",
+        ]
+        for prompt in turn_prompts:
+            async for chunk in stream_openai_response_with_save(...):
+                pass
+        # Each turn should save an assistant message
+        messages = await store.load_session_messages(session_id)
+        assistant_msgs = [m for m in messages if m["role"] == "assistant"]
+        assert len(assistant_msgs) == 3
+```
+#### Test Agent Schemas
+Test agents are defined in `tests/data/schemas/agents/`:
+- **`test_orchestrator.yaml`**: Always delegates via `ask_agent`
+- **`test_responder.yaml`**: Simple agent that responds directly
+```yaml
+# test_orchestrator.yaml
+type: object
+description: |
+  You are a TEST ORCHESTRATOR that ALWAYS delegates to another agent.
+  Call ask_agent with agent_name="test_responder" on EVERY turn.
+json_schema_extra:
+  kind: agent
+  name: test_orchestrator
+  tools:
+    - name: ask_agent
+      mcp_server: rem
+```
+#### Running Integration Tests
+```bash
+# Run individually (recommended due to async isolation)
+POSTGRES__CONNECTION_STRING="postgresql://rem:rem@localhost:5050/rem" \
+  uv run pytest tests/integration/test_ask_agent_streaming.py::TestAskAgentStreaming::test_multi_turn_saves_all_assistant_messages -v -s
+```
 ## Future Work
 - [ ] Phoenix evaluator integration
 - [ ] Agent schema registry (load schemas by URI)
 - [ ] Schema validation and versioning
-- [ ] Multi-turn conversation management
-- [ ] Agent composition (agents calling agents)
+- [x] Multi-turn conversation management
+- [x] Agent composition (agents calling agents)
 - [ ] Alternative provider implementations (if needed)

rem/agentic/context.py CHANGED Viewed

@@ -30,9 +30,10 @@ Multi-Agent Context Propagation:
 - Child agents inherit user_id, tenant_id, session_id, is_eval from parent
 """
+import asyncio
 from contextlib import contextmanager
 from contextvars import ContextVar
-from typing import Generator
+from typing import Any, Generator
 from loguru import logger
 from pydantic import BaseModel, Field
@@ -46,6 +47,13 @@ _current_agent_context: ContextVar["AgentContext | None"] = ContextVar(
     "current_agent_context", default=None
 )
+# Event sink for streaming child agent events to parent
+# When set, child agents (via ask_agent) should push their events here
+# for the parent's streaming loop to proxy to the client
+_parent_event_sink: ContextVar["asyncio.Queue | None"] = ContextVar(
+    "parent_event_sink", default=None
+)
 def get_current_context() -> "AgentContext | None":
     """
@@ -97,6 +105,70 @@ def agent_context_scope(ctx: "AgentContext") -> Generator["AgentContext", None,
         _current_agent_context.set(previous)
+# =============================================================================
+# Event Sink for Streaming Multi-Agent Delegation
+# =============================================================================
+def get_event_sink() -> "asyncio.Queue | None":
+    """
+    Get the parent's event sink for streaming child events.
+    Used by ask_agent to push child agent events to the parent's stream.
+    Returns None if not in a streaming context.
+    """
+    return _parent_event_sink.get()
+def set_event_sink(sink: "asyncio.Queue | None") -> None:
+    """Set the event sink for child agents to push events to."""
+    _parent_event_sink.set(sink)
+@contextmanager
+def event_sink_scope(sink: "asyncio.Queue") -> Generator["asyncio.Queue", None, None]:
+    """
+    Context manager for scoped event sink setting.
+    Used by streaming layer to set up event proxying before tool execution.
+    Child agents (via ask_agent) will push their events to this sink.
+    Example:
+        event_queue = asyncio.Queue()
+        with event_sink_scope(event_queue):
+            # ask_agent will push child events to event_queue
+            async for event in tools_stream:
+                ...
+            # Also consume from event_queue
+    """
+    previous = _parent_event_sink.get()
+    _parent_event_sink.set(sink)
+    try:
+        yield sink
+    finally:
+        _parent_event_sink.set(previous)
+async def push_event(event: Any) -> bool:
+    """
+    Push an event to the parent's event sink (if available).
+    Used by ask_agent to proxy child agent events to the parent's stream.
+    Returns True if event was pushed, False if no sink available.
+    Args:
+        event: Any streaming event (ToolCallEvent, content chunk, etc.)
+    Returns:
+        True if event was pushed to sink, False otherwise
+    """
+    sink = _parent_event_sink.get()
+    if sink is not None:
+        await sink.put(event)
+        return True
+    return False
 class AgentContext(BaseModel):
     """
     Session and configuration context for agent execution.

rem/agentic/mcp/tool_wrapper.py CHANGED Viewed

@@ -116,7 +116,7 @@ def create_resource_tool(uri: str, usage: str = "", mcp_server: Any = None) -> T
     the artificial MCP distinction between tools and resources.
     Supports both:
-    - Concrete URIs: "rem://schemas" -> tool with no parameters
+    - Concrete URIs: "rem://agents" -> tool with no parameters
     - Template URIs: "patient-profile://field/{field_key}" -> tool with field_key parameter
     Args:
@@ -131,7 +131,7 @@ def create_resource_tool(uri: str, usage: str = "", mcp_server: Any = None) -> T
     Example:
         # Concrete URI -> no-param tool
-        tool = create_resource_tool("rem://schemas", "List all agent schemas")
+        tool = create_resource_tool("rem://agents", "List all agent schemas")
         # Template URI -> parameterized tool
         tool = create_resource_tool("patient-profile://field/{field_key}", "Get field definition", mcp_server=mcp)

rem/agentic/providers/pydantic_ai.py CHANGED Viewed

@@ -732,7 +732,7 @@ async def create_agent(
     # the artificial MCP distinction between tools and resources
     #
     # Supports both concrete and template URIs:
-    # - Concrete: "rem://schemas" -> no-param tool
+    # - Concrete: "rem://agents" -> no-param tool
     # - Template: "patient-profile://field/{field_key}" -> tool with field_key param
     from ..mcp.tool_wrapper import create_resource_tool

rem/agentic/schema.py CHANGED Viewed

@@ -79,7 +79,7 @@ class MCPResourceReference(BaseModel):
     Example (exact URI):
         {
-            "uri": "rem://schemas",
+            "uri": "rem://agents",
             "name": "Agent Schemas",
             "description": "List all available agent schemas"
         }
@@ -96,7 +96,7 @@ class MCPResourceReference(BaseModel):
         default=None,
         description=(
             "Exact resource URI or URI with query parameters. "
-            "Examples: 'rem://schemas', 'rem://resources?category=drug.*'"
+            "Examples: 'rem://agents', 'rem://resources?category=drug.*'"
         )
     )

rem/api/mcp_router/tools.py CHANGED Viewed

@@ -594,15 +594,18 @@ async def read_resource(uri: str) -> dict[str, Any]:
     **Available Resources:**
     Agent Schemas:
-    • rem://schemas - List all agent schemas
-    • rem://schema/{name} - Get specific schema definition
-    • rem://schema/{name}/{version} - Get specific version
+    • rem://agents - List all available agent schemas
+    • rem://agents/{agent_name} - Get specific agent schema
+    Documentation:
+    • rem://schema/entities - Entity schemas (Resource, Message, User, File, Moment)
+    • rem://schema/query-types - REM query type documentation
     System Status:
     • rem://status - System health and statistics
     Args:
-        uri: Resource URI (e.g., "rem://schemas", "rem://schema/ask_rem")
+        uri: Resource URI (e.g., "rem://agents", "rem://agents/ask_rem")
     Returns:
         Dict with:
@@ -611,14 +614,11 @@ async def read_resource(uri: str) -> dict[str, Any]:
         - data: Resource data (format depends on resource type)
     Examples:
-        # List all schemas
-        read_resource(uri="rem://schemas")
-        # Get specific schema
-        read_resource(uri="rem://schema/ask_rem")
+        # List all agents
+        read_resource(uri="rem://agents")
-        # Get schema version
-        read_resource(uri="rem://schema/ask_rem/v1.0.0")
+        # Get specific agent
+        read_resource(uri="rem://agents/ask_rem")
         # Check system status
         read_resource(uri="rem://status")
@@ -1265,7 +1265,7 @@ async def ask_agent(
     """
     import asyncio
     from ...agentic import create_agent
-    from ...agentic.context import get_current_context, agent_context_scope
+    from ...agentic.context import get_current_context, agent_context_scope, get_event_sink, push_event
     from ...agentic.agents.agent_manager import get_agent
     from ...utils.schema_loader import load_agent_schema
@@ -1342,16 +1342,146 @@ async def ask_agent(
     if input_data:
         prompt = f"{input_text}\n\nInput data: {json.dumps(input_data)}"
+    # Load session history for the sub-agent (CRITICAL for multi-turn conversations)
+    # Sub-agents need to see the full conversation context, not just the summary
+    pydantic_message_history = None
+    if child_context.session_id and settings.postgres.enabled:
+        try:
+            from ...services.session import SessionMessageStore, session_to_pydantic_messages
+            from ...agentic.schema import get_system_prompt
+            store = SessionMessageStore(user_id=child_context.user_id or "default")
+            raw_session_history = await store.load_session_messages(
+                session_id=child_context.session_id,
+                user_id=child_context.user_id,
+                compress_on_load=False,  # Need full data for reconstruction
+            )
+            if raw_session_history:
+                # Extract agent's system prompt from schema
+                agent_system_prompt = get_system_prompt(schema) if schema else None
+                pydantic_message_history = session_to_pydantic_messages(
+                    raw_session_history,
+                    system_prompt=agent_system_prompt,
+                )
+                logger.debug(
+                    f"ask_agent '{agent_name}': loaded {len(raw_session_history)} session messages "
+                    f"-> {len(pydantic_message_history)} pydantic-ai messages"
+                )
+                # Audit session history if enabled
+                from ...services.session import audit_session_history
+                audit_session_history(
+                    session_id=child_context.session_id,
+                    agent_name=agent_name,
+                    prompt=prompt,
+                    raw_session_history=raw_session_history,
+                    pydantic_messages_count=len(pydantic_message_history),
+                )
+        except Exception as e:
+            logger.warning(f"ask_agent '{agent_name}': failed to load session history: {e}")
+            # Fall back to running without history
     # Run agent with timeout and context propagation
     logger.info(f"Invoking agent '{agent_name}' with prompt: {prompt[:100]}...")
+    # Check if we have an event sink for streaming
+    push_event = get_event_sink()
+    use_streaming = push_event is not None
+    streamed_content = ""  # Track if content was streamed
     try:
         # Set child context for nested tool calls
         with agent_context_scope(child_context):
-            result = await asyncio.wait_for(
-                agent_runtime.run(prompt),
-                timeout=timeout_seconds
-            )
+            if use_streaming:
+                # STREAMING MODE: Use iter() and proxy events to parent
+                logger.debug(f"ask_agent '{agent_name}': using streaming mode with event proxying")
+                async def run_with_streaming():
+                    from pydantic_ai.messages import (
+                        PartStartEvent, PartDeltaEvent, PartEndEvent,
+                        FunctionToolResultEvent, FunctionToolCallEvent,
+                    )
+                    from pydantic_ai.agent import Agent
+                    accumulated_content = []
+                    child_tool_calls = []
+                    # iter() returns an async context manager, not an awaitable
+                    iter_kwargs = {"message_history": pydantic_message_history} if pydantic_message_history else {}
+                    async with agent_runtime.iter(prompt, **iter_kwargs) as agent_run:
+                        async for node in agent_run:
+                            if Agent.is_model_request_node(node):
+                                async with node.stream(agent_run.ctx) as request_stream:
+                                    async for event in request_stream:
+                                        # Proxy part starts
+                                        if isinstance(event, PartStartEvent):
+                                            from pydantic_ai.messages import ToolCallPart, TextPart
+                                            if isinstance(event.part, ToolCallPart):
+                                                # Push tool start event to parent
+                                                await push_event.put({
+                                                    "type": "child_tool_start",
+                                                    "agent_name": agent_name,
+                                                    "tool_name": event.part.tool_name,
+                                                    "arguments": event.part.args if hasattr(event.part, 'args') else None,
+                                                })
+                                                child_tool_calls.append({
+                                                    "tool_name": event.part.tool_name,
+                                                    "index": event.index,
+                                                })
+                                            elif isinstance(event.part, TextPart):
+                                                # TextPart may have initial content
+                                                if event.part.content:
+                                                    accumulated_content.append(event.part.content)
+                                                    await push_event.put({
+                                                        "type": "child_content",
+                                                        "agent_name": agent_name,
+                                                        "content": event.part.content,
+                                                    })
+                                        # Proxy text content deltas to parent for real-time streaming
+                                        elif isinstance(event, PartDeltaEvent):
+                                            if hasattr(event, 'delta') and hasattr(event.delta, 'content_delta'):
+                                                content = event.delta.content_delta
+                                                if content:
+                                                    accumulated_content.append(content)
+                                                    # Push content chunk to parent for streaming
+                                                    await push_event.put({
+                                                        "type": "child_content",
+                                                        "agent_name": agent_name,
+                                                        "content": content,
+                                                    })
+                            elif Agent.is_call_tools_node(node):
+                                async with node.stream(agent_run.ctx) as tools_stream:
+                                    async for tool_event in tools_stream:
+                                        if isinstance(tool_event, FunctionToolResultEvent):
+                                            result_content = tool_event.result.content if hasattr(tool_event.result, 'content') else tool_event.result
+                                            # Push tool result to parent
+                                            await push_event.put({
+                                                "type": "child_tool_result",
+                                                "agent_name": agent_name,
+                                                "result": result_content,
+                                            })
+                        # Get final result (inside context manager)
+                        return agent_run.result, "".join(accumulated_content), child_tool_calls
+                result, streamed_content, tool_calls = await asyncio.wait_for(
+                    run_with_streaming(),
+                    timeout=timeout_seconds
+                )
+            else:
+                # NON-STREAMING MODE: Use run() for backwards compatibility
+                if pydantic_message_history:
+                    result = await asyncio.wait_for(
+                        agent_runtime.run(prompt, message_history=pydantic_message_history),
+                        timeout=timeout_seconds
+                    )
+                else:
+                    result = await asyncio.wait_for(
+                        agent_runtime.run(prompt),
+                        timeout=timeout_seconds
+                    )
     except asyncio.TimeoutError:
         return {
             "status": "error",
@@ -1365,14 +1495,20 @@ async def ask_agent(
     logger.info(f"Agent '{agent_name}' completed successfully")
-    return {
+    response = {
         "status": "success",
         "output": output,
-        "text_response": str(result.output),
         "agent_schema": agent_name,
         "input_text": input_text,
     }
+    # Only include text_response if content was NOT streamed
+    # When streaming, child_content events already delivered the content
+    if not use_streaming or not streamed_content:
+        response["text_response"] = str(result.output)
+    return response
 # =============================================================================
 # Test/Debug Tools (for development only)

remdb 0.3.200__py3-none-any.whl → 0.3.226__py3-none-any.whl

Potentially problematic release.

remdb 0.3.200py3-none-any.whl → 0.3.226py3-none-any.whl