PyPI - open-reflection-protocol - Versions diffs - 0.3.0__py3-none-any.whl - Mend

open-reflection-protocol 0.3.0__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (29) hide show

open_reflection_protocol-0.3.0.dist-info/METADATA +262 -0
open_reflection_protocol-0.3.0.dist-info/RECORD +29 -0
open_reflection_protocol-0.3.0.dist-info/WHEEL +4 -0
open_reflection_protocol-0.3.0.dist-info/entry_points.txt +2 -0
orp/__init__.py +66 -0
orp/adapters/__init__.py +6 -0
orp/adapters/generic_json.py +24 -0
orp/adapters/langgraph.py +24 -0
orp/adapters/openai_agents.py +27 -0
orp/adapters/otel.py +52 -0
orp/capture.py +162 -0
orp/cli.py +366 -0
orp/compiler.py +124 -0
orp/conflicts.py +62 -0
orp/delivery.py +110 -0
orp/effects.py +112 -0
orp/evidence.py +92 -0
orp/examples/failing_coding_agent.py +38 -0
orp/experience.py +114 -0
orp/export.py +60 -0
orp/lessons.py +95 -0
orp/mcp_server.py +171 -0
orp/reflect.py +97 -0
orp/replay.py +108 -0
orp/rollback.py +82 -0
orp/schema.py +303 -0
orp/storage.py +459 -0
orp/training.py +94 -0
orp/viewer.py +104 -0

open_reflection_protocol-0.3.0.dist-info/METADATA ADDED Viewed

@@ -0,0 +1,262 @@
+Metadata-Version: 2.4
+Name: open-reflection-protocol
+Version: 0.3.0
+Summary: Turn agent failures into regression tests, reusable lessons, and measurable improvements
+Project-URL: Home, https://github.com/Fujo930/ORP
+Author: ORP Contributors
+License: MIT
+Keywords: agent,ai,observability,opentelemetry,reflection
+Classifier: Development Status :: 3 - Alpha
+Classifier: Intended Audience :: Developers
+Classifier: License :: OSI Approved :: MIT License
+Classifier: Programming Language :: Python :: 3
+Classifier: Programming Language :: Python :: 3.10
+Classifier: Programming Language :: Python :: 3.11
+Classifier: Programming Language :: Python :: 3.12
+Requires-Python: >=3.10
+Requires-Dist: pydantic>=2.0
+Description-Content-Type: text/markdown
+# Open Reflection Protocol (ORP)
+> Turn agent failures into regression tests, reusable lessons, and measurable improvements.
+**Tracing tells you what your agent did. ORP turns what happened into a tested lesson.**
+---
+## Demo: 30 Seconds
+A coding agent fixes an auth bug but misses the anonymous user path. Tests fail at 34/35.
+```bash
+# 1. Wrap your agent with ORP
+orp wrap -- python my_agent.py
+# 2. ORP captures the failure, challenges unproven claims,
+#    and compiles a Lesson + regression Eval
+orp learn latest
+# 3. Same agent retrieves the Lesson via MCP, applies it
+#    -> All 35 tests pass this time
+orp mcp-server
+# 4. Before/after comparison
+orp diff exp_before exp_after
+```
+**Before:**
+```
+Task success:  FAILED   (34/35 tests)
+Claims:        1 unproven
+```
+**After:**
+```
+Task success:  PASSED   (35/35 tests)
+Claims:        0 unproven
+```
+That's the loop. One mistake, one lesson, one measurable improvement.
+---
+## What ORP Does
+ORP is an **open experience layer for AI agents**, built on OpenTelemetry. It converts agent traces into three executable artifacts:
+| Artifact | What | Example |
+|----------|------|---------|
+| **Lesson** | Retrievable, scope-scoped experience | "Test anonymous, authenticated, and forbidden paths" |
+| **Eval** | Regression test reproducing the failure | `pytest tests/test_anonymous_access.py` |
+| **Guardrail** | Preventative rule | "Before modifying auth, run full test suite" |
+Each Lesson goes through a lifecycle:
+```
+candidate -> active -> under_review -> deprecated -> rejected
+               |
+         (only active lessons
+          are retrievable)
+```
+---
+## Key Concepts
+- **Evidence-first**: ORP distinguishes observed facts (tool output, test results) from agent claims (diagnoses, confidence statements). Claims are never automatically treated as ground truth.
+- **Executable experience**: Lessons compile to runnable evals and guardrails, not just text.
+- **Outcome-based value**: Lesson quality is determined by whether it actually improves results, measured through effect evaluation.
+- **Built on OpenTelemetry**: ORP extends existing trace infrastructure instead of replacing it.
+- **Default private**: All data stays local, de-identified by default, no prompt/tool output uploaded.
+---
+## Install
+```bash
+pip install open-reflection-protocol
+```
+Requires Python 3.10+.
+---
+## Quick Start
+### 1. Wrap any agent command
+```bash
+orp wrap -- python my_agent.py --run-task
+```
+ORP automatically captures stdout, exit codes, test results, git diff, and OpenTelemetry spans.
+### 2. Learn from the run
+```bash
+orp learn latest
+```
+This generates:
+- A **diagnosis** of what went wrong
+- **Challenged claims** (unsupported agent statements)
+- A **Lesson** candidate
+- A **regression Eval**
+### 3. View results
+```bash
+orp inspect latest
+orp report --open          # HTML report
+orp diff exp_before exp_after
+```
+### 4. Deliver lessons to future runs
+```bash
+# Start the MCP Lesson server
+orp mcp-server --transport stdio
+# Compatible agents can now use these MCP tools:
+#   orp_retrieve_lessons(task, limit=3)
+#   orp_acknowledge_lesson(lesson_id)
+#   orp_report_outcome(lesson_id, outcome, evidence_refs)
+```
+---
+## Run the Demo
+```bash
+git clone https://github.com/Fujo930/ORP
+cd ORP
+uv run python demo/orp_demo.py
+```
+Output:
+```
+Run 1: Agent misses anonymous user path -> FAILED
+ORP analyzes the failure -> challenges 1 unproven claim
+ORP compiles Lesson + Eval
+MCP delivers Lesson to Agent
+Run 2: Agent applies Lesson -> PASSED
+Before: 34/35 tests, 1 unproven claim
+After:  35/35 tests, 0 unproven claims
+Estimated effect: 0.5
+```
+## Experimental Results
+**10 failure tasks, 5 trials each, 100 total runs.**
+| Metric | Control (no ORP) | +ORP | Improvement |
+|--------|:-:|:-:|:-:|
+| Task success rate | 14% | 100% | **+86%** |
+| Repeat failure rate | high | 0% | **100% reduction** |
+| Lesson application | — | 100% | — |
+| Eval validity | — | 85% | — |
+```
+Go/No-Go: >>> GO — 4/4 checks passed
+```
+Run yourself: `uv run python exps/runner.py`
+---
+## CLI Reference
+```text
+orp wrap -- python agent.py    Wrap an agent process with ORP
+orp inspect [id]               Inspect an experience (default: latest)
+orp learn [id]                 Generate lessons from an experience
+orp replay <id>                Counterfactual replay
+orp lessons list               List lessons
+orp lessons validate <id>      Validate lesson integrity
+orp lessons conflicts          Auto-detect conflicting lessons
+orp lessons rollback <id>      Rollback a lesson
+orp lessons deliver <id>       Deliver a lesson
+orp effects evaluate <id>      Evaluate lesson effect
+orp training candidates        List training candidates
+orp training export            Export approved training data
+orp mcp-server                 Start MCP lesson server
+orp report --open              Generate HTML report
+orp diff <id1> <id2>           Compare two experiences
+orp export [id]                Export as JSON
+```
+---
+## Architecture
+```text
+Agent / Existing Trace
+        |
+        v
+  Trace Adapters (OTel / OpenAI / LangGraph / Generic JSON)
+        |
+        v
+ Experience Builder -> Evidence Verifier
+                    -> Reflection Analyzer  (diagnosis + challenger)
+                    -> Counterfactual Replayer
+        |
+        v
+ Experience Compiler
+   +----+----+------+
+   |         |      |
+ Lesson    Eval   Guardrail
+   |         |      |
+   +---- Delivery Router (MCP Server / Prompt / Policy / Runtime Hook)
+             |
+             v
+    Effect Evaluator + Rollback
+```
+---
+## For Contributors
+Tests (58 total):
+```bash
+uv run pytest -q
+# 58 passed in 0.68s
+```
+Key design documents in this repo:
+| File | What |
+|------|------|
+| `ROADMAP.md` | Project roadmap and strategy |
+| `SPEC.md` | Protocol specification v0.3 |
+| `ARCHITECTURE.md` | Implementation architecture |
+| `demo/orp_demo.py` | Standalone demo |
+---
+## License
+MIT

open_reflection_protocol-0.3.0.dist-info/RECORD ADDED Viewed

@@ -0,0 +1,29 @@
+orp/__init__.py,sha256=ElxT4yZk6CJLGCxfPGKAKF_bgsO3FQXBABL5AkSyZcw,2228
+orp/capture.py,sha256=cq_N52iNeio7X-8AQJjcqm1czvMZCuRJ1Ft3PfWM2fw,5255
+orp/cli.py,sha256=0HZm3KbsQ17ouT0TM1txjNmJZt2TjxUeKOvmhnd6w4E,12927
+orp/compiler.py,sha256=JNvcc_tISJmsVKNyt6HzUWuD-edYc6bMfVcJVyE86wc,4818
+orp/conflicts.py,sha256=k_VhbFccO2SgfeNOT3GzSJS6ONmlZn9H8y-DuRODj30,2471
+orp/delivery.py,sha256=H3pKf9MsCOqb5kbQO032pXawfkueMNqk9g9k-6IOLfw,4432
+orp/effects.py,sha256=i8BAQjSJPvi07qVDhEb909fe-tTOS0rRw4FfQwdTfII,4367
+orp/evidence.py,sha256=hjpfujrQ9lhIQhWFH6YGjBHMcefuYXRnrlc3plimVxY,3085
+orp/experience.py,sha256=Y0N3uYBQ5kGciM_GZ8V6l17RcutpFvlMeJJIKZHoi1s,4141
+orp/export.py,sha256=o5JN4fNC1rp2lsJKFdxwSwZfkV83bhlDItk_oojX54Q,1919
+orp/lessons.py,sha256=qWilxA48vDXctdDZwl-hyQZ6O26H2IUX5N0tHSqHbKE,3603
+orp/mcp_server.py,sha256=pNifTZOOV5XKTCTmrLktogQJSuqW9_jFt4ShMBPFUfU,7158
+orp/reflect.py,sha256=2peSjF-Dch2N_uV9Rd7uxDTbM_T7GPPnXvUWCDQRPR8,3862
+orp/replay.py,sha256=PBHX-TZ6UglRy0UP9u_gZk4CFvezBvdTU64IM2EIiRg,4210
+orp/rollback.py,sha256=mcVfsQmL51hJ5AVNJ4grdcFrAAA2ttlbed8JUfY5FtM,2926
+orp/schema.py,sha256=DouLTExAVUItRcY_t8hptGyIar0BxKvFnouuvOHWkAs,12698
+orp/storage.py,sha256=IiU3Oe5TIZyPwFaRXhXeZWlLK453ETh-Vd3g5in-va8,19310
+orp/training.py,sha256=jCcJ-qfOpDrLrKsaUMqiKZKuj9uvKhT1E71K7aC5Fn8,3815
+orp/viewer.py,sha256=OLpTNZandwVrSoT4FJkLg7rwfAu3_3Xtwi9iXOtJgrg,5062
+orp/adapters/__init__.py,sha256=k8GarXirpNGZS7b-pEWvRxCVzWS_OKDUgOyUNruD0I0,285
+orp/adapters/generic_json.py,sha256=p-v8mlOUMdeBqjU4Dw5-LrnwoSUd4snDjNSsoysajvQ,697
+orp/adapters/langgraph.py,sha256=J6Rxsw4u1v_EwdieMFrtNuZNhYn4yVJdMuj0CqnA42A,829
+orp/adapters/openai_agents.py,sha256=NmZ6nT4vwQ2UNLLVYwDsCW367IuT3ECP3kSBYJW_A5E,1044
+orp/adapters/otel.py,sha256=lIEwqG7Ve0CathNxLX_BASxUu2dIzZQsW03nec5C-xQ,1929
+orp/examples/failing_coding_agent.py,sha256=Cw6Wjewl-8nWskiXEvAbU28olkLeOk4Ax4Wq25i6FdA,1311
+open_reflection_protocol-0.3.0.dist-info/METADATA,sha256=91IDOlDccp-BFFS72dHBXOcpT-N5MvUvgwBXA04bp5E,6700
+open_reflection_protocol-0.3.0.dist-info/WHEEL,sha256=mffPy8wBnZQn2VnJUU5jE99KsxaSfiyMHV9Yt0aLVxs,87
+open_reflection_protocol-0.3.0.dist-info/entry_points.txt,sha256=lwGrp8bM18-BFu2L3vHWCWQrmodWoWASHdDppgvnbtw,37
+open_reflection_protocol-0.3.0.dist-info/RECORD,,

open_reflection_protocol-0.3.0.dist-info/WHEEL ADDED Viewed

@@ -0,0 +1,4 @@
+Wheel-Version: 1.0
+Generator: hatchling 1.30.1
+Root-Is-Purelib: true
+Tag: py3-none-any

open_reflection_protocol-0.3.0.dist-info/entry_points.txt ADDED Viewed

	@@ -0,0 +1,2 @@
1	+ [console_scripts]
2	+ orp = orp.cli:main

orp/__init__.py ADDED Viewed

@@ -0,0 +1,66 @@
+"""ORP — Open Reflection Protocol
+Turn agent failures into regression tests, reusable lessons, and measurable improvements.
+Core API:
+    from orp import Experience
+    with Experience(goal="fix bug") as exp:
+        result = agent.run()
+        exp.outcome(result)
+    from orp import autolog
+    autolog()  # auto-capture all agent runs (experimental)
+"""
+from orp.schema import (
+    ExperienceRecord, Lesson, EvalArtifact, LessonStatus,
+    TimelineEvent, EventKind, LessonDelivery, LessonEvaluation,
+    LessonRollback, TrainingCandidate, Outcome,
+)
+from orp.storage import ORPStorage
+from orp.experience import ExperienceBuilder, Redactor, EvidenceLinker
+from orp.capture import capture_trace_context
+from orp.lessons import LessonStore
+from orp.compiler import ExperienceCompiler
+from orp.reflect import ReflectionAnalyzer, Challenger
+from orp.replay import CounterfactualReplayer
+from orp.delivery import DeliveryRouter
+from orp.conflicts import ConflictDefender
+from orp.effects import EffectEvaluator
+from orp.rollback import RollbackManager
+from orp.training import TrainingPipeline
+from orp.mcp_server import MCPServer
+from orp.export import ExportEngine
+from orp.viewer import HTMLReporter
+def autolog():
+    """Enable automatic capture (experimental)"""
+    import warnings
+    warnings.warn("autolog() is experimental — use `orp wrap -- python agent.py` instead")
+class Experience:
+    """Experience context manager — captures an agent run as an ORP experience"""
+    def __init__(self, goal: str = ""):
+        self.goal = goal
+        self._ctx = None
+        self._record = None
+    def __enter__(self):
+        self._ctx = capture_trace_context(self.goal)
+        return self._ctx.__enter__()
+    def __exit__(self, *args):
+        if self._ctx:
+            self._ctx.__exit__(*args)
+            events = self._ctx.get_events()
+            if events:
+                builder = ExperienceBuilder()
+                self._record = builder.from_events(events, self.goal)
+                storage = ORPStorage()
+                storage.save_experience(self._record)
+    @property
+    def experience_id(self) -> str:
+        return self._record.experience_id if self._record else ""

orp/adapters/__init__.py ADDED Viewed

@@ -0,0 +1,6 @@
+"""Trace adapters — 将异构 trace 格式转换为 ExperienceRecord"""
+from orp.adapters.generic_json import GenericJSONAdapter
+from orp.adapters.otel import OTelAdapter
+from orp.adapters.openai_agents import OpenAIAgentsAdapter
+from orp.adapters.langgraph import LangGraphAdapter

orp/adapters/generic_json.py ADDED Viewed

@@ -0,0 +1,24 @@
+"""Generic JSON Adapter — 从任意 JSON trace 导入"""
+from typing import Any, Optional
+from orp.schema import ExperienceRecord
+from orp.experience import ExperienceBuilder
+class GenericJSONAdapter:
+    """通用 JSON trace 适配器"""
+    def __init__(self):
+        self._builder = ExperienceBuilder()
+    def parse(self, data: dict[str, Any],
+              agent_id: str = "unknown",
+              goal: str = "") -> ExperienceRecord:
+        return self._builder.from_trace(data, agent_id=agent_id, goal=goal)
+    def parse_file(self, path: str) -> ExperienceRecord:
+        import json
+        with open(path) as f:
+            data = json.load(f)
+        return self.parse(data)

orp/adapters/langgraph.py ADDED Viewed

@@ -0,0 +1,24 @@
+"""LangGraph Adapter"""
+from typing import Any, Optional
+from orp.schema import ExperienceRecord, TimelineEvent
+from orp.experience import ExperienceBuilder
+class LangGraphAdapter:
+    """LangGraph trace 适配器"""
+    def parse(self, state_snapshots: list[dict[str, Any]],
+              agent_id: str = "langgraph-agent",
+              goal: str = "") -> ExperienceRecord:
+        builder = ExperienceBuilder()
+        events = []
+        for i, snapshot in enumerate(state_snapshots):
+            node = snapshot.get("node", f"step_{i}")
+            events.append(TimelineEvent(
+                kind=snapshot.get("kind", "action"),
+                content=f"Node {node}: {snapshot.get('keys', '')}",
+                source="agent",
+            ))
+        return builder.from_events(events, goal=goal, agent_id=agent_id)

orp/adapters/openai_agents.py ADDED Viewed

@@ -0,0 +1,27 @@
+"""OpenAI Agents SDK Adapter"""
+from typing import Any, Optional
+from orp.schema import ExperienceRecord, TimelineEvent
+from orp.experience import ExperienceBuilder
+class OpenAIAgentsAdapter:
+    """OpenAI Agents SDK trace 适配器"""
+    def parse(self, trace_data: dict[str, Any],
+              agent_id: str = "openai-agent",
+              goal: str = "") -> ExperienceRecord:
+        builder = ExperienceBuilder()
+        events = []
+        # OpenAI Agents SDK trace 结构: trace -> runs -> steps
+        runs = trace_data.get("runs", [trace_data])
+        for run in runs:
+            for step in run.get("steps", []):
+                events.append(TimelineEvent(
+                    kind=step.get("type", "action"),
+                    content=step.get("output", step.get("input", str(step)))[:500],
+                    source="agent",
+                    evidence_refs=[f"otel:{step.get('span_id', '')}"] if step.get("span_id") else [],
+                ))
+        return builder.from_events(events, goal=goal, agent_id=agent_id)

orp/adapters/otel.py ADDED Viewed

@@ -0,0 +1,52 @@
+"""OpenTelemetry Adapter — 从 OTel GenAI trace 导入"""
+from typing import Any, Optional
+from orp.schema import ExperienceRecord, TimelineEvent, EventKind
+from orp.experience import ExperienceBuilder
+class OTelAdapter:
+    """OpenTelemetry GenAI trace 适配器
+    解析符合 OTel GenAI 语义约定的 trace/span 数据。
+    """
+    def parse(self, spans: list[dict[str, Any]],
+              agent_id: str = "unknown",
+              goal: str = "") -> ExperienceRecord:
+        builder = ExperienceBuilder()
+        events = []
+        for span in spans:
+            kind = self._map_kind(span)
+            events.append(TimelineEvent(
+                kind=kind,
+                content=span.get("name", span.get("attributes", {}).get("gen_ai.request.model", "")),
+                source=span.get("attributes", {}).get("gen_ai.system", "agent"),
+            ))
+        if not events:
+            events.append(TimelineEvent(kind="observation", content="Empty OTel trace"))
+        return builder.from_events(events, goal=goal, agent_id=agent_id)
+    def _map_kind(self, span: dict[str, Any]) -> str:
+        attrs = span.get("attributes", {})
+        kind = span.get("kind", "SPAN_KIND_INTERNAL")
+        if "gen_ai.request.model" in attrs:
+            return "action"
+        if "gen_ai.evaluation.result" in attrs:
+            return "feedback"
+        if "exception" in span or "error" in span:
+            return "observation"
+        return "action"
+    def from_otel_json(self, path: str) -> list[ExperienceRecord]:
+        import json
+        with open(path) as f:
+            data = json.load(f)
+        records = []
+        for resource_span in data.get("resourceSpans", []):
+            for scope_span in resource_span.get("scopeSpans", []):
+                spans = scope_span.get("spans", [])
+                if spans:
+                    records.append(self.parse(spans))
+        return records

orp/capture.py ADDED Viewed

@@ -0,0 +1,162 @@
+"""捕获层 — 进程/工具/测试/OTel 数据采集"""
+import os
+import subprocess
+import sys
+import tempfile
+import time
+from contextlib import contextmanager
+from datetime import datetime, timezone
+from pathlib import Path
+from typing import Any, Optional
+from orp.schema import TimelineEvent, EventKind
+def _now_iso() -> str:
+    return datetime.now(timezone.utc).isoformat()
+def capture_command(
+    command: list[str],
+    workdir: Optional[str] = None,
+    timeout: int = 300,
+) -> dict[str, Any]:
+    """运行命令并捕获输出、退出码和耗时"""
+    start = time.time()
+    try:
+        result = subprocess.run(
+            command,
+            capture_output=True,
+            text=True,
+            cwd=workdir or os.getcwd(),
+            timeout=timeout,
+        )
+        duration = time.time() - start
+        return {
+            "command": " ".join(command),
+            "exit_code": result.returncode,
+            "stdout": result.stdout,
+            "stderr": result.stderr,
+            "duration": round(duration, 2),
+            "success": result.returncode == 0,
+            "timed_out": False,
+        }
+    except subprocess.TimeoutExpired as e:
+        return {
+            "command": " ".join(command),
+            "exit_code": -1,
+            "stdout": e.stdout or "",
+            "stderr": e.stderr or "",
+            "duration": timeout,
+            "success": False,
+            "timed_out": True,
+        }
+def capture_git_diff(workdir: Optional[str] = None) -> str:
+    """捕获工作目录的 git diff"""
+    try:
+        cwd = workdir or os.getcwd()
+        result = subprocess.run(
+            ["git", "diff"],
+            capture_output=True, text=True, cwd=cwd, timeout=30,
+        )
+        return result.stdout
+    except (subprocess.TimeoutExpired, FileNotFoundError):
+        return ""
+def capture_git_status(workdir: Optional[str] = None) -> str:
+    """捕获 git 状态"""
+    try:
+        cwd = workdir or os.getcwd()
+        result = subprocess.run(
+            ["git", "status", "--short"],
+            capture_output=True, text=True, cwd=cwd, timeout=30,
+        )
+        return result.stdout
+    except (subprocess.TimeoutExpired, FileNotFoundError):
+        return ""
+def capture_pytest_result(workdir: Optional[str] = None) -> dict[str, Any]:
+    """运行 pytest 并捕获结果"""
+    try:
+        cwd = workdir or os.getcwd()
+        result = subprocess.run(
+            [sys.executable, "-m", "pytest", "-q", "--tb=short"],
+            capture_output=True, text=True, cwd=cwd, timeout=120,
+        )
+        output = result.stdout + result.stderr
+        passed = "passed" in output or result.returncode == 0
+        failed_count = 0
+        passed_count = 0
+        for line in output.split("\n"):
+            if "failed" in line and "passed" in line:
+                parts = line.split()
+                for p in parts:
+                    if "failed" in p:
+                        try:
+                            failed_count = int(p.split("failed")[0])
+                        except ValueError:
+                            pass
+                    elif "passed" in p:
+                        try:
+                            passed_count = int(p.split("passed")[0])
+                        except ValueError:
+                            pass
+        return {
+            "exit_code": result.returncode,
+            "summary": result.stdout.strip().split("\n")[-1] if result.stdout else "",
+            "passed": passed,
+            "passed_count": passed_count,
+            "failed_count": failed_count,
+            "output": output,
+        }
+    except (subprocess.TimeoutExpired, FileNotFoundError):
+        return {"exit_code": -1, "passed": False, "error": "could not run pytest"}
+@contextmanager
+def capture_trace_context(goal: str):
+    """上下文管理器 — 创建一个带有基本 trace 的作用域
+    用法:
+        with capture_trace_context("修复登录错误") as ctx:
+            result = agent.run()
+            ctx.set_outcome(result)
+    """
+    events: list[TimelineEvent] = []
+    outcome = {"status": "unknown"}
+    start = time.time()
+    class CaptureContext:
+        def add_event(self, kind: str, content: str, source: str = "agent",
+                      evidence_refs: Optional[list[str]] = None):
+            events.append(TimelineEvent(
+                kind=kind,
+                content=content,
+                source=source,
+                evidence_refs=evidence_refs or [],
+            ))
+        def set_outcome(self, status: str, signals: Optional[dict[str, Any]] = None):
+            nonlocal outcome
+            outcome = {"status": status, "objective_signals": [signals] if signals else []}
+        def get_events(self) -> list[TimelineEvent]:
+            return events.copy()
+        def get_duration(self) -> float:
+            return time.time() - start
+    ctx = CaptureContext()
+    try:
+        yield ctx
+        ctx.add_event("outcome", f"Completed in {time.time()-start:.1f}s", source="system")
+    except Exception as e:
+        ctx.set_outcome("failed", {"error": str(e)})
+        ctx.add_event("observation", f"Error: {e}", source="system")
+    finally:
+        pass