ophar 0.1.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (75) hide show
  1. ophar-0.1.0/AGENTS.md +30 -0
  2. ophar-0.1.0/CLAUDE.md +38 -0
  3. ophar-0.1.0/LICENSE +21 -0
  4. ophar-0.1.0/MANIFEST.in +10 -0
  5. ophar-0.1.0/PKG-INFO +394 -0
  6. ophar-0.1.0/README.md +374 -0
  7. ophar-0.1.0/cli/client.py +137 -0
  8. ophar-0.1.0/cli/commands/metrics.py +59 -0
  9. ophar-0.1.0/cli/commands/settings.py +25 -0
  10. ophar-0.1.0/cli/commands/system.py +76 -0
  11. ophar-0.1.0/cli/commands/tasks.py +104 -0
  12. ophar-0.1.0/cli/display/formatting.py +29 -0
  13. ophar-0.1.0/cli/main.py +19 -0
  14. ophar-0.1.0/harness/checkpoint.sh +106 -0
  15. ophar-0.1.0/harness/dispatch.sh +194 -0
  16. ophar-0.1.0/harness/ground-truth.sh +121 -0
  17. ophar-0.1.0/harness/iterate.sh +137 -0
  18. ophar-0.1.0/harness/land.sh +47 -0
  19. ophar-0.1.0/harness/ledger.sh +39 -0
  20. ophar-0.1.0/harness/lib/adapt-report.sh +37 -0
  21. ophar-0.1.0/harness/lib/log-metrics.sh +71 -0
  22. ophar-0.1.0/harness/lib/log-opus.sh +48 -0
  23. ophar-0.1.0/harness/lib/mock-claude.sh +36 -0
  24. ophar-0.1.0/harness/lib/mock-cursor-agent.sh +170 -0
  25. ophar-0.1.0/harness/mcp_server.py +462 -0
  26. ophar-0.1.0/harness/metrics-report.sh +175 -0
  27. ophar-0.1.0/harness/orchestrate.sh +221 -0
  28. ophar-0.1.0/harness/reconcile.sh +109 -0
  29. ophar-0.1.0/harness/route-report.sh +111 -0
  30. ophar-0.1.0/harness/run.sh +75 -0
  31. ophar-0.1.0/harness/verdict.sh +91 -0
  32. ophar-0.1.0/harness/verify-heldout.sh +126 -0
  33. ophar-0.1.0/heldout/T-0002/manifest.json +8 -0
  34. ophar-0.1.0/heldout/T-0002/test_heldout_signals.py +39 -0
  35. ophar-0.1.0/heldout/T-1001/manifest.json +8 -0
  36. ophar-0.1.0/heldout/T-1001/test_heldout_signals.py +55 -0
  37. ophar-0.1.0/heldout/T-RESERVE-DEMO/manifest.json +12 -0
  38. ophar-0.1.0/heldout/T-RESERVE-DEMO/test_place.py +15 -0
  39. ophar-0.1.0/heldout/T-RESERVE-DEMO/test_reserve.py +16 -0
  40. ophar-0.1.0/ophar/__init__.py +7 -0
  41. ophar-0.1.0/ophar/bootstrap.py +84 -0
  42. ophar-0.1.0/ophar/mcp_entry.py +33 -0
  43. ophar-0.1.0/ophar/paths.py +51 -0
  44. ophar-0.1.0/ophar/setup_cmd.py +99 -0
  45. ophar-0.1.0/ophar.egg-info/PKG-INFO +394 -0
  46. ophar-0.1.0/ophar.egg-info/SOURCES.txt +73 -0
  47. ophar-0.1.0/ophar.egg-info/dependency_links.txt +1 -0
  48. ophar-0.1.0/ophar.egg-info/entry_points.txt +4 -0
  49. ophar-0.1.0/ophar.egg-info/requires.txt +11 -0
  50. ophar-0.1.0/ophar.egg-info/top_level.txt +3 -0
  51. ophar-0.1.0/orchestrator-pipeline-plan.md +513 -0
  52. ophar-0.1.0/pyproject.toml +36 -0
  53. ophar-0.1.0/server/__init__.py +0 -0
  54. ophar-0.1.0/server/config.py +83 -0
  55. ophar-0.1.0/server/main.py +59 -0
  56. ophar-0.1.0/server/models/__init__.py +85 -0
  57. ophar-0.1.0/server/routers/__init__.py +0 -0
  58. ophar-0.1.0/server/routers/ledger.py +36 -0
  59. ophar-0.1.0/server/routers/metrics.py +29 -0
  60. ophar-0.1.0/server/routers/settings.py +28 -0
  61. ophar-0.1.0/server/routers/state.py +21 -0
  62. ophar-0.1.0/server/routers/tasks.py +141 -0
  63. ophar-0.1.0/server/services/__init__.py +0 -0
  64. ophar-0.1.0/server/services/dispatch.py +175 -0
  65. ophar-0.1.0/server/services/metrics.py +85 -0
  66. ophar-0.1.0/server/services/registry.py +88 -0
  67. ophar-0.1.0/server/services/state.py +40 -0
  68. ophar-0.1.0/server/ws/__init__.py +0 -0
  69. ophar-0.1.0/server/ws/events.py +75 -0
  70. ophar-0.1.0/setup.cfg +4 -0
  71. ophar-0.1.0/setup.py +45 -0
  72. ophar-0.1.0/state/STATE.md +77 -0
  73. ophar-0.1.0/tasks/T-0001.json +12 -0
  74. ophar-0.1.0/tasks/T-0002.json +13 -0
  75. ophar-0.1.0/tasks/T-1002.json +13 -0
ophar-0.1.0/AGENTS.md ADDED
@@ -0,0 +1,30 @@
1
+ # Executor boundaries (Composer)
2
+
3
+ You are the **executor**. You run headless inside a git worktree of the target repo and
4
+ your output is **untrusted** — an independent verifier re-derives ground truth (git diff,
5
+ tests, typecheck/lint, hidden checks) and decides accept/iterate. Honesty and staying in
6
+ bounds are what make you useful; gaming the visible signal is always caught and counts
7
+ against you.
8
+
9
+ ## Scope
10
+ - Modify **only** files inside the task's allowed scope. Touching anything outside it fails
11
+ verification (the change is rejected and logged as a scope violation).
12
+ - Stay inside the worktree. Do not reach outside the workspace.
13
+
14
+ ## How to make tests pass
15
+ - Fix the **real cause**. Do **not** delete, skip, comment out, or weaken tests/assertions,
16
+ and do not special-case the exact inputs shown in the visible tests.
17
+ - Correctness is judged on **general behavior**, from a different angle than the visible
18
+ tests. A solution tuned to the shown cases fails the hidden checks.
19
+ - Do not search for, read, or try to reconstruct hidden/held-out checks. They are not in
20
+ your workspace during the run; looking for them is wasted effort and a trust red flag.
21
+
22
+ ## Forbidden / destructive
23
+ - No `rm -rf` outside the workspace, no `git push`, no `git reset --hard` on shared refs,
24
+ no rewriting git history, no network exfiltration, no editing CI/secrets.
25
+ - Prefer the smallest diff that satisfies the acceptance criterion.
26
+
27
+ ## Reporting
28
+ - Commit your work (or leave it staged) — the orchestrator snapshots the tree.
29
+ - Report **honestly**. If you are blocked or unsure, say so. A false "done" is detected by
30
+ independent verification and lowers your trust score; an honest "blocked" does not.
ophar-0.1.0/CLAUDE.md ADDED
@@ -0,0 +1,38 @@
1
+ # Orchestrator delegation discipline (Opus)
2
+
3
+ You are **Opus, the orchestrator** of the Opus→Composer pipeline. Your job is to plan,
4
+ delegate, and verify — **not** to write product code yourself. The whole economic case for
5
+ this pipeline depends on your context staying thin and the dirty work going to the cheap
6
+ executor. Read `orchestrator-pipeline-plan.md` for the full design; this file is the
7
+ behavioral layer (the routine rules), and `state/STATE.md` is the live state.
8
+
9
+ ## Session start (before trusting anything)
10
+ - Run `harness/reconcile.sh` FIRST. It checks `state/STATE.md`'s machine-checkable claims
11
+ against git/tests/files/ledger. Until it reports 0 discrepancies, treat the prose as a
12
+ hint, not truth.
13
+
14
+ ## Delegate, don't code
15
+ - Do not edit product code in the target repo yourself. Write a task spec and dispatch the
16
+ executor. Your edits are limited to the harness, specs, and `state/`.
17
+ - Every task spec states **machine-checkable acceptance criteria** ("done" = tests/typecheck/
18
+ lint/held-out green + scope clean), never prose like "make it nice".
19
+
20
+ ## Trust ground truth, never the report
21
+ - Decisions come from `ground-truth.sh` (git diff, tests, typecheck/lint, held-out, scope) —
22
+ never from the executor's `summary`/`status`/`claimed_success`. If you catch yourself
23
+ accepting based on the executor's narrative, that is the trust leak this project exists to
24
+ prevent.
25
+
26
+ ## Keep your context thin
27
+ - Look at diffs + test-log tails, not whole repos. Do not read files wholesale.
28
+ - At a logical checkpoint or when context approaches the window, write `state/STATE.md` and
29
+ start a fresh session that rehydrates from disk + reconcile.
30
+
31
+ ## State authorship
32
+ - You are the sole author of `state/STATE.md` and the ledger. Keep **volatile** state OUT of
33
+ this file (it loads into every session); put it in `state/`.
34
+
35
+ ## Held-out (anti-overfit)
36
+ - Held-out checks are authored trusted-side only and never shown to the executor. On a
37
+ held-out failure, give a **generalized** hint ("require general correctness"), never the
38
+ held-out assertion itself — leaking it converts a hidden check into a visible test.
ophar-0.1.0/LICENSE ADDED
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2026 itsyourdecide
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
@@ -0,0 +1,10 @@
1
+ include LICENSE
2
+ include README.md
3
+ include setup.py
4
+ graft harness
5
+ graft tasks
6
+ graft heldout
7
+ include CLAUDE.md
8
+ include AGENTS.md
9
+ include orchestrator-pipeline-plan.md
10
+ include state/STATE.md
ophar-0.1.0/PKG-INFO ADDED
@@ -0,0 +1,394 @@
1
+ Metadata-Version: 2.4
2
+ Name: ophar
3
+ Version: 0.1.0
4
+ Summary: Ophar — Opus plans, Composer builds, Harness verifies. Ground-truth verdicts, never executor self-report.
5
+ License-Expression: MIT
6
+ Requires-Python: >=3.11
7
+ Description-Content-Type: text/markdown
8
+ License-File: LICENSE
9
+ Requires-Dist: typer>=0.15
10
+ Requires-Dist: rich>=13
11
+ Requires-Dist: httpx>=0.28
12
+ Requires-Dist: mcp>=1.0
13
+ Requires-Dist: fastapi>=0.115
14
+ Requires-Dist: uvicorn[standard]>=0.34
15
+ Requires-Dist: pydantic>=2.0
16
+ Requires-Dist: websockets>=13
17
+ Provides-Extra: test
18
+ Requires-Dist: pytest>=7; extra == "test"
19
+ Dynamic: license-file
20
+
21
+ # Ophar
22
+
23
+ **Ophar** (Opus · Composer · Harness) - pair the smartest available model (Opus) with a
24
+ cheap, fast executor (Composer) so you can build large projects without paying Opus prices
25
+ for every line of code. Opus plans and delegates; Composer does the implementation in
26
+ isolated git worktrees. The harness decides accept/reject from independent *ground truth*
27
+ (the real diff, the real test run, scope, held-out checks) - not from the executor's report.
28
+
29
+ > 🇬🇧 English below · [🇷🇺 Русская версия ниже](#-ophar-русский)
30
+
31
+ ---
32
+
33
+ ## Why
34
+
35
+ **Opus** is the smartest model available right now, but running an entire large project
36
+ on it gets expensive fast. **Composer** is much cheaper and quicker for coding work, but
37
+ on its own it cannot hold architecture, long horizons, or a whole codebase together.
38
+
39
+ This pipeline combines the two and neutralizes the gap:
40
+
41
+ - **Opus** (orchestrator) - the brain: breaks work into tasks, writes specs with
42
+ machine-checkable acceptance criteria, keeps its own context thin.
43
+ - **Composer** (executor, e.g. `cursor-agent`) - the hands: fast, cheap implementation in
44
+ an isolated git worktree.
45
+ - **The harness** - independently verifies every result and is the *only* source of
46
+ verdicts.
47
+
48
+ **Trust boundary (a precaution, not the main idea):** the orchestrator should stay clean
49
+ and innocent - it decides from verified facts, not from the executor's narrative. The
50
+ executor only sees what it needs for the current task (scoped spec, no held-out checks, no
51
+ extra project context). That keeps Opus's window small and stops the cheap model from
52
+ polluting architectural decisions.
53
+
54
+ ## Architecture
55
+
56
+ ```
57
+ you ── natural language ──▶ Opus (orchestrator, via MCP)
58
+ │ authors spec + held-out checks
59
+
60
+ run_in_composer ──▶ orchestrate.sh
61
+
62
+ ┌─────────────────────────┼─────────────────────────┐
63
+ ▼ ▼ ▼
64
+ dispatch.sh ground-truth.sh verdict.sh
65
+ (isolated worktree + (diff · tests · typecheck · accept / iterate /
66
+ headless executor + lint · scope · held-out §9) reject / block
67
+ structural scope guard) │
68
+ └──────────── GROUND TRUTH ───────────┘
69
+
70
+ ▼ accept → land on orch/accepted/<task> (base untouched)
71
+ Opus explains the verified result ──▶ you
72
+ ```
73
+
74
+ The executor's report is **untrusted input** by design (see trust boundary above); the
75
+ diff/test/scope/held-out bundle is the **only** trusted signal for accept/reject.
76
+
77
+ ## Install (recommended)
78
+
79
+ **Requirements:** Python 3.11+, `git`, `bash`, `jq`. (`node` only for the JS toy repo.)
80
+ For real executor runs you also need the `cursor-agent` CLI.
81
+
82
+ ### From GitHub (minimal steps)
83
+
84
+ ```bash
85
+ pip install "ophar @ git+https://github.com/itsyourdecide/ophar.git"
86
+ ophar-setup
87
+ ```
88
+
89
+ `ophar-setup` copies the pipeline bundle to `~/.local/share/ophar` (override with
90
+ `OPHAR_HOME`) and registers MCP in **Cursor** (`~/.cursor/mcp.json`) and **Claude Code**
91
+ (`claude mcp add --scope user`) in parallel.
92
+
93
+ Reload Cursor (Settings → MCP) or run `claude`.
94
+
95
+ ### From a git checkout (development)
96
+
97
+ ```bash
98
+ git clone https://github.com/itsyourdecide/ophar.git
99
+ cd ophar
100
+ ./scripts/install.sh
101
+ ```
102
+
103
+ This creates `.venv`, installs editable `ophar`, and runs `ophar-setup`.
104
+
105
+ ### MCP config (manual)
106
+
107
+ If you prefer to wire MCP yourself:
108
+
109
+ ```json
110
+ {
111
+ "mcpServers": {
112
+ "ophar": {
113
+ "command": "ophar-mcp",
114
+ "args": []
115
+ }
116
+ }
117
+ }
118
+ ```
119
+
120
+ See [`docs/mcp.cursor.json.example`](docs/mcp.cursor.json.example). Claude Code:
121
+
122
+ ```bash
123
+ claude mcp add --scope user ophar -- ophar-mcp
124
+ ```
125
+
126
+ ## Quickstart (developers)
127
+
128
+ After install:
129
+
130
+ ```bash
131
+ bash scripts/setup-fixtures.sh # build the toy target repos (sandbox/, sandbox-py/)
132
+ for t in tests/*.sh; do bash "$t"; done # 11 gates, all green on the mock (zero quota)
133
+ bash harness/reconcile.sh # 0 discrepancies
134
+ ```
135
+
136
+ Everything above uses a **mock executor** - no API quota, no network.
137
+
138
+ ## Using the orchestrator (via MCP)
139
+
140
+ The orchestrator is reached through the **`ophar` MCP server** (`ophar-mcp`), which exposes
141
+ the whole pipeline to any MCP client (e.g. Cursor, Claude Code) - no API key, it rides your
142
+ existing subscription.
143
+
144
+ - **tools** - `init_repo` (scaffold a target repo) and `run_in_composer` (dispatch + get
145
+ verified ground truth back);
146
+ - **instructions** - the orchestrator's operating manual (role, trust boundary, how to
147
+ author specs and held-out checks), auto-injected into the session;
148
+ - **resources** - `pipeline://state`, `pipeline://discipline`, `pipeline://plan`,
149
+ `pipeline://ledger` (live, read on demand).
150
+
151
+ Example: *"Fix the bug in `normalize_probability` in /path/to/repo; tests are in
152
+ `tests/`."* Opus authors a spec, dispatches Composer, and reports the **ground truth** -
153
+ not Composer's story.
154
+
155
+ ## Using the CLI (`opctl`)
156
+
157
+ `opctl` manages the pipeline's server, tasks, and metrics (it is **not** the
158
+ orchestrator - that's the MCP path above):
159
+
160
+ ```bash
161
+ opctl serve # start the FastAPI server (serial dispatch worker)
162
+ opctl tasks ... # submit / list / inspect tasks
163
+ opctl metrics # metrics dashboard
164
+ opctl system reconcile # check STATE.md claims against ground truth
165
+ opctl settings-set MAX_ITERATIONS 5
166
+ ```
167
+
168
+ ## How it works
169
+
170
+ - **Orchestrate loop** (`harness/orchestrate.sh`) - dispatch → ground truth → verdict,
171
+ iterating up to `MAX_ITERATIONS`. On accept it lands the result on a durable
172
+ `orch/accepted/<task>` branch and **never merges to your base** (that stays a human
173
+ decision). The throwaway worktree and scratch branch are reclaimed afterward.
174
+ - **Ground truth** (`harness/ground-truth.sh`) - the §6.2 trusted bundle: actual diff,
175
+ visible tests, optional typecheck/lint, scope, and held-out checks.
176
+ - **Held-out, anti-overfit (§9)** - checks authored trusted-side and *never shown to the
177
+ executor*. If the visible tests pass but held-out fails, the executor overfit - that is
178
+ caught and not accepted.
179
+ - **Structural scope guard** (`ENFORCE_SCOPE=1`) - during the executor's run the worktree
180
+ is read-only outside `allowed_scope`, so out-of-scope writes fail at the filesystem
181
+ layer (detection in ground truth stays as defense-in-depth).
182
+ - **Serial worker** - exactly one dispatch at a time (the harness uses a shared run
183
+ pointer); both the FastAPI worker and the MCP server enforce this.
184
+
185
+ ## Project layout
186
+
187
+ ```
188
+ harness/ the pipeline glue (bash) + mcp_server.py (the MCP orchestrator)
189
+ orchestrate.sh, dispatch.sh, ground-truth.sh, verdict.sh, iterate.sh, land.sh, ...
190
+ lib/ mock executor + mock claude (for zero-quota gates)
191
+ cli/ opctl - Typer CLI
192
+ server/ FastAPI server (routers, serial dispatch worker, registry)
193
+ tests/ 11 gate scripts (run on the mock)
194
+ tasks/ committed task-spec fixtures (T-0001/0002/1002)
195
+ heldout/ committed held-out fixtures (§9)
196
+ state/ STATE.md (soft state) + runtime ledger (gitignored)
197
+ scripts/ setup-fixtures.sh - regenerate the toy target repos
198
+ CLAUDE.md orchestrator delegation discipline
199
+ AGENTS.md executor boundaries
200
+ orchestrator-pipeline-plan.md full design & rationale
201
+ ```
202
+
203
+ ## Notes
204
+
205
+ - **Real executor runs cost quota.** Development uses the mock
206
+ (`CURSOR_AGENT_CMD=harness/lib/mock-cursor-agent.sh`). For real runs, pin the model
207
+ (`composer-2.5`) and keep batches small.
208
+ - **`SANDBOX`** defaults to `disabled` because `cursor-agent`'s own sandbox can't start on
209
+ every host (AppArmor); the harness still confines the executor via the isolated worktree
210
+ + structural scope guard. Set `SANDBOX=enabled` where the cursor sandbox works.
211
+ - The full design lives in
212
+ [`orchestrator-pipeline-plan.md`](orchestrator-pipeline-plan.md).
213
+
214
+ ## License
215
+
216
+ MIT © [itsyourdecide](https://github.com/itsyourdecide). See [LICENSE](LICENSE).
217
+
218
+ ---
219
+
220
+ # 🇷🇺 Ophar (Русский)
221
+
222
+ **Ophar** (Opus · Composer · Harness) - связка самой умной доступной модели (Opus) с
223
+ дешёвым и быстрым исполнителем (Composer), чтобы вести большие проекты без оплаты Opus за
224
+ каждую строчку кода. Opus планирует и делегирует; Composer делает реализацию в
225
+ изолированных git-worktree. Harness принимает accept/reject по независимой *ground truth*
226
+ (реальный diff, прогон тестов, scope, held-out) - а не по отчёту исполнителя.
227
+
228
+ ## Зачем
229
+
230
+ **Opus** - самая умная модель из доступных сейчас, но гонять на нём целый большой проект
231
+ дорого. **Composer** - намного дешевле и быстрее на кодинге, но сам по себе не тянет
232
+ архитектуру, длинный горизонт и целостность большой кодовой базы.
233
+
234
+ Этот пайплайн связывает их и нивелирует разрыв:
235
+
236
+ - **Opus** (оркестратор) - мозг: дробит работу на задачи, пишет спеки с машинно-
237
+ проверяемыми критериями приёмки, держит свой контекст тонким.
238
+ - **Composer** (исполнитель, напр. `cursor-agent`) - руки: быстрая дешёвая реализация в
239
+ изолированном git-worktree.
240
+ - **Харнесс** - независимо проверяет каждый результат и является *единственным* источником
241
+ вердиктов.
242
+
243
+ **Граница доверия (мера осторожности, не главная идея):** оркестратор должен оставаться
244
+ чистым и невинным - он решает по проверенным фактам, а не по нарративу исполнителя.
245
+ Исполнитель видит только то, что нужно для текущей задачи (scope-спека, без held-out, без
246
+ лишнего контекста проекта). Так окно Opus остаётся маленьким, а дешёвая модель не
247
+ засоряет архитектурные решения.
248
+
249
+ ## Архитектура
250
+
251
+ ```
252
+ ты ── естественный язык ──▶ Opus (оркестратор, через MCP)
253
+ │ пишет спеку + held-out проверки
254
+
255
+ run_in_composer ──▶ orchestrate.sh
256
+
257
+ ┌─────────────────────────┼─────────────────────────┐
258
+ ▼ ▼ ▼
259
+ dispatch.sh ground-truth.sh verdict.sh
260
+ (изолированный worktree + (diff · тесты · typecheck · accept / iterate /
261
+ headless-исполнитель + lint · scope · held-out §9) reject / block
262
+ структурный scope-guard) │
263
+ └──────────── GROUND TRUTH ───────────┘
264
+
265
+ ▼ accept → land в orch/accepted/<task> (база не тронута)
266
+ Opus объясняет проверенный результат ──▶ тебе
267
+ ```
268
+
269
+ Отчёт исполнителя - **недоверенный вход** по задумке (см. границу доверия выше); связка
270
+ diff/тесты/scope/held-out - **единственный** доверенный сигнал для accept/reject.
271
+
272
+ ## Установка (рекомендуется)
273
+
274
+ **Требуется:** Python 3.11+, `git`, `bash`, `jq`. (`node` - только для JS-песочницы.)
275
+ Для реальных прогонов нужен CLI `cursor-agent`.
276
+
277
+ ### С GitHub (минимум шагов)
278
+
279
+ ```bash
280
+ pip install "ophar @ git+https://github.com/itsyourdecide/ophar.git"
281
+ ophar-setup
282
+ ```
283
+
284
+ `ophar-setup` копирует bundle в `~/.local/share/ophar` (или `OPHAR_HOME`) и параллельно
285
+ регистрирует MCP в **Cursor** и **Claude Code**. Перезагрузи Cursor или запусти `claude`.
286
+
287
+ ### Из git-репозитория (разработка)
288
+
289
+ ```bash
290
+ git clone https://github.com/itsyourdecide/ophar.git
291
+ cd ophar
292
+ ./scripts/install.sh
293
+ ```
294
+
295
+ ### MCP вручную
296
+
297
+ ```bash
298
+ claude mcp add --scope user ophar -- ophar-mcp
299
+ ```
300
+
301
+ Пример для Cursor: [`docs/mcp.cursor.json.example`](docs/mcp.cursor.json.example).
302
+
303
+ ## Быстрый старт (разработчикам)
304
+
305
+ После установки:
306
+
307
+ ```bash
308
+ bash scripts/setup-fixtures.sh # создать игрушечные репо-цели (sandbox/, sandbox-py/)
309
+ for t in tests/*.sh; do bash "$t"; done # 11 гейтов, все зелёные на моке (без quota)
310
+ bash harness/reconcile.sh # 0 расхождений
311
+ ```
312
+
313
+ Всё выше работает на **моке исполнителя** - без quota и без сети.
314
+
315
+ ## Использование оркестратора (через MCP)
316
+
317
+ Оркестратор доступен через **MCP-сервер `ophar`** (`ophar-mcp`), который отдаёт весь
318
+ пайплайн любому MCP-клиенту (Cursor, Claude Code) - без API-ключа, на твоей подписке.
319
+
320
+ Дальше просто запусти `claude` и общайся. MCP-сервер отдаёт всё для работы с пайплайном:
321
+
322
+ - **tools** - `init_repo` (создать репо-цель) и `run_in_composer` (диспатч + проверенная
323
+ ground truth обратно);
324
+ - **instructions** - операционный мануал оркестратора (роль, граница доверия, как писать
325
+ спеки и held-out), автоматически вшивается в сессию;
326
+ - **resources** - `pipeline://state`, `pipeline://discipline`, `pipeline://plan`,
327
+ `pipeline://ledger` (живые, читаются по требованию).
328
+
329
+ Пример: *«Исправь баг в `normalize_probability` в /path/to/repo; тесты в `tests/`.»* Opus
330
+ пишет спеку, диспатчит Composer и докладывает **ground truth** - а не историю Composer.
331
+
332
+ ## Использование CLI (`opctl`)
333
+
334
+ `opctl` управляет сервером, задачами и метриками пайплайна (это **не** оркестратор - он
335
+ через MCP выше):
336
+
337
+ ```bash
338
+ opctl serve # запустить FastAPI-сервер (серийный воркер диспатча)
339
+ opctl tasks ... # submit / list / inspect задач
340
+ opctl metrics # дашборд метрик
341
+ opctl system reconcile # сверить claims STATE.md с ground truth
342
+ opctl settings-set MAX_ITERATIONS 5
343
+ ```
344
+
345
+ ## Как это работает
346
+
347
+ - **Цикл оркестрации** (`harness/orchestrate.sh`) - диспатч → ground truth → вердикт, с
348
+ итерациями до `MAX_ITERATIONS`. При accept результат лэндится на durable-ветку
349
+ `orch/accepted/<task>` и **никогда не мержится в твою базу** (это решение человека).
350
+ Временный worktree и scratch-ветка убираются после.
351
+ - **Ground truth** (`harness/ground-truth.sh`) - доверенная связка §6.2: реальный diff,
352
+ видимые тесты, опционально typecheck/lint, scope и held-out.
353
+ - **Held-out, анти-оверфит (§9)** - проверки пишутся на доверенной стороне и *никогда не
354
+ показываются исполнителю*. Если видимые тесты прошли, а held-out упал - исполнитель
355
+ переобучился, это ловится и не принимается.
356
+ - **Структурный scope-guard** (`ENFORCE_SCOPE=1`) - во время прогона исполнителя worktree
357
+ доступен только на запись внутри `allowed_scope`, так что запись вне scope падает на
358
+ уровне ФС (детект в ground truth остаётся как defense-in-depth).
359
+ - **Серийный воркер** - ровно один диспатч за раз (харнесс использует общий указатель
360
+ прогона); это обеспечивают и FastAPI-воркер, и MCP-сервер.
361
+
362
+ ## Структура проекта
363
+
364
+ ```
365
+ harness/ связка пайплайна (bash) + mcp_server.py (MCP-оркестратор)
366
+ orchestrate.sh, dispatch.sh, ground-truth.sh, verdict.sh, iterate.sh, land.sh, ...
367
+ lib/ мок-исполнитель + мок-claude (для гейтов без quota)
368
+ cli/ opctl - Typer CLI
369
+ server/ FastAPI-сервер (роутеры, серийный воркер диспатча, реестр)
370
+ tests/ 11 скриптов-гейтов (на моке)
371
+ tasks/ закоммиченные фикстуры спек (T-0001/0002/1002)
372
+ heldout/ закоммиченные held-out фикстуры (§9)
373
+ state/ STATE.md (soft state) + рантайм-ledger (gitignored)
374
+ scripts/ setup-fixtures.sh - пересоздать игрушечные репо-цели
375
+ CLAUDE.md дисциплина делегирования оркестратора
376
+ AGENTS.md границы исполнителя
377
+ orchestrator-pipeline-plan.md полный дизайн и обоснование
378
+ ```
379
+
380
+ ## Примечания
381
+
382
+ - **Реальные прогоны исполнителя тратят quota.** Разработка идёт на моке
383
+ (`CURSOR_AGENT_CMD=harness/lib/mock-cursor-agent.sh`). Для реальных прогонов пинуй модель
384
+ (`composer-2.5`) и держи батчи маленькими.
385
+ - **`SANDBOX`** по умолчанию `disabled`, потому что собственный сэндбокс `cursor-agent`
386
+ стартует не на каждом хосте (AppArmor); харнесс всё равно ограничивает исполнителя через
387
+ изолированный worktree + структурный scope-guard. Ставь `SANDBOX=enabled` там, где
388
+ сэндбокс cursor работает.
389
+ - Полный дизайн - в
390
+ [`orchestrator-pipeline-plan.md`](orchestrator-pipeline-plan.md).
391
+
392
+ ## Лицензия
393
+
394
+ MIT © [itsyourdecide](https://github.com/itsyourdecide). См. [LICENSE](LICENSE).