open-research-protocol 0.4.31 → 0.4.32

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -103,8 +103,10 @@ artifact paths (code/data/proofs/logs/papers).
103
103
  - Treat **failed paths** as assets: record dead ends as a `Failed Path Record` with the blocking reason/counterexample and a
104
104
  next hook.
105
105
  - Resolve disputes by **verification or downgrade**, not argument.
106
- - Run `orp hygiene --json` before long delegation, after material writeback, before API/remote/paid compute, and when dirty
107
- state grows unexpectedly.
106
+ - Run `orp hygiene --json` before long delegation, after material writeback, before remote side effects or unbudgeted paid
107
+ compute, and when dirty state grows unexpectedly.
108
+ - Do not hard-stop solely because an OpenAI research lane is paid; budgeted ORP research may run when `orp research` spend
109
+ preflight is within the configured daily cap.
108
110
  - Stop long-running expansion while hygiene reports `dirty_unclassified`; classify, refresh generated surfaces, canonicalize
109
111
  useful scratch, or write a blocker before continuing.
110
112
  - Hygiene is non-destructive: never reset, checkout, or delete files merely to hide dirty state.
package/CHANGELOG.md CHANGED
@@ -6,6 +6,19 @@ There was no prior in-repo changelog file, so the first formal entry starts
6
6
  with the currently shipped `v0.4.4` release and summarizes the full release
7
7
  delta reflected in this repo.
8
8
 
9
+ ## v0.4.32 - 2026-04-25
10
+
11
+ This release clarifies ORP's paid-work boundary so budgeted OpenAI research is
12
+ not treated as a hard stop solely because it uses paid API calls.
13
+
14
+ ### Changed
15
+
16
+ - Built-in OpenAI research lanes now require a local spend policy, then use
17
+ spend preflight as the approval boundary for budgeted provider calls.
18
+ - Generated project context, AGENTS guidance, handoffs, and research docs now
19
+ distinguish budgeted ORP research from unbudgeted paid compute, purchases,
20
+ and cap-exceeded provider calls.
21
+
9
22
  ## v0.4.31 - 2026-04-25
10
23
 
11
24
  This release refreshes ORP's OpenAI-backed research lanes and tightens
package/cli/orp.py CHANGED
@@ -146,6 +146,11 @@ OPENAI_DEEP_RESEARCH_MODEL = OPENAI_RESEARCH_MODEL
146
146
  SECRET_SPEND_POLICY_SCHEMA_VERSION = "1.0.0"
147
147
  RESEARCH_SPEND_LEDGER_SCHEMA_VERSION = "1.0.0"
148
148
  PROJECT_CONTEXT_SCHEMA_VERSION = "1.0.0"
149
+ HYGIENE_REMOTE_SPEND_MOMENT = "before remote side effects or unbudgeted paid compute"
150
+ BUDGETED_RESEARCH_SPEND_RULE = (
151
+ "Do not hard-stop solely because an OpenAI research lane is paid; budgeted ORP research may run when "
152
+ "`orp research` spend preflight is within the configured daily cap."
153
+ )
149
154
  HYGIENE_POLICY_SCHEMA_VERSION = "1.0.0"
150
155
  MAINTENANCE_STATE_SCHEMA_VERSION = "1.0.0"
151
156
  SCHEDULE_REGISTRY_SCHEMA_VERSION = "1.0.0"
@@ -6921,7 +6926,7 @@ def _default_hygiene_policy() -> dict[str, Any]:
6921
6926
  run_moments = [
6922
6927
  "before long delegation",
6923
6928
  "after material writeback",
6924
- "before API/remote/paid compute",
6929
+ HYGIENE_REMOTE_SPEND_MOMENT,
6925
6930
  "when dirty state grows unexpectedly",
6926
6931
  ]
6927
6932
  self_healing_policy = [
@@ -10692,6 +10697,11 @@ def _project_research_trigger_policy() -> dict[str, Any]:
10692
10697
  "the project must compare multiple papers, standards, providers, or public claims",
10693
10698
  "the output needs a citation-rich report rather than a short decision memo",
10694
10699
  ],
10700
+ "spend_policy": {
10701
+ "budgeted_provider_calls": "OpenAI research lanes are paid but allowed when executed through ORP with a configured local spend policy and a passing spend preflight.",
10702
+ "hard_stop_boundary": "Stop for missing required spend policy, missing secret, cap-exceeded preflight, unbudgeted provider spend, purchases, or non-ORP paid compute.",
10703
+ "local_enforcement": "keychain spend policy with local_preflight_reservation",
10704
+ },
10695
10705
  }
10696
10706
 
10697
10707
 
@@ -10711,7 +10721,7 @@ def _project_evolution_policy() -> dict[str, Any]:
10711
10721
  "run_moments": [
10712
10722
  "before long delegation",
10713
10723
  "after material writeback",
10714
- "before API/remote/paid compute",
10724
+ HYGIENE_REMOTE_SPEND_MOMENT,
10715
10725
  "when dirty state grows unexpectedly",
10716
10726
  ],
10717
10727
  "stop_rule": (
@@ -10720,10 +10730,11 @@ def _project_evolution_policy() -> dict[str, Any]:
10720
10730
  "or write a blocker first."
10721
10731
  ),
10722
10732
  "self_healing_rule": "Non-destructive by default: never reset, checkout, or delete files merely to hide dirty state.",
10733
+ "budgeted_research_spend_rule": BUDGETED_RESEARCH_SPEND_RULE,
10723
10734
  },
10724
10735
  "evolution_loop": [
10725
10736
  "scan authority surfaces",
10726
- "run worktree hygiene before expansion or remote spend",
10737
+ "run worktree hygiene before expansion, remote side effects, or unbudgeted spend",
10727
10738
  "classify dirty state as canonical, runtime, source/test, docs, scratch, or blocker",
10728
10739
  "classify what is local, public, executable, or human-gated",
10729
10740
  "choose whether reasoning, web synthesis, or deep research is justified",
@@ -10764,9 +10775,10 @@ def _project_context_payload(repo_root: Path, *, source: str) -> dict[str, Any]:
10764
10775
  "run_moments": [
10765
10776
  "before long delegation",
10766
10777
  "after material writeback",
10767
- "before API/remote/paid compute",
10778
+ HYGIENE_REMOTE_SPEND_MOMENT,
10768
10779
  "when dirty state grows unexpectedly",
10769
10780
  ],
10781
+ "budgeted_research_spend_rule": BUDGETED_RESEARCH_SPEND_RULE,
10770
10782
  },
10771
10783
  "evolution_policy": _project_evolution_policy(),
10772
10784
  "next_actions": [
@@ -10780,6 +10792,7 @@ def _project_context_payload(repo_root: Path, *, source: str) -> dict[str, Any]:
10780
10792
  "This file is ORP process context for the local directory.",
10781
10793
  "It is refreshed as the project evolves and should not be cited as proof or canonical evidence.",
10782
10794
  "Provider research calls remain opt-in through `orp research ask --execute`.",
10795
+ BUDGETED_RESEARCH_SPEND_RULE,
10783
10796
  ],
10784
10797
  }
10785
10798
 
@@ -10912,7 +10925,8 @@ def _init_handoff_template(repo_root: Path, *, default_branch: str, initialized_
10912
10925
  "## Agent Rules\n\n"
10913
10926
  f"- Do not do meaningful implementation work directly on `{default_branch}` unless explicitly allowed.\n"
10914
10927
  "- Create a work branch before substantial edits.\n"
10915
- "- Run `orp hygiene --json` before long delegation, after material writeback, before API/remote/paid compute, and when dirty state grows unexpectedly.\n"
10928
+ f"- Run `orp hygiene --json` before long delegation, after material writeback, {HYGIENE_REMOTE_SPEND_MOMENT}, and when dirty state grows unexpectedly.\n"
10929
+ f"- {BUDGETED_RESEARCH_SPEND_RULE}\n"
10916
10930
  "- Stop long-running expansion while hygiene reports `dirty_unclassified`; classify, refresh generated surfaces, canonicalize useful scratch, or write a blocker.\n"
10917
10931
  "- Hygiene is non-destructive: never reset, checkout, or delete files merely to hide dirty state.\n"
10918
10932
  "- Create a checkpoint commit after each meaningful completed unit of work.\n"
@@ -11098,7 +11112,8 @@ def _render_agent_guide_block(
11098
11112
  [
11099
11113
  "- Preserve human notes outside ORP-managed blocks.",
11100
11114
  "- Use this local file for the project-specific current state, local constraints, and concrete next moves.",
11101
- "- Run `orp hygiene --json` before long delegation, after material writeback, before API/remote/paid compute, and when dirty state grows unexpectedly.",
11115
+ f"- Run `orp hygiene --json` before long delegation, after material writeback, {HYGIENE_REMOTE_SPEND_MOMENT}, and when dirty state grows unexpectedly.",
11116
+ f"- {BUDGETED_RESEARCH_SPEND_RULE}",
11102
11117
  "- Stop long-running expansion while hygiene reports `dirty_unclassified`; classify, refresh generated surfaces, canonicalize useful scratch, or write a blocker.",
11103
11118
  "- Hygiene is non-destructive: never reset, checkout, or delete files merely to hide dirty state.",
11104
11119
  ]
@@ -11516,9 +11531,10 @@ def _agent_policy_payload(
11516
11531
  "run_moments": [
11517
11532
  "before long delegation",
11518
11533
  "after material writeback",
11519
- "before API/remote/paid compute",
11534
+ HYGIENE_REMOTE_SPEND_MOMENT,
11520
11535
  "when dirty state grows unexpectedly",
11521
11536
  ],
11537
+ "budgeted_research_spend_rule": BUDGETED_RESEARCH_SPEND_RULE,
11522
11538
  "required_self_healing": [
11523
11539
  "classify dirty paths",
11524
11540
  "refresh generated surfaces",
@@ -17714,6 +17730,7 @@ def _research_staged_deep_think_profile(profile_id: str = "deep-think-web-think-
17714
17730
  ],
17715
17731
  "env_var": "OPENAI_API_KEY",
17716
17732
  "secret_alias": "openai-primary",
17733
+ "spend_policy_required": True,
17717
17734
  "reasoning_effort": "xhigh",
17718
17735
  "reasoning_summary": "auto",
17719
17736
  "web_search": True,
@@ -17751,6 +17768,7 @@ def _research_staged_deep_think_profile(profile_id: str = "deep-think-web-think-
17751
17768
  ],
17752
17769
  "env_var": "OPENAI_API_KEY",
17753
17770
  "secret_alias": "openai-primary",
17771
+ "spend_policy_required": True,
17754
17772
  "reasoning_effort": "high",
17755
17773
  "text_verbosity": "medium",
17756
17774
  "spend_reserve_usd": 0.5,
@@ -17784,6 +17802,7 @@ def _research_staged_deep_think_profile(profile_id: str = "deep-think-web-think-
17784
17802
  ],
17785
17803
  "env_var": "OPENAI_API_KEY",
17786
17804
  "secret_alias": "openai-primary",
17805
+ "spend_policy_required": True,
17787
17806
  "reasoning_effort": "high",
17788
17807
  "text_verbosity": "medium",
17789
17808
  "web_search": True,
@@ -17821,6 +17840,7 @@ def _research_staged_deep_think_profile(profile_id: str = "deep-think-web-think-
17821
17840
  ],
17822
17841
  "env_var": "OPENAI_API_KEY",
17823
17842
  "secret_alias": "openai-primary",
17843
+ "spend_policy_required": True,
17824
17844
  "reasoning_effort": "high",
17825
17845
  "text_verbosity": "medium",
17826
17846
  "spend_reserve_usd": 0.5,
@@ -17854,6 +17874,7 @@ def _research_staged_deep_think_profile(profile_id: str = "deep-think-web-think-
17854
17874
  ],
17855
17875
  "env_var": "OPENAI_API_KEY",
17856
17876
  "secret_alias": "openai-primary",
17877
+ "spend_policy_required": True,
17857
17878
  "reasoning_effort": "xhigh",
17858
17879
  "reasoning_summary": "auto",
17859
17880
  "web_search": True,
@@ -17935,6 +17956,7 @@ def _research_default_profile(profile_id: str = "openai-council") -> dict[str, A
17935
17956
  "role": "Deliberate high-reasoning pass from the provided context. Think hard, critique assumptions, and produce a decision-oriented answer.",
17936
17957
  "env_var": "OPENAI_API_KEY",
17937
17958
  "secret_alias": "openai-primary",
17959
+ "spend_policy_required": True,
17938
17960
  "reasoning_effort": "high",
17939
17961
  "text_verbosity": "medium",
17940
17962
  "spend_reserve_usd": 0.5,
@@ -17950,6 +17972,7 @@ def _research_default_profile(profile_id: str = "openai-council") -> dict[str, A
17950
17972
  "role": "Recency-aware synthesis using OpenAI Responses web search with citations.",
17951
17973
  "env_var": "OPENAI_API_KEY",
17952
17974
  "secret_alias": "openai-primary",
17975
+ "spend_policy_required": True,
17953
17976
  "reasoning_effort": "high",
17954
17977
  "text_verbosity": "medium",
17955
17978
  "web_search": True,
@@ -17970,6 +17993,7 @@ def _research_default_profile(profile_id: str = "openai-council") -> dict[str, A
17970
17993
  "role": "Pro Research style long-form investigation. Produce a structured, citation-rich report grounded in public sources.",
17971
17994
  "env_var": "OPENAI_API_KEY",
17972
17995
  "secret_alias": "openai-primary",
17996
+ "spend_policy_required": True,
17973
17997
  "reasoning_effort": "xhigh",
17974
17998
  "reasoning_summary": "auto",
17975
17999
  "web_search": True,
@@ -18485,6 +18509,7 @@ def _research_openai_spend_preflight(
18485
18509
  provider = str(lane.get("provider", "") or "").strip()
18486
18510
  secret_alias = str(lane.get("secret_alias", "") or "").strip()
18487
18511
  reserve_usd = _research_lane_spend_reserve_usd(lane)
18512
+ spend_policy_required = bool(lane.get("spend_policy_required", False))
18488
18513
  entry, entry_issue = _research_spend_policy_entry_for_lane(lane)
18489
18514
  policy = _normalize_secret_spend_policy(entry.get("spend_policy", {}) if isinstance(entry, dict) else {})
18490
18515
  date_utc = dt.datetime.now(dt.timezone.utc).date().isoformat()
@@ -18498,11 +18523,15 @@ def _research_openai_spend_preflight(
18498
18523
  "ledger_path": str(_research_spend_ledger_path()),
18499
18524
  }
18500
18525
  if not policy:
18526
+ reason = entry_issue or "no spend policy configured for this local keychain entry"
18527
+ if spend_policy_required:
18528
+ reason = f"required spend policy missing: {reason}"
18501
18529
  return {
18502
18530
  **base,
18503
- "allowed": True,
18531
+ "allowed": not spend_policy_required,
18504
18532
  "policy_source": "",
18505
- "reason": entry_issue or "no spend policy configured for this local keychain entry",
18533
+ "spend_policy_required": spend_policy_required,
18534
+ "reason": reason,
18506
18535
  }
18507
18536
 
18508
18537
  reserved_today = _research_spend_ledger_today_total(
@@ -18519,6 +18548,7 @@ def _research_openai_spend_preflight(
18519
18548
  **base,
18520
18549
  "allowed": allowed,
18521
18550
  "policy_source": "keychain",
18551
+ "spend_policy_required": spend_policy_required,
18522
18552
  "daily_cap_usd": round(daily_cap_usd, 6),
18523
18553
  "currency": str(policy.get("currency", "USD")).strip() or "USD",
18524
18554
  "reserved_today_usd": reserved_today,
@@ -13,10 +13,15 @@ read:
13
13
  - Read `llms.txt`.
14
14
  - Run `orp about --json`.
15
15
  - Run `orp hygiene --json` before long delegation, after material writeback,
16
- before API/remote/paid compute, and whenever dirty state grows unexpectedly.
16
+ before remote side effects or unbudgeted paid compute, and whenever dirty
17
+ state grows unexpectedly.
17
18
  If it reports `dirty_unclassified`, stop long-running expansion and classify
18
19
  the paths, refresh generated surfaces, canonicalize useful scratch, or write a
19
20
  blocker before continuing.
21
+ - Do not hard-stop solely because an OpenAI research lane is paid. Budgeted ORP
22
+ research may run when `orp research` spend preflight is within the configured
23
+ daily cap; stop for missing required spend policy, missing secret, cap
24
+ exhaustion, purchases, or non-ORP paid compute.
20
25
  - If the task benefits from fresh concepting, tasteful interface work, or
21
26
  exploratory reframing, run:
22
27
  - `orp mode nudge sleek-minimal-progressive --json`
@@ -91,7 +91,7 @@ A true gate is not "the agent feels uncertain."
91
91
 
92
92
  A true gate is a boundary like:
93
93
 
94
- - spend or purchase
94
+ - unbudgeted spend or purchase
95
95
  - outreach or counterparty contact
96
96
  - provider/vendor selection with real consequences
97
97
  - legal/oversight/compliance judgment
@@ -196,7 +196,7 @@ Bad candidates for automatic compilation:
196
196
 
197
197
  - vague strategic narratives with no runnable command
198
198
  - tasks that imply counterparty contact
199
- - tasks that imply money
199
+ - tasks that imply unbudgeted money
200
200
  - steps that promote support-only outputs into authority
201
201
 
202
202
  ## What ORP Should Emit
@@ -230,6 +230,7 @@ The controller benchmark experiment surfaced the exact shape:
230
230
  2. compile the remaining pre-outreach tasks
231
231
  3. keep drafts unsent
232
232
  4. stop only when the next step would actually contact a counterparty or spend
233
+ outside a configured budget/preflight policy
233
234
  5. emit a gate dossier
234
235
  6. resume only after the human opens that gate
235
236
 
@@ -102,6 +102,26 @@ printf '%s' '<openai-key>' | orp secrets keychain-add \
102
102
  --json
103
103
  ```
104
104
 
105
+ ## Spend Policy
106
+
107
+ The OpenAI research lanes are paid, but paid does not automatically mean human
108
+ hard stop. ORP treats them as budgeted provider calls when `openai-primary` has
109
+ a local spend policy and the lane passes spend preflight.
110
+
111
+ Set or update the local daily cap metadata like this:
112
+
113
+ ```bash
114
+ orp secrets keychain-spend-policy openai-primary \
115
+ --daily-spend-cap-usd 5 \
116
+ --dashboard-spend-cap-status unconfirmed \
117
+ --dashboard-url https://platform.openai.com/settings/organization/limits \
118
+ --json
119
+ ```
120
+
121
+ Live research should stop when the required spend policy is missing, the secret
122
+ is missing, the daily cap would be exceeded, or the work is unbudgeted provider
123
+ spend outside ORP research lanes.
124
+
105
125
  ## Fixtures
106
126
 
107
127
  Provider outputs can be attached without spending live calls:
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "open-research-protocol",
3
- "version": "0.4.31",
3
+ "version": "0.4.32",
4
4
  "description": "ORP CLI (Open Research Protocol): workspace ledgers, secrets, scheduling, governed execution, and agent-friendly research workflows.",
5
5
  "license": "MIT",
6
6
  "author": "Fractal Research Group <cody@frg.earth>",