rlhf-feedback-loop 0.5.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (73) hide show
  1. package/CHANGELOG.md +26 -0
  2. package/LICENSE +21 -0
  3. package/README.md +308 -0
  4. package/adapters/README.md +8 -0
  5. package/adapters/amp/skills/rlhf-feedback/SKILL.md +20 -0
  6. package/adapters/chatgpt/INSTALL.md +80 -0
  7. package/adapters/chatgpt/openapi.yaml +292 -0
  8. package/adapters/claude/.mcp.json +8 -0
  9. package/adapters/codex/config.toml +4 -0
  10. package/adapters/gemini/function-declarations.json +95 -0
  11. package/adapters/mcp/server-stdio.js +444 -0
  12. package/bin/cli.js +167 -0
  13. package/config/mcp-allowlists.json +29 -0
  14. package/config/policy-bundles/constrained-v1.json +53 -0
  15. package/config/policy-bundles/default-v1.json +80 -0
  16. package/config/rubrics/default-v1.json +52 -0
  17. package/config/subagent-profiles.json +32 -0
  18. package/openapi/openapi.yaml +292 -0
  19. package/package.json +91 -0
  20. package/plugins/amp-skill/INSTALL.md +52 -0
  21. package/plugins/amp-skill/SKILL.md +31 -0
  22. package/plugins/claude-skill/INSTALL.md +55 -0
  23. package/plugins/claude-skill/SKILL.md +46 -0
  24. package/plugins/codex-profile/AGENTS.md +20 -0
  25. package/plugins/codex-profile/INSTALL.md +57 -0
  26. package/plugins/gemini-extension/INSTALL.md +74 -0
  27. package/plugins/gemini-extension/gemini_prompt.txt +10 -0
  28. package/plugins/gemini-extension/tool_contract.json +28 -0
  29. package/scripts/billing.js +471 -0
  30. package/scripts/budget-guard.js +173 -0
  31. package/scripts/code-reasoning.js +307 -0
  32. package/scripts/context-engine.js +547 -0
  33. package/scripts/contextfs.js +513 -0
  34. package/scripts/contract-audit.js +198 -0
  35. package/scripts/dpo-optimizer.js +208 -0
  36. package/scripts/export-dpo-pairs.js +316 -0
  37. package/scripts/export-training.js +448 -0
  38. package/scripts/feedback-attribution.js +313 -0
  39. package/scripts/feedback-inbox-read.js +162 -0
  40. package/scripts/feedback-loop.js +838 -0
  41. package/scripts/feedback-schema.js +300 -0
  42. package/scripts/feedback-to-memory.js +165 -0
  43. package/scripts/feedback-to-rules.js +109 -0
  44. package/scripts/generate-paperbanana-diagrams.sh +99 -0
  45. package/scripts/hybrid-feedback-context.js +676 -0
  46. package/scripts/intent-router.js +164 -0
  47. package/scripts/mcp-policy.js +92 -0
  48. package/scripts/meta-policy.js +194 -0
  49. package/scripts/plan-gate.js +154 -0
  50. package/scripts/prove-adapters.js +364 -0
  51. package/scripts/prove-attribution.js +364 -0
  52. package/scripts/prove-automation.js +393 -0
  53. package/scripts/prove-data-quality.js +219 -0
  54. package/scripts/prove-intelligence.js +256 -0
  55. package/scripts/prove-lancedb.js +370 -0
  56. package/scripts/prove-loop-closure.js +255 -0
  57. package/scripts/prove-rlaif.js +404 -0
  58. package/scripts/prove-subway-upgrades.js +250 -0
  59. package/scripts/prove-training-export.js +324 -0
  60. package/scripts/prove-v2-milestone.js +273 -0
  61. package/scripts/prove-v3-milestone.js +381 -0
  62. package/scripts/rlaif-self-audit.js +123 -0
  63. package/scripts/rubric-engine.js +230 -0
  64. package/scripts/self-heal.js +127 -0
  65. package/scripts/self-healing-check.js +111 -0
  66. package/scripts/skill-quality-tracker.js +284 -0
  67. package/scripts/subagent-profiles.js +79 -0
  68. package/scripts/sync-gh-secrets-from-env.sh +29 -0
  69. package/scripts/thompson-sampling.js +331 -0
  70. package/scripts/train_from_feedback.py +914 -0
  71. package/scripts/validate-feedback.js +580 -0
  72. package/scripts/vector-store.js +100 -0
  73. package/src/api/server.js +497 -0
package/CHANGELOG.md ADDED
@@ -0,0 +1,26 @@
1
+ # Changelog
2
+
3
+ ## 0.5.0 - 2026-03-03
4
+
5
+ - Added autonomous GitOps workflows: agent auto-merge, Dependabot auto-merge, self-healing monitor, and merge-branch fallback.
6
+ - Enabled CI proof artifact uploads and strengthened CI concurrency/branch scoping.
7
+ - Added self-healing command layer (`scripts/self-healing-check.js`, `scripts/self-heal.js`) with unit tests.
8
+ - Added semantic cache for ContextFS context-pack construction with TTL + similarity gating and provenance events.
9
+ - Added secret-sync helper (`scripts/sync-gh-secrets-from-env.sh`) and docs for required repo settings/secrets.
10
+
11
+ ## 0.4.0 - 2026-03-03
12
+
13
+ - Added rubric-based RLHF scoring with configurable criteria and weighted evaluation.
14
+ - Added anti-reward-hacking safeguards: guardrail checks and multi-judge disagreement detection.
15
+ - Added rubric-aware memory promotion gates for positive feedback.
16
+ - Added rubric-aware context evaluation, prevention-rule dimensions, and DPO export metadata.
17
+ - Extended API/MCP/Gemini contracts for rubric scores and guardrails.
18
+ - Added automated proof harness for rubric + intent + API/MCP end-to-end validation (`proof/automation/*`).
19
+
20
+ ## 0.3.0 - 2026-03-03
21
+
22
+ - Added production API server with secure auth defaults and safe-path checks.
23
+ - Added local MCP server for Claude/Codex integrations.
24
+ - Added ChatGPT, Gemini, Codex, Claude, and Amp adapter bundles.
25
+ - Added budget guard and PaperBanana generation workflow.
26
+ - Added platform research, packaging plan, and verification artifacts.
package/LICENSE ADDED
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2026 Igor Ganapolsky
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
package/README.md ADDED
@@ -0,0 +1,308 @@
1
+ # RLHF Feedback Loop
2
+
3
+ [![CI](https://github.com/IgorGanapolsky/rlhf-feedback-loop/actions/workflows/ci.yml/badge.svg)](https://github.com/IgorGanapolsky/rlhf-feedback-loop/actions/workflows/ci.yml)
4
+ [![Self-Healing](https://github.com/IgorGanapolsky/rlhf-feedback-loop/actions/workflows/self-healing-monitor.yml/badge.svg)](https://github.com/IgorGanapolsky/rlhf-feedback-loop/actions/workflows/self-healing-monitor.yml)
5
+ [![License: MIT](https://img.shields.io/badge/License-MIT-green.svg)](LICENSE)
6
+ [![MCP Ready](https://img.shields.io/badge/MCP-ready-black)](adapters/mcp/server-stdio.js)
7
+ [![DPO Ready](https://img.shields.io/badge/DPO-ready-blue)](scripts/export-dpo-pairs.js)
8
+
9
+ Production-grade RLHF operations for AI agents across ChatGPT, Claude, Gemini, Codex, and Amp.
10
+
11
+ ## Quick Install
12
+
13
+ Install on any platform with a single command. Be capturing feedback in under 5 minutes.
14
+
15
+ ### Universal (any platform)
16
+
17
+ ```bash
18
+ npx rlhf-feedback-loop init
19
+ node .rlhf/capture-feedback.js --feedback=up --context="test"
20
+ ```
21
+
22
+ ### Claude Code
23
+
24
+ ```bash
25
+ cp plugins/claude-skill/SKILL.md .claude/skills/rlhf-feedback.md
26
+ ```
27
+
28
+ Full guide: [plugins/claude-skill/INSTALL.md](plugins/claude-skill/INSTALL.md)
29
+
30
+ ### Codex
31
+
32
+ ```bash
33
+ cat adapters/codex/config.toml >> ~/.codex/config.toml
34
+ ```
35
+
36
+ Full guide: [plugins/codex-profile/INSTALL.md](plugins/codex-profile/INSTALL.md)
37
+
38
+ ### Gemini
39
+
40
+ ```bash
41
+ cp adapters/gemini/function-declarations.json .gemini/rlhf-tools.json
42
+ ```
43
+
44
+ Full guide: [plugins/gemini-extension/INSTALL.md](plugins/gemini-extension/INSTALL.md)
45
+
46
+ ### Amp
47
+
48
+ ```bash
49
+ cp plugins/amp-skill/SKILL.md .amp/skills/rlhf-feedback.md
50
+ ```
51
+
52
+ Full guide: [plugins/amp-skill/INSTALL.md](plugins/amp-skill/INSTALL.md)
53
+
54
+ ### ChatGPT (GPT Actions)
55
+
56
+ Import `adapters/chatgpt/openapi.yaml` in the GPT Builder Actions editor.
57
+
58
+ Full guide: [adapters/chatgpt/INSTALL.md](adapters/chatgpt/INSTALL.md)
59
+
60
+ ---
61
+
62
+ ## Value Proposition
63
+
64
+ Most teams collect feedback but do not convert it into reliable behavior change.
65
+ This project gives you a working loop:
66
+
67
+ 1. Capture thumbs up/down with context.
68
+ 2. Score outcomes with weighted rubrics and objective guardrails.
69
+ 3. Promote only schema-valid, rubric-eligible memories.
70
+ 4. Generate prevention rules from repeated mistakes and failed rubric dimensions.
71
+ 5. Export DPO-ready preference pairs with rubric deltas.
72
+ 6. Construct bounded context packs (constructor/loader/evaluator).
73
+ 7. Reuse the same core through API + MCP wrappers.
74
+ 8. Route intents through policy bundles with human checkpoints on high-risk actions.
75
+
76
+ ## Pricing
77
+
78
+ | Plan | Price | What you get |
79
+ |------|-------|-------------|
80
+ | **Open Source** | $0 forever | Full source, self-hosted, MIT license, 314+ tests, 5-platform plugins |
81
+ | **Cloud Pro** | $49/mo | Hosted HTTPS API on Railway, provisioned API key on payment, usage metering, email support |
82
+
83
+ Get Cloud Pro: see the [landing page](docs/landing-page.html) or go straight to Stripe Checkout.
84
+
85
+ ---
86
+
87
+ ## Quick Start
88
+
89
+ ```bash
90
+ cp .env.example .env
91
+ npm test
92
+ npm run prove:adapters
93
+ npm run prove:automation
94
+ npm run start:api
95
+ ```
96
+
97
+ Set `RLHF_API_KEY` before running the API (or explicitly set `RLHF_ALLOW_INSECURE=true` for isolated local testing only).
98
+
99
+ Capture feedback:
100
+
101
+ ```bash
102
+ node .claude/scripts/feedback/capture-feedback.js \
103
+ --feedback=down \
104
+ --context="Claimed done without test evidence" \
105
+ --what-went-wrong="No proof attached" \
106
+ --what-to-change="Always run tests and include output" \
107
+ --tags="verification,testing"
108
+ ```
109
+
110
+ ## Integration Adapters
111
+
112
+ - ChatGPT Actions: `adapters/chatgpt/openapi.yaml`
113
+ - Claude MCP: `adapters/claude/.mcp.json`
114
+ - Codex MCP: `adapters/codex/config.toml`
115
+ - Gemini tools: `adapters/gemini/function-declarations.json`
116
+ - Amp skill: `adapters/amp/skills/rlhf-feedback/SKILL.md`
117
+
118
+ ## API Surface
119
+
120
+ - `POST /v1/feedback/capture`
121
+ - `GET /v1/feedback/stats`
122
+ - `GET /v1/intents/catalog`
123
+ - `POST /v1/intents/plan`
124
+ - `GET /v1/feedback/summary`
125
+ - `POST /v1/feedback/rules`
126
+ - `POST /v1/dpo/export`
127
+ - `POST /v1/context/construct`
128
+ - `POST /v1/context/evaluate`
129
+ - `GET /v1/context/provenance`
130
+
131
+ Spec: `openapi/openapi.yaml`
132
+
133
+ ## Versioning
134
+
135
+ - Package/runtime release version: `package.json`
136
+ - API contract version: `openapi/openapi.yaml`
137
+ - MCP server protocol version: `adapters/mcp/server-stdio.js` `serverInfo.version`
138
+
139
+ ## ContextFS
140
+
141
+ The repo includes a file-system context substrate for multi-agent memory orchestration:
142
+ - Constructor: relevance-ranked context pack assembly
143
+ - Loader: strict `maxItems` + `maxChars` budgeting
144
+ - Evaluator: outcome/provenance logging for improvement loops
145
+
146
+ Docs: [docs/CONTEXTFS.md](docs/CONTEXTFS.md)
147
+
148
+ ## MCP Policy Profiles
149
+
150
+ Use least-privilege MCP profiles based on runtime risk:
151
+
152
+ - `default`: full local toolset
153
+ - `readonly`: read-heavy operations
154
+ - `locked`: summary-only constrained mode
155
+
156
+ Config: [config/mcp-allowlists.json](config/mcp-allowlists.json)
157
+
158
+ ## Rubric Engine
159
+
160
+ Rubric config: `config/rubrics/default-v1.json`
161
+
162
+ - Weighted criteria scoring (`1-5`)
163
+ - Multi-judge disagreement detection
164
+ - Objective guardrail checks (`testsPassed`, `pathSafety`, `budgetCompliant`)
165
+ - Promotion gate blocks positive memory writes on unsafe/high-disagreement signals
166
+
167
+ ## Intent Router
168
+
169
+ Versioned orchestration bundles define intent-to-action plans and checkpoint policy:
170
+
171
+ - Bundle configs: `config/policy-bundles/*.json`
172
+ - CLI list: `npm run intents:list`
173
+ - CLI plan: `npm run intents:plan`
174
+
175
+ The router marks high-risk intents as `checkpoint_required` unless explicitly approved.
176
+ Details: [docs/INTENT_ROUTER.md](docs/INTENT_ROUTER.md)
177
+
178
+ ## Autonomous GitOps
179
+
180
+ The repo now ships with PR-gated autonomous operations:
181
+
182
+ - `CI` (`.github/workflows/ci.yml`): required quality gate (`npm test`, adapter proof, automation proof)
183
+ - `Agent PR Auto-Merge` (`.github/workflows/agent-automerge.yml`): auto-merges eligible agent branches (`claude/*`, `codex/*`, `auto/*`, `agent/*`) after required checks pass
184
+ - `Dependabot Auto-Merge` (`.github/workflows/dependabot-automerge.yml`): auto-approves and merges safe dependency updates after required checks pass
185
+ - `Self-Healing Monitor` (`.github/workflows/self-healing-monitor.yml`): scheduled health checks, auto-created alert issue on failure, remediation PR generation when fixable
186
+ - `Self-Healing Auto-Fix` (`.github/workflows/self-healing-auto-fix.yml`): scheduled safe-fix attempts that open remediation PRs
187
+ - `Merge Branch to Main` (`.github/workflows/merge-branch.yml`): manual fallback that still uses PR flow and branch protections
188
+
189
+ Required repo settings:
190
+
191
+ - `main` protected + required check(s)
192
+ - auto-merge enabled
193
+ - branch deletion on merge enabled
194
+
195
+ Secrets:
196
+
197
+ - Required: `GH_PAT` (or rely on `GITHUB_TOKEN` where permitted)
198
+ - Optional: `SENTRY_AUTH_TOKEN`, `SENTRY_DSN`
199
+ - Optional (LLM router): `LLM_GATEWAY_BASE_URL`, `LLM_GATEWAY_API_KEY`, `TETRATE_API_KEY`
200
+
201
+ Sync helper:
202
+
203
+ ```bash
204
+ bash scripts/sync-gh-secrets-from-env.sh IgorGanapolsky/rlhf-feedback-loop
205
+ ```
206
+
207
+ ## Architecture
208
+
209
+ ### RLHF Feedback Loop
210
+
211
+ ```mermaid
212
+ flowchart TD
213
+ A["👍/👎 User Feedback"] --> B["Capture Layer\n(context + tags)"]
214
+ B --> C{"Action Resolver"}
215
+ C -->|store-learning| D["Schema Validator"]
216
+ C -->|store-mistake| D
217
+ C -->|no-action| X["Discard"]
218
+ D -->|valid| E["Memory Store\n(learning / error)"]
219
+ D -->|invalid| X
220
+ E --> F["Analytics\n(trends + recurrence)"]
221
+ F --> G["Prevention Rules Engine"]
222
+ F --> H["DPO Export\n(prompt/chosen/rejected)"]
223
+ E --> I["Rubric Engine\n(weighted scoring + guardrails)"]
224
+ I -->|promotion gate| E
225
+ ```
226
+
227
+ ### Plugin Topology
228
+
229
+ ```mermaid
230
+ flowchart LR
231
+ subgraph Adapters
232
+ GPT["ChatGPT\n(GPT Actions)"]
233
+ CL["Claude\n(MCP Server)"]
234
+ CX["Codex\n(MCP Config)"]
235
+ GEM["Gemini\n(Function Calling)"]
236
+ AMP["Amp\n(Skills Template)"]
237
+ end
238
+
239
+ subgraph Core["RLHF Feedback API"]
240
+ SV["Schema Validation"]
241
+ PR["Prevention Rules"]
242
+ DPO["DPO Export"]
243
+ BG["Budget Guard\n($10/mo cap)"]
244
+ end
245
+
246
+ GPT <--> Core
247
+ CL <--> Core
248
+ CX <--> Core
249
+ GEM <--> Core
250
+ AMP <--> Core
251
+ ```
252
+
253
+ ### PaperBanana (high-fidelity PNG)
254
+
255
+ Generate richer architecture visuals with a budget guard:
256
+
257
+ ```bash
258
+ npm run diagrams:paperbanana
259
+ npm run budget:status
260
+ ```
261
+
262
+ Docs: [docs/PAPERBANANA.md](docs/PAPERBANANA.md)
263
+ Verification evidence: [docs/VERIFICATION_EVIDENCE.md](docs/VERIFICATION_EVIDENCE.md)
264
+ Compatibility proof artifacts: [proof/compatibility/report.md](proof/compatibility/report.md), [proof/compatibility/report.json](proof/compatibility/report.json)
265
+ Automation proof artifacts: [proof/automation/report.md](proof/automation/report.md), [proof/automation/report.json](proof/automation/report.json)
266
+
267
+ ## Budget Guardrail
268
+
269
+ Default monthly cap is `$10` for paid external operations.
270
+ The local budget ledger blocks additional spend if cap would be exceeded.
271
+
272
+ ## Semantic Cache (Cost + Latency)
273
+
274
+ Context pack construction now supports semantic cache reuse for similar queries:
275
+
276
+ - token-overlap (Jaccard) similarity gate
277
+ - TTL-bound cache entries
278
+ - full provenance (`context_pack_cache_hit`)
279
+
280
+ Environment toggles:
281
+
282
+ - `RLHF_SEMANTIC_CACHE_ENABLED=true|false` (default `true`)
283
+ - `RLHF_SEMANTIC_CACHE_THRESHOLD=0.7`
284
+ - `RLHF_SEMANTIC_CACHE_TTL_SECONDS=86400`
285
+
286
+ This directly reduces repeated retrieval/LLM context assembly work and improves response latency under budget constraints.
287
+
288
+ ## Optional Tetrate Router
289
+
290
+ Not required for core local RLHF logic.
291
+ Recommended only when routing paid LLM calls (PaperBanana, external judges, hosted control-plane features):
292
+
293
+ - centralized provider routing
294
+ - price/fallback control
295
+ - unified usage observability
296
+
297
+ ## Commercialization
298
+
299
+ - OSS core for adoption
300
+ - Hosted control plane for teams
301
+ - Enterprise support and compliance features
302
+
303
+ See:
304
+
305
+ - [docs/PACKAGING_AND_SALES_PLAN.md](docs/PACKAGING_AND_SALES_PLAN.md)
306
+ - [docs/PLATFORM_RESEARCH_2026-03-03.md](docs/PLATFORM_RESEARCH_2026-03-03.md)
307
+ - [docs/PLUGIN_DISTRIBUTION.md](docs/PLUGIN_DISTRIBUTION.md)
308
+ - [docs/AUTONOMOUS_GITOPS.md](docs/AUTONOMOUS_GITOPS.md)
@@ -0,0 +1,8 @@
1
+ # Adapter Bundles
2
+
3
+ - `chatgpt/openapi.yaml`: import into GPT Actions.
4
+ - `gemini/function-declarations.json`: Gemini function-calling definitions.
5
+ - `mcp/server-stdio.js`: local MCP server for Claude/Codex.
6
+ - `claude/.mcp.json`: example Claude Code MCP config.
7
+ - `codex/config.toml`: example Codex MCP profile section.
8
+ - `amp/skills/rlhf-feedback/SKILL.md`: Amp skill template.
@@ -0,0 +1,20 @@
1
+ ---
2
+ name: rlhf-feedback
3
+ description: Capture thumbs feedback and apply prevention rules before coding
4
+ ---
5
+
6
+ # Amp RLHF Skill
7
+
8
+ On explicit user feedback:
9
+
10
+ ```bash
11
+ node .claude/scripts/feedback/capture-feedback.js --feedback=up --context="..." --tags="..."
12
+ node .claude/scripts/feedback/capture-feedback.js --feedback=down --context="..." --tags="..."
13
+ ```
14
+
15
+ Before major implementation:
16
+
17
+ ```bash
18
+ npm run feedback:summary
19
+ npm run feedback:rules
20
+ ```
@@ -0,0 +1,80 @@
1
+ # ChatGPT GPT Actions: RLHF Feedback Loop Install
2
+
3
+ Import the OpenAPI spec into a Custom GPT in under 5 minutes. No coding required.
4
+
5
+ ## Prerequisites
6
+
7
+ - A ChatGPT Plus or Team account (Custom GPTs require a paid plan)
8
+ - RLHF API running at a public HTTPS URL (see [Deployment docs](../../docs/deployment.md))
9
+
10
+ ## Step 1 — Open GPT Builder
11
+
12
+ 1. Go to [https://chat.openai.com/gpts/editor](https://chat.openai.com/gpts/editor)
13
+ 2. Click **Create a GPT**
14
+ 3. Switch to the **Configure** tab
15
+
16
+ ## Step 2 — Add Actions
17
+
18
+ 1. Scroll to the **Actions** section
19
+ 2. Click **Create new action**
20
+ 3. Click **Import from URL** — paste your hosted spec URL:
21
+ ```
22
+ https://<your-railway-domain>/openapi.yaml
23
+ ```
24
+ Or click **Upload file** and select:
25
+ ```
26
+ adapters/chatgpt/openapi.yaml
27
+ ```
28
+
29
+ ## Step 3 — Set Authentication
30
+
31
+ In the Actions panel:
32
+
33
+ 1. Select **Authentication type: API Key**
34
+ 2. **Auth type**: Bearer
35
+ 3. **API Key**: paste your `RLHF_API_KEY` value
36
+
37
+ ## Step 4 — Update the Server URL
38
+
39
+ In the imported spec, confirm the `servers.url` points to your deployed API:
40
+
41
+ ```yaml
42
+ servers:
43
+ - url: https://<your-railway-domain>
44
+ ```
45
+
46
+ If you uploaded the file, edit the server URL in the GPT Actions editor.
47
+
48
+ ## Step 5 — Verify
49
+
50
+ Click **Test** on the `captureFeedback` action:
51
+
52
+ ```json
53
+ {
54
+ "signal": "up",
55
+ "context": "GPT Actions install verified"
56
+ }
57
+ ```
58
+
59
+ Expected response: `200 OK` with `{ "id": "fb-...", "status": "captured" }`.
60
+
61
+ ## Available Actions
62
+
63
+ | Action | Method | Path | Description |
64
+ |---|---|---|---|
65
+ | `captureFeedback` | POST | `/v1/feedback/capture` | Capture thumbs up/down signal |
66
+ | `getFeedbackStats` | GET | `/v1/feedback/stats` | Aggregated feedback statistics |
67
+ | `getFeedbackSummary` | GET | `/v1/feedback/summary` | Recent feedback summary |
68
+ | `generatePreventionRules` | POST | `/v1/feedback/rules` | Generate prevention rules |
69
+ | `exportDpoPairs` | POST | `/v1/dpo/export` | Export DPO preference pairs |
70
+ | `listIntentCatalog` | GET | `/v1/intents/catalog` | List available intents |
71
+ | `planIntent` | POST | `/v1/intents/plan` | Generate policy-scoped plan |
72
+ | `constructContextPack` | POST | `/v1/context/construct` | Build context pack |
73
+
74
+ Full spec: `adapters/chatgpt/openapi.yaml`
75
+
76
+ ## Troubleshooting
77
+
78
+ - **401 Unauthorized**: Verify `RLHF_API_KEY` is set and matches the Bearer token
79
+ - **Connection refused**: Confirm Railway deployment is live (`curl https://<domain>/health`)
80
+ - **Schema errors**: Ensure you are using the latest `openapi.yaml` (version 1.1.0+)