rlhf-feedback-loop 0.5.0 → 0.6.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,308 +1,77 @@
1
1
  # RLHF Feedback Loop
2
2
 
3
3
  [![CI](https://github.com/IgorGanapolsky/rlhf-feedback-loop/actions/workflows/ci.yml/badge.svg)](https://github.com/IgorGanapolsky/rlhf-feedback-loop/actions/workflows/ci.yml)
4
- [![Self-Healing](https://github.com/IgorGanapolsky/rlhf-feedback-loop/actions/workflows/self-healing-monitor.yml/badge.svg)](https://github.com/IgorGanapolsky/rlhf-feedback-loop/actions/workflows/self-healing-monitor.yml)
4
+ [![npm](https://img.shields.io/npm/v/rlhf-feedback-loop)](https://www.npmjs.com/package/rlhf-feedback-loop)
5
5
  [![License: MIT](https://img.shields.io/badge/License-MIT-green.svg)](LICENSE)
6
6
  [![MCP Ready](https://img.shields.io/badge/MCP-ready-black)](adapters/mcp/server-stdio.js)
7
7
  [![DPO Ready](https://img.shields.io/badge/DPO-ready-blue)](scripts/export-dpo-pairs.js)
8
8
 
9
- Production-grade RLHF operations for AI agents across ChatGPT, Claude, Gemini, Codex, and Amp.
10
-
11
- ## Quick Install
12
-
13
- Install on any platform with a single command. Be capturing feedback in under 5 minutes.
14
-
15
- ### Universal (any platform)
16
-
17
- ```bash
18
- npx rlhf-feedback-loop init
19
- node .rlhf/capture-feedback.js --feedback=up --context="test"
20
- ```
21
-
22
- ### Claude Code
23
-
24
- ```bash
25
- cp plugins/claude-skill/SKILL.md .claude/skills/rlhf-feedback.md
26
- ```
27
-
28
- Full guide: [plugins/claude-skill/INSTALL.md](plugins/claude-skill/INSTALL.md)
29
-
30
- ### Codex
31
-
32
- ```bash
33
- cat adapters/codex/config.toml >> ~/.codex/config.toml
34
- ```
35
-
36
- Full guide: [plugins/codex-profile/INSTALL.md](plugins/codex-profile/INSTALL.md)
37
-
38
- ### Gemini
39
-
40
- ```bash
41
- cp adapters/gemini/function-declarations.json .gemini/rlhf-tools.json
42
- ```
43
-
44
- Full guide: [plugins/gemini-extension/INSTALL.md](plugins/gemini-extension/INSTALL.md)
45
-
46
- ### Amp
47
-
48
- ```bash
49
- cp plugins/amp-skill/SKILL.md .amp/skills/rlhf-feedback.md
50
- ```
51
-
52
- Full guide: [plugins/amp-skill/INSTALL.md](plugins/amp-skill/INSTALL.md)
53
-
54
- ### ChatGPT (GPT Actions)
55
-
56
- Import `adapters/chatgpt/openapi.yaml` in the GPT Builder Actions editor.
57
-
58
- Full guide: [adapters/chatgpt/INSTALL.md](adapters/chatgpt/INSTALL.md)
59
-
60
- ---
61
-
62
- ## Value Proposition
63
-
64
- Most teams collect feedback but do not convert it into reliable behavior change.
65
- This project gives you a working loop:
66
-
67
- 1. Capture thumbs up/down with context.
68
- 2. Score outcomes with weighted rubrics and objective guardrails.
69
- 3. Promote only schema-valid, rubric-eligible memories.
70
- 4. Generate prevention rules from repeated mistakes and failed rubric dimensions.
71
- 5. Export DPO-ready preference pairs with rubric deltas.
72
- 6. Construct bounded context packs (constructor/loader/evaluator).
73
- 7. Reuse the same core through API + MCP wrappers.
74
- 8. Route intents through policy bundles with human checkpoints on high-risk actions.
75
-
76
- ## Pricing
77
-
78
- | Plan | Price | What you get |
79
- |------|-------|-------------|
80
- | **Open Source** | $0 forever | Full source, self-hosted, MIT license, 314+ tests, 5-platform plugins |
81
- | **Cloud Pro** | $49/mo | Hosted HTTPS API on Railway, provisioned API key on payment, usage metering, email support |
82
-
83
- Get Cloud Pro: see the [landing page](docs/landing-page.html) or go straight to Stripe Checkout.
84
-
85
- ---
86
-
87
- ## Quick Start
88
-
89
- ```bash
90
- cp .env.example .env
91
- npm test
92
- npm run prove:adapters
93
- npm run prove:automation
94
- npm run start:api
95
- ```
96
-
97
- Set `RLHF_API_KEY` before running the API (or explicitly set `RLHF_ALLOW_INSECURE=true` for isolated local testing only).
98
-
99
- Capture feedback:
100
-
101
- ```bash
102
- node .claude/scripts/feedback/capture-feedback.js \
103
- --feedback=down \
104
- --context="Claimed done without test evidence" \
105
- --what-went-wrong="No proof attached" \
106
- --what-to-change="Always run tests and include output" \
107
- --tags="verification,testing"
108
- ```
109
-
110
- ## Integration Adapters
111
-
112
- - ChatGPT Actions: `adapters/chatgpt/openapi.yaml`
113
- - Claude MCP: `adapters/claude/.mcp.json`
114
- - Codex MCP: `adapters/codex/config.toml`
115
- - Gemini tools: `adapters/gemini/function-declarations.json`
116
- - Amp skill: `adapters/amp/skills/rlhf-feedback/SKILL.md`
117
-
118
- ## API Surface
119
-
120
- - `POST /v1/feedback/capture`
121
- - `GET /v1/feedback/stats`
122
- - `GET /v1/intents/catalog`
123
- - `POST /v1/intents/plan`
124
- - `GET /v1/feedback/summary`
125
- - `POST /v1/feedback/rules`
126
- - `POST /v1/dpo/export`
127
- - `POST /v1/context/construct`
128
- - `POST /v1/context/evaluate`
129
- - `GET /v1/context/provenance`
130
-
131
- Spec: `openapi/openapi.yaml`
132
-
133
- ## Versioning
134
-
135
- - Package/runtime release version: `package.json`
136
- - API contract version: `openapi/openapi.yaml`
137
- - MCP server protocol version: `adapters/mcp/server-stdio.js` `serverInfo.version`
138
-
139
- ## ContextFS
140
-
141
- The repo includes a file-system context substrate for multi-agent memory orchestration:
142
- - Constructor: relevance-ranked context pack assembly
143
- - Loader: strict `maxItems` + `maxChars` budgeting
144
- - Evaluator: outcome/provenance logging for improvement loops
145
-
146
- Docs: [docs/CONTEXTFS.md](docs/CONTEXTFS.md)
147
-
148
- ## MCP Policy Profiles
149
-
150
- Use least-privilege MCP profiles based on runtime risk:
151
-
152
- - `default`: full local toolset
153
- - `readonly`: read-heavy operations
154
- - `locked`: summary-only constrained mode
155
-
156
- Config: [config/mcp-allowlists.json](config/mcp-allowlists.json)
157
-
158
- ## Rubric Engine
159
-
160
- Rubric config: `config/rubrics/default-v1.json`
161
-
162
- - Weighted criteria scoring (`1-5`)
163
- - Multi-judge disagreement detection
164
- - Objective guardrail checks (`testsPassed`, `pathSafety`, `budgetCompliant`)
165
- - Promotion gate blocks positive memory writes on unsafe/high-disagreement signals
166
-
167
- ## Intent Router
168
-
169
- Versioned orchestration bundles define intent-to-action plans and checkpoint policy:
170
-
171
- - Bundle configs: `config/policy-bundles/*.json`
172
- - CLI list: `npm run intents:list`
173
- - CLI plan: `npm run intents:plan`
174
-
175
- The router marks high-risk intents as `checkpoint_required` unless explicitly approved.
176
- Details: [docs/INTENT_ROUTER.md](docs/INTENT_ROUTER.md)
177
-
178
- ## Autonomous GitOps
179
-
180
- The repo now ships with PR-gated autonomous operations:
181
-
182
- - `CI` (`.github/workflows/ci.yml`): required quality gate (`npm test`, adapter proof, automation proof)
183
- - `Agent PR Auto-Merge` (`.github/workflows/agent-automerge.yml`): auto-merges eligible agent branches (`claude/*`, `codex/*`, `auto/*`, `agent/*`) after required checks pass
184
- - `Dependabot Auto-Merge` (`.github/workflows/dependabot-automerge.yml`): auto-approves and merges safe dependency updates after required checks pass
185
- - `Self-Healing Monitor` (`.github/workflows/self-healing-monitor.yml`): scheduled health checks, auto-created alert issue on failure, remediation PR generation when fixable
186
- - `Self-Healing Auto-Fix` (`.github/workflows/self-healing-auto-fix.yml`): scheduled safe-fix attempts that open remediation PRs
187
- - `Merge Branch to Main` (`.github/workflows/merge-branch.yml`): manual fallback that still uses PR flow and branch protections
188
-
189
- Required repo settings:
190
-
191
- - `main` protected + required check(s)
192
- - auto-merge enabled
193
- - branch deletion on merge enabled
194
-
195
- Secrets:
196
-
197
- - Required: `GH_PAT` (or rely on `GITHUB_TOKEN` where permitted)
198
- - Optional: `SENTRY_AUTH_TOKEN`, `SENTRY_DSN`
199
- - Optional (LLM router): `LLM_GATEWAY_BASE_URL`, `LLM_GATEWAY_API_KEY`, `TETRATE_API_KEY`
200
-
201
- Sync helper:
202
-
203
- ```bash
204
- bash scripts/sync-gh-secrets-from-env.sh IgorGanapolsky/rlhf-feedback-loop
205
- ```
9
+ **Make your AI agent learn from mistakes.** Capture thumbs up/down feedback, block repeated failures, and export DPO training data — across ChatGPT, Claude, Codex, Gemini, and Amp.
206
10
 
207
11
  ## Architecture
208
12
 
209
- ### RLHF Feedback Loop
13
+ ![RLHF Architecture](docs/diagrams/rlhf-architecture-pb.png)
210
14
 
211
- ```mermaid
212
- flowchart TD
213
- A["👍/👎 User Feedback"] --> B["Capture Layer\n(context + tags)"]
214
- B --> C{"Action Resolver"}
215
- C -->|store-learning| D["Schema Validator"]
216
- C -->|store-mistake| D
217
- C -->|no-action| X["Discard"]
218
- D -->|valid| E["Memory Store\n(learning / error)"]
219
- D -->|invalid| X
220
- E --> F["Analytics\n(trends + recurrence)"]
221
- F --> G["Prevention Rules Engine"]
222
- F --> H["DPO Export\n(prompt/chosen/rejected)"]
223
- E --> I["Rubric Engine\n(weighted scoring + guardrails)"]
224
- I -->|promotion gate| E
225
- ```
15
+ ![Plugin Topology](docs/diagrams/plugin-topology-pb.png)
226
16
 
227
- ### Plugin Topology
17
+ ## Get Started
228
18
 
229
- ```mermaid
230
- flowchart LR
231
- subgraph Adapters
232
- GPT["ChatGPT\n(GPT Actions)"]
233
- CL["Claude\n(MCP Server)"]
234
- CX["Codex\n(MCP Config)"]
235
- GEM["Gemini\n(Function Calling)"]
236
- AMP["Amp\n(Skills Template)"]
237
- end
19
+ One command. Pick your platform:
238
20
 
239
- subgraph Core["RLHF Feedback API"]
240
- SV["Schema Validation"]
241
- PR["Prevention Rules"]
242
- DPO["DPO Export"]
243
- BG["Budget Guard\n($10/mo cap)"]
244
- end
245
-
246
- GPT <--> Core
247
- CL <--> Core
248
- CX <--> Core
249
- GEM <--> Core
250
- AMP <--> Core
251
- ```
21
+ | Platform | Install |
22
+ |----------|---------|
23
+ | **Claude** | `claude mcp add rlhf -- npx -y rlhf-feedback-loop serve` |
24
+ | **Codex** | `codex mcp add rlhf -- npx -y rlhf-feedback-loop serve` |
25
+ | **Gemini** | `gemini mcp add rlhf -- npx -y rlhf-feedback-loop serve` |
26
+ | **All at once** | `npx add-mcp rlhf-feedback-loop` |
252
27
 
253
- ### PaperBanana (high-fidelity PNG)
28
+ That's it. Your agent can now capture feedback, recall past learnings mid-conversation, and block repeated mistakes.
254
29
 
255
- Generate richer architecture visuals with a budget guard:
30
+ ## How It Works
256
31
 
257
- ```bash
258
- npm run diagrams:paperbanana
259
- npm run budget:status
32
+ ```
33
+ Thumbs up/down
34
+ |
35
+ v
36
+ Capture → JSONL log
37
+ |
38
+ v
39
+ Rubric engine (block false positives)
40
+ |
41
+ +---+---+
42
+ | |
43
+ Good Bad
44
+ | |
45
+ v v
46
+ Learn Prevention rule
47
+ | |
48
+ v v
49
+ LanceDB ShieldCortex
50
+ vectors context packs
51
+ |
52
+ v
53
+ DPO export → fine-tune your model
260
54
  ```
261
55
 
262
- Docs: [docs/PAPERBANANA.md](docs/PAPERBANANA.md)
263
- Verification evidence: [docs/VERIFICATION_EVIDENCE.md](docs/VERIFICATION_EVIDENCE.md)
264
- Compatibility proof artifacts: [proof/compatibility/report.md](proof/compatibility/report.md), [proof/compatibility/report.json](proof/compatibility/report.json)
265
- Automation proof artifacts: [proof/automation/report.md](proof/automation/report.md), [proof/automation/report.json](proof/automation/report.json)
266
-
267
- ## Budget Guardrail
268
-
269
- Default monthly cap is `$10` for paid external operations.
270
- The local budget ledger blocks additional spend if cap would be exceeded.
271
-
272
- ## Semantic Cache (Cost + Latency)
273
-
274
- Context pack construction now supports semantic cache reuse for similar queries:
275
-
276
- - token-overlap (Jaccard) similarity gate
277
- - TTL-bound cache entries
278
- - full provenance (`context_pack_cache_hit`)
279
-
280
- Environment toggles:
281
-
282
- - `RLHF_SEMANTIC_CACHE_ENABLED=true|false` (default `true`)
283
- - `RLHF_SEMANTIC_CACHE_THRESHOLD=0.7`
284
- - `RLHF_SEMANTIC_CACHE_TTL_SECONDS=86400`
285
-
286
- This directly reduces repeated retrieval/LLM context assembly work and improves response latency under budget constraints.
287
-
288
- ## Optional Tetrate Router
56
+ All data stored locally as **JSONL** files — fully transparent, fully portable, no vendor lock-in. **LanceDB** indexes memories as vector embeddings for semantic search. **ShieldCortex** assembles context packs so your agent starts each task informed.
289
57
 
290
- Not required for core local RLHF logic.
291
- Recommended only when routing paid LLM calls (PaperBanana, external judges, hosted control-plane features):
58
+ ## Why This Exists
292
59
 
293
- - centralized provider routing
294
- - price/fallback control
295
- - unified usage observability
60
+ | Problem | What this does |
61
+ |---------|---------------|
62
+ | Agent keeps making the same mistake | Prevention rules auto-generated from repeated failures |
63
+ | Agent claims "done" without proof | Rubric engine blocks positive feedback without test evidence |
64
+ | Feedback collected but never used | DPO pairs exported for actual model fine-tuning |
65
+ | Different tools, different formats | One MCP server works across 5 platforms |
66
+ | Agent starts every task blank | In-session recall injects past learnings into current conversation |
296
67
 
297
- ## Commercialization
68
+ ## Deep Dive
298
69
 
299
- - OSS core for adoption
300
- - Hosted control plane for teams
301
- - Enterprise support and compliance features
70
+ - [API Reference](openapi/openapi.yaml) full OpenAPI spec
71
+ - [Context Engine](docs/CONTEXTFS.md) multi-agent memory orchestration
72
+ - [Autonomous GitOps](docs/AUTONOMOUS_GITOPS.md) self-healing CI/CD
73
+ - [Contributing](CONTRIBUTING.md)
302
74
 
303
- See:
75
+ ## License
304
76
 
305
- - [docs/PACKAGING_AND_SALES_PLAN.md](docs/PACKAGING_AND_SALES_PLAN.md)
306
- - [docs/PLATFORM_RESEARCH_2026-03-03.md](docs/PLATFORM_RESEARCH_2026-03-03.md)
307
- - [docs/PLUGIN_DISTRIBUTION.md](docs/PLUGIN_DISTRIBUTION.md)
308
- - [docs/AUTONOMOUS_GITOPS.md](docs/AUTONOMOUS_GITOPS.md)
77
+ MIT. See [LICENSE](LICENSE).
@@ -31,6 +31,9 @@ const {
31
31
  getAllowedTools,
32
32
  assertToolAllowed,
33
33
  } = require('../../scripts/mcp-policy');
34
+ const {
35
+ searchSimilar,
36
+ } = require('../../scripts/vector-store');
34
37
 
35
38
  const SERVER_INFO = {
36
39
  name: 'rlhf-feedback-loop-mcp',
@@ -212,6 +215,18 @@ const TOOLS = [
212
215
  },
213
216
  },
214
217
  },
218
+ {
219
+ name: 'recall',
220
+ description: 'Recall relevant past feedback, memories, and prevention rules for the current task. Call this at the start of any task to inject past learnings into the conversation.',
221
+ inputSchema: {
222
+ type: 'object',
223
+ required: ['query'],
224
+ properties: {
225
+ query: { type: 'string', description: 'Describe the current task or context to find relevant past feedback' },
226
+ limit: { type: 'number', description: 'Max memories to return (default 5)' },
227
+ },
228
+ },
229
+ },
215
230
  ];
216
231
 
217
232
  function toText(result) {
@@ -237,6 +252,56 @@ function parseOptionalObject(input, name) {
237
252
  async function callTool(name, args = {}) {
238
253
  assertToolAllowed(name, getActiveMcpProfile());
239
254
 
255
+ if (name === 'recall') {
256
+ const query = args.query || '';
257
+ const limit = Number(args.limit || 5);
258
+ const parts = [];
259
+
260
+ // 1. Vector search for similar past feedback
261
+ try {
262
+ const similar = await searchSimilar(query, limit);
263
+ if (similar.length > 0) {
264
+ parts.push('## Relevant Past Feedback\n');
265
+ for (const mem of similar) {
266
+ const signal = mem.signal === 'positive' ? 'GOOD' : 'BAD';
267
+ parts.push(`**[${signal}]** ${mem.context}`);
268
+ if (mem.tags) parts.push(` Tags: ${mem.tags}`);
269
+ parts.push('');
270
+ }
271
+ }
272
+ } catch (_) {
273
+ // Vector store may not be initialized yet — fall back to JSONL
274
+ }
275
+
276
+ // 2. Load prevention rules
277
+ try {
278
+ const rulesPath = path.join(SAFE_DATA_DIR, 'prevention-rules.md');
279
+ if (fs.existsSync(rulesPath)) {
280
+ const rules = fs.readFileSync(rulesPath, 'utf8').trim();
281
+ if (rules.length > 50) {
282
+ parts.push('## Active Prevention Rules\n');
283
+ parts.push(rules);
284
+ parts.push('');
285
+ }
286
+ }
287
+ } catch (_) {}
288
+
289
+ // 3. Recent feedback summary
290
+ try {
291
+ const summary = feedbackSummary(10);
292
+ if (summary) {
293
+ parts.push('## Recent Feedback Summary\n');
294
+ parts.push(summary);
295
+ }
296
+ } catch (_) {}
297
+
298
+ const text = parts.length > 0
299
+ ? parts.join('\n')
300
+ : 'No past feedback found. This appears to be a fresh start.';
301
+
302
+ return { content: [{ type: 'text', text }] };
303
+ }
304
+
240
305
  if (name === 'capture_feedback') {
241
306
  const result = captureFeedback({
242
307
  signal: args.signal,
@@ -249,7 +314,22 @@ async function callTool(name, args = {}) {
249
314
  tags: args.tags || [],
250
315
  skill: args.skill,
251
316
  });
252
- return { content: [{ type: 'text', text: toText(result) }] };
317
+
318
+ // Auto-recall: after capturing, return relevant context so the agent
319
+ // can immediately adjust behavior based on past learnings
320
+ let recallText = '';
321
+ try {
322
+ const similar = await searchSimilar(args.context || '', 3);
323
+ if (similar.length > 0) {
324
+ recallText = '\n\n---\n## Related Past Feedback (auto-recall)\n';
325
+ for (const mem of similar) {
326
+ const signal = mem.signal === 'positive' ? 'GOOD' : 'BAD';
327
+ recallText += `- **[${signal}]** ${mem.context}\n`;
328
+ }
329
+ }
330
+ } catch (_) {}
331
+
332
+ return { content: [{ type: 'text', text: toText(result) + recallText }] };
253
333
  }
254
334
 
255
335
  if (name === 'feedback_summary') {
package/bin/cli.js CHANGED
@@ -3,23 +3,131 @@
3
3
  * rlhf-feedback-loop CLI
4
4
  *
5
5
  * Usage:
6
- * npx rlhf-feedback-loop init
7
- *
8
- * Creates a .rlhf/ directory with config and capture script for local use.
6
+ * npx rlhf-feedback-loop init # scaffold .rlhf/ config + .mcp.json
7
+ * npx rlhf-feedback-loop capture # capture feedback
8
+ * npx rlhf-feedback-loop export-dpo # export DPO training pairs
9
+ * npx rlhf-feedback-loop stats # feedback analytics
10
+ * npx rlhf-feedback-loop rules # generate prevention rules
11
+ * npx rlhf-feedback-loop self-heal # run self-healing check + fix
12
+ * npx rlhf-feedback-loop prove # run proof harness
13
+ * npx rlhf-feedback-loop start-api # start HTTPS API server
9
14
  */
10
15
 
11
16
  'use strict';
12
17
 
13
18
  const fs = require('fs');
14
19
  const path = require('path');
20
+ const { execSync } = require('child_process');
15
21
 
16
22
  const COMMAND = process.argv[2];
17
23
  const CWD = process.cwd();
24
+ const PKG_ROOT = path.join(__dirname, '..');
25
+
26
+ function parseArgs(argv) {
27
+ const args = {};
28
+ argv.forEach((arg) => {
29
+ if (!arg.startsWith('--')) return;
30
+ const [key, ...rest] = arg.slice(2).split('=');
31
+ args[key] = rest.length ? rest.join('=') : true;
32
+ });
33
+ return args;
34
+ }
35
+
36
+ function pkgVersion() {
37
+ const pkg = JSON.parse(fs.readFileSync(path.join(PKG_ROOT, 'package.json'), 'utf8'));
38
+ return pkg.version;
39
+ }
40
+
41
+ // --- Platform auto-detection helpers ---
42
+
43
+ const HOME = process.env.HOME || process.env.USERPROFILE || '';
44
+ const MCP_SERVER_ENTRY = {
45
+ command: 'node',
46
+ args: [path.relative(CWD, path.join(PKG_ROOT, 'adapters', 'mcp', 'server-stdio.js'))],
47
+ };
48
+
49
+ function mergeMcpJson(filePath, label) {
50
+ if (!fs.existsSync(filePath)) {
51
+ const dir = path.dirname(filePath);
52
+ if (!fs.existsSync(dir)) fs.mkdirSync(dir, { recursive: true });
53
+ fs.writeFileSync(filePath, JSON.stringify({ mcpServers: { 'rlhf-feedback-loop': MCP_SERVER_ENTRY } }, null, 2) + '\n');
54
+ console.log(` ${label}: wrote ${path.relative(CWD, filePath)}`);
55
+ return true;
56
+ }
57
+ const existing = JSON.parse(fs.readFileSync(filePath, 'utf8'));
58
+ if (existing.mcpServers && existing.mcpServers['rlhf-feedback-loop']) return false;
59
+ existing.mcpServers = existing.mcpServers || {};
60
+ existing.mcpServers['rlhf-feedback-loop'] = MCP_SERVER_ENTRY;
61
+ fs.writeFileSync(filePath, JSON.stringify(existing, null, 2) + '\n');
62
+ console.log(` ${label}: updated ${path.relative(CWD, filePath)}`);
63
+ return true;
64
+ }
65
+
66
+ function detectPlatform(name, checks) {
67
+ for (const check of checks) {
68
+ try { if (check()) return true; } catch (_) {}
69
+ }
70
+ return false;
71
+ }
72
+
73
+ function whichExists(cmd) {
74
+ try { execSync(`which ${cmd}`, { stdio: 'pipe' }); return true; } catch (_) { return false; }
75
+ }
76
+
77
+ function setupClaude() {
78
+ return mergeMcpJson(path.join(CWD, '.mcp.json'), 'Claude Code');
79
+ }
80
+
81
+ function setupCodex() {
82
+ const configPath = path.join(HOME, '.codex', 'config.toml');
83
+ const block = `\n[mcp_servers.rlhf_feedback_loop]\ncommand = "node"\nargs = ["${MCP_SERVER_ENTRY.args[0]}"]\n`;
84
+ if (!fs.existsSync(configPath)) {
85
+ fs.mkdirSync(path.dirname(configPath), { recursive: true });
86
+ fs.writeFileSync(configPath, block);
87
+ console.log(' Codex: created ~/.codex/config.toml');
88
+ return true;
89
+ }
90
+ const content = fs.readFileSync(configPath, 'utf8');
91
+ if (content.includes('[mcp_servers.rlhf_feedback_loop]')) return false;
92
+ fs.appendFileSync(configPath, block);
93
+ console.log(' Codex: appended MCP server to ~/.codex/config.toml');
94
+ return true;
95
+ }
96
+
97
+ function setupGemini() {
98
+ const settingsPath = path.join(HOME, '.gemini', 'settings.json');
99
+ if (fs.existsSync(settingsPath)) {
100
+ const settings = JSON.parse(fs.readFileSync(settingsPath, 'utf8'));
101
+ if (settings.mcpServers && settings.mcpServers['rlhf-feedback-loop']) return false;
102
+ settings.mcpServers = settings.mcpServers || {};
103
+ settings.mcpServers['rlhf-feedback-loop'] = MCP_SERVER_ENTRY;
104
+ fs.writeFileSync(settingsPath, JSON.stringify(settings, null, 2) + '\n');
105
+ console.log(' Gemini: updated ~/.gemini/settings.json');
106
+ return true;
107
+ }
108
+ // Fallback: project-level .gemini/settings.json
109
+ return mergeMcpJson(path.join(CWD, '.gemini', 'settings.json'), 'Gemini');
110
+ }
111
+
112
+ function setupAmp() {
113
+ const skillDir = path.join(CWD, '.amp', 'skills', 'rlhf-feedback');
114
+ const destPath = path.join(skillDir, 'SKILL.md');
115
+ if (fs.existsSync(destPath)) return false;
116
+ const srcPath = path.join(PKG_ROOT, 'plugins', 'amp-skill', 'SKILL.md');
117
+ if (!fs.existsSync(srcPath)) return false;
118
+ fs.mkdirSync(skillDir, { recursive: true });
119
+ fs.copyFileSync(srcPath, destPath);
120
+ console.log(' Amp: installed .amp/skills/rlhf-feedback/SKILL.md');
121
+ return true;
122
+ }
123
+
124
+ function setupCursor() {
125
+ return mergeMcpJson(path.join(CWD, '.cursor', 'mcp.json'), 'Cursor');
126
+ }
18
127
 
19
128
  function init() {
20
129
  const rlhfDir = path.join(CWD, '.rlhf');
21
130
 
22
- // Create directory
23
131
  if (!fs.existsSync(rlhfDir)) {
24
132
  fs.mkdirSync(rlhfDir, { recursive: true });
25
133
  console.log('Created .rlhf/');
@@ -27,130 +135,236 @@ function init() {
27
135
  console.log('.rlhf/ already exists — updating config');
28
136
  }
29
137
 
30
- // Write config.json
31
138
  const config = {
32
- version: '0.5.0',
139
+ version: pkgVersion(),
33
140
  apiUrl: process.env.RLHF_API_URL || 'http://localhost:3000',
34
141
  logPath: '.rlhf/feedback-log.jsonl',
35
142
  memoryPath: '.rlhf/memory-log.jsonl',
36
143
  createdAt: new Date().toISOString(),
37
144
  };
38
145
 
39
- const configPath = path.join(rlhfDir, 'config.json');
40
- fs.writeFileSync(configPath, JSON.stringify(config, null, 2) + '\n');
146
+ fs.writeFileSync(path.join(rlhfDir, 'config.json'), JSON.stringify(config, null, 2) + '\n');
41
147
  console.log('Wrote .rlhf/config.json');
42
148
 
43
- // Copy capture-feedback script (inline minimal version for standalone use)
44
- const captureScript = `#!/usr/bin/env node
45
- /**
46
- * Standalone feedback capture script — created by npx rlhf-feedback-loop init
47
- * Full version: https://github.com/IgorGanapolsky/rlhf-feedback-loop
48
- *
49
- * Usage:
50
- * node .rlhf/capture-feedback.js --feedback=up --context="that worked great" --tags="testing"
51
- * node .rlhf/capture-feedback.js --feedback=down --context="missed edge case" --what-went-wrong="..." --what-to-change="..."
52
- */
149
+ // Always create .mcp.json (project-level MCP config used by Claude, Codex, Cursor)
150
+ mergeMcpJson(path.join(CWD, '.mcp.json'), 'MCP');
53
151
 
54
- 'use strict';
152
+ // Auto-detect and configure platform-specific locations
153
+ console.log('');
154
+ console.log('Detecting platforms...');
155
+ let configured = 0;
55
156
 
56
- const fs = require('fs');
57
- const path = require('path');
58
- const os = require('os');
157
+ const platforms = [
158
+ { name: 'Codex', detect: [() => whichExists('codex'), () => fs.existsSync(path.join(HOME, '.codex'))], setup: setupCodex },
159
+ { name: 'Gemini', detect: [() => whichExists('gemini'), () => fs.existsSync(path.join(HOME, '.gemini'))], setup: setupGemini },
160
+ { name: 'Amp', detect: [() => whichExists('amp'), () => fs.existsSync(path.join(HOME, '.amp'))], setup: setupAmp },
161
+ { name: 'Cursor', detect: [() => fs.existsSync(path.join(HOME, '.cursor', 'mcp.json')), () => fs.existsSync(path.join(CWD, '.cursor'))], setup: setupCursor },
162
+ ];
59
163
 
60
- const CONFIG_PATH = path.join(__dirname, 'config.json');
61
- const config = fs.existsSync(CONFIG_PATH) ? JSON.parse(fs.readFileSync(CONFIG_PATH, 'utf8')) : {};
62
- const LOG_PATH = path.join(process.cwd(), config.logPath || '.rlhf/feedback-log.jsonl');
164
+ for (const p of platforms) {
165
+ if (detectPlatform(p.name, p.detect)) {
166
+ const didSetup = p.setup();
167
+ if (didSetup) configured++;
168
+ else console.log(` ${p.name}: already configured`);
169
+ }
170
+ }
63
171
 
64
- function parseArgs(argv) {
65
- const args = {};
66
- argv.forEach((arg) => {
67
- if (!arg.startsWith('--')) return;
68
- const [key, ...rest] = arg.slice(2).split('=');
69
- args[key] = rest.length ? rest.join('=') : true;
70
- });
71
- return args;
72
- }
172
+ // ChatGPT — cannot be automated
173
+ const chatgptSpec = path.join(PKG_ROOT, 'adapters', 'chatgpt', 'openapi.yaml');
174
+ if (fs.existsSync(chatgptSpec)) {
175
+ console.log(` ChatGPT: import ${path.relative(CWD, chatgptSpec)} in GPT Builder > Actions`);
176
+ }
73
177
 
74
- const args = parseArgs(process.argv.slice(2));
75
- const signal = args.feedback || args.signal;
178
+ if (configured === 0) console.log(' All detected platforms already configured.');
76
179
 
77
- if (!signal) {
78
- console.error('Error: --feedback=up or --feedback=down required');
79
- console.error('Usage: node .rlhf/capture-feedback.js --feedback=up --context="..."');
80
- process.exit(1);
180
+ // .gitignore
181
+ const gitignorePath = path.join(CWD, '.gitignore');
182
+ if (fs.existsSync(gitignorePath)) {
183
+ const gitignore = fs.readFileSync(gitignorePath, 'utf8');
184
+ const entries = ['.rlhf/feedback-log.jsonl', '.rlhf/memory-log.jsonl'];
185
+ const missing = entries.filter((e) => !gitignore.includes(e));
186
+ if (missing.length > 0) {
187
+ fs.appendFileSync(gitignorePath, '\n# RLHF local feedback data\n' + missing.join('\n') + '\n');
188
+ console.log('Updated .gitignore');
189
+ }
190
+ }
191
+
192
+ console.log('');
193
+ console.log(`rlhf-feedback-loop v${pkgVersion()} initialized.`);
194
+ console.log('Run: npx rlhf-feedback-loop help');
81
195
  }
82
196
 
83
- const normalized = ['up', 'thumbs_up', 'positive'].includes(signal) ? 'up' : 'down';
197
+ function capture() {
198
+ const args = parseArgs(process.argv.slice(3));
84
199
 
85
- const entry = {
86
- id: \`fb-\${Date.now()}-\${Math.random().toString(36).slice(2, 7)}\`,
87
- signal: normalized,
88
- context: args.context || '',
89
- whatWentWrong: args['what-went-wrong'] || undefined,
90
- whatToChange: args['what-to-change'] || undefined,
91
- whatWorked: args['what-worked'] || undefined,
92
- tags: args.tags ? args.tags.split(',').map((t) => t.trim()) : [],
93
- timestamp: new Date().toISOString(),
94
- hostname: os.hostname(),
95
- };
200
+ // Delegate to the full engine
201
+ const { captureFeedback, analyzeFeedback, feedbackSummary, writePreventionRules } = require(path.join(PKG_ROOT, 'scripts', 'feedback-loop'));
202
+
203
+ if (args.stats) {
204
+ console.log(JSON.stringify(analyzeFeedback(), null, 2));
205
+ return;
206
+ }
96
207
 
97
- // Remove undefined fields
98
- Object.keys(entry).forEach((k) => entry[k] === undefined && delete entry[k]);
208
+ if (args.summary) {
209
+ console.log(feedbackSummary(Number(args.recent || 20)));
210
+ return;
211
+ }
99
212
 
100
- // Ensure log directory exists
101
- const logDir = path.dirname(LOG_PATH);
102
- if (!fs.existsSync(logDir)) fs.mkdirSync(logDir, { recursive: true });
213
+ // Normalize signal with fuzzy matching (uses the full engine's normalize)
214
+ const captureScript = require(path.join(PKG_ROOT, '.claude', 'scripts', 'feedback', 'capture-feedback.js'));
215
+ // The capture-feedback.js runs as main when required directly, so we call via subprocess
216
+ const scriptArgs = process.argv.slice(3).join(' ');
217
+ try {
218
+ const output = execSync(
219
+ `node "${path.join(PKG_ROOT, '.claude', 'scripts', 'feedback', 'capture-feedback.js')}" ${scriptArgs}`,
220
+ { encoding: 'utf8', stdio: 'pipe', cwd: CWD }
221
+ );
222
+ process.stdout.write(output);
223
+ } catch (err) {
224
+ process.stderr.write(err.stderr || err.stdout || err.message);
225
+ process.exit(err.status || 1);
226
+ }
227
+ }
103
228
 
104
- fs.appendFileSync(LOG_PATH, JSON.stringify(entry) + '\\n');
105
- console.log(\`Feedback captured [\${normalized}]: \${entry.id}\`);
106
- console.log(\`Logged to: \${LOG_PATH}\`);
107
- `;
229
+ function stats() {
230
+ const { analyzeFeedback } = require(path.join(PKG_ROOT, 'scripts', 'feedback-loop'));
231
+ console.log(JSON.stringify(analyzeFeedback(), null, 2));
232
+ }
233
+
234
+ function summary() {
235
+ const args = parseArgs(process.argv.slice(3));
236
+ const { feedbackSummary } = require(path.join(PKG_ROOT, 'scripts', 'feedback-loop'));
237
+ console.log(feedbackSummary(Number(args.recent || 20)));
238
+ }
108
239
 
109
- const scriptPath = path.join(rlhfDir, 'capture-feedback.js');
110
- fs.writeFileSync(scriptPath, captureScript);
111
- // Make executable
240
+ function exportDpo() {
112
241
  try {
113
- fs.chmodSync(scriptPath, '755');
114
- } catch (_) {
115
- // chmod may not be available on all platforms — not fatal
242
+ const output = execSync(
243
+ `node "${path.join(PKG_ROOT, 'scripts', 'export-dpo-pairs.js')}"`,
244
+ { encoding: 'utf8', stdio: 'pipe', cwd: CWD }
245
+ );
246
+ process.stdout.write(output);
247
+ } catch (err) {
248
+ process.stderr.write(err.stderr || err.stdout || err.message);
249
+ process.exit(err.status || 1);
116
250
  }
117
- console.log('Wrote .rlhf/capture-feedback.js');
251
+ }
118
252
 
119
- // Add .rlhf/feedback-log.jsonl to .gitignore if it exists
120
- const gitignorePath = path.join(CWD, '.gitignore');
121
- if (fs.existsSync(gitignorePath)) {
122
- const gitignore = fs.readFileSync(gitignorePath, 'utf8');
123
- const entries = ['.rlhf/feedback-log.jsonl', '.rlhf/memory-log.jsonl'];
124
- const missing = entries.filter((e) => !gitignore.includes(e));
125
- if (missing.length > 0) {
126
- fs.appendFileSync(gitignorePath, '\n# RLHF local feedback data\n' + missing.join('\n') + '\n');
127
- console.log('Updated .gitignore with RLHF data paths');
128
- }
253
+ function rules() {
254
+ const args = parseArgs(process.argv.slice(3));
255
+ const { writePreventionRules } = require(path.join(PKG_ROOT, 'scripts', 'feedback-loop'));
256
+ const outPath = args.output || path.join(CWD, '.rlhf', 'prevention-rules.md');
257
+ const result = writePreventionRules(outPath, Number(args.min || 2));
258
+ console.log(`Wrote prevention rules to ${result.path}`);
259
+ }
260
+
261
+ function selfHeal() {
262
+ try {
263
+ const output = execSync(
264
+ `node "${path.join(PKG_ROOT, 'scripts', 'self-healing-check.js')}" && node "${path.join(PKG_ROOT, 'scripts', 'self-heal.js')}"`,
265
+ { encoding: 'utf8', stdio: 'inherit', cwd: CWD }
266
+ );
267
+ } catch (err) {
268
+ process.exit(err.status || 1);
129
269
  }
270
+ }
130
271
 
131
- console.log('');
132
- console.log('Setup complete! Run:');
133
- console.log(" node .rlhf/capture-feedback.js --feedback=up --context='test'");
134
- console.log('');
135
- console.log('Full docs: https://github.com/IgorGanapolsky/rlhf-feedback-loop');
272
+ function prove() {
273
+ const args = parseArgs(process.argv.slice(3));
274
+ const target = args.target || 'adapters';
275
+ const script = path.join(PKG_ROOT, 'scripts', `prove-${target}.js`);
276
+ if (!fs.existsSync(script)) {
277
+ console.error(`Unknown proof target: ${target}`);
278
+ console.error('Available: adapters, automation, attribution, lancedb, data-quality, intelligence, loop-closure, training-export');
279
+ process.exit(1);
280
+ }
281
+ try {
282
+ execSync(`node "${script}"`, { encoding: 'utf8', stdio: 'inherit', cwd: CWD });
283
+ } catch (err) {
284
+ process.exit(err.status || 1);
285
+ }
286
+ }
287
+
288
+ function serve() {
289
+ // Start MCP server over stdio — used by `claude mcp add`, `codex mcp add`, `gemini mcp add`
290
+ const mcpServer = path.join(PKG_ROOT, 'adapters', 'mcp', 'server-stdio.js');
291
+ require(mcpServer);
292
+ }
293
+
294
+ function startApi() {
295
+ const serverPath = path.join(PKG_ROOT, 'src', 'api', 'server.js');
296
+ try {
297
+ execSync(`node "${serverPath}"`, { stdio: 'inherit', cwd: CWD });
298
+ } catch (err) {
299
+ process.exit(err.status || 1);
300
+ }
136
301
  }
137
302
 
138
303
  function help() {
139
- console.log('rlhf-feedback-loop CLI');
304
+ const v = pkgVersion();
305
+ console.log(`rlhf-feedback-loop v${v}`);
140
306
  console.log('');
141
307
  console.log('Commands:');
142
- console.log(' init Scaffold .rlhf/ config and capture script in current directory');
143
- console.log(' help Show this help message');
308
+ console.log(' init Scaffold .rlhf/ config + MCP server in current project');
309
+ console.log(' serve Start MCP server (stdio) — for claude/codex/gemini mcp add');
310
+ console.log(' capture [flags] Capture feedback (--feedback=up|down --context="..." --tags="...")');
311
+ console.log(' stats Show feedback analytics');
312
+ console.log(' summary Human-readable feedback summary');
313
+ console.log(' export-dpo Export DPO training pairs (prompt/chosen/rejected JSONL)');
314
+ console.log(' rules Generate prevention rules from repeated failures');
315
+ console.log(' self-heal Run self-healing check and auto-fix');
316
+ console.log(' prove [--target=X] Run proof harness (adapters|automation|attribution|lancedb|...)');
317
+ console.log(' start-api Start the RLHF HTTPS API server');
318
+ console.log(' help Show this help message');
144
319
  console.log('');
145
320
  console.log('Examples:');
146
321
  console.log(' npx rlhf-feedback-loop init');
147
- console.log(' node .rlhf/capture-feedback.js --feedback=up --context="great result"');
322
+ console.log(' npx rlhf-feedback-loop capture --feedback=up --context="all tests pass"');
323
+ console.log(' npx rlhf-feedback-loop capture --feedback=down --context="broke prod" --what-went-wrong="no tests"');
324
+ console.log(' npx rlhf-feedback-loop export-dpo');
325
+ console.log(' npx rlhf-feedback-loop stats');
326
+ console.log('');
327
+ console.log('MCP install (one command per platform):');
328
+ console.log(' claude mcp add rlhf -- npx -y rlhf-feedback-loop serve');
329
+ console.log(' codex mcp add rlhf -- npx -y rlhf-feedback-loop serve');
330
+ console.log(' gemini mcp add rlhf -- npx -y rlhf-feedback-loop serve');
148
331
  }
149
332
 
150
333
  switch (COMMAND) {
151
334
  case 'init':
152
335
  init();
153
336
  break;
337
+ case 'serve':
338
+ case 'mcp':
339
+ serve();
340
+ break;
341
+ case 'capture':
342
+ case 'feedback':
343
+ capture();
344
+ break;
345
+ case 'stats':
346
+ stats();
347
+ break;
348
+ case 'summary':
349
+ summary();
350
+ break;
351
+ case 'export-dpo':
352
+ case 'dpo':
353
+ exportDpo();
354
+ break;
355
+ case 'rules':
356
+ rules();
357
+ break;
358
+ case 'self-heal':
359
+ selfHeal();
360
+ break;
361
+ case 'prove':
362
+ prove();
363
+ break;
364
+ case 'start-api':
365
+ case 'serve':
366
+ startApi();
367
+ break;
154
368
  case 'help':
155
369
  case '--help':
156
370
  case '-h':
@@ -2,6 +2,7 @@
2
2
  "version": 1,
3
3
  "profiles": {
4
4
  "default": [
5
+ "recall",
5
6
  "capture_feedback",
6
7
  "feedback_summary",
7
8
  "feedback_stats",
@@ -14,6 +15,7 @@
14
15
  "context_provenance"
15
16
  ],
16
17
  "readonly": [
18
+ "recall",
17
19
  "feedback_summary",
18
20
  "feedback_stats",
19
21
  "list_intents",
package/package.json CHANGED
@@ -1,7 +1,15 @@
1
1
  {
2
2
  "name": "rlhf-feedback-loop",
3
- "version": "0.5.0",
4
- "description": "Production-grade RLHF feedback operations for coding agents: capture thumbs signals, enforce schema quality, prevent repeated mistakes, and export DPO pairs.",
3
+ "version": "0.6.1",
4
+ "description": "Make your AI agent learn from mistakes. Capture thumbs up/down feedback, block repeated failures, export DPO training data. Works with ChatGPT, Claude, Codex, Gemini, Amp.",
5
+ "homepage": "https://github.com/IgorGanapolsky/rlhf-feedback-loop#readme",
6
+ "repository": {
7
+ "type": "git",
8
+ "url": "https://github.com/IgorGanapolsky/rlhf-feedback-loop.git"
9
+ },
10
+ "bugs": {
11
+ "url": "https://github.com/IgorGanapolsky/rlhf-feedback-loop/issues"
12
+ },
5
13
  "main": "scripts/feedback-loop.js",
6
14
  "bin": {
7
15
  "rlhf-feedback-loop": "./bin/cli.js"
@@ -25,8 +33,8 @@
25
33
  "test:schema": "node scripts/feedback-schema.js --test",
26
34
  "test:loop": "node scripts/feedback-loop.js --test",
27
35
  "test:dpo": "node scripts/export-dpo-pairs.js --test",
28
- "test:api": "node --test tests/api-server.test.js tests/api-auth-config.test.js tests/mcp-server.test.js tests/adapters.test.js tests/openapi-parity.test.js tests/budget-guard.test.js tests/contextfs.test.js tests/mcp-policy.test.js tests/subagent-profiles.test.js tests/intent-router.test.js tests/rubric-engine.test.js tests/self-healing-check.test.js tests/self-heal.test.js tests/feedback-schema.test.js tests/thompson-sampling.test.js tests/feedback-sequences.test.js tests/diversity-tracking.test.js tests/vector-store.test.js tests/feedback-attribution.test.js tests/hybrid-feedback-context.test.js tests/loop-closure.test.js tests/code-reasoning.test.js",
29
- "test:proof": "node --test tests/prove-adapters.test.js tests/prove-automation.test.js",
36
+ "test:api": "node --test tests/api-server.test.js tests/api-auth-config.test.js tests/mcp-server.test.js tests/adapters.test.js tests/openapi-parity.test.js tests/budget-guard.test.js tests/contextfs.test.js tests/mcp-policy.test.js tests/subagent-profiles.test.js tests/intent-router.test.js tests/rubric-engine.test.js tests/self-healing-check.test.js tests/self-heal.test.js tests/feedback-schema.test.js tests/thompson-sampling.test.js tests/feedback-sequences.test.js tests/diversity-tracking.test.js tests/vector-store.test.js tests/feedback-attribution.test.js tests/hybrid-feedback-context.test.js tests/loop-closure.test.js tests/code-reasoning.test.js tests/feedback-loop.test.js tests/feedback-inbox-read.test.js tests/feedback-to-memory.test.js",
37
+ "test:proof": "node --test tests/prove-adapters.test.js tests/prove-automation.test.js tests/prove-attribution.test.js tests/prove-lancedb.test.js tests/prove-data-quality.test.js tests/prove-intelligence.test.js tests/prove-loop-closure.test.js tests/prove-subway-upgrades.test.js tests/prove-training-export.test.js",
30
38
  "test:rlaif": "node --test tests/rlaif-self-audit.test.js tests/dpo-optimizer.test.js tests/meta-policy.test.js",
31
39
  "test:attribution": "node --test tests/feedback-attribution.test.js tests/hybrid-feedback-context.test.js",
32
40
  "test:quality": "node --test tests/validate-feedback.test.js",
@@ -79,13 +87,24 @@
79
87
  "claude",
80
88
  "codex",
81
89
  "gemini",
90
+ "chatgpt",
91
+ "amp",
92
+ "mcp",
93
+ "model-context-protocol",
82
94
  "agent-evaluation",
83
- "prompt-engineering"
95
+ "prompt-engineering",
96
+ "context-engineering",
97
+ "ai-safety",
98
+ "machine-learning",
99
+ "openapi",
100
+ "developer-tools"
84
101
  ],
85
102
  "license": "MIT",
86
103
  "dependencies": {
87
104
  "@huggingface/transformers": "^3.8.1",
88
105
  "@lancedb/lancedb": "^0.26.2",
89
- "apache-arrow": "^18.1.0"
90
- }
106
+ "apache-arrow": "^18.1.0",
107
+ "stripe": "^20.4.0"
108
+ },
109
+ "mcpName": "io.github.igorganapolsky/rlhf-feedback-loop"
91
110
  }
@@ -1,31 +1,64 @@
1
1
  ---
2
2
  name: rlhf-feedback
3
- description: Capture thumbs feedback and apply prevention rules before coding
3
+ description: Dual-write feedback to Amp MCP memory AND rlhf-feedback-loop for DPO export, analytics, and cross-platform portability
4
4
  ---
5
5
 
6
- # Amp RLHF Skill
6
+ # RLHF Feedback Skill (Dual-Write)
7
7
 
8
- On explicit user feedback:
8
+ This skill captures feedback in TWO places simultaneously:
9
+ 1. **Amp MCP memory** — for immediate in-session recall (native to Amp)
10
+ 2. **rlhf-feedback-loop** — for DPO export, LanceDB vectors, prevention rules, and cross-platform analytics
9
11
 
12
+ ## On thumbs up / positive feedback
13
+
14
+ First, store in Amp's native memory for in-session recall:
15
+ ```
16
+ Use mcp__memory__remember to store: "GOOD: {context}. What worked: {details}"
17
+ ```
18
+
19
+ Then, pipe to rlhf-feedback-loop for DPO export and analytics:
20
+ ```bash
21
+ npx -y rlhf-feedback-loop capture --feedback=up --context="..." --what-worked="..." --tags="..."
22
+ ```
23
+
24
+ ## On thumbs down / negative feedback
25
+
26
+ First, store in Amp's native memory so the mistake is recalled immediately:
27
+ ```
28
+ Use mcp__memory__remember to store: "BAD: {context}. What went wrong: {details}. Fix: {what to change}"
29
+ ```
30
+
31
+ Then, pipe to rlhf-feedback-loop for prevention rules and DPO pairs:
10
32
  ```bash
11
- node .rlhf/capture-feedback.js --feedback=up --context="..." --tags="..."
12
- node .rlhf/capture-feedback.js --feedback=down --context="..." --what-went-wrong="..." --what-to-change="..." --tags="..."
33
+ npx -y rlhf-feedback-loop capture --feedback=down --context="..." --what-went-wrong="..." --what-to-change="..." --tags="..."
13
34
  ```
14
35
 
15
- Before major implementation:
36
+ ## Before starting a new task
37
+
38
+ Recall from Amp's memory (instant, in-session):
39
+ ```
40
+ Use mcp__memory__recall with query describing the current task
41
+ ```
16
42
 
43
+ Also check rlhf-feedback-loop for cross-session prevention rules:
17
44
  ```bash
18
- node .rlhf/capture-feedback.js --feedback=up --context="session start" --tags="session" 2>/dev/null || true
45
+ npx -y rlhf-feedback-loop rules
19
46
  ```
20
47
 
21
48
  ## Triggers
22
49
 
23
- - "thumbs up" / "that worked" / "looks good"
24
- - "thumbs down" / "that failed" / "that was wrong"
50
+ - "thumbs up" / "that worked" / "looks good" / "nice" / "perfect"
51
+ - "thumbs down" / "that failed" / "that was wrong" / "no" / "fix this"
25
52
 
26
53
  ## Negative Triggers (do NOT activate for)
27
54
 
28
- - "generate code"
29
- - "search files"
30
- - "explain this"
31
- - "run tests"
55
+ - "generate code" / "search files" / "explain this" / "run tests"
56
+
57
+ ## Why dual-write?
58
+
59
+ Amp's MCP memory gives you instant in-session recall. rlhf-feedback-loop gives you:
60
+ - **DPO training pairs** for fine-tuning your model
61
+ - **Prevention rules** that block repeated mistakes
62
+ - **Cross-platform portability** — same feedback works in Claude, Codex, Gemini
63
+ - **LanceDB vector search** for semantic similarity across sessions
64
+ - **REST API** for team dashboards and analytics
@@ -305,3 +305,4 @@ module.exports = {
305
305
  aggregateTraces,
306
306
  DEFAULT_CONFIDENCE_THRESHOLD,
307
307
  };
308
+ // test coverage: 573 tests