rlhf-feedback-loop 0.6.0 → 0.6.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -16,29 +16,17 @@
16
16
 
17
17
  ## Get Started
18
18
 
19
- One command. Works with any MCP-compatible agent:
19
+ One command. Pick your platform:
20
20
 
21
- ```bash
22
- claude mcp add rlhf -- npx -y rlhf-feedback-loop serve
23
- ```
24
-
25
- ```bash
26
- codex mcp add rlhf -- npx -y rlhf-feedback-loop serve
27
- ```
28
-
29
- ```bash
30
- gemini mcp add rlhf -- npx -y rlhf-feedback-loop serve
31
- ```
21
+ | Platform | Install |
22
+ |----------|---------|
23
+ | **Claude** | `claude mcp add rlhf -- npx -y rlhf-feedback-loop serve` |
24
+ | **Codex** | `codex mcp add rlhf -- npx -y rlhf-feedback-loop serve` |
25
+ | **Gemini** | `gemini mcp add rlhf -- npx -y rlhf-feedback-loop serve` |
26
+ | **All at once** | `npx add-mcp rlhf-feedback-loop` |
32
27
 
33
28
  That's it. Your agent can now capture feedback, recall past learnings mid-conversation, and block repeated mistakes.
34
29
 
35
- Or install via npm for CLI and programmatic use:
36
-
37
- ```bash
38
- npm install rlhf-feedback-loop
39
- npx rlhf-feedback-loop init
40
- ```
41
-
42
30
  ## How It Works
43
31
 
44
32
  ```
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "rlhf-feedback-loop",
3
- "version": "0.6.0",
3
+ "version": "0.6.2",
4
4
  "description": "Make your AI agent learn from mistakes. Capture thumbs up/down feedback, block repeated failures, export DPO training data. Works with ChatGPT, Claude, Codex, Gemini, Amp.",
5
5
  "homepage": "https://github.com/IgorGanapolsky/rlhf-feedback-loop#readme",
6
6
  "repository": {
@@ -105,5 +105,6 @@
105
105
  "@lancedb/lancedb": "^0.26.2",
106
106
  "apache-arrow": "^18.1.0",
107
107
  "stripe": "^20.4.0"
108
- }
108
+ },
109
+ "mcpName": "io.github.IgorGanapolsky/rlhf-feedback-loop"
109
110
  }
@@ -1,31 +1,64 @@
1
1
  ---
2
2
  name: rlhf-feedback
3
- description: Capture thumbs feedback and apply prevention rules before coding
3
+ description: Dual-write feedback to Amp MCP memory AND rlhf-feedback-loop for DPO export, analytics, and cross-platform portability
4
4
  ---
5
5
 
6
- # Amp RLHF Skill
6
+ # RLHF Feedback Skill (Dual-Write)
7
7
 
8
- On explicit user feedback:
8
+ This skill captures feedback in TWO places simultaneously:
9
+ 1. **Amp MCP memory** — for immediate in-session recall (native to Amp)
10
+ 2. **rlhf-feedback-loop** — for DPO export, LanceDB vectors, prevention rules, and cross-platform analytics
9
11
 
12
+ ## On thumbs up / positive feedback
13
+
14
+ First, store in Amp's native memory for in-session recall:
15
+ ```
16
+ Use mcp__memory__remember to store: "GOOD: {context}. What worked: {details}"
17
+ ```
18
+
19
+ Then, pipe to rlhf-feedback-loop for DPO export and analytics:
20
+ ```bash
21
+ npx -y rlhf-feedback-loop capture --feedback=up --context="..." --what-worked="..." --tags="..."
22
+ ```
23
+
24
+ ## On thumbs down / negative feedback
25
+
26
+ First, store in Amp's native memory so the mistake is recalled immediately:
27
+ ```
28
+ Use mcp__memory__remember to store: "BAD: {context}. What went wrong: {details}. Fix: {what to change}"
29
+ ```
30
+
31
+ Then, pipe to rlhf-feedback-loop for prevention rules and DPO pairs:
10
32
  ```bash
11
- node .rlhf/capture-feedback.js --feedback=up --context="..." --tags="..."
12
- node .rlhf/capture-feedback.js --feedback=down --context="..." --what-went-wrong="..." --what-to-change="..." --tags="..."
33
+ npx -y rlhf-feedback-loop capture --feedback=down --context="..." --what-went-wrong="..." --what-to-change="..." --tags="..."
13
34
  ```
14
35
 
15
- Before major implementation:
36
+ ## Before starting a new task
37
+
38
+ Recall from Amp's memory (instant, in-session):
39
+ ```
40
+ Use mcp__memory__recall with query describing the current task
41
+ ```
16
42
 
43
+ Also check rlhf-feedback-loop for cross-session prevention rules:
17
44
  ```bash
18
- node .rlhf/capture-feedback.js --feedback=up --context="session start" --tags="session" 2>/dev/null || true
45
+ npx -y rlhf-feedback-loop rules
19
46
  ```
20
47
 
21
48
  ## Triggers
22
49
 
23
- - "thumbs up" / "that worked" / "looks good"
24
- - "thumbs down" / "that failed" / "that was wrong"
50
+ - "thumbs up" / "that worked" / "looks good" / "nice" / "perfect"
51
+ - "thumbs down" / "that failed" / "that was wrong" / "no" / "fix this"
25
52
 
26
53
  ## Negative Triggers (do NOT activate for)
27
54
 
28
- - "generate code"
29
- - "search files"
30
- - "explain this"
31
- - "run tests"
55
+ - "generate code" / "search files" / "explain this" / "run tests"
56
+
57
+ ## Why dual-write?
58
+
59
+ Amp's MCP memory gives you instant in-session recall. rlhf-feedback-loop gives you:
60
+ - **DPO training pairs** for fine-tuning your model
61
+ - **Prevention rules** that block repeated mistakes
62
+ - **Cross-platform portability** — same feedback works in Claude, Codex, Gemini
63
+ - **LanceDB vector search** for semantic similarity across sessions
64
+ - **REST API** for team dashboards and analytics