npm - rlhf-feedback-loop - Versions diffs - 0.6.0 → 0.6.1 - Mend

rlhf-feedback-loop 0.6.0 → 0.6.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (3) hide show

package/README.md +7 -19
package/package.json +3 -2
package/plugins/amp-skill/SKILL.md +46 -13

package/README.md CHANGED Viewed

@@ -16,29 +16,17 @@
 ## Get Started
-One command. Works with any MCP-compatible agent:
+One command. Pick your platform:
-```bash
-claude mcp add rlhf -- npx -y rlhf-feedback-loop serve
-```
-```bash
-codex mcp add rlhf -- npx -y rlhf-feedback-loop serve
-```
-```bash
-gemini mcp add rlhf -- npx -y rlhf-feedback-loop serve
-```
+| Platform | Install |
+|----------|---------|
+| **Claude** | `claude mcp add rlhf -- npx -y rlhf-feedback-loop serve` |
+| **Codex** | `codex mcp add rlhf -- npx -y rlhf-feedback-loop serve` |
+| **Gemini** | `gemini mcp add rlhf -- npx -y rlhf-feedback-loop serve` |
+| **All at once** | `npx add-mcp rlhf-feedback-loop` |
 That's it. Your agent can now capture feedback, recall past learnings mid-conversation, and block repeated mistakes.
-Or install via npm for CLI and programmatic use:
-```bash
-npm install rlhf-feedback-loop
-npx rlhf-feedback-loop init
-```
 ## How It Works
 ```

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "rlhf-feedback-loop",
-  "version": "0.6.0",
+  "version": "0.6.1",
   "description": "Make your AI agent learn from mistakes. Capture thumbs up/down feedback, block repeated failures, export DPO training data. Works with ChatGPT, Claude, Codex, Gemini, Amp.",
   "homepage": "https://github.com/IgorGanapolsky/rlhf-feedback-loop#readme",
   "repository": {
@@ -105,5 +105,6 @@
     "@lancedb/lancedb": "^0.26.2",
     "apache-arrow": "^18.1.0",
     "stripe": "^20.4.0"
-  }
+  },
+  "mcpName": "io.github.igorganapolsky/rlhf-feedback-loop"
 }

package/plugins/amp-skill/SKILL.md CHANGED Viewed

@@ -1,31 +1,64 @@
 ---
 name: rlhf-feedback
-description: Capture thumbs feedback and apply prevention rules before coding
+description: Dual-write feedback to Amp MCP memory AND rlhf-feedback-loop for DPO export, analytics, and cross-platform portability
 ---
-# Amp RLHF Skill
+# RLHF Feedback Skill (Dual-Write)
-On explicit user feedback:
+This skill captures feedback in TWO places simultaneously:
+1. **Amp MCP memory** — for immediate in-session recall (native to Amp)
+2. **rlhf-feedback-loop** — for DPO export, LanceDB vectors, prevention rules, and cross-platform analytics
+## On thumbs up / positive feedback
+First, store in Amp's native memory for in-session recall:
+```
+Use mcp__memory__remember to store: "GOOD: {context}. What worked: {details}"
+```
+Then, pipe to rlhf-feedback-loop for DPO export and analytics:
+```bash
+npx -y rlhf-feedback-loop capture --feedback=up --context="..." --what-worked="..." --tags="..."
+```
+## On thumbs down / negative feedback
+First, store in Amp's native memory so the mistake is recalled immediately:
+```
+Use mcp__memory__remember to store: "BAD: {context}. What went wrong: {details}. Fix: {what to change}"
+```
+Then, pipe to rlhf-feedback-loop for prevention rules and DPO pairs:
 ```bash
-node .rlhf/capture-feedback.js --feedback=up --context="..." --tags="..."
-node .rlhf/capture-feedback.js --feedback=down --context="..." --what-went-wrong="..." --what-to-change="..." --tags="..."
+npx -y rlhf-feedback-loop capture --feedback=down --context="..." --what-went-wrong="..." --what-to-change="..." --tags="..."
 ```
-Before major implementation:
+## Before starting a new task
+Recall from Amp's memory (instant, in-session):
+```
+Use mcp__memory__recall with query describing the current task
+```
+Also check rlhf-feedback-loop for cross-session prevention rules:
 ```bash
-node .rlhf/capture-feedback.js --feedback=up --context="session start" --tags="session" 2>/dev/null || true
+npx -y rlhf-feedback-loop rules
 ```
 ## Triggers
-- "thumbs up" / "that worked" / "looks good"
-- "thumbs down" / "that failed" / "that was wrong"
+- "thumbs up" / "that worked" / "looks good" / "nice" / "perfect"
+- "thumbs down" / "that failed" / "that was wrong" / "no" / "fix this"
 ## Negative Triggers (do NOT activate for)
-- "generate code"
-- "search files"
-- "explain this"
-- "run tests"
+- "generate code" / "search files" / "explain this" / "run tests"
+## Why dual-write?
+Amp's MCP memory gives you instant in-session recall. rlhf-feedback-loop gives you:
+- **DPO training pairs** for fine-tuning your model
+- **Prevention rules** that block repeated mistakes
+- **Cross-platform portability** — same feedback works in Claude, Codex, Gemini
+- **LanceDB vector search** for semantic similarity across sessions
+- **REST API** for team dashboards and analytics