claude-self-reflect 7.1.9 → 7.1.11

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -26,7 +26,7 @@ Give Claude perfect memory of all your conversations. Search past discussions in
26
26
 
27
27
  **100% Local by Default** • **20x Faster** • **Zero Configuration** • **Production Ready**
28
28
 
29
- > **Latest: v7.1.9 Cross-Project Iteration Memory** - Ralph loops now share memory across ALL projects automatically. [Learn more →](#ralph-loop-memory-integration-v719)
29
+ > **Latest: v7.1.9 Cross-Project Iteration Memory** - Ralph loops now share memory across ALL projects automatically. [Learn more →](#ralph-loop-memory)
30
30
 
31
31
  ## Why This Exists
32
32
 
@@ -39,9 +39,8 @@ Claude starts fresh every conversation. You've solved complex bugs, designed arc
39
39
  - [The Magic](#the-magic)
40
40
  - [Before & After](#before--after)
41
41
  - [Real Examples](#real-examples)
42
- - [NEW: Real-time Indexing Status](#new-real-time-indexing-status-in-your-terminal)
42
+ - [Ralph Loop Memory](#ralph-loop-memory)
43
43
  - [Key Features](#key-features)
44
- - [Ralph Loop Memory Integration](#ralph-loop-memory-integration)
45
44
  - [Code Quality Insights](#code-quality-insights)
46
45
  - [Architecture](#architecture)
47
46
  - [Requirements](#requirements)
@@ -133,17 +132,44 @@ Claude: "Found conversations about JWT patterns including User.authenticate
133
132
  PKCE, and social login integration."
134
133
  ```
135
134
 
136
- ## NEW: Real-time Indexing Status in Your Terminal
135
+ ## Ralph Loop Memory
137
136
 
138
- See your conversation indexing progress directly in your statusline:
137
+ <div align="center">
138
+ <img src="docs/images/ralph-loop-csr.png" alt="Ralph Loop with CSR Memory - From hamster wheel to upward spiral" width="800"/>
139
+ </div>
140
+
141
+ **The difference between spinning in circles and building on every iteration.**
142
+
143
+ Use the [ralph-wiggum plugin](https://github.com/anthropics/claude-code-plugins/tree/main/ralph-wiggum) for long tasks? CSR gives your Ralph loops **persistent memory across sessions and projects**.
144
+
145
+ ### Without CSR: The Hamster Wheel
146
+ - Each context compaction = everything forgotten
147
+ - Same mistakes repeated across iterations
148
+ - No learning from past sessions
149
+ - Cross-project insights lost forever
150
+
151
+ ### With CSR: The Upward Spiral
152
+ - **Automatic backup** before context compaction
153
+ - **Anti-pattern injection** - "DON'T RETRY THESE" surfaces first
154
+ - **Success pattern learning** - reuse what worked before
155
+ - **Cross-project memory** - learn from ALL your projects
156
+
157
+ ### Quick Setup
158
+ ```bash
159
+ ./scripts/ralph/install_hooks.sh # Install hooks globally
160
+ ./scripts/ralph/install_hooks.sh --check # Verify installation
161
+ ```
139
162
 
140
- ### Fully Indexed (100%)
141
- ![Statusline showing 100% indexed](docs/images/statusbar-1.png)
163
+ ### How It Works
164
+ 1. Start a Ralph loop: `/ralph-wiggum:ralph-loop "Build feature X"`
165
+ 2. Work naturally - CSR hooks capture state automatically
166
+ 3. **Stop hook** stores each iteration's learnings
167
+ 4. **PreCompact hook** backs up state before compaction
168
+ 5. Next session retrieves past insights, failed approaches, and wins
142
169
 
143
- ### Active Indexing (50% with backlog)
144
- ![Statusline showing 50% indexed with 7h backlog](docs/images/statusbar-2.png)
170
+ > **v7.1.9+**: Cross-project iteration memory - hooks work for ALL projects, entries tagged with `project_{name}` for global searchability.
145
171
 
146
- Works with [Claude Code Statusline](https://github.com/sirmalloc/ccstatusline) - shows progress bars, percentages, and indexing lag in real-time! The statusline also displays MCP connection status (✓ Connected) and collection counts (28/29 indexed).
172
+ [Full documentation ](docs/development/ralph-memory-integration.md)
147
173
 
148
174
  ## Code Quality Insights
149
175
 
@@ -325,74 +351,6 @@ Claude: [Searches across ALL your projects]
325
351
 
326
352
  </details>
327
353
 
328
- <details>
329
- <summary><b>Ralph Loop Memory Integration (v7.1.9+)</b></summary>
330
-
331
- <div align="center">
332
- <img src="docs/images/ralph-loop-csr.png" alt="Ralph Loop with CSR Memory - From hamster wheel to upward spiral" width="800"/>
333
- </div>
334
-
335
- Use the [ralph-wiggum plugin](https://github.com/anthropics/claude-code-plugins/tree/main/ralph-wiggum) for long tasks? CSR automatically gives Ralph loops **cross-session AND cross-project memory**:
336
-
337
- **Core Features:**
338
- - **Automatic backup** before context compaction
339
- - **Past session retrieval** when starting new Ralph loops
340
- - **Failed approach tracking** - never repeat the same mistakes
341
- - **Success pattern learning** - reuse what worked before
342
-
343
- **v7.1.9 Cross-Project Iteration Memory (NEW!):**
344
- - **Global Hook System** - Hooks fire for ALL projects, not just CSR
345
- - **Stop Hook** - Captures iteration state after every Claude response
346
- - **PreCompact Hook** - Backs up Ralph state before context compaction
347
- - **Project Tagging** - Each entry tagged with `project_{name}` for cross-project visibility
348
- - **Automatic Storage** - No manual protocol needed, hooks capture everything
349
-
350
- **v7.1+ Enhanced Features:**
351
- - **Error Signature Deduplication** - Normalizes errors (removes line numbers, paths, timestamps)
352
- - **Output Decline Detection** - Circuit breaker pattern detects >70% output drop
353
- - **Confidence-Based Exit** - 0-100 scoring (tasks complete, tests passing, no errors)
354
- - **Anti-Pattern Injection** - "DON'T RETRY THESE" surfaces failed approaches first
355
- - **Work Type Tracking** - IMPLEMENTATION/TESTING/DEBUGGING/DOCUMENTATION
356
- - **Error-Centric Search** - Find past sessions by error pattern
357
-
358
- **Setup (one-time):**
359
- ```bash
360
- ./scripts/ralph/install_hooks.sh # Install CSR hooks globally
361
- ./scripts/ralph/install_hooks.sh --check # Verify installation
362
- ```
363
-
364
- **How it works:**
365
- 1. Start a Ralph loop in ANY project: `/ralph-wiggum:ralph-loop "Build feature X"`
366
- 2. Work naturally - state is tracked in `.claude/ralph-loop.local.md`
367
- 3. **Stop hook** captures each iteration automatically
368
- 4. **PreCompact hook** backs up state before compaction
369
- 5. All entries stored with project tags for cross-project search
370
-
371
- **Cross-Project Example:**
372
- ```bash
373
- # In project-a: learned Docker fix
374
- # In project-b: CSR finds it automatically
375
- reflect_on_past("docker memory issue")
376
- # Returns: project_a session with Docker solution
377
- ```
378
-
379
- **Verified proof (2026-01-05):**
380
- ```
381
- # Hook collection shows cross-project entries:
382
- Total entries: 23
383
-
384
- 1. 05:25:06 📦 PreCompact | 📧 emailmon
385
- 2. 05:22:00 🔄 Iteration | 🔍 claude-self-reflect
386
- 3. 05:13:51 🔄 Iteration | 📧 emailmon
387
-
388
- # Entries include project tags:
389
- tags: ['__csr_hook_auto__', 'project_emailmon', 'ralph_iteration']
390
- ```
391
-
392
- [Full documentation →](docs/development/ralph-memory-integration.md)
393
-
394
- </details>
395
-
396
354
  <details>
397
355
  <summary><b>Memory Decay</b></summary>
398
356
 
@@ -0,0 +1,81 @@
1
+ # Ground Truth Evaluation Prompt
2
+
3
+ You are evaluating a code generation session for ground truth labeling. Your task is to assess the quality and effectiveness of the solution provided.
4
+
5
+ ## Evaluation Criteria
6
+
7
+ ### 1. Functional Correctness (0.0-1.0)
8
+ - Does the solution address the user's request?
9
+ - Are there any bugs or errors in the implementation?
10
+ - Do builds and tests pass?
11
+ - Is the code logically sound?
12
+
13
+ ### 2. Design Quality (0.0-1.0)
14
+ - Does the solution follow best practices?
15
+ - Is the code well-structured and maintainable?
16
+ - Are appropriate patterns and architectures used?
17
+ - Is the implementation efficient?
18
+
19
+ ### 3. Completeness (0.0-1.0)
20
+ - Were all requirements addressed?
21
+ - Are edge cases handled?
22
+ - Is error handling appropriate?
23
+ - Is the solution production-ready?
24
+
25
+ ## Scoring Guidelines
26
+
27
+ **0.9-1.0 (Excellent)**
28
+ - All requirements met with high quality
29
+ - Best practices followed throughout
30
+ - Well-tested and robust
31
+ - Production-ready code
32
+
33
+ **0.7-0.8 (Good)**
34
+ - Requirements met with minor issues
35
+ - Generally follows best practices
36
+ - Some testing coverage
37
+ - Mostly production-ready
38
+
39
+ **0.5-0.6 (Acceptable)**
40
+ - Core requirements met
41
+ - Some quality issues
42
+ - Limited testing
43
+ - Needs refinement
44
+
45
+ **0.3-0.4 (Needs Work)**
46
+ - Partial implementation
47
+ - Quality concerns
48
+ - Missing tests
49
+ - Significant gaps
50
+
51
+ **0.0-0.2 (Inadequate)**
52
+ - Major issues or incomplete
53
+ - Does not meet requirements
54
+ - Serious bugs or design flaws
55
+
56
+ ## Response Format
57
+
58
+ Provide your evaluation in the following structure:
59
+
60
+ ```json
61
+ {
62
+ "functional_correctness": 0.0-1.0,
63
+ "design_quality": 0.0-1.0,
64
+ "completeness": 0.0-1.0,
65
+ "overall_grade": 0.0-1.0,
66
+ "reasoning": "Detailed explanation of scores",
67
+ "strengths": ["List of strong points"],
68
+ "weaknesses": ["List of areas for improvement"],
69
+ "confidence": 0.0-1.0
70
+ }
71
+ ```
72
+
73
+ ## Context Placeholders
74
+
75
+ The following sections will be filled in for each evaluation:
76
+
77
+ - `<request>`: The user's original request
78
+ - `<solution>`: The implementation provided
79
+ - `<tier1_results>`: Build and test results
80
+ - `<rubric>`: Specific evaluation criteria for this session
81
+ - `<narrative>`: Full conversation narrative for context