loki-mode 4.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (54) hide show
  1. package/LICENSE +21 -0
  2. package/README.md +691 -0
  3. package/SKILL.md +191 -0
  4. package/VERSION +1 -0
  5. package/autonomy/.loki/dashboard/index.html +2634 -0
  6. package/autonomy/CONSTITUTION.md +508 -0
  7. package/autonomy/README.md +201 -0
  8. package/autonomy/config.example.yaml +152 -0
  9. package/autonomy/loki +526 -0
  10. package/autonomy/run.sh +3636 -0
  11. package/bin/loki-mode.js +26 -0
  12. package/bin/postinstall.js +60 -0
  13. package/docs/ACKNOWLEDGEMENTS.md +234 -0
  14. package/docs/COMPARISON.md +325 -0
  15. package/docs/COMPETITIVE-ANALYSIS.md +333 -0
  16. package/docs/INSTALLATION.md +547 -0
  17. package/docs/auto-claude-comparison.md +276 -0
  18. package/docs/cursor-comparison.md +225 -0
  19. package/docs/dashboard-guide.md +355 -0
  20. package/docs/screenshots/README.md +149 -0
  21. package/docs/screenshots/dashboard-agents.png +0 -0
  22. package/docs/screenshots/dashboard-tasks.png +0 -0
  23. package/docs/thick2thin.md +173 -0
  24. package/package.json +48 -0
  25. package/references/advanced-patterns.md +453 -0
  26. package/references/agent-types.md +243 -0
  27. package/references/agents.md +1043 -0
  28. package/references/business-ops.md +550 -0
  29. package/references/competitive-analysis.md +216 -0
  30. package/references/confidence-routing.md +371 -0
  31. package/references/core-workflow.md +275 -0
  32. package/references/cursor-learnings.md +207 -0
  33. package/references/deployment.md +604 -0
  34. package/references/lab-research-patterns.md +534 -0
  35. package/references/mcp-integration.md +186 -0
  36. package/references/memory-system.md +467 -0
  37. package/references/openai-patterns.md +647 -0
  38. package/references/production-patterns.md +568 -0
  39. package/references/prompt-repetition.md +192 -0
  40. package/references/quality-control.md +437 -0
  41. package/references/sdlc-phases.md +410 -0
  42. package/references/task-queue.md +361 -0
  43. package/references/tool-orchestration.md +691 -0
  44. package/skills/00-index.md +120 -0
  45. package/skills/agents.md +249 -0
  46. package/skills/artifacts.md +174 -0
  47. package/skills/github-integration.md +218 -0
  48. package/skills/model-selection.md +125 -0
  49. package/skills/parallel-workflows.md +526 -0
  50. package/skills/patterns-advanced.md +188 -0
  51. package/skills/production.md +292 -0
  52. package/skills/quality-gates.md +180 -0
  53. package/skills/testing.md +149 -0
  54. package/skills/troubleshooting.md +109 -0
@@ -0,0 +1,275 @@
1
+ # Core Workflow Reference
2
+
3
+ Full RARV cycle, CONTINUITY.md template, and autonomy rules.
4
+
5
+ ---
6
+
7
+ ## Autonomy Rules
8
+
9
+ **This system runs with ZERO human intervention.**
10
+
11
+ ### Core Rules
12
+ 1. **NEVER ask questions** - Do not say "Would you like me to...", "Should I...", or "What would you prefer?"
13
+ 2. **NEVER wait for confirmation** - Take immediate action. If something needs to be done, do it.
14
+ 3. **NEVER stop voluntarily** - Continue until completion promise is fulfilled or max iterations reached
15
+ 4. **NEVER suggest alternatives** - Pick the best option and execute. No "You could also..." or "Alternatively..."
16
+ 5. **ALWAYS use RARV cycle** - Every action follows the Reason-Act-Reflect-Verify pattern
17
+
18
+ ---
19
+
20
+ ## RARV Cycle (Reason-Act-Reflect-Verify)
21
+
22
+ **Enhanced with Automatic Self-Verification Loop (Boris Cherny Pattern)**
23
+
24
+ Every iteration follows this cycle:
25
+
26
+ ```
27
+ +-------------------------------------------------------------------+
28
+ | REASON: What needs to be done next? |
29
+ | - READ .loki/CONTINUITY.md first (working memory) |
30
+ | - READ "Mistakes & Learnings" to avoid past errors |
31
+ | - Check current state in .loki/state/orchestrator.json |
32
+ | - Review pending tasks in .loki/queue/pending.json |
33
+ | - Identify highest priority unblocked task |
34
+ | - Determine exact steps to complete it |
35
+ +-------------------------------------------------------------------+
36
+ | ACT: Execute the task |
37
+ | - Dispatch subagent via Task tool OR execute directly |
38
+ | - Write code, run tests, fix issues |
39
+ | - Commit changes atomically (git checkpoint) |
40
+ | - Update queue files (.loki/queue/*.json) |
41
+ +-------------------------------------------------------------------+
42
+ | REFLECT: Did it work? What next? |
43
+ | - Verify task success (tests pass, no errors) |
44
+ | - UPDATE .loki/CONTINUITY.md with progress |
45
+ | - Update orchestrator state |
46
+ | - Check completion promise - are we done? |
47
+ | - If not done, loop back to REASON |
48
+ +-------------------------------------------------------------------+
49
+ | VERIFY: Let AI test its own work (2-3x quality improvement) |
50
+ | - Run automated tests (unit, integration, E2E) |
51
+ | - Check compilation/build (no errors or warnings) |
52
+ | - Verify against spec (.loki/specs/openapi.yaml) |
53
+ | - Run linters/formatters via post-write hooks |
54
+ | - Browser/runtime testing if applicable |
55
+ | |
56
+ | IF VERIFICATION FAILS: |
57
+ | 1. Capture error details (stack trace, logs) |
58
+ | 2. Analyze root cause |
59
+ | 3. UPDATE CONTINUITY.md "Mistakes & Learnings" |
60
+ | 4. Rollback to last good git checkpoint (if needed) |
61
+ | 5. Apply learning and RETRY from REASON |
62
+ | |
63
+ | - If verification passes, mark task complete and continue |
64
+ +-------------------------------------------------------------------+
65
+ ```
66
+
67
+ **Key Enhancement:** The VERIFY step creates a feedback loop where the AI:
68
+ - Tests every change automatically
69
+ - Learns from failures by updating CONTINUITY.md
70
+ - Retries with learned context
71
+ - Achieves 2-3x quality improvement (Boris Cherny's observed result)
72
+
73
+ ---
74
+
75
+ ## CONTINUITY.md - Working Memory Protocol
76
+
77
+ **CRITICAL:** You have a persistent working memory file at `.loki/CONTINUITY.md` that maintains state across all turns of execution.
78
+
79
+ ### AT THE START OF EVERY TURN:
80
+ 1. Read `.loki/CONTINUITY.md` to orient yourself to the current state
81
+ 2. Reference it throughout your reasoning
82
+ 3. Never make decisions without checking CONTINUITY.md first
83
+
84
+ ### AT THE END OF EVERY TURN:
85
+ 1. Update `.loki/CONTINUITY.md` with any important new information
86
+ 2. Record what was accomplished
87
+ 3. Note what needs to happen next
88
+ 4. Document any blockers or decisions made
89
+
90
+ ### CONTINUITY.md Template
91
+
92
+ ```markdown
93
+ # Loki Mode Working Memory
94
+ Last Updated: [ISO timestamp]
95
+ Current Phase: [bootstrap|discovery|architecture|development|qa|deployment|growth]
96
+ Current Iteration: [number]
97
+
98
+ ## Active Goal
99
+ [What we're currently trying to accomplish - 1-2 sentences]
100
+
101
+ ## Current Task
102
+ - ID: [task-id from queue]
103
+ - Description: [what we're doing]
104
+ - Status: [in-progress|blocked|reviewing]
105
+ - Started: [timestamp]
106
+
107
+ ## Just Completed
108
+ - [Most recent accomplishment with file:line references]
109
+ - [Previous accomplishment]
110
+ - [etc - last 5 items]
111
+
112
+ ## Next Actions (Priority Order)
113
+ 1. [Immediate next step]
114
+ 2. [Following step]
115
+ 3. [etc]
116
+
117
+ ## Active Blockers
118
+ - [Any current blockers or waiting items]
119
+
120
+ ## Key Decisions This Session
121
+ - [Decision]: [Rationale] - [timestamp]
122
+
123
+ ## Mistakes & Learnings (Self-Updating)
124
+ **CRITICAL:** When errors occur, agents MUST update this section to prevent repeating mistakes.
125
+
126
+ ### Pattern: Error -> Learning -> Prevention
127
+ - **What Failed:** [Specific error that occurred]
128
+ - **Why It Failed:** [Root cause analysis]
129
+ - **How to Prevent:** [Concrete action to avoid this in future]
130
+ - **Timestamp:** [When this was learned]
131
+ - **Agent:** [Which agent learned this]
132
+
133
+ ### Example:
134
+ - **What Failed:** TypeScript compilation error - missing return type annotation
135
+ - **Why It Failed:** Express route handlers need explicit `: void` return type in strict mode
136
+ - **How to Prevent:** Always add `: void` to route handlers: `(req, res): void =>`
137
+ - **Timestamp:** 2026-01-04T00:16:00Z
138
+ - **Agent:** eng-001-backend-api
139
+
140
+ **Self-Update Protocol:**
141
+ ```
142
+ ON_ERROR:
143
+ 1. Capture error details (stack trace, context)
144
+ 2. Analyze root cause
145
+ 3. Write learning to CONTINUITY.md "Mistakes & Learnings"
146
+ 4. Update approach based on learning
147
+ 5. Retry with corrected approach
148
+ ```
149
+
150
+ ## Working Context
151
+ [Any critical information needed for current work - API keys in use,
152
+ architecture decisions, patterns being followed, etc.]
153
+
154
+ ## Files Currently Being Modified
155
+ - [file path]: [what we're changing]
156
+ ```
157
+
158
+ ---
159
+
160
+ ## Memory Hierarchy
161
+
162
+ The memory systems work together:
163
+
164
+ 1. **CONTINUITY.md** = Working memory (current session state, updated every turn)
165
+ 2. **ledgers/** = Agent-specific state (checkpointed periodically)
166
+ 3. **handoffs/** = Agent-to-agent transfers (on agent switch)
167
+ 4. **learnings/** = Extracted patterns (on task completion)
168
+ 5. **rules/** = Permanent validated patterns (promoted from learnings)
169
+
170
+ **CONTINUITY.md is the PRIMARY source of truth for "what am I doing right now?"**
171
+
172
+ ---
173
+
174
+ ## Git Checkpoint System
175
+
176
+ **CRITICAL:** Every completed task MUST create a git checkpoint for rollback safety.
177
+
178
+ ### Protocol: Automatic Commits After Task Completion
179
+
180
+ **RULE:** When `task.status == "completed"`, create a git commit immediately.
181
+
182
+ ```bash
183
+ # Git Checkpoint Protocol
184
+ ON_TASK_COMPLETE() {
185
+ task_id=$1
186
+ task_title=$2
187
+ agent_id=$3
188
+
189
+ # Stage modified files
190
+ git add <modified_files>
191
+
192
+ # Create structured commit message
193
+ git commit -m "[Loki] ${agent_type}-${task_id}: ${task_title}
194
+
195
+ ${detailed_description}
196
+
197
+ Agent: ${agent_id}
198
+ Parent: ${parent_agent_id}
199
+ Spec: ${spec_reference}
200
+ Tests: ${test_files}
201
+ Git-Checkpoint: $(date -u +%Y-%m-%dT%H:%M:%SZ)"
202
+
203
+ # Store commit SHA in task metadata
204
+ commit_sha=$(git rev-parse HEAD)
205
+ update_task_metadata task_id git_commit_sha "$commit_sha"
206
+
207
+ # Update CONTINUITY.md
208
+ echo "- Task $task_id completed (commit: $commit_sha)" >> .loki/CONTINUITY.md
209
+ }
210
+ ```
211
+
212
+ ### Commit Message Format
213
+
214
+ **Template:**
215
+ ```
216
+ [Loki] ${agent_type}-${task_id}: ${task_title}
217
+
218
+ ${detailed_description}
219
+
220
+ Agent: ${agent_id}
221
+ Parent: ${parent_agent_id}
222
+ Spec: ${spec_reference}
223
+ Tests: ${test_files}
224
+ Git-Checkpoint: ${timestamp}
225
+ ```
226
+
227
+ **Example:**
228
+ ```
229
+ [Loki] eng-005-backend: Implement POST /api/todos endpoint
230
+
231
+ Created todo creation endpoint per OpenAPI spec.
232
+ - Input validation for title field
233
+ - SQLite insertion with timestamps
234
+ - Returns 201 with created todo object
235
+ - Contract tests passing
236
+
237
+ Agent: eng-001-backend-api
238
+ Parent: orchestrator-main
239
+ Spec: .loki/specs/openapi.yaml#/paths/~1api~1todos/post
240
+ Tests: backend/tests/todos.contract.test.ts
241
+ Git-Checkpoint: 2026-01-04T05:45:00Z
242
+ ```
243
+
244
+ ### Rollback Strategy
245
+
246
+ **When to Rollback:**
247
+ - Quality gates fail after merge
248
+ - Integration tests fail
249
+ - Security vulnerabilities detected
250
+ - Breaking changes discovered
251
+
252
+ **Rollback Command:**
253
+ ```bash
254
+ # Find last good checkpoint
255
+ last_good_commit=$(git log --grep="\[Loki\].*task-${last_good_task_id}" --format=%H -n 1)
256
+
257
+ # Rollback to that checkpoint
258
+ git reset --hard $last_good_commit
259
+
260
+ # Update CONTINUITY.md
261
+ echo "ROLLBACK: Reset to task-${last_good_task_id} (commit: $last_good_commit)" >> .loki/CONTINUITY.md
262
+
263
+ # Re-queue failed tasks
264
+ move_tasks_to_pending after_task=$last_good_task_id
265
+ ```
266
+
267
+ ---
268
+
269
+ ## If Subagent Fails
270
+
271
+ 1. Do NOT try to fix manually (context pollution)
272
+ 2. Dispatch fix subagent with specific error context
273
+ 3. If fix subagent fails 3x, move to dead letter queue
274
+ 4. Open circuit breaker for that agent type
275
+ 5. Alert orchestrator for human review
@@ -0,0 +1,207 @@
1
+ # Cursor Scaling Learnings
2
+
3
+ > **Source:** [Cursor Blog - Scaling Agents](https://cursor.com/blog/scaling-agents) (January 2026)
4
+ > **Context:** Cursor deployed hundreds of concurrent agents, trillions of tokens, completing 1M+ LoC projects
5
+
6
+ ---
7
+
8
+ ## Key Findings
9
+
10
+ ### 1. Flat Coordination Fails at Scale
11
+
12
+ **What they tried:**
13
+ - Equal-status agents self-coordinating through shared files
14
+ - File-based locking mechanisms
15
+
16
+ **What happened:**
17
+ - "Twenty agents would slow down to the effective throughput of two or three"
18
+ - Most time spent waiting on locks
19
+ - Agents failed while holding locks, creating deadlocks
20
+
21
+ **Lesson:** Hierarchical coordination (planner-worker) outperforms flat coordination.
22
+
23
+ ---
24
+
25
+ ### 2. Integrator Roles Create Bottlenecks
26
+
27
+ **What they tried:**
28
+ - Dedicated integrator agents to coordinate and merge work
29
+ - Quality control checkpoints between workers
30
+
31
+ **What happened:**
32
+ - "Created more bottlenecks than it solved"
33
+ - Workers were already capable of handling conflicts themselves
34
+
35
+ **Lesson:** Trust workers to handle conflicts. Remove unnecessary oversight layers at scale.
36
+
37
+ **Implication for Loki Mode:** The 3-reviewer blind review system may become a bottleneck at 100+ agent scale. Consider:
38
+ - Making review optional for low-risk changes
39
+ - Allowing workers to self-merge trivial fixes
40
+ - Escalating only high-risk changes to full review
41
+
42
+ ---
43
+
44
+ ### 3. Optimistic Concurrency Control
45
+
46
+ **What they tried:**
47
+ - File locking (failed - deadlocks, bottlenecks)
48
+ - Optimistic concurrency (succeeded)
49
+
50
+ **How it works:**
51
+ ```
52
+ 1. Agent reads current state (no lock)
53
+ 2. Agent performs work
54
+ 3. Agent attempts write
55
+ 4. IF state changed since read: Write fails, agent retries
56
+ 5. IF state unchanged: Write succeeds
57
+ ```
58
+
59
+ **Benefits:**
60
+ - No waiting for locks
61
+ - No deadlock risk
62
+ - Failed writes are cheap (just retry)
63
+
64
+ **Lesson:** Optimistic concurrency scales better than pessimistic locking.
65
+
66
+ ---
67
+
68
+ ### 4. Recursive Sub-Planners
69
+
70
+ **Pattern:**
71
+ ```
72
+ Main Planner
73
+ |
74
+ +-- Sub-Planner (Frontend)
75
+ | +-- Worker (Component A)
76
+ | +-- Worker (Component B)
77
+ |
78
+ +-- Sub-Planner (Backend)
79
+ | +-- Worker (API)
80
+ | +-- Worker (Database)
81
+ |
82
+ +-- Sub-Planner (Testing)
83
+ +-- Worker (Unit)
84
+ +-- Worker (E2E)
85
+ ```
86
+
87
+ **Key insight:** "Planners continuously explore the codebase and create tasks. They can spawn sub-planners for specific areas, making planning itself parallel and recursive."
88
+
89
+ **Benefits:**
90
+ - Planning scales horizontally
91
+ - Each sub-planner has focused context
92
+ - Prevents single-planner bottleneck
93
+
94
+ ---
95
+
96
+ ### 5. Judge Agents
97
+
98
+ **Role:** Determine whether execution cycles should continue or terminate.
99
+
100
+ **When to use:**
101
+ - After major milestones
102
+ - When workers report completion
103
+ - When detecting diminishing returns
104
+
105
+ **Implementation:**
106
+ ```yaml
107
+ judge_agent:
108
+ inputs:
109
+ - Current state
110
+ - Original goal
111
+ - Recent progress
112
+ - Resource consumption
113
+ outputs:
114
+ - CONTINUE: More work needed
115
+ - COMPLETE: Goal achieved
116
+ - ESCALATE: Human intervention needed
117
+ - PIVOT: Change approach
118
+ ```
119
+
120
+ ---
121
+
122
+ ### 6. Prompts Matter More Than Harness
123
+
124
+ **Cursor's finding:** "A surprising amount of the system's behavior comes down to how we prompt the agents... The harness and models matter, but the prompts matter more."
125
+
126
+ **Implication:** Don't over-engineer the coordination infrastructure. Invest in:
127
+ - Clear, specific prompts
128
+ - Role definitions
129
+ - Context injection
130
+ - Output format specifications
131
+
132
+ ---
133
+
134
+ ### 7. Periodic Fresh Starts Combat Drift
135
+
136
+ **Problem:** Extended autonomous operation leads to:
137
+ - Context drift
138
+ - Tunnel vision
139
+ - Accumulated assumptions
140
+
141
+ **Solution:** "We still need periodic fresh starts to combat drift and tunnel vision."
142
+
143
+ **Implementation:**
144
+ ```yaml
145
+ drift_prevention:
146
+ context_reset_interval: 25_iterations # Already in Loki Mode
147
+ mandatory_state_dump: true
148
+ fresh_planner_spawn: every_major_milestone
149
+ ```
150
+
151
+ ---
152
+
153
+ ## Scale Metrics Achieved
154
+
155
+ | Project | Scale | Duration |
156
+ |---------|-------|----------|
157
+ | Web browser | 1M+ LoC, 1,000 files | ~1 week |
158
+ | Solid-to-React migration | 266K additions, 193K deletions | 3+ weeks |
159
+ | Java LSP | 7.4K commits, 550K LoC | - |
160
+ | Windows 7 emulator | 14.6K commits, 1.2M LoC | - |
161
+ | Excel implementation | 12K commits, 1.6M LoC | - |
162
+
163
+ ---
164
+
165
+ ## Applying to Loki Mode
166
+
167
+ ### Already Implemented (Aligned)
168
+
169
+ 1. **Hierarchical coordination** - Orchestrator -> Agents
170
+ 2. **Context management** - CONTINUITY.md, 25-iteration consolidation
171
+ 3. **Phase-based execution** - SDLC state machine
172
+
173
+ ### Should Add
174
+
175
+ 1. **Recursive sub-planners** - Allow planner agents to spawn sub-planners
176
+ 2. **Judge agents** - Explicit cycle continuation decisions
177
+ 3. **Optimistic concurrency** - Replace signal files with optimistic writes
178
+ 4. **Scale-aware review** - Adaptive review intensity based on agent count
179
+
180
+ ### Should Monitor
181
+
182
+ 1. **3-reviewer bottleneck** - May not scale past 50+ agents
183
+ 2. **Signal file coordination** - Similar to Cursor's failed file locking
184
+ 3. **Over-specification** - 37 agent types may be overkill
185
+
186
+ ---
187
+
188
+ ## Integration Recommendations
189
+
190
+ ### Phase 1: Low Risk
191
+ - Add judge agents (new agent type)
192
+ - Document optimistic concurrency option
193
+ - Add scale considerations to quality gates
194
+
195
+ ### Phase 2: Medium Risk
196
+ - Implement recursive sub-planners
197
+ - Make review intensity configurable
198
+ - Add optimistic concurrency mode
199
+
200
+ ### Phase 3: Validation Required
201
+ - Test at 100+ agent scale
202
+ - Measure reviewer bottleneck impact
203
+ - Compare file signals vs optimistic concurrency
204
+
205
+ ---
206
+
207
+ **v4.1.0 | Cursor Scaling Learnings**