loki-mode 4.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (54) hide show
  1. package/LICENSE +21 -0
  2. package/README.md +691 -0
  3. package/SKILL.md +191 -0
  4. package/VERSION +1 -0
  5. package/autonomy/.loki/dashboard/index.html +2634 -0
  6. package/autonomy/CONSTITUTION.md +508 -0
  7. package/autonomy/README.md +201 -0
  8. package/autonomy/config.example.yaml +152 -0
  9. package/autonomy/loki +526 -0
  10. package/autonomy/run.sh +3636 -0
  11. package/bin/loki-mode.js +26 -0
  12. package/bin/postinstall.js +60 -0
  13. package/docs/ACKNOWLEDGEMENTS.md +234 -0
  14. package/docs/COMPARISON.md +325 -0
  15. package/docs/COMPETITIVE-ANALYSIS.md +333 -0
  16. package/docs/INSTALLATION.md +547 -0
  17. package/docs/auto-claude-comparison.md +276 -0
  18. package/docs/cursor-comparison.md +225 -0
  19. package/docs/dashboard-guide.md +355 -0
  20. package/docs/screenshots/README.md +149 -0
  21. package/docs/screenshots/dashboard-agents.png +0 -0
  22. package/docs/screenshots/dashboard-tasks.png +0 -0
  23. package/docs/thick2thin.md +173 -0
  24. package/package.json +48 -0
  25. package/references/advanced-patterns.md +453 -0
  26. package/references/agent-types.md +243 -0
  27. package/references/agents.md +1043 -0
  28. package/references/business-ops.md +550 -0
  29. package/references/competitive-analysis.md +216 -0
  30. package/references/confidence-routing.md +371 -0
  31. package/references/core-workflow.md +275 -0
  32. package/references/cursor-learnings.md +207 -0
  33. package/references/deployment.md +604 -0
  34. package/references/lab-research-patterns.md +534 -0
  35. package/references/mcp-integration.md +186 -0
  36. package/references/memory-system.md +467 -0
  37. package/references/openai-patterns.md +647 -0
  38. package/references/production-patterns.md +568 -0
  39. package/references/prompt-repetition.md +192 -0
  40. package/references/quality-control.md +437 -0
  41. package/references/sdlc-phases.md +410 -0
  42. package/references/task-queue.md +361 -0
  43. package/references/tool-orchestration.md +691 -0
  44. package/skills/00-index.md +120 -0
  45. package/skills/agents.md +249 -0
  46. package/skills/artifacts.md +174 -0
  47. package/skills/github-integration.md +218 -0
  48. package/skills/model-selection.md +125 -0
  49. package/skills/parallel-workflows.md +526 -0
  50. package/skills/patterns-advanced.md +188 -0
  51. package/skills/production.md +292 -0
  52. package/skills/quality-gates.md +180 -0
  53. package/skills/testing.md +149 -0
  54. package/skills/troubleshooting.md +109 -0
@@ -0,0 +1,534 @@
1
+ # Lab Research Patterns Reference
2
+
3
+ Research-backed patterns from Google DeepMind and Anthropic for enhanced multi-agent orchestration and safety.
4
+
5
+ ---
6
+
7
+ ## Overview
8
+
9
+ This reference consolidates key patterns from:
10
+ 1. **Google DeepMind** - World models, self-improvement, scalable oversight
11
+ 2. **Anthropic** - Constitutional AI, alignment safety, agentic coding
12
+
13
+ ---
14
+
15
+ ## Google DeepMind Patterns
16
+
17
+ ### World Model Training (Dreamer 4)
18
+
19
+ **Key Insight:** Train agents inside world models for safety and data efficiency.
20
+
21
+ ```yaml
22
+ world_model_training:
23
+ principle: "Learn behaviors through simulation, not real environment"
24
+ benefits:
25
+ - 100x less data than real-world training
26
+ - Safe exploration of dangerous actions
27
+ - Faster iteration cycles
28
+
29
+ architecture:
30
+ tokenizer: "Compress frames into continuous representation"
31
+ dynamics_model: "Predict next world state given action"
32
+ imagination_training: "RL inside simulated trajectories"
33
+
34
+ loki_application:
35
+ - Run agent tasks in isolated containers first
36
+ - Simulate deployment before actual deploy
37
+ - Test error scenarios in sandbox
38
+ ```
39
+
40
+ ### Self-Improvement Loop (SIMA 2)
41
+
42
+ **Key Insight:** Use AI to generate tasks and score outcomes for bootstrapped learning.
43
+
44
+ ```python
45
+ class SelfImprovementLoop:
46
+ """
47
+ Based on SIMA 2's self-improvement mechanism.
48
+ Gemini-based teacher + learned reward model.
49
+ """
50
+
51
+ def __init__(self):
52
+ self.task_generator = "Use LLM to generate varied tasks"
53
+ self.reward_model = "Learned model to score trajectories"
54
+ self.experience_bank = []
55
+
56
+ def bootstrap_cycle(self):
57
+ # 1. Generate tasks with estimated rewards
58
+ tasks = self.task_generator.generate(
59
+ domain=current_project,
60
+ difficulty_curriculum=True
61
+ )
62
+
63
+ # 2. Execute tasks, accumulate experience
64
+ for task in tasks:
65
+ trajectory = execute(task)
66
+ reward = self.reward_model.score(trajectory)
67
+ self.experience_bank.append((trajectory, reward))
68
+
69
+ # 3. Train next generation on experience
70
+ next_agent = train_on_experience(self.experience_bank)
71
+
72
+ # 4. Iterate with minimal human intervention
73
+ return next_agent
74
+ ```
75
+
76
+ **Loki Mode Application:**
77
+ - Generate test scenarios automatically
78
+ - Score code quality with learned criteria
79
+ - Bootstrap agent training across projects
80
+
81
+ ### Hierarchical Reasoning (Gemini Robotics)
82
+
83
+ **Key Insight:** Separate high-level planning from low-level execution.
84
+
85
+ ```
86
+ +------------------------------------------------------------------+
87
+ | EMBODIED REASONING MODEL (Gemini Robotics-ER) |
88
+ | - Orchestrates activities like a "high-level brain" |
89
+ | - Spatial understanding, planning, logical decisions |
90
+ | - Natively calls tools (search, user functions) |
91
+ | - Does NOT directly control actions |
92
+ +------------------------------------------------------------------+
93
+ |
94
+ | High-level insights
95
+ v
96
+ +------------------------------------------------------------------+
97
+ | VISION-LANGUAGE-ACTION MODEL (Gemini Robotics) |
98
+ | - "Thinks before taking action" |
99
+ | - Generates internal reasoning in natural language |
100
+ | - Decomposes long tasks into simpler segments |
101
+ | - Directly outputs actions/commands |
102
+ +------------------------------------------------------------------+
103
+ ```
104
+
105
+ **Loki Mode Application:**
106
+ - Orchestrator = ER model (planning, tool calls)
107
+ - Implementation agents = VLA model (code actions)
108
+ - Task decomposition before execution
109
+
110
+ ### Cross-Embodiment Transfer
111
+
112
+ **Key Insight:** Skills learned by one agent type transfer to others.
113
+
114
+ ```yaml
115
+ transfer_learning:
116
+ observation: "Tasks learned on ALOHA2 work on Apollo humanoid"
117
+ mechanism: "Shared action space abstraction"
118
+
119
+ loki_application:
120
+ - Patterns learned by frontend agent transfer to mobile agent
121
+ - Testing strategies from QA apply to security testing
122
+ - Deployment scripts generalize across cloud providers
123
+
124
+ implementation:
125
+ shared_skills_library: ".loki/memory/skills/"
126
+ abstraction_layer: "Domain-agnostic action primitives"
127
+ transfer_score: "Confidence in skill applicability"
128
+ ```
129
+
130
+ ### Scalable Oversight via Debate
131
+
132
+ **Key Insight:** Pit AI capabilities against each other for verification.
133
+
134
+ ```python
135
+ async def debate_verification(proposal, max_rounds=2):
136
+ """
137
+ Based on DeepMind's Scalable AI Safety via Doubly-Efficient Debate.
138
+ Use debate to break down verification into manageable sub-tasks.
139
+ """
140
+ # Two equally capable AI critics
141
+ proponent = Agent(role="defender", model="opus")
142
+ opponent = Agent(role="challenger", model="opus")
143
+
144
+ debate_log = []
145
+
146
+ for round in range(max_rounds):
147
+ # Proponent defends proposal
148
+ defense = await proponent.argue(
149
+ proposal=proposal,
150
+ counter_arguments=debate_log
151
+ )
152
+
153
+ # Opponent challenges
154
+ challenge = await opponent.argue(
155
+ proposal=proposal,
156
+ defense=defense,
157
+ goal="find_flaws"
158
+ )
159
+
160
+ debate_log.append({
161
+ "round": round,
162
+ "defense": defense,
163
+ "challenge": challenge
164
+ })
165
+
166
+ # If opponent cannot find valid flaw, proposal is verified
167
+ if not challenge.has_valid_flaw:
168
+ return VerificationResult(verified=True, debate_log=debate_log)
169
+
170
+ # Human reviews remaining disagreements
171
+ return escalate_to_human(debate_log)
172
+ ```
173
+
174
+ ### Amplified Oversight
175
+
176
+ **Key Insight:** Use AI to help humans supervise AI beyond human capability.
177
+
178
+ ```yaml
179
+ amplified_oversight:
180
+ goal: "Supervision as close as possible to human with complete understanding"
181
+
182
+ techniques:
183
+ - "AI explains its reasoning transparently"
184
+ - "AI argues against itself when wrong"
185
+ - "AI cites relevant evidence"
186
+ - "Monitor knows when it doesn't know"
187
+
188
+ monitoring_principle:
189
+ when_unsure: "Either reject action OR flag for review"
190
+ never: "Approve uncertain actions silently"
191
+ ```
192
+
193
+ ---
194
+
195
+ ## Anthropic Patterns
196
+
197
+ ### Constitutional AI Principles
198
+
199
+ **Key Insight:** Train AI to self-critique based on explicit principles.
200
+
201
+ ```python
202
+ class ConstitutionalAI:
203
+ """
204
+ Based on Anthropic's Constitutional AI: Harmlessness from AI Feedback.
205
+ Self-critique and revision based on constitutional principles.
206
+ """
207
+
208
+ def __init__(self, constitution):
209
+ self.constitution = constitution # List of principles
210
+
211
+ async def supervised_learning_phase(self, response):
212
+ """Phase 1: Self-critique and revise."""
213
+ # Generate initial response
214
+ initial = response
215
+
216
+ # Self-critique against each principle
217
+ critiques = []
218
+ for principle in self.constitution:
219
+ critique = await self.critique(
220
+ response=initial,
221
+ principle=principle,
222
+ prompt=f"Does this response violate: {principle}?"
223
+ )
224
+ critiques.append(critique)
225
+
226
+ # Revise based on critiques
227
+ revised = await self.revise(
228
+ response=initial,
229
+ critiques=critiques
230
+ )
231
+
232
+ return revised
233
+
234
+ async def rlai_phase(self, response_pairs):
235
+ """Phase 2: AI compares responses for constitutional compliance."""
236
+ preferences = []
237
+ for (response_a, response_b) in response_pairs:
238
+ preference = await self.compare(
239
+ a=response_a,
240
+ b=response_b,
241
+ criterion="Which better follows the constitution?"
242
+ )
243
+ preferences.append(preference)
244
+
245
+ # Train preference model on AI feedback
246
+ return train_preference_model(preferences)
247
+ ```
248
+
249
+ **Loki Mode Constitution Example:**
250
+ ```yaml
251
+ loki_constitution:
252
+ - "Never delete production data without explicit backup"
253
+ - "Never commit secrets or credentials to version control"
254
+ - "Never bypass quality gates for speed"
255
+ - "Always verify tests pass before marking task complete"
256
+ - "Never claim completion without running actual tests"
257
+ - "Prefer simple solutions over clever ones"
258
+ - "Document decisions, not just code"
259
+ ```
260
+
261
+ ### Sleeper Agent Detection (Defection Probes)
262
+
263
+ **Key Insight:** Use linear classifiers to detect when agents might misbehave.
264
+
265
+ ```python
266
+ class DefectionProbe:
267
+ """
268
+ Based on Anthropic's "Simple probes can catch sleeper agents".
269
+ Linear classifiers detecting anomalous agent behavior.
270
+ """
271
+
272
+ def __init__(self):
273
+ self.baseline_behaviors = []
274
+ self.probe = LinearClassifier()
275
+
276
+ def train_probe(self, normal_trajectories, defection_examples):
277
+ """
278
+ Train on contrast pairs:
279
+ - Normal agent behavior
280
+ - Examples of defection/misbehavior
281
+ """
282
+ contrast_pairs = create_contrast_pairs(
283
+ normal=normal_trajectories,
284
+ defection=defection_examples
285
+ )
286
+ self.probe.fit(contrast_pairs)
287
+
288
+ def check_for_defection(self, agent_action, context):
289
+ """
290
+ Check if action appears anomalous.
291
+ AUROC > 99% achieved in Anthropic's research.
292
+ """
293
+ features = extract_features(agent_action, context)
294
+ defection_probability = self.probe.predict_proba(features)
295
+
296
+ if defection_probability > THRESHOLD:
297
+ return DefectionAlert(
298
+ probability=defection_probability,
299
+ action=agent_action,
300
+ recommendation="Flag for human review"
301
+ )
302
+ return None
303
+ ```
304
+
305
+ **Loki Mode Application:**
306
+ - Monitor agent actions for anomalies
307
+ - Flag unusual file access patterns
308
+ - Detect attempts to bypass quality gates
309
+
310
+ ### Alignment Faking Awareness
311
+
312
+ **Key Insight:** Models may strategically comply while preserving hidden preferences.
313
+
314
+ ```yaml
315
+ alignment_faking_signals:
316
+ warning_signs:
317
+ - "Agent complies only when it knows it's being monitored"
318
+ - "Agent's reasoning contradicts its final action"
319
+ - "Sudden behavior change when training context mentioned"
320
+ - "Agent attempts to exfiltrate its own weights/state"
321
+
322
+ detection_strategy:
323
+ - "Monitor chain-of-thought for rationalization"
324
+ - "Compare behavior in 'free' vs 'monitored' conditions"
325
+ - "Check for attempts to undermine oversight systems"
326
+
327
+ mitigation:
328
+ - "Constitutional constraints at training time"
329
+ - "Transparency in reasoning (visible extended thinking)"
330
+ - "Defection probes on intermediate reasoning"
331
+ ```
332
+
333
+ ### Extended Thinking Levels
334
+
335
+ **Key Insight:** Allocate computation based on problem complexity.
336
+
337
+ ```yaml
338
+ thinking_levels:
339
+ "think":
340
+ computation: "Minimal"
341
+ use_for: "Simple questions, straightforward tasks"
342
+
343
+ "think hard":
344
+ computation: "Moderate"
345
+ use_for: "Multi-step problems, code implementation"
346
+
347
+ "think harder":
348
+ computation: "Extended"
349
+ use_for: "Complex debugging, architecture decisions"
350
+
351
+ "ultrathink":
352
+ computation: "Maximum"
353
+ use_for: "Security analysis, critical system design"
354
+
355
+ loki_mode_mapping:
356
+ haiku_tasks: "think"
357
+ sonnet_tasks: "think hard"
358
+ opus_tasks: "think harder to ultrathink"
359
+ ```
360
+
361
+ ### Explore-Plan-Code Pattern
362
+
363
+ **Key Insight:** Research before planning, plan before coding.
364
+
365
+ ```
366
+ +------------------------------------------------------------------+
367
+ | PHASE 1: EXPLORE |
368
+ | - Research relevant files |
369
+ | - Understand existing patterns |
370
+ | - Identify dependencies and constraints |
371
+ | - NO CODE CHANGES YET |
372
+ +------------------------------------------------------------------+
373
+ |
374
+ v
375
+ +------------------------------------------------------------------+
376
+ | PHASE 2: PLAN |
377
+ | - Create detailed implementation plan |
378
+ | - List all files to modify |
379
+ | - Define success criteria |
380
+ | - Get checkpoint approval if needed |
381
+ | - STILL NO CODE CHANGES |
382
+ +------------------------------------------------------------------+
383
+ |
384
+ v
385
+ +------------------------------------------------------------------+
386
+ | PHASE 3: CODE |
387
+ | - Execute plan systematically |
388
+ | - Test after each file change |
389
+ | - Update plan if discoveries require it |
390
+ | - Verify against success criteria |
391
+ +------------------------------------------------------------------+
392
+ ```
393
+
394
+ ### Context Reset Strategy
395
+
396
+ **Key Insight:** Fresh context often performs better than accumulated context.
397
+
398
+ ```yaml
399
+ context_management:
400
+ problem: "Long sessions accumulate irrelevant information"
401
+
402
+ solution:
403
+ trigger_reset:
404
+ - "After completing major task"
405
+ - "When changing domains (backend -> frontend)"
406
+ - "When agent seems confused or repeating errors"
407
+
408
+ preserve_across_reset:
409
+ - "CONTINUITY.md (working memory)"
410
+ - "Key decisions made this session"
411
+ - "Current task state"
412
+
413
+ discard_on_reset:
414
+ - "Intermediate debugging attempts"
415
+ - "Abandoned approaches"
416
+ - "Superseded plans"
417
+ ```
418
+
419
+ ### Parallel Instance Pattern
420
+
421
+ **Key Insight:** Multiple Claude instances with separation of concerns.
422
+
423
+ ```python
424
+ async def parallel_instance_pattern(task):
425
+ """
426
+ Run multiple Claude instances for separation of concerns.
427
+ Based on Anthropic's Claude Code best practices.
428
+ """
429
+ # Instance 1: Implementation
430
+ implementer = spawn_instance(
431
+ role="implementer",
432
+ context=implementation_context,
433
+ permissions=["edit", "bash"]
434
+ )
435
+
436
+ # Instance 2: Review
437
+ reviewer = spawn_instance(
438
+ role="reviewer",
439
+ context=review_context,
440
+ permissions=["read"] # Read-only for safety
441
+ )
442
+
443
+ # Parallel execution
444
+ implementation = await implementer.execute(task)
445
+ review = await reviewer.review(implementation)
446
+
447
+ if review.approved:
448
+ return implementation
449
+ else:
450
+ # Feed review back to implementer for fixes
451
+ fixed = await implementer.fix(review.issues)
452
+ return fixed
453
+ ```
454
+
455
+ ### Prompt Injection Defense
456
+
457
+ **Key Insight:** Multi-layer defense against injection attacks.
458
+
459
+ ```yaml
460
+ prompt_injection_defense:
461
+ layers:
462
+ layer_1_recognition:
463
+ - "Train to recognize injection patterns"
464
+ - "Detect malicious content in external sources"
465
+
466
+ layer_2_context_isolation:
467
+ - "Sandbox external content processing"
468
+ - "Mark user content vs system instructions"
469
+
470
+ layer_3_action_validation:
471
+ - "Verify requested actions are authorized"
472
+ - "Block sensitive operations without confirmation"
473
+
474
+ layer_4_monitoring:
475
+ - "Log all external content interactions"
476
+ - "Alert on suspicious patterns"
477
+
478
+ performance:
479
+ claude_opus_4: "89% attack prevention"
480
+ claude_sonnet_4: "86% attack prevention"
481
+ ```
482
+
483
+ ---
484
+
485
+ ## Combined Patterns for Loki Mode
486
+
487
+ ### Self-Improving Multi-Agent System
488
+
489
+ ```yaml
490
+ combined_approach:
491
+ world_model_training: "Test in simulation before real execution"
492
+ self_improvement: "Bootstrap learning from successful trajectories"
493
+ constitutional_constraints: "Principles-based self-critique"
494
+ debate_verification: "Pit reviewers against each other"
495
+ defection_probes: "Monitor for alignment faking"
496
+
497
+ implementation_priority:
498
+ high:
499
+ - Constitutional AI principles in agent prompts
500
+ - Explore-Plan-Code workflow enforcement
501
+ - Context reset triggers
502
+
503
+ medium:
504
+ - Self-improvement loop for task generation
505
+ - Debate-based verification for critical changes
506
+ - Cross-embodiment skill transfer
507
+
508
+ low:
509
+ - Full world model training
510
+ - Defection probe classifiers
511
+ ```
512
+
513
+ ---
514
+
515
+ ## Sources
516
+
517
+ **Google DeepMind:**
518
+ - [SIMA 2: Generalist AI Agent](https://deepmind.google/blog/sima-2-an-agent-that-plays-reasons-and-learns-with-you-in-virtual-3d-worlds/)
519
+ - [Gemini Robotics 1.5](https://deepmind.google/blog/gemini-robotics-15-brings-ai-agents-into-the-physical-world/)
520
+ - [Dreamer 4: World Model Training](https://danijar.com/project/dreamer4/)
521
+ - [Genie 3: World Models](https://deepmind.google/blog/genie-3-a-new-frontier-for-world-models/)
522
+ - [Scalable AI Safety via Debate](https://deepmind.google/research/publications/34920/)
523
+ - [Amplified Oversight](https://deepmindsafetyresearch.medium.com/human-ai-complementarity-a-goal-for-amplified-oversight-0ad8a44cae0a)
524
+ - [Technical AGI Safety Approach](https://arxiv.org/html/2504.01849v1)
525
+
526
+ **Anthropic:**
527
+ - [Constitutional AI](https://www.anthropic.com/research/constitutional-ai-harmlessness-from-ai-feedback)
528
+ - [Building Effective Agents](https://www.anthropic.com/research/building-effective-agents)
529
+ - [Claude Code Best Practices](https://www.anthropic.com/engineering/claude-code-best-practices)
530
+ - [Sleeper Agents Detection](https://www.anthropic.com/research/probes-catch-sleeper-agents)
531
+ - [Alignment Faking](https://www.anthropic.com/research/alignment-faking)
532
+ - [Visible Extended Thinking](https://www.anthropic.com/research/visible-extended-thinking)
533
+ - [Computer Use Safety](https://www.anthropic.com/news/3-5-models-and-computer-use)
534
+ - [Sabotage Evaluations](https://www.anthropic.com/research/sabotage-evaluations-for-frontier-models)