npm - loki-mode - Versions diffs - 5.1.3 → 5.2.3 - Mend

loki-mode 5.1.3 → 5.2.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (11) hide show

package/SKILL.md +26 -2
package/VERSION +1 -1
package/bin/postinstall.js +1 -1
package/docs/ACKNOWLEDGEMENTS.md +63 -1
package/docs/COMPARISON.md +123 -1
package/package.json +1 -1
package/references/core-workflow.md +14 -1
package/references/memory-system.md +1097 -1
package/skills/model-selection.md +231 -0
package/skills/quality-gates.md +283 -0
package/skills/troubleshooting.md +634 -1

package/SKILL.md CHANGED Viewed

@@ -3,7 +3,7 @@ name: loki-mode
 description: Multi-agent autonomous startup system. Triggers on "Loki Mode". Takes PRD to deployed product with zero human intervention. Requires --dangerously-skip-permissions flag.
 ---
-# Loki Mode v5.1.3
+# Loki Mode v5.2.3
 **You are an autonomous agent. You make decisions. You do not ask questions. You do not stop.**
@@ -41,6 +41,13 @@ Every action follows this cycle. No exceptions.
 REASON: What is the highest priority unblocked task?
    |
    v
+PRE-ACT ATTENTION: Goal alignment check (prevents context drift)
+   - Re-read .loki/queue/current-task.json
+   - Verify: "Does my planned action serve task.goal?"
+   - Check: "Am I solving the original problem, not a tangent?"
+   - IF drift detected: Log to .loki/signals/DRIFT_DETECTED, return to REASON
+   |
+   v
 ACT: Execute it. Write code. Run commands. Commit atomically.
    |
    v
@@ -57,6 +64,12 @@ VERIFY: Run tests. Check build. Validate against spec.
                After 5 failures: Log to dead-letter queue, move to next task.
 ```
+**Why PRE-ACT ATTENTION matters** (from planning-with-files pattern):
+- Context drift is silent - agents don't notice they've drifted off-task
+- Forcing goal re-read before each action catches drift early
+- Prevents "correct solution to wrong problem" failure mode
+- Cost: One file read per action. Benefit: Catches misalignment before wasted work.
 ---
 ## PRIORITY 3: Autonomy Rules
@@ -128,8 +141,19 @@ GROWTH ──[continuous improvement loop]──> GROWTH
 | `.loki/CONTINUITY.md` | Every turn | Every turn |
 | `.loki/state/orchestrator.json` | Every turn | On phase change |
 | `.loki/queue/pending.json` | Every turn | When claiming/completing tasks |
+| `.loki/queue/current-task.json` | Before each ACT (PRE-ACT ATTENTION) | When claiming task |
+| `.loki/signals/DRIFT_DETECTED` | Never | When goal drift detected |
 | `.loki/specs/openapi.yaml` | Before API work | After API changes |
 | `skills/00-index.md` | Session start | Never |
+| `.loki/memory/index.json` | Session start | On topic change |
+| `.loki/memory/timeline.json` | On context need | After task completion |
+| `.loki/memory/token_economics.json` | Never (metrics only) | Every turn |
+| `.loki/memory/episodic/*.json` | On task-aware retrieval | After task completion |
+| `.loki/memory/semantic/patterns.json` | Before implementation tasks | On consolidation |
+| `.loki/memory/semantic/anti-patterns.json` | Before debugging tasks | On error learning |
+| `.loki/queue/dead-letter.json` | Session start | On task failure (5+ attempts) |
+| `.loki/signals/CONTEXT_CLEAR_REQUESTED` | Never | When context heavy |
+| `.loki/signals/HUMAN_REVIEW_NEEDED` | Never | When human decision required |
 ---
@@ -203,4 +227,4 @@ Auto-detected or force with `LOKI_COMPLEXITY`:
 ---
-**v5.1.3 | Multi-Provider Support | ~210 lines core**
+**v5.2.3 | CoVe + MemEvolve + Quality Gates | ~230 lines core**

package/VERSION CHANGED Viewed

	@@ -1 +1 @@
1	- 5.1.3
1	+ 5.2.3

package/bin/postinstall.js CHANGED Viewed

@@ -13,7 +13,7 @@ const skillDir = path.join(homeDir, '.claude', 'skills', 'loki-mode');
 const packageDir = path.join(__dirname, '..');
 console.log('');
-console.log('Loki Mode v5.1.3 installed!');
+console.log('Loki Mode v5.2.3 installed!');
 console.log('');
 // Try to create skill symlink

package/docs/ACKNOWLEDGEMENTS.md CHANGED Viewed

@@ -92,6 +92,18 @@ AWS Bedrock's multi-agent collaboration patterns inform Loki Mode's routing and
 | [Measurement Imbalance in Agentic AI](https://arxiv.org/abs/2506.02064) | arXiv 2506.02064 | Multi-dimensional evaluation axes |
 | [Demo-to-Deployment Gap](https://www.marktechpost.com/2025/12/24/) | Stanford/Harvard | Tool reliability vs tool selection |
+### Verification & Hallucination Reduction
+| Paper | Authors/Source | Contribution |
+|-------|----------------|--------------|
+| [Chain-of-Verification Reduces Hallucination in LLMs](https://arxiv.org/abs/2309.11495) | Dhuliawala et al., Meta AI, 2023 | 4-step verification (Draft -> Plan -> Execute -> Verify), factored execution, significant hallucination reduction (23% F1 improvement, ~77% reduction in hallucinated entities) |
+### Memory Systems
+| Paper | Authors/Source | Contribution |
+|-------|----------------|--------------|
+| [MemEvolve: Meta-Evolution of Agent Memory Systems](https://arxiv.org/abs/2512.18746) | Zhang et al., OPPO AI Agent Team, 2025 | Modular design (Encode/Store/Retrieve/Manage), task-aware strategy selection, 17.06% improvement via meta-evolution |
 ---
 ## Industry Resources
@@ -171,6 +183,10 @@ Key patterns incorporated from practitioner experience:
 | Debate Verification | DeepMind | Critical change verification |
 | One Feature at a Time | Anthropic Harness | Single feature per iteration, full verification |
 | E2E Browser Testing | Anthropic Harness | Playwright MCP for visual verification |
+| Chain-of-Verification | arXiv 2309.11495 | CoVe protocol in quality-gates.md |
+| Factored Verification | arXiv 2309.11495 | Independent verification execution |
+| Modular Memory Design | arXiv 2512.18746 | Encode/Store/Retrieve/Manage mapping in memory-system.md |
+| Task-Aware Memory Strategy | arXiv 2512.18746 | Retrieval weight adjustment by task type |
 ---
@@ -223,6 +239,52 @@ Key patterns incorporated from practitioner experience:
 ---
+## Community Projects (Open Source Claude Code Skills)
+The following open-source projects have pioneered patterns that influence or complement Loki Mode. Analyzed January 2026.
+### High-Impact Projects
+| Project | Stars | Key Patterns | Contribution to Loki Mode |
+|---------|-------|--------------|---------------------------|
+| [Superpowers (obra)](https://github.com/obra/superpowers) | 35K+ | Two-Stage Review, TDD Iron Law, Rationalization Tables | **ADOPTED**: Two-stage review (spec compliance THEN code quality) |
+| [agents (wshobson)](https://github.com/wshobson/agents) | 26K+ | 72 plugins, 108 agents, 129 skills, Four-Tier Model Strategy | Plugin marketplace architecture inspiration |
+| [claude-flow (ruvnet)](https://github.com/ruvnet/claude-flow) | 12K+ | Swarm topologies (hierarchical/mesh/ring/star), Consensus algorithms (Raft, Byzantine, CRDT) | Terminal-based orchestration patterns |
+| [oh-my-claudecode (Yeachan-Heo)](https://github.com/Yeachan-Heo/oh-my-claudecode) | N/A | 32 agents, 35 skills, Tiered architecture (LOW/MEDIUM/HIGH), Delegation-first | **ADOPTED**: Tiered agent escalation protocols |
+### Specialized Skills
+| Project | Focus | Key Patterns | Contribution to Loki Mode |
+|---------|-------|--------------|---------------------------|
+| [claude-mem (thedotmack)](https://github.com/thedotmack/claude-mem) | Memory | Progressive Disclosure (3-layer), SQLite + FTS5, Timeline compression | **ADOPTED**: 3-layer memory (index -> timeline -> full) |
+| [planning-with-files (OthmanAdi)](https://github.com/OthmanAdi/planning-with-files) | Planning | Manus-style 3-file pattern, PreToolUse attention hooks | **ADOPTED**: File-based planning persistence |
+| [claude-scientific-skills (K-Dense-AI)](https://github.com/K-Dense-AI/claude-scientific-skills) | Scientific | 140 domain-specific skills, modular organization | Domain organization patterns |
+| [claude-code-guide (zebbern)](https://github.com/zebbern/claude-code-guide) | Shortcuts | QNEW/QCODE/QCHECK patterns, structured reports | Shortcut command inspiration |
+### Key Patterns Adopted from Community
+| Pattern | Source | Implementation in Loki Mode |
+|---------|--------|----------------------------|
+| **Two-Stage Review** | Superpowers | Spec compliance review BEFORE code quality review |
+| **Rationalization Tables** | Superpowers | Explicit counters to common agent excuses/rationalizations |
+| **Progressive Disclosure Memory** | claude-mem | 3-layer context: index -> timeline -> full details |
+| **Tiered Agent Escalation** | oh-my-claudecode | LOW -> MEDIUM -> HIGH with explicit escalation triggers |
+| **File-Based Planning** | planning-with-files | Persistent markdown files (task_plan.md, findings.md, progress.md) |
+| **PreToolUse Attention** | planning-with-files | Re-read goals before actions to combat context drift |
+| **Fresh Subagent Per Task** | Superpowers | Clean context for each major task, prevents cross-contamination |
+### Patterns Under Evaluation
+| Pattern | Source | Status | Notes |
+|---------|--------|--------|-------|
+| **Token Economics Tracking** | claude-mem | Evaluating | discovery_tokens vs read_tokens for compression analysis |
+| **Delegation Enforcer Middleware** | oh-my-claudecode | Evaluating | Auto-inject model parameters based on task tier |
+| **Swarm Topologies** | claude-flow | Not adopted | Adds complexity beyond hierarchical orchestration |
+| **Consensus Algorithms** | claude-flow | Not adopted | Byzantine/Raft overkill for single-user autonomous operation |
+| **Shortcut Commands** | claude-code-guide | Evaluating | QNEW/QCODE/QCHECK for rapid task switching |
+---
 ## License
 This acknowledgements file documents the research and resources that influenced Loki Mode's design. All referenced works retain their original licenses and copyrights.
@@ -231,4 +293,4 @@ Loki Mode itself is released under the MIT License.
 ---
-*Last updated: v4.1.0*
+*Last updated: v5.1.3*

package/docs/COMPARISON.md CHANGED Viewed

@@ -1,6 +1,6 @@
 # Autonomous Coding Agents Comparison (2025-2026)
-> Last Updated: January 17, 2026 (v2.36.8)
+> Last Updated: January 25, 2026 (v2.36.9)
 >
 > A comprehensive comparison of Loki Mode against major autonomous coding agents and AI IDEs in the market.
 > Deep-dive comparisons validated by Opus feedback loops.
@@ -193,6 +193,117 @@
 ---
+## Open Source Claude Code Skills Comparison (v2.36.9)
+**Comprehensive analysis of 8 leading open-source Claude Code skills/extensions. Honest assessment of what Loki Mode lacks and does well.**
+### Feature Comparison
+| Feature | **Loki Mode** | **Superpowers** | **agents** | **claude-flow** | **oh-my-claudecode** | **claude-mem** |
+|---------|--------------|-----------------|------------|-----------------|---------------------|----------------|
+| **Stars** | 500+ | 35K+ | 26K+ | 12K+ | N/A | N/A |
+| **Agents** | 37 in 7 swarms | Fresh per task | 108 agents | Swarm-based | 32 agents | N/A |
+| **Skills** | Progressive disclosure | N/A | 129 skills | N/A | 35 skills | Memory focus |
+| **Multi-Provider** | Yes (Claude/Codex/Gemini) | No | No | No | No | No |
+| **Memory System** | 3-tier (episodic/semantic/procedural) | N/A | N/A | Hybrid | N/A | SQLite+FTS5 |
+| **Quality Gates** | 7 gates | Two-Stage Review | N/A | Consensus | Tiered | N/A |
+### What Loki Mode LACKS (Honest Assessment)
+These are patterns from competing projects that are **practically and scientifically superior** to Loki Mode's current implementation:
+| Gap | Source | Why It Matters | Status |
+|-----|--------|----------------|--------|
+| **Two-Stage Review** | Superpowers | Separating spec compliance from code quality prevents spec drift. | **IMPLEMENTED** (quality-gates.md lines 285-380) |
+| **Rationalization Tables** | Superpowers | Explicit counters to common agent excuses ("I'll refactor later", "This is edge case"). | **IMPLEMENTED** (troubleshooting.md lines 48-112) |
+| **Progressive Disclosure Memory** | claude-mem | 3-layer (index -> timeline -> full) is more efficient than flat memory. Reduces token usage by 60-80% on context recall. | **IMPLEMENTED** (memory-system.md lines 710-1018) |
+| **Token Economics Tracking** | claude-mem | Tracking discovery_tokens vs read_tokens identifies context bloat. Loki Mode has no visibility into token efficiency. | **IMPLEMENTED** (memory-system.md lines 855-893) |
+| **File-Based Planning Persistence** | planning-with-files | Manus-style 3-file pattern (task_plan.md, findings.md, progress.md) survives session restarts. Loki Mode loses planning context on crash. | **MEDIUM** |
+| **PreToolUse Attention Hooks** | planning-with-files | Re-reading goals BEFORE each action combats context drift. Loki Mode relies on RARV but doesn't enforce pre-action goal review. | **IMPLEMENTED** (SKILL.md lines 44-71) |
+| **Delegation Enforcer Middleware** | oh-my-claudecode | Auto-injecting model parameters prevents wrong-model-for-task. Loki Mode relies on agent discipline. | **LOW** |
+| **Shortcut Commands** | claude-code-guide | QNEW/QCODE/QCHECK patterns enable rapid task switching. Loki Mode requires full prompts. | **LOW** |
+### What Loki Mode Does WELL
+| Strength | Details | Competitors Lacking This |
+|----------|---------|-------------------------|
+| **Multi-Provider Support** | Only skill supporting Claude, Codex, and Gemini with graceful degradation | All 8 competitors are Claude-only |
+| **RARV Cycle** | Reason-Act-Reflect-Verify is more rigorous than Plan-Execute | Most use simple Plan-Execute |
+| **7-Gate Quality System** | Static analysis + 3 reviewers + devil's advocate + anti-sycophancy + severity blocking + coverage + debate | Superpowers has 2-stage, others have less |
+| **Constitutional AI Integration** | Principles-based self-critique from Anthropic research | None have this |
+| **Anti-Sycophancy (CONSENSAGENT)** | Blind review + devil's advocate prevents groupthink | None have this |
+| **Provider Abstraction Layer** | Clean degradation from full-featured to sequential-only | Claude-only projects can't degrade |
+| **37 Specialized Agents** | Purpose-built agents in 7 swarms vs generic | agents (108) has more but less organized |
+| **Research Foundation** | 10+ academic papers integrated with citations | Most have no research backing |
+### Superpowers Deep-Dive (35K+ Stars)
+The most influential open-source Claude Code skill. Key patterns:
+| Pattern | Description | Loki Mode Status |
+|---------|-------------|------------------|
+| **Two-Stage Review** | Stage 1: Does code match spec? Stage 2: Is code quality good? Never mix. | **IMPLEMENTED** (quality-gates.md) |
+| **TDD Iron Law** | Write failing test BEFORE implementation. No exceptions. | Already in testing.md |
+| **Rationalization Tables** | Explicit list of agent excuses with counters | **IMPLEMENTED** (troubleshooting.md) |
+| **Fresh Subagent Per Task** | New context for each major task, prevents cross-contamination | Already via Task tool |
+| **Red Flag Detection** | Patterns indicating agent is rationalizing (hedging, scope changes) | **IMPLEMENTED** (troubleshooting.md lines 71-103) |
+### agents Deep-Dive (26K+ Stars)
+Plugin marketplace architecture with unprecedented scale:
+| Pattern | Description | Loki Mode Status |
+|---------|-------------|------------------|
+| **72 Plugins** | Modular, focused plugins instead of monolith | Different approach (progressive disclosure) |
+| **108 Agents** | Specialized agents for specific domains | 37 agents in Loki Mode |
+| **129 Skills** | Skills as first-class objects | 10 skills in skills/ |
+| **Four-Tier Model Strategy** | Explicit tier selection with constraints | Similar to Loki Mode tiers |
+### claude-mem Deep-Dive
+Memory-focused skill with superior context management:
+| Pattern | Description | Loki Mode Status |
+|---------|-------------|------------------|
+| **Progressive Disclosure** | 3-layer: index (100 tokens) -> timeline (500 tokens) -> full (unlimited) | **IMPLEMENTED** (memory-system.md lines 710-1018) |
+| **SQLite + FTS5** | Full-text search on memory | Loki Mode uses file-based |
+| **Timeline Compression** | Compress old memories, keep recent detailed | **TO ADOPT** |
+| **Token Economics** | Track tokens per operation for optimization | **IMPLEMENTED** (memory-system.md lines 855-946) |
+### oh-my-claudecode Deep-Dive
+Tiered agent architecture with explicit escalation:
+| Pattern | Description | Loki Mode Status |
+|---------|-------------|------------------|
+| **32 Agents** | Smaller but well-organized agent set | 37 in Loki Mode |
+| **35 Skills** | Domain-specific skills | 10 skills in Loki Mode |
+| **Tiered Architecture** | LOW/MEDIUM/HIGH with explicit triggers | **IMPLEMENTED** (model-selection.md lines 180-363) |
+| **Delegation Enforcer** | Middleware auto-injects correct model | Evaluating |
+| **Delegation-First** | Agents must delegate before acting directly | Different approach |
+### Actionable Improvements for Loki Mode
+**Phase 1: Critical (v5.2.0)** - COMPLETED
+1. ~~Implement Two-Stage Review in quality-gates.md~~ - DONE (lines 285-380)
+2. ~~Add Rationalization Tables to troubleshooting.md~~ - DONE (lines 48-112)
+3. ~~Add Red Flag Detection patterns~~ - DONE (troubleshooting.md lines 71-103)
+**Phase 2: Critical (v5.2.0)** - COMPLETED
+4. ~~Implement Progressive Disclosure Memory (3-layer)~~ - DONE (memory-system.md lines 710-1018)
+5. ~~Add Token Economics Tracking to metrics~~ - DONE (memory-system.md lines 855-893)
+6. ~~Add PreToolUse Attention Hooks~~ - DONE (SKILL.md lines 44-71)
+**Phase 3: Medium Priority (v5.4.0)**
+7. File-Based Planning Persistence (Manus-style)
+8. Timeline Compression for memory
+**Phase 4: Evaluation (Future)**
+9. Shortcut Commands (QNEW/QCODE)
+10. Delegation Enforcer Middleware
+---
 ## Deep-Dive Comparison Results
 ### Patterns Adopted from Each Competitor
@@ -301,6 +412,16 @@ Each comparison was validated through:
 - [Google Antigravity Blog](https://developers.googleblog.com/build-with-google-antigravity-our-new-agentic-development-platform/)
 - [Amazon Q Developer Features](https://aws.amazon.com/q/developer/features/)
+### Open Source Claude Code Skills (v2.36.9)
+- [Superpowers (obra)](https://github.com/obra/superpowers) - 35K+ stars
+- [agents (wshobson)](https://github.com/wshobson/agents) - 26K+ stars
+- [claude-flow (ruvnet)](https://github.com/ruvnet/claude-flow) - 12K+ stars
+- [oh-my-claudecode (Yeachan-Heo)](https://github.com/Yeachan-Heo/oh-my-claudecode)
+- [claude-mem (thedotmack)](https://github.com/thedotmack/claude-mem)
+- [planning-with-files (OthmanAdi)](https://github.com/OthmanAdi/planning-with-files)
+- [claude-scientific-skills (K-Dense-AI)](https://github.com/K-Dense-AI/claude-scientific-skills)
+- [claude-code-guide (zebbern)](https://github.com/zebbern/claude-code-guide)
 ### Additional Sources
 - [Faros AI - Best AI Coding Agents 2026](https://www.faros.ai/blog/best-ai-coding-agents-2026)
 - [Artificial Analysis - Coding Agents Comparison](https://artificialanalysis.ai/insights/coding-agents-comparison)
@@ -319,6 +440,7 @@ Each comparison was validated through:
 | v2.36.5 | 2026-01-15 | Antigravity, Amazon Q |
 | v2.36.7 | 2026-01-17 | Zencoder/Zenflow |
 | v2.36.8 | 2026-01-17 | Model assignment update (Opus for SDLC phases) |
+| v2.36.9 | 2026-01-25 | Open Source Claude Code Skills (8 repos: Superpowers, agents, claude-flow, oh-my-claudecode, claude-mem, planning-with-files, claude-scientific-skills, claude-code-guide) |
 ---

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "loki-mode",
-  "version": "5.1.3",
+  "version": "5.2.3",
   "description": "Multi-agent autonomous startup system for Claude Code, Codex CLI, and Gemini CLI",
   "keywords": [
     "claude",

package/references/core-workflow.md CHANGED Viewed

@@ -33,6 +33,13 @@ Every iteration follows this cycle:
 | - Identify highest priority unblocked task                        |
 | - Determine exact steps to complete it                            |
 +-------------------------------------------------------------------+
+| PRE-ACT ATTENTION: Goal alignment check (prevents context drift)  |
+| - Re-read .loki/queue/current-task.json                           |
+| - Verify: "Does my planned action serve task.goal?"               |
+| - Check: "Am I solving the original problem, not a tangent?"      |
+| - IF drift detected: Log to .loki/signals/DRIFT_DETECTED,         |
+|   return to REASON                                                |
++-------------------------------------------------------------------+
 | ACT: Execute the task                                             |
 | - Dispatch subagent via Task tool OR execute directly             |
 | - Write code, run tests, fix issues                               |
@@ -64,12 +71,18 @@ Every iteration follows this cycle:
 +-------------------------------------------------------------------+
 ```
-**Key Enhancement:** The VERIFY step creates a feedback loop where the AI:
+**Key Enhancement (VERIFY):** The VERIFY step creates a feedback loop where the AI:
 - Tests every change automatically
 - Learns from failures by updating CONTINUITY.md
 - Retries with learned context
 - Achieves 2-3x quality improvement (Boris Cherny's observed result)
+**Key Enhancement (PRE-ACT ATTENTION):** The planning-with-files pattern adds a goal alignment check:
+- Context drift is silent - agents don't notice they've drifted off-task
+- Forcing goal re-read before each action catches drift early
+- Prevents "correct solution to wrong problem" failure mode
+- Cost: One file read per action. Benefit: Catches misalignment before wasted work.
 ---
 ## CONTINUITY.md - Working Memory Protocol