npm - @atlashub/smartstack-cli - Versions diffs - 3.9.0 → 3.10.0 - Mend

@atlashub/smartstack-cli 3.9.0 → 3.10.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (22) hide show

package/templates/skills/business-analyse/references/cache-warming-strategy.md ADDED Viewed

@@ -0,0 +1,578 @@
+# Cache Warming Strategy
+> **Objective:** Reduce token waste from redundant file reads by 15-20% via strategic pre-loading.
+## Problem Statement
+**Baseline (no cache warming):**
+Analysis of BA session `03b76b8b-ea1c-4f1e-a636-bd46b0c33e02` shows:
+```
+Total tokens (1 agent): 106,881
+Cache read input tokens: 105,351 (98.5% of total)
+Cache creation input tokens: 857 (0.8% of total)
+```
+**Issue:** 98.5% cache read means files are being re-read multiple times, but NOT efficiently. Many reads are for the SAME files re-loaded redundantly across different agents.
+**Example redundancies detected:**
+- `feature-schema.json` read 7 times
+- `questionnaire/01-context.md` read 3 times
+- `suggestion-catalog.md` read 5 times
+- `handoff-file-templates.md` read 4 times
+**Root cause:** No pre-loading strategy → each agent reads files on-demand → redundant I/O operations.
+---
+## Solution: Progressive Cache Warming
+**Principle:** Pre-load frequently-used static files at specific checkpoints, organize by retention policy.
+### Benefits
+| Metric | Before | After | Improvement |
+|--------|--------|-------|-------------|
+| Schema file reads | 7× | 1× | 86% ↓ |
+| Questionnaire reads | 3× | 1× | 67% ↓ |
+| Template reads | 4-5× | 1× | 75-80% ↓ |
+| Token waste | ~15,000 | ~3,000 | 80% ↓ |
+| Overall session | 106,881 | ~90,000 | 15-20% ↓ |
+---
+## Cache Bucket Architecture
+### Bucket 1: Core Schemas (CRITICAL)
+**Files:**
+```
+docs/business/{app}/business-analyse/schemas/
+├── feature-schema.json              (~8KB)
+├── application-schema.json          (~6KB)
+├── sections/
+│   ├── analysis-schema.json         (~5KB)
+│   ├── discovery-schema.json        (~3KB)
+│   ├── handoff-schema.json          (~4KB)
+│   ├── metadata-schema.json         (~2KB)
+│   ├── specification-schema.json    (~7KB)
+│   └── validation-schema.json       (~3KB)
+└── shared/
+    └── common-defs.json             (~4KB)
+Total: ~42KB, 9 files
+```
+**Load at:** Step-00 (initialization)
+**Retention:** Entire session (until BA completes)
+**Used in:** ALL steps (validation, schema references)
+**Why critical:**
+- Used by every ba-writer operation for validation
+- Referenced in every feature.json via `$schema` field
+- Small files (42KB total) with high re-use (7× average)
+- ROI: 86% reduction in schema reads
+**Implementation:**
+```javascript
+// Step-00 initialization
+const schemaFiles = glob("docs/business/{app}/business-analyse/schemas/**/*.json");
+for (const file of schemaFiles) {
+  read(file);  // Pre-load into cache
+}
+// Files now cached for entire session
+```
+---
+### Bucket 2: Questionnaire Templates (HIGH)
+**Files:**
+```
+~/.claude/skills/business-analyse/questionnaire/
+├── 00-application.md      (~4KB)
+├── 01-context.md          (~6KB)
+├── 02-stakeholders.md     (~7KB)
+├── 03-scope.md            (~6KB)
+├── 04-data.md             (~3KB)
+├── 05-integrations.md     (~3KB)
+├── 06-security.md         (~2KB)
+├── 07-ui.md               (~4KB)
+├── 08-performance.md      (~2KB)
+├── 09-constraints.md      (~1KB)
+├── 10-documentation.md    (~1KB)
+├── 11-data-lifecycle.md   (~3KB)
+├── 12-migration.md        (~3KB)
+├── 13-cross-module.md     (~3KB)
+├── 14-risk-assumptions.md (~4KB)
+└── 15-success-metrics.md  (~4KB)
+Total: ~56KB, 16 files
+```
+**Load at:** Step-00 (initialization)
+**Retention:** Until step-02 (after cadrage completes)
+**Used in:** Step-01 only (cadrage/framing)
+**Why high priority:**
+- Read 2-3× during step-01 (questionnaire selection + processing)
+- Conditional loading logic re-reads files multiple times
+- Moderate size (56KB) but only used in 1 step
+- ROI: 67% reduction in questionnaire reads
+**Implementation:**
+```javascript
+// Step-00 initialization
+const questionnaireFiles = glob("~/.claude/skills/business-analyse/questionnaire/*.md");
+for (const file of questionnaireFiles) {
+  read(file);
+}
+// Step-02 start: Clear cache
+// (Questionnaires no longer needed)
+```
+**Optimization:** Only load questionnaires that will be used based on `vibe_coding` flag:
+- If `vibe_coding = true`: Load only 00, 01, 02, 03, 14, 15 (core 6 files, ~31KB)
+- If `vibe_coding = false`: Load all 16 files (~56KB)
+---
+### Bucket 3: Suggestion & Pattern Catalogs (HIGH)
+**Files:**
+```
+~/.claude/skills/business-analyse/patterns/
+└── suggestion-catalog.md  (~12KB)
+Total: ~12KB, 1 file
+```
+**Load at:** Step-00 (initialization)
+**Retention:** Until step-02 (after cadrage)
+**Used in:** Step-01 (cadrage suggestions)
+**Why high priority:**
+- Read 5× during step-01 (suggestion generation)
+- Small file (12KB) but very high re-use
+- ROI: 80% reduction in suggestion reads
+---
+### Bucket 4: Module Specification References (MEDIUM)
+**Files:**
+```
+~/.claude/skills/business-analyse/references/
+├── spec-auto-inference.md     (~8KB)
+├── ui-resource-cards.md       (~6KB)
+├── ui-dashboard-spec.md       (~5KB)
+└── cadrage-vibe-coding.md     (~4KB)
+Total: ~23KB, 4 files
+```
+**Load at:** Step-03 start (before module loop)
+**Retention:** Until step-04 (after all modules specified)
+**Used in:** Step-03a, 03b, 03c (module specification)
+**Why medium priority:**
+- NOT loaded at step-00 (not used until step-03)
+- Read 2-3× PER MODULE (5 modules = 10-15 reads)
+- Moderate size (23KB) with high re-use during module loop
+- ROI: 75% reduction in reference reads during step-03
+**Implementation:**
+```javascript
+// Step-03 start (before module loop)
+const moduleRefs = [
+  "spec-auto-inference.md",
+  "ui-resource-cards.md",
+  "ui-dashboard-spec.md",
+  "cadrage-vibe-coding.md"
+];
+for (const file of moduleRefs) {
+  read(`~/.claude/skills/business-analyse/references/${file}`);
+}
+// Step-04 start: Clear cache
+```
+---
+### Bucket 5: Handoff & Deploy Templates (LOW)
+**Files:**
+```
+~/.claude/skills/business-analyse/references/
+├── handoff-file-templates.md   (~15KB)
+├── handoff-mappings.md         (~12KB)
+├── deploy-data-build.md        (~10KB)
+├── deploy-modes.md             (~8KB)
+└── html-data-mapping.md        (~9KB)
+~/.claude/skills/business-analyse/html/
+└── ba-interactive.html         (~85KB)
+Total: ~139KB, 6 files
+```
+**Load at:** Step-05a start (before handoff generation)
+**Retention:** Until session end
+**Used in:** Step-05a, 05b (handoff & deploy)
+**Why LOW priority (don't pre-load at step-00):**
+- Large files (139KB total), especially ba-interactive.html (85KB)
+- Only used at END of BA workflow (step-05)
+- Read only 1-2× each (low re-use)
+- ROI: Marginal (5-10% savings) vs. upfront cost
+**Implementation:**
+```javascript
+// Step-05a start (NOT at step-00)
+const handoffRefs = [
+  "handoff-file-templates.md",
+  "handoff-mappings.md",
+  "deploy-data-build.md",
+  "deploy-modes.md",
+  "html-data-mapping.md"
+];
+for (const file of handoffRefs) {
+  read(`~/.claude/skills/business-analyse/references/${file}`);
+}
+const htmlTemplate = read("~/.claude/skills/business-analyse/html/ba-interactive.html");
+```
+---
+## Implementation Timeline
+### Step-00: Initialization (Load CRITICAL + HIGH)
+```javascript
+// 1. Schemas (42KB, 9 files) - CRITICAL
+glob("schemas/**/*.json").forEach(f => read(f));
+// 2. Questionnaires (31-56KB, 6-16 files) - HIGH
+// Conditional on vibe_coding flag
+if (vibe_coding) {
+  ["00", "01", "02", "03", "14", "15"].forEach(n =>
+    read(`questionnaire/${n}-*.md`)
+  );
+} else {
+  glob("questionnaire/*.md").forEach(f => read(f));
+}
+// 3. Suggestion catalog (12KB, 1 file) - HIGH
+read("patterns/suggestion-catalog.md");
+// Total pre-loaded: 85-110KB, 16-26 files
+```
+**Cache status after step-00:**
+```
+✓ Core schemas: 9 files (42KB) - cached for session
+✓ Questionnaires: 6-16 files (31-56KB) - cached until step-02
+✓ Suggestion catalog: 1 file (12KB) - cached until step-02
+  Total: 16-26 files (85-110KB)
+  Expected hit rate: 95-100% in step-01
+```
+---
+### Step-02: Clear Questionnaires (Free Memory)
+```javascript
+// After cadrage completes, questionnaires no longer needed
+// Cache eviction happens automatically based on retention policy
+// No explicit action needed (handled by Claude's cache system)
+console.log(`
+✓ Cache eviction: questionnaires (31-56KB freed)
+  Retained: schemas (42KB)
+`);
+```
+---
+### Step-03: Load Module Spec References
+```javascript
+// Before starting module loop
+const moduleRefs = [
+  "spec-auto-inference.md",
+  "ui-resource-cards.md",
+  "ui-dashboard-spec.md",
+  "cadrage-vibe-coding.md"
+];
+moduleRefs.forEach(f => read(`references/${f}`));
+// Total added: 23KB, 4 files
+```
+**Cache status during step-03:**
+```
+✓ Core schemas: 9 files (42KB) - still cached
+✓ Module references: 4 files (23KB) - cached for module loop
+  Total: 13 files (65KB)
+  Expected hit rate: 90-95% during module loop
+```
+---
+### Step-04: Clear Module Spec References
+```javascript
+// After all modules specified, references no longer needed
+// Cache eviction automatic
+console.log(`
+✓ Cache eviction: module references (23KB freed)
+  Retained: schemas (42KB)
+`);
+```
+---
+### Step-05a: Load Handoff Templates
+```javascript
+// Before handoff generation
+const handoffRefs = [
+  "handoff-file-templates.md",
+  "handoff-mappings.md",
+  "deploy-data-build.md",
+  "deploy-modes.md",
+  "html-data-mapping.md"
+];
+handoffRefs.forEach(f => read(`references/${f}`));
+const htmlTemplate = read("html/ba-interactive.html");
+// Total added: 139KB, 6 files
+```
+**Cache status during step-05:**
+```
+✓ Core schemas: 9 files (42KB) - still cached
+✓ Handoff templates: 6 files (139KB) - cached for handoff/deploy
+  Total: 15 files (181KB)
+  Expected hit rate: 85-90% during handoff/deploy
+```
+---
+## Token Savings Calculation
+### Baseline (No Cache Warming)
+```
+Step-01 (Cadrage):
+  - Schemas read: 7× × 42KB = 294KB
+  - Questionnaires read: 3× × 56KB = 168KB
+  - Suggestions read: 5× × 12KB = 60KB
+  Total redundant: 522KB (~15,000 tokens wasted)
+Step-03 (Module Loop, 5 modules):
+  - Module refs read: 3× per module × 5 modules × 23KB = 345KB
+  Total redundant: 345KB (~10,000 tokens wasted)
+Step-05 (Handoff):
+  - Handoff refs read: 2× × 54KB = 108KB
+  Total redundant: 108KB (~3,000 tokens wasted)
+TOTAL WASTED: 975KB (~28,000 tokens)
+```
+### With Cache Warming
+```
+Step-00: Pre-load 85-110KB (one-time cost: ~3,000 tokens)
+Step-01: 0KB redundant (100% cache hits)
+Step-03: Pre-load 23KB (one-time cost: ~700 tokens)
+  Module loop: 0KB redundant (90% cache hits, ~2,000 tokens saved)
+Step-05a: Pre-load 139KB (one-time cost: ~4,000 tokens)
+  Handoff: 0KB redundant (85% cache hits, ~2,500 tokens saved)
+TOTAL PRE-LOAD COST: 247-272KB (~7,700 tokens)
+TOTAL SAVED: 28,000 - 7,700 = 20,300 tokens (72% reduction)
+```
+### ROI Analysis
+| Session Type | Baseline Tokens | After Warming | Savings | % Improvement |
+|--------------|----------------|---------------|---------|---------------|
+| 5-module app | 106,881 | ~90,000 | 16,881 | 15.8% |
+| 3-module app | 85,000 | ~72,000 | 13,000 | 15.3% |
+| 1-module (simple) | 45,000 | ~40,000 | 5,000 | 11.1% |
+**Average improvement: 15-20% token savings**
+---
+## Monitoring & Validation
+### Cache Hit Rate Monitoring
+After each major step, verify cache efficiency:
+```javascript
+// Pseudo-code for monitoring
+const cacheStats = getCacheStats();
+console.log(`
+Step-${currentStep} cache statistics:
+  - Total file reads: ${cacheStats.totalReads}
+  - Cache hits: ${cacheStats.cacheHits} (${cacheStats.hitRate}%)
+  - Cache misses: ${cacheStats.cacheMisses}
+  - Redundant reads: ${cacheStats.redundantReads}
+  - Status: ${cacheStats.hitRate > 90 ? "Optimal" : "Suboptimal"}
+`);
+```
+**Target metrics:**
+- Step-01: 95-100% hit rate (schemas + questionnaires pre-loaded)
+- Step-03: 90-95% hit rate (schemas + module refs pre-loaded)
+- Step-05: 85-90% hit rate (schemas + handoff refs pre-loaded)
+### Troubleshooting Poor Cache Efficiency
+| Symptom | Cause | Fix |
+|---------|-------|-----|
+| Step-01 hit rate < 90% | Questionnaires not pre-loaded | Verify step-00 glob pattern |
+| Step-03 hit rate < 80% | Module refs not loaded at step start | Add pre-load at step-03 start |
+| Cache eviction too early | Retention policy too aggressive | Extend retention period |
+| Memory issues | Too many files cached | Reduce bucket 2/3 sizes, prioritize bucket 1 |
+---
+## Advanced: Adaptive Cache Warming
+**Concept:** Adjust pre-loading based on detected workflow patterns.
+### Detection Logic
+```javascript
+// Step-00: Analyze feature_description to predict workflow
+const predictions = {
+  willNeedQuestionnaires: !vibe_coding,  // Full questionnaires if not vibe coding
+  willBeMultiModule: detectMultiModuleKeywords(feature_description),  // "modules", "applications"
+  estimatedModuleCount: estimateModuleCount(feature_description)  // 1-10
+};
+// Adaptive pre-loading
+if (predictions.willBeMultiModule && predictions.estimatedModuleCount > 3) {
+  // Large multi-module app → pre-load module refs at step-00 (not step-03)
+  loadModuleReferences();
+}
+if (!predictions.willNeedQuestionnaires) {
+  // Vibe coding → only load 6 core questionnaires
+  loadCoreQuestionnaires();
+} else {
+  // Full analysis → load all 16 questionnaires
+  loadAllQuestionnaires();
+}
+```
+**Benefits:**
+- Further 5-10% token savings via predictive loading
+- Reduced memory footprint for simple workflows
+**Risks:**
+- Prediction errors (over-loading or under-loading)
+- Complexity increase
+**Recommendation:** Start with STATIC strategy (current), add ADAPTIVE in Phase 3.
+---
+## Integration with Agent Pooling
+**Synergy:** Cache warming + Agent pooling = Maximum efficiency
+### Example: Step-03 Module Loop
+**Without cache warming + agent pooling:**
+```
+Spawn 24 agents (4-5 per module × 5 modules)
+Each agent re-reads schemas + module refs = 24 × 65KB = 1,560KB
+Tokens: ~45,000
+```
+**With cache warming ONLY:**
+```
+Spawn 24 agents
+Pre-load schemas + module refs once = 65KB
+Subsequent reads cached = 24 × 0KB redundant
+Tokens: ~15,000 (67% savings)
+```
+**With cache warming + agent pooling:**
+```
+Spawn 5 agents (1 per module)
+Pre-load schemas + module refs once = 65KB
+Each agent reads cached files = 5 × 0KB redundant
+Tokens: ~7,000 (84% savings)
+```
+**Combined effect:** 84% token reduction (vs. baseline)
+---
+## Rollout Checklist
+### Phase 2.1: Implement Core Warming (Week 1)
+- [ ] Add cache warming to step-00-init.md (Bucket 1 + 2)
+- [ ] Test cache hit rates in step-01
+- [ ] Verify 15-20% token savings
+- [ ] Document in MEMORY.md
+### Phase 2.2: Progressive Warming (Week 2)
+- [ ] Add module ref warming at step-03 start (Bucket 3)
+- [ ] Add handoff ref warming at step-05a start (Bucket 5)
+- [ ] Test cache eviction after step-02 and step-04
+- [ ] Measure token savings across full session
+### Phase 2.3: Monitoring & Validation (Week 3)
+- [ ] Add cache hit rate reporting after each step
+- [ ] Create troubleshooting guide for poor efficiency
+- [ ] Test with 1-module, 3-module, 5-module apps
+- [ ] Validate 15-20% savings across all scenarios
+### Phase 2.4: Optimization (Week 4)
+- [ ] Implement adaptive warming (optional)
+- [ ] Fine-tune retention policies
+- [ ] Integrate with agent pooling strategy
+- [ ] Document best practices
+---
+## Success Criteria
+**Minimum (Phase 2.1):**
+- ✓ Cache warming implemented in step-00
+- ✓ Cache hit rate > 90% in step-01
+- ✓ Token savings ≥ 10%
+**Target (Phase 2.2):**
+- ✓ Progressive warming across all steps
+- ✓ Cache hit rates: step-01 (95%), step-03 (90%), step-05 (85%)
+- ✓ Token savings ≥ 15%
+**Optimal (Phase 2.3+):**
+- ✓ Adaptive warming based on predictions
+- ✓ Cache hit rates > 90% across all steps
+- ✓ Token savings ≥ 20%
+- ✓ Integration with agent pooling (84% combined savings)
+---
+**Last Updated:** 2026-02-08
+**Version:** 1.0
+**Author:** SmartStack CLI Team
+**Based on:** Analysis of BA session 03b76b8b-ea1c-4f1e-a636-bd46b0c33e02