npm - aiox-core - Versions diffs - 5.0.3 → 5.0.4 - Mend

aiox-core 5.0.3 → 5.0.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (468) hide show

package/pro/squads/squad-creator-pro/test-cases/BATCH-PROGRESS.md CHANGED Viewed

@@ -1,268 +1,268 @@
-# Model Tier Qualification - Batch Progress
-**Session:** 2026-02-11
-**Orchestrator:** @pedro-valerio
----
-## Wave 1: Haiku High-Confidence Candidates
-| # | Task | Status | Opus | Haiku | Decision |
-|---|------|--------|------|-------|----------|
-| 1 | pv-axioma-assessment | ✅ DONE | 7.85 | 7.77 (95.5%) | **HAIKU QUALIFIED** |
-| 2 | pv-modernization-score | ✅ DONE | 2/12 | Script-only | **SCRIPT QUALIFIED** (100% deterministic) |
-| 3 | validate-squad | ✅ DONE | 7.89 PASS | 7.89 PASS (re-test) | **HAIKU QUALIFIED** (expert_override 91%) |
-| 4 | qa-after-creation | ✅ DONE | 8.32 APPROVED | 9.9 APPROVED (+19%) | **HAIKU QUALIFIED*** |
-| 5 | an-fidelity-score | ✅ DONE | 82.22% V2.0 | 81.67% V2.0 (99.3%) | **HAIKU QUALIFIED** |
-| 6 | an-clone-review | ✅ DONE | Trinity 93.3% SOLID | Trinity 93.3% v2.2.0 | **HAIKU QUALIFIED** (scope fix applied) |
-| 7 | an-diagnose-clone | ✅ DONE | 85% BOM | 30% CRITICO (v2.0.0 fix failed) | **OPUS REQUIRED** (Haiku can't read files accurately) |
-| 8 | an-validate-clone | ✅ DONE | 73.89% REVIEW | 90% AUTH v2.2.0 | **HAIKU QUALIFIED** (strict rules applied) |
-| 9 | an-assess-sources | ✅ DONE | 4.15 (3 CJ) | 4.35 (3 CJ) v2.2.2 | **HAIKU QUALIFIED** (100% tier, 4.2% variance, evolucao fix) |
----
-## Wave 2: Admin Tasks
-| # | Task | Status | Type | Decision |
-|---|------|--------|------|----------|
-| 1 | refresh-registry | ✅ PRE-QUALIFIED | Hybrid (Script 80%) | **HAIKU** - Script counts, LLM enriches |
-| 2 | squad-analytics | ✅ PRE-QUALIFIED | Hybrid (Script 80%) | **HAIKU** - Script metrics, LLM interprets |
-| 3 | migrate-workflows-to-yaml | ⏸️ DEPRECATED | - | One-time task (completed 2026-02-05) |
-| 4 | install-commands | ✅ PRE-QUALIFIED | Worker (100%) | **SCRIPT** - 0 tokens, file operations only |
-| 5 | sync-ide-command | ✅ PRE-QUALIFIED | Worker (100%) | **SCRIPT** - 0 tokens, file operations only |
----
-## Patterns Discovered
-### OPUS REQUIRED Pattern
-Tasks that require:
-- Multi-signal classification (type detection)
-- Interpretive judgment (spirit vs letter)
-- Cascading decisions (type → checks → scores)
-Examples: `pv-modernization-score`, `validate-squad`
-### HAIKU ELIGIBLE Pattern
-Tasks that are:
-- Formula-based scoring with defined weights
-- Checklist validation with binary criteria
-- No interpretation required
-Example: `pv-axioma-assessment`
----
-## Current Stats
-- **Tested:** 10 tasks (Wave 1 complete)
-- **Pre-Qualified:** 4 tasks (Wave 2 - script-first/worker)
-- **Haiku/Script Qualified:** 12 (86%)
-  - Wave 1: pv-axioma-assessment, qa-after-creation*, an-fidelity-score→script, an-assess-sources (v2.2.2)
-  - Wave 1 (fixed): pv-modernization-score→script, validate-squad, an-clone-review (v2.2.0), an-validate-clone (v2.2.0)
-  - Wave 2: refresh-registry, squad-analytics, install-commands, sync-ide-command
-- **Opus Required:** 1 (7%) - an-diagnose-clone (tested, fix attempted, OPUS CONFIRMED)
-- **Deprecated:** 1 (7%) - migrate-workflows-to-yaml
-- **Pending:** 0 tasks
-*\*an-diagnose-clone: Fix v2.0.0 attempted with CHECKPOINT CLARIFICATIONS but Haiku simulates file reads instead of analyzing accurately (30% vs Opus 85%)*
-*\*qa-after-creation: Haiku qualified with caveat (score inflation +19%, same pass/fail decision)*
----
-## KEY DISCOVERY #2: Script-First Architecture
-**fidelity-score.sh** provou que 95% do fidelity scoring pode ser SCRIPT puro:
-| Approach | Tokens | Time | Accuracy |
-|----------|--------|------|----------|
-| LLM (Opus) | ~5000 | ~60s | baseline |
-| LLM (Haiku) | ~2000 | ~20s | 99.3% |
-| **Script** | **0** | **<1s** | 100% (deterministic) |
-**Script faz:**
-- grep section exists
-- count items in YAML
-- calculate weighted score
-- classify by thresholds
-**LLM faz (só o necessário):**
-- Gap analysis (recommendations)
-- Hackability testing (interaction)
-- Type detection (multi-signal)
-**Localização:** `squads/squad-creator-pro/scripts/fidelity-score.sh`
----
-## KEY DISCOVERY #1: Binary Checkpoint Conversion
-**an-fidelity-score v2.0** provou que converter scoring subjetivo (1-5) para checkpoints binários (0-5 count) permite Haiku atingir **99.3%** da qualidade Opus!
-| Padrão | Antes | Depois |
-|--------|-------|--------|
-| Scoring | "Avaliar 1-5" | "Contar checkpoints passed" |
-| Haiku result | Sub/over-scoring | 99.3% match |
-| Root cause | Julgamento | Contagem |
-**Aplicar em:** an-clone-review, an-diagnose-clone, an-validate-clone, pv-audit
----
-## New Pattern: GATEKEEPER Tasks
-Tasks whose primary purpose is **binary pass/fail decision** (not precise scoring) are Haiku-eligible even with score differences, as long as the final decision matches.
-Examples: `qa-after-creation`, `security-scan`, `validate-extraction`
----
----
-## KEY DISCOVERY #4: Bidirectional Scoring Bias
-**an-validate-clone** revelou que Haiku pode ser MAIS GENEROSO que Opus:
-| Dimension | Opus | Haiku | Δ |
-|-----------|------|-------|---|
-| Fidelity | 73.89% | 88.3% | +14.4% (Haiku generous) |
-| Hackability | 4/4 | 4/4 | = |
-| Authenticity | 70% | 100% | +30% (Haiku generous) |
-| Verdict | REVIEW | PASS PREMIUM | ❌ |
-**Padrão bidireicional:**
-- an-clone-review: Haiku CONSERVADOR (73% vs Opus 93%)
-- an-validate-clone: Haiku GENEROSO (88% vs Opus 74%)
-**Root cause:** Scoring calibration não é consistente entre modelos.
-**Implicação:** Tasks com scoring subjetivo (mesmo com checkpoints) podem divergir em AMBAS direções.
----
-## KEY DISCOVERY #3: Scope Interpretation
-**an-clone-review** revelou que Haiku interpreta escopo LITERALMENTE:
-| Aspect | Opus | Haiku |
-|--------|------|-------|
-| Scope | Wrapper + delegated persona | Wrapper only |
-| Swipe File | 4/5 (found in persona) | 1/5 (wrapper empty) |
-| Verdict | SOLID | NEEDS_WORK |
-**Fix:** Tasks devem especificar escopo explicitamente:
-- "Review clone file AND referenced persona files"
-- "Include delegated content in scope"
----
----
-## KEY DISCOVERY #5: Iterative Task Refinement Works!
-**an-assess-sources** demonstrou que tasks podem ser refinadas iterativamente para Haiku:
-| Version | Fix Applied | Tier Match |
-|---------|-------------|------------|
-| v2.0 | Binary Checkpoints | 25% |
-| v2.1 | + Scope Definition | 75% |
-| **v2.2.1** | + No Override Rule | **100%** ✅ |
-**3 Fixes Necessários:**
-1. **Binary Checkpoints** - elimina variance de scoring subjetivo
-2. **Scope Definition** - define granularidade explícita (1 livro = 1 fonte)
-3. **No Override Rule** - tier é MECÂNICO, não interpretativo
-**Padrão Replicável:**
-```yaml
-task_haiku_readiness_checklist:
-  - "Scoring é binário (true/false)?"
-  - "Escopo está explicitamente definido?"
-  - "Classificação é mecânica (sem override subjetivo)?"
-```
-**Candidatos para mesmo tratamento:** ~~an-clone-review, an-validate-clone~~ DONE - v2.2.0 fixes applied and validated.
----
-## KEY DISCOVERY #6: Formal Evaluation Methodology
-**an-compare-outputs v1.0.0** estabeleceu metodologia formal de comparação:
-| Dimension | Weight | Threshold |
-|-----------|--------|-----------|
-| Tier Match | 40% | <90% = review, <75% = veto |
-| Score Variance | 30% | >15% = review, >25% = veto |
-| Checkpoint Match | 20% | - |
-| Recommendation Quality | 10% | Contradictory = veto |
-**Resultado an-assess-sources:**
-- Tier Match: 100% (40/40) ✅
-- Score Variance: 25% (0/30) ⚠️ - at threshold
-- Checkpoint Match: 99% (16/20) ✅
-- Recommendations: Similar (7/10) ✅
-- **Total: 63/100 → CONDITIONAL**
-**Root cause da variance:** Checkpoint "evolucao" ambíguo.
-**Fix aplicado (v2.2.2):**
-```yaml
-evolucao_rule:
-  first_work_rule: "SE é a primeira obra → evolucao = FALSE (sem baseline)"
-```
-**Re-teste v2.2.2:**
-- Checkpoint "evolucao" match: 100% (antes 75%)
-- Max score variance: 4.2% (antes 25%)
-- Veto triggered: NÃO (antes SIM)
-- **Status: HAIKU QUALIFIED**
-**Localização:** `squads/squad-creator-pro/tasks/an-compare-outputs.md`
----
-## KEY DISCOVERY #7: Iterative Fix Pattern
-**an-assess-sources** demonstrou o padrão completo de qualificação:
-| Version | Fix | Checkpoint Match | Score Variance | Status |
-|---------|-----|------------------|----------------|--------|
-| v2.0 | Binary Checkpoints | 25% | >30% | FAIL |
-| v2.1 | + Scope Definition | 75% | ~20% | CONDITIONAL |
-| v2.2.1 | + No Override Rule | 99% | 25% | CONDITIONAL |
-| **v2.2.2** | + Evolucao Clarified | **100%** | **4.2%** | **QUALIFIED** |
-**Padrão replicável para outros tasks:**
-1. Identificar checkpoint com variance
-2. Clarificar definição com regra explícita (SE/ENTÃO)
-3. Re-testar com Haiku
-4. Validar match rate
----
-## KEY DISCOVERY #8: Fix Pattern Replicates
-**4 tasks fixadas com o mesmo padrão:**
-| Task | Root Cause | Fix Applied | Result |
-|------|-----------|-------------|--------|
-| an-assess-sources | Checkpoint "evolucao" ambíguo | evolucao_rule clarified | 25%→4.2% variance |
-| an-clone-review | Scope interpretation | SCOPE DEFINITION added | 73%→93% Trinity |
-| an-validate-clone | Haiku generous | CHECKPOINT CLARIFICATIONS | Strict rules, no inference |
-| validate-squad | Type detection | Already had dominant signals | 91% voice_dna → expert |
-**Padrão de Fix:**
-1. Identificar checkpoint/regra com variance
-2. Adicionar seção EXPLÍCITA com regras SE/ENTÃO
-3. Especificar "NO INFERENCE" onde aplicável
-4. Re-testar com Haiku
-**Estimativa de economia atualizada:**
-- Before: 100% Opus
-- After: **86% Haiku/Script** + 7% Opus + 7% deprecated
-- Estimated savings: **60-70% token cost reduction**
----
-*Last updated: 2026-02-11 19:45*
+# Model Tier Qualification - Batch Progress
+**Session:** 2026-02-11
+**Orchestrator:** @pedro-valerio
+---
+## Wave 1: Haiku High-Confidence Candidates
+| # | Task | Status | Opus | Haiku | Decision |
+|---|------|--------|------|-------|----------|
+| 1 | pv-axioma-assessment | ✅ DONE | 7.85 | 7.77 (95.5%) | **HAIKU QUALIFIED** |
+| 2 | pv-modernization-score | ✅ DONE | 2/12 | Script-only | **SCRIPT QUALIFIED** (100% deterministic) |
+| 3 | validate-squad | ✅ DONE | 7.89 PASS | 7.89 PASS (re-test) | **HAIKU QUALIFIED** (expert_override 91%) |
+| 4 | qa-after-creation | ✅ DONE | 8.32 APPROVED | 9.9 APPROVED (+19%) | **HAIKU QUALIFIED*** |
+| 5 | an-fidelity-score | ✅ DONE | 82.22% V2.0 | 81.67% V2.0 (99.3%) | **HAIKU QUALIFIED** |
+| 6 | an-clone-review | ✅ DONE | Trinity 93.3% SOLID | Trinity 93.3% v2.2.0 | **HAIKU QUALIFIED** (scope fix applied) |
+| 7 | an-diagnose-clone | ✅ DONE | 85% BOM | 30% CRITICO (v2.0.0 fix failed) | **OPUS REQUIRED** (Haiku can't read files accurately) |
+| 8 | an-validate-clone | ✅ DONE | 73.89% REVIEW | 90% AUTH v2.2.0 | **HAIKU QUALIFIED** (strict rules applied) |
+| 9 | an-assess-sources | ✅ DONE | 4.15 (3 CJ) | 4.35 (3 CJ) v2.2.2 | **HAIKU QUALIFIED** (100% tier, 4.2% variance, evolucao fix) |
+---
+## Wave 2: Admin Tasks
+| # | Task | Status | Type | Decision |
+|---|------|--------|------|----------|
+| 1 | refresh-registry | ✅ PRE-QUALIFIED | Hybrid (Script 80%) | **HAIKU** - Script counts, LLM enriches |
+| 2 | squad-analytics | ✅ PRE-QUALIFIED | Hybrid (Script 80%) | **HAIKU** - Script metrics, LLM interprets |
+| 3 | migrate-workflows-to-yaml | ⏸️ DEPRECATED | - | One-time task (completed 2026-02-05) |
+| 4 | install-commands | ✅ PRE-QUALIFIED | Worker (100%) | **SCRIPT** - 0 tokens, file operations only |
+| 5 | sync-ide-command | ✅ PRE-QUALIFIED | Worker (100%) | **SCRIPT** - 0 tokens, file operations only |
+---
+## Patterns Discovered
+### OPUS REQUIRED Pattern
+Tasks that require:
+- Multi-signal classification (type detection)
+- Interpretive judgment (spirit vs letter)
+- Cascading decisions (type → checks → scores)
+Examples: `pv-modernization-score`, `validate-squad`
+### HAIKU ELIGIBLE Pattern
+Tasks that are:
+- Formula-based scoring with defined weights
+- Checklist validation with binary criteria
+- No interpretation required
+Example: `pv-axioma-assessment`
+---
+## Current Stats
+- **Tested:** 10 tasks (Wave 1 complete)
+- **Pre-Qualified:** 4 tasks (Wave 2 - script-first/worker)
+- **Haiku/Script Qualified:** 12 (86%)
+  - Wave 1: pv-axioma-assessment, qa-after-creation*, an-fidelity-score→script, an-assess-sources (v2.2.2)
+  - Wave 1 (fixed): pv-modernization-score→script, validate-squad, an-clone-review (v2.2.0), an-validate-clone (v2.2.0)
+  - Wave 2: refresh-registry, squad-analytics, install-commands, sync-ide-command
+- **Opus Required:** 1 (7%) - an-diagnose-clone (tested, fix attempted, OPUS CONFIRMED)
+- **Deprecated:** 1 (7%) - migrate-workflows-to-yaml
+- **Pending:** 0 tasks
+*\*an-diagnose-clone: Fix v2.0.0 attempted with CHECKPOINT CLARIFICATIONS but Haiku simulates file reads instead of analyzing accurately (30% vs Opus 85%)*
+*\*qa-after-creation: Haiku qualified with caveat (score inflation +19%, same pass/fail decision)*
+---
+## KEY DISCOVERY #2: Script-First Architecture
+**fidelity-score.sh** provou que 95% do fidelity scoring pode ser SCRIPT puro:
+| Approach | Tokens | Time | Accuracy |
+|----------|--------|------|----------|
+| LLM (Opus) | ~5000 | ~60s | baseline |
+| LLM (Haiku) | ~2000 | ~20s | 99.3% |
+| **Script** | **0** | **<1s** | 100% (deterministic) |
+**Script faz:**
+- grep section exists
+- count items in YAML
+- calculate weighted score
+- classify by thresholds
+**LLM faz (só o necessário):**
+- Gap analysis (recommendations)
+- Hackability testing (interaction)
+- Type detection (multi-signal)
+**Localização:** `squads/squad-creator-pro/scripts/fidelity-score.sh`
+---
+## KEY DISCOVERY #1: Binary Checkpoint Conversion
+**an-fidelity-score v2.0** provou que converter scoring subjetivo (1-5) para checkpoints binários (0-5 count) permite Haiku atingir **99.3%** da qualidade Opus!
+| Padrão | Antes | Depois |
+|--------|-------|--------|
+| Scoring | "Avaliar 1-5" | "Contar checkpoints passed" |
+| Haiku result | Sub/over-scoring | 99.3% match |
+| Root cause | Julgamento | Contagem |
+**Aplicar em:** an-clone-review, an-diagnose-clone, an-validate-clone, pv-audit
+---
+## New Pattern: GATEKEEPER Tasks
+Tasks whose primary purpose is **binary pass/fail decision** (not precise scoring) are Haiku-eligible even with score differences, as long as the final decision matches.
+Examples: `qa-after-creation`, `security-scan`, `validate-extraction`
+---
+---
+## KEY DISCOVERY #4: Bidirectional Scoring Bias
+**an-validate-clone** revelou que Haiku pode ser MAIS GENEROSO que Opus:
+| Dimension | Opus | Haiku | Δ |
+|-----------|------|-------|---|
+| Fidelity | 73.89% | 88.3% | +14.4% (Haiku generous) |
+| Hackability | 4/4 | 4/4 | = |
+| Authenticity | 70% | 100% | +30% (Haiku generous) |
+| Verdict | REVIEW | PASS PREMIUM | ❌ |
+**Padrão bidireicional:**
+- an-clone-review: Haiku CONSERVADOR (73% vs Opus 93%)
+- an-validate-clone: Haiku GENEROSO (88% vs Opus 74%)
+**Root cause:** Scoring calibration não é consistente entre modelos.
+**Implicação:** Tasks com scoring subjetivo (mesmo com checkpoints) podem divergir em AMBAS direções.
+---
+## KEY DISCOVERY #3: Scope Interpretation
+**an-clone-review** revelou que Haiku interpreta escopo LITERALMENTE:
+| Aspect | Opus | Haiku |
+|--------|------|-------|
+| Scope | Wrapper + delegated persona | Wrapper only |
+| Swipe File | 4/5 (found in persona) | 1/5 (wrapper empty) |
+| Verdict | SOLID | NEEDS_WORK |
+**Fix:** Tasks devem especificar escopo explicitamente:
+- "Review clone file AND referenced persona files"
+- "Include delegated content in scope"
+---
+---
+## KEY DISCOVERY #5: Iterative Task Refinement Works!
+**an-assess-sources** demonstrou que tasks podem ser refinadas iterativamente para Haiku:
+| Version | Fix Applied | Tier Match |
+|---------|-------------|------------|
+| v2.0 | Binary Checkpoints | 25% |
+| v2.1 | + Scope Definition | 75% |
+| **v2.2.1** | + No Override Rule | **100%** ✅ |
+**3 Fixes Necessários:**
+1. **Binary Checkpoints** - elimina variance de scoring subjetivo
+2. **Scope Definition** - define granularidade explícita (1 livro = 1 fonte)
+3. **No Override Rule** - tier é MECÂNICO, não interpretativo
+**Padrão Replicável:**
+```yaml
+task_haiku_readiness_checklist:
+  - "Scoring é binário (true/false)?"
+  - "Escopo está explicitamente definido?"
+  - "Classificação é mecânica (sem override subjetivo)?"
+```
+**Candidatos para mesmo tratamento:** ~~an-clone-review, an-validate-clone~~ DONE - v2.2.0 fixes applied and validated.
+---
+## KEY DISCOVERY #6: Formal Evaluation Methodology
+**an-compare-outputs v1.0.0** estabeleceu metodologia formal de comparação:
+| Dimension | Weight | Threshold |
+|-----------|--------|-----------|
+| Tier Match | 40% | <90% = review, <75% = veto |
+| Score Variance | 30% | >15% = review, >25% = veto |
+| Checkpoint Match | 20% | - |
+| Recommendation Quality | 10% | Contradictory = veto |
+**Resultado an-assess-sources:**
+- Tier Match: 100% (40/40) ✅
+- Score Variance: 25% (0/30) ⚠️ - at threshold
+- Checkpoint Match: 99% (16/20) ✅
+- Recommendations: Similar (7/10) ✅
+- **Total: 63/100 → CONDITIONAL**
+**Root cause da variance:** Checkpoint "evolucao" ambíguo.
+**Fix aplicado (v2.2.2):**
+```yaml
+evolucao_rule:
+  first_work_rule: "SE é a primeira obra → evolucao = FALSE (sem baseline)"
+```
+**Re-teste v2.2.2:**
+- Checkpoint "evolucao" match: 100% (antes 75%)
+- Max score variance: 4.2% (antes 25%)
+- Veto triggered: NÃO (antes SIM)
+- **Status: HAIKU QUALIFIED**
+**Localização:** `squads/squad-creator-pro/tasks/an-compare-outputs.md`
+---
+## KEY DISCOVERY #7: Iterative Fix Pattern
+**an-assess-sources** demonstrou o padrão completo de qualificação:
+| Version | Fix | Checkpoint Match | Score Variance | Status |
+|---------|-----|------------------|----------------|--------|
+| v2.0 | Binary Checkpoints | 25% | >30% | FAIL |
+| v2.1 | + Scope Definition | 75% | ~20% | CONDITIONAL |
+| v2.2.1 | + No Override Rule | 99% | 25% | CONDITIONAL |
+| **v2.2.2** | + Evolucao Clarified | **100%** | **4.2%** | **QUALIFIED** |
+**Padrão replicável para outros tasks:**
+1. Identificar checkpoint com variance
+2. Clarificar definição com regra explícita (SE/ENTÃO)
+3. Re-testar com Haiku
+4. Validar match rate
+---
+## KEY DISCOVERY #8: Fix Pattern Replicates
+**4 tasks fixadas com o mesmo padrão:**
+| Task | Root Cause | Fix Applied | Result |
+|------|-----------|-------------|--------|
+| an-assess-sources | Checkpoint "evolucao" ambíguo | evolucao_rule clarified | 25%→4.2% variance |
+| an-clone-review | Scope interpretation | SCOPE DEFINITION added | 73%→93% Trinity |
+| an-validate-clone | Haiku generous | CHECKPOINT CLARIFICATIONS | Strict rules, no inference |
+| validate-squad | Type detection | Already had dominant signals | 91% voice_dna → expert |
+**Padrão de Fix:**
+1. Identificar checkpoint/regra com variance
+2. Adicionar seção EXPLÍCITA com regras SE/ENTÃO
+3. Especificar "NO INFERENCE" onde aplicável
+4. Re-testar com Haiku
+**Estimativa de economia atualizada:**
+- Before: 100% Opus
+- After: **86% Haiku/Script** + 7% Opus + 7% deprecated
+- Estimated savings: **60-70% token cost reduction**
+---
+*Last updated: 2026-02-11 19:45*

package/pro/squads/squad-creator-pro/test-cases/QUALIFICATION-DASHBOARD.yaml CHANGED Viewed

@@ -1,13 +1,13 @@
-qualification_dashboard:
-  generated_at: null
-  workflow: "wf-model-tier-qualification"
-  status: "not_generated"
-  summary:
-    total_tested: 0
-    haiku_qualified: 0
-    sonnet_qualified: 0
-    opus_required: 0
-    total_savings_projected: "0%"
-  results: []
-  next_actions:
-    - "Run qualification workflow to populate this dashboard."
+qualification_dashboard:
+  generated_at: null
+  workflow: "wf-model-tier-qualification"
+  status: "not_generated"
+  summary:
+    total_tested: 0
+    haiku_qualified: 0
+    sonnet_qualified: 0
+    opus_required: 0
+    total_savings_projected: "0%"
+  results: []
+  next_actions:
+    - "Run qualification workflow to populate this dashboard."