mindforge-cc 10.0.1 → 10.0.3
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.mindforge/config.json +50 -2
- package/.mindforge/engine/autonomous/cross-iteration-bridge.md +96 -0
- package/.mindforge/engine/cost-tracking/budget-enforcer.md +68 -0
- package/.mindforge/engine/cost-tracking/router.md +58 -0
- package/.mindforge/engine/cost-tracking/token-ledger.md +77 -0
- package/.mindforge/engine/council/council-protocol.md +96 -0
- package/.mindforge/engine/council/council-templates.md +85 -0
- package/.mindforge/engine/council/synthesis-engine.md +71 -0
- package/.mindforge/engine/instincts/capture-engine.md +63 -0
- package/.mindforge/engine/instincts/instinct-schema.md +76 -0
- package/.mindforge/engine/instincts/promotion-engine.md +77 -0
- package/.mindforge/engine/skills/composition.md +83 -0
- package/.mindforge/engine/skills/loader.md +16 -0
- package/.mindforge/personas/cost-optimizer.md +71 -0
- package/.mindforge/personas/council-architect.md +66 -0
- package/.mindforge/personas/council-critic.md +67 -0
- package/.mindforge/personas/council-pragmatist.md +71 -0
- package/.mindforge/personas/council-skeptic.md +73 -0
- package/.mindforge/personas/doc-auditor.md +84 -0
- package/.mindforge/personas/instinct-curator.md +83 -0
- package/.mindforge/personas/multi-model-bridge.md +86 -0
- package/.mindforge/personas/swarm-templates.json +28 -1
- package/.mindforge/personas/threat-modeler.md +82 -0
- package/.mindforge/skills/agent-introspection-debugging/SKILL.md +88 -0
- package/.mindforge/skills/agent-loops/SKILL.md +84 -0
- package/.mindforge/skills/autonomous-loops/SKILL.md +105 -0
- package/.mindforge/skills/continuous-learning/SKILL.md +84 -0
- package/.mindforge/skills/cost-aware-routing/SKILL.md +83 -0
- package/.mindforge/skills/council/SKILL.md +68 -0
- package/.mindforge/skills/doc-health-audit/SKILL.md +102 -0
- package/.mindforge/skills/multi-llm-consult/SKILL.md +75 -0
- package/.mindforge/skills/threat-modeling/SKILL.md +109 -0
- package/.mindforge/skills/verification-loop/SKILL.md +85 -0
- package/CHANGELOG.md +22 -3
- package/MINDFORGE.md +4 -4
- package/README.md +2 -2
- package/RELEASENOTES.md +71 -5
- package/SECURITY.md +1 -1
- package/bin/installer-core.js +1 -1
- package/bin/wizard/theme.js +2 -2
- package/docs/commands-reference.md +18 -1
- package/docs/getting-started.md +1 -1
- package/docs/sdk-reference.md +1 -1
- package/docs/troubleshooting.md +3 -3
- package/docs/user-guide.md +3 -3
- package/examples/starter-project/MINDFORGE.md +2 -2
- package/package.json +1 -1
|
@@ -0,0 +1,84 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: continuous-learning
|
|
3
|
+
version: 1.0.0
|
|
4
|
+
min_mindforge_version: 10.0.3
|
|
5
|
+
status: stable
|
|
6
|
+
triggers: instinct, learned behavior, pattern detection, evolve, auto-learn, skill promotion, confidence, observation, habit, adaptation
|
|
7
|
+
---
|
|
8
|
+
|
|
9
|
+
# Skill — Continuous Learning
|
|
10
|
+
|
|
11
|
+
## When this skill activates
|
|
12
|
+
When discussing learned behaviors, instinct management, pattern capture,
|
|
13
|
+
or skill evolution. Also activates during session-end reviews where patterns
|
|
14
|
+
may be captured.
|
|
15
|
+
|
|
16
|
+
## Mandatory actions when this skill is active
|
|
17
|
+
|
|
18
|
+
### Understanding the Instinct System
|
|
19
|
+
MindForge automatically observes sessions and captures behavioral patterns as
|
|
20
|
+
"instincts" — lightweight learned behaviors that may evolve into full skills.
|
|
21
|
+
|
|
22
|
+
**Instinct lifecycle:**
|
|
23
|
+
```
|
|
24
|
+
Observation -> Instinct (confidence: 0.5)
|
|
25
|
+
| applied successfully
|
|
26
|
+
Confidence grows (0.5 -> 0.6 -> 0.7 -> ...)
|
|
27
|
+
| confidence >= 0.85 AND applied 5+ times
|
|
28
|
+
Promotion candidate
|
|
29
|
+
| user approves
|
|
30
|
+
Full SKILL.md created and registered
|
|
31
|
+
```
|
|
32
|
+
|
|
33
|
+
### Auto-Capture (Always Active)
|
|
34
|
+
The instinct engine runs in auto-capture mode, watching for:
|
|
35
|
+
|
|
36
|
+
1. **User corrections** -> instinct created at confidence 0.6
|
|
37
|
+
- "No, always use X instead of Y" -> captures the preference
|
|
38
|
+
2. **Repeated patterns (3+)** -> instinct created at confidence 0.4
|
|
39
|
+
- Same action pattern 3 times in a session -> probably intentional
|
|
40
|
+
3. **Successful outcomes** -> instinct created at confidence 0.5
|
|
41
|
+
- Non-obvious action followed by verification pass -> worth remembering
|
|
42
|
+
|
|
43
|
+
### Managing Instincts
|
|
44
|
+
|
|
45
|
+
**View active instincts:**
|
|
46
|
+
- Check `.mindforge/engine/instincts/instinct-store.jsonl`
|
|
47
|
+
- Use `/mindforge:status` which includes instinct summary
|
|
48
|
+
|
|
49
|
+
**Manually capture:**
|
|
50
|
+
- Use `/mindforge:learn-instinct "observation" --behavior "what to do"`
|
|
51
|
+
- Manual instincts start at confidence 0.7 (user-stated = higher trust)
|
|
52
|
+
|
|
53
|
+
**Promote to skills:**
|
|
54
|
+
- Use `/mindforge:evolve-skills` to review mature instincts
|
|
55
|
+
- Candidates: confidence >= 0.85 AND times_applied >= 5
|
|
56
|
+
- Promotion creates a full SKILL.md and registers in MANIFEST.md
|
|
57
|
+
|
|
58
|
+
**Deprecate/Prune:**
|
|
59
|
+
- Instincts below 0.2 confidence after 10+ applications are auto-pruned
|
|
60
|
+
- Instincts inactive for 30 days are auto-pruned
|
|
61
|
+
- User can manually deprecate any instinct
|
|
62
|
+
|
|
63
|
+
### During any task (passive observation)
|
|
64
|
+
- Note patterns that repeat across tasks
|
|
65
|
+
- When user corrects behavior: acknowledge and create instinct
|
|
66
|
+
- At session end: report any new instincts captured
|
|
67
|
+
- Never let instinct count exceed 100 per project (prune lowest confidence)
|
|
68
|
+
|
|
69
|
+
### After session
|
|
70
|
+
Report instinct activity:
|
|
71
|
+
```
|
|
72
|
+
Instincts this session:
|
|
73
|
+
[NEW] "Pattern observed" (confidence: 0.5)
|
|
74
|
+
[REINFORCED] "Existing pattern" (0.6 -> 0.7)
|
|
75
|
+
[READY] "Mature pattern" -- eligible for promotion
|
|
76
|
+
|
|
77
|
+
Active: 47/100 | Promotion candidates: 3
|
|
78
|
+
```
|
|
79
|
+
|
|
80
|
+
## Self-check before task completion
|
|
81
|
+
- [ ] Did I report any new instincts captured this session?
|
|
82
|
+
- [ ] Did I verify instinct count remains under 100 for this project?
|
|
83
|
+
- [ ] Did I check for promotion candidates meeting the threshold?
|
|
84
|
+
- [ ] Did I confirm captured instincts are project-scoped (not leaking to other projects)?
|
|
@@ -0,0 +1,83 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: cost-aware-routing
|
|
3
|
+
version: 1.0.0
|
|
4
|
+
min_mindforge_version: 10.0.3
|
|
5
|
+
status: stable
|
|
6
|
+
triggers: token budget, model routing, cost optimization, model selection, Haiku, Flash, cheap model, expensive model, cost per token, budget, spend, token usage
|
|
7
|
+
---
|
|
8
|
+
|
|
9
|
+
# Skill — Cost-Aware Routing
|
|
10
|
+
|
|
11
|
+
## When this skill activates
|
|
12
|
+
When making model selection decisions, monitoring token spend, optimizing
|
|
13
|
+
for cost-performance tradeoffs, or when budget limits are approaching.
|
|
14
|
+
|
|
15
|
+
## Mandatory actions when this skill is active
|
|
16
|
+
|
|
17
|
+
### Model Tier Reference
|
|
18
|
+
|
|
19
|
+
| Tier | Model | Input/Output per 1M | Best For |
|
|
20
|
+
|------|-------|--------------------:|----------|
|
|
21
|
+
| simple | claude-haiku-4-5 | $0.25 / $1.25 | File reads, simple edits, formatting |
|
|
22
|
+
| standard | claude-sonnet-4-6 | $3 / $15 | Multi-file work, code review, daily tasks |
|
|
23
|
+
| complex | claude-opus-4-7 | $15 / $75 | Architecture, security, hard debugging |
|
|
24
|
+
| research | gemini-2.5-pro | $1.25 / $10 | Web research, long context, synthesis |
|
|
25
|
+
| consult | gpt-4o | $2.50 / $10 | Second opinions, alternative perspectives |
|
|
26
|
+
|
|
27
|
+
### Routing Rules
|
|
28
|
+
|
|
29
|
+
**Match task complexity to model tier:**
|
|
30
|
+
- Difficulty 1-3 (single file, obvious change): **simple**
|
|
31
|
+
- Difficulty 3-6 (multi-file, some judgment): **standard**
|
|
32
|
+
- Difficulty 6-8 (cross-system, complex logic): **complex**
|
|
33
|
+
- Difficulty 8-10 (architectural, novel problems): **complex**
|
|
34
|
+
|
|
35
|
+
**Override rules (always apply):**
|
|
36
|
+
- Security tasks: minimum **standard** (never use simple for auth/payment)
|
|
37
|
+
- Architecture decisions: minimum **complex**
|
|
38
|
+
- File exploration/reading: always **simple**
|
|
39
|
+
- Research requiring web access: **research**
|
|
40
|
+
|
|
41
|
+
### Cost Optimization Patterns
|
|
42
|
+
|
|
43
|
+
1. **Start cheap, escalate if needed:**
|
|
44
|
+
- Try simple tier first for uncertain-complexity tasks
|
|
45
|
+
- If result quality is insufficient: escalate to next tier
|
|
46
|
+
- Log escalation with reason
|
|
47
|
+
|
|
48
|
+
2. **Batch similar tasks:**
|
|
49
|
+
- 5 simple edits in one call < 5 separate calls
|
|
50
|
+
- Group related file reads into single exploration
|
|
51
|
+
|
|
52
|
+
3. **Cache aggressively:**
|
|
53
|
+
- Prompt caching reduces input costs ~90%
|
|
54
|
+
- Structure prompts with static context first (cacheable)
|
|
55
|
+
- Dynamic content last (not cached, but small)
|
|
56
|
+
|
|
57
|
+
4. **Avoid token waste:**
|
|
58
|
+
- Don't re-read files already in context
|
|
59
|
+
- Don't repeat instructions the model already has
|
|
60
|
+
- Use compaction when context grows large
|
|
61
|
+
|
|
62
|
+
### Budget Monitoring
|
|
63
|
+
|
|
64
|
+
Check budget status regularly:
|
|
65
|
+
- Session budget remaining: from token-ledger.jsonl
|
|
66
|
+
- Warning threshold: `[COST_WARN_USD]` from config
|
|
67
|
+
- Hard limit: `[COST_HARD_LIMIT_USD]` from config
|
|
68
|
+
|
|
69
|
+
**When budget is tight:**
|
|
70
|
+
- Downgrade all tasks one tier (complex→standard, standard→simple)
|
|
71
|
+
- Warn user: "Budget at X% — operating in economy mode"
|
|
72
|
+
- Never exceed hard limit without explicit user approval
|
|
73
|
+
|
|
74
|
+
### After any task
|
|
75
|
+
- Log actual model used + tokens consumed to token-ledger.jsonl
|
|
76
|
+
- Compare actual vs optimal tier (for future routing accuracy)
|
|
77
|
+
- Report cost in session summary
|
|
78
|
+
|
|
79
|
+
## Self-check before task completion
|
|
80
|
+
- [ ] Did I log the model routing decision with rationale?
|
|
81
|
+
- [ ] Did I record actual token usage in token-ledger.jsonl?
|
|
82
|
+
- [ ] Did I check remaining budget against session/project limits?
|
|
83
|
+
- [ ] Did I flag any tasks where a cheaper model could have been used?
|
|
@@ -0,0 +1,68 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: council
|
|
3
|
+
version: 1.0.0
|
|
4
|
+
min_mindforge_version: 10.0.3
|
|
5
|
+
status: stable
|
|
6
|
+
triggers: council, multi-voice, decision debate, architectural decision, trade-off, ambiguous, contentious, ADR, design decision, breaking change
|
|
7
|
+
---
|
|
8
|
+
|
|
9
|
+
# Skill — Council (Multi-Voice Decision)
|
|
10
|
+
|
|
11
|
+
## When this skill activates
|
|
12
|
+
Any ambiguous architectural decision where multiple valid approaches exist,
|
|
13
|
+
when resolving contentious technical choices, or when the Adversarial Decision
|
|
14
|
+
Loop (ADS) produces a split verdict.
|
|
15
|
+
|
|
16
|
+
## Mandatory actions when this skill is active
|
|
17
|
+
|
|
18
|
+
### Before invoking council
|
|
19
|
+
1. Verify the decision actually warrants multi-voice debate:
|
|
20
|
+
- Are there 2+ genuinely viable options? (If one is obviously best, skip council)
|
|
21
|
+
- Is the decision hard to reverse? (If easily reversible, just pick one and iterate)
|
|
22
|
+
- Does the decision affect multiple stakeholders or systems?
|
|
23
|
+
2. Frame the decision clearly with: context, options, constraints, and stakes.
|
|
24
|
+
|
|
25
|
+
### The Four Voices
|
|
26
|
+
|
|
27
|
+
| Voice | Asks | Bias (intentional) |
|
|
28
|
+
|-------|------|-------------------|
|
|
29
|
+
| **Architect** | "What scales? What's maintainable in 2 years?" | Elegant, extensible systems |
|
|
30
|
+
| **Skeptic** | "What breaks? What haven't we considered?" | Caution, hidden failure modes |
|
|
31
|
+
| **Pragmatist** | "What ships? What delivers value soonest?" | Delivery speed, incrementalism |
|
|
32
|
+
| **Critic** | "What's excellent? What meets our standards?" | Quality, craftsmanship |
|
|
33
|
+
|
|
34
|
+
### Council Protocol
|
|
35
|
+
1. **Frame** — Present decision with context, options, constraints
|
|
36
|
+
2. **Positions** — Each voice states recommendation + top 3 reasons + confidence (0-1)
|
|
37
|
+
3. **Challenge** — Each voice rebuts the strongest counterargument
|
|
38
|
+
4. **Synthesize** — Produce verdict with consensus score
|
|
39
|
+
5. **Output** — Write to `.planning/decisions/COUNCIL-[timestamp].md`
|
|
40
|
+
|
|
41
|
+
### Interpreting Results
|
|
42
|
+
|
|
43
|
+
| Consensus Score | Meaning | Action |
|
|
44
|
+
|----------------|---------|--------|
|
|
45
|
+
| >= 0.85 | Strong agreement | Proceed with confidence |
|
|
46
|
+
| 0.65 - 0.84 | Moderate agreement | Proceed but address dissent concerns |
|
|
47
|
+
| 0.50 - 0.64 | Weak agreement | Seek user input before proceeding |
|
|
48
|
+
| < 0.50 | No consensus | Present all options to user, defer decision |
|
|
49
|
+
|
|
50
|
+
### Guardrails
|
|
51
|
+
- Council is ADVISORY — user always has final say
|
|
52
|
+
- Maximum 2 rounds (initial + challenge). No infinite debates.
|
|
53
|
+
- Each voice limited to 200 words per round
|
|
54
|
+
- If consensus < 0.5: do NOT auto-select. Report "No consensus" honestly.
|
|
55
|
+
- Always document dissent — suppressed minority opinions create tech debt
|
|
56
|
+
- Use council templates (see `council-templates.md`) for common decision types
|
|
57
|
+
|
|
58
|
+
### After council
|
|
59
|
+
- Write the decision record to `.planning/decisions/`
|
|
60
|
+
- If the decision creates an ADR: also write to `docs/adr/`
|
|
61
|
+
- Log council invocation and verdict in AUDIT
|
|
62
|
+
- Track the Skeptic's unmitigated risks as action items
|
|
63
|
+
|
|
64
|
+
## Self-check before task completion
|
|
65
|
+
- [ ] Did I verify the decision warranted a council (not a simple choice)?
|
|
66
|
+
- [ ] Did I document all dissenting opinions (never suppressed)?
|
|
67
|
+
- [ ] Did I write the council verdict to .planning/decisions/?
|
|
68
|
+
- [ ] Did I remind the user that council is advisory (user has final say)?
|
|
@@ -0,0 +1,102 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: doc-health-audit
|
|
3
|
+
version: 1.0.0
|
|
4
|
+
min_mindforge_version: 10.0.3
|
|
5
|
+
status: stable
|
|
6
|
+
triggers: documentation audit, doc health, stale docs, outdated documentation, README outdated, doc maintenance, documentation drift, claim validation, doc review, doc coverage, broken links
|
|
7
|
+
compose:
|
|
8
|
+
- documentation
|
|
9
|
+
---
|
|
10
|
+
|
|
11
|
+
# Skill — Documentation Health Audit
|
|
12
|
+
|
|
13
|
+
## When this skill activates
|
|
14
|
+
When reviewing documentation quality, checking for staleness, validating
|
|
15
|
+
code references in docs, or performing periodic maintenance on project documentation.
|
|
16
|
+
|
|
17
|
+
## Mandatory actions when this skill is active
|
|
18
|
+
|
|
19
|
+
### Before auditing
|
|
20
|
+
1. Identify all documentation files in the project (README, docs/, CHANGELOG, API docs, etc.)
|
|
21
|
+
2. Note the last modification date of each doc file
|
|
22
|
+
3. Note recent code changes that SHOULD have triggered doc updates
|
|
23
|
+
|
|
24
|
+
### The 5-Point Health Check
|
|
25
|
+
|
|
26
|
+
**1. Claim Validation**
|
|
27
|
+
For every factual claim in docs (code examples, API signatures, file paths):
|
|
28
|
+
- Verify the referenced code/file ACTUALLY EXISTS
|
|
29
|
+
- Verify code examples COMPILE and RUN correctly
|
|
30
|
+
- Verify API signatures match current implementation
|
|
31
|
+
- Flag any claim that cannot be verified as "UNVERIFIED"
|
|
32
|
+
|
|
33
|
+
**2. Staleness Detection**
|
|
34
|
+
A doc is STALE if:
|
|
35
|
+
- Code it references has changed since the doc was last updated
|
|
36
|
+
- More than 20 commits have touched referenced files without doc update
|
|
37
|
+
- It references deprecated APIs, removed features, or old patterns
|
|
38
|
+
- Version numbers or counts are outdated
|
|
39
|
+
|
|
40
|
+
**3. Coverage Gaps**
|
|
41
|
+
Check for undocumented areas:
|
|
42
|
+
- Public APIs without usage examples
|
|
43
|
+
- Features without user-facing documentation
|
|
44
|
+
- Error codes without explanation
|
|
45
|
+
- Configuration options without description
|
|
46
|
+
- New files/modules without architectural context
|
|
47
|
+
|
|
48
|
+
**4. Consistency Check**
|
|
49
|
+
Across all docs, verify:
|
|
50
|
+
- Version numbers are consistent (README vs package.json vs CHANGELOG)
|
|
51
|
+
- Naming is consistent (same feature called the same thing everywhere)
|
|
52
|
+
- Instructions don't contradict each other
|
|
53
|
+
- Links between docs are not broken (internal cross-references)
|
|
54
|
+
|
|
55
|
+
**5. Maintenance Scoring**
|
|
56
|
+
Score each doc file 0-10:
|
|
57
|
+
- 9-10: Current, accurate, comprehensive
|
|
58
|
+
- 6-8: Minor issues (outdated example, missing edge case)
|
|
59
|
+
- 3-5: Significant staleness (multiple outdated references)
|
|
60
|
+
- 0-2: Dangerously outdated (actively misleading)
|
|
61
|
+
|
|
62
|
+
### Output Format
|
|
63
|
+
|
|
64
|
+
Write to `.planning/DOC-HEALTH-REPORT-[timestamp].md`:
|
|
65
|
+
```markdown
|
|
66
|
+
# Documentation Health Report
|
|
67
|
+
Date: [timestamp]
|
|
68
|
+
Files audited: [count]
|
|
69
|
+
Overall score: [average across all files]
|
|
70
|
+
|
|
71
|
+
## Critical Findings (score 0-2)
|
|
72
|
+
| File | Issue | Impact |
|
|
73
|
+
...
|
|
74
|
+
|
|
75
|
+
## Stale Docs (score 3-5)
|
|
76
|
+
| File | Last Updated | Code Changes Since | Top Issue |
|
|
77
|
+
...
|
|
78
|
+
|
|
79
|
+
## Healthy Docs (score 6+)
|
|
80
|
+
| File | Score | Minor Issues |
|
|
81
|
+
...
|
|
82
|
+
|
|
83
|
+
## Coverage Gaps
|
|
84
|
+
- [undocumented area 1]
|
|
85
|
+
- [undocumented area 2]
|
|
86
|
+
|
|
87
|
+
## Recommendations (prioritized)
|
|
88
|
+
1. [highest impact fix]
|
|
89
|
+
2. ...
|
|
90
|
+
```
|
|
91
|
+
|
|
92
|
+
### After audit
|
|
93
|
+
- Create action items for all Critical findings (score 0-2)
|
|
94
|
+
- Suggest specific fixes for Stale docs
|
|
95
|
+
- Log audit summary in AUDIT
|
|
96
|
+
- Consider: should any finding become an instinct? (e.g., "always update README when adding new commands")
|
|
97
|
+
|
|
98
|
+
## Self-check before task completion
|
|
99
|
+
- [ ] Did I verify code references in docs actually exist?
|
|
100
|
+
- [ ] Did I check code examples compile/run correctly?
|
|
101
|
+
- [ ] Did I produce a DOC-HEALTH-REPORT with per-file scores?
|
|
102
|
+
- [ ] Did I create action items for any Critical findings (score 0-2)?
|
|
@@ -0,0 +1,75 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: multi-llm-consult
|
|
3
|
+
version: 1.0.0
|
|
4
|
+
min_mindforge_version: 10.0.3
|
|
5
|
+
status: stable
|
|
6
|
+
triggers: second opinion, cross-model, multi-model, consult, gemini, codex, GPT, alternative model, consensus, multi-LLM, external model, validation
|
|
7
|
+
---
|
|
8
|
+
|
|
9
|
+
# Skill — Multi-LLM Consult
|
|
10
|
+
|
|
11
|
+
## When this skill activates
|
|
12
|
+
When seeking a second opinion from external models, validating a decision across
|
|
13
|
+
multiple AI providers, or when the user explicitly requests cross-model consultation.
|
|
14
|
+
|
|
15
|
+
## Mandatory actions when this skill is active
|
|
16
|
+
|
|
17
|
+
### Before consulting external models
|
|
18
|
+
1. **Sanitize the prompt.** NEVER send raw project context to external models.
|
|
19
|
+
- Remove: file paths, internal variable names, proprietary business logic
|
|
20
|
+
- Remove: API keys, secrets, credentials, internal URLs
|
|
21
|
+
- Remove: user PII, customer data, anything covered by data-privacy skill
|
|
22
|
+
- Keep: the abstract question, general patterns, public knowledge references
|
|
23
|
+
2. **Estimate cost.** Each external call costs tokens. Check budget via cost-tracking module.
|
|
24
|
+
3. **Define the question clearly.** Vague questions produce vague answers. Frame as:
|
|
25
|
+
- "Given [sanitized context], which approach is better: A or B? Why?"
|
|
26
|
+
|
|
27
|
+
### Configured Models
|
|
28
|
+
|
|
29
|
+
| Provider | Model | Best For | Cost Tier |
|
|
30
|
+
|----------|-------|----------|-----------|
|
|
31
|
+
| Anthropic | claude-opus-4-7 | Deep reasoning, architecture | complex |
|
|
32
|
+
| Google | gemini-2.5-pro | Research, long context, web grounding | research |
|
|
33
|
+
| OpenAI | gpt-4o | Alternative perspective, validation | consult |
|
|
34
|
+
|
|
35
|
+
### Consultation Protocol
|
|
36
|
+
|
|
37
|
+
**Single Consult (one external model):**
|
|
38
|
+
1. Sanitize prompt
|
|
39
|
+
2. Send to selected model
|
|
40
|
+
3. Present response with source attribution
|
|
41
|
+
4. Note areas of agreement/disagreement with primary analysis
|
|
42
|
+
|
|
43
|
+
**Consensus Consult (all 3 models):**
|
|
44
|
+
1. Sanitize prompt (same prompt to all)
|
|
45
|
+
2. Send to all configured models in parallel
|
|
46
|
+
3. Analyze responses for:
|
|
47
|
+
- **Agreement** (2+ models recommend same approach): high confidence signal
|
|
48
|
+
- **Divergence** (models disagree): flag for user decision, present all perspectives
|
|
49
|
+
- **Novel insight** (one model raises a point others missed): highlight specifically
|
|
50
|
+
4. Produce synthesis:
|
|
51
|
+
```
|
|
52
|
+
Consensus: [Yes/No/Partial]
|
|
53
|
+
Recommended: [approach]
|
|
54
|
+
Agreement: [which models agree]
|
|
55
|
+
Dissent: [which models disagree and why]
|
|
56
|
+
Novel: [unique insights from individual models]
|
|
57
|
+
```
|
|
58
|
+
|
|
59
|
+
### During consultation
|
|
60
|
+
- Log every external call in token-ledger.jsonl (model, tokens, cost)
|
|
61
|
+
- Never send more than 2000 tokens to external models per consultation
|
|
62
|
+
- If a model is unavailable: skip it, note in output, continue with available models
|
|
63
|
+
- Respect rate limits — max 3 consultations per session
|
|
64
|
+
|
|
65
|
+
### After consultation
|
|
66
|
+
- Present results to user with clear attribution
|
|
67
|
+
- Never auto-execute based on external model recommendations
|
|
68
|
+
- External opinions are ADVISORY — user sovereignty applies
|
|
69
|
+
- Log consultation summary in AUDIT
|
|
70
|
+
|
|
71
|
+
## Self-check before task completion
|
|
72
|
+
- [ ] Did I sanitize the prompt before sending to external models?
|
|
73
|
+
- [ ] Did I log every external call in token-ledger.jsonl?
|
|
74
|
+
- [ ] Did I attribute responses to their source model (no unattributed blending)?
|
|
75
|
+
- [ ] Did I remind the user that external opinions are advisory?
|
|
@@ -0,0 +1,109 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: threat-modeling
|
|
3
|
+
version: 1.0.0
|
|
4
|
+
min_mindforge_version: 10.0.3
|
|
5
|
+
status: stable
|
|
6
|
+
triggers: threat model, STRIDE, attack tree, DREAD, threat surface, trust boundary, attack vector, data flow diagram, DFD, adversary, threat assessment
|
|
7
|
+
compose:
|
|
8
|
+
- security-review
|
|
9
|
+
---
|
|
10
|
+
|
|
11
|
+
# Skill — Threat Modeling
|
|
12
|
+
|
|
13
|
+
## When this skill activates
|
|
14
|
+
When analyzing security threats to a system, component, or feature. When
|
|
15
|
+
constructing attack trees, identifying trust boundaries, or scoring risks
|
|
16
|
+
using structured methodologies.
|
|
17
|
+
|
|
18
|
+
## Mandatory actions when this skill is active
|
|
19
|
+
|
|
20
|
+
### Before threat modeling
|
|
21
|
+
1. Switch to `threat-modeler` persona.
|
|
22
|
+
2. Identify the **scope** — what system/component are we modeling?
|
|
23
|
+
3. Gather the system's **data flow** — how does data move through it?
|
|
24
|
+
4. Identify all **trust boundaries** — where does trust level change?
|
|
25
|
+
|
|
26
|
+
### STRIDE Methodology
|
|
27
|
+
|
|
28
|
+
Apply STRIDE to each trust boundary crossing:
|
|
29
|
+
|
|
30
|
+
| Threat | Question | Example |
|
|
31
|
+
|--------|----------|---------|
|
|
32
|
+
| **S**poofing | Can an attacker pretend to be someone else? | Forged auth tokens, session hijacking |
|
|
33
|
+
| **T**ampering | Can data be modified in transit or at rest? | Man-in-the-middle, DB manipulation |
|
|
34
|
+
| **R**epudiation | Can actions be denied without proof? | Missing audit logs, unsigned transactions |
|
|
35
|
+
| **I**nformation Disclosure | Can sensitive data leak? | Error messages with stack traces, verbose APIs |
|
|
36
|
+
| **D**enial of Service | Can the system be overwhelmed? | Unbounded queries, missing rate limits |
|
|
37
|
+
| **E**levation of Privilege | Can a user gain unauthorized access? | Missing authz checks, IDOR, path traversal |
|
|
38
|
+
|
|
39
|
+
### DREAD Scoring
|
|
40
|
+
|
|
41
|
+
Score each identified threat (1-10 for each dimension):
|
|
42
|
+
|
|
43
|
+
| Dimension | 1 (Low) | 5 (Medium) | 10 (High) |
|
|
44
|
+
|-----------|---------|------------|-----------|
|
|
45
|
+
| **D**amage | Minor inconvenience | Data loss for some users | Full system compromise |
|
|
46
|
+
| **R**eproducibility | Requires rare conditions | Requires specific setup | Anyone can reproduce |
|
|
47
|
+
| **E**xploitability | Requires deep expertise | Requires some skill | Script kiddie level |
|
|
48
|
+
| **A**ffected Users | Single user | Subset of users | All users |
|
|
49
|
+
| **D**iscoverability | Hidden, requires source | Findable with effort | Obvious, publicly known |
|
|
50
|
+
|
|
51
|
+
**Risk Score** = (D + R + E + A + D) / 5
|
|
52
|
+
- Score 1-3: Low risk (monitor)
|
|
53
|
+
- Score 4-6: Medium risk (mitigate within sprint)
|
|
54
|
+
- Score 7-10: High/Critical risk (mitigate IMMEDIATELY)
|
|
55
|
+
|
|
56
|
+
### Attack Tree Construction
|
|
57
|
+
|
|
58
|
+
For high-scoring threats, build an attack tree:
|
|
59
|
+
```
|
|
60
|
+
[Goal: Unauthorized admin access]
|
|
61
|
+
├── [OR] Steal admin credentials
|
|
62
|
+
│ ├── [AND] Phish admin user
|
|
63
|
+
│ │ ├── Send convincing email
|
|
64
|
+
│ │ └── Capture credentials on fake page
|
|
65
|
+
│ └── [AND] Exploit password reset
|
|
66
|
+
│ ├── Enumerate valid emails
|
|
67
|
+
│ └── Intercept reset token
|
|
68
|
+
└── [OR] Escalate from regular user
|
|
69
|
+
├── [AND] Exploit IDOR
|
|
70
|
+
│ ├── Find predictable resource IDs
|
|
71
|
+
│ └── Access admin endpoints directly
|
|
72
|
+
└── [AND] Exploit role assignment bug
|
|
73
|
+
```
|
|
74
|
+
|
|
75
|
+
### Output Format
|
|
76
|
+
|
|
77
|
+
Write to `.planning/THREAT-MODEL-[component]-[timestamp].md`:
|
|
78
|
+
```markdown
|
|
79
|
+
# Threat Model: [Component]
|
|
80
|
+
Date: [timestamp]
|
|
81
|
+
Scope: [what was analyzed]
|
|
82
|
+
|
|
83
|
+
## Data Flow Diagram
|
|
84
|
+
[ASCII or description of data flow with trust boundaries marked]
|
|
85
|
+
|
|
86
|
+
## Trust Boundaries
|
|
87
|
+
1. [boundary]: [what crosses it]
|
|
88
|
+
|
|
89
|
+
## Identified Threats
|
|
90
|
+
| # | Category | Threat | DREAD Score | Mitigation | Status |
|
|
91
|
+
|---|----------|--------|-------------|-----------|--------|
|
|
92
|
+
|
|
93
|
+
## Attack Trees (for High/Critical threats)
|
|
94
|
+
[Trees for any threat scoring 7+]
|
|
95
|
+
|
|
96
|
+
## Recommendations
|
|
97
|
+
[Prioritized list of mitigations]
|
|
98
|
+
```
|
|
99
|
+
|
|
100
|
+
### After threat modeling
|
|
101
|
+
- Review all High/Critical findings with security-reviewer persona
|
|
102
|
+
- Create action items for each unmitigated threat
|
|
103
|
+
- Log threat model in AUDIT with finding counts by severity
|
|
104
|
+
|
|
105
|
+
## Self-check before task completion
|
|
106
|
+
- [ ] Did I apply STRIDE to ALL trust boundary crossings (not just obvious ones)?
|
|
107
|
+
- [ ] Did I score every identified threat with DREAD?
|
|
108
|
+
- [ ] Did I build attack trees for all threats scoring 7+?
|
|
109
|
+
- [ ] Did I write the threat model to .planning/THREAT-MODEL-[component].md?
|
|
@@ -0,0 +1,85 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: verification-loop
|
|
3
|
+
version: 1.0.0
|
|
4
|
+
min_mindforge_version: 10.0.3
|
|
5
|
+
status: stable
|
|
6
|
+
triggers: verification, quality gate, build check, type check, full verification, security scan, diff review, pre-merge, CI gate, green build, ship check, verify all
|
|
7
|
+
compose:
|
|
8
|
+
- security-review
|
|
9
|
+
---
|
|
10
|
+
|
|
11
|
+
# Skill — Verification Loop (6-Phase Quality Gate)
|
|
12
|
+
|
|
13
|
+
## When this skill activates
|
|
14
|
+
Before shipping, merging, or marking any implementation task as complete.
|
|
15
|
+
Provides a systematic 6-phase verification pipeline that catches issues
|
|
16
|
+
at the cheapest point to fix them.
|
|
17
|
+
|
|
18
|
+
## Mandatory actions when this skill is active
|
|
19
|
+
|
|
20
|
+
### The 6 Phases (sequential, fail-fast)
|
|
21
|
+
|
|
22
|
+
**Phase 1 — Build**
|
|
23
|
+
```bash
|
|
24
|
+
npm run build # or equivalent for project
|
|
25
|
+
```
|
|
26
|
+
- Must produce zero errors
|
|
27
|
+
- Warnings are acceptable but logged
|
|
28
|
+
- If build fails: fix before proceeding (no skipping)
|
|
29
|
+
|
|
30
|
+
**Phase 2 — Type Check**
|
|
31
|
+
```bash
|
|
32
|
+
npx tsc --noEmit # or equivalent
|
|
33
|
+
```
|
|
34
|
+
- All type errors must be resolved
|
|
35
|
+
- No `@ts-ignore` without documented reason
|
|
36
|
+
- Generic `any` types flagged as warnings
|
|
37
|
+
|
|
38
|
+
**Phase 3 — Lint**
|
|
39
|
+
```bash
|
|
40
|
+
npm run lint
|
|
41
|
+
```
|
|
42
|
+
- All lint errors must be resolved
|
|
43
|
+
- Auto-fixable issues: fix immediately (`--fix`)
|
|
44
|
+
- Non-auto-fixable: resolve manually before proceeding
|
|
45
|
+
|
|
46
|
+
**Phase 4 — Test**
|
|
47
|
+
```bash
|
|
48
|
+
npm test
|
|
49
|
+
```
|
|
50
|
+
- All tests must pass (zero failures)
|
|
51
|
+
- No skipped tests without documented reason
|
|
52
|
+
- If new code was added: verify test coverage exists
|
|
53
|
+
- Coverage threshold: per project config (default 80%)
|
|
54
|
+
|
|
55
|
+
**Phase 5 — Security Scan**
|
|
56
|
+
- Check for hardcoded secrets (grep for API keys, passwords, tokens)
|
|
57
|
+
- Check for new dependencies with known vulnerabilities
|
|
58
|
+
- Verify no dangerous code execution patterns (dynamic code evaluation, innerHTML, SQL concatenation) introduced
|
|
59
|
+
- Run security-review skill checklist on the diff
|
|
60
|
+
|
|
61
|
+
**Phase 6 — Diff Review**
|
|
62
|
+
```bash
|
|
63
|
+
git diff --staged # or git diff main...HEAD
|
|
64
|
+
```
|
|
65
|
+
- Review every changed line for:
|
|
66
|
+
- Accidental debug code (console.log, debugger, TODO)
|
|
67
|
+
- Unintended file changes (lock files, configs)
|
|
68
|
+
- Sensitive data exposure
|
|
69
|
+
- Logic errors visible in the diff
|
|
70
|
+
|
|
71
|
+
### Execution Rules
|
|
72
|
+
- Phases execute in ORDER (1→2→3→4→5→6)
|
|
73
|
+
- Each phase must PASS before the next begins (fail-fast)
|
|
74
|
+
- On failure: report which phase failed, what the error is, suggest fix
|
|
75
|
+
- After fixing: restart from the FAILED phase (not from Phase 1)
|
|
76
|
+
- All 6 phases passing = "green" verification = safe to ship
|
|
77
|
+
|
|
78
|
+
### Integration
|
|
79
|
+
- `/mindforge:verify-loop` invokes this full pipeline
|
|
80
|
+
- `/mindforge:ship` MUST run verify-loop before proceeding
|
|
81
|
+
- Autonomous mode runs verify-loop after every task (Phase 4+5+6 minimum)
|
|
82
|
+
|
|
83
|
+
### After verification passes
|
|
84
|
+
- Log verification result in AUDIT with per-phase timing
|
|
85
|
+
- Report: "All 6 verification gates passed. Safe to proceed."
|
package/CHANGELOG.md
CHANGED
|
@@ -1,15 +1,34 @@
|
|
|
1
1
|
# Changelog
|
|
2
2
|
|
|
3
|
-
## [10.0.
|
|
3
|
+
## [10.0.3] - 2026-05-25 — "Council Awakens"
|
|
4
|
+
|
|
5
|
+
### Added (v10.0.3)
|
|
6
|
+
|
|
7
|
+
- **10 new core skills** — agent-loops, multi-llm-consult, continuous-learning, council, verification-loop, threat-modeling, autonomous-loops, agent-introspection-debugging, cost-aware-routing, doc-health-audit.
|
|
8
|
+
- **8 new commands** — `/mindforge:council`, `/mindforge:consult`, `/mindforge:verify-loop`, `/mindforge:introspect`, `/mindforge:cost-report`, `/mindforge:threat-model`, `/mindforge:learn-instinct`, `/mindforge:evolve-skills`.
|
|
9
|
+
- **9 new personas** — cost-optimizer, threat-modeler, council-architect, council-skeptic, council-pragmatist, council-critic, instinct-curator, doc-auditor, multi-model-bridge.
|
|
10
|
+
- **3 new swarm templates** — CouncilSwarm (HITL decision gate), VerificationSwarm (autonomous quality gates), LearningSwarm (instinct management).
|
|
11
|
+
- **Skill Composition System** — Skills can now declare dependencies on other skills via `compose:` frontmatter field. Composed skills are injected as summaries (max 2-level depth, cycle detection).
|
|
12
|
+
- **Instinct Engine** — Auto-capture learned behaviors with confidence scoring. Instincts auto-promote to skills at 0.85 confidence after 5+ successful applications. Project-scoped, max 100 per project.
|
|
13
|
+
- **Cost Tracking Module** — Token budgeting, 5-tier model routing (Haiku/Sonnet/Opus/Gemini/GPT-4o), spend analytics via token-ledger.jsonl.
|
|
14
|
+
- **Council Framework** — 4-voice decision harness (Architect, Skeptic, Pragmatist, Critic) with weighted consensus scoring, dissent documentation, and 5 pre-built templates.
|
|
15
|
+
- **Cross-Iteration Bridge** — SHARED_TASK_NOTES.md for semantic context persistence across autonomous mode iterations (complements HANDOFF.json).
|
|
16
|
+
- **Swarm templates v6.0.0** — Bump from v5.0.0 with 3 new templates (total: 21 swarm templates).
|
|
17
|
+
- **claude-opus-4-7** added to market_registry in config.json.
|
|
18
|
+
- **Loader composition step** — New Step 4.1 in skill loader for resolving `compose:` dependencies.
|
|
4
19
|
|
|
5
|
-
|
|
20
|
+
---
|
|
21
|
+
|
|
22
|
+
## [10.0.2] - 2026-05-24 — "Persona Expansion"
|
|
23
|
+
|
|
24
|
+
### Added (v10.0.2)
|
|
6
25
|
|
|
7
26
|
- **47 new specialist personas** — Security & Compliance (6), Architecture & System Design (8), Frontend & UX (10), Performance & Reliability (5), Data & ML (4), DevOps & Infrastructure (6), Language Specialists (5), DX & Operations (8), Strategy & Review (9), Specialized Engineering (10).
|
|
8
27
|
- **6 new swarm templates** — ArchitectureSwarm, PerformanceSwarm, InfrastructureSwarm, AccessibilitySwarm, ReviewSwarm, MigrationSwarm.
|
|
9
28
|
- **Updated existing swarms** — UISwarm, BackendSwarm, SecuritySwarm, DeveloperExperienceSwarm, DataMeshSwarm, IncidentResponseSwarm, ComplianceSwarm, QualityAssuranceSwarm now include relevant new specialist personas.
|
|
10
29
|
- **Swarm templates v5.0.0** — Bump from 4.2.0 with expanded member rosters and new specialist orchestration patterns.
|
|
11
30
|
- **Persona registry documentation** — 10 new category tables in `docs/registry/PERSONAS.md`.
|
|
12
|
-
- **Persona reference guide** — Updated `docs/PERSONAS.md` with all 47 entries (total:
|
|
31
|
+
- **Persona reference guide** — Updated `docs/PERSONAS.md` with all 47 entries (total: 108 personas).
|
|
13
32
|
|
|
14
33
|
---
|
|
15
34
|
|
package/MINDFORGE.md
CHANGED
|
@@ -1,12 +1,12 @@
|
|
|
1
|
-
# MINDFORGE.md — Parameter Registry (v10.0.
|
|
1
|
+
# MINDFORGE.md — Parameter Registry (v10.0.3)
|
|
2
2
|
|
|
3
3
|
## 1. IDENTITY & VERSIONING
|
|
4
4
|
|
|
5
5
|
[NAME] = MindForge
|
|
6
|
-
[VERSION] = 10.0.
|
|
6
|
+
[VERSION] = 10.0.3-COUNCIL
|
|
7
7
|
[STABLE] = true
|
|
8
|
-
[MODE] = \"
|
|
9
|
-
[REQUIRED_CORE_VERSION] = 10.0.
|
|
8
|
+
[MODE] = \"Council Awakens\"
|
|
9
|
+
[REQUIRED_CORE_VERSION] = 10.0.3
|
|
10
10
|
[SOVEREIGN_IDENTITY] = true
|
|
11
11
|
[SRE_LAYER_ENABLED] = true
|
|
12
12
|
|
package/README.md
CHANGED
|
@@ -4,9 +4,9 @@
|
|
|
4
4
|
|
|
5
5
|
---
|
|
6
6
|
|
|
7
|
-
## v10.0.
|
|
7
|
+
## v10.0.3 — Council Awakens
|
|
8
8
|
|
|
9
|
-
MindForge
|
|
9
|
+
MindForge v10.0.3 "Council Awakens" introduces the Council decision framework, Instinct Engine, Cost-Aware Routing, 6-phase Verification Loop, and Multi-LLM Consult. This release adds 10 skills (20 core total), 8 commands (71 total), 9 personas (117 total), 3 swarm templates (21 total), and 5 engine subsystems — expanding MindForge's autonomous governance and multi-agent reasoning capabilities.
|
|
10
10
|
|
|
11
11
|
|
|
12
12
|
## Installation & Setup
|