mindforge-cc 10.0.2 → 10.7.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.mindforge/config.json +73 -2
- package/.mindforge/engine/autonomous/cross-iteration-bridge.md +96 -0
- package/.mindforge/engine/cost-tracking/budget-enforcer.md +68 -0
- package/.mindforge/engine/cost-tracking/router.md +58 -0
- package/.mindforge/engine/cost-tracking/token-ledger.md +77 -0
- package/.mindforge/engine/council/council-protocol.md +96 -0
- package/.mindforge/engine/council/council-templates.md +85 -0
- package/.mindforge/engine/council/synthesis-engine.md +71 -0
- package/.mindforge/engine/cross-model-eval.md +74 -0
- package/.mindforge/engine/instincts/capture-engine.md +63 -0
- package/.mindforge/engine/instincts/instinct-schema.md +76 -0
- package/.mindforge/engine/instincts/promotion-engine.md +77 -0
- package/.mindforge/engine/proactive/signal-detector.md +60 -0
- package/.mindforge/engine/proactive/suggestion-engine.md +100 -0
- package/.mindforge/engine/skills/composition.md +83 -0
- package/.mindforge/engine/skills/loader.md +16 -0
- package/.mindforge/personas/agent-architect.md +57 -0
- package/.mindforge/personas/agent-evaluator.md +162 -0
- package/.mindforge/personas/agent-memory-designer.md +157 -0
- package/.mindforge/personas/agent-ops-engineer.md +120 -0
- package/.mindforge/personas/agent-orchestrator.md +112 -0
- package/.mindforge/personas/ai-economist.md +57 -0
- package/.mindforge/personas/ai-safety-engineer.md +57 -0
- package/.mindforge/personas/analytics-engineer.md +57 -0
- package/.mindforge/personas/anti-pattern-hunter.md +61 -0
- package/.mindforge/personas/api-gateway-designer.md +132 -0
- package/.mindforge/personas/auth-engineer.md +112 -0
- package/.mindforge/personas/build-engineer.md +57 -0
- package/.mindforge/personas/business-analyst.md +56 -0
- package/.mindforge/personas/cache-architect.md +100 -0
- package/.mindforge/personas/causal-scientist.md +57 -0
- package/.mindforge/personas/cdn-architect.md +118 -0
- package/.mindforge/personas/change-agent.md +104 -0
- package/.mindforge/personas/code-narrator.md +52 -0
- package/.mindforge/personas/codegen-specialist.md +68 -0
- package/.mindforge/personas/communication-architect.md +102 -0
- package/.mindforge/personas/compliance-engineer.md +96 -0
- package/.mindforge/personas/consensus-engineer.md +116 -0
- package/.mindforge/personas/contract-tester.md +60 -192
- package/.mindforge/personas/cost-optimizer.md +71 -0
- package/.mindforge/personas/council-architect.md +66 -0
- package/.mindforge/personas/council-critic.md +67 -0
- package/.mindforge/personas/council-pragmatist.md +71 -0
- package/.mindforge/personas/council-skeptic.md +73 -0
- package/.mindforge/personas/data-architect.md +108 -0
- package/.mindforge/personas/data-mesh-architect.md +57 -0
- package/.mindforge/personas/data-pipeline-architect.md +120 -0
- package/.mindforge/personas/de-sloppifier.md +60 -0
- package/.mindforge/personas/debt-manager.md +66 -0
- package/.mindforge/personas/decision-architect.md +82 -51
- package/.mindforge/personas/deployment-captain.md +74 -0
- package/.mindforge/personas/design-system-lead.md +112 -0
- package/.mindforge/personas/dmux-orchestrator.md +75 -0
- package/.mindforge/personas/doc-auditor.md +84 -0
- package/.mindforge/personas/dx-engineer.md +96 -0
- package/.mindforge/personas/ecommerce-engineer.md +57 -0
- package/.mindforge/personas/edge-engineer.md +94 -0
- package/.mindforge/personas/edtech-architect.md +106 -0
- package/.mindforge/personas/embedding-architect.md +57 -0
- package/.mindforge/personas/environment-engineer.md +57 -0
- package/.mindforge/personas/eval-judge.md +55 -0
- package/.mindforge/personas/event-architect.md +102 -0
- package/.mindforge/personas/experiment-designer.md +138 -0
- package/.mindforge/personas/feature-store-engineer.md +57 -0
- package/.mindforge/personas/finops-analyst.md +66 -0
- package/.mindforge/personas/fintech-architect.md +57 -0
- package/.mindforge/personas/flutter-engineer.md +104 -0
- package/.mindforge/personas/gaming-engineer.md +57 -0
- package/.mindforge/personas/graphql-designer.md +73 -0
- package/.mindforge/personas/healthcare-engineer.md +57 -0
- package/.mindforge/personas/hiring-strategist.md +105 -0
- package/.mindforge/personas/hitl-architect.md +165 -0
- package/.mindforge/personas/i18n-architect.md +69 -0
- package/.mindforge/personas/instinct-curator.md +83 -0
- package/.mindforge/personas/iot-architect.md +105 -0
- package/.mindforge/personas/knowledge-curator.md +139 -0
- package/.mindforge/personas/knowledge-engineer.md +57 -0
- package/.mindforge/personas/lakehouse-architect.md +57 -0
- package/.mindforge/personas/llm-orchestrator.md +57 -0
- package/.mindforge/personas/logistics-architect.md +106 -0
- package/.mindforge/personas/market-analyst.md +53 -0
- package/.mindforge/personas/marketplace-engineer.md +105 -0
- package/.mindforge/personas/mcp-designer.md +54 -0
- package/.mindforge/personas/meeting-designer.md +104 -0
- package/.mindforge/personas/mentorship-lead.md +106 -0
- package/.mindforge/personas/migration-architect.md +57 -0
- package/.mindforge/personas/ml-ops-engineer.md +101 -0
- package/.mindforge/personas/mobile-architect.md +105 -0
- package/.mindforge/personas/mobile-security-engineer.md +106 -0
- package/.mindforge/personas/multi-model-bridge.md +86 -0
- package/.mindforge/personas/multi-tenancy-architect.md +71 -0
- package/.mindforge/personas/multimodal-engineer.md +57 -0
- package/.mindforge/personas/offline-specialist.md +105 -0
- package/.mindforge/personas/onboarding-navigator.md +63 -0
- package/.mindforge/personas/payments-engineer.md +135 -0
- package/.mindforge/personas/pipeline-engineer.md +115 -0
- package/.mindforge/personas/platform-engineer.md +97 -0
- package/.mindforge/personas/platform-lead.md +57 -0
- package/.mindforge/personas/privacy-engineer.md +57 -0
- package/.mindforge/personas/product-owner.md +56 -0
- package/.mindforge/personas/productivity-analyst.md +57 -0
- package/.mindforge/personas/prompt-architect.md +101 -0
- package/.mindforge/personas/proofreader.md +53 -0
- package/.mindforge/personas/pwa-architect.md +105 -0
- package/.mindforge/personas/quality-scorer.md +63 -0
- package/.mindforge/personas/react-native-engineer.md +106 -0
- package/.mindforge/personas/resilience-engineer.md +69 -0
- package/.mindforge/personas/rfc-architect.md +64 -0
- package/.mindforge/personas/saga-orchestrator.md +80 -0
- package/.mindforge/personas/secrets-engineer.md +57 -0
- package/.mindforge/personas/skill-smith.md +79 -0
- package/.mindforge/personas/sre-lead.md +107 -0
- package/.mindforge/personas/stream-engineer.md +57 -0
- package/.mindforge/personas/streaming-engineer.md +64 -0
- package/.mindforge/personas/swarm-templates.json +695 -38
- package/.mindforge/personas/system-designer.md +57 -0
- package/.mindforge/personas/team-coach.md +120 -0
- package/.mindforge/personas/tech-lead-coach.md +103 -0
- package/.mindforge/personas/technical-writer-lead.md +111 -0
- package/.mindforge/personas/threat-modeler.md +82 -0
- package/.mindforge/personas/vibe-checker.md +75 -0
- package/.mindforge/personas/worktree-manager.md +56 -0
- package/.mindforge/personas/zero-trust-engineer.md +113 -0
- package/.mindforge/skills/a11y-testing/SKILL.md +143 -0
- package/.mindforge/skills/agent-evaluation-framework/SKILL.md +227 -0
- package/.mindforge/skills/agent-introspection-debugging/SKILL.md +88 -0
- package/.mindforge/skills/agent-loops/SKILL.md +84 -0
- package/.mindforge/skills/agent-memory-design/SKILL.md +199 -0
- package/.mindforge/skills/agent-orchestration-patterns/SKILL.md +129 -0
- package/.mindforge/skills/agent-tool-selection/SKILL.md +204 -0
- package/.mindforge/skills/ai-agent-deployment/SKILL.md +176 -0
- package/.mindforge/skills/ai-cost-management/SKILL.md +57 -0
- package/.mindforge/skills/ai-safety-alignment/SKILL.md +53 -0
- package/.mindforge/skills/analytics-instrumentation/SKILL.md +172 -0
- package/.mindforge/skills/api-gateway-patterns/SKILL.md +177 -0
- package/.mindforge/skills/api-marketplace/SKILL.md +56 -0
- package/.mindforge/skills/api-versioning/SKILL.md +100 -0
- package/.mindforge/skills/app-store-deployment/SKILL.md +44 -0
- package/.mindforge/skills/architecture-tradeoff-analysis/SKILL.md +97 -0
- package/.mindforge/skills/audit-logging/SKILL.md +140 -0
- package/.mindforge/skills/auth-patterns/SKILL.md +148 -0
- package/.mindforge/skills/autonomous-agent-harness/SKILL.md +218 -0
- package/.mindforge/skills/autonomous-agents/SKILL.md +59 -0
- package/.mindforge/skills/autonomous-loops/SKILL.md +105 -0
- package/.mindforge/skills/build-system-optimization/SKILL.md +54 -0
- package/.mindforge/skills/build-vs-buy/SKILL.md +80 -0
- package/.mindforge/skills/bundle-optimization/SKILL.md +174 -0
- package/.mindforge/skills/business-analyst/SKILL.md +82 -0
- package/.mindforge/skills/caching-strategies/SKILL.md +132 -0
- package/.mindforge/skills/capacity-planning/SKILL.md +96 -0
- package/.mindforge/skills/causal-inference/SKILL.md +42 -0
- package/.mindforge/skills/cdn-optimization/SKILL.md +212 -0
- package/.mindforge/skills/change-management/SKILL.md +106 -0
- package/.mindforge/skills/chaos-engineering/SKILL.md +99 -0
- package/.mindforge/skills/ci-cd-pipeline/SKILL.md +118 -0
- package/.mindforge/skills/cli-design/SKILL.md +118 -0
- package/.mindforge/skills/code-generation-patterns/SKILL.md +92 -0
- package/.mindforge/skills/code-review-methodology/SKILL.md +180 -0
- package/.mindforge/skills/code-tour/SKILL.md +145 -0
- package/.mindforge/skills/codebase-onboarding/SKILL.md +95 -0
- package/.mindforge/skills/compliance-as-code/SKILL.md +195 -0
- package/.mindforge/skills/conflict-resolution/SKILL.md +87 -0
- package/.mindforge/skills/connection-pooling/SKILL.md +151 -0
- package/.mindforge/skills/container-security/SKILL.md +151 -0
- package/.mindforge/skills/context-engineering/SKILL.md +114 -0
- package/.mindforge/skills/continuous-learning/SKILL.md +84 -0
- package/.mindforge/skills/contract-testing/SKILL.md +85 -0
- package/.mindforge/skills/cost-aware-routing/SKILL.md +83 -0
- package/.mindforge/skills/cost-estimation/SKILL.md +82 -0
- package/.mindforge/skills/council/SKILL.md +68 -0
- package/.mindforge/skills/cqrs-event-sourcing/SKILL.md +95 -0
- package/.mindforge/skills/cross-platform-testing/SKILL.md +43 -0
- package/.mindforge/skills/data-governance/SKILL.md +42 -0
- package/.mindforge/skills/data-lakehouse/SKILL.md +42 -0
- package/.mindforge/skills/data-mesh/SKILL.md +42 -0
- package/.mindforge/skills/data-modeling/SKILL.md +107 -0
- package/.mindforge/skills/data-pipeline-design/SKILL.md +171 -0
- package/.mindforge/skills/data-privacy-engineering/SKILL.md +42 -0
- package/.mindforge/skills/database-performance/SKILL.md +174 -0
- package/.mindforge/skills/database-sharding-advanced/SKILL.md +206 -0
- package/.mindforge/skills/de-sloppify/SKILL.md +120 -0
- package/.mindforge/skills/defense-in-depth/SKILL.md +84 -0
- package/.mindforge/skills/delegation-patterns/SKILL.md +123 -0
- package/.mindforge/skills/dependency-management/SKILL.md +94 -0
- package/.mindforge/skills/deployment-workflow/SKILL.md +135 -0
- package/.mindforge/skills/design-system/SKILL.md +113 -0
- package/.mindforge/skills/developer-onboarding/SKILL.md +99 -0
- package/.mindforge/skills/developer-productivity-metrics/SKILL.md +59 -0
- package/.mindforge/skills/distributed-consensus/SKILL.md +141 -0
- package/.mindforge/skills/dmux-workflows/SKILL.md +141 -0
- package/.mindforge/skills/dns-architecture/SKILL.md +167 -0
- package/.mindforge/skills/doc-health-audit/SKILL.md +102 -0
- package/.mindforge/skills/ecommerce-architecture/SKILL.md +41 -0
- package/.mindforge/skills/edge-computing/SKILL.md +91 -0
- package/.mindforge/skills/edtech-platform/SKILL.md +41 -0
- package/.mindforge/skills/email-deliverability/SKILL.md +177 -0
- package/.mindforge/skills/embedding-systems/SKILL.md +55 -0
- package/.mindforge/skills/environment-management/SKILL.md +54 -0
- package/.mindforge/skills/error-handling-architecture/SKILL.md +118 -0
- package/.mindforge/skills/estimation-techniques/SKILL.md +113 -0
- package/.mindforge/skills/eval-harness/SKILL.md +180 -0
- package/.mindforge/skills/event-driven-architecture/SKILL.md +162 -0
- package/.mindforge/skills/experiment-design/SKILL.md +139 -0
- package/.mindforge/skills/experiment-platform/SKILL.md +43 -0
- package/.mindforge/skills/feature-engineering/SKILL.md +42 -0
- package/.mindforge/skills/feature-flag-management/SKILL.md +183 -0
- package/.mindforge/skills/fine-tuning-workflow/SKILL.md +189 -0
- package/.mindforge/skills/fintech-patterns/SKILL.md +41 -0
- package/.mindforge/skills/flutter-architecture/SKILL.md +42 -0
- package/.mindforge/skills/gaming-backend/SKILL.md +41 -0
- package/.mindforge/skills/git-workflow-design/SKILL.md +129 -0
- package/.mindforge/skills/graceful-degradation/SKILL.md +95 -0
- package/.mindforge/skills/graphql-patterns/SKILL.md +243 -0
- package/.mindforge/skills/guardrails-and-safety/SKILL.md +137 -0
- package/.mindforge/skills/healthcare-systems/SKILL.md +40 -0
- package/.mindforge/skills/hiring-engineering/SKILL.md +119 -0
- package/.mindforge/skills/human-in-the-loop-design/SKILL.md +234 -0
- package/.mindforge/skills/i18n-architecture/SKILL.md +147 -0
- package/.mindforge/skills/idempotency-patterns/SKILL.md +84 -0
- package/.mindforge/skills/incident-communication/SKILL.md +96 -0
- package/.mindforge/skills/incident-management/SKILL.md +97 -0
- package/.mindforge/skills/infrastructure-as-code/SKILL.md +98 -0
- package/.mindforge/skills/instinct-clustering/SKILL.md +190 -0
- package/.mindforge/skills/internal-developer-platform/SKILL.md +51 -0
- package/.mindforge/skills/iot-platform/SKILL.md +41 -0
- package/.mindforge/skills/k8s-deployment/SKILL.md +358 -0
- package/.mindforge/skills/knowledge-graphs/SKILL.md +56 -0
- package/.mindforge/skills/knowledge-sharing-systems/SKILL.md +112 -0
- package/.mindforge/skills/llm-cost-optimization/SKILL.md +198 -0
- package/.mindforge/skills/llm-orchestration/SKILL.md +56 -0
- package/.mindforge/skills/load-testing/SKILL.md +84 -0
- package/.mindforge/skills/logistics-optimization/SKILL.md +40 -0
- package/.mindforge/skills/market-researcher/SKILL.md +99 -0
- package/.mindforge/skills/marketplace-trust/SKILL.md +40 -0
- package/.mindforge/skills/mcp-server-patterns/SKILL.md +264 -0
- package/.mindforge/skills/media-streaming/SKILL.md +41 -0
- package/.mindforge/skills/meeting-architecture/SKILL.md +146 -0
- package/.mindforge/skills/mentoring-patterns/SKILL.md +77 -0
- package/.mindforge/skills/microservices-patterns/SKILL.md +83 -0
- package/.mindforge/skills/migration-platform/SKILL.md +61 -0
- package/.mindforge/skills/migration-strategies/SKILL.md +129 -0
- package/.mindforge/skills/ml-feature-store/SKILL.md +56 -0
- package/.mindforge/skills/ml-monitoring/SKILL.md +42 -0
- package/.mindforge/skills/mobile-performance/SKILL.md +44 -0
- package/.mindforge/skills/mobile-security/SKILL.md +45 -0
- package/.mindforge/skills/model-evaluation/SKILL.md +53 -0
- package/.mindforge/skills/monorepo-management/SKILL.md +100 -0
- package/.mindforge/skills/multi-llm-consult/SKILL.md +75 -0
- package/.mindforge/skills/multi-tenancy-patterns/SKILL.md +145 -0
- package/.mindforge/skills/multi-turn-conversation-design/SKILL.md +206 -0
- package/.mindforge/skills/multimodal-ai/SKILL.md +51 -0
- package/.mindforge/skills/mutation-testing/SKILL.md +97 -0
- package/.mindforge/skills/notification-system-design/SKILL.md +168 -0
- package/.mindforge/skills/observability-stack/SKILL.md +136 -0
- package/.mindforge/skills/offline-first-design/SKILL.md +43 -0
- package/.mindforge/skills/on-call-design/SKILL.md +111 -0
- package/.mindforge/skills/pagination-patterns/SKILL.md +230 -0
- package/.mindforge/skills/payment-integration/SKILL.md +176 -0
- package/.mindforge/skills/performance-reviews/SKILL.md +140 -0
- package/.mindforge/skills/platform-observability/SKILL.md +58 -0
- package/.mindforge/skills/platform-reliability/SKILL.md +52 -0
- package/.mindforge/skills/post-incident-learning/SKILL.md +96 -0
- package/.mindforge/skills/product-manager/SKILL.md +104 -0
- package/.mindforge/skills/progressive-web-app/SKILL.md +44 -0
- package/.mindforge/skills/prompt-engineering/SKILL.md +94 -0
- package/.mindforge/skills/proofreader/SKILL.md +158 -0
- package/.mindforge/skills/push-notification-architecture/SKILL.md +45 -0
- package/.mindforge/skills/python-performance/SKILL.md +183 -0
- package/.mindforge/skills/quality-audit/SKILL.md +171 -0
- package/.mindforge/skills/queue-design/SKILL.md +85 -0
- package/.mindforge/skills/rag-architecture/SKILL.md +176 -0
- package/.mindforge/skills/rate-limiting-design/SKILL.md +94 -0
- package/.mindforge/skills/react-native-patterns/SKILL.md +42 -0
- package/.mindforge/skills/react-performance/SKILL.md +229 -0
- package/.mindforge/skills/real-time-analytics/SKILL.md +42 -0
- package/.mindforge/skills/real-time-sync/SKILL.md +83 -0
- package/.mindforge/skills/responsive-native/SKILL.md +44 -0
- package/.mindforge/skills/responsive-patterns/SKILL.md +141 -0
- package/.mindforge/skills/rfc-pipeline/SKILL.md +114 -0
- package/.mindforge/skills/saas-multi-tenant/SKILL.md +41 -0
- package/.mindforge/skills/santa-method/SKILL.md +134 -0
- package/.mindforge/skills/search-implementation/SKILL.md +98 -0
- package/.mindforge/skills/secrets-platform/SKILL.md +56 -0
- package/.mindforge/skills/secrets-rotation/SKILL.md +173 -0
- package/.mindforge/skills/self-serve-infrastructure/SKILL.md +51 -0
- package/.mindforge/skills/serverless-patterns/SKILL.md +119 -0
- package/.mindforge/skills/skill-creator-meta/SKILL.md +146 -0
- package/.mindforge/skills/sprint-retrospective-facilitation/SKILL.md +112 -0
- package/.mindforge/skills/stakeholder-communication/SKILL.md +85 -0
- package/.mindforge/skills/state-management/SKILL.md +104 -0
- package/.mindforge/skills/stream-processing/SKILL.md +43 -0
- package/.mindforge/skills/streaming-architecture/SKILL.md +81 -0
- package/.mindforge/skills/supply-chain-security/SKILL.md +145 -0
- package/.mindforge/skills/synthetic-data-generation/SKILL.md +52 -0
- package/.mindforge/skills/system-design/SKILL.md +88 -0
- package/.mindforge/skills/team-topology-design/SKILL.md +107 -0
- package/.mindforge/skills/technical-debt-management/SKILL.md +86 -0
- package/.mindforge/skills/technical-interview-design/SKILL.md +98 -0
- package/.mindforge/skills/technical-leadership/SKILL.md +75 -0
- package/.mindforge/skills/technical-writing/SKILL.md +237 -0
- package/.mindforge/skills/technology-radar/SKILL.md +88 -0
- package/.mindforge/skills/testing-anti-patterns/SKILL.md +288 -0
- package/.mindforge/skills/threat-modeling/SKILL.md +109 -0
- package/.mindforge/skills/tool-design/SKILL.md +138 -0
- package/.mindforge/skills/typescript-advanced/SKILL.md +198 -0
- package/.mindforge/skills/using-git-worktrees/SKILL.md +139 -0
- package/.mindforge/skills/verification-loop/SKILL.md +97 -0
- package/.mindforge/skills/vibe-security/SKILL.md +165 -0
- package/.mindforge/skills/visual-regression-testing/SKILL.md +97 -0
- package/.mindforge/skills/websocket-patterns/SKILL.md +203 -0
- package/.mindforge/skills/writing-plans/SKILL.md +170 -0
- package/.mindforge/skills/writing-skills/SKILL.md +216 -0
- package/.mindforge/skills/zero-trust-architecture/SKILL.md +166 -0
- package/CHANGELOG.md +195 -0
- package/MINDFORGE.md +4 -4
- package/README.md +2 -2
- package/RELEASENOTES.md +66 -0
- package/bin/installer-core.js +1 -1
- package/bin/wizard/theme.js +2 -2
- package/docs/commands-reference.md +18 -1
- package/package.json +2 -2
- package/.mindforge/personas/data-privacy-engineer.md +0 -187
|
@@ -0,0 +1,198 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: llm-cost-optimization
|
|
3
|
+
version: 1.0.0
|
|
4
|
+
min_mindforge_version: 0.1.0
|
|
5
|
+
status: stable
|
|
6
|
+
compose: cost-aware-routing
|
|
7
|
+
triggers: llm cost optimization, prompt compression, semantic caching, model cascading, batch api, token estimation, cost per query, model routing optimization, output token reduction, prompt deduplication, cache hit rate, cost monitoring
|
|
8
|
+
---
|
|
9
|
+
|
|
10
|
+
# Skill — LLM Cost Optimization
|
|
11
|
+
|
|
12
|
+
## When this skill activates
|
|
13
|
+
Any task involving LLM API cost reduction, prompt compression, semantic caching,
|
|
14
|
+
model cascading/routing, batch API usage, or token budget management.
|
|
15
|
+
|
|
16
|
+
## Mandatory actions when this skill is active
|
|
17
|
+
|
|
18
|
+
### Before writing any code
|
|
19
|
+
1. Establish baseline costs (cost per query, daily/monthly spend, cost by feature).
|
|
20
|
+
2. Identify the biggest cost drivers (which prompts, which models, which features).
|
|
21
|
+
3. Set cost reduction targets with quality guardrails.
|
|
22
|
+
|
|
23
|
+
### During implementation
|
|
24
|
+
- Implement semantic caching for repeated/similar queries.
|
|
25
|
+
- Use model cascading (start cheap, escalate only when needed).
|
|
26
|
+
- Add token estimation before API calls (pre-flight cost check).
|
|
27
|
+
|
|
28
|
+
### After implementation
|
|
29
|
+
- Monitor cost per query, cache hit rate, and cascade escalation rate.
|
|
30
|
+
- Set up cost anomaly alerts (spike detection).
|
|
31
|
+
- Document cost optimization decisions in ARCHITECTURE.md.
|
|
32
|
+
|
|
33
|
+
## Prompt Compression
|
|
34
|
+
|
|
35
|
+
### System Prompt Optimization
|
|
36
|
+
- Remove redundant instructions (LLMs don't need repetition like humans).
|
|
37
|
+
- Use abbreviations and compact formatting in system prompts.
|
|
38
|
+
- Reference items by ID rather than including full content.
|
|
39
|
+
- Cache static system prompts (most providers support this).
|
|
40
|
+
|
|
41
|
+
### Context Window Efficiency
|
|
42
|
+
- Include only relevant context (not entire documents).
|
|
43
|
+
- Summarize long documents before including in prompt.
|
|
44
|
+
- Use structured formats (JSON/YAML) over verbose prose for data.
|
|
45
|
+
- Remove examples from system prompt once model demonstrates understanding.
|
|
46
|
+
|
|
47
|
+
### Token Reduction Techniques
|
|
48
|
+
| Technique | Savings | Quality Impact |
|
|
49
|
+
|-----------|---------|---------------|
|
|
50
|
+
| Remove redundant instructions | 10-30% | None |
|
|
51
|
+
| Abbreviate system prompt | 15-25% | Minimal |
|
|
52
|
+
| Summarize context | 40-60% | Low-moderate |
|
|
53
|
+
| Reference by ID | 50-70% | None (if lookup available) |
|
|
54
|
+
| Fewer few-shot examples | 30-50% | Low (if model is capable) |
|
|
55
|
+
|
|
56
|
+
## Semantic Caching
|
|
57
|
+
|
|
58
|
+
### Concept
|
|
59
|
+
- Hash similar queries → return cached response if semantically equivalent.
|
|
60
|
+
- Not exact-match caching — uses embedding similarity.
|
|
61
|
+
- Threshold: if query embedding distance < 0.05, serve cached response.
|
|
62
|
+
|
|
63
|
+
### Implementation
|
|
64
|
+
```
|
|
65
|
+
1. Embed incoming query
|
|
66
|
+
2. Search cache for similar queries (cosine similarity > 0.95)
|
|
67
|
+
3. If hit: return cached response (cost = ~$0)
|
|
68
|
+
4. If miss: call LLM, store response in cache with query embedding
|
|
69
|
+
```
|
|
70
|
+
|
|
71
|
+
### Cache Invalidation
|
|
72
|
+
- TTL-based: expire after N hours (for time-sensitive data).
|
|
73
|
+
- Event-based: invalidate when underlying data changes.
|
|
74
|
+
- Version-based: invalidate when prompt/model version changes.
|
|
75
|
+
|
|
76
|
+
### Expected Performance
|
|
77
|
+
- Cache hit rate: 20-60% for typical applications.
|
|
78
|
+
- Cost reduction: proportional to hit rate.
|
|
79
|
+
- Latency improvement: 10-100x faster on cache hits.
|
|
80
|
+
|
|
81
|
+
## Model Cascading
|
|
82
|
+
|
|
83
|
+
### Pattern
|
|
84
|
+
```
|
|
85
|
+
Query → Haiku/Small Model → Quality Check → Pass? → Return
|
|
86
|
+
→ Fail? → Sonnet/Large Model → Return
|
|
87
|
+
```
|
|
88
|
+
|
|
89
|
+
### Implementation Rules
|
|
90
|
+
- Start with cheapest model capable of the task.
|
|
91
|
+
- Define quality gate (confidence score, format validation, length check).
|
|
92
|
+
- Escalate to more expensive model only when quality gate fails.
|
|
93
|
+
- Track escalation rate (target: < 20% of queries escalate).
|
|
94
|
+
|
|
95
|
+
### Model Tier Pricing (Approximate)
|
|
96
|
+
| Tier | Model Examples | Cost (per 1M tokens) | Use For |
|
|
97
|
+
|------|---------------|----------------------|---------|
|
|
98
|
+
| Cheap | Haiku, GPT-4o-mini | $0.25-1.00 | Simple tasks, classification, extraction |
|
|
99
|
+
| Medium | Sonnet, GPT-4o | $3.00-15.00 | Most generation, reasoning |
|
|
100
|
+
| Expensive | Opus, o1 | $15.00-75.00 | Complex reasoning, critical decisions |
|
|
101
|
+
|
|
102
|
+
### Routing Heuristics
|
|
103
|
+
- Classification/extraction → always use cheap model.
|
|
104
|
+
- Code generation → medium model (escalate if syntax errors).
|
|
105
|
+
- Complex reasoning → start medium, escalate if confidence low.
|
|
106
|
+
- Safety-critical → always use expensive model (no cascading).
|
|
107
|
+
|
|
108
|
+
## Batch API Usage
|
|
109
|
+
|
|
110
|
+
### When to Use
|
|
111
|
+
- Non-real-time workloads (background processing, ETL, reports).
|
|
112
|
+
- Large volume of similar requests.
|
|
113
|
+
- Typical discount: 50% cheaper than synchronous API.
|
|
114
|
+
|
|
115
|
+
### Batch-Eligible Workloads
|
|
116
|
+
- Document summarization pipelines.
|
|
117
|
+
- Nightly content generation.
|
|
118
|
+
- Bulk classification/tagging.
|
|
119
|
+
- Training data generation.
|
|
120
|
+
- Automated evaluations.
|
|
121
|
+
|
|
122
|
+
### Implementation
|
|
123
|
+
- Queue requests during the day.
|
|
124
|
+
- Submit batch job during off-peak (overnight).
|
|
125
|
+
- Process results next morning.
|
|
126
|
+
- Set up retry for failed items in batch.
|
|
127
|
+
|
|
128
|
+
## Token Estimation
|
|
129
|
+
|
|
130
|
+
### Pre-Flight Cost Check
|
|
131
|
+
```python
|
|
132
|
+
estimated_tokens = count_tokens(system_prompt + context + query)
|
|
133
|
+
estimated_cost = estimated_tokens * price_per_token
|
|
134
|
+
if estimated_cost > budget_threshold:
|
|
135
|
+
compress_context() # or reject query
|
|
136
|
+
```
|
|
137
|
+
|
|
138
|
+
### Token Counting
|
|
139
|
+
- Use tiktoken (OpenAI) or provider-specific tokenizer.
|
|
140
|
+
- Count BEFORE sending to API (not after).
|
|
141
|
+
- Include expected output tokens in estimate.
|
|
142
|
+
- Set max_tokens to limit output cost.
|
|
143
|
+
|
|
144
|
+
### Budget Controls
|
|
145
|
+
- Per-query budget: reject or compress if estimated cost too high.
|
|
146
|
+
- Per-user budget: track cumulative cost, throttle when approaching limit.
|
|
147
|
+
- Per-feature budget: allocate cost budgets to product features.
|
|
148
|
+
|
|
149
|
+
## Output Token Reduction
|
|
150
|
+
|
|
151
|
+
### Techniques
|
|
152
|
+
- Set `max_tokens` to reasonable limit for the task.
|
|
153
|
+
- Instruct model to be concise: "Answer in 2-3 sentences."
|
|
154
|
+
- Use structured output (JSON) to prevent verbose prose.
|
|
155
|
+
- Ask for key information only, not explanations (when appropriate).
|
|
156
|
+
|
|
157
|
+
### Output Cost Impact
|
|
158
|
+
| Approach | Typical Output Reduction | Quality Impact |
|
|
159
|
+
|----------|------------------------|---------------|
|
|
160
|
+
| max_tokens cap | Varies | May truncate if too aggressive |
|
|
161
|
+
| Conciseness instruction | 30-50% | Usually none for factual tasks |
|
|
162
|
+
| JSON/structured output | 40-60% | None (often improves) |
|
|
163
|
+
| Enumerate, don't explain | 50-70% | Low for extraction tasks |
|
|
164
|
+
|
|
165
|
+
## Cost Monitoring
|
|
166
|
+
|
|
167
|
+
### Key Metrics
|
|
168
|
+
| Metric | Alert Threshold | Description |
|
|
169
|
+
|--------|----------------|-------------|
|
|
170
|
+
| Daily cost | > 2x rolling average | Anomaly detection |
|
|
171
|
+
| Cost per query | > budget ceiling | Individual query cost |
|
|
172
|
+
| Cache hit rate | < 30% (if caching enabled) | Cache effectiveness |
|
|
173
|
+
| Escalation rate | > 30% | Cascade efficiency |
|
|
174
|
+
| Token waste ratio | > 20% unused max_tokens | Over-allocated budgets |
|
|
175
|
+
|
|
176
|
+
### Dashboard Requirements
|
|
177
|
+
- Cost breakdown by: feature, model, endpoint, user tier.
|
|
178
|
+
- Trend lines: daily, weekly, monthly.
|
|
179
|
+
- Forecast: projected monthly cost at current rate.
|
|
180
|
+
- Anomaly alerts: immediate notification on cost spikes.
|
|
181
|
+
|
|
182
|
+
### Optimization Feedback Loop
|
|
183
|
+
```
|
|
184
|
+
Monitor costs → Identify top cost drivers → Apply optimization →
|
|
185
|
+
Measure improvement → Adjust thresholds → Repeat monthly
|
|
186
|
+
```
|
|
187
|
+
|
|
188
|
+
## Self-check before task completion
|
|
189
|
+
|
|
190
|
+
Before marking a task done when this skill was active:
|
|
191
|
+
|
|
192
|
+
- [ ] Did I read the full SKILL.md before starting? (Not just the triggers)
|
|
193
|
+
- [ ] Is semantic caching implemented for repeated queries?
|
|
194
|
+
- [ ] Is model cascading configured (cheap first, escalate on failure)?
|
|
195
|
+
- [ ] Are token budgets estimated before API calls?
|
|
196
|
+
- [ ] Is cost monitoring in place with anomaly alerts?
|
|
197
|
+
- [ ] Are batch APIs used for non-real-time workloads?
|
|
198
|
+
- [ ] Is prompt compression applied to system prompts?
|
|
@@ -0,0 +1,56 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: llm-orchestration
|
|
3
|
+
version: 1.0.0
|
|
4
|
+
min_mindforge_version: 10.5.0
|
|
5
|
+
status: stable
|
|
6
|
+
triggers: LLM orchestration system, chain-of-thought routing, model cascading strategy, LLM fallback design, LLM context window optimization, LLM router design, model selection pipeline, LLM chain composition, prompt chaining, LLM gateway architecture, multi-model routing, LLM request planning
|
|
7
|
+
compose:
|
|
8
|
+
- agent-orchestration-patterns
|
|
9
|
+
---
|
|
10
|
+
|
|
11
|
+
# LLM Orchestration & Routing
|
|
12
|
+
|
|
13
|
+
## When this skill activates
|
|
14
|
+
|
|
15
|
+
This skill activates when building systems that route requests across multiple LLMs, cascade from cheap to expensive models, implement fallback strategies, or manage context window limits. It applies to production AI systems where cost, latency, and reliability must be optimized across a fleet of models.
|
|
16
|
+
|
|
17
|
+
## Mandatory actions when this skill is active
|
|
18
|
+
|
|
19
|
+
### Before writing any code
|
|
20
|
+
|
|
21
|
+
1. **Map task to model capabilities** — Identify which tasks require which model capabilities: reasoning (Opus/GPT-4), speed (Haiku/GPT-3.5), cost-efficiency (Haiku/GPT-3.5), context length (Claude 200K, GPT-4 128K), multimodal (GPT-4V, Claude Sonnet with vision). Not all tasks need the most expensive model.
|
|
22
|
+
2. **Design cascading strategy** — Define fallback chain: start with fastest/cheapest model, escalate to more capable models if quality is insufficient. Example: Haiku → Sonnet → Opus. Measure escalation rate (% of requests that cascade) and optimize to minimize unnecessary escalations.
|
|
23
|
+
3. **Establish routing rules** — Define deterministic rules for model selection: route by task type (classification → Haiku, creative writing → Opus), by complexity (short prompts → small models, long prompts → large models), or by user tier (free users → Haiku, paid users → Sonnet). Document routing logic explicitly.
|
|
24
|
+
4. **Plan for context window management** — Measure typical prompt lengths and output lengths. If prompts exceed model limits, implement truncation (drop least important context), summarization (compress with smaller model), or chunking (split into multiple requests). Test that truncated prompts still produce useful outputs.
|
|
25
|
+
|
|
26
|
+
### During implementation
|
|
27
|
+
|
|
28
|
+
- **Implement request routing layer** — Build a central router that examines each request (task type, complexity, user tier, priority) and selects the appropriate model. Router must be fast (<10ms overhead) and deterministic (same input always routes to same model for consistency).
|
|
29
|
+
- **Add quality-based cascading** — After receiving a response from a cheap model, run a lightweight quality check (confidence score, length validation, keyword presence). If quality is below threshold, retry with a more expensive model. Log all escalations for analysis.
|
|
30
|
+
- **Design stateful prompt chaining** — When chaining multiple LLM calls (research → draft → critique → revision), maintain conversation state explicitly. Use structured context: {task: ..., previous_output: ..., next_instruction: ...}. Avoid relying on model memory (models are stateless between API calls).
|
|
31
|
+
- **Implement prompt compression** — For long contexts, compress aggressively: remove filler words, use abbreviations, reference external documents by ID instead of embedding full text. Test that compressed prompts produce equivalent outputs. Measure compression ratio (original tokens / compressed tokens).
|
|
32
|
+
- **Handle rate limits gracefully** — Implement exponential backoff and retry logic for rate limit errors. If one model is rate-limited, fail over to an alternative model (even if more expensive). Never return errors to users due to transient rate limits.
|
|
33
|
+
- **Track cost and latency per model** — Log every request: model used, input tokens, output tokens, latency, cost. Aggregate metrics by model and task type. Identify opportunities to downgrade expensive models to cheaper alternatives without quality loss.
|
|
34
|
+
|
|
35
|
+
### After implementation
|
|
36
|
+
|
|
37
|
+
- **Validate routing accuracy** — Test that requests route to the expected model based on your rules. Measure routing accuracy (% of requests routed correctly). Incorrect routing wastes cost (cheap task routed to expensive model) or produces poor quality (complex task routed to weak model).
|
|
38
|
+
- **Measure cascade efficiency** — Track escalation rate: % of requests that start with a cheap model but cascade to expensive models. Target: <10% escalation rate. Higher rates indicate routing rules are too aggressive (route to cheaper models upfront).
|
|
39
|
+
- **Benchmark end-to-end latency** — Measure latency including routing overhead, model inference, and cascading retries. Compare to single-model baseline. Orchestration should add <50ms overhead. If higher, optimize routing logic or reduce cascade depth.
|
|
40
|
+
- **Test fallback behavior** — Simulate model failures (rate limits, API downtime, timeouts) and validate that fallback logic activates. Ensure degraded mode still returns useful responses (even if lower quality or higher latency).
|
|
41
|
+
|
|
42
|
+
## Self-check before task completion
|
|
43
|
+
|
|
44
|
+
- [ ] Task-to-model mapping is documented with capability requirements (reasoning, speed, cost, context)
|
|
45
|
+
- [ ] Cascading strategy is defined with explicit escalation rules and thresholds
|
|
46
|
+
- [ ] Routing rules are deterministic and documented (by task type, complexity, user tier)
|
|
47
|
+
- [ ] Context window limits are handled via truncation, summarization, or chunking strategies
|
|
48
|
+
- [ ] Request router is implemented with <10ms overhead and deterministic behavior
|
|
49
|
+
- [ ] Quality-based cascading retries with more expensive models if quality is below threshold
|
|
50
|
+
- [ ] Prompt chaining maintains stateful context explicitly across LLM calls
|
|
51
|
+
- [ ] Prompt compression is implemented and tested for equivalence with uncompressed prompts
|
|
52
|
+
- [ ] Rate limit handling implements exponential backoff and model failover
|
|
53
|
+
- [ ] Cost and latency per model are logged and aggregated for optimization analysis
|
|
54
|
+
- [ ] Routing accuracy is validated (requests route to expected models)
|
|
55
|
+
- [ ] Escalation rate is measured and optimized to <10% of requests
|
|
56
|
+
- [ ] Fallback behavior is tested under simulated model failures (rate limits, downtime)
|
|
@@ -0,0 +1,84 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: load-testing
|
|
3
|
+
version: 1.0.0
|
|
4
|
+
min_mindforge_version: 10.0.8
|
|
5
|
+
status: stable
|
|
6
|
+
triggers: load testing, k6 script, locust test, artillery config, load profile, ramp up test, spike test, soak test, stress test, SLA validation, throughput test, concurrent users
|
|
7
|
+
compose: performance
|
|
8
|
+
---
|
|
9
|
+
|
|
10
|
+
# Load Testing
|
|
11
|
+
|
|
12
|
+
## When this skill activates
|
|
13
|
+
|
|
14
|
+
This skill activates when designing, writing, or analyzing performance tests that simulate user load against a system. It applies to any scenario involving throughput measurement, latency profiling under load, capacity planning, or SLA validation.
|
|
15
|
+
|
|
16
|
+
## Mandatory actions when this skill is active
|
|
17
|
+
|
|
18
|
+
### Before
|
|
19
|
+
|
|
20
|
+
1. Define the SLA targets (p95 latency < X ms, error rate < Y%, throughput > Z rps).
|
|
21
|
+
2. Identify the critical user journeys to simulate (login flow, checkout, search).
|
|
22
|
+
3. Determine the load profile type needed (load, stress, spike, or soak).
|
|
23
|
+
4. Confirm test environment is production-like (same infra, realistic data volume).
|
|
24
|
+
5. Establish baseline metrics from current production (if available).
|
|
25
|
+
6. Ensure monitoring is active during tests (APM, infra metrics, logs).
|
|
26
|
+
|
|
27
|
+
### During
|
|
28
|
+
|
|
29
|
+
**Test Types and When to Use:**
|
|
30
|
+
- **Load test**: Expected traffic. Validates system meets SLA under normal conditions.
|
|
31
|
+
- **Stress test**: Beyond capacity. Finds the breaking point and degradation behavior.
|
|
32
|
+
- **Spike test**: Sudden burst (0 to max instantly). Tests auto-scaling and recovery.
|
|
33
|
+
- **Soak test**: Sustained load for hours. Detects memory leaks, connection pool exhaustion, log disk fill.
|
|
34
|
+
|
|
35
|
+
**Load Profile Design:**
|
|
36
|
+
- **Ramp**: Gradually increase (0 → target over 5 minutes). Standard for load tests.
|
|
37
|
+
- **Spike**: Instant jump to peak. Use for flash sale / viral event scenarios.
|
|
38
|
+
- **Step**: Increase in discrete steps (50 → 100 → 150). Good for finding thresholds.
|
|
39
|
+
- **Soak**: Steady state for 2-8 hours. Target = expected average load.
|
|
40
|
+
|
|
41
|
+
**Tool Selection:**
|
|
42
|
+
- **k6** (Go-based): Developer-friendly, JS scripting, excellent CI integration, checks/thresholds built-in.
|
|
43
|
+
- **Locust** (Python): Python scripting, distributed mode, real-time web UI, event-driven.
|
|
44
|
+
- **Artillery** (Node): YAML config, scenario-based, good for quick setups, plugin ecosystem.
|
|
45
|
+
|
|
46
|
+
**Metrics to Capture:**
|
|
47
|
+
- **Latency**: p50, p95, p99 (p95 is your SLA target, p99 catches tail latency).
|
|
48
|
+
- **Throughput**: Requests per second (rps) sustained at target.
|
|
49
|
+
- **Error rate**: Percentage of non-2xx responses under load.
|
|
50
|
+
- **Saturation**: CPU, memory, connection pool usage, thread count.
|
|
51
|
+
- **Apdex**: Application Performance Index score.
|
|
52
|
+
|
|
53
|
+
**Environment Considerations:**
|
|
54
|
+
- NEVER run load tests against production without explicit approval and safeguards.
|
|
55
|
+
- Use dedicated load-test environment with production-equivalent resources.
|
|
56
|
+
- Populate with realistic data volume (empty DB gives false confidence).
|
|
57
|
+
- Disable rate limiters and WAF rules that would mask real limits (or test with them separately).
|
|
58
|
+
- Coordinate with infrastructure team to avoid impacting shared services.
|
|
59
|
+
|
|
60
|
+
**Result Analysis:**
|
|
61
|
+
- Compare against SLA targets: PASS if all thresholds met under target load.
|
|
62
|
+
- Identify the saturation point (where latency inflects upward).
|
|
63
|
+
- Check for error rate correlation with load increase.
|
|
64
|
+
- Look for resource bottlenecks (CPU-bound vs IO-bound vs network-bound).
|
|
65
|
+
- Document capacity ceiling: "System handles X concurrent users before degradation."
|
|
66
|
+
|
|
67
|
+
### After
|
|
68
|
+
|
|
69
|
+
1. Record results with full context (date, environment, data volume, test config).
|
|
70
|
+
2. Compare against previous runs to detect regressions.
|
|
71
|
+
3. File tickets for any SLA violations with root cause analysis.
|
|
72
|
+
4. Update capacity planning docs with new findings.
|
|
73
|
+
5. Archive test scripts and results in version control.
|
|
74
|
+
|
|
75
|
+
## Self-check before task completion
|
|
76
|
+
|
|
77
|
+
- [ ] SLA targets are explicitly defined and measurable.
|
|
78
|
+
- [ ] Load profile matches a realistic usage pattern (not arbitrary numbers).
|
|
79
|
+
- [ ] Test runs long enough to surface time-dependent issues (minimum 5 minutes for load, 2+ hours for soak).
|
|
80
|
+
- [ ] Results include p50, p95, p99 latency plus error rate and throughput.
|
|
81
|
+
- [ ] Test environment is documented (instance types, replicas, data volume).
|
|
82
|
+
- [ ] Bottleneck identified and categorized (CPU / memory / IO / network / external dependency).
|
|
83
|
+
- [ ] Results compared against baseline or SLA with clear PASS/FAIL verdict.
|
|
84
|
+
- [ ] No tests were accidentally run against production without safeguards.
|
|
@@ -0,0 +1,40 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: logistics-optimization
|
|
3
|
+
version: 1.0.0
|
|
4
|
+
min_mindforge_version: 10.2.0
|
|
5
|
+
status: stable
|
|
6
|
+
triggers: logistics optimization, route planning algorithm, fleet management system, warehouse management, last-mile delivery, supply chain visibility, shipment tracking, delivery optimization, logistics platform, transportation management, inventory routing, fulfillment optimization
|
|
7
|
+
---
|
|
8
|
+
|
|
9
|
+
# Skill — Logistics Optimization
|
|
10
|
+
|
|
11
|
+
## When this skill activates
|
|
12
|
+
This skill activates when building route planning algorithms, fleet management systems, warehouse management platforms, last-mile delivery optimization, supply chain visibility tools, shipment tracking, transportation management systems, or fulfillment optimization engines.
|
|
13
|
+
|
|
14
|
+
## Mandatory actions when this skill is active
|
|
15
|
+
|
|
16
|
+
### Before writing any code
|
|
17
|
+
1. Design route optimization algorithm: model as Vehicle Routing Problem with Time Windows (VRPTW), constraints (vehicle capacity, driver shift hours, customer time windows, priority deliveries), objective function (minimize total distance, balance workload across drivers, maximize on-time delivery %), solution methods (heuristic: nearest neighbor with 2-opt improvement, metaheuristic: simulated annealing, genetic algorithm, or exact: mixed-integer programming for <100 stops)
|
|
18
|
+
2. Model warehouse operations workflow: receiving (unload truck, quality check, barcode scan, put-away to optimal bin location) → storage (zone by velocity: fast-movers near packing, slow-movers in bulk) → picking (batch pick orders, wave pick by zone, pick-to-light system) → packing (right-sized box selection, void fill, label printing) → shipping (load truck, scan manifests, dispatch)
|
|
19
|
+
3. Map shipment lifecycle with visibility: order placed → warehouse allocated (nearest to destination with stock) → picked → packed → labeled → handed to carrier → in-transit (tracking events: picked up, at hub, out for delivery) → delivered (signature capture, photo proof) → exceptions handled (delivery failed, address incorrect, damaged)
|
|
20
|
+
|
|
21
|
+
### During implementation
|
|
22
|
+
- Implement route planning with real-time constraints: integrate maps API (Google Maps, Mapbox) for distance matrix (travel time between all location pairs), consider traffic patterns (historical data + real-time updates), vehicle constraints (capacity in cubic meters + weight, refrigeration for perishables), driver constraints (shift hours, break requirements, certifications for hazmat), time windows (hard: must deliver 2-4pm, soft: preferred 2-4pm with penalty)
|
|
23
|
+
- Build fleet management with telematics: integrate GPS devices (track vehicle location every 30s), monitor vehicle health (fuel level, engine diagnostics, tire pressure), driver behavior scoring (harsh braking, speeding, idle time), geofencing (alert when vehicle enters/exits depot or customer site), maintenance scheduling (oil change every 5000 miles, tire rotation every 10K miles)
|
|
24
|
+
- Design warehouse management system (WMS): inventory stored by SKU with bin location (aisle-rack-shelf), implement slotting optimization (place high-velocity items near packing, co-locate frequently ordered-together items), track lot numbers and expiry dates (FIFO/FEFO picking), support cycle counting (daily counts of A-items, weekly B-items, monthly C-items), integrate barcode scanners for all transactions
|
|
25
|
+
- Implement last-mile delivery optimization: cluster orders by geographic proximity (k-means clustering), assign to delivery vehicles (bin packing problem: maximize utilization), sequence stops (traveling salesman problem: minimize route distance), provide driver mobile app (turn-by-turn navigation, delivery instructions, photo capture, signature collection, real-time status updates)
|
|
26
|
+
- Build shipment tracking with carrier integration: integrate with carrier APIs (FedEx, UPS, USPS for tracking updates), parse webhooks (shipment picked up, in transit, out for delivery, delivered, exception), store tracking history (timestamp, location, status, notes), expose customer tracking page (order number lookup, map visualization, estimated delivery time), send proactive notifications (SMS/email at key events)
|
|
27
|
+
|
|
28
|
+
### After implementation
|
|
29
|
+
- Validate route optimization quality: measure total distance vs greedy baseline (should be 10-20% improvement), on-time delivery rate (>95%), vehicle utilization (>80% capacity filled), driver satisfaction (balanced workload, no overtime unless necessary), test with historical order data (replay last month's orders with optimized routes)
|
|
30
|
+
- Test warehouse efficiency metrics: measure pick rate (items per hour, target: 100 for standard, 200 for pick-to-light), packing time (minutes per order, target: <3 min), accuracy rate (>99.5% correct items, correct quantities), inventory accuracy (cycle count variance <1%), space utilization (>80% of racking filled)
|
|
31
|
+
- Execute delivery performance analysis: track key metrics (on-time delivery rate >95%, first-attempt delivery rate >90%, customer satisfaction score >4.5/5), identify failure modes (address incorrect, customer not home, access issues), measure exception resolution time (redelivery scheduled within 24 hours)
|
|
32
|
+
|
|
33
|
+
## Self-check before task completion
|
|
34
|
+
- [ ] Route optimization functional: VRPTW solver with constraints (capacity, time windows, driver shifts), objective function (minimize distance, balance workload)
|
|
35
|
+
- [ ] Fleet management integrated: GPS tracking (30s updates), telematics (vehicle health, driver behavior), geofencing (depot/customer alerts), maintenance scheduling
|
|
36
|
+
- [ ] Warehouse operations optimized: inventory slotting (velocity-based placement), picking strategies (batch, wave, pick-to-light), barcode scanning at all touchpoints
|
|
37
|
+
- [ ] Last-mile delivery optimized: geographic clustering, bin packing assignment, TSP sequencing, driver mobile app (navigation, photo/signature, status updates)
|
|
38
|
+
- [ ] Shipment tracking real-time: carrier API integration (FedEx/UPS/USPS), webhook parsing, tracking history storage, customer tracking page, proactive notifications
|
|
39
|
+
- [ ] Performance metrics tracked: route distance improvement (10-20% vs baseline), on-time delivery >95%, pick rate 100+ items/hr, inventory accuracy >99.5%
|
|
40
|
+
- [ ] Exception handling robust: delivery failures (address incorrect, not home), redelivery scheduling (within 24h), customer notifications (SMS/email)
|
|
@@ -0,0 +1,99 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: market-researcher
|
|
3
|
+
version: 1.0.0
|
|
4
|
+
min_mindforge_version: 10.0.6
|
|
5
|
+
status: stable
|
|
6
|
+
triggers: market research, competitor analysis, TAM SAM SOM, market sizing, opportunity scoring, positioning strategy, SWOT analysis, competitive landscape, market opportunity, Porter forces, market entry, competitive advantage
|
|
7
|
+
---
|
|
8
|
+
|
|
9
|
+
# Skill — Market Researcher
|
|
10
|
+
|
|
11
|
+
## When this skill activates
|
|
12
|
+
Any task involving market sizing, competitive analysis, positioning strategy, opportunity
|
|
13
|
+
assessment, market entry evaluation, or strategic intelligence gathering.
|
|
14
|
+
|
|
15
|
+
## Mandatory actions when this skill is active
|
|
16
|
+
|
|
17
|
+
### Before
|
|
18
|
+
|
|
19
|
+
1. **Define market boundary** — Geographic scope, customer segment, product category. Ambiguous boundaries produce useless analysis.
|
|
20
|
+
2. **State the decision** — Market research serves a decision (enter, price, position, prioritize). Name it.
|
|
21
|
+
3. **Identify data sources** — Primary (interviews, surveys) and secondary (reports, filings). Note confidence per source.
|
|
22
|
+
|
|
23
|
+
### During
|
|
24
|
+
|
|
25
|
+
#### TAM/SAM/SOM sizing
|
|
26
|
+
```
|
|
27
|
+
Top-down (industry reports):
|
|
28
|
+
TAM = total market revenue at 100% share (analyst reports, govt data)
|
|
29
|
+
SAM = TAM * geographic_filter * segment_filter * product_fit_filter
|
|
30
|
+
SOM = SAM * realistic_capture_rate (3-5yr, benchmarked)
|
|
31
|
+
|
|
32
|
+
Bottom-up (unit economics):
|
|
33
|
+
SOM = reachable_customers * win_rate * average_ACV
|
|
34
|
+
|
|
35
|
+
Cross-validate: top-down and bottom-up within 2x. If divergent, re-examine assumptions.
|
|
36
|
+
```
|
|
37
|
+
|
|
38
|
+
#### Competitor analysis framework
|
|
39
|
+
Per competitor document: overview (founded, HQ, size, funding, revenue estimate), product analysis (features, pricing model, integrations, UX), strengths (with evidence), weaknesses (with evidence), positioning (tagline, claimed differentiator, market perception), signals to monitor (hiring, changelog velocity, pricing moves).
|
|
40
|
+
|
|
41
|
+
#### SWOT analysis
|
|
42
|
+
```
|
|
43
|
+
Strengths (internal, current) | Weaknesses (internal, current)
|
|
44
|
+
Opportunities (external, future) | Threats (external, future)
|
|
45
|
+
|
|
46
|
+
Action matrix:
|
|
47
|
+
S+O = INVEST (leverage strengths to capture opportunities)
|
|
48
|
+
S+T = DEFEND (use strengths to mitigate threats)
|
|
49
|
+
W+O = IMPROVE (fix weaknesses to unlock opportunities)
|
|
50
|
+
W+T = URGENT (weaknesses that amplify threats — fix first)
|
|
51
|
+
```
|
|
52
|
+
|
|
53
|
+
#### Porter's Five Forces
|
|
54
|
+
1. Threat of New Entrants — barriers: capital, network effects, switching costs, regulation
|
|
55
|
+
2. Supplier Power — concentration, switching costs, substitute inputs
|
|
56
|
+
3. Buyer Power — concentration, price sensitivity, switching costs
|
|
57
|
+
4. Threat of Substitutes — alternative solutions, price-performance ratio
|
|
58
|
+
5. Competitive Rivalry — number of players, growth rate, differentiation, exit barriers
|
|
59
|
+
|
|
60
|
+
Rate each High/Medium/Low with specific evidence. Conclude overall industry attractiveness.
|
|
61
|
+
|
|
62
|
+
#### Opportunity scoring matrix
|
|
63
|
+
```
|
|
64
|
+
Score = 0.3*MarketSize + 0.3*Fit + 0.2*(6-Effort) + 0.2*(6-Competition)
|
|
65
|
+
All dimensions rated 1-5. Show weights and math transparently.
|
|
66
|
+
Rank opportunities by score. Top 2-3 become strategic focus.
|
|
67
|
+
```
|
|
68
|
+
|
|
69
|
+
#### Positioning strategy (2x2 maps)
|
|
70
|
+
- Axes: choose two dimensions that matter to buyers (price vs breadth, ease vs power)
|
|
71
|
+
- Plot competitors and self on the map
|
|
72
|
+
- Identify white space (underserved quadrants)
|
|
73
|
+
- Craft positioning statement: "For [segment] who [need], [Product] is the [category] that [differentiator], unlike [alternative] which [limitation]."
|
|
74
|
+
|
|
75
|
+
#### Market entry timing
|
|
76
|
+
```
|
|
77
|
+
Market readiness: budget exists, pain is acute, category awareness present
|
|
78
|
+
Competitive window: no dominant incumbent, slow innovators, tech shift creates opening
|
|
79
|
+
Internal readiness: domain expertise, MVP <6 months, GTM channel identified, unit economics work
|
|
80
|
+
|
|
81
|
+
All green = GO | 1-2 yellow = GO with mitigation | Red in market readiness = WAIT
|
|
82
|
+
```
|
|
83
|
+
|
|
84
|
+
### After
|
|
85
|
+
|
|
86
|
+
1. **Cross-validate** — Triangulate from 3+ sources. Single-source = hypothesis, not finding.
|
|
87
|
+
2. **Label confidence** — High (3+ sources), Medium (2 sources), Low (single/inference).
|
|
88
|
+
3. **Connect to decision** — Every insight maps to the stated decision. Remove interesting-but-unactionable analysis.
|
|
89
|
+
4. **Set refresh cadence** — Competitors: quarterly. Market sizing: annually.
|
|
90
|
+
|
|
91
|
+
## Self-check before task completion
|
|
92
|
+
- [ ] Market boundary clearly defined (geography, segment, category)
|
|
93
|
+
- [ ] TAM/SAM/SOM calculated top-down AND bottom-up, cross-validated
|
|
94
|
+
- [ ] Competitor profiles cover product, pricing, strengths, weaknesses
|
|
95
|
+
- [ ] SWOT has action matrix (S+O, S+T, W+O, W+T strategies)
|
|
96
|
+
- [ ] Porter's Five Forces assessed with evidence per force
|
|
97
|
+
- [ ] Positioning map identifies white space and adjacent moves
|
|
98
|
+
- [ ] Opportunity scoring uses weighted criteria with transparent math
|
|
99
|
+
- [ ] Confidence level labeled on every finding
|
|
@@ -0,0 +1,40 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: marketplace-trust
|
|
3
|
+
version: 1.0.0
|
|
4
|
+
min_mindforge_version: 10.2.0
|
|
5
|
+
status: stable
|
|
6
|
+
triggers: marketplace trust system, reputation scoring, fraud detection marketplace, dispute resolution workflow, escrow payment pattern, trust and safety, content moderation marketplace, seller verification, buyer protection, platform integrity, transaction dispute, marketplace fraud prevention
|
|
7
|
+
---
|
|
8
|
+
|
|
9
|
+
# Skill — Marketplace Trust
|
|
10
|
+
|
|
11
|
+
## When this skill activates
|
|
12
|
+
This skill activates when building reputation systems, fraud detection, dispute resolution workflows, escrow payment patterns, trust and safety mechanisms, content moderation, seller verification, buyer protection, or platform integrity features for marketplaces.
|
|
13
|
+
|
|
14
|
+
## Mandatory actions when this skill is active
|
|
15
|
+
|
|
16
|
+
### Before writing any code
|
|
17
|
+
1. Design reputation scoring system: aggregate trust signals (completion rate, average rating, response time, dispute rate, tenure, transaction volume), weight recent activity higher (exponential decay: last 30 days = 50%, 30-90 days = 30%, 90+ days = 20%), normalize to 0-100 scale, segment by buyer/seller roles (separate scores)
|
|
18
|
+
2. Model dispute resolution workflow: buyer opens dispute (order not received, item not as described, damaged) → seller responds within 48 hours → platform mediator reviews evidence (messages, photos, tracking) → decision rendered (refund, partial refund, favor seller) → appeal window (7 days) → final resolution → reputation impact recorded
|
|
19
|
+
3. Map escrow payment flow: buyer pays platform (funds held) → seller notified (order confirmed) → seller ships item → buyer receives and inspects (3-7 day review period) → buyer approves (funds released to seller) OR buyer disputes (funds held until resolution) → platform fee deducted (10-20%) → seller payout (ACH, wire, PayPal)
|
|
20
|
+
|
|
21
|
+
### During implementation
|
|
22
|
+
- Implement fraud detection with multi-signal analysis: velocity checks (too many orders in 1 hour, new account with high-value purchase), device fingerprinting (IP geolocation mismatch with shipping address, VPN detection), behavioral anomalies (first order is expensive electronics), content analysis (listing description contains phishing keywords), ML scoring (ensemble model: random forest + gradient boosting, threshold tuning for precision/recall)
|
|
23
|
+
- Build reputation system with decay and recovery: store transaction outcomes (completed, disputed, refunded), calculate rolling metrics (30-day completion rate, 90-day avg rating), apply penalty for disputes (temporary score reduction, recovers over 6 months with good behavior), prevent review manipulation (verified purchase only, rate limiting 1 review per transaction, detect suspicious patterns like coordinated 5-star reviews from new accounts)
|
|
24
|
+
- Design dispute resolution dashboard: queue disputes by age (SLA: respond within 24 hours), display conversation history with timestamps, attach evidence (photos, tracking numbers, payment receipts), automated suggestions (refund amount calculator based on depreciation, precedent from similar disputes), escalation to human moderator for complex cases, decision audit trail (who decided, when, rationale)
|
|
25
|
+
- Implement content moderation pipeline: user-generated content (listings, reviews, messages) → automated filters (profanity, prohibited items like weapons/drugs, spam detection via regex and ML) → flagged content queued for human review → moderator approves/rejects with reason → user notification (content removed, account warning, suspension on repeat violations) → appeal process
|
|
26
|
+
- Build seller verification with identity proofing: phone verification (SMS OTP), email verification (link click), government ID upload (passport, driver's license → OCR extraction, face matching via liveness check), business registration documents (tax ID, incorporation certificate → third-party verification API), bank account linking (micro-deposit verification)
|
|
27
|
+
|
|
28
|
+
### After implementation
|
|
29
|
+
- Validate reputation accuracy: measure correlation with actual outcomes (high-rated sellers have <2% dispute rate, low-rated >10%), test decay mechanism (old bad reviews eventually expire), verify manipulation resistance (fake review rings detected and removed), ensure fairness (new sellers can build reputation within 30 days with 10+ successful transactions)
|
|
30
|
+
- Test fraud detection effectiveness: simulate known fraud patterns (account takeover, credit card fraud, fake listings), measure detection rate (>95% of known fraud blocked), false positive rate (<1% of legit transactions flagged), detection latency (<1 second at transaction time), monitor feedback loop (fraud analysts label misses, retrain model monthly)
|
|
31
|
+
- Execute dispute resolution fairness audit: measure resolution time (median <72 hours), appeal rate (<5% of decisions), appeal overturn rate (10-15% indicates healthy calibration), sentiment analysis of user feedback (post-resolution satisfaction), detect bias (outcomes should not correlate with user demographics)
|
|
32
|
+
|
|
33
|
+
## Self-check before task completion
|
|
34
|
+
- [ ] Reputation scoring functional: aggregates trust signals (completion rate, rating, disputes), weights recent activity higher (exponential decay), normalized 0-100 score
|
|
35
|
+
- [ ] Fraud detection multi-signal: velocity checks, device fingerprinting, behavioral anomalies, content analysis, ML scoring with tuned threshold
|
|
36
|
+
- [ ] Dispute resolution workflow: buyer opens, seller responds (48h SLA), evidence review, decision rendered, appeal window (7 days), reputation impact
|
|
37
|
+
- [ ] Escrow payment implemented: funds held by platform, released on buyer approval or dispute resolution, seller payout with fee deduction
|
|
38
|
+
- [ ] Content moderation active: automated filters (profanity, prohibited items), human review queue, moderator actions (approve/reject/suspend), appeal process
|
|
39
|
+
- [ ] Seller verification robust: phone/email verification, government ID with OCR + liveness, business documents, bank account linking
|
|
40
|
+
- [ ] Fairness metrics tracked: resolution time <72h, appeal rate <5%, no bias in outcomes, new sellers can build reputation within 30 days
|