omgkit 2.9.1 → 2.10.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (60) hide show
  1. package/README.md +48 -1
  2. package/package.json +1 -1
  3. package/plugin/commands/workflow/1000x-innovation.md +61 -0
  4. package/plugin/commands/workflow/100x-architecture.md +60 -0
  5. package/plugin/commands/workflow/10x-improvement.md +63 -0
  6. package/plugin/commands/workflow/agent-development.md +60 -0
  7. package/plugin/commands/workflow/api-design.md +61 -0
  8. package/plugin/commands/workflow/api-testing.md +61 -0
  9. package/plugin/commands/workflow/authentication.md +60 -0
  10. package/plugin/commands/workflow/best-practices.md +61 -0
  11. package/plugin/commands/workflow/bug-fix.md +61 -0
  12. package/plugin/commands/workflow/code-review.md +52 -0
  13. package/plugin/commands/workflow/feature.md +73 -0
  14. package/plugin/commands/workflow/fine-tuning.md +60 -0
  15. package/plugin/commands/workflow/full-feature.md +70 -0
  16. package/plugin/commands/workflow/marketing.md +53 -0
  17. package/plugin/commands/workflow/migration.md +60 -0
  18. package/plugin/commands/workflow/model-evaluation.md +59 -0
  19. package/plugin/commands/workflow/optimization.md +60 -0
  20. package/plugin/commands/workflow/penetration-testing.md +60 -0
  21. package/plugin/commands/workflow/performance-optimization.md +60 -0
  22. package/plugin/commands/workflow/prompt-engineering.md +51 -0
  23. package/plugin/commands/workflow/rag-development.md +79 -0
  24. package/plugin/commands/workflow/refactor.md +59 -0
  25. package/plugin/commands/workflow/schema-design.md +70 -0
  26. package/plugin/commands/workflow/security-audit.md +61 -0
  27. package/plugin/commands/workflow/sprint-execution.md +65 -0
  28. package/plugin/commands/workflow/sprint-retrospective.md +61 -0
  29. package/plugin/commands/workflow/sprint-setup.md +64 -0
  30. package/plugin/commands/workflow/technical-docs.md +52 -0
  31. package/plugin/commands/workflow/technology-research.md +61 -0
  32. package/plugin/workflows/ai-engineering/agent-development.md +240 -0
  33. package/plugin/workflows/ai-engineering/fine-tuning.md +212 -0
  34. package/plugin/workflows/ai-engineering/model-evaluation.md +203 -0
  35. package/plugin/workflows/ai-engineering/prompt-engineering.md +192 -0
  36. package/plugin/workflows/ai-engineering/rag-development.md +203 -0
  37. package/plugin/workflows/api/api-design.md +152 -0
  38. package/plugin/workflows/api/api-testing.md +152 -0
  39. package/plugin/workflows/content/marketing.md +118 -0
  40. package/plugin/workflows/content/technical-docs.md +146 -0
  41. package/plugin/workflows/database/migration.md +153 -0
  42. package/plugin/workflows/database/optimization.md +136 -0
  43. package/plugin/workflows/database/schema-design.md +148 -0
  44. package/plugin/workflows/development/bug-fix.md +159 -0
  45. package/plugin/workflows/development/code-review.md +119 -0
  46. package/plugin/workflows/development/feature.md +171 -0
  47. package/plugin/workflows/development/refactor.md +155 -0
  48. package/plugin/workflows/fullstack/authentication.md +153 -0
  49. package/plugin/workflows/fullstack/full-feature.md +217 -0
  50. package/plugin/workflows/omega/1000x-innovation.md +167 -0
  51. package/plugin/workflows/omega/100x-architecture.md +150 -0
  52. package/plugin/workflows/omega/10x-improvement.md +228 -0
  53. package/plugin/workflows/quality/performance-optimization.md +157 -0
  54. package/plugin/workflows/research/best-practices.md +140 -0
  55. package/plugin/workflows/research/technology-research.md +130 -0
  56. package/plugin/workflows/security/penetration-testing.md +150 -0
  57. package/plugin/workflows/security/security-audit.md +176 -0
  58. package/plugin/workflows/sprint/sprint-execution.md +168 -0
  59. package/plugin/workflows/sprint/sprint-retrospective.md +168 -0
  60. package/plugin/workflows/sprint/sprint-setup.md +153 -0
@@ -0,0 +1,73 @@
1
+ ---
2
+ description: Complete feature development from planning to deployment
3
+ allowed-tools: Task, Read, Write, Edit, Bash, Grep, Glob
4
+ argument-hint: <feature description>
5
+ ---
6
+
7
+ # Feature Development Workflow
8
+
9
+ Build feature: **$ARGUMENTS**
10
+
11
+ ## Workflow Steps
12
+
13
+ ### Step 1: Planning
14
+ **Agent:** @planner
15
+ **Command:** `/planning:plan "$ARGUMENTS"`
16
+
17
+ - Analyze feature requirements
18
+ - Break down into implementable tasks
19
+ - Create detailed implementation plan
20
+ - Define acceptance criteria
21
+ - Identify dependencies and risks
22
+
23
+ ### Step 2: Implementation
24
+ **Agent:** @fullstack-developer
25
+ **Command:** `/dev:feature "$ARGUMENTS"`
26
+
27
+ - Follow the implementation plan
28
+ - Write code incrementally
29
+ - Add inline documentation
30
+ - Follow coding standards
31
+
32
+ ### Step 3: Testing
33
+ **Agent:** @tester
34
+ **Command:** `/dev:test`
35
+
36
+ - Write unit tests for new code
37
+ - Write integration tests
38
+ - Achieve coverage target (>80%)
39
+ - Run all tests and fix failures
40
+
41
+ ### Step 4: Code Review
42
+ **Agent:** @code-reviewer
43
+ **Command:** `/dev:review`
44
+
45
+ - Review code quality
46
+ - Check for security issues
47
+ - Verify best practices
48
+ - Suggest improvements
49
+
50
+ ### Step 5: Commit & PR
51
+ **Agent:** @git-manager
52
+ **Command:** `/git:pr`
53
+
54
+ - Create feature branch
55
+ - Stage and commit changes
56
+ - Write meaningful commit message
57
+ - Create pull request
58
+
59
+ ## Progress Tracking
60
+ - [ ] Step 1: Planning complete
61
+ - [ ] Step 2: Implementation complete
62
+ - [ ] Step 3: Tests passing (>80% coverage)
63
+ - [ ] Step 4: Code review approved
64
+ - [ ] Step 5: PR created
65
+
66
+ ## Quality Gates
67
+ - [ ] Implementation plan approved
68
+ - [ ] All code follows project standards
69
+ - [ ] Test coverage exceeds 80%
70
+ - [ ] No security vulnerabilities
71
+ - [ ] Code review passed
72
+
73
+ Execute each step sequentially. Show progress after each step.
@@ -0,0 +1,60 @@
1
+ ---
2
+ description: Model fine-tuning workflow
3
+ allowed-tools: Task, Read, Write, Edit, Bash, Grep, Glob
4
+ argument-hint: <fine-tuning objective>
5
+ ---
6
+
7
+ # Fine-Tuning Workflow
8
+
9
+ Fine-tune for: **$ARGUMENTS**
10
+
11
+ ## Workflow Steps
12
+
13
+ ### Step 1: Data Preparation
14
+ **Agent:** @fullstack-developer
15
+
16
+ - Collect training data
17
+ - Format for fine-tuning
18
+ - Quality validation
19
+ - Train/val split
20
+
21
+ ### Step 2: Configuration
22
+ **Agent:** @researcher
23
+
24
+ - Select base model
25
+ - Configure hyperparameters
26
+ - Choose PEFT method (LoRA, QLoRA)
27
+ - Set up training
28
+
29
+ ### Step 3: Training
30
+ **Agent:** @fullstack-developer
31
+
32
+ - Run fine-tuning
33
+ - Monitor metrics
34
+ - Handle checkpoints
35
+ - Early stopping
36
+
37
+ ### Step 4: Evaluation
38
+ **Agent:** @tester
39
+
40
+ - Evaluate on test set
41
+ - Compare to baseline
42
+ - Check for regressions
43
+ - Measure improvements
44
+
45
+ ### Step 5: Deployment
46
+ **Agent:** @fullstack-developer
47
+
48
+ - Merge weights (if LoRA)
49
+ - Optimize for inference
50
+ - Deploy model
51
+ - Monitor performance
52
+
53
+ ## Progress Tracking
54
+ - [ ] Data prepared
55
+ - [ ] Configuration complete
56
+ - [ ] Training finished
57
+ - [ ] Evaluation passed
58
+ - [ ] Deployed
59
+
60
+ Execute each step sequentially. Show progress after each step.
@@ -0,0 +1,70 @@
1
+ ---
2
+ description: Complex full-stack feature development
3
+ allowed-tools: Task, Read, Write, Edit, Bash, Grep, Glob
4
+ argument-hint: <full-stack feature>
5
+ ---
6
+
7
+ # Full-Stack Feature Workflow
8
+
9
+ Build: **$ARGUMENTS**
10
+
11
+ ## Workflow Steps
12
+
13
+ ### Step 1: Architecture Design
14
+ **Agent:** @architect
15
+
16
+ - Design system architecture
17
+ - Plan database schema
18
+ - Design API contracts
19
+ - Plan frontend components
20
+
21
+ ### Step 2: Backend Implementation
22
+ **Agent:** @fullstack-developer
23
+
24
+ - Implement database models
25
+ - Create API endpoints
26
+ - Add business logic
27
+ - Write backend tests
28
+
29
+ ### Step 3: Frontend Implementation
30
+ **Agent:** @fullstack-developer
31
+
32
+ - Create UI components
33
+ - Implement state management
34
+ - Connect to API
35
+ - Add styling
36
+
37
+ ### Step 4: Integration
38
+ **Agent:** @fullstack-developer
39
+
40
+ - End-to-end integration
41
+ - Error handling
42
+ - Loading states
43
+ - Edge cases
44
+
45
+ ### Step 5: Testing
46
+ **Agent:** @tester
47
+ **Command:** `/dev:test`
48
+
49
+ - Unit tests (frontend + backend)
50
+ - Integration tests
51
+ - E2E tests
52
+ - Performance tests
53
+
54
+ ### Step 6: Review & Deploy
55
+ **Agent:** @code-reviewer, @git-manager
56
+
57
+ - Code review
58
+ - Security review
59
+ - Create PR
60
+ - Deploy
61
+
62
+ ## Progress Tracking
63
+ - [ ] Architecture designed
64
+ - [ ] Backend complete
65
+ - [ ] Frontend complete
66
+ - [ ] Integration done
67
+ - [ ] Tests passing
68
+ - [ ] Deployed
69
+
70
+ Execute each layer. Full-stack excellence.
@@ -0,0 +1,53 @@
1
+ ---
2
+ description: Create marketing content and materials
3
+ allowed-tools: Task, Read, Write, Edit, Bash, Grep, Glob
4
+ argument-hint: <marketing goal>
5
+ ---
6
+
7
+ # Marketing Content Workflow
8
+
9
+ Create: **$ARGUMENTS**
10
+
11
+ ## Workflow Steps
12
+
13
+ ### Step 1: Research
14
+ **Agent:** @researcher
15
+ **Command:** `/planning:research`
16
+
17
+ - Target audience analysis
18
+ - Competitor research
19
+ - Market positioning
20
+ - Key messages
21
+
22
+ ### Step 2: Ideation
23
+ **Agent:** @brainstormer
24
+ **Command:** `/planning:brainstorm`
25
+
26
+ - Message angles
27
+ - Headline options
28
+ - Value propositions
29
+ - Call-to-actions
30
+
31
+ ### Step 3: Content Creation
32
+ **Agent:** @copywriter
33
+
34
+ - Write headlines
35
+ - Create body copy
36
+ - Features/benefits
37
+ - CTAs
38
+
39
+ ### Step 4: Review
40
+ **Agent:** @copywriter
41
+
42
+ - Tone consistency
43
+ - Clarity
44
+ - Persuasiveness
45
+ - Grammar
46
+
47
+ ## Progress Tracking
48
+ - [ ] Research complete
49
+ - [ ] Ideas generated
50
+ - [ ] Content written
51
+ - [ ] Review complete
52
+
53
+ Execute for compelling content.
@@ -0,0 +1,60 @@
1
+ ---
2
+ description: Safe database migration workflow
3
+ allowed-tools: Task, Read, Write, Edit, Bash, Grep, Glob
4
+ argument-hint: <migration description>
5
+ ---
6
+
7
+ # Database Migration Workflow
8
+
9
+ Migration: **$ARGUMENTS**
10
+
11
+ ## Workflow Steps
12
+
13
+ ### Step 1: Change Analysis
14
+ **Agent:** @database-admin
15
+
16
+ - Analyze required changes
17
+ - Assess impact
18
+ - Identify risks
19
+ - Plan approach
20
+
21
+ ### Step 2: Migration Design
22
+ **Agent:** @database-admin
23
+
24
+ - Design migration strategy
25
+ - Plan zero-downtime approach
26
+ - Create rollback plan
27
+ - Document steps
28
+
29
+ ### Step 3: Migration Creation
30
+ **Agent:** @database-admin
31
+
32
+ - Write migration scripts
33
+ - Include rollback
34
+ - Add data transformations
35
+ - Test locally
36
+
37
+ ### Step 4: Staging Test
38
+ **Agent:** @tester
39
+
40
+ - Run on staging
41
+ - Verify data integrity
42
+ - Test rollback
43
+ - Measure performance
44
+
45
+ ### Step 5: Production Deploy
46
+ **Agent:** @database-admin
47
+
48
+ - Execute migration
49
+ - Monitor closely
50
+ - Verify success
51
+ - Document completion
52
+
53
+ ## Progress Tracking
54
+ - [ ] Changes analyzed
55
+ - [ ] Migration designed
56
+ - [ ] Scripts created
57
+ - [ ] Staging tested
58
+ - [ ] Production deployed
59
+
60
+ Execute carefully. Data integrity is critical.
@@ -0,0 +1,59 @@
1
+ ---
2
+ description: AI model evaluation and benchmarking workflow
3
+ allowed-tools: Task, Read, Write, Edit, Bash, Grep, Glob
4
+ argument-hint: <model or system to evaluate>
5
+ ---
6
+
7
+ # Model Evaluation Workflow
8
+
9
+ Evaluate: **$ARGUMENTS**
10
+
11
+ ## Workflow Steps
12
+
13
+ ### Step 1: Evaluation Planning
14
+ **Agent:** @planner
15
+
16
+ - Define evaluation objectives
17
+ - Identify key metrics
18
+ - Design test cases
19
+ - Plan benchmarks
20
+
21
+ ### Step 2: Dataset Preparation
22
+ **Agent:** @fullstack-developer
23
+
24
+ - Collect evaluation data
25
+ - Create test sets
26
+ - Label ground truth
27
+ - Validate data quality
28
+
29
+ ### Step 3: Metric Implementation
30
+ **Agent:** @fullstack-developer
31
+
32
+ - Implement evaluation metrics
33
+ - Set up scoring functions
34
+ - Create comparison framework
35
+
36
+ ### Step 4: Evaluation Execution
37
+ **Agent:** @tester
38
+
39
+ - Run evaluations
40
+ - Collect results
41
+ - Statistical analysis
42
+ - Compare baselines
43
+
44
+ ### Step 5: Reporting
45
+ **Agent:** @docs-manager
46
+
47
+ - Create evaluation report
48
+ - Visualize results
49
+ - Document findings
50
+ - Recommend improvements
51
+
52
+ ## Progress Tracking
53
+ - [ ] Evaluation planned
54
+ - [ ] Dataset prepared
55
+ - [ ] Metrics implemented
56
+ - [ ] Evaluation complete
57
+ - [ ] Report generated
58
+
59
+ Execute each step sequentially. Show progress after each step.
@@ -0,0 +1,60 @@
1
+ ---
2
+ description: Database performance optimization workflow
3
+ allowed-tools: Task, Read, Write, Edit, Bash, Grep, Glob
4
+ argument-hint: <optimization target>
5
+ ---
6
+
7
+ # Database Optimization Workflow
8
+
9
+ Optimize: **$ARGUMENTS**
10
+
11
+ ## Workflow Steps
12
+
13
+ ### Step 1: Performance Analysis
14
+ **Agent:** @database-admin
15
+
16
+ - Identify slow queries
17
+ - Analyze execution plans
18
+ - Review index usage
19
+ - Measure baselines
20
+
21
+ ### Step 2: Query Optimization
22
+ **Agent:** @database-admin
23
+
24
+ - Rewrite slow queries
25
+ - Add missing indexes
26
+ - Remove unused indexes
27
+ - Optimize joins
28
+
29
+ ### Step 3: Schema Optimization
30
+ **Agent:** @database-admin
31
+
32
+ - Review data types
33
+ - Consider partitioning
34
+ - Evaluate denormalization
35
+ - Optimize storage
36
+
37
+ ### Step 4: Configuration Tuning
38
+ **Agent:** @database-admin
39
+
40
+ - Review DB configuration
41
+ - Tune memory settings
42
+ - Optimize connections
43
+ - Configure caching
44
+
45
+ ### Step 5: Validation
46
+ **Agent:** @tester
47
+
48
+ - Benchmark improvements
49
+ - Compare to baseline
50
+ - Test under load
51
+ - Document gains
52
+
53
+ ## Progress Tracking
54
+ - [ ] Performance analyzed
55
+ - [ ] Queries optimized
56
+ - [ ] Schema improved
57
+ - [ ] Configuration tuned
58
+ - [ ] Improvements validated
59
+
60
+ Execute each step. Measure impact.
@@ -0,0 +1,60 @@
1
+ ---
2
+ description: Penetration testing workflow
3
+ allowed-tools: Task, Read, Write, Edit, Bash, Grep, Glob
4
+ argument-hint: <target system>
5
+ ---
6
+
7
+ # Penetration Testing Workflow
8
+
9
+ Pentest: **$ARGUMENTS**
10
+
11
+ ## Workflow Steps
12
+
13
+ ### Step 1: Reconnaissance
14
+ **Agent:** @security-auditor
15
+
16
+ - Gather information
17
+ - Map infrastructure
18
+ - Identify entry points
19
+ - Document findings
20
+
21
+ ### Step 2: Vulnerability Analysis
22
+ **Agent:** @vulnerability-scanner
23
+
24
+ - Scan for vulnerabilities
25
+ - Identify misconfigurations
26
+ - Check for known CVEs
27
+ - Analyze attack vectors
28
+
29
+ ### Step 3: Exploitation (Authorized Only)
30
+ **Agent:** @security-auditor
31
+
32
+ - Attempt authorized exploits
33
+ - Document successful attacks
34
+ - Measure impact
35
+ - Preserve evidence
36
+
37
+ ### Step 4: Post-Exploitation
38
+ **Agent:** @security-auditor
39
+
40
+ - Assess access gained
41
+ - Identify lateral movement
42
+ - Document persistence risks
43
+ - Clean up artifacts
44
+
45
+ ### Step 5: Reporting
46
+ **Agent:** @docs-manager
47
+
48
+ - Create detailed report
49
+ - Include proof of concepts
50
+ - Provide remediation steps
51
+ - Executive summary
52
+
53
+ ## Progress Tracking
54
+ - [ ] Recon complete
55
+ - [ ] Vulnerabilities identified
56
+ - [ ] Testing complete
57
+ - [ ] Analysis done
58
+ - [ ] Report delivered
59
+
60
+ ⚠️ Only for authorized testing contexts.
@@ -0,0 +1,60 @@
1
+ ---
2
+ description: Application performance optimization workflow
3
+ allowed-tools: Task, Read, Write, Edit, Bash, Grep, Glob
4
+ argument-hint: <optimization target>
5
+ ---
6
+
7
+ # Performance Optimization Workflow
8
+
9
+ Optimize: **$ARGUMENTS**
10
+
11
+ ## Workflow Steps
12
+
13
+ ### Step 1: Profiling
14
+ **Agent:** @fullstack-developer
15
+
16
+ - Profile application
17
+ - Identify bottlenecks
18
+ - Measure baselines
19
+ - Document hotspots
20
+
21
+ ### Step 2: Analysis
22
+ **Agent:** @architect
23
+
24
+ - Analyze root causes
25
+ - Evaluate solutions
26
+ - Assess trade-offs
27
+ - Prioritize fixes
28
+
29
+ ### Step 3: Optimization
30
+ **Agent:** @fullstack-developer
31
+ **Command:** `/quality:optimize`
32
+
33
+ - Implement optimizations
34
+ - Add caching
35
+ - Optimize queries
36
+ - Reduce bundle size
37
+
38
+ ### Step 4: Validation
39
+ **Agent:** @tester
40
+
41
+ - Benchmark improvements
42
+ - Load testing
43
+ - Compare to baseline
44
+ - Verify no regressions
45
+
46
+ ### Step 5: Documentation
47
+ **Agent:** @docs-manager
48
+
49
+ - Document changes
50
+ - Update runbooks
51
+ - Share learnings
52
+
53
+ ## Progress Tracking
54
+ - [ ] Profiling complete
55
+ - [ ] Analysis done
56
+ - [ ] Optimizations applied
57
+ - [ ] Improvements validated
58
+ - [ ] Documented
59
+
60
+ Execute each step. Measure everything.
@@ -0,0 +1,51 @@
1
+ ---
2
+ description: Systematic prompt optimization workflow
3
+ allowed-tools: Task, Read, Write, Edit, Bash, Grep, Glob
4
+ argument-hint: <prompt optimization goal>
5
+ ---
6
+
7
+ # Prompt Engineering Workflow
8
+
9
+ Optimize prompts for: **$ARGUMENTS**
10
+
11
+ ## Workflow Steps
12
+
13
+ ### Step 1: Baseline Analysis
14
+ **Agent:** @researcher
15
+
16
+ - Analyze current prompts
17
+ - Identify issues
18
+ - Document failure cases
19
+ - Establish baseline metrics
20
+
21
+ ### Step 2: Prompt Design
22
+ **Agent:** @oracle
23
+
24
+ - Apply prompt engineering techniques
25
+ - Create prompt variants
26
+ - Add few-shot examples
27
+ - Implement chain-of-thought
28
+
29
+ ### Step 3: A/B Testing
30
+ **Agent:** @tester
31
+
32
+ - Test prompt variants
33
+ - Measure quality metrics
34
+ - Statistical comparison
35
+ - Identify winners
36
+
37
+ ### Step 4: Optimization
38
+ **Agent:** @fullstack-developer
39
+
40
+ - Refine winning prompts
41
+ - Optimize token usage
42
+ - Add guardrails
43
+ - Document final prompts
44
+
45
+ ## Progress Tracking
46
+ - [ ] Baseline established
47
+ - [ ] Prompts designed
48
+ - [ ] A/B testing complete
49
+ - [ ] Optimization done
50
+
51
+ Execute each step sequentially. Show progress after each step.
@@ -0,0 +1,79 @@
1
+ ---
2
+ description: Build Retrieval-Augmented Generation systems
3
+ allowed-tools: Task, Read, Write, Edit, Bash, Grep, Glob
4
+ argument-hint: <RAG system description>
5
+ ---
6
+
7
+ # RAG Development Workflow
8
+
9
+ Build RAG system: **$ARGUMENTS**
10
+
11
+ ## Workflow Steps
12
+
13
+ ### Step 1: Research
14
+ **Agent:** @researcher
15
+ **Command:** `/planning:research "RAG best practices"`
16
+
17
+ - Study RAG architectures
18
+ - Evaluate chunking strategies
19
+ - Compare embedding models
20
+ - Review retrieval methods
21
+
22
+ ### Step 2: Architecture Design
23
+ **Agent:** @architect
24
+
25
+ - Define system architecture
26
+ - Select components (vector DB, embeddings, LLM)
27
+ - Plan data pipeline
28
+ - Design API interface
29
+
30
+ ### Step 3: Data Preparation
31
+ **Agent:** @fullstack-developer
32
+
33
+ - Implement document loaders
34
+ - Create chunking pipeline
35
+ - Handle different formats
36
+ - Clean and normalize text
37
+
38
+ ### Step 4: Embedding Pipeline
39
+ **Agent:** @fullstack-developer
40
+
41
+ - Integrate embedding model
42
+ - Create batch processing
43
+ - Store in vector database
44
+ - Optimize indexing
45
+
46
+ ### Step 5: Retrieval System
47
+ **Agent:** @fullstack-developer
48
+
49
+ - Implement semantic search
50
+ - Add hybrid retrieval (BM25 + vector)
51
+ - Build reranking pipeline
52
+ - Context assembly
53
+
54
+ ### Step 6: Generation Pipeline
55
+ **Agent:** @fullstack-developer
56
+
57
+ - Create prompt templates
58
+ - Context injection
59
+ - LLM integration
60
+ - Response formatting
61
+
62
+ ### Step 7: Evaluation
63
+ **Agent:** @tester
64
+
65
+ - Test retrieval accuracy (Recall@K, MRR)
66
+ - Measure answer quality
67
+ - Benchmark latency
68
+ - Analyze costs
69
+
70
+ ## Progress Tracking
71
+ - [ ] Research complete
72
+ - [ ] Architecture designed
73
+ - [ ] Data pipeline ready
74
+ - [ ] Embeddings working
75
+ - [ ] Retrieval system built
76
+ - [ ] Generation working
77
+ - [ ] Evaluation passed
78
+
79
+ Execute each step sequentially. Show progress after each step.