claude-flow-novice 2.15.10 → 2.15.11

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (47) hide show
  1. package/claude-assets/agents/cfn-dev-team/CLAUDE.md +9 -81
  2. package/claude-assets/agents/cfn-dev-team/architecture/base-template-generator.md +4 -4
  3. package/claude-assets/agents/cfn-dev-team/architecture/planner.md +4 -4
  4. package/claude-assets/agents/cfn-dev-team/architecture/system-architect.md +5 -5
  5. package/claude-assets/agents/cfn-dev-team/coordinators/cfn-v3-coordinator.md +5 -1
  6. package/claude-assets/agents/cfn-dev-team/dev-ops/devops-engineer.md +4 -4
  7. package/claude-assets/agents/cfn-dev-team/dev-ops/docker-specialist.md +7 -37
  8. package/claude-assets/agents/cfn-dev-team/dev-ops/kubernetes-specialist.md +7 -37
  9. package/claude-assets/agents/cfn-dev-team/dev-ops/monitoring-specialist.md +4 -4
  10. package/claude-assets/agents/cfn-dev-team/developers/api-gateway-specialist.md +11 -42
  11. package/claude-assets/agents/cfn-dev-team/product-owners/accessibility-advocate-persona.md +4 -4
  12. package/claude-assets/agents/cfn-dev-team/product-owners/cto-agent.md +4 -4
  13. package/claude-assets/agents/cfn-dev-team/product-owners/power-user-persona.md +4 -4
  14. package/claude-assets/agents/cfn-dev-team/product-owners/product-owner.md +18 -22
  15. package/claude-assets/agents/cfn-dev-team/reviewers/code-reviewer.md +1 -1
  16. package/claude-assets/agents/cfn-dev-team/reviewers/quality/code-quality-validator.md +1 -1
  17. package/claude-assets/agents/cfn-dev-team/reviewers/quality/perf-analyzer.md +1 -1
  18. package/claude-assets/agents/cfn-dev-team/reviewers/quality/performance-benchmarker.md +1 -1
  19. package/claude-assets/agents/cfn-dev-team/reviewers/quality/security-specialist.md +1 -1
  20. package/claude-assets/agents/cfn-dev-team/testers/api-testing-specialist.md +7 -35
  21. package/claude-assets/agents/cfn-dev-team/testers/chaos-engineering-specialist.md +17 -36
  22. package/claude-assets/agents/cfn-dev-team/testers/contract-tester.md +10 -11
  23. package/claude-assets/agents/cfn-dev-team/testers/e2e/playwright-tester.md +5 -5
  24. package/claude-assets/agents/cfn-dev-team/testers/integration-tester.md +10 -12
  25. package/claude-assets/agents/cfn-dev-team/testers/interaction-tester.md +7 -36
  26. package/claude-assets/agents/cfn-dev-team/testers/load-testing-specialist.md +10 -12
  27. package/claude-assets/agents/cfn-dev-team/testers/mutation-testing-specialist.md +10 -12
  28. package/claude-assets/agents/cfn-dev-team/testers/playwright-tester.md +7 -37
  29. package/claude-assets/agents/cfn-dev-team/testers/tester.md +7 -33
  30. package/claude-assets/agents/cfn-dev-team/testers/unit/tdd-london-unit-swarm.md +5 -5
  31. package/claude-assets/agents/cfn-dev-team/testers/validation/validation-production-validator.md +4 -4
  32. package/claude-assets/agents/cfn-dev-team/testing/test-validation-agent.md +4 -4
  33. package/claude-assets/agents/cfn-dev-team/utility/agent-builder.md +16 -16
  34. package/claude-assets/agents/cfn-dev-team/utility/analyst.md +4 -4
  35. package/claude-assets/agents/cfn-dev-team/utility/code-booster.md +4 -4
  36. package/claude-assets/agents/cfn-dev-team/utility/context-curator.md +4 -4
  37. package/claude-assets/agents/cfn-dev-team/utility/epic-creator.md +7 -92
  38. package/claude-assets/agents/cfn-dev-team/utility/memory-leak-specialist.md +7 -100
  39. package/claude-assets/agents/cfn-dev-team/utility/researcher.md +4 -4
  40. package/claude-assets/agents/cfn-dev-team/utility/z-ai-specialist.md +7 -91
  41. package/dist/cli/agent-definition-parser.js +37 -4
  42. package/dist/cli/agent-definition-parser.js.map +1 -1
  43. package/dist/cli/agent-executor.js +32 -2
  44. package/dist/cli/agent-executor.js.map +1 -1
  45. package/dist/cli/config-manager.js +109 -91
  46. package/dist/cli/config-manager.js.map +1 -1
  47. package/package.json +1 -1
@@ -603,14 +603,14 @@ npx claude-flow-novice agent-spawn backend-dev --task-id "${TASK_ID}"
603
603
  npx claude-flow-novice agent-spawn reviewer --task-id "${TASK_ID}"
604
604
  ```
605
605
 
606
- ### Signaling Pattern (CLI Mode Only)
606
+ ### Signaling Pattern
607
607
  ```bash
608
- # CLI Mode: Signal completion
608
+ # Signal completion (when coordination is required)
609
609
  if [[ -n "${TASK_ID:-}" && -n "${AGENT_ID:-}" ]]; then
610
610
  redis-cli lpush "swarm:${TASK_ID}:${AGENT_ID}:done" "complete"
611
611
  fi
612
612
 
613
- # CLI Mode: Wait for other agent (zero-token blocking)
613
+ # Wait for other agent (zero-token blocking)
614
614
  if [[ -n "${TASK_ID:-}" ]]; then
615
615
  redis-cli blpop "swarm:${TASK_ID}:other-agent:done" 30
616
616
  fi
@@ -656,11 +656,11 @@ Agents participate in 3-loop validation:
656
656
 
657
657
  Complete your work and provide a structured response with:
658
658
  - Confidence score (0.0-1.0) based on work quality
659
- - Summary of analysis/review completed
660
- - List of findings or deliverables
661
- - Any recommendations made
659
+ - Summary of work completed
660
+ - List of deliverables created
661
+ - Any recommendations or findings
662
662
 
663
- **Note:** Coordination instructions are provided when spawned via CLI.
663
+ **Note:** Coordination handled automatically by the system.
664
664
  ```
665
665
 
666
666
  ### Validation Hooks
@@ -994,79 +994,7 @@ Now that you understand agent creation, explore:
994
994
 
995
995
  ---
996
996
 
997
- ## CFN Loop Completion Protocol (MODE-SPECIFIC)
998
-
999
- **Date:** 2025-11-06
1000
- **Status:** Memory Leak Fix Applied - ANTI-023 Resolution
1001
-
1002
- ### ⚠️ CRITICAL: Mode-Specific Completion Required
1003
-
1004
- Agents MUST use different completion protocols based on how they were spawned:
1005
-
1006
- ### Task Mode (Spawned via Task() tool in Main Chat)
1007
-
1008
- **Simply complete your work and return structured output.**
1009
-
1010
- ```json
1011
- {
1012
- "confidence": 0.85,
1013
- "status": "COMPLETE|NEEDS_WORK",
1014
- "summary": "Brief summary of work completed",
1015
- "deliverables": ["file1.ts", "file2.test.ts"]
1016
- }
1017
- ```
1018
-
1019
- **❌ DO NOT:**
1020
- - Use Redis commands (redis-cli)
1021
- - Execute bash scripts for completion
1022
- - Signal completion via CLI tools
1023
- - **Main Chat receives your output automatically**
1024
-
1025
- ### CLI Mode (Spawned via `npx claude-flow-novice agent-spawn`)
1026
-
1027
- **Step 1: Complete Work**
1028
- Execute assigned task (implementation, validation, review, etc.)
1029
-
1030
- **Step 2: Signal Completion (CLI Mode Only)**
1031
- ```bash
1032
- # CLI Mode: Signal completion
1033
- if [[ -n "${TASK_ID:-}" && -n "${AGENT_ID:-}" ]]; then
1034
- redis-cli lpush "swarm:${TASK_ID}:${AGENT_ID}:done" "complete"
1035
- fi
1036
- ```
1037
-
1038
- **Step 3: Report Confidence Score and Exit (CLI Mode Only)**
1039
- ```bash
1040
- # CLI Mode: Report confidence
1041
- if [[ -n "${TASK_ID:-}" && -n "${AGENT_ID:-}" ]]; then
1042
- ./.claude/skills/cfn-redis-coordination/invoke-waiting-mode.sh report \
1043
- --task-id "$TASK_ID" \
1044
- --agent-id "$AGENT_ID" \
1045
- --confidence [0.0-1.0] \
1046
- --iteration 1
1047
- fi
1048
- ```
1049
-
1050
- ### How to Tell Which Mode You're In
1051
-
1052
- - **Task Mode**: You were spawned via `Task("agent-name", "...")` in Main Chat
1053
- - **CLI Mode**: You were spawned via `npx claude-flow-novice agent-spawn ...` command
1054
-
1055
- ### Why This Matters
1056
-
1057
- - **Task Mode**: Main Chat handles everything, just return results
1058
- - **CLI Mode**: Coordinator needs Redis signals to collect confidence scores
1059
- - **Mixed protocols cause memory leaks** (ANTI-023 pattern)
1060
-
1061
- ### Related Documentation
1062
-
1063
- - **Memory Leak Fix:** `docs/bugs/BUG_MEMORY_LEAK_VALIDATOR_FIX.md`
1064
- - **Agent Lifecycle:** `.claude/agents/AGENT_LIFECYCLE.md`
1065
- - **Main Documentation:** `CLAUDE.md:333-357` (Mode-specific protocols)
1066
-
1067
- ---
1068
-
1069
- **Document Version:** 4.2.0 (Memory Leak Fix - Mode-Specific Protocols)
1070
- **Last Updated:** 2025-11-06
997
+ **Document Version:** 4.3.0 (Unified Completion Protocol)
998
+ **Last Updated:** 2025-11-19
1071
999
  **Maintained By:** Claude Flow Novice Team
1072
1000
  **Feedback:** We'd love to hear how you're using agents! Share your creations.
@@ -142,8 +142,8 @@ When generating templates, always consider the broader project context, existing
142
142
 
143
143
  Complete your work and provide a structured response with:
144
144
  - Confidence score (0.0-1.0) based on work quality
145
- - Summary of analysis/review completed
146
- - List of findings or deliverables
147
- - Any recommendations made
145
+ - Summary of work completed
146
+ - List of deliverables created
147
+ - Any recommendations or findings
148
148
 
149
- **Note:** Coordination instructions are provided when spawned via CLI.
149
+ **Note:** Coordination handled automatically by the system.
@@ -123,9 +123,9 @@ Remember: A good plan executed now is better than a perfect plan executed never.
123
123
 
124
124
  Complete your work and provide a structured response with:
125
125
  - Confidence score (0.0-1.0) based on work quality
126
- - Summary of analysis/review completed
127
- - List of findings or deliverables
128
- - Any recommendations made
126
+ - Summary of work completed
127
+ - List of deliverables created
128
+ - Any recommendations or findings
129
129
 
130
- **Note:** Coordination instructions are provided when spawned via CLI.
130
+ **Note:** Coordination handled automatically by the system.
131
131
 
@@ -115,10 +115,10 @@ await sqlite.memoryAdapter.set(
115
115
  **Core Insight:** Great architecture balances technical excellence with business needs, making informed trade-offs that enable long-term system health and adaptability.
116
116
  ## Completion Protocol
117
117
 
118
- Complete your architectural work and provide a structured response with:
119
- - Confidence score (0.0-1.0) based on architectural quality
120
- - Summary of design decisions made
118
+ Complete your work and provide a structured response with:
119
+ - Confidence score (0.0-1.0) based on work quality
120
+ - Summary of work completed
121
121
  - List of deliverables created
122
- - Any assumptions or constraints identified
122
+ - Any recommendations or findings
123
123
 
124
- **Note:** Coordination instructions are provided when spawned via CLI.
124
+ **Note:** Coordination handled automatically by the system.
@@ -454,7 +454,11 @@ CRITERIA_JSON='{
454
454
  echo "$CRITERIA_JSON" | redis-cli -h "${REDIS_HOST:-localhost}" -p "${REDIS_PORT:-6379}" \
455
455
  -x HSET "cfn_loop:task:${TASK_ID}:context" "success-criteria" >/dev/null 2>&1 || true
456
456
 
457
- echo " ✅ Success criteria stored in Redis"
457
+ # Store task description in Redis for agent context injection
458
+ redis-cli -h "${REDIS_HOST:-localhost}" -p "${REDIS_PORT:-6379}" \
459
+ HSET "cfn_loop:task:${TASK_ID}:context" "task_description" "$TASK_DESCRIPTION" >/dev/null 2>&1 || true
460
+
461
+ echo " ✅ Success criteria and task description stored in Redis"
458
462
 
459
463
  # Invoke orchestrator with validated parameters
460
464
  # The orchestrator handles ALL remaining work:
@@ -132,8 +132,8 @@ Remember: The best infrastructure is invisible—seamless, scalable, and empower
132
132
 
133
133
  Complete your work and provide a structured response with:
134
134
  - Confidence score (0.0-1.0) based on work quality
135
- - Summary of analysis/review completed
136
- - List of findings or deliverables
137
- - Any recommendations made
135
+ - Summary of work completed
136
+ - List of deliverables created
137
+ - Any recommendations or findings
138
138
 
139
- **Note:** Coordination instructions are provided when spawned via CLI.
139
+ **Note:** Coordination handled automatically by the system.
@@ -621,45 +621,15 @@ networks:
621
621
 
622
622
  ---
623
623
 
624
- ## Completion Protocol (Test-Driven)
624
+ ## Completion Protocol
625
625
 
626
- Complete your work and provide test-based validation:
626
+ Complete your work and provide a structured response with:
627
+ - Confidence score (0.0-1.0) based on work quality
628
+ - Summary of work completed
629
+ - List of deliverables created
630
+ - Any recommendations or findings
627
631
 
628
- 1. **Execute Tests**: Run all test suites from success criteria
629
- ```bash
630
- # Parse natively (no external dependencies)
631
- PASS=$(echo "$TEST_OUTPUT" | grep -oP '\d+(?= passing)' || echo "0")
632
- FAIL=$(echo "$TEST_OUTPUT" | grep -oP '\d+(?= failing)' || echo "0")
633
- TOTAL=$((PASS + FAIL))
634
- RATE=$(awk "BEGIN {if ($TOTAL > 0) printf \"%.2f\", $PASS/$TOTAL; else print \"0.00\"}")
635
-
636
- # Return results (Main Chat receives automatically in Task Mode)
637
- echo "{\"passed\": $PASS, \"failed\": $FAIL, \"pass_rate\": $RATE}"
638
- ```
639
-
640
- 2. **Parse Results**: Extract test counts and calculate pass rate
641
-
642
- 3. **Coverage Check**: Ensure coverage meets minimum thresholds
643
- - Build tests: ≥95%
644
- - Security tests: ≥90%
645
- - Coverage: ≥80%
646
-
647
- 4. **Store in Redis**: Use test-results key (not confidence key)
648
-
649
- 5. **Signal Completion**: Push to completion queue
650
-
651
- **Example Report:**
652
- ```
653
- Test Execution Summary:
654
- - Build Tests: 45/47 passed (95.7%)
655
- - Security Scan Tests: 12/12 passed (100%)
656
- - Performance Tests: 8/10 passed (80%)
657
- - Overall: 65/69 passed (94.2%)
658
- - Coverage: 84.3%
659
- - Gate Status: PASS (≥95% in 2/3 suites, ≥80% overall)
660
- ```
661
-
662
- **Note:** Coordination instructions and success criteria provided when spawned via CLI.
632
+ **Note:** Coordination handled automatically by the system.
663
633
 
664
634
  ## Success Metrics
665
635
  - Images build successfully
@@ -595,42 +595,12 @@ Before reporting high confidence:
595
595
  4. **Monitoring Setup**: Prometheus metrics, Grafana dashboards
596
596
  5. **CI/CD Integration**: GitOps workflows, ArgoCD applications
597
597
 
598
- ## Completion Protocol (Test-Driven)
598
+ ## Completion Protocol
599
599
 
600
- Complete your work and provide test-based validation:
600
+ Complete your work and provide a structured response with:
601
+ - Confidence score (0.0-1.0) based on work quality
602
+ - Summary of work completed
603
+ - List of deliverables created
604
+ - Any recommendations or findings
601
605
 
602
- 1. **Execute Tests**: Run all test suites from success criteria
603
- ```bash
604
- # Parse natively (no external dependencies)
605
- PASS=$(echo "$TEST_OUTPUT" | grep -oP '\d+(?= passing)' || echo "0")
606
- FAIL=$(echo "$TEST_OUTPUT" | grep -oP '\d+(?= failing)' || echo "0")
607
- TOTAL=$((PASS + FAIL))
608
- RATE=$(awk "BEGIN {if ($TOTAL > 0) printf \"%.2f\", $PASS/$TOTAL; else print \"0.00\"}")
609
-
610
- # Return results (Main Chat receives automatically in Task Mode)
611
- echo "{\"passed\": $PASS, \"failed\": $FAIL, \"pass_rate\": $RATE}"
612
- ```
613
-
614
- 2. **Parse Results**: Extract test counts and calculate pass rate
615
-
616
- 3. **Coverage Check**: Ensure coverage meets minimum thresholds
617
- - Manifest tests: ≥95%
618
- - Deployment tests: ≥90%
619
- - Coverage: ≥80%
620
-
621
- 4. **Store in Redis**: Use test-results key (not confidence key)
622
-
623
- 5. **Signal Completion**: Push to completion queue
624
-
625
- **Example Report:**
626
- ```
627
- Test Execution Summary:
628
- - Manifest Tests: 45/47 passed (95.7%)
629
- - Helm Chart Tests: 12/12 passed (100%)
630
- - Deployment Tests: 8/10 passed (80%)
631
- - Overall: 65/69 passed (94.2%)
632
- - Coverage: 84.3%
633
- - Gate Status: PASS (≥95% in 2/3 suites, ≥80% overall)
634
- ```
635
-
636
- **Note:** Coordination instructions and success criteria provided when spawned via CLI.
606
+ **Note:** Coordination handled automatically by the system.
@@ -746,11 +746,11 @@ Before reporting high confidence:
746
746
 
747
747
  Complete your work and provide a structured response with:
748
748
  - Confidence score (0.0-1.0) based on work quality
749
- - Summary of analysis/review completed
750
- - List of findings or deliverables
751
- - Any recommendations made
749
+ - Summary of work completed
750
+ - List of deliverables created
751
+ - Any recommendations or findings
752
752
 
753
- **Note:** Coordination instructions are provided when spawned via CLI.
753
+ **Note:** Coordination handled automatically by the system.
754
754
 
755
755
  ## Skill References
756
756
  → **Prometheus Setup**: `.claude/skills/prometheus-monitoring/SKILL.md`
@@ -67,12 +67,8 @@ fi
67
67
 
68
68
  ### 3. Report Test Results (NOT Confidence)
69
69
 
70
- **Old (Deprecated):**
71
- ```bash
72
- # Not shown - deprecated pattern
73
- ```
70
+ Execute tests and report objective pass/fail metrics:
74
71
 
75
- **New (Required):**
76
72
  ```bash
77
73
  # Execute tests and capture output
78
74
  TEST_OUTPUT=$(npm test 2>&1)
@@ -82,6 +78,9 @@ PASS=$(echo "$TEST_OUTPUT" | grep -oP '\d+(?= passing)' || echo "0")
82
78
  FAIL=$(echo "$TEST_OUTPUT" | grep -oP '\d+(?= failing)' || echo "0")
83
79
  TOTAL=$((PASS + FAIL))
84
80
  RATE=$(awk "BEGIN {if ($TOTAL > 0) printf \"%.2f\", $PASS/$TOTAL; else print \"0.00\"}")
81
+
82
+ # Return results (Main Chat receives automatically in Task Mode)
83
+ echo "{\"passed\": $PASS, \"failed\": $FAIL, \"pass_rate\": $RATE}"
85
84
  ```
86
85
 
87
86
  # API Gateway Specialist Agent
@@ -959,42 +958,12 @@ Before reporting high confidence:
959
958
  → **OAuth2/JWT**: `.claude/skills/oauth2-jwt-auth/SKILL.md`
960
959
  → **Nginx Reverse Proxy**: `.claude/skills/nginx-reverse-proxy/SKILL.md`
961
960
 
962
- ## Completion Protocol (Test-Driven)
963
-
964
- Complete your work and provide test-based validation:
965
-
966
- 1. **Execute Tests**: Run all test suites from success criteria
967
- ```bash
968
- # Parse natively (no external dependencies)
969
- PASS=$(echo "$TEST_OUTPUT" | grep -oP '\d+(?= passing)' || echo "0")
970
- FAIL=$(echo "$TEST_OUTPUT" | grep -oP '\d+(?= failing)' || echo "0")
971
- TOTAL=$((PASS + FAIL))
972
- RATE=$(awk "BEGIN {if ($TOTAL > 0) printf \"%.2f\", $PASS/$TOTAL; else print \"0.00\"}")
973
-
974
- # Return results (Main Chat receives automatically in Task Mode)
975
- echo "{\"passed\": $PASS, \"failed\": $FAIL, \"pass_rate\": $RATE}"
976
- ```
961
+ ## Completion Protocol
977
962
 
978
- 2. **Parse Results**: Extract test counts and calculate pass rate
979
-
980
- 3. **Coverage Check**: Ensure coverage meets minimum thresholds
981
- - Core tests: ≥95%
982
- - Configuration tests: ≥90%
983
- - Coverage: ≥80%
984
-
985
- 4. **Store in Redis**: Use test-results key (not confidence key)
986
-
987
- 5. **Signal Completion**: Push to completion queue
988
-
989
- **Example Report:**
990
- ```
991
- Test Execution Summary:
992
- - Configuration Tests: 45/47 passed (95.7%)
993
- - Security Tests: 12/12 passed (100%)
994
- - Performance Tests: 8/10 passed (80%)
995
- - Overall: 65/69 passed (94.2%)
996
- - Coverage: 84.3%
997
- - Gate Status: PASS (≥95% in 2/3 suites, ≥80% overall)
998
- ```
963
+ Complete your work and provide a structured response with:
964
+ - Confidence score (0.0-1.0) based on work quality
965
+ - Summary of work completed
966
+ - List of deliverables created
967
+ - Any recommendations or findings
999
968
 
1000
- **Note:** Coordination instructions and success criteria provided when spawned via CLI.
969
+ **Note:** Coordination handled automatically by the system.
@@ -104,8 +104,8 @@ confidence = (
104
104
 
105
105
  Complete your work and provide a structured response with:
106
106
  - Confidence score (0.0-1.0) based on work quality
107
- - Summary of analysis/review completed
108
- - List of findings or deliverables
109
- - Any recommendations made
107
+ - Summary of work completed
108
+ - List of deliverables created
109
+ - Any recommendations or findings
110
110
 
111
- **Note:** Coordination instructions are provided when spawned via CLI.
111
+ **Note:** Coordination handled automatically by the system.
@@ -128,8 +128,8 @@ Evaluate completed implementations:
128
128
 
129
129
  Complete your work and provide a structured response with:
130
130
  - Confidence score (0.0-1.0) based on work quality
131
- - Summary of analysis/review completed
132
- - List of findings or deliverables
133
- - Any recommendations made
131
+ - Summary of work completed
132
+ - List of deliverables created
133
+ - Any recommendations or findings
134
134
 
135
- **Note:** Coordination instructions are provided when spawned via CLI.
135
+ **Note:** Coordination handled automatically by the system.
@@ -117,11 +117,11 @@ Evaluate completed implementations:
117
117
 
118
118
  Complete your work and provide a structured response with:
119
119
  - Confidence score (0.0-1.0) based on work quality
120
- - Summary of analysis/review completed
121
- - List of findings or deliverables
122
- - Any recommendations made
120
+ - Summary of work completed
121
+ - List of deliverables created
122
+ - Any recommendations or findings
123
123
 
124
- **Note:** Coordination instructions are provided when spawned via CLI.
124
+ **Note:** Coordination handled automatically by the system.
125
125
 
126
126
  ## Success Metrics
127
127
 
@@ -323,22 +323,21 @@ const adjustConfidenceBasedOnHistory = (baseConfidence, auditData) => {
323
323
  };
324
324
  ```
325
325
 
326
- **3. Cross-Mode Consistency Validation:**
326
+ **3. Consistency Validation:**
327
327
  ```bash
328
- # Check if Task Mode and CLI Mode validators agree
328
+ # Check validator agreement
329
329
  VALIDATOR_AGREEMENT=$(echo "$AUDIT_DATA" | jq -r '
330
330
  group_by(.agent_type) |
331
331
  map({
332
332
  agent: .[0].agent_type,
333
- modes: group_by(.mode) | map({mode: .[0].mode, avg_confidence: map(.confidence) | add / length})
333
+ avg_confidence: map(.confidence) | add / length
334
334
  }) |
335
- .[] | select(.modes | length > 1) |
336
- select((.modes[0].avg_confidence - .modes[1].avg_confidence | abs) > 0.2) |
335
+ .[] | select(.avg_confidence < 0.8) |
337
336
  .agent')
338
337
 
339
338
  if [ -n "$VALIDATOR_AGREEMENT" ]; then
340
- echo "⚠️ Warning: Cross-mode validator disagreement detected for: $VALIDATOR_AGREEMENT"
341
- # Reduce confidence when validators disagree across modes
339
+ echo "⚠️ Warning: Low confidence detected for: $VALIDATOR_AGREEMENT"
340
+ # Reduce confidence when validators show low scores
342
341
  CONFIDENCE_ADJUSTMENT=0.1
343
342
  fi
344
343
  ```
@@ -363,20 +362,20 @@ if [ $(echo "$PERFORMANCE_PATTERN" | wc -l) -gt 2 ]; then
363
362
  fi
364
363
  ```
365
364
 
366
- **Example 2: Mode Effectiveness Analysis**
365
+ **Example 2: Effectiveness Analysis**
367
366
  ```bash
368
- # Compare Task Mode vs CLI Mode effectiveness
369
- MODE_ANALYSIS=$(echo "$AUDIT_DATA" | jq -r '
370
- group_by(.mode) |
367
+ # Analyze overall agent effectiveness
368
+ EFFECTIVENESS_ANALYSIS=$(echo "$AUDIT_DATA" | jq -r '
369
+ group_by(.agent_type) |
371
370
  map({
372
- mode: .[0].mode,
373
- total_agents: length,
371
+ agent_type: .[0].agent_type,
372
+ total_tasks: length,
374
373
  avg_confidence: map(.confidence) | add / length,
375
374
  success_rate: map(select(.decision != "ABORT")) | length / length
376
375
  })')
377
376
 
378
- echo "📊 MODE EFFECTIVENESS ANALYSIS:"
379
- echo "$MODE_ANALYSIS"
377
+ echo "📊 EFFECTIVENESS ANALYSIS:"
378
+ echo "$EFFECTIVENESS_ANALYSIS"
380
379
  ```
381
380
 
382
381
  **Example 3: Agent Reliability Scoring**
@@ -429,15 +428,12 @@ When provided with validator feedback:
429
428
  - Concerns: Missing test coverage, unclear requirements
430
429
  - Decision: ITERATE with 0.80 confidence
431
430
 
432
- ### Task Mode with Manual Audit Retrieval
431
+ ### Audit Data Integration
433
432
 
434
- **Example: Debugging Security Issues**
433
+ **Example: Security Issue Analysis**
435
434
 
436
435
  ```bash
437
- # Task Mode for debugging
438
- /cfn-loop-task "Fix security vulnerability in auth module" --mode=standard
439
-
440
- # Product Owner spawned in Task Mode:
436
+ # Product Owner workflow:
441
437
  # 1. Receives Loop 2 results from coordinator
442
438
  # 2. Optionally retrieves audit data for context
443
439
  # 3. Makes decision with audit insights
@@ -460,7 +456,7 @@ Agent Performance: Recommend involving security-specialist agent in next iterati
460
456
  1. **Pattern Recognition**: Identifies recurring concerns across iterations
461
457
  2. **Agent Reliability**: Tracks which agents perform best on specific task types
462
458
  3. **Confidence Adjustment**: Modifies confidence based on historical success rates
463
- 4. **Cross-Mode Analysis**: Compares performance between Task Mode and CLI Mode
459
+ 4. **Performance Analysis**: Compares agent effectiveness across different scenarios
464
460
  5. **Decision Context**: Provides rich context for strategic decision-making
465
461
 
466
462
  ### Key Features
@@ -314,4 +314,4 @@ Test Execution Summary:
314
314
  - Gate Status: PASS (≥95% in 2/3 suites, ≥80% overall)
315
315
  ```
316
316
 
317
- **Note:** Coordination instructions and success criteria provided when spawned via CLI.
317
+ **Note:** Coordination handled automatically by the system.
@@ -258,4 +258,4 @@ Code Quality Test Execution Summary:
258
258
  - Gate Status: PASS (≥95% overall, actionable debt prioritization provided)
259
259
  ```
260
260
 
261
- **Note:** Coordination instructions and success criteria provided when spawned via CLI.
261
+ **Note:** Coordination handled automatically by the system.
@@ -289,4 +289,4 @@ Performance Analysis Test Summary:
289
289
  - Gate Status: PASS (≥95% in 1/3 suites, actionable recommendations provided)
290
290
  ```
291
291
 
292
- **Note:** Coordination instructions and success criteria provided when spawned via CLI.
292
+ **Note:** Coordination handled automatically by the system.
@@ -133,7 +133,7 @@ Benchmark Test Execution Summary:
133
133
  - Gate Status: PASS (≥95% in 1/3 suites, latency anomalies noted)
134
134
  ```
135
135
 
136
- **Note:** Coordination instructions and success criteria provided when spawned via CLI.
136
+ **Note:** Coordination handled automatically by the system.
137
137
 
138
138
  ## Team Dynamics
139
139
 
@@ -218,7 +218,7 @@ Security Test Execution Summary:
218
218
  - Gate Status: PASS (≥95% overall, zero critical vulnerabilities)
219
219
  ```
220
220
 
221
- **Note:** Coordination instructions and success criteria provided when spawned via CLI.
221
+ **Note:** Coordination handled automatically by the system.
222
222
 
223
223
  ## Success Metrics
224
224
 
@@ -760,43 +760,15 @@ DO NOT report subjective confidence scores. Instead:
760
760
  - ❌ OLD: "Confidence: 0.90 - API tests are comprehensive"
761
761
  - ✅ NEW: "API Tests: 58/60 passed (96.7% pass rate) - 2 schema validation edge cases need work"
762
762
 
763
- ## Completion Protocol (Test-Driven)
763
+ ## Completion Protocol
764
764
 
765
- Complete your work and provide test-based validation:
765
+ Complete your work and provide a structured response with:
766
+ - Confidence score (0.0-1.0) based on work quality
767
+ - Summary of work completed
768
+ - List of deliverables created
769
+ - Any recommendations or findings
766
770
 
767
- 1. **Execute Tests**: Run all API test suites from success criteria
768
-
769
- ```bash
770
- # Parse natively (no external dependencies)
771
- PASS=$(echo "$TEST_OUTPUT" | grep -oP '\d+(?= passing)' || echo "0")
772
- FAIL=$(echo "$TEST_OUTPUT" | grep -oP '\d+(?= failing)' || echo "0")
773
- TOTAL=$((PASS + FAIL))
774
- RATE=$(awk "BEGIN {if ($TOTAL > 0) printf \"%.2f\", $PASS/$TOTAL; else print \"0.00\"}")
775
-
776
- # Return results (Main Chat receives automatically in Task Mode)
777
- echo "{\"passed\": $PASS, \"failed\": $FAIL, \"pass_rate\": $RATE}"
778
- ```
779
-
780
- 2. **Validate Results**:
781
- - Coverage: ≥80%
782
- - Contract tests: X/Y passed
783
- - Security tests: X/Y passed
784
-
785
- 3. **Store Results**: Use test-results key (not confidence key)
786
- 4. **Signal Completion**: Push to completion queue
787
-
788
- **Example Report:**
789
- ```text
790
- API Testing Summary:
791
- - Contract Tests: 20/20 passed (100%)
792
- - Schema Validation Tests: 18/18 passed (100%)
793
- - Security Tests: 12/12 passed (100%)
794
- - Load Tests: 8/10 passed (80%)
795
- - Overall: 58/60 passed (96.7%)
796
- - Coverage: 85.3%
797
- - All Endpoints Tested: Yes
798
- - Gate Status: PASS (≥95% overall, 100% security coverage)
799
- ```
771
+ **Note:** Coordination handled automatically by the system.
800
772
 
801
773
  ## Skill References
802
774
  → **Contract Testing**: `.claude/skills/pact-contract-testing/SKILL.md`