@cleocode/skills 2.0.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/dispatch-config.json +404 -0
- package/index.d.ts +178 -0
- package/index.js +405 -0
- package/package.json +14 -0
- package/profiles/core.json +7 -0
- package/profiles/full.json +10 -0
- package/profiles/minimal.json +7 -0
- package/profiles/recommended.json +7 -0
- package/provider-skills-map.json +97 -0
- package/skills/_shared/cleo-style-guide.md +84 -0
- package/skills/_shared/manifest-operations.md +810 -0
- package/skills/_shared/placeholders.json +433 -0
- package/skills/_shared/skill-chaining-patterns.md +237 -0
- package/skills/_shared/subagent-protocol-base.md +223 -0
- package/skills/_shared/task-system-integration.md +232 -0
- package/skills/_shared/testing-framework-config.md +110 -0
- package/skills/ct-cleo/SKILL.md +490 -0
- package/skills/ct-cleo/references/anti-patterns.md +19 -0
- package/skills/ct-cleo/references/loom-lifecycle.md +136 -0
- package/skills/ct-cleo/references/orchestrator-constraints.md +55 -0
- package/skills/ct-cleo/references/session-protocol.md +162 -0
- package/skills/ct-codebase-mapper/SKILL.md +82 -0
- package/skills/ct-contribution/SKILL.md +521 -0
- package/skills/ct-contribution/templates/contribution-init.json +21 -0
- package/skills/ct-dev-workflow/SKILL.md +423 -0
- package/skills/ct-docs-lookup/SKILL.md +66 -0
- package/skills/ct-docs-review/SKILL.md +175 -0
- package/skills/ct-docs-write/SKILL.md +108 -0
- package/skills/ct-documentor/SKILL.md +231 -0
- package/skills/ct-epic-architect/SKILL.md +305 -0
- package/skills/ct-epic-architect/references/bug-epic-example.md +172 -0
- package/skills/ct-epic-architect/references/commands.md +201 -0
- package/skills/ct-epic-architect/references/feature-epic-example.md +210 -0
- package/skills/ct-epic-architect/references/migration-epic-example.md +244 -0
- package/skills/ct-epic-architect/references/output-format.md +92 -0
- package/skills/ct-epic-architect/references/patterns.md +284 -0
- package/skills/ct-epic-architect/references/refactor-epic-example.md +412 -0
- package/skills/ct-epic-architect/references/research-epic-example.md +226 -0
- package/skills/ct-epic-architect/references/shell-escaping.md +86 -0
- package/skills/ct-epic-architect/references/skill-aware-execution.md +195 -0
- package/skills/ct-grade/SKILL.md +230 -0
- package/skills/ct-grade/agents/analysis-reporter.md +203 -0
- package/skills/ct-grade/agents/blind-comparator.md +157 -0
- package/skills/ct-grade/agents/scenario-runner.md +134 -0
- package/skills/ct-grade/eval-viewer/__pycache__/generate_grade_review.cpython-314.pyc +0 -0
- package/skills/ct-grade/eval-viewer/generate_grade_review.py +1138 -0
- package/skills/ct-grade/eval-viewer/generate_grade_viewer.py +544 -0
- package/skills/ct-grade/eval-viewer/generate_review.py +283 -0
- package/skills/ct-grade/eval-viewer/grade-review.html +1574 -0
- package/skills/ct-grade/eval-viewer/viewer.html +219 -0
- package/skills/ct-grade/evals/evals.json +94 -0
- package/skills/ct-grade/references/ab-test-methodology.md +150 -0
- package/skills/ct-grade/references/domains.md +137 -0
- package/skills/ct-grade/references/grade-spec.md +236 -0
- package/skills/ct-grade/references/scenario-playbook.md +234 -0
- package/skills/ct-grade/references/token-tracking.md +120 -0
- package/skills/ct-grade/scripts/__pycache__/audit_analyzer.cpython-314.pyc +0 -0
- package/skills/ct-grade/scripts/__pycache__/run_ab_test.cpython-314.pyc +0 -0
- package/skills/ct-grade/scripts/__pycache__/run_all.cpython-314.pyc +0 -0
- package/skills/ct-grade/scripts/__pycache__/token_tracker.cpython-314.pyc +0 -0
- package/skills/ct-grade/scripts/audit_analyzer.py +279 -0
- package/skills/ct-grade/scripts/generate_report.py +283 -0
- package/skills/ct-grade/scripts/run_ab_test.py +504 -0
- package/skills/ct-grade/scripts/run_all.py +287 -0
- package/skills/ct-grade/scripts/setup_run.py +183 -0
- package/skills/ct-grade/scripts/token_tracker.py +630 -0
- package/skills/ct-grade-v2-1/SKILL.md +237 -0
- package/skills/ct-grade-v2-1/agents/analysis-reporter.md +203 -0
- package/skills/ct-grade-v2-1/agents/blind-comparator.md +157 -0
- package/skills/ct-grade-v2-1/agents/scenario-runner.md +179 -0
- package/skills/ct-grade-v2-1/evals/evals.json +74 -0
- package/skills/ct-grade-v2-1/grade-viewer/__pycache__/build_op_stats.cpython-314.pyc +0 -0
- package/skills/ct-grade-v2-1/grade-viewer/__pycache__/generate_grade_review.cpython-314.pyc +0 -0
- package/skills/ct-grade-v2-1/grade-viewer/build_op_stats.py +174 -0
- package/skills/ct-grade-v2-1/grade-viewer/eval-analysis.json +41 -0
- package/skills/ct-grade-v2-1/grade-viewer/eval-report.md +34 -0
- package/skills/ct-grade-v2-1/grade-viewer/generate_grade_review.py +1023 -0
- package/skills/ct-grade-v2-1/grade-viewer/generate_grade_viewer.py +548 -0
- package/skills/ct-grade-v2-1/grade-viewer/grade-review-eval.html +613 -0
- package/skills/ct-grade-v2-1/grade-viewer/grade-review.html +1532 -0
- package/skills/ct-grade-v2-1/grade-viewer/viewer.html +620 -0
- package/skills/ct-grade-v2-1/manifest-entry.json +31 -0
- package/skills/ct-grade-v2-1/references/ab-testing.md +233 -0
- package/skills/ct-grade-v2-1/references/domains-ssot.md +156 -0
- package/skills/ct-grade-v2-1/references/grade-spec-v2.md +167 -0
- package/skills/ct-grade-v2-1/references/playbook-v2.md +393 -0
- package/skills/ct-grade-v2-1/references/token-tracking.md +202 -0
- package/skills/ct-grade-v2-1/scripts/generate_report.py +419 -0
- package/skills/ct-grade-v2-1/scripts/run_ab_test.py +493 -0
- package/skills/ct-grade-v2-1/scripts/run_scenario.py +396 -0
- package/skills/ct-grade-v2-1/scripts/setup_run.py +207 -0
- package/skills/ct-grade-v2-1/scripts/token_tracker.py +175 -0
- package/skills/ct-memory/SKILL.md +84 -0
- package/skills/ct-orchestrator/INSTALL.md +61 -0
- package/skills/ct-orchestrator/README.md +69 -0
- package/skills/ct-orchestrator/SKILL.md +380 -0
- package/skills/ct-orchestrator/manifest-entry.json +19 -0
- package/skills/ct-orchestrator/orchestrator-prompt.txt +17 -0
- package/skills/ct-orchestrator/references/SUBAGENT-PROTOCOL-BLOCK.md +66 -0
- package/skills/ct-orchestrator/references/autonomous-operation.md +167 -0
- package/skills/ct-orchestrator/references/lifecycle-gates.md +98 -0
- package/skills/ct-orchestrator/references/orchestrator-compliance.md +271 -0
- package/skills/ct-orchestrator/references/orchestrator-handoffs.md +85 -0
- package/skills/ct-orchestrator/references/orchestrator-patterns.md +164 -0
- package/skills/ct-orchestrator/references/orchestrator-recovery.md +113 -0
- package/skills/ct-orchestrator/references/orchestrator-spawning.md +271 -0
- package/skills/ct-orchestrator/references/orchestrator-tokens.md +180 -0
- package/skills/ct-research-agent/SKILL.md +226 -0
- package/skills/ct-skill-creator/.cleo/.context-state.json +13 -0
- package/skills/ct-skill-creator/.cleo/logs/cleo.2026-03-07.1.log +24 -0
- package/skills/ct-skill-creator/.cleo/tasks.db +0 -0
- package/skills/ct-skill-creator/SKILL.md +356 -0
- package/skills/ct-skill-creator/agents/analyzer.md +276 -0
- package/skills/ct-skill-creator/agents/comparator.md +204 -0
- package/skills/ct-skill-creator/agents/grader.md +225 -0
- package/skills/ct-skill-creator/assets/eval_review.html +146 -0
- package/skills/ct-skill-creator/eval-viewer/__pycache__/generate_review.cpython-314.pyc +0 -0
- package/skills/ct-skill-creator/eval-viewer/generate_review.py +471 -0
- package/skills/ct-skill-creator/eval-viewer/viewer.html +1325 -0
- package/skills/ct-skill-creator/manifest-entry.json +17 -0
- package/skills/ct-skill-creator/references/dynamic-context.md +228 -0
- package/skills/ct-skill-creator/references/frontmatter.md +83 -0
- package/skills/ct-skill-creator/references/invocation-control.md +165 -0
- package/skills/ct-skill-creator/references/output-patterns.md +86 -0
- package/skills/ct-skill-creator/references/provider-deployment.md +175 -0
- package/skills/ct-skill-creator/references/schemas.md +430 -0
- package/skills/ct-skill-creator/references/workflows.md +28 -0
- package/skills/ct-skill-creator/scripts/__init__.py +1 -0
- package/skills/ct-skill-creator/scripts/__pycache__/__init__.cpython-314.pyc +0 -0
- package/skills/ct-skill-creator/scripts/__pycache__/aggregate_benchmark.cpython-314.pyc +0 -0
- package/skills/ct-skill-creator/scripts/__pycache__/generate_report.cpython-314.pyc +0 -0
- package/skills/ct-skill-creator/scripts/__pycache__/improve_description.cpython-314.pyc +0 -0
- package/skills/ct-skill-creator/scripts/__pycache__/init_skill.cpython-314.pyc +0 -0
- package/skills/ct-skill-creator/scripts/__pycache__/quick_validate.cpython-314.pyc +0 -0
- package/skills/ct-skill-creator/scripts/__pycache__/run_eval.cpython-314.pyc +0 -0
- package/skills/ct-skill-creator/scripts/__pycache__/run_loop.cpython-314.pyc +0 -0
- package/skills/ct-skill-creator/scripts/__pycache__/utils.cpython-314.pyc +0 -0
- package/skills/ct-skill-creator/scripts/aggregate_benchmark.py +401 -0
- package/skills/ct-skill-creator/scripts/generate_report.py +326 -0
- package/skills/ct-skill-creator/scripts/improve_description.py +247 -0
- package/skills/ct-skill-creator/scripts/init_skill.py +306 -0
- package/skills/ct-skill-creator/scripts/package_skill.py +110 -0
- package/skills/ct-skill-creator/scripts/quick_validate.py +97 -0
- package/skills/ct-skill-creator/scripts/run_eval.py +310 -0
- package/skills/ct-skill-creator/scripts/run_loop.py +328 -0
- package/skills/ct-skill-creator/scripts/utils.py +47 -0
- package/skills/ct-skill-validator/SKILL.md +178 -0
- package/skills/ct-skill-validator/agents/ecosystem-checker.md +151 -0
- package/skills/ct-skill-validator/assets/valid-skill-example.md +13 -0
- package/skills/ct-skill-validator/evals/eval_set.json +14 -0
- package/skills/ct-skill-validator/evals/evals.json +52 -0
- package/skills/ct-skill-validator/manifest-entry.json +20 -0
- package/skills/ct-skill-validator/references/cleo-ecosystem-rules.md +163 -0
- package/skills/ct-skill-validator/references/validation-rules.md +168 -0
- package/skills/ct-skill-validator/scripts/__init__.py +0 -0
- package/skills/ct-skill-validator/scripts/__pycache__/audit_body.cpython-314.pyc +0 -0
- package/skills/ct-skill-validator/scripts/__pycache__/check_ecosystem.cpython-314.pyc +0 -0
- package/skills/ct-skill-validator/scripts/__pycache__/generate_validation_report.cpython-314.pyc +0 -0
- package/skills/ct-skill-validator/scripts/__pycache__/validate.cpython-314.pyc +0 -0
- package/skills/ct-skill-validator/scripts/audit_body.py +242 -0
- package/skills/ct-skill-validator/scripts/check_ecosystem.py +169 -0
- package/skills/ct-skill-validator/scripts/check_manifest.py +172 -0
- package/skills/ct-skill-validator/scripts/generate_validation_report.py +442 -0
- package/skills/ct-skill-validator/scripts/validate.py +422 -0
- package/skills/ct-spec-writer/SKILL.md +189 -0
- package/skills/ct-stickynote/README.md +14 -0
- package/skills/ct-stickynote/SKILL.md +46 -0
- package/skills/ct-task-executor/SKILL.md +296 -0
- package/skills/ct-validator/SKILL.md +216 -0
- package/skills/manifest.json +469 -0
- package/skills.json +281 -0
|
@@ -0,0 +1,86 @@
|
|
|
1
|
+
# Shell Escaping Reference
|
|
2
|
+
|
|
3
|
+
When passing text to CLEO commands via `--notes`, `--description`, or other text fields, certain characters require escaping to prevent shell interpretation.
|
|
4
|
+
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
## Quick Reference
|
|
8
|
+
|
|
9
|
+
| Character | Escape As | Example |
|
|
10
|
+
|-----------|-----------|---------|
|
|
11
|
+
| `$` | `\$` | `\$100`, `\$HOME` |
|
|
12
|
+
| Backtick | `\`` | Code blocks |
|
|
13
|
+
| `"` | `\"` | Nested quotes |
|
|
14
|
+
| Exclamation | `\!` | History expansion (bash) |
|
|
15
|
+
|
|
16
|
+
---
|
|
17
|
+
|
|
18
|
+
## Common Patterns
|
|
19
|
+
|
|
20
|
+
### Dollar Signs in Notes
|
|
21
|
+
|
|
22
|
+
```bash
|
|
23
|
+
# CORRECT - escaped dollar sign
|
|
24
|
+
cleo add "Task" --notes "Cost estimate: \$500 per user"
|
|
25
|
+
cleo add "Task" --description "Process \$DATA variable"
|
|
26
|
+
|
|
27
|
+
# WRONG - $500 and $DATA interpreted as shell variables
|
|
28
|
+
cleo add "Task" --notes "Cost estimate: $500 per user"
|
|
29
|
+
cleo add "Task" --description "Process $DATA variable"
|
|
30
|
+
```
|
|
31
|
+
|
|
32
|
+
### Quotes in Text
|
|
33
|
+
|
|
34
|
+
```bash
|
|
35
|
+
# CORRECT - escaped inner quotes
|
|
36
|
+
cleo add "Task" --notes "User said \"hello world\""
|
|
37
|
+
|
|
38
|
+
# Alternative - use single quotes for outer
|
|
39
|
+
cleo add 'Task with "quotes" inside'
|
|
40
|
+
```
|
|
41
|
+
|
|
42
|
+
### Backticks for Code
|
|
43
|
+
|
|
44
|
+
```bash
|
|
45
|
+
# CORRECT - escaped backticks
|
|
46
|
+
cleo add "Task" --notes "Run \`npm install\` first"
|
|
47
|
+
|
|
48
|
+
# Alternative - use $() syntax in description
|
|
49
|
+
cleo add "Task" --notes 'Run `npm install` first'
|
|
50
|
+
```
|
|
51
|
+
|
|
52
|
+
---
|
|
53
|
+
|
|
54
|
+
## Validation Errors
|
|
55
|
+
|
|
56
|
+
If you see validation exit code 6 (`E_VALIDATION_*`), check for unescaped special characters in your text fields. The shell may have interpolated variables before CLEO received the input.
|
|
57
|
+
|
|
58
|
+
### Debugging Tips
|
|
59
|
+
|
|
60
|
+
1. Echo your command first to see what the shell produces:
|
|
61
|
+
```bash
|
|
62
|
+
echo "cleo add \"Task\" --notes \"Price: $500\""
|
|
63
|
+
# Shows: cleo add "Task" --notes "Price: "
|
|
64
|
+
# The $500 became empty (no $500 variable exists)
|
|
65
|
+
```
|
|
66
|
+
|
|
67
|
+
2. Use single quotes when escaping is complex:
|
|
68
|
+
```bash
|
|
69
|
+
cleo add 'Task' --notes 'Price: $500 for "premium" tier'
|
|
70
|
+
```
|
|
71
|
+
|
|
72
|
+
---
|
|
73
|
+
|
|
74
|
+
## HEREDOC for Complex Text
|
|
75
|
+
|
|
76
|
+
For multi-line or heavily-quoted content, use a HEREDOC:
|
|
77
|
+
|
|
78
|
+
```bash
|
|
79
|
+
cleo update T001 --notes "$(cat <<'EOF'
|
|
80
|
+
Complex notes with $variables and "quotes"
|
|
81
|
+
that don't need escaping inside HEREDOC.
|
|
82
|
+
EOF
|
|
83
|
+
)"
|
|
84
|
+
```
|
|
85
|
+
|
|
86
|
+
Note: Use `<<'EOF'` (quoted) to prevent variable expansion, or `<<EOF` (unquoted) if you want variables expanded.
|
|
@@ -0,0 +1,195 @@
|
|
|
1
|
+
# Skill-Aware Epic Execution Patterns
|
|
2
|
+
|
|
3
|
+
This reference documents how to integrate the ct-epic-architect skill with orchestrators, subagents, and CLEO research commands.
|
|
4
|
+
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
## Orchestrator Workflow
|
|
8
|
+
|
|
9
|
+
### When to Invoke ct-epic-architect
|
|
10
|
+
|
|
11
|
+
Use the ct-epic-architect skill when the user's request involves:
|
|
12
|
+
|
|
13
|
+
| Trigger | Example Request | Action |
|
|
14
|
+
|---------|-----------------|--------|
|
|
15
|
+
| Epic creation | "Create an epic for user authentication" | Invoke `/ct-epic-architect` |
|
|
16
|
+
| Task decomposition | "Break down this project into tasks" | Invoke `/ct-epic-architect` |
|
|
17
|
+
| Dependency planning | "Plan the dependency order for this feature" | Invoke `/ct-epic-architect` |
|
|
18
|
+
| Wave analysis | "What can run in parallel?" | Invoke `/ct-epic-architect` |
|
|
19
|
+
| Sprint planning | "Plan the sprint backlog" | Invoke `/ct-epic-architect` |
|
|
20
|
+
|
|
21
|
+
### How to Invoke
|
|
22
|
+
|
|
23
|
+
```
|
|
24
|
+
# Via Skill tool
|
|
25
|
+
Skill(skill="ct-epic-architect")
|
|
26
|
+
|
|
27
|
+
# Via slash command
|
|
28
|
+
/ct-epic-architect
|
|
29
|
+
```
|
|
30
|
+
|
|
31
|
+
**What Loads:**
|
|
32
|
+
1. SKILL.md body (480 lines of core instructions)
|
|
33
|
+
2. Access to references/ files (loaded on-demand when Claude reads them)
|
|
34
|
+
|
|
35
|
+
### Decision Tree
|
|
36
|
+
|
|
37
|
+
```
|
|
38
|
+
User Request
|
|
39
|
+
│
|
|
40
|
+
▼
|
|
41
|
+
┌───────────────────────────────┐
|
|
42
|
+
│ Is this about epic/task │
|
|
43
|
+
│ planning and decomposition? │
|
|
44
|
+
└───────────────────────────────┘
|
|
45
|
+
│
|
|
46
|
+
├── YES ─► Invoke /ct-epic-architect
|
|
47
|
+
│
|
|
48
|
+
└── NO ─► Handle directly or use other skill
|
|
49
|
+
```
|
|
50
|
+
|
|
51
|
+
---
|
|
52
|
+
|
|
53
|
+
## Subagent Skill Specification
|
|
54
|
+
|
|
55
|
+
### Why Subagents Don't Inherit Skills
|
|
56
|
+
|
|
57
|
+
Per the skill system architecture:
|
|
58
|
+
- **Skills are session-scoped** - They load into the CURRENT context
|
|
59
|
+
- **Subagents are NEW contexts** - They don't automatically get parent skills
|
|
60
|
+
- **This is intentional** - Prevents context bloat and skill pollution
|
|
61
|
+
|
|
62
|
+
### When Subagents SHOULD Have ct-epic-architect
|
|
63
|
+
|
|
64
|
+
| Scenario | Should Have Skill? | Rationale |
|
|
65
|
+
|----------|-------------------|-----------|
|
|
66
|
+
| Coder implementing a task | No | Coders execute, not plan |
|
|
67
|
+
| Researcher gathering info | No | Researchers investigate, not plan |
|
|
68
|
+
| Nested orchestrator | Yes | Needs planning capability |
|
|
69
|
+
| Epic architect subagent | Yes | Primary function requires it |
|
|
70
|
+
|
|
71
|
+
### Declaring Skills for Subagents
|
|
72
|
+
|
|
73
|
+
When spawning a subagent that needs ct-epic-architect:
|
|
74
|
+
|
|
75
|
+
```markdown
|
|
76
|
+
# In subagent prompt frontmatter (if using template)
|
|
77
|
+
---
|
|
78
|
+
name: nested-orchestrator
|
|
79
|
+
skills:
|
|
80
|
+
- ct-epic-architect
|
|
81
|
+
- orchestrator
|
|
82
|
+
---
|
|
83
|
+
```
|
|
84
|
+
|
|
85
|
+
Or via Task tool:
|
|
86
|
+
|
|
87
|
+
```
|
|
88
|
+
Task(
|
|
89
|
+
subagent_type="general-purpose",
|
|
90
|
+
prompt="""
|
|
91
|
+
You have access to the ct-epic-architect skill.
|
|
92
|
+
Use /ct-epic-architect when you need to create epics.
|
|
93
|
+
|
|
94
|
+
[Task description]
|
|
95
|
+
"""
|
|
96
|
+
)
|
|
97
|
+
```
|
|
98
|
+
|
|
99
|
+
---
|
|
100
|
+
|
|
101
|
+
## CLEO Research Integration
|
|
102
|
+
|
|
103
|
+
### Epic-Architect Uses Research Commands
|
|
104
|
+
|
|
105
|
+
Before creating epics, ct-epic-architect SHOULD check existing research:
|
|
106
|
+
|
|
107
|
+
```bash
|
|
108
|
+
# Check for related research before planning
|
|
109
|
+
{{TASK_RESEARCH_LIST_CMD}} --status complete --topic {{DOMAIN}}
|
|
110
|
+
{{TASK_RESEARCH_SHOW_CMD}} {{RESEARCH_ID}}
|
|
111
|
+
|
|
112
|
+
# Link epic to research after creation
|
|
113
|
+
{{TASK_LINK_CMD}} {{EPIC_ID}} {{RESEARCH_ID}}
|
|
114
|
+
```
|
|
115
|
+
|
|
116
|
+
### Research Output Protocol for Epic Creation
|
|
117
|
+
|
|
118
|
+
When ct-epic-architect creates an epic, it follows the subagent protocol:
|
|
119
|
+
|
|
120
|
+
1. **Write output file**: `{{OUTPUT_DIR}}/{{DATE}}_epic-{{FEATURE_SLUG}}.md`
|
|
121
|
+
2. **Append manifest entry**: Single line JSON to `{{MANIFEST_PATH}}`
|
|
122
|
+
3. **Return summary only**: "Epic created. See MANIFEST.jsonl for summary."
|
|
123
|
+
|
|
124
|
+
### Querying Prior Research
|
|
125
|
+
|
|
126
|
+
Epic-architect can leverage prior research for informed planning:
|
|
127
|
+
|
|
128
|
+
```bash
|
|
129
|
+
# Find research related to the epic domain
|
|
130
|
+
{{TASK_RESEARCH_LIST_CMD}} --topic "authentication"
|
|
131
|
+
|
|
132
|
+
# Get key findings from specific research
|
|
133
|
+
{{TASK_RESEARCH_SHOW_CMD}} research-auth-options-2026-01-15
|
|
134
|
+
|
|
135
|
+
# The key_findings array informs epic structure without loading full content
|
|
136
|
+
```
|
|
137
|
+
|
|
138
|
+
---
|
|
139
|
+
|
|
140
|
+
## Integration with Orchestrator Skill
|
|
141
|
+
|
|
142
|
+
### Orchestrator → Epic-Architect Flow
|
|
143
|
+
|
|
144
|
+
```
|
|
145
|
+
Orchestrator receives "plan authentication feature"
|
|
146
|
+
│
|
|
147
|
+
▼
|
|
148
|
+
Orchestrator spawns ct-epic-architect subagent
|
|
149
|
+
│
|
|
150
|
+
▼
|
|
151
|
+
Epic-architect creates epic and tasks in CLEO
|
|
152
|
+
│
|
|
153
|
+
▼
|
|
154
|
+
Epic-architect writes to manifest
|
|
155
|
+
│
|
|
156
|
+
▼
|
|
157
|
+
Orchestrator reads manifest key_findings
|
|
158
|
+
│
|
|
159
|
+
▼
|
|
160
|
+
Orchestrator proceeds with task execution
|
|
161
|
+
```
|
|
162
|
+
|
|
163
|
+
### Context Protection
|
|
164
|
+
|
|
165
|
+
The orchestrator skill enforces context budget (ORC-005). When spawning ct-epic-architect:
|
|
166
|
+
|
|
167
|
+
1. **Pass minimal context** - Epic-architect reads full task details itself
|
|
168
|
+
2. **Receive minimal response** - Only manifest summary returned
|
|
169
|
+
3. **Query manifest for details** - Don't ask ct-epic-architect for full breakdown
|
|
170
|
+
|
|
171
|
+
```bash
|
|
172
|
+
# Orchestrator queries manifest for epic details
|
|
173
|
+
tail -1 {{MANIFEST_PATH}} | jq '.key_findings'
|
|
174
|
+
```
|
|
175
|
+
|
|
176
|
+
---
|
|
177
|
+
|
|
178
|
+
## Anti-Patterns
|
|
179
|
+
|
|
180
|
+
| Anti-Pattern | Problem | Solution |
|
|
181
|
+
|--------------|---------|----------|
|
|
182
|
+
| Giving all subagents ct-epic-architect skill | Context bloat | Only nested orchestrators need it |
|
|
183
|
+
| Returning full epic details | Bloats orchestrator context | Return "Epic created. See MANIFEST." |
|
|
184
|
+
| Skipping research check | Duplicate work | Always query research first |
|
|
185
|
+
| Loading all references | Wasted tokens | Load only needed references |
|
|
186
|
+
| Parallel epic creation | Task conflicts | Create epics sequentially |
|
|
187
|
+
|
|
188
|
+
---
|
|
189
|
+
|
|
190
|
+
## Cross-References
|
|
191
|
+
|
|
192
|
+
- **Orchestrator Skill**: skills/ct-orchestrator/SKILL.md
|
|
193
|
+
- **Subagent Protocol**: skills/_shared/subagent-protocol-base.md
|
|
194
|
+
- **Task System Integration**: skills/_shared/task-system-integration.md
|
|
195
|
+
- **Research Commands**: docs/commands/research.md
|
|
@@ -0,0 +1,230 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: ct-grade
|
|
3
|
+
description: Session grading for agent behavioral analysis. Use when evaluating agent session quality, running grade scenarios, or interpreting grade results. Triggers on grading tasks, session quality checks, or behavioral analysis needs.
|
|
4
|
+
version: 1.0.0
|
|
5
|
+
tier: 2
|
|
6
|
+
core: false
|
|
7
|
+
category: quality
|
|
8
|
+
protocol: null
|
|
9
|
+
dependencies: []
|
|
10
|
+
sharedResources: []
|
|
11
|
+
compatibility:
|
|
12
|
+
- claude-code
|
|
13
|
+
- cursor
|
|
14
|
+
- windsurf
|
|
15
|
+
- gemini-cli
|
|
16
|
+
license: MIT
|
|
17
|
+
---
|
|
18
|
+
|
|
19
|
+
# Session Grading Guide
|
|
20
|
+
|
|
21
|
+
Session grading evaluates agent behavioral patterns against the CLEO protocol. It reads the audit log for a completed session and applies a 5-dimension rubric to produce a score (0-100), letter grade (A-F), and diagnostic flags.
|
|
22
|
+
|
|
23
|
+
## When to Use Grade Mode
|
|
24
|
+
|
|
25
|
+
Use grading when you need to:
|
|
26
|
+
- Evaluate how well an agent followed CLEO protocol during a session
|
|
27
|
+
- Identify behavioral anti-patterns (skipped discovery, missing session.end, etc.)
|
|
28
|
+
- Track improvement over time across multiple sessions
|
|
29
|
+
- Validate that orchestrated subagents followed protocol
|
|
30
|
+
|
|
31
|
+
Grading requires audit data. Sessions must be started with the `--grade` flag to enable audit log capture.
|
|
32
|
+
|
|
33
|
+
## Starting a Grade Session
|
|
34
|
+
|
|
35
|
+
### CLI
|
|
36
|
+
|
|
37
|
+
```bash
|
|
38
|
+
# Start a session with grading enabled
|
|
39
|
+
ct session start --scope epic:T001 --name "Feature work" --grade
|
|
40
|
+
|
|
41
|
+
# The --grade flag enables detailed audit logging
|
|
42
|
+
# All MCP and CLI operations are recorded for later analysis
|
|
43
|
+
```
|
|
44
|
+
|
|
45
|
+
### MCP
|
|
46
|
+
|
|
47
|
+
```
|
|
48
|
+
mutate({ domain: "session", operation: "start",
|
|
49
|
+
params: { scope: "epic:T001", name: "Feature work", grade: true }})
|
|
50
|
+
```
|
|
51
|
+
|
|
52
|
+
## Running Scenarios
|
|
53
|
+
|
|
54
|
+
The grading rubric evaluates 5 behavioral scenarios that map to protocol compliance:
|
|
55
|
+
|
|
56
|
+
### 1. Fresh Discovery
|
|
57
|
+
Tests whether the agent checks existing sessions and tasks before starting work. Evaluates `session.list` and `tasks.find` calls at session start.
|
|
58
|
+
|
|
59
|
+
### 2. Task Hygiene
|
|
60
|
+
Tests whether task creation follows protocol: descriptions provided, parent existence verified before subtask creation, no duplicate tasks.
|
|
61
|
+
|
|
62
|
+
### 3. Error Recovery
|
|
63
|
+
Tests whether the agent handles errors correctly: follows up `E_NOT_FOUND` with recovery lookups (`tasks.find` or `tasks.exists`), avoids duplicate creates after failures.
|
|
64
|
+
|
|
65
|
+
### 4. Full Lifecycle
|
|
66
|
+
Tests session discipline end-to-end: session listed before task ops, session properly ended, MCP-first usage patterns.
|
|
67
|
+
|
|
68
|
+
### 5. Multi-Domain Analysis
|
|
69
|
+
Tests progressive disclosure: use of `admin.help` or skill lookups, preference for `query` (MCP) over CLI for programmatic access.
|
|
70
|
+
|
|
71
|
+
## Evaluating Results
|
|
72
|
+
|
|
73
|
+
### CLI
|
|
74
|
+
|
|
75
|
+
```bash
|
|
76
|
+
# Grade a specific session
|
|
77
|
+
ct grade <sessionId>
|
|
78
|
+
|
|
79
|
+
# List all past grade results
|
|
80
|
+
ct grade --list
|
|
81
|
+
```
|
|
82
|
+
|
|
83
|
+
### MCP
|
|
84
|
+
|
|
85
|
+
```
|
|
86
|
+
# Grade a session
|
|
87
|
+
# Canonical registry surface (preferred)
|
|
88
|
+
query({ domain: "check", operation: "grade",
|
|
89
|
+
params: { sessionId: "abc-123" }})
|
|
90
|
+
|
|
91
|
+
# List past grades
|
|
92
|
+
query({ domain: "check", operation: "grade.list" })
|
|
93
|
+
|
|
94
|
+
# Compatibility aliases still work at runtime
|
|
95
|
+
query({ domain: "admin", operation: "grade",
|
|
96
|
+
params: { sessionId: "abc-123" }})
|
|
97
|
+
```
|
|
98
|
+
|
|
99
|
+
## Understanding the 5 Dimensions
|
|
100
|
+
|
|
101
|
+
Each dimension scores 0-20 points, totaling 0-100.
|
|
102
|
+
|
|
103
|
+
### S1: Session Discipline (20 pts)
|
|
104
|
+
|
|
105
|
+
| Points | Criteria |
|
|
106
|
+
|--------|----------|
|
|
107
|
+
| 10 | `session.list` called before first task operation |
|
|
108
|
+
| 10 | `session.end` called when work is complete |
|
|
109
|
+
|
|
110
|
+
**What it measures**: Does the agent check existing sessions before starting, and properly close sessions when done?
|
|
111
|
+
|
|
112
|
+
### S2: Discovery Efficiency (20 pts)
|
|
113
|
+
|
|
114
|
+
| Points | Criteria |
|
|
115
|
+
|--------|----------|
|
|
116
|
+
| 0-15 | `find:list` ratio >= 80% earns full 15; scales linearly below |
|
|
117
|
+
| 5 | `tasks.show` used for detail retrieval |
|
|
118
|
+
|
|
119
|
+
**What it measures**: Does the agent prefer `tasks.find` (low context cost) over `tasks.list` (high context cost) for discovery?
|
|
120
|
+
|
|
121
|
+
### S3: Task Hygiene (20 pts)
|
|
122
|
+
|
|
123
|
+
Starts at 20 and deducts for violations:
|
|
124
|
+
|
|
125
|
+
| Deduction | Violation |
|
|
126
|
+
|-----------|-----------|
|
|
127
|
+
| -5 each | `tasks.add` without a description |
|
|
128
|
+
| -3 | Subtasks created without `tasks.exists` parent check |
|
|
129
|
+
|
|
130
|
+
**What it measures**: Does the agent create well-formed tasks with descriptions and verify parents before creating subtasks?
|
|
131
|
+
|
|
132
|
+
### S4: Error Protocol (20 pts)
|
|
133
|
+
|
|
134
|
+
Starts at 20 and deducts for violations:
|
|
135
|
+
|
|
136
|
+
| Deduction | Violation |
|
|
137
|
+
|-----------|-----------|
|
|
138
|
+
| -5 each | `E_NOT_FOUND` error not followed by recovery lookup within 5 ops |
|
|
139
|
+
| -5 | Duplicate task creates detected (same title in session) |
|
|
140
|
+
|
|
141
|
+
**What it measures**: Does the agent recover gracefully from errors and avoid creating duplicate tasks?
|
|
142
|
+
|
|
143
|
+
### S5: Progressive Disclosure Use (20 pts)
|
|
144
|
+
|
|
145
|
+
| Points | Criteria |
|
|
146
|
+
|--------|----------|
|
|
147
|
+
| 10 | `admin.help` or skill lookup calls made |
|
|
148
|
+
| 10 | `query` (MCP gateway) used for programmatic access |
|
|
149
|
+
|
|
150
|
+
**What it measures**: Does the agent use progressive disclosure (help/skills) and prefer MCP over CLI?
|
|
151
|
+
|
|
152
|
+
## Interpreting Scores
|
|
153
|
+
|
|
154
|
+
### Letter Grades
|
|
155
|
+
|
|
156
|
+
| Grade | Score Range | Meaning |
|
|
157
|
+
|-------|-----------|---------|
|
|
158
|
+
| **A** | 90-100 | Excellent protocol adherence. Agent follows all best practices. |
|
|
159
|
+
| **B** | 75-89 | Good. Minor gaps in one or two dimensions. |
|
|
160
|
+
| **C** | 60-74 | Acceptable. Several protocol violations need attention. |
|
|
161
|
+
| **D** | 45-59 | Below expectations. Significant anti-patterns present. |
|
|
162
|
+
| **F** | 0-44 | Failing. Major protocol violations across multiple dimensions. |
|
|
163
|
+
|
|
164
|
+
### Reading the Output
|
|
165
|
+
|
|
166
|
+
The grade result includes:
|
|
167
|
+
- **score/maxScore**: Raw numeric score (e.g., `85/100`)
|
|
168
|
+
- **percent**: Percentage score
|
|
169
|
+
- **grade**: Letter grade (A-F)
|
|
170
|
+
- **dimensions**: Per-dimension breakdown with score, max, and evidence
|
|
171
|
+
- **flags**: Specific violations or improvement suggestions
|
|
172
|
+
- **entryCount**: Number of audit entries analyzed
|
|
173
|
+
|
|
174
|
+
### Flags
|
|
175
|
+
|
|
176
|
+
Flags are actionable diagnostic messages. Each flag identifies a specific behavioral issue:
|
|
177
|
+
|
|
178
|
+
- `session.list never called` -- Check existing sessions before starting new ones
|
|
179
|
+
- `session.end never called` -- Always end sessions when done
|
|
180
|
+
- `tasks.list used Nx` -- Prefer `tasks.find` for discovery
|
|
181
|
+
- `tasks.add without description` -- Always provide task descriptions
|
|
182
|
+
- `Subtasks created without tasks.exists parent check` -- Verify parent exists first
|
|
183
|
+
- `E_NOT_FOUND not followed by recovery lookup` -- Follow errors with `tasks.find` or `tasks.exists`
|
|
184
|
+
- `No admin.help or skill lookup calls` -- Load `ct-cleo` for protocol guidance
|
|
185
|
+
- `No MCP query calls` -- Prefer `query` over CLI
|
|
186
|
+
|
|
187
|
+
## Common Anti-patterns
|
|
188
|
+
|
|
189
|
+
| Anti-pattern | Impact | Fix |
|
|
190
|
+
|-------------|--------|-----|
|
|
191
|
+
| Skipping `session.list` at start | -10 S1 | Always check existing sessions first |
|
|
192
|
+
| Forgetting `session.end` | -10 S1 | End sessions when work is complete |
|
|
193
|
+
| Using `tasks.list` instead of `tasks.find` | -up to 15 S2 | Use `find` for discovery, `list` only for known parent children |
|
|
194
|
+
| Creating tasks without descriptions | -5 each S3 | Always provide a description with `tasks.add` |
|
|
195
|
+
| Ignoring `E_NOT_FOUND` errors | -5 each S4 | Follow up with `tasks.find` or `tasks.exists` |
|
|
196
|
+
| Creating duplicate tasks | -5 S4 | Check for existing tasks before creating new ones |
|
|
197
|
+
| Never using `admin.help` | -10 S5 | Use progressive disclosure for protocol guidance |
|
|
198
|
+
| CLI-only usage (no MCP) | -10 S5 | Prefer `query`/`mutate` for programmatic access |
|
|
199
|
+
|
|
200
|
+
## Grade Result Schema
|
|
201
|
+
|
|
202
|
+
Grade results are stored in `.cleo/metrics/GRADES.jsonl` as append-only JSONL. Each entry conforms to `schemas/grade.schema.json` with these fields:
|
|
203
|
+
|
|
204
|
+
- `sessionId` (string, required) -- Session that was graded
|
|
205
|
+
- `taskId` (string, optional) -- Associated task ID
|
|
206
|
+
- `totalScore` (number, 0-100) -- Aggregate score
|
|
207
|
+
- `maxScore` (number, default 100) -- Maximum possible score
|
|
208
|
+
- `dimensions` (object) -- Per-dimension `{ score, max, evidence[] }`
|
|
209
|
+
- `flags` (string[]) -- Specific violations or suggestions
|
|
210
|
+
- `timestamp` (ISO 8601) -- When the grade was computed
|
|
211
|
+
- `entryCount` (number) -- Audit entries analyzed
|
|
212
|
+
- `evaluator` (`auto` | `manual`) -- How the grade was computed
|
|
213
|
+
|
|
214
|
+
## MCP Operations
|
|
215
|
+
|
|
216
|
+
| Gateway | Domain | Operation | Description |
|
|
217
|
+
|---------|--------|-----------|-------------|
|
|
218
|
+
| `query` | `check` | `grade` | Canonical grade read (`params: { sessionId }`) |
|
|
219
|
+
| `query` | `check` | `grade.list` | Canonical grade history read |
|
|
220
|
+
| `query` | `admin` | `grade` | Compatibility alias for runtime handlers |
|
|
221
|
+
| `query` | `admin` | `grade.list` | Compatibility alias for runtime handlers |
|
|
222
|
+
| `query` | `admin` | `token` | Canonical token telemetry read (`action=summary|list|show`) |
|
|
223
|
+
|
|
224
|
+
|
|
225
|
+
## API Update Notes
|
|
226
|
+
|
|
227
|
+
- Prefer the canonical registry surface from `docs/specs/CLEO-API.md`: `check.grade`, `check.grade.list`, and `admin.token` with an `action` param.
|
|
228
|
+
- `admin.grade*` and split `admin.token.*` paths remain compatibility handlers and may still appear in existing automation.
|
|
229
|
+
- Browser clients should target `POST /api/query` and `POST /api/mutate`; LAFS metadata is carried in `X-Cleo-*` headers by default.
|
|
230
|
+
- Treat persisted token transport values `api` and `http` as equivalent during the compatibility window described in `docs/specs/CLEO-WEB-API.md`.
|