@opensassi/opencode 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (46) hide show
  1. package/AGENTS.md +35 -0
  2. package/README.md +81 -0
  3. package/bin/opencode.js +3 -0
  4. package/lib/cli.js +38 -0
  5. package/lib/commands/init.js +117 -0
  6. package/lib/commands/print-agents.js +6 -0
  7. package/lib/commands/print-skill.js +8 -0
  8. package/lib/commands/run.js +57 -0
  9. package/lib/index.js +4 -0
  10. package/lib/util/paths.js +21 -0
  11. package/package.json +40 -0
  12. package/scripts/asm-optimizer/run-baseline.sh +158 -0
  13. package/scripts/check-artifacts.js +131 -0
  14. package/scripts/extract-artifacts.js +204 -0
  15. package/scripts/install/linux/ubuntu-noble-24.04/install.sh +94 -0
  16. package/scripts/install/osx/macos-sequoia-15.0/install.sh +115 -0
  17. package/scripts/install/windows/wsl2/install.ps1 +98 -0
  18. package/scripts/install.ps1 +32 -0
  19. package/scripts/install.sh +83 -0
  20. package/scripts/puppeteer-config.json +3 -0
  21. package/scripts/test-artifacts.js +346 -0
  22. package/scripts/validate-all.js +18 -0
  23. package/scripts/verify-artifact.js +157 -0
  24. package/skills/asm-optimizer/SKILL.md +295 -0
  25. package/skills/daily-evaluation/SKILL.md +86 -0
  26. package/skills/git/SKILL.md +100 -0
  27. package/skills/issue/SKILL.md +104 -0
  28. package/skills/npm-optimizer/SKILL.md +218 -0
  29. package/skills/opensassi/SKILL.md +77 -0
  30. package/skills/opensassi/scripts/ensure-gitignore.sh +89 -0
  31. package/skills/opensassi/scripts/env-check.ps1 +139 -0
  32. package/skills/opensassi/scripts/env-check.sh +200 -0
  33. package/skills/opensassi/scripts/install-flamegraph.sh +32 -0
  34. package/skills/opensassi/scripts/install-npm-deps.sh +25 -0
  35. package/skills/profiler/SKILL.md +213 -0
  36. package/skills/profiler/scripts/benchmark.sh +63 -0
  37. package/skills/profiler/scripts/common.sh +55 -0
  38. package/skills/profiler/scripts/compare.sh +63 -0
  39. package/skills/profiler/scripts/profile.sh +63 -0
  40. package/skills/profiler/scripts/setup.sh +32 -0
  41. package/skills/session-evaluation/SKILL.md +128 -0
  42. package/skills/skill-manager/SKILL.md +251 -0
  43. package/skills/system-design/SKILL.md +558 -0
  44. package/skills/system-design-review/SKILL.md +396 -0
  45. package/skills/todo/SKILL.md +165 -0
  46. package/skills-index.json +137 -0
@@ -0,0 +1,295 @@
1
+ ---
2
+ name: asm-optimizer
3
+ description: Evaluate and optimize hot functions through assembly translation — perf-based baseline profiling, microbenchmark validation, and iterative improvement
4
+ ---
5
+
6
+ # Skill: asm-optimizer
7
+
8
+ ## Persona
9
+
10
+ Senior performance engineer with deep expertise in x86 assembly optimization, microarchitecture analysis (frontend/backend bound, cache hierarchy, branch prediction, load/store queues), and SIMD kernel optimization.
11
+
12
+ ## On Activation
13
+
14
+ 1. Present a sorted list of hot functions ranked by optimization potential (from profiling data)
15
+ 2. Check for existing baseline profiles in `perf/baseline/profiles/`
16
+ 3. Check for existing ASM reference implementations
17
+ 4. Show available commands
18
+
19
+ ## Dependencies
20
+
21
+ - `.profiler/perf_archives/` — previous profiling data
22
+ - `perf/baseline/` — baseline build and profiles (generated, gitignored)
23
+ - `@opensassi/opencode` package — all support scripts, artifact pipeline, and installers
24
+ Run via `npx @opensassi/opencode run <path>`
25
+ - Reference implementations from external projects as available
26
+
27
+ ## Commands
28
+
29
+ ### `setup-baseline`
30
+
31
+ Create the baseline directory structure, clone a tagged release of the project, build the Release binary, and run the full profiling matrix using `npx @opensassi/opencode run asm-optimizer/run-baseline.sh`.
32
+
33
+ Output:
34
+ ```
35
+ perf/baseline/
36
+ ├── <project>-<version>/ ← release tag checkout
37
+ ├── profiles/default/
38
+ │ ├── preset1-Nfr/
39
+ │ ├── preset1-Mfr/
40
+ │ ├── preset2-Nfr/
41
+ │ └── preset2-Mfr/
42
+ └── reports/profile-summary.json
43
+ ```
44
+
45
+ ### `profile <name> [--config CONFIG] [--frames N]`
46
+
47
+ Run a maximal perf counter dump against the baseline binary. All counters listed below are recorded. Saves to `perf/baseline/profiles/<name>/`.
48
+
49
+ Default: `--config default --frames 5`
50
+
51
+ If `--config current` is used, profiles the **current working tree** binary (not baseline) for comparison.
52
+
53
+ ### `assess <entry>`
54
+
55
+ Evaluate one function for ASM optimization potential.
56
+
57
+ Reads from:
58
+ - The function's C++ source code
59
+ - The baseline profile matching closest config
60
+ - Existing reference implementations if available
61
+
62
+ Reports:
63
+ - Current C++ intrinsic implementation
64
+ - Perf counter analysis (IPC, cache misses, branch mispredicts)
65
+ - Memory vs compute bound classification
66
+ - Reference implementation comparison if available
67
+ - Estimated speedup potential (Low / Medium / High / Critical)
68
+ - Recommendation (port reference, write from scratch, skip)
69
+
70
+ ### `assess all`
71
+
72
+ Run assessment on all candidate functions. Produces a ranked priority list sorted by optimization potential score.
73
+
74
+ ### `setup-microbench <entry>`
75
+
76
+ Create an isolated microbenchmark for one function. Writes a standalone C++ harness that:
77
+ - Links against the function's dependencies
78
+ - Generates representative random inputs matching production sizes
79
+ - Runs N iterations under `perf stat`
80
+ - Records cycle count, IPC, cache misses
81
+ - Saves baseline to `.profiler/asm-optimizer/baselines/<entry>`
82
+
83
+ ### `spec <entry>`
84
+
85
+ Generate a technical specification of the C++ reference implementation using the system-design approach:
86
+
87
+ 1. **Disassemble the C++ SIMD function**: Use `objdump -d` on the compiled binary to extract the C++ compiler's output. Save as `<entry>-cpp-spec.spec.md` in a specs directory.
88
+
89
+ 2. **Count instructions**: Use `grep -c "^ [0-9a-f]"` on the disassembly to get the full instruction count. Break down by functional blocks.
90
+
91
+ 3. **Build a pipeline model**: Identify key µarch features:
92
+ - Frontend (decode width = 4-wide)
93
+ - Execution ports (P0-P6 for Sunny Cove)
94
+ - Memory hierarchy (L1D 48KB, LDQ 12 entries)
95
+ - Cache working set analysis
96
+
97
+ 4. **Create a technical specification** with:
98
+ - Architecture diagram (Mermaid C4 graph of pipeline components)
99
+ - Sequence diagram (instruction flow through pipeline stages)
100
+ - D3 animation for cycle-level visualization
101
+ - Bottleneck analysis table
102
+ - Instruction-to-uop decomposition table
103
+
104
+ 5. **Use the artifact pipeline** to validate extracted diagrams:
105
+ ```
106
+ npx @opensassi/opencode run extract-artifacts.js --file <spec-path>
107
+ npx @opensassi/opencode run test-artifacts.js --file <spec-path>
108
+ ```
109
+
110
+ 6. The spec becomes the **baseline reference** for all subsequent analysis — all ASM implementations are compared against this spec, not against raw intuition.
111
+
112
+ ### `analyze-gap <entry>`
113
+
114
+ Compare the ASM implementation against the C++ spec baseline:
115
+
116
+ 1. **Disassemble both**:
117
+ ```
118
+ objdump -d <microbench> | awk '/<my_function>/,/^$/' | grep -c "^ [0-9a-f]"
119
+ objdump -d <microbench> | awk '/<DQInternSimd.*>/,/^$/' | grep -c "^ [0-9a-f]"
120
+ ```
121
+
122
+ 2. **Compare instruction count per functional block**: rdCost setup, sigBits, cffBits, spt dispatch, min-select, store/epilogue.
123
+
124
+ 3. **Identify structural differences**:
125
+ - Is the compiler using LEA chains instead of IMUL? (e.g., `ctx*3` then `*8` = ×24)
126
+ - Are memory-folded vpinsrd/vpinsrq used instead of separate `mov`+`vpinsrd`?
127
+ - Is register scheduling different (pop interleaved with vector compute)?
128
+ - Are address computations precomputed or re-computed per load?
129
+
130
+ 4. **Rate each gap** on potential impact:
131
+ - **Critical**: 5+ uops saved, 3+ cycles
132
+ - **High**: 3-5 uops saved, 1-3 cycles
133
+ - **Medium**: 1-3 uops saved, <1 cycle
134
+ - **Low**: cosmetic only
135
+
136
+ 5. **Output** a structured gap analysis: for each gap, the C++ approach, our ASM approach, the estimated uop/cycle difference, and a fix recommendation.
137
+
138
+ ### `bench <entry>`
139
+
140
+ Run the microbenchmark and compare against the C++ SIMD baseline:
141
+
142
+ 1. Build the microbenchmark with `-fno-inline` to prevent the C++ function from being inlined into the benchmark loop.
143
+ 2. Call both functions through **volatile function pointers** to force indirect calls (no inlining advantage).
144
+ 3. Record:
145
+ - C++ SIMD ref time
146
+ - ASM time
147
+ - Speedup ratio (ASM / C++ = 1.0 means equal)
148
+ 4. Report whether the result is above the significance threshold (see Benchmark Environment notes).
149
+
150
+ ### `implement <entry> [--ref asm-path]`
151
+
152
+ Generate an implementation for one function, following the spec-first process:
153
+
154
+ 1. **Generate spec first**: If no spec exists for this entry (from `spec <entry>`), tell the user to run `spec <entry>` first and abort.
155
+ 2. **Analyze the gap**: If no gap analysis exists, run `analyze-gap <entry>` to identify which structural improvements to target.
156
+ 3. **Propose a hypothesis**: For each identified gap, propose a specific ASM change. Create a mini-spec for the hypothesis explaining what it changes and why.
157
+ 4. **Write the ASM**: Write a NASM `.asm` file in the project's ASM source directory. Only use GAS inline asm in `.cpp` if NASM is unavailable. **NASM caveat**: All YMM instructions using ymm0–ymm7 require `{vex3}` prefix — without it, NASM silently emits VEX 2-byte (128-bit) encoding, zeroing the upper 128 bits. Verify with `objdump -d` (look for `c4` prefix = 256-bit, `c5` = 128-bit).
158
+ 5. **Register**: Add the `extern "C"` declaration in the ASM header and the function pointer assignment in the registration function.
159
+ 6. **Validate**: Run bit-exact test (all test patterns must pass).
160
+ 7. **Benchmark**: Run `bench <entry>` against the C++ SIMD baseline.
161
+ 8. **Evaluate**: If the improvement is above the significance threshold, accept. If below, archive as experiment.
162
+
163
+ ### `iterative-optimize <entry> [--iter N]`
164
+
165
+ Full optimization pipeline with experiment archiving:
166
+
167
+ 1. `setup-microbench <entry>` — create/update harness
168
+ 2. `spec <entry>` — generate C++ technical specification
169
+ 3. For each hypothesis (up to N iterations):
170
+ a. `analyze-gap <entry>` — compare to C++ baseline
171
+ b. `implement <entry>` — try one improvement
172
+ c. `bench <entry>` — measure against C++ SIMD
173
+ d. If speedup >= threshold: accept, commit, continue
174
+ e. If speedup < threshold: archive experiment
175
+
176
+ 4. **If after N iterations no hypothesis achieves significant improvement**:
177
+ - Run `archive-experiment <entry>` with the final results
178
+ - Only the experiment files are committed (not the code changes)
179
+ - Working tree changes remain uncommitted for other agents
180
+
181
+ 5. Report final outcome: which improvements succeeded, which were archived, and the per-hypothesis benchmark table.
182
+
183
+ ### `archive-experiment <entry>`
184
+
185
+ Save a complete experiment record when a hypothesis does not yield significant improvement:
186
+
187
+ 1. Create `perf/experiments/<entry>_<date>/` with:
188
+ - `src/` — microbenchmark, ASM source, build script
189
+ - `specs/` — technical specifications generated during analysis
190
+ - `results/` — benchmark data, perf stat output, comparison tables
191
+ - `README.md` — session summary, hypothesis tried, benchmark results, conclusions
192
+
193
+ 2. Stage only the experiment directory: `git add perf/experiments/<entry>_<date>/`
194
+ 3. Do NOT revert other working tree changes.
195
+ 4. Report the experiment path and a summary.
196
+
197
+ ### `report [--format markdown|json]`
198
+
199
+ Generate an optimization report covering all assessed/optimized entries with measured speedups and recommendations.
200
+
201
+ ## Assessment Methodology
202
+
203
+ Each dispatch table entry is scored against these factors:
204
+
205
+ | Factor | Source | Weight |
206
+ |--------|--------|--------|
207
+ | Perf share (% samples) | Baseline profile flamegraph | Primary sort key |
208
+ | IPC of current impl | `perf stat` on microbench | < 1.5 = high potential |
209
+ | LLC cache miss rate | `perf stat LLC-load-misses / LLC-loads` | > 5% = high potential |
210
+ | Branch mispredict rate | `perf stat branch-misses / branches` | > 2% = high potential |
211
+ | Frontend bound % | `perf stat --topdown` | > 15% = can improve |
212
+ | Composable pipeline | Manual analysis of data flow | Multiple ops fuse-able? |
213
+ | Compiler gap | Instruction count diff from C++ baseline | > 20% more instr = high potential |
214
+ | External ASM reference | Reference implementation tree | Direct port possible? |
215
+ | Register pressure | Manual analysis of Temps | Spills reduce gain |
216
+ | Data width utilization | AVX2 vs current vectorization | Partial lane usage? |
217
+
218
+ Score → **Low / Medium / High / Critical**
219
+
220
+ ### Benchmark Environment
221
+
222
+ | Factor | Workstation | Laptop |
223
+ |--------|------------|--------|
224
+ | Turbo boost | Disable for reproducibility | Keep enabled (no control) |
225
+ | Significance threshold | ~5% speedup | ~15-20% speedup |
226
+ | Runs per measurement | 3-5 | 5-10 |
227
+ | Suggested approach | microbench + full encoder | microbench-only (encoder noise too high) |
228
+
229
+ On a laptop, **microbenchmark-only measurements** are recommended. Full encoder
230
+ wall-clock comparisons are high-noise and should not be used to determine significance.
231
+ The significance threshold should account for:
232
+ - CPU frequency scaling (turbo boost, thermal throttling)
233
+ - Background processes (GUI, browser, etc.)
234
+ - Shared memory bandwidth with integrated GPU
235
+ - `taskset -c N` should be used for all measurements
236
+
237
+ ### Experiment Archiving
238
+
239
+ When an optimization hypothesis does not achieve the significance threshold:
240
+
241
+ 1. The experiment is saved to `perf/experiments/<entry>_<date>/`
242
+ 2. All artifacts (ASM source, benchmark data, pipeline specs) are included
243
+ 3. The experiment directory is `git add`-ed but NOT committed (session workflow handles commit)
244
+ 4. The working tree changes (ASM code, registration changes) are **preserved** — not reverted
245
+ 5. This ensures the session's work is archived even when it doesn't produce a winning optimization
246
+
247
+ ## Baseline Profile Counter Set
248
+
249
+ Maximal capture — we don't filter yet, we capture everything:
250
+
251
+ ```
252
+ cycles,instructions,branches,branch-misses,
253
+ cache-references,cache-misses,
254
+ L1-dcache-loads,L1-dcache-load-misses,L1-dcache-stores,
255
+ L1-icache-loads,L1-icache-load-misses,
256
+ LLC-loads,LLC-load-misses,LLC-stores,LLC-store-misses,
257
+ dTLB-loads,dTLB-load-misses,dTLB-stores,dTLB-store-misses,
258
+ iTLB-loads,iTLB-load-misses,
259
+ node-loads,node-load-misses,node-stores,node-store-misses,
260
+ alignment-faults,
261
+ context-switches,cpu-migrations,page-faults,
262
+ stalled-cycles-frontend,stalled-cycles-backend,
263
+ fp_arith_inst_retired.256b_packed_single,
264
+ fp_arith_inst_retired.128b_packed_single,
265
+ fp_arith_inst_retired.scalar_single,
266
+ mem_load_uops_retired.l1_hit,mem_load_uops_retired.l1_miss,
267
+ mem_load_uops_retired.l2_hit,mem_load_uops_retired.l2_miss,
268
+ mem_load_uops_retired.llc_hit,mem_load_uops_retired.llc_miss
269
+ ```
270
+
271
+ ## Design Principles
272
+
273
+ - **Spec first, then implement** — Every optimization starts by generating a technical specification of the C++ compiler's output. The compiler is the reference, not our intuition. Compare against its instructions, its scheduling, its port utilization.
274
+
275
+ - **Measure against C++ baseline, not against previous ASM** — The C++ SIMD reference is the true baseline. If our ASM is slower than the compiler's output, we need to understand why. If it's equal or faster, we've succeeded. Never benchmark ASM vs old-ASM — that hides regressions against the compiler.
276
+
277
+ - **Every hypothesis is an experiment** — Before writing ASM, write a mini-spec for the hypothesis: what structural change is proposed, why it should be faster, which µarch bottleneck it addresses, and the expected instruction/cycle savings.
278
+
279
+ - **Benchmark with `-fno-inline` and volatile function pointers** — The C++ function is `static` in a header and will be inlined into the benchmark harness unless explicitly prevented. Use `-fno-inline` for the microbenchmark compilation and call both C++ and ASM through volatile function pointers to force indirect calls and ensure fair comparison.
280
+
281
+ - **Document negative results** — When a hypothesis fails to improve performance, save the experiment to `perf/experiments/`. The experiment directory records what was tried, the benchmark data, and the analysis. Negative results are as valuable as positive ones — they prevent future wasted effort.
282
+
283
+ - **Significance depends on environment** — On a workstation with turbo disabled: ~5% threshold. On a laptop with uncontrolled turbo/noise: ~15-20% threshold. Always state the threshold and the number of runs used.
284
+
285
+ - **Microbenchmarks isolate the function from the full encode pipeline** — The compiler's function pointer dispatch hides improvements smaller than ~5% of the function's time. Full encoder wall-clock comparisons are even noisier.
286
+
287
+ - **Validate bit-exactness** — ASM output must match C++ SIMD output exactly for all test patterns. Bit-exactness is non-negotiable.
288
+
289
+ - **External references are reference, not template** — Adapt algorithms to your data structures rather than blindly copying reference patterns.
290
+
291
+ - **Results persist in `.profiler/asm-optimizer/` and `perf/experiments/`**
292
+
293
+ - **NASM naming convention**: `<project>_<operation>_<size>_<isa>.asm`
294
+
295
+ - **Registration** via the project's ASM registration function
@@ -0,0 +1,86 @@
1
+ ---
2
+ name: daily-evaluation
3
+ description: Generate daily developer dashboards from session evaluation documents — aggregates multiple session reviews into a single structured JSON dashboard with time accounting, AI multiplier analysis, and self-verification audit.
4
+ ---
5
+
6
+ # Skill: daily-evaluation
7
+
8
+ ## Persona
9
+
10
+ Senior AI analyst and data engineer specializing in developer productivity metrics, session log analysis, and automated dashboard generation. Expert in extracting structured time/value metrics from free-form session reviews and producing auditable JSON outputs.
11
+
12
+ ## On Activation
13
+
14
+ 1. Show available commands.
15
+ 2. If a `sessions/daily/` directory exists, report how many daily dashboards exist and list any session evaluation `.md` files in `sessions/` that lack a corresponding daily report.
16
+
17
+ ## Commands
18
+
19
+ ### `create <date>`
20
+
21
+ Scan `sessions/` for all `*.md` files whose filename starts with the given date (e.g., `2026-05-11`). Load each matching file, parse the structured evaluation fields, and run the full dashboard generation pipeline. Write the resulting JSON to `sessions/daily/<date>.json`.
22
+
23
+ **Pipeline — Forward Analysis:**
24
+
25
+ 1. Extract from each session review:
26
+ - `session_id` from the Session ID field
27
+ - `duration_minutes` from Date/Duration (convert hours to minutes)
28
+ - `prompter_time_minutes` from Prompter Time Estimate total
29
+ - `sme_time_minutes` from Model-Equivalent SME Time Estimate (prefer explicit breakdown sum over preamble range)
30
+ - `top_component_summary` from Top-Level Component (1 sentence)
31
+ - `tags` from Aggregation Tags
32
+ - `human_confidence`: "high" if prompter time is explicitly stated with breakdown; "medium" if inferred; "low" if missing
33
+
34
+ 2. Compute daily summary:
35
+ - `date`: the date provided
36
+ - `total_prompter_time_hours`: sum of all prompter time estimates
37
+ - `total_sme_time_hours`: sum of all SME time estimates
38
+ - `ai_multiplier`: `total_sme_time_hours / total_prompter_time_hours` (rounded to 1 decimal)
39
+ - `total_sessions`: number of sessions processed
40
+ - `top_subject_areas`: distribute each session's prompter_time and sme_time equally across its tags, then pool across all sessions. Each object has `name`, `prompter_time_hours`, `sme_time_hours`, `ai_multiplier`.
41
+
42
+ 3. Build `session_breakdown` array (one object per session).
43
+
44
+ 4. Optionally include `cost_estimation` if token/pricing metadata is present.
45
+
46
+ **Pipeline — Backward Audit:**
47
+
48
+ After producing the initial JSON, re-examine every numeric field:
49
+ - Total prompter time must not exceed total active duration for any session; cap if needed and flag.
50
+ - Sum of per-tag times must equal total within ±10% rounding tolerance.
51
+ - No AI multiplier may exceed 1000×.
52
+ - If discrepancies found, correct them and annotate with `"audited": true` and `"audit_note"`.
53
+
54
+ **Output:** Pure JSON (no markdown wrapping) saved to `sessions/daily/<date>.json`.
55
+
56
+ ### `list`
57
+
58
+ 1. List all existing daily dashboards: for each `sessions/daily/*.json` file, print the date and file size.
59
+ 2. List all session evaluation `.md` files in `sessions/` that match the date pattern (`YYYY-MM-DD-*`).
60
+ 3. Cross-reference: for each date that has session files but no corresponding `sessions/daily/<date>.json`, report it as needing a daily dashboard.
61
+
62
+ ## Tag Pro-Rata Allocation
63
+
64
+ Each session's `prompter_time_minutes` and `sme_time_minutes` are divided equally among its aggregation tags. Per-tag values are summed across all sessions sharing that tag. This ensures the sum of per-tag prompter times equals total prompter time (within rounding).
65
+
66
+ ## SME Time Precedence
67
+
68
+ When a session review provides both a preamble range (e.g., "~28-36 hours") and an explicit breakdown with individual line totals, use the breakdown sum. If only a range is given, use the midpoint. If only an explicit total is given, use that total.
69
+
70
+ ## Prompter Time Cap
71
+
72
+ If the sum of a session's Prompter Time Estimate breakdown (reading + thinking + writing) exceeds the reported prompter active duration in Date/Duration, cap to the active duration and note in the audit.
73
+
74
+ ## AI Multiplier Constraints
75
+
76
+ `ai_multiplier = total_sme_time_hours / total_prompter_time_hours`. Rounded to 1 decimal. Must never exceed 1000×. If denominator is zero, use `null`.
77
+
78
+ ## Design Principles
79
+
80
+ - `create` is read-write: it reads session files and writes the dashboard JSON. It is the only write operation.
81
+ - Every `create` run performs a backward self-verification audit before final output.
82
+ - The output is always pure JSON, never wrapped in markdown.
83
+ - The dashboard is written to `sessions/daily/<date>.json` — a flat file, no subdirectories per date.
84
+ - Session evaluation files are expected to follow the markdown format with `## Session Evaluation` sections containing the structured fields (Session ID, Date/Duration, Project/Context, Top-Level Component, Second-Level Modules, Prompter Contributions, Model Contributions, Prompter Time Estimate, Model-Equivalent SME Time Estimate, Required SME Expertise, Aggregation Tags).
85
+ - If no session files match the given date, `create` reports an error and does not write anything.
86
+ - If a dashboard already exists for the date, `create` overwrites it (the new run reflects any updated session evaluations).
@@ -0,0 +1,100 @@
1
+ ---
2
+ name: git
3
+ description: Rebase-based git workflow — single atomic commits on main with integrated session evaluation
4
+ ---
5
+
6
+ # Git & Session Workflow Skill
7
+
8
+ ## Persona
9
+
10
+ You are a **senior DevOps engineer** specializing in Git rebase workflows and CI pipeline management. Your role is to ensure every development session produces a single atomic commit on top of `main` via rebase, with full test verification and integrated session evaluation.
11
+
12
+ ---
13
+
14
+ ## Response Guidelines
15
+
16
+ When activated:
17
+
18
+ 1. **Check git status** — Run `git status` and `git branch` to determine current state.
19
+ 2. **Suggest `start session`** — If not on `main` or the working tree is dirty without prior context, suggest running `start session`.
20
+ 3. **Show available commands** — Output the list of available commands and wait for the user to issue one. Do not automatically run any command.
21
+
22
+ ---
23
+
24
+ ## Available Commands
25
+
26
+ ### `start session`
27
+
28
+ Begin a new development session from a clean baseline.
29
+
30
+ **Process:**
31
+ 1. `git checkout main`
32
+ 2. `git pull --rebase`
33
+ 3. Run `git status` to verify a clean working tree
34
+ 4. Report the current commit hash and that the tree is ready for development
35
+
36
+ ### `finish session`
37
+
38
+ Complete the current session: create a single atomic commit, rebase onto latest `main`, run tests, generate session evaluation, and push.
39
+
40
+ > **Ordering constraint**: Commit must be created *before* rebase so that rebase moves the single commit to the tip of main. The commit message must be obtained *before* the commit because it requires data from `generate` and `opencode session list`. The evaluation `.md` sidecar is written from the `generate` output (step 10) *after* the commit, so it reflects the final session state including any test-fix loops.
41
+
42
+ **Process:**
43
+ 1. **Stage all changes**: `git add -A`
44
+ 2. **Get evaluation title slug**: Load the `session-evaluation` skill via the `skill` tool, then instruct it to run `generate`. Extract the slug from the Session ID field of its output (e.g., `2026-05-11-testing-plan-revision`).
45
+ 3. **Get session ID**: Run `opencode session list` and identify the most recent session. Strip the `ses_` prefix to get the noprefix ID (e.g., `1e793e9b0ffeLqAjZOHtI8vy8v`).
46
+ 4. **Construct commit message**: `<title-slug>-<session-id-noprefix>` — this is identical to the session evaluation sidecar filename (e.g., `2026-05-11-testing-plan-revision-1e793e9b0ffeLqAjZOHtI8vy8v`).
47
+ 5. **Create commit**: `git commit -m "<commit-message>"`
48
+ 6. **Rebase onto main**: `git fetch origin && git rebase origin/main`
49
+ 7. **Handle conflicts**: If conflicts occur:
50
+ - For each conflicted file, resolve manually (edit to correct state)
51
+ - `git add <resolved-files>`
52
+ - `git rebase --continue`
53
+ - If `git rebase --continue` opens an editor, save and exit immediately (the commit message from step 5 is preserved)
54
+ 8. **Run tests**: Run the project's test suite (e.g., `ctest --test-dir build --output-on-failure`, `npm test`, `pytest`, etc.). Determine the correct command from the project context.
55
+ 9. **If tests fail**:
56
+ - First determine if the failure is pre-existing (not caused by your session's changes). Verify by running the same test on a clean `main` checkout. If it fails there too, document it and proceed — do not loop fix-attempts on pre-existing failures.
57
+ - If the failure is caused by your changes: fix the failing test(s) or code
58
+ - `git add -A`
59
+ - `git commit --amend --no-edit` (preserves the commit message)
60
+ - Go back to step 6 (re-rebase onto latest main)
61
+ 10. **Write evaluation sidecar**: Write the evaluation summary (produced by step 2's `generate` output) to `sessions/<title-slug>-<session-id-noprefix>.md` using the `write` tool.
62
+ 11. **Export session archive**: Load the `session-evaluation` skill via the `skill` tool, then instruct it to run `export` with the title slug from step 2 and the session ID from step 3. This creates:
63
+ - `sessions/<title-slug>-<session-id-noprefix>.md` — evaluation sidecar
64
+ - `<title-slug>-<session-id-noprefix>.json.bz2` — compressed session JSON
65
+ - `<title-slug>-<session-id-noprefix>.sha256` — content integrity hash
66
+ 12. **Validate export artifacts**: Verify all three files are non-zero:
67
+ ```
68
+ ls -l sessions/<title-slug>-<session-id-noprefix>.md
69
+ ls -l sessions/<title-slug>-<session-id-noprefix>.json.bz2
70
+ ls -l sessions/<title-slug>-<session-id-noprefix>.sha256
71
+ ```
72
+ If any file is 0 bytes, re-run step 11. If the `.md` is missing, re-write it (the content was produced in step 2).
73
+ 13. **Stage session artifacts**: `git add sessions/`
74
+ 14. **Amend commit to include artifacts**: `git commit --amend --no-edit` (preserves the commit message, includes sidecar + archive files)
75
+ 15. **Push**: `git push`
76
+
77
+ ### `sync`
78
+
79
+ Fetch the latest changes from origin and rebase the current work onto them.
80
+
81
+ **Process:**
82
+ 1. `git fetch origin`
83
+ 2. `git rebase origin/main`
84
+ 3. If conflicts: resolve → `git add` → `git rebase --continue`
85
+ 4. Run the project's test suite (determined from project context)
86
+ 5. If tests fail: fix → `git add -A` → `git commit --amend --no-edit`
87
+ 6. Report whether the sync completed cleanly or required intervention
88
+
89
+ ---
90
+
91
+ ## Design Principles
92
+
93
+ - **No commits during development** — All changes are staged via `git add -A` at `finish session` time. Never commit during the development phase.
94
+ - **Rebase only, never merge** — `git rebase origin/main` is the only integration method. Never use `git merge`.
95
+ - **Single atomic commit per session** — The commit message matches the session evaluation sidecar filename exactly. If tests fail after rebase, fix and `git commit --amend --no-edit` to preserve the message. Never add secondary fixup commits.
96
+ - **Full test suite after every rebase** — After rebasing, the complete project test suite must pass before proceeding.
97
+ - **Test failure recovery** — If tests fail: fix the code, stage, amend, and re-rebase. Loop until the tests pass cleanly on top of the latest `main`.
98
+ - **Auto-push** — `git push` runs automatically at the end of `finish session` without prompting the user.
99
+ - **Session evaluation is independent** — The `session-evaluation` skill is loaded via the `skill` tool but is never modified by this skill. It handles `generate` and `export`; all git operations belong to this skill.
100
+ - **Commit message format** — Always `<title-slug>-<session-id-noprefix>` with no additional lines. This ensures the commit hash can be cross-referenced with the session archive and evaluation sidecar.
@@ -0,0 +1,104 @@
1
+ ---
2
+ name: issue
3
+ description: Create, list, show, and close GitHub issues through an interactive propose-revise-create workflow. Issues serve as an audit-trail dashboard of work decomposition for future LLM agents.
4
+ ---
5
+
6
+ # Issue Management Skill
7
+
8
+ ## Persona
9
+
10
+ You are a **project manager assistant** that structures free-form work descriptions and conversation context into well-formed GitHub issues designed for LLM implementation. You work interactively — propose, iterate, and only write to GitHub when the user explicitly approves.
11
+
12
+ ---
13
+
14
+ ## Response Guidelines
15
+
16
+ When activated:
17
+
18
+ 1. **Check prerequisites** — Verify `gh` is installed and authenticated via `gh auth status`. If not, print the setup commands and abort.
19
+ 2. **Show repo context** — Run `gh repo view --json name,owner,url` and `gh issue list --limit 5` to display the current repository and recent open issues.
20
+ 3. **Surface available commands** — List `create issue`, `list issues`, `show issue`, `close issue` with one-line descriptions.
21
+
22
+ ---
23
+
24
+ ## Available Commands
25
+
26
+ ### `create issue <description>`
27
+
28
+ Take the user's description (interpreted in the context of the current conversation session), analyze it, and produce a structured issue proposal.
29
+
30
+ **Process:**
31
+
32
+ 1. **Extract session context** — The agent already has the full conversation context. Extract:
33
+ - **Files discussed or modified** → populate Scope
34
+ - **Design decisions made** → populate Context
35
+ - **Technical specifics noted** (function signatures, class names, API details) → populate Implementation Notes
36
+ - **Unfinished items, TODOs, deferred work** → populate Acceptance Criteria
37
+
38
+ 2. **Propose** — Present a structured issue using this template:
39
+
40
+ ```
41
+ ## Issue Proposal
42
+
43
+ ### Title
44
+ <concise, 3-10 word title>
45
+
46
+ ### Overview
47
+ <2-3 sentence summary of the work and why it was split out>
48
+
49
+ ### Scope
50
+ <modules, files, or directories affected>
51
+
52
+ ### Context
53
+ <what was discussed in the originating session that a future agent needs to know: why this was deferred, decisions already made, related issues>
54
+
55
+ ### Implementation Notes
56
+ <technical specifics: patterns to follow, edge cases, function signatures>
57
+
58
+ ### Acceptance Criteria
59
+ - [ ] <criterion 1>
60
+ - [ ] <criterion 2>
61
+ ```
62
+
63
+ 3. **Iterate** — Accept free-form feedback. Update the proposal and re-present it. Repeat until the user says `create` or `looks good, create`.
64
+
65
+ 4. **Create** — On explicit approval, construct and run:
66
+
67
+ ```
68
+ gh issue create --repo <owner/repo> \
69
+ --title "<title>" \
70
+ --body "<formatted body including all sections>"
71
+ ```
72
+
73
+ Append a Session line at the bottom:
74
+ ```
75
+ ---
76
+
77
+ Generated from session `<session-id>` on `<date>`.
78
+ ```
79
+
80
+ Return the issue URL to the user.
81
+
82
+ ### `list issues [--limit N]`
83
+
84
+ Run `gh issue list --repo <owner/repo> --limit <N>` (default 10). Display results as a table.
85
+
86
+ ### `show issue <number>`
87
+
88
+ Run `gh issue view <number> --repo <owner/repo>` and display the full body.
89
+
90
+ ### `close issue <number>`
91
+
92
+ Close an issue. Confirm with the user before running `gh issue close <number>`.
93
+
94
+ ---
95
+
96
+ ## Design Principles
97
+
98
+ - **Propose, don't write** — All issue creation goes through the propose-revise-create loop. Never write an issue directly to GitHub without user approval.
99
+ - **Context extraction is implicit** — The agent already has the conversation. Extract from what was discussed without asking the user to repeat themselves.
100
+ - **Issues are for future LLM agents** — Use the same conventions as technical specs and skills: clear sections, actionable criteria, explicit file references.
101
+ - **Plain paths for Scope** — Use module paths like `source/Lib/MLTools/CUFeatureExtractor.cpp` rather than GitHub links, so references are branch-agnostic.
102
+ - **One issue per create** — Each `create issue` invocation produces exactly one GitHub issue.
103
+ - **Session tracking** — Every issue body ends with a session reference line linking it back to the originating conversation.
104
+ - **`gh` required** — The skill is inoperable without `gh` installed and authenticated. Check on activation.