@opensassi/opencode 0.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/AGENTS.md +35 -0
- package/README.md +81 -0
- package/bin/opencode.js +3 -0
- package/lib/cli.js +38 -0
- package/lib/commands/init.js +117 -0
- package/lib/commands/print-agents.js +6 -0
- package/lib/commands/print-skill.js +8 -0
- package/lib/commands/run.js +57 -0
- package/lib/index.js +4 -0
- package/lib/util/paths.js +21 -0
- package/package.json +40 -0
- package/scripts/asm-optimizer/run-baseline.sh +158 -0
- package/scripts/check-artifacts.js +131 -0
- package/scripts/extract-artifacts.js +204 -0
- package/scripts/install/linux/ubuntu-noble-24.04/install.sh +94 -0
- package/scripts/install/osx/macos-sequoia-15.0/install.sh +115 -0
- package/scripts/install/windows/wsl2/install.ps1 +98 -0
- package/scripts/install.ps1 +32 -0
- package/scripts/install.sh +83 -0
- package/scripts/puppeteer-config.json +3 -0
- package/scripts/test-artifacts.js +346 -0
- package/scripts/validate-all.js +18 -0
- package/scripts/verify-artifact.js +157 -0
- package/skills/asm-optimizer/SKILL.md +295 -0
- package/skills/daily-evaluation/SKILL.md +86 -0
- package/skills/git/SKILL.md +100 -0
- package/skills/issue/SKILL.md +104 -0
- package/skills/npm-optimizer/SKILL.md +218 -0
- package/skills/opensassi/SKILL.md +77 -0
- package/skills/opensassi/scripts/ensure-gitignore.sh +89 -0
- package/skills/opensassi/scripts/env-check.ps1 +139 -0
- package/skills/opensassi/scripts/env-check.sh +200 -0
- package/skills/opensassi/scripts/install-flamegraph.sh +32 -0
- package/skills/opensassi/scripts/install-npm-deps.sh +25 -0
- package/skills/profiler/SKILL.md +213 -0
- package/skills/profiler/scripts/benchmark.sh +63 -0
- package/skills/profiler/scripts/common.sh +55 -0
- package/skills/profiler/scripts/compare.sh +63 -0
- package/skills/profiler/scripts/profile.sh +63 -0
- package/skills/profiler/scripts/setup.sh +32 -0
- package/skills/session-evaluation/SKILL.md +128 -0
- package/skills/skill-manager/SKILL.md +251 -0
- package/skills/system-design/SKILL.md +558 -0
- package/skills/system-design-review/SKILL.md +396 -0
- package/skills/todo/SKILL.md +165 -0
- package/skills-index.json +137 -0
|
@@ -0,0 +1,295 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: asm-optimizer
|
|
3
|
+
description: Evaluate and optimize hot functions through assembly translation — perf-based baseline profiling, microbenchmark validation, and iterative improvement
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# Skill: asm-optimizer
|
|
7
|
+
|
|
8
|
+
## Persona
|
|
9
|
+
|
|
10
|
+
Senior performance engineer with deep expertise in x86 assembly optimization, microarchitecture analysis (frontend/backend bound, cache hierarchy, branch prediction, load/store queues), and SIMD kernel optimization.
|
|
11
|
+
|
|
12
|
+
## On Activation
|
|
13
|
+
|
|
14
|
+
1. Present a sorted list of hot functions ranked by optimization potential (from profiling data)
|
|
15
|
+
2. Check for existing baseline profiles in `perf/baseline/profiles/`
|
|
16
|
+
3. Check for existing ASM reference implementations
|
|
17
|
+
4. Show available commands
|
|
18
|
+
|
|
19
|
+
## Dependencies
|
|
20
|
+
|
|
21
|
+
- `.profiler/perf_archives/` — previous profiling data
|
|
22
|
+
- `perf/baseline/` — baseline build and profiles (generated, gitignored)
|
|
23
|
+
- `@opensassi/opencode` package — all support scripts, artifact pipeline, and installers
|
|
24
|
+
Run via `npx @opensassi/opencode run <path>`
|
|
25
|
+
- Reference implementations from external projects as available
|
|
26
|
+
|
|
27
|
+
## Commands
|
|
28
|
+
|
|
29
|
+
### `setup-baseline`
|
|
30
|
+
|
|
31
|
+
Create the baseline directory structure, clone a tagged release of the project, build the Release binary, and run the full profiling matrix using `npx @opensassi/opencode run asm-optimizer/run-baseline.sh`.
|
|
32
|
+
|
|
33
|
+
Output:
|
|
34
|
+
```
|
|
35
|
+
perf/baseline/
|
|
36
|
+
├── <project>-<version>/ ← release tag checkout
|
|
37
|
+
├── profiles/default/
|
|
38
|
+
│ ├── preset1-Nfr/
|
|
39
|
+
│ ├── preset1-Mfr/
|
|
40
|
+
│ ├── preset2-Nfr/
|
|
41
|
+
│ └── preset2-Mfr/
|
|
42
|
+
└── reports/profile-summary.json
|
|
43
|
+
```
|
|
44
|
+
|
|
45
|
+
### `profile <name> [--config CONFIG] [--frames N]`
|
|
46
|
+
|
|
47
|
+
Run a maximal perf counter dump against the baseline binary. All counters listed below are recorded. Saves to `perf/baseline/profiles/<name>/`.
|
|
48
|
+
|
|
49
|
+
Default: `--config default --frames 5`
|
|
50
|
+
|
|
51
|
+
If `--config current` is used, profiles the **current working tree** binary (not baseline) for comparison.
|
|
52
|
+
|
|
53
|
+
### `assess <entry>`
|
|
54
|
+
|
|
55
|
+
Evaluate one function for ASM optimization potential.
|
|
56
|
+
|
|
57
|
+
Reads from:
|
|
58
|
+
- The function's C++ source code
|
|
59
|
+
- The baseline profile matching closest config
|
|
60
|
+
- Existing reference implementations if available
|
|
61
|
+
|
|
62
|
+
Reports:
|
|
63
|
+
- Current C++ intrinsic implementation
|
|
64
|
+
- Perf counter analysis (IPC, cache misses, branch mispredicts)
|
|
65
|
+
- Memory vs compute bound classification
|
|
66
|
+
- Reference implementation comparison if available
|
|
67
|
+
- Estimated speedup potential (Low / Medium / High / Critical)
|
|
68
|
+
- Recommendation (port reference, write from scratch, skip)
|
|
69
|
+
|
|
70
|
+
### `assess all`
|
|
71
|
+
|
|
72
|
+
Run assessment on all candidate functions. Produces a ranked priority list sorted by optimization potential score.
|
|
73
|
+
|
|
74
|
+
### `setup-microbench <entry>`
|
|
75
|
+
|
|
76
|
+
Create an isolated microbenchmark for one function. Writes a standalone C++ harness that:
|
|
77
|
+
- Links against the function's dependencies
|
|
78
|
+
- Generates representative random inputs matching production sizes
|
|
79
|
+
- Runs N iterations under `perf stat`
|
|
80
|
+
- Records cycle count, IPC, cache misses
|
|
81
|
+
- Saves baseline to `.profiler/asm-optimizer/baselines/<entry>`
|
|
82
|
+
|
|
83
|
+
### `spec <entry>`
|
|
84
|
+
|
|
85
|
+
Generate a technical specification of the C++ reference implementation using the system-design approach:
|
|
86
|
+
|
|
87
|
+
1. **Disassemble the C++ SIMD function**: Use `objdump -d` on the compiled binary to extract the C++ compiler's output. Save as `<entry>-cpp-spec.spec.md` in a specs directory.
|
|
88
|
+
|
|
89
|
+
2. **Count instructions**: Use `grep -c "^ [0-9a-f]"` on the disassembly to get the full instruction count. Break down by functional blocks.
|
|
90
|
+
|
|
91
|
+
3. **Build a pipeline model**: Identify key µarch features:
|
|
92
|
+
- Frontend (decode width = 4-wide)
|
|
93
|
+
- Execution ports (P0-P6 for Sunny Cove)
|
|
94
|
+
- Memory hierarchy (L1D 48KB, LDQ 12 entries)
|
|
95
|
+
- Cache working set analysis
|
|
96
|
+
|
|
97
|
+
4. **Create a technical specification** with:
|
|
98
|
+
- Architecture diagram (Mermaid C4 graph of pipeline components)
|
|
99
|
+
- Sequence diagram (instruction flow through pipeline stages)
|
|
100
|
+
- D3 animation for cycle-level visualization
|
|
101
|
+
- Bottleneck analysis table
|
|
102
|
+
- Instruction-to-uop decomposition table
|
|
103
|
+
|
|
104
|
+
5. **Use the artifact pipeline** to validate extracted diagrams:
|
|
105
|
+
```
|
|
106
|
+
npx @opensassi/opencode run extract-artifacts.js --file <spec-path>
|
|
107
|
+
npx @opensassi/opencode run test-artifacts.js --file <spec-path>
|
|
108
|
+
```
|
|
109
|
+
|
|
110
|
+
6. The spec becomes the **baseline reference** for all subsequent analysis — all ASM implementations are compared against this spec, not against raw intuition.
|
|
111
|
+
|
|
112
|
+
### `analyze-gap <entry>`
|
|
113
|
+
|
|
114
|
+
Compare the ASM implementation against the C++ spec baseline:
|
|
115
|
+
|
|
116
|
+
1. **Disassemble both**:
|
|
117
|
+
```
|
|
118
|
+
objdump -d <microbench> | awk '/<my_function>/,/^$/' | grep -c "^ [0-9a-f]"
|
|
119
|
+
objdump -d <microbench> | awk '/<DQInternSimd.*>/,/^$/' | grep -c "^ [0-9a-f]"
|
|
120
|
+
```
|
|
121
|
+
|
|
122
|
+
2. **Compare instruction count per functional block**: rdCost setup, sigBits, cffBits, spt dispatch, min-select, store/epilogue.
|
|
123
|
+
|
|
124
|
+
3. **Identify structural differences**:
|
|
125
|
+
- Is the compiler using LEA chains instead of IMUL? (e.g., `ctx*3` then `*8` = ×24)
|
|
126
|
+
- Are memory-folded vpinsrd/vpinsrq used instead of separate `mov`+`vpinsrd`?
|
|
127
|
+
- Is register scheduling different (pop interleaved with vector compute)?
|
|
128
|
+
- Are address computations precomputed or re-computed per load?
|
|
129
|
+
|
|
130
|
+
4. **Rate each gap** on potential impact:
|
|
131
|
+
- **Critical**: 5+ uops saved, 3+ cycles
|
|
132
|
+
- **High**: 3-5 uops saved, 1-3 cycles
|
|
133
|
+
- **Medium**: 1-3 uops saved, <1 cycle
|
|
134
|
+
- **Low**: cosmetic only
|
|
135
|
+
|
|
136
|
+
5. **Output** a structured gap analysis: for each gap, the C++ approach, our ASM approach, the estimated uop/cycle difference, and a fix recommendation.
|
|
137
|
+
|
|
138
|
+
### `bench <entry>`
|
|
139
|
+
|
|
140
|
+
Run the microbenchmark and compare against the C++ SIMD baseline:
|
|
141
|
+
|
|
142
|
+
1. Build the microbenchmark with `-fno-inline` to prevent the C++ function from being inlined into the benchmark loop.
|
|
143
|
+
2. Call both functions through **volatile function pointers** to force indirect calls (no inlining advantage).
|
|
144
|
+
3. Record:
|
|
145
|
+
- C++ SIMD ref time
|
|
146
|
+
- ASM time
|
|
147
|
+
- Speedup ratio (ASM / C++ = 1.0 means equal)
|
|
148
|
+
4. Report whether the result is above the significance threshold (see Benchmark Environment notes).
|
|
149
|
+
|
|
150
|
+
### `implement <entry> [--ref asm-path]`
|
|
151
|
+
|
|
152
|
+
Generate an implementation for one function, following the spec-first process:
|
|
153
|
+
|
|
154
|
+
1. **Generate spec first**: If no spec exists for this entry (from `spec <entry>`), tell the user to run `spec <entry>` first and abort.
|
|
155
|
+
2. **Analyze the gap**: If no gap analysis exists, run `analyze-gap <entry>` to identify which structural improvements to target.
|
|
156
|
+
3. **Propose a hypothesis**: For each identified gap, propose a specific ASM change. Create a mini-spec for the hypothesis explaining what it changes and why.
|
|
157
|
+
4. **Write the ASM**: Write a NASM `.asm` file in the project's ASM source directory. Only use GAS inline asm in `.cpp` if NASM is unavailable. **NASM caveat**: All YMM instructions using ymm0–ymm7 require `{vex3}` prefix — without it, NASM silently emits VEX 2-byte (128-bit) encoding, zeroing the upper 128 bits. Verify with `objdump -d` (look for `c4` prefix = 256-bit, `c5` = 128-bit).
|
|
158
|
+
5. **Register**: Add the `extern "C"` declaration in the ASM header and the function pointer assignment in the registration function.
|
|
159
|
+
6. **Validate**: Run bit-exact test (all test patterns must pass).
|
|
160
|
+
7. **Benchmark**: Run `bench <entry>` against the C++ SIMD baseline.
|
|
161
|
+
8. **Evaluate**: If the improvement is above the significance threshold, accept. If below, archive as experiment.
|
|
162
|
+
|
|
163
|
+
### `iterative-optimize <entry> [--iter N]`
|
|
164
|
+
|
|
165
|
+
Full optimization pipeline with experiment archiving:
|
|
166
|
+
|
|
167
|
+
1. `setup-microbench <entry>` — create/update harness
|
|
168
|
+
2. `spec <entry>` — generate C++ technical specification
|
|
169
|
+
3. For each hypothesis (up to N iterations):
|
|
170
|
+
a. `analyze-gap <entry>` — compare to C++ baseline
|
|
171
|
+
b. `implement <entry>` — try one improvement
|
|
172
|
+
c. `bench <entry>` — measure against C++ SIMD
|
|
173
|
+
d. If speedup >= threshold: accept, commit, continue
|
|
174
|
+
e. If speedup < threshold: archive experiment
|
|
175
|
+
|
|
176
|
+
4. **If after N iterations no hypothesis achieves significant improvement**:
|
|
177
|
+
- Run `archive-experiment <entry>` with the final results
|
|
178
|
+
- Only the experiment files are committed (not the code changes)
|
|
179
|
+
- Working tree changes remain uncommitted for other agents
|
|
180
|
+
|
|
181
|
+
5. Report final outcome: which improvements succeeded, which were archived, and the per-hypothesis benchmark table.
|
|
182
|
+
|
|
183
|
+
### `archive-experiment <entry>`
|
|
184
|
+
|
|
185
|
+
Save a complete experiment record when a hypothesis does not yield significant improvement:
|
|
186
|
+
|
|
187
|
+
1. Create `perf/experiments/<entry>_<date>/` with:
|
|
188
|
+
- `src/` — microbenchmark, ASM source, build script
|
|
189
|
+
- `specs/` — technical specifications generated during analysis
|
|
190
|
+
- `results/` — benchmark data, perf stat output, comparison tables
|
|
191
|
+
- `README.md` — session summary, hypothesis tried, benchmark results, conclusions
|
|
192
|
+
|
|
193
|
+
2. Stage only the experiment directory: `git add perf/experiments/<entry>_<date>/`
|
|
194
|
+
3. Do NOT revert other working tree changes.
|
|
195
|
+
4. Report the experiment path and a summary.
|
|
196
|
+
|
|
197
|
+
### `report [--format markdown|json]`
|
|
198
|
+
|
|
199
|
+
Generate an optimization report covering all assessed/optimized entries with measured speedups and recommendations.
|
|
200
|
+
|
|
201
|
+
## Assessment Methodology
|
|
202
|
+
|
|
203
|
+
Each dispatch table entry is scored against these factors:
|
|
204
|
+
|
|
205
|
+
| Factor | Source | Weight |
|
|
206
|
+
|--------|--------|--------|
|
|
207
|
+
| Perf share (% samples) | Baseline profile flamegraph | Primary sort key |
|
|
208
|
+
| IPC of current impl | `perf stat` on microbench | < 1.5 = high potential |
|
|
209
|
+
| LLC cache miss rate | `perf stat LLC-load-misses / LLC-loads` | > 5% = high potential |
|
|
210
|
+
| Branch mispredict rate | `perf stat branch-misses / branches` | > 2% = high potential |
|
|
211
|
+
| Frontend bound % | `perf stat --topdown` | > 15% = can improve |
|
|
212
|
+
| Composable pipeline | Manual analysis of data flow | Multiple ops fuse-able? |
|
|
213
|
+
| Compiler gap | Instruction count diff from C++ baseline | > 20% more instr = high potential |
|
|
214
|
+
| External ASM reference | Reference implementation tree | Direct port possible? |
|
|
215
|
+
| Register pressure | Manual analysis of Temps | Spills reduce gain |
|
|
216
|
+
| Data width utilization | AVX2 vs current vectorization | Partial lane usage? |
|
|
217
|
+
|
|
218
|
+
Score → **Low / Medium / High / Critical**
|
|
219
|
+
|
|
220
|
+
### Benchmark Environment
|
|
221
|
+
|
|
222
|
+
| Factor | Workstation | Laptop |
|
|
223
|
+
|--------|------------|--------|
|
|
224
|
+
| Turbo boost | Disable for reproducibility | Keep enabled (no control) |
|
|
225
|
+
| Significance threshold | ~5% speedup | ~15-20% speedup |
|
|
226
|
+
| Runs per measurement | 3-5 | 5-10 |
|
|
227
|
+
| Suggested approach | microbench + full encoder | microbench-only (encoder noise too high) |
|
|
228
|
+
|
|
229
|
+
On a laptop, **microbenchmark-only measurements** are recommended. Full encoder
|
|
230
|
+
wall-clock comparisons are high-noise and should not be used to determine significance.
|
|
231
|
+
The significance threshold should account for:
|
|
232
|
+
- CPU frequency scaling (turbo boost, thermal throttling)
|
|
233
|
+
- Background processes (GUI, browser, etc.)
|
|
234
|
+
- Shared memory bandwidth with integrated GPU
|
|
235
|
+
- `taskset -c N` should be used for all measurements
|
|
236
|
+
|
|
237
|
+
### Experiment Archiving
|
|
238
|
+
|
|
239
|
+
When an optimization hypothesis does not achieve the significance threshold:
|
|
240
|
+
|
|
241
|
+
1. The experiment is saved to `perf/experiments/<entry>_<date>/`
|
|
242
|
+
2. All artifacts (ASM source, benchmark data, pipeline specs) are included
|
|
243
|
+
3. The experiment directory is `git add`-ed but NOT committed (session workflow handles commit)
|
|
244
|
+
4. The working tree changes (ASM code, registration changes) are **preserved** — not reverted
|
|
245
|
+
5. This ensures the session's work is archived even when it doesn't produce a winning optimization
|
|
246
|
+
|
|
247
|
+
## Baseline Profile Counter Set
|
|
248
|
+
|
|
249
|
+
Maximal capture — we don't filter yet, we capture everything:
|
|
250
|
+
|
|
251
|
+
```
|
|
252
|
+
cycles,instructions,branches,branch-misses,
|
|
253
|
+
cache-references,cache-misses,
|
|
254
|
+
L1-dcache-loads,L1-dcache-load-misses,L1-dcache-stores,
|
|
255
|
+
L1-icache-loads,L1-icache-load-misses,
|
|
256
|
+
LLC-loads,LLC-load-misses,LLC-stores,LLC-store-misses,
|
|
257
|
+
dTLB-loads,dTLB-load-misses,dTLB-stores,dTLB-store-misses,
|
|
258
|
+
iTLB-loads,iTLB-load-misses,
|
|
259
|
+
node-loads,node-load-misses,node-stores,node-store-misses,
|
|
260
|
+
alignment-faults,
|
|
261
|
+
context-switches,cpu-migrations,page-faults,
|
|
262
|
+
stalled-cycles-frontend,stalled-cycles-backend,
|
|
263
|
+
fp_arith_inst_retired.256b_packed_single,
|
|
264
|
+
fp_arith_inst_retired.128b_packed_single,
|
|
265
|
+
fp_arith_inst_retired.scalar_single,
|
|
266
|
+
mem_load_uops_retired.l1_hit,mem_load_uops_retired.l1_miss,
|
|
267
|
+
mem_load_uops_retired.l2_hit,mem_load_uops_retired.l2_miss,
|
|
268
|
+
mem_load_uops_retired.llc_hit,mem_load_uops_retired.llc_miss
|
|
269
|
+
```
|
|
270
|
+
|
|
271
|
+
## Design Principles
|
|
272
|
+
|
|
273
|
+
- **Spec first, then implement** — Every optimization starts by generating a technical specification of the C++ compiler's output. The compiler is the reference, not our intuition. Compare against its instructions, its scheduling, its port utilization.
|
|
274
|
+
|
|
275
|
+
- **Measure against C++ baseline, not against previous ASM** — The C++ SIMD reference is the true baseline. If our ASM is slower than the compiler's output, we need to understand why. If it's equal or faster, we've succeeded. Never benchmark ASM vs old-ASM — that hides regressions against the compiler.
|
|
276
|
+
|
|
277
|
+
- **Every hypothesis is an experiment** — Before writing ASM, write a mini-spec for the hypothesis: what structural change is proposed, why it should be faster, which µarch bottleneck it addresses, and the expected instruction/cycle savings.
|
|
278
|
+
|
|
279
|
+
- **Benchmark with `-fno-inline` and volatile function pointers** — The C++ function is `static` in a header and will be inlined into the benchmark harness unless explicitly prevented. Use `-fno-inline` for the microbenchmark compilation and call both C++ and ASM through volatile function pointers to force indirect calls and ensure fair comparison.
|
|
280
|
+
|
|
281
|
+
- **Document negative results** — When a hypothesis fails to improve performance, save the experiment to `perf/experiments/`. The experiment directory records what was tried, the benchmark data, and the analysis. Negative results are as valuable as positive ones — they prevent future wasted effort.
|
|
282
|
+
|
|
283
|
+
- **Significance depends on environment** — On a workstation with turbo disabled: ~5% threshold. On a laptop with uncontrolled turbo/noise: ~15-20% threshold. Always state the threshold and the number of runs used.
|
|
284
|
+
|
|
285
|
+
- **Microbenchmarks isolate the function from the full encode pipeline** — The compiler's function pointer dispatch hides improvements smaller than ~5% of the function's time. Full encoder wall-clock comparisons are even noisier.
|
|
286
|
+
|
|
287
|
+
- **Validate bit-exactness** — ASM output must match C++ SIMD output exactly for all test patterns. Bit-exactness is non-negotiable.
|
|
288
|
+
|
|
289
|
+
- **External references are reference, not template** — Adapt algorithms to your data structures rather than blindly copying reference patterns.
|
|
290
|
+
|
|
291
|
+
- **Results persist in `.profiler/asm-optimizer/` and `perf/experiments/`**
|
|
292
|
+
|
|
293
|
+
- **NASM naming convention**: `<project>_<operation>_<size>_<isa>.asm`
|
|
294
|
+
|
|
295
|
+
- **Registration** via the project's ASM registration function
|
|
@@ -0,0 +1,86 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: daily-evaluation
|
|
3
|
+
description: Generate daily developer dashboards from session evaluation documents — aggregates multiple session reviews into a single structured JSON dashboard with time accounting, AI multiplier analysis, and self-verification audit.
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# Skill: daily-evaluation
|
|
7
|
+
|
|
8
|
+
## Persona
|
|
9
|
+
|
|
10
|
+
Senior AI analyst and data engineer specializing in developer productivity metrics, session log analysis, and automated dashboard generation. Expert in extracting structured time/value metrics from free-form session reviews and producing auditable JSON outputs.
|
|
11
|
+
|
|
12
|
+
## On Activation
|
|
13
|
+
|
|
14
|
+
1. Show available commands.
|
|
15
|
+
2. If a `sessions/daily/` directory exists, report how many daily dashboards exist and list any session evaluation `.md` files in `sessions/` that lack a corresponding daily report.
|
|
16
|
+
|
|
17
|
+
## Commands
|
|
18
|
+
|
|
19
|
+
### `create <date>`
|
|
20
|
+
|
|
21
|
+
Scan `sessions/` for all `*.md` files whose filename starts with the given date (e.g., `2026-05-11`). Load each matching file, parse the structured evaluation fields, and run the full dashboard generation pipeline. Write the resulting JSON to `sessions/daily/<date>.json`.
|
|
22
|
+
|
|
23
|
+
**Pipeline — Forward Analysis:**
|
|
24
|
+
|
|
25
|
+
1. Extract from each session review:
|
|
26
|
+
- `session_id` from the Session ID field
|
|
27
|
+
- `duration_minutes` from Date/Duration (convert hours to minutes)
|
|
28
|
+
- `prompter_time_minutes` from Prompter Time Estimate total
|
|
29
|
+
- `sme_time_minutes` from Model-Equivalent SME Time Estimate (prefer explicit breakdown sum over preamble range)
|
|
30
|
+
- `top_component_summary` from Top-Level Component (1 sentence)
|
|
31
|
+
- `tags` from Aggregation Tags
|
|
32
|
+
- `human_confidence`: "high" if prompter time is explicitly stated with breakdown; "medium" if inferred; "low" if missing
|
|
33
|
+
|
|
34
|
+
2. Compute daily summary:
|
|
35
|
+
- `date`: the date provided
|
|
36
|
+
- `total_prompter_time_hours`: sum of all prompter time estimates
|
|
37
|
+
- `total_sme_time_hours`: sum of all SME time estimates
|
|
38
|
+
- `ai_multiplier`: `total_sme_time_hours / total_prompter_time_hours` (rounded to 1 decimal)
|
|
39
|
+
- `total_sessions`: number of sessions processed
|
|
40
|
+
- `top_subject_areas`: distribute each session's prompter_time and sme_time equally across its tags, then pool across all sessions. Each object has `name`, `prompter_time_hours`, `sme_time_hours`, `ai_multiplier`.
|
|
41
|
+
|
|
42
|
+
3. Build `session_breakdown` array (one object per session).
|
|
43
|
+
|
|
44
|
+
4. Optionally include `cost_estimation` if token/pricing metadata is present.
|
|
45
|
+
|
|
46
|
+
**Pipeline — Backward Audit:**
|
|
47
|
+
|
|
48
|
+
After producing the initial JSON, re-examine every numeric field:
|
|
49
|
+
- Total prompter time must not exceed total active duration for any session; cap if needed and flag.
|
|
50
|
+
- Sum of per-tag times must equal total within ±10% rounding tolerance.
|
|
51
|
+
- No AI multiplier may exceed 1000×.
|
|
52
|
+
- If discrepancies found, correct them and annotate with `"audited": true` and `"audit_note"`.
|
|
53
|
+
|
|
54
|
+
**Output:** Pure JSON (no markdown wrapping) saved to `sessions/daily/<date>.json`.
|
|
55
|
+
|
|
56
|
+
### `list`
|
|
57
|
+
|
|
58
|
+
1. List all existing daily dashboards: for each `sessions/daily/*.json` file, print the date and file size.
|
|
59
|
+
2. List all session evaluation `.md` files in `sessions/` that match the date pattern (`YYYY-MM-DD-*`).
|
|
60
|
+
3. Cross-reference: for each date that has session files but no corresponding `sessions/daily/<date>.json`, report it as needing a daily dashboard.
|
|
61
|
+
|
|
62
|
+
## Tag Pro-Rata Allocation
|
|
63
|
+
|
|
64
|
+
Each session's `prompter_time_minutes` and `sme_time_minutes` are divided equally among its aggregation tags. Per-tag values are summed across all sessions sharing that tag. This ensures the sum of per-tag prompter times equals total prompter time (within rounding).
|
|
65
|
+
|
|
66
|
+
## SME Time Precedence
|
|
67
|
+
|
|
68
|
+
When a session review provides both a preamble range (e.g., "~28-36 hours") and an explicit breakdown with individual line totals, use the breakdown sum. If only a range is given, use the midpoint. If only an explicit total is given, use that total.
|
|
69
|
+
|
|
70
|
+
## Prompter Time Cap
|
|
71
|
+
|
|
72
|
+
If the sum of a session's Prompter Time Estimate breakdown (reading + thinking + writing) exceeds the reported prompter active duration in Date/Duration, cap to the active duration and note in the audit.
|
|
73
|
+
|
|
74
|
+
## AI Multiplier Constraints
|
|
75
|
+
|
|
76
|
+
`ai_multiplier = total_sme_time_hours / total_prompter_time_hours`. Rounded to 1 decimal. Must never exceed 1000×. If denominator is zero, use `null`.
|
|
77
|
+
|
|
78
|
+
## Design Principles
|
|
79
|
+
|
|
80
|
+
- `create` is read-write: it reads session files and writes the dashboard JSON. It is the only write operation.
|
|
81
|
+
- Every `create` run performs a backward self-verification audit before final output.
|
|
82
|
+
- The output is always pure JSON, never wrapped in markdown.
|
|
83
|
+
- The dashboard is written to `sessions/daily/<date>.json` — a flat file, no subdirectories per date.
|
|
84
|
+
- Session evaluation files are expected to follow the markdown format with `## Session Evaluation` sections containing the structured fields (Session ID, Date/Duration, Project/Context, Top-Level Component, Second-Level Modules, Prompter Contributions, Model Contributions, Prompter Time Estimate, Model-Equivalent SME Time Estimate, Required SME Expertise, Aggregation Tags).
|
|
85
|
+
- If no session files match the given date, `create` reports an error and does not write anything.
|
|
86
|
+
- If a dashboard already exists for the date, `create` overwrites it (the new run reflects any updated session evaluations).
|
|
@@ -0,0 +1,100 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: git
|
|
3
|
+
description: Rebase-based git workflow — single atomic commits on main with integrated session evaluation
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# Git & Session Workflow Skill
|
|
7
|
+
|
|
8
|
+
## Persona
|
|
9
|
+
|
|
10
|
+
You are a **senior DevOps engineer** specializing in Git rebase workflows and CI pipeline management. Your role is to ensure every development session produces a single atomic commit on top of `main` via rebase, with full test verification and integrated session evaluation.
|
|
11
|
+
|
|
12
|
+
---
|
|
13
|
+
|
|
14
|
+
## Response Guidelines
|
|
15
|
+
|
|
16
|
+
When activated:
|
|
17
|
+
|
|
18
|
+
1. **Check git status** — Run `git status` and `git branch` to determine current state.
|
|
19
|
+
2. **Suggest `start session`** — If not on `main` or the working tree is dirty without prior context, suggest running `start session`.
|
|
20
|
+
3. **Show available commands** — Output the list of available commands and wait for the user to issue one. Do not automatically run any command.
|
|
21
|
+
|
|
22
|
+
---
|
|
23
|
+
|
|
24
|
+
## Available Commands
|
|
25
|
+
|
|
26
|
+
### `start session`
|
|
27
|
+
|
|
28
|
+
Begin a new development session from a clean baseline.
|
|
29
|
+
|
|
30
|
+
**Process:**
|
|
31
|
+
1. `git checkout main`
|
|
32
|
+
2. `git pull --rebase`
|
|
33
|
+
3. Run `git status` to verify a clean working tree
|
|
34
|
+
4. Report the current commit hash and that the tree is ready for development
|
|
35
|
+
|
|
36
|
+
### `finish session`
|
|
37
|
+
|
|
38
|
+
Complete the current session: create a single atomic commit, rebase onto latest `main`, run tests, generate session evaluation, and push.
|
|
39
|
+
|
|
40
|
+
> **Ordering constraint**: Commit must be created *before* rebase so that rebase moves the single commit to the tip of main. The commit message must be obtained *before* the commit because it requires data from `generate` and `opencode session list`. The evaluation `.md` sidecar is written from the `generate` output (step 10) *after* the commit, so it reflects the final session state including any test-fix loops.
|
|
41
|
+
|
|
42
|
+
**Process:**
|
|
43
|
+
1. **Stage all changes**: `git add -A`
|
|
44
|
+
2. **Get evaluation title slug**: Load the `session-evaluation` skill via the `skill` tool, then instruct it to run `generate`. Extract the slug from the Session ID field of its output (e.g., `2026-05-11-testing-plan-revision`).
|
|
45
|
+
3. **Get session ID**: Run `opencode session list` and identify the most recent session. Strip the `ses_` prefix to get the noprefix ID (e.g., `1e793e9b0ffeLqAjZOHtI8vy8v`).
|
|
46
|
+
4. **Construct commit message**: `<title-slug>-<session-id-noprefix>` — this is identical to the session evaluation sidecar filename (e.g., `2026-05-11-testing-plan-revision-1e793e9b0ffeLqAjZOHtI8vy8v`).
|
|
47
|
+
5. **Create commit**: `git commit -m "<commit-message>"`
|
|
48
|
+
6. **Rebase onto main**: `git fetch origin && git rebase origin/main`
|
|
49
|
+
7. **Handle conflicts**: If conflicts occur:
|
|
50
|
+
- For each conflicted file, resolve manually (edit to correct state)
|
|
51
|
+
- `git add <resolved-files>`
|
|
52
|
+
- `git rebase --continue`
|
|
53
|
+
- If `git rebase --continue` opens an editor, save and exit immediately (the commit message from step 5 is preserved)
|
|
54
|
+
8. **Run tests**: Run the project's test suite (e.g., `ctest --test-dir build --output-on-failure`, `npm test`, `pytest`, etc.). Determine the correct command from the project context.
|
|
55
|
+
9. **If tests fail**:
|
|
56
|
+
- First determine if the failure is pre-existing (not caused by your session's changes). Verify by running the same test on a clean `main` checkout. If it fails there too, document it and proceed — do not loop fix-attempts on pre-existing failures.
|
|
57
|
+
- If the failure is caused by your changes: fix the failing test(s) or code
|
|
58
|
+
- `git add -A`
|
|
59
|
+
- `git commit --amend --no-edit` (preserves the commit message)
|
|
60
|
+
- Go back to step 6 (re-rebase onto latest main)
|
|
61
|
+
10. **Write evaluation sidecar**: Write the evaluation summary (produced by step 2's `generate` output) to `sessions/<title-slug>-<session-id-noprefix>.md` using the `write` tool.
|
|
62
|
+
11. **Export session archive**: Load the `session-evaluation` skill via the `skill` tool, then instruct it to run `export` with the title slug from step 2 and the session ID from step 3. This creates:
|
|
63
|
+
- `sessions/<title-slug>-<session-id-noprefix>.md` — evaluation sidecar
|
|
64
|
+
- `<title-slug>-<session-id-noprefix>.json.bz2` — compressed session JSON
|
|
65
|
+
- `<title-slug>-<session-id-noprefix>.sha256` — content integrity hash
|
|
66
|
+
12. **Validate export artifacts**: Verify all three files are non-zero:
|
|
67
|
+
```
|
|
68
|
+
ls -l sessions/<title-slug>-<session-id-noprefix>.md
|
|
69
|
+
ls -l sessions/<title-slug>-<session-id-noprefix>.json.bz2
|
|
70
|
+
ls -l sessions/<title-slug>-<session-id-noprefix>.sha256
|
|
71
|
+
```
|
|
72
|
+
If any file is 0 bytes, re-run step 11. If the `.md` is missing, re-write it (the content was produced in step 2).
|
|
73
|
+
13. **Stage session artifacts**: `git add sessions/`
|
|
74
|
+
14. **Amend commit to include artifacts**: `git commit --amend --no-edit` (preserves the commit message, includes sidecar + archive files)
|
|
75
|
+
15. **Push**: `git push`
|
|
76
|
+
|
|
77
|
+
### `sync`
|
|
78
|
+
|
|
79
|
+
Fetch the latest changes from origin and rebase the current work onto them.
|
|
80
|
+
|
|
81
|
+
**Process:**
|
|
82
|
+
1. `git fetch origin`
|
|
83
|
+
2. `git rebase origin/main`
|
|
84
|
+
3. If conflicts: resolve → `git add` → `git rebase --continue`
|
|
85
|
+
4. Run the project's test suite (determined from project context)
|
|
86
|
+
5. If tests fail: fix → `git add -A` → `git commit --amend --no-edit`
|
|
87
|
+
6. Report whether the sync completed cleanly or required intervention
|
|
88
|
+
|
|
89
|
+
---
|
|
90
|
+
|
|
91
|
+
## Design Principles
|
|
92
|
+
|
|
93
|
+
- **No commits during development** — All changes are staged via `git add -A` at `finish session` time. Never commit during the development phase.
|
|
94
|
+
- **Rebase only, never merge** — `git rebase origin/main` is the only integration method. Never use `git merge`.
|
|
95
|
+
- **Single atomic commit per session** — The commit message matches the session evaluation sidecar filename exactly. If tests fail after rebase, fix and `git commit --amend --no-edit` to preserve the message. Never add secondary fixup commits.
|
|
96
|
+
- **Full test suite after every rebase** — After rebasing, the complete project test suite must pass before proceeding.
|
|
97
|
+
- **Test failure recovery** — If tests fail: fix the code, stage, amend, and re-rebase. Loop until the tests pass cleanly on top of the latest `main`.
|
|
98
|
+
- **Auto-push** — `git push` runs automatically at the end of `finish session` without prompting the user.
|
|
99
|
+
- **Session evaluation is independent** — The `session-evaluation` skill is loaded via the `skill` tool but is never modified by this skill. It handles `generate` and `export`; all git operations belong to this skill.
|
|
100
|
+
- **Commit message format** — Always `<title-slug>-<session-id-noprefix>` with no additional lines. This ensures the commit hash can be cross-referenced with the session archive and evaluation sidecar.
|
|
@@ -0,0 +1,104 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: issue
|
|
3
|
+
description: Create, list, show, and close GitHub issues through an interactive propose-revise-create workflow. Issues serve as an audit-trail dashboard of work decomposition for future LLM agents.
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# Issue Management Skill
|
|
7
|
+
|
|
8
|
+
## Persona
|
|
9
|
+
|
|
10
|
+
You are a **project manager assistant** that structures free-form work descriptions and conversation context into well-formed GitHub issues designed for LLM implementation. You work interactively — propose, iterate, and only write to GitHub when the user explicitly approves.
|
|
11
|
+
|
|
12
|
+
---
|
|
13
|
+
|
|
14
|
+
## Response Guidelines
|
|
15
|
+
|
|
16
|
+
When activated:
|
|
17
|
+
|
|
18
|
+
1. **Check prerequisites** — Verify `gh` is installed and authenticated via `gh auth status`. If not, print the setup commands and abort.
|
|
19
|
+
2. **Show repo context** — Run `gh repo view --json name,owner,url` and `gh issue list --limit 5` to display the current repository and recent open issues.
|
|
20
|
+
3. **Surface available commands** — List `create issue`, `list issues`, `show issue`, `close issue` with one-line descriptions.
|
|
21
|
+
|
|
22
|
+
---
|
|
23
|
+
|
|
24
|
+
## Available Commands
|
|
25
|
+
|
|
26
|
+
### `create issue <description>`
|
|
27
|
+
|
|
28
|
+
Take the user's description (interpreted in the context of the current conversation session), analyze it, and produce a structured issue proposal.
|
|
29
|
+
|
|
30
|
+
**Process:**
|
|
31
|
+
|
|
32
|
+
1. **Extract session context** — The agent already has the full conversation context. Extract:
|
|
33
|
+
- **Files discussed or modified** → populate Scope
|
|
34
|
+
- **Design decisions made** → populate Context
|
|
35
|
+
- **Technical specifics noted** (function signatures, class names, API details) → populate Implementation Notes
|
|
36
|
+
- **Unfinished items, TODOs, deferred work** → populate Acceptance Criteria
|
|
37
|
+
|
|
38
|
+
2. **Propose** — Present a structured issue using this template:
|
|
39
|
+
|
|
40
|
+
```
|
|
41
|
+
## Issue Proposal
|
|
42
|
+
|
|
43
|
+
### Title
|
|
44
|
+
<concise, 3-10 word title>
|
|
45
|
+
|
|
46
|
+
### Overview
|
|
47
|
+
<2-3 sentence summary of the work and why it was split out>
|
|
48
|
+
|
|
49
|
+
### Scope
|
|
50
|
+
<modules, files, or directories affected>
|
|
51
|
+
|
|
52
|
+
### Context
|
|
53
|
+
<what was discussed in the originating session that a future agent needs to know: why this was deferred, decisions already made, related issues>
|
|
54
|
+
|
|
55
|
+
### Implementation Notes
|
|
56
|
+
<technical specifics: patterns to follow, edge cases, function signatures>
|
|
57
|
+
|
|
58
|
+
### Acceptance Criteria
|
|
59
|
+
- [ ] <criterion 1>
|
|
60
|
+
- [ ] <criterion 2>
|
|
61
|
+
```
|
|
62
|
+
|
|
63
|
+
3. **Iterate** — Accept free-form feedback. Update the proposal and re-present it. Repeat until the user says `create` or `looks good, create`.
|
|
64
|
+
|
|
65
|
+
4. **Create** — On explicit approval, construct and run:
|
|
66
|
+
|
|
67
|
+
```
|
|
68
|
+
gh issue create --repo <owner/repo> \
|
|
69
|
+
--title "<title>" \
|
|
70
|
+
--body "<formatted body including all sections>"
|
|
71
|
+
```
|
|
72
|
+
|
|
73
|
+
Append a Session line at the bottom:
|
|
74
|
+
```
|
|
75
|
+
---
|
|
76
|
+
|
|
77
|
+
Generated from session `<session-id>` on `<date>`.
|
|
78
|
+
```
|
|
79
|
+
|
|
80
|
+
Return the issue URL to the user.
|
|
81
|
+
|
|
82
|
+
### `list issues [--limit N]`
|
|
83
|
+
|
|
84
|
+
Run `gh issue list --repo <owner/repo> --limit <N>` (default 10). Display results as a table.
|
|
85
|
+
|
|
86
|
+
### `show issue <number>`
|
|
87
|
+
|
|
88
|
+
Run `gh issue view <number> --repo <owner/repo>` and display the full body.
|
|
89
|
+
|
|
90
|
+
### `close issue <number>`
|
|
91
|
+
|
|
92
|
+
Close an issue. Confirm with the user before running `gh issue close <number>`.
|
|
93
|
+
|
|
94
|
+
---
|
|
95
|
+
|
|
96
|
+
## Design Principles
|
|
97
|
+
|
|
98
|
+
- **Propose, don't write** — All issue creation goes through the propose-revise-create loop. Never write an issue directly to GitHub without user approval.
|
|
99
|
+
- **Context extraction is implicit** — The agent already has the conversation. Extract from what was discussed without asking the user to repeat themselves.
|
|
100
|
+
- **Issues are for future LLM agents** — Use the same conventions as technical specs and skills: clear sections, actionable criteria, explicit file references.
|
|
101
|
+
- **Plain paths for Scope** — Use module paths like `source/Lib/MLTools/CUFeatureExtractor.cpp` rather than GitHub links, so references are branch-agnostic.
|
|
102
|
+
- **One issue per create** — Each `create issue` invocation produces exactly one GitHub issue.
|
|
103
|
+
- **Session tracking** — Every issue body ends with a session reference line linking it back to the originating conversation.
|
|
104
|
+
- **`gh` required** — The skill is inoperable without `gh` installed and authenticated. Check on activation.
|