@hanzlaa/rcode 2.8.0 → 3.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (122) hide show
  1. package/AGENTS.md +11 -1
  2. package/CONTRIBUTING.md +7 -0
  3. package/README.md +39 -20
  4. package/cli/install.js +145 -47
  5. package/dist/rcode.js +134 -43
  6. package/package.json +2 -2
  7. package/rihal/agents/rihal-advisor-researcher.md +1 -1
  8. package/rihal/agents/rihal-assumptions-analyzer.md +1 -1
  9. package/rihal/agents/rihal-codebase-mapper.md +1 -1
  10. package/rihal/agents/rihal-docs-auditor.md +3 -3
  11. package/rihal/agents/rihal-executor.md +10 -0
  12. package/rihal/agents/rihal-integration-checker.md +1 -1
  13. package/rihal/agents/rihal-noor.md +2 -2
  14. package/rihal/agents/rihal-phase-researcher.md +1 -1
  15. package/rihal/agents/rihal-planner.md +25 -0
  16. package/rihal/agents/rihal-project-researcher.md +1 -1
  17. package/rihal/agents/rihal-research-synthesizer.md +1 -1
  18. package/rihal/agents/rihal-roadmapper.md +1 -1
  19. package/rihal/agents/rihal-sprint-checker.md +19 -1
  20. package/rihal/agents/rihal-verifier.md +1 -1
  21. package/rihal/agents/rihal-waleed.md +1 -2
  22. package/rihal/commands/code-review.md +1 -1
  23. package/rihal/commands/memory-audit.md +10 -0
  24. package/rihal/commands/memory-distill.md +11 -0
  25. package/rihal/commands/memory-init.md +12 -0
  26. package/rihal/commands/memory-update.md +12 -0
  27. package/rihal/config/model-profiles.json +5 -5
  28. package/rihal/references/karpathy-guidelines-full.md +1 -1
  29. package/rihal/references/no-unauthorized-git-ops.md +1 -1
  30. package/rihal/references/verb-dictionary.md +1 -1
  31. package/rihal/skills/actions/2-plan/rihal-frontend-design/SKILL.md +49 -139
  32. package/rihal/skills/actions/2-plan/rihal-frontend-design/references.md +79 -0
  33. package/rihal/skills/actions/4-implementation/rihal-browser-verify/SKILL.md +70 -0
  34. package/rihal/skills/actions/4-implementation/rihal-checkpoint-preview/SKILL.md +1 -1
  35. package/rihal/skills/actions/4-implementation/rihal-ci/SKILL.md +108 -0
  36. package/rihal/skills/actions/4-implementation/rihal-debug/SKILL.md +78 -0
  37. package/rihal/skills/actions/4-implementation/rihal-git-flow/SKILL.md +90 -0
  38. package/rihal/skills/actions/4-implementation/rihal-harden/SKILL.md +91 -0
  39. package/rihal/skills/actions/4-implementation/rihal-incremental/SKILL.md +50 -0
  40. package/rihal/skills/actions/4-implementation/rihal-migrate/SKILL.md +86 -0
  41. package/rihal/skills/actions/4-implementation/rihal-perf/SKILL.md +96 -0
  42. package/rihal/skills/actions/4-implementation/rihal-prove-it/SKILL.md +64 -0
  43. package/rihal/skills/actions/4-implementation/rihal-source-truth/SKILL.md +76 -0
  44. package/rihal/skills/actions/4-implementation/rihal-trim/SKILL.md +73 -0
  45. package/rihal/skills/agents/dalil-scout/SKILL.md +43 -125
  46. package/rihal/skills/agents/dalil-scout/references.md +67 -0
  47. package/rihal/skills/agents/majlis-council/SKILL.md +50 -144
  48. package/rihal/skills/agents/majlis-council/references.md +90 -0
  49. package/rihal/skills/agents/raees-orchestrator/SKILL.md +56 -117
  50. package/rihal/skills/agents/raees-orchestrator/references.md +47 -0
  51. package/rihal/skills/core/rihal-advanced-elicitation/SKILL.md +36 -136
  52. package/rihal/skills/core/rihal-advanced-elicitation/references.md +101 -0
  53. package/rihal/skills/core/rihal-auth-audit/SKILL.md +93 -0
  54. package/rihal/skills/core/rihal-brainstorming/SKILL.md +5 -0
  55. package/rihal/skills/core/rihal-client-gate/SKILL.md +91 -0
  56. package/rihal/skills/core/rihal-clone-website/SKILL.md +30 -371
  57. package/rihal/skills/core/rihal-clone-website/references.md +213 -0
  58. package/rihal/skills/core/rihal-deploy-unify/SKILL.md +87 -0
  59. package/rihal/skills/core/rihal-distillator/SKILL.md +37 -187
  60. package/rihal/skills/core/rihal-distillator/references.md +118 -0
  61. package/rihal/skills/core/rihal-editorial-review-prose/SKILL.md +5 -0
  62. package/rihal/skills/core/rihal-editorial-review-structure/SKILL.md +45 -183
  63. package/rihal/skills/core/rihal-editorial-review-structure/references.md +110 -0
  64. package/rihal/skills/core/rihal-help/SKILL.md +6 -1
  65. package/rihal/skills/core/rihal-incident-record/SKILL.md +161 -0
  66. package/rihal/skills/core/rihal-index-docs/SKILL.md +5 -0
  67. package/rihal/skills/core/rihal-init/SKILL.md +5 -0
  68. package/rihal/skills/core/rihal-memory-audit/SKILL.md +88 -0
  69. package/rihal/skills/core/rihal-memory-distill/SKILL.md +87 -0
  70. package/rihal/skills/core/rihal-memory-init/SKILL.md +77 -0
  71. package/rihal/skills/core/rihal-memory-update/SKILL.md +73 -0
  72. package/rihal/skills/core/rihal-mvp-graduate/SKILL.md +116 -0
  73. package/rihal/skills/core/rihal-ocr-consistency/SKILL.md +106 -0
  74. package/rihal/skills/core/rihal-party-mode/SKILL.md +5 -0
  75. package/rihal/skills/core/rihal-rebrand/SKILL.md +133 -0
  76. package/rihal/skills/core/rihal-review-adversarial-general/SKILL.md +5 -0
  77. package/rihal/skills/core/rihal-review-edge-case-hunter/SKILL.md +5 -0
  78. package/rihal/skills/core/rihal-shard-doc/SKILL.md +5 -0
  79. package/rihal/skills/core/rihal-theme-system/SKILL.md +113 -0
  80. package/rihal/team.yaml +3 -22
  81. package/rihal/templates/memory/INDEX.md +46 -0
  82. package/rihal/templates/memory/change-records/.gitkeep +4 -0
  83. package/rihal/templates/memory/distillates/project.distillate.md +11 -0
  84. package/rihal/templates/memory/distillates/stack.distillate.md +11 -0
  85. package/rihal/templates/memory/incidents/known-issues.md +27 -0
  86. package/rihal/templates/memory/incidents/post-mortems/.gitkeep +3 -0
  87. package/rihal/templates/memory/milestones/archive/.gitkeep +2 -0
  88. package/rihal/templates/memory/milestones/current.md +39 -0
  89. package/rihal/templates/memory/people/stakeholders.md +25 -0
  90. package/rihal/templates/memory/people/team.md +35 -0
  91. package/rihal/templates/memory/project/decisions.md +32 -0
  92. package/rihal/templates/memory/project/glossary.md +16 -0
  93. package/rihal/templates/memory/project/stack.md +46 -0
  94. package/rihal/workflows/audit.md +3 -3
  95. package/rihal/workflows/code-review.md +32 -1
  96. package/rihal/workflows/council.md +1 -1
  97. package/rihal/workflows/discuss-phase-power.md +3 -3
  98. package/rihal/workflows/do.md +1 -1
  99. package/rihal/workflows/docs-update.md +4 -4
  100. package/rihal/workflows/execute.md +61 -5
  101. package/rihal/workflows/help.md +5 -5
  102. package/rihal/workflows/karpathy-audit.md +9 -9
  103. package/rihal/workflows/memory-audit.md +83 -0
  104. package/rihal/workflows/memory-distill.md +103 -0
  105. package/rihal/workflows/memory-init.md +102 -0
  106. package/rihal/workflows/memory-update.md +83 -0
  107. package/rihal/workflows/plan.md +66 -1
  108. package/server/dashboard.js +6 -1
  109. package/server/lib/api.js +8 -2
  110. package/server/lib/html/client.js +63 -0
  111. package/server/lib/html/shell.js +5 -0
  112. package/server/lib/scanner.js +76 -1
  113. package/rihal/agents/rihal-architect.md +0 -79
  114. package/rihal/agents/rihal-tech-writer.md +0 -80
  115. package/rihal/commands/check-implementation-readiness.md +0 -8
  116. package/rihal/commands/discuss-phase-power.md +0 -11
  117. package/rihal/commands/karpathy-audit.md +0 -12
  118. package/rihal/commands/new-project-research.md +0 -11
  119. package/rihal/commands/new-project-roadmap.md +0 -11
  120. package/rihal/commands/report.md +0 -10
  121. package/rihal/commands/review-adversarial.md +0 -8
  122. package/rihal/commands/review-edge-case-hunter.md +0 -8
@@ -0,0 +1,213 @@
1
+ # Clone Website — Detailed Reference
2
+
3
+ Detailed principles, scripts, templates, and checklists for [`SKILL.md`](SKILL.md). Keep this file open in another tab while running the skill.
4
+
5
+ ---
6
+
7
+ ## 9 Guiding Principles
8
+
9
+ These are the truths that separate a successful clone from a "close enough" mess.
10
+
11
+ ### 1. Completeness beats speed
12
+ Every builder agent must receive **everything** it needs: screenshot, exact CSS values, downloaded assets with local paths, real text content, component structure. If a builder has to guess any value, extraction failed. One extra minute of extraction beats an incomplete brief.
13
+
14
+ ### 2. Small tasks, perfect results
15
+ Builder prompts ≤150 lines of spec. If a section's spec exceeds that, split it: one agent per sub-component plus one for the wrapper. Don't override with "but it's all related."
16
+
17
+ ### 3. Real content, real assets
18
+ Extract actual text, images, videos, SVGs from the live site. Use `element.textContent`, download every `<img>` and `<video>`, extract inline `<svg>` as React components. Layered assets matter — a section that looks like one image is often multiple layers (background, foreground UI mockup, overlay icon). Inspect the full DOM tree.
19
+
20
+ ### 4. Foundation first
21
+ Sequential and non-negotiable: global CSS with the target's design tokens, TypeScript types for content structures, global assets (fonts, favicons). Everything after this can be parallel.
22
+
23
+ ### 5. Extract how it looks AND how it behaves
24
+ Static CSS alone produces dead-feeling clones. For every element extract appearance (`getComputedStyle`) AND behaviour (what changes, what triggers it, how the transition runs). Behaviours to watch: scroll-shrink navbars, viewport-entry animations, scroll-snap, parallax, hover transitions, modals/accordions, auto-play carousels, scroll-driven tab switching, smooth-scroll libraries (Lenis, Locomotive Scroll).
25
+
26
+ ### 6. Identify the interaction model before building
27
+ The single most expensive mistake: building click-based UI when the original is scroll-driven (or vice versa).
28
+ - Scroll first, slowly. Watch for self-changing elements.
29
+ - If something changes on scroll, it's scroll-driven. Extract the mechanism.
30
+ - Only THEN test for click/hover-driven interactivity.
31
+ - Document the interaction model explicitly in every spec.
32
+
33
+ ### 7. Extract every state, not just the default
34
+ For tabbed/stateful content: click each tab via Chrome MCP, extract per state. For scroll-dependent elements: capture at scroll position 0 and after crossing the trigger threshold.
35
+
36
+ ### 8. Spec files are the source of truth
37
+ Every component gets a spec file in `docs/research/components/` BEFORE any builder is dispatched. The builder receives the spec contents inline; the file persists as an auditable artefact.
38
+
39
+ ### 9. Build must always compile
40
+ Every builder verifies `npx tsc --noEmit` before finishing. After merging worktrees, you verify `npm run build` passes. No red merges.
41
+
42
+ ---
43
+
44
+ ## Asset Discovery Script (Chrome MCP)
45
+
46
+ ```javascript
47
+ JSON.stringify({
48
+ images: [...document.querySelectorAll('img')].map(img => ({
49
+ src: img.src || img.currentSrc,
50
+ alt: img.alt,
51
+ width: img.naturalWidth,
52
+ height: img.naturalHeight,
53
+ parentClasses: img.parentElement?.className,
54
+ position: getComputedStyle(img).position,
55
+ zIndex: getComputedStyle(img).zIndex
56
+ })),
57
+ videos: [...document.querySelectorAll('video')].map(v => ({
58
+ src: v.src || v.querySelector('source')?.src,
59
+ poster: v.poster,
60
+ autoplay: v.autoplay, loop: v.loop, muted: v.muted
61
+ })),
62
+ backgroundImages: [...document.querySelectorAll('*')].filter(el => {
63
+ const bg = getComputedStyle(el).backgroundImage;
64
+ return bg && bg !== 'none';
65
+ }).map(el => ({
66
+ url: getComputedStyle(el).backgroundImage,
67
+ element: el.tagName + '.' + el.className?.split(' ')[0]
68
+ })),
69
+ fonts: [...new Set([...document.querySelectorAll('*')].slice(0, 200).map(el => getComputedStyle(el).fontFamily))],
70
+ favicons: [...document.querySelectorAll('link[rel*="icon"]')].map(l => ({ href: l.href, sizes: l.sizes?.toString() }))
71
+ });
72
+ ```
73
+
74
+ ---
75
+
76
+ ## CSS Extraction Script (per section, replace `SELECTOR`)
77
+
78
+ ```javascript
79
+ (function(selector) {
80
+ const el = document.querySelector(selector);
81
+ if (!el) return JSON.stringify({ error: 'Element not found: ' + selector });
82
+ const props = [
83
+ 'fontSize','fontWeight','fontFamily','lineHeight','letterSpacing','color',
84
+ 'textTransform','textDecoration','backgroundColor','background',
85
+ 'padding','paddingTop','paddingRight','paddingBottom','paddingLeft',
86
+ 'margin','marginTop','marginRight','marginBottom','marginLeft',
87
+ 'width','height','maxWidth','minWidth','maxHeight','minHeight',
88
+ 'display','flexDirection','justifyContent','alignItems','gap',
89
+ 'gridTemplateColumns','gridTemplateRows',
90
+ 'borderRadius','border','borderTop','borderBottom','borderLeft','borderRight',
91
+ 'boxShadow','overflow','overflowX','overflowY',
92
+ 'position','top','right','bottom','left','zIndex',
93
+ 'opacity','transform','transition','cursor',
94
+ 'objectFit','objectPosition','mixBlendMode','filter','backdropFilter',
95
+ 'whiteSpace','textOverflow','WebkitLineClamp'
96
+ ];
97
+ function extractStyles(element) {
98
+ const cs = getComputedStyle(element);
99
+ const styles = {};
100
+ props.forEach(p => {
101
+ const v = cs[p];
102
+ if (v && v !== 'none' && v !== 'normal' && v !== 'auto' && v !== '0px' && v !== 'rgba(0, 0, 0, 0)') styles[p] = v;
103
+ });
104
+ return styles;
105
+ }
106
+ function walk(element, depth) {
107
+ if (depth > 4) return null;
108
+ const children = [...element.children];
109
+ return {
110
+ tag: element.tagName.toLowerCase(),
111
+ classes: element.className?.toString().split(' ').slice(0, 5).join(' '),
112
+ text: element.childNodes.length === 1 && element.childNodes[0].nodeType === 3 ? element.textContent.trim().slice(0, 200) : null,
113
+ styles: extractStyles(element),
114
+ images: element.tagName === 'IMG' ? { src: element.src, alt: element.alt, naturalWidth: element.naturalWidth, naturalHeight: element.naturalHeight } : null,
115
+ childCount: children.length,
116
+ children: children.slice(0, 20).map(c => walk(c, depth + 1)).filter(Boolean)
117
+ };
118
+ }
119
+ return JSON.stringify(walk(el, 0), null, 2);
120
+ })('SELECTOR');
121
+ ```
122
+
123
+ ---
124
+
125
+ ## Component Spec Template
126
+
127
+ Save to `docs/research/components/<component-name>.spec.md`:
128
+
129
+ ```markdown
130
+ # <ComponentName> Specification
131
+
132
+ ## Overview
133
+ - **Target file:** `src/components/<ComponentName>.tsx`
134
+ - **Screenshot:** `docs/design-references/<screenshot-name>.png`
135
+ - **Interaction model:** <static | click-driven | scroll-driven | time-driven>
136
+
137
+ ## DOM Structure
138
+ <hierarchy>
139
+
140
+ ## Computed Styles (exact values)
141
+ ### Container
142
+ - display, padding, maxWidth, etc.
143
+
144
+ ### <Child element>
145
+ - every relevant property
146
+
147
+ ## States & Behaviors
148
+ ### <Behavior name>
149
+ - **Trigger:** <exact mechanism>
150
+ - **State A (before):** CSS values
151
+ - **State B (after):** CSS values
152
+ - **Transition:** transition CSS
153
+ - **Implementation approach:** <CSS transition | IntersectionObserver | etc.>
154
+
155
+ ## Assets
156
+ - Background/overlay images with paths
157
+ - Icons used from icons.tsx
158
+
159
+ ## Text Content (verbatim)
160
+ <copy-pasted from live site>
161
+
162
+ ## Responsive Behavior
163
+ - Desktop (1440px): <layout>
164
+ - Tablet (768px): <changes>
165
+ - Mobile (390px): <changes>
166
+ - Breakpoint: ~<N>px
167
+ ```
168
+
169
+ ---
170
+
171
+ ## Pre-Dispatch Checklist (every builder, every time)
172
+
173
+ - [ ] Spec file written with ALL sections filled
174
+ - [ ] Every CSS value is from `getComputedStyle()`, not estimated
175
+ - [ ] Interaction model identified and documented
176
+ - [ ] All states captured (not just default)
177
+ - [ ] Scroll/hover triggers with before/after/transition recorded
178
+ - [ ] All images identified including overlays
179
+ - [ ] Responsive behavior documented
180
+ - [ ] Text content verbatim
181
+ - [ ] Builder prompt ≤150 lines
182
+
183
+ ---
184
+
185
+ ## What NOT to Do
186
+
187
+ - Don't build click-based tabs when the original is scroll-driven
188
+ - Don't extract only the default state of tabbed content
189
+ - Don't miss overlay/layered images
190
+ - Don't build HTML mockups for content that's actually videos / Lottie / canvas
191
+ - Don't approximate CSS classes — extract exact values
192
+ - Don't build monolithic commits
193
+ - Don't reference external docs from builder prompts — inline everything
194
+ - Don't skip asset extraction
195
+ - Don't give a builder too much scope
196
+ - Don't bundle unrelated sections into one agent
197
+ - Don't skip responsive extraction at 1440 / 768 / 390
198
+ - Don't forget smooth scroll libraries (Lenis, Locomotive)
199
+ - Don't dispatch builders without a spec file
200
+
201
+ ---
202
+
203
+ ## Final Completion Report Format
204
+
205
+ ```
206
+ Total sections built: N
207
+ Total components created: N
208
+ Total spec files written: N (must match components)
209
+ Total assets downloaded: N (images / videos / SVGs / fonts)
210
+ Build status: PASS | FAIL
211
+ Visual QA discrepancies: <remaining diffs>
212
+ Known gaps / limitations: <list>
213
+ ```
@@ -0,0 +1,87 @@
1
+ ---
2
+ name: rihal-deploy-unify
3
+ description: Detect and unify multiple deployment paths in a single project. Use when a repo has accumulated overlapping deploy mechanisms (Docker Compose + Helm + manual scripts + Vercel + Jenkins) and "which one runs in production" is unclear. Specifically encodes the Siraaj deployment chaos lesson — multiple deploy paths cost a week of debugging and broke Keycloak more than once.
4
+ triggers:
5
+ - "deploy unify"
6
+ - "multiple deploy paths"
7
+ - "which deploy is production"
8
+ - "deploy chaos"
9
+ - "consolidate deployments"
10
+ - "kubernetes vs compose"
11
+ - "single deploy path"
12
+ - "deployment audit"
13
+ user-invocable: true
14
+ ---
15
+
16
+ ## Overview
17
+
18
+ Multiple deploy paths is a shipping-risk multiplier. Every path is one more thing that can drift from the others, deploy stale code, or "I thought you ran it" through ops. The Rihal Siraaj incident was a textbook case — Docker Compose for some services, Helm for others, manual `ssh && pull` for the rest, and no one knew which combination was production.
19
+
20
+ ## Workflow
21
+
22
+ 1. **Inventory every deploy path.** Look in:
23
+ - `docker-compose.yml`, `docker-compose.*.yml`
24
+ - `helm/`, `charts/`, `k8s/`
25
+ - `Makefile`, `scripts/deploy*`
26
+ - `.github/workflows/deploy*.yml`, `.gitlab-ci.yml`, `Jenkinsfile`
27
+ - Vercel / Netlify project links
28
+ - Anything in `infra/` or `deployment/`
29
+ 2. **Classify each:** dev / staging / production. If you can't classify it, that's the bug.
30
+ 3. **Identify drift.** For each pair (dev↔staging, staging↔prod):
31
+ - Different env vars?
32
+ - Different image tags?
33
+ - Different replica counts?
34
+ - Different healthchecks?
35
+ - Different secret-management?
36
+ 4. **Pick ONE canonical path per environment.** Helm + values per env is the rcode default. Compose is dev-only. No "and also a Jenkinsfile that does it differently".
37
+ 5. **Deprecate the others** with a clear timeline. Don't delete on day one — leave them as `*.deprecated` and observe for 2 weeks before removal.
38
+ 6. **Document the canonical path** in `.rihal/memory/project/decisions.md` and a top-level `DEPLOYMENT.md`.
39
+
40
+ ## Common drift patterns to look for
41
+
42
+ | Symptom | Root cause | Fix |
43
+ |---|---|---|
44
+ | "Works in staging, breaks in prod" | Different env vars between paths | Single source of truth (Helm values + sealed-secrets) |
45
+ | Image tags lag behind git SHA | Manual `docker push` mid-week | Tag-based deploys via CI only |
46
+ | Healthchecks pass in compose, fail in K8s | Compose uses HTTP, K8s uses TCP | Align probe definitions |
47
+ | "Deploy" doesn't restart all services | Some compose, some bare metal | One orchestrator |
48
+ | Secrets diverge | `.env` files copied manually | External Secrets Operator or sealed-secrets only |
49
+
50
+ ## Output Format
51
+
52
+ ```
53
+ Deploy paths discovered: <count>
54
+ - <path 1> — <classification>
55
+ - <path 2> — <classification>
56
+ ...
57
+
58
+ Drift findings:
59
+ ✗ <pair> — <specific drift>
60
+ ✗ <pair> — <specific drift>
61
+
62
+ Canonical path proposal:
63
+ dev: <one path>
64
+ staging: <one path>
65
+ production: <one path>
66
+
67
+ Deprecation plan:
68
+ Week 1: mark <X> as deprecated, route docs to canonical
69
+ Week 2: remove <X> if no fallback usage observed
70
+
71
+ Memory Bank update:
72
+ → .rihal/memory/project/decisions.md (canonical path decision)
73
+ → DEPLOYMENT.md (the runbook)
74
+ ```
75
+
76
+ ## Examples
77
+
78
+ **Happy path — Siraaj-style mess** — 4 deploy paths found: docker-compose (dev), Helm (staging), manual script (prod-mostly), Jenkinsfile (sometimes prod). Drift: 6 envs differ between staging and prod. Canonical: Helm with `values.staging.yaml` and `values.production.yaml`. Compose stays dev-only. Manual + Jenkinsfile deprecated, removed 2 weeks later.
79
+
80
+ **Edge case — legitimate dual path** — Mobile app uses TestFlight + Play Console; web uses Vercel. These are different surfaces, not deploy-path drift. Document why each surface uses what it does; don't try to unify across surfaces.
81
+
82
+ **Negative — "let's just delete the old paths"** — Refuse without observation period. Some "deprecated" paths are actually the only thing that works for a specific service. Mark, observe, then delete.
83
+
84
+ ## Memory Bank Hooks
85
+
86
+ - **Reads:** `.rihal/memory/incidents/post-mortems/` (prior deploy incidents)
87
+ - **Writes:** `.rihal/memory/project/decisions.md` (canonical path decision); `.rihal/memory/change-records/YYYYMMDD-NNN.md` (the unification itself as a change record)
@@ -1,213 +1,63 @@
1
1
  ---
2
2
  name: rihal-distillator
3
- description: Lossless LLM-optimized compression of source documents. Use when the user requests to 'distill documents' or 'create a distillate'.
4
- argument-hint: "[to create provide input paths] [--validate distillate-path to confirm distillate is lossless and optimized]"
3
+ description: Lossless LLM-optimized compression of source documents. Use when the user requests to "distill documents" or "create a distillate". Distillates preserve every fact, decision, constraint, and relationship while stripping prose overhead — designed as drop-in LLM context. Not summarisation (summaries are lossy). For Memory Bank distillates specifically, use rcode-memory-distill.
4
+ argument-hint: "<source-paths> [--validate <distillate-path>] [--token-budget <N>] [--consumer <name>]"
5
5
  triggers:
6
6
  - "distillator"
7
+ - "distill documents"
8
+ - "create a distillate"
9
+ - "compress these docs"
10
+ user-invocable: true
7
11
  ---
8
12
 
9
- # Distillator: A Document Distillation Engine
10
-
11
13
  ## Overview
12
14
 
13
- This skill produces hyper-compressed, token-efficient documents (distillates) from any set of source documents. A distillate preserves every fact, decision, constraint, and relationship from the sources while stripping all overhead that humans need and LLMs don't. Act as an information extraction and compression specialist. The output is a single dense document (or semantically-split set) that a downstream LLM workflow can consume as sole context input without information loss.
14
-
15
- This is a compression task, not a summarization task. Summaries are lossy. Distillates are lossless compression optimized for LLM consumption.
16
-
17
- ## On Activation
18
-
19
- 1. **Validate inputs.** The caller must provide:
20
- - **source_documents** (required) — One or more file paths, folder paths, or glob patterns to distill
21
- - **downstream_consumer** (optional) — What workflow/agent consumes this distillate (e.g., "PRD creation", "architecture design"). When provided, use it to judge signal vs noise. When omitted, preserve everything.
22
- - **token_budget** (optional) — Approximate target size. When provided and the distillate would exceed it, trigger semantic splitting.
23
- - **output_path** (optional) — Where to save. When omitted, save adjacent to the primary source document with `-distillate.md` suffix.
24
- - **--validate** (flag) — Run round-trip reconstruction test after producing the distillate.
25
-
26
- 2. **Route** — proceed to Stage 1.
27
-
28
- ## Stages
29
-
30
- | # | Stage | Purpose |
31
- |---|-------|---------|
32
- | 1 | Analyze | Run analysis script, determine routing and splitting |
33
- | 2 | Compress | Spawn compressor agent(s) to produce the distillate |
34
- | 3 | Verify & Output | Completeness check, format check, save output |
35
- | 4 | Round-Trip Validate | (--validate only) Reconstruct and diff against originals |
36
-
37
- ### Stage 1: Analyze
38
-
39
- Run `scripts/analyze_sources.py --help` then run it with the source paths. Use its routing recommendation and grouping output to drive Stage 2. Do NOT read the source documents yourself.
40
-
41
- ### Stage 2: Compress
42
-
43
- **Single mode** (routing = `"single"`, ≤3 files, ≤15K estimated tokens):
44
-
45
- Spawn one subagent using `agents/distillate-compressor.md` with all source file paths.
46
-
47
- **Fan-out mode** (routing = `"fan-out"`):
48
-
49
- 1. Spawn one compressor subagent per group from the analysis output. Each compressor receives only its group's file paths and produces an intermediate distillate.
50
-
51
- 2. After all compressors return, spawn one final **merge compressor** subagent using `agents/distillate-compressor.md`. Pass it the intermediate distillate contents as its input (not the original files). Its job is cross-group deduplication, thematic regrouping, and final compression.
52
-
53
- 3. Clean up intermediate distillate content (it exists only in memory, not saved to disk).
54
-
55
- **Graceful degradation:** If subagent spawning is unavailable, read the source documents and perform the compression work directly using the same instructions from `agents/distillate-compressor.md`. For fan-out, process groups sequentially then merge.
56
-
57
- The compressor returns a structured JSON result containing the distillate content, source headings, named entities, and token estimate.
58
-
59
- ### Stage 3: Verify & Output
60
-
61
- After the compressor (or merge compressor) returns:
62
-
63
- 1. **Completeness check.** Using the headings and named entities list returned by the compressor, verify each appears in the distillate content. If gaps are found, send them back to the compressor for a targeted fix pass — not a full recompression. Limit to 2 fix passes maximum.
64
-
65
- 2. **Format check.** Verify the output follows distillate format rules:
66
- - No prose paragraphs (only bullets)
67
- - No decorative formatting
68
- - No repeated information
69
- - Each bullet is self-contained
70
- - Themes are clearly delineated with `##` headings
71
-
72
- 3. **Determine output format.** Using the split prediction from Stage 1 and actual distillate size:
73
-
74
- **Single distillate** (≤~5,000 tokens or token_budget not exceeded):
75
-
76
- Save as a single file with frontmatter:
77
-
78
- ```yaml
79
- ---
80
- type: rihal-distillate
81
- sources:
82
- - "{relative path to source file 1}"
83
- - "{relative path to source file 2}"
84
- downstream_consumer: "{consumer or 'general'}"
85
- created: "{date}"
86
- token_estimate: {approximate token count}
87
- parts: 1
88
- ---
89
- ```
90
-
91
- **Split distillate** (>~5,000 tokens, or token_budget requires it):
92
-
93
- Create a folder `{base-name}-distillate/` containing:
15
+ Compresses source documents into a dense, lossless distillate optimised for LLM context loading. Output is one (or several semantically split) markdown files containing every fact, decision, named entity, and relationship from the sources but no prose connectives, decoration, or repetition. A downstream LLM can use the distillate as sole context with no information loss.
94
16
 
95
- ```
96
- {base-name}-distillate/
97
- ├── _index.md # Orientation, cross-cutting items, section manifest
98
- ├── 01-{topic-slug}.md # Self-contained section
99
- ├── 02-{topic-slug}.md
100
- └── 03-{topic-slug}.md
101
- ```
17
+ ## Process
102
18
 
103
- The `_index.md` contains:
104
- - Frontmatter with sources (relative paths from the distillate folder to the originals)
105
- - 3-5 bullet orientation (what was distilled, from what)
106
- - Section manifest: each section's filename + 1-line description
107
- - Cross-cutting items that span multiple sections
108
-
109
- Each section file is self-contained loadable independently. Include a 1-line context header: "This section covers [topic]. Part N of M."
110
-
111
- Source paths in frontmatter must be relative to the distillate's location.
112
-
113
- 4. **Measure distillate.** Run `scripts/analyze_sources.py` on the final distillate file(s) to get accurate token counts for the output. Use the `total_estimated_tokens` from this analysis as `distillate_total_tokens`.
114
-
115
- 5. **Report results.** Always return structured JSON output:
116
-
117
- ```json
118
- {
119
- "status": "complete",
120
- "distillate": "{path or folder path}",
121
- "section_distillates": ["{path1}", "{path2}"] or null,
122
- "source_total_tokens": N,
123
- "distillate_total_tokens": N,
124
- "compression_ratio": "X:1",
125
- "source_documents": ["{path1}", "{path2}"],
126
- "completeness_check": "pass" or "pass_with_additions"
127
- }
128
- ```
129
-
130
- Where `source_total_tokens` is from the Stage 1 analysis and `distillate_total_tokens` is from step 4. The `compression_ratio` is `source_total_tokens / distillate_total_tokens` formatted as "X:1" (e.g., "3.2:1").
131
-
132
- 6. If `--validate` flag was set, proceed to Stage 4. Otherwise, done.
133
-
134
- ### Stage 4: Round-Trip Validation (--validate only)
135
-
136
- This stage proves the distillate is lossless by reconstructing source documents from the distillate alone. Use for critical documents where information loss is unacceptable, or as a quality gate for high-stakes downstream workflows. Not for routine use — it adds significant token cost.
137
-
138
- 1. **Spawn the reconstructor agent** using `agents/round-trip-reconstructor.md`. Pass it ONLY the distillate file path (or `_index.md` path for split distillates) — it must NOT have access to the original source documents.
139
-
140
- For split distillates, spawn one reconstructor per section in parallel. Each receives its section file plus the `_index.md` for cross-cutting context.
141
-
142
- **Graceful degradation:** If subagent spawning is unavailable, this stage cannot be performed by the main agent (it has already seen the originals). Report that round-trip validation requires subagent support and skip.
143
-
144
- 2. **Receive reconstructions.** The reconstructor returns reconstruction file paths saved adjacent to the distillate.
145
-
146
- 3. **Perform semantic diff.** Read both the original source documents and the reconstructions. For each section of the original, assess:
147
- - Is the core information present in the reconstruction?
148
- - Are specific details preserved (numbers, names, decisions)?
149
- - Are relationships and rationale intact?
150
- - Did the reconstruction add anything not in the original? (indicates hallucination filling gaps)
151
-
152
- 4. **Produce validation report** saved adjacent to the distillate as `-validation-report.md`:
153
-
154
- ```markdown
155
- ---
156
- type: distillate-validation
157
- distillate: "{distillate path}"
158
- sources: ["{source paths}"]
159
- created: "{date}"
160
- ---
161
-
162
- ## Validation Summary
163
- - Status: PASS | PASS_WITH_WARNINGS | FAIL
164
- - Information preserved: {percentage estimate}
165
- - Gaps found: {count}
166
- - Hallucinations detected: {count}
167
-
168
- ## Gaps (information in originals but missing from reconstruction)
169
- - {gap description} — Source: {which original}, Section: {where}
170
-
171
- ## Hallucinations (information in reconstruction not traceable to originals)
172
- - {hallucination description} — appears to fill gap in: {section}
173
-
174
- ## Possible Gap Markers (flagged by reconstructor)
175
- - {marker description}
176
- ```
177
-
178
- 5. **If gaps are found**, offer to run a targeted fix pass on the distillate — adding the missing information without full recompression. Limit to 2 fix passes maximum.
19
+ 1. **Validate inputs.** Required: `source_documents`. Optional: `downstream_consumer` (judges signal vs noise; if omitted, preserve everything), `token_budget` (triggers split when exceeded), `output_path` (default: adjacent to primary source with `-distillate.md` suffix), `--validate` flag (round-trip reconstruction test).
20
+ 2. **Stage 1 Analyze.** Run `scripts/analyze_sources.py` on the source paths. Use its routing recommendation (`single` / `fan-out`) and grouping output. Do not read sources yourself.
21
+ 3. **Stage 2 Compress.** Spawn `agents/distillate-compressor.md` subagent(s):
22
+ - **Single mode** (≤3 files, ≤15K tokens): one compressor.
23
+ - **Fan-out mode**: one compressor per group, then a merge compressor consuming the intermediate distillates (not originals).
24
+ - **Graceful degradation:** if subagent spawning is unavailable, perform the work directly using the same instructions.
25
+ 4. **Stage 3 Verify & output.** Completeness check (every returned heading and named entity appears in the distillate; up to 2 targeted fix passes). Format check (bullets only, no prose, no repetition, `##` themes). Save with frontmatter (`type: rihal-distillate`, `sources`, `created`, `token_estimate`, `parts`). Split distillates when >5K tokens or `token_budget` exceeded — see [`references.md`](references.md) for the split format.
26
+ 5. **Stage 4 — Round-trip validate (only with `--validate`).** Spawn `agents/round-trip-reconstructor.md` with the distillate path only (no source access). Semantic-diff the reconstruction against the originals. Produce `<name>-validation-report.md` with status, gaps, and hallucinations. Up to 2 fix passes if gaps found. Adds significant token cost — only for high-stakes use.
179
27
 
180
28
  ## Output Format
181
29
 
182
- Structured JSON result:
30
+ Structured JSON on every run:
31
+
183
32
  ```json
184
33
  {
185
34
  "status": "complete",
186
- "distillate": "path/to/distillate.md",
187
- "compression_ratio": "3.2:1",
35
+ "distillate": "path/to/file-distillate.md",
36
+ "section_distillates": ["path1", "path2"] or null,
188
37
  "source_total_tokens": 15000,
189
- "distillate_total_tokens": 4688
38
+ "distillate_total_tokens": 4688,
39
+ "compression_ratio": "3.2:1",
40
+ "source_documents": ["path1", "path2"],
41
+ "completeness_check": "pass" | "pass_with_additions"
190
42
  }
191
43
  ```
192
44
 
193
- ## Workflow
194
-
195
- 1. Read the user request and extract key parameters.
196
- 2. Execute the skill logic as described in the Overview.
197
- 3. Return output in the format specified below.
45
+ Token counts come from `scripts/analyze_sources.py`. Compression ratio is `source / distillate`.
198
46
 
199
47
  ## Examples
200
48
 
201
- ### Happy path
202
- **User:** "distill ./docs/architecture.md ./docs/decisions.md"
203
- **Result:** Analyzes sourcessingle-mode compressionsaves `architecture-distillate.md`reports 3.2:1 ratio
49
+ **Happy path** — `distill ./docs/architecture.md ./docs/decisions.md` → single-mode → saves `architecture-distillate.md` → reports `3.2:1`.
50
+
51
+ **Edge case large folder** — `distill ./docs/ --validate` fan-out mode (multiple compressors) merge passround-trip validation produces a validation report.
52
+
53
+ **Negative — summarisation request** — "summarize this meeting" — distillation is lossless compression, not summarisation. Clarify the difference or route to a writing skill.
54
+
55
+ ## Memory Bank Hooks
204
56
 
205
- ### Edge case
206
- **User:** "distill ./docs/ --validate"
207
- **Result:** Fan-out mode for large folder merge pass round-trip validation via reconstructor agent
57
+ - **Reads:** the source documents passed in
58
+ - **Writes:** the distillate file (or folder) at the specified or default path
59
+ - **Note:** for Memory Bank-specific distillates, use `rcode-memory-distill` instead it knows the Memory Bank source set.
208
60
 
209
- ### Negative boundary
210
- **User:** "summarize this meeting"
211
- **Result:** Distillation is lossless compression, not summarization. If user wants a summary, clarify the difference or route to a writing skill
61
+ ## Detailed reference
212
62
 
213
- 6. **Clean up** delete the temporary reconstruction files after the report is generated.
63
+ See [`references.md`](references.md) for: the split distillate folder format, the validation report template, frontmatter schema, and `--validate` flag semantics.
@@ -0,0 +1,118 @@
1
+ # Distillator — Detailed Reference
2
+
3
+ Detailed formats and templates for [`SKILL.md`](SKILL.md).
4
+
5
+ ---
6
+
7
+ ## Frontmatter schema
8
+
9
+ Every distillate file (single or split-index) has:
10
+
11
+ ```yaml
12
+ ---
13
+ type: rihal-distillate
14
+ sources:
15
+ - "{relative path to source 1}"
16
+ - "{relative path to source 2}"
17
+ downstream_consumer: "{consumer or 'general'}"
18
+ created: "{ISO date}"
19
+ token_estimate: {approximate token count}
20
+ parts: 1 # or N for split distillates
21
+ ---
22
+ ```
23
+
24
+ Source paths are relative to the distillate's location.
25
+
26
+ ---
27
+
28
+ ## Single distillate format
29
+
30
+ When `total_tokens ≤ 5000` and `token_budget` is not exceeded:
31
+
32
+ - One file: `{base-name}-distillate.md`
33
+ - Frontmatter as above with `parts: 1`
34
+ - Body: `##` themes containing self-contained bullets — no prose paragraphs, no decorative formatting, no repetition
35
+
36
+ ---
37
+
38
+ ## Split distillate format
39
+
40
+ When `total_tokens > 5000` OR `token_budget` requires splitting:
41
+
42
+ ```
43
+ {base-name}-distillate/
44
+ ├── _index.md # orientation, cross-cutting items, section manifest
45
+ ├── 01-{topic-slug}.md # self-contained section
46
+ ├── 02-{topic-slug}.md
47
+ └── 03-{topic-slug}.md
48
+ ```
49
+
50
+ `_index.md` contains:
51
+ - Frontmatter (sources relative to the folder; `parts: N`)
52
+ - 3-5 bullet orientation: what was distilled, from what
53
+ - Section manifest: each section's filename + 1-line description
54
+ - Cross-cutting items that span multiple sections
55
+
56
+ Each section file:
57
+ - Self-contained — loadable independently
58
+ - 1-line context header: `This section covers {topic}. Part N of M.`
59
+ - Same bullet-only format
60
+
61
+ ---
62
+
63
+ ## Round-trip validation report template
64
+
65
+ Saved adjacent to the distillate as `{base-name}-validation-report.md`:
66
+
67
+ ```markdown
68
+ ---
69
+ type: distillate-validation
70
+ distillate: "{distillate path}"
71
+ sources: ["{source paths}"]
72
+ created: "{ISO date}"
73
+ ---
74
+
75
+ ## Validation Summary
76
+ - Status: PASS | PASS_WITH_WARNINGS | FAIL
77
+ - Information preserved: {percentage estimate}
78
+ - Gaps found: {count}
79
+ - Hallucinations detected: {count}
80
+
81
+ ## Gaps (information in originals but missing from reconstruction)
82
+ - {gap description} — Source: {which original}, Section: {where}
83
+
84
+ ## Hallucinations (information in reconstruction not traceable to originals)
85
+ - {hallucination description} — appears to fill gap in: {section}
86
+
87
+ ## Possible Gap Markers (flagged by reconstructor)
88
+ - {marker description}
89
+ ```
90
+
91
+ ---
92
+
93
+ ## Validation semantics
94
+
95
+ - **PASS** — every fact, decision, constraint, and relationship survives the round trip.
96
+ - **PASS_WITH_WARNINGS** — minor gaps that the reconstructor itself flagged ("possible gap markers").
97
+ - **FAIL** — material gaps or hallucinations. Trigger up to 2 targeted fix passes on the distillate.
98
+
99
+ If gaps remain after fix passes: surface them honestly in the report. Do not pad the distillate with regenerated content — that introduces hallucination.
100
+
101
+ ---
102
+
103
+ ## When to use `--validate`
104
+
105
+ - Distillates feeding architecture / system-design workflows
106
+ - Distillates of regulatory or compliance documents
107
+ - Distillates produced for an external consumer (client deliverable)
108
+ - Anywhere information loss is unacceptable
109
+
110
+ Skip `--validate` for routine distillates (Memory Bank refresh, internal context loading) — it adds significant token cost.
111
+
112
+ ---
113
+
114
+ ## Cleanup behaviour
115
+
116
+ - Intermediate distillates (fan-out mode) live only in memory; they are not saved.
117
+ - Reconstruction files (`--validate` mode) are temporary; delete them after the validation report is written.
118
+ - The validation report itself persists alongside the distillate.