oh-my-codex-cli 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (292) hide show
  1. package/.agent/skills/agent-kb/HOW_TO_USE.md +428 -0
  2. package/.agent/skills/agent-kb/README.md +46 -0
  3. package/.agent/skills/agent-kb/SKILL.md +128 -0
  4. package/.agent/skills/agent-kb/references/intelligent-analysis-explained.md +333 -0
  5. package/.agent/skills/agent-kb/references/query-optimization.md +225 -0
  6. package/.agent/skills/aireview/SKILL.md +704 -0
  7. package/.agent/skills/analyze/SKILL.md +81 -0
  8. package/.agent/skills/architect-planner/HOW_TO_USE.md +238 -0
  9. package/.agent/skills/architect-planner/README.md +41 -0
  10. package/.agent/skills/architect-planner/SKILL.md +539 -0
  11. package/.agent/skills/auto-mbti/SKILL.md +291 -0
  12. package/.agent/skills/autopilot/SKILL.md +222 -0
  13. package/.agent/skills/backend-patterns/SKILL.md +602 -0
  14. package/.agent/skills/bdd-generator/README.md +78 -0
  15. package/.agent/skills/bdd-generator/SKILL.md +436 -0
  16. package/.agent/skills/brainstorming/HOW_TO_USE.md +289 -0
  17. package/.agent/skills/brainstorming/README.md +41 -0
  18. package/.agent/skills/brainstorming/SKILL.md +165 -0
  19. package/.agent/skills/build-fix/SKILL.md +190 -0
  20. package/.agent/skills/cancel/SKILL.md +658 -0
  21. package/.agent/skills/checkpoint/SKILL.md +94 -0
  22. package/.agent/skills/code-review/SKILL.md +273 -0
  23. package/.agent/skills/coding-standards/SKILL.md +535 -0
  24. package/.agent/skills/conductor/SKILL.md +128 -0
  25. package/.agent/skills/conductor/commands/conductor/implement.toml +358 -0
  26. package/.agent/skills/conductor/commands/conductor/newTrack.toml +142 -0
  27. package/.agent/skills/conductor/commands/conductor/revert.toml +123 -0
  28. package/.agent/skills/conductor/commands/conductor/setup.toml +429 -0
  29. package/.agent/skills/conductor/commands/conductor/status.toml +57 -0
  30. package/.agent/skills/conductor/scripts/install.sh +89 -0
  31. package/.agent/skills/conductor/templates/code_styleguides/csharp.md +115 -0
  32. package/.agent/skills/conductor/templates/code_styleguides/dart.md +238 -0
  33. package/.agent/skills/conductor/templates/code_styleguides/general.md +23 -0
  34. package/.agent/skills/conductor/templates/code_styleguides/go.md +48 -0
  35. package/.agent/skills/conductor/templates/code_styleguides/html-css.md +49 -0
  36. package/.agent/skills/conductor/templates/code_styleguides/javascript.md +51 -0
  37. package/.agent/skills/conductor/templates/code_styleguides/python.md +37 -0
  38. package/.agent/skills/conductor/templates/code_styleguides/typescript.md +43 -0
  39. package/.agent/skills/conductor/templates/rules/README.md +23 -0
  40. package/.agent/skills/conductor/templates/rules/agents.md +49 -0
  41. package/.agent/skills/conductor/templates/rules/coding-style.md +70 -0
  42. package/.agent/skills/conductor/templates/rules/dev.md +20 -0
  43. package/.agent/skills/conductor/templates/rules/git-workflow.md +45 -0
  44. package/.agent/skills/conductor/templates/rules/hooks.md +6 -0
  45. package/.agent/skills/conductor/templates/rules/patterns.md +55 -0
  46. package/.agent/skills/conductor/templates/rules/performance.md +47 -0
  47. package/.agent/skills/conductor/templates/rules/research.md +26 -0
  48. package/.agent/skills/conductor/templates/rules/review.md +22 -0
  49. package/.agent/skills/conductor/templates/rules/security.md +36 -0
  50. package/.agent/skills/conductor/templates/rules/testing.md +30 -0
  51. package/.agent/skills/conductor/templates/workflow.md +333 -0
  52. package/.agent/skills/consensus/HOW_TO_USE.md +191 -0
  53. package/.agent/skills/consensus/README.md +41 -0
  54. package/.agent/skills/consensus/SKILL.md +317 -0
  55. package/.agent/skills/content-research-writer/SKILL.md +537 -0
  56. package/.agent/skills/debug-analysis/SKILL.md +331 -0
  57. package/.agent/skills/deepinit/SKILL.md +347 -0
  58. package/.agent/skills/deepsearch/SKILL.md +56 -0
  59. package/.agent/skills/doctor/SKILL.md +158 -0
  60. package/.agent/skills/drawio/EXAMPLES.md +382 -0
  61. package/.agent/skills/drawio/QUICK_START.md +237 -0
  62. package/.agent/skills/drawio/README.md +315 -0
  63. package/.agent/skills/drawio/SETUP_GUIDE.md +254 -0
  64. package/.agent/skills/drawio/SKILL.md +1176 -0
  65. package/.agent/skills/e2e/SKILL.md +396 -0
  66. package/.agent/skills/ecomode/SKILL.md +160 -0
  67. package/.agent/skills/electron-driver/SKILL.md +144 -0
  68. package/.agent/skills/electron-driver/scripts/driver-template.js +71 -0
  69. package/.agent/skills/eval/SKILL.md +140 -0
  70. package/.agent/skills/eval-harness/SKILL.md +242 -0
  71. package/.agent/skills/evolve/SKILL.md +213 -0
  72. package/.agent/skills/frontend-design/SKILL.md +42 -0
  73. package/.agent/skills/frontend-patterns/SKILL.md +646 -0
  74. package/.agent/skills/frontend-ui-ux/SKILL.md +70 -0
  75. package/.agent/skills/git-master/SKILL.md +75 -0
  76. package/.agent/skills/help/SKILL.md +89 -0
  77. package/.agent/skills/iterative-retrieval/SKILL.md +217 -0
  78. package/.agent/skills/local-skills-setup/SKILL.md +483 -0
  79. package/.agent/skills/log-analyzer/SKILL.md +187 -0
  80. package/.agent/skills/mcp-setup/SKILL.md +226 -0
  81. package/.agent/skills/multi-model-research/HOW_TO_USE.md +614 -0
  82. package/.agent/skills/multi-model-research/README.md +233 -0
  83. package/.agent/skills/multi-model-research/SKILL.md +541 -0
  84. package/.agent/skills/multi-model-research/references/troubleshooting.md +415 -0
  85. package/.agent/skills/note/SKILL.md +80 -0
  86. package/.agent/skills/omc-setup/SKILL.md +219 -0
  87. package/.agent/skills/orchestrate/SKILL.md +620 -0
  88. package/.agent/skills/patent-workflow/IMPLEMENTATION_SUMMARY.md +500 -0
  89. package/.agent/skills/patent-workflow/README.md +455 -0
  90. package/.agent/skills/patent-workflow/SKILL.md +1036 -0
  91. package/.agent/skills/patent-workflow/tools/irr_checker.py +260 -0
  92. package/.agent/skills/patent-workflow/tools/sample_terminology.json +49 -0
  93. package/.agent/skills/patent-workflow/tools/term_checker.py +355 -0
  94. package/.agent/skills/pattern-recognition/SKILL.md +792 -0
  95. package/.agent/skills/pipeline/SKILL.md +448 -0
  96. package/.agent/skills/plan/SKILL.md +309 -0
  97. package/.agent/skills/planning-methodology/SKILL.md +370 -0
  98. package/.agent/skills/planning-with-files/SKILL.md +210 -0
  99. package/.agent/skills/planning-with-files/examples.md +202 -0
  100. package/.agent/skills/planning-with-files/reference.md +218 -0
  101. package/.agent/skills/planning-with-files/scripts/check-complete.ps1 +42 -0
  102. package/.agent/skills/planning-with-files/scripts/check-complete.sh +44 -0
  103. package/.agent/skills/planning-with-files/scripts/init-session.ps1 +120 -0
  104. package/.agent/skills/planning-with-files/scripts/init-session.sh +120 -0
  105. package/.agent/skills/planning-with-files/scripts/session-catchup.py +208 -0
  106. package/.agent/skills/planning-with-files/templates/findings.md +95 -0
  107. package/.agent/skills/planning-with-files/templates/progress.md +114 -0
  108. package/.agent/skills/planning-with-files/templates/task_plan.md +132 -0
  109. package/.agent/skills/project-analyze/CLAUDE.md +18 -0
  110. package/.agent/skills/project-analyze/HOW_TO_USE.md +145 -0
  111. package/.agent/skills/project-analyze/README.md +42 -0
  112. package/.agent/skills/project-analyze/SKILL.md +289 -0
  113. package/.agent/skills/project-analyze/SKILL.md.backup +287 -0
  114. package/.agent/skills/project-analyze/SKILL.md.backup_20260105_093646 +287 -0
  115. package/.agent/skills/project-analyze/assets/analysis-report-template.md +433 -0
  116. package/.agent/skills/project-analyze/references/analysis-patterns.md +422 -0
  117. package/.agent/skills/project-analyze/references/projectmind-explained.md +535 -0
  118. package/.agent/skills/project-session-manager/SKILL.md +428 -0
  119. package/.agent/skills/project-session-manager/lib/config.sh +86 -0
  120. package/.agent/skills/project-session-manager/lib/parse.sh +121 -0
  121. package/.agent/skills/project-session-manager/lib/session.sh +132 -0
  122. package/.agent/skills/project-session-manager/lib/tmux.sh +103 -0
  123. package/.agent/skills/project-session-manager/lib/worktree.sh +171 -0
  124. package/.agent/skills/project-session-manager/psm.sh +629 -0
  125. package/.agent/skills/project-session-manager/templates/feature.md +56 -0
  126. package/.agent/skills/project-session-manager/templates/issue-fix.md +57 -0
  127. package/.agent/skills/project-session-manager/templates/pr-review.md +65 -0
  128. package/.agent/skills/project-session-manager/templates/projects.json +19 -0
  129. package/.agent/skills/quality-check/HOW_TO_USE.md +171 -0
  130. package/.agent/skills/quality-check/README.md +50 -0
  131. package/.agent/skills/quality-check/SKILL.md +240 -0
  132. package/.agent/skills/quality-check/SKILL.md.backup +238 -0
  133. package/.agent/skills/quality-check/SKILL.md.backup_20260105_093646 +238 -0
  134. package/.agent/skills/quality-check/assets/quality-report-template.md +437 -0
  135. package/.agent/skills/quality-check/references/refactoring-patterns.md +550 -0
  136. package/.agent/skills/quality-check/references/scoring-criteria.md +454 -0
  137. package/.agent/skills/quality-validation/SKILL.md +519 -0
  138. package/.agent/skills/quality-validation/SKILL.md.backup +573 -0
  139. package/.agent/skills/quality-validation/SKILL.md.backup_20260105_093646 +573 -0
  140. package/.agent/skills/ralph/SKILL.md +236 -0
  141. package/.agent/skills/ralph-init/SKILL.md +78 -0
  142. package/.agent/skills/ralplan/SKILL.md +58 -0
  143. package/.agent/skills/refactor-clean/SKILL.md +49 -0
  144. package/.agent/skills/release/SKILL.md +84 -0
  145. package/.agent/skills/research/SKILL.md +526 -0
  146. package/.agent/skills/research-methodology/SKILL.md +268 -0
  147. package/.agent/skills/review/SKILL.md +53 -0
  148. package/.agent/skills/security-review/SKILL.md +509 -0
  149. package/.agent/skills/security-review/cloud-infrastructure-security.md +361 -0
  150. package/.agent/skills/setup-pm/SKILL.md +102 -0
  151. package/.agent/skills/skill/SKILL.md +424 -0
  152. package/.agent/skills/skill-create/SKILL.md +209 -0
  153. package/.agent/skills/skill-debugger/HOW_TO_USE.md +244 -0
  154. package/.agent/skills/skill-debugger/README.md +44 -0
  155. package/.agent/skills/skill-debugger/SKILL.md +326 -0
  156. package/.agent/skills/skill-debugger/diagnostic_checklist.md +115 -0
  157. package/.agent/skills/skill-development/SKILL.md +661 -0
  158. package/.agent/skills/skill-development/references/skill-creator-original.md +209 -0
  159. package/.agent/skills/skill-doc-generator/README.md +37 -0
  160. package/.agent/skills/skill-doc-generator/SKILL.md +331 -0
  161. package/.agent/skills/skill-quality-analyzer/HOW_TO_USE.md +243 -0
  162. package/.agent/skills/skill-quality-analyzer/README.md +61 -0
  163. package/.agent/skills/skill-quality-analyzer/SKILL.md +247 -0
  164. package/.agent/skills/skill-quality-analyzer/analyzer.py +209 -0
  165. package/.agent/skills/skill-quality-analyzer/expected_output.json +81 -0
  166. package/.agent/skills/skill-quality-analyzer/sample_input.json +9 -0
  167. package/.agent/skills/skill-tester/README.md +46 -0
  168. package/.agent/skills/skill-tester/SKILL.md +345 -0
  169. package/.agent/skills/start-dev/SKILL.md +701 -0
  170. package/.agent/skills/swarm/SKILL.md +691 -0
  171. package/.agent/skills/task-kb-lookup/SKILL.md +211 -0
  172. package/.agent/skills/task-kb-record/SKILL.md +417 -0
  173. package/.agent/skills/tdd/SKILL.md +446 -0
  174. package/.agent/skills/tdd-generator/DEMO.md +516 -0
  175. package/.agent/skills/tdd-generator/README.md +89 -0
  176. package/.agent/skills/tdd-generator/SKILL.md +278 -0
  177. package/.agent/skills/tdd-workflow/SKILL.md +424 -0
  178. package/.agent/skills/test-coverage/SKILL.md +48 -0
  179. package/.agent/skills/thinkdeep/HOW_TO_USE.md +183 -0
  180. package/.agent/skills/thinkdeep/README.md +41 -0
  181. package/.agent/skills/thinkdeep/SKILL.md +343 -0
  182. package/.agent/skills/ui-ux-pro-max/SKILL.md +228 -0
  183. package/.agent/skills/ui-ux-pro-max/data/charts.csv +26 -0
  184. package/.agent/skills/ui-ux-pro-max/data/colors.csv +97 -0
  185. package/.agent/skills/ui-ux-pro-max/data/landing.csv +31 -0
  186. package/.agent/skills/ui-ux-pro-max/data/products.csv +97 -0
  187. package/.agent/skills/ui-ux-pro-max/data/prompts.csv +24 -0
  188. package/.agent/skills/ui-ux-pro-max/data/stacks/flutter.csv +53 -0
  189. package/.agent/skills/ui-ux-pro-max/data/stacks/html-tailwind.csv +56 -0
  190. package/.agent/skills/ui-ux-pro-max/data/stacks/nextjs.csv +53 -0
  191. package/.agent/skills/ui-ux-pro-max/data/stacks/react-native.csv +52 -0
  192. package/.agent/skills/ui-ux-pro-max/data/stacks/react.csv +54 -0
  193. package/.agent/skills/ui-ux-pro-max/data/stacks/svelte.csv +54 -0
  194. package/.agent/skills/ui-ux-pro-max/data/stacks/swiftui.csv +51 -0
  195. package/.agent/skills/ui-ux-pro-max/data/stacks/vue.csv +50 -0
  196. package/.agent/skills/ui-ux-pro-max/data/styles.csv +59 -0
  197. package/.agent/skills/ui-ux-pro-max/data/typography.csv +58 -0
  198. package/.agent/skills/ui-ux-pro-max/data/ux-guidelines.csv +100 -0
  199. package/.agent/skills/ui-ux-pro-max/scripts/core.py +236 -0
  200. package/.agent/skills/ui-ux-pro-max/scripts/search.py +61 -0
  201. package/.agent/skills/ultrapilot/SKILL.md +647 -0
  202. package/.agent/skills/ultraqa/SKILL.md +152 -0
  203. package/.agent/skills/ultrawork/SKILL.md +123 -0
  204. package/.agent/skills/update-codemaps/SKILL.md +38 -0
  205. package/.agent/skills/update-docs/SKILL.md +52 -0
  206. package/.agent/skills/verification-loop/SKILL.md +140 -0
  207. package/.agent/skills/verify/SKILL.md +80 -0
  208. package/.agent/skills/writer-memory/SKILL.md +459 -0
  209. package/.agent/skills/writer-memory/lib/character-tracker.ts +338 -0
  210. package/.agent/skills/writer-memory/lib/memory-manager.ts +804 -0
  211. package/.agent/skills/writer-memory/lib/relationship-graph.ts +400 -0
  212. package/.agent/skills/writer-memory/lib/scene-organizer.ts +544 -0
  213. package/.agent/skills/writer-memory/lib/synopsis-builder.ts +339 -0
  214. package/.agent/skills/writer-memory/templates/synopsis-template.md +46 -0
  215. package/.governance/skill-lint.allowlist +4 -0
  216. package/.governance/skill-llm.allowlist +4 -0
  217. package/AGENTS.md +59 -0
  218. package/LICENSE +21 -0
  219. package/README.md +169 -0
  220. package/README.zh.md +145 -0
  221. package/bin/omcodex.js +8 -0
  222. package/commands/conductor/implement.toml +358 -0
  223. package/commands/conductor/newTrack.toml +142 -0
  224. package/commands/conductor/revert.toml +123 -0
  225. package/commands/conductor/setup.toml +429 -0
  226. package/commands/conductor/status.toml +57 -0
  227. package/docs/ALIGNMENT.md +40 -0
  228. package/docs/CODEX.md +133 -0
  229. package/docs/NOTIFY.md +81 -0
  230. package/docs/SKILL_GOVERNANCE.md +72 -0
  231. package/docs/SKILL_GOVERNANCE_FRAMEWORK.md +182 -0
  232. package/docs/SKILL_GOVERNANCE_FRAMEWORK.zh.md +170 -0
  233. package/package.json +50 -0
  234. package/prompts/architect.md +105 -0
  235. package/prompts/executor.md +134 -0
  236. package/prompts/planner.md +113 -0
  237. package/scripts/check-skill-governance.sh +84 -0
  238. package/scripts/check-skill-llm-governance.js +302 -0
  239. package/scripts/eval-skills.js +217 -0
  240. package/scripts/generate-catalog-docs.js +95 -0
  241. package/scripts/generate-codex-mcp-config.sh +22 -0
  242. package/scripts/install-codex-force.sh +5 -0
  243. package/scripts/install-codex-incremental.sh +5 -0
  244. package/scripts/install-codex.sh +79 -0
  245. package/scripts/notify-dispatch.js +15 -0
  246. package/scripts/setup-package-manager.js +137 -0
  247. package/src/catalog/generated/public-catalog.json +547 -0
  248. package/src/catalog/manifest.json +542 -0
  249. package/src/catalog/reader.js +43 -0
  250. package/src/catalog/schema.js +79 -0
  251. package/src/cli/doctor.js +62 -0
  252. package/src/cli/index.js +85 -0
  253. package/src/cli/notify.js +127 -0
  254. package/src/cli/route.js +43 -0
  255. package/src/cli/setup.js +155 -0
  256. package/src/cli/team.js +125 -0
  257. package/src/config/generator.js +119 -0
  258. package/src/mcp/memory-server.js +241 -0
  259. package/src/mcp/state-server.js +112 -0
  260. package/src/mcp/trace-server.js +168 -0
  261. package/src/notify/dispatch.js +74 -0
  262. package/src/notify/extensibility/dispatcher.js +113 -0
  263. package/src/notify/extensibility/events.js +15 -0
  264. package/src/notify/extensibility/loader.js +54 -0
  265. package/src/router/skill-router.js +90 -0
  266. package/src/team/auto-advance.js +72 -0
  267. package/src/team/orchestrator.js +82 -0
  268. package/src/team/state-store.js +33 -0
  269. package/src/utils/paths.js +33 -0
  270. package/templates/AGENTS.md +15 -0
  271. package/templates/catalog-manifest.json +542 -0
  272. package/templates/code_styleguides/csharp.md +115 -0
  273. package/templates/code_styleguides/dart.md +238 -0
  274. package/templates/code_styleguides/general.md +23 -0
  275. package/templates/code_styleguides/go.md +48 -0
  276. package/templates/code_styleguides/html-css.md +49 -0
  277. package/templates/code_styleguides/javascript.md +51 -0
  278. package/templates/code_styleguides/python.md +37 -0
  279. package/templates/code_styleguides/typescript.md +43 -0
  280. package/templates/rules/README.md +23 -0
  281. package/templates/rules/agents.md +49 -0
  282. package/templates/rules/coding-style.md +70 -0
  283. package/templates/rules/dev.md +20 -0
  284. package/templates/rules/git-workflow.md +45 -0
  285. package/templates/rules/notify.md +6 -0
  286. package/templates/rules/patterns.md +55 -0
  287. package/templates/rules/performance.md +47 -0
  288. package/templates/rules/research.md +26 -0
  289. package/templates/rules/review.md +22 -0
  290. package/templates/rules/security.md +36 -0
  291. package/templates/rules/testing.md +30 -0
  292. package/templates/workflow.md +333 -0
@@ -0,0 +1,71 @@
1
+ const { chromium } = require('playwright-core');
2
+
3
+ /**
4
+ * Electron Driver Template (Attach Mode)
5
+ *
6
+ * Usage:
7
+ * 1. Ensure Electron is running with --remote-debugging-port=<PORT>
8
+ * 2. Set CDP_PORT env var if not 9222.
9
+ * 3. Run with `node script.cjs`
10
+ */
11
+
12
+ const CDP_PORT = process.env.CDP_PORT || 9222;
13
+ const CDP_URL = `http://localhost:${CDP_PORT}`;
14
+
15
+ async function run() {
16
+ let browser;
17
+ try {
18
+ // 1. Connect to the running Electron app
19
+ // console.log(`Connecting to Electron at ${CDP_URL}...`);
20
+ browser = await chromium.connectOverCDP(CDP_URL);
21
+
22
+ // 2. Find the right context and page
23
+ // Electron apps typically have one default context.
24
+ const context = browser.contexts()[0];
25
+ if (!context) throw new Error('No browser context found.');
26
+
27
+ const pages = context.pages();
28
+
29
+ // FILTER: Ignore DevTools windows
30
+ const appPage = pages.find(p => !p.url().startsWith('devtools://'));
31
+
32
+ if (!appPage) {
33
+ throw new Error(`No app window found. Open pages: ${pages.map(p => p.url()).join(', ')}`);
34
+ }
35
+
36
+ // console.log(`Attached to page: "${await appPage.title()}"`);
37
+
38
+ // ---------------------------------------------------------
39
+ // INJECTED LOGIC START
40
+ // ---------------------------------------------------------
41
+
42
+ // Example Modern Usage:
43
+ // await appPage.getByRole('button', { name: 'Save' }).click();
44
+ // const status = await appPage.locator('.status-bar').textContent();
45
+
46
+ console.log(JSON.stringify({
47
+ status: 'success',
48
+ title: await appPage.title(),
49
+ url: appPage.url(),
50
+ message: 'Electron Driver connected successfully.'
51
+ }));
52
+
53
+ // ---------------------------------------------------------
54
+ // INJECTED LOGIC END
55
+ // ---------------------------------------------------------
56
+
57
+ } catch (error) {
58
+ console.error(JSON.stringify({
59
+ status: 'error',
60
+ message: error.message,
61
+ stack: error.stack
62
+ }));
63
+ process.exit(1);
64
+ } finally {
65
+ if (browser) {
66
+ await browser.close();
67
+ }
68
+ }
69
+ }
70
+
71
+ run();
@@ -0,0 +1,140 @@
1
+ ---
2
+ name: eval
3
+ description: Imported from everything-codex command eval
4
+ ---
5
+
6
+ # Eval Command
7
+
8
+
9
+ ## Native Subagent Protocol (Codex)
10
+
11
+ Codex supports native subagents. Delegate with `spawn_agent`, coordinate with `send_input`, collect via `wait`, and clean up with `close_agent`.
12
+
13
+ Execution preference:
14
+ 1. Use native subagents first for independent workstreams (parallel when possible).
15
+ 2. Merge results in main thread and run final verification.
16
+ 3. Fallback only when delegation is blocked: use the `[ANALYST]`/`[ARCHITECT]`/`[EXECUTOR]`/`[REVIEWER]` structure in a single response.
17
+
18
+ Minimal orchestration pattern:
19
+ ```text
20
+ spawn_agent -> send_input (optional) -> wait -> close_agent
21
+ ```
22
+
23
+ Manage eval-driven development workflow.
24
+
25
+ ## Usage
26
+
27
+ `$eval [define|check|report|list] [feature-name]`
28
+
29
+ ## Define Evals
30
+
31
+ `$eval define feature-name`
32
+
33
+ Create a new eval definition:
34
+
35
+ 1. Create `.codex/evals/feature-name.md` with template:
36
+
37
+ ```markdown
38
+ ## EVAL: feature-name
39
+ Created: $(date)
40
+
41
+ ### Capability Evals
42
+ - [ ] [Description of capability 1]
43
+ - [ ] [Description of capability 2]
44
+
45
+ ### Regression Evals
46
+ - [ ] [Existing behavior 1 still works]
47
+ - [ ] [Existing behavior 2 still works]
48
+
49
+ ### Success Criteria
50
+ - pass@3 > 90% for capability evals
51
+ - pass^3 = 100% for regression evals
52
+ ```
53
+
54
+ 2. Prompt user to fill in specific criteria
55
+
56
+ ## Check Evals
57
+
58
+ `$eval check feature-name`
59
+
60
+ Run evals for a feature:
61
+
62
+ 1. Read eval definition from `.codex/evals/feature-name.md`
63
+ 2. For each capability eval:
64
+ - Attempt to verify criterion
65
+ - Record PASS/FAIL
66
+ - Log attempt in `.codex/evals/feature-name.log`
67
+ 3. For each regression eval:
68
+ - Run relevant tests
69
+ - Compare against baseline
70
+ - Record PASS/FAIL
71
+ 4. Report current status:
72
+
73
+ ```
74
+ EVAL CHECK: feature-name
75
+ ========================
76
+ Capability: X/Y passing
77
+ Regression: X/Y passing
78
+ Status: IN PROGRESS / READY
79
+ ```
80
+
81
+ ## Report Evals
82
+
83
+ `$eval report feature-name`
84
+
85
+ Generate comprehensive eval report:
86
+
87
+ ```
88
+ EVAL REPORT: feature-name
89
+ =========================
90
+ Generated: $(date)
91
+
92
+ CAPABILITY EVALS
93
+ ----------------
94
+ [eval-1]: PASS (pass@1)
95
+ [eval-2]: PASS (pass@2) - required retry
96
+ [eval-3]: FAIL - see notes
97
+
98
+ REGRESSION EVALS
99
+ ----------------
100
+ [test-1]: PASS
101
+ [test-2]: PASS
102
+ [test-3]: PASS
103
+
104
+ METRICS
105
+ -------
106
+ Capability pass@1: 67%
107
+ Capability pass@3: 100%
108
+ Regression pass^3: 100%
109
+
110
+ NOTES
111
+ -----
112
+ [Any issues, edge cases, or observations]
113
+
114
+ RECOMMENDATION
115
+ --------------
116
+ [SHIP / NEEDS WORK / BLOCKED]
117
+ ```
118
+
119
+ ## List Evals
120
+
121
+ `$eval list`
122
+
123
+ Show all eval definitions:
124
+
125
+ ```
126
+ EVAL DEFINITIONS
127
+ ================
128
+ feature-auth [3/5 passing] IN PROGRESS
129
+ feature-search [5/5 passing] READY
130
+ feature-export [0/4 passing] NOT STARTED
131
+ ```
132
+
133
+ ## Arguments
134
+
135
+ $ARGUMENTS:
136
+ - `define <name>` - Create new eval definition
137
+ - `check <name>` - Run and check evals
138
+ - `report <name>` - Generate full report
139
+ - `list` - Show all evals
140
+ - `clean` - Remove old eval logs (keeps last 10 runs)
@@ -0,0 +1,242 @@
1
+ ---
2
+ name: eval-harness
3
+ description: Formal evaluation framework for Codex sessions implementing eval-driven development (EDD) principles
4
+ tools: Read, Write, Edit, Bash, Grep, Glob
5
+ ---
6
+
7
+ # Eval Harness Skill
8
+
9
+
10
+ ## Native Subagent Protocol (Codex)
11
+
12
+ Codex supports native subagents. Delegate with `spawn_agent`, coordinate with `send_input`, collect via `wait`, and clean up with `close_agent`.
13
+
14
+ Execution preference:
15
+ 1. Use native subagents first for independent workstreams (parallel when possible).
16
+ 2. Merge results in main thread and run final verification.
17
+ 3. Fallback only when delegation is blocked: use the `[ANALYST]`/`[ARCHITECT]`/`[EXECUTOR]`/`[REVIEWER]` structure in a single response.
18
+
19
+ Minimal orchestration pattern:
20
+ ```text
21
+ spawn_agent -> send_input (optional) -> wait -> close_agent
22
+ ```
23
+
24
+ A formal evaluation framework for Codex sessions, implementing eval-driven development (EDD) principles.
25
+
26
+ ## Philosophy
27
+
28
+ Eval-Driven Development treats evals as the "unit tests of AI development":
29
+ - Define expected behavior BEFORE implementation
30
+ - Run evals continuously during development
31
+ - Track regressions with each change
32
+ - Use pass@k metrics for reliability measurement
33
+
34
+ ## Eval Types
35
+
36
+ ### Capability Evals
37
+ Test if Codex can do something it couldn't before:
38
+ ```markdown
39
+ [CAPABILITY EVAL: feature-name]
40
+ Task: Description of what Codex should accomplish
41
+ Success Criteria:
42
+ - [ ] Criterion 1
43
+ - [ ] Criterion 2
44
+ - [ ] Criterion 3
45
+ Expected Output: Description of expected result
46
+ ```
47
+
48
+ ### Regression Evals
49
+ Ensure changes don't break existing functionality:
50
+ ```markdown
51
+ [REGRESSION EVAL: feature-name]
52
+ Baseline: SHA or checkpoint name
53
+ Tests:
54
+ - existing-test-1: PASS/FAIL
55
+ - existing-test-2: PASS/FAIL
56
+ - existing-test-3: PASS/FAIL
57
+ Result: X/Y passed (previously Y/Y)
58
+ ```
59
+
60
+ ## Grader Types
61
+
62
+ ### 1. Code-Based Grader
63
+ Deterministic checks using code:
64
+ ```bash
65
+ # Check if file contains expected pattern
66
+ grep -q "export function handleAuth" src/auth.ts && echo "PASS" || echo "FAIL"
67
+
68
+ # Check if tests pass
69
+ npm test -- --testPathPattern="auth" && echo "PASS" || echo "FAIL"
70
+
71
+ # Check if build succeeds
72
+ npm run build && echo "PASS" || echo "FAIL"
73
+ ```
74
+
75
+ ### 2. Model-Based Grader
76
+ Use Codex to evaluate open-ended outputs:
77
+ ```markdown
78
+ [MODEL GRADER PROMPT]
79
+ Evaluate the following code change:
80
+ 1. Does it solve the stated problem?
81
+ 2. Is it well-structured?
82
+ 3. Are edge cases handled?
83
+ 4. Is error handling appropriate?
84
+
85
+ Score: 1-5 (1=poor, 5=excellent)
86
+ Reasoning: [explanation]
87
+ ```
88
+
89
+ ### 3. Human Grader
90
+ Flag for manual review:
91
+ ```markdown
92
+ [HUMAN REVIEW REQUIRED]
93
+ Change: Description of what changed
94
+ Reason: Why human review is needed
95
+ Risk Level: LOW/MEDIUM/HIGH
96
+ ```
97
+
98
+ ## Metrics
99
+
100
+ ### pass@k
101
+ "At least one success in k attempts"
102
+ - pass@1: First attempt success rate
103
+ - pass@3: Success within 3 attempts
104
+ - Typical target: pass@3 > 90%
105
+
106
+ ### pass^k
107
+ "All k trials succeed"
108
+ - Higher bar for reliability
109
+ - pass^3: 3 consecutive successes
110
+ - Use for critical paths
111
+
112
+ ## Eval Workflow
113
+
114
+ ### 1. Define (Before Coding)
115
+ ```markdown
116
+ ## EVAL DEFINITION: feature-xyz
117
+
118
+ ### Capability Evals
119
+ 1. Can create new user account
120
+ 2. Can validate email format
121
+ 3. Can hash password securely
122
+
123
+ ### Regression Evals
124
+ 1. Existing login still works
125
+ 2. Session management unchanged
126
+ 3. Logout flow intact
127
+
128
+ ### Success Metrics
129
+ - pass@3 > 90% for capability evals
130
+ - pass^3 = 100% for regression evals
131
+ ```
132
+
133
+ ### 2. Implement
134
+ Write code to pass the defined evals.
135
+
136
+ ### 3. Evaluate
137
+ ```bash
138
+ # Run capability evals
139
+ [Run each capability eval, record PASS/FAIL]
140
+
141
+ # Run regression evals
142
+ npm test -- --testPathPattern="existing"
143
+
144
+ # Generate report
145
+ ```
146
+
147
+ ### 4. Report
148
+ ```markdown
149
+ EVAL REPORT: feature-xyz
150
+ ========================
151
+
152
+ Capability Evals:
153
+ create-user: PASS (pass@1)
154
+ validate-email: PASS (pass@2)
155
+ hash-password: PASS (pass@1)
156
+ Overall: 3/3 passed
157
+
158
+ Regression Evals:
159
+ login-flow: PASS
160
+ session-mgmt: PASS
161
+ logout-flow: PASS
162
+ Overall: 3/3 passed
163
+
164
+ Metrics:
165
+ pass@1: 67% (2/3)
166
+ pass@3: 100% (3/3)
167
+
168
+ Status: READY FOR REVIEW
169
+ ```
170
+
171
+ ## Integration Patterns
172
+
173
+ ### Pre-Implementation
174
+ ```
175
+ $eval define feature-name
176
+ ```
177
+ Creates eval definition file at `.codex/evals/feature-name.md`
178
+
179
+ ### During Implementation
180
+ ```
181
+ $eval check feature-name
182
+ ```
183
+ Runs current evals and reports status
184
+
185
+ ### Post-Implementation
186
+ ```
187
+ $eval report feature-name
188
+ ```
189
+ Generates full eval report
190
+
191
+ ## Eval Storage
192
+
193
+ Store evals in project:
194
+ ```
195
+ .codex/
196
+ evals/
197
+ feature-xyz.md # Eval definition
198
+ feature-xyz.log # Eval run history
199
+ baseline.json # Regression baselines
200
+ ```
201
+
202
+ ## Best Practices
203
+
204
+ 1. **Define evals BEFORE coding** - Forces clear thinking about success criteria
205
+ 2. **Run evals frequently** - Catch regressions early
206
+ 3. **Track pass@k over time** - Monitor reliability trends
207
+ 4. **Use code graders when possible** - Deterministic > probabilistic
208
+ 5. **Human review for security** - Never fully automate security checks
209
+ 6. **Keep evals fast** - Slow evals don't get run
210
+ 7. **Version evals with code** - Evals are first-class artifacts
211
+
212
+ ## Example: Adding Authentication
213
+
214
+ ```markdown
215
+ ## EVAL: add-authentication
216
+
217
+ ### Phase 1: Define (10 min)
218
+ Capability Evals:
219
+ - [ ] User can register with email/password
220
+ - [ ] User can login with valid credentials
221
+ - [ ] Invalid credentials rejected with proper error
222
+ - [ ] Sessions persist across page reloads
223
+ - [ ] Logout clears session
224
+
225
+ Regression Evals:
226
+ - [ ] Public routes still accessible
227
+ - [ ] API responses unchanged
228
+ - [ ] Database schema compatible
229
+
230
+ ### Phase 2: Implement (varies)
231
+ [Write code]
232
+
233
+ ### Phase 3: Evaluate
234
+ Run: $eval check add-authentication
235
+
236
+ ### Phase 4: Report
237
+ EVAL REPORT: add-authentication
238
+ ==============================
239
+ Capability: 5/5 passed (pass@3: 100%)
240
+ Regression: 3/3 passed (pass^3: 100%)
241
+ Status: SHIP IT
242
+ ```
@@ -0,0 +1,213 @@
1
+ ---
2
+ name: evolve
3
+ description: Imported from everything-codex command evolve
4
+ ---
5
+
6
+ ---
7
+ name: evolve
8
+ description: Cluster related instincts into skills, commands, or agents
9
+ command: true
10
+ ---
11
+
12
+ # Evolve Command
13
+
14
+
15
+ ## Native Subagent Protocol (Codex)
16
+
17
+ Codex supports native subagents. Delegate with `spawn_agent`, coordinate with `send_input`, collect via `wait`, and clean up with `close_agent`.
18
+
19
+ Execution preference:
20
+ 1. Use native subagents first for independent workstreams (parallel when possible).
21
+ 2. Merge results in main thread and run final verification.
22
+ 3. Fallback only when delegation is blocked: use the `[ANALYST]`/`[ARCHITECT]`/`[EXECUTOR]`/`[REVIEWER]` structure in a single response.
23
+
24
+ Minimal orchestration pattern:
25
+ ```text
26
+ spawn_agent -> send_input (optional) -> wait -> close_agent
27
+ ```
28
+
29
+ ## Implementation
30
+
31
+ Run the instinct CLI using the Codex home path:
32
+
33
+ ```bash
34
+ python3 "${CODEX_HOME}/skills/continuous-learning-v2/scripts/instinct-cli.py" evolve [--generate]
35
+ ```
36
+
37
+ Or if `CODEX_HOME` is not set (manual installation):
38
+
39
+ ```bash
40
+ python3 ~/.codex/skills/continuous-learning-v2/scripts/instinct-cli.py evolve [--generate]
41
+ ```
42
+
43
+ Analyzes instincts and clusters related ones into higher-level structures:
44
+ - **Commands**: When instincts describe user-invoked actions
45
+ - **Skills**: When instincts describe auto-triggered behaviors
46
+ - **Agents**: When instincts describe complex, multi-step processes
47
+
48
+ ## Usage
49
+
50
+ ```
51
+ $evolve # Analyze all instincts and suggest evolutions
52
+ $evolve --domain testing # Only evolve instincts in testing domain
53
+ $evolve --dry-run # Show what would be created without creating
54
+ $evolve --threshold 5 # Require 5+ related instincts to cluster
55
+ ```
56
+
57
+ ## Evolution Rules
58
+
59
+ ### → Command (User-Invoked)
60
+ When instincts describe actions a user would explicitly request:
61
+ - Multiple instincts about "when user asks to..."
62
+ - Instincts with triggers like "when creating a new X"
63
+ - Instincts that follow a repeatable sequence
64
+
65
+ Example:
66
+ - `new-table-step1`: "when adding a database table, create migration"
67
+ - `new-table-step2`: "when adding a database table, update schema"
68
+ - `new-table-step3`: "when adding a database table, regenerate types"
69
+
70
+ → Creates: `$new-table` command
71
+
72
+ ### → Skill (Auto-Triggered)
73
+ When instincts describe behaviors that should happen automatically:
74
+ - Pattern-matching triggers
75
+ - Error handling responses
76
+ - Code style enforcement
77
+
78
+ Example:
79
+ - `prefer-functional`: "when writing functions, prefer functional style"
80
+ - `use-immutable`: "when modifying state, use immutable patterns"
81
+ - `avoid-classes`: "when designing modules, avoid class-based design"
82
+
83
+ → Creates: `functional-patterns` skill
84
+
85
+ ### → Agent (Needs Depth/Isolation)
86
+ When instincts describe complex, multi-step processes that benefit from isolation:
87
+ - Debugging workflows
88
+ - Refactoring sequences
89
+ - Research tasks
90
+
91
+ Example:
92
+ - `debug-step1`: "when debugging, first check logs"
93
+ - `debug-step2`: "when debugging, isolate the failing component"
94
+ - `debug-step3`: "when debugging, create minimal reproduction"
95
+ - `debug-step4`: "when debugging, verify fix with test"
96
+
97
+ → Creates: `debugger` agent
98
+
99
+ ## What to Do
100
+
101
+ 1. Read all instincts from `~/.codex/homunculus/instincts/`
102
+ 2. Group instincts by:
103
+ - Domain similarity
104
+ - Trigger pattern overlap
105
+ - Action sequence relationship
106
+ 3. For each cluster of 3+ related instincts:
107
+ - Determine evolution type (command/skill/agent)
108
+ - Generate the appropriate file
109
+ - Save to `~/.codex/homunculus/evolved/{commands,skills,agents}/`
110
+ 4. Link evolved structure back to source instincts
111
+
112
+ ## Output Format
113
+
114
+ ```
115
+ 🧬 Evolve Analysis
116
+ ==================
117
+
118
+ Found 3 clusters ready for evolution:
119
+
120
+ ## Cluster 1: Database Migration Workflow
121
+ Instincts: new-table-migration, update-schema, regenerate-types
122
+ Type: Command
123
+ Confidence: 85% (based on 12 observations)
124
+
125
+ Would create: $new-table command
126
+ Files:
127
+ - ~/.codex/homunculus/evolved/commands/new-table.md
128
+
129
+ ## Cluster 2: Functional Code Style
130
+ Instincts: prefer-functional, use-immutable, avoid-classes, pure-functions
131
+ Type: Skill
132
+ Confidence: 78% (based on 8 observations)
133
+
134
+ Would create: functional-patterns skill
135
+ Files:
136
+ - ~/.codex/homunculus/evolved/skills/functional-patterns.md
137
+
138
+ ## Cluster 3: Debugging Process
139
+ Instincts: debug-check-logs, debug-isolate, debug-reproduce, debug-verify
140
+ Type: Agent
141
+ Confidence: 72% (based on 6 observations)
142
+
143
+ Would create: debugger agent
144
+ Files:
145
+ - ~/.codex/homunculus/evolved/agents/debugger.md
146
+
147
+ ---
148
+ Run `$evolve --execute` to create these files.
149
+ ```
150
+
151
+ ## Flags
152
+
153
+ - `--execute`: Actually create the evolved structures (default is preview)
154
+ - `--dry-run`: Preview without creating
155
+ - `--domain <name>`: Only evolve instincts in specified domain
156
+ - `--threshold <n>`: Minimum instincts required to form cluster (default: 3)
157
+ - `--type <command|skill|agent>`: Only create specified type
158
+
159
+ ## Generated File Format
160
+
161
+ ### Command
162
+ ```markdown
163
+ ---
164
+ name: new-table
165
+ description: Create a new database table with migration, schema update, and type generation
166
+ command: $new-table
167
+ evolved_from:
168
+ - new-table-migration
169
+ - update-schema
170
+ - regenerate-types
171
+ ---
172
+
173
+ # New Table Command
174
+
175
+ [Generated content based on clustered instincts]
176
+
177
+ ## Steps
178
+ 1. ...
179
+ 2. ...
180
+ ```
181
+
182
+ ### Skill
183
+ ```markdown
184
+ ---
185
+ name: functional-patterns
186
+ description: Enforce functional programming patterns
187
+ evolved_from:
188
+ - prefer-functional
189
+ - use-immutable
190
+ - avoid-classes
191
+ ---
192
+
193
+ # Functional Patterns Skill
194
+
195
+ [Generated content based on clustered instincts]
196
+ ```
197
+
198
+ ### Agent
199
+ ```markdown
200
+ ---
201
+ name: debugger
202
+ description: Systematic debugging agent
203
+ model: sonnet
204
+ evolved_from:
205
+ - debug-check-logs
206
+ - debug-isolate
207
+ - debug-reproduce
208
+ ---
209
+
210
+ # Debugger Agent
211
+
212
+ [Generated content based on clustered instincts]
213
+ ```
@@ -0,0 +1,42 @@
1
+ ---
2
+ name: frontend-design
3
+ description: Design and implement modern frontend interfaces. Covers React, Vue, state management, routing, and component architecture patterns.
4
+ license: Complete terms in LICENSE.txt
5
+ ---
6
+
7
+ This skill guides creation of distinctive, production-grade frontend interfaces that avoid generic "AI slop" aesthetics. Implement real working code with exceptional attention to aesthetic details and creative choices.
8
+
9
+ The user provides frontend requirements: a component, page, application, or interface to build. They may include context about the purpose, audience, or technical constraints.
10
+
11
+ ## Design Thinking
12
+
13
+ Before coding, understand the context and commit to a BOLD aesthetic direction:
14
+ - **Purpose**: What problem does this interface solve? Who uses it?
15
+ - **Tone**: Pick an extreme: brutally minimal, maximalist chaos, retro-futuristic, organic/natural, luxury/refined, playful/toy-like, editorial/magazine, brutalist/raw, art deco/geometric, soft/pastel, industrial/utilitarian, etc. There are so many flavors to choose from. Use these for inspiration but design one that is true to the aesthetic direction.
16
+ - **Constraints**: Technical requirements (framework, performance, accessibility).
17
+ - **Differentiation**: What makes this UNFORGETTABLE? What's the one thing someone will remember?
18
+
19
+ **CRITICAL**: Choose a clear conceptual direction and execute it with precision. Bold maximalism and refined minimalism both work - the key is intentionality, not intensity.
20
+
21
+ Then implement working code (HTML/CSS/JS, React, Vue, etc.) that is:
22
+ - Production-grade and functional
23
+ - Visually striking and memorable
24
+ - Cohesive with a clear aesthetic point-of-view
25
+ - Meticulously refined in every detail
26
+
27
+ ## Frontend Aesthetics Guidelines
28
+
29
+ Focus on:
30
+ - **Typography**: Choose fonts that are beautiful, unique, and interesting. Avoid generic fonts like Arial and Inter; opt instead for distinctive choices that elevate the frontend's aesthetics; unexpected, characterful font choices. Pair a distinctive display font with a refined body font.
31
+ - **Color & Theme**: Commit to a cohesive aesthetic. Use CSS variables for consistency. Dominant colors with sharp accents outperform timid, evenly-distributed palettes.
32
+ - **Motion**: Use animations for effects and micro-interactions. Prioritize CSS-only solutions for HTML. Use Motion library for React when available. Focus on high-impact moments: one well-orchestrated page load with staggered reveals (animation-delay) creates more delight than scattered micro-interactions. Use scroll-triggering and hover states that surprise.
33
+ - **Spatial Composition**: Unexpected layouts. Asymmetry. Overlap. Diagonal flow. Grid-breaking elements. Generous negative space OR controlled density.
34
+ - **Backgrounds & Visual Details**: Create atmosphere and depth rather than defaulting to solid colors. Add contextual effects and textures that match the overall aesthetic. Apply creative forms like gradient meshes, noise textures, geometric patterns, layered transparencies, dramatic shadows, decorative borders, custom cursors, and grain overlays.
35
+
36
+ NEVER use generic AI-generated aesthetics like overused font families (Inter, Roboto, Arial, system fonts), cliched color schemes (particularly purple gradients on white backgrounds), predictable layouts and component patterns, and cookie-cutter design that lacks context-specific character.
37
+
38
+ Interpret creatively and make unexpected choices that feel genuinely designed for the context. No design should be the same. Vary between light and dark themes, different fonts, different aesthetics. NEVER converge on common choices (Space Grotesk, for example) across generations.
39
+
40
+ **IMPORTANT**: Match implementation complexity to the aesthetic vision. Maximalist designs need elaborate code with extensive animations and effects. Minimalist or refined designs need restraint, precision, and careful attention to spacing, typography, and subtle details. Elegance comes from executing the vision well.
41
+
42
+ Remember: Claude is capable of extraordinary creative work. Don't hold back, show what can truly be created when thinking outside the box and committing fully to a distinctive vision.