oh-my-workflow 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/LICENSE ADDED
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2026 Dongwook Kim
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
package/README.md ADDED
@@ -0,0 +1,178 @@
1
+ # oh-my-workflow
2
+
3
+ > Run the coding-agent CLIs you already have — `claude -p`, `codex exec` — as
4
+ > nodes in a plain-JS workflow your host agent writes. omw is the thin glue: it
5
+ > runs the script, schema-gates each node's output, and journals every step so the
6
+ > agent can read its own failure and repair its own script. (What's
7
+ > "deterministic" is scoped honestly below — the engine and `--agent fake`, not
8
+ > your script.)
9
+
10
+ ## Try it now — free, no API key
11
+
12
+ ```sh
13
+ git clone https://github.com/domuk-k/oh-my-workflow && cd oh-my-workflow
14
+ bun install
15
+ bun src/cli/omw.ts run examples/deep-research --agent fake
16
+ ```
17
+
18
+ ```json
19
+ {"confirmed":[{"topic":"a","hits":3,"verified":true},{"topic":"c","hits":5,"verified":true}],"summary":{"summary":"done","count":2}}
20
+ ```
21
+
22
+ That's the whole spine in one pass — a `--pretty` tree shows it:
23
+
24
+ ```sh
25
+ bun src/cli/omw.ts run examples/deep-research --agent fake --pretty
26
+ ```
27
+
28
+ ```
29
+ run r-… (examples/deep-research)
30
+ ▸ Scope
31
+ • call#1 [fake]
32
+ ✓ call#1
33
+ ▸ Search
34
+ • search:a [fake]
35
+ • search:b [fake]
36
+ • search:c [fake]
37
+ ✗ timeout call#3
38
+ ✓ call#4
39
+ ✓ call#2
40
+ ▸ Verify
41
+ • call#5 [fake]
42
+ • call#6 [fake]
43
+ ✓ call#5
44
+ ✓ call#6
45
+ ▸ Synthesize
46
+ • call#7 [fake]
47
+ ✓ call#7
48
+ run_end ok=true · 6 ok / 1 failed
49
+ ```
50
+
51
+ `search:a` (call#2) returns invalid JSON first and self-repairs to `✓`; `search:b`
52
+ (call#3) times out and is dropped by `filter(Boolean)` — the run still ends green.
53
+
54
+ `--agent fake` is a built-in deterministic adapter — no API key, no network. A
55
+ stranger runs the full fan-out + pipeline + a scripted schema-fail→self-repair +
56
+ a scripted timeout→drop, and gets a stable result JSON. Swap `--agent claude`
57
+ (after `claude login`) to run it for real.
58
+
59
+ > Once on npm this is `bunx oh-my-workflow run examples/deep-research --agent fake`
60
+ > — the example ships inside the package and resolves from there, so it runs from
61
+ > any directory. omw runs under **bun**; `npx` (Node) won't execute the TS bin.
62
+
63
+ ## What it is
64
+
65
+ You write a plain-JS orchestration script. Its nodes are **whole coding agents**
66
+ (`claude -p`, `codex exec`) — not single LLM calls. The runtime hands your script
67
+ **five hooks** and nothing else:
68
+
69
+ ```ts
70
+ export default async function (rt, args) {
71
+ rt.phase("Search");
72
+ const hits = (await rt.parallel(
73
+ args.queries.map((q) => () => rt.agent(`SEARCH: ${q}`, { schema: HIT, label: q })),
74
+ )).filter(Boolean); // agent() returns null on failure, never throws
75
+ return { hits, count: hits.length };
76
+ }
77
+ ```
78
+
79
+ - `rt.agent(prompt, opts?)` — run one coding-agent CLI node. With a `schema`, omw
80
+ extracts JSON, validates it (ajv), and **re-prompts the node with the
81
+ validation errors** up to 2 times before giving up. Returns the validated
82
+ object, or `null`. **Never throws** — the load-bearing *null-contract*.
83
+ - `rt.parallel(thunks)` — concurrent, barrier; failures become `null`.
84
+ - `rt.pipeline(items, …stages)` — each item flows through all stages independently.
85
+ - `rt.phase(title)` / `rt.log(msg)` — journal / `--pretty` side-channel only.
86
+
87
+ Concurrency is bounded at the `agent()` boundary (default 4, `--concurrency N`).
88
+ Every step is recorded to the journal file `.omw/<runId>.jsonl`, so when a node
89
+ fails you read the `kind` (`timeout` / `nonzero_exit` / `schema_violation` / …)
90
+ and fix your script. stdout is one result JSON; the `--pretty` tree and a
91
+ `journal: <path>` pointer go to stderr.
92
+
93
+ **The full agent-facing guide is [`skill/SKILL.md`](skill/SKILL.md)** — patterns
94
+ (fan-out / verify-vote / pipeline / loop-until-dry), the debug loop, and the
95
+ conventions. That skill is the primary product; this README is the human intro.
96
+
97
+ ## Adapters
98
+
99
+ A node is a coding agent driven through its headless prompt→result CLI.
100
+
101
+ | adapter | status | notes |
102
+ |---|---|---|
103
+ | **fake** | built-in, free, deterministic | the no-key demo engine and test double |
104
+ | **claude** | **full** (live-verified, 2.1.177) | `claude -p --output-format json`; `--resume` powers in-session schema self-repair |
105
+ | **codex** | **experimental** (live-verified, 0.137.0) | `codex exec --json`; **no cost field**; tolerates malformed JSONL ([openai/codex#15451](https://github.com/openai/codex/issues/15451)) and fails *actionably* |
106
+ | **pi** | planned | not wired yet (`--agent pi` → exit 3 + install hint) |
107
+ | **kiro** | not a fit | its CLI is an IDE launcher (open files/diffs), no headless prompt→result interface |
108
+
109
+ A missing CLI exits `3` with an `install_hint` instead of failing mid-run. A node
110
+ that hits `internal_error` (e.g. an invalid JSON Schema) escalates the run to exit
111
+ `4` (result still on stdout) so an author bug doesn't hide behind the null-contract.
112
+ `omw validate <wf>` is a pre-flight load + fake-fixture lint that spawns no agents.
113
+
114
+ ## Honest scope (read before you judge the novelty)
115
+
116
+ omw externalizes a pattern Claude Code uses internally for dynamic workflows
117
+ ("the model authors a deterministic orchestration script on the fly"). It is a
118
+ **faithful reconstruction of that pattern as OSS** — not a decompiled copy, and
119
+ **no claim of first / best / moat**.
120
+
121
+ - **"deterministic"** means: the engine's guarantees (stable resume keys, JSONL
122
+ recording, schema-gate) **and** the `--agent fake` demo. Your *script's*
123
+ determinism is a **convention you keep** — there is **no sandbox**, so omw
124
+ can't stop a workflow from calling `Date.now()`.
125
+ - **resume**: the journal format and resume key `(callIndex, promptHash,
126
+ optsHash)` (journaled as `call`) are **frozen and proven byte-stable** (identical re-run = 100% key
127
+ hits; edit the last node = hits up to the first change, then a miss). **Live
128
+ resume has landed**: `omw run <wf> --resume <journal>` reuses any node whose
129
+ `(callIndex, promptHash, optsHash)` key hits (adapter not invoked,
130
+ `agent_end{cached:true}`) and re-runs the rest — verified end-to-end on
131
+ `--agent fake`. Resume is **per-node key match, not dependency-aware**: it
132
+ behaves as longest-unchanged-prefix only when upstream outputs flow into
133
+ downstream prompts (the usual data-flow shape). When nodes instead pass state
134
+ through the **filesystem** (the normal coding-agent idiom — node 1 writes files
135
+ node 2 reads), an upstream edit re-runs node 1 but a cached node 2 serves a
136
+ **stale** result; re-run fresh, or thread a file digest into the downstream
137
+ prompt. Keeping per-node preserves parallel/pipeline sibling cache; a
138
+ `--strict-resume` prefix-truncation opt-in and dependency-aware cascade are v2.
139
+ It holds **only for deterministic workflows**: omw can't *enforce* determinism
140
+ (no sandbox), so that stays a convention you keep (enforcement is v2). `omw replay` remains a
141
+ read-only **fixture replay** (reconstructing a recorded run's view), a separate
142
+ command — not the resume path.
143
+ - an omw node is a **whole external coding-agent CLI**, heavier than a single
144
+ in-harness subagent.
145
+ - **not in v1** (the CC dynamic-workflow surface has these; omw doesn't yet):
146
+ `budget`, nested `workflow()`, a `meta`/`phases` block, custom `agentType`,
147
+ `run_in_background`, worktree isolation.
148
+
149
+ The one genuinely novel piece of code is the **schema-gate self-repair loop** —
150
+ the part a "subprocess + for-loop" comparison misses. Everything else is honest
151
+ glue. The fuller positioning (4-way prior-art table, resemblance ledger) lives in
152
+ [`skill/SKILL.md`](skill/SKILL.md) and the
153
+ [launch strategy](https://github.com/domuk-k/oh-my-workflow/blob/main/docs/specs/2026-06-14-omw-launch-strategy.md).
154
+
155
+ ## Develop
156
+
157
+ ```sh
158
+ bun install
159
+ bun test # 136 pass / 2 skip (live adapters, OMW_LIVE=1) / 0 fail
160
+ bun test --coverage # ~99% lines on the pure core
161
+ bun run typecheck # tsc --noEmit, clean
162
+ ```
163
+
164
+ `test/spine.test.ts` is the gate: one full `scope → search → verify → synthesize`
165
+ pass against the fake adapter, including the scripted schema-fail → self-repair →
166
+ `filter(Boolean)` survival cycle. Live adapter tests run only under `OMW_LIVE=1`
167
+ (they spend real tokens) and are skipped by default.
168
+
169
+ ## Docs
170
+
171
+ - **Skill (primary product)**: [`skill/SKILL.md`](skill/SKILL.md)
172
+ - Product spec: [`docs/specs/2026-06-12-oh-my-workflow-design.md`](https://github.com/domuk-k/oh-my-workflow/blob/main/docs/specs/2026-06-12-oh-my-workflow-design.md)
173
+ - Launch strategy + scorecard: [`docs/specs/2026-06-14-omw-launch-strategy.md`](https://github.com/domuk-k/oh-my-workflow/blob/main/docs/specs/2026-06-14-omw-launch-strategy.md)
174
+ - Resume / determinism internals: [`docs/specs/2026-06-15-resume-internals-deepdive.md`](https://github.com/domuk-k/oh-my-workflow/blob/main/docs/specs/2026-06-15-resume-internals-deepdive.md)
175
+
176
+ ## License
177
+
178
+ MIT
@@ -0,0 +1,82 @@
1
+ // deep-research — the reference workflow and the free `--agent fake` hero demo.
2
+ //
3
+ // bunx oh-my-workflow run examples/deep-research --agent fake
4
+ //
5
+ // It exercises the whole spine in one pass: phase → fan-out search (parallel) →
6
+ // per-finding verify (pipeline) → synthesize. One search node is scripted to
7
+ // hard-fail and one to return schema-invalid JSON first, so the demo also shows
8
+ // the two load-bearing behaviors live:
9
+ // • null-contract — a failed node resolves to null and is dropped by
10
+ // filter(Boolean); the run still completes green.
11
+ // • self-repair — an invalid-JSON node is re-prompted with the schema error
12
+ // and recovers, without the authoring script ever seeing the noise.
13
+ //
14
+ // `fake` fixtures are co-located so the demo is deterministic with no API key.
15
+ // Swap `--agent fake` for `--agent claude` (once configured) to run it for real.
16
+
17
+ import type { Runtime } from "../../src/runtime";
18
+ import type { FakeAdapterOptions } from "../../src/adapters/fake";
19
+
20
+ const scopeSchema = {
21
+ type: "object",
22
+ required: ["topics"],
23
+ properties: { topics: { type: "array" } },
24
+ };
25
+ const searchSchema = {
26
+ type: "object",
27
+ required: ["topic", "hits"],
28
+ properties: { topic: { type: "string" }, hits: { type: "number" } },
29
+ };
30
+ const verifySchema = {
31
+ type: "object",
32
+ required: ["verified"],
33
+ properties: { verified: { type: "boolean" } },
34
+ };
35
+ const synthSchema = {
36
+ type: "object",
37
+ required: ["summary", "count"],
38
+ properties: { summary: { type: "string" }, count: { type: "number" } },
39
+ };
40
+
41
+ export default async function deepResearch(rt: Runtime, _args: unknown) {
42
+ rt.phase("Scope");
43
+ const scope = (await rt.agent("SCOPE the research question into topics", {
44
+ schema: scopeSchema,
45
+ })) as { topics: string[] } | null;
46
+ if (!scope) return { error: "scoping failed", confirmed: 0 };
47
+
48
+ rt.phase("Search");
49
+ const searched = await rt.parallel(
50
+ scope.topics.map((t) => () => rt.agent(`SEARCH ${t}`, { schema: searchSchema, label: `search:${t}` })),
51
+ );
52
+ const found = searched.filter(Boolean); // a failed/timed-out node is dropped here
53
+
54
+ rt.phase("Verify");
55
+ const verified = await rt.pipeline(found, async (f) => {
56
+ const v = await rt.agent(`VERIFY ${JSON.stringify(f)}`, { schema: verifySchema });
57
+ return v ? { ...(f as object), ...(v as object) } : null;
58
+ });
59
+ const confirmed = verified.filter(Boolean);
60
+
61
+ rt.phase("Synthesize");
62
+ const summary = await rt.agent(`SYNTH over ${confirmed.length} findings`, { schema: synthSchema });
63
+
64
+ return { confirmed, summary };
65
+ }
66
+
67
+ // Deterministic fixtures for `--agent fake`: topic `a` self-repairs (invalid
68
+ // JSON + sessionId → repaired), topic `b` hard-fails (timeout → dropped),
69
+ // topic `c` succeeds. Net: 2 confirmed findings, one synthesized summary.
70
+ export const fake: FakeAdapterOptions = {
71
+ rules: [
72
+ { match: (p) => p.includes("SCOPE"), responses: [{ text: '{"topics":["a","b","c"]}' }] },
73
+ {
74
+ match: (p) => p.includes("SEARCH a"),
75
+ responses: [{ text: '{"oops":1}', sessionId: "sa" }, { text: '{"topic":"a","hits":3}' }],
76
+ },
77
+ { match: (p) => p.includes("SEARCH b"), responses: [{ fail: "timeout" }] },
78
+ { match: (p) => p.includes("SEARCH c"), responses: [{ text: '{"topic":"c","hits":5}' }] },
79
+ { match: (p) => p.includes("VERIFY"), responses: [{ text: '{"verified":true}' }] },
80
+ { match: (p) => p.includes("SYNTH"), responses: [{ text: '{"summary":"done","count":2}' }] },
81
+ ],
82
+ };
package/package.json ADDED
@@ -0,0 +1,60 @@
1
+ {
2
+ "name": "oh-my-workflow",
3
+ "version": "0.1.0",
4
+ "description": "Run coding-agent CLIs (claude -p / codex exec) as nodes in a plain-JS workflow. The thin deterministic glue.",
5
+ "type": "module",
6
+ "author": "Dongwook Kim (https://github.com/domuk-k)",
7
+ "engines": {
8
+ "bun": ">=1.0.0"
9
+ },
10
+ "files": [
11
+ "src",
12
+ "examples",
13
+ "skill",
14
+ "README.md",
15
+ "LICENSE"
16
+ ],
17
+ "repository": {
18
+ "type": "git",
19
+ "url": "git+https://github.com/domuk-k/oh-my-workflow.git"
20
+ },
21
+ "homepage": "https://github.com/domuk-k/oh-my-workflow#readme",
22
+ "bugs": "https://github.com/domuk-k/oh-my-workflow/issues",
23
+ "exports": {
24
+ ".": "./src/runtime.ts",
25
+ "./adapters/fake": "./src/adapters/fake.ts",
26
+ "./package.json": "./package.json"
27
+ },
28
+ "types": "./src/runtime.ts",
29
+ "bin": {
30
+ "omw": "./src/cli/omw.ts"
31
+ },
32
+ "scripts": {
33
+ "test": "bun test",
34
+ "typecheck": "tsc --noEmit",
35
+ "prepublishOnly": "bun run typecheck && bun test"
36
+ },
37
+ "keywords": [
38
+ "workflow",
39
+ "orchestration",
40
+ "coding-agent",
41
+ "claude",
42
+ "codex",
43
+ "agent",
44
+ "cli",
45
+ "bun",
46
+ "llm",
47
+ "ai-agent"
48
+ ],
49
+ "publishConfig": {
50
+ "access": "public"
51
+ },
52
+ "dependencies": {
53
+ "ajv": "^8.17.1"
54
+ },
55
+ "devDependencies": {
56
+ "@types/bun": "latest",
57
+ "typescript": "^5.6.0"
58
+ },
59
+ "license": "MIT"
60
+ }