npm - oh-my-workflow - Versions diffs - 0.1.0 - Mend

oh-my-workflow 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (17) hide show

package/LICENSE +21 -0
package/README.md +178 -0
package/examples/deep-research/workflow.ts +82 -0
package/package.json +60 -0
package/skill/SKILL.md +491 -0
package/src/adapters/claude.ts +146 -0
package/src/adapters/codex.ts +149 -0
package/src/adapters/fake.ts +70 -0
package/src/adapters/types.ts +43 -0
package/src/cli/omw.ts +37 -0
package/src/cli/replay.ts +98 -0
package/src/cli/run.ts +371 -0
package/src/cli/validate.ts +110 -0
package/src/journal.ts +138 -0
package/src/resume.ts +48 -0
package/src/runtime.ts +235 -0
package/src/schema-gate.ts +164 -0

package/LICENSE ADDED Viewed

@@ -0,0 +1,21 @@
+MIT License
+Copyright (c) 2026 Dongwook Kim
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.

package/README.md ADDED Viewed

@@ -0,0 +1,178 @@
+# oh-my-workflow
+> Run the coding-agent CLIs you already have — `claude -p`, `codex exec` — as
+> nodes in a plain-JS workflow your host agent writes. omw is the thin glue: it
+> runs the script, schema-gates each node's output, and journals every step so the
+> agent can read its own failure and repair its own script. (What's
+> "deterministic" is scoped honestly below — the engine and `--agent fake`, not
+> your script.)
+## Try it now — free, no API key
+```sh
+git clone https://github.com/domuk-k/oh-my-workflow && cd oh-my-workflow
+bun install
+bun src/cli/omw.ts run examples/deep-research --agent fake
+```
+```json
+{"confirmed":[{"topic":"a","hits":3,"verified":true},{"topic":"c","hits":5,"verified":true}],"summary":{"summary":"done","count":2}}
+```
+That's the whole spine in one pass — a `--pretty` tree shows it:
+```sh
+bun src/cli/omw.ts run examples/deep-research --agent fake --pretty
+```
+```
+run r-… (examples/deep-research)
+  ▸ Scope
+    • call#1 [fake]
+      ✓ call#1
+  ▸ Search
+    • search:a [fake]
+    • search:b [fake]
+    • search:c [fake]
+      ✗ timeout call#3
+      ✓ call#4
+      ✓ call#2
+  ▸ Verify
+    • call#5 [fake]
+    • call#6 [fake]
+      ✓ call#5
+      ✓ call#6
+  ▸ Synthesize
+    • call#7 [fake]
+      ✓ call#7
+run_end ok=true · 6 ok / 1 failed
+```
+`search:a` (call#2) returns invalid JSON first and self-repairs to `✓`; `search:b`
+(call#3) times out and is dropped by `filter(Boolean)` — the run still ends green.
+`--agent fake` is a built-in deterministic adapter — no API key, no network. A
+stranger runs the full fan-out + pipeline + a scripted schema-fail→self-repair +
+a scripted timeout→drop, and gets a stable result JSON. Swap `--agent claude`
+(after `claude login`) to run it for real.
+> Once on npm this is `bunx oh-my-workflow run examples/deep-research --agent fake`
+> — the example ships inside the package and resolves from there, so it runs from
+> any directory. omw runs under **bun**; `npx` (Node) won't execute the TS bin.
+## What it is
+You write a plain-JS orchestration script. Its nodes are **whole coding agents**
+(`claude -p`, `codex exec`) — not single LLM calls. The runtime hands your script
+**five hooks** and nothing else:
+```ts
+export default async function (rt, args) {
+  rt.phase("Search");
+  const hits = (await rt.parallel(
+    args.queries.map((q) => () => rt.agent(`SEARCH: ${q}`, { schema: HIT, label: q })),
+  )).filter(Boolean);                 // agent() returns null on failure, never throws
+  return { hits, count: hits.length };
+}
+```
+- `rt.agent(prompt, opts?)` — run one coding-agent CLI node. With a `schema`, omw
+  extracts JSON, validates it (ajv), and **re-prompts the node with the
+  validation errors** up to 2 times before giving up. Returns the validated
+  object, or `null`. **Never throws** — the load-bearing *null-contract*.
+- `rt.parallel(thunks)` — concurrent, barrier; failures become `null`.
+- `rt.pipeline(items, …stages)` — each item flows through all stages independently.
+- `rt.phase(title)` / `rt.log(msg)` — journal / `--pretty` side-channel only.
+Concurrency is bounded at the `agent()` boundary (default 4, `--concurrency N`).
+Every step is recorded to the journal file `.omw/<runId>.jsonl`, so when a node
+fails you read the `kind` (`timeout` / `nonzero_exit` / `schema_violation` / …)
+and fix your script. stdout is one result JSON; the `--pretty` tree and a
+`journal: <path>` pointer go to stderr.
+**The full agent-facing guide is [`skill/SKILL.md`](skill/SKILL.md)** — patterns
+(fan-out / verify-vote / pipeline / loop-until-dry), the debug loop, and the
+conventions. That skill is the primary product; this README is the human intro.
+## Adapters
+A node is a coding agent driven through its headless prompt→result CLI.
+| adapter | status | notes |
+|---|---|---|
+| **fake** | built-in, free, deterministic | the no-key demo engine and test double |
+| **claude** | **full** (live-verified, 2.1.177) | `claude -p --output-format json`; `--resume` powers in-session schema self-repair |
+| **codex** | **experimental** (live-verified, 0.137.0) | `codex exec --json`; **no cost field**; tolerates malformed JSONL ([openai/codex#15451](https://github.com/openai/codex/issues/15451)) and fails *actionably* |
+| **pi** | planned | not wired yet (`--agent pi` → exit 3 + install hint) |
+| **kiro** | not a fit | its CLI is an IDE launcher (open files/diffs), no headless prompt→result interface |
+A missing CLI exits `3` with an `install_hint` instead of failing mid-run. A node
+that hits `internal_error` (e.g. an invalid JSON Schema) escalates the run to exit
+`4` (result still on stdout) so an author bug doesn't hide behind the null-contract.
+`omw validate <wf>` is a pre-flight load + fake-fixture lint that spawns no agents.
+## Honest scope (read before you judge the novelty)
+omw externalizes a pattern Claude Code uses internally for dynamic workflows
+("the model authors a deterministic orchestration script on the fly"). It is a
+**faithful reconstruction of that pattern as OSS** — not a decompiled copy, and
+**no claim of first / best / moat**.
+- **"deterministic"** means: the engine's guarantees (stable resume keys, JSONL
+  recording, schema-gate) **and** the `--agent fake` demo. Your *script's*
+  determinism is a **convention you keep** — there is **no sandbox**, so omw
+  can't stop a workflow from calling `Date.now()`.
+- **resume**: the journal format and resume key `(callIndex, promptHash,
+  optsHash)` (journaled as `call`) are **frozen and proven byte-stable** (identical re-run = 100% key
+  hits; edit the last node = hits up to the first change, then a miss). **Live
+  resume has landed**: `omw run <wf> --resume <journal>` reuses any node whose
+  `(callIndex, promptHash, optsHash)` key hits (adapter not invoked,
+  `agent_end{cached:true}`) and re-runs the rest — verified end-to-end on
+  `--agent fake`. Resume is **per-node key match, not dependency-aware**: it
+  behaves as longest-unchanged-prefix only when upstream outputs flow into
+  downstream prompts (the usual data-flow shape). When nodes instead pass state
+  through the **filesystem** (the normal coding-agent idiom — node 1 writes files
+  node 2 reads), an upstream edit re-runs node 1 but a cached node 2 serves a
+  **stale** result; re-run fresh, or thread a file digest into the downstream
+  prompt. Keeping per-node preserves parallel/pipeline sibling cache; a
+  `--strict-resume` prefix-truncation opt-in and dependency-aware cascade are v2.
+  It holds **only for deterministic workflows**: omw can't *enforce* determinism
+  (no sandbox), so that stays a convention you keep (enforcement is v2). `omw replay` remains a
+  read-only **fixture replay** (reconstructing a recorded run's view), a separate
+  command — not the resume path.
+- an omw node is a **whole external coding-agent CLI**, heavier than a single
+  in-harness subagent.
+- **not in v1** (the CC dynamic-workflow surface has these; omw doesn't yet):
+  `budget`, nested `workflow()`, a `meta`/`phases` block, custom `agentType`,
+  `run_in_background`, worktree isolation.
+The one genuinely novel piece of code is the **schema-gate self-repair loop** —
+the part a "subprocess + for-loop" comparison misses. Everything else is honest
+glue. The fuller positioning (4-way prior-art table, resemblance ledger) lives in
+[`skill/SKILL.md`](skill/SKILL.md) and the
+[launch strategy](https://github.com/domuk-k/oh-my-workflow/blob/main/docs/specs/2026-06-14-omw-launch-strategy.md).
+## Develop
+```sh
+bun install
+bun test            # 136 pass / 2 skip (live adapters, OMW_LIVE=1) / 0 fail
+bun test --coverage # ~99% lines on the pure core
+bun run typecheck   # tsc --noEmit, clean
+```
+`test/spine.test.ts` is the gate: one full `scope → search → verify → synthesize`
+pass against the fake adapter, including the scripted schema-fail → self-repair →
+`filter(Boolean)` survival cycle. Live adapter tests run only under `OMW_LIVE=1`
+(they spend real tokens) and are skipped by default.
+## Docs
+- **Skill (primary product)**: [`skill/SKILL.md`](skill/SKILL.md)
+- Product spec: [`docs/specs/2026-06-12-oh-my-workflow-design.md`](https://github.com/domuk-k/oh-my-workflow/blob/main/docs/specs/2026-06-12-oh-my-workflow-design.md)
+- Launch strategy + scorecard: [`docs/specs/2026-06-14-omw-launch-strategy.md`](https://github.com/domuk-k/oh-my-workflow/blob/main/docs/specs/2026-06-14-omw-launch-strategy.md)
+- Resume / determinism internals: [`docs/specs/2026-06-15-resume-internals-deepdive.md`](https://github.com/domuk-k/oh-my-workflow/blob/main/docs/specs/2026-06-15-resume-internals-deepdive.md)
+## License
+MIT

package/examples/deep-research/workflow.ts ADDED Viewed

@@ -0,0 +1,82 @@
+// deep-research — the reference workflow and the free `--agent fake` hero demo.
+//
+//   bunx oh-my-workflow run examples/deep-research --agent fake
+//
+// It exercises the whole spine in one pass: phase → fan-out search (parallel) →
+// per-finding verify (pipeline) → synthesize. One search node is scripted to
+// hard-fail and one to return schema-invalid JSON first, so the demo also shows
+// the two load-bearing behaviors live:
+//   • null-contract  — a failed node resolves to null and is dropped by
+//     filter(Boolean); the run still completes green.
+//   • self-repair    — an invalid-JSON node is re-prompted with the schema error
+//     and recovers, without the authoring script ever seeing the noise.
+//
+// `fake` fixtures are co-located so the demo is deterministic with no API key.
+// Swap `--agent fake` for `--agent claude` (once configured) to run it for real.
+import type { Runtime } from "../../src/runtime";
+import type { FakeAdapterOptions } from "../../src/adapters/fake";
+const scopeSchema = {
+  type: "object",
+  required: ["topics"],
+  properties: { topics: { type: "array" } },
+};
+const searchSchema = {
+  type: "object",
+  required: ["topic", "hits"],
+  properties: { topic: { type: "string" }, hits: { type: "number" } },
+};
+const verifySchema = {
+  type: "object",
+  required: ["verified"],
+  properties: { verified: { type: "boolean" } },
+};
+const synthSchema = {
+  type: "object",
+  required: ["summary", "count"],
+  properties: { summary: { type: "string" }, count: { type: "number" } },
+};
+export default async function deepResearch(rt: Runtime, _args: unknown) {
+  rt.phase("Scope");
+  const scope = (await rt.agent("SCOPE the research question into topics", {
+    schema: scopeSchema,
+  })) as { topics: string[] } | null;
+  if (!scope) return { error: "scoping failed", confirmed: 0 };
+  rt.phase("Search");
+  const searched = await rt.parallel(
+    scope.topics.map((t) => () => rt.agent(`SEARCH ${t}`, { schema: searchSchema, label: `search:${t}` })),
+  );
+  const found = searched.filter(Boolean); // a failed/timed-out node is dropped here
+  rt.phase("Verify");
+  const verified = await rt.pipeline(found, async (f) => {
+    const v = await rt.agent(`VERIFY ${JSON.stringify(f)}`, { schema: verifySchema });
+    return v ? { ...(f as object), ...(v as object) } : null;
+  });
+  const confirmed = verified.filter(Boolean);
+  rt.phase("Synthesize");
+  const summary = await rt.agent(`SYNTH over ${confirmed.length} findings`, { schema: synthSchema });
+  return { confirmed, summary };
+}
+// Deterministic fixtures for `--agent fake`: topic `a` self-repairs (invalid
+// JSON + sessionId → repaired), topic `b` hard-fails (timeout → dropped),
+// topic `c` succeeds. Net: 2 confirmed findings, one synthesized summary.
+export const fake: FakeAdapterOptions = {
+  rules: [
+    { match: (p) => p.includes("SCOPE"), responses: [{ text: '{"topics":["a","b","c"]}' }] },
+    {
+      match: (p) => p.includes("SEARCH a"),
+      responses: [{ text: '{"oops":1}', sessionId: "sa" }, { text: '{"topic":"a","hits":3}' }],
+    },
+    { match: (p) => p.includes("SEARCH b"), responses: [{ fail: "timeout" }] },
+    { match: (p) => p.includes("SEARCH c"), responses: [{ text: '{"topic":"c","hits":5}' }] },
+    { match: (p) => p.includes("VERIFY"), responses: [{ text: '{"verified":true}' }] },
+    { match: (p) => p.includes("SYNTH"), responses: [{ text: '{"summary":"done","count":2}' }] },
+  ],
+};

package/package.json ADDED Viewed

@@ -0,0 +1,60 @@
+{
+  "name": "oh-my-workflow",
+  "version": "0.1.0",
+  "description": "Run coding-agent CLIs (claude -p / codex exec) as nodes in a plain-JS workflow. The thin deterministic glue.",
+  "type": "module",
+  "author": "Dongwook Kim (https://github.com/domuk-k)",
+  "engines": {
+    "bun": ">=1.0.0"
+  },
+  "files": [
+    "src",
+    "examples",
+    "skill",
+    "README.md",
+    "LICENSE"
+  ],
+  "repository": {
+    "type": "git",
+    "url": "git+https://github.com/domuk-k/oh-my-workflow.git"
+  },
+  "homepage": "https://github.com/domuk-k/oh-my-workflow#readme",
+  "bugs": "https://github.com/domuk-k/oh-my-workflow/issues",
+  "exports": {
+    ".": "./src/runtime.ts",
+    "./adapters/fake": "./src/adapters/fake.ts",
+    "./package.json": "./package.json"
+  },
+  "types": "./src/runtime.ts",
+  "bin": {
+    "omw": "./src/cli/omw.ts"
+  },
+  "scripts": {
+    "test": "bun test",
+    "typecheck": "tsc --noEmit",
+    "prepublishOnly": "bun run typecheck && bun test"
+  },
+  "keywords": [
+    "workflow",
+    "orchestration",
+    "coding-agent",
+    "claude",
+    "codex",
+    "agent",
+    "cli",
+    "bun",
+    "llm",
+    "ai-agent"
+  ],
+  "publishConfig": {
+    "access": "public"
+  },
+  "dependencies": {
+    "ajv": "^8.17.1"
+  },
+  "devDependencies": {
+    "@types/bun": "latest",
+    "typescript": "^5.6.0"
+  },
+  "license": "MIT"
+}