npm - lilflow - Versions diffs - 0.1.1 → 0.2.1 - Mend

lilflow 0.1.1 → 0.2.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (11) hide show

package/.claude-plugin/plugin.json +7 -0
package/README.md +193 -51
package/package.json +6 -2
package/skills/lilflow-workflow-driver/SKILL.md +110 -0
package/src/agents/index.js +41 -1
package/src/agents/output-file.js +204 -0
package/src/cli.js +17 -9
package/src/run-workflow.js +202 -1
package/src/session-bridge.js +644 -0
package/src/session-prompt.js +59 -0
package/src/session-runner.js +152 -0

package/.claude-plugin/plugin.json ADDED Viewed

@@ -0,0 +1,7 @@
+{
+  "name": "lilflow",
+  "version": "0.2.1",
+  "description": "Drive lilflow workflows from inside a Claude Code session. Exposes the flow session-bridge CLI as a skill so the agent can request the next step, update workflow state, evaluate gates, and converse with the user.",
+  "homepage": "https://github.com/iVintik/lilflow",
+  "license": "MIT"
+}

package/README.md CHANGED Viewed

@@ -1,112 +1,254 @@
+<!-- codeharness:readme -->
 # lilflow
-Repo-native workflow engine CLI. Step-level resumable YAML workflows with state stored under `.flow/` in your repo.
+Repo-native workflow engine CLI. Define multi-step workflows in YAML, execute them with step-level persistence, and resume from failure -- all stored under `.flow/` in your repo. No external services, no databases, no daemon.
-## Install
+## What it does
+lilflow turns YAML workflow definitions into executable, resumable pipelines that live inside your repository. It persists every step's start, completion, and failure as append-only JSONL events, so you can resume a failed run from exactly where it broke. It has first-class support for LLM coding agents (Claude Code and OpenCode) as step types, with session management across steps and two execution modes: classic (one agent process per step) and session (the entire workflow runs inside a single long-lived agent session for prompt-cache efficiency). It's for developers who want CI-like orchestration without leaving the repo.
+## Key features
+- **Step-level resumability** -- append-only JSONL event logs let you resume from the exact point of failure
+- **LLM agent steps** -- run Claude Code or OpenCode as workflow steps with session continuation and cost tracking
+- **Session mode** -- run the whole workflow inside one agent session; agent calls `flow session-bridge` CLI to advance steps. Keeps the prompt cache warm for the entire run
+- **Conversational agent mode** -- `agent.mode: conversational` lets the agent iterate with the user until the step is complete, then advance
+- **Structured JSON output** -- `output_file` + `output_format: json` on agent steps extracts and validates JSON from the agent's stdout, no parsing scripts required
+- **Parallel execution** -- mark steps with `parallel: true` or group them with `parallel_group` for concurrent batches
+- **For-each loops** -- expand a step across an array of items with `{{item}}` templating
+- **Quality gates** -- conditional checkpoints that halt or warn based on expressions
+- **Subflow composition** -- call child workflows with parameter passing
+- **Wait triggers** -- block on file existence or external signals with `flow signal`
+- **Hierarchical config** -- global, project, workflow-local, env vars, and CLI flags merge in order
+## Tech stack
+- **Runtime:** Node.js >= 22
+- **Language:** JavaScript (ES modules)
+- **Dependencies:** js-yaml (single runtime dependency)
+- **Test runner:** node:test (built-in)
+- **Coverage:** c8
+## Getting started
+### Install
 ```bash
 npm install -g lilflow
 ```
-Binary is exposed as both `lilflow` and `flow`.
-## Quick Start
+### Run the simplest possible example
 ```bash
-flow init        # scaffold workflow.yaml + .flow/ state dir
-flow run         # execute the workflow
-flow status      # inspect run state
-flow resume      # resume a failed run from the last completed step
+flow init && flow run
 ```
-## Parallel Steps
+Expected output:
+```
+Initialized .flow/ and workflow.yaml
+Running workflow 'hello-world'...
+[setup] echo "Hello from flow"
+Hello from flow
+Workflow completed.
+```
+### Next step
+See [Usage](#usage) for common workflows or [docs/index.md](./docs/index.md) for full docs.
-The runner supports contiguous parallel batches inside `workflow.yaml`.
+## Usage
+### Define and run a multi-step workflow
 ```yaml
-name: ci
+name: build-and-test
 steps:
+  - name: install
+    run: npm ci
   - name: lint
     run: npm run lint
-    parallel: true
+  - name: test
+    run: npm test
+```
+```bash
+flow run
+```
+### Run steps in parallel
+```yaml
+steps:
+  - name: lint
+    run: npm run lint
+    parallel: true
   - name: test
     run: npm test
     parallel: true
   - name: build
     run: npm run build
 ```
-Parallel step output is streamed in real time with step-name prefixes such as `[lint] ...`, and `flow status <run-id>` shows each started step with its own `running`, `completed`, or `failed` state.
+Parallel step output streams in real time with `[lint] ...` / `[test] ...` prefixes.
+### Resume a failed run
+```bash
+flow status <run-id>    # see what failed
+flow resume <run-id>    # pick up from the failed step
+```
-Parallel batch concurrency is controlled by the resolved `parallelism` config value. The default is `4`, and you can override it in `.flow/config.yaml`, `~/.flow/config.yaml`, or with `FLOW_PARALLELISM`.
+### Use LLM agents as steps
 ```yaml
-parallelism: 2
+steps:
+  - name: implement
+    agent:
+      provider: claude-code
+      prompt: "Implement the dashboard component"
+  - name: review
+    agent:
+      provider: claude-code
+      prompt: "Review the implementation"
+      session: continue
 ```
-## Retry Logic
+### Agent interaction modes
-Steps can retry transient failures with exponential backoff.
+Agent steps take an optional `mode` field:
+- `autonomous` (default) -- agent works alone, runs headless, reports when done
+- `conversational` -- agent works with the user iteratively in a TTY, advances only after user approval
 ```yaml
-name: deploy
 steps:
-  - name: flaky-check
-    run: npm run flaky-check
-    retry: 3
-    retry_delay: 5s
+  - name: build
+    agent:
+      provider: claude-code
+      prompt: "Build the feature described in spec.md"
+      mode: conversational
 ```
-`retry` is the number of retries after the initial attempt, so `retry: 3` allows up to 4 total attempts. `retry_delay` is the base delay and doubles on each retry. The runner prints numbered attempt labels such as `[attempt 2/4]`, shows the failure reason for each retryable attempt, and prints wait lines such as `Waiting 10s before retry 2/3...`.
+### Extract JSON output from an agent step
+```yaml
+steps:
+  - name: analyze
+    agent:
+      provider: claude-code
+      prompt: ./analyze.md
+      output_file: out/analysis.json
+      output_format: json
+```
-## For-Each Loops
+lilflow strips markdown fences, finds the first balanced JSON object/array in stdout, validates it parses, and writes the cleaned JSON to the file. The step fails with a structured error code if the output isn't valid JSON.
-Steps can expand an inline array into concrete iterations with `for_each`.
+### Run the entire workflow in one agent session
 ```yaml
-name: deploy
+name: iterative-dev
+mode: session
+session:
+  provider: claude-code       # or: opencode (via oh-my-opencode)
+  model: claude-opus-4-6
+  allow_tools: [Bash, Read, Write, Edit, Grep, Glob]
+  plugins: [lilflow]
 steps:
-  - name: deploy-target
-    run: echo "Deploying to {{item}} ({{item_number}}/{{item_count}})"
-    for_each: [dev, staging, prod]
+  - name: scaffold
+    run: mkdir -p src/components
+  - name: implement
+    agent:
+      provider: claude-code
+      prompt: "Build dashboard components"
+      mode: conversational
+  - name: test
+    run: npm test
 ```
-Each iteration becomes a concrete runtime step with a stable label such as `deploy-target [item-prod]`. The templating variables `{{item}}`, `{{item_index}}`, `{{item_number}}`, `{{item_count}}`, `{{item_first}}`, and `{{item_last}}` are available in string step fields. If one iteration fails, the remaining iterations still run before the workflow stops.
+`flow run` spawns one `claude` (or `opencode`) process, injects a workflow-aware system prompt, and exposes the `flow session-bridge` CLI through the bundled `lilflow` skill. The agent drives the workflow (`flow session-bridge next` -> execute -> `flow session-bridge update`) without per-step process spawns, keeping the prompt cache warm for the entire run.
-## Workflow Templates
+OpenCode sessions work through [oh-my-opencode](https://github.com/opensoft/oh-my-opencode)'s Claude Code compatibility layer -- the same bundled plugin loads unmodified.
-`flow init` can scaffold `workflow.yaml` from reusable local templates stored under `.flow/templates/<name>/workflow.yaml`.
+### Iterate with for-each
-```bash
-flow init --list-templates
-flow init --template ci-pipeline
+```yaml
+steps:
+  - name: deploy
+    run: ./deploy.sh {{item}}
+    for_each: [dev, staging, prod]
 ```
-Template files can mix prompt-backed placeholders and config-backed placeholders:
+### Wait for external input
 ```yaml
-name: {{template.workflow_name|ci-pipeline}}
-parallelism: {{config.parallelism}}
 steps:
-  - name: test
-    run: npm test
+  - name: await-approval
+    wait:
+      trigger: signal
+      timeout: 1h
+```
+Then from another terminal: `flow signal <run-id> await-approval --data '{"approved": true}'`
+## CLI
 ```
+flow init [--template <name>]
+flow config
+flow run [<workflow>]
+flow resume <run-id>
+flow set-step <run-id> <step-index>
+flow signal <run-id> <step-name> [--data '{}']
+flow status <run-id>
+flow list
+flow logs <run-id> [--step <step>]
+flow session-bridge <subcommand>     # agent-facing bridge for session-mode workflows
+```
+## Claude Code plugin
-- `{{template.name}}` prompts for a value
-- `{{template.name|default}}` prompts and uses the default when you press enter
-- `{{config.parallelism}}` reads from the resolved lilflow config
+This repo is also a Claude Code plugin. `.claude-plugin/plugin.json` is the marker file; `skills/` holds the `lilflow-workflow-driver` skill used by session mode. The plugin version is kept in lock-step with the npm package version via the `npm version` lifecycle hook, so a single `v{version}` tag releases both.
-Any directory matching `.flow/templates/<name>/workflow.yaml` is a usable custom template.
+Install as a plugin from a Claude Code session:
+```bash
+claude plugin install github:iVintik/lilflow
+```
+OpenCode users install the same plugin via [oh-my-opencode](https://github.com/opensoft/oh-my-opencode)'s Claude Code compatibility layer.
+## Project structure
+```
+lilflow/
+├── src/                    # Application source (ES modules)
+│   ├── cli.js              # Entry point, command router
+│   ├── run-workflow.js     # Core workflow engine
+│   ├── config.js           # Hierarchical config system
+│   ├── init-project.js     # Project scaffolding
+│   ├── session-bridge.js   # Agent-facing CLI for session mode
+│   ├── session-runner.js   # Single-process execution path
+│   ├── session-prompt.js   # System prompt generator
+│   └── agents/             # LLM agent provider adapters + output extraction
+├── .claude-plugin/         # Claude Code plugin manifest
+│   └── plugin.json
+├── skills/                 # Claude Code skills bundled with the plugin
+│   └── lilflow-workflow-driver/SKILL.md
+├── scripts/                # Release helpers (sync-plugin-version.js)
+├── tests/                  # Test suite (node:test)
+├── docs/                   # Generated documentation + requirements specs
+└── .github/workflows/      # CI/CD (lint + test + npm publish)
+```
-## Configuration
+## Documentation
-- Project config: `.flow/config.yaml`
-- Global config: `~/.flow/config.yaml`
-- Env var override: `FLOW_*`
-- Override order: defaults → global → project → env → flags
+- [Documentation Index](./docs/index.md)
+- [Architecture](./docs/architecture.md)
+- [Component Inventory](./docs/component-inventory.md)
+- [Source Tree Analysis](./docs/source-tree-analysis.md)
+- [Development Guide](./docs/development-guide.md)
+- [Session Mode Spec](./docs/requirements/session-mode.md)
 ## License
 MIT
+<!-- /codeharness:readme -->

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "lilflow",
-  "version": "0.1.1",
+  "version": "0.2.1",
   "description": "Repo-native workflow engine CLI",
   "type": "module",
   "bin": {
@@ -9,6 +9,8 @@
   },
   "files": [
     "src",
+    ".claude-plugin",
+    "skills",
     "README.md",
     "AGENTS.md"
   ],
@@ -33,7 +35,9 @@
   "scripts": {
     "test": "c8 --reporter=text --reporter=lcov --reporter=json-summary node --test",
     "coverage": "npm test",
-    "lint": "eslint src/"
+    "lint": "eslint src/",
+    "sync-plugin-version": "node scripts/sync-plugin-version.js",
+    "version": "npm run sync-plugin-version && git add .claude-plugin/plugin.json"
   },
   "engines": {
     "node": ">=22"

package/skills/lilflow-workflow-driver/SKILL.md ADDED Viewed

@@ -0,0 +1,110 @@
+---
+name: lilflow-workflow-driver
+description: Drive a lilflow session-mode workflow from inside this agent session (Claude Code natively, or OpenCode via the oh-my-opencode Claude Code compatibility layer). Use this skill whenever the FLOW_RUN_ID environment variable is set — it means lilflow has started a workflow and expects you to advance it via the `flow session-bridge` CLI. Call `flow session-bridge next` to get the next step, execute it, then call `flow session-bridge update <step> completed|failed` to advance. For conversational agent steps, work with the user iteratively before marking complete.
+---
+# lilflow Workflow Driver
+You are driving a **lilflow** workflow. Each step in the workflow must be fetched, executed, and reported back through the `flow session-bridge` CLI. Do not try to guess the workflow — always ask the bridge what's next.
+This skill works identically in Claude Code and in OpenCode (via [oh-my-opencode](https://github.com/opensoft/oh-my-opencode)'s Claude Code compatibility layer). The bridge CLI is the same in both environments.
+## When this skill activates
+You can tell this skill applies when:
+- The environment variable `FLOW_RUN_ID` is set (always check `echo $FLOW_RUN_ID` first).
+- The user's initial prompt references a workflow, a "run", or explicitly mentions lilflow.
+If `FLOW_RUN_ID` is not set, stop and tell the user — the workflow runner did not start this session.
+## Core loop
+Follow this loop until `flow session-bridge next` reports `workflow_status: "completed"`:
+1. **Get the next step:**
+   ```bash
+   flow session-bridge next
+   ```
+   Returns JSON: `{ step, steps, remaining, workflow_status }`. `step` is non-null for a single eligible step; `steps` is a list for a parallel batch.
+2. **Execute the step** based on its `type`:
+   - **`run`** — execute `step.command` via the `Bash` tool. Capture stdout/exit code.
+   - **`agent`** with `mode: autonomous` — execute `step.agent.prompt` yourself. You are the agent.
+   - **`agent`** with `mode: conversational` — see [Conversational steps](#conversational-steps) below.
+   - **`agent`** with a different provider (e.g. `opencode`) — mark the step `deferred`; the runner handles foreign providers.
+   - **`gate`** — call `flow session-bridge gate <name>`. It evaluates the condition and returns `{ passed, workflow_should_stop }`.
+   - **`interactive`** — call `flow session-bridge ask <prompt>` to get user input, then run the command with the input.
+   - **`wait`** — call `flow session-bridge wait <name>`. Blocks until the trigger fires.
+   - **`subflow`** — not yet supported in session mode. Mark the step `deferred`.
+3. **Report the result:**
+   ```bash
+   flow session-bridge update <step-name> completed --output "..." --exit-code 0
+   flow session-bridge update <step-name> failed --reason "..."
+   ```
+4. **On failure:** call `flow session-bridge update <name> failed` and stop the loop unless the user instructs otherwise.
+5. **Repeat** until `workflow_status === "completed"`.
+## Conversational steps
+When `step.agent.mode === "conversational"`:
+1. Present the step's prompt and your proposed approach to the user.
+2. Work iteratively — implement, show results, gather feedback, adjust.
+3. Do NOT call `flow session-bridge update` until the user has confirmed satisfaction OR you have completed the task to its written specification.
+4. Explicitly tell the user when you consider the step complete before advancing.
+5. If the user says "skip", "abort", or similar, call `flow session-bridge update <name> failed --reason "user aborted"`.
+Example:
+```
+User: (workflow starts)
+Agent: The workflow says: "Create React components for the dashboard.
+       Iterate with the user until they approve." Here's my plan: ...
+User: Add a sidebar too.
+Agent: [revises, implements, shows output]. Anything else?
+User: Looks good.
+Agent: Marking "implement" complete, advancing to the next step.
+  [runs: flow session-bridge update implement completed]
+```
+## Parallel batches
+If `flow session-bridge next` returns multiple `steps`, they are eligible in parallel. You may execute them sequentially, or use sub-agents (the `Agent` tool) to run them concurrently. Call `flow session-bridge update` once per step.
+## User-initiated questions
+If you need user input in the middle of an autonomous step (an ambiguity the workflow didn't cover), call `flow session-bridge ask "<question>"` — the bridge prompts the user and returns their response on stdout.
+## Context management
+After long workflows, your context may accumulate noise from completed steps. Call:
+```bash
+flow session-bridge compact
+```
+This returns a condensed summary of completed steps. Use the summary to anchor your memory and treat earlier tool results as droppable.
+## Reporting notes and decisions
+To record observations that don't change workflow state:
+```bash
+flow session-bridge log "observation or decision"
+```
+## Inspecting state
+To see the full workflow state (what's done, what's pending):
+```bash
+flow session-bridge status
+```
+## Rules
+1. **Always** call `flow session-bridge next` before executing anything. Do not infer the workflow.
+2. **Always** call `flow session-bridge update` after finishing a step, even on failure. The bridge is the source of truth.
+3. **Never** invent step names or pretend a step succeeded. The event log is authoritative.
+4. When a gate fails with `workflow_should_stop: true`, stop the loop and report the failure to the user.
+5. Respect `mode: conversational` — do not advance these steps without user interaction.

package/src/agents/index.js CHANGED Viewed

@@ -5,6 +5,7 @@ import { runOpencode } from "./opencode.js";
 import { runClaudeCode } from "./claude-code.js";
 import { resolvePromptInput } from "./prompt.js";
 import { getSessionId, loadSessionStore, saveSessionId } from "./session-store.js";
+import { AgentOutputError, writeAgentOutput } from "./output-file.js";
 const PROVIDERS = {
   opencode: {
@@ -101,6 +102,18 @@ export async function executeAgentStep(options) {
   const templatedAppendSystemPrompt = agentSpec.append_system_prompt === undefined
     ? undefined
     : templateFn(agentSpec.append_system_prompt);
+  if (agentSpec.interactive !== undefined && typeof warn === "function") {
+    warn(
+      `Agent step '${step.name}' uses deprecated agent.interactive; prefer agent.mode: ${agentSpec.interactive === true ? "conversational" : "autonomous"}.`
+    );
+  }
+  // mode: conversational maps to TTY passthrough. Legacy agent.interactive
+  // is respected for back-compat until removal.
+  const resolvedInteractive = agentSpec.mode === "conversational"
+    || agentSpec.interactive === true;
   const result = await provider.run({
     bin: binary,
     prompt: templatedPrompt,
@@ -109,7 +122,7 @@ export async function executeAgentStep(options) {
     timeoutMs,
     cwd,
     env,
-    interactive: agentSpec.interactive === true,
+    interactive: resolvedInteractive,
     plugins: agentSpec.plugins ?? [],
     allowTools: agentSpec.allow_tools ?? [],
     appendSystemPrompt: templatedAppendSystemPrompt,
@@ -128,6 +141,33 @@ export async function executeAgentStep(options) {
     }
   }
+  if (agentSpec.output_file !== undefined && result.exitCode === 0) {
+    try {
+      const writeResult = await writeAgentOutput({
+        outputFile: agentSpec.output_file,
+        format: agentSpec.output_format ?? "text",
+        stdout: result.stdout,
+        cwd,
+        stepName: step.name
+      });
+      if (typeof warn === "function") {
+        warn(`Agent step '${step.name}' wrote ${writeResult.bytesWritten} bytes to ${agentSpec.output_file}.`);
+      }
+    } catch (error) {
+      if (error instanceof AgentOutputError) {
+        // Surface as step failure without corrupting the agent result payload.
+        return {
+          ...result,
+          exitCode: result.exitCode === 0 ? 1 : result.exitCode,
+          stderr: `${result.stderr ?? ""}\n${error.message}`.trim(),
+          outputError: { code: error.code, message: error.message }
+        };
+      }
+      throw error;
+    }
+  }
   return result;
 }