npm - lilflow - Versions diffs - 0.1.0 - Mend

lilflow 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (15) hide show

package/AGENTS.md +140 -0
package/README.md +112 -0
package/package.json +50 -0
package/src/AGENTS.md +27 -0
package/src/agents/claude-code.js +352 -0
package/src/agents/index.js +228 -0
package/src/agents/ndjson.js +67 -0
package/src/agents/opencode.js +290 -0
package/src/agents/prompt.js +91 -0
package/src/agents/session-store.js +91 -0
package/src/cli.js +204 -0
package/src/config/AGENTS.md +23 -0
package/src/config.js +776 -0
package/src/init-project.js +573 -0
package/src/run-workflow.js +6274 -0

package/AGENTS.md ADDED Viewed

@@ -0,0 +1,140 @@
+# lilflow
+## Stack
+- **Language/Runtime:** Unknown
+## Build & Test Commands
+```bash
+# No recognized stack — add build/test commands here
+```
+## Project Structure
+```
+lilflow/
+├── src/           # Source code
+├── tests/         # Test files
+├── docs/          # Documentation
+└── .claude/       # Codeharness state
+```
+## Conventions
+- All changes must pass tests before commit
+- Maintain test coverage targets
+- Follow existing code style and patterns
+## Harness Files
+- `AGENTS.md` is the primary repo-local instruction file for coding agents
+- `commands/` contains harness command playbooks the agent can read and execute directly
+- `skills/` contains focused harness skills and operating procedures
+- Install BMAD with `npx bmad-method install --yes --modules bmm --tools claude-code` for Claude Code
+<!-- CODEHARNESS-PATCH-START:dev-workflow-enforcement -->
+## Harness Enforcement (codeharness)
+During implementation, the agent MUST:
+### Observability
+- Query VictoriaLogs after running the application to check for errors
+- Verify OTLP instrumentation is active in new code paths
+- Use `curl localhost:9428/select/logsql/query?query=level:error` to check for runtime errors
+### Documentation
+- Create or update per-subsystem AGENTS.md for any new modules (max 100 lines)
+- Update exec-plan at `docs/exec-plans/active/{story-id}.md` with progress
+- Ensure inline code documentation for new public APIs
+### Testing
+- Write tests AFTER implementation, BEFORE verification
+- Achieve 100% project-wide test coverage
+- All tests must pass before proceeding to verification
+### Verification
+- Run `/harness-verify` to produce proof document
+- Do NOT mark story complete without passing verification
+<!-- CODEHARNESS-PATCH-END:dev-workflow-enforcement -->
+<!-- CODEHARNESS-PATCH-START:code-review-harness -->
+## Harness Review Checklist (codeharness)
+In addition to standard code review, verify:
+### Documentation
+- [ ] AGENTS.md files are fresh for all changed modules
+- [ ] Exec-plan updated with story progress
+- [ ] Inline documentation present for new public APIs
+### Testing
+- [ ] Tests exist for all new code
+- [ ] Project-wide test coverage is 100%
+- [ ] No skipped or disabled tests
+- [ ] Test coverage report is present
+### Verification
+- [ ] Proof document exists at `verification/{story-id}-proof.md`
+- [ ] Proof document covers all acceptance criteria
+- [ ] Evidence is reproducible
+<!-- CODEHARNESS-PATCH-END:code-review-harness -->
+<!-- CODEHARNESS-PATCH-START:sprint-planning-harness -->
+## Harness Pre-Sprint Checklist (codeharness)
+Before starting the sprint, verify:
+### Planning Docs
+- [ ] PRD is current and reflects latest decisions
+- [ ] Architecture doc is current (ARCHITECTURE.md)
+- [ ] Epics and stories are fully defined with Given/When/Then ACs
+### Test Infrastructure
+- [ ] Coverage tool configured for the stack (c8/istanbul, coverage.py)
+- [ ] Baseline coverage recorded in `.claude/codeharness.local.md`
+- [ ] Test runner configured and working
+### Harness Infrastructure
+- [ ] Harness initialized (`/harness-init` completed)
+- [ ] Docker stack healthy (VictoriaMetrics responding)
+- [ ] OTLP instrumentation installed
+- [ ] Hooks registered and active
+<!-- CODEHARNESS-PATCH-END:sprint-planning-harness -->
+<!-- CODEHARNESS-PATCH-START:retro-harness -->
+## Harness Analysis (codeharness)
+The retrospective MUST analyze the following in addition to standard retro topics:
+### Verification Effectiveness
+- Pass rates per story (how many stories verified on first attempt?)
+- Common failure patterns (what types of verification fail most?)
+- Iteration counts per story (how many implement→verify→fix loops?)
+- Average iterations vs target (<3)
+### Documentation Health
+- Stale doc count (AGENTS.md files not updated since code changed)
+- Quality grades per module (A/B/C/D/F)
+- Doc-gardener findings summary
+- Documentation debt trends (improving or degrading?)
+### Test Quality
+- Coverage trends per story (baseline → final, deltas)
+- Tests that caught real bugs vs tests that never failed
+- Flaky test detection (tests that pass/fail inconsistently)
+- Test suite execution time trends
+### Follow-up Items
+Convert each finding into an actionable item:
+- Code/test issues → new story for next sprint
+- Process improvements → BMAD workflow patch
+- Verification gaps → enforcement config update
+<!-- CODEHARNESS-PATCH-END:retro-harness -->

package/README.md ADDED Viewed

@@ -0,0 +1,112 @@
+# lilflow
+Repo-native workflow engine CLI. Step-level resumable YAML workflows with state stored under `.flow/` in your repo.
+## Install
+```bash
+npm install -g lilflow
+```
+Binary is exposed as both `lilflow` and `flow`.
+## Quick Start
+```bash
+flow init        # scaffold workflow.yaml + .flow/ state dir
+flow run         # execute the workflow
+flow status      # inspect run state
+flow resume      # resume a failed run from the last completed step
+```
+## Parallel Steps
+The runner supports contiguous parallel batches inside `workflow.yaml`.
+```yaml
+name: ci
+steps:
+  - name: lint
+    run: npm run lint
+    parallel: true
+  - name: test
+    run: npm test
+    parallel: true
+  - name: build
+    run: npm run build
+```
+Parallel step output is streamed in real time with step-name prefixes such as `[lint] ...`, and `flow status <run-id>` shows each started step with its own `running`, `completed`, or `failed` state.
+Parallel batch concurrency is controlled by the resolved `parallelism` config value. The default is `4`, and you can override it in `.flow/config.yaml`, `~/.flow/config.yaml`, or with `FLOW_PARALLELISM`.
+```yaml
+parallelism: 2
+```
+## Retry Logic
+Steps can retry transient failures with exponential backoff.
+```yaml
+name: deploy
+steps:
+  - name: flaky-check
+    run: npm run flaky-check
+    retry: 3
+    retry_delay: 5s
+```
+`retry` is the number of retries after the initial attempt, so `retry: 3` allows up to 4 total attempts. `retry_delay` is the base delay and doubles on each retry. The runner prints numbered attempt labels such as `[attempt 2/4]`, shows the failure reason for each retryable attempt, and prints wait lines such as `Waiting 10s before retry 2/3...`.
+## For-Each Loops
+Steps can expand an inline array into concrete iterations with `for_each`.
+```yaml
+name: deploy
+steps:
+  - name: deploy-target
+    run: echo "Deploying to {{item}} ({{item_number}}/{{item_count}})"
+    for_each: [dev, staging, prod]
+```
+Each iteration becomes a concrete runtime step with a stable label such as `deploy-target [item-prod]`. The templating variables `{{item}}`, `{{item_index}}`, `{{item_number}}`, `{{item_count}}`, `{{item_first}}`, and `{{item_last}}` are available in string step fields. If one iteration fails, the remaining iterations still run before the workflow stops.
+## Workflow Templates
+`flow init` can scaffold `workflow.yaml` from reusable local templates stored under `.flow/templates/<name>/workflow.yaml`.
+```bash
+flow init --list-templates
+flow init --template ci-pipeline
+```
+Template files can mix prompt-backed placeholders and config-backed placeholders:
+```yaml
+name: {{template.workflow_name|ci-pipeline}}
+parallelism: {{config.parallelism}}
+steps:
+  - name: test
+    run: npm test
+```
+- `{{template.name}}` prompts for a value
+- `{{template.name|default}}` prompts and uses the default when you press enter
+- `{{config.parallelism}}` reads from the resolved lilflow config
+Any directory matching `.flow/templates/<name>/workflow.yaml` is a usable custom template.
+## Configuration
+- Project config: `.flow/config.yaml`
+- Global config: `~/.flow/config.yaml`
+- Env var override: `FLOW_*`
+- Override order: defaults → global → project → env → flags
+## License
+MIT

package/package.json ADDED Viewed

@@ -0,0 +1,50 @@
+{
+  "name": "lilflow",
+  "version": "0.1.0",
+  "description": "Repo-native workflow engine CLI",
+  "type": "module",
+  "bin": {
+    "lilflow": "./src/cli.js",
+    "flow": "./src/cli.js"
+  },
+  "files": [
+    "src",
+    "README.md",
+    "AGENTS.md"
+  ],
+  "repository": {
+    "type": "git",
+    "url": "git+https://github.com/iVintik/lilflow.git"
+  },
+  "bugs": {
+    "url": "https://github.com/iVintik/lilflow/issues"
+  },
+  "homepage": "https://github.com/iVintik/lilflow#readme",
+  "license": "MIT",
+  "keywords": [
+    "workflow",
+    "cli",
+    "automation",
+    "agents"
+  ],
+  "publishConfig": {
+    "access": "public"
+  },
+  "scripts": {
+    "test": "c8 --reporter=text --reporter=lcov --reporter=json-summary node --test",
+    "coverage": "npm test",
+    "lint": "eslint src/"
+  },
+  "engines": {
+    "node": ">=22"
+  },
+  "dependencies": {
+    "js-yaml": "^4.1.0"
+  },
+  "devDependencies": {
+    "@eslint/js": "^9.25.1",
+    "c8": "^10.1.3",
+    "eslint": "^9.25.1",
+    "globals": "^16.0.0"
+  }
+}

package/src/AGENTS.md ADDED Viewed

@@ -0,0 +1,27 @@
+# src
+## Purpose
+Contains the `flow` CLI implementation for this repo.
+## Modules
+- `cli.js`: process entrypoint and command dispatch for `init`, `config`, `run`, `resume`, `set-step`, `status`, `list`, and `logs`, including `flow init --template` / `--list-templates`
+- `config.js`: config defaults, YAML parsing, merge precedence across global/project/workflow/workflow-local/env/flag layers, workflow-local path discovery, and validation
+- `init-project.js`: filesystem logic for `flow init`, reusable template discovery under `.flow/templates/<name>/workflow.yaml`, strict template-name validation before CLI/path use, template placeholder resolution for `{{template.*}}` prompts and `{{config.*}}` config-backed values, and safe non-overwriting scaffold creation
+- `run-workflow.js`: workflow YAML parsing, workflow `parameters` definitions with string/number/boolean typing plus defaults and required flags, conditional `if` expressions with `env.*`, `steps.*.(stdout|stderr|exit_code)`, and `parameters.*` references, first-class `gate` steps with `gate_action: fail|warn` and optional messages, first-class `subflow` steps with child run IDs, parent linkage, `with` parameter passing, child `FLOW_PARAM_*` env injection, per-workflow hierarchical config resolution for top-level and nested workflows, persisted child `run_started.parameters` metadata, output aggregation, and failure propagation, `step_skipped` persistence plus deterministic resume for skipped steps, manual `step_set` overrides with interactive confirmation plus auditable resume-position changes for failed and completed runs, `for_each` loop expansion with strict loop-template validation plus `{{item}}`/iteration metadata templating, shell-escaped loop-item interpolation for `run` commands, ordered execution with optional parallel batches, loop batches that finish every iteration before surfacing failure, `depends_on` validation with cycle and same-batch dependency checks, append-only JSONL event persistence under `.flow/runs/`, optional structured harness-event emission to stdout when `HARNESS_SESSION_ID` is set, nested execution-path metadata for root and subflow runs, persisted run-status reconstruction for `flow status`, persisted run listing and status filtering for `flow list`, persisted log reconstruction and step filtering for `flow logs`, failed-run replay plus manual-reset replay for `flow resume`, run-ID validation for persisted status/log lookups, stable failure event schemas for timeouts, non-zero exits, gate failures, and subflow failures, persisted stdout/stderr for completed and failed steps, per-step timeout overrides, streamed output with per-step prefixes during parallel and loop execution, coordinated cancellation for parallel children on step failure or workflow signals, previous-step output env propagation, loop metadata env propagation, ANSI-colored run output, and failure diagnostics with recent output plus resume hints
+## Conventions
+- Keep source in ESM JavaScript under `src/`
+- Export small public functions with JSDoc
+- Prefer pure helpers around filesystem decisions where practical
+- Preserve idempotent CLI behavior; do not overwrite user files silently
+- Add tests in `tests/` for every new branch in CLI behavior
+## Verification
+- Run `npm test`
+- Run `npm run coverage`
+- Run `npx eslint src/ --fix`
+- Run `npx eslint src/`

package/src/agents/claude-code.js ADDED Viewed

@@ -0,0 +1,352 @@
+import { spawn } from "node:child_process";
+/**
+ * Run the `claude` CLI as a headless or interactive agent step.
+ *
+ * In headless mode, the CLI is invoked with `-p <prompt> --output-format json`.
+ * The full stdout is parsed as a single JSON object — we extract `total_cost_usd`
+ * (or `cost_usd`), `session_id`, and the final assistant text (`result` field).
+ *
+ * In interactive mode, stdio is inherited for full TTY passthrough.
+ *
+ * @param {object} options - Invocation options.
+ * @param {string} options.bin - Resolved binary path for the `claude` CLI.
+ * @param {string} options.prompt - Final prompt text.
+ * @param {string} [options.model] - Optional model identifier.
+ * @param {string | null} [options.sessionId] - Prior session ID to resume, or null for fresh.
+ * @param {number} options.timeoutMs - Step timeout in milliseconds.
+ * @param {string} options.cwd - Working directory.
+ * @param {object} options.env - Process environment.
+ * @param {boolean} [options.interactive=false] - TTY passthrough mode.
+ * @param {string[]} [options.allowTools] - Tool names for `--allowedTools`.
+ * @param {string} [options.appendSystemPrompt] - Extra system prompt injection.
+ * @param {boolean} [options.sourceAccess=true] - When true, pass `--dangerously-skip-permissions`.
+ * @param {typeof spawn} [options.spawnProcess=spawn] - Child process factory (for tests).
+ * @param {{isCancelled: () => boolean, getReason: () => string | null, trackChild: (child: import("node:child_process").ChildProcess) => () => void}} [options.cancellation] - Cancellation controller.
+ * @returns {Promise<{exitCode: number, stdout: string, stderr: string, combinedOutput: string, costUsd: number | null, sessionId: string | null}>} Agent run result.
+ */
+export function runClaudeCode(options) {
+  const {
+    bin,
+    prompt,
+    model,
+    sessionId = null,
+    timeoutMs,
+    cwd,
+    env,
+    interactive = false,
+    allowTools = [],
+    appendSystemPrompt,
+    sourceAccess = true,
+    spawnProcess = spawn,
+    cancellation = null
+  } = options;
+  if (interactive) {
+    // Surface the prompt to the user before launching the TUI. The claude REPL
+    // does not accept a seed prompt as a flag, so the user has to copy/paste
+    // or type it after launch.
+    if (typeof prompt === "string" && prompt.trim() !== "") {
+      process.stdout.write(`\n[agent prompt — paste into claude after launch]\n${prompt}\n\n`);
+    }
+    return spawnInteractive({
+      bin,
+      prompt,
+      model,
+      sessionId,
+      cwd,
+      env,
+      timeoutMs,
+      cancellation,
+      sourceAccess,
+      spawnProcess
+    });
+  }
+  const args = ["--output-format", "json"];
+  if (model) {
+    args.push("--model", model);
+  }
+  if (sessionId) {
+    args.push("--resume", sessionId);
+  }
+  if (allowTools.length > 0) {
+    args.push("--allowedTools", ...allowTools);
+  }
+  if (appendSystemPrompt) {
+    args.push("--append-system-prompt", appendSystemPrompt);
+  }
+  if (sourceAccess) {
+    args.push("--dangerously-skip-permissions");
+  }
+  // Place `-p` last with `--` separator so a prompt starting with `-` is not
+  // parsed as a flag by the claude CLI argv parser.
+  args.push("-p", "--", prompt);
+  return spawnHeadless({ bin, args, cwd, env, timeoutMs, cancellation, spawnProcess });
+}
+/**
+ * @param {object} options - Spawn options.
+ * @returns {Promise<{exitCode: number, stdout: string, stderr: string, combinedOutput: string, costUsd: number | null, sessionId: string | null}>} Result.
+ */
+function spawnHeadless(options) {
+  const { bin, args, cwd, env, timeoutMs, cancellation, spawnProcess } = options;
+  return new Promise((resolve, reject) => {
+    const child = spawnProcess(bin, args, {
+      cwd,
+      env,
+      stdio: ["ignore", "pipe", "pipe"]
+    });
+    const untrackChild = cancellation?.trackChild(child) ?? (() => {});
+    let stdoutCapture = "";
+    let stderrCapture = "";
+    let combinedOutput = "";
+    let settled = false;
+    let timedOut = false;
+    const timer = setTimeout(() => {
+      timedOut = true;
+      child.kill("SIGTERM");
+    }, timeoutMs);
+    child.stdout.on("data", (chunk) => {
+      const text = chunk.toString();
+      stdoutCapture += text;
+      combinedOutput += text;
+    });
+    child.stderr.on("data", (chunk) => {
+      const text = chunk.toString();
+      stderrCapture += text;
+      combinedOutput += text;
+    });
+    child.on("error", (error) => {
+      if (settled) return;
+      settled = true;
+      clearTimeout(timer);
+      untrackChild();
+      reject(new Error(`Failed to start claude agent: ${error.message}`));
+    });
+    child.on("close", (code, signal) => {
+      if (settled) return;
+      settled = true;
+      clearTimeout(timer);
+      untrackChild();
+      if (timedOut) {
+        reject(Object.assign(new Error("claude agent timed out"), { code: "STEP_TIMEOUT" }));
+        return;
+      }
+      if (signal) {
+        if (cancellation?.isCancelled()) {
+          const reason = cancellation.getReason();
+          reject(Object.assign(new Error(`claude agent cancelled (${reason})`), {
+            code: reason === "STEP_FAILURE" ? "STEP_CANCELLED" : "WORKFLOW_CANCELLED"
+          }));
+          return;
+        }
+        reject(new Error(`claude agent exited due to signal ${signal}`));
+        return;
+      }
+      const captured = parseClaudeResponse(stdoutCapture);
+      resolve({
+        exitCode: code ?? 1,
+        stdout: captured.text ?? stdoutCapture,
+        stderr: stderrCapture,
+        combinedOutput,
+        costUsd: captured.costUsd,
+        sessionId: captured.sessionId
+      });
+    });
+  });
+}
+/**
+ * @param {object} options - Interactive spawn options.
+ * @returns {Promise<{exitCode: number, stdout: string, stderr: string, combinedOutput: string, costUsd: null, sessionId: null}>} Result.
+ */
+function spawnInteractive(options) {
+  const {
+    bin, model, sessionId, cwd, env, timeoutMs, cancellation, sourceAccess, spawnProcess
+  } = options;
+  const args = [];
+  if (model) {
+    args.push("--model", model);
+  }
+  if (sessionId) {
+    args.push("--resume", sessionId);
+  }
+  if (sourceAccess) {
+    args.push("--dangerously-skip-permissions");
+  }
+  return new Promise((resolve, reject) => {
+    const child = spawnProcess(bin, args, {
+      cwd,
+      env,
+      stdio: "inherit"
+    });
+    const untrackChild = cancellation?.trackChild(child) ?? (() => {});
+    let settled = false;
+    let timedOut = false;
+    const timer = setTimeout(() => {
+      timedOut = true;
+      child.kill("SIGTERM");
+    }, timeoutMs);
+    child.on("error", (error) => {
+      if (settled) return;
+      settled = true;
+      clearTimeout(timer);
+      untrackChild();
+      reject(new Error(`Failed to start claude agent: ${error.message}`));
+    });
+    child.on("close", (code, signal) => {
+      if (settled) return;
+      settled = true;
+      clearTimeout(timer);
+      untrackChild();
+      if (timedOut) {
+        reject(Object.assign(new Error("claude agent timed out"), { code: "STEP_TIMEOUT" }));
+        return;
+      }
+      if (signal) {
+        reject(new Error(`claude agent exited due to signal ${signal}`));
+        return;
+      }
+      resolve({
+        exitCode: code ?? 1,
+        stdout: "",
+        stderr: "",
+        combinedOutput: "",
+        costUsd: null,
+        sessionId: null
+      });
+    });
+  });
+}
+/**
+ * Parse the single JSON response produced by `claude -p ... --output-format json`.
+ *
+ * @param {string} raw - Raw stdout from the claude CLI.
+ * @returns {{costUsd: number | null, sessionId: string | null, text: string | null}} Captured fields.
+ */
+function parseClaudeResponse(raw) {
+  const trimmed = raw.trim();
+  if (trimmed === "") {
+    return { costUsd: null, sessionId: null, text: null };
+  }
+  const parsed = extractJsonObject(trimmed);
+  if (parsed === null) {
+    return { costUsd: null, sessionId: null, text: null };
+  }
+  let costUsd = null;
+  if (typeof parsed.total_cost_usd === "number") {
+    costUsd = parsed.total_cost_usd;
+  } else if (typeof parsed.cost_usd === "number") {
+    costUsd = parsed.cost_usd;
+  }
+  const sessionId = typeof parsed.session_id === "string" && parsed.session_id !== ""
+    ? parsed.session_id
+    : null;
+  const text = typeof parsed.result === "string"
+    ? parsed.result
+    : typeof parsed.text === "string"
+      ? parsed.text
+      : null;
+  return { costUsd, sessionId, text };
+}
+/**
+ * Extract a single JSON object from raw stdout that may contain non-JSON
+ * preamble lines (warnings, deprecation notices). Tries the entire string
+ * first, then falls back to scanning each line, then to scanning for the
+ * largest brace-balanced substring.
+ *
+ * @param {string} raw - Raw stdout string (already trimmed).
+ * @returns {object | null} Parsed object or null when no JSON is found.
+ */
+function extractJsonObject(raw) {
+  try {
+    const value = JSON.parse(raw);
+    if (value && typeof value === "object" && !Array.isArray(value)) {
+      return value;
+    }
+  } catch {
+    // fall through to line scan
+  }
+  const lines = raw.split("\n");
+  for (let lineIndex = lines.length - 1; lineIndex >= 0; lineIndex -= 1) {
+    const candidate = lines[lineIndex].trim();
+    if (candidate === "" || !candidate.startsWith("{")) {
+      continue;
+    }
+    try {
+      const value = JSON.parse(candidate);
+      if (value && typeof value === "object" && !Array.isArray(value)) {
+        return value;
+      }
+    } catch {
+      continue;
+    }
+  }
+  const firstBrace = raw.indexOf("{");
+  const lastBrace = raw.lastIndexOf("}");
+  if (firstBrace !== -1 && lastBrace > firstBrace) {
+    try {
+      const value = JSON.parse(raw.slice(firstBrace, lastBrace + 1));
+      if (value && typeof value === "object" && !Array.isArray(value)) {
+        return value;
+      }
+    } catch {
+      // give up
+    }
+  }
+  return null;
+}