singleton-pipeline 0.4.0-beta.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/LICENSE ADDED
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2026 Romain Lentz
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
package/README.md ADDED
@@ -0,0 +1,200 @@
1
+ # Singleton Pipeline Builder (v0.4.0-beta.0)
2
+
3
+ Build multi-agent pipelines for your codebase, visually.
4
+
5
+ You probably already chain Claude, Codex, Copilot, or OpenCode agents by hand: a scout reads the repo, a generator writes code, a reviewer checks it. Singleton turns that workflow into a reusable pipeline you can edit in a graph, run in one command, and commit cleanly.
6
+
7
+ - agents are plain Markdown files
8
+ - pipelines are JSON, stored in your project under `.singleton/`
9
+ - runs are versioned, with a manifest of what was actually written
10
+ - nothing leaves your machine Singleton drives local provider CLIs (`claude`, `codex`, `copilot`, `opencode`)
11
+
12
+ > Status: beta. Singleton is usable locally, but the pipeline format, provider adapters, and builder UX may still evolve.
13
+
14
+ ## Version 0.4.0 Beta
15
+
16
+ This beta uniformizes the four runners around a single security model and adds the deterministic post-run validation layer.
17
+
18
+ - All four runners (Claude Code, Codex, Copilot, OpenCode) accept a `security_profile` parameter and translate it to their native CLI flags through a dedicated, unit-tested builder (`buildClaudePermissionArgs`, `buildCodexSandboxArgs`, `buildCopilotPermissionArgs`, `buildOpenCodePermissionConfig`).
19
+ - Codex now honors the four profiles via `--sandbox` modes (`read-only`, `workspace-write`, `danger-full-access`) instead of a hardcoded `workspace-write`.
20
+ - Claude now maps profiles to `--permission-mode` and `--disallowedTools` instead of relying on the legacy `permission_mode`; the legacy escape hatch is preserved.
21
+ - Layer 3 (post-run snapshot diff) is now covered by a dedicated test suite that exercises out-of-bounds writes, blocked-path patterns, `../` traversal, and read-only enforcement without any LLM in the loop.
22
+ - `assertWriteAllowed` is unit-tested as the atomic security predicate (11 cases) in `security/policy.test.js`.
23
+ - Preflight now emits explicit info/warning messages for each provider × profile combination, calling out when Singleton relies on Layer 3 because a runner has no per-path filter.
24
+ - Shared runner helpers (`safeJsonParse`, `extractText`, `findUsage`, `findCostUsd`) factored into `runners/_shared.js`; ~120 lines of duplicated parsing logic removed.
25
+ - OpenCode auth fix: Singleton no longer redirects `XDG_DATA_HOME`, which used to strip provider credentials from the spawned process. Permission isolation now relies exclusively on `OPENCODE_CONFIG_CONTENT`.
26
+ - A `Security model` section in this README describes the three layers and their per-runner coverage.
27
+ - Test count grew from 52 to 96 across the uniformization work.
28
+
29
+ ## Version 0.3.0 Beta
30
+
31
+ This beta focused on multi-provider execution, Copilot support, inspection, and safer local runs.
32
+
33
+ - Claude, Codex, Copilot, and experimental OpenCode can now run from the same pipeline model.
34
+ - Copilot support uses the local `copilot` CLI with optional `runner_agent`.
35
+ - Copilot tool permissions are generated from Singleton security profiles using `--allow-tool` and `--deny-tool`.
36
+ - Repo-level Copilot profiles in `.github/agents/*.agent.md` are optional; user-level and organization-level agents can also be used.
37
+ - OpenCode support uses the local `opencode` CLI with optional `runner_agent`; Singleton maps security profiles to OpenCode native permissions through runtime config.
38
+ - `Policy` is now visible during runs and in the final recap.
39
+ - Agents can run as `read-only`, `workspace-write`, `restricted-write`, or `dangerous`.
40
+ - Pipelines can restrict writers to exact files or folders with `allowed_paths`.
41
+ - `.singleton/security.json` defines project-wide defaults, blocked paths, and commit exclusions.
42
+ - Singleton validates writes before execution, at write-time, and after each step by checking real project changes.
43
+ - Security violations pause the REPL with `continue`, `stop`, and `diff` options.
44
+ - `--debug` pauses before each step with `continue`, `inspect`, `edit`, `skip`, and `abort`.
45
+ - Debug also pauses after each step to inspect parsed outputs, written files, detected changes, and diffs before continuing.
46
+ - Debug inspect shows the full prompt that will be sent to the provider.
47
+ - Debug edit lets you override resolved step inputs for the current run only.
48
+ - Debug replay can rerun a step with adjusted inputs; project file changes from the previous attempt are restored first.
49
+ - Debug replay stores repeated step artifacts under `attempt-1`, `attempt-2`, etc.; steps without replay keep artifacts at the step root.
50
+ - Debug replay is capped per step and only restores detected project file changes, not external side effects.
51
+ - Edited inputs are marked in prompt preview with `debug-edited="true"` to make prompt priority easier to inspect.
52
+ - Debug decisions are recorded in `run-manifest.json` as lightweight `debugEvents`.
53
+ - Debug runs are stored with a `DEBUG-` prefix in `.singleton/runs/`.
54
+ - Raw provider output can be inspected during debug and is saved as `raw-output.md` when structured output parsing fails or debug detects output warnings.
55
+ - Run manifests are written even when a pipeline fails, so partial runs remain inspectable.
56
+ - `/commit-last` previews files, applies security exclusions, and asks for confirmation.
57
+
58
+ ## Install
59
+
60
+ ```bash
61
+ npm install
62
+ npm link # optional, to use `singleton` globally
63
+ ```
64
+
65
+ Requirements: Node 20+, plus the provider CLIs you want to use in your `$PATH` with a working session: `claude`, `codex`, `copilot`, and/or `opencode`.
66
+
67
+ ## Quickstart
68
+
69
+ Run the bundled mixed-provider example end-to-end (uses Claude for scouting/review and Codex for implementation):
70
+
71
+ ```bash
72
+ singleton run --pipeline examples/mixed-code-review/.singleton/pipelines/text-spec-to-code-review-mixed.json
73
+ ```
74
+
75
+ Add `--dry-run` to validate the pipeline without calling any LLM.
76
+
77
+ Use `--debug` to pause before each step, inspect the prompt, and adjust inputs for the current run:
78
+
79
+ ```bash
80
+ singleton run --pipeline examples/mixed-code-review/.singleton/pipelines/text-spec-to-code-review-mixed.json --debug
81
+ ```
82
+
83
+ Open the visual builder on your own project:
84
+
85
+ ```bash
86
+ singleton serve --root /path/to/your/project
87
+ cd packages/web && npm run dev
88
+ # → http://localhost:5173
89
+ ```
90
+
91
+ Or just drop into the REPL:
92
+
93
+ ```bash
94
+ singleton
95
+ ```
96
+
97
+ ## How it works
98
+
99
+ A pipeline is a graph of two node types:
100
+
101
+ - **Input** : a value or file path you provide at runtime
102
+ - **Agent** : a Markdown file (`## Config` + prompt) that gets executed
103
+
104
+ Steps wire to each other through four references:
105
+
106
+ | Reference | Direction | Use for |
107
+ | ---------------------- | --------- | --------------------------------------------- |
108
+ | `$INPUT:<id>` | in | a value supplied at run time |
109
+ | `$FILE:<path>` | in / out | read or write a single file |
110
+ | `$PIPE:<agent>.<out>` | in | grab the output of a previous step |
111
+ | `$FILES:<dir>` | out | let an agent emit several files at once |
112
+
113
+ Execution is sequential, ordered by `$PIPE` dependencies. A preflight pass validates inputs, files, providers, references, and security policies before any LLM is called. Each run lands in `.singleton/runs/<id>/` with a manifest, even when the run fails; `/commit-last` stages only approved project deliverables (never `.singleton` itself).
114
+
115
+ Debug mode adds interactive checkpoints before and after each agent. It is designed for inspecting what the agent will receive, testing alternate specs, reviewing outputs, or replaying one step with adjusted inputs. Any edited input is temporary and does not mutate the pipeline JSON.
116
+
117
+ Full details, agent fields, provider resolution, preflight rules, CLI flags, `$FILES` format, run manifest schema live in **[docs/reference.md](docs/reference.md)**.
118
+
119
+ ## Security model
120
+
121
+ Singleton enforces a `security_profile` (`read-only`, `restricted-write`, `workspace-write`, `dangerous`) at three layers, in order of trust:
122
+
123
+ 1. **Prompt-level policy** — Singleton injects a `<security_policy>` block in the user message describing `allowed_paths`/`blocked_paths`. Cooperative models honor it on their own. *Not load-bearing*: a jailbreak can bypass it.
124
+ 2. **Runner-native permissions** — Singleton translates the profile into each CLI's native flags before spawning. Best-effort, varies by runner (see matrix below).
125
+ 3. **Post-run snapshot diff** — Singleton snapshots the project filesystem before each step and diffs it after. Any change outside `allowed_paths` (or matching `blocked_paths`) fails the step, regardless of what the agent did. *This is the deterministic guarantee* — it does not depend on the LLM, the runner, or the prompt.
126
+
127
+ | Profile | Claude Code | Codex | Copilot | OpenCode |
128
+ | --- | --- | --- | --- | --- |
129
+ | `read-only` | native (`--disallowedTools Write,Edit,Bash,NotebookEdit`) | native (`--sandbox read-only`) | native (`--deny-tool=write --deny-tool=shell`) | native (`permission.edit=deny`) |
130
+ | `restricted-write` (per `allowed_paths`) | ⚠ no per-path filter → Layer 3 enforces | ⚠ no per-path filter → Layer 3 enforces | ✅ native (`--allow-tool=write(path)`) | ✅ native (`permission.edit` per pattern) |
131
+ | `workspace-write` | native (`--permission-mode acceptEdits`) | native (`--sandbox workspace-write`) | native (`--allow-tool=write`) | native (`permission.edit=allow`) |
132
+ | `dangerous` | bypass (`--permission-mode bypassPermissions`) | bypass (`--sandbox danger-full-access`) | bypass (`--allow-all-tools`) | bypass (`--dangerously-skip-permissions`) |
133
+
134
+ ⚠ Claude and Codex do not expose per-path write filters in their CLIs. For these runners, the agent *can* write anywhere it has permission to — Singleton's post-run snapshot diff is what fails the step when the write lands outside `allowed_paths`. Layer 3 covers both runners with a deterministic check that does not depend on the agent cooperating.
135
+
136
+ Tests covering Layer 3: see [`packages/cli/src/security/policy.test.js`](packages/cli/src/security/policy.test.js) (`assertWriteAllowed` atomic predicate, including `../` traversal and blocked-path patterns) and [`packages/cli/src/executor.test.js`](packages/cli/src/executor.test.js) (`describe('Layer 3 — post-run snapshot diff …')` end-to-end without any LLM in the loop).
137
+
138
+ ## Examples
139
+
140
+ The repository ships with runnable example projects:
141
+
142
+ | Example | Providers | Purpose |
143
+ | ------- | --------- | ------- |
144
+ | `examples/claude-code-review` | Claude | Text spec -> scout -> code writer -> reviewer |
145
+ | `examples/codex-code-review` | Codex | Same code workflow, using Codex only |
146
+ | `examples/mixed-code-review` | Claude + Codex | Claude scouts/reviews, Codex edits code |
147
+ | `examples/frontend-audit` | Claude | Read-only frontend audit pipeline |
148
+ | `examples/opencode-review` | OpenCode | Experimental read-only review pipeline |
149
+
150
+ The code-review examples are portable templates: they do not ship a toy source file. When you run them for real, provide a spec, a target file path, and the same file as readable context from your own project.
151
+
152
+ Validate any example without calling an LLM:
153
+
154
+ ```bash
155
+ singleton run --pipeline examples/mixed-code-review/.singleton/pipelines/text-spec-to-code-review-mixed.json --dry-run
156
+ ```
157
+
158
+ ## Project layout
159
+
160
+ Singleton is project-local. In your target repo:
161
+
162
+ ```txt
163
+ my-project/
164
+ .singleton/
165
+ agents/ # your Singleton agents
166
+ pipelines/ # saved pipelines
167
+ runs/ # versioned run artifacts
168
+ .github/
169
+ agents/ # optional repo-level Copilot profiles (*.agent.md)
170
+ ```
171
+
172
+ `.claude/agents/` is also scanned for Singleton-compatible agents. `.github/agents/*.agent.md` is not scanned as Singleton agents; it is only used by Copilot when a Singleton agent sets `runner_agent`.
173
+
174
+ Copilot does not require `.github/agents`. If `runner_agent` is omitted, Copilot uses its default agent. If `runner_agent` is set, Copilot can resolve it from a repo-level profile, a user-level profile, or an organization-level profile. Singleton warns when a repo-level profile is not found, but does not fail preflight.
175
+
176
+ `AGENTS.md` is forwarded to Codex as project context.
177
+
178
+ ## Screenshots
179
+
180
+ | Home | Pipeline run |
181
+ | ----------------------------------------------------- | --------------------------------------------------------- |
182
+ | ![Home](.github/assets/singleton_img_home.png) | ![Run](.github/assets/singleton_img_pipeline.png) |
183
+ | **Help** | **Run summary** |
184
+ | ![Help](.github/assets/singleton_img_help.png) | ![Summary](.github/assets/singleton_img_pipeline_finished.png) |
185
+
186
+ ## Repo
187
+
188
+ ```txt
189
+ packages/cli CLI, REPL, executor, runners
190
+ packages/server builder API
191
+ packages/web builder UI
192
+ docs/ reference documentation
193
+ examples/ official example projects
194
+ ```
195
+
196
+ ## Tests
197
+
198
+ ```bash
199
+ npm test
200
+ ```