PyPI - spec-runner - Versions diffs - 2.0.0__tar.gz → 2.2.0__tar.gz - Mend

spec-runner 2.0.0tar.gz → 2.2.0tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (93) hide show

spec_runner-2.2.0/PKG-INFO ADDED Viewed

@@ -0,0 +1,435 @@
+Metadata-Version: 2.4
+Name: spec-runner
+Version: 2.2.0
+Summary: Task automation from markdown specs via Claude CLI
+Author: Andrei
+License-Expression: MIT
+Project-URL: Homepage, https://github.com/andrei-shtanakov/spec-runner
+Project-URL: Repository, https://github.com/andrei-shtanakov/spec-runner
+Keywords: automation,tasks,claude,cli,specs,markdown
+Classifier: Development Status :: 5 - Production/Stable
+Classifier: Environment :: Console
+Classifier: Intended Audience :: Developers
+Classifier: Operating System :: POSIX :: Linux
+Classifier: Operating System :: MacOS
+Classifier: Programming Language :: Python :: 3.10
+Classifier: Programming Language :: Python :: 3.11
+Classifier: Programming Language :: Python :: 3.12
+Classifier: Programming Language :: Python :: 3.13
+Classifier: Topic :: Software Development :: Build Tools
+Requires-Python: >=3.10
+Description-Content-Type: text/markdown
+License-File: LICENSE
+Requires-Dist: mcp>=1.26.0
+Requires-Dist: PyYAML>=6.0
+Requires-Dist: structlog>=24.0
+Requires-Dist: textual>=1.0
+Requires-Dist: ulid-py>=1.1.0
+Provides-Extra: dev
+Requires-Dist: pytest>=7.0; extra == "dev"
+Requires-Dist: pytest-cov>=4.0; extra == "dev"
+Requires-Dist: ruff>=0.1.0; extra == "dev"
+Requires-Dist: mypy>=1.0; extra == "dev"
+Requires-Dist: types-PyYAML; extra == "dev"
+Dynamic: license-file
+# spec-runner
+Task automation from markdown specs via Claude CLI. Execute tasks from a structured `tasks.md` file with automatic retries, 5-role code review, Git integration, compliance verification, traceability reporting, and live TUI dashboard.
+## Installation
+```bash
+uv add spec-runner
+```
+Or for development:
+```bash
+uv sync
+```
+Requirements:
+- Python 3.10+
+- Claude CLI (`claude` command available)
+- Git (for branch management)
+- `gh` CLI (optional, for GitHub Issues sync)
+## Quick Start
+```bash
+# Install Claude Code skills (creates .claude/skills in current project)
+spec-runner-init
+# Execute next ready task
+spec-runner run
+# Execute specific task
+spec-runner run --task=TASK-001
+# Execute all ready tasks
+spec-runner run --all
+# Execute with live TUI dashboard
+spec-runner run --all --tui
+# Create tasks interactively
+spec-runner plan "add user authentication"
+# Watch mode — continuously execute ready tasks
+spec-runner watch
+```
+## Features
+- **Task-based execution** — reads tasks from `spec/tasks.md` with priorities, checklists, and dependencies
+- **Specification traceability** — links tasks to requirements (REQ-XXX) and design (DESIGN-XXX)
+- **Automatic retries** — configurable retry policy with exponential backoff and error context forwarding
+- **Code review** — multi-agent review after task completion with enriched diff context
+- **Git integration** — automatic branch creation, commits, and merges
+- **TUI dashboard** — live Textual-based terminal UI with progress bars and log panel
+- **Cost tracking** — per-task token usage and cost breakdown
+- **Watch mode** — continuously poll and execute ready tasks
+- **Plugin system** — extend with custom hooks via `spec/plugins/*/plugin.yaml`
+- **MCP server** — Model Context Protocol server for Claude Code integration (read + write operations)
+- **GitHub Issues sync** — bidirectional sync between tasks.md and GitHub Issues
+- **Interactive planning** — generate specs (requirements + design + tasks) through dialogue with Claude
+- **Structured logging** — JSON/console output via structlog
+- **SQLite state** — persistent execution state with WAL mode, auto-migration from legacy JSON
+- **HITL review** — optional human-in-the-loop approval gate after code review
+- **Parallel review** — 5 specialized review agents (quality, implementation, testing, simplification, docs) running concurrently
+- **Agent personas** — role-specific prompt templates and model selection (architect, implementer, reviewer)
+- **Constitution guardrails** — inviolable project rules from `spec/constitution.md` injected into every prompt
+- **Telegram / webhook notifications** — alerts on task failure, run completion, and degraded-mode persistence failures (Telegram Bot API + generic webhook)
+- **Degraded-mode resilience** — SQLite write failures (disk-full, DB corruption) are caught, the run continues in memory, and operators are notified once
+- **Compliance audit trail** — opt-in JSON-Lines log of every task lifecycle event (started, attempt, completed/failed, state_degraded, run start/end) with operator + run-id attribution
+- **Pause/resume** — pause mid-run with Ctrl+\, edit tasks, resume; TUI keybinding `p`
+- **Streaming events** — live stdout streaming from Claude CLI to TUI via EventBus
+- **Session/idle timeouts** — automatic stop after configurable session or idle duration
+## Task File Format
+Tasks are defined in `spec/tasks.md`:
+```markdown
+## Milestone 1: MVP
+### TASK-001: Implement user login
+🔴 P0 | ⬜ TODO | Est: 2d
+**Checklist:**
+- [ ] Create login endpoint
+- [ ] Add JWT token generation
+- [ ] Write unit tests
+**Traces to:** [REQ-001], [DESIGN-001]
+**Depends on:** —
+**Blocks:** [TASK-002], [TASK-003]
+```
+## CLI Commands
+### spec-runner
+```bash
+# Execution
+spec-runner run                            # Execute next ready task
+spec-runner run --task=TASK-001            # Execute specific task
+spec-runner run --all                      # Execute all ready tasks
+spec-runner run --all --hitl-review        # Interactive HITL approval gate
+spec-runner run --force                    # Skip lock check (stale lock)
+spec-runner run --tui                      # Execute with live TUI dashboard
+spec-runner run --dry-run                  # Show what would execute (JSON)
+spec-runner run --json-result              # Structured JSON output (Maestro interop)
+spec-runner run --budget=10.0              # Set global budget in USD
+spec-runner run --log-level=DEBUG          # Set log verbosity
+spec-runner run --log-json                 # Output logs as JSON
+# Monitoring
+spec-runner status                         # Show execution status
+spec-runner status --json                  # JSON status output
+spec-runner costs                          # Cost breakdown per task
+spec-runner costs --json                   # JSON output for automation
+spec-runner costs --sort=cost              # Sort by cost descending
+spec-runner logs TASK-001                  # View task logs
+# Operations
+spec-runner retry TASK-001                 # Retry failed task
+spec-runner reset                          # Reset state
+spec-runner watch                          # Continuously execute ready tasks
+spec-runner watch --tui                    # Watch with live TUI dashboard
+spec-runner tui                            # Launch TUI status dashboard
+spec-runner validate                       # Validate config and tasks
+# Verification & Reporting (v2.0)
+spec-runner audit                          # Static pre-execution spec check
+spec-runner audit --strict                 # Fail on warnings (orphans, uncovered)
+spec-runner audit --json                   # JSON findings output (for CI)
+spec-runner audit --csv                    # CSV for spreadsheet review
+spec-runner verify                         # Verify post-execution compliance
+spec-runner verify --task=TASK-001         # Verify specific task
+spec-runner verify --json                  # JSON compliance output
+spec-runner verify --strict                # Fail on warnings too
+spec-runner report                         # Generate traceability matrix
+spec-runner report --milestone=mvp         # Filter by milestone
+spec-runner report --uncovered-only        # Show only uncovered requirements
+spec-runner report --json                  # JSON matrix output
+# Planning
+spec-runner plan "description"             # Interactive task planning
+spec-runner plan --full "description"      # Generate full spec (requirements + design + tasks)
+# Integration
+spec-runner mcp                            # Launch MCP server (stdio)
+```
+### Task Management (unified in v2.0)
+```bash
+# Task commands (use `spec-runner task` instead of deprecated `spec-task`)
+spec-runner task list                      # List all tasks
+spec-runner task list --status=todo        # Filter by status
+spec-runner task list --priority=p0        # Filter by priority
+spec-runner task list --milestone=mvp      # Filter by milestone
+spec-runner task show TASK-001             # Task details
+spec-runner task start TASK-001            # Mark as in_progress
+spec-runner task done TASK-001             # Mark as done
+spec-runner task block TASK-001            # Mark as blocked
+spec-runner task check TASK-001 2          # Mark checklist item
+spec-runner task stats                     # Statistics
+spec-runner task next                      # Show next ready tasks
+spec-runner task graph                     # ASCII dependency graph
+# GitHub Issues
+spec-runner task export-gh                 # Export to GitHub Issues format
+spec-runner task sync-to-gh                # Sync tasks -> GitHub Issues
+spec-runner task sync-to-gh --dry-run      # Preview without making changes
+spec-runner task sync-from-gh              # Sync GitHub Issues -> tasks.md
+```
+### spec-runner-init
+```bash
+spec-runner-init                           # Install skills to ./.claude/skills
+spec-runner-init --force                   # Overwrite existing skills
+spec-runner-init /path/to/project          # Install to specific project
+```
+### Multi-phase Options
+`--spec-prefix` namespaces tasks, state, logs, and history for phase-based workflows:
+```bash
+spec-runner run --spec-prefix=phase5-          # Uses spec/phase5-tasks.md
+spec-runner task list --spec-prefix=phase5-    # List phase 5 tasks
+```
+Phase-scoped paths: `spec/phase5-{tasks,requirements,design}.md`, `spec/.executor-phase5-state.db`, `spec/.executor-phase5-logs/`, `spec/.phase5-task-history.log`. Multiple phases coexist without state bleed.
+## Usage as Library
+```python
+from spec_runner import Task, ExecutorConfig, parse_tasks, get_next_tasks
+from pathlib import Path
+tasks = parse_tasks(Path("spec/tasks.md"))
+ready = get_next_tasks(tasks)
+for task in ready:
+    print(f"{task.id}: {task.name} ({task.priority})")
+```
+## MCP Server (Claude Code Integration)
+spec-runner includes an MCP server for querying status and executing tasks from Claude Code.
+Add to `.mcp.json`:
+```json
+{
+  "mcpServers": {
+    "spec-runner": {
+      "command": "spec-runner",
+      "args": ["mcp"]
+    }
+  }
+}
+```
+Available tools:
+| Tool | Kind | Effect |
+|---|---|---|
+| `spec_runner_status` | read | Returns aggregate status (completed/failed/running, cost, tokens) |
+| `spec_runner_tasks` | read | Lists tasks with id/name/priority/status/deps |
+| `spec_runner_next_tasks` | read | Lists ready-to-run tasks |
+| `spec_runner_task_detail` | read | Returns per-task checklist, attempts, last review, cost |
+| `spec_runner_costs` | read | Per-task cost/token breakdown |
+| `spec_runner_logs` | read | Tail of a task's execution log |
+| `spec_runner_run_task` | **write** | **Spawns a subprocess that runs Claude CLI** against the workspace. Can modify files, create git branches, run hooks (tests/lint/commit) |
+| `spec_runner_stop` | **write** | Writes a stop-file that asks a running executor to shut down gracefully |
+### Security model
+**Authentication.** The MCP server has **no built-in authentication**. It uses `stdio` transport and inherits the trust boundary of whatever started it (typically your terminal or Claude Code). Whoever can run the server can call any of its tools.
+**Safe deployment patterns:**
+- ✅ **Local stdio only** (default). Run via `spec-runner mcp` from `.mcp.json` on a single developer machine. Same trust boundary as your shell.
+- ✅ **Claude Code inside your own workspace.** The MCP server operates on the workspace it's invoked in; tools like `spec_runner_run_task` will modify files in that workspace.
+- ❌ **Do NOT expose over TCP, HTTP, or a shared socket** without adding authentication and audit logging — the write tools execute subprocesses that run Claude CLI with full filesystem access.
+- ❌ **Do NOT run under a shared service account** that multiple users or agents share. There is no per-caller identity, so audit logs cannot attribute actions.
+**Write-tool blast radius.** `spec_runner_run_task` does not sandbox execution: the spawned `spec-runner run --task TASK-XXX` can:
+- edit any file in the project root
+- create git branches and auto-commit (if `hooks.post_done.auto_commit: true`)
+- run tests, linters, and any configured hook command
+- spend budget (Claude API cost) up to `budget_usd` / `task_budget_usd`
+Treat the MCP server as equivalent to giving the caller shell access to the workspace.
+**Hardening options** (if you need tighter limits):
+- Run in a disposable container or Maestro worktree so writes are isolated
+- Set `budget_usd` low to cap accidental cost spend
+- Disable `hooks.post_done.auto_commit` if you want manual review before commits
+- Restrict `commands.test`/`commands.lint` to safe allow-listed shell commands — they run verbatim
+See also: `docs/state-schema.md` for the read contract, and `src/spec_runner/mcp_server.py` for tool implementations.
+## Configuration
+Configuration file: `spec-runner.config.yaml` (project root, v2.0)
+Legacy location `spec/executor.config.yaml` is still supported with a deprecation warning.
+v2.0 flat format (no `executor:` wrapper):
+```yaml
+max_retries: 3
+task_timeout_minutes: 30
+claude_command: "claude"
+claude_model: "sonnet"
+spec_prefix: ""                # e.g. "phase5-" for phase5-tasks.md
+budget_usd: 50.0               # Total budget cap (whole run)
+task_budget_usd: 10.0          # Per-task cap incl. first attempt
+max_retry_cost_usd: 2.0        # Cap on retry cost only (attempts 2+)
+# Telegram notifications (optional)
+telegram_bot_token: ""         # Bot token from @BotFather
+telegram_chat_id: ""           # Chat ID to send notifications to
+notify_on: [run_complete, task_failed, state_degraded]
+# Generic webhook (optional — works with Slack, Discord, ntfy.sh, etc.)
+webhook_url: ""                # Webhook URL (empty = disabled)
+webhook_template: '{"text": "{{event}}: {{message}}"}'
+# Compliance audit trail (optional — JSON Lines, opt-in)
+audit_log_path: ""             # e.g. "spec/.executor-audit.jsonl"; empty = disabled
+audit_log_operator: ""         # Override the auto-detected "user@host" tag
+# Agent personas (optional)
+personas:
+  implementer:
+    system_prompt: "You are a focused Python developer"
+    model: "sonnet"
+  reviewer:
+    system_prompt: "You are a senior code reviewer"
+    model: "haiku"
+hooks:
+  pre_start:
+    create_git_branch: true
+  post_done:
+    run_tests: true
+    run_lint: true
+    auto_commit: true
+    run_review: true
+    review_parallel: false     # Run 5 review agents in parallel
+    review_roles: [quality, implementation, testing]
+commands:
+  test: "uv run pytest tests/ -v"
+  lint: "uv run ruff check ."
+paths:
+  root: "."
+  logs: "spec/.executor-logs"
+```
+### Git Branch Workflow
+1. **Branch detection**: Auto-detects `main` or `master`, or use `main_branch` config
+2. **Task branches**: Creates `task/TASK-001-short-name` branches for each task
+3. **Auto-merge**: Merges task branch to main after completion
+### Supported CLIs
+| CLI | Auto-detected | Example template |
+|-----|--------------|------------------|
+| Claude | Yes | `{cmd} -p {prompt} --model {model}` |
+| Codex | Yes | `{cmd} -p {prompt} --model {model}` |
+| OpenCode ([sst/opencode](https://opencode.ai)) | Yes | `{cmd} run --model {model} {prompt}` |
+| Pi Agent ([pi.dev](https://pi.dev)) | Yes (basename match) | `{cmd} -p --model {model} {prompt}` |
+| Ollama | Yes | `{cmd} run {model} {prompt}` |
+| llama-cli | Yes | `{cmd} -m {model} -p {prompt} --no-display-prompt` |
+| Custom | Use template | `{cmd} --prompt {prompt}` |
+## Project Structure
+```
+project/
+├── pyproject.toml
+├── spec-runner.config.yaml      # v2.0 config location
+├── Makefile
+├── .pre-commit-config.yaml
+├── src/
+│   └── spec_runner/
+│       ├── __init__.py
+│       ├── executor.py          # Re-exports (backward compat)
+│       ├── cli.py               # Main CLI dispatcher, cmd_run, cmd_watch
+│       ├── cli_info.py          # Status, costs, logs, validate, verify, report, TUI, MCP
+│       ├── cli_plan.py          # Interactive planning command
+│       ├── execution.py         # Task execution + retry logic
+│       ├── config.py            # ExecutorConfig + YAML loading
+│       ├── state.py             # SQLite state persistence + degraded-mode fallback
+│       ├── prompt.py            # Prompt building + templates
+│       ├── hooks.py             # Pre/post hook orchestration
+│       ├── git_ops.py           # Git branch/commit/merge operations
+│       ├── review.py            # 5-role code review + HITL gate
+│       ├── runner.py            # Subprocess execution + event streaming
+│       ├── task.py              # Task parsing + dependency resolution
+│       ├── task_commands.py     # Task CLI commands (list, show, start, etc.)
+│       ├── github_sync.py       # GitHub Issues sync (to/from)
+│       ├── audit.py             # Pre-execution static audit (LABS-37)
+│       ├── audit_log.py         # JSON Lines compliance audit trail (LABS-40)
+│       ├── verify.py            # Post-execution compliance verification
+│       ├── report.py            # Traceability matrix generation
+│       ├── validate.py          # Config + task validation
+│       ├── plugins.py           # Plugin discovery + hooks
+│       ├── logging.py           # Structured logging (structlog)
+│       ├── events.py            # EventBus for streaming to TUI
+│       ├── notifications.py     # Telegram + webhook notifications
+│       ├── tui.py               # Textual TUI dashboard
+│       ├── mcp_server.py        # MCP server (FastMCP, stdio)
+│       ├── init_cmd.py          # Skill installer
+│       └── skills/
+│           └── spec-generator-skill/
+├── docs/
+│   └── state-schema.md          # Maestro interop contract (SQLite + --json-result)
+├── schemas/
+│   ├── executor-state.schema.json   # JSON Schema for .executor-state.db contents
+│   └── json-result.schema.json      # JSON Schema for `run --json-result` stdout
+├── tests/
+│   └── fixtures/maestro-interop/    # Golden fixtures copied by Maestro contract tests
+└── spec/
+    ├── tasks.md
+    ├── requirements.md
+    ├── design.md
+    ├── FORMAT.md                # Task format specification
+    └── plugins/                 # Optional: per-plugin subdirectories with plugin.yaml
+```
+## License
+MIT

spec-runner 2.0.0__tar.gz → 2.2.0__tar.gz

spec-runner 2.0.0tar.gz → 2.2.0tar.gz