PyPI - arccrew - Versions diffs - 0.1.0__tar.gz - Mend

arccrew 0.1.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (134) hide show

arccrew-0.1.0/.claude/commands/add-agent.md +69 -0
arccrew-0.1.0/.claude/commands/add-api-endpoint.md +11 -0
arccrew-0.1.0/.claude/commands/add-mcp-pipeline.md +11 -0
arccrew-0.1.0/.claude/commands/add-prompt.md +20 -0
arccrew-0.1.0/.claude/commands/add-retry-loop.md +11 -0
arccrew-0.1.0/.claude/commands/add-review-gate.md +11 -0
arccrew-0.1.0/.claude/commands/add-state-field.md +11 -0
arccrew-0.1.0/.claude/commands/add-supervisor.md +11 -0
arccrew-0.1.0/.claude/commands/add-tool.md +26 -0
arccrew-0.1.0/.claude/commands/analyze-release.md +63 -0
arccrew-0.1.0/.claude/commands/build-agents.md +130 -0
arccrew-0.1.0/.claude/commands/configure-claude.md +11 -0
arccrew-0.1.0/.claude/commands/configure-gemini.md +11 -0
arccrew-0.1.0/.claude/commands/configure-openai.md +11 -0
arccrew-0.1.0/.claude/commands/debug-pipeline.md +11 -0
arccrew-0.1.0/.claude/commands/enable-langsmith.md +11 -0
arccrew-0.1.0/.claude/commands/enable-otel.md +70 -0
arccrew-0.1.0/.claude/commands/release-pr.md +96 -0
arccrew-0.1.0/.claude/commands/switch-provider.md +11 -0
arccrew-0.1.0/.claude/commands/write-tests.md +32 -0
arccrew-0.1.0/.env.example +63 -0
arccrew-0.1.0/.github/workflows/publish.yml +36 -0
arccrew-0.1.0/.github/workflows/test.yml +31 -0
arccrew-0.1.0/.gitignore +36 -0
arccrew-0.1.0/.pre-commit-config.yaml +6 -0
arccrew-0.1.0/CHANGELOG.md +47 -0
arccrew-0.1.0/CLAUDE.md +153 -0
arccrew-0.1.0/LICENSE +21 -0
arccrew-0.1.0/PKG-INFO +380 -0
arccrew-0.1.0/README.md +334 -0
arccrew-0.1.0/arccrew/__init__.py +41 -0
arccrew-0.1.0/arccrew/agents/__init__.py +3 -0
arccrew-0.1.0/arccrew/agents/base.py +371 -0
arccrew-0.1.0/arccrew/api/__init__.py +0 -0
arccrew-0.1.0/arccrew/api/app.py +94 -0
arccrew-0.1.0/arccrew/api/deps.py +65 -0
arccrew-0.1.0/arccrew/api/middleware/__init__.py +0 -0
arccrew-0.1.0/arccrew/api/middleware/auth.py +70 -0
arccrew-0.1.0/arccrew/api/middleware/rate_limit.py +61 -0
arccrew-0.1.0/arccrew/api/routes/__init__.py +0 -0
arccrew-0.1.0/arccrew/api/routes/auth.py +49 -0
arccrew-0.1.0/arccrew/api/routes/pipeline.py +233 -0
arccrew-0.1.0/arccrew/api/routes/ws.py +112 -0
arccrew-0.1.0/arccrew/api/schemas.py +86 -0
arccrew-0.1.0/arccrew/cli/__init__.py +0 -0
arccrew-0.1.0/arccrew/cli/main.py +415 -0
arccrew-0.1.0/arccrew/config.py +136 -0
arccrew-0.1.0/arccrew/llm.py +77 -0
arccrew-0.1.0/arccrew/mcp_server.py +247 -0
arccrew-0.1.0/arccrew/models/__init__.py +3 -0
arccrew-0.1.0/arccrew/models/base.py +63 -0
arccrew-0.1.0/arccrew/observability.py +67 -0
arccrew-0.1.0/arccrew/orchestrator/__init__.py +5 -0
arccrew-0.1.0/arccrew/orchestrator/graph.py +195 -0
arccrew-0.1.0/arccrew/orchestrator/prompt_manager.py +158 -0
arccrew-0.1.0/arccrew/orchestrator/state.py +58 -0
arccrew-0.1.0/arccrew/prompts/base.md +54 -0
arccrew-0.1.0/arccrew/templates/project/.env.example +60 -0
arccrew-0.1.0/arccrew/templates/project/.gitignore +20 -0
arccrew-0.1.0/arccrew/templates/project/CLAUDE.md +139 -0
arccrew-0.1.0/arccrew/templates/project/README.md +60 -0
arccrew-0.1.0/arccrew/templates/project/agents/.gitkeep +0 -0
arccrew-0.1.0/arccrew/templates/project/agents/__init__.py +0 -0
arccrew-0.1.0/arccrew/templates/project/pipeline.py +48 -0
arccrew-0.1.0/arccrew/templates/project/prompts/global.md +15 -0
arccrew-0.1.0/arccrew/templates/project/requirements.txt +1 -0
arccrew-0.1.0/arccrew/templates/project/tests/__init__.py +0 -0
arccrew-0.1.0/arccrew/templates/project/tests/test_agents.py +9 -0
arccrew-0.1.0/arccrew/templates/project/tools/__init__.py +0 -0
arccrew-0.1.0/arccrew/templates/project/tools/my_tools.py +49 -0
arccrew-0.1.0/arccrew/templates/skills/add-agent/SKILL.md +279 -0
arccrew-0.1.0/arccrew/templates/skills/add-agent/command.md +69 -0
arccrew-0.1.0/arccrew/templates/skills/add-api-endpoint/SKILL.md +221 -0
arccrew-0.1.0/arccrew/templates/skills/add-api-endpoint/command.md +11 -0
arccrew-0.1.0/arccrew/templates/skills/add-mcp-pipeline/SKILL.md +192 -0
arccrew-0.1.0/arccrew/templates/skills/add-mcp-pipeline/command.md +11 -0
arccrew-0.1.0/arccrew/templates/skills/add-prompt/SKILL.md +214 -0
arccrew-0.1.0/arccrew/templates/skills/add-prompt/command.md +20 -0
arccrew-0.1.0/arccrew/templates/skills/add-retry-loop/SKILL.md +224 -0
arccrew-0.1.0/arccrew/templates/skills/add-retry-loop/command.md +11 -0
arccrew-0.1.0/arccrew/templates/skills/add-review-gate/SKILL.md +232 -0
arccrew-0.1.0/arccrew/templates/skills/add-review-gate/command.md +11 -0
arccrew-0.1.0/arccrew/templates/skills/add-state-field/SKILL.md +257 -0
arccrew-0.1.0/arccrew/templates/skills/add-state-field/command.md +11 -0
arccrew-0.1.0/arccrew/templates/skills/add-supervisor/SKILL.md +196 -0
arccrew-0.1.0/arccrew/templates/skills/add-supervisor/command.md +11 -0
arccrew-0.1.0/arccrew/templates/skills/add-tool/SKILL.md +271 -0
arccrew-0.1.0/arccrew/templates/skills/add-tool/command.md +26 -0
arccrew-0.1.0/arccrew/templates/skills/build-agents/SKILL.md +283 -0
arccrew-0.1.0/arccrew/templates/skills/build-agents/command.md +130 -0
arccrew-0.1.0/arccrew/templates/skills/configure-claude/SKILL.md +175 -0
arccrew-0.1.0/arccrew/templates/skills/configure-claude/command.md +11 -0
arccrew-0.1.0/arccrew/templates/skills/configure-gemini/SKILL.md +129 -0
arccrew-0.1.0/arccrew/templates/skills/configure-gemini/command.md +11 -0
arccrew-0.1.0/arccrew/templates/skills/configure-openai/SKILL.md +170 -0
arccrew-0.1.0/arccrew/templates/skills/configure-openai/command.md +11 -0
arccrew-0.1.0/arccrew/templates/skills/debug-pipeline/SKILL.md +242 -0
arccrew-0.1.0/arccrew/templates/skills/debug-pipeline/command.md +11 -0
arccrew-0.1.0/arccrew/templates/skills/enable-langsmith/SKILL.md +148 -0
arccrew-0.1.0/arccrew/templates/skills/enable-langsmith/command.md +11 -0
arccrew-0.1.0/arccrew/templates/skills/enable-otel/SKILL.md +70 -0
arccrew-0.1.0/arccrew/templates/skills/enable-otel/command.md +70 -0
arccrew-0.1.0/arccrew/templates/skills/switch-provider/SKILL.md +182 -0
arccrew-0.1.0/arccrew/templates/skills/switch-provider/command.md +11 -0
arccrew-0.1.0/arccrew/templates/skills/write-tests/SKILL.md +295 -0
arccrew-0.1.0/arccrew/templates/skills/write-tests/command.md +32 -0
arccrew-0.1.0/arccrew/tools/__init__.py +4 -0
arccrew-0.1.0/arccrew/tools/agent_tools.py +97 -0
arccrew-0.1.0/arccrew/tools/workspace.py +121 -0
arccrew-0.1.0/arccrew/utils/__init__.py +1 -0
arccrew-0.1.0/arccrew/utils/helpers.py +123 -0
arccrew-0.1.0/cli.py +13 -0
arccrew-0.1.0/examples/researcher_writer/__init__.py +1 -0
arccrew-0.1.0/examples/researcher_writer/agents/__init__.py +6 -0
arccrew-0.1.0/examples/researcher_writer/agents/researcher.py +47 -0
arccrew-0.1.0/examples/researcher_writer/agents/writer.py +55 -0
arccrew-0.1.0/examples/researcher_writer/pipeline.py +244 -0
arccrew-0.1.0/examples/researcher_writer/prompts/researcher.md +26 -0
arccrew-0.1.0/examples/researcher_writer/prompts/writer.md +25 -0
arccrew-0.1.0/mcp.json.example +14 -0
arccrew-0.1.0/pyproject.toml +77 -0
arccrew-0.1.0/pytest.ini +8 -0
arccrew-0.1.0/requirements.txt +44 -0
arccrew-0.1.0/ruff.toml +7 -0
arccrew-0.1.0/tests/__init__.py +0 -0
arccrew-0.1.0/tests/test_base_agent.py +302 -0
arccrew-0.1.0/tests/test_cli.py +198 -0
arccrew-0.1.0/tests/test_config.py +146 -0
arccrew-0.1.0/tests/test_graph.py +72 -0
arccrew-0.1.0/tests/test_helpers.py +112 -0
arccrew-0.1.0/tests/test_observability.py +117 -0
arccrew-0.1.0/tests/test_prompt_manager.py +118 -0
arccrew-0.1.0/tests/test_state.py +105 -0
arccrew-0.1.0/tests/test_tools.py +111 -0

arccrew-0.1.0/.claude/commands/add-agent.md ADDED Viewed

@@ -0,0 +1,69 @@
+# Add Agent
+Add a single new agent to an existing pipeline.
+**User request:** $ARGUMENTS
+---
+Read `CLAUDE.md` fully — it shows existing agents, their state contracts, and the current flow.
+Read `skills/add-prompt/SKILL.md` before writing any prompt file.
+## Step 1 — Fill gaps before proposing anything
+Identify what cannot be confidently inferred from the request and `CLAUDE.md`.
+Ask in a single focused message if any of these are unclear:
+- **Placement**: where in the flow does this agent go? (after which agent, before which?)
+- **Input**: what does it read from state? (which field from `context`, or directly from `tasks`?)
+- **Output**: what should it produce? (fields, format — this becomes the JSON schema)
+- **Volume**: how many items will it process per call? (needed for round estimation)
+If everything is clear from `CLAUDE.md` and the request, skip this step.
+## Step 2 — Show a mini-blueprint and confirm
+Before writing any code, show:
+```
+## New Agent: {AgentName}
+**Role:** one sentence
+**Placement:** after {PreviousAgent} → {AgentName} → {NextAgent}
+**Data contract:**
+  Reads:  {state field} — e.g. context.research (dict with keys: ...)
+  Writes: {state field} — e.g. context.plan (dict with keys: ...)
+  (last agent writes to: results)
+**Tools:** [tool names] or "none (single LLM call)"
+**Est. rounds:** ~N  (N items × M tool calls × 2)
+**MAX_ROUNDS needed:** yes/no — AGENTNAME_MAX_ROUNDS=X in .env
+**Output schema:**
+{"field1": "...", "field2": [...]}
+```
+Ask: **"Does this look right?"** Wait for confirmation before writing any file.
+## Step 3 — Generate files
+1. **`agents/{agent}.py`** — extends `arccrew.BaseAgent`, imports from `arccrew` only
+2. **`prompts/{agent}.md`** — follow `skills/add-prompt/SKILL.md` rules exactly:
+   - Every tool named explicitly with when-to-call instructions
+   - Round budget hint if processing a list
+   - JSON schema matching the data contract above
+   - Closes with: `Output ONLY the JSON object. No explanation, no markdown, no text before or after.`
+3. **`pipeline.py`** — add `async def` node wrapper, update flow
+4. **`CLAUDE.md`** — add to "Agents in this project":
+   ```
+   - `AgentName` — one-line description → prompts/{agent}.md
+   ```
+After creating the agent, show:
+```
+Prompts active for {agent_name}:
+  1. base.md     — arccrew universal rules (do not edit)
+  2. global.md   — your project-wide rules (edit prompts/global.md)
+  3. {agent}.md  — this agent's role and output schema
+```

arccrew-0.1.0/.claude/commands/add-api-endpoint.md ADDED Viewed

@@ -0,0 +1,11 @@
+# Add API Endpoint
+Register a pipeline with the FastAPI server to expose it over HTTP.
+**User request:** $ARGUMENTS
+---
+Read `CLAUDE.md` and `skills/add-api-endpoint/SKILL.md` fully before doing anything.
+Based on the description above, register the pipeline in `pipeline.py` using `pipeline_registry.register("name", create_pipeline)`, then provide the complete curl commands for calling the pipeline via `POST /api/run` with the appropriate `tasks` structure. Do not edit `arccrew`'s internal API files.

arccrew-0.1.0/.claude/commands/add-mcp-pipeline.md ADDED Viewed

@@ -0,0 +1,11 @@
+# Add MCP Pipeline
+Register a pipeline as an MCP tool so it can be called from Claude Code, Cursor, or Claude Desktop.
+**User request:** $ARGUMENTS
+---
+Read `CLAUDE.md` and `skills/add-mcp-pipeline/SKILL.md` fully before doing anything.
+Based on the description above, register the pipeline in `pipeline.py` using `register_pipeline("name", create_pipeline)` from `arccrew.mcp_server`, provide the MCP client configuration (local stdio or remote URL), and show how to invoke the pipeline using natural language from the MCP client.

arccrew-0.1.0/.claude/commands/add-prompt.md ADDED Viewed

@@ -0,0 +1,20 @@
+# Add Prompt
+Create or update an agent prompt file.
+**User request:** $ARGUMENTS
+---
+Read `CLAUDE.md` and `skills/add-prompt/SKILL.md` fully before doing anything.
+Based on the description above, create or update `prompts/{agent}.md` following these rules:
+- **Role statement** — one sentence: what this agent does and why
+- **Tools** — list every tool the agent has access to by name. For each tool, say exactly when to call it and what argument to pass. Tools not mentioned by name will be silently ignored by the LLM.
+  - If a tool must be called for every item in a list: "Call `tool_name` for EVERY item before adding it to your output."
+- **Round budget** — if the agent processes N items, add: "You have ~X rounds. Process all items efficiently — one `tool_name` call per item is enough."
+- **Numbered instructions** — step-by-step, specific, no ambiguity
+- **JSON output schema** — show the exact structure with real key names and example values. Close with the literal line: `Output ONLY the JSON object. No explanation, no markdown, no text before or after.`
+Do not edit `prompts/global.md` (project-wide rules) or the arccrew base prompt (bundled in the library).

arccrew-0.1.0/.claude/commands/add-retry-loop.md ADDED Viewed

@@ -0,0 +1,11 @@
+# Add Retry Loop
+Add a verification and retry loop to an existing pipeline.
+**User request:** $ARGUMENTS
+---
+Read `CLAUDE.md` and `skills/add-retry-loop/SKILL.md` fully before doing anything.
+Based on the description above, create the verifier agent class and its prompt, update `pipeline.py` to use the `create_pipeline()` factory with `verify_node` and `retry_from` parameters (or add manual conditional edges), and update the worker agent to read `retry_history` for self-correction on subsequent attempts.

arccrew-0.1.0/.claude/commands/add-review-gate.md ADDED Viewed

@@ -0,0 +1,11 @@
+# Add Review Gate
+Add a reviewer approval gate that can reject and retry the upstream worker.
+**User request:** $ARGUMENTS
+---
+Read `CLAUDE.md` and `skills/add-review-gate/SKILL.md` fully before doing anything.
+Based on the description above, create the reviewer agent using `generate()` (no tools), write its prompt with clear approval criteria and JSON schema, update `pipeline.py` to use `create_pipeline()` with `review_node` and `retry_from` parameters, and ensure the worker reads `retry_history` to address reviewer feedback on retries.

arccrew-0.1.0/.claude/commands/add-state-field.md ADDED Viewed

@@ -0,0 +1,11 @@
+# Add State Field
+Add custom state fields to extend PipelineState for a pipeline.
+**User request:** $ARGUMENTS
+---
+Read `CLAUDE.md` and `skills/add-state-field/SKILL.md` fully before doing anything.
+Based on the description above, create or update `my_project/state.py` with the appropriate fields and reducers, update `pipeline.py` to use the new state class in `StateGraph(MyState)`, and update any agent `execute()` methods that need to read or write the new fields.

arccrew-0.1.0/.claude/commands/add-supervisor.md ADDED Viewed

@@ -0,0 +1,11 @@
+# Add Supervisor
+Set up the supervisor orchestration pattern for a pipeline.
+**User request:** $ARGUMENTS
+---
+Read `CLAUDE.md` and `skills/add-supervisor/SKILL.md` fully before doing anything.
+Based on the description above, create sub-agents using `create_react_agent`, write the supervisor prompt that specifies each agent's role and the ordering rules, assemble the pipeline with `create_supervisor`, and add an API-compatible wrapper if the pipeline needs to be registered with the HTTP API.

arccrew-0.1.0/.claude/commands/add-tool.md ADDED Viewed

@@ -0,0 +1,26 @@
+# Add Tool
+Add a new `@tool` function that agents can use during their ReAct loop.
+**User request:** $ARGUMENTS
+---
+Read `CLAUDE.md` and `skills/add-tool/SKILL.md` fully before doing anything.
+Based on the description above:
+1. **Create the `@tool` function** in `tools/my_tools.py` (or a new file under `tools/`). Write a complete docstring that tells the LLM when and how to call it — the docstring IS the LLM's API contract. Return error strings (prefixed `"ERROR:"`) instead of raising exceptions.
+2. **Expose via a getter function** so agents import cleanly (`get_my_tools() -> list`).
+3. **Update the agent's `execute()`** to pass the tool to `run_react(tools=[...])`:
+```python
+tools=get_research_tools() + create_workspace_tools(Path("workspace")) + get_my_tools()
+```
+4. **Update `prompts/{agent}.md`** — this step is mandatory and the most commonly missed. The LLM will silently ignore any tool that is not mentioned by name in the prompt. Add an explicit instruction like:
+```
+Call `tool_name` to do X. Pass {argument} as the parameter. Use the result to Y.
+```
+If the tool should be called for every item in a list, say so literally: "Call `tool_name` for EVERY item before adding it to your output."

arccrew-0.1.0/.claude/commands/analyze-release.md ADDED Viewed

@@ -0,0 +1,63 @@
+# Analyze Release
+Analyze commits since the last git tag and recommend the next version bump.
+---
+## Steps
+1. Run: `git describe --tags --abbrev=0` to find the last tag (e.g. `v0.1.0`).
+   If no tags exist, use `git log --oneline` from the beginning.
+2. Run: `git log {last_tag}..HEAD --oneline` to get all commits since the last tag.
+3. Categorize each commit using Conventional Commits:
+   | Prefix | Category | Version impact |
+   |---|---|---|
+   | `feat!:` or `BREAKING CHANGE:` | Breaking | MAJOR bump |
+   | `feat:` | New feature | MINOR bump |
+   | `fix:` | Bug fix | PATCH bump |
+   | `perf:` | Performance | PATCH bump |
+   | `refactor:` | Refactor | no bump |
+   | `docs:` | Documentation | no bump |
+   | `chore:` | Maintenance | no bump |
+   | `test:` | Tests | no bump |
+   | `ci:` | CI/CD | no bump |
+4. Apply SemVer rules:
+   - Any BREAKING CHANGE → MAJOR bump (`x.0.0`)
+   - No breaking change, but has `feat:` → MINOR bump (`0.x.0`)
+   - No feat, but has `fix:` or `perf:` → PATCH bump (`0.0.x`)
+   - Only docs/chore/refactor → no release needed
+5. Read `pyproject.toml` to find the current version.
+6. Output a clear analysis:
+```
+Last tag: v0.1.0
+Current version in pyproject.toml: 0.1.0
+Commits since v0.1.0:
+  MINOR (feat):
+    - feat: add global.md 3-layer prompt support
+    - feat: arccrew init CLI command
+  PATCH (fix):
+    - fix: PromptManager cache not invalidated on reload
+  No bump (chore/docs/refactor):
+    - docs: update README
+    - chore: update dependencies
+Recommendation: v0.2.0  (MINOR — new features, no breaking changes)
+Reasoning: Two new features warrant a MINOR bump. No breaking API changes detected.
+Ready to release? Run /release-pr to create the release branch and PR.
+```
+7. Ask the user to confirm or adjust the recommendation before proceeding.
+   If the user identifies a breaking change not captured in commits, recalculate.

arccrew-0.1.0/.claude/commands/build-agents.md ADDED Viewed

@@ -0,0 +1,130 @@
+# Build Agents — arccrew
+Build a complete multi-agent pipeline based on the user's description.
+**User request:** $ARGUMENTS
+---
+Read these files fully before writing any code:
+- `CLAUDE.md` — project structure, existing agents, conventions
+- `skills/build-agents/SKILL.md` — pipeline patterns, state, agent code examples
+- `skills/add-tool/SKILL.md` — how to create `@tool` functions correctly
+- `skills/add-prompt/SKILL.md` — how to write agent prompts that actually work
+## Step 1 — Fill gaps before writing the blueprint
+Read the user's request and identify anything that cannot be confidently inferred.
+Ask ONLY about blocking unknowns — one focused message, not an interview.
+Common blocking unknowns (ask if unclear):
+- **Input shape**: what does `tasks[0].description` look like? (free text, JSON, specific fields?)
+- **Output shape**: what should the final result contain? (fields, format, who consumes it?)
+- **Volume**: how many items will an agent process per call? (affects round estimation)
+- **Custom tools**: does any step need external data or computation not covered by `web_search` or workspace tools?
+If everything can be inferred from the description, skip this step and go straight to Step 2.
+Never ask about things that have sensible defaults (model, max_rounds, state reducers).
+## Step 2 — Present the blueprint (REQUIRED before any code)
+Present a **Pipeline Blueprint** for the user to confirm. Show the data contract for every
+agent — what it reads from state and what it writes back. This makes silent data-flow bugs
+visible before a single line of code is written.
+Estimate ReAct rounds per agent:
+- Each tool call = ~2 rounds (call + result read)
+- Formula: N items × M tool calls per item × 2 = estimated rounds
+- > 10 rounds: recommend `AGENTNAME_MAX_ROUNDS` in `.env`
+- > 20 rounds: suggest splitting the agent
+```
+## Pipeline Blueprint: {Pipeline Name}
+**What it does:** one sentence
+**Input:** what tasks[0].description contains (free text / JSON fields / etc.)
+**Agents ({n} total):**
+| Agent | Role | Reads from state | Tools | Est. rounds | Writes to state |
+|-------|------|-----------------|-------|-------------|-----------------|
+| AgentA | ... | tasks[idx].description | web_search | ~6 | context.research |
+| AgentB | ... | context.research | none (LLM only) | ~2 | context.plan |
+| AgentC | ... | context.plan | validate_post | ~8 | results |
+**Flow:** AgentA → AgentB → AgentC → __end__
+**Workload notes:** (round budget warnings, MAX_ROUNDS recommendations)
+**Custom tools needed:**
+- `tool_name(param: type) -> str` — what it does · file: tools/x.py
+- (none if not needed)
+**Custom state fields needed:**
+- `field_name: dict` — why · reducer: _take_last_dict
+- (none if not needed)
+**Expected output** (what results[] contains):
+[{"task_index": 0, "field1": "...", "field2": [...]}]
+**Example curl:**
+curl -X POST http://localhost:8000/api/run \
+  -H "Content-Type: application/json" \
+  -d '{"pipeline": "name", "tasks": [{"description": "example matching input shape above"}]}'
+```
+Ask: **"Does this look right? I'll generate all the files."**
+Wait for confirmation before proceeding to Step 3.
+## Step 3 — Generate all files
+Apply the rules from the SKILL.md files read in Step 0:
+| What to create | Rules from |
+|----------------|------------|
+| `tools/{custom}.py` | `skills/add-tool/SKILL.md` |
+| `prompts/{agent}.md` | `skills/add-prompt/SKILL.md` |
+| `agents/{agent}.py`, `pipeline.py`, `state.py` | `skills/build-agents/SKILL.md` |
+Files to generate:
+1. `tools/{custom}.py` — only if custom tools are needed
+2. `prompts/{agent}.md` — one per agent
+3. `agents/{agent}.py` — one per agent
+4. `pipeline.py` — graph wiring + registration (API + MCP)
+5. `state.py` — only if custom state fields are needed
+## Non-negotiable rules
+- Imports from `arccrew` only — never `src.*`
+- `prompts/base.md` — never create or edit (library file)
+- `prompts/global.md` — create if missing, never overwrite if exists
+- Last agent MUST update `"results": [{"task_index": idx, **data}]` — this is what the API returns
+- `async def` node wrappers in `pipeline.py` — never `lambda s: agent.execute(s)`
+- Register pipeline in `pipeline.py`: both `pipeline_registry` and `register_pipeline`
+## After generating
+Update `CLAUDE.md` "Agents in this project":
+```
+- `AgentName` — one-line description → prompts/{agent}.md
+```
+Show the user:
+```
+Created:
+  tools/{custom}.py      (if any)
+  agents/{agent1}.py
+  agents/{agent2}.py
+  prompts/{agent1}.md
+  prompts/{agent2}.md
+  pipeline.py
+.env recommendations:
+  AGENTNAME_MAX_ROUNDS=25    (if agent has heavy workload)
+Run:
+  arccrew serve
+  curl -X POST http://localhost:8000/api/run \
+    -H "Content-Type: application/json" \
+    -d '{"pipeline": "{name}", "tasks": [{"description": "your task"}]}'
+```

arccrew-0.1.0/.claude/commands/configure-claude.md ADDED Viewed

@@ -0,0 +1,11 @@
+# Configure Claude
+Configure Anthropic Claude as the LLM provider.
+**User request:** $ARGUMENTS
+---
+Read `CLAUDE.md` and `skills/configure-claude/SKILL.md` fully before doing anything.
+Based on the description above, set `ANTHROPIC_API_KEY` and the correct Claude model string in `.env`, configure any per-agent Haiku/Sonnet/Opus overrides in `.env` (e.g. `RESEARCHER_MODEL=anthropic/claude-sonnet-4-6`), and adjust `LLM_MAX_TOKENS` or `LLM_TEMPERATURE` if the request calls for it. No code changes needed — arccrew reads these from the environment automatically.

arccrew-0.1.0/.claude/commands/configure-gemini.md ADDED Viewed

@@ -0,0 +1,11 @@
+# Configure Gemini
+Configure Google Gemini models as the LLM provider.
+**User request:** $ARGUMENTS
+---
+Read `CLAUDE.md` and `skills/configure-gemini/SKILL.md` fully before doing anything.
+Based on the description above, set `GOOGLE_API_KEY` and the correct Gemini model string in `.env` (e.g. `AGENT_MODEL=google_genai/gemini-2.0-flash`). No code changes needed — `langchain-google-genai` reads `GOOGLE_API_KEY` directly from the environment. Specify which agents benefit from Gemini's large context window.

arccrew-0.1.0/.claude/commands/configure-openai.md ADDED Viewed

@@ -0,0 +1,11 @@
+# Configure OpenAI
+Configure OpenAI models as the LLM provider.
+**User request:** $ARGUMENTS
+---
+Read `CLAUDE.md` and `skills/configure-openai/SKILL.md` fully before doing anything.
+Based on the description above, set `OPENAI_API_KEY` and the correct model string in `.env` (e.g. `AGENT_MODEL=openai/gpt-4o`), configure any per-agent overrides, and note that o-series reasoning models require `LLM_TEMPERATURE=1.0`. No code changes needed — arccrew reads these from the environment automatically.

arccrew-0.1.0/.claude/commands/debug-pipeline.md ADDED Viewed

@@ -0,0 +1,11 @@
+# Debug Pipeline
+Diagnose and fix a broken or misbehaving pipeline.
+**User request:** $ARGUMENTS
+---
+Read `CLAUDE.md` and `skills/debug-pipeline/SKILL.md` fully before doing anything.
+Based on the description above, work through the diagnostic steps in order: run `arccrew check`, then run the pipeline with `--debug` flag or add `LOG_LEVEL=DEBUG` to `.env`, identify the failure pattern from the common failures table in the SKILL.md, and apply the appropriate fix to the prompt, tool, graph wiring, or state field.

arccrew-0.1.0/.claude/commands/enable-langsmith.md ADDED Viewed

@@ -0,0 +1,11 @@
+# Enable LangSmith
+Enable LangSmith tracing for observability into pipeline runs.
+**User request:** $ARGUMENTS
+---
+Read `CLAUDE.md` and `skills/enable-langsmith/SKILL.md` fully before doing anything.
+Based on the description above, add `LANGSMITH_API_KEY`, `LANGSMITH_PROJECT`, and `LANGSMITH_TRACING=true` to `.env`. arccrew picks these up automatically on startup — no code changes needed. Show how to use LangSmith traces to compare model performance or diagnose agent failures.

arccrew-0.1.0/.claude/commands/enable-otel.md ADDED Viewed

@@ -0,0 +1,70 @@
+# Enable OpenTelemetry — arccrew
+Set up OpenTelemetry tracing to send pipeline spans to Grafana, Datadog, Jaeger, or any OTLP-compatible backend.
+**User request:** $ARGUMENTS
+---
+## When to use this instead of LangSmith
+- Client environment cannot send data to external SaaS (compliance, PII, air-gapped)
+- You already have Grafana, Datadog, or Jaeger running
+- You want traces in the same system as your infra metrics
+Use **LangSmith** (`/enable-langsmith`) when you want a managed UI with LLM-specific features (token counts, cost, prompt diffs).
+## Steps
+1. Install the OTel exporter:
+   ```bash
+   pip install arccrew[otel]
+   ```
+2. Add to `.env`:
+   ```bash
+   OTEL_ENABLED=true
+   OTEL_ENDPOINT=http://localhost:4317   # your OTLP/gRPC endpoint
+   OTEL_SERVICE_NAME=my-pipeline
+   ```
+3. Restart the server — tracing activates automatically:
+   ```bash
+   arccrew serve
+   ```
+## What gets traced
+Every agent execution produces two span types:
+- `agent.run_react` — attributes: `agent.name`, `max_rounds`
+- `agent.reason` — attribute: `agent.name`
+Spans are nested under whatever root span your OTLP backend provides.
+## Verify it works
+Look for this in the startup logs:
+```
+[OTel] Exporting traces to http://localhost:4317
+[OTel] Tracing initialized — service: my-pipeline
+```
+If you see `[OTel] opentelemetry-sdk not installed` instead, run `pip install arccrew[otel]`.
+## Common backends
+| Backend | OTLP endpoint (default) |
+|---|---|
+| Grafana + Tempo | `http://localhost:4317` |
+| Jaeger | `http://localhost:4317` |
+| Datadog Agent | `http://localhost:4317` |
+| Honeycomb | `https://api.honeycomb.io:443` (with API key header) |
+For backends that require auth headers, configure them at the collector level — arccrew sends plain OTLP.
+## Disable tracing
+```bash
+# .env
+OTEL_ENABLED=false
+```

arccrew-0.1.0/.claude/commands/release-pr.md ADDED Viewed

@@ -0,0 +1,96 @@
+# Release PR
+Create a release branch, bump the version, update CHANGELOG, and open a GitHub PR.
+**Version to release:** $ARGUMENTS
+---
+> Run `/analyze-release` first if you haven't determined the version yet.
+## Steps
+### 1. Validate input
+If $ARGUMENTS is empty, ask: "What version should this release be? (e.g. 0.2.0)"
+Strip any `v` prefix from the input — work with bare semver (e.g. `0.2.0`).
+### 2. Get commits for changelog
+Run: `git describe --tags --abbrev=0` to find the last tag.
+Run: `git log {last_tag}..HEAD --oneline` to get commits to include.
+Categorize commits exactly as in `/analyze-release`.
+### 3. Create release branch
+```bash
+git checkout -b release/v{version}
+```
+### 4. Bump version in pyproject.toml
+Edit `pyproject.toml` — update `version = "..."` to the new version.
+### 5. Update CHANGELOG.md
+If `CHANGELOG.md` doesn't exist, create it with this header:
+```markdown
+# Changelog
+All notable changes to arccrew are documented here.
+Format follows [Keep a Changelog](https://keepachangelog.com/en/1.0.0/).
+```
+Prepend the new release block (after the header):
+```markdown
+## [0.2.0] — 2026-03-25
+### Added
+- feat: add global.md 3-layer prompt support
+- feat: arccrew init CLI command
+### Fixed
+- fix: PromptManager cache not invalidated on reload
+```
+Only include categories that have commits. Omit empty sections.
+### 6. Commit the changes
+```bash
+git add pyproject.toml CHANGELOG.md
+git commit -m "chore: release v{version}"
+```
+### 7. Push and create GitHub PR
+Push the branch and create a PR via the GitHub MCP tool with:
+- **Title:** `Release v{version}`
+- **Base branch:** `main`
+- **Body:** the full changelog block for this release, plus:
+```
+## Release checklist
+- [ ] Version bumped in pyproject.toml
+- [ ] CHANGELOG updated
+- [ ] All tests passing
+## After merging
+The CI/CD workflow will automatically publish v{version} to PyPI.
+```
+### 8. Show the user
+```
+Release PR created: {PR_URL}
+After merging:
+  - GitHub Actions will tag the commit and publish to PyPI automatically
+  - Tag: v{version}
+To publish manually instead:
+  git tag v{version}
+  git push origin v{version}
+```

arccrew-0.1.0/.claude/commands/switch-provider.md ADDED Viewed

@@ -0,0 +1,11 @@
+# Switch Provider
+Switch the LLM provider for this project.
+**User request:** $ARGUMENTS
+---
+Read `CLAUDE.md` and `skills/switch-provider/SKILL.md` fully before doing anything.
+Based on the description above, update `AGENT_MODEL` in `.env` with the correct model string for the target provider, install the required provider package if needed, specify any per-agent overrides, and verify the API key for that provider is set. No code changes needed — arccrew resolves the provider from the model string automatically.

arccrew-0.1.0/.claude/commands/write-tests.md ADDED Viewed

@@ -0,0 +1,32 @@
+# Write Tests
+Write tests for agents, tools, or pipelines.
+**User request:** $ARGUMENTS
+---
+Read `CLAUDE.md` and `skills/write-tests/SKILL.md` fully before doing anything.
+Based on the description above, create the appropriate test file(s) in `tests/` following the existing patterns in `test_base_agent.py` and `test_tools.py`.
+**For `@tool` functions** — test directly without mocking (they are pure functions or async):
+- Happy path with valid input
+- Error path: what does it return when the input is bad or the external call fails? (must return `"ERROR: ..."` string, never raise)
+- Edge cases: empty string, None, very long input, unexpected types
+**For agent logic** — mock the LLM, test the agent's state handling:
+- Does `execute()` read the right keys from state?
+- Does it return a `Command` with the expected `goto` and `update` keys?
+- Does `extract_json` fallback work when the LLM returns prose? Test with `{"raw": result}` fallback.
+- Is `"results"` populated on the last agent? (`results: []` in the API response is a common failure)
+**For the pipeline end-to-end** — use a real but cheap model (haiku) with a small input:
+- Use N=1 or N=2 items, never the full production N — this catches round exhaustion early
+- Assert that `response["results"]` is non-empty
+- Assert that the JSON keys you expect actually exist in the result
+**Specific cases to always cover:**
+- `extract_json` on valid JSON, JSON inside markdown fences, and plain text (should return `{}`)
+- Tool returning `"ERROR: ..."` string — does the agent handle it gracefully and not crash?
+- State field missing from input — does the agent default safely or raise a clear error?