hatch3r 1.7.1 → 1.8.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +38 -12
- package/agents/hatch3r-a11y-auditor.md +4 -0
- package/agents/hatch3r-architect.md +4 -0
- package/agents/hatch3r-ci-watcher.md +4 -0
- package/agents/hatch3r-context-rules.md +26 -6
- package/agents/hatch3r-creator.md +6 -1
- package/agents/hatch3r-dependency-auditor.md +4 -0
- package/agents/hatch3r-devops.md +4 -0
- package/agents/hatch3r-docs-writer.md +4 -0
- package/agents/hatch3r-fixer.md +4 -0
- package/agents/hatch3r-handoff-loader.md +243 -0
- package/agents/hatch3r-handoff-preparer.md +134 -0
- package/agents/hatch3r-implementer.md +12 -0
- package/agents/hatch3r-learnings-loader.md +5 -1
- package/agents/hatch3r-lint-fixer.md +4 -0
- package/agents/hatch3r-perf-profiler.md +8 -0
- package/agents/hatch3r-researcher.md +4 -0
- package/agents/hatch3r-reviewer.md +94 -0
- package/agents/hatch3r-security-auditor.md +24 -0
- package/agents/hatch3r-test-writer.md +4 -0
- package/agents/modes/requirements-elicitation.md +4 -1
- package/agents/modes/similar-implementation.md +6 -0
- package/agents/modes/user-flows.md +76 -0
- package/agents/shared/quality-charter.md +128 -0
- package/agents/shared/user-content-templates.md +31 -1
- package/commands/hatch3r-agent-customize.md +4 -0
- package/commands/hatch3r-api-spec.md +7 -0
- package/commands/hatch3r-benchmark.md +7 -0
- package/commands/hatch3r-board-fill.md +8 -0
- package/commands/hatch3r-board-groom.md +4 -0
- package/commands/hatch3r-board-init.md +51 -0
- package/commands/hatch3r-board-pickup.md +8 -0
- package/commands/hatch3r-board-refresh.md +4 -0
- package/commands/hatch3r-board-shared.md +6 -6
- package/commands/hatch3r-bug-plan.md +7 -0
- package/commands/hatch3r-codebase-map.md +8 -0
- package/commands/hatch3r-command-customize.md +4 -0
- package/commands/hatch3r-context-health.md +5 -0
- package/commands/hatch3r-create.md +59 -4
- package/commands/hatch3r-debug.md +7 -0
- package/commands/hatch3r-dep-audit.md +4 -0
- package/commands/hatch3r-feature-plan.md +7 -0
- package/commands/hatch3r-handoff.md +133 -0
- package/commands/hatch3r-healthcheck.md +4 -0
- package/commands/hatch3r-hooks.md +4 -0
- package/commands/hatch3r-learn.md +16 -0
- package/commands/hatch3r-migration-plan.md +7 -0
- package/commands/hatch3r-onboard.md +7 -0
- package/commands/hatch3r-pr-resolve.md +12 -1
- package/commands/hatch3r-project-spec.md +8 -0
- package/commands/hatch3r-quick-change.md +11 -2
- package/commands/hatch3r-recipe.md +4 -0
- package/commands/hatch3r-refactor-plan.md +7 -0
- package/commands/hatch3r-release.md +5 -0
- package/commands/hatch3r-revision.md +7 -0
- package/commands/hatch3r-roadmap.md +8 -0
- package/commands/hatch3r-rule-customize.md +4 -0
- package/commands/hatch3r-security-audit.md +4 -0
- package/commands/hatch3r-skill-customize.md +4 -0
- package/commands/hatch3r-test-plan.md +7 -0
- package/commands/hatch3r-workflow.md +11 -1
- package/dist/cli/index.js +4814 -1130
- package/dist/cli/index.js.map +1 -1
- package/package.json +10 -5
- package/rules/hatch3r-accessibility-standards.md +21 -0
- package/rules/hatch3r-accessibility-standards.mdc +21 -0
- package/rules/hatch3r-agent-orchestration-detail.md +3 -0
- package/rules/hatch3r-agent-orchestration-detail.mdc +3 -0
- package/rules/hatch3r-agent-orchestration.md +34 -3
- package/rules/hatch3r-agent-orchestration.mdc +34 -3
- package/rules/hatch3r-ai-evals.md +158 -0
- package/rules/hatch3r-ai-evals.mdc +154 -0
- package/rules/hatch3r-ai-ux-patterns.md +131 -0
- package/rules/hatch3r-ai-ux-patterns.mdc +127 -0
- package/rules/hatch3r-api-design.md +67 -9
- package/rules/hatch3r-api-design.mdc +67 -9
- package/rules/hatch3r-api-versioning.md +119 -0
- package/rules/hatch3r-api-versioning.mdc +115 -0
- package/rules/hatch3r-auth-patterns.md +170 -0
- package/rules/hatch3r-auth-patterns.mdc +166 -0
- package/rules/hatch3r-component-conventions.md +30 -0
- package/rules/hatch3r-component-conventions.mdc +30 -0
- package/rules/hatch3r-container-hardening.md +131 -0
- package/rules/hatch3r-container-hardening.mdc +127 -0
- package/rules/hatch3r-contract-testing.md +117 -0
- package/rules/hatch3r-contract-testing.mdc +113 -0
- package/rules/hatch3r-deep-context.md +2 -0
- package/rules/hatch3r-deep-context.mdc +2 -0
- package/rules/hatch3r-dependency-management.md +73 -1
- package/rules/hatch3r-dependency-management.mdc +72 -0
- package/rules/hatch3r-design-system-detection.md +142 -0
- package/rules/hatch3r-design-system-detection.mdc +138 -0
- package/rules/hatch3r-event-schema-evolution.md +90 -0
- package/rules/hatch3r-event-schema-evolution.mdc +86 -0
- package/rules/hatch3r-handoff-readiness.md +45 -0
- package/rules/hatch3r-handoff-readiness.mdc +40 -0
- package/rules/hatch3r-i18n.md +13 -0
- package/rules/hatch3r-i18n.mdc +13 -0
- package/rules/hatch3r-iteration-summary.md +2 -0
- package/rules/hatch3r-iteration-summary.mdc +2 -0
- package/rules/hatch3r-migrations.md +61 -16
- package/rules/hatch3r-migrations.mdc +61 -16
- package/rules/hatch3r-observability-logging.md +1 -1
- package/rules/hatch3r-observability-logging.mdc +1 -1
- package/rules/hatch3r-observability-metrics.md +1 -1
- package/rules/hatch3r-observability-metrics.mdc +1 -1
- package/rules/hatch3r-observability-tracing-detail.md +8 -149
- package/rules/hatch3r-observability-tracing-detail.mdc +7 -149
- package/rules/hatch3r-observability-tracing.md +154 -6
- package/rules/hatch3r-observability-tracing.mdc +154 -6
- package/rules/hatch3r-observability.md +1 -0
- package/rules/hatch3r-observability.mdc +1 -0
- package/rules/hatch3r-operability.md +149 -0
- package/rules/hatch3r-operability.mdc +145 -0
- package/rules/hatch3r-passkey-server.md +181 -0
- package/rules/hatch3r-passkey-server.mdc +177 -0
- package/rules/hatch3r-progressive-delivery.md +120 -0
- package/rules/hatch3r-progressive-delivery.mdc +116 -0
- package/rules/hatch3r-resilience-patterns.md +154 -0
- package/rules/hatch3r-resilience-patterns.mdc +150 -0
- package/rules/hatch3r-secrets-management.md +29 -0
- package/rules/hatch3r-secrets-management.mdc +29 -0
- package/rules/hatch3r-testing.md +139 -43
- package/rules/hatch3r-testing.mdc +139 -43
- package/rules/hatch3r-ux-states-and-flows.md +149 -0
- package/rules/hatch3r-ux-states-and-flows.mdc +145 -0
- package/skills/hatch3r-a11y-audit/SKILL.md +14 -0
- package/skills/hatch3r-agent-customize/SKILL.md +10 -0
- package/skills/hatch3r-ai-feature/SKILL.md +136 -0
- package/skills/hatch3r-api-spec/SKILL.md +73 -0
- package/skills/hatch3r-architecture-review/SKILL.md +14 -0
- package/skills/hatch3r-bug-fix/SKILL.md +5 -0
- package/skills/hatch3r-ci-pipeline/SKILL.md +14 -0
- package/skills/hatch3r-cli-aichat/SKILL.md +84 -0
- package/skills/hatch3r-cli-ast-grep/SKILL.md +85 -0
- package/skills/hatch3r-cli-az-devops/SKILL.md +89 -0
- package/skills/hatch3r-cli-bat/SKILL.md +85 -0
- package/skills/hatch3r-cli-comby/SKILL.md +85 -0
- package/skills/hatch3r-cli-csvkit/SKILL.md +84 -0
- package/skills/hatch3r-cli-delta/SKILL.md +86 -0
- package/skills/hatch3r-cli-difftastic/SKILL.md +84 -0
- package/skills/hatch3r-cli-docker/SKILL.md +89 -0
- package/skills/hatch3r-cli-duckdb/SKILL.md +84 -0
- package/skills/hatch3r-cli-fd/SKILL.md +85 -0
- package/skills/hatch3r-cli-fzf/SKILL.md +84 -0
- package/skills/hatch3r-cli-gh/SKILL.md +90 -0
- package/skills/hatch3r-cli-glab/SKILL.md +89 -0
- package/skills/hatch3r-cli-jq/SKILL.md +89 -0
- package/skills/hatch3r-cli-lazygit/SKILL.md +78 -0
- package/skills/hatch3r-cli-llm/SKILL.md +84 -0
- package/skills/hatch3r-cli-miller/SKILL.md +84 -0
- package/skills/hatch3r-cli-mods/SKILL.md +84 -0
- package/skills/hatch3r-cli-overview/SKILL.md +60 -0
- package/skills/hatch3r-cli-playwright/SKILL.md +89 -0
- package/skills/hatch3r-cli-podman/SKILL.md +84 -0
- package/skills/hatch3r-cli-qsv/SKILL.md +91 -0
- package/skills/hatch3r-cli-ripgrep/SKILL.md +85 -0
- package/skills/hatch3r-cli-rtk/SKILL.md +91 -0
- package/skills/hatch3r-cli-sd/SKILL.md +85 -0
- package/skills/hatch3r-cli-stagehand/SKILL.md +111 -0
- package/skills/hatch3r-cli-taplo/SKILL.md +84 -0
- package/skills/hatch3r-cli-yq/SKILL.md +85 -0
- package/skills/hatch3r-cli-zstd/SKILL.md +85 -0
- package/skills/hatch3r-command-customize/SKILL.md +10 -0
- package/skills/hatch3r-context-health/SKILL.md +14 -0
- package/skills/hatch3r-cost-tracking/SKILL.md +14 -0
- package/skills/hatch3r-customize/SKILL.md +17 -0
- package/skills/hatch3r-dep-audit/SKILL.md +14 -0
- package/skills/hatch3r-design-system-detect/SKILL.md +164 -0
- package/skills/hatch3r-feature/SKILL.md +2 -0
- package/skills/hatch3r-gh-agentic-workflows/SKILL.md +13 -0
- package/skills/hatch3r-handoff-prepare/SKILL.md +160 -0
- package/skills/hatch3r-handoff-resume/SKILL.md +171 -0
- package/skills/hatch3r-incident-response/SKILL.md +14 -0
- package/skills/hatch3r-issue-workflow/SKILL.md +5 -0
- package/skills/hatch3r-logical-refactor/SKILL.md +14 -0
- package/skills/hatch3r-migration/SKILL.md +14 -0
- package/skills/hatch3r-observability-verify/SKILL.md +134 -0
- package/skills/hatch3r-perf-audit/SKILL.md +14 -0
- package/skills/hatch3r-pr-creation/SKILL.md +14 -0
- package/skills/hatch3r-qa-validation/SKILL.md +18 -0
- package/skills/hatch3r-recipe/SKILL.md +14 -0
- package/skills/hatch3r-refactor/SKILL.md +14 -0
- package/skills/hatch3r-release/SKILL.md +14 -0
- package/skills/hatch3r-reliability-verify/SKILL.md +146 -0
- package/skills/hatch3r-rule-customize/SKILL.md +10 -0
- package/skills/hatch3r-skill-customize/SKILL.md +10 -0
- package/skills/hatch3r-ui-ux-verify/SKILL.md +138 -0
- package/skills/hatch3r-visual-refactor/SKILL.md +15 -1
|
@@ -5,9 +5,19 @@ tags: [customize]
|
|
|
5
5
|
quality_charter: agents/shared/quality-charter.md
|
|
6
6
|
efficiency_patterns: agents/shared/efficiency-patterns.md
|
|
7
7
|
cache_friendly: true
|
|
8
|
+
redirect_to: hatch3r-customize
|
|
8
9
|
---
|
|
9
10
|
# Agent Customization
|
|
10
11
|
|
|
11
12
|
> **This skill has been consolidated.** Use the `hatch3r-customize` skill with `type: agent`.
|
|
12
13
|
|
|
13
14
|
For agent-specific reference (model resolution, protected agents, YAML schema), see the `hatch3r-agent-customize` command.
|
|
15
|
+
|
|
16
|
+
## Rejected Merge Alternative (D16.3 add-vs-remove bias)
|
|
17
|
+
|
|
18
|
+
Per `governance/audit/domains/D16-compound-system.md` SA 16.3, the default recommendation on functional overlap is MERGE rather than removal. Full deletion of this redirect file was rejected for two reasons:
|
|
19
|
+
|
|
20
|
+
1. **Preserves UX entry points.** Users typed `/h4tcher-agent-customize` or referenced the id `hatch3r-agent-customize` (per CHANGELOG.md, `website/docs/reference/configuration.md:325`, `docs/model-selection.md:158`) before consolidation. Deleting the id breaks those entry points without a redirect target.
|
|
21
|
+
2. **Signals umbrella canonicality.** The `redirect_to: hatch3r-customize` frontmatter field marks `hatch3r-customize` as the single source of truth — tooling, audit scans, and adapters can resolve any redirect to the canonical without re-reading body prose.
|
|
22
|
+
|
|
23
|
+
The 13-LOC redirect cost is paid once per type; the umbrella body lives in `skills/hatch3r-customize/SKILL.md`.
|
|
@@ -0,0 +1,136 @@
|
|
|
1
|
+
---
|
|
2
|
+
id: hatch3r-ai-feature
|
|
3
|
+
type: skill
|
|
4
|
+
description: Eval-driven development workflow for shipping AI features — write eval before prompt, measure, iterate, ship with caching + cost telemetry + model fallback + hallucination SLI
|
|
5
|
+
tags: [implementation, ai]
|
|
6
|
+
quality_charter: agents/shared/quality-charter.md
|
|
7
|
+
efficiency_patterns: agents/shared/efficiency-patterns.md
|
|
8
|
+
cache_friendly: true
|
|
9
|
+
---
|
|
10
|
+
# AI Feature Workflow (Eval-Driven)
|
|
11
|
+
|
|
12
|
+
## Quick Start
|
|
13
|
+
|
|
14
|
+
Run this skill before shipping any LLM-driven feature. It defines the canonical eval-driven loop (write eval, write prompt, measure, iterate) and the production-readiness gates. Skipping any of the 9 steps = the feature is not done.
|
|
15
|
+
|
|
16
|
+
This skill is the implementation counterpart to `rules/hatch3r-ai-evals.md` (backend governance) and `rules/hatch3r-ai-ux-patterns.md` (UI governance). The rules define the bar; this skill defines the route to clearing the bar.
|
|
17
|
+
|
|
18
|
+
## Step 0 — Detect Ambiguity (P8 B1)
|
|
19
|
+
|
|
20
|
+
Before any work, scan the invocation for unresolved questions in scope, intent, acceptance criteria, target environment, or irreversibility. If any are found, ask the user via the platform-native question tool per `agents/shared/user-question-protocol.md`. Do not proceed under silent assumption. Default path, not an exception. Triggers for THIS skill: task class (classification vs open-ended vs RAG vs agentic), model pin (Sonnet vs Opus vs Haiku), eval threshold values, budget per request (cost cap), and fallback policy (graceful degrade vs hard fail).
|
|
21
|
+
|
|
22
|
+
## Fan-out Discipline (P8 B2)
|
|
23
|
+
|
|
24
|
+
This skill delegates per task size:
|
|
25
|
+
- Tier 1 (trivial single-file): inline execution acceptable.
|
|
26
|
+
- Tier 2 (multi-file or multi-concern): spawn parallel sub-agents per concern via the Task tool.
|
|
27
|
+
- Tier 3 (multi-module / high-risk): one fresh sub-agent per independent module or gate; orchestrator integrates only.
|
|
28
|
+
|
|
29
|
+
Never under-fan-out to save tokens. Token cost is dominated by quality and completeness gains. Emit `sub_agents_spawned: { count, rationale }` in your output.
|
|
30
|
+
|
|
31
|
+
## Step 1: Define the task and success criteria
|
|
32
|
+
|
|
33
|
+
- Write down what "right" looks like in one paragraph — the user input class, the expected output shape, the failure modes you want to catch.
|
|
34
|
+
- Hand-author 20+ golden examples in `evals/<feature>/golden.jsonl` with `input` + `expected_output` (or a graded rubric when the task is open-ended).
|
|
35
|
+
- Save the threshold per metric in `evals/<feature>/thresholds.json`. Without an explicit threshold, "passing the eval" is undefined.
|
|
36
|
+
- Cross-reference `rules/hatch3r-ai-evals.md` Golden Dataset Versioning for filename and refresh policy.
|
|
37
|
+
- Source diversity matters more than count beyond 20 — include adversarial inputs, edge cases from prior incidents, and at least 3 examples per known input class.
|
|
38
|
+
- Label every example with the input class so per-class accuracy is computable in Step 4.
|
|
39
|
+
|
|
40
|
+
## Step 2: Pick eval tool and metric
|
|
41
|
+
|
|
42
|
+
Match the task class to the tool:
|
|
43
|
+
|
|
44
|
+
- Classification → promptfoo with exact-match assertions.
|
|
45
|
+
- Open-ended generation → DeepEval or braintrust with LLM-as-judge + a 50-example human-labeled calibration set.
|
|
46
|
+
- Retrieval/RAG → RAGAS (context_precision, context_recall, faithfulness, answer_relevance).
|
|
47
|
+
- Tool-use / agentic → Inspect or BFCL-style harness.
|
|
48
|
+
- Safety/red-team → Garak or PyRIT scheduled weekly.
|
|
49
|
+
|
|
50
|
+
Pin the choice in `evals/README.md` so the next agent run picks the same tool.
|
|
51
|
+
|
|
52
|
+
## Step 3: Write the prompt
|
|
53
|
+
|
|
54
|
+
- Author the prompt at `prompts/<feature>/v1.md` with frontmatter `{ id, version: 1, model_pinned, eval_set }`.
|
|
55
|
+
- Commit; record SHA-256 hash in `evals/<feature>/thresholds.json`.
|
|
56
|
+
- If the system prompt + tool definitions + RAG context exceed 1024 tokens, apply Anthropic `cache_control` breakpoints (or rely on OpenAI's automatic prefix cache for ≥1024-token deterministic prefixes). Longest-TTL block first.
|
|
57
|
+
|
|
58
|
+
## Step 4: Run eval; iterate prompt
|
|
59
|
+
|
|
60
|
+
- Run `npx promptfoo eval` (or the chosen tool's CLI) against the golden set.
|
|
61
|
+
- Read the per-metric report. If below threshold, modify the prompt, bump to `v2.md`, re-hash, re-run.
|
|
62
|
+
- Treat each prompt revision like a code commit — small, named, testable.
|
|
63
|
+
- Stop iterating when every metric clears its threshold in `thresholds.json` and the pairwise win-rate vs the prior version is >=55%.
|
|
64
|
+
- Capture the eval report artifact in CI so the PR reviewer can read per-case pass/fail without re-running the suite locally.
|
|
65
|
+
- If iteration count exceeds 10 versions without convergence, escalate — the task may need decomposition (one sub-prompt per input class) or a retrieval-grounded approach.
|
|
66
|
+
|
|
67
|
+
## Step 5: Wire production telemetry
|
|
68
|
+
|
|
69
|
+
- Per-request log line emits `model`, `tokens_in`, `tokens_out`, `cache_hit`, `cached_tokens`, `cost_usd`, `latency_ms`, `prompt_version`, `prompt_hash`, `cost_center`.
|
|
70
|
+
- Per-request OpenTelemetry span follows the OTel GenAI semantic conventions (`gen_ai.*` attributes).
|
|
71
|
+
- Aggregate dashboards: cost-per-request, hallucination_rate, citation_precision, refusal_rate, cache_hit_ratio.
|
|
72
|
+
- Cross-reference `skills/hatch3r-observability-verify` for the per-feature dashboard checklist.
|
|
73
|
+
|
|
74
|
+
## Step 6: Wire fallback chain
|
|
75
|
+
|
|
76
|
+
- Primary model (e.g. Sonnet 4.7) → secondary (cheaper/faster, e.g. Haiku 4.5) → static fallback (cached or canned).
|
|
77
|
+
- Wrap in circuit-breaker + retry-with-decorrelated-jitter — cross-reference `rules/hatch3r-resilience-patterns.md` (Slice 8) for the primitives.
|
|
78
|
+
- Run the eval suite against the secondary path too — a silent quality cliff between primary and secondary is a regression.
|
|
79
|
+
- Static fallback text names the failure mode in user-readable language ("AI is briefly unavailable — retry in a minute") rather than dumping a stack trace into the UI.
|
|
80
|
+
|
|
81
|
+
## Step 7: Add CI gate
|
|
82
|
+
|
|
83
|
+
- Eval runs on every PR that touches `**/prompts/**`, `**/rag/**`, `**/ai/**`, `**/llm/**`.
|
|
84
|
+
- PR blocks when any metric drops below the threshold in `evals/<feature>/thresholds.json`.
|
|
85
|
+
- Model-version upgrade (Sonnet to Opus, 4.6 to 4.7) triggers a full eval with a 5% accuracy budget; cross over 5% requires a named-reviewer sign-off + 24-hour canary at 5% traffic.
|
|
86
|
+
|
|
87
|
+
## Step 8: Production verification
|
|
88
|
+
|
|
89
|
+
First 24 hours after deploy, monitor:
|
|
90
|
+
|
|
91
|
+
- `ai.hallucination_rate` — SLO <5% on golden set; alert if 7-day rolling rate >5%.
|
|
92
|
+
- `ai.refusal_rate` — track false-positive refusal rate separately.
|
|
93
|
+
- `ai.cost_per_request_usd` — p50/p95/p99 vs feature budget; alert at 50%/75%/90% of monthly budget.
|
|
94
|
+
- `ai.latency_ms` — first-token-latency p95 + total-response-latency p99.
|
|
95
|
+
- `ai.cache_hit_ratio` — should match the dev-environment baseline within 10%; a drop indicates prefix drift.
|
|
96
|
+
- `ai.tokens_per_request` — p95 should be within 20% of the eval-time distribution; a spike signals retrieval growth or prompt drift.
|
|
97
|
+
|
|
98
|
+
Cross-reference `skills/hatch3r-observability-verify`.
|
|
99
|
+
|
|
100
|
+
## Step 9: Feedback loop
|
|
101
|
+
|
|
102
|
+
- Wire user thumbs-down to a feedback queue per response.
|
|
103
|
+
- Monthly triage job promotes thumbs-down examples into regression fixtures in `evals/<feature>/edge.jsonl`.
|
|
104
|
+
- Promotion is a manual review step — raw user feedback contains noise and adversarial labels.
|
|
105
|
+
- Capture an optional free-text comment with each thumbs-down; the comment is the highest-signal feature for triage clustering.
|
|
106
|
+
- Track feedback volume per response surface — a sudden spike in thumbs-down rate signals an upstream prompt or retrieval regression and gates a rollback.
|
|
107
|
+
|
|
108
|
+
## Verdict
|
|
109
|
+
|
|
110
|
+
All 9 steps complete = the AI feature is "done". Anything less = not done. The orchestrator running this skill emits a single-line verdict per step (`STEP_N: PASS|FAIL <evidence-path>`) and aggregates them. One FAIL on any step blocks release.
|
|
111
|
+
|
|
112
|
+
Evidence paths point at concrete artifacts: the golden set (`evals/<feature>/golden.jsonl`), the prompt version (`prompts/<feature>/v<N>.md`), the eval report (`evals/<feature>/report-<run-id>.json`), and the dashboard URL for production SLI verification. Verdicts without evidence paths are not accepted by the gate.
|
|
113
|
+
|
|
114
|
+
## When this skill runs
|
|
115
|
+
|
|
116
|
+
- After `hatch3r-implementer` finishes the surrounding non-AI feature code, before `hatch3r-qa-validation`.
|
|
117
|
+
- On every PR that introduces a new LLM call or modifies an existing prompt, model, or retrieval pipeline.
|
|
118
|
+
- Step 8 (production verification) executes against the post-deploy environment, not the PR branch.
|
|
119
|
+
|
|
120
|
+
## Cross-References
|
|
121
|
+
|
|
122
|
+
- `rules/hatch3r-ai-evals.md` — backend governance (eval, cost, caching, fallback, SLI).
|
|
123
|
+
- `rules/hatch3r-ai-ux-patterns.md` — frontend UX patterns (streaming, tool-call cards, citations).
|
|
124
|
+
- `skills/hatch3r-ui-ux-verify/SKILL.md` — UI verification gate for AI surfaces.
|
|
125
|
+
- `skills/hatch3r-observability-verify` — observability wiring checklist.
|
|
126
|
+
- `rules/hatch3r-resilience-patterns.md` (Slice 8) — circuit-breaker + retry primitives reused in the fallback chain.
|
|
127
|
+
|
|
128
|
+
## References
|
|
129
|
+
|
|
130
|
+
- promptfoo — `promptfoo.dev`
|
|
131
|
+
- DeepEval — `github.com/confident-ai/deepeval`
|
|
132
|
+
- RAGAS — `docs.ragas.io`
|
|
133
|
+
- Inspect (UK AISI) — `github.com/UKGovernmentBEIS/inspect_ai`
|
|
134
|
+
- Anthropic prompt caching guide — `docs.anthropic.com/en/docs/build-with-claude/prompt-caching`
|
|
135
|
+
- OpenTelemetry GenAI semantic conventions — `opentelemetry.io/docs/specs/semconv/gen-ai/`
|
|
136
|
+
- Berkeley Function Calling Leaderboard (BFCL v4) — `gorilla.cs.berkeley.edu/leaderboard.html`
|
|
@@ -14,13 +14,19 @@ cache_friendly: true
|
|
|
14
14
|
|
|
15
15
|
```
|
|
16
16
|
Task Progress:
|
|
17
|
+
- [ ] Step 0: Detect ambiguity (P8 B1)
|
|
17
18
|
- [ ] Step 1: Inventory existing endpoints
|
|
18
19
|
- [ ] Step 2: Generate OpenAPI spec
|
|
19
20
|
- [ ] Step 3: Validate schemas
|
|
20
21
|
- [ ] Step 4: Generate documentation
|
|
21
22
|
- [ ] Step 5: Verify spec accuracy
|
|
23
|
+
- [ ] Step 6: Wire oasdiff breaking-change CI gate
|
|
22
24
|
```
|
|
23
25
|
|
|
26
|
+
## Step 0 — Detect Ambiguity (P8 B1)
|
|
27
|
+
|
|
28
|
+
Before any work, scan the invocation for unresolved questions in scope, intent, acceptance criteria, target environment, or irreversibility. If any are found, ask the user via the platform-native question tool per `agents/shared/user-question-protocol.md`. Do not proceed under silent assumption. Default path, not an exception. Triggers for THIS skill: OpenAPI version (3.0 vs 3.1), spec output path, auth scheme (Bearer vs OAuth2 vs API key), breaking-change policy (block vs version vs document), and target consumers (SDK clients vs human docs vs both).
|
|
29
|
+
|
|
24
30
|
## Step 1: Inventory Existing Endpoints
|
|
25
31
|
|
|
26
32
|
- Scan route definitions across the codebase (controllers, handlers, route files).
|
|
@@ -61,6 +67,72 @@ Task Progress:
|
|
|
61
67
|
- Check that path parameters, query parameters, and headers are documented with accurate types, required flags, and example values.
|
|
62
68
|
- Validate against any existing API consumers (SDKs, frontend clients) for breaking changes.
|
|
63
69
|
|
|
70
|
+
## Step 6: Wire `oasdiff` Breaking-Change CI Gate
|
|
71
|
+
|
|
72
|
+
Breaking changes on stable endpoints must trip CI before merge. This step enforces the CONSTITUTION §2 P5 lean-thresholds row "API breaking-change events on stable endpoints = 0 per release" (governance/CONSTITUTION.md:80, verified by `oasdiff / buf breaking / graphql-inspector CI gate`).
|
|
73
|
+
|
|
74
|
+
### 6.1 Install `oasdiff`
|
|
75
|
+
|
|
76
|
+
Pick one of two install paths:
|
|
77
|
+
|
|
78
|
+
- npm global (CI runner with Node 22+): `npm i -g @tufin/oasdiff`
|
|
79
|
+
- Docker image (no Node dependency): `docker run --rm -t -v $(pwd):/specs tufin/oasdiff <subcommand>`
|
|
80
|
+
|
|
81
|
+
Pin the version in CI (e.g., `npm i -g @tufin/oasdiff@1.10.x` or `tufin/oasdiff:1.10`) so a new release of oasdiff does not change gate semantics mid-cycle.
|
|
82
|
+
|
|
83
|
+
### 6.2 Compare current spec vs previous merged version
|
|
84
|
+
|
|
85
|
+
The gate compares the spec on the feature branch against the spec at the merge base on the default branch. Fail CI on any breaking change to a stable endpoint; report non-breaking diffs as informational.
|
|
86
|
+
|
|
87
|
+
- Fetch the base ref's spec into a temp path (e.g., `git show origin/main:openapi.yaml > /tmp/openapi.base.yaml`).
|
|
88
|
+
- Run `oasdiff breaking /tmp/openapi.base.yaml ./openapi.yaml --fail-on ERR` — exit code 1 when one or more `ERR`-level breaking changes are detected.
|
|
89
|
+
- Scope the gate to stable endpoints by excluding paths tagged `x-stability: experimental` via `--match-path` or by maintaining an `oasdiff-ignore.yaml` rules file for documented breaking changes already coordinated with consumers.
|
|
90
|
+
|
|
91
|
+
### 6.3 Example GitHub Actions step
|
|
92
|
+
|
|
93
|
+
```yaml
|
|
94
|
+
name: API Breaking-Change Gate
|
|
95
|
+
on:
|
|
96
|
+
pull_request:
|
|
97
|
+
paths:
|
|
98
|
+
- 'openapi.yaml'
|
|
99
|
+
- 'openapi.json'
|
|
100
|
+
- 'docs/api/**'
|
|
101
|
+
|
|
102
|
+
jobs:
|
|
103
|
+
oasdiff:
|
|
104
|
+
runs-on: ubuntu-latest
|
|
105
|
+
steps:
|
|
106
|
+
- uses: actions/checkout@v4
|
|
107
|
+
with:
|
|
108
|
+
fetch-depth: 0
|
|
109
|
+
- uses: actions/setup-node@v4
|
|
110
|
+
with:
|
|
111
|
+
node-version: '22'
|
|
112
|
+
- name: Install oasdiff
|
|
113
|
+
run: npm i -g @tufin/oasdiff@1.10.x
|
|
114
|
+
- name: Resolve base spec
|
|
115
|
+
run: |
|
|
116
|
+
git show origin/${{ github.base_ref }}:openapi.yaml > /tmp/openapi.base.yaml
|
|
117
|
+
- name: Run breaking-change diff
|
|
118
|
+
run: |
|
|
119
|
+
oasdiff breaking /tmp/openapi.base.yaml ./openapi.yaml \
|
|
120
|
+
--fail-on ERR \
|
|
121
|
+
--format githubactions
|
|
122
|
+
```
|
|
123
|
+
|
|
124
|
+
The `--format githubactions` flag emits `::error::` annotations so each breaking change shows up inline on the PR diff.
|
|
125
|
+
|
|
126
|
+
### 6.4 Handling an intentional breaking change
|
|
127
|
+
|
|
128
|
+
When a breaking change is deliberate (versioned endpoint cut, deprecated field removed after the documented sunset window):
|
|
129
|
+
|
|
130
|
+
1. Add a row to `oasdiff-ignore.yaml` with the change ID, the affected operation, and a link to the consumer-coordination record.
|
|
131
|
+
2. Bump the spec `info.version` in line with the project's API versioning policy (semver-major for breaking changes on stable endpoints).
|
|
132
|
+
3. Document the change in CHANGELOG (or equivalent) with the migration path for downstream consumers.
|
|
133
|
+
|
|
134
|
+
The gate stays green only because the change is recorded — not because the breaking signal was silenced.
|
|
135
|
+
|
|
64
136
|
## Error Handling
|
|
65
137
|
|
|
66
138
|
- **Route definitions use dynamic or meta-programmed patterns**: If endpoints are generated at runtime or via decorators that resist static analysis, document the gap and manually enumerate the missing endpoints.
|
|
@@ -74,3 +146,4 @@ Task Progress:
|
|
|
74
146
|
- [ ] Spec passes linter validation
|
|
75
147
|
- [ ] Example requests/responses included
|
|
76
148
|
- [ ] No breaking changes to existing API consumers
|
|
149
|
+
- [ ] `oasdiff breaking` CI gate is wired and fails on any `ERR`-level breaking change on stable endpoints (CONSTITUTION §2 P5: 0 per release)
|
|
@@ -12,6 +12,7 @@ cache_friendly: true
|
|
|
12
12
|
|
|
13
13
|
```
|
|
14
14
|
Task Progress:
|
|
15
|
+
- [ ] Step 0: Detect ambiguity (P8 B1)
|
|
15
16
|
- [ ] Step 1: Read existing ADRs and the template
|
|
16
17
|
- [ ] Step 2: Define the decision context — problem, constraints, options
|
|
17
18
|
- [ ] Step 3: Evaluate options — pros/cons, prototype if needed, check ADR constraints
|
|
@@ -20,6 +21,19 @@ Task Progress:
|
|
|
20
21
|
- [ ] Step 6: Update affected specs or docs to reference the new ADR
|
|
21
22
|
```
|
|
22
23
|
|
|
24
|
+
## Step 0 — Detect Ambiguity (P8 B1)
|
|
25
|
+
|
|
26
|
+
Before any work, scan the invocation for unresolved questions in scope, intent, acceptance criteria, target environment, or irreversibility. If any are found, ask the user via the platform-native question tool per `agents/shared/user-question-protocol.md`. Do not proceed under silent assumption. Default path, not an exception. Triggers for THIS skill: problem framing (what decision needs to be made), constraint set (mandatory vs preferred), evaluation horizon (short-term vs long-term cost), supersedes which prior ADR, and ADR status target (PROPOSED for discussion vs ACCEPTED for binding decision).
|
|
27
|
+
|
|
28
|
+
## Fan-out Discipline (P8 B2)
|
|
29
|
+
|
|
30
|
+
This skill delegates per task size:
|
|
31
|
+
- Tier 1 (trivial single-file): inline execution acceptable.
|
|
32
|
+
- Tier 2 (multi-file or multi-concern): spawn parallel sub-agents per concern via the Task tool.
|
|
33
|
+
- Tier 3 (multi-module / high-risk): one fresh sub-agent per independent module or gate; orchestrator integrates only.
|
|
34
|
+
|
|
35
|
+
Never under-fan-out to save tokens. Token cost is dominated by quality and completeness gains. Emit `sub_agents_spawned: { count, rationale }` in your output.
|
|
36
|
+
|
|
23
37
|
## Step 1: Read Existing ADRs and Template
|
|
24
38
|
|
|
25
39
|
- Read all ADRs in project docs to understand current architecture and constraints.
|
|
@@ -14,6 +14,7 @@ cache_friendly: true
|
|
|
14
14
|
|
|
15
15
|
```
|
|
16
16
|
Task Progress:
|
|
17
|
+
- [ ] Step 0: Detect ambiguity (P8 B1)
|
|
17
18
|
- [ ] Step 1: Read the issue and relevant specs
|
|
18
19
|
- [ ] Step 2: Produce a diagnosis plan
|
|
19
20
|
- [ ] Step 2b: Browser reproduction (if UI bug)
|
|
@@ -25,6 +26,10 @@ Task Progress:
|
|
|
25
26
|
- [ ] Step 6: Open PR
|
|
26
27
|
```
|
|
27
28
|
|
|
29
|
+
## Step 0 — Detect Ambiguity (P8 B1)
|
|
30
|
+
|
|
31
|
+
Before any work, scan the invocation for unresolved questions in scope, intent, acceptance criteria, target environment, or irreversibility. If any are found, ask the user via the platform-native question tool per `agents/shared/user-question-protocol.md`. Do not proceed under silent assumption. Default path, not an exception. Triggers for THIS skill: reproduction steps incomplete, expected vs actual behavior unstated, severity unclear (P0/P1 vs P2/P3), affected environment unknown (staging vs prod), or fix may require schema/API change with downstream consumers.
|
|
32
|
+
|
|
28
33
|
## Step 1: Read Inputs
|
|
29
34
|
|
|
30
35
|
- Parse the issue body: problem description, reproduction steps, expected/actual behavior, severity, affected area.
|
|
@@ -14,6 +14,7 @@ cache_friendly: true
|
|
|
14
14
|
|
|
15
15
|
```
|
|
16
16
|
Task Progress:
|
|
17
|
+
- [ ] Step 0: Detect ambiguity (P8 B1)
|
|
17
18
|
- [ ] Step 1: Audit existing pipeline
|
|
18
19
|
- [ ] Step 2: Design stage structure
|
|
19
20
|
- [ ] Step 3: Optimize test parallelization
|
|
@@ -21,6 +22,19 @@ Task Progress:
|
|
|
21
22
|
- [ ] Step 5: Implement and validate
|
|
22
23
|
```
|
|
23
24
|
|
|
25
|
+
## Step 0 — Detect Ambiguity (P8 B1)
|
|
26
|
+
|
|
27
|
+
Before any work, scan the invocation for unresolved questions in scope, intent, acceptance criteria, target environment, or irreversibility. If any are found, ask the user via the platform-native question tool per `agents/shared/user-question-protocol.md`. Do not proceed under silent assumption. Default path, not an exception. Triggers for THIS skill: CI platform (GitHub Actions vs GitLab vs CircleCI vs Azure Pipelines), pipeline duration target, runner sizing budget, deploy gate (auto vs manual approval for prod), and artifact retention policy.
|
|
28
|
+
|
|
29
|
+
## Fan-out Discipline (P8 B2)
|
|
30
|
+
|
|
31
|
+
This skill delegates per task size:
|
|
32
|
+
- Tier 1 (trivial single-file): inline execution acceptable.
|
|
33
|
+
- Tier 2 (multi-file or multi-concern): spawn parallel sub-agents per concern via the Task tool.
|
|
34
|
+
- Tier 3 (multi-module / high-risk): one fresh sub-agent per independent module or gate; orchestrator integrates only.
|
|
35
|
+
|
|
36
|
+
Never under-fan-out to save tokens. Token cost is dominated by quality and completeness gains. Emit `sub_agents_spawned: { count, rationale }` in your output.
|
|
37
|
+
|
|
24
38
|
## Step 1: Audit Existing Pipeline
|
|
25
39
|
|
|
26
40
|
- Map the current pipeline stages, their dependencies, and execution times.
|
|
@@ -0,0 +1,84 @@
|
|
|
1
|
+
---
|
|
2
|
+
id: hatch3r-cli-aichat
|
|
3
|
+
description: "Multi-provider LLM chat CLI with RAG and session memory. Use when RAG-enabled multi-provider conversational shell with saved session history; invoke `aichat`. Streams tokens to stdout so downstream `grep`/`tee` consumers see partial results."
|
|
4
|
+
tags: ["cli-tools", "ai", "opt-in"]
|
|
5
|
+
quality_charter: agents/shared/quality-charter.md
|
|
6
|
+
efficiency_patterns: agents/shared/efficiency-patterns.md
|
|
7
|
+
cache_friendly: true
|
|
8
|
+
cli_tool:
|
|
9
|
+
id: aichat
|
|
10
|
+
bin: aichat
|
|
11
|
+
tier: 3
|
|
12
|
+
category: ai
|
|
13
|
+
homepage: https://github.com/sigoden/aichat
|
|
14
|
+
---
|
|
15
|
+
<!-- HATCH3R-CLI-SKILL-GENERATED v1 -->
|
|
16
|
+
# aichat
|
|
17
|
+
|
|
18
|
+
Multi-provider LLM chat CLI with RAG and session memory
|
|
19
|
+
|
|
20
|
+
## When to Use
|
|
21
|
+
|
|
22
|
+
Reach for `aichat` when the task is in the **ai** category and the agent would otherwise call an MCP tool or read large outputs into context.
|
|
23
|
+
|
|
24
|
+
## Token Cost
|
|
25
|
+
|
|
26
|
+
CLI tools return structured stdout that fits in <1KB for typical queries; equivalent MCP calls regularly exceed 10KB.
|
|
27
|
+
Reference: Anthropic engineering (Nov 4 2025) — code-execution-over-MCP yields 98.7% token reduction.
|
|
28
|
+
|
|
29
|
+
## Recipes
|
|
30
|
+
|
|
31
|
+
```bash
|
|
32
|
+
aichat 'explain this commit message' < commit.txt
|
|
33
|
+
```
|
|
34
|
+
One-shot prompt with stdin as the input payload.
|
|
35
|
+
|
|
36
|
+
```bash
|
|
37
|
+
aichat -r 'tech writer' 'rewrite as bullets' < draft.md
|
|
38
|
+
```
|
|
39
|
+
Apply a saved role (`~/.config/aichat/roles/tech-writer.md`) as the system prompt.
|
|
40
|
+
|
|
41
|
+
```bash
|
|
42
|
+
aichat --model claude-3-5-sonnet -e 'summarize' README.md
|
|
43
|
+
```
|
|
44
|
+
Pin the model and pass a file argument directly — `-e` executes the prompt non-interactively.
|
|
45
|
+
|
|
46
|
+
```bash
|
|
47
|
+
aichat --rag mydocs 'how do we configure auth?'
|
|
48
|
+
```
|
|
49
|
+
Query a pre-built RAG index over local documentation — runs embeddings locally, no remote indexer needed.
|
|
50
|
+
|
|
51
|
+
```bash
|
|
52
|
+
aichat --session refactor-plan
|
|
53
|
+
```
|
|
54
|
+
Resume a named session with persisted history — useful for multi-turn refinement loops.
|
|
55
|
+
|
|
56
|
+
## Wrong Choice When
|
|
57
|
+
|
|
58
|
+
- **Scripted Unix-style pipelines with a rich plugin ecosystem:** `hatch3r-cli-llm` (tier 2) has plugin support for templates, embeddings, and provider adapters not in aichat.
|
|
59
|
+
- **Offline-only / fully local inference:** aichat supports Ollama backends but adds an unneeded abstraction; talk to Ollama's HTTP API directly via `curl`.
|
|
60
|
+
- **CI batch tasks that benefit from `mods` pipe semantics:** `hatch3r-cli-mods` reads a single piped payload then exits — simpler for one-shot transforms.
|
|
61
|
+
|
|
62
|
+
## Alternatives
|
|
63
|
+
|
|
64
|
+
| Tool | When to prefer |
|
|
65
|
+
|------|----------------|
|
|
66
|
+
| `hatch3r-cli-llm` (tier 2) | Plugin ecosystem, templates, embeddings, structured CI use |
|
|
67
|
+
| `hatch3r-cli-mods` (tier 3) | Single-piped-payload transforms, Unix-pipe ergonomics |
|
|
68
|
+
| Raw `curl` against Ollama / provider HTTP API | Maximum control, no client-side caching or session state |
|
|
69
|
+
|
|
70
|
+
## Detection / Install
|
|
71
|
+
|
|
72
|
+
Verify with:
|
|
73
|
+
```bash
|
|
74
|
+
command -v aichat
|
|
75
|
+
```
|
|
76
|
+
|
|
77
|
+
Install (mac):
|
|
78
|
+
|
|
79
|
+
```bash
|
|
80
|
+
# brew
|
|
81
|
+
brew install aichat
|
|
82
|
+
```
|
|
83
|
+
|
|
84
|
+
Homepage: https://github.com/sigoden/aichat
|
|
@@ -0,0 +1,85 @@
|
|
|
1
|
+
---
|
|
2
|
+
id: hatch3r-cli-ast-grep
|
|
3
|
+
description: "Structural search and rewrite for code via AST patterns. Use when Tree-sitter AST pattern rewrites scoped to a single grammar; invoke `sg`. Grammar-aware: queries are written in the same syntax as the language being edited."
|
|
4
|
+
tags: ["cli-tools", "search", "core"]
|
|
5
|
+
quality_charter: agents/shared/quality-charter.md
|
|
6
|
+
efficiency_patterns: agents/shared/efficiency-patterns.md
|
|
7
|
+
cache_friendly: true
|
|
8
|
+
cli_tool:
|
|
9
|
+
id: ast-grep
|
|
10
|
+
bin: sg
|
|
11
|
+
tier: 1
|
|
12
|
+
category: search
|
|
13
|
+
homepage: https://ast-grep.github.io/
|
|
14
|
+
---
|
|
15
|
+
<!-- HATCH3R-CLI-SKILL-GENERATED v1 -->
|
|
16
|
+
# ast-grep
|
|
17
|
+
|
|
18
|
+
Structural search and rewrite for code via AST patterns
|
|
19
|
+
|
|
20
|
+
## When to Use
|
|
21
|
+
|
|
22
|
+
Reach for `sg` when the task is in the **search** category and the agent would otherwise call an MCP tool or read large outputs into context.
|
|
23
|
+
|
|
24
|
+
## Token Cost
|
|
25
|
+
|
|
26
|
+
CLI tools return structured stdout that fits in <1KB for typical queries; equivalent MCP calls regularly exceed 10KB.
|
|
27
|
+
Reference: Anthropic engineering (Nov 4 2025) — code-execution-over-MCP yields 98.7% token reduction.
|
|
28
|
+
|
|
29
|
+
## Recipes
|
|
30
|
+
|
|
31
|
+
```bash
|
|
32
|
+
sg --pattern 'console.log($MSG)' --lang ts src/
|
|
33
|
+
```
|
|
34
|
+
Pattern with a meta-variable (`$MSG`) — matches any `console.log` call regardless of whitespace or argument shape.
|
|
35
|
+
|
|
36
|
+
```bash
|
|
37
|
+
sg run -p 'await $FN()' -r 'await ($FN()).catch(e => log(e))' --update-all src/
|
|
38
|
+
```
|
|
39
|
+
Structural rewrite: every bare `await $FN()` gains a `.catch` arm; `--update-all` writes in place.
|
|
40
|
+
|
|
41
|
+
```bash
|
|
42
|
+
sg scan --config sgconfig.yml
|
|
43
|
+
```
|
|
44
|
+
Runs a rule pack from `sgconfig.yml` — repo-pinned lints that survive regex edits.
|
|
45
|
+
|
|
46
|
+
```bash
|
|
47
|
+
sg test --update-snapshots
|
|
48
|
+
```
|
|
49
|
+
Snapshot-style tests for rules — keeps rule packs honest as the codebase shifts.
|
|
50
|
+
|
|
51
|
+
```bash
|
|
52
|
+
sg --pattern 'function $NAME($$$ARGS) { $$$BODY }' --lang ts --json src/
|
|
53
|
+
```
|
|
54
|
+
Triple-`$` captures the rest of an argument list or body — JSON output feeds `jq` for downstream filtering.
|
|
55
|
+
|
|
56
|
+
## Wrong Choice When
|
|
57
|
+
|
|
58
|
+
- Don't reach for `sg` when the target is plain literal text (a TODO marker, a string in CHANGELOG). Reach for `ripgrep` (`hatch3r-cli-ripgrep`) — orders of magnitude faster on raw matching.
|
|
59
|
+
- Don't use `sg` for cross-language SAST policy work (e.g., taint analysis). Reach for `semgrep`, which has rule packs, CI integrations, and a security-audit lineage.
|
|
60
|
+
- Don't reach for `sg` on languages it does not parse (Bash, Makefile, INI). The pattern compiler will reject the request — fall back to `ripgrep` + `sd`.
|
|
61
|
+
|
|
62
|
+
## Alternatives
|
|
63
|
+
|
|
64
|
+
| Tool | When to prefer |
|
|
65
|
+
|------|----------------|
|
|
66
|
+
| `ripgrep` (`hatch3r-cli-ripgrep`) | Literal regex over text — ast-grep is overkill if you do not need structural matching. |
|
|
67
|
+
| `semgrep` | Security/policy rule packs, multi-language SAST, central rule registry. |
|
|
68
|
+
| `comby` | Multi-language structural rewrites with template syntax and no per-language plugin. |
|
|
69
|
+
| Editor refactor / language server | Authoritative rename or extract-method with full type information. |
|
|
70
|
+
|
|
71
|
+
## Detection / Install
|
|
72
|
+
|
|
73
|
+
Verify with:
|
|
74
|
+
```bash
|
|
75
|
+
command -v sg
|
|
76
|
+
```
|
|
77
|
+
|
|
78
|
+
Install (mac):
|
|
79
|
+
|
|
80
|
+
```bash
|
|
81
|
+
# brew
|
|
82
|
+
brew install ast-grep
|
|
83
|
+
```
|
|
84
|
+
|
|
85
|
+
Homepage: https://ast-grep.github.io/
|
|
@@ -0,0 +1,89 @@
|
|
|
1
|
+
---
|
|
2
|
+
id: hatch3r-cli-az-devops
|
|
3
|
+
description: "Azure DevOps work items, repos, pipelines via az CLI extension. Use when Azure DevOps work-item edits, repo pushes, and pipeline runs; invoke `az`. Authenticates via the platform's native token mechanism (OAuth / PAT)."
|
|
4
|
+
tags: ["cli-tools", "forge"]
|
|
5
|
+
quality_charter: agents/shared/quality-charter.md
|
|
6
|
+
efficiency_patterns: agents/shared/efficiency-patterns.md
|
|
7
|
+
cache_friendly: true
|
|
8
|
+
cli_tool:
|
|
9
|
+
id: az-devops
|
|
10
|
+
bin: az
|
|
11
|
+
tier: 2
|
|
12
|
+
category: forge
|
|
13
|
+
homepage: https://learn.microsoft.com/en-us/cli/azure/azure-devops
|
|
14
|
+
---
|
|
15
|
+
<!-- HATCH3R-CLI-SKILL-GENERATED v1 -->
|
|
16
|
+
# az-devops
|
|
17
|
+
|
|
18
|
+
Azure DevOps work items, repos, pipelines via az CLI extension
|
|
19
|
+
|
|
20
|
+
## When to Use
|
|
21
|
+
|
|
22
|
+
Reach for `az` when the task is in the **forge** category and the agent would otherwise call an MCP tool or read large outputs into context.
|
|
23
|
+
|
|
24
|
+
## Token Cost
|
|
25
|
+
|
|
26
|
+
CLI tools return structured stdout that fits in <1KB for typical queries; equivalent MCP calls regularly exceed 10KB.
|
|
27
|
+
Reference: Anthropic engineering (Nov 4 2025) — code-execution-over-MCP yields 98.7% token reduction.
|
|
28
|
+
|
|
29
|
+
## Recipes
|
|
30
|
+
|
|
31
|
+
```bash
|
|
32
|
+
az repos pr list --status active --query '[].pullRequestId' --output tsv
|
|
33
|
+
```
|
|
34
|
+
Print active PR IDs as a newline-separated list; `--query` (JMESPath) trims the payload before stdout.
|
|
35
|
+
|
|
36
|
+
```bash
|
|
37
|
+
az repos pr show --id 42 --output json
|
|
38
|
+
```
|
|
39
|
+
Fetch a single PR's metadata as JSON for downstream `jq` filters.
|
|
40
|
+
|
|
41
|
+
```bash
|
|
42
|
+
az boards work-item show --id 4242 --output json
|
|
43
|
+
```
|
|
44
|
+
Pull a work item (bug, task, user story) by numeric ID; one round-trip, structured output.
|
|
45
|
+
|
|
46
|
+
```bash
|
|
47
|
+
az boards work-item create --type Bug --title 'flaky import test' --description 'Repro: ...'
|
|
48
|
+
```
|
|
49
|
+
Open a work item from CI or an agent; the new ID is printed on stdout.
|
|
50
|
+
|
|
51
|
+
```bash
|
|
52
|
+
az pipelines run --name CI --branch main
|
|
53
|
+
```
|
|
54
|
+
Queue a pipeline run on a named definition; returns the build ID for polling.
|
|
55
|
+
|
|
56
|
+
```bash
|
|
57
|
+
az artifacts universal download --feed myfeed --name pkg --version 1.0.0 --path .
|
|
58
|
+
```
|
|
59
|
+
Fetch a Universal Package into the cwd — avoids the larger Azure Artifacts MCP equivalents.
|
|
60
|
+
|
|
61
|
+
## Wrong Choice When
|
|
62
|
+
|
|
63
|
+
- The repo is on GitHub — use `gh` (Tier 1); `az repos` will return 404s without a configured Azure project.
|
|
64
|
+
- The repo is on GitLab — use `glab` (Tier 2 sibling); same operations, native auth.
|
|
65
|
+
- You only need to download a public release asset — `curl` to the artifact URL is one hop.
|
|
66
|
+
|
|
67
|
+
## Alternatives
|
|
68
|
+
|
|
69
|
+
| Tool | When to prefer |
|
|
70
|
+
|------|----------------|
|
|
71
|
+
| `gh` | GitHub-hosted code or issues. |
|
|
72
|
+
| `glab` | GitLab-hosted code or issues. |
|
|
73
|
+
| `curl` + `AZURE_DEVOPS_PAT` | Endpoint not surfaced by `az devops`; need raw header control. |
|
|
74
|
+
|
|
75
|
+
## Detection / Install
|
|
76
|
+
|
|
77
|
+
Verify with:
|
|
78
|
+
```bash
|
|
79
|
+
command -v az
|
|
80
|
+
```
|
|
81
|
+
|
|
82
|
+
Install (mac):
|
|
83
|
+
|
|
84
|
+
```bash
|
|
85
|
+
# brew
|
|
86
|
+
brew install azure-cli && az extension add --name azure-devops
|
|
87
|
+
```
|
|
88
|
+
|
|
89
|
+
Homepage: https://learn.microsoft.com/en-us/cli/azure/azure-devops
|