npm - hatch3r - Versions diffs - 1.7.5 → 1.8.0 - Mend

hatch3r 1.7.5 → 1.8.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (75) hide show

package/README.md +2 -2
package/agents/hatch3r-context-rules.md +22 -6
package/agents/hatch3r-creator.md +2 -1
package/agents/hatch3r-handoff-loader.md +1 -1
package/agents/hatch3r-implementer.md +8 -0
package/agents/hatch3r-learnings-loader.md +1 -1
package/agents/hatch3r-reviewer.md +2 -0
package/agents/shared/user-content-templates.md +31 -1
package/commands/hatch3r-agent-customize.md +4 -0
package/commands/hatch3r-api-spec.md +7 -0
package/commands/hatch3r-benchmark.md +7 -0
package/commands/hatch3r-board-fill.md +7 -0
package/commands/hatch3r-board-groom.md +4 -0
package/commands/hatch3r-board-init.md +51 -0
package/commands/hatch3r-board-pickup.md +8 -0
package/commands/hatch3r-board-refresh.md +4 -0
package/commands/hatch3r-board-shared.md +6 -6
package/commands/hatch3r-bug-plan.md +7 -0
package/commands/hatch3r-codebase-map.md +8 -0
package/commands/hatch3r-command-customize.md +4 -0
package/commands/hatch3r-context-health.md +5 -0
package/commands/hatch3r-create.md +57 -4
package/commands/hatch3r-debug.md +7 -0
package/commands/hatch3r-dep-audit.md +4 -0
package/commands/hatch3r-feature-plan.md +7 -0
package/commands/hatch3r-handoff.md +7 -0
package/commands/hatch3r-healthcheck.md +4 -0
package/commands/hatch3r-hooks.md +4 -0
package/commands/hatch3r-learn.md +16 -0
package/commands/hatch3r-migration-plan.md +7 -0
package/commands/hatch3r-onboard.md +7 -0
package/commands/hatch3r-pr-resolve.md +8 -1
package/commands/hatch3r-project-spec.md +8 -0
package/commands/hatch3r-quick-change.md +7 -0
package/commands/hatch3r-recipe.md +4 -0
package/commands/hatch3r-refactor-plan.md +7 -0
package/commands/hatch3r-release.md +5 -0
package/commands/hatch3r-revision.md +7 -0
package/commands/hatch3r-roadmap.md +8 -0
package/commands/hatch3r-rule-customize.md +4 -0
package/commands/hatch3r-security-audit.md +4 -0
package/commands/hatch3r-skill-customize.md +4 -0
package/commands/hatch3r-test-plan.md +7 -0
package/commands/hatch3r-workflow.md +9 -1
package/dist/cli/index.js +2600 -777
package/dist/cli/index.js.map +1 -1
package/package.json +8 -5
package/rules/hatch3r-agent-orchestration-detail.md +3 -0
package/rules/hatch3r-agent-orchestration-detail.mdc +3 -0
package/rules/hatch3r-agent-orchestration.md +25 -2
package/rules/hatch3r-agent-orchestration.mdc +25 -2
package/rules/hatch3r-iteration-summary.md +2 -0
package/rules/hatch3r-iteration-summary.mdc +2 -0
package/rules/hatch3r-observability-tracing-detail.md +7 -148
package/rules/hatch3r-observability-tracing-detail.mdc +6 -148
package/rules/hatch3r-observability-tracing.md +154 -6
package/rules/hatch3r-observability-tracing.mdc +154 -6
package/skills/hatch3r-agent-customize/SKILL.md +10 -0
package/skills/hatch3r-ai-feature/SKILL.md +2 -0
package/skills/hatch3r-api-spec/SKILL.md +68 -0
package/skills/hatch3r-cli-csvkit/SKILL.md +2 -2
package/skills/hatch3r-cli-duckdb/SKILL.md +3 -3
package/skills/hatch3r-cli-jq/SKILL.md +4 -0
package/skills/hatch3r-cli-miller/SKILL.md +2 -2
package/skills/hatch3r-cli-overview/SKILL.md +1 -1
package/skills/{hatch3r-cli-xsv → hatch3r-cli-qsv}/SKILL.md +20 -18
package/skills/hatch3r-cli-stagehand/SKILL.md +48 -16
package/skills/hatch3r-command-customize/SKILL.md +10 -0
package/skills/hatch3r-customize/SKILL.md +3 -0
package/skills/hatch3r-design-system-detect/SKILL.md +2 -0
package/skills/hatch3r-observability-verify/SKILL.md +4 -3
package/skills/hatch3r-reliability-verify/SKILL.md +2 -0
package/skills/hatch3r-rule-customize/SKILL.md +10 -0
package/skills/hatch3r-skill-customize/SKILL.md +10 -0
package/skills/hatch3r-ui-ux-verify/SKILL.md +2 -0

package/rules/hatch3r-observability-tracing.md CHANGED Viewed

@@ -1,16 +1,16 @@
 ---
 id: hatch3r-observability-tracing
 type: rule
-description: Distributed tracing and OpenTelemetry core conventions for the project
+description: Distributed tracing, OpenTelemetry conventions, and AI agent instrumentation for the project
 scope: conditional
-globs: "**/*trac*,**/*span*,**/*telemetry*,**/*otel*,**/observability/**,**/routes/**,**/handlers/**,**/services/**,**/api/**,**/middleware/**,**/controllers/**,**/lib/**"
+globs: "**/*trac*,**/*span*,**/*telemetry*,**/*otel*,**/*agent*,**/observability/**,**/routes/**,**/handlers/**,**/services/**,**/api/**,**/middleware/**,**/controllers/**,**/lib/**"
 tags: [devops]
 quality_charter: agents/shared/quality-charter.md
 cache_friendly: true
 ---
 # Observability -- Distributed Tracing & OpenTelemetry
-Core distributed tracing and OpenTelemetry conventions. For structured logging see `hatch3r-observability-logging`. For metrics, SLOs, alerting, and dashboards see `hatch3r-observability-metrics`. For AI agent instrumentation, tool call audit trails, and correlation ID patterns see `hatch3r-observability-tracing-detail`.
+Distributed tracing, OpenTelemetry semantic conventions, AI agent instrumentation, tool call audit trails, and correlation ID patterns. For structured logging see `hatch3r-observability-logging`. For metrics, SLOs, alerting, and dashboards see `hatch3r-observability-metrics`.
 ## Distributed Tracing
@@ -82,6 +82,154 @@ Every telemetry-producing service must declare resource attributes at startup:
 - Attribute values should be low-cardinality. Never use unbounded values (full URLs with query params, raw SQL) as attribute values.
 - Prefer semantic convention attributes over custom attributes. Prefix custom attributes with your project namespace (e.g., `myapp.feature.flag_key`).
-### AI Agent Semantic Conventions (Summary)
-Follow the [OpenTelemetry GenAI Semantic Conventions](https://opentelemetry.io/docs/specs/semconv/gen-ai/) for AI/LLM agent instrumentation. Key attributes: `gen_ai.system`, `gen_ai.request.model`, `gen_ai.usage.input_tokens`, `gen_ai.usage.output_tokens`. For full attribute tables, code examples, tool call audit trails, and correlation ID patterns, see `hatch3r-observability-tracing-detail`.
+## AI Agent Instrumentation
+Follow the [OpenTelemetry GenAI Semantic Conventions](https://opentelemetry.io/docs/specs/semconv/gen-ai/) for AI/LLM agent instrumentation.
+### GenAI Span Attributes
+Use these attributes on all spans representing interactions with generative AI models:
+| Attribute | Type | Description | Example |
+|-----------|------|-------------|---------|
+| `gen_ai.system` | string | GenAI provider system name | `openai`, `anthropic`, `azure_openai` |
+| `gen_ai.request.model` | string | Model name as specified in the request | `gpt-4o`, `claude-sonnet-4-20250514` |
+| `gen_ai.response.model` | string | Model name as returned in the response | `gpt-4o-2024-08-06` |
+| `gen_ai.request.max_tokens` | int | Maximum tokens requested for generation | `4096` |
+| `gen_ai.request.temperature` | float | Temperature parameter | `0.7` |
+| `gen_ai.response.finish_reasons` | string[] | Reasons the model stopped generating | `["stop"]`, `["length"]` |
+| `gen_ai.usage.input_tokens` | int | Tokens in the input/prompt | `1250` |
+| `gen_ai.usage.output_tokens` | int | Tokens in the generated output | `530` |
+- Always set `gen_ai.system` and `gen_ai.request.model` on every GenAI span.
+- Record `gen_ai.usage.input_tokens` and `gen_ai.usage.output_tokens` from the API response for cost dashboards.
+- Use `gen_ai.response.finish_reasons` to detect truncated outputs (`length`) and trigger re-prompting.
+### Agent Invocation Spans
+Instrument the full lifecycle of an agent invocation with a dedicated span. This span is the parent for all LLM calls, tool executions, and sub-agent delegations.
+- **Span name pattern:** `agent.{agent_name}.invoke`
+- **Required attributes:** `agent.id`, `agent.name`, `agent.parent_id`, `agent.task`, `agent.framework`
+- **Span events for state transitions:** `agent.planning`, `agent.tool_selection`, `agent.awaiting_human`, `agent.delegating`, `agent.completed`, `agent.error`
+```typescript
+const agentSpan = tracer.startSpan('agent.code_reviewer.invoke', {
+  attributes: {
+    'agent.id': invocationId,
+    'agent.name': 'code_reviewer',
+    'agent.parent_id': parentAgentId ?? '',
+    'agent.task': `review PR #${prNumber}`,
+    'agent.framework': 'custom',
+  },
+});
+agentSpan.addEvent('agent.planning');
+// ... agent reasoning and tool calls happen as child spans ...
+agentSpan.addEvent('agent.completed');
+agentSpan.end();
+```
+### Tool Call Spans
+Every tool invocation by an agent creates a child span of the agent invocation span.
+- **Span name pattern:** `tool.{tool_name}.execute`
+- **Required attributes:** `tool.name`, `tool.input_hash` (SHA-256), `tool.output_status`, `tool.duration_ms`, `tool.parameters_count`
+- Tool spans must be children of the invoking agent span. Set span status to `ERROR` when `tool.output_status` is `error` or `timeout`.
+- For tools performing I/O, create nested child spans using appropriate semantic conventions (`http.*`, `db.*`).
+```typescript
+const toolSpan = tracer.startSpan(
+  'tool.git_diff.execute',
+  { attributes: { 'tool.name': 'git_diff' } },
+  trace.setSpan(context.active(), agentSpan),
+);
+try {
+  const result = await tools.gitDiff(params);
+  toolSpan.setAttributes({
+    'tool.output_status': 'success',
+    'tool.duration_ms': performance.now() - startTime,
+    'tool.input_hash': hashInput(params),
+  });
+} catch (err) {
+  toolSpan.setAttributes({ 'tool.output_status': 'error' });
+  toolSpan.setStatus({ code: SpanStatusCode.ERROR, message: err.message });
+  toolSpan.recordException(err);
+  throw err;
+} finally {
+  toolSpan.end();
+}
+```
+### LLM Request/Response Tracing
+- **Span name pattern:** `gen_ai.{operation}` (e.g., `gen_ai.chat`, `gen_ai.completion`)
+- **Token tracking:** Capture `gen_ai.usage.input_tokens` and `gen_ai.usage.output_tokens`. Aggregate in metrics: Counter `gen_ai.tokens_total` with labels `{direction, model, agent_name}`, Histogram `gen_ai.request_duration_ms`.
+- **Model version tracking:** Record both `gen_ai.request.model` and `gen_ai.response.model` for drift detection.
+- **Retry spans:** Each retry attempt is a separate child span. Set `gen_ai.request.retries` on the final span. Record `http.response.status_code` on failed spans (429 vs 500+).
+- Never log raw prompt content or full model responses as span attributes. Use token counts for cost tracking and correlated logs for prompt debugging in non-production environments.
+- Sample GenAI spans at 50-100% in production (higher than general spans) because each call is expensive and low volume.
+### Tool Call Audit Trail
+Maintain a structured audit log for every tool invocation in agentic workflows, separate from tracing spans.
+| Field | Type | Description |
+|-------|------|-------------|
+| `tool.name` | string | Name of the tool invoked |
+| `tool.input_hash` | string | SHA-256 hash of tool input (never log raw input) |
+| `tool.output_status` | string | `success`, `error`, `timeout`, or `denied` |
+| `tool.duration_ms` | float | Execution time in milliseconds |
+| `agent.id` | string | ID of the invoking agent |
+| `agent.name` | string | Human-readable agent name |
+| `correlation.id` | string | Trace correlation ID |
+| `timestamp` | string | ISO 8601 timestamp |
+| `session.id` | string | Session identifier |
+- Log tool invocations at `info` level, failures at `error` level with `error.type` and `error.message`.
+- Aggregate tool call counts per agent per session for anomaly detection.
+- Retain audit logs for a minimum of 90 days.
+### Correlation IDs for Agent Workflows
+- Use UUIDv4 with workflow-type prefix: `{workflow-type}-{uuid}` (e.g., `agent-run-550e8400-...`).
+- Generate at the workflow entry point. Propagate to all sub-agents and tool calls.
+- Every log entry, span, and metric must include `correlation.id`.
+- Cross-process: propagate via `X-Correlation-ID` header alongside W3C Trace Context.
+- Use OpenTelemetry `SpanLink` for cross-workflow references (e.g., agent run triggered by CI event).
+```typescript
+import { randomUUID } from 'node:crypto';
+import { context, trace, SpanStatusCode } from '@opentelemetry/api';
+function generateCorrelationId(workflowType: string): string {
+  return `${workflowType}-${randomUUID()}`;
+}
+async function runAgentWorkflow(task: string): Promise<void> {
+  const correlationId = generateCorrelationId('agent-run');
+  const tracer = trace.getTracer('agent-orchestrator');
+  const rootSpan = tracer.startSpan('agent.orchestrator.invoke', {
+    attributes: {
+      'correlation.id': correlationId,
+      'agent.name': 'orchestrator',
+      'agent.task': task,
+    },
+  });
+  try {
+    await context.with(trace.setSpan(context.active(), rootSpan), async () => {
+      await delegateToSubAgent('code_reviewer', {
+        correlationId,
+        parentSpanId: rootSpan.spanContext().spanId,
+        task: 'review changes',
+      });
+    });
+  } catch (err) {
+    rootSpan.setStatus({ code: SpanStatusCode.ERROR, message: (err as Error).message });
+    rootSpan.recordException(err as Error);
+    throw err;
+  } finally {
+    rootSpan.end();
+  }
+}
+```

package/rules/hatch3r-observability-tracing.mdc CHANGED Viewed

@@ -1,11 +1,11 @@
 ---
-description: Distributed tracing and OpenTelemetry core conventions for the project
-globs: ["**/*trac*", "**/*span*", "**/*telemetry*", "**/*otel*", "**/observability/**", "**/routes/**", "**/handlers/**", "**/services/**", "**/api/**", "**/middleware/**", "**/controllers/**", "**/lib/**"]
+description: Distributed tracing, OpenTelemetry conventions, and AI agent instrumentation for the project
+globs: ["**/*trac*", "**/*span*", "**/*telemetry*", "**/*otel*", "**/*agent*", "**/observability/**", "**/routes/**", "**/handlers/**", "**/services/**", "**/api/**", "**/middleware/**", "**/controllers/**", "**/lib/**"]
 alwaysApply: false
 ---
 # Observability -- Distributed Tracing & OpenTelemetry
-Core distributed tracing and OpenTelemetry conventions. For structured logging see `hatch3r-observability-logging`. For metrics, SLOs, alerting, and dashboards see `hatch3r-observability-metrics`. For AI agent instrumentation, tool call audit trails, and correlation ID patterns see `hatch3r-observability-tracing-detail`.
+Distributed tracing, OpenTelemetry semantic conventions, AI agent instrumentation, tool call audit trails, and correlation ID patterns. For structured logging see `hatch3r-observability-logging`. For metrics, SLOs, alerting, and dashboards see `hatch3r-observability-metrics`.
 ## Distributed Tracing
@@ -77,6 +77,154 @@ Every telemetry-producing service must declare resource attributes at startup:
 - Attribute values should be low-cardinality. Never use unbounded values (full URLs with query params, raw SQL) as attribute values.
 - Prefer semantic convention attributes over custom attributes. Prefix custom attributes with your project namespace (e.g., `myapp.feature.flag_key`).
-### AI Agent Semantic Conventions (Summary)
-Follow the [OpenTelemetry GenAI Semantic Conventions](https://opentelemetry.io/docs/specs/semconv/gen-ai/) for AI/LLM agent instrumentation. Key attributes: `gen_ai.system`, `gen_ai.request.model`, `gen_ai.usage.input_tokens`, `gen_ai.usage.output_tokens`. For full attribute tables, code examples, tool call audit trails, and correlation ID patterns, see `hatch3r-observability-tracing-detail`.
+## AI Agent Instrumentation
+Follow the [OpenTelemetry GenAI Semantic Conventions](https://opentelemetry.io/docs/specs/semconv/gen-ai/) for AI/LLM agent instrumentation.
+### GenAI Span Attributes
+Use these attributes on all spans representing interactions with generative AI models:
+| Attribute | Type | Description | Example |
+|-----------|------|-------------|---------|
+| `gen_ai.system` | string | GenAI provider system name | `openai`, `anthropic`, `azure_openai` |
+| `gen_ai.request.model` | string | Model name as specified in the request | `gpt-4o`, `claude-sonnet-4-20250514` |
+| `gen_ai.response.model` | string | Model name as returned in the response | `gpt-4o-2024-08-06` |
+| `gen_ai.request.max_tokens` | int | Maximum tokens requested for generation | `4096` |
+| `gen_ai.request.temperature` | float | Temperature parameter | `0.7` |
+| `gen_ai.response.finish_reasons` | string[] | Reasons the model stopped generating | `["stop"]`, `["length"]` |
+| `gen_ai.usage.input_tokens` | int | Tokens in the input/prompt | `1250` |
+| `gen_ai.usage.output_tokens` | int | Tokens in the generated output | `530` |
+- Always set `gen_ai.system` and `gen_ai.request.model` on every GenAI span.
+- Record `gen_ai.usage.input_tokens` and `gen_ai.usage.output_tokens` from the API response for cost dashboards.
+- Use `gen_ai.response.finish_reasons` to detect truncated outputs (`length`) and trigger re-prompting.
+### Agent Invocation Spans
+Instrument the full lifecycle of an agent invocation with a dedicated span. This span is the parent for all LLM calls, tool executions, and sub-agent delegations.
+- **Span name pattern:** `agent.{agent_name}.invoke`
+- **Required attributes:** `agent.id`, `agent.name`, `agent.parent_id`, `agent.task`, `agent.framework`
+- **Span events for state transitions:** `agent.planning`, `agent.tool_selection`, `agent.awaiting_human`, `agent.delegating`, `agent.completed`, `agent.error`
+```typescript
+const agentSpan = tracer.startSpan('agent.code_reviewer.invoke', {
+  attributes: {
+    'agent.id': invocationId,
+    'agent.name': 'code_reviewer',
+    'agent.parent_id': parentAgentId ?? '',
+    'agent.task': `review PR #${prNumber}`,
+    'agent.framework': 'custom',
+  },
+});
+agentSpan.addEvent('agent.planning');
+// ... agent reasoning and tool calls happen as child spans ...
+agentSpan.addEvent('agent.completed');
+agentSpan.end();
+```
+### Tool Call Spans
+Every tool invocation by an agent creates a child span of the agent invocation span.
+- **Span name pattern:** `tool.{tool_name}.execute`
+- **Required attributes:** `tool.name`, `tool.input_hash` (SHA-256), `tool.output_status`, `tool.duration_ms`, `tool.parameters_count`
+- Tool spans must be children of the invoking agent span. Set span status to `ERROR` when `tool.output_status` is `error` or `timeout`.
+- For tools performing I/O, create nested child spans using appropriate semantic conventions (`http.*`, `db.*`).
+```typescript
+const toolSpan = tracer.startSpan(
+  'tool.git_diff.execute',
+  { attributes: { 'tool.name': 'git_diff' } },
+  trace.setSpan(context.active(), agentSpan),
+);
+try {
+  const result = await tools.gitDiff(params);
+  toolSpan.setAttributes({
+    'tool.output_status': 'success',
+    'tool.duration_ms': performance.now() - startTime,
+    'tool.input_hash': hashInput(params),
+  });
+} catch (err) {
+  toolSpan.setAttributes({ 'tool.output_status': 'error' });
+  toolSpan.setStatus({ code: SpanStatusCode.ERROR, message: err.message });
+  toolSpan.recordException(err);
+  throw err;
+} finally {
+  toolSpan.end();
+}
+```
+### LLM Request/Response Tracing
+- **Span name pattern:** `gen_ai.{operation}` (e.g., `gen_ai.chat`, `gen_ai.completion`)
+- **Token tracking:** Capture `gen_ai.usage.input_tokens` and `gen_ai.usage.output_tokens`. Aggregate in metrics: Counter `gen_ai.tokens_total` with labels `{direction, model, agent_name}`, Histogram `gen_ai.request_duration_ms`.
+- **Model version tracking:** Record both `gen_ai.request.model` and `gen_ai.response.model` for drift detection.
+- **Retry spans:** Each retry attempt is a separate child span. Set `gen_ai.request.retries` on the final span. Record `http.response.status_code` on failed spans (429 vs 500+).
+- Never log raw prompt content or full model responses as span attributes. Use token counts for cost tracking and correlated logs for prompt debugging in non-production environments.
+- Sample GenAI spans at 50-100% in production (higher than general spans) because each call is expensive and low volume.
+### Tool Call Audit Trail
+Maintain a structured audit log for every tool invocation in agentic workflows, separate from tracing spans.
+| Field | Type | Description |
+|-------|------|-------------|
+| `tool.name` | string | Name of the tool invoked |
+| `tool.input_hash` | string | SHA-256 hash of tool input (never log raw input) |
+| `tool.output_status` | string | `success`, `error`, `timeout`, or `denied` |
+| `tool.duration_ms` | float | Execution time in milliseconds |
+| `agent.id` | string | ID of the invoking agent |
+| `agent.name` | string | Human-readable agent name |
+| `correlation.id` | string | Trace correlation ID |
+| `timestamp` | string | ISO 8601 timestamp |
+| `session.id` | string | Session identifier |
+- Log tool invocations at `info` level, failures at `error` level with `error.type` and `error.message`.
+- Aggregate tool call counts per agent per session for anomaly detection.
+- Retain audit logs for a minimum of 90 days.
+### Correlation IDs for Agent Workflows
+- Use UUIDv4 with workflow-type prefix: `{workflow-type}-{uuid}` (e.g., `agent-run-550e8400-...`).
+- Generate at the workflow entry point. Propagate to all sub-agents and tool calls.
+- Every log entry, span, and metric must include `correlation.id`.
+- Cross-process: propagate via `X-Correlation-ID` header alongside W3C Trace Context.
+- Use OpenTelemetry `SpanLink` for cross-workflow references (e.g., agent run triggered by CI event).
+```typescript
+import { randomUUID } from 'node:crypto';
+import { context, trace, SpanStatusCode } from '@opentelemetry/api';
+function generateCorrelationId(workflowType: string): string {
+  return `${workflowType}-${randomUUID()}`;
+}
+async function runAgentWorkflow(task: string): Promise<void> {
+  const correlationId = generateCorrelationId('agent-run');
+  const tracer = trace.getTracer('agent-orchestrator');
+  const rootSpan = tracer.startSpan('agent.orchestrator.invoke', {
+    attributes: {
+      'correlation.id': correlationId,
+      'agent.name': 'orchestrator',
+      'agent.task': task,
+    },
+  });
+  try {
+    await context.with(trace.setSpan(context.active(), rootSpan), async () => {
+      await delegateToSubAgent('code_reviewer', {
+        correlationId,
+        parentSpanId: rootSpan.spanContext().spanId,
+        task: 'review changes',
+      });
+    });
+  } catch (err) {
+    rootSpan.setStatus({ code: SpanStatusCode.ERROR, message: (err as Error).message });
+    rootSpan.recordException(err as Error);
+    throw err;
+  } finally {
+    rootSpan.end();
+  }
+}
+```

package/skills/hatch3r-agent-customize/SKILL.md CHANGED Viewed

@@ -5,9 +5,19 @@ tags: [customize]
 quality_charter: agents/shared/quality-charter.md
 efficiency_patterns: agents/shared/efficiency-patterns.md
 cache_friendly: true
+redirect_to: hatch3r-customize
 ---
 # Agent Customization
 > **This skill has been consolidated.** Use the `hatch3r-customize` skill with `type: agent`.
 For agent-specific reference (model resolution, protected agents, YAML schema), see the `hatch3r-agent-customize` command.
+## Rejected Merge Alternative (D16.3 add-vs-remove bias)
+Per `governance/audit/domains/D16-compound-system.md` SA 16.3, the default recommendation on functional overlap is MERGE rather than removal. Full deletion of this redirect file was rejected for two reasons:
+1. **Preserves UX entry points.** Users typed `/h4tcher-agent-customize` or referenced the id `hatch3r-agent-customize` (per CHANGELOG.md, `website/docs/reference/configuration.md:325`, `docs/model-selection.md:158`) before consolidation. Deleting the id breaks those entry points without a redirect target.
+2. **Signals umbrella canonicality.** The `redirect_to: hatch3r-customize` frontmatter field marks `hatch3r-customize` as the single source of truth — tooling, audit scans, and adapters can resolve any redirect to the canonical without re-reading body prose.
+The 13-LOC redirect cost is paid once per type; the umbrella body lives in `skills/hatch3r-customize/SKILL.md`.

package/skills/hatch3r-ai-feature/SKILL.md CHANGED Viewed

@@ -4,6 +4,8 @@ type: skill
 description: Eval-driven development workflow for shipping AI features — write eval before prompt, measure, iterate, ship with caching + cost telemetry + model fallback + hallucination SLI
 tags: [implementation, ai]
 quality_charter: agents/shared/quality-charter.md
+efficiency_patterns: agents/shared/efficiency-patterns.md
+cache_friendly: true
 ---
 # AI Feature Workflow (Eval-Driven)

package/skills/hatch3r-api-spec/SKILL.md CHANGED Viewed

@@ -20,6 +20,7 @@ Task Progress:
 - [ ] Step 3: Validate schemas
 - [ ] Step 4: Generate documentation
 - [ ] Step 5: Verify spec accuracy
+- [ ] Step 6: Wire oasdiff breaking-change CI gate
 ```
 ## Step 0 — Detect Ambiguity (P8 B1)
@@ -66,6 +67,72 @@ Before any work, scan the invocation for unresolved questions in scope, intent,
 - Check that path parameters, query parameters, and headers are documented with accurate types, required flags, and example values.
 - Validate against any existing API consumers (SDKs, frontend clients) for breaking changes.
+## Step 6: Wire `oasdiff` Breaking-Change CI Gate
+Breaking changes on stable endpoints must trip CI before merge. This step enforces the CONSTITUTION §2 P5 lean-thresholds row "API breaking-change events on stable endpoints = 0 per release" (governance/CONSTITUTION.md:80, verified by `oasdiff / buf breaking / graphql-inspector CI gate`).
+### 6.1 Install `oasdiff`
+Pick one of two install paths:
+- npm global (CI runner with Node 22+): `npm i -g @tufin/oasdiff`
+- Docker image (no Node dependency): `docker run --rm -t -v $(pwd):/specs tufin/oasdiff <subcommand>`
+Pin the version in CI (e.g., `npm i -g @tufin/oasdiff@1.10.x` or `tufin/oasdiff:1.10`) so a new release of oasdiff does not change gate semantics mid-cycle.
+### 6.2 Compare current spec vs previous merged version
+The gate compares the spec on the feature branch against the spec at the merge base on the default branch. Fail CI on any breaking change to a stable endpoint; report non-breaking diffs as informational.
+- Fetch the base ref's spec into a temp path (e.g., `git show origin/main:openapi.yaml > /tmp/openapi.base.yaml`).
+- Run `oasdiff breaking /tmp/openapi.base.yaml ./openapi.yaml --fail-on ERR` — exit code 1 when one or more `ERR`-level breaking changes are detected.
+- Scope the gate to stable endpoints by excluding paths tagged `x-stability: experimental` via `--match-path` or by maintaining an `oasdiff-ignore.yaml` rules file for documented breaking changes already coordinated with consumers.
+### 6.3 Example GitHub Actions step
+```yaml
+name: API Breaking-Change Gate
+on:
+  pull_request:
+    paths:
+      - 'openapi.yaml'
+      - 'openapi.json'
+      - 'docs/api/**'
+jobs:
+  oasdiff:
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v4
+        with:
+          fetch-depth: 0
+      - uses: actions/setup-node@v4
+        with:
+          node-version: '22'
+      - name: Install oasdiff
+        run: npm i -g @tufin/oasdiff@1.10.x
+      - name: Resolve base spec
+        run: |
+          git show origin/${{ github.base_ref }}:openapi.yaml > /tmp/openapi.base.yaml
+      - name: Run breaking-change diff
+        run: |
+          oasdiff breaking /tmp/openapi.base.yaml ./openapi.yaml \
+            --fail-on ERR \
+            --format githubactions
+```
+The `--format githubactions` flag emits `::error::` annotations so each breaking change shows up inline on the PR diff.
+### 6.4 Handling an intentional breaking change
+When a breaking change is deliberate (versioned endpoint cut, deprecated field removed after the documented sunset window):
+1. Add a row to `oasdiff-ignore.yaml` with the change ID, the affected operation, and a link to the consumer-coordination record.
+2. Bump the spec `info.version` in line with the project's API versioning policy (semver-major for breaking changes on stable endpoints).
+3. Document the change in CHANGELOG (or equivalent) with the migration path for downstream consumers.
+The gate stays green only because the change is recorded — not because the breaking signal was silenced.
 ## Error Handling
 - **Route definitions use dynamic or meta-programmed patterns**: If endpoints are generated at runtime or via decorators that resist static analysis, document the gap and manually enumerate the missing endpoints.
@@ -79,3 +146,4 @@ Before any work, scan the invocation for unresolved questions in scope, intent,
 - [ ] Spec passes linter validation
 - [ ] Example requests/responses included
 - [ ] No breaking changes to existing API consumers
+- [ ] `oasdiff breaking` CI gate is wired and fails on any `ERR`-level breaking change on stable endpoints (CONSTITUTION §2 P5: 0 per release)

package/skills/hatch3r-cli-csvkit/SKILL.md CHANGED Viewed

@@ -57,14 +57,14 @@ Run SQL directly against a CSV using an in-memory SQLite — no schema file requ
 - **Files larger than ~1M rows:** csvkit is Python-startup-heavy; `hatch3r-cli-duckdb` (tier 2) loads and queries the same file in a fraction of the time.
 - **Production SQL workloads:** csvsql is convenient but evaluates against in-memory SQLite — use a real database for anything served.
-- **Single-column slice or count under a few hundred MB:** `hatch3r-cli-xsv` (tier 2) is faster with lower memory pressure.
+- **Single-column slice or count under a few hundred MB:** `hatch3r-cli-qsv` (tier 2) is faster with lower memory pressure.
 ## Alternatives
 | Tool | When to prefer |
 |------|----------------|
 | `hatch3r-cli-duckdb` (tier 2) | Large files, analytical SQL, Parquet, multi-file joins |
-| `hatch3r-cli-xsv` (tier 2) | Fast column slicing, sampling, deduping |
+| `hatch3r-cli-qsv` (tier 2) | Fast column slicing, sampling, deduping |
 | `hatch3r-cli-miller` (tier 3) | Streaming put/filter DSL, format conversion |
 ## Detection / Install

package/skills/hatch3r-cli-duckdb/SKILL.md CHANGED Viewed

@@ -36,7 +36,7 @@ Count rows across a Parquet glob — no schema declaration, no import step.
 ```bash
 duckdb -c "COPY (SELECT * FROM 'in.csv' WHERE active) TO 'out.parquet' (FORMAT PARQUET)"
 ```
-Filter a CSV and emit columnar Parquet in one pass; ideal for downstream `xsv`/`jq` chains.
+Filter a CSV and emit columnar Parquet in one pass; ideal for downstream `qsv`/`jq` chains.
 ```bash
 duckdb -c "ATTACH 'app.sqlite' AS sqlite; SELECT * FROM sqlite.users LIMIT 10"
@@ -55,7 +55,7 @@ Aggregate over a CSV directory; DuckDB streams the read so memory stays bounded.
 ## Wrong Choice When
-- The CSV has <10k rows and you only need to slice/select columns — `xsv` (Tier 2 sibling) starts faster and has no install dependency in many environments.
+- The CSV has <10k rows and you only need to slice/select columns — `qsv` (Tier 2 sibling) starts faster and has no install dependency in many environments.
 - The workload is transactional (writes from multiple clients, ACID across rows) — use SQLite or Postgres; DuckDB is read-optimized OLAP.
 - A single `jq` filter would do the job (the data is already JSON, the operation is field extraction) — skip the SQL detour.
@@ -63,7 +63,7 @@ Aggregate over a CSV directory; DuckDB streams the read so memory stays bounded.
 | Tool | When to prefer |
 |------|----------------|
-| `xsv` | Single CSV file, <100MB, just need slice/select/sort. |
+| `qsv` | Single CSV file, <100MB, just need slice/select/sort. |
 | `sqlite3` | Need OLTP writes or row-level updates rather than analytics. |
 | `python -m pandas` | Already in a Python script and the data fits in memory. |

package/skills/hatch3r-cli-jq/SKILL.md CHANGED Viewed

@@ -68,6 +68,10 @@ Compact (`-c`) one-object-per-line projection — perfect input for `xargs -L1`
 | `dasel` | Single binary across JSON/YAML/TOML/XML with a path-query DSL — handy in CI where you do not want jq+yq. |
 | `fx` | Interactive JSON browsing in a TTY; jq is the right call in scripts. |
+## Known Issues
+- **CVE-2026-32316 (active, no tagged fix as of 2026-05-18):** jq 1.8.1 ships with a heap buffer overflow in expression evaluation. Six additional CVEs were disclosed 2026-04-15; patches are committed on `jqlang/jq` `main` but no superseding tagged release exists yet. Do not invoke `jq` on JSON sourced from an untrusted producer (third-party API webhook, user-supplied upload) until a tagged release past 1.8.1 lands. Reference: https://github.com/jqlang/jq/security/advisories.
 ## Detection / Install
 Verify with:

package/skills/hatch3r-cli-miller/SKILL.md CHANGED Viewed

@@ -56,7 +56,7 @@ SQL-style join on `id` between two CSVs, streamed.
 ## Wrong Choice When
 - **Multi-gigabyte analytical queries with joins:** `hatch3r-cli-duckdb` (tier 2) has a query planner and parallel scan; mlr is streaming-single-thread.
-- **One-column slice or count:** `hatch3r-cli-xsv` (tier 2) is faster for trivial slicing.
+- **One-column slice or count:** `hatch3r-cli-qsv` (tier 2) is faster for trivial slicing.
 - **Production ETL with schema enforcement:** use a real database or dbt — mlr is a CLI-scratchpad tool.
 ## Alternatives
@@ -64,7 +64,7 @@ SQL-style join on `id` between two CSVs, streamed.
 | Tool | When to prefer |
 |------|----------------|
 | `hatch3r-cli-duckdb` (tier 2) | Multi-GB data, joins, analytical SQL, Parquet |
-| `hatch3r-cli-xsv` (tier 2) | Single-column slice, count, sample on plain CSV |
+| `hatch3r-cli-qsv` (tier 2) | Single-column slice, count, sample on plain CSV |
 | `hatch3r-cli-csvkit` (tier 3) | SQL-over-CSV with `csvsql`, Python integration |
 ## Detection / Install

package/skills/hatch3r-cli-overview/SKILL.md CHANGED Viewed

@@ -40,7 +40,7 @@ hatch3r recommends a small set of terminal-native CLI tools agents can call inst
 | `llm` | `hatch3r-cli-llm` | simonw/llm — invoke LLMs from the command line with prompt templates |
 | `playwright` | `hatch3r-cli-playwright` | Browser automation, web testing, and UI interaction |
 | `taplo` | `hatch3r-cli-taplo` | TOML toolkit (format, lint, query) for pyproject.toml / Cargo.toml |
-| `xsv` | `hatch3r-cli-xsv` | Fast CSV toolkit (slice, search, join, stats) |
+| `qsv` | `hatch3r-cli-qsv` | Fast CSV toolkit (slice, search, join, stats, 80+ commands) — actively-maintained xsv successor |
 ## Tier 3 — opt-in advanced

package/skills/{hatch3r-cli-xsv → hatch3r-cli-qsv}/SKILL.md RENAMED Viewed

@@ -1,25 +1,27 @@
 ---
-id: hatch3r-cli-xsv
-description: "Fast CSV toolkit (slice, search, join, stats). Use when slicing huge CSV documents by row range or column without materialising the dataset; invoke `xsv`. Streams records lazily; works on datasets that exceed available RAM."
+id: hatch3r-cli-qsv
+description: "Fast CSV toolkit (slice, search, join, stats, 80+ commands) — actively-maintained xsv successor. Use when slicing huge CSV documents by row range or column without materialising the dataset; invoke `qsv`. Streams records lazily; works on datasets that exceed available RAM."
 tags: ["cli-tools", "data"]
 quality_charter: agents/shared/quality-charter.md
 efficiency_patterns: agents/shared/efficiency-patterns.md
 cache_friendly: true
 cli_tool:
-  id: xsv
-  bin: xsv
+  id: qsv
+  bin: qsv
   tier: 2
   category: data
-  homepage: https://github.com/BurntSushi/xsv
+  homepage: https://github.com/jqnatividad/qsv
 ---
 <!-- HATCH3R-CLI-SKILL-GENERATED v1 -->
-# xsv
+# qsv
-Fast CSV toolkit (slice, search, join, stats)
+Fast CSV toolkit (slice, search, join, stats, 80+ commands) — actively-maintained xsv successor
 ## When to Use
-Reach for `xsv` when the task is in the **data** category and the agent would otherwise call an MCP tool or read large outputs into context.
+Reach for `qsv` when the task is in the **data** category and the agent would otherwise call an MCP tool or read large outputs into context.
+`qsv` is a drop-in superset of `xsv` — every `xsv` sub-command name and flag works under `qsv`, plus 50+ additional commands (`apply`, `fetch`, `validate`, `tojsonl`, `sqlp`, etc.). The upstream `BurntSushi/xsv` repository was archived on 2025-04-24; `jqnatividad/qsv` is the active fork with regular releases.
 ## Token Cost
@@ -29,39 +31,39 @@ Reference: Anthropic engineering (Nov 4 2025) — code-execution-over-MCP yields
 ## Recipes
 ```bash
-xsv stats huge.csv
+qsv stats huge.csv
 ```
 Per-column min/max/mean/stddev/cardinality — single streaming pass over the file.
 ```bash
-xsv select name,email,active records.csv
+qsv select name,email,active records.csv
 ```
 Project a subset of columns without rewriting; output stays CSV for downstream tools.
 ```bash
-xsv sort -s amount records.csv | xsv slice -e 100
+qsv sort -s amount records.csv | qsv slice -e 100
 ```
 Sort by `amount` then take the first 100 rows — composable pipe; both stages stream.
 ```bash
-xsv frequency -s status events.csv
+qsv frequency -s status events.csv
 ```
 Tabulate value counts for a column; output is itself CSV, parsable by the next step.
 ```bash
-xsv search -s email '@example\.com$' users.csv
+qsv search -s email '@example\.com$' users.csv
 ```
 Regex-filter a column — much cheaper than loading the whole file into a SQL engine.
 ```bash
-xsv join id orders.csv id customers.csv > joined.csv
+qsv join id orders.csv id customers.csv > joined.csv
 ```
 Hash join two CSVs on a common column without spinning up DuckDB.
 ## Wrong Choice When
 - The query needs aggregation across millions of rows or multiple files — DuckDB (Tier 2 sibling) is built for that scan plan.
-- You need a multi-way join with type coercion or window functions — `xsv join` is hash-only and untyped; use DuckDB.
+- You need a multi-way join with type coercion or window functions — `qsv join` is hash-only and untyped; use DuckDB.
 - The data is JSON or Parquet, not CSV — pipe through `jq`/DuckDB instead of CSV-converting first.
 ## Alternatives
@@ -76,14 +78,14 @@ Hash join two CSVs on a common column without spinning up DuckDB.
 Verify with:
 ```bash
-command -v xsv
+command -v qsv
 ```
 Install (mac):
 ```bash
 # brew
-brew install xsv
+brew install qsv
 ```
-Homepage: https://github.com/BurntSushi/xsv
+Homepage: https://github.com/jqnatividad/qsv