hatch3r 1.1.0 → 1.3.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +109 -364
- package/agents/hatch3r-a11y-auditor.md +8 -8
- package/agents/hatch3r-architect.md +2 -4
- package/agents/hatch3r-ci-watcher.md +2 -4
- package/agents/hatch3r-context-rules.md +2 -4
- package/agents/hatch3r-dependency-auditor.md +5 -7
- package/agents/hatch3r-devops.md +2 -4
- package/agents/hatch3r-docs-writer.md +2 -4
- package/agents/hatch3r-fixer.md +2 -0
- package/agents/hatch3r-implementer.md +32 -0
- package/agents/hatch3r-learnings-loader.md +189 -13
- package/agents/hatch3r-lint-fixer.md +3 -14
- package/agents/hatch3r-perf-profiler.md +2 -4
- package/agents/hatch3r-researcher.md +247 -0
- package/agents/hatch3r-reviewer.md +76 -7
- package/agents/hatch3r-security-auditor.md +4 -7
- package/agents/hatch3r-test-writer.md +3 -11
- package/agents/modes/architecture.md +44 -0
- package/agents/modes/boundary-analysis.md +45 -0
- package/agents/modes/codebase-impact.md +81 -0
- package/agents/modes/complexity-risk.md +40 -0
- package/agents/modes/coverage-analysis.md +44 -0
- package/agents/modes/current-state.md +52 -0
- package/agents/modes/feature-design.md +39 -0
- package/agents/modes/impact-analysis.md +45 -0
- package/agents/modes/library-docs.md +31 -0
- package/agents/modes/migration-path.md +55 -0
- package/agents/modes/prior-art.md +31 -0
- package/agents/modes/refactoring-strategy.md +55 -0
- package/agents/modes/regression.md +45 -0
- package/agents/modes/requirements-elicitation.md +68 -0
- package/agents/modes/risk-assessment.md +41 -0
- package/agents/modes/risk-prioritization.md +43 -0
- package/agents/modes/root-cause.md +39 -0
- package/agents/modes/similar-implementation.md +70 -0
- package/agents/modes/symptom-trace.md +39 -0
- package/agents/modes/test-pattern.md +61 -0
- package/agents/shared/external-knowledge.md +11 -0
- package/commands/board/pickup-azure-devops.md +81 -0
- package/commands/board/pickup-delegation-multi.md +197 -0
- package/commands/board/pickup-delegation.md +100 -0
- package/commands/board/pickup-github.md +82 -0
- package/commands/board/pickup-gitlab.md +81 -0
- package/commands/board/pickup-modes.md +143 -0
- package/commands/board/pickup-post-impl.md +120 -0
- package/commands/board/shared-azure-devops.md +149 -0
- package/commands/board/shared-board-overview.md +215 -0
- package/commands/board/shared-github.md +169 -0
- package/commands/board/shared-gitlab.md +142 -0
- package/commands/hatch3r-agent-customize.md +3 -2
- package/commands/hatch3r-api-spec.md +1 -0
- package/commands/hatch3r-benchmark.md +1 -0
- package/commands/hatch3r-board-fill.md +15 -16
- package/commands/hatch3r-board-groom.md +50 -10
- package/commands/hatch3r-board-init.md +1 -0
- package/commands/hatch3r-board-pickup.md +44 -572
- package/commands/hatch3r-board-refresh.md +31 -10
- package/commands/hatch3r-board-shared.md +87 -439
- package/commands/hatch3r-bug-plan.md +1 -0
- package/commands/hatch3r-codebase-map.md +1 -0
- package/commands/hatch3r-command-customize.md +1 -0
- package/commands/hatch3r-context-health.md +23 -2
- package/commands/hatch3r-cost-tracking.md +15 -0
- package/commands/hatch3r-debug.md +1 -0
- package/commands/hatch3r-dep-audit.md +2 -1
- package/commands/hatch3r-feature-plan.md +1 -0
- package/commands/hatch3r-healthcheck.md +2 -1
- package/commands/hatch3r-hooks.md +1 -0
- package/commands/hatch3r-learn.md +69 -2
- package/commands/hatch3r-migration-plan.md +1 -0
- package/commands/hatch3r-onboard.md +1 -0
- package/commands/hatch3r-project-spec.md +1 -0
- package/commands/hatch3r-quick-change.md +1 -0
- package/commands/hatch3r-recipe.md +1 -0
- package/commands/hatch3r-refactor-plan.md +1 -0
- package/commands/hatch3r-release.md +2 -1
- package/commands/hatch3r-revision.md +1 -0
- package/commands/hatch3r-roadmap.md +8 -1
- package/commands/hatch3r-rule-customize.md +1 -0
- package/commands/hatch3r-security-audit.md +2 -1
- package/commands/hatch3r-skill-customize.md +1 -0
- package/commands/hatch3r-test-plan.md +532 -0
- package/commands/hatch3r-workflow.md +1 -0
- package/dist/cli/index.js +4735 -1426
- package/dist/cli/index.js.map +1 -1
- package/github-agents/hatch3r-docs-agent.md +1 -0
- package/github-agents/hatch3r-lint-agent.md +1 -0
- package/github-agents/hatch3r-security-agent.md +1 -0
- package/github-agents/hatch3r-test-agent.md +1 -0
- package/hooks/hatch3r-ci-failure.md +1 -0
- package/hooks/hatch3r-file-save.md +1 -0
- package/hooks/hatch3r-post-merge.md +1 -0
- package/hooks/hatch3r-pre-commit.md +1 -0
- package/hooks/hatch3r-pre-push.md +1 -0
- package/hooks/hatch3r-session-start.md +1 -0
- package/package.json +2 -2
- package/prompts/hatch3r-bug-triage.md +1 -0
- package/prompts/hatch3r-code-review.md +1 -0
- package/prompts/hatch3r-pr-description.md +1 -0
- package/rules/hatch3r-accessibility-standards.md +1 -0
- package/rules/hatch3r-agent-orchestration.md +289 -73
- package/rules/hatch3r-api-design.md +1 -0
- package/rules/hatch3r-browser-verification.md +1 -0
- package/rules/hatch3r-ci-cd.md +1 -0
- package/rules/hatch3r-code-standards.md +9 -0
- package/rules/hatch3r-component-conventions.md +1 -0
- package/rules/hatch3r-data-classification.md +1 -0
- package/rules/hatch3r-deep-context.md +1 -0
- package/rules/hatch3r-dependency-management.md +13 -0
- package/rules/hatch3r-feature-flags.md +1 -0
- package/rules/hatch3r-git-conventions.md +1 -0
- package/rules/hatch3r-i18n.md +1 -0
- package/rules/hatch3r-learning-consult.md +1 -0
- package/rules/hatch3r-migrations.md +12 -0
- package/rules/hatch3r-observability.md +290 -0
- package/rules/hatch3r-performance-budgets.md +1 -0
- package/rules/hatch3r-secrets-management.md +1 -0
- package/rules/hatch3r-security-patterns.md +12 -0
- package/rules/hatch3r-testing.md +1 -0
- package/rules/hatch3r-theming.md +1 -0
- package/rules/hatch3r-tooling-hierarchy.md +1 -0
- package/skills/hatch3r-a11y-audit/SKILL.md +1 -0
- package/skills/hatch3r-agent-customize/SKILL.md +1 -0
- package/skills/hatch3r-api-spec/SKILL.md +1 -0
- package/skills/hatch3r-architecture-review/SKILL.md +1 -0
- package/skills/hatch3r-bug-fix/SKILL.md +1 -0
- package/skills/hatch3r-ci-pipeline/SKILL.md +1 -0
- package/skills/hatch3r-command-customize/SKILL.md +1 -0
- package/skills/hatch3r-context-health/SKILL.md +1 -0
- package/skills/hatch3r-cost-tracking/SKILL.md +1 -0
- package/skills/hatch3r-dep-audit/SKILL.md +2 -1
- package/skills/hatch3r-feature/SKILL.md +1 -0
- package/skills/hatch3r-gh-agentic-workflows/SKILL.md +1 -0
- package/skills/hatch3r-incident-response/SKILL.md +1 -0
- package/skills/hatch3r-issue-workflow/SKILL.md +1 -0
- package/skills/hatch3r-logical-refactor/SKILL.md +1 -0
- package/skills/hatch3r-migration/SKILL.md +1 -0
- package/skills/hatch3r-perf-audit/SKILL.md +1 -0
- package/skills/hatch3r-pr-creation/SKILL.md +1 -0
- package/skills/hatch3r-qa-validation/SKILL.md +1 -0
- package/skills/hatch3r-recipe/SKILL.md +1 -0
- package/skills/hatch3r-refactor/SKILL.md +1 -0
- package/skills/hatch3r-release/SKILL.md +1 -0
- package/skills/hatch3r-rule-customize/SKILL.md +1 -0
- package/skills/hatch3r-skill-customize/SKILL.md +1 -0
- package/skills/hatch3r-visual-refactor/SKILL.md +1 -0
|
@@ -3,6 +3,7 @@ id: hatch3r-code-standards
|
|
|
3
3
|
type: rule
|
|
4
4
|
description: Code quality and file naming conventions for the project
|
|
5
5
|
scope: always
|
|
6
|
+
tags: [core]
|
|
6
7
|
---
|
|
7
8
|
# Code Standards
|
|
8
9
|
|
|
@@ -102,6 +103,14 @@ Enforce consistent import ordering via linter rules (e.g., `eslint-plugin-import
|
|
|
102
103
|
|
|
103
104
|
Separate each group with a blank line. Sort alphabetically within each group.
|
|
104
105
|
|
|
106
|
+
## Monorepo Conventions
|
|
107
|
+
|
|
108
|
+
When working in a monorepo (multiple packages or apps in a single repository):
|
|
109
|
+
|
|
110
|
+
- **Scope changes to a single package at a time.** A PR should touch one package unless the change requires a coordinated cross-package update (e.g., a shared type change and its consumers). Coordinated changes must be documented in the PR description.
|
|
111
|
+
- **Run tests only for affected packages.** Use the monorepo tool's filtering (e.g., `--filter`, `--scope`, `--since`) to run tests, lint, and builds only for packages affected by the current change.
|
|
112
|
+
- **Respect package boundaries — do not import across packages without explicit dependency.** If package A needs something from package B, B must be declared as a dependency in A's `package.json` (or equivalent manifest). Direct file-path imports across package boundaries are forbidden.
|
|
113
|
+
|
|
105
114
|
## Dead Code Prevention
|
|
106
115
|
|
|
107
116
|
- Remove unused imports, variables, functions, and type definitions immediately. Do not comment them out "for later."
|
|
@@ -3,6 +3,7 @@ id: hatch3r-deep-context
|
|
|
3
3
|
type: rule
|
|
4
4
|
description: Adaptive pre-implementation analysis — complexity scoring, requirements elicitation, similar implementation discovery, and transitive dependency tracing before coding
|
|
5
5
|
scope: always
|
|
6
|
+
tags: [core]
|
|
6
7
|
---
|
|
7
8
|
# Deep Context Analysis
|
|
8
9
|
|
|
@@ -3,6 +3,7 @@ id: hatch3r-dependency-management
|
|
|
3
3
|
type: rule
|
|
4
4
|
description: Rules for managing project dependencies
|
|
5
5
|
scope: always
|
|
6
|
+
tags: [maintenance]
|
|
6
7
|
---
|
|
7
8
|
# Dependency Management
|
|
8
9
|
|
|
@@ -15,3 +16,15 @@ scope: always
|
|
|
15
16
|
- Remove unused dependencies on every cleanup pass.
|
|
16
17
|
- Security patches (CVEs) are P0/P1 priority. Patch within 48h for critical.
|
|
17
18
|
- Check bundle size impact against budget. Reject deps that exceed.
|
|
19
|
+
|
|
20
|
+
## Transitive Dependency Hygiene
|
|
21
|
+
|
|
22
|
+
- Audit transitive dependencies, not just direct ones. A direct dependency with a compromised transitive dep is still a vulnerability. Use `npm ls`, `pip show`, or `cargo tree` to inspect the full dependency graph.
|
|
23
|
+
- When a transitive dependency has a known CVE, determine whether the vulnerable code path is reachable from your project. If reachable, override or patch the transitive dep. If unreachable, document the finding with justification for deferral.
|
|
24
|
+
- Avoid dependencies that pull in excessively large transitive trees for minimal functionality. If a package adds 50+ transitive deps for a single utility function, write the utility inline or find a lighter alternative.
|
|
25
|
+
|
|
26
|
+
## Version Upgrade Strategy
|
|
27
|
+
|
|
28
|
+
- Review changelogs and migration guides before upgrading major versions. Never blindly bump major versions and assume backward compatibility.
|
|
29
|
+
- Run the full test suite after any dependency upgrade, including integration tests. A passing unit test suite does not guarantee compatibility with upgraded peer dependencies.
|
|
30
|
+
- When upgrading a shared dependency used across multiple modules, upgrade all consumers in the same PR to avoid version skew within the monorepo or project.
|
package/rules/hatch3r-i18n.md
CHANGED
|
@@ -3,6 +3,7 @@ id: hatch3r-migrations
|
|
|
3
3
|
type: rule
|
|
4
4
|
description: Database migration and schema change patterns for the project
|
|
5
5
|
scope: always
|
|
6
|
+
tags: [implementation, brownfield]
|
|
6
7
|
---
|
|
7
8
|
# Migrations
|
|
8
9
|
|
|
@@ -15,3 +16,14 @@ scope: always
|
|
|
15
16
|
- Document schema changes in project data model spec.
|
|
16
17
|
- Rollback plan required for every migration. Never run destructive migrations without backup verification.
|
|
17
18
|
- Hot documents must stay within size limits after migration.
|
|
19
|
+
|
|
20
|
+
## Data Validation During Migration
|
|
21
|
+
|
|
22
|
+
- Validate data integrity after each migration step, not just at the end. Check that migrated records match the expected schema, required fields are populated, and no data was silently dropped.
|
|
23
|
+
- Include count checks: the number of records processed should match the number of records in the source collection. Log discrepancies as errors, not warnings.
|
|
24
|
+
- For large datasets, migrate in batches with progress checkpoints. If a batch fails, resume from the last checkpoint rather than restarting the entire migration.
|
|
25
|
+
|
|
26
|
+
## Migration Coordination in Multi-Service Environments
|
|
27
|
+
|
|
28
|
+
- When a migration affects shared data (e.g., a schema used by multiple services), coordinate the migration order across services. The consuming services must be deployed with backward-compatible readers before the migration runs.
|
|
29
|
+
- Never assume that all service instances will be running the same code version during a migration window. Design migrations to tolerate mixed-version reads and writes during the rollout period.
|
|
@@ -3,6 +3,7 @@ id: hatch3r-observability
|
|
|
3
3
|
type: rule
|
|
4
4
|
description: Logging, metrics, and tracing conventions for the project
|
|
5
5
|
scope: conditional
|
|
6
|
+
tags: [devops]
|
|
6
7
|
---
|
|
7
8
|
# Observability
|
|
8
9
|
|
|
@@ -165,3 +166,292 @@ Every telemetry-producing service must declare resource attributes at startup:
|
|
|
165
166
|
- Attribute values should be low-cardinality. Never use unbounded values (full URLs with query params, raw SQL, user-generated content) as attribute values.
|
|
166
167
|
- For high-cardinality identifiers (user IDs, request IDs), use span attributes sparingly and rely on correlated logs for detail.
|
|
167
168
|
- Prefer semantic convention attributes over custom attributes. When custom attributes are necessary, prefix them with your organization or project namespace (e.g., `myapp.feature.flag_key`).
|
|
169
|
+
|
|
170
|
+
### AI Agent Semantic Conventions
|
|
171
|
+
|
|
172
|
+
Follow the [OpenTelemetry GenAI Semantic Conventions](https://opentelemetry.io/docs/specs/semconv/gen-ai/) (experimental, introduced 2024) for instrumenting AI/LLM agent systems. These conventions provide consistent attribute naming for generative AI operations, enabling interoperability across agent frameworks and observability backends.
|
|
173
|
+
|
|
174
|
+
#### `gen_ai.*` Span Attributes
|
|
175
|
+
|
|
176
|
+
Use these attributes on all spans that represent interactions with generative AI models:
|
|
177
|
+
|
|
178
|
+
| Attribute | Type | Description | Example |
|
|
179
|
+
|-----------|------|-------------|---------|
|
|
180
|
+
| `gen_ai.system` | string | The GenAI provider system name | `openai`, `anthropic`, `azure_openai` |
|
|
181
|
+
| `gen_ai.request.model` | string | Model name as specified in the request | `gpt-4o`, `claude-sonnet-4-20250514` |
|
|
182
|
+
| `gen_ai.response.model` | string | Model name as returned in the response (may differ from request) | `gpt-4o-2024-08-06` |
|
|
183
|
+
| `gen_ai.request.max_tokens` | int | Maximum number of tokens requested for generation | `4096` |
|
|
184
|
+
| `gen_ai.request.temperature` | float | Temperature parameter sent in the request | `0.7` |
|
|
185
|
+
| `gen_ai.request.top_p` | float | Top-p (nucleus sampling) parameter | `0.9` |
|
|
186
|
+
| `gen_ai.response.finish_reasons` | string[] | Reasons the model stopped generating | `["stop"]`, `["length"]`, `["tool_calls"]` |
|
|
187
|
+
| `gen_ai.usage.input_tokens` | int | Number of tokens in the input/prompt | `1250` |
|
|
188
|
+
| `gen_ai.usage.output_tokens` | int | Number of tokens in the generated output | `530` |
|
|
189
|
+
|
|
190
|
+
- Always set `gen_ai.system` and `gen_ai.request.model` on every GenAI span. These are required for meaningful filtering and cost attribution.
|
|
191
|
+
- Record `gen_ai.usage.input_tokens` and `gen_ai.usage.output_tokens` from the API response to enable token usage dashboards and cost tracking.
|
|
192
|
+
- Use `gen_ai.response.finish_reasons` to detect truncated outputs (`length`) and trigger re-prompting or alerting logic.
|
|
193
|
+
|
|
194
|
+
#### Agent Invocation Spans
|
|
195
|
+
|
|
196
|
+
Instrument the full lifecycle of an agent invocation with a dedicated span. This span is the parent for all LLM calls, tool executions, and sub-agent delegations within a single agent run.
|
|
197
|
+
|
|
198
|
+
- **Span name pattern:** `agent.{agent_name}.invoke` (e.g., `agent.code_reviewer.invoke`, `agent.research_assistant.invoke`)
|
|
199
|
+
- **Required attributes:**
|
|
200
|
+
|
|
201
|
+
| Attribute | Type | Description | Example |
|
|
202
|
+
|-----------|------|-------------|---------|
|
|
203
|
+
| `agent.id` | string | Unique identifier for this agent invocation | `agent-run-a1b2c3d4` |
|
|
204
|
+
| `agent.name` | string | Logical name of the agent | `code_reviewer` |
|
|
205
|
+
| `agent.parent_id` | string | ID of the parent agent (for sub-agent delegation chains) | `agent-run-x9y8z7` |
|
|
206
|
+
| `agent.task` | string | High-level description of the agent's assigned task | `review PR #42` |
|
|
207
|
+
| `agent.framework` | string | Agent framework in use | `langchain`, `autogen`, `custom` |
|
|
208
|
+
|
|
209
|
+
- **Span events for state transitions:** Record span events to mark key lifecycle transitions within the agent invocation:
|
|
210
|
+
- `agent.planning` — Agent begins task decomposition or reasoning.
|
|
211
|
+
- `agent.tool_selection` — Agent selects a tool to invoke.
|
|
212
|
+
- `agent.awaiting_human` — Agent pauses for human-in-the-loop confirmation.
|
|
213
|
+
- `agent.delegating` — Agent spawns a sub-agent.
|
|
214
|
+
- `agent.completed` — Agent finishes its task and produces a final output.
|
|
215
|
+
- `agent.error` — Agent encounters a non-recoverable error. Include `exception.type` and `exception.message` attributes on the event.
|
|
216
|
+
|
|
217
|
+
```typescript
|
|
218
|
+
const agentSpan = tracer.startSpan('agent.code_reviewer.invoke', {
|
|
219
|
+
attributes: {
|
|
220
|
+
'agent.id': invocationId,
|
|
221
|
+
'agent.name': 'code_reviewer',
|
|
222
|
+
'agent.parent_id': parentAgentId ?? '',
|
|
223
|
+
'agent.task': `review PR #${prNumber}`,
|
|
224
|
+
'agent.framework': 'custom',
|
|
225
|
+
},
|
|
226
|
+
});
|
|
227
|
+
|
|
228
|
+
agentSpan.addEvent('agent.planning');
|
|
229
|
+
// ... agent reasoning and tool calls happen as child spans ...
|
|
230
|
+
agentSpan.addEvent('agent.completed');
|
|
231
|
+
agentSpan.end();
|
|
232
|
+
```
|
|
233
|
+
|
|
234
|
+
#### Tool Call Spans
|
|
235
|
+
|
|
236
|
+
Every tool invocation by an agent creates a child span of the agent invocation span. This enables tracing the full sequence of tool calls within an agent run, measuring tool latency, and detecting tool failures.
|
|
237
|
+
|
|
238
|
+
- **Span name pattern:** `tool.{tool_name}.execute` (e.g., `tool.file_read.execute`, `tool.web_search.execute`)
|
|
239
|
+
- **Required attributes:**
|
|
240
|
+
|
|
241
|
+
| Attribute | Type | Description | Example |
|
|
242
|
+
|-----------|------|-------------|---------|
|
|
243
|
+
| `tool.name` | string | Canonical name of the tool | `file_read`, `git_diff`, `web_search` |
|
|
244
|
+
| `tool.input_hash` | string | SHA-256 hash of the tool input (for deduplication, not logging raw input) | `sha256:3a7f...` |
|
|
245
|
+
| `tool.output_status` | string | Outcome of the tool execution | `success`, `error`, `timeout`, `rejected` |
|
|
246
|
+
| `tool.duration_ms` | float | Wall-clock execution time of the tool in milliseconds | `142.5` |
|
|
247
|
+
| `tool.parameters_count` | int | Number of parameters passed to the tool | `3` |
|
|
248
|
+
|
|
249
|
+
- **Parent-child relationship:** Tool spans must be children of the invoking agent span. Use `context.with(trace.setSpan(context.active(), agentSpan))` to propagate the agent span context to tool execution.
|
|
250
|
+
- Set span status to `ERROR` when `tool.output_status` is `error` or `timeout`. Attach exception details as a span event.
|
|
251
|
+
- For tools that perform I/O (HTTP requests, file system operations, database queries), create nested child spans using the appropriate semantic conventions (`http.*`, `db.*`) under the tool span.
|
|
252
|
+
|
|
253
|
+
```typescript
|
|
254
|
+
const toolSpan = tracer.startSpan(
|
|
255
|
+
'tool.git_diff.execute',
|
|
256
|
+
{ attributes: { 'tool.name': 'git_diff' } },
|
|
257
|
+
trace.setSpan(context.active(), agentSpan),
|
|
258
|
+
);
|
|
259
|
+
|
|
260
|
+
const startTime = performance.now();
|
|
261
|
+
try {
|
|
262
|
+
const result = await tools.gitDiff(params);
|
|
263
|
+
toolSpan.setAttributes({
|
|
264
|
+
'tool.output_status': 'success',
|
|
265
|
+
'tool.duration_ms': performance.now() - startTime,
|
|
266
|
+
'tool.input_hash': hashInput(params),
|
|
267
|
+
});
|
|
268
|
+
} catch (err) {
|
|
269
|
+
toolSpan.setAttributes({
|
|
270
|
+
'tool.output_status': 'error',
|
|
271
|
+
'tool.duration_ms': performance.now() - startTime,
|
|
272
|
+
});
|
|
273
|
+
toolSpan.setStatus({ code: SpanStatusCode.ERROR, message: err.message });
|
|
274
|
+
toolSpan.recordException(err);
|
|
275
|
+
throw err;
|
|
276
|
+
} finally {
|
|
277
|
+
toolSpan.end();
|
|
278
|
+
}
|
|
279
|
+
```
|
|
280
|
+
|
|
281
|
+
#### LLM Request/Response Tracing
|
|
282
|
+
|
|
283
|
+
Instrument every LLM API call with a dedicated span. These spans are typically children of an agent invocation span and capture model, token usage, and latency data for cost analysis and performance monitoring.
|
|
284
|
+
|
|
285
|
+
- **Span name pattern:** `gen_ai.{operation}` (e.g., `gen_ai.chat`, `gen_ai.completion`, `gen_ai.embeddings`)
|
|
286
|
+
- **Required attributes:** All applicable `gen_ai.*` attributes from the table above, plus:
|
|
287
|
+
|
|
288
|
+
| Attribute | Type | Description | Example |
|
|
289
|
+
|-----------|------|-------------|---------|
|
|
290
|
+
| `gen_ai.operation.name` | string | The specific API operation | `chat`, `completion`, `embeddings` |
|
|
291
|
+
| `gen_ai.request.stop_sequences` | string[] | Stop sequences sent in the request | `["\n\n", "END"]` |
|
|
292
|
+
| `server.address` | string | Hostname of the GenAI API endpoint | `api.openai.com` |
|
|
293
|
+
| `server.port` | int | Port of the GenAI API endpoint | `443` |
|
|
294
|
+
|
|
295
|
+
- **Input/output token tracking:** Always capture `gen_ai.usage.input_tokens` and `gen_ai.usage.output_tokens` from the API response. Aggregate these in metrics for cost dashboards:
|
|
296
|
+
- Counter: `gen_ai.tokens_total` with labels `{direction=input|output, model, agent_name}`
|
|
297
|
+
- Histogram: `gen_ai.request_duration_ms` with labels `{model, operation, agent_name}`
|
|
298
|
+
|
|
299
|
+
- **Model version tracking:** Record both `gen_ai.request.model` (what was requested) and `gen_ai.response.model` (what was actually used). API providers may silently route to different model versions; capturing both enables drift detection.
|
|
300
|
+
|
|
301
|
+
- **Error handling and retry spans:** When an LLM request fails and is retried, each attempt is a separate child span under the same parent. Record the error on the failed span and create a new span for the retry:
|
|
302
|
+
- Set `gen_ai.request.retries` (int) on the final successful span to indicate total retry count.
|
|
303
|
+
- Record `http.response.status_code` on failed spans to distinguish rate-limit errors (429) from server errors (500+).
|
|
304
|
+
- Use exponential backoff; the retry span's start time naturally captures the wait duration.
|
|
305
|
+
|
|
306
|
+
```typescript
|
|
307
|
+
const llmSpan = tracer.startSpan(
|
|
308
|
+
'gen_ai.chat',
|
|
309
|
+
{
|
|
310
|
+
attributes: {
|
|
311
|
+
'gen_ai.system': 'openai',
|
|
312
|
+
'gen_ai.operation.name': 'chat',
|
|
313
|
+
'gen_ai.request.model': 'gpt-4o',
|
|
314
|
+
'gen_ai.request.max_tokens': 4096,
|
|
315
|
+
'gen_ai.request.temperature': 0.2,
|
|
316
|
+
'server.address': 'api.openai.com',
|
|
317
|
+
},
|
|
318
|
+
},
|
|
319
|
+
trace.setSpan(context.active(), agentSpan),
|
|
320
|
+
);
|
|
321
|
+
|
|
322
|
+
try {
|
|
323
|
+
const response = await openai.chat.completions.create({ /* ... */ });
|
|
324
|
+
llmSpan.setAttributes({
|
|
325
|
+
'gen_ai.response.model': response.model,
|
|
326
|
+
'gen_ai.response.finish_reasons': response.choices.map(c => c.finish_reason),
|
|
327
|
+
'gen_ai.usage.input_tokens': response.usage.prompt_tokens,
|
|
328
|
+
'gen_ai.usage.output_tokens': response.usage.completion_tokens,
|
|
329
|
+
});
|
|
330
|
+
|
|
331
|
+
// Record token usage in metrics for cost tracking
|
|
332
|
+
tokenCounter.add(response.usage.prompt_tokens, {
|
|
333
|
+
direction: 'input', model: response.model, agent_name: agentName,
|
|
334
|
+
});
|
|
335
|
+
tokenCounter.add(response.usage.completion_tokens, {
|
|
336
|
+
direction: 'output', model: response.model, agent_name: agentName,
|
|
337
|
+
});
|
|
338
|
+
} catch (err) {
|
|
339
|
+
llmSpan.setStatus({ code: SpanStatusCode.ERROR, message: err.message });
|
|
340
|
+
llmSpan.recordException(err);
|
|
341
|
+
throw err;
|
|
342
|
+
} finally {
|
|
343
|
+
llmSpan.end();
|
|
344
|
+
}
|
|
345
|
+
```
|
|
346
|
+
|
|
347
|
+
- Never log raw prompt content or full model responses as span attributes — these are high-cardinality and may contain sensitive data. Use `gen_ai.usage.*` token counts for cost tracking and correlated logs for prompt debugging in non-production environments.
|
|
348
|
+
- In production, sample GenAI spans at a higher rate than general spans (e.g., 50-100%) because each call is expensive and lower volume than typical HTTP traffic. Adjust sampling based on call volume and observability budget.
|
|
349
|
+
|
|
350
|
+
### Tool Call Audit Trail
|
|
351
|
+
|
|
352
|
+
Maintain a structured audit log for every tool invocation in agentic workflows. This log is separate from tracing spans and serves as an immutable compliance and debugging record.
|
|
353
|
+
|
|
354
|
+
#### Schema Definition
|
|
355
|
+
|
|
356
|
+
Every tool call audit log entry must include the following fields:
|
|
357
|
+
|
|
358
|
+
| Field | Type | Description |
|
|
359
|
+
|-------|------|-------------|
|
|
360
|
+
| `tool.name` | string | Name of the tool invoked |
|
|
361
|
+
| `tool.input_hash` | string | SHA-256 hash of the tool input (for privacy, never log raw input) |
|
|
362
|
+
| `tool.output_status` | string | Outcome of the tool execution: `success`, `error`, `timeout`, or `denied` |
|
|
363
|
+
| `tool.duration_ms` | float | Execution time in milliseconds |
|
|
364
|
+
| `agent.id` | string | ID of the agent that invoked the tool |
|
|
365
|
+
| `agent.name` | string | Human-readable agent name |
|
|
366
|
+
| `correlation.id` | string | Trace correlation ID linking this entry to the broader workflow |
|
|
367
|
+
| `timestamp` | string | ISO 8601 timestamp of the invocation |
|
|
368
|
+
| `session.id` | string | Session identifier for grouping related tool calls |
|
|
369
|
+
|
|
370
|
+
#### Logging Requirements
|
|
371
|
+
|
|
372
|
+
- Log every tool invocation at `info` level with the full schema above.
|
|
373
|
+
- Log tool failures at `error` level with additional `error.type` and `error.message` fields describing the failure.
|
|
374
|
+
- Aggregate tool call counts per agent per session for anomaly detection (e.g., an agent invoking an unusual number of tools may indicate a loop or misconfiguration).
|
|
375
|
+
- Retain audit logs for a minimum of 90 days to support post-incident investigation and compliance review.
|
|
376
|
+
|
|
377
|
+
#### Example Log Entry
|
|
378
|
+
|
|
379
|
+
```json
|
|
380
|
+
{
|
|
381
|
+
"timestamp": "2026-02-15T14:32:07.891Z",
|
|
382
|
+
"level": "info",
|
|
383
|
+
"correlation.id": "agent-run-550e8400-e29b-41d4-a716-446655440000",
|
|
384
|
+
"session.id": "sess-8f14e45f-ceea-467f-a8f0-3b5c6d7e8f9a",
|
|
385
|
+
"agent.id": "agent-run-a1b2c3d4",
|
|
386
|
+
"agent.name": "code_reviewer",
|
|
387
|
+
"tool.name": "git_diff",
|
|
388
|
+
"tool.input_hash": "sha256:3a7f2c9e8b1d4f6a0e5c7b9d2f4a6e8c0b3d5f7a9e1c3b5d7f9a2c4e6b8d0f",
|
|
389
|
+
"tool.output_status": "success",
|
|
390
|
+
"tool.duration_ms": 142.5
|
|
391
|
+
}
|
|
392
|
+
```
|
|
393
|
+
|
|
394
|
+
### Correlation IDs for Agent Workflows
|
|
395
|
+
|
|
396
|
+
Correlation IDs provide the connective thread linking all telemetry signals (logs, spans, metrics) across a multi-agent workflow. Every participant in the workflow uses the same correlation ID, enabling end-to-end traceability from the initial trigger through all agent delegations and tool calls.
|
|
397
|
+
|
|
398
|
+
#### ID Generation
|
|
399
|
+
|
|
400
|
+
- Use UUIDv4 for correlation IDs. Generate the ID at the workflow entry point (the first agent invocation or the orchestrator that initiates the run).
|
|
401
|
+
- Format: `{workflow-type}-{uuid}` (e.g., `agent-run-550e8400-e29b-41d4-a716-446655440000`, `review-flow-7c9e6679-7425-40de-944b-e07fc1f90ae7`).
|
|
402
|
+
- The workflow-type prefix provides human-readable context when scanning logs and makes it possible to filter by workflow category without parsing the full ID.
|
|
403
|
+
|
|
404
|
+
#### Propagation
|
|
405
|
+
|
|
406
|
+
- The correlation ID propagates from the parent agent to all sub-agents via context. Pass it explicitly when delegating to sub-agents or invoking tools.
|
|
407
|
+
- Every log entry, span, and metric produced during the workflow must include the `correlation.id` attribute.
|
|
408
|
+
- When crossing process boundaries (e.g., HTTP calls between services), propagate the correlation ID via a custom header (`X-Correlation-ID`) alongside standard W3C Trace Context headers.
|
|
409
|
+
|
|
410
|
+
#### Parent-Child Span Linking
|
|
411
|
+
|
|
412
|
+
- The parent agent's span ID becomes the `parent_span_id` attribute on child agent spans, establishing a clear hierarchy in trace visualizations.
|
|
413
|
+
- For cross-workflow references (e.g., an agent run triggered by a CI pipeline event), use OpenTelemetry `SpanLink` to connect the agent workflow trace to the originating trace without creating a parent-child relationship.
|
|
414
|
+
- SpanLinks preserve the independence of each workflow trace while enabling navigation between related workflows in the observability backend.
|
|
415
|
+
|
|
416
|
+
#### Implementation Pattern
|
|
417
|
+
|
|
418
|
+
```typescript
|
|
419
|
+
import { randomUUID } from 'node:crypto';
|
|
420
|
+
import { context, trace, SpanStatusCode } from '@opentelemetry/api';
|
|
421
|
+
|
|
422
|
+
function generateCorrelationId(workflowType: string): string {
|
|
423
|
+
return `${workflowType}-${randomUUID()}`;
|
|
424
|
+
}
|
|
425
|
+
|
|
426
|
+
async function runAgentWorkflow(task: string): Promise<void> {
|
|
427
|
+
const correlationId = generateCorrelationId('agent-run');
|
|
428
|
+
const tracer = trace.getTracer('agent-orchestrator');
|
|
429
|
+
|
|
430
|
+
const rootSpan = tracer.startSpan('agent.orchestrator.invoke', {
|
|
431
|
+
attributes: {
|
|
432
|
+
'correlation.id': correlationId,
|
|
433
|
+
'agent.name': 'orchestrator',
|
|
434
|
+
'agent.task': task,
|
|
435
|
+
},
|
|
436
|
+
});
|
|
437
|
+
|
|
438
|
+
const ctx = trace.setSpan(context.active(), rootSpan);
|
|
439
|
+
|
|
440
|
+
try {
|
|
441
|
+
// Sub-agent inherits the correlation ID from context
|
|
442
|
+
await context.with(ctx, async () => {
|
|
443
|
+
await delegateToSubAgent('code_reviewer', {
|
|
444
|
+
correlationId,
|
|
445
|
+
parentSpanId: rootSpan.spanContext().spanId,
|
|
446
|
+
task: 'review changes',
|
|
447
|
+
});
|
|
448
|
+
});
|
|
449
|
+
} catch (err) {
|
|
450
|
+
rootSpan.setStatus({ code: SpanStatusCode.ERROR, message: (err as Error).message });
|
|
451
|
+
rootSpan.recordException(err as Error);
|
|
452
|
+
throw err;
|
|
453
|
+
} finally {
|
|
454
|
+
rootSpan.end();
|
|
455
|
+
}
|
|
456
|
+
}
|
|
457
|
+
```
|
|
@@ -3,6 +3,7 @@ id: hatch3r-security-patterns
|
|
|
3
3
|
type: rule
|
|
4
4
|
description: Security patterns including input validation, auth enforcement, and AI/agentic security for the project
|
|
5
5
|
scope: always
|
|
6
|
+
tags: [security]
|
|
6
7
|
---
|
|
7
8
|
# Security Patterns
|
|
8
9
|
|
|
@@ -63,6 +64,11 @@ scope: always
|
|
|
63
64
|
- Enforce parameter schemas on every tool call. Reject calls with unexpected, missing, or out-of-range arguments.
|
|
64
65
|
- Rate-limit tool invocations per agent per time window. Alert on anomalous tool usage patterns.
|
|
65
66
|
- Sandbox tool execution: restrict file system access, network egress, and subprocess spawning.
|
|
67
|
+
- **MCP server filesystem scope:** MCP servers with filesystem access must be scoped to the minimum necessary directories:
|
|
68
|
+
- Restrict filesystem access to the project directory. MCP servers should never have access to the home directory, system directories, or unrelated project directories.
|
|
69
|
+
- Document which MCP servers have filesystem access and define their intended scope (read-only vs read-write, which directories).
|
|
70
|
+
- Configure `allowedDirectories` in MCP server configs where supported. If the server does not support directory restrictions, document this as a known risk and apply compensating controls (monitoring, read-only mode).
|
|
71
|
+
- Audit MCP server filesystem access on configuration changes. Verify that added servers do not expand the filesystem attack surface beyond the project boundary.
|
|
66
72
|
|
|
67
73
|
### ASI03 — Identity & Privilege Abuse
|
|
68
74
|
|
|
@@ -77,6 +83,12 @@ scope: always
|
|
|
77
83
|
- Verify package integrity (checksums, signatures) before loading tools or plugins.
|
|
78
84
|
- Audit third-party prompt templates for injected instructions before use.
|
|
79
85
|
- Maintain an allowlist of approved MCP servers and tool sources.
|
|
86
|
+
- **`npx -y` safety:** The `-y` flag auto-confirms installation of unknown packages without prompts, creating a supply chain attack vector:
|
|
87
|
+
- Never use `npx -y` with untrusted, unknown, or typo-squattable package names.
|
|
88
|
+
- Always pin explicit versions when using npx: `npx package@1.2.3` instead of `npx package`.
|
|
89
|
+
- Prefer `npm exec --package=package@version -- command` for critical tooling — it provides explicit version control and avoids silent auto-install.
|
|
90
|
+
- In CI pipelines, install tools as explicit `devDependencies` with pinned versions rather than relying on `npx` at runtime.
|
|
91
|
+
- Verify the package name and publisher on the npm registry before first use. Typosquatting attacks exploit `npx -y` by registering names similar to popular packages.
|
|
80
92
|
|
|
81
93
|
### ASI05 — Unexpected Code Execution
|
|
82
94
|
|
package/rules/hatch3r-testing.md
CHANGED
package/rules/hatch3r-theming.md
CHANGED
|
@@ -1,6 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
id: hatch3r-a11y-audit
|
|
3
3
|
description: Comprehensive WCAG AA accessibility audit with findings and fixes. Use when auditing accessibility, verifying WCAG compliance, or improving a11y across the application.
|
|
4
|
+
tags: [review, a11y]
|
|
4
5
|
---
|
|
5
6
|
# Accessibility Audit Workflow
|
|
6
7
|
|
|
@@ -1,6 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
id: hatch3r-agent-customize
|
|
3
3
|
description: Create and manage per-agent customization files for model overrides, description changes, and project-specific markdown instructions. Use when tailoring agent behavior to project-specific needs.
|
|
4
|
+
tags: [customize]
|
|
4
5
|
---
|
|
5
6
|
# Agent Customization Management
|
|
6
7
|
|
|
@@ -1,6 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
id: hatch3r-architecture-review
|
|
3
3
|
description: Evaluate architectural decisions and produce ADRs following the project template. Use when making architectural decisions, evaluating trade-offs, or creating ADRs.
|
|
4
|
+
tags: [review]
|
|
4
5
|
---
|
|
5
6
|
# Architecture Review Workflow
|
|
6
7
|
|
|
@@ -1,6 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
id: hatch3r-bug-fix
|
|
3
3
|
description: Step-by-step bug fix workflow. Diagnose root cause, implement minimal fix, write regression test. Use when fixing bugs, working on bug report issues, or when the user mentions a bug.
|
|
4
|
+
tags: [core, implementation]
|
|
4
5
|
---
|
|
5
6
|
> **Note:** Commands below use `npm` as an example. Substitute with your project's package manager (`yarn`, `pnpm`, `bun`) or build tool as appropriate.
|
|
6
7
|
|
|
@@ -1,6 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
id: hatch3r-command-customize
|
|
3
3
|
description: Create and manage per-command customization files for description overrides, enable/disable control, and project-specific markdown instructions. Use when tailoring command behavior to project-specific needs.
|
|
4
|
+
tags: [customize]
|
|
4
5
|
---
|
|
5
6
|
# Command Customization Management
|
|
6
7
|
|
|
@@ -1,6 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
id: hatch3r-context-health
|
|
3
3
|
description: Monitor and maintain conversation context health during long sessions. Use when context may be degrading, after many turns, or when experiencing repeated errors.
|
|
4
|
+
tags: [maintenance]
|
|
4
5
|
---
|
|
5
6
|
# Context Health Monitoring
|
|
6
7
|
|
|
@@ -1,6 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
id: hatch3r-dep-audit
|
|
3
3
|
description: Audit and update npm dependencies for security, freshness, and bundle impact. Use when auditing dependencies, responding to CVEs, or upgrading packages.
|
|
4
|
+
tags: [maintenance, security]
|
|
4
5
|
---
|
|
5
6
|
> **Note:** Commands below use `npm` as an example. Substitute with your project's package manager (`yarn`, `pnpm`, `bun`) or build tool as appropriate.
|
|
6
7
|
|
|
@@ -34,7 +35,7 @@ For critical and high vulnerabilities:
|
|
|
34
35
|
- **GitHub:** GitHub Security Advisories (`gh api /repos/{owner}/{repo}/security-advisories`)
|
|
35
36
|
- **Azure DevOps:** Azure Artifacts security scanning and Azure Boards advisory tracking
|
|
36
37
|
- **GitLab:** GitLab Dependency Scanning (Security & Compliance → Vulnerability Report)
|
|
37
|
-
- Prioritize: critical first, then high.
|
|
38
|
+
- Prioritize: critical first, then high. Medium/low can be batched.
|
|
38
39
|
- Note any packages with no fix available — document mitigation or deferral rationale.
|
|
39
40
|
|
|
40
41
|
## Step 3: Plan Upgrades
|
|
@@ -1,6 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
id: hatch3r-feature
|
|
3
3
|
description: End-to-end feature implementation workflow. Covers data model, domain logic, API, and UI as a vertical slice. Use when implementing new features or working on feature request issues.
|
|
4
|
+
tags: [core, implementation]
|
|
4
5
|
---
|
|
5
6
|
> **Note:** Commands below use `npm` as an example. Substitute with your project's package manager (`yarn`, `pnpm`, `bun`) or build tool as appropriate.
|
|
6
7
|
|
|
@@ -1,6 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
id: hatch3r-issue-workflow
|
|
3
3
|
description: Guides the 8-step agentic development workflow for issues/work items. Covers parsing issues, loading skills, reading specs, planning, implementing, testing, opening PRs/MRs, and addressing review. Use when working on any issue/work item or when the user mentions an issue number.
|
|
4
|
+
tags: [core, implementation]
|
|
4
5
|
---
|
|
5
6
|
# Issue Workflow
|
|
6
7
|
|