hatch3r 1.4.0 → 1.5.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +10 -6
- package/agents/hatch3r-a11y-auditor.md +13 -2
- package/agents/hatch3r-architect.md +20 -1
- package/agents/hatch3r-ci-watcher.md +25 -1
- package/agents/hatch3r-context-rules.md +15 -3
- package/agents/hatch3r-dependency-auditor.md +23 -2
- package/agents/hatch3r-devops.md +11 -0
- package/agents/hatch3r-docs-writer.md +27 -2
- package/agents/hatch3r-fixer.md +46 -3
- package/agents/hatch3r-implementer.md +19 -1
- package/agents/hatch3r-learnings-loader.md +19 -0
- package/agents/hatch3r-lint-fixer.md +11 -0
- package/agents/hatch3r-perf-profiler.md +21 -1
- package/agents/hatch3r-researcher.md +51 -911
- package/agents/hatch3r-reviewer.md +24 -2
- package/agents/hatch3r-security-auditor.md +20 -0
- package/agents/hatch3r-test-writer.md +24 -0
- package/agents/modes/architecture.md +1 -0
- package/agents/modes/boundary-analysis.md +2 -1
- package/agents/modes/codebase-impact.md +1 -0
- package/agents/modes/complexity-risk.md +1 -0
- package/agents/modes/coverage-analysis.md +1 -0
- package/agents/modes/current-state.md +1 -0
- package/agents/modes/feature-design.md +1 -0
- package/agents/modes/impact-analysis.md +1 -0
- package/agents/modes/library-docs.md +2 -1
- package/agents/modes/migration-path.md +1 -0
- package/agents/modes/prior-art.md +1 -0
- package/agents/modes/refactoring-strategy.md +1 -0
- package/agents/modes/regression.md +1 -0
- package/agents/modes/requirements-elicitation.md +1 -0
- package/agents/modes/risk-assessment.md +1 -0
- package/agents/modes/risk-prioritization.md +1 -0
- package/agents/modes/root-cause.md +1 -0
- package/agents/modes/similar-implementation.md +2 -1
- package/agents/modes/symptom-trace.md +1 -0
- package/agents/modes/test-pattern.md +2 -1
- package/agents/shared/external-knowledge.md +10 -0
- package/agents/shared/quality-charter.md +18 -0
- package/checks/README.md +1 -0
- package/checks/accessibility.md +55 -0
- package/commands/board/pickup-azure-devops.md +1 -0
- package/commands/board/pickup-delegation-multi.md +6 -1
- package/commands/board/pickup-delegation.md +1 -0
- package/commands/board/pickup-github.md +1 -0
- package/commands/board/pickup-gitlab.md +1 -0
- package/commands/board/pickup-modes.md +1 -0
- package/commands/board/pickup-post-impl.md +2 -1
- package/commands/board/shared-azure-devops.md +1 -0
- package/commands/board/shared-board-overview.md +1 -0
- package/commands/board/shared-github.md +1 -0
- package/commands/board/shared-gitlab.md +1 -0
- package/commands/hatch3r-agent-customize.md +1 -0
- package/commands/hatch3r-api-spec.md +1 -0
- package/commands/hatch3r-benchmark.md +4 -3
- package/commands/hatch3r-board-fill.md +52 -9
- package/commands/hatch3r-board-groom.md +69 -5
- package/commands/hatch3r-board-init.md +2 -1
- package/commands/hatch3r-board-pickup.md +1 -0
- package/commands/hatch3r-board-refresh.md +1 -0
- package/commands/hatch3r-board-shared.md +34 -3
- package/commands/hatch3r-bug-plan.md +2 -1
- package/commands/hatch3r-codebase-map.md +4 -3
- package/commands/hatch3r-command-customize.md +2 -1
- package/commands/hatch3r-context-health.md +1 -0
- package/commands/hatch3r-cost-tracking.md +1 -0
- package/commands/hatch3r-debug.md +4 -3
- package/commands/hatch3r-dep-audit.md +3 -0
- package/commands/hatch3r-feature-plan.md +3 -2
- package/commands/hatch3r-healthcheck.md +1 -0
- package/commands/hatch3r-hooks.md +5 -0
- package/commands/hatch3r-learn.md +1 -0
- package/commands/hatch3r-migration-plan.md +3 -2
- package/commands/hatch3r-onboard.md +2 -1
- package/commands/hatch3r-project-spec.md +4 -3
- package/commands/hatch3r-quick-change.md +2 -0
- package/commands/hatch3r-recipe.md +1 -0
- package/commands/hatch3r-refactor-plan.md +2 -1
- package/commands/hatch3r-release.md +4 -1
- package/commands/hatch3r-revision.md +2 -1
- package/commands/hatch3r-roadmap.md +5 -4
- package/commands/hatch3r-rule-customize.md +1 -0
- package/commands/hatch3r-security-audit.md +1 -0
- package/commands/hatch3r-skill-customize.md +1 -0
- package/commands/hatch3r-test-plan.md +3 -2
- package/commands/hatch3r-workflow.md +5 -0
- package/dist/cli/index.js +7467 -4582
- package/dist/cli/index.js.map +1 -1
- package/hooks/hatch3r-ci-failure.md +1 -0
- package/hooks/hatch3r-file-save.md +1 -0
- package/hooks/hatch3r-post-merge.md +1 -0
- package/hooks/hatch3r-pre-commit.md +1 -0
- package/hooks/hatch3r-pre-push.md +1 -0
- package/hooks/hatch3r-session-start.md +1 -0
- package/package.json +19 -4
- package/rules/hatch3r-accessibility-standards.md +2 -1
- package/rules/hatch3r-accessibility-standards.mdc +1 -1
- package/rules/hatch3r-agent-orchestration-detail.md +49 -1
- package/rules/hatch3r-agent-orchestration-detail.mdc +47 -1
- package/rules/hatch3r-agent-orchestration.md +87 -5
- package/rules/hatch3r-agent-orchestration.mdc +85 -5
- package/rules/hatch3r-api-design.md +2 -1
- package/rules/hatch3r-api-design.mdc +1 -1
- package/rules/hatch3r-browser-verification.md +4 -2
- package/rules/hatch3r-browser-verification.mdc +1 -0
- package/rules/hatch3r-ci-cd.md +2 -1
- package/rules/hatch3r-ci-cd.mdc +1 -1
- package/rules/hatch3r-code-standards.md +15 -2
- package/rules/hatch3r-code-standards.mdc +12 -0
- package/rules/hatch3r-component-conventions.md +2 -1
- package/rules/hatch3r-component-conventions.mdc +1 -0
- package/rules/hatch3r-data-classification.md +2 -1
- package/rules/hatch3r-data-classification.mdc +1 -1
- package/rules/hatch3r-deep-context.md +26 -1
- package/rules/hatch3r-deep-context.mdc +25 -1
- package/rules/hatch3r-dependency-management.md +2 -1
- package/rules/hatch3r-dependency-management.mdc +1 -1
- package/rules/hatch3r-feature-flags.md +2 -0
- package/rules/hatch3r-feature-flags.mdc +1 -0
- package/rules/hatch3r-git-conventions.md +2 -1
- package/rules/hatch3r-git-conventions.mdc +2 -1
- package/rules/hatch3r-i18n.md +2 -1
- package/rules/hatch3r-i18n.mdc +1 -0
- package/rules/hatch3r-learning-consult.md +11 -1
- package/rules/hatch3r-learning-consult.mdc +11 -1
- package/rules/hatch3r-migrations.md +2 -1
- package/rules/hatch3r-migrations.mdc +1 -1
- package/rules/hatch3r-observability-logging.md +34 -0
- package/rules/hatch3r-observability-logging.mdc +30 -0
- package/rules/hatch3r-observability-metrics.md +74 -0
- package/rules/hatch3r-observability-metrics.mdc +70 -0
- package/rules/hatch3r-observability-tracing-detail.md +160 -0
- package/rules/hatch3r-observability-tracing-detail.mdc +63 -0
- package/rules/hatch3r-observability-tracing.md +86 -0
- package/rules/hatch3r-observability-tracing.mdc +77 -0
- package/rules/hatch3r-observability.md +9 -448
- package/rules/hatch3r-observability.mdc +7 -448
- package/rules/hatch3r-performance-budgets.md +2 -0
- package/rules/hatch3r-performance-budgets.mdc +1 -0
- package/rules/hatch3r-secrets-management.md +2 -1
- package/rules/hatch3r-secrets-management.mdc +1 -1
- package/rules/hatch3r-security-patterns.md +3 -2
- package/rules/hatch3r-security-patterns.mdc +1 -1
- package/rules/hatch3r-testing.md +12 -2
- package/rules/hatch3r-testing.mdc +10 -1
- package/rules/hatch3r-theming.md +3 -2
- package/rules/hatch3r-theming.mdc +1 -0
- package/rules/hatch3r-tooling-hierarchy.md +3 -2
- package/rules/hatch3r-tooling-hierarchy.mdc +1 -1
- package/skills/hatch3r-a11y-audit/SKILL.md +11 -4
- package/skills/hatch3r-agent-customize/SKILL.md +1 -0
- package/skills/hatch3r-api-spec/SKILL.md +9 -2
- package/skills/hatch3r-architecture-review/SKILL.md +7 -0
- package/skills/hatch3r-bug-fix/SKILL.md +16 -7
- package/skills/hatch3r-ci-pipeline/SKILL.md +8 -1
- package/skills/hatch3r-command-customize/SKILL.md +1 -0
- package/skills/hatch3r-context-health/SKILL.md +23 -2
- package/skills/hatch3r-cost-tracking/SKILL.md +16 -6
- package/skills/hatch3r-customize/SKILL.md +8 -1
- package/skills/hatch3r-dep-audit/SKILL.md +9 -2
- package/skills/hatch3r-feature/SKILL.md +12 -4
- package/skills/hatch3r-gh-agentic-workflows/SKILL.md +7 -0
- package/skills/hatch3r-incident-response/SKILL.md +7 -0
- package/skills/hatch3r-issue-workflow/SKILL.md +8 -1
- package/skills/hatch3r-logical-refactor/SKILL.md +8 -1
- package/skills/hatch3r-migration/SKILL.md +7 -0
- package/skills/hatch3r-perf-audit/SKILL.md +9 -2
- package/skills/hatch3r-pr-creation/SKILL.md +8 -1
- package/skills/hatch3r-qa-validation/SKILL.md +8 -1
- package/skills/hatch3r-recipe/SKILL.md +8 -1
- package/skills/hatch3r-refactor/SKILL.md +10 -2
- package/skills/hatch3r-release/SKILL.md +8 -1
- package/skills/hatch3r-rule-customize/SKILL.md +1 -0
- package/skills/hatch3r-skill-customize/SKILL.md +1 -0
- package/skills/hatch3r-visual-refactor/SKILL.md +12 -5
|
@@ -1,454 +1,13 @@
|
|
|
1
1
|
---
|
|
2
|
-
description:
|
|
2
|
+
description: "[Deprecated] Observability conventions -- split into hatch3r-observability-logging, hatch3r-observability-metrics, and hatch3r-observability-tracing"
|
|
3
3
|
alwaysApply: false
|
|
4
4
|
---
|
|
5
|
-
# Observability
|
|
5
|
+
# Observability (Deprecated Redirect)
|
|
6
6
|
|
|
7
|
-
|
|
7
|
+
This rule has been split into three focused rules for maintainability:
|
|
8
8
|
|
|
9
|
-
-
|
|
10
|
-
-
|
|
11
|
-
-
|
|
12
|
-
- Never log secrets, PII, tokens, passwords, or sensitive content.
|
|
13
|
-
- Instrument key operations with timing metrics. Serverless functions log execution time and outcome.
|
|
14
|
-
- Client-side: log errors to a sink (e.g., error reporting service), not just `console.error`.
|
|
15
|
-
- Prefer event-based metrics over polling. Trace user flows end-to-end with `correlationId`.
|
|
16
|
-
- Respect performance budgets: logging must not add > 10ms latency to hot paths.
|
|
17
|
-
- Include `service`, `environment`, and `version` fields in every log entry for filtering.
|
|
18
|
-
- Use log sampling for high-volume debug logs in production (e.g., 1% sample rate).
|
|
9
|
+
- **`hatch3r-observability-logging`** -- Structured logging and error reporting conventions
|
|
10
|
+
- **`hatch3r-observability-metrics`** -- Metrics, SLO/SLI definitions, alerting, and dashboard standards
|
|
11
|
+
- **`hatch3r-observability-tracing`** -- Distributed tracing, OpenTelemetry semantic conventions, AI agent instrumentation, and correlation IDs
|
|
19
12
|
|
|
20
|
-
|
|
21
|
-
|
|
22
|
-
- Use OpenTelemetry SDK for all tracing instrumentation. Initialize the TracerProvider once at application startup before any instrumented libraries load.
|
|
23
|
-
- Propagate trace context via W3C Trace Context headers (`traceparent`, `tracestate`) across all service boundaries, queues, and async workflows.
|
|
24
|
-
- Span naming conventions:
|
|
25
|
-
|
|
26
|
-
| Span Type | Pattern | Example |
|
|
27
|
-
| ----------- | ------------------------------ | --------------------------- |
|
|
28
|
-
| HTTP server | `HTTP {method} {route}` | `HTTP GET /api/users/:id` |
|
|
29
|
-
| HTTP client | `HTTP {method} {host}{path}` | `HTTP POST api.stripe.com/` |
|
|
30
|
-
| DB query | `{db.system} {operation}` | `firestore getDoc` |
|
|
31
|
-
| Queue | `{queue} {operation}` | `tasks-queue publish` |
|
|
32
|
-
| Internal | `{module}.{function}` | `auth.verifyToken` |
|
|
33
|
-
|
|
34
|
-
- Required span attributes: `service.name`, `service.version`, `deployment.environment`. Add domain-specific attributes (e.g., `user.id`, `tenant.id`) where relevant.
|
|
35
|
-
- Parent-child span relationships: every outbound call (HTTP, DB, queue) creates a child span of the current context. Never create orphan spans.
|
|
36
|
-
- Sampling strategies: use `ParentBased(TraceIdRatioBased(0.1))` in production (10% sample rate). Always sample errors and slow requests (> p95 latency) at 100%.
|
|
37
|
-
- Use the OpenTelemetry Collector as a gateway between applications and backends to enable batching, retrying, and vendor-neutral export.
|
|
38
|
-
- Keep span event count low (< 32 per span). For high-volume events, use correlated logs or `SpanLink` instead.
|
|
39
|
-
|
|
40
|
-
## Metrics
|
|
41
|
-
|
|
42
|
-
- Use OpenTelemetry Metrics SDK. Expose Prometheus-compatible `/metrics` endpoint for scraping where applicable.
|
|
43
|
-
- Metric naming: `{service}.{domain}.{metric}_{unit}` in snake_case. Example: `api.auth.login_duration_ms`.
|
|
44
|
-
- Instrument types and when to use:
|
|
45
|
-
|
|
46
|
-
| Instrument | Use Case | Example |
|
|
47
|
-
| ----------- | ---------------------------------- | -------------------------------- |
|
|
48
|
-
| Counter | Monotonically increasing totals | `http.requests_total` |
|
|
49
|
-
| Histogram | Distributions (latency, size) | `http.request_duration_ms` |
|
|
50
|
-
| Gauge | Point-in-time values | `db.connection_pool_active` |
|
|
51
|
-
| UpDownCounter | Values that increase and decrease | `queue.messages_pending` |
|
|
52
|
-
|
|
53
|
-
- Histogram buckets for latency: `[5, 10, 25, 50, 100, 250, 500, 1000, 2500, 5000, 10000]` ms.
|
|
54
|
-
- Cardinality management: never use unbounded values (user IDs, request paths with params) as metric labels. Cap label cardinality to < 100 unique values per metric.
|
|
55
|
-
- Custom business metrics: track domain-significant events (sign-ups, purchases, feature usage) as counters with relevant dimensions.
|
|
56
|
-
|
|
57
|
-
## SLO / SLI Definitions
|
|
58
|
-
|
|
59
|
-
- Define SLIs as ratios of good events to total events, measured from the user's perspective.
|
|
60
|
-
- Standard SLIs:
|
|
61
|
-
|
|
62
|
-
| SLI | Definition | Measurement Source |
|
|
63
|
-
| ---------------- | --------------------------------------------- | ------------------------ |
|
|
64
|
-
| Availability | Requests returning non-5xx / total requests | Load balancer logs |
|
|
65
|
-
| Latency | Requests completing < threshold / total | Tracing p99 |
|
|
66
|
-
| Error rate | Failed operations / total operations | Application metrics |
|
|
67
|
-
| Freshness | Data updated within SLA / total records | Background job metrics |
|
|
68
|
-
|
|
69
|
-
- SLO targets: set per-service. Typical starting points: 99.9% availability (43 min/month budget), p99 latency < 500ms.
|
|
70
|
-
- Error budgets: `budget = 1 - SLO_target`. Track remaining budget on a rolling 30-day window.
|
|
71
|
-
- Burn rate alerts: use multi-window approach (short + long window). Fast-burn alert: 2% budget consumed in 1 hour. Slow-burn alert: 5% consumed in 6 hours. Alert only when both windows confirm.
|
|
72
|
-
|
|
73
|
-
## Alerting
|
|
74
|
-
|
|
75
|
-
| Severity | Criteria | Response Time | Notification |
|
|
76
|
-
| -------- | ----------------------------------- | ------------- | ------------------- |
|
|
77
|
-
| P1 | Service down, data loss risk | 15 min | Page on-call + Slack |
|
|
78
|
-
| P2 | Degraded performance, SLO at risk | 1 hour | Page on-call |
|
|
79
|
-
| P3 | Non-critical issue, workaround exists | Next business day | Slack channel |
|
|
80
|
-
| P4 | Cosmetic / low-impact | Sprint backlog | Ticket only |
|
|
81
|
-
|
|
82
|
-
- Every alert must link to a runbook with: symptoms, likely causes, diagnostic steps, remediation actions.
|
|
83
|
-
- Alert fatigue prevention: tune thresholds to < 5 actionable alerts per on-call shift. Suppress duplicate alerts within a 10-minute dedup window.
|
|
84
|
-
- Route alerts by service ownership. Use escalation policies: if P1/P2 unacknowledged in 15 min, escalate to secondary.
|
|
85
|
-
- Review alert quality monthly: snooze/delete alerts with < 20% action rate.
|
|
86
|
-
|
|
87
|
-
## Structured Error Reporting
|
|
88
|
-
|
|
89
|
-
- Integrate Sentry (or equivalent) for automated error capture in both server and client environments.
|
|
90
|
-
- Configure release tracking: tag errors with `release` (git SHA or semver) and upload source maps for readable stack traces.
|
|
91
|
-
- Enable breadcrumbs: capture the last 50 user actions, network requests, and console messages leading to an error.
|
|
92
|
-
- Error grouping: use custom fingerprints for domain-specific errors to prevent over-grouping. Default fingerprinting is acceptable for unhandled exceptions.
|
|
93
|
-
- Enrich error context with `correlationId`, `userId`, environment, and relevant business state. Never attach PII or secrets.
|
|
94
|
-
- Set sample rates: 100% for errors, 10% for transactions in production. Adjust based on volume and budget.
|
|
95
|
-
|
|
96
|
-
## Dashboard Standards
|
|
97
|
-
|
|
98
|
-
- Required dashboards per service:
|
|
99
|
-
|
|
100
|
-
| Dashboard | Contents |
|
|
101
|
-
| ---------------- | ----------------------------------------------------------- |
|
|
102
|
-
| Service Health | Request rate, error rate, latency p50/p95/p99, saturation |
|
|
103
|
-
| Business Metrics | Key domain counters, conversion funnels, feature adoption |
|
|
104
|
-
| Dependencies | Upstream/downstream latency, error rates, circuit breaker state |
|
|
105
|
-
| Infrastructure | CPU, memory, disk, connection pools, queue depth |
|
|
106
|
-
|
|
107
|
-
- Dashboard-as-code: define dashboards in version-controlled JSON/YAML (Grafana provisioning, Terraform, or equivalent). No manual dashboard creation in production.
|
|
108
|
-
- Every dashboard panel includes: descriptive title, unit labels, threshold lines for SLO targets, and a link to the relevant runbook or alert.
|
|
109
|
-
- Review dashboards quarterly: remove unused panels, update thresholds, verify data source accuracy.
|
|
110
|
-
|
|
111
|
-
## OpenTelemetry Semantic Conventions
|
|
112
|
-
|
|
113
|
-
Follow the [OpenTelemetry Semantic Conventions](https://opentelemetry.io/docs/specs/semconv/) (v1.29+) for consistent attribute naming across all telemetry signals. Semantic conventions ensure interoperability between instrumentation libraries, collectors, and observability backends.
|
|
114
|
-
|
|
115
|
-
### Standard Attribute Namespaces
|
|
116
|
-
|
|
117
|
-
| Namespace | Scope | Key Attributes |
|
|
118
|
-
|-----------|-------|----------------|
|
|
119
|
-
| `http.*` | HTTP client and server spans | `http.request.method`, `http.response.status_code`, `http.route`, `url.full`, `url.scheme` |
|
|
120
|
-
| `db.*` | Database client spans | `db.system` (e.g., `postgresql`, `mongodb`), `db.operation.name`, `db.collection.name`, `db.query.text` (sanitized) |
|
|
121
|
-
| `rpc.*` | RPC client and server spans | `rpc.system` (e.g., `grpc`, `jsonrpc`), `rpc.service`, `rpc.method`, `rpc.grpc.status_code` |
|
|
122
|
-
| `messaging.*` | Message queue spans | `messaging.system` (e.g., `kafka`, `rabbitmq`), `messaging.operation.type` (`publish`, `receive`, `process`), `messaging.destination.name` |
|
|
123
|
-
| `faas.*` | Serverless/FaaS invocations | `faas.trigger` (`http`, `pubsub`, `timer`), `faas.invoked_name`, `faas.coldstart` |
|
|
124
|
-
| `cloud.*` | Cloud provider context | `cloud.provider`, `cloud.region`, `cloud.availability_zone`, `cloud.account.id` |
|
|
125
|
-
| `k8s.*` | Kubernetes context | `k8s.namespace.name`, `k8s.pod.name`, `k8s.deployment.name`, `k8s.container.name` |
|
|
126
|
-
|
|
127
|
-
- Use the semantic convention attribute names exactly as specified. Do not invent custom alternatives for concepts already covered by the conventions.
|
|
128
|
-
- When semantic conventions are marked "Experimental," prefer them over project-specific names to ease future migration to stable conventions.
|
|
129
|
-
|
|
130
|
-
### Resource Semantic Conventions
|
|
131
|
-
|
|
132
|
-
Every telemetry-producing service must declare resource attributes at startup:
|
|
133
|
-
|
|
134
|
-
| Attribute | Stability | Requirement | Description |
|
|
135
|
-
|-----------|-----------|-------------|-------------|
|
|
136
|
-
| `service.name` | Stable | Required | Logical name of the service (e.g., `api-gateway`, `auth-service`) |
|
|
137
|
-
| `service.version` | Stable | Recommended | Semantic version of the service (e.g., `1.4.2`) |
|
|
138
|
-
| `deployment.environment.name` | Stable | Recommended | Deployment environment (e.g., `production`, `staging`, `development`) |
|
|
139
|
-
| `service.instance.id` | Experimental | Recommended | Unique instance identifier (e.g., pod name, container ID) |
|
|
140
|
-
| `service.namespace` | Experimental | Optional | Namespace for grouping related services |
|
|
141
|
-
| `telemetry.sdk.name` | Stable | Auto | Set by the SDK (e.g., `opentelemetry`) |
|
|
142
|
-
| `telemetry.sdk.language` | Stable | Auto | Set by the SDK (e.g., `nodejs`, `python`) |
|
|
143
|
-
| `telemetry.sdk.version` | Stable | Auto | Set by the SDK |
|
|
144
|
-
|
|
145
|
-
- Configure `service.name` and `service.version` via environment variables (`OTEL_SERVICE_NAME`, `OTEL_RESOURCE_ATTRIBUTES`) or programmatically at SDK initialization.
|
|
146
|
-
- Do not use the default `unknown_service` value in any deployed environment. Every service must have an explicit name.
|
|
147
|
-
|
|
148
|
-
### Span Status Codes
|
|
149
|
-
|
|
150
|
-
| Code | When to Set |
|
|
151
|
-
|------|-------------|
|
|
152
|
-
| `UNSET` | Default. The span completed without the instrumentation indicating an error. |
|
|
153
|
-
| `OK` | Explicitly set only when the application considers the operation successful and wants to override any lower-level error signal. Use sparingly. |
|
|
154
|
-
| `ERROR` | The operation failed. Set when an exception is caught, an HTTP response is 5xx, or a business-logic error occurs that should be visible in error rate metrics. |
|
|
155
|
-
|
|
156
|
-
- Set span status to `ERROR` for server-side errors (5xx) and unhandled exceptions. Do not set `ERROR` for client errors (4xx) on the server span — those are valid responses, not server failures.
|
|
157
|
-
- Attach the exception to the span as a span event (`exception.type`, `exception.message`, `exception.stacktrace`) when setting status to `ERROR`.
|
|
158
|
-
- Use `OK` only when you want to suppress error signals from child spans. In most cases, leaving status as `UNSET` is correct.
|
|
159
|
-
|
|
160
|
-
### Attribute Naming Guidelines
|
|
161
|
-
|
|
162
|
-
- Use dot-separated namespaces: `http.request.method`, not `httpRequestMethod` or `http_request_method`.
|
|
163
|
-
- Attribute values should be low-cardinality. Never use unbounded values (full URLs with query params, raw SQL, user-generated content) as attribute values.
|
|
164
|
-
- For high-cardinality identifiers (user IDs, request IDs), use span attributes sparingly and rely on correlated logs for detail.
|
|
165
|
-
- Prefer semantic convention attributes over custom attributes. When custom attributes are necessary, prefix them with your organization or project namespace (e.g., `myapp.feature.flag_key`).
|
|
166
|
-
|
|
167
|
-
### AI Agent Semantic Conventions
|
|
168
|
-
|
|
169
|
-
Follow the [OpenTelemetry GenAI Semantic Conventions](https://opentelemetry.io/docs/specs/semconv/gen-ai/) (experimental, introduced 2024) for instrumenting AI/LLM agent systems. These conventions provide consistent attribute naming for generative AI operations, enabling interoperability across agent frameworks and observability backends.
|
|
170
|
-
|
|
171
|
-
#### `gen_ai.*` Span Attributes
|
|
172
|
-
|
|
173
|
-
Use these attributes on all spans that represent interactions with generative AI models:
|
|
174
|
-
|
|
175
|
-
| Attribute | Type | Description | Example |
|
|
176
|
-
|-----------|------|-------------|---------|
|
|
177
|
-
| `gen_ai.system` | string | The GenAI provider system name | `openai`, `anthropic`, `azure_openai` |
|
|
178
|
-
| `gen_ai.request.model` | string | Model name as specified in the request | `gpt-4o`, `claude-sonnet-4-20250514` |
|
|
179
|
-
| `gen_ai.response.model` | string | Model name as returned in the response (may differ from request) | `gpt-4o-2024-08-06` |
|
|
180
|
-
| `gen_ai.request.max_tokens` | int | Maximum number of tokens requested for generation | `4096` |
|
|
181
|
-
| `gen_ai.request.temperature` | float | Temperature parameter sent in the request | `0.7` |
|
|
182
|
-
| `gen_ai.request.top_p` | float | Top-p (nucleus sampling) parameter | `0.9` |
|
|
183
|
-
| `gen_ai.response.finish_reasons` | string[] | Reasons the model stopped generating | `["stop"]`, `["length"]`, `["tool_calls"]` |
|
|
184
|
-
| `gen_ai.usage.input_tokens` | int | Number of tokens in the input/prompt | `1250` |
|
|
185
|
-
| `gen_ai.usage.output_tokens` | int | Number of tokens in the generated output | `530` |
|
|
186
|
-
|
|
187
|
-
- Always set `gen_ai.system` and `gen_ai.request.model` on every GenAI span. These are required for meaningful filtering and cost attribution.
|
|
188
|
-
- Record `gen_ai.usage.input_tokens` and `gen_ai.usage.output_tokens` from the API response to enable token usage dashboards and cost tracking.
|
|
189
|
-
- Use `gen_ai.response.finish_reasons` to detect truncated outputs (`length`) and trigger re-prompting or alerting logic.
|
|
190
|
-
|
|
191
|
-
#### Agent Invocation Spans
|
|
192
|
-
|
|
193
|
-
Instrument the full lifecycle of an agent invocation with a dedicated span. This span is the parent for all LLM calls, tool executions, and sub-agent delegations within a single agent run.
|
|
194
|
-
|
|
195
|
-
- **Span name pattern:** `agent.{agent_name}.invoke` (e.g., `agent.code_reviewer.invoke`, `agent.research_assistant.invoke`)
|
|
196
|
-
- **Required attributes:**
|
|
197
|
-
|
|
198
|
-
| Attribute | Type | Description | Example |
|
|
199
|
-
|-----------|------|-------------|---------|
|
|
200
|
-
| `agent.id` | string | Unique identifier for this agent invocation | `agent-run-a1b2c3d4` |
|
|
201
|
-
| `agent.name` | string | Logical name of the agent | `code_reviewer` |
|
|
202
|
-
| `agent.parent_id` | string | ID of the parent agent (for sub-agent delegation chains) | `agent-run-x9y8z7` |
|
|
203
|
-
| `agent.task` | string | High-level description of the agent's assigned task | `review PR #42` |
|
|
204
|
-
| `agent.framework` | string | Agent framework in use | `langchain`, `autogen`, `custom` |
|
|
205
|
-
|
|
206
|
-
- **Span events for state transitions:** Record span events to mark key lifecycle transitions within the agent invocation:
|
|
207
|
-
- `agent.planning` — Agent begins task decomposition or reasoning.
|
|
208
|
-
- `agent.tool_selection` — Agent selects a tool to invoke.
|
|
209
|
-
- `agent.awaiting_human` — Agent pauses for human-in-the-loop confirmation.
|
|
210
|
-
- `agent.delegating` — Agent spawns a sub-agent.
|
|
211
|
-
- `agent.completed` — Agent finishes its task and produces a final output.
|
|
212
|
-
- `agent.error` — Agent encounters a non-recoverable error. Include `exception.type` and `exception.message` attributes on the event.
|
|
213
|
-
|
|
214
|
-
```typescript
|
|
215
|
-
const agentSpan = tracer.startSpan('agent.code_reviewer.invoke', {
|
|
216
|
-
attributes: {
|
|
217
|
-
'agent.id': invocationId,
|
|
218
|
-
'agent.name': 'code_reviewer',
|
|
219
|
-
'agent.parent_id': parentAgentId ?? '',
|
|
220
|
-
'agent.task': `review PR #${prNumber}`,
|
|
221
|
-
'agent.framework': 'custom',
|
|
222
|
-
},
|
|
223
|
-
});
|
|
224
|
-
|
|
225
|
-
agentSpan.addEvent('agent.planning');
|
|
226
|
-
// ... agent reasoning and tool calls happen as child spans ...
|
|
227
|
-
agentSpan.addEvent('agent.completed');
|
|
228
|
-
agentSpan.end();
|
|
229
|
-
```
|
|
230
|
-
|
|
231
|
-
#### Tool Call Spans
|
|
232
|
-
|
|
233
|
-
Every tool invocation by an agent creates a child span of the agent invocation span. This enables tracing the full sequence of tool calls within an agent run, measuring tool latency, and detecting tool failures.
|
|
234
|
-
|
|
235
|
-
- **Span name pattern:** `tool.{tool_name}.execute` (e.g., `tool.file_read.execute`, `tool.web_search.execute`)
|
|
236
|
-
- **Required attributes:**
|
|
237
|
-
|
|
238
|
-
| Attribute | Type | Description | Example |
|
|
239
|
-
|-----------|------|-------------|---------|
|
|
240
|
-
| `tool.name` | string | Canonical name of the tool | `file_read`, `git_diff`, `web_search` |
|
|
241
|
-
| `tool.input_hash` | string | SHA-256 hash of the tool input (for deduplication, not logging raw input) | `sha256:3a7f...` |
|
|
242
|
-
| `tool.output_status` | string | Outcome of the tool execution | `success`, `error`, `timeout`, `rejected` |
|
|
243
|
-
| `tool.duration_ms` | float | Wall-clock execution time of the tool in milliseconds | `142.5` |
|
|
244
|
-
| `tool.parameters_count` | int | Number of parameters passed to the tool | `3` |
|
|
245
|
-
|
|
246
|
-
- **Parent-child relationship:** Tool spans must be children of the invoking agent span. Use `context.with(trace.setSpan(context.active(), agentSpan))` to propagate the agent span context to tool execution.
|
|
247
|
-
- Set span status to `ERROR` when `tool.output_status` is `error` or `timeout`. Attach exception details as a span event.
|
|
248
|
-
- For tools that perform I/O (HTTP requests, file system operations, database queries), create nested child spans using the appropriate semantic conventions (`http.*`, `db.*`) under the tool span.
|
|
249
|
-
|
|
250
|
-
```typescript
|
|
251
|
-
const toolSpan = tracer.startSpan(
|
|
252
|
-
'tool.git_diff.execute',
|
|
253
|
-
{ attributes: { 'tool.name': 'git_diff' } },
|
|
254
|
-
trace.setSpan(context.active(), agentSpan),
|
|
255
|
-
);
|
|
256
|
-
|
|
257
|
-
const startTime = performance.now();
|
|
258
|
-
try {
|
|
259
|
-
const result = await tools.gitDiff(params);
|
|
260
|
-
toolSpan.setAttributes({
|
|
261
|
-
'tool.output_status': 'success',
|
|
262
|
-
'tool.duration_ms': performance.now() - startTime,
|
|
263
|
-
'tool.input_hash': hashInput(params),
|
|
264
|
-
});
|
|
265
|
-
} catch (err) {
|
|
266
|
-
toolSpan.setAttributes({
|
|
267
|
-
'tool.output_status': 'error',
|
|
268
|
-
'tool.duration_ms': performance.now() - startTime,
|
|
269
|
-
});
|
|
270
|
-
toolSpan.setStatus({ code: SpanStatusCode.ERROR, message: err.message });
|
|
271
|
-
toolSpan.recordException(err);
|
|
272
|
-
throw err;
|
|
273
|
-
} finally {
|
|
274
|
-
toolSpan.end();
|
|
275
|
-
}
|
|
276
|
-
```
|
|
277
|
-
|
|
278
|
-
#### LLM Request/Response Tracing
|
|
279
|
-
|
|
280
|
-
Instrument every LLM API call with a dedicated span. These spans are typically children of an agent invocation span and capture model, token usage, and latency data for cost analysis and performance monitoring.
|
|
281
|
-
|
|
282
|
-
- **Span name pattern:** `gen_ai.{operation}` (e.g., `gen_ai.chat`, `gen_ai.completion`, `gen_ai.embeddings`)
|
|
283
|
-
- **Required attributes:** All applicable `gen_ai.*` attributes from the table above, plus:
|
|
284
|
-
|
|
285
|
-
| Attribute | Type | Description | Example |
|
|
286
|
-
|-----------|------|-------------|---------|
|
|
287
|
-
| `gen_ai.operation.name` | string | The specific API operation | `chat`, `completion`, `embeddings` |
|
|
288
|
-
| `gen_ai.request.stop_sequences` | string[] | Stop sequences sent in the request | `["\n\n", "END"]` |
|
|
289
|
-
| `server.address` | string | Hostname of the GenAI API endpoint | `api.openai.com` |
|
|
290
|
-
| `server.port` | int | Port of the GenAI API endpoint | `443` |
|
|
291
|
-
|
|
292
|
-
- **Input/output token tracking:** Always capture `gen_ai.usage.input_tokens` and `gen_ai.usage.output_tokens` from the API response. Aggregate these in metrics for cost dashboards:
|
|
293
|
-
- Counter: `gen_ai.tokens_total` with labels `{direction=input|output, model, agent_name}`
|
|
294
|
-
- Histogram: `gen_ai.request_duration_ms` with labels `{model, operation, agent_name}`
|
|
295
|
-
|
|
296
|
-
- **Model version tracking:** Record both `gen_ai.request.model` (what was requested) and `gen_ai.response.model` (what was actually used). API providers may silently route to different model versions; capturing both enables drift detection.
|
|
297
|
-
|
|
298
|
-
- **Error handling and retry spans:** When an LLM request fails and is retried, each attempt is a separate child span under the same parent. Record the error on the failed span and create a new span for the retry:
|
|
299
|
-
- Set `gen_ai.request.retries` (int) on the final successful span to indicate total retry count.
|
|
300
|
-
- Record `http.response.status_code` on failed spans to distinguish rate-limit errors (429) from server errors (500+).
|
|
301
|
-
- Use exponential backoff; the retry span's start time naturally captures the wait duration.
|
|
302
|
-
|
|
303
|
-
```typescript
|
|
304
|
-
const llmSpan = tracer.startSpan(
|
|
305
|
-
'gen_ai.chat',
|
|
306
|
-
{
|
|
307
|
-
attributes: {
|
|
308
|
-
'gen_ai.system': 'openai',
|
|
309
|
-
'gen_ai.operation.name': 'chat',
|
|
310
|
-
'gen_ai.request.model': 'gpt-4o',
|
|
311
|
-
'gen_ai.request.max_tokens': 4096,
|
|
312
|
-
'gen_ai.request.temperature': 0.2,
|
|
313
|
-
'server.address': 'api.openai.com',
|
|
314
|
-
},
|
|
315
|
-
},
|
|
316
|
-
trace.setSpan(context.active(), agentSpan),
|
|
317
|
-
);
|
|
318
|
-
|
|
319
|
-
try {
|
|
320
|
-
const response = await openai.chat.completions.create({ /* ... */ });
|
|
321
|
-
llmSpan.setAttributes({
|
|
322
|
-
'gen_ai.response.model': response.model,
|
|
323
|
-
'gen_ai.response.finish_reasons': response.choices.map(c => c.finish_reason),
|
|
324
|
-
'gen_ai.usage.input_tokens': response.usage.prompt_tokens,
|
|
325
|
-
'gen_ai.usage.output_tokens': response.usage.completion_tokens,
|
|
326
|
-
});
|
|
327
|
-
|
|
328
|
-
// Record token usage in metrics for cost tracking
|
|
329
|
-
tokenCounter.add(response.usage.prompt_tokens, {
|
|
330
|
-
direction: 'input', model: response.model, agent_name: agentName,
|
|
331
|
-
});
|
|
332
|
-
tokenCounter.add(response.usage.completion_tokens, {
|
|
333
|
-
direction: 'output', model: response.model, agent_name: agentName,
|
|
334
|
-
});
|
|
335
|
-
} catch (err) {
|
|
336
|
-
llmSpan.setStatus({ code: SpanStatusCode.ERROR, message: err.message });
|
|
337
|
-
llmSpan.recordException(err);
|
|
338
|
-
throw err;
|
|
339
|
-
} finally {
|
|
340
|
-
llmSpan.end();
|
|
341
|
-
}
|
|
342
|
-
```
|
|
343
|
-
|
|
344
|
-
- Never log raw prompt content or full model responses as span attributes — these are high-cardinality and may contain sensitive data. Use `gen_ai.usage.*` token counts for cost tracking and correlated logs for prompt debugging in non-production environments.
|
|
345
|
-
- In production, sample GenAI spans at a higher rate than general spans (e.g., 50-100%) because each call is expensive and lower volume than typical HTTP traffic. Adjust sampling based on call volume and observability budget.
|
|
346
|
-
|
|
347
|
-
### Tool Call Audit Trail
|
|
348
|
-
|
|
349
|
-
Maintain a structured audit log for every tool invocation in agentic workflows. This log is separate from tracing spans and serves as an immutable compliance and debugging record.
|
|
350
|
-
|
|
351
|
-
#### Schema Definition
|
|
352
|
-
|
|
353
|
-
Every tool call audit log entry must include the following fields:
|
|
354
|
-
|
|
355
|
-
| Field | Type | Description |
|
|
356
|
-
|-------|------|-------------|
|
|
357
|
-
| `tool.name` | string | Name of the tool invoked |
|
|
358
|
-
| `tool.input_hash` | string | SHA-256 hash of the tool input (for privacy, never log raw input) |
|
|
359
|
-
| `tool.output_status` | string | Outcome of the tool execution: `success`, `error`, `timeout`, or `denied` |
|
|
360
|
-
| `tool.duration_ms` | float | Execution time in milliseconds |
|
|
361
|
-
| `agent.id` | string | ID of the agent that invoked the tool |
|
|
362
|
-
| `agent.name` | string | Human-readable agent name |
|
|
363
|
-
| `correlation.id` | string | Trace correlation ID linking this entry to the broader workflow |
|
|
364
|
-
| `timestamp` | string | ISO 8601 timestamp of the invocation |
|
|
365
|
-
| `session.id` | string | Session identifier for grouping related tool calls |
|
|
366
|
-
|
|
367
|
-
#### Logging Requirements
|
|
368
|
-
|
|
369
|
-
- Log every tool invocation at `info` level with the full schema above.
|
|
370
|
-
- Log tool failures at `error` level with additional `error.type` and `error.message` fields describing the failure.
|
|
371
|
-
- Aggregate tool call counts per agent per session for anomaly detection (e.g., an agent invoking an unusual number of tools may indicate a loop or misconfiguration).
|
|
372
|
-
- Retain audit logs for a minimum of 90 days to support post-incident investigation and compliance review.
|
|
373
|
-
|
|
374
|
-
#### Example Log Entry
|
|
375
|
-
|
|
376
|
-
```json
|
|
377
|
-
{
|
|
378
|
-
"timestamp": "2026-02-15T14:32:07.891Z",
|
|
379
|
-
"level": "info",
|
|
380
|
-
"correlation.id": "agent-run-550e8400-e29b-41d4-a716-446655440000",
|
|
381
|
-
"session.id": "sess-8f14e45f-ceea-467f-a8f0-3b5c6d7e8f9a",
|
|
382
|
-
"agent.id": "agent-run-a1b2c3d4",
|
|
383
|
-
"agent.name": "code_reviewer",
|
|
384
|
-
"tool.name": "git_diff",
|
|
385
|
-
"tool.input_hash": "sha256:3a7f2c9e8b1d4f6a0e5c7b9d2f4a6e8c0b3d5f7a9e1c3b5d7f9a2c4e6b8d0f",
|
|
386
|
-
"tool.output_status": "success",
|
|
387
|
-
"tool.duration_ms": 142.5
|
|
388
|
-
}
|
|
389
|
-
```
|
|
390
|
-
|
|
391
|
-
### Correlation IDs for Agent Workflows
|
|
392
|
-
|
|
393
|
-
Correlation IDs provide the connective thread linking all telemetry signals (logs, spans, metrics) across a multi-agent workflow. Every participant in the workflow uses the same correlation ID, enabling end-to-end traceability from the initial trigger through all agent delegations and tool calls.
|
|
394
|
-
|
|
395
|
-
#### ID Generation
|
|
396
|
-
|
|
397
|
-
- Use UUIDv4 for correlation IDs. Generate the ID at the workflow entry point (the first agent invocation or the orchestrator that initiates the run).
|
|
398
|
-
- Format: `{workflow-type}-{uuid}` (e.g., `agent-run-550e8400-e29b-41d4-a716-446655440000`, `review-flow-7c9e6679-7425-40de-944b-e07fc1f90ae7`).
|
|
399
|
-
- The workflow-type prefix provides human-readable context when scanning logs and makes it possible to filter by workflow category without parsing the full ID.
|
|
400
|
-
|
|
401
|
-
#### Propagation
|
|
402
|
-
|
|
403
|
-
- The correlation ID propagates from the parent agent to all sub-agents via context. Pass it explicitly when delegating to sub-agents or invoking tools.
|
|
404
|
-
- Every log entry, span, and metric produced during the workflow must include the `correlation.id` attribute.
|
|
405
|
-
- When crossing process boundaries (e.g., HTTP calls between services), propagate the correlation ID via a custom header (`X-Correlation-ID`) alongside standard W3C Trace Context headers.
|
|
406
|
-
|
|
407
|
-
#### Parent-Child Span Linking
|
|
408
|
-
|
|
409
|
-
- The parent agent's span ID becomes the `parent_span_id` attribute on child agent spans, establishing a clear hierarchy in trace visualizations.
|
|
410
|
-
- For cross-workflow references (e.g., an agent run triggered by a CI pipeline event), use OpenTelemetry `SpanLink` to connect the agent workflow trace to the originating trace without creating a parent-child relationship.
|
|
411
|
-
- SpanLinks preserve the independence of each workflow trace while enabling navigation between related workflows in the observability backend.
|
|
412
|
-
|
|
413
|
-
#### Implementation Pattern
|
|
414
|
-
|
|
415
|
-
```typescript
|
|
416
|
-
import { randomUUID } from 'node:crypto';
|
|
417
|
-
import { context, trace, SpanStatusCode } from '@opentelemetry/api';
|
|
418
|
-
|
|
419
|
-
function generateCorrelationId(workflowType: string): string {
|
|
420
|
-
return `${workflowType}-${randomUUID()}`;
|
|
421
|
-
}
|
|
422
|
-
|
|
423
|
-
async function runAgentWorkflow(task: string): Promise<void> {
|
|
424
|
-
const correlationId = generateCorrelationId('agent-run');
|
|
425
|
-
const tracer = trace.getTracer('agent-orchestrator');
|
|
426
|
-
|
|
427
|
-
const rootSpan = tracer.startSpan('agent.orchestrator.invoke', {
|
|
428
|
-
attributes: {
|
|
429
|
-
'correlation.id': correlationId,
|
|
430
|
-
'agent.name': 'orchestrator',
|
|
431
|
-
'agent.task': task,
|
|
432
|
-
},
|
|
433
|
-
});
|
|
434
|
-
|
|
435
|
-
const ctx = trace.setSpan(context.active(), rootSpan);
|
|
436
|
-
|
|
437
|
-
try {
|
|
438
|
-
// Sub-agent inherits the correlation ID from context
|
|
439
|
-
await context.with(ctx, async () => {
|
|
440
|
-
await delegateToSubAgent('code_reviewer', {
|
|
441
|
-
correlationId,
|
|
442
|
-
parentSpanId: rootSpan.spanContext().spanId,
|
|
443
|
-
task: 'review changes',
|
|
444
|
-
});
|
|
445
|
-
});
|
|
446
|
-
} catch (err) {
|
|
447
|
-
rootSpan.setStatus({ code: SpanStatusCode.ERROR, message: (err as Error).message });
|
|
448
|
-
rootSpan.recordException(err as Error);
|
|
449
|
-
throw err;
|
|
450
|
-
} finally {
|
|
451
|
-
rootSpan.end();
|
|
452
|
-
}
|
|
453
|
-
}
|
|
454
|
-
```
|
|
13
|
+
Load the specific rule that matches your task scope instead of this file.
|
|
@@ -3,7 +3,9 @@ id: hatch3r-performance-budgets
|
|
|
3
3
|
type: rule
|
|
4
4
|
description: Performance budgets and targets for the project
|
|
5
5
|
scope: conditional
|
|
6
|
+
globs: "**/*perf*,**/*benchmark*,**/*budget*,**/lighthouse*,**/*.perf.*"
|
|
6
7
|
tags: [performance]
|
|
8
|
+
quality_charter: agents/shared/quality-charter.md
|
|
7
9
|
---
|
|
8
10
|
# Performance Budgets
|
|
9
11
|
|
|
@@ -2,8 +2,9 @@
|
|
|
2
2
|
id: hatch3r-secrets-management
|
|
3
3
|
type: rule
|
|
4
4
|
description: Secret management, rotation, and secure handling patterns for the project
|
|
5
|
-
scope:
|
|
5
|
+
scope: "**/.env*,**/*secret*,**/*credential*,**/*token*,**/config/**,**/.gitignore,**/vault/**,**/*auth*.config*"
|
|
6
6
|
tags: [security]
|
|
7
|
+
quality_charter: agents/shared/quality-charter.md
|
|
7
8
|
---
|
|
8
9
|
# Secrets Management
|
|
9
10
|
|
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
---
|
|
2
2
|
description: Secret management, rotation, and secure handling patterns for the project
|
|
3
|
-
|
|
3
|
+
globs: ["**/.env*", "**/*secret*", "**/*credential*", "**/*token*", "**/config/**", "**/.gitignore", "**/vault/**", "**/*auth*.config*"]
|
|
4
4
|
---
|
|
5
5
|
# Secrets Management
|
|
6
6
|
|
|
@@ -2,8 +2,9 @@
|
|
|
2
2
|
id: hatch3r-security-patterns
|
|
3
3
|
type: rule
|
|
4
4
|
description: Security patterns including input validation, auth enforcement, and AI/agentic security for the project
|
|
5
|
-
scope:
|
|
5
|
+
scope: "**/auth/**,**/security/**,**/middleware/**,**/*auth*,**/*guard*,**/*policy*,**/*permission*,**/*sanitiz*,**/*validat*"
|
|
6
6
|
tags: [security]
|
|
7
|
+
quality_charter: agents/shared/quality-charter.md
|
|
7
8
|
---
|
|
8
9
|
# Security Patterns
|
|
9
10
|
|
|
@@ -202,7 +203,7 @@ tags: [security]
|
|
|
202
203
|
### A08 — Software and Data Integrity Failures
|
|
203
204
|
|
|
204
205
|
- Verify integrity of all software updates, dependencies, and CI/CD pipeline artifacts using digital signatures or checksums.
|
|
205
|
-
- Use lockfiles and verify their integrity. `npm ci` (not `npm install`) in CI
|
|
206
|
+
- Use lockfiles and verify their integrity. `npm ci` (not `npm install`) in CI for deterministic builds that fail on lockfile drift.
|
|
206
207
|
- CI/CD pipelines: require code review for all changes, enforce branch protection, sign commits where feasible.
|
|
207
208
|
- Never deserialize untrusted data without validation. Use schemas (zod, JSON Schema) to validate structure before processing.
|
|
208
209
|
- Protect CI/CD secrets and permissions: restrict who can modify pipeline configuration, require approval for deployment steps.
|
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
---
|
|
2
2
|
description: Security patterns including input validation, auth enforcement, and AI/agentic security for the project
|
|
3
|
-
|
|
3
|
+
globs: ["**/auth/**", "**/security/**", "**/middleware/**", "**/*auth*", "**/*guard*", "**/*policy*", "**/*permission*", "**/*sanitiz*", "**/*validat*"]
|
|
4
4
|
---
|
|
5
5
|
# Security Patterns
|
|
6
6
|
|
package/rules/hatch3r-testing.md
CHANGED
|
@@ -2,8 +2,9 @@
|
|
|
2
2
|
id: hatch3r-testing
|
|
3
3
|
type: rule
|
|
4
4
|
description: Test standards and conventions for the project
|
|
5
|
-
scope:
|
|
5
|
+
scope: "**/*.test.*,**/*.spec.*,**/__tests__/**,**/tests/**,**/test/**,**/*.cy.*,**/playwright/**,**/vitest.config.*,**/jest.config.*,**/cypress.config.*"
|
|
6
6
|
tags: [core]
|
|
7
|
+
quality_charter: agents/shared/quality-charter.md
|
|
7
8
|
---
|
|
8
9
|
# Testing Standards
|
|
9
10
|
|
|
@@ -80,11 +81,20 @@ tags: [core]
|
|
|
80
81
|
- **Fixture files** (JSON, YAML) are acceptable for large, complex, or externally-sourced test inputs (API response snapshots, configuration samples). Store in `tests/fixtures/`.
|
|
81
82
|
- **Database state:** Integration tests that require database state must set up and tear down within the test using helpers. Never depend on database state from a previous test.
|
|
82
83
|
|
|
84
|
+
## Error Path Coverage
|
|
85
|
+
|
|
86
|
+
Error handling code is often under-tested because developers focus on happy paths. Enforce minimum error coverage:
|
|
87
|
+
|
|
88
|
+
- **Every exported function that can fail** must have at least one test exercising the error path. "Can fail" includes: functions returning `Result<T, E>`, functions with `throw` statements, async functions calling external services, and functions with input validation.
|
|
89
|
+
- **Error message assertions.** Test that error messages, codes, and structured fields contain the expected values. Do not assert only that "an error was thrown" -- verify the error content.
|
|
90
|
+
- **Error propagation.** When a function wraps or transforms errors from a dependency, test that the original error context is preserved (cause chain, stack trace, original error code).
|
|
91
|
+
- **Boundary error tests.** For each architectural boundary (API handler, event handler, background processor), test that errors are caught, logged, and returned as safe responses without leaking internal details.
|
|
92
|
+
|
|
83
93
|
## Snapshot Testing
|
|
84
94
|
|
|
85
95
|
- **Use sparingly.** Snapshots are appropriate for serialized output (JSON API responses, CLI output, rendered HTML structure) where the exact output matters and is stable.
|
|
86
96
|
- **Not appropriate for:** UI component visual appearance (use visual regression tests), objects with timestamps or random IDs (unstable), large objects (unreadable diffs).
|
|
87
|
-
- **Review discipline.** Snapshot updates (`--update-snapshots`) must be reviewed
|
|
97
|
+
- **Review discipline.** Snapshot updates (`--update-snapshots`) must be reviewed with the same rigor as code changes. Reviewers must verify the new snapshot is intentionally correct, not just "different."
|
|
88
98
|
- **Keep snapshots small.** Snapshot files > 100 lines suggest the test is asserting too broadly. Narrow the assertion to the relevant subset.
|
|
89
99
|
- **Inline snapshots** (where supported) are preferred over external `.snap` files for short outputs (< 20 lines) because they keep the assertion co-located with the test.
|
|
90
100
|
- **Name snapshot files** to match their test file: `auth.test.ts` → `auth.test.ts.snap`.
|
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
---
|
|
2
2
|
description: Test standards and conventions for the project
|
|
3
|
-
|
|
3
|
+
globs: ["**/*.test.*", "**/*.spec.*", "**/__tests__/**", "**/tests/**", "**/test/**", "**/*.cy.*", "**/playwright/**", "**/vitest.config.*", "**/jest.config.*", "**/cypress.config.*"]
|
|
4
4
|
---
|
|
5
5
|
# Testing Standards
|
|
6
6
|
|
|
@@ -77,6 +77,15 @@ alwaysApply: true
|
|
|
77
77
|
- **Fixture files** (JSON, YAML) are acceptable for large, complex, or externally-sourced test inputs (API response snapshots, configuration samples). Store in `tests/fixtures/`.
|
|
78
78
|
- **Database state:** Integration tests that require database state must set up and tear down within the test using helpers. Never depend on database state from a previous test.
|
|
79
79
|
|
|
80
|
+
## Error Path Coverage
|
|
81
|
+
|
|
82
|
+
Error handling code is often under-tested because developers focus on happy paths. Enforce minimum error coverage:
|
|
83
|
+
|
|
84
|
+
- **Every exported function that can fail** must have at least one test exercising the error path. "Can fail" includes: functions returning `Result<T, E>`, functions with `throw` statements, async functions calling external services, and functions with input validation.
|
|
85
|
+
- **Error message assertions.** Test that error messages, codes, and structured fields contain the expected values. Do not assert only that "an error was thrown" -- verify the error content.
|
|
86
|
+
- **Error propagation.** When a function wraps or transforms errors from a dependency, test that the original error context is preserved (cause chain, stack trace, original error code).
|
|
87
|
+
- **Boundary error tests.** For each architectural boundary (API handler, event handler, background processor), test that errors are caught, logged, and returned as safe responses without leaking internal details.
|
|
88
|
+
|
|
80
89
|
## Snapshot Testing
|
|
81
90
|
|
|
82
91
|
- **Use sparingly.** Snapshots are appropriate for serialized output (JSON API responses, CLI output, rendered HTML structure) where the exact output matters and is stable.
|
package/rules/hatch3r-theming.md
CHANGED
|
@@ -5,6 +5,7 @@ description: Theming, dark mode, and color system conventions for the project
|
|
|
5
5
|
scope: conditional
|
|
6
6
|
globs: src/**/*.vue, src/**/*.tsx, src/**/*.jsx, src/**/*.css, src/**/*.scss
|
|
7
7
|
tags: [implementation]
|
|
8
|
+
quality_charter: agents/shared/quality-charter.md
|
|
8
9
|
---
|
|
9
10
|
# Theming & Dark Mode
|
|
10
11
|
|
|
@@ -43,11 +44,11 @@ tags: [implementation]
|
|
|
43
44
|
- Provide a `high-contrast` token set with ≥ 7:1 contrast ratios for all text and ≥ 3:1 for non-text UI.
|
|
44
45
|
- Detect user preference with `@media (prefers-contrast: more)` and apply high-contrast tokens.
|
|
45
46
|
- Support `forced-colors` mode: use system color keywords (`Canvas`, `CanvasText`, `LinkText`, `ButtonFace`, `ButtonText`) and test that information is not conveyed by color alone.
|
|
46
|
-
-
|
|
47
|
+
- Verify focus indicators and borders remain visible under forced-colors by testing in Windows High Contrast Mode — use `Highlight` / `SelectedItem` keywords.
|
|
47
48
|
|
|
48
49
|
## Testing
|
|
49
50
|
|
|
50
|
-
- Verify theme toggle switches all tokens
|
|
51
|
+
- Verify theme toggle switches all tokens — no unstyled or hard-coded colors leak through. Inspect computed styles to confirm all color values come from design tokens.
|
|
51
52
|
- Validate contrast ratios per theme using automated tools (axe-core, Lighthouse) against WCAG AA (4.5:1 text, 3:1 non-text).
|
|
52
53
|
- Capture screenshots across light, dark, and high-contrast themes at key viewport sizes for visual regression comparison.
|
|
53
54
|
- Test `prefers-color-scheme` and `prefers-contrast` media query overrides using browser DevTools emulation or Playwright `emulateMedia`.
|
|
@@ -2,8 +2,9 @@
|
|
|
2
2
|
id: hatch3r-tooling-hierarchy
|
|
3
3
|
type: rule
|
|
4
4
|
description: Priority order for tools and knowledge sources
|
|
5
|
-
scope:
|
|
5
|
+
scope: "**/.agents/**,**/mcp/**,**/mcp.json,**/.cursor/**,**/.github/copilot*,**/.windsurf/**,**/hatch.json,**/.claude/**"
|
|
6
6
|
tags: [core]
|
|
7
|
+
quality_charter: agents/shared/quality-charter.md
|
|
7
8
|
---
|
|
8
9
|
# Tooling Hierarchy
|
|
9
10
|
|
|
@@ -91,7 +92,7 @@ If no web search MCP server is configured (e.g., `brave-search` is not in `mcp.s
|
|
|
91
92
|
Use browser automation MCP tools to visually verify UI changes after automated tests pass.
|
|
92
93
|
|
|
93
94
|
**When to use:**
|
|
94
|
-
- Verifying UI component changes render
|
|
95
|
+
- Verifying UI component changes render as specified in the design or acceptance criteria.
|
|
95
96
|
- Reproducing and confirming fixes for visually observable bugs.
|
|
96
97
|
- Accessibility auditing (keyboard nav, contrast, focus indicators).
|
|
97
98
|
- Frontend performance profiling (CPU, frame rate, memory).
|