xtrm-tools 2.4.1 → 2.4.3
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +15 -6
- package/cli/dist/index.cjs +738 -239
- package/cli/dist/index.cjs.map +1 -1
- package/cli/package.json +1 -1
- package/config/hooks.json +10 -0
- package/config/pi/extensions/core/adapter.ts +2 -14
- package/config/pi/extensions/core/guard-rules.ts +70 -0
- package/config/pi/extensions/core/session-state.ts +59 -0
- package/config/pi/extensions/main-guard.ts +10 -14
- package/config/pi/extensions/plan-mode/README.md +65 -0
- package/config/pi/extensions/plan-mode/index.ts +340 -0
- package/config/pi/extensions/plan-mode/utils.ts +168 -0
- package/config/pi/extensions/service-skills.ts +51 -7
- package/config/pi/extensions/session-flow.ts +117 -0
- package/hooks/beads-claim-sync.mjs +140 -14
- package/hooks/beads-compact-restore.mjs +41 -9
- package/hooks/beads-compact-save.mjs +36 -5
- package/hooks/beads-gate-messages.mjs +27 -1
- package/hooks/beads-memory-gate.mjs +24 -16
- package/hooks/beads-stop-gate.mjs +58 -8
- package/hooks/guard-rules.mjs +117 -0
- package/hooks/hooks.json +28 -18
- package/hooks/main-guard.mjs +22 -22
- package/hooks/quality-check.cjs +1286 -0
- package/hooks/quality-check.py +345 -0
- package/hooks/session-state.mjs +138 -0
- package/package.json +2 -1
- package/project-skills/quality-gates/.claude/settings.json +1 -24
- package/skills/creating-service-skills/SKILL.md +433 -0
- package/skills/creating-service-skills/references/script_quality_standards.md +425 -0
- package/skills/creating-service-skills/references/service_skill_system_guide.md +278 -0
- package/skills/creating-service-skills/scripts/bootstrap.py +326 -0
- package/skills/creating-service-skills/scripts/deep_dive.py +304 -0
- package/skills/creating-service-skills/scripts/scaffolder.py +482 -0
- package/skills/scoping-service-skills/SKILL.md +231 -0
- package/skills/scoping-service-skills/scripts/scope.py +74 -0
- package/skills/sync-docs/SKILL.md +235 -0
- package/skills/sync-docs/evals/evals.json +89 -0
- package/skills/sync-docs/references/doc-structure.md +104 -0
- package/skills/sync-docs/references/schema.md +103 -0
- package/skills/sync-docs/scripts/context_gatherer.py +246 -0
- package/skills/sync-docs/scripts/doc_structure_analyzer.py +495 -0
- package/skills/sync-docs/scripts/validate_doc.py +365 -0
- package/skills/sync-docs-workspace/iteration-1/benchmark.json +293 -0
- package/skills/sync-docs-workspace/iteration-1/benchmark.md +13 -0
- package/skills/sync-docs-workspace/iteration-1/eval-doc-audit/eval_metadata.json +27 -0
- package/skills/sync-docs-workspace/iteration-1/eval-doc-audit/with_skill/outputs/result.md +210 -0
- package/skills/sync-docs-workspace/iteration-1/eval-doc-audit/with_skill/run-1/grading.json +28 -0
- package/skills/sync-docs-workspace/iteration-1/eval-doc-audit/with_skill/run-1/timing.json +1 -0
- package/skills/sync-docs-workspace/iteration-1/eval-doc-audit/without_skill/outputs/result.md +101 -0
- package/skills/sync-docs-workspace/iteration-1/eval-doc-audit/without_skill/run-1/grading.json +28 -0
- package/skills/sync-docs-workspace/iteration-1/eval-doc-audit/without_skill/run-1/timing.json +5 -0
- package/skills/sync-docs-workspace/iteration-1/eval-doc-audit/without_skill/timing.json +5 -0
- package/skills/sync-docs-workspace/iteration-1/eval-fix-mode/eval_metadata.json +27 -0
- package/skills/sync-docs-workspace/iteration-1/eval-fix-mode/with_skill/outputs/result.md +198 -0
- package/skills/sync-docs-workspace/iteration-1/eval-fix-mode/with_skill/run-1/grading.json +28 -0
- package/skills/sync-docs-workspace/iteration-1/eval-fix-mode/with_skill/run-1/timing.json +1 -0
- package/skills/sync-docs-workspace/iteration-1/eval-fix-mode/without_skill/outputs/result.md +94 -0
- package/skills/sync-docs-workspace/iteration-1/eval-fix-mode/without_skill/run-1/grading.json +28 -0
- package/skills/sync-docs-workspace/iteration-1/eval-fix-mode/without_skill/run-1/timing.json +1 -0
- package/skills/sync-docs-workspace/iteration-1/eval-sprint-closeout/eval_metadata.json +27 -0
- package/skills/sync-docs-workspace/iteration-1/eval-sprint-closeout/with_skill/outputs/result.md +237 -0
- package/skills/sync-docs-workspace/iteration-1/eval-sprint-closeout/with_skill/run-1/grading.json +28 -0
- package/skills/sync-docs-workspace/iteration-1/eval-sprint-closeout/with_skill/run-1/timing.json +1 -0
- package/skills/sync-docs-workspace/iteration-1/eval-sprint-closeout/without_skill/outputs/result.md +134 -0
- package/skills/sync-docs-workspace/iteration-1/eval-sprint-closeout/without_skill/run-1/grading.json +28 -0
- package/skills/sync-docs-workspace/iteration-1/eval-sprint-closeout/without_skill/run-1/timing.json +1 -0
- package/skills/sync-docs-workspace/iteration-2/benchmark.json +297 -0
- package/skills/sync-docs-workspace/iteration-2/benchmark.md +13 -0
- package/skills/sync-docs-workspace/iteration-2/eval-doc-audit/eval_metadata.json +27 -0
- package/skills/sync-docs-workspace/iteration-2/eval-doc-audit/with_skill/outputs/result.md +137 -0
- package/skills/sync-docs-workspace/iteration-2/eval-doc-audit/with_skill/run-1/grading.json +92 -0
- package/skills/sync-docs-workspace/iteration-2/eval-doc-audit/with_skill/run-1/timing.json +1 -0
- package/skills/sync-docs-workspace/iteration-2/eval-doc-audit/without_skill/outputs/result.md +134 -0
- package/skills/sync-docs-workspace/iteration-2/eval-doc-audit/without_skill/run-1/grading.json +86 -0
- package/skills/sync-docs-workspace/iteration-2/eval-doc-audit/without_skill/run-1/timing.json +1 -0
- package/skills/sync-docs-workspace/iteration-2/eval-fix-mode/eval_metadata.json +27 -0
- package/skills/sync-docs-workspace/iteration-2/eval-fix-mode/with_skill/outputs/result.md +193 -0
- package/skills/sync-docs-workspace/iteration-2/eval-fix-mode/with_skill/run-1/grading.json +72 -0
- package/skills/sync-docs-workspace/iteration-2/eval-fix-mode/with_skill/run-1/timing.json +1 -0
- package/skills/sync-docs-workspace/iteration-2/eval-fix-mode/without_skill/outputs/result.md +211 -0
- package/skills/sync-docs-workspace/iteration-2/eval-fix-mode/without_skill/run-1/grading.json +91 -0
- package/skills/sync-docs-workspace/iteration-2/eval-fix-mode/without_skill/run-1/timing.json +5 -0
- package/skills/sync-docs-workspace/iteration-2/eval-sprint-closeout/eval_metadata.json +27 -0
- package/skills/sync-docs-workspace/iteration-2/eval-sprint-closeout/with_skill/outputs/result.md +182 -0
- package/skills/sync-docs-workspace/iteration-2/eval-sprint-closeout/with_skill/run-1/grading.json +95 -0
- package/skills/sync-docs-workspace/iteration-2/eval-sprint-closeout/with_skill/run-1/timing.json +1 -0
- package/skills/sync-docs-workspace/iteration-2/eval-sprint-closeout/without_skill/outputs/result.md +222 -0
- package/skills/sync-docs-workspace/iteration-2/eval-sprint-closeout/without_skill/run-1/grading.json +88 -0
- package/skills/sync-docs-workspace/iteration-2/eval-sprint-closeout/without_skill/run-1/timing.json +5 -0
- package/skills/sync-docs-workspace/iteration-3/benchmark.json +298 -0
- package/skills/sync-docs-workspace/iteration-3/benchmark.md +13 -0
- package/skills/sync-docs-workspace/iteration-3/eval-doc-audit/eval_metadata.json +27 -0
- package/skills/sync-docs-workspace/iteration-3/eval-doc-audit/with_skill/outputs/result.md +125 -0
- package/skills/sync-docs-workspace/iteration-3/eval-doc-audit/with_skill/run-1/grading.json +97 -0
- package/skills/sync-docs-workspace/iteration-3/eval-doc-audit/with_skill/run-1/timing.json +5 -0
- package/skills/sync-docs-workspace/iteration-3/eval-doc-audit/without_skill/outputs/result.md +144 -0
- package/skills/sync-docs-workspace/iteration-3/eval-doc-audit/without_skill/run-1/grading.json +78 -0
- package/skills/sync-docs-workspace/iteration-3/eval-doc-audit/without_skill/run-1/timing.json +5 -0
- package/skills/sync-docs-workspace/iteration-3/eval-fix-mode/eval_metadata.json +27 -0
- package/skills/sync-docs-workspace/iteration-3/eval-fix-mode/with_skill/outputs/result.md +104 -0
- package/skills/sync-docs-workspace/iteration-3/eval-fix-mode/with_skill/run-1/grading.json +91 -0
- package/skills/sync-docs-workspace/iteration-3/eval-fix-mode/with_skill/run-1/timing.json +5 -0
- package/skills/sync-docs-workspace/iteration-3/eval-fix-mode/without_skill/outputs/result.md +79 -0
- package/skills/sync-docs-workspace/iteration-3/eval-fix-mode/without_skill/run-1/grading.json +82 -0
- package/skills/sync-docs-workspace/iteration-3/eval-fix-mode/without_skill/run-1/timing.json +5 -0
- package/skills/sync-docs-workspace/iteration-3/eval-sprint-closeout/eval_metadata.json +27 -0
- package/skills/sync-docs-workspace/iteration-3/eval-sprint-closeout/with_skill/outputs/phase1_context.json +302 -0
- package/skills/sync-docs-workspace/iteration-3/eval-sprint-closeout/with_skill/outputs/phase2_drift.txt +33 -0
- package/skills/sync-docs-workspace/iteration-3/eval-sprint-closeout/with_skill/outputs/phase3_analysis.json +114 -0
- package/skills/sync-docs-workspace/iteration-3/eval-sprint-closeout/with_skill/outputs/phase4_fix.txt +118 -0
- package/skills/sync-docs-workspace/iteration-3/eval-sprint-closeout/with_skill/outputs/phase5_validate.txt +38 -0
- package/skills/sync-docs-workspace/iteration-3/eval-sprint-closeout/with_skill/outputs/result.md +158 -0
- package/skills/sync-docs-workspace/iteration-3/eval-sprint-closeout/with_skill/run-1/grading.json +95 -0
- package/skills/sync-docs-workspace/iteration-3/eval-sprint-closeout/with_skill/run-1/timing.json +5 -0
- package/skills/sync-docs-workspace/iteration-3/eval-sprint-closeout/without_skill/outputs/result.md +71 -0
- package/skills/sync-docs-workspace/iteration-3/eval-sprint-closeout/without_skill/run-1/grading.json +90 -0
- package/skills/sync-docs-workspace/iteration-3/eval-sprint-closeout/without_skill/run-1/timing.json +5 -0
- package/skills/updating-service-skills/SKILL.md +136 -0
- package/skills/updating-service-skills/scripts/drift_detector.py +222 -0
- package/skills/using-quality-gates/SKILL.md +254 -0
- package/skills/using-service-skills/SKILL.md +108 -0
- package/skills/using-service-skills/scripts/cataloger.py +74 -0
- package/skills/using-service-skills/scripts/skill_activator.py +152 -0
- package/skills/using-service-skills/scripts/test_skill_activator.py +58 -0
- package/skills/using-xtrm/SKILL.md +34 -38
|
@@ -0,0 +1,425 @@
|
|
|
1
|
+
# Script Quality Standards for Service Skills
|
|
2
|
+
|
|
3
|
+
> Distilled from the mercury-market-data implementation (Feb 2026).
|
|
4
|
+
> Updated with lessons from the processing-papers implementation (Feb 2026).
|
|
5
|
+
> Apply these standards to every script generated in Phase 2 of the service-skill-builder workflow.
|
|
6
|
+
|
|
7
|
+
---
|
|
8
|
+
|
|
9
|
+
## Table of Contents
|
|
10
|
+
|
|
11
|
+
- [Mandatory DB Connection Pattern](#mandatory-db-connection-pattern)
|
|
12
|
+
- [Schema Verification Before Writing Any SQL](#schema-verification-before-writing-any-sql)
|
|
13
|
+
- [Makefile Standard](#makefile-standard)
|
|
14
|
+
- [Design Principles](#design-principles)
|
|
15
|
+
- [health_probe.py Standards](#health_probepy-standards)
|
|
16
|
+
- [log_hunter.py Standards](#log_hunterpy-standards)
|
|
17
|
+
- [Specialist Script Standards](#specialist-script-standards)
|
|
18
|
+
- [Common Pitfalls](#common-pitfalls)
|
|
19
|
+
|
|
20
|
+
---
|
|
21
|
+
|
|
22
|
+
## Mandatory DB Connection Pattern
|
|
23
|
+
|
|
24
|
+
**Every script that touches the database MUST use this exact pattern.** No exceptions.
|
|
25
|
+
|
|
26
|
+
```python
|
|
27
|
+
#!/usr/bin/env python3
|
|
28
|
+
import sys
|
|
29
|
+
from pathlib import Path
|
|
30
|
+
from dotenv import load_dotenv
|
|
31
|
+
|
|
32
|
+
# Resolve project root (depth depends on script location within .claude/skills/)
|
|
33
|
+
project_root = Path(__file__).resolve().parent.parent.parent.parent.parent
|
|
34
|
+
env_file = project_root / ".env"
|
|
35
|
+
if env_file.exists():
|
|
36
|
+
load_dotenv(str(env_file))
|
|
37
|
+
|
|
38
|
+
sys.path.insert(0, str(project_root))
|
|
39
|
+
from shared.db_pool_manager import execute_db_query
|
|
40
|
+
```
|
|
41
|
+
|
|
42
|
+
**Why:** System `python3` may lack `dotenv` and project deps. Always test with `venv/bin/python3`.
|
|
43
|
+
**Never:** Raw `psycopg2`, hardcoded DSN strings, or skipping the `load_dotenv` call.
|
|
44
|
+
|
|
45
|
+
---
|
|
46
|
+
|
|
47
|
+
## Schema Verification Before Writing Any SQL
|
|
48
|
+
|
|
49
|
+
Run these queries against the live DB **before** writing any script SQL. Paste the results
|
|
50
|
+
into the delegation prompt so the agent never guesses column or table names.
|
|
51
|
+
|
|
52
|
+
```sql
|
|
53
|
+
-- Step 1: Confirm which tables actually exist
|
|
54
|
+
SELECT tablename FROM pg_tables WHERE schemaname = 'public' ORDER BY tablename;
|
|
55
|
+
|
|
56
|
+
-- Step 2: For each output table — get exact column names and types
|
|
57
|
+
SELECT column_name, data_type, is_nullable
|
|
58
|
+
FROM information_schema.columns
|
|
59
|
+
WHERE table_name = '<your_table>'
|
|
60
|
+
ORDER BY ordinal_position;
|
|
61
|
+
```
|
|
62
|
+
|
|
63
|
+
**Critical rule:** If a table has no timestamp column, use `COUNT(*)` for freshness checks —
|
|
64
|
+
never guess a column name. Verify with `information_schema.columns` first.
|
|
65
|
+
|
|
66
|
+
---
|
|
67
|
+
|
|
68
|
+
## Makefile Standard
|
|
69
|
+
|
|
70
|
+
Every `scripts/` directory MUST contain a `Makefile` with these standard targets.
|
|
71
|
+
This is auto-generated by the scaffolder in Phase 1 and should be updated in Phase 2
|
|
72
|
+
to add any service-specific targets.
|
|
73
|
+
|
|
74
|
+
```makefile
|
|
75
|
+
# Skill diagnostic scripts for <service-id>
|
|
76
|
+
# Usage: make <target> (from this directory)
|
|
77
|
+
# Override python: make health PYTHON=../../venv/bin/python3
|
|
78
|
+
|
|
79
|
+
PYTHON := python3
|
|
80
|
+
|
|
81
|
+
.PHONY: health health-json data data-json logs errors db help
|
|
82
|
+
|
|
83
|
+
help:
|
|
84
|
+
@echo "Available targets:"
|
|
85
|
+
@echo " health - Run health probe (human readable)"
|
|
86
|
+
@echo " health-json - Run health probe (JSON output)"
|
|
87
|
+
@echo " data - Show latest DB records"
|
|
88
|
+
@echo " data-json - Show latest DB records (JSON, limit 5)"
|
|
89
|
+
@echo " logs - Tail and analyze recent logs"
|
|
90
|
+
@echo " errors - Show errors/criticals only"
|
|
91
|
+
@echo " db - Run DB helper example queries"
|
|
92
|
+
|
|
93
|
+
health:
|
|
94
|
+
$(PYTHON) health_probe.py
|
|
95
|
+
|
|
96
|
+
health-json:
|
|
97
|
+
$(PYTHON) health_probe.py --json
|
|
98
|
+
|
|
99
|
+
data:
|
|
100
|
+
$(PYTHON) data_explorer.py
|
|
101
|
+
|
|
102
|
+
data-json:
|
|
103
|
+
$(PYTHON) data_explorer.py --json --limit 5
|
|
104
|
+
|
|
105
|
+
logs:
|
|
106
|
+
$(PYTHON) log_hunter.py --tail 50
|
|
107
|
+
|
|
108
|
+
errors:
|
|
109
|
+
$(PYTHON) log_hunter.py --errors-only --tail 50
|
|
110
|
+
|
|
111
|
+
db:
|
|
112
|
+
$(PYTHON) db_helper.py
|
|
113
|
+
```
|
|
114
|
+
|
|
115
|
+
---
|
|
116
|
+
|
|
117
|
+
## Design Principles
|
|
118
|
+
|
|
119
|
+
### 1. Service-Specific, Not Generic
|
|
120
|
+
|
|
121
|
+
The single most important rule. Generic scripts provide zero value.
|
|
122
|
+
|
|
123
|
+
**Wrong (generic stub output):**
|
|
124
|
+
```python
|
|
125
|
+
error_patterns = [
|
|
126
|
+
r"(ERROR|CRITICAL|FATAL|EXCEPTION)",
|
|
127
|
+
r"ConnectionError",
|
|
128
|
+
r"SyntaxError",
|
|
129
|
+
r"ImportError",
|
|
130
|
+
]
|
|
131
|
+
```
|
|
132
|
+
|
|
133
|
+
**Right (service-specific, sourced from actual codebase):**
|
|
134
|
+
```python
|
|
135
|
+
PATTERNS = [
|
|
136
|
+
# From yfinance source: actual exception class names
|
|
137
|
+
("Rate limit", r"YFRateLimitError|429.*yahoo|Too Many Requests", "error"),
|
|
138
|
+
("Missing data", r"YFPricesMissingError|no timezone found|Period.*invalid", "warning"),
|
|
139
|
+
# From DB layer: actual psycopg2 messages
|
|
140
|
+
("DB connect", r"could not connect|password authentication failed", "critical"),
|
|
141
|
+
("DB write", r"relation.*does not exist|column.*does not exist", "error"),
|
|
142
|
+
]
|
|
143
|
+
```
|
|
144
|
+
|
|
145
|
+
**How to find real patterns:** Read the service's entry point script, exception handlers, and log statements. Search for `logger.error`, `raise`, `except` blocks, and error message strings.
|
|
146
|
+
|
|
147
|
+
---
|
|
148
|
+
|
|
149
|
+
### 2. Port Awareness: Host vs. Container
|
|
150
|
+
|
|
151
|
+
Scripts in `skills/` run on the host machine, not inside Docker. Port mappings matter.
|
|
152
|
+
|
|
153
|
+
| Context | Use This Port |
|
|
154
|
+
|---------|--------------|
|
|
155
|
+
| Host scripts (`skills/*.py`) | External mapped port (e.g., `5433` for TimescaleDB `5433:5432`) |
|
|
156
|
+
| Docker service env vars | Internal port (`5432`) |
|
|
157
|
+
| `docker exec` commands | N/A — resolves via container DNS |
|
|
158
|
+
|
|
159
|
+
**Always use env vars with correct defaults:**
|
|
160
|
+
```python
|
|
161
|
+
DB_HOST = os.getenv("DB_HOST", "localhost")
|
|
162
|
+
DB_PORT = int(os.getenv("DB_PORT", "5433")) # External mapped port
|
|
163
|
+
```
|
|
164
|
+
|
|
165
|
+
---
|
|
166
|
+
|
|
167
|
+
### 3. Read-Before-Write Discipline
|
|
168
|
+
|
|
169
|
+
When a stub file already exists and you are replacing it, **always read it first**. Write tools fail with "File has not been read yet" otherwise. New files (no existing content) can be created directly.
|
|
170
|
+
|
|
171
|
+
---
|
|
172
|
+
|
|
173
|
+
### 4. Dual Output Mode
|
|
174
|
+
|
|
175
|
+
Every script must support both human-readable (default) and machine-readable (`--json`) output.
|
|
176
|
+
|
|
177
|
+
```python
|
|
178
|
+
parser.add_argument("--json", action="store_true", help="Output as JSON")
|
|
179
|
+
|
|
180
|
+
if args.json:
|
|
181
|
+
print(json.dumps(result, indent=2, default=str))
|
|
182
|
+
return
|
|
183
|
+
# ... human-readable output below
|
|
184
|
+
```
|
|
185
|
+
|
|
186
|
+
---
|
|
187
|
+
|
|
188
|
+
### 5. Actionable Remediation in Output
|
|
189
|
+
|
|
190
|
+
When a health probe or log hunter detects a critical problem, it must print the exact fix command — not a generic "check the logs."
|
|
191
|
+
|
|
192
|
+
```python
|
|
193
|
+
if by_sev["critical"]:
|
|
194
|
+
print(f"\n ⚠ Critical issues detected.")
|
|
195
|
+
if any("OAuth expired" in h["labels"] for h in by_sev["critical"]):
|
|
196
|
+
print(f" Fix: docker exec -it {CONTAINER} python scripts/auth.py --refresh")
|
|
197
|
+
if any("DB connect" in h["labels"] for h in by_sev["critical"]):
|
|
198
|
+
print(f" Fix: docker compose restart timescaledb && docker compose restart {CONTAINER}")
|
|
199
|
+
```
|
|
200
|
+
|
|
201
|
+
---
|
|
202
|
+
|
|
203
|
+
## health_probe.py Standards
|
|
204
|
+
|
|
205
|
+
### Structure
|
|
206
|
+
|
|
207
|
+
```python
|
|
208
|
+
CONTAINER = "service-name" # Exact Docker container name
|
|
209
|
+
|
|
210
|
+
def check_container() -> dict:
|
|
211
|
+
"""docker inspect for status. Always present."""
|
|
212
|
+
...
|
|
213
|
+
|
|
214
|
+
def check_<domain>() -> dict:
|
|
215
|
+
"""Service-specific check (DB freshness, file presence, HTTP endpoint, etc.)"""
|
|
216
|
+
...
|
|
217
|
+
|
|
218
|
+
def main():
|
|
219
|
+
# 1. Collect all checks
|
|
220
|
+
# 2. --json: dump report dict
|
|
221
|
+
# 3. Human: print formatted table
|
|
222
|
+
# 4. Print fix commands on failure
|
|
223
|
+
```
|
|
224
|
+
|
|
225
|
+
### For DB-writing services: Freshness Table
|
|
226
|
+
|
|
227
|
+
```python
|
|
228
|
+
# Define per-table stale thresholds based on service update frequency
|
|
229
|
+
FRESHNESS_CHECKS = [
|
|
230
|
+
# (table_name, timestamp_col, stale_threshold_minutes)
|
|
231
|
+
("candles_5m", "timestamp", 30), # 5m feed → stale if >30m old
|
|
232
|
+
("outright_snapshots", "snapshot_ts", 10), # continuous → stale if >10m old
|
|
233
|
+
("volatility_snapshots","snapshot_ts", 1500), # daily job → stale if >25h old
|
|
234
|
+
]
|
|
235
|
+
```
|
|
236
|
+
|
|
237
|
+
Stale threshold logic: `update_interval × 3` is a reasonable default, but adjust for business criticality.
|
|
238
|
+
|
|
239
|
+
### For HTTP API services: Endpoint Probing
|
|
240
|
+
|
|
241
|
+
Do not ping generic ports. Probe the actual API routes the service exposes:
|
|
242
|
+
|
|
243
|
+
```python
|
|
244
|
+
HEALTH_ENDPOINTS = [
|
|
245
|
+
("FastAPI health", "http://localhost:8000/api/system/health", 3),
|
|
246
|
+
("Background server", "http://localhost:5002/health", 2),
|
|
247
|
+
]
|
|
248
|
+
# Optional smoke tests against real data endpoints
|
|
249
|
+
SMOKE_ENDPOINTS = [
|
|
250
|
+
("Market snapshot", "http://localhost:8000/api/market/snapshot"),
|
|
251
|
+
("Volatility data", "http://localhost:8000/api/analytics/volatility"),
|
|
252
|
+
]
|
|
253
|
+
```
|
|
254
|
+
|
|
255
|
+
### For one-shot services (migrations, backfills): Exit Code
|
|
256
|
+
|
|
257
|
+
```python
|
|
258
|
+
# docker inspect returns status="exited" and exit_code="0" on success
|
|
259
|
+
result = subprocess.run(
|
|
260
|
+
["docker", "inspect", "--format",
|
|
261
|
+
"{{.State.Status}} {{.State.ExitCode}} {{.State.FinishedAt}}",
|
|
262
|
+
CONTAINER],
|
|
263
|
+
capture_output=True, text=True
|
|
264
|
+
)
|
|
265
|
+
```
|
|
266
|
+
|
|
267
|
+
A one-shot service is healthy if `exit_code == "0"` and expected tables/schemas exist in the DB.
|
|
268
|
+
|
|
269
|
+
### For file watcher services: Mount + State File
|
|
270
|
+
|
|
271
|
+
```python
|
|
272
|
+
def check_scid_mount() -> dict:
|
|
273
|
+
result = subprocess.run(
|
|
274
|
+
["docker", "exec", CONTAINER, "ls", "/data/scid"],
|
|
275
|
+
capture_output=True, text=True, timeout=10
|
|
276
|
+
)
|
|
277
|
+
files = [f for f in result.stdout.splitlines() if f.endswith(".scid")]
|
|
278
|
+
return {"accessible": result.returncode == 0, "file_count": len(files)}
|
|
279
|
+
|
|
280
|
+
def check_state_file() -> dict:
|
|
281
|
+
result = subprocess.run(
|
|
282
|
+
["docker", "exec", CONTAINER, "cat", "/app/state/watcher_state.json"],
|
|
283
|
+
capture_output=True, text=True
|
|
284
|
+
)
|
|
285
|
+
if result.returncode != 0:
|
|
286
|
+
return {"present": False}
|
|
287
|
+
return {"present": True, "state": json.loads(result.stdout)}
|
|
288
|
+
```
|
|
289
|
+
|
|
290
|
+
---
|
|
291
|
+
|
|
292
|
+
## log_hunter.py Standards
|
|
293
|
+
|
|
294
|
+
### Pattern Structure
|
|
295
|
+
|
|
296
|
+
```python
|
|
297
|
+
PATTERNS = [
|
|
298
|
+
# (label, regex_pattern, severity)
|
|
299
|
+
("OAuth expired", r"invalid_grant|token.*expired", "critical"),
|
|
300
|
+
("PDF parse", r"PdfReadError|pdf.*format.*changed", "error"),
|
|
301
|
+
("No data", r"No new.*report|0 reports.*found", "warning"),
|
|
302
|
+
("Report saved", r"report.*ingested|saved.*DB", "info"),
|
|
303
|
+
]
|
|
304
|
+
```
|
|
305
|
+
|
|
306
|
+
**Severity levels:**
|
|
307
|
+
- `critical`: Service needs restart or manual intervention to recover
|
|
308
|
+
- `error`: Functionality is impaired, data may be incomplete
|
|
309
|
+
- `warning`: Degraded state, worth monitoring
|
|
310
|
+
- `info`: Normal operation confirmation
|
|
311
|
+
|
|
312
|
+
### Severity Ordering
|
|
313
|
+
|
|
314
|
+
Always use `sev_order` so that the highest severity "wins" when a line matches multiple patterns:
|
|
315
|
+
|
|
316
|
+
```python
|
|
317
|
+
sev_order = {"critical": 0, "error": 1, "warning": 2, "info": 3}
|
|
318
|
+
if matched_severity is None or sev_order[severity] < sev_order[matched_severity]:
|
|
319
|
+
matched_severity = severity
|
|
320
|
+
```
|
|
321
|
+
|
|
322
|
+
### Required CLI Flags
|
|
323
|
+
|
|
324
|
+
```python
|
|
325
|
+
parser.add_argument("--tail", type=int, default=200)
|
|
326
|
+
parser.add_argument("--since", type=str, default=None) # Docker --since (e.g. "1h", "2026-01-01")
|
|
327
|
+
parser.add_argument("--errors-only", action="store_true") # Skip info entries
|
|
328
|
+
parser.add_argument("--json", action="store_true")
|
|
329
|
+
```
|
|
330
|
+
|
|
331
|
+
### Pattern Design Rules
|
|
332
|
+
|
|
333
|
+
1. Test patterns against the **actual log format** of the service, not hypothetical messages.
|
|
334
|
+
2. Use `re.IGNORECASE` — log levels and messages vary in capitalization.
|
|
335
|
+
3. Prefer specific class names (`YFPricesMissingError`) over generic keywords (`Error`).
|
|
336
|
+
4. For Rust services, add: `r"thread '.*' panicked|panicked at '"` as a critical pattern.
|
|
337
|
+
5. Always include at least 2 `info` patterns for normal operation confirmation — so the absence of info lines itself becomes a signal.
|
|
338
|
+
|
|
339
|
+
### Anti-patterns to avoid
|
|
340
|
+
|
|
341
|
+
| Anti-pattern | Why It Fails |
|
|
342
|
+
|---|---|
|
|
343
|
+
| `r"ERROR"` | Matches comment text, variable names, and dozens of false positives |
|
|
344
|
+
| `r"Exception"` | Too broad — every Python `try/except` emits this |
|
|
345
|
+
| `r"ConnectionError"` | Only catches one subclass; misses `OperationalError`, `InterfaceError`, etc. |
|
|
346
|
+
| Single `error_patterns` list without severity | Provides no triage — everything looks equally bad |
|
|
347
|
+
|
|
348
|
+
---
|
|
349
|
+
|
|
350
|
+
## Specialist Script Standards
|
|
351
|
+
|
|
352
|
+
### data_explorer.py (for DB-writing services)
|
|
353
|
+
|
|
354
|
+
Purpose: Let an agent query the service's output tables interactively without writing SQL.
|
|
355
|
+
|
|
356
|
+
```python
|
|
357
|
+
# Always support:
|
|
358
|
+
parser.add_argument("--symbol", help="Filter to a specific symbol")
|
|
359
|
+
parser.add_argument("--history", action="store_true", help="Show time series, not just latest")
|
|
360
|
+
parser.add_argument("--limit", type=int, default=20)
|
|
361
|
+
parser.add_argument("--json", action="store_true")
|
|
362
|
+
```
|
|
363
|
+
|
|
364
|
+
Use `DISTINCT ON (symbol) ... ORDER BY symbol, timestamp DESC` for "latest per symbol" queries. Use parameterized queries: `WHERE symbol = %s`.
|
|
365
|
+
|
|
366
|
+
### endpoint_tester.py (for HTTP API services)
|
|
367
|
+
|
|
368
|
+
Test every real route in the API, not just `/health`. Measure response time and size:
|
|
369
|
+
|
|
370
|
+
```python
|
|
371
|
+
ENDPOINTS = [
|
|
372
|
+
# (label, method, path, expected_status, timeout_s)
|
|
373
|
+
("Health check", "GET", "/api/system/health", 200, 3),
|
|
374
|
+
("Market overview", "GET", "/api/market/overview", 200, 5),
|
|
375
|
+
("Symbol detail", "GET", "/api/market/ES=F", 200, 5),
|
|
376
|
+
# ... all actual routes
|
|
377
|
+
]
|
|
378
|
+
```
|
|
379
|
+
|
|
380
|
+
Report slow endpoints (>2s) separately from failed ones.
|
|
381
|
+
|
|
382
|
+
### state_inspector.py (for stateful file watchers)
|
|
383
|
+
|
|
384
|
+
Read the state file via `docker exec` and compute lag between current file size and processed byte offset:
|
|
385
|
+
|
|
386
|
+
```python
|
|
387
|
+
scid_size = get_file_size_in_container(container, filepath)
|
|
388
|
+
lag_bytes = scid_size - state["byte_offset"]
|
|
389
|
+
lag_flag = " ⚠" if lag_bytes > 1_000_000 else ""
|
|
390
|
+
```
|
|
391
|
+
|
|
392
|
+
### coverage_checker.py (for one-shot backfill services)
|
|
393
|
+
|
|
394
|
+
Report per-entity (spread, symbol, etc.) row counts, date ranges, and gaps:
|
|
395
|
+
|
|
396
|
+
```sql
|
|
397
|
+
SELECT entity_id, COUNT(*) AS rows,
|
|
398
|
+
MIN(ts) AS earliest, MAX(ts) AS latest
|
|
399
|
+
FROM output_table
|
|
400
|
+
GROUP BY entity_id ORDER BY entity_id;
|
|
401
|
+
```
|
|
402
|
+
|
|
403
|
+
Also detect missing entities against a known expected list, and find time-series gaps using `LAG()`.
|
|
404
|
+
|
|
405
|
+
---
|
|
406
|
+
|
|
407
|
+
## Common Pitfalls
|
|
408
|
+
|
|
409
|
+
| Pitfall | Prevention |
|
|
410
|
+
|---------|-----------|
|
|
411
|
+
| Script uses port 5432 from host | Default to 5433 (external mapped port); document the discrepancy |
|
|
412
|
+
| Script uses HTTP port scanning instead of real routes | Read docker-compose to find actual port mappings; check the service's API routes |
|
|
413
|
+
| OAuth token path is wrong | `docker exec container ls /expected/path` to verify before hardcoding |
|
|
414
|
+
| **DB table name is guessed** | Run `SELECT tablename FROM pg_tables WHERE schemaname='public'` first; include output in delegation prompt |
|
|
415
|
+
| **DB column name is guessed** | Run `SELECT column_name, data_type FROM information_schema.columns WHERE table_name='X'` per table; include in delegation prompt |
|
|
416
|
+
| **Assumed timestamp column on every table** | Check `information_schema.columns` — if no timestamp exists, use `COUNT(*)` for freshness; never guess |
|
|
417
|
+
| `try` block with no matching `except` | Every DB call needs a complete `try/except`; bare `try` blocks crash silently |
|
|
418
|
+
| Function renamed but call sites not updated | After any rename, grep the scripts dir for the old name before finishing |
|
|
419
|
+
| Delegation with no `-y` flag (Qwen) | Qwen requires `-y` for non-interactive file writes; without it, research happens but no files are written |
|
|
420
|
+
| Using `ccs gemini` instead of `gemini -p` | Gemini: `gemini -p "..."` · Qwen: `qwen -y "..."` · GLM: `env -u CLAUDECODE ccs glm -p "..."` |
|
|
421
|
+
| Scripts tested with system python3 | Always test with `venv/bin/python3`; system python may lack dotenv and other deps |
|
|
422
|
+
| Log patterns too broad | Read the actual `logger.error()` calls in the source code |
|
|
423
|
+
| Missing `--since` flag | Log hunters without `--since` can't be used for incremental monitoring |
|
|
424
|
+
| `health_probe.py` doesn't print fix commands | Always add actionable remediation text after detecting critical states |
|
|
425
|
+
| No `scripts/Makefile` | Every skill must have a Makefile with standard targets; scaffolder generates it in Phase 1 |
|
|
@@ -0,0 +1,278 @@
|
|
|
1
|
+
# Service Skill System: Architecture & Operations Guide
|
|
2
|
+
|
|
3
|
+
> Distilled from real-world Docker microservices projects.
|
|
4
|
+
> This guide is project-agnostic — adapt all examples to your stack.
|
|
5
|
+
|
|
6
|
+
---
|
|
7
|
+
|
|
8
|
+
## Table of Contents
|
|
9
|
+
|
|
10
|
+
- [1. System Overview](#1-system-overview)
|
|
11
|
+
- [2. System Architecture](#2-system-architecture)
|
|
12
|
+
- [3. Mandatory Two-Phase Workflow](#3-mandatory-two-phase-workflow)
|
|
13
|
+
- [4. Service Type Classification](#4-service-type-classification)
|
|
14
|
+
- [5. Directory Structure](#5-directory-structure)
|
|
15
|
+
- [6. Skill Lifecycle](#6-skill-lifecycle)
|
|
16
|
+
- [7. Quality Gates](#7-quality-gates)
|
|
17
|
+
- [8. Best Practices](#8-best-practices)
|
|
18
|
+
- [9. Anti-Patterns](#9-anti-patterns)
|
|
19
|
+
|
|
20
|
+
---
|
|
21
|
+
|
|
22
|
+
## 1. System Overview
|
|
23
|
+
|
|
24
|
+
The **Service Skill System** transforms an AI agent from a generic assistant into a service-aware operator. Each Docker service in your project gets a dedicated **skill package**: a structured combination of operational documentation and executable diagnostic scripts.
|
|
25
|
+
|
|
26
|
+
### What a Skill Provides
|
|
27
|
+
|
|
28
|
+
| Layer | Contents | Purpose |
|
|
29
|
+
|-------|----------|---------|
|
|
30
|
+
| `SKILL.md` | Operational manual | How the service works, how to debug it |
|
|
31
|
+
| `scripts/health_probe.py` | Container + data freshness checks | Is the service healthy right now? |
|
|
32
|
+
| `scripts/log_hunter.py` | Pattern-based log analysis | What is the service logging and why? |
|
|
33
|
+
| `scripts/<specialist>.py` | Service-specific inspector | What state does this service hold? |
|
|
34
|
+
|
|
35
|
+
Without scripts, a skill is documentation only. Without documentation, scripts have no context. Both are required.
|
|
36
|
+
|
|
37
|
+
---
|
|
38
|
+
|
|
39
|
+
## 2. System Architecture
|
|
40
|
+
|
|
41
|
+
### Three Components
|
|
42
|
+
|
|
43
|
+
**A. The Builder (`service-skill-builder`)**
|
|
44
|
+
The meta-skill that generates other skills.
|
|
45
|
+
- **Input**: `docker-compose*.yml`, Dockerfiles, entry-point source code
|
|
46
|
+
- **Engine**: `scripts/main.py` (Phase 1 skeleton generator)
|
|
47
|
+
- **Output**: `SKILL.md`, `REFINEMENT_BRIEF.md`, stub scripts → then replaced in Phase 2
|
|
48
|
+
|
|
49
|
+
**B. The Health Checker (`scripts/skill_health_check.py`)**
|
|
50
|
+
Detects drift between skills and the live codebase.
|
|
51
|
+
- Compares service modification timestamps vs. skill generation timestamps
|
|
52
|
+
- Identifies services with no skill (coverage gaps)
|
|
53
|
+
- Reports stale skills needing a re-dive
|
|
54
|
+
|
|
55
|
+
**C. The Generated Skills**
|
|
56
|
+
Individual packages per service (e.g., `.claude/skills/my-service/`).
|
|
57
|
+
|
|
58
|
+
---
|
|
59
|
+
|
|
60
|
+
## 3. Mandatory Two-Phase Workflow
|
|
61
|
+
|
|
62
|
+
**Phase 1 and Phase 2 are both required. The skeleton alone is never sufficient.**
|
|
63
|
+
|
|
64
|
+
### Phase 1: Automated Skeleton
|
|
65
|
+
|
|
66
|
+
Run the generator against your project root:
|
|
67
|
+
|
|
68
|
+
```bash
|
|
69
|
+
# Discover all Docker services
|
|
70
|
+
python3 .claude/skills/service-skill-builder/scripts/main.py --scan
|
|
71
|
+
|
|
72
|
+
# Generate skeleton for one service
|
|
73
|
+
python3 .claude/skills/service-skill-builder/scripts/main.py <service-name>
|
|
74
|
+
```
|
|
75
|
+
|
|
76
|
+
The skeleton provides:
|
|
77
|
+
- Structural facts: port mappings, env var names, image names, volumes
|
|
78
|
+
- `REFINEMENT_BRIEF.md` listing every open question
|
|
79
|
+
- Generic stub scripts (placeholder only — **must be replaced**)
|
|
80
|
+
|
|
81
|
+
**The skeleton cannot tell you:**
|
|
82
|
+
- What the service actually writes to the database (column names, stale thresholds)
|
|
83
|
+
- What real error messages look like in the logs
|
|
84
|
+
- What "healthy" vs. "degraded" vs. "failed" looks like
|
|
85
|
+
- What exact commands fix common failures
|
|
86
|
+
|
|
87
|
+
### Phase 2: Agentic Deep Dive
|
|
88
|
+
|
|
89
|
+
Read the source code. Answer every question in `REFINEMENT_BRIEF.md` using `Grep`, `Glob`, `Read`, and Serena LSP tools. Do not guess. Do not leave placeholders.
|
|
90
|
+
|
|
91
|
+
**Mandatory investigation areas:**
|
|
92
|
+
|
|
93
|
+
#### Container & Runtime
|
|
94
|
+
- What is the exact entry point? (Dockerfile CMD + docker-compose `command:`)
|
|
95
|
+
- Is this a long-running daemon, a cron job, or a one-shot? → determines health strategy
|
|
96
|
+
- Which env vars cause a crash if missing? Which are optional?
|
|
97
|
+
- What volumes does it read from? Write to?
|
|
98
|
+
- What is the restart policy and why?
|
|
99
|
+
|
|
100
|
+
#### Data Layer
|
|
101
|
+
- Which tables does it write? Which does it only read?
|
|
102
|
+
- What is the timestamp column for each output table (`created_at`, `snapshot_ts`, `asof_ts`, etc.)?
|
|
103
|
+
- What is a realistic "stale" threshold in minutes per table? (Rule of thumb: update_interval × 3)
|
|
104
|
+
- Does it use Redis, S3, local files, or other external state?
|
|
105
|
+
- Are queries parameterized? (Check `%s`, `%(name)s`, `?` patterns — never f-strings in SQL)
|
|
106
|
+
|
|
107
|
+
#### Failure Modes
|
|
108
|
+
Build this table with ≥5 rows from code comments, exception handlers, and READMEs:
|
|
109
|
+
|
|
110
|
+
| Symptom | Likely Cause | Resolution |
|
|
111
|
+
|---------|-------------|------------|
|
|
112
|
+
| (what you see in logs or alerts) | (root cause) | (exact docker/shell command to fix) |
|
|
113
|
+
|
|
114
|
+
#### Log Patterns
|
|
115
|
+
Search for `logger.error`, `logger.warning`, `raise`, `except`, and `panic!` in the source:
|
|
116
|
+
- What appears in logs during normal healthy operation? (→ `info` patterns)
|
|
117
|
+
- What appears during recoverable errors? (→ `warning` / `error` patterns)
|
|
118
|
+
- What appears during critical failures requiring restart? (→ `critical` patterns)
|
|
119
|
+
- For Rust services: what does a panic look like? (`thread '.*' panicked`)
|
|
120
|
+
|
|
121
|
+
---
|
|
122
|
+
|
|
123
|
+
## 4. Service Type Classification
|
|
124
|
+
|
|
125
|
+
Classify before writing scripts. The service type determines which scripts to write beyond the baseline `health_probe.py` and `log_hunter.py`.
|
|
126
|
+
|
|
127
|
+
| Service Type | Health Probe Strategy | Specialist Script |
|
|
128
|
+
|---|---|---|
|
|
129
|
+
| **Continuous DB writer** | Table freshness (age of most recent row per table) | `data_explorer.py` |
|
|
130
|
+
| **HTTP API server** | HTTP probe against real routes (not just port scan) | `endpoint_tester.py` |
|
|
131
|
+
| **One-shot / migration** | Container exit code + expected tables/schemas present | `coverage_checker.py` |
|
|
132
|
+
| **File watcher** | Mount path accessible + state file present + DB recency | `state_inspector.py` |
|
|
133
|
+
| **Email / API poller** | Container running + auth token file present | service-specific |
|
|
134
|
+
| **Scheduled backup** | Recent backup files in staging dir + daemon running | service-specific |
|
|
135
|
+
| **MCP stdio server** | Data source freshness in DB (no HTTP to probe) | service-specific |
|
|
136
|
+
|
|
137
|
+
---
|
|
138
|
+
|
|
139
|
+
## 5. Directory Structure
|
|
140
|
+
|
|
141
|
+
```
|
|
142
|
+
.claude/skills/
|
|
143
|
+
├── service-skill-builder/ # Meta-skill (system core)
|
|
144
|
+
│ ├── SKILL.md
|
|
145
|
+
│ ├── references/
|
|
146
|
+
│ │ ├── service_skill_system_guide.md # This file
|
|
147
|
+
│ │ └── script_quality_standards.md # Script design rules
|
|
148
|
+
│ └── scripts/
|
|
149
|
+
│ ├── main.py # Phase 1 skeleton generator
|
|
150
|
+
│ ├── skill_health_check.py # Drift detection
|
|
151
|
+
│ ├── discovery.py # Docker Compose parser
|
|
152
|
+
│ ├── analysis.py # AST/regex code analyzer
|
|
153
|
+
│ ├── devops_audit.py # CI/CD/observability audit
|
|
154
|
+
│ └── generator.py # Skill file generation logic
|
|
155
|
+
│
|
|
156
|
+
├── my-service-a/ # Generated skill (long-running daemon)
|
|
157
|
+
│ ├── SKILL.md
|
|
158
|
+
│ └── scripts/
|
|
159
|
+
│ ├── health_probe.py # Container + DB freshness checks
|
|
160
|
+
│ ├── log_hunter.py # Pattern-matched log analysis
|
|
161
|
+
│ └── data_explorer.py # Query output tables interactively
|
|
162
|
+
│
|
|
163
|
+
├── my-service-b/ # Generated skill (HTTP API)
|
|
164
|
+
│ ├── SKILL.md
|
|
165
|
+
│ └── scripts/
|
|
166
|
+
│ ├── health_probe.py
|
|
167
|
+
│ ├── log_hunter.py
|
|
168
|
+
│ └── endpoint_tester.py # Probe all real API routes
|
|
169
|
+
│
|
|
170
|
+
└── my-service-c/ # Generated skill (file watcher)
|
|
171
|
+
├── SKILL.md
|
|
172
|
+
└── scripts/
|
|
173
|
+
├── health_probe.py
|
|
174
|
+
├── log_hunter.py
|
|
175
|
+
└── state_inspector.py # Read state file, compute lag
|
|
176
|
+
```
|
|
177
|
+
|
|
178
|
+
Agent mirrors — always sync after creating or updating skills:
|
|
179
|
+
|
|
180
|
+
```bash
|
|
181
|
+
for d in .claude/skills/my-*/; do
|
|
182
|
+
svc=$(basename "$d")
|
|
183
|
+
cp -r "$d" ".agent/skills/$svc/"
|
|
184
|
+
cp -r "$d" ".gemini/skills/$svc/"
|
|
185
|
+
done
|
|
186
|
+
```
|
|
187
|
+
|
|
188
|
+
---
|
|
189
|
+
|
|
190
|
+
## 6. Skill Lifecycle
|
|
191
|
+
|
|
192
|
+
### When to Generate a Skill
|
|
193
|
+
- A new Docker service is added to the project
|
|
194
|
+
- An existing service is significantly refactored
|
|
195
|
+
|
|
196
|
+
### When to Update a Skill
|
|
197
|
+
- The service's database schema changes
|
|
198
|
+
- New error conditions are added to the code
|
|
199
|
+
- The entry point or restart policy changes
|
|
200
|
+
- The health check script's stale thresholds no longer reflect reality
|
|
201
|
+
|
|
202
|
+
### Detecting Drift
|
|
203
|
+
|
|
204
|
+
```bash
|
|
205
|
+
# Check all skills for staleness
|
|
206
|
+
python3 .claude/skills/service-skill-builder/scripts/skill_health_check.py --all
|
|
207
|
+
```
|
|
208
|
+
|
|
209
|
+
Output example:
|
|
210
|
+
```
|
|
211
|
+
my-service-a: HEALTHY
|
|
212
|
+
my-service-b: STALE (service code modified 2026-01-15, skill generated 2025-11-01)
|
|
213
|
+
my-service-c: MISSING (no skill exists)
|
|
214
|
+
```
|
|
215
|
+
|
|
216
|
+
A skill is **STALE** when the service's source code or docker-compose definition has been modified more recently than the skill was generated. This is a signal to re-run Phase 2 for the affected service.
|
|
217
|
+
|
|
218
|
+
---
|
|
219
|
+
|
|
220
|
+
## 7. Quality Gates
|
|
221
|
+
|
|
222
|
+
A skill is **complete** (not draft) when all of the following are true:
|
|
223
|
+
|
|
224
|
+
- [ ] No `[PENDING RESEARCH]` markers remain in SKILL.md
|
|
225
|
+
- [ ] All stub scripts have been replaced with service-specific implementations
|
|
226
|
+
- [ ] `health_probe.py` queries actual output tables with correct stale thresholds
|
|
227
|
+
- [ ] `log_hunter.py` patterns are sourced from the real codebase (not invented)
|
|
228
|
+
- [ ] At least one specialist script exists if the service has unique inspectable state
|
|
229
|
+
- [ ] The Troubleshooting table has ≥5 rows based on real failure modes
|
|
230
|
+
- [ ] All CLI commands in SKILL.md are verified against the actual docker-compose config
|
|
231
|
+
- [ ] Scripts have been synced to `.agent/skills/` and `.gemini/skills/` mirrors
|
|
232
|
+
|
|
233
|
+
---
|
|
234
|
+
|
|
235
|
+
## 8. Best Practices
|
|
236
|
+
|
|
237
|
+
### One Service, One Skill
|
|
238
|
+
Keep skills granular. A skill for `my-api` should not also document `my-worker`. Tightly coupled services (e.g., Redis master/replica) may share a skill if they are always operated together.
|
|
239
|
+
|
|
240
|
+
### Read Source, Not Docs
|
|
241
|
+
Internal README files go stale. The entry point script, exception handlers, and log statements are the ground truth. Always grep the source code for actual error messages before writing log patterns.
|
|
242
|
+
|
|
243
|
+
### Port Awareness
|
|
244
|
+
Scripts in `skills/` run on the **host machine**, not inside Docker. Always use the external mapped port:
|
|
245
|
+
|
|
246
|
+
```python
|
|
247
|
+
# ✅ Host script (external mapped port)
|
|
248
|
+
DB_PORT = int(os.getenv("DB_PORT", "5433"))
|
|
249
|
+
|
|
250
|
+
# ❌ Wrong for a host script (container-internal port)
|
|
251
|
+
DB_PORT = int(os.getenv("DB_PORT", "5432"))
|
|
252
|
+
```
|
|
253
|
+
|
|
254
|
+
### Executable Knowledge
|
|
255
|
+
Prefer putting logic into `scripts/` (executed without reading into context) over text-only descriptions in SKILL.md. An agent that can run `health_probe.py` learns the truth about service health in one step. An agent reading stale prose may act on incorrect assumptions.
|
|
256
|
+
|
|
257
|
+
### Actionable Remediation
|
|
258
|
+
Every critical failure detected by a script must print the exact command to fix it — not "check the logs." For example:
|
|
259
|
+
|
|
260
|
+
```python
|
|
261
|
+
if not token_present:
|
|
262
|
+
print(f" Fix: docker exec -it {CONTAINER} python scripts/auth.py --refresh")
|
|
263
|
+
```
|
|
264
|
+
|
|
265
|
+
---
|
|
266
|
+
|
|
267
|
+
## 9. Anti-Patterns
|
|
268
|
+
|
|
269
|
+
| Anti-pattern | Why It Fails |
|
|
270
|
+
|---|---|
|
|
271
|
+
| Skip Phase 2 because Phase 1 looks complete | Skeleton has correct port numbers but wrong table names, wrong log patterns, wrong stale thresholds |
|
|
272
|
+
| Copy log patterns from another service's skill | Different services emit different errors; shared patterns produce false positives and miss real failures |
|
|
273
|
+
| Use port 5432 in host scripts | Container-internal port is unreachable from host; scripts silently hang |
|
|
274
|
+
| Write `health_probe.py` without fix commands | Agent sees a failure but has no recovery path |
|
|
275
|
+
| Leave `[PENDING RESEARCH]` markers | The skill is unusable — an agent acting on incomplete info may apply wrong fixes |
|
|
276
|
+
| Forget to sync to `.agent/` and `.gemini/` | Other agent runtimes use stale or missing skills |
|
|
277
|
+
| Use `r"ERROR"` as a log pattern | Matches variable names, comments, thousands of false positives |
|
|
278
|
+
| Hardcode table names without verifying | `SELECT tablename FROM pg_tables WHERE schemaname='public'` first |
|