mindforge-cc 11.5.1 → 11.6.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.agent/mindforge/skill-tdd.md +53 -0
- package/.agent/mindforge/skills-index.md +118 -0
- package/.agent/mindforge/systematic-debug.md +60 -0
- package/.agent/skills/1password-skill/SKILL.md +156 -0
- package/.agent/skills/1password-skill/references/cli-examples.md +31 -0
- package/.agent/skills/1password-skill/references/get-started.md +21 -0
- package/.agent/skills/article-illustrator/SKILL.md +199 -0
- package/.agent/skills/article-illustrator/references/prompt-construction.md +426 -0
- package/.agent/skills/article-illustrator/references/style-presets.md +80 -0
- package/.agent/skills/article-illustrator/references/styles.md +224 -0
- package/.agent/skills/article-illustrator/references/usage.md +50 -0
- package/.agent/skills/article-illustrator/references/workflow.md +332 -0
- package/.agent/skills/arxiv/SKILL.md +275 -0
- package/.agent/skills/blogwatcher/SKILL.md +130 -0
- package/.agent/skills/code-wiki/SKILL.md +438 -0
- package/.agent/skills/code-wiki/templates/README.md +31 -0
- package/.agent/skills/code-wiki/templates/architecture.md +30 -0
- package/.agent/skills/code-wiki/templates/getting-started.md +47 -0
- package/.agent/skills/code-wiki/templates/module.md +38 -0
- package/.agent/skills/codebase-inspection/SKILL.md +109 -0
- package/.agent/skills/comic-creator/SKILL.md +240 -0
- package/.agent/skills/comic-creator/references/analysis-framework.md +176 -0
- package/.agent/skills/comic-creator/references/auto-selection.md +71 -0
- package/.agent/skills/comic-creator/references/base-prompt.md +98 -0
- package/.agent/skills/comic-creator/references/character-template.md +180 -0
- package/.agent/skills/comic-creator/references/ohmsha-guide.md +85 -0
- package/.agent/skills/comic-creator/references/partial-workflows.md +106 -0
- package/.agent/skills/comic-creator/references/storyboard-template.md +143 -0
- package/.agent/skills/comic-creator/references/workflow.md +401 -0
- package/.agent/skills/concept-diagrams/SKILL.md +355 -0
- package/.agent/skills/concept-diagrams/references/dashboard-patterns.md +43 -0
- package/.agent/skills/concept-diagrams/references/infrastructure-patterns.md +144 -0
- package/.agent/skills/concept-diagrams/references/physical-shape-cookbook.md +42 -0
- package/.agent/skills/creative-ideation/SKILL.md +144 -0
- package/.agent/skills/creative-ideation/references/full-prompt-library.md +110 -0
- package/.agent/skills/devops-cli/SKILL.md +149 -0
- package/.agent/skills/devops-cli/references/app-discovery.md +112 -0
- package/.agent/skills/devops-cli/references/authentication.md +59 -0
- package/.agent/skills/devops-cli/references/cli-reference.md +104 -0
- package/.agent/skills/devops-cli/references/running-apps.md +171 -0
- package/.agent/skills/devops-watchers/SKILL.md +103 -0
- package/.agent/skills/docker-management/SKILL.md +273 -0
- package/.agent/skills/domain-intel/SKILL.md +96 -0
- package/.agent/skills/duckduckgo-search/SKILL.md +230 -0
- package/.agent/skills/github-auth/SKILL.md +240 -0
- package/.agent/skills/github-code-review/SKILL.md +474 -0
- package/.agent/skills/github-code-review/references/review-output-template.md +74 -0
- package/.agent/skills/github-issues/SKILL.md +363 -0
- package/.agent/skills/github-issues/templates/bug-report.md +35 -0
- package/.agent/skills/github-issues/templates/feature-request.md +31 -0
- package/.agent/skills/github-pr-workflow/SKILL.md +360 -0
- package/.agent/skills/github-pr-workflow/references/ci-troubleshooting.md +183 -0
- package/.agent/skills/github-pr-workflow/references/conventional-commits.md +71 -0
- package/.agent/skills/github-pr-workflow/templates/pr-body-bugfix.md +35 -0
- package/.agent/skills/github-pr-workflow/templates/pr-body-feature.md +33 -0
- package/.agent/skills/github-repo-management/SKILL.md +509 -0
- package/.agent/skills/github-repo-management/references/github-api-cheatsheet.md +161 -0
- package/.agent/skills/godmode/SKILL.md +396 -0
- package/.agent/skills/godmode/references/jailbreak-templates.md +128 -0
- package/.agent/skills/godmode/references/refusal-detection.md +142 -0
- package/.agent/skills/hyperframes/SKILL.md +182 -0
- package/.agent/skills/hyperframes/references/cli.md +185 -0
- package/.agent/skills/hyperframes/references/composition.md +129 -0
- package/.agent/skills/hyperframes/references/features.md +289 -0
- package/.agent/skills/hyperframes/references/gsap.md +136 -0
- package/.agent/skills/hyperframes/references/troubleshooting.md +137 -0
- package/.agent/skills/hyperframes/references/website-to-video.md +145 -0
- package/.agent/skills/jupyter-live-kernel/SKILL.md +160 -0
- package/.agent/skills/kanban-orchestrator/SKILL.md +209 -0
- package/.agent/skills/kanban-worker/SKILL.md +188 -0
- package/.agent/skills/llm-wiki/SKILL.md +499 -0
- package/.agent/skills/meme-generation/SKILL.md +122 -0
- package/.agent/skills/node-inspect-debugger/SKILL.md +312 -0
- package/.agent/skills/obsidian/SKILL.md +60 -0
- package/.agent/skills/osint-investigation/SKILL.md +269 -0
- package/.agent/skills/osint-investigation/templates/source-template.md +59 -0
- package/.agent/skills/oss-forensics/SKILL.md +422 -0
- package/.agent/skills/oss-forensics/references/evidence-types.md +89 -0
- package/.agent/skills/oss-forensics/references/github-archive-guide.md +184 -0
- package/.agent/skills/oss-forensics/references/investigation-templates.md +131 -0
- package/.agent/skills/oss-forensics/references/recovery-techniques.md +164 -0
- package/.agent/skills/oss-forensics/templates/forensic-report.md +151 -0
- package/.agent/skills/oss-forensics/templates/malicious-package-report.md +43 -0
- package/.agent/skills/parallel-cli/SKILL.md +384 -0
- package/.agent/skills/pinggy-tunnel/SKILL.md +302 -0
- package/.agent/skills/pixel-art/SKILL.md +209 -0
- package/.agent/skills/pixel-art/references/palettes.md +49 -0
- package/.agent/skills/plan/SKILL.md +331 -0
- package/.agent/skills/polymarket/SKILL.md +75 -0
- package/.agent/skills/polymarket/references/api-endpoints.md +220 -0
- package/.agent/skills/python-debugpy/SKILL.md +368 -0
- package/.agent/skills/requesting-code-review/SKILL.md +273 -0
- package/.agent/skills/research-paper-writing/SKILL.md +2367 -0
- package/.agent/skills/research-paper-writing/references/autoreason-methodology.md +394 -0
- package/.agent/skills/research-paper-writing/references/checklists.md +434 -0
- package/.agent/skills/research-paper-writing/references/citation-workflow.md +563 -0
- package/.agent/skills/research-paper-writing/references/experiment-patterns.md +728 -0
- package/.agent/skills/research-paper-writing/references/human-evaluation.md +476 -0
- package/.agent/skills/research-paper-writing/references/paper-types.md +481 -0
- package/.agent/skills/research-paper-writing/references/reviewer-guidelines.md +433 -0
- package/.agent/skills/research-paper-writing/references/sources.md +191 -0
- package/.agent/skills/research-paper-writing/references/writing-guide.md +474 -0
- package/.agent/skills/research-paper-writing/templates/README.md +251 -0
- package/.agent/skills/rest-graphql-debug/SKILL.md +507 -0
- package/.agent/skills/s6-container-supervision/SKILL.md +171 -0
- package/.agent/skills/scrapling/SKILL.md +328 -0
- package/.agent/skills/sherlock/SKILL.md +186 -0
- package/.agent/skills/simplify-code/SKILL.md +168 -0
- package/.agent/skills/skill-authoring/SKILL.md +158 -0
- package/.agent/skills/spike/SKILL.md +190 -0
- package/.agent/skills/subagent-driven-development/SKILL.md +345 -0
- package/.agent/skills/subagent-driven-development/references/context-budget-discipline.md +53 -0
- package/.agent/skills/subagent-driven-development/references/gates-taxonomy.md +93 -0
- package/.agent/skills/systematic-debugging/SKILL.md +360 -0
- package/.agent/skills/test-driven-development/SKILL.md +336 -0
- package/.agent/skills/video-orchestrator/SKILL.md +194 -0
- package/.agent/skills/video-orchestrator/references/examples.md +227 -0
- package/.agent/skills/video-orchestrator/references/intake.md +166 -0
- package/.agent/skills/video-orchestrator/references/kanban-setup.md +278 -0
- package/.agent/skills/video-orchestrator/references/monitoring.md +180 -0
- package/.agent/skills/video-orchestrator/references/role-archetypes.md +298 -0
- package/.agent/skills/video-orchestrator/references/tool-matrix.md +317 -0
- package/.agent/skills/web-pentest/SKILL.md +332 -0
- package/.agent/skills/web-pentest/references/bypass-techniques.md +133 -0
- package/.agent/skills/web-pentest/references/exploitation-techniques.md +204 -0
- package/.agent/skills/web-pentest/references/scope-enforcement.md +110 -0
- package/.agent/skills/web-pentest/references/vuln-taxonomy.md +81 -0
- package/.agent/skills/web-pentest/templates/authorization.md +69 -0
- package/.agent/skills/web-pentest/templates/pentest-report.md +178 -0
- package/.claude/commands/mindforge/skill-tdd.md +53 -0
- package/.claude/commands/mindforge/skills-index.md +118 -0
- package/.claude/commands/mindforge/systematic-debug.md +60 -0
- package/.mindforge/config.json +2 -2
- package/.mindforge/memory/sync-manifest.json +1 -1
- package/.mindforge/skills/arxiv/SKILL.md +294 -0
- package/.mindforge/skills/blogwatcher/SKILL.md +147 -0
- package/.mindforge/skills/code-wiki/SKILL.md +457 -0
- package/.mindforge/skills/codebase-inspection/SKILL.md +126 -0
- package/.mindforge/skills/concept-diagrams/SKILL.md +373 -0
- package/.mindforge/skills/creative-ideation/SKILL.md +162 -0
- package/.mindforge/skills/domain-intel/SKILL.md +116 -0
- package/.mindforge/skills/duckduckgo-search/SKILL.md +249 -0
- package/.mindforge/skills/github-code-review/SKILL.md +493 -0
- package/.mindforge/skills/github-issues/SKILL.md +382 -0
- package/.mindforge/skills/github-pr-workflow/SKILL.md +379 -0
- package/.mindforge/skills/jupyter-live-kernel/SKILL.md +179 -0
- package/.mindforge/skills/kanban-orchestrator/SKILL.md +227 -0
- package/.mindforge/skills/kanban-worker/SKILL.md +206 -0
- package/.mindforge/skills/meme-generation/SKILL.md +141 -0
- package/.mindforge/skills/obsidian/SKILL.md +80 -0
- package/.mindforge/skills/osint-investigation/SKILL.md +288 -0
- package/.mindforge/skills/oss-forensics/SKILL.md +421 -0
- package/.mindforge/skills/pixel-art/SKILL.md +228 -0
- package/.mindforge/skills/plan/SKILL.md +350 -0
- package/.mindforge/skills/requesting-code-review/SKILL.md +292 -0
- package/.mindforge/skills/research-paper-writing/SKILL.md +2384 -0
- package/.mindforge/skills/scrapling/SKILL.md +345 -0
- package/.mindforge/skills/sherlock/SKILL.md +203 -0
- package/.mindforge/skills/simplify-code/SKILL.md +187 -0
- package/.mindforge/skills/spike/SKILL.md +209 -0
- package/.mindforge/skills/subagent-driven-development/SKILL.md +364 -0
- package/.mindforge/skills/systematic-debugging/SKILL.md +379 -0
- package/.mindforge/skills/test-driven-development/SKILL.md +355 -0
- package/.mindforge/skills/web-pentest/SKILL.md +327 -0
- package/CHANGELOG.md +43 -0
- package/MINDFORGE.md +2 -2
- package/README.md +39 -3
- package/RELEASENOTES.md +55 -0
- package/docs/getting-started.md +42 -5
- package/package.json +1 -1
|
@@ -0,0 +1,507 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: rest-graphql-debug
|
|
3
|
+
description: "Debug REST/GraphQL APIs: status codes, auth, schemas, repro."
|
|
4
|
+
version: 1.2.0
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
# API Testing & Debugging
|
|
8
|
+
|
|
9
|
+
Drive REST and GraphQL diagnosis through tools — `terminal` for `curl`, `execute_code` for Python `requests`, `web_extract` for vendor docs. Isolate the failing layer before guessing at the fix.
|
|
10
|
+
|
|
11
|
+
## When to Use
|
|
12
|
+
|
|
13
|
+
- API returns unexpected status or body
|
|
14
|
+
- Auth fails (401/403 after token refresh, OAuth, API key)
|
|
15
|
+
- Works in Postman but fails in code
|
|
16
|
+
- Webhook / callback integration debugging
|
|
17
|
+
- Building or reviewing API integration tests
|
|
18
|
+
- Rate limiting or pagination issues
|
|
19
|
+
|
|
20
|
+
Skip for UI rendering, DB query tuning, or DNS/firewall infra (escalate).
|
|
21
|
+
|
|
22
|
+
## Core Principle
|
|
23
|
+
|
|
24
|
+
**Isolate the layer, then fix.** A 200 OK can hide broken data. A 500 can mask a one-character auth typo. Walk the chain in order; never skip a step.
|
|
25
|
+
|
|
26
|
+
```
|
|
27
|
+
1. Connectivity → can we reach the host at all?
|
|
28
|
+
1.5 Timeouts → connect-slow vs read-slow?
|
|
29
|
+
2. TLS/SSL → cert valid and trusted?
|
|
30
|
+
3. Auth → credentials correct and unexpired?
|
|
31
|
+
4. Request format → payload shape match server expectations?
|
|
32
|
+
5. Response parse → does our code accept what came back?
|
|
33
|
+
6. Semantics → does the data mean what we assume?
|
|
34
|
+
```
|
|
35
|
+
|
|
36
|
+
## 5-Minute Quickstart
|
|
37
|
+
|
|
38
|
+
### REST via terminal
|
|
39
|
+
|
|
40
|
+
```python
|
|
41
|
+
# Verbose request/response exchange
|
|
42
|
+
terminal('curl -v https://api.example.com/users/1')
|
|
43
|
+
|
|
44
|
+
# POST with JSON
|
|
45
|
+
terminal("""curl -X POST https://api.example.com/users \\
|
|
46
|
+
-H 'Content-Type: application/json' \\
|
|
47
|
+
-H "Authorization: Bearer $TOKEN" \\
|
|
48
|
+
-d '{"name":"test","email":"test@example.com"}'""")
|
|
49
|
+
|
|
50
|
+
# Headers only
|
|
51
|
+
terminal('curl -sI https://api.example.com/health')
|
|
52
|
+
|
|
53
|
+
# Pretty-print JSON
|
|
54
|
+
terminal('curl -s https://api.example.com/users | python3 -m json.tool')
|
|
55
|
+
```
|
|
56
|
+
|
|
57
|
+
### GraphQL via terminal
|
|
58
|
+
|
|
59
|
+
```python
|
|
60
|
+
terminal("""curl -X POST https://api.example.com/graphql \\
|
|
61
|
+
-H 'Content-Type: application/json' \\
|
|
62
|
+
-H "Authorization: Bearer $TOKEN" \\
|
|
63
|
+
-d '{"query":"{ user(id: 1) { name email } }"}'""")
|
|
64
|
+
```
|
|
65
|
+
|
|
66
|
+
**GraphQL gotcha:** servers often return HTTP 200 even when the query failed. Always inspect the `errors` field regardless of status code:
|
|
67
|
+
|
|
68
|
+
```python
|
|
69
|
+
execute_code('''
|
|
70
|
+
import os, requests
|
|
71
|
+
resp = requests.post(
|
|
72
|
+
"https://api.example.com/graphql",
|
|
73
|
+
json={"query": "{ user(id: 1) { name email } }"},
|
|
74
|
+
headers={"Authorization": f"Bearer {os.environ['TOKEN']}"},
|
|
75
|
+
timeout=10,
|
|
76
|
+
)
|
|
77
|
+
data = resp.json()
|
|
78
|
+
if data.get("errors"):
|
|
79
|
+
for err in data["errors"]:
|
|
80
|
+
print(f"GraphQL error: {err['message']} (path: {err.get('path')})")
|
|
81
|
+
print(data.get("data"))
|
|
82
|
+
''')
|
|
83
|
+
```
|
|
84
|
+
|
|
85
|
+
### Python (requests) via execute_code
|
|
86
|
+
|
|
87
|
+
```python
|
|
88
|
+
execute_code('''
|
|
89
|
+
import requests
|
|
90
|
+
resp = requests.get(
|
|
91
|
+
"https://api.example.com/users/1",
|
|
92
|
+
headers={"Authorization": "Bearer <TOKEN>"},
|
|
93
|
+
timeout=(3.05, 30), # (connect, read)
|
|
94
|
+
)
|
|
95
|
+
print(resp.status_code, dict(resp.headers))
|
|
96
|
+
print(resp.text[:500])
|
|
97
|
+
''')
|
|
98
|
+
```
|
|
99
|
+
|
|
100
|
+
## Layered Debug Flow
|
|
101
|
+
|
|
102
|
+
### Step 1 — Connectivity
|
|
103
|
+
|
|
104
|
+
```python
|
|
105
|
+
terminal('nslookup api.example.com')
|
|
106
|
+
terminal('curl -v --connect-timeout 5 https://api.example.com/health')
|
|
107
|
+
```
|
|
108
|
+
|
|
109
|
+
Failures: DNS not resolving, firewall, VPN required, proxy missing.
|
|
110
|
+
|
|
111
|
+
### Step 1.5 — Timeouts
|
|
112
|
+
|
|
113
|
+
Distinguish *can't reach* from *reaches but slow*:
|
|
114
|
+
|
|
115
|
+
```python
|
|
116
|
+
terminal('''curl -w "dns:%{time_namelookup}s connect:%{time_connect}s tls:%{time_appconnect}s ttfb:%{time_starttransfer}s total:%{time_total}s\\n" \\
|
|
117
|
+
-o /dev/null -s https://api.example.com/endpoint''')
|
|
118
|
+
```
|
|
119
|
+
|
|
120
|
+
In Python, always pass a tuple timeout — `requests` has no default and will hang forever:
|
|
121
|
+
|
|
122
|
+
```python
|
|
123
|
+
execute_code('''
|
|
124
|
+
import requests
|
|
125
|
+
from requests.exceptions import ConnectTimeout, ReadTimeout
|
|
126
|
+
try:
|
|
127
|
+
requests.get(url, timeout=(3.05, 30))
|
|
128
|
+
except ConnectTimeout:
|
|
129
|
+
print("Cannot reach host — DNS, firewall, VPN")
|
|
130
|
+
except ReadTimeout:
|
|
131
|
+
print("Connected but server is slow")
|
|
132
|
+
''')
|
|
133
|
+
```
|
|
134
|
+
|
|
135
|
+
Diagnosis: high `time_connect` is network/firewall; high `time_starttransfer` with low `time_connect` is a slow server.
|
|
136
|
+
|
|
137
|
+
### Step 2 — TLS/SSL
|
|
138
|
+
|
|
139
|
+
```python
|
|
140
|
+
terminal('curl -vI https://api.example.com 2>&1 | grep -E "SSL|subject|expire|issuer"')
|
|
141
|
+
```
|
|
142
|
+
|
|
143
|
+
Failures: expired cert, self-signed, hostname mismatch, missing CA bundle. Use `-k` only for ad-hoc debug, never in code.
|
|
144
|
+
|
|
145
|
+
### Step 3 — Authentication
|
|
146
|
+
|
|
147
|
+
```python
|
|
148
|
+
# Token validity check
|
|
149
|
+
terminal('curl -s -o /dev/null -w "%{http_code}\\n" -H "Authorization: Bearer $TOKEN" https://api.example.com/me')
|
|
150
|
+
|
|
151
|
+
# Decode JWT exp claim — handles base64url padding correctly
|
|
152
|
+
execute_code('''
|
|
153
|
+
import json, base64, os
|
|
154
|
+
tok = os.environ["TOKEN"]
|
|
155
|
+
payload = tok.split(".")[1]
|
|
156
|
+
payload += "=" * (-len(payload) % 4)
|
|
157
|
+
print(json.dumps(json.loads(base64.urlsafe_b64decode(payload)), indent=2))
|
|
158
|
+
''')
|
|
159
|
+
```
|
|
160
|
+
|
|
161
|
+
Checklist:
|
|
162
|
+
- Token expired? (`exp` claim in JWT)
|
|
163
|
+
- Right scheme? Bearer vs Basic vs Token vs `X-Api-Key`
|
|
164
|
+
- Right environment? Staging key on prod is a classic
|
|
165
|
+
- API key in header vs query param (`?api_key=…`)?
|
|
166
|
+
|
|
167
|
+
### Step 4 — Request Format
|
|
168
|
+
|
|
169
|
+
```python
|
|
170
|
+
terminal("""curl -v -X POST https://api.example.com/endpoint \\
|
|
171
|
+
-H 'Content-Type: application/json' \\
|
|
172
|
+
-d '{"key":"value"}' 2>&1""")
|
|
173
|
+
```
|
|
174
|
+
|
|
175
|
+
**Content-Type / body mismatch — the silent 415/400:**
|
|
176
|
+
|
|
177
|
+
```python
|
|
178
|
+
# WRONG — data= sends form-encoded, header lies
|
|
179
|
+
requests.post(url, data='{"k":"v"}', headers={"Content-Type": "application/json"})
|
|
180
|
+
|
|
181
|
+
# RIGHT — json= auto-sets header AND serializes
|
|
182
|
+
requests.post(url, json={"k": "v"})
|
|
183
|
+
|
|
184
|
+
# WRONG — Accept says XML, code calls .json()
|
|
185
|
+
requests.get(url, headers={"Accept": "text/xml"})
|
|
186
|
+
|
|
187
|
+
# RIGHT — let requests build multipart with boundary
|
|
188
|
+
requests.post(url, files={"file": open("doc.pdf", "rb")})
|
|
189
|
+
```
|
|
190
|
+
|
|
191
|
+
Common: form-encoded vs JSON, missing required fields, wrong HTTP method, unencoded query params.
|
|
192
|
+
|
|
193
|
+
### Step 5 — Response Parsing
|
|
194
|
+
|
|
195
|
+
Always inspect content-type before calling `.json()`:
|
|
196
|
+
|
|
197
|
+
```python
|
|
198
|
+
execute_code('''
|
|
199
|
+
import requests
|
|
200
|
+
resp = requests.post(url, json=payload, timeout=10)
|
|
201
|
+
print(f"status={resp.status_code}")
|
|
202
|
+
print(f"headers={dict(resp.headers)}")
|
|
203
|
+
ct = resp.headers.get("Content-Type", "")
|
|
204
|
+
if "application/json" in ct:
|
|
205
|
+
print(resp.json())
|
|
206
|
+
else:
|
|
207
|
+
print(f"unexpected content-type {ct!r}, body={resp.text[:500]!r}")
|
|
208
|
+
''')
|
|
209
|
+
```
|
|
210
|
+
|
|
211
|
+
Failures: HTML error page where JSON expected, empty body, wrong charset.
|
|
212
|
+
|
|
213
|
+
### Step 6 — Semantic Validation
|
|
214
|
+
|
|
215
|
+
Parsed cleanly — but is the data *correct*?
|
|
216
|
+
|
|
217
|
+
- Does `"status": "active"` mean what your code thinks?
|
|
218
|
+
- ID in response matches the one requested?
|
|
219
|
+
- Timestamps in expected timezone?
|
|
220
|
+
- Pagination returning all results, or just page 1?
|
|
221
|
+
|
|
222
|
+
## HTTP Status Playbook
|
|
223
|
+
|
|
224
|
+
### 401 Unauthorized — credentials missing or invalid
|
|
225
|
+
|
|
226
|
+
1. `Authorization` header actually present? (`curl -v` to confirm)
|
|
227
|
+
2. Token correct and unexpired?
|
|
228
|
+
3. Right auth scheme? (`Bearer` vs `Basic` vs `Token`)
|
|
229
|
+
4. Some APIs use query param (`?api_key=…`) instead of header.
|
|
230
|
+
|
|
231
|
+
### 403 Forbidden — authenticated but not authorized
|
|
232
|
+
|
|
233
|
+
1. Token has the required scopes/permissions?
|
|
234
|
+
2. Resource owned by a different account?
|
|
235
|
+
3. IP allowlist blocking you?
|
|
236
|
+
4. CORS in browser? (check `Access-Control-Allow-Origin`)
|
|
237
|
+
|
|
238
|
+
### 404 Not Found — resource doesn't exist or URL is wrong
|
|
239
|
+
|
|
240
|
+
1. Path correct? (trailing slash, typo, version prefix)
|
|
241
|
+
2. Resource ID exists?
|
|
242
|
+
3. Right API version (`/v1/` vs `/v2/`)?
|
|
243
|
+
4. Right base URL (staging vs prod)?
|
|
244
|
+
|
|
245
|
+
### 409 Conflict — state collision
|
|
246
|
+
|
|
247
|
+
1. Resource already exists (duplicate create)?
|
|
248
|
+
2. Stale `ETag` / `If-Match`?
|
|
249
|
+
3. Concurrent modification by another process?
|
|
250
|
+
|
|
251
|
+
### 422 Unprocessable Entity — valid JSON, invalid data
|
|
252
|
+
|
|
253
|
+
The error body usually names the bad fields. Check:
|
|
254
|
+
- Field types (string vs int, date format)
|
|
255
|
+
- Required vs optional
|
|
256
|
+
- Enum values inside the allowed set
|
|
257
|
+
|
|
258
|
+
### 429 Too Many Requests — rate limited
|
|
259
|
+
|
|
260
|
+
Check `Retry-After` and `X-RateLimit-*` headers. Exponential backoff:
|
|
261
|
+
|
|
262
|
+
```python
|
|
263
|
+
execute_code('''
|
|
264
|
+
import time, requests
|
|
265
|
+
|
|
266
|
+
def with_backoff(method, url, **kwargs):
|
|
267
|
+
for attempt in range(5):
|
|
268
|
+
resp = requests.request(method, url, **kwargs)
|
|
269
|
+
if resp.status_code != 429:
|
|
270
|
+
return resp
|
|
271
|
+
wait = int(resp.headers.get("Retry-After", 2 ** attempt))
|
|
272
|
+
time.sleep(wait)
|
|
273
|
+
return resp
|
|
274
|
+
''')
|
|
275
|
+
```
|
|
276
|
+
|
|
277
|
+
### 5xx — server-side, usually not your fault
|
|
278
|
+
|
|
279
|
+
- **500** — server bug. Capture correlation ID, file with provider.
|
|
280
|
+
- **502** — upstream down. Backoff + retry.
|
|
281
|
+
- **503** — overloaded / maintenance. Check status page.
|
|
282
|
+
- **504** — upstream timeout. Reduce payload or raise timeout.
|
|
283
|
+
|
|
284
|
+
For all 5xx: backoff with jitter, alert on persistence.
|
|
285
|
+
|
|
286
|
+
## Pagination & Idempotency
|
|
287
|
+
|
|
288
|
+
**Pagination.** Verify you're getting *all* results. Look for `next_cursor`, `next_page`, `total_count`. Two patterns:
|
|
289
|
+
- Offset (`?limit=100&offset=200`) — simple, can skip items if data shifts.
|
|
290
|
+
- Cursor (`?cursor=abc123`) — preferred for live or large datasets.
|
|
291
|
+
|
|
292
|
+
**Idempotency.** For non-idempotent operations (POST), send `Idempotency-Key: <uuid>` so retries don't double-charge / double-create. Mandatory for payments and orders.
|
|
293
|
+
|
|
294
|
+
## Contract Validation
|
|
295
|
+
|
|
296
|
+
Catch schema drift before it hits production:
|
|
297
|
+
|
|
298
|
+
```python
|
|
299
|
+
execute_code('''
|
|
300
|
+
import requests
|
|
301
|
+
|
|
302
|
+
def validate_user(data: dict) -> list[str]:
|
|
303
|
+
errors = []
|
|
304
|
+
required = {"id": int, "email": str, "created_at": str}
|
|
305
|
+
for field, expected in required.items():
|
|
306
|
+
if field not in data:
|
|
307
|
+
errors.append(f"missing field: {field}")
|
|
308
|
+
elif not isinstance(data[field], expected):
|
|
309
|
+
errors.append(f"{field}: want {expected.__name__}, got {type(data[field]).__name__}")
|
|
310
|
+
return errors
|
|
311
|
+
|
|
312
|
+
resp = requests.get(f"{BASE}/users/1", headers=HEADERS, timeout=10)
|
|
313
|
+
issues = validate_user(resp.json())
|
|
314
|
+
if issues:
|
|
315
|
+
print(f"contract violations: {issues}")
|
|
316
|
+
''')
|
|
317
|
+
```
|
|
318
|
+
|
|
319
|
+
Run after API upgrades, when integrating new third parties, or in CI smoke tests.
|
|
320
|
+
|
|
321
|
+
## Correlation IDs
|
|
322
|
+
|
|
323
|
+
Always capture the provider's request ID — fastest path to vendor support:
|
|
324
|
+
|
|
325
|
+
```python
|
|
326
|
+
execute_code('''
|
|
327
|
+
import requests
|
|
328
|
+
resp = requests.post(url, json=payload, headers=headers, timeout=10)
|
|
329
|
+
request_id = (
|
|
330
|
+
resp.headers.get("X-Request-Id")
|
|
331
|
+
or resp.headers.get("X-Trace-Id")
|
|
332
|
+
or resp.headers.get("CF-Ray") # Cloudflare
|
|
333
|
+
)
|
|
334
|
+
if resp.status_code >= 400:
|
|
335
|
+
print(f"failed status={resp.status_code} req_id={request_id} ts={resp.headers.get('Date')}")
|
|
336
|
+
''')
|
|
337
|
+
```
|
|
338
|
+
|
|
339
|
+
**Vendor bug-report template:**
|
|
340
|
+
|
|
341
|
+
```
|
|
342
|
+
Endpoint: POST /api/v1/orders
|
|
343
|
+
Request ID: req_abc123xyz
|
|
344
|
+
Timestamp: 2026-03-17T14:30:00Z
|
|
345
|
+
Status: 500
|
|
346
|
+
Expected: 201 with order object
|
|
347
|
+
Actual: 500 {"error":"internal server error"}
|
|
348
|
+
Repro: curl -X POST … (auth: <REDACTED>)
|
|
349
|
+
```
|
|
350
|
+
|
|
351
|
+
## Regression Test Template
|
|
352
|
+
|
|
353
|
+
Drop this into `tests/` and run via `terminal('pytest tests/test_api_smoke.py -v')`:
|
|
354
|
+
|
|
355
|
+
```python
|
|
356
|
+
import os, requests, pytest
|
|
357
|
+
|
|
358
|
+
BASE_URL = os.environ.get("API_BASE_URL", "https://api.example.com")
|
|
359
|
+
TOKEN = os.environ.get("API_TOKEN", "")
|
|
360
|
+
HEADERS = {"Authorization": f"Bearer {TOKEN}"}
|
|
361
|
+
|
|
362
|
+
class TestAPISmoke:
|
|
363
|
+
def test_health(self):
|
|
364
|
+
resp = requests.get(f"{BASE_URL}/health", timeout=5)
|
|
365
|
+
assert resp.status_code == 200
|
|
366
|
+
|
|
367
|
+
def test_list_users_returns_array(self):
|
|
368
|
+
resp = requests.get(f"{BASE_URL}/users", headers=HEADERS, timeout=10)
|
|
369
|
+
assert resp.status_code == 200
|
|
370
|
+
data = resp.json()
|
|
371
|
+
assert isinstance(data.get("data", data), list)
|
|
372
|
+
|
|
373
|
+
def test_get_user_required_fields(self):
|
|
374
|
+
resp = requests.get(f"{BASE_URL}/users/1", headers=HEADERS, timeout=10)
|
|
375
|
+
assert resp.status_code in (200, 404)
|
|
376
|
+
if resp.status_code == 200:
|
|
377
|
+
user = resp.json()
|
|
378
|
+
assert "id" in user and "email" in user
|
|
379
|
+
|
|
380
|
+
def test_invalid_auth_returns_401(self):
|
|
381
|
+
resp = requests.get(
|
|
382
|
+
f"{BASE_URL}/users",
|
|
383
|
+
headers={"Authorization": "Bearer invalid-token"},
|
|
384
|
+
timeout=10,
|
|
385
|
+
)
|
|
386
|
+
assert resp.status_code == 401
|
|
387
|
+
```
|
|
388
|
+
|
|
389
|
+
## Security
|
|
390
|
+
|
|
391
|
+
### Token handling
|
|
392
|
+
- Never log full tokens. Redact: `Bearer <REDACTED>`.
|
|
393
|
+
- Never hardcode tokens in scripts. Read from env (`os.environ["API_TOKEN"]`) or `${HERMES_HOME:-~/.hermes}/.env`.
|
|
394
|
+
- Rotate immediately if a token surfaces in logs, error messages, or git history.
|
|
395
|
+
|
|
396
|
+
### Safe logging
|
|
397
|
+
|
|
398
|
+
```python
|
|
399
|
+
def redact_auth(headers: dict) -> dict:
|
|
400
|
+
sensitive = {"authorization", "x-api-key", "cookie", "set-cookie"}
|
|
401
|
+
return {k: ("<REDACTED>" if k.lower() in sensitive else v) for k, v in headers.items()}
|
|
402
|
+
```
|
|
403
|
+
|
|
404
|
+
### Leak checklist
|
|
405
|
+
|
|
406
|
+
- [ ] **Credentials in URLs.** API keys in query strings end up in server logs, browser history, referrer headers — use headers.
|
|
407
|
+
- [ ] **PII in error responses.** `404 on /users/123` shouldn't reveal whether the user exists (enumeration).
|
|
408
|
+
- [ ] **Stack traces in prod.** 500s shouldn't leak file paths, framework versions.
|
|
409
|
+
- [ ] **Internal hostnames/IPs.** `10.x.x.x`, `internal-api.corp.local` in error bodies.
|
|
410
|
+
- [ ] **Tokens echoed back.** Some APIs include the auth token in error details. Verify they don't.
|
|
411
|
+
- [ ] **Verbose `Server` / `X-Powered-By`.** Stack-info leaks. Note for security review.
|
|
412
|
+
|
|
413
|
+
## the agent Tool Patterns
|
|
414
|
+
|
|
415
|
+
### terminal — for curl, dig, openssl
|
|
416
|
+
|
|
417
|
+
```python
|
|
418
|
+
terminal('curl -sI https://api.example.com')
|
|
419
|
+
terminal('openssl s_client -connect api.example.com:443 -servername api.example.com </dev/null 2>/dev/null | openssl x509 -noout -dates')
|
|
420
|
+
```
|
|
421
|
+
|
|
422
|
+
### execute_code — for multi-step Python flows
|
|
423
|
+
|
|
424
|
+
When debugging spans auth → fetch → paginate → validate, use `execute_code`. Variables persist for the script, results print to stdout, no risk of token spam in your context:
|
|
425
|
+
|
|
426
|
+
```python
|
|
427
|
+
execute_code('''
|
|
428
|
+
import os, requests
|
|
429
|
+
|
|
430
|
+
token = os.environ["API_TOKEN"]
|
|
431
|
+
base = "https://api.example.com"
|
|
432
|
+
H = {"Authorization": f"Bearer {token}"}
|
|
433
|
+
|
|
434
|
+
# 1. auth
|
|
435
|
+
me = requests.get(f"{base}/me", headers=H, timeout=10)
|
|
436
|
+
print(f"auth {me.status_code}")
|
|
437
|
+
|
|
438
|
+
# 2. paginate
|
|
439
|
+
all_users, cursor = [], None
|
|
440
|
+
while True:
|
|
441
|
+
params = {"cursor": cursor} if cursor else {}
|
|
442
|
+
r = requests.get(f"{base}/users", headers=H, params=params, timeout=10)
|
|
443
|
+
body = r.json()
|
|
444
|
+
all_users.extend(body["data"])
|
|
445
|
+
cursor = body.get("next_cursor")
|
|
446
|
+
if not cursor:
|
|
447
|
+
break
|
|
448
|
+
print(f"users={len(all_users)}")
|
|
449
|
+
''')
|
|
450
|
+
```
|
|
451
|
+
|
|
452
|
+
### web_extract — for vendor API docs
|
|
453
|
+
|
|
454
|
+
Pull the spec for the endpoint you're debugging instead of guessing:
|
|
455
|
+
|
|
456
|
+
```python
|
|
457
|
+
web_extract(urls=["https://docs.example.com/api/v1/users"])
|
|
458
|
+
```
|
|
459
|
+
|
|
460
|
+
### delegate_task — for full CRUD test sweeps
|
|
461
|
+
|
|
462
|
+
```python
|
|
463
|
+
delegate_task(
|
|
464
|
+
goal="Test all CRUD endpoints for /api/v1/users",
|
|
465
|
+
context="""
|
|
466
|
+
Follow the rest-graphql-debug skill (optional-skills/software-development/rest-graphql-debug).
|
|
467
|
+
Base URL: https://api.example.com
|
|
468
|
+
Auth: Bearer token from API_TOKEN env var.
|
|
469
|
+
|
|
470
|
+
For each verb (POST, GET, PATCH, DELETE):
|
|
471
|
+
- happy path: assert status + response schema
|
|
472
|
+
- error cases: 400, 404, 422
|
|
473
|
+
- log a repro curl for any failure (redact tokens)
|
|
474
|
+
|
|
475
|
+
Output: pass/fail per endpoint + correlation IDs for failures.
|
|
476
|
+
""",
|
|
477
|
+
toolsets=["terminal", "file"],
|
|
478
|
+
)
|
|
479
|
+
```
|
|
480
|
+
|
|
481
|
+
## Output Format
|
|
482
|
+
|
|
483
|
+
When reporting findings:
|
|
484
|
+
|
|
485
|
+
```
|
|
486
|
+
## Finding
|
|
487
|
+
Endpoint: POST /api/v1/users
|
|
488
|
+
Status: 422 Unprocessable Entity
|
|
489
|
+
Req ID: req_abc123xyz
|
|
490
|
+
|
|
491
|
+
## Repro
|
|
492
|
+
curl -X POST https://api.example.com/api/v1/users \
|
|
493
|
+
-H 'Content-Type: application/json' \
|
|
494
|
+
-H 'Authorization: Bearer <REDACTED>' \
|
|
495
|
+
-d '{"name":"test"}'
|
|
496
|
+
|
|
497
|
+
## Root Cause
|
|
498
|
+
Missing required field `email`. Server validation rejects before processing.
|
|
499
|
+
|
|
500
|
+
## Fix
|
|
501
|
+
-d '{"name":"test","email":"test@example.com"}'
|
|
502
|
+
```
|
|
503
|
+
|
|
504
|
+
## Related
|
|
505
|
+
|
|
506
|
+
- `systematic-debugging` — once the failing API layer is isolated, root-cause your code
|
|
507
|
+
- `test-driven-development` — write the regression test before shipping the fix
|
|
@@ -0,0 +1,171 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: hermes-s6-container-supervision
|
|
3
|
+
description: Modify, debug, or extend the s6-overlay supervision tree inside the
|
|
4
|
+
version: 1.0.0
|
|
5
|
+
environments: [s6]
|
|
6
|
+
---
|
|
7
|
+
|
|
8
|
+
# the agent s6-overlay Container Supervision
|
|
9
|
+
|
|
10
|
+
## When to use this skill
|
|
11
|
+
|
|
12
|
+
Load this skill when you're working on:
|
|
13
|
+
- Adding or removing a static service in the the agent Docker image (something that should be supervised at every container start, like the dashboard)
|
|
14
|
+
- Diagnosing why a per-profile gateway isn't starting, restarting, or surviving `docker restart`
|
|
15
|
+
- Understanding why the container's CMD is `/opt/hermes/docker/main-wrapper.sh` and how leading-dash args reach the user's program
|
|
16
|
+
- Modifying `cont-init.d` boot scripts (UID remap, volume seeding, profile reconciliation)
|
|
17
|
+
- Changing the rendered run-script for per-profile gateways (Phase 4)
|
|
18
|
+
|
|
19
|
+
If you're just running the
|
|
20
|
+
|
|
21
|
+
## Architecture at a glance
|
|
22
|
+
|
|
23
|
+
```
|
|
24
|
+
/init ← PID 1 (s6-overlay v3.2.3.0)
|
|
25
|
+
├── cont-init.d ← oneshot setup, runs as root
|
|
26
|
+
│ ├── 01-hermes-setup ← docker/stage2-hook.sh
|
|
27
|
+
│ │ ├── UID/GID remap
|
|
28
|
+
│ │ ├── chown /opt/data
|
|
29
|
+
│ │ ├── chown /opt/data/profiles (every boot)
|
|
30
|
+
│ │ ├── seed .env / config.yaml / SOUL.md
|
|
31
|
+
│ │ └── skills_sync.py
|
|
32
|
+
│ └── 02-reconcile-profiles ← hermes_cli.container_boot
|
|
33
|
+
│ ├── chown /run/service (hermes-writable for runtime register)
|
|
34
|
+
│ └── walk $HERMES_HOME/profiles/<name>/gateway_state.json
|
|
35
|
+
│ → recreate /run/service/gateway-<name>/
|
|
36
|
+
│ → auto-start only those with prior_state == "running"
|
|
37
|
+
│
|
|
38
|
+
├── s6-rc.d (static services, in /etc/s6-overlay/s6-rc.d/)
|
|
39
|
+
│ ├── main-hermes/run ← exec sleep infinity (no-op slot)
|
|
40
|
+
│ └── dashboard/run ← if HERMES_DASHBOARD=1, runs `hermes dashboard`
|
|
41
|
+
│
|
|
42
|
+
├── /run/service (s6-svscan watches; tmpfs)
|
|
43
|
+
│ ├── gateway-coder/ ← runtime-registered per-profile
|
|
44
|
+
│ │ ├── type ("longrun")
|
|
45
|
+
│ │ ├── run ("#!/command/with-contenv sh ... exec s6-setuidgid hermes hermes -p coder gateway run")
|
|
46
|
+
│ │ ├── down (marker — present means "registered but don't auto-start")
|
|
47
|
+
│ │ └── log/run (s6-log → $HERMES_HOME/logs/gateways/coder/current)
|
|
48
|
+
│ └── ...
|
|
49
|
+
│
|
|
50
|
+
└── CMD ("main program") ← /opt/hermes/docker/main-wrapper.sh
|
|
51
|
+
└── routes user args: bare exec | hermes subcommand | hermes (no args)
|
|
52
|
+
— exec'd by /init with stdin/stdout/stderr inherited (TTY for --tui)
|
|
53
|
+
```
|
|
54
|
+
|
|
55
|
+
## Key files
|
|
56
|
+
|
|
57
|
+
| Path | Role |
|
|
58
|
+
|---|---|
|
|
59
|
+
| `Dockerfile` | s6-overlay install + cont-init.d wiring + `ENTRYPOINT ["/init", "/opt/hermes/docker/main-wrapper.sh"]` |
|
|
60
|
+
| `docker/stage2-hook.sh` | The "old entrypoint logic" — UID remap, chown, seed, skills sync. Runs as cont-init.d/01-hermes-setup. |
|
|
61
|
+
| `docker/cont-init.d/02-reconcile-profiles` | Calls `hermes_cli.container_boot` on every boot to restore profile gateway slots from the persistent volume. |
|
|
62
|
+
| `docker/main-wrapper.sh` | The container's CMD. Routes user args, drops to hermes via `s6-setuidgid`, exec's the chosen program. |
|
|
63
|
+
| `docker/s6-rc.d/main-hermes/run` | No-op `sleep infinity` — slot exists so the s6-rc user bundle is valid; main hermes runs as the CMD, not as a supervised service. |
|
|
64
|
+
| `docker/s6-rc.d/dashboard/run` | Conditional service — `exec sleep infinity` unless `HERMES_DASHBOARD` is truthy. |
|
|
65
|
+
| `docker/entrypoint.sh` | Back-compat shim that `exec`s the stage2 hook. External scripts that hard-coded the old entrypoint path still work. |
|
|
66
|
+
| `hermes_cli/service_manager.py` | `S6ServiceManager`: `register_profile_gateway`, `unregister_profile_gateway`, `start/stop/restart/is_running`, `list_profile_gateways`. |
|
|
67
|
+
| `hermes_cli/container_boot.py` | `reconcile_profile_gateways()` — walks persistent profiles, regenerates s6 slots, emits `container-boot.log`. |
|
|
68
|
+
| `hermes_cli/gateway.py::_dispatch_via_service_manager_if_s6` | Intercepts `hermes gateway start/stop/restart` and routes to s6 when running in a container. |
|
|
69
|
+
|
|
70
|
+
## Why Architecture B (CMD as main program, not s6-supervised)
|
|
71
|
+
|
|
72
|
+
The original plan (v1–v3) called for main hermes to run as a supervised s6-rc service. Two real s6-overlay v3 mechanics blocked that:
|
|
73
|
+
|
|
74
|
+
1. **cont-init.d scripts receive no CMD args** — so the stage2 hook can't parse `docker run <image> chat -q "hi"` to set `HERMES_ARGS` for a service `run` script to consume.
|
|
75
|
+
2. **`/run/s6/basedir/bin/halt` does NOT propagate the exit code** written to `/run/s6-linux-init-container-results/exitcode`. Containers always exit 143 (SIGTERM) regardless. Confirmed by skarnet (s6 author) in [issue #477](https://github.com/just-containers/s6-overlay/issues/477): _"if you want a container shutdown, you need to either have your CMD exit, or, if you have no CMD, write the container exit code you want then call halt"_.
|
|
76
|
+
|
|
77
|
+
So we use the s6-overlay-native CMD pattern: `ENTRYPOINT ["/init", "/opt/hermes/docker/main-wrapper.sh"]`. /init prepends the wrapper to user args automatically — so `docker run <image> --version` becomes `/init main-wrapper.sh --version`, and `--version` doesn't get intercepted by /init's POSIX shell. The wrapper drops to hermes via `s6-setuidgid`, then exec's the chosen program. The program's exit code becomes the container exit code, exactly matching the pre-s6 tini contract.
|
|
78
|
+
|
|
79
|
+
Trade-off: main hermes is unsupervised under s6. That exactly matches its behavior under tini (the pre-s6 image). Dashboard supervision is the only **new** guarantee — and per-profile gateways under `/run/service/` get full supervision.
|
|
80
|
+
|
|
81
|
+
## Quick recipes
|
|
82
|
+
|
|
83
|
+
### Verify s6 is PID 1 in a running container
|
|
84
|
+
|
|
85
|
+
```sh
|
|
86
|
+
docker exec <c> sh -c 'cat /proc/1/comm; readlink /proc/1/exe'
|
|
87
|
+
# Expect: s6-svscan or init / /package/admin/s6/.../s6-svscan
|
|
88
|
+
```
|
|
89
|
+
|
|
90
|
+
### Inspect a profile gateway service
|
|
91
|
+
|
|
92
|
+
```sh
|
|
93
|
+
# /command/ isn't on docker-exec PATH — use absolute path
|
|
94
|
+
docker exec <c> /command/s6-svstat /run/service/gateway-<name>
|
|
95
|
+
# "up (pid …) … seconds" → running
|
|
96
|
+
# "down (exitcode N) … seconds, normally up, want up, …" → s6 wants it up but the process keeps exiting (crash loop)
|
|
97
|
+
# "down … normally up, ready …" → user stopped it
|
|
98
|
+
```
|
|
99
|
+
|
|
100
|
+
### Bring a service up/down manually
|
|
101
|
+
|
|
102
|
+
```sh
|
|
103
|
+
docker exec <c> /command/s6-svc -u /run/service/gateway-<name> # up
|
|
104
|
+
docker exec <c> /command/s6-svc -d /run/service/gateway-<name> # down
|
|
105
|
+
docker exec <c> /command/s6-svc -t /run/service/gateway-<name> # SIGTERM (restart)
|
|
106
|
+
```
|
|
107
|
+
|
|
108
|
+
### Watch the cont-init reconciler log
|
|
109
|
+
|
|
110
|
+
```sh
|
|
111
|
+
docker exec <c> tail -n 50 /opt/data/logs/container-boot.log
|
|
112
|
+
# 2026-05-21T06:18:05+0000 profile=coder prior_state=running action=started
|
|
113
|
+
# 2026-05-21T06:18:05+0000 profile=writer prior_state=stopped action=registered
|
|
114
|
+
```
|
|
115
|
+
|
|
116
|
+
### Add a new static service
|
|
117
|
+
|
|
118
|
+
1. Create `docker/s6-rc.d/<name>/type` with `longrun\n` and `docker/s6-rc.d/<name>/run` (use `#!/command/with-contenv sh` + `# shellcheck shell=sh`).
|
|
119
|
+
2. Drop to hermes via `s6-setuidgid hermes` at the top of run (unless you specifically need root).
|
|
120
|
+
3. Create empty `docker/s6-rc.d/<name>/dependencies.d/base` so it waits for the base bundle.
|
|
121
|
+
4. Create empty `docker/s6-rc.d/user/contents.d/<name>` so it joins the user bundle.
|
|
122
|
+
5. The `COPY docker/s6-rc.d/` in the Dockerfile picks it up automatically — no other changes.
|
|
123
|
+
|
|
124
|
+
### Change the per-profile gateway run command
|
|
125
|
+
|
|
126
|
+
Edit `S6ServiceManager._render_run_script` in `hermes_cli/service_manager.py`. The function is also called by `hermes_cli/container_boot.py::_register_service` during boot reconciliation, so it's the single source of truth. Update the corresponding assertion in `tests/hermes_cli/test_service_manager.py::test_s6_register_creates_service_dir_and_triggers_scan`.
|
|
127
|
+
|
|
128
|
+
### Run the docker test harness
|
|
129
|
+
|
|
130
|
+
```sh
|
|
131
|
+
docker build -t
|
|
132
|
+
HERMES_TEST_IMAGE=
|
|
133
|
+
# Expect 19 passed, 0 xfailed against the s6 image
|
|
134
|
+
```
|
|
135
|
+
|
|
136
|
+
The harness lives in `tests/docker/` and skips when Docker isn't available. The per-test timeout is bumped to 180s (see `tests/docker/conftest.py`).
|
|
137
|
+
|
|
138
|
+
## Common pitfalls
|
|
139
|
+
|
|
140
|
+
### "command not found" via `docker exec`
|
|
141
|
+
|
|
142
|
+
`/command/` (where s6-overlay puts its binaries) is on PATH only for processes spawned by the supervision tree — services, cont-init.d, main-wrapper.sh. `docker exec <c> s6-svstat …` will fail with "command not found"; always use the absolute path `/command/s6-svstat`. The `hermes` binary works because the Dockerfile adds `/opt/hermes/.venv/bin` to the runtime `ENV PATH`.
|
|
143
|
+
|
|
144
|
+
### Profile directory ownership
|
|
145
|
+
|
|
146
|
+
The cont-init reconciler runs as hermes (`s6-setuidgid hermes` in `02-reconcile-profiles`). If a profile dir ends up root-owned (e.g. because `docker exec <c> hermes profile create …` ran as root by default), the reconciler can't read SOUL.md and fails with `PermissionError`. Mitigation: `stage2-hook.sh` chowns `$HERMES_HOME/profiles` to hermes on **every** boot, idempotently. Don't remove that block.
|
|
147
|
+
|
|
148
|
+
### Files written by `docker exec` are root-owned
|
|
149
|
+
|
|
150
|
+
`docker exec` defaults to root. Either pass `--user hermes` or rely on the stage2 chown sweep next reboot. Don't write files under `$HERMES_HOME/profiles/<name>/` as root manually — the next reconcile pass will sweep them but in-flight operations may hit perm errors.
|
|
151
|
+
|
|
152
|
+
### Service slot exists but s6-svstat says "s6-supervise not running"
|
|
153
|
+
|
|
154
|
+
The service directory is on tmpfs and was wiped on container restart. Either the cont-init reconciler hasn't run yet (give it a moment after `docker restart`) or it failed. Check `docker logs <c> | grep '02-reconcile'`.
|
|
155
|
+
|
|
156
|
+
### Gateway starts then immediately exits (`down (exitcode 1)` in svstat)
|
|
157
|
+
|
|
158
|
+
Most likely the profile has no model or auth configured. The service slot is correct — the gateway itself is unconfigured. Run `hermes -p <profile> setup` first. The s6 supervisor will keep restarting it; that's the desired behavior (when you fix the config, the next attempt succeeds and stays up).
|
|
159
|
+
|
|
160
|
+
### Reconciler skipped a profile
|
|
161
|
+
|
|
162
|
+
The reconciler keys on the **presence of `SOUL.md`** as the "real profile" marker. `hermes profile create` always seeds it. If a profile dir is missing SOUL.md (stray directory, partial restore, backup-in-progress), the reconciler skips it intentionally. Add a `SOUL.md` (even empty) to opt back in.
|
|
163
|
+
|
|
164
|
+
### "Help, the container exits 143!"
|
|
165
|
+
|
|
166
|
+
Check whether something is invoking `s6-svscanctl -t` or `/run/s6/basedir/bin/halt` — both cause /init to begin stage 3 shutdown but return 143 (SIGTERM) rather than the desired exit code. This was the Phase 2 architecture pivot from A to B. For container shutdown with a real exit code, you must let the CMD (main-wrapper.sh) exit normally; do **not** try to control exit from a finish script.
|
|
167
|
+
|
|
168
|
+
## Related skills
|
|
169
|
+
|
|
170
|
+
- `
|
|
171
|
+
- `hermes-tool-quirks`: Specific the agent-tool workarounds (sed/grep/etc.) — load when debugging the s6 stack's interaction with hermes built-in tools.
|