@jaguilar87/gaia-ops 4.0.0 → 4.4.0-beta.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (241) hide show
  1. package/.claude-plugin/marketplace.json +32 -0
  2. package/.claude-plugin/plugin.json +17 -0
  3. package/ARCHITECTURE.md +320 -0
  4. package/CHANGELOG.md +15 -0
  5. package/CODE_OF_CONDUCT.md +11 -0
  6. package/CONTRIBUTING.md +146 -0
  7. package/INSTALL.md +33 -34
  8. package/README.md +30 -12
  9. package/SECURITY.md +47 -0
  10. package/agents/cloud-troubleshooter.md +1 -1
  11. package/agents/devops-developer.md +1 -2
  12. package/agents/gaia.md +1 -2
  13. package/agents/gitops-operator.md +1 -2
  14. package/agents/speckit-planner.md +29 -16
  15. package/agents/terraform-architect.md +1 -2
  16. package/bin/README.md +6 -6
  17. package/bin/gaia-cleanup.js +296 -2
  18. package/bin/gaia-doctor.js +12 -12
  19. package/bin/gaia-history.js +20 -24
  20. package/bin/gaia-metrics.js +494 -63
  21. package/bin/gaia-scan +14 -0
  22. package/bin/gaia-scan.py +640 -0
  23. package/bin/gaia-skills-diagnose.js +3 -3
  24. package/bin/gaia-status.js +82 -40
  25. package/bin/gaia-update.js +3 -3
  26. package/bin/pre-publish-validate.js +112 -10
  27. package/commands/README.md +11 -52
  28. package/commands/scan-project.md +67 -0
  29. package/commands/speckit.add-task.md +4 -4
  30. package/commands/speckit.analyze-task.md +3 -3
  31. package/commands/speckit.init.md +14 -14
  32. package/commands/speckit.plan.md +8 -34
  33. package/commands/speckit.tasks.md +14 -12
  34. package/config/README.md +4 -0
  35. package/config/context-contracts.aws.json +20 -9
  36. package/config/context-contracts.gcp.json +17 -11
  37. package/config/context-contracts.json +43 -26
  38. package/config/surface-routing.json +189 -0
  39. package/config/universal-rules.json +8 -32
  40. package/hooks/README.md +11 -7
  41. package/hooks/adapters/__init__.py +52 -0
  42. package/hooks/adapters/base.py +168 -0
  43. package/hooks/adapters/channel.py +42 -0
  44. package/hooks/adapters/claude_code.py +500 -0
  45. package/hooks/adapters/types.py +193 -0
  46. package/hooks/adapters/utils.py +25 -0
  47. package/hooks/hooks.json +52 -0
  48. package/hooks/modules/README.md +60 -25
  49. package/hooks/modules/agents/__init__.py +28 -4
  50. package/hooks/modules/agents/contract_validator.py +304 -0
  51. package/hooks/modules/agents/response_contract.py +457 -0
  52. package/hooks/modules/agents/task_info_builder.py +74 -0
  53. package/hooks/modules/agents/transcript_reader.py +165 -0
  54. package/hooks/modules/audit/logger.py +2 -26
  55. package/hooks/modules/audit/metrics.py +14 -8
  56. package/hooks/modules/audit/workflow_auditor.py +258 -0
  57. package/hooks/modules/audit/workflow_recorder.py +266 -0
  58. package/hooks/modules/context/__init__.py +3 -0
  59. package/hooks/modules/context/context_freshness.py +145 -0
  60. package/hooks/modules/context/context_injector.py +370 -0
  61. package/hooks/modules/context/context_writer.py +96 -6
  62. package/hooks/modules/context/contracts_loader.py +164 -0
  63. package/hooks/modules/core/__init__.py +9 -1
  64. package/hooks/modules/core/hook_entry.py +77 -0
  65. package/hooks/modules/core/state.py +10 -1
  66. package/hooks/modules/core/stdin.py +24 -0
  67. package/hooks/modules/memory/__init__.py +8 -0
  68. package/hooks/modules/memory/episode_writer.py +228 -0
  69. package/hooks/modules/scanning/__init__.py +8 -0
  70. package/hooks/modules/scanning/scan_trigger.py +96 -0
  71. package/hooks/modules/security/__init__.py +61 -15
  72. package/hooks/modules/security/approval_cleanup.py +57 -0
  73. package/hooks/modules/security/approval_constants.py +11 -14
  74. package/hooks/modules/security/approval_grants.py +712 -0
  75. package/hooks/modules/security/approval_messages.py +81 -0
  76. package/hooks/modules/security/approval_scopes.py +153 -0
  77. package/hooks/modules/security/blocked_commands.py +389 -95
  78. package/hooks/modules/security/command_semantics.py +134 -0
  79. package/hooks/modules/security/mutative_verbs.py +707 -0
  80. package/hooks/modules/security/prompt_validator.py +40 -0
  81. package/hooks/modules/security/tiers.py +57 -31
  82. package/hooks/modules/session/__init__.py +10 -0
  83. package/hooks/modules/session/session_context_writer.py +106 -0
  84. package/hooks/modules/session/session_event_injector.py +174 -0
  85. package/hooks/modules/session/session_manager.py +31 -0
  86. package/hooks/modules/tools/bash_validator.py +158 -103
  87. package/hooks/modules/tools/cloud_pipe_validator.py +16 -11
  88. package/hooks/modules/tools/hook_response.py +51 -0
  89. package/hooks/modules/tools/task_validator.py +119 -85
  90. package/hooks/post_tool_use.py +64 -205
  91. package/hooks/pre_tool_use.py +153 -554
  92. package/hooks/session_start.py +105 -0
  93. package/hooks/stop_hook.py +72 -0
  94. package/hooks/subagent_start.py +88 -0
  95. package/hooks/subagent_stop.py +202 -773
  96. package/hooks/task_completed.py +71 -0
  97. package/package.json +13 -5
  98. package/plugins/gaia-ops/.claude-plugin/plugin.json +8 -0
  99. package/plugins/gaia-security/.claude-plugin/plugin.json +6 -0
  100. package/pyproject.toml +29 -0
  101. package/skills/README.md +16 -15
  102. package/skills/agent-protocol/SKILL.md +178 -15
  103. package/skills/approval/SKILL.md +29 -18
  104. package/skills/approval/examples.md +2 -0
  105. package/skills/approval/reference.md +7 -0
  106. package/skills/command-execution/SKILL.md +3 -1
  107. package/skills/command-execution/reference.md +1 -0
  108. package/skills/context-updater/SKILL.md +28 -11
  109. package/skills/developer-patterns/SKILL.md +2 -1
  110. package/skills/execution/SKILL.md +13 -13
  111. package/skills/fast-queries/SKILL.md +2 -1
  112. package/skills/gaia-patterns/SKILL.md +14 -6
  113. package/skills/git-conventions/SKILL.md +3 -2
  114. package/skills/gitops-patterns/SKILL.md +2 -1
  115. package/skills/investigation/SKILL.md +61 -6
  116. package/skills/orchestrator-approval/SKILL.md +84 -0
  117. package/skills/output-format/SKILL.md +10 -1
  118. package/skills/security-tiers/SKILL.md +19 -3
  119. package/skills/security-tiers/destructive-commands-reference.md +623 -0
  120. package/skills/security-tiers/reference.md +4 -0
  121. package/skills/skill-creation/SKILL.md +2 -1
  122. package/skills/specification/SKILL.md +177 -0
  123. package/skills/speckit-workflow/SKILL.md +139 -49
  124. package/skills/speckit-workflow/reference.md +73 -57
  125. package/skills/terraform-patterns/SKILL.md +2 -1
  126. package/speckit/README.md +97 -132
  127. package/speckit/templates/plan-template.md +10 -16
  128. package/speckit/templates/tasks-template.md +241 -375
  129. package/templates/CLAUDE.template.md +48 -72
  130. package/templates/README.md +10 -10
  131. package/templates/governance.template.md +1 -1
  132. package/templates/settings.template.json +89 -638
  133. package/tools/context/README.md +46 -15
  134. package/tools/context/__init__.py +11 -0
  135. package/tools/context/_paths.py +20 -0
  136. package/tools/context/context_provider.py +164 -30
  137. package/tools/context/context_section_reader.py +34 -16
  138. package/tools/context/pending_updates.py +1 -1
  139. package/tools/context/surface_router.py +278 -0
  140. package/tools/memory/episodic.py +26 -4
  141. package/tools/replay/__init__.py +33 -0
  142. package/tools/replay/cli.py +355 -0
  143. package/tools/replay/extractor.py +457 -0
  144. package/tools/replay/reporter.py +258 -0
  145. package/tools/replay/routing_simulator.py +335 -0
  146. package/tools/replay/runner.py +506 -0
  147. package/tools/replay/skills_mapper.py +263 -0
  148. package/tools/scan/__init__.py +21 -0
  149. package/tools/scan/config.py +226 -0
  150. package/tools/scan/merge.py +212 -0
  151. package/tools/scan/orchestrator.py +392 -0
  152. package/tools/scan/registry.py +127 -0
  153. package/tools/scan/scanners/__init__.py +18 -0
  154. package/tools/scan/scanners/base.py +129 -0
  155. package/tools/scan/scanners/environment.py +324 -0
  156. package/tools/scan/scanners/git.py +453 -0
  157. package/tools/scan/scanners/infrastructure.py +414 -0
  158. package/tools/scan/scanners/orchestration.py +560 -0
  159. package/tools/scan/scanners/stack.py +786 -0
  160. package/tools/scan/scanners/tools.py +213 -0
  161. package/tools/scan/setup.py +703 -0
  162. package/tools/scan/tests/__init__.py +1 -0
  163. package/tools/scan/tests/conftest.py +796 -0
  164. package/tools/scan/tests/test_environment.py +323 -0
  165. package/tools/scan/tests/test_git.py +339 -0
  166. package/tools/scan/tests/test_infrastructure.py +336 -0
  167. package/tools/scan/tests/test_integration.py +920 -0
  168. package/tools/scan/tests/test_merge.py +269 -0
  169. package/tools/scan/tests/test_orchestration.py +304 -0
  170. package/tools/scan/tests/test_stack.py +518 -0
  171. package/tools/scan/tests/test_tools.py +409 -0
  172. package/tools/scan/ui.py +368 -0
  173. package/tools/scan/verify.py +284 -0
  174. package/tools/validation/README.md +6 -11
  175. package/bin/gaia-init.js +0 -1777
  176. package/commands/speckit.implement.md +0 -96
  177. package/commands/speckit.specify.md +0 -177
  178. package/hooks/modules/security/safe_commands.py +0 -391
  179. package/hooks/modules/workflow/__init__.py +0 -5
  180. package/tests/README.md +0 -94
  181. package/tests/conftest.py +0 -195
  182. package/tests/fixtures/project-context.aws.json +0 -53
  183. package/tests/fixtures/project-context.full.json +0 -165
  184. package/tests/fixtures/project-context.gcp.json +0 -53
  185. package/tests/hooks/__init__.py +0 -1
  186. package/tests/hooks/modules/context/__init__.py +0 -0
  187. package/tests/hooks/modules/context/test_context_writer.py +0 -594
  188. package/tests/hooks/modules/core/__init__.py +0 -0
  189. package/tests/hooks/modules/core/test_paths.py +0 -235
  190. package/tests/hooks/modules/core/test_state.py +0 -332
  191. package/tests/hooks/modules/security/__init__.py +0 -0
  192. package/tests/hooks/modules/security/test_blocked_commands.py +0 -290
  193. package/tests/hooks/modules/security/test_gitops_validator.py +0 -357
  194. package/tests/hooks/modules/security/test_safe_commands.py +0 -383
  195. package/tests/hooks/modules/security/test_tiers.py +0 -230
  196. package/tests/hooks/modules/tools/__init__.py +0 -0
  197. package/tests/hooks/modules/tools/test_bash_validator.py +0 -243
  198. package/tests/hooks/modules/tools/test_shell_parser.py +0 -290
  199. package/tests/hooks/modules/tools/test_task_validator.py +0 -363
  200. package/tests/hooks/test_subagent_stop_discovery.py +0 -124
  201. package/tests/integration/__init__.py +0 -0
  202. package/tests/integration/test_context_enrichment.py +0 -647
  203. package/tests/integration/test_subagent_lifecycle.py +0 -783
  204. package/tests/integration/test_subagent_stop_e2e.py +0 -639
  205. package/tests/layer1_prompt_regression/test_agent_frontmatter.py +0 -152
  206. package/tests/layer1_prompt_regression/test_agent_prompt_content.py +0 -170
  207. package/tests/layer1_prompt_regression/test_context_contracts.py +0 -139
  208. package/tests/layer1_prompt_regression/test_routing_table.py +0 -106
  209. package/tests/layer1_prompt_regression/test_security_tier_consistency.py +0 -117
  210. package/tests/layer1_prompt_regression/test_skill_content_rules.py +0 -148
  211. package/tests/layer1_prompt_regression/test_skills_cross_reference.py +0 -168
  212. package/tests/layer2_llm_evaluation/conftest.py +0 -6
  213. package/tests/layer2_llm_evaluation/helpers/promptfoo_runner.py +0 -132
  214. package/tests/layer2_llm_evaluation/test_agent_behavior.py +0 -198
  215. package/tests/layer3_e2e/conftest.py +0 -6
  216. package/tests/layer3_e2e/helpers/claude_headless.py +0 -169
  217. package/tests/layer3_e2e/test_hook_lifecycle.py +0 -160
  218. package/tests/layer3_e2e/test_installation_smoke.py +0 -117
  219. package/tests/performance/__init__.py +0 -1
  220. package/tests/performance/test_context_performance.py +0 -855
  221. package/tests/promptfoo.yaml +0 -126
  222. package/tests/system/__init__.py +0 -0
  223. package/tests/system/permissions_helpers.py +0 -318
  224. package/tests/system/test_agent_definitions.py +0 -179
  225. package/tests/system/test_configuration_files.py +0 -121
  226. package/tests/system/test_directory_structure.py +0 -221
  227. package/tests/system/test_permissions_system.py +0 -1059
  228. package/tests/system/test_schema_compatibility.py +0 -106
  229. package/tests/test_cross_layer_consistency.py +0 -459
  230. package/tests/tools/__init__.py +0 -0
  231. package/tests/tools/test_context_provider.py +0 -208
  232. package/tests/tools/test_deep_merge.py +0 -146
  233. package/tests/tools/test_episodic.py +0 -463
  234. package/tests/tools/test_pending_updates.py +0 -549
  235. package/tests/tools/test_review_engine.py +0 -203
  236. package/tools/context/benchmark_context.py +0 -389
  237. package/tools/context/context_compressor.py +0 -444
  238. package/tools/context/context_lazy_loader.py +0 -402
  239. package/tools/context/context_selector.py +0 -451
  240. package/tools/validation/skills_report.md +0 -162
  241. /package/{tests/hooks/modules/__init__.py → speckit/scripts/.gitkeep} +0 -0
@@ -0,0 +1,32 @@
1
+ {
2
+ "marketplace": {
3
+ "name": "gaia-ops-marketplace",
4
+ "description": "Security, governance, and multi-agent orchestration for AI coding",
5
+ "owner": {
6
+ "name": "jaguilar87",
7
+ "email": "jaguilar1897@gmail.com"
8
+ },
9
+ "plugins": [
10
+ {
11
+ "name": "gaia-security",
12
+ "description": "Security hooks, approval system, audit logging, metrics, and anomaly detection for Claude Code",
13
+ "version": "4.4.0-beta.2",
14
+ "source": "./plugins/gaia-security",
15
+ "category": "security",
16
+ "tags": ["security", "hooks", "audit", "metrics", "approval"],
17
+ "dependencies": [],
18
+ "includes": ["hooks/pre_tool_use.py", "hooks/post_tool_use.py", "hooks/subagent_stop.py", "hooks/stop_hook.py", "hooks/modules/security/", "hooks/modules/tools/", "hooks/adapters/", "config/"]
19
+ },
20
+ {
21
+ "name": "gaia-ops",
22
+ "description": "Complete DevOps orchestration system: agents, scanning, context injection, episodic memory, speckit planning, and CLI tools \u2014 includes gaia-security",
23
+ "version": "4.4.0-beta.2",
24
+ "source": "./plugins/gaia-ops",
25
+ "category": "devops",
26
+ "tags": ["devops", "agents", "scanning", "orchestration", "speckit", "ops"],
27
+ "dependencies": [],
28
+ "includes": ["."]
29
+ }
30
+ ]
31
+ }
32
+ }
@@ -0,0 +1,17 @@
1
+ {
2
+ "name": "gaia-ops",
3
+ "version": "4.4.0-beta.2",
4
+ "description": "Security-first orchestrator with specialized agents, hooks, and governance for AI coding",
5
+ "author": {
6
+ "name": "jaguilar87",
7
+ "url": "https://github.com/metraton/gaia-ops"
8
+ },
9
+ "repository": "https://github.com/metraton/gaia-ops",
10
+ "license": "MIT",
11
+ "keywords": ["security", "devops", "orchestrator", "governance", "terraform", "kubernetes", "gitops"],
12
+ "engines": { "claude-code": ">=2.1.0" },
13
+ "categories": ["devops", "security", "orchestration"],
14
+ "commands": "./commands/",
15
+ "agents": "./agents/",
16
+ "skills": "./skills/"
17
+ }
@@ -0,0 +1,320 @@
1
+ # Architecture
2
+
3
+ ## What is gaia-ops?
4
+
5
+ gaia-ops is an orchestration system for Claude Code agents. It turns a single Claude Code session into a coordinated multi-agent system with security enforcement, context injection, surface-based routing, episodic memory, and deterministic response contracts.
6
+
7
+ The package is published as `@jaguilar87/gaia-ops` on npm and installed into a project's `.claude/` directory via symlinks.
8
+
9
+ ## Core Concepts
10
+
11
+ | Concept | Definition |
12
+ |---------|-----------|
13
+ | **Agent** | A Markdown file in `agents/` defining identity, scope, skills, and delegation rules |
14
+ | **Skill** | Injected procedural knowledge (in `skills/`) -- the HOW for agents |
15
+ | **Hook** | Python scripts that intercept tool calls before and after execution |
16
+ | **Tool** | Python modules in `tools/` providing context assembly, memory, and validation |
17
+ | **Config** | JSON files in `config/` defining contracts, rules, surface routing, and security |
18
+ | **Orchestrator** | The root `CLAUDE.md` that routes user requests to the correct agent |
19
+
20
+ ## Runtime Flow
21
+
22
+ ```
23
+ User request
24
+ |
25
+ v
26
+ Orchestrator (CLAUDE.md)
27
+ | Routes by surface classification
28
+ v
29
+ pre_tool_use.py (PreToolUse hook)
30
+ | 1. Inject project-context into agent prompt
31
+ | 2. Inject session events
32
+ | 3. Validate Bash commands (security gate)
33
+ | 4. Validate Task/Agent invocations
34
+ v
35
+ Agent executes
36
+ | Uses tools, follows skills, emits AGENT_STATUS
37
+ v
38
+ subagent_stop.py (SubagentStop hook)
39
+ | 1. Read transcript, extract task description
40
+ | 2. Capture workflow metrics
41
+ | 3. Validate response contract
42
+ | 4. Detect anomalies
43
+ | 5. Store episodic memory
44
+ | 6. Process CONTEXT_UPDATE blocks
45
+ v
46
+ Orchestrator processes AGENT_STATUS
47
+ | COMPLETE -> summarize to user
48
+ | PENDING_APPROVAL -> get approval -> resume
49
+ | NEEDS_INPUT -> ask user -> resume
50
+ | BLOCKED -> report blocker
51
+ ```
52
+
53
+ ## Hook Pipeline: pre_tool_use.py
54
+
55
+ Entry point for all Bash and Task/Agent tool validation. With `Bash(*)` in the settings.json allow list, the hook is the sole security gate.
56
+
57
+ ### Bash Command Validation (BashValidator)
58
+
59
+ Order is short-circuit -- first match wins:
60
+
61
+ ```
62
+ 1. blocked_commands.py --> permanently denied patterns (exit 2)
63
+ 2. Claude footer strip --> auto-remove Co-Authored-By (transparent updatedInput)
64
+ 3. Commit message check --> conventional commits format validation
65
+ 4. cloud_pipe_validator --> block pipes/redirects/chains on cloud CLIs (exit 0, corrective)
66
+ 5. mutative_verbs.py --> scan tokens 1-5 for MUTATIVE verbs
67
+ | If mutative + no active grant -> generate nonce, block
68
+ | If mutative + active grant -> allow (T3)
69
+ | If not mutative -> safe by elimination (T0)
70
+ 6. gitops_validator --> GitOps policy for kubectl/helm/flux
71
+ ```
72
+
73
+ ### Task/Agent Validation
74
+
75
+ ```
76
+ 1. Response contract guard --> if pending repair exists, block new tasks until resolved
77
+ 2. Context injection --> context_provider.py assembles payload, injected into prompt
78
+ 3. Session events injection --> recent git commits, pushes, file mods added to prompt
79
+ 4. Resume validation --> validate agent ID format, detect approval nonces
80
+ 5. TaskValidator --> validate agent name, check available agents
81
+ ```
82
+
83
+ ## Agent Completion Pipeline: subagent_stop.py
84
+
85
+ Fires after every agent tool completes:
86
+
87
+ ```
88
+ 1. Consume approval file --> delete pending approval if matches agent
89
+ 2. Capture workflow metrics --> duration, exit code, plan status -> metrics.jsonl
90
+ 3. Validate response contract
91
+ | Parse AGENT_STATUS block (plan_status, agent_id, pending_steps, next_action)
92
+ | Parse EVIDENCE_REPORT block (7 required fields)
93
+ | Parse CONSOLIDATION_REPORT if multi-surface task
94
+ | If invalid -> save pending-repair.json for pre_tool_use guard
95
+ | If valid -> clear pending repair
96
+ 4. Detect anomalies --> execution failures, consecutive failures
97
+ | If anomalies found -> create needs_analysis.flag for Gaia
98
+ 5. Capture episodic memory --> store episode via tools/memory/episodic.py
99
+ 6. Process context updates --> apply CONTEXT_UPDATE blocks via context_writer.py
100
+ ```
101
+
102
+ ## Surface Routing: surface_router.py
103
+
104
+ Classifies user tasks into surfaces using signal matching against `config/surface-routing.json`.
105
+
106
+ | Surface | Primary Agent | Typical Signals |
107
+ |---------|--------------|-----------------|
108
+ | `live_runtime` | cloud-troubleshooter | pods, services, logs, kubectl, gcloud |
109
+ | `gitops_desired_state` | gitops-operator | manifests, Flux, Helm, Kustomize |
110
+ | `terraform_iac` | terraform-architect | Terraform, Terragrunt, IAM, modules |
111
+ | `app_ci_tooling` | devops-developer | CI/CD, Docker, package tooling |
112
+ | `planning_specs` | speckit-planner | specs, plans, task breakdowns |
113
+ | `gaia_system` | gaia | hooks, skills, agents/, CLAUDE.md |
114
+
115
+ **Classification algorithm:**
116
+ 1. Normalize task text
117
+ 2. Score each surface by keyword (1.0), command (1.5), and artifact (1.0) matches
118
+ 3. Keep surfaces with score >= 1.0 and >= 55% of top score
119
+ 4. If no match and current agent maps to a surface, use agent-fallback (score 0.2)
120
+ 5. If still no match, dispatch reconnaissance agent
121
+
122
+ **Investigation brief** is generated per agent from routing results. It contains role assignment (primary/cross_check/adjacent), required evidence fields, stop conditions, and whether a CONSOLIDATION_REPORT is required.
123
+
124
+ ## Context Injection: context_provider.py
125
+
126
+ Assembles the context payload injected into agent prompts by pre_tool_use.py.
127
+
128
+ ```
129
+ context_provider.py <agent_name> <user_task>
130
+ |
131
+ +--> Load project-context.json
132
+ +--> Detect cloud provider (GCP/AWS)
133
+ +--> Load base contracts (config/context-contracts.json)
134
+ +--> Merge cloud overrides (config/cloud/{provider}.json)
135
+ +--> Extract contracted sections for this agent (read permissions)
136
+ +--> Load universal rules (config/universal-rules.json)
137
+ +--> Load relevant episodic memory (similarity match)
138
+ +--> Classify surfaces (surface_router.py)
139
+ +--> Build investigation brief (surface_router.py)
140
+ |
141
+ v
142
+ JSON payload:
143
+ contract: {sections the agent may read}
144
+ context_update_contract: {readable/writable section lists}
145
+ rules: {universal + agent-specific rules}
146
+ surface_routing: {active surfaces, dispatch mode, confidence}
147
+ investigation_brief: {role, required checks, stop conditions}
148
+ historical_context: {relevant episodes if any}
149
+ metadata: {provider, version, counts}
150
+ ```
151
+
152
+ ## Approval Flow
153
+
154
+ Nonce-based T3 approval lifecycle:
155
+
156
+ ```
157
+ 1. Agent attempts dangerous command (e.g., terraform apply)
158
+ 2. mutative_verbs.py detects MUTATIVE verb
159
+ 3. BashValidator generates 128-bit nonce via generate_nonce()
160
+ 4. write_pending_approval() saves pending-{nonce}.json to .claude/cache/approvals/
161
+ 5. Hook returns corrective deny (exit 0) with NONCE:{hex} in message
162
+ 6. Agent includes NONCE:{hex} in PENDING_APPROVAL status to orchestrator
163
+ 7. Orchestrator presents plan to user, asks for approval
164
+ 8. User approves -> orchestrator resumes agent with "APPROVE:{nonce}"
165
+ 9. pre_tool_use.py detects APPROVE: prefix, calls activate_pending_approval()
166
+ 10. Pending grant converted to active grant (TTL 10 min, verb-matched)
167
+ 11. Agent retries command -> check_approval_grant() finds active grant -> allowed
168
+ ```
169
+
170
+ ## Response Contract Validation
171
+
172
+ Every agent response must end with an AGENT_STATUS block. The contract validator (`hooks/modules/agents/response_contract.py`) enforces:
173
+
174
+ - **AGENT_STATUS**: PLAN_STATUS (from 8 valid states), PENDING_STEPS, NEXT_ACTION, AGENT_ID
175
+ - **EVIDENCE_REPORT**: required for all states except APPROVED_EXECUTING. Seven fields: PATTERNS_CHECKED, FILES_CHECKED, COMMANDS_RUN, KEY_OUTPUTS, VERBATIM_OUTPUTS, CROSS_LAYER_IMPACTS, OPEN_GAPS
176
+ - **CONSOLIDATION_REPORT**: required when multi-surface or cross-check. Fields: OWNERSHIP_ASSESSMENT (enum), CONFIRMED_FINDINGS, SUSPECTED_FINDINGS, CONFLICTS, OPEN_GAPS, NEXT_BEST_AGENT
177
+
178
+ Invalid responses trigger a repair loop: save pending-repair.json, pre_tool_use guard blocks new tasks, orchestrator must resume the same agent for repair (max 2 attempts before escalation).
179
+
180
+ ## Adapter Layer
181
+
182
+ The adapter layer decouples business logic from CLI-specific protocols. Located at `hooks/adapters/`.
183
+
184
+ ### Components
185
+ - `types.py` -- Normalized dataclasses (HookEvent, ValidationRequest, ValidationResult, etc.)
186
+ - `base.py` -- Abstract HookAdapter interface
187
+ - `claude_code.py` -- Claude Code adapter (stdin JSON <-> normalized types)
188
+ - `channel.py` -- Distribution channel detection (plugin vs npm)
189
+
190
+ ### Flow
191
+ ```
192
+ Claude Code stdin JSON -> ClaudeCodeAdapter.parse_event() -> normalized HookEvent
193
+ -> Business logic (unchanged) ->
194
+ ClaudeCodeAdapter.format_validation_response() -> Claude Code stdout JSON
195
+ ```
196
+
197
+ ### Plugin Distribution
198
+ gaia-ops is distributable as a Claude Code plugin via `.claude-plugin/plugin.json`.
199
+ The plugin is auto-discovered by Claude Code -- agents, skills, commands, and hooks
200
+ are loaded from their respective directories.
201
+
202
+ See `.claude-plugin/marketplace.json` for the self-hosted marketplace with sub-plugins.
203
+
204
+ ## Adapter Coupling Points
205
+
206
+ The adapter layer connects Claude Code's hook protocol to gaia-ops business logic through 5 coupling points. Each coupling point is a thin entry point that delegates to the adapter for JSON parsing/formatting and to business logic modules for decisions.
207
+
208
+ ### CP-1: `hooks/pre_tool_use.py` -- Command Validation Entry Point
209
+
210
+ | Attribute | Value |
211
+ |-----------|-------|
212
+ | **File** | `hooks/pre_tool_use.py` |
213
+ | **Hook event** | PreToolUse |
214
+ | **What it does** | Security gate for all Bash, Task, and Agent tool invocations. Validates commands (blocked patterns, mutative verbs, nonce-based approval), injects project-context into agent prompts, guards pending contract repairs. |
215
+ | **Adapter methods called** | `ClaudeCodeAdapter.parse_event()`, `ClaudeCodeAdapter.parse_pre_tool_use()`, `ClaudeCodeAdapter.format_validation_response()` |
216
+ | **Business logic modules** | `security/blocked_commands.py`, `security/mutative_verbs.py`, `security/approval_grants.py`, `tools/bash_validator.py`, `tools/task_validator.py`, `agents/response_contract.py`, `context/context_provider.py` |
217
+
218
+ ### CP-2: `hooks/post_tool_use.py` -- Audit Logging Entry Point
219
+
220
+ | Attribute | Value |
221
+ |-----------|-------|
222
+ | **File** | `hooks/post_tool_use.py` |
223
+ | **Hook event** | PostToolUse |
224
+ | **What it does** | Records execution audit logs, detects critical events (git commits, pushes, file modifications), updates active session context. Reads pre-hook state for timing and tier classification. |
225
+ | **Adapter methods called** | `ClaudeCodeAdapter.parse_event()`, `ClaudeCodeAdapter.parse_post_tool_use()` |
226
+ | **Business logic modules** | `audit/logger.py` (`log_execution`), `audit/event_detector.py` (`detect_critical_event`), `core/state.py` (`get_hook_state`, `clear_hook_state`) |
227
+
228
+ ### CP-3: `hooks/subagent_stop.py` -- Contract Validation + Memory Entry Point
229
+
230
+ | Attribute | Value |
231
+ |-----------|-------|
232
+ | **File** | `hooks/subagent_stop.py` |
233
+ | **Hook event** | SubagentStop |
234
+ | **What it does** | Fires after every agent completes. Consumes approval files, captures workflow metrics, validates the response contract (AGENT_STATUS, EVIDENCE_REPORT, CONSOLIDATION_REPORT), detects anomalies, stores episodic memory, and processes CONTEXT_UPDATE blocks. |
235
+ | **Adapter methods called** | `ClaudeCodeAdapter.parse_event()`, `ClaudeCodeAdapter.parse_agent_completion()` |
236
+ | **Business logic modules** | `agents/response_contract.py` (`validate_response_contract`, `save_pending_repair`, `clear_pending_repair`), `tools/memory/episodic.py` (`EpisodicMemory.store_episode`), `context/context_writer.py` (`process_agent_output`) |
237
+
238
+ ### CP-4: `hooks/modules/tools/hook_response.py` -- Response Formatting
239
+
240
+ | Attribute | Value |
241
+ |-----------|-------|
242
+ | **File** | `hooks/modules/tools/hook_response.py` |
243
+ | **Hook event** | (shared utility, used by PreToolUse callers) |
244
+ | **What it does** | Provides `build_hook_permission_response()` -- a shared builder for hookSpecificOutput JSON. Delegates to the adapter's `format_validation_response()` so all permission responses share a single code path. |
245
+ | **Adapter methods called** | `ClaudeCodeAdapter.format_validation_response()` |
246
+ | **Business logic modules** | None (pure formatting bridge) |
247
+
248
+ ### CP-5: `templates/settings.template.json` / `hooks/hooks.json` -- Hook Configuration
249
+
250
+ | Attribute | Value |
251
+ |-----------|-------|
252
+ | **File (npm channel)** | `templates/settings.template.json` -- paths use `.claude/hooks/` prefix |
253
+ | **File (plugin channel)** | `hooks/hooks.json` -- paths use `${CLAUDE_PLUGIN_ROOT}/hooks/` prefix |
254
+ | **What it does** | Maps Claude Code hook events to handler scripts. Defines which events fire which entry points, the tool matchers (Bash, Task, Agent, `*`), and permissions (allow/deny lists). |
255
+ | **Events configured** | PreToolUse, PostToolUse, SubagentStop, SessionStart, Stop, TaskCompleted, SubagentStart (UserPromptSubmit is a static echo in settings.json only) |
256
+
257
+ ### HookAdapter ABC Contract
258
+
259
+ The abstract interface in `hooks/adapters/base.py` defines the adapter contract. Each CLI backend provides a concrete implementation.
260
+
261
+ | Method | Signature | Description |
262
+ |--------|-----------|-------------|
263
+ | `parse_event` | `(stdin_data: str) -> HookEvent` | Parse raw stdin JSON into a normalized, CLI-agnostic event |
264
+ | `format_validation_response` | `(result: ValidationResult) -> HookResponse` | Format a validation result for the CLI's permission protocol |
265
+ | `format_completion_response` | `(result: CompletionResult) -> HookResponse` | Format a completion result for SubagentStop |
266
+ | `format_context_response` | `(result: ContextResult) -> HookResponse` | Format a context injection result |
267
+ | `detect_channel` | `() -> DistributionChannel` | Detect whether gaia-ops is running as NPM or PLUGIN |
268
+
269
+ Additional abstract methods for P1/P2 events: `adapt_session_start`, `format_bootstrap_response`, `adapt_stop`, `adapt_task_completed`, `adapt_subagent_start`, `format_quality_response`, `format_verification_response`.
270
+
271
+ **Invariants:**
272
+ 1. Business logic modules NEVER see `HookResponse`. They produce `ValidationResult`, `CompletionResult`, etc.
273
+ 2. The adapter NEVER modifies business logic results -- it only translates format.
274
+ 3. Adding a new hook event requires ONLY a new adapter method. Zero changes to business logic modules.
275
+
276
+ ### Adding a New Hook Event
277
+
278
+ To add support for a new Claude Code hook event (e.g., a future `PreCompact` event):
279
+
280
+ 1. **Add enum value** to `HookEventType` in `hooks/adapters/types.py` (already present for all 19 known events).
281
+ 2. **Add adapter method** to `ClaudeCodeAdapter` in `hooks/adapters/claude_code.py` -- implement `adapt_<event_name>(raw: dict) -> <ResultType>` and the corresponding `format_<result>_response()` if a new result type is needed.
282
+ 3. **Add extract/format methods** for the event type -- the extract method pulls typed data from the raw payload, the format method builds the CLI response JSON.
283
+ 4. **Create hook script entry point** -- a new `hooks/<event_name>.py` file that reads stdin, calls `adapter.parse_event()`, delegates to business logic, and writes the response to stdout.
284
+ 5. **Add entry to `hooks/hooks.json`** (plugin channel) and `templates/settings.template.json` (npm channel) mapping the event name to the new script.
285
+
286
+ **Zero changes to business logic modules required.** The adapter is the only layer that touches CLI-specific JSON.
287
+
288
+ ### Adding a New CLI Backend
289
+
290
+ To support a CLI other than Claude Code (e.g., a hypothetical Cursor or Windsurf integration):
291
+
292
+ 1. **Subclass `HookAdapter`** from `hooks/adapters/base.py`.
293
+ 2. **Implement `parse_event()`** and all `format_*()` methods to translate between the new CLI's JSON protocol and the normalized types in `hooks/adapters/types.py`.
294
+ 3. **No changes to business logic or adapter interface.** The same `ValidationResult`, `CompletionResult`, `ContextResult`, etc. flow through unchanged.
295
+
296
+ **Business logic modules remain untouched.** They consume and produce normalized types; only the adapter layer changes.
297
+
298
+ ## Key Files Reference
299
+
300
+ | File | Purpose |
301
+ |------|---------|
302
+ | `CLAUDE.md` | Orchestrator identity, routing table, tool restrictions |
303
+ | `hooks/pre_tool_use.py` | PreToolUse hook entry point |
304
+ | `hooks/subagent_stop.py` | SubagentStop hook entry point |
305
+ | `hooks/modules/tools/bash_validator.py` | Bash command security gate |
306
+ | `hooks/modules/tools/task_validator.py` | Task/Agent invocation validator |
307
+ | `hooks/modules/security/blocked_commands.py` | Permanently denied command patterns |
308
+ | `hooks/modules/security/mutative_verbs.py` | CLI-agnostic mutative verb detector |
309
+ | `hooks/modules/security/approval_grants.py` | Nonce grant lifecycle management |
310
+ | `hooks/modules/agents/response_contract.py` | Agent response contract validator |
311
+ | `hooks/modules/context/context_writer.py` | Progressive context enrichment |
312
+ | `tools/context/context_provider.py` | Context payload assembly |
313
+ | `tools/context/surface_router.py` | Surface classification and investigation briefs |
314
+ | `tools/memory/episodic.py` | Episodic memory storage |
315
+ | `config/context-contracts.json` | Agent read/write section permissions |
316
+ | `config/universal-rules.json` | Universal and agent-specific rules |
317
+ | `config/surface-routing.json` | Surface signals and routing config |
318
+ | `agents/*.md` | Agent identity definitions |
319
+ | `skills/*/SKILL.md` | Injected procedural knowledge |
320
+ | `bin/*.js` | CLI tools (gaia-scan, gaia-doctor, gaia-status, etc.) |
package/CHANGELOG.md CHANGED
@@ -5,6 +5,21 @@ All notable changes to the CLAUDE.md orchestrator instructions are documented in
5
5
  The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
6
6
  and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
7
7
 
8
+ ## [Unreleased]
9
+
10
+ ### Added
11
+ - Plugin distribution: `.claude-plugin/plugin.json` manifest for Claude Code native plugin system
12
+ - Self-hosted marketplace: `.claude-plugin/marketplace.json` with 2 sub-plugin tiers (gaia-security, gaia-ops)
13
+ - Adapter layer: `hooks/adapters/` with normalized types, abstract base, and Claude Code adapter
14
+ - `hooks/hooks.json` for plugin-channel hook configuration
15
+ - Distribution channel detection (`hooks/adapters/channel.py`)
16
+ - Integration tests for adapter -> business logic -> response flow
17
+ - Plugin manifest validation tests
18
+
19
+ ### Changed
20
+ - Hook entry points (pre_tool_use.py, post_tool_use.py, subagent_stop.py) now use adapter layer for stdin/stdout
21
+ - hook_response.py delegates to ClaudeCodeAdapter internally
22
+
8
23
  ## [4.0.0] - 2026-03-03
9
24
 
10
25
  ### Breaking: Contracts as Single Source of Truth
@@ -0,0 +1,11 @@
1
+ # Code of Conduct
2
+
3
+ This project follows the [Contributor Covenant Code of Conduct v2.1](https://www.contributor-covenant.org/version/2/1/code_of_conduct/).
4
+
5
+ Please read the full text at the link above. All contributors, maintainers, and participants are expected to uphold these standards.
6
+
7
+ ## Reporting
8
+
9
+ Report unacceptable behavior to jaguilar1897@gmail.com.
10
+
11
+ Reports will be reviewed and investigated promptly and fairly.
@@ -0,0 +1,146 @@
1
+ # Contributing to gaia-ops
2
+
3
+ Thank you for your interest in contributing to gaia-ops. This guide covers how to set up your development environment, run tests, and submit changes.
4
+
5
+ ## Development Setup
6
+
7
+ ### Prerequisites
8
+
9
+ - **Node.js** >= 18.0.0
10
+ - **Python** >= 3.9
11
+ - **Git** >= 2.30
12
+ - **Claude Code** (latest version, for end-to-end testing)
13
+
14
+ ### Clone and Install
15
+
16
+ ```bash
17
+ git clone https://github.com/metraton/gaia-ops.git
18
+ cd gaia-ops
19
+ npm install
20
+ ```
21
+
22
+ Python test dependencies:
23
+
24
+ ```bash
25
+ pip install pytest
26
+ ```
27
+
28
+ ## Running Tests
29
+
30
+ The test suite is organized in layers:
31
+
32
+ ```bash
33
+ # Layer 1 (fast, deterministic) - run these before every PR
34
+ npm test
35
+
36
+ # Equivalent:
37
+ npm run test:layer1
38
+
39
+ # Layer 2 (LLM evaluation) - requires Claude Code access
40
+ npm run test:layer2
41
+
42
+ # Layer 3 (end-to-end)
43
+ npm run test:layer3
44
+
45
+ # All layers
46
+ npm run test:all
47
+
48
+ # Run pytest directly with stop-on-first-failure
49
+ python -m pytest tests/ -x
50
+
51
+ # Linting
52
+ npm run lint
53
+ ```
54
+
55
+ Always ensure Layer 1 tests pass before submitting a PR.
56
+
57
+ ## Project Structure
58
+
59
+ See [README.md](./README.md) for the full directory tree. Key areas for contributors:
60
+
61
+ | Directory | What it contains |
62
+ |-----------|-----------------|
63
+ | `agents/` | Agent definition files (`.md`) - identity, scope, routing |
64
+ | `skills/` | Skill modules (`SKILL.md` files) - injected procedural knowledge |
65
+ | `hooks/` | Runtime validators (`pre_tool_use.py`, `post_tool_use.py`, `subagent_stop.py`) |
66
+ | `hooks/modules/` | Modular hook components (blocked commands, safe commands, dangerous verbs) |
67
+ | `tools/` | Orchestration tools (context provider, memory, validation) |
68
+ | `config/` | Configuration files (contracts, git standards, rules) |
69
+ | `tests/` | Test suite organized by layer |
70
+ | `bin/` | CLI utilities (`gaia-scan`, `gaia-doctor`, etc.) |
71
+
72
+ ## Coding Standards
73
+
74
+ ### Python
75
+
76
+ - Follow the existing code style in the repository.
77
+ - Use [ruff](https://github.com/astral-sh/ruff) for linting and formatting.
78
+ - Type hints are encouraged but not strictly required.
79
+ - Keep functions focused and testable.
80
+
81
+ ### JavaScript / Node.js
82
+
83
+ - ES modules (`import`/`export`), not CommonJS.
84
+ - Follow the existing patterns in `bin/` and `index.js`.
85
+
86
+ ### Commit Messages
87
+
88
+ All commits must follow [Conventional Commits](https://www.conventionalcommits.org/):
89
+
90
+ ```
91
+ type(scope): short description
92
+ ```
93
+
94
+ Allowed types: `feat`, `fix`, `refactor`, `docs`, `test`, `chore`, `ci`, `perf`, `style`, `build`
95
+
96
+ Examples:
97
+ - `feat(hooks): add timeout protection to bash validator`
98
+ - `fix(skills): correct token budget in agent-protocol`
99
+ - `docs(readme): update installation instructions`
100
+
101
+ ## PR Process
102
+
103
+ 1. **Fork** the repository and create a feature branch from `main`.
104
+ 2. **Make your changes** following the coding standards above.
105
+ 3. **Write tests** for new functionality. Changes to `hooks/` always need tests.
106
+ 4. **Run the test suite**: `npm test` must pass.
107
+ 5. **Commit** using Conventional Commits format.
108
+ 6. **Open a PR** against `main` with a clear description of what changed and why.
109
+
110
+ PRs are reviewed for correctness, test coverage, and consistency with existing patterns.
111
+
112
+ ## Hooks Development
113
+
114
+ The `hooks/` directory contains runtime validators that enforce security and workflow policies in Claude Code. These are critical-path code.
115
+
116
+ - `pre_tool_use.py` - Main entry point; validates every tool call before execution.
117
+ - `post_tool_use.py` - Audit and metrics after tool execution.
118
+ - `hooks/modules/` - Individual validation modules (e.g., `blocked_commands.py`, `mutative_verbs.py`).
119
+
120
+ **Key rules for hook changes:**
121
+ - Every change to a hook module must have a corresponding test in `tests/`.
122
+ - Hook modules must be deterministic -- no network calls, no randomness.
123
+ - Test both the allow and deny paths for any new validation rule.
124
+
125
+ ## Skills Development
126
+
127
+ Skills live in `skills/` as directories, each containing a `SKILL.md` file:
128
+
129
+ ```
130
+ skills/
131
+ skill-name/
132
+ SKILL.md # Main content (injected into agents)
133
+ reference.md # Heavy reference material (read on-demand)
134
+ examples.md # Concrete examples (optional)
135
+ scripts/ # Executable tools (optional)
136
+ ```
137
+
138
+ - `SKILL.md` must stay under 100 lines (it is injected on every agent call).
139
+ - Heavy content goes in `reference.md` (loaded on-demand).
140
+ - Skills define process; agents define identity. Do not duplicate between them.
141
+
142
+ For detailed guidance, see `skills/skill-creation/SKILL.md`.
143
+
144
+ ## Questions?
145
+
146
+ Open an issue on [GitHub](https://github.com/metraton/gaia-ops/issues) or contact the maintainer at jaguilar1897@gmail.com.