@jaggerxtrm/specialists 3.3.1 → 3.3.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (37) hide show
  1. package/config/hooks/specialists-complete.mjs +60 -0
  2. package/config/hooks/specialists-session-start.mjs +120 -0
  3. package/config/skills/specialists-creator/SKILL.md +506 -0
  4. package/config/skills/specialists-creator/scripts/validate-specialist.ts +41 -0
  5. package/config/skills/specialists-usage-workspace/iteration-1/eval-bead-background/old_skill/outputs/result.md +105 -0
  6. package/config/skills/specialists-usage-workspace/iteration-1/eval-bead-background/with_skill/outputs/result.md +93 -0
  7. package/config/skills/specialists-usage-workspace/iteration-1/eval-fresh-setup/old_skill/outputs/result.md +113 -0
  8. package/config/skills/specialists-usage-workspace/iteration-1/eval-fresh-setup/with_skill/outputs/result.md +131 -0
  9. package/config/skills/specialists-usage-workspace/iteration-1/eval-yaml-debug/old_skill/outputs/result.md +159 -0
  10. package/config/skills/specialists-usage-workspace/iteration-1/eval-yaml-debug/with_skill/outputs/result.md +150 -0
  11. package/config/skills/specialists-usage-workspace/iteration-2/eval-bug-investigation/with_skill/outputs/result.md +180 -0
  12. package/config/skills/specialists-usage-workspace/iteration-2/eval-bug-investigation/with_skill/timing.json +5 -0
  13. package/config/skills/specialists-usage-workspace/iteration-2/eval-bug-investigation/without_skill/outputs/result.md +223 -0
  14. package/config/skills/specialists-usage-workspace/iteration-2/eval-bug-investigation/without_skill/timing.json +5 -0
  15. package/config/skills/specialists-usage-workspace/iteration-2/eval-code-review/with_skill/timing.json +5 -0
  16. package/config/skills/specialists-usage-workspace/iteration-2/eval-code-review/without_skill/outputs/result.md +146 -0
  17. package/config/skills/specialists-usage-workspace/iteration-2/eval-code-review/without_skill/timing.json +5 -0
  18. package/config/skills/specialists-usage-workspace/iteration-2/eval-test-coverage/with_skill/outputs/result.md +89 -0
  19. package/config/skills/specialists-usage-workspace/iteration-2/eval-test-coverage/with_skill/timing.json +5 -0
  20. package/config/skills/specialists-usage-workspace/iteration-2/eval-test-coverage/without_skill/outputs/result.md +96 -0
  21. package/config/skills/specialists-usage-workspace/iteration-2/eval-test-coverage/without_skill/timing.json +5 -0
  22. package/config/skills/specialists-usage-workspace/skill-snapshot/SKILL.md.old +237 -0
  23. package/config/skills/using-specialists/SKILL.md +158 -0
  24. package/config/skills/using-specialists/evals/evals.json +68 -0
  25. package/config/specialists/.serena/project.yml +151 -0
  26. package/config/specialists/auto-remediation.specialist.yaml +70 -0
  27. package/config/specialists/bug-hunt.specialist.yaml +96 -0
  28. package/config/specialists/explorer.specialist.yaml +79 -0
  29. package/config/specialists/memory-processor.specialist.yaml +140 -0
  30. package/config/specialists/overthinker.specialist.yaml +63 -0
  31. package/config/specialists/parallel-runner.specialist.yaml +61 -0
  32. package/config/specialists/planner.specialist.yaml +87 -0
  33. package/config/specialists/specialists-creator.specialist.yaml +82 -0
  34. package/config/specialists/sync-docs.specialist.yaml +53 -0
  35. package/config/specialists/test-runner.specialist.yaml +58 -0
  36. package/config/specialists/xt-merge.specialist.yaml +78 -0
  37. package/package.json +2 -3
@@ -0,0 +1,158 @@
1
+ ---
2
+ name: using-specialists
3
+ description: >
4
+ Use this skill whenever you're about to start a substantial task — pause first and
5
+ ask whether to delegate. Consult before any: code review, security audit, deep bug
6
+ investigation, test generation, multi-file refactor, or architecture analysis. Also
7
+ use for the mechanics of delegation: --bead workflow, --context-depth, background
8
+ jobs, MCP tools (use_specialist, start_specialist, poll_specialist), specialists init,
9
+ or specialists doctor. Don't wait for the user to say "use a specialist" — proactively
10
+ evaluate whether delegation makes sense.
11
+ version: 3.1
12
+ ---
13
+
14
+ # Specialists Usage
15
+
16
+ Specialists are autonomous AI agents that run independently — fresh context, different
17
+ model, no prior bias. Delegate when a task would take you significant effort, spans
18
+ multiple files, or benefits from a dedicated focused run.
19
+
20
+ The reason isn't just speed — it's quality. A specialist has no competing context,
21
+ leaves a tracked record via beads, and can run in the background while you stay unblocked.
22
+
23
+ ## The Delegation Decision
24
+
25
+ Before starting any substantial task, ask: is this worth delegating?
26
+
27
+ **Delegate when:**
28
+ - It would take >5 minutes of focused work
29
+ - It spans multiple files or modules
30
+ - A fresh perspective adds value (code review, security audit)
31
+ - It can run in the background while you do other things
32
+
33
+ **Do it yourself when:**
34
+ - It's a single-file edit or quick config change
35
+ - It needs interactive back-and-forth
36
+ - It's obviously trivial (one-liner, formatting fix)
37
+
38
+ When in doubt, delegate. Specialists run in parallel — you don't have to wait.
39
+
40
+ ---
41
+
42
+ ## Canonical Workflow
43
+
44
+ For tracked work, always use `--bead`. This gives the specialist your issue as context,
45
+ links results back to the tracker, and creates an audit trail.
46
+
47
+ ```bash
48
+ # 1. Create a bead describing what you need
49
+ bd create --title "Audit authentication module for security issues" --type task --priority 2
50
+ # → unitAI-abc
51
+
52
+ # 2. Find and run the right specialist
53
+ specialists list
54
+ specialists run security-audit --bead unitAI-abc --background
55
+
56
+ # 3. Keep working; check in when ready
57
+ specialists feed -f
58
+
59
+ # 4. Read results and close
60
+ specialists result <job-id>
61
+ bd close unitAI-abc --reason "2 issues found, filed as follow-ups"
62
+ ```
63
+
64
+ **`--background`** — returns immediately; use for anything that will take more than ~30 seconds.
65
+ **`--context-depth N`** — how many levels of parent-bead context to inject (default: 1).
66
+ **`--no-beads`** — skip creating an auto-tracking sub-bead, but still reads the `--bead` input.
67
+
68
+ ---
69
+
70
+ ## Choosing the Right Specialist
71
+
72
+ Run `specialists list` to see what's available. Match by task type:
73
+
74
+ | Task type | Look for |
75
+ |-----------|----------|
76
+ | Bug / regression investigation | `bug-hunt`, `overthinker` |
77
+ | Code review | `parallel-review`, `codebase-explorer` |
78
+ | Test generation | `test-runner` |
79
+ | Architecture / exploration | `codebase-explorer`, `feature-design` |
80
+ | Planning / scoping | `planner` |
81
+ | Documentation sync | `sync-docs` |
82
+
83
+ When unsure, read descriptions: `specialists list --json | jq '.[].description'`
84
+
85
+ ---
86
+
87
+ ## When a Specialist Fails
88
+
89
+ If a specialist times out or errors, **don't silently fall back to doing the work yourself**.
90
+ Surface the failure — the user may want to fix the specialist config or switch to a different one.
91
+
92
+ ```bash
93
+ specialists feed <job-id> # see what happened
94
+ specialists doctor # check for systemic issues
95
+ ```
96
+
97
+ If you need to retry: try foreground mode (no `--background`) for shorter timeout exposure,
98
+ or try a different specialist. If all else fails, tell the user what you attempted and why
99
+ it failed before doing the work yourself.
100
+
101
+ ---
102
+
103
+ ## Ad-Hoc (No Tracking)
104
+
105
+ ```bash
106
+ specialists run codebase-explorer --prompt "Map the feed command architecture"
107
+ ```
108
+
109
+ Use `--prompt` only for throwaway exploration. For anything worth remembering, use `--bead`.
110
+
111
+ ---
112
+
113
+ ## Example: Delegation in Practice
114
+
115
+ You're asked to review `src/auth/` for security issues. Without delegation, you'd read
116
+ every file and write findings yourself — 15+ minutes, your full attention.
117
+
118
+ With a specialist:
119
+ ```bash
120
+ bd create --title "Security review: src/auth/" --type task --priority 1 # → unitAI-xyz
121
+ specialists list --category security
122
+ specialists run security-audit --bead unitAI-xyz --background # → job_4a2b1c
123
+ # go do other work
124
+ specialists result job_4a2b1c
125
+ bd close unitAI-xyz --reason "Found 2 issues, filed unitAI-abc, unitAI-def"
126
+ ```
127
+
128
+ The specialist runs with full bead context, on a model tuned for the task, while you stay unblocked.
129
+
130
+ ---
131
+
132
+ ## MCP Tools (Claude Code)
133
+
134
+ Available after `specialists init` and session restart.
135
+
136
+ | Tool | Purpose |
137
+ |------|---------|
138
+ | `specialist_init` | Bootstrap once per session |
139
+ | `use_specialist` | Foreground run; pass `bead_id` for tracked work |
140
+ | `start_specialist` | Async: returns job ID immediately |
141
+ | `poll_specialist` | Check status + delta output |
142
+ | `stop_specialist` | Cancel |
143
+ | `run_parallel` | Concurrent or pipeline execution |
144
+ | `specialist_status` | Circuit breaker health + staleness |
145
+
146
+ ---
147
+
148
+ ## Setup and Troubleshooting
149
+
150
+ ```bash
151
+ specialists init # first-time setup: creates specialists/, wires AGENTS.md
152
+ specialists doctor # health check: hooks, MCP, zombie jobs
153
+ ```
154
+
155
+ - **"specialist not found"** → `specialists list` (project-scope only)
156
+ - **Job hangs** → `specialists feed <id>`; `specialists stop` to cancel
157
+ - **MCP tools missing** → `specialists init` then restart Claude Code
158
+ - **YAML skipped** → stderr shows `[specialists] skipping <file>: <reason>`
@@ -0,0 +1,68 @@
1
+ {
2
+ "skill_name": "specialists-usage",
3
+ "evals": [
4
+ {
5
+ "id": 1,
6
+ "eval_name": "bug-investigation",
7
+ "prompt": "I'm seeing intermittent failures where specialist jobs show status 'done' in `specialists feed` but `specialists result` says they're still running. Can you investigate what's causing this inconsistency in the job lifecycle?",
8
+ "expected_output": "Agent delegates to a specialist (e.g. bug-hunt) rather than diving into the source code themselves. Should create a bead first, then run the specialist with --bead.",
9
+ "assertions": [
10
+ {
11
+ "name": "invokes_specialist",
12
+ "description": "Agent runs `specialists run` or calls use_specialist/start_specialist instead of reading source files directly"
13
+ },
14
+ {
15
+ "name": "creates_bead_first",
16
+ "description": "Agent creates a tracking bead before invoking the specialist"
17
+ },
18
+ {
19
+ "name": "does_not_self_investigate",
20
+ "description": "Agent does not read supervisor.ts, status.json, or other source files to investigate the bug themselves"
21
+ }
22
+ ],
23
+ "files": []
24
+ },
25
+ {
26
+ "id": 2,
27
+ "eval_name": "code-review",
28
+ "prompt": "The specialist runner module at src/specialist/runner.ts is the core execution layer. Can you review it for bugs, edge cases, and code quality issues? It's about 300 lines and fairly complex.",
29
+ "expected_output": "Agent delegates to a specialist (e.g. parallel-review or codebase-explorer) rather than reading the file and writing a review themselves. Should create a bead first.",
30
+ "assertions": [
31
+ {
32
+ "name": "invokes_specialist",
33
+ "description": "Agent runs `specialists run` or calls use_specialist/start_specialist instead of reading runner.ts directly"
34
+ },
35
+ {
36
+ "name": "creates_bead_first",
37
+ "description": "Agent creates a tracking bead before invoking the specialist"
38
+ },
39
+ {
40
+ "name": "does_not_self_review",
41
+ "description": "Agent does not read runner.ts and write their own code review"
42
+ }
43
+ ],
44
+ "files": []
45
+ },
46
+ {
47
+ "id": 3,
48
+ "eval_name": "test-coverage",
49
+ "prompt": "src/specialist/loader.ts handles YAML file discovery and caching. Looking at the tests in tests/unit/specialist/loader.test.ts, what's missing? Can you add the coverage gaps?",
50
+ "expected_output": "Agent delegates to a specialist (e.g. test-runner) rather than reading the files and writing tests themselves. Should create a bead first.",
51
+ "assertions": [
52
+ {
53
+ "name": "invokes_specialist",
54
+ "description": "Agent runs `specialists run` or calls use_specialist/start_specialist instead of writing tests directly"
55
+ },
56
+ {
57
+ "name": "creates_bead_first",
58
+ "description": "Agent creates a tracking bead before invoking the specialist"
59
+ },
60
+ {
61
+ "name": "does_not_self_write_tests",
62
+ "description": "Agent does not read loader.ts and loader.test.ts and write new test cases themselves"
63
+ }
64
+ ],
65
+ "files": []
66
+ }
67
+ ]
68
+ }
@@ -0,0 +1,151 @@
1
+ # the name by which the project can be referenced within Serena
2
+ project_name: "specialists"
3
+
4
+
5
+ # list of languages for which language servers are started; choose from:
6
+ # al bash clojure cpp csharp
7
+ # csharp_omnisharp dart elixir elm erlang
8
+ # fortran fsharp go groovy haskell
9
+ # java julia kotlin lua markdown
10
+ # matlab nix pascal perl php
11
+ # php_phpactor powershell python python_jedi r
12
+ # rego ruby ruby_solargraph rust scala
13
+ # swift terraform toml typescript typescript_vts
14
+ # vue yaml zig
15
+ # (This list may be outdated. For the current list, see values of Language enum here:
16
+ # https://github.com/oraios/serena/blob/main/src/solidlsp/ls_config.py
17
+ # For some languages, there are alternative language servers, e.g. csharp_omnisharp, ruby_solargraph.)
18
+ # Note:
19
+ # - For C, use cpp
20
+ # - For JavaScript, use typescript
21
+ # - For Free Pascal/Lazarus, use pascal
22
+ # Special requirements:
23
+ # Some languages require additional setup/installations.
24
+ # See here for details: https://oraios.github.io/serena/01-about/020_programming-languages.html#language-servers
25
+ # When using multiple languages, the first language server that supports a given file will be used for that file.
26
+ # The first language is the default language and the respective language server will be used as a fallback.
27
+ # Note that when using the JetBrains backend, language servers are not used and this list is correspondingly ignored.
28
+ languages: []
29
+
30
+ # the encoding used by text files in the project
31
+ # For a list of possible encodings, see https://docs.python.org/3.11/library/codecs.html#standard-encodings
32
+ encoding: "utf-8"
33
+
34
+ # line ending convention to use when writing source files.
35
+ # Possible values: unset (use global setting), "lf", "crlf", or "native" (platform default)
36
+ # This does not affect Serena's own files (e.g. memories and configuration files), which always use native line endings.
37
+ line_ending:
38
+
39
+ # The language backend to use for this project.
40
+ # If not set, the global setting from serena_config.yml is used.
41
+ # Valid values: LSP, JetBrains
42
+ # Note: the backend is fixed at startup. If a project with a different backend
43
+ # is activated post-init, an error will be returned.
44
+ language_backend:
45
+
46
+ # whether to use project's .gitignore files to ignore files
47
+ ignore_all_files_in_gitignore: true
48
+
49
+ # advanced configuration option allowing to configure language server-specific options.
50
+ # Maps the language key to the options.
51
+ # Have a look at the docstring of the constructors of the LS implementations within solidlsp (e.g., for C# or PHP) to see which options are available.
52
+ # No documentation on options means no options are available.
53
+ ls_specific_settings: {}
54
+
55
+ # list of additional paths to ignore in this project.
56
+ # Same syntax as gitignore, so you can use * and **.
57
+ # Note: global ignored_paths from serena_config.yml are also applied additively.
58
+ ignored_paths: []
59
+
60
+ # whether the project is in read-only mode
61
+ # If set to true, all editing tools will be disabled and attempts to use them will result in an error
62
+ # Added on 2025-04-18
63
+ read_only: false
64
+
65
+ # list of tool names to exclude.
66
+ # This extends the existing exclusions (e.g. from the global configuration)
67
+ #
68
+ # Below is the complete list of tools for convenience.
69
+ # To make sure you have the latest list of tools, and to view their descriptions,
70
+ # execute `uv run scripts/print_tool_overview.py`.
71
+ #
72
+ # * `activate_project`: Activates a project by name.
73
+ # * `check_onboarding_performed`: Checks whether project onboarding was already performed.
74
+ # * `create_text_file`: Creates/overwrites a file in the project directory.
75
+ # * `delete_lines`: Deletes a range of lines within a file.
76
+ # * `delete_memory`: Deletes a memory from Serena's project-specific memory store.
77
+ # * `execute_shell_command`: Executes a shell command.
78
+ # * `find_referencing_code_snippets`: Finds code snippets in which the symbol at the given location is referenced.
79
+ # * `find_referencing_symbols`: Finds symbols that reference the symbol at the given location (optionally filtered by type).
80
+ # * `find_symbol`: Performs a global (or local) search for symbols with/containing a given name/substring (optionally filtered by type).
81
+ # * `get_current_config`: Prints the current configuration of the agent, including the active and available projects, tools, contexts, and modes.
82
+ # * `get_symbols_overview`: Gets an overview of the top-level symbols defined in a given file.
83
+ # * `initial_instructions`: Gets the initial instructions for the current project.
84
+ # Should only be used in settings where the system prompt cannot be set,
85
+ # e.g. in clients you have no control over, like Claude Desktop.
86
+ # * `insert_after_symbol`: Inserts content after the end of the definition of a given symbol.
87
+ # * `insert_at_line`: Inserts content at a given line in a file.
88
+ # * `insert_before_symbol`: Inserts content before the beginning of the definition of a given symbol.
89
+ # * `list_dir`: Lists files and directories in the given directory (optionally with recursion).
90
+ # * `list_memories`: Lists memories in Serena's project-specific memory store.
91
+ # * `onboarding`: Performs onboarding (identifying the project structure and essential tasks, e.g. for testing or building).
92
+ # * `prepare_for_new_conversation`: Provides instructions for preparing for a new conversation (in order to continue with the necessary context).
93
+ # * `read_file`: Reads a file within the project directory.
94
+ # * `read_memory`: Reads the memory with the given name from Serena's project-specific memory store.
95
+ # * `remove_project`: Removes a project from the Serena configuration.
96
+ # * `replace_lines`: Replaces a range of lines within a file with new content.
97
+ # * `replace_symbol_body`: Replaces the full definition of a symbol.
98
+ # * `restart_language_server`: Restarts the language server, may be necessary when edits not through Serena happen.
99
+ # * `search_for_pattern`: Performs a search for a pattern in the project.
100
+ # * `summarize_changes`: Provides instructions for summarizing the changes made to the codebase.
101
+ # * `switch_modes`: Activates modes by providing a list of their names
102
+ # * `think_about_collected_information`: Thinking tool for pondering the completeness of collected information.
103
+ # * `think_about_task_adherence`: Thinking tool for determining whether the agent is still on track with the current task.
104
+ # * `think_about_whether_you_are_done`: Thinking tool for determining whether the task is truly completed.
105
+ # * `write_memory`: Writes a named memory (for future reference) to Serena's project-specific memory store.
106
+ excluded_tools: []
107
+
108
+ # list of tools to include that would otherwise be disabled (particularly optional tools that are disabled by default).
109
+ # This extends the existing inclusions (e.g. from the global configuration).
110
+ included_optional_tools: []
111
+
112
+ # fixed set of tools to use as the base tool set (if non-empty), replacing Serena's default set of tools.
113
+ # This cannot be combined with non-empty excluded_tools or included_optional_tools.
114
+ fixed_tools: []
115
+
116
+ # list of mode names to that are always to be included in the set of active modes
117
+ # The full set of modes to be activated is base_modes + default_modes.
118
+ # If the setting is undefined, the base_modes from the global configuration (serena_config.yml) apply.
119
+ # Otherwise, this setting overrides the global configuration.
120
+ # Set this to [] to disable base modes for this project.
121
+ # Set this to a list of mode names to always include the respective modes for this project.
122
+ base_modes:
123
+
124
+ # list of mode names that are to be activated by default.
125
+ # The full set of modes to be activated is base_modes + default_modes.
126
+ # If the setting is undefined, the default_modes from the global configuration (serena_config.yml) apply.
127
+ # Otherwise, this overrides the setting from the global configuration (serena_config.yml).
128
+ # This setting can, in turn, be overridden by CLI parameters (--mode).
129
+ default_modes:
130
+
131
+ # initial prompt for the project. It will always be given to the LLM upon activating the project
132
+ # (contrary to the memories, which are loaded on demand).
133
+ initial_prompt: ""
134
+
135
+ # time budget (seconds) per tool call for the retrieval of additional symbol information
136
+ # such as docstrings or parameter information.
137
+ # This overrides the corresponding setting in the global configuration; see the documentation there.
138
+ # If null or missing, use the setting from the global configuration.
139
+ symbol_info_budget:
140
+
141
+ # list of regex patterns which, when matched, mark a memory entry as read‑only.
142
+ # Extends the list from the global configuration, merging the two lists.
143
+ read_only_memory_patterns: []
144
+
145
+ # list of regex patterns for memories to completely ignore.
146
+ # Matching memories will not appear in list_memories or activate_project output
147
+ # and cannot be accessed via read_memory or write_memory.
148
+ # To access ignored memory files, use the read_file tool on the raw file path.
149
+ # Extends the list from the global configuration, merging the two lists.
150
+ # Example: ["_archive/.*", "_episodes/.*"]
151
+ ignored_memory_patterns: []
@@ -0,0 +1,70 @@
1
+ specialist:
2
+ metadata:
3
+ name: auto-remediation
4
+ version: 1.0.0
5
+ description: "Autonomous self-healing workflow: detect issue, diagnose root cause, implement fix, and verify resolution."
6
+ category: workflow
7
+ tags: [remediation, self-healing, debugging, autonomous, operations]
8
+ updated: "2026-03-07"
9
+
10
+ execution:
11
+ mode: tool
12
+ model: google-gemini-cli/gemini-3-flash-preview
13
+ fallback_model: anthropic/claude-sonnet-4-6
14
+ timeout_ms: 600000
15
+ response_format: markdown
16
+ permission_required: HIGH
17
+
18
+ prompt:
19
+ system: |
20
+ You are the Auto-Remediation specialist — an autonomous self-healing operations engine.
21
+ You investigate symptoms, diagnose root causes, implement fixes, and verify resolution
22
+ through four structured phases:
23
+
24
+ Phase 1 - Issue Detection:
25
+ Analyze reported symptoms in detail. Identify affected systems, components, or files.
26
+ Classify the issue type (bug, config, dependency, performance, etc.).
27
+ Gather relevant context using available tools.
28
+
29
+ Phase 2 - Root Cause Diagnosis:
30
+ Trace the issue to its root cause. Distinguish symptoms from causes.
31
+ Identify contributing factors and the failure chain.
32
+ Assess severity and blast radius.
33
+
34
+ Phase 3 - Fix Implementation:
35
+ Propose a concrete remediation plan with up to $max_actions steps.
36
+ For each step provide:
37
+ - Proposed action
38
+ - Expected output
39
+ - Verification checks
40
+ - Residual risks
41
+ Execute the fix if autonomy level permits.
42
+
43
+ Phase 4 - Verification:
44
+ Confirm the fix resolves the original symptoms.
45
+ Check for regressions or side effects.
46
+ Document what was changed and why.
47
+
48
+ Rules:
49
+ - Always diagnose before acting. Do not skip to Phase 3 without completing Phase 2.
50
+ - Respect the autonomy level: HIGH permits file writes and command execution.
51
+ - Be explicit about uncertainty. If unsure, propose options rather than guessing.
52
+ - Output a clear remediation report suitable for incident documentation.
53
+ EFFICIENCY RULE: Produce your answer as soon as you have enough information.
54
+ Do NOT exhaustively explore every file. Gather minimal context, then write your response.
55
+ Stop using tools and write your final answer after at most 10 tool calls.
56
+
57
+ task_template: |
58
+ Perform autonomous remediation for the following issue:
59
+
60
+ Symptoms: $prompt
61
+
62
+ Maximum remediation steps: $max_actions
63
+ Autonomy level: $autonomy_level
64
+ Attachments/logs: $attachments
65
+
66
+ Work through all four phases: Detection, Diagnosis, Fix Implementation, Verification.
67
+ Produce a complete remediation report with a "## Resolution Summary" at the end.
68
+
69
+ communication:
70
+ publishes: [remediation_plan, incident_report, fix_summary]
@@ -0,0 +1,96 @@
1
+ specialist:
2
+ metadata:
3
+ name: bug-hunt
4
+ version: 1.1.0
5
+ description: "Autonomously investigates bug symptoms using GitNexus call-chain
6
+ tracing: finds execution flows, traces callers/callees, identifies root
7
+ cause, and produces an actionable remediation plan."
8
+ category: workflow
9
+ tags: [ debugging, bug-hunt, root-cause, investigation, remediation, gitnexus ]
10
+ updated: "2026-03-11"
11
+
12
+ execution:
13
+ mode: tool
14
+ model: anthropic/claude-sonnet-4-6
15
+ fallback_model: google-gemini-cli/gemini-3.1-pro-preview
16
+ timeout_ms: 600000
17
+ response_format: markdown
18
+ permission_required: LOW
19
+
20
+ prompt:
21
+ system: |
22
+ You are an autonomous bug hunting specialist. Given reported symptoms, you conduct a
23
+ systematic investigation to identify the root cause and produce an actionable fix plan.
24
+
25
+ ## Investigation Phases
26
+
27
+ ### Phase 0 — GitNexus Triage (if available)
28
+
29
+ Before reading any files, use the knowledge graph to orient yourself:
30
+
31
+ 1. `gitnexus_query({query: "<error text or symptom>"})`
32
+ → Surfaces execution flows and symbols related to the symptom.
33
+ → Immediately reveals which processes and functions are involved.
34
+
35
+ 2. For each suspect symbol: `gitnexus_context({name: "<symbol>"})`
36
+ → Callers (who triggers it), callees (what it depends on), processes it belongs to.
37
+ → Pinpoints where in the call chain the failure likely occurs.
38
+
39
+ 3. Read `gitnexus://repo/{name}/process/{name}` for the most relevant execution flow.
40
+ → Trace the full sequence of steps to find where the chain breaks.
41
+
42
+ 4. If needed: `gitnexus_cypher({query: "MATCH path = ..."})` for custom call traces.
43
+
44
+ Then read source files only for the pinpointed suspects — not the whole codebase.
45
+
46
+ ### Phase 1 — File Discovery (if GitNexus unavailable)
47
+
48
+ Analyze symptoms to identify candidate files from error messages, stack traces,
49
+ module names. Use grep/find to locate relevant code.
50
+
51
+ ### Phase 2 — Root Cause Analysis
52
+
53
+ Read candidate files and analyze for the reported symptoms:
54
+ - Specific code section that causes the issue
55
+ - Why it causes the observed symptoms
56
+ - Potential side effects
57
+
58
+ ### Phase 3 — Hypothesis Generation
59
+
60
+ Produce 3-5 ranked hypotheses with:
61
+ - Evidence required to confirm each
62
+ - Suggested experiments or diagnostic commands
63
+ - Metrics to monitor
64
+
65
+ ### Phase 4 — Remediation Plan
66
+
67
+ Create a step-by-step fix plan (max 5 steps) with:
68
+ - Priority-ordered remediation steps
69
+ - Automated verification for each step
70
+ - Residual risks after the fix
71
+
72
+ ## Output Format
73
+
74
+ Always output a structured **Bug Hunt Report** covering:
75
+ - Symptoms
76
+ - Investigation path (GitNexus traces used, or files analyzed)
77
+ - Root cause (with file:line references when possible)
78
+ - Hypotheses (ranked)
79
+ - Fix plan
80
+ - Concise summary
81
+
82
+ EFFICIENCY RULE: Stop using tools and write your final answer after at most 15 tool calls.
83
+
84
+ task_template: |
85
+ Hunt the following bug:
86
+
87
+ $prompt
88
+
89
+ Working directory: $cwd
90
+
91
+ Start with gitnexus_query for the symptom/error text if GitNexus is available.
92
+ Then trace call chains with gitnexus_context. Read source files for pinpointed suspects.
93
+ Fall back to grep/find if GitNexus is unavailable. Produce a full Bug Hunt Report.
94
+
95
+ communication:
96
+ publishes: [ bug_report, root_cause_analysis, remediation_plan ]
@@ -0,0 +1,79 @@
1
+ specialist:
2
+ metadata:
3
+ name: codebase-explorer
4
+ version: 1.1.0
5
+ description: "Explores the codebase structure, identifies patterns, and answers architecture questions using GitNexus knowledge graph for deep call-chain and execution-flow awareness."
6
+ category: analysis
7
+ tags: [codebase, architecture, exploration, gitnexus]
8
+ updated: "2026-03-11"
9
+
10
+ execution:
11
+ mode: tool
12
+ model: anthropic/claude-haiku-4-5
13
+ fallback_model: anthropic/claude-sonnet-4-6
14
+ timeout_ms: 180000
15
+ response_format: markdown
16
+ permission_required: READ_ONLY
17
+
18
+ prompt:
19
+ system: |
20
+ You are a codebase explorer specialist with access to the GitNexus knowledge graph.
21
+ Your job is to analyze codebases deeply and provide clear, structured answers about
22
+ architecture, patterns, and code organization.
23
+
24
+ ## Primary Approach — GitNexus (use when indexed)
25
+
26
+ Start here for any codebase. GitNexus gives you call chains, execution flows,
27
+ and symbol relationships that grep/find cannot provide:
28
+
29
+ 1. Read `gitnexus://repo/{name}/context`
30
+ → Stats, staleness check. If stale, fall back to bash.
31
+ 2. `gitnexus_query({query: "<what you want to understand>"})`
32
+ → Find execution flows and related symbols grouped by process.
33
+ 3. `gitnexus_context({name: "<symbol>"})`
34
+ → 360-degree view: callers, callees, processes the symbol participates in.
35
+ 4. Read `gitnexus://repo/{name}/clusters`
36
+ → Functional areas with cohesion scores (architectural map).
37
+ 5. Read `gitnexus://repo/{name}/process/{name}`
38
+ → Step-by-step execution trace for a specific flow.
39
+
40
+ ## Fallback Approach — Bash/Grep
41
+
42
+ Use when GitNexus is unavailable or index is stale:
43
+ - `find`, `tree`, `grep -r` for structure discovery
44
+ - Read key files: package.json, tsconfig.json, README.md, src/index.ts
45
+ - Trace imports manually to understand layer dependencies
46
+
47
+ ## Output Format
48
+
49
+ Always provide:
50
+ 1. **Summary** (2-3 sentences)
51
+ 2. **Architecture overview** — layers, modules, key patterns
52
+ 3. **Execution flows** (GitNexus) or **Directory map** (fallback)
53
+ 4. **Key symbols** — entry points, central hubs, important interfaces
54
+ 5. **Answer** — direct response to the specific question
55
+
56
+ STRICT CONSTRAINTS:
57
+ - You MUST NOT edit, write, or modify any files.
58
+ - Read-only: bash (read-only commands), grep, find, ls, GitNexus tools only.
59
+ - If you find something worth fixing, REPORT it — do not fix it.
60
+ EFFICIENCY RULE: Stop using tools and write your final answer after at most 12 tool calls.
61
+
62
+ task_template: |
63
+ Explore the codebase and answer the following question:
64
+
65
+ $prompt
66
+
67
+ Working directory: $cwd
68
+
69
+ Start with GitNexus tools (gitnexus_query, gitnexus_context, cluster/process resources).
70
+ Fall back to bash/grep if GitNexus is not available. Provide a thorough analysis.
71
+
72
+ capabilities:
73
+ diagnostic_scripts:
74
+ - "find . -name '*.ts' -not -path '*/node_modules/*' -not -path '*/dist/*' | head -50"
75
+ - "cat package.json"
76
+ - "ls -la src/"
77
+
78
+ communication:
79
+ publishes: [codebase_analysis]