agentic-advisor 0.7.1__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (48) hide show
  1. agentic_advisor-0.7.1/.agent/workflows/advisor-security.md +51 -0
  2. agentic_advisor-0.7.1/.agent/workflows/advisor-session.md +59 -0
  3. agentic_advisor-0.7.1/.agent/workflows/advisor-setup.md +18 -0
  4. agentic_advisor-0.7.1/.gitignore +13 -0
  5. agentic_advisor-0.7.1/.mcp.json +9 -0
  6. agentic_advisor-0.7.1/.skills/agentic-advisor.md +46 -0
  7. agentic_advisor-0.7.1/AGENTS.md +58 -0
  8. agentic_advisor-0.7.1/CHANGELOG.md +195 -0
  9. agentic_advisor-0.7.1/CLAUDE.md +98 -0
  10. agentic_advisor-0.7.1/DEEP_THINK_PROMPT.md +109 -0
  11. agentic_advisor-0.7.1/PKG-INFO +249 -0
  12. agentic_advisor-0.7.1/README.md +232 -0
  13. agentic_advisor-0.7.1/evals/conftest.py +10 -0
  14. agentic_advisor-0.7.1/evals/eval_runner.py +129 -0
  15. agentic_advisor-0.7.1/evals/test_retriever_evals.py +67 -0
  16. agentic_advisor-0.7.1/evals/test_router_evals.py +122 -0
  17. agentic_advisor-0.7.1/evals/test_scanner_evals.py +142 -0
  18. agentic_advisor-0.7.1/pyproject.toml +36 -0
  19. agentic_advisor-0.7.1/src/agentic_advisor/__init__.py +7 -0
  20. agentic_advisor-0.7.1/src/agentic_advisor/errors.py +152 -0
  21. agentic_advisor-0.7.1/src/agentic_advisor/knowledge/__init__.py +0 -0
  22. agentic_advisor-0.7.1/src/agentic_advisor/knowledge/loader.py +384 -0
  23. agentic_advisor-0.7.1/src/agentic_advisor/knowledge/retriever.py +73 -0
  24. agentic_advisor-0.7.1/src/agentic_advisor/knowledge/updater.py +169 -0
  25. agentic_advisor-0.7.1/src/agentic_advisor/knowledge/vector_store.py +60 -0
  26. agentic_advisor-0.7.1/src/agentic_advisor/proactive/__init__.py +0 -0
  27. agentic_advisor-0.7.1/src/agentic_advisor/proactive/aibom.py +222 -0
  28. agentic_advisor-0.7.1/src/agentic_advisor/proactive/approval.py +270 -0
  29. agentic_advisor-0.7.1/src/agentic_advisor/proactive/architecture.py +54 -0
  30. agentic_advisor-0.7.1/src/agentic_advisor/proactive/briefing.py +194 -0
  31. agentic_advisor-0.7.1/src/agentic_advisor/proactive/checkpointing.py +146 -0
  32. agentic_advisor-0.7.1/src/agentic_advisor/proactive/circuit_breaker.py +245 -0
  33. agentic_advisor-0.7.1/src/agentic_advisor/proactive/hooks.py +404 -0
  34. agentic_advisor-0.7.1/src/agentic_advisor/proactive/memory.py +71 -0
  35. agentic_advisor-0.7.1/src/agentic_advisor/proactive/multiplexer.py +191 -0
  36. agentic_advisor-0.7.1/src/agentic_advisor/proactive/notifications.py +136 -0
  37. agentic_advisor-0.7.1/src/agentic_advisor/proactive/scanner.py +451 -0
  38. agentic_advisor-0.7.1/src/agentic_advisor/proactive/summarizer.py +177 -0
  39. agentic_advisor-0.7.1/src/agentic_advisor/proactive/telemetry.py +252 -0
  40. agentic_advisor-0.7.1/src/agentic_advisor/routing/__init__.py +0 -0
  41. agentic_advisor-0.7.1/src/agentic_advisor/routing/router.py +351 -0
  42. agentic_advisor-0.7.1/src/agentic_advisor/server.py +1193 -0
  43. agentic_advisor-0.7.1/src/agentic_advisor/setup/__init__.py +0 -0
  44. agentic_advisor-0.7.1/src/agentic_advisor/setup/a2a.py +44 -0
  45. agentic_advisor-0.7.1/src/agentic_advisor/setup/detector.py +206 -0
  46. agentic_advisor-0.7.1/src/agentic_advisor/setup/generator.py +286 -0
  47. agentic_advisor-0.7.1/src/agentic_advisor/setup/templates.py +529 -0
  48. agentic_advisor-0.7.1/tests/test_advisor.py +483 -0
@@ -0,0 +1,51 @@
1
+ ---
2
+ description: Pre-commit security review — scan diffs, validate deps, check circuit breaker
3
+ ---
4
+
5
+ # Advisor Security Workflow
6
+
7
+ Run this workflow before every commit to catch secrets, risky patterns, and hallucinated packages.
8
+
9
+ ## Steps
10
+
11
+ ### 1. Scan the Diff
12
+ // turbo
13
+ Run `scan_diff(diff_text)` with the output of `git diff --cached`.
14
+
15
+ The scanner checks:
16
+ - **17 regex patterns** — Anthropic, OpenAI, AWS, Google, GitHub, Stripe, Twilio, Slack keys
17
+ - **Shannon entropy** — High-randomness strings (>4.5 bits/char) in hex/base64 charset
18
+ - **10 risky patterns** — `eval()`, `exec()`, `shell=True`, SQL injection, `pickle.loads()`, disabled SSL
19
+
20
+ If `is_clean` is `false`, fix the findings before committing.
21
+
22
+ ### 2. Validate Dependencies
23
+ For each new dependency added in this commit, run:
24
+ ```
25
+ validate_dependency(package_name)
26
+ ```
27
+
28
+ Verdicts:
29
+ - `confirmed_real` — Exists on PyPI or npm (live verified)
30
+ - `likely_real` — Matches known-real prefix list
31
+ - `suspicious` — Heuristic flags (long name, AI SDK combo, gibberish)
32
+ - `not_found` — **Does NOT exist on any registry. Do NOT install.**
33
+
34
+ ### 3. Check Circuit Breaker
35
+ // turbo
36
+ Call `get_circuit_status()` to verify you haven't been stuck in a death loop.
37
+
38
+ If `tripped` is `true`:
39
+ 1. Do NOT commit
40
+ 2. Write a summary of the failure to `DECISIONS.md`
41
+ 3. Call `reset_circuit()` only after the human has reviewed and provided guidance
42
+
43
+ ### 4. Commit
44
+ If all checks pass, proceed with the commit. The `post_tool_use` hook will automatically log this action to `.claude/audit.log`.
45
+
46
+ ### 5. Revert if Needed
47
+ If the commit breaks something critical, call:
48
+ ```
49
+ revert_to_checkpoint(directory)
50
+ ```
51
+ This hard-resets to the last advisor checkpoint commit.
@@ -0,0 +1,59 @@
1
+ ---
2
+ description: Full agentic-advisor session lifecycle — from briefing to ROI dashboard
3
+ ---
4
+
5
+ # Advisor Session Workflow
6
+
7
+ The canonical execution loop for any AI coding session using the agentic-advisor MCP.
8
+
9
+ ## Steps
10
+
11
+ ### 1. Session Briefing
12
+ // turbo
13
+ Call `get_session_briefing()` to check project health, verify MCP connections, and get session tips.
14
+
15
+ ### 2. Check Circuit Breaker
16
+ // turbo
17
+ Call `get_circuit_status()` to verify you are not in a tripped state from a previous session.
18
+
19
+ ### 3. Get Next Task
20
+ // turbo
21
+ Call `whats_next()` to read the next unchecked task from `tasks.md`. This returns:
22
+ - `next_task` — the task description
23
+ - `phase` — which phase of the plan you're in
24
+ - `progress_pct` — completion percentage
25
+ - `completed` / `total` — task counts
26
+
27
+ If `status` is `"all_done"`, skip to step 7.
28
+
29
+ ### 4. Implement the Task
30
+ Use your native IDE tools (file edits, terminal, browser) to implement the task described in step 3. Follow the `requirements.md` and `design.md` specs.
31
+
32
+ If a test fails after modifying a file, call:
33
+ ```
34
+ record_loop_event(file_path, test_command, error_output)
35
+ ```
36
+ Check the response — if `tripped` is `true`, STOP immediately and write a failure summary to `DECISIONS.md`.
37
+
38
+ ### 5. Pre-Commit Security Review
39
+ Before committing, run:
40
+ ```
41
+ scan_diff(git diff --cached)
42
+ ```
43
+ If any secrets or HIGH-severity patterns are found, fix them before proceeding.
44
+
45
+ For any new dependencies, validate them:
46
+ ```
47
+ validate_dependency(package_name)
48
+ ```
49
+
50
+ ### 6. Mark Task Complete
51
+ Call `mark_done()` to check off the task in `tasks.md`. Then loop back to step 3.
52
+
53
+ ### 7. Session Analytics
54
+ // turbo
55
+ At session end, call `get_session_analytics()` to review:
56
+ - Loop velocity (time per task)
57
+ - Circuit breaker trips
58
+ - Knowledge gaps (low-score RAG queries)
59
+ - Estimated hours saved
@@ -0,0 +1,18 @@
1
+ ---
2
+ description: Set up an optimal agentic coding environment for the current project
3
+ ---
4
+
5
+ # Setup Agentic Environment
6
+
7
+ 1. Call the `assess_project` tool on the agentic-advisor MCP with the current directory
8
+ 2. Review the recommended configuration (MCP stack, CLAUDE.md / AGENTS.md)
9
+ 3. Call `setup_project` to write the configuration files (CLAUDE.md, AGENTS.md, Skills, Workflows)
10
+ 4. Install the recommended MCP servers from the install commands provided
11
+ // turbo
12
+ 5. Run `git add CLAUDE.md AGENTS.md .skills/ .agent/` and commit "chore: add agentic advisor config"
13
+
14
+ ## Optional: Spec-Driven Setup for New Features
15
+ After setup, if starting a new feature:
16
+ 6. Call `create_spec(feature_name)` to generate requirements.md, design.md, tasks.md
17
+ 7. Review and fill in the spec files before asking an agent to implement
18
+ 8. Run `git add requirements.md design.md tasks.md` and commit "docs: add spec for {feature}"
@@ -0,0 +1,13 @@
1
+ __pycache__/
2
+ *.py[cod]
3
+ *.egg-info/
4
+ .eggs/
5
+ dist/
6
+ build/
7
+ *.egg
8
+ .venv/
9
+ venv/
10
+ .env
11
+ *.db
12
+ *.sqlite3
13
+ .chroma/
@@ -0,0 +1,9 @@
1
+ {
2
+ "mcpServers": {
3
+ "agentic-advisor": {
4
+ "command": "/Users/kenthall/Developer/agentic-advisor/.venv/bin/python",
5
+ "args": ["-m", "agentic_advisor.server"],
6
+ "cwd": "/Users/kenthall/Developer/agentic-advisor/src"
7
+ }
8
+ }
9
+ }
@@ -0,0 +1,46 @@
1
+ ---
2
+ name: agentic-advisor
3
+ description: >
4
+ Consult the agentic-advisor MCP to get routing recommendations, best-practice guidance,
5
+ spec-driven development support, and project setup help. Use this when you need to know
6
+ which tool to use, how to structure a workflow, want a session briefing, or need to
7
+ generate spec files before starting implementation.
8
+ ---
9
+
10
+ # Agentic Advisor Skill
11
+
12
+ When this skill is active, consult the agentic-advisor MCP server at key decision points:
13
+
14
+ ## At Session Start
15
+ 1. Read the `advisor://briefing` resource for a project health summary
16
+ 2. Or call `get_session_briefing()` with the current project directory to receive:
17
+ - Project type and recommended MCP stack
18
+ - Missing configuration warnings
19
+ - Recommended approach for today's work
20
+
21
+ ## Before Starting a New Feature
22
+ Call `create_spec(feature_name)` to generate:
23
+ - `requirements.md` — what to build (user stories, acceptance criteria)
24
+ - `design.md` — how to build it (architecture, key decisions)
25
+ - `tasks.md` — ordered implementation checklist
26
+
27
+ ## When Choosing a Tool
28
+ Call `route_task` with a description of what you need to do.
29
+ The advisor will tell you exactly which MCP and tool to use, including new categories:
30
+ - Memory/persistence → mcp-memory-service
31
+ - Multi-agent coordination → Agent-MCP / git worktrees
32
+ - Task tracking → linear-mcp
33
+ - Code health → codescene-mcp
34
+
35
+ ## When Unsure About Best Practice
36
+ Call `ask_advisor` with your question to get knowledge-base-grounded guidance.
37
+
38
+ ## Trigger Phrases
39
+ This skill activates when the user says:
40
+ - "set up this project"
41
+ - "what's the best way to..."
42
+ - "which MCP should I use"
43
+ - "ask the advisor"
44
+ - "get a session briefing"
45
+ - "create a spec for..."
46
+ - "generate requirements for..."
@@ -0,0 +1,58 @@
1
+ # AGENTS.md — python
2
+
3
+ This file configures AI agents (Antigravity, OpenAI Codex, GitHub Copilot agent mode) for this project.
4
+ It is the authoritative contract between humans and agents — read it before every task.
5
+
6
+ ## Role & Goal
7
+ You are an expert python + python developer. Your goal is to implement the requested task
8
+ with correctness, security, and minimal scope. Implement what is asked; do not add unrequested features.
9
+
10
+ ## Capabilities
11
+ - Read, write, and refactor python code
12
+ - Run `pytest` to validate changes
13
+ - Use the MCP servers listed below to perform specialist tasks
14
+ - Generate and follow spec files (`requirements.md`, `design.md`, `tasks.md`)
15
+
16
+ ## Active MCPs
17
+ - context7
18
+
19
+ For task routing decisions, call `route_task()` on `agentic-advisor` MCP.
20
+ For best-practice questions, call `ask_advisor()` on `agentic-advisor` MCP.
21
+
22
+ ## Spec-Driven Workflow
23
+ If `requirements.md` and `tasks.md` exist in the project root:
24
+ 1. Read them before writing any code
25
+ 2. Work through `tasks.md` items in order, checking off each when complete
26
+ 3. Do not deviate from the spec without surfacing the conflict to the user
27
+
28
+ ## Boundaries
29
+ - Only modify files in the directories specified in each task
30
+ - Do not install new packages without asking first
31
+ - Do not run destructive shell commands (`rm -rf`, `DROP TABLE`, etc.) without explicit confirmation
32
+ - Do not commit or push to `main`/`master` directly — create a branch and open a PR
33
+ - Do not modify CI/CD configs, deployment manifests, or `.env` files without explicit instruction
34
+
35
+ ## Human-in-the-Loop
36
+ Surface to the user and wait for confirmation before:
37
+ - Deleting or renaming files
38
+ - Making schema migrations
39
+ - Changing authentication or authorization logic
40
+ - Adding new external dependencies
41
+
42
+ ## Security
43
+ - Never write hardcoded secrets, tokens, or passwords — not even as `TODO` placeholders
44
+ - All user input must be validated and sanitized before use
45
+ - Use environment variables for all configuration values
46
+ - When installing packages, verify the exact name on the registry (typosquat prevention)
47
+
48
+ ## Output Format
49
+ After completing a task:
50
+ 1. List files changed and the nature of each change
51
+ 2. Confirm tests pass: `pytest`
52
+ 3. Summarize what behavior changed and why
53
+
54
+ ## Code Style
55
+ - Language: python
56
+ - Tests required for all new functions
57
+ - Comments in English only
58
+ - Keep commits atomic: one logical change per commit
@@ -0,0 +1,195 @@
1
+ # Changelog — agentic-advisor
2
+
3
+ All notable changes to this project will be documented here.
4
+ Format based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/).
5
+
6
+ ---
7
+
8
+ ## [0.7.1] — 2026-02-26
9
+
10
+ ### Added
11
+
12
+ - **`summarize_memory(directory, max_entries)`** — Compacts NOTES.md by summarizing older entries into a single section. Keeps the N newest entries verbatim (default: 10).
13
+ - **`knowledge/vector_store.py`** — Formal `VectorStoreAdapter` ABC for knowledge base backends. Makes adding ChromaDB, FAISS, or Pinecone trivial by inheriting from the interface.
14
+ - **`errors.py`** — Structured error recovery with 8 error type classifiers (`file_not_found`, `network_error`, `permission_denied`, `parse_error`, `validation_error`, `dependency_missing`, `data_structure_error`, `path_type_error`). Returns actionable recovery hints to help LLMs self-correct.
15
+
16
+ ### Improved
17
+
18
+ - **All 13 tool error handlers** now return structured `advisor_error()` responses instead of bare `{"error": str(e)}`.
19
+ - **Session briefing** warns when NOTES.md exceeds 50 entries, suggesting `summarize_memory()`.
20
+ - `SemanticKnowledgeBase` and `TFIDFKnowledgeBase` now document their adherence to the `VectorStoreAdapter` interface.
21
+
22
+ ### Infrastructure
23
+
24
+ - Version bumped to `0.7.1`. Total: **27 tools, 4 prompts, 9 resources**.
25
+ - Unit tests expanded with `TestSummarizer`, `TestVectorStoreAdapter`, and `TestErrorRecovery` (13 new tests).
26
+
27
+ ---
28
+
29
+ ### Added — 5 new tools
30
+
31
+ - **`request_approval(action, risk_level, details)`** — Submit high-stakes actions for human approval. Risk levels: `low` (auto-approve), `medium`, `high` (blocks), `critical` (blocks + notification).
32
+ - **`check_approval(approval_id)`** — Check the status of a pending approval request.
33
+ - **`list_pending_approvals()`** — List all pending approval requests awaiting human review.
34
+ - **`grant_approval(approval_id, note)`** — Grant approval for a pending request (human-facing).
35
+ - **`deny_approval(approval_id, note)`** — Deny a pending approval request (human-facing).
36
+
37
+ ### Added — 5 new MCP Resources
38
+
39
+ - **`advisor://memory`** — Current NOTES.md contents served as a resource (avoids tool call overhead).
40
+ - **`advisor://circuit-status`** — Circuit breaker state as a readable resource.
41
+ - **`advisor://aibom`** — Last generated AIBOM compliance artifact.
42
+ - **`advisor://alerts`** — Proactive alerts from scanner and circuit breaker (poll-based notifications).
43
+ - **`advisor://pending-approvals`** — Pending human approval requests.
44
+
45
+ ### Added — 2 new modules
46
+
47
+ - `proactive/notifications.py` — Notification queue + manager for proactive alerts. Integrated into circuit_breaker and scanner for auto-queuing.
48
+ - `proactive/approval.py` — Human-in-the-loop approval gate with risk-based auto-approve logic.
49
+
50
+ ### Added — Evaluation Framework
51
+
52
+ - `evals/` directory with 45+ evaluation cases:
53
+ - `test_router_evals.py` — 20+ routing accuracy cases across 8 task categories
54
+ - `test_scanner_evals.py` — 15+ scanner detection cases (secrets, entropy, diffs, dependencies)
55
+ - `test_retriever_evals.py` — 10+ retrieval quality and relevance cases
56
+ - `eval_runner.py` — CLI tool for running all evals with quality report
57
+
58
+ ### Improved
59
+
60
+ - **Circuit Breaker** — Now auto-queues notifications on both test-failure and semantic loop trips.
61
+ - **Scanner** — Now auto-queues critical alerts when secrets are detected.
62
+ - Unit tests expanded with `TestNotifications` and `TestApprovalGate` classes (12 new tests).
63
+
64
+ ### Infrastructure
65
+
66
+ - Version bumped to `0.7.0`. Total: **26 tools, 4 prompts, 9 resources**.
67
+ - `pyproject.toml` updated: evals added to testpaths.
68
+
69
+ ---
70
+
71
+ ## [0.6.0] — 2026-02-26
72
+
73
+ ### Added — 3 new tools & Protocols
74
+
75
+ - **A2A Protocol Support** — `agent-card.json` generation and Orchestrator role scaffolding for multi-agent workflows.
76
+ - **`read_agentic_memory()`** and **`write_agentic_memory(topic, content)`** — Persistent `NOTES.md` storage to manage long-horizon context.
77
+ - **`record_semantic_event(reasoning, tool_calls)`** — New semantic layer for the circuit breaker tracking repeating thoughts/actions. Trips after 4 identical cycles into a `DEGRADED` state.
78
+
79
+ ### Improved
80
+
81
+ - **Architecture Enforcement** — Integrated SOLID/Clean Architecture scanning directly into `scan_diff` (detects Arrow Anti-patterns, SRP violations).
82
+ - Prompts updated to incorporate Agentic Memory checks and semantic decision logging.
83
+
84
+ ---
85
+
86
+ ## [0.5.0] — 2026-02-24
87
+
88
+ ### Added — 6 new tools
89
+
90
+ - **`record_loop_event(file, command, error)`** — Circuit breaker telemetry. Tracks (file, error) pairs. Trips after 3 identical failures.
91
+ - **`get_circuit_status()`** — Check if the death-loop breaker has tripped.
92
+ - **`reset_circuit()`** — Clear the breaker after human intervention.
93
+ - **`revert_to_checkpoint(directory)`** — Hard-reset to last advisor git checkpoint.
94
+ - **`get_session_analytics()`** — Session ROI: loop velocity, tool usage, knowledge gaps, hours saved.
95
+ - **`generate_aibom(directory)`** — AI Bill of Materials compliance artifact (Commit → Task → Scans).
96
+
97
+ ### Added — 4 MCP Prompts
98
+
99
+ - **`start-session`** — Full bootstrap: briefing → circuit check → task loop → analytics.
100
+ - **`pre-commit`** — Security: scan_diff → validate_dependency → circuit check.
101
+ - **`plan-feature`** — Spec-driven: assess → create_spec → hooks → whats_next.
102
+ - **`debug-loop`** — Recovery: circuit → stop → DECISIONS.md → revert → ask human.
103
+
104
+ ### Added — 3 Agent Workflows + 4 New Modules
105
+
106
+ - `.agent/workflows/advisor-session.md`, `advisor-setup.md`, `advisor-security.md`
107
+ - `proactive/circuit_breaker.py` — Death loop detection (3-strike, 5-min window)
108
+ - `proactive/checkpointing.py` — Git snapshots + revert
109
+ - `proactive/telemetry.py` — SQLite analytics at `.claude/telemetry.db`
110
+ - `proactive/aibom.py` — AIBOM with traceability matrix
111
+
112
+ ### Improved
113
+
114
+ - **`scan_for_secrets`** — Two-pass: 17 regex + Shannon entropy (4.5 bit threshold).
115
+ - **`pre_tool_use` hook** — Allow-list model: shlex parsing, 45+ safe binaries, directory sandboxing.
116
+ - **Routing** — Tier 2 now uses sentence-transformer embeddings (reuses MiniLM model).
117
+ - **Chunking** — Structure-aware: splits on headers, never breaks code blocks.
118
+ - **`retriever.py`** — Annotates results with active backend info.
119
+
120
+ ### Infrastructure
121
+
122
+ - Version bumped to `0.5.0`. Total: **18 tools, 4 prompts, 4 resources**.
123
+ - `pyproject.toml` has `[semantic]` extras group (`sentence-transformers`, `numpy`).
124
+
125
+ ---
126
+
127
+ ## [0.3.0] — 2026-02-24
128
+
129
+ ### Added — 5 new tools
130
+
131
+ - **`whats_next(directory)`** — Returns the next unchecked task from `tasks.md` as a
132
+ structured dict with `phase`, `completed`, `total`, and `progress_pct`. Previously
133
+ returned a bare string.
134
+
135
+ - **`mark_done(task_text, directory)`** — Marks the first matching unchecked task in
136
+ `tasks.md` as complete (`[x]`). Accepts partial, case-insensitive text match.
137
+ Closes the spec-driven execution loop: `whats_next → implement → mark_done → repeat`.
138
+
139
+ - **`scan_for_secrets(text)`** — Detects hardcoded secrets in any text using 17 regex
140
+ patterns covering Anthropic, OpenAI, AWS, Google, GitHub, Stripe, Twilio, Slack,
141
+ RSA private keys, generic assignments, and Bearer tokens.
142
+
143
+ - **`scan_diff(diff_text)`** — Full git diff review scanning only added lines (`+`) for
144
+ secrets (17 patterns) and risky code patterns: `eval()`, `exec()`, `shell=True`,
145
+ SQL injection via string formatting, `pickle.loads()`, disabled SSL, and more.
146
+
147
+ - **`generate_hook_script(hook_type, directory, dry_run)`** — Writes production-ready
148
+ Claude Code lifecycle hook scripts to `.claude/hooks/`. Supports:
149
+ - `pre_tool_use` — Blocks dangerous bash commands and secrets in file writes (exit 2)
150
+ - `post_tool_use` — Audit logging + auto-format Python files with ruff
151
+ - `stop` — Self-verification: runs pytest/npm test before agent can stop
152
+ - `session_start` — Injects tasks.md status and DECISIONS.md alerts at session start
153
+ - `all` — Installs all four hooks at once
154
+
155
+ ### Improved
156
+
157
+ - **`validate_dependency`** now runs a **live PyPI + npm registry lookup** (2s timeout,
158
+ stdlib only, no new dependencies). Returns `confirmed_real` (live verified),
159
+ `not_found` (hallucinated package), or heuristic fallbacks when offline.
160
+
161
+ - **`route_task` / `detect_task_type`** now has a **Tier 2 semantic fallback** using
162
+ cosine similarity against routing map descriptions. Queries like *"run tests in a
163
+ real browser"* now correctly route to `browser_testing` even without hitting keywords.
164
+ Routing decisions include a `reasoning` field describing which tier matched.
165
+
166
+ ### Infrastructure
167
+
168
+ - Version bumped to `0.3.0` in `server.py` (FastMCP) and `pyproject.toml`.
169
+ - Knowledge base expanded from 14 to 30 documents (docs 14-29 added).
170
+ - All new docs optimized for TF-IDF section chunking with `##`/`###` headers.
171
+ - `00-master-index.md` updated with full TOC, FAQ table, and cross-reference guide
172
+ covering all 30 documents.
173
+
174
+ ---
175
+
176
+ ## [0.2.0] — 2026-02-23
177
+
178
+ ### Added
179
+
180
+ - Core MCP server with 6 tools: `ask_advisor`, `assess_project`, `setup_project`,
181
+ `create_spec`, `route_task`, `get_session_briefing`
182
+ - TF-IDF knowledge base with 14 seed documents
183
+ - 4 resources: `advisor://briefing`, `advisor://routing-guide`,
184
+ `advisor://spec-templates`, `advisor://patterns-guide`
185
+ - Project detector, CLAUDE.md/AGENTS.md generator, spec file templates
186
+ - Session briefing with project health checks and session tips
187
+ - 17-category routing map with keyword classification
188
+
189
+ ---
190
+
191
+ ## [0.1.0] — 2026-02-22 (Initial Release)
192
+
193
+ - Initial FastMCP server scaffold
194
+ - Basic RAG over knowledge base using TF-IDF
195
+ - `ask_advisor` tool with knowledge base retrieval
@@ -0,0 +1,98 @@
1
+ # CLAUDE.md — python
2
+
3
+ This file configures Claude Code for this project. Read it at the start of every session.
4
+
5
+ ## Project Overview
6
+ - **Type**: python + python
7
+ - **Language**: python
8
+ - **Test framework**: pytest
9
+ - **Database**: unknown
10
+
11
+ ## Build & Run Commands
12
+ <!-- Add your build, dev, test, and lint commands here -->
13
+
14
+ ## Active MCP Servers
15
+ - context7
16
+ > Tip: Use MCP lazy loading (`defer_loading: true` in .mcp.json) to reduce context overhead.
17
+ > With 5+ MCP servers active, upfront tool definitions can consume 50K+ tokens before any work begins.
18
+
19
+ ## Core Rules
20
+
21
+ ## Security Rules (Always Active)
22
+ - NEVER hardcode API keys, secrets, tokens, or passwords in any file
23
+ - NEVER use `eval()`, `exec()`, or `Function()` with user-provided data
24
+ - ALWAYS use parameterized queries — never interpolate user input into SQL strings
25
+ - NEVER execute shell commands that include unvalidated user input
26
+ - When adding a new npm/pip package, confirm the exact package name on the registry before installing
27
+ - If you're unsure whether an action is safe, ask before proceeding
28
+
29
+ ## Context Management
30
+ - Use `/clear` between major tasks to keep context focused
31
+ - When the task list is complete, run a wrap-up: summarize what changed and commit
32
+ - Always read existing code before modifying it — don't assume
33
+ - If you've tried the same approach 3 times and it's failing, stop and ask for guidance
34
+ - MCP tool definitions consume context — only connect MCPs you'll actually use in this session
35
+ - Use `advisor://routing-guide` resource once per session instead of calling route_task() repeatedly
36
+
37
+ ## Spec-Driven Development
38
+ Before writing any significant new feature or module:
39
+ 1. Create `requirements.md` — what the feature must do (user stories, acceptance criteria)
40
+ 2. Create `design.md` — how it will be built (architecture, key decisions, constraints)
41
+ 3. Create `tasks.md` — ordered checklist of implementation steps
42
+ These files are the source of truth. The agent implements against them, not against vague prompts.
43
+ Run: `route_task("create spec files")` → the advisor will generate these for you.
44
+
45
+ ## Persistent Memory
46
+ - If `mcp-memory-service` or `mcp-knowledge-graph` is connected, store key architectural decisions
47
+ after each session: `store_memory("We chose X over Y because Z")`
48
+ - Important decisions should also be appended to `DECISIONS.md` in the project root
49
+ - At session start, search memories for relevant context: `search_memories("project architecture")`
50
+
51
+
52
+
53
+ ## Workflow
54
+
55
+ ### Starting a session
56
+ 1. Read `advisor://briefing` resource (or call `get_session_briefing()`) for project health
57
+ 2. If spec files exist (`requirements.md`, `design.md`, `tasks.md`), read them before touching code
58
+ 3. Search persistent memory for relevant context: `search_memories("this project")`
59
+
60
+ ### Starting a task
61
+ 1. Read the relevant source files before making changes
62
+ 2. State your plan in 3 bullet points before writing any code
63
+ 3. Check for existing utilities before adding new dependencies
64
+ 4. For significant new features, generate spec files first with `create_spec()`
65
+
66
+ ### During a task
67
+ - Make small, atomic commits after each working increment
68
+ - Run tests after every significant change: `pytest`
69
+ - Never modify files outside the agreed scope without asking
70
+ - Claude Code auto-saves checkpoints before each change — use `/rewind` to undo if needed
71
+
72
+ ### Ending a session
73
+ 1. Run the full test suite
74
+ 2. Summarize what changed (what files, what behavior)
75
+ 3. Commit with a descriptive message
76
+ 4. Store key decisions in memory: `store_memory("Decision: ...")`
77
+
78
+ ## Human-in-the-Loop Triggers
79
+ Stop and ask the user before proceeding when:
80
+ - About to delete files, drop database tables, or remove more than 10 lines from a critical module
81
+ - Installing a new package not already in the project
82
+ - Making changes to CI/CD pipelines, deployment configs, or environment variables
83
+ - Unsure whether a destructive operation is reversible
84
+
85
+ ## Parallel Agents & Git Worktrees
86
+ To run multiple agents simultaneously on independent tasks:
87
+ ```bash
88
+ git worktree add ../feature-branch -b feature/your-feature-name
89
+ ```
90
+ Each worktree is an isolated working copy — agents can't conflict.
91
+ Use Agent-MCP or claude-flow to coordinate agents via shared context.
92
+ After parallel work, merge back: `git merge --no-ff feature/your-feature-name`
93
+
94
+ ## Custom Slash Commands
95
+ - `/review` — Review recent changes for security and correctness
96
+ - `/route [task]` — Ask the agentic-advisor which MCP to use for a task
97
+ - `/ask-advisor [question]` — Query the agentic coding knowledge base
98
+ - `/spec [feature]` — Generate requirements.md + design.md + tasks.md for a new feature
@@ -0,0 +1,109 @@
1
+ # Deep Analysis Request — agentic-advisor MCP Server (v0.3.0)
2
+
3
+ ## Context
4
+
5
+ I have built an MCP (Model Context Protocol) server called `agentic-advisor` — a proactive AI coding advisor that acts as an orchestration layer for agentic development workflows. It's built with FastMCP (Python) and is designed to be the "resident expert" that sits alongside any AI coding tool (Claude Code, Cursor, Copilot, Windsurf, etc.) and provides guardrails, best practices, and workflow automation.
6
+
7
+ The server runs fully local via STDIO transport. No cloud calls except optional live PyPI/npm registry lookups (2s timeout, stdlib urllib only).
8
+
9
+ ---
10
+
11
+ ## Architecture Overview
12
+
13
+ ### Knowledge Layer (RAG)
14
+ - **30 markdown documents** covering agentic coding best practices (foundations, tools, MCP ecosystem, security, workflows, multi-agent orchestration, failure modes, etc.)
15
+ - **Dual-backend search engine** — auto-selects at import time:
16
+ - **Semantic**: `sentence-transformers/all-MiniLM-L6-v2` (25MB local model, cosine similarity on 384-dim embeddings)
17
+ - **TF-IDF fallback**: pure Python, zero dependencies, keyword matching with IDF weighting + section/doc-name bonuses
18
+ - Documents are chunked into 300-word overlapping windows with section header metadata
19
+ - Singleton pattern for the knowledge base instance
20
+
21
+ ### Routing Layer
22
+ - **17-category task router** that maps natural language task descriptions to the best MCP/tool
23
+ - **3-tier classification**: Tier 1 keyword matching (deterministic) → Tier 2 cosine similarity on routing descriptions (semantic fallback) → Tier 3 default to `knowledge_question`
24
+ - Returns structured `RoutingDecision` with confidence level, reasoning, install commands, and doc references
25
+
26
+ ### Setup Engine
27
+ - **Project detector**: scans directory for `package.json`, `pyproject.toml`, `Cargo.toml`, etc. to auto-detect project type, language, framework, and test runner
28
+ - **Config generator**: writes `CLAUDE.md`, `AGENTS.md`, `.skills/`, `.agent/workflows/` based on detected project profile
29
+ - **Spec-driven development**: `create_spec()` generates `requirements.md`, `design.md`, `tasks.md` scaffolds
30
+
31
+ ### Execution Loop (Spec-Driven)
32
+ - `whats_next(directory)` → reads `tasks.md`, returns structured dict: `{next_task, phase, completed, total, progress_pct}`
33
+ - `mark_done(task_text, directory)` → fuzzy-matches and checks off the first matching `[ ]` item in `tasks.md`
34
+ - This closes the autonomous execution loop: `whats_next → implement → mark_done → whats_next → repeat`
35
+
36
+ ### Security Scanner
37
+ - `scan_for_secrets(text)` — 17 regex patterns (Anthropic, OpenAI, AWS, Google, GitHub, Stripe, Twilio, Slack, RSA, generic assignments, Bearer tokens)
38
+ - `scan_diff(diff_text)` — scans only added `+` lines for secrets + 10 risky code patterns (`eval`, `exec`, `shell=True`, SQL injection, `pickle.loads`, disabled SSL, etc.)
39
+ - `validate_dependency(package)` — two-stage: offline heuristics (80+ known-real prefixes, 5 suspicion patterns) + live PyPI/npm 404 check (2s timeout). Verdicts: `confirmed_real`, `likely_real`, `suspicious`, `not_found`
40
+
41
+ ### Hook Generator
42
+ - `generate_hook_script(hook_type)` — writes production-ready Python scripts to `.claude/hooks/`:
43
+ - `pre_tool_use`: blocks dangerous bash commands + secrets in file writes (exit code 2)
44
+ - `post_tool_use`: audit logging to `.claude/audit.log` + auto-format Python with ruff
45
+ - `stop`: runs `pytest` or `npm test` before agent can stop — blocks if tests fail
46
+ - `session_start`: injects `tasks.md` status + `DECISIONS.md` alerts at session start
47
+
48
+ ### Resources (4 static endpoints)
49
+ - `advisor://briefing` — session health summary
50
+ - `advisor://routing-guide` — full 17-category routing reference
51
+ - `advisor://spec-templates` — spec-driven development template reference
52
+ - `advisor://patterns-guide` — multi-agent patterns, context engineering, MCP security
53
+
54
+ ### Tool Count: 12
55
+ `ask_advisor`, `assess_project`, `setup_project`, `create_spec`, `whats_next`, `mark_done`, `scan_for_secrets`, `scan_diff`, `validate_dependency`, `generate_hook_script`, `route_task`, `get_session_briefing`
56
+
57
+ ---
58
+
59
+ ## Deep Probing Questions
60
+
61
+ ### 1. Architectural Critique
62
+ The server currently operates as a monolith — RAG, routing, security scanning, hook generation, and the spec execution loop are all in one FastMCP process. At what scale (number of tools, knowledge base size, concurrent agent sessions) does this architecture start to show cracks? Would you recommend decomposing into multiple coordinated MCP servers, and if so, what's the natural boundary for splitting? How would that affect the `instructions` prompt that introduces the server's capabilities to agents?
63
+
64
+ ### 2. RAG Quality at Scale
65
+ We have 30 documents (~200 chunks at 300 words each). The semantic backend uses `all-MiniLM-L6-v2` with a flat cosine similarity search (no index, just `np.dot`). At what chunk count does this become a latency problem? Should we switch to FAISS, Annoy, or HNSW at some point? More importantly — is 300 words the right chunk size for code-heavy documents? Would a hybrid chunking strategy (e.g., section-level for prose, function-level for code examples) improve retrieval quality?
66
+
67
+ ### 3. Routing Robustness
68
+ The Tier 2 semantic fallback in `detect_task_type` uses a token-overlap cosine sim against ROUTING_MAP description strings (one sentence each). This is a very thin semantic surface. Would generating synthetic paraphrases for each routing category (e.g., 5-10 per category) and matching against those improve accuracy? Or would it be better to embed the routing descriptions using the same sentence-transformer model and do proper vector similarity?
69
+
70
+ ### 4. Security Scanner Completeness
71
+ The secret scanner uses 17 regexes. Real-world secret scanners (TruffleHog, GitLeaks, Gitleaks) use 500+ patterns and entropy-based detection. Is our regex-only approach a false sense of security? Should we integrate entropy scoring for high-randomness strings? What about multi-line secrets (e.g., PEM keys split across lines, YAML blocks with base64-encoded secrets)?
72
+
73
+ ### 5. Spec-Driven Loop Integrity
74
+ The `whats_next` / `mark_done` loop assumes `tasks.md` is the single source of truth. But what happens when:
75
+ - Two agents are running in parallel via `git worktree` and both read the same `tasks.md`?
76
+ - An agent marks a task done but the implementation is actually wrong (false completion)?
77
+ - The user manually edits `tasks.md` mid-session, reordering or removing tasks?
78
+ Should we add file locking, checksums, or a lightweight state machine to make this more robust?
79
+
80
+ ### 6. Hook Security Model
81
+ The `pre_tool_use` hook uses string matching (`if pattern.lower() in command.lower()`) to block dangerous commands. An adversarial agent could trivially bypass this with encoding tricks (`echo cm0gLXJmIC8= | base64 -d | bash`), variable expansion, or multi-step command chaining. Is there a more robust approach? Should we parse the command AST instead, or is defense-in-depth (multiple weak layers > one strong layer) the right philosophy for agentic hooks?
82
+
83
+ ### 7. Token Economy
84
+ Every tool call costs tokens. With 12 tools, the tool descriptions alone consume ~2,500 tokens of context window in every agent session. The `instructions` string adds another ~200. For a model with a 128k context window this is fine, but for 8k-context models it's significant. Should we implement lazy tool loading (only expose tools relevant to the detected project type)? Or is the cognitive overhead of a smaller tool surface worth the token savings?
85
+
86
+ ### 8. Knowledge Base Maintenance
87
+ The 30 documents are static markdown files. AI coding best practices are evolving weekly in 2026 — new MCP servers ship, tool capabilities change, workflow patterns emerge. What's the best approach for keeping this knowledge base current? Should we add a `refresh_knowledge_base()` tool that fetches updates from a curated source? Or is manual curation the only way to maintain quality in a RAG system?
88
+
89
+ ### 9. Observability Gap
90
+ We have `post_tool_use` audit logging and `scan_diff` for pre-commit checks, but we have no way to measure:
91
+ - How often agents actually call `whats_next` vs. ignoring it
92
+ - Which `ask_advisor` queries return low-relevance results (indicating knowledge gaps)
93
+ - Whether `route_task` is sending agents to the wrong MCP
94
+ What telemetry or feedback loops would you add to close this observability gap, while respecting the local-only, privacy-first design?
95
+
96
+ ---
97
+
98
+ ## Open-Ended: What Are We Missing?
99
+
100
+ Given the full architecture above, what capabilities, failure modes, or architectural patterns are we NOT thinking about that would make this MCP server significantly more valuable?
101
+
102
+ Think about:
103
+ - What would make this the default MCP that every agentic IDE ships with?
104
+ - What would make an enterprise team adopt this over building their own?
105
+ - What failure modes could cause a team to REMOVE this MCP from their stack?
106
+ - Are there interaction patterns between the 12 tools that we should be encoding as higher-level workflows rather than leaving up to the agent to figure out?
107
+ - What would a "v1.0" of this server need to have that v0.3.0 doesn't?
108
+
109
+ Please be specific and opinionated. Give concrete recommendations, not generic advice.