@codifier/cli 2.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (40) hide show
  1. package/README.md +511 -0
  2. package/commands/init.md +3 -0
  3. package/commands/onboard.md +3 -0
  4. package/commands/research.md +3 -0
  5. package/dist/cli/add.d.ts +5 -0
  6. package/dist/cli/add.d.ts.map +1 -0
  7. package/dist/cli/add.js +23 -0
  8. package/dist/cli/add.js.map +1 -0
  9. package/dist/cli/bin/codifier.d.ts +7 -0
  10. package/dist/cli/bin/codifier.d.ts.map +1 -0
  11. package/dist/cli/bin/codifier.js +43 -0
  12. package/dist/cli/bin/codifier.js.map +1 -0
  13. package/dist/cli/detect.d.ts +12 -0
  14. package/dist/cli/detect.d.ts.map +1 -0
  15. package/dist/cli/detect.js +35 -0
  16. package/dist/cli/detect.js.map +1 -0
  17. package/dist/cli/doctor.d.ts +5 -0
  18. package/dist/cli/doctor.d.ts.map +1 -0
  19. package/dist/cli/doctor.js +58 -0
  20. package/dist/cli/doctor.js.map +1 -0
  21. package/dist/cli/init.d.ts +6 -0
  22. package/dist/cli/init.d.ts.map +1 -0
  23. package/dist/cli/init.js +93 -0
  24. package/dist/cli/init.js.map +1 -0
  25. package/dist/cli/update.d.ts +5 -0
  26. package/dist/cli/update.d.ts.map +1 -0
  27. package/dist/cli/update.js +25 -0
  28. package/dist/cli/update.js.map +1 -0
  29. package/dist/index.js +87 -0
  30. package/package.json +40 -0
  31. package/skills/brownfield-onboard/SKILL.md +107 -0
  32. package/skills/initialize-project/SKILL.md +145 -0
  33. package/skills/initialize-project/templates/evals-prompt.md +39 -0
  34. package/skills/initialize-project/templates/requirements-prompt.md +44 -0
  35. package/skills/initialize-project/templates/roadmap-prompt.md +44 -0
  36. package/skills/initialize-project/templates/rules-prompt.md +34 -0
  37. package/skills/research-analyze/SKILL.md +131 -0
  38. package/skills/research-analyze/templates/query-generation-prompt.md +61 -0
  39. package/skills/research-analyze/templates/synthesis-prompt.md +67 -0
  40. package/skills/shared/codifier-tools.md +123 -0
@@ -0,0 +1,39 @@
1
+ # Prompt Template: Generate Evals.md
2
+
3
+ When this template is used, substitute all `{placeholders}` with actual values, then generate the evals document as instructed.
4
+
5
+ ---
6
+
7
+ You are a quality-engineering expert. Using the project rules below, create a set of structured evaluation criteria that can be used to verify compliance with those rules during code review, CI checks, or AI-assisted development sessions.
8
+
9
+ ## Project Rules
10
+
11
+ {rules}
12
+
13
+ ## Project Context
14
+
15
+ **Project Name:** {project_name}
16
+ **Description:** {description}
17
+
18
+ ## Instructions
19
+
20
+ For EACH rule, produce one or more evals. Each eval must include:
21
+
22
+ - **id**: a slug identifier (e.g., `eval-validate-input-boundary`)
23
+ - **rule_ref**: the title or ID of the rule being evaluated
24
+ - **description**: what this eval checks
25
+ - **pass_criteria**: precise, observable conditions that indicate the rule is being followed
26
+ - **fail_criteria**: precise, observable conditions that indicate a violation
27
+ - **automation_hint**: whether this can be checked automatically (lint, test, static analysis) and how
28
+
29
+ Format the output as a YAML document with a top-level `evals:` list. Example structure:
30
+
31
+ ```yaml
32
+ evals:
33
+ - id: eval-validate-input-boundary
34
+ rule_ref: Always validate external input at the boundary
35
+ description: Checks that all external inputs are validated before use
36
+ pass_criteria: Every controller method validates request body with a schema before processing
37
+ fail_criteria: Business logic receives raw unvalidated input from request objects
38
+ automation_hint: ESLint rule or custom AST check; unit tests covering invalid inputs
39
+ ```
@@ -0,0 +1,44 @@
1
+ # Prompt Template: Generate Requirements.md
2
+
3
+ When this template is used, substitute all `{placeholders}` with actual values, then generate the requirements document as instructed.
4
+
5
+ ---
6
+
7
+ You are a product manager and solutions architect. Using the project information below, produce a detailed requirements document.
8
+
9
+ ## Project Information
10
+
11
+ **Project Name:** {project_name}
12
+ **Description:** {description}
13
+ **Scope of Work:** {sow}
14
+ **Repositories:** {repo_urls}
15
+ **Additional Context:** {additional_context}
16
+
17
+ ## Instructions
18
+
19
+ Produce a requirements document titled `# Requirements.md` with the following sections:
20
+
21
+ ### 1. Executive Summary
22
+ One-paragraph summary of what the project delivers and for whom.
23
+
24
+ ### 2. Functional Requirements
25
+ List every distinct feature or capability. For each requirement use this format:
26
+
27
+ - **FR-001**: short title
28
+ - **Priority**: Must / Should / Could (MoSCoW)
29
+ - **Description**: what the system must do
30
+ - **Acceptance Criteria**: measurable, testable conditions
31
+
32
+ ### 3. Non-Functional Requirements
33
+ Cover: Performance, Security, Scalability, Reliability, Maintainability, Observability. Use the same FR-NNN format with prefix NFR-.
34
+
35
+ ### 4. Constraints and Assumptions
36
+ List known technical constraints, business constraints, and assumptions being made.
37
+
38
+ ### 5. Out of Scope
39
+ Explicitly list what is NOT included in this project.
40
+
41
+ ### 6. Glossary
42
+ Define key domain terms used throughout this document.
43
+
44
+ Format as a structured Markdown document. Number all requirements sequentially.
@@ -0,0 +1,44 @@
1
+ # Prompt Template: Generate Roadmap.md
2
+
3
+ When this template is used, substitute all `{placeholders}` with actual values, then generate the roadmap document as instructed.
4
+
5
+ ---
6
+
7
+ You are a senior engineering lead responsible for delivery planning. Using the project requirements below, produce a phased implementation roadmap.
8
+
9
+ ## Requirements
10
+
11
+ {requirements}
12
+
13
+ ## Project Context
14
+
15
+ **Project Name:** {project_name}
16
+ **Description:** {description}
17
+ **Repositories:** {repo_urls}
18
+
19
+ ## Instructions
20
+
21
+ Produce a roadmap titled `# Roadmap.md` structured as 3–5 phases. For EACH phase include:
22
+
23
+ - **Phase N — Name**: meaningful phase title (e.g., "Phase 1 — Foundation")
24
+ - **Goal**: one-sentence summary of what this phase achieves
25
+ - **Duration estimate**: calendar weeks or sprints
26
+ - **Deliverables**: concrete, shippable outputs
27
+ - **Functional Requirements covered**: list the FR-NNN and NFR-NNN IDs addressed
28
+ - **Technical tasks**: engineering work breakdown (checklist format)
29
+ - **Dependencies**: what must be true before this phase can start
30
+ - **Success criteria**: how to know this phase is done
31
+
32
+ After the phased plan, include:
33
+
34
+ ### Critical Path
35
+ The sequence of tasks where any delay directly delays the project.
36
+
37
+ ### Risks and Mitigations
38
+ Top 5 risks in a table:
39
+
40
+ | Risk | Likelihood | Impact | Mitigation |
41
+ |------|-----------|--------|-----------|
42
+ | ... | High/Med/Low | High/Med/Low | ... |
43
+
44
+ Format as a structured Markdown document.
@@ -0,0 +1,34 @@
1
+ # Prompt Template: Generate Rules.md
2
+
3
+ When this template is used, substitute all `{placeholders}` with actual project values, then generate the rules document as instructed.
4
+
5
+ ---
6
+
7
+ You are a senior software architect. Based on the project context below, generate a comprehensive set of development rules and coding standards for this project.
8
+
9
+ ## Project Context
10
+
11
+ **Project Name:** {project_name}
12
+ **Description:** {description}
13
+ **Scope of Work:** {sow}
14
+ **Repositories:** {repo_urls}
15
+ **Additional Context:** {additional_context}
16
+
17
+ ## Instructions
18
+
19
+ Generate rules covering ALL of the following areas:
20
+
21
+ 1. **Code Style** — naming conventions, file organisation, formatting
22
+ 2. **Architecture Patterns** — module structure, dependency direction, layering
23
+ 3. **Security** — input validation, secrets management, authentication patterns
24
+ 4. **Testing** — unit test structure, coverage targets, mocking strategy
25
+ 5. **Documentation** — inline comments, ADR conventions, README standards
26
+ 6. **Error Handling** — error propagation, logging strategy, user-facing messages
27
+
28
+ For EACH rule provide:
29
+ - **title**: short, actionable slug (e.g., "Always validate external input at the boundary")
30
+ - **description**: one-paragraph explanation
31
+ - **rationale**: why this rule matters for this specific project
32
+ - **examples**: 1–3 concrete code or configuration examples
33
+
34
+ Format the output as a Markdown document titled `# Rules.md` with one H2 heading per rule category and one H3 heading per rule.
@@ -0,0 +1,131 @@
1
+ # Skill: Research & Analyze
2
+
3
+ **Role:** Researcher
4
+ **Purpose:** Define a research objective, discover Athena data warehouse schemas, generate and validate SQL queries, execute them, synthesize the findings into a ResearchFindings.md report, and persist it to the shared knowledge base.
5
+
6
+ See `../shared/codifier-tools.md` for full MCP tool reference.
7
+
8
+ ---
9
+
10
+ ## Prerequisites
11
+
12
+ - Active MCP connection to the Codifier server
13
+ - AWS Athena credentials configured on the server (`AWS_REGION`, `AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY`, `ATHENA_S3_OUTPUT_LOCATION`)
14
+ - A project to associate the findings with
15
+
16
+ ---
17
+
18
+ ## Workflow
19
+
20
+ ### Step 1 — Identify or Create the Project
21
+
22
+ Call `manage_projects` with `operation: "list"` and show the user their existing projects.
23
+
24
+ Ask: **"Which project should these research findings be associated with?"**
25
+
26
+ Select or create a project and capture the `project_id`.
27
+
28
+ ### Step 2 — Fetch Prior Research
29
+
30
+ Call `fetch_context` with `{ project_id, memory_type: "research_finding" }` to surface any prior findings relevant to this session.
31
+
32
+ If prior findings exist, summarize them briefly: **"Here's what we've found before on this project..."**
33
+
34
+ ### Step 3 — Define the Research Objective
35
+
36
+ Ask the user to describe:
37
+ 1. **Research objective** — the specific question or hypothesis to investigate
38
+ 2. **Background context** — business context, prior hypotheses, relevant metrics or KPIs
39
+ 3. **Time period of interest** — date ranges for the analysis
40
+ 4. **Known relevant tables** — if the user knows which tables to look at (optional)
41
+
42
+ Confirm your understanding of the objective before proceeding.
43
+
44
+ ### Step 4 — Discover Available Tables
45
+
46
+ Call `query_data` with `{ operation: "list-tables", project_id }`.
47
+
48
+ Present the full table list to the user. Ask: **"Which of these tables are likely relevant to your research objective?"**
49
+
50
+ ### Step 5 — Describe Selected Tables
51
+
52
+ Call `query_data` with `{ operation: "describe-tables", project_id, table_names: [<user-selected tables>] }`.
53
+
54
+ Review the returned schemas with the user. Note column names, data types, and any partitioning. Ask if any additional tables should be included.
55
+
56
+ ### Step 6 — Generate SQL Queries
57
+
58
+ Using the prompt template in `templates/query-generation-prompt.md`, generate SQL queries tailored to the research objective.
59
+
60
+ **Substitute:**
61
+ - `{objective}` — the research objective from Step 3
62
+ - `{context}` — background context from Step 3
63
+ - `{available_tables}` — full table list from Step 4
64
+ - `{table_definitions}` — schema details from Step 5
65
+
66
+ Present all generated queries to the user. For each query, show:
67
+ - Query ID and purpose
68
+ - The SQL
69
+ - Expected output columns
70
+
71
+ Ask: **"Do these queries look correct? Which ones should we run, and are there any you'd like to modify?"**
72
+
73
+ Allow the user to edit, add, or remove queries before execution.
74
+
75
+ ### Step 7 — Execute Approved Queries
76
+
77
+ For each approved query, call `query_data` with `{ operation: "execute-query", project_id, query: "<sql>" }`.
78
+
79
+ Execute one query at a time. After each:
80
+ - Show the result rows
81
+ - Ask: "Does this look as expected, or should we investigate further before continuing?"
82
+
83
+ If a query returns no results: note this explicitly and ask if the query should be revised.
84
+ If a query errors: show the error and ask the user how to proceed.
85
+
86
+ ### Step 8 — Synthesize Findings
87
+
88
+ Using the prompt template in `templates/synthesis-prompt.md`, synthesize all query results into a ResearchFindings.md report.
89
+
90
+ **Substitute:**
91
+ - `{objective}` — the research objective
92
+ - `{context}` — background context
93
+ - `{query_results}` — all query results (as structured data)
94
+ - `{table_definitions}` — the schema reference from Step 5
95
+
96
+ Present the full ResearchFindings.md to the user. Ask: **"Does this accurately capture the findings? Any corrections or additions?"**
97
+
98
+ Incorporate feedback.
99
+
100
+ ### Step 9 — Persist Findings
101
+
102
+ Call `update_memory`:
103
+ ```
104
+ memory_type: "research_finding"
105
+ title: "ResearchFindings — <objective summary> — <YYYY-MM-DD>"
106
+ content: {
107
+ text: "<full ResearchFindings.md markdown>",
108
+ objective: "<objective>",
109
+ tables_used: ["<table1>", "<table2>"],
110
+ queries_run: <count>
111
+ }
112
+ tags: ["research", "<domain-tag>", "<date-tag>"]
113
+ source_role: "researcher"
114
+ ```
115
+
116
+ ### Step 10 — Summarize
117
+
118
+ Tell the user:
119
+ - Project ID and memory ID of the persisted finding
120
+ - Tables queried and query count
121
+ - Key findings (2–3 sentence summary)
122
+ - How developers can access this finding: `fetch_context` with `{ project_id, memory_type: "research_finding" }`
123
+
124
+ ---
125
+
126
+ ## Error Handling
127
+
128
+ - If `list-tables` returns empty: Athena credentials may not be configured. Inform the user and check the server configuration.
129
+ - If a query exceeds the 100KB result cap: the tool returns a truncation notice. Acknowledge this in the findings methodology section.
130
+ - If the user asks to run a non-SELECT query: refuse and explain the SELECT-only constraint. Offer an alternative SELECT formulation if possible.
131
+ - If synthesis produces speculative conclusions: flag them explicitly with confidence levels (High/Medium/Low) per the synthesis template.
@@ -0,0 +1,61 @@
1
+ # Prompt Template: Generate SQL Queries
2
+
3
+ When this template is used, substitute all `{placeholders}` with actual values, then generate the queries as instructed.
4
+
5
+ ---
6
+
7
+ You are a senior data analyst expert in SQL and data warehousing. Using the research objective and schema information below, generate SQL queries that will answer the research questions effectively.
8
+
9
+ ## Research Objective
10
+
11
+ {objective}
12
+
13
+ ## Research Context
14
+
15
+ {context}
16
+
17
+ ## Available Schema
18
+
19
+ **Tables discovered:**
20
+ {available_tables}
21
+
22
+ **Table definitions:**
23
+ {table_definitions}
24
+
25
+ ## Instructions
26
+
27
+ Generate a set of SQL queries that address the research objective. Organise them from exploratory (broad counts, distributions) to specific (targeted metrics that directly answer the objective).
28
+
29
+ For EACH query provide:
30
+
31
+ ### Query: {query-id} — {short title}
32
+
33
+ **Purpose:** one sentence describing what this query answers
34
+
35
+ **SQL:**
36
+ ```sql
37
+ -- {explanation of non-obvious logic}
38
+ SELECT
39
+ ...
40
+ FROM {table}
41
+ WHERE ...
42
+ AND date_partition BETWEEN '{{start_date}}' AND '{{end_date}}'
43
+ LIMIT 1000
44
+ ```
45
+
46
+ **Expected output columns:**
47
+ | Column | Type | Description |
48
+ |--------|------|-------------|
49
+ | ... | ... | ... |
50
+
51
+ **Notes:** caveats, known data quality issues, or follow-up queries suggested
52
+
53
+ ---
54
+
55
+ **Query writing conventions:**
56
+ - Use standard ANSI SQL where possible
57
+ - Add comments inside SQL explaining non-obvious logic
58
+ - Parameterise date ranges using placeholders like `{{start_date}}` and `{{end_date}}`
59
+ - Include `LIMIT` clauses on exploratory queries
60
+ - For Athena: use partition columns in WHERE clauses to control cost
61
+ - Only SELECT statements — no DDL or DML
@@ -0,0 +1,67 @@
1
+ # Prompt Template: Synthesize Research Findings
2
+
3
+ When this template is used, substitute all `{placeholders}` with actual values, then generate the findings report as instructed.
4
+
5
+ ---
6
+
7
+ You are a senior data scientist and technical writer. Using the research objective, context, and query results below, synthesise a clear and actionable research findings report.
8
+
9
+ ## Research Objective
10
+
11
+ {objective}
12
+
13
+ ## Research Context
14
+
15
+ {context}
16
+
17
+ ## Query Results
18
+
19
+ {query_results}
20
+
21
+ ## Available Schema Reference
22
+
23
+ {table_definitions}
24
+
25
+ ## Instructions
26
+
27
+ Produce a research findings report titled `# ResearchFindings.md` with the following sections:
28
+
29
+ ### 1. Executive Summary
30
+ 2–4 sentences: the most important finding and its business implication.
31
+
32
+ ### 2. Methodology
33
+ Describe:
34
+ - Data sources used (tables, date ranges)
35
+ - Queries run and what each was designed to measure
36
+ - Data quality considerations or limitations discovered
37
+
38
+ ### 3. Key Findings
39
+ For each significant finding:
40
+
41
+ **Finding N: {descriptive title}**
42
+ - **Evidence:** specific numbers, percentages, or trends from the query results
43
+ - **Interpretation:** what this means in business or research terms
44
+ - **Confidence:** High / Medium / Low — with reasoning
45
+
46
+ ### 4. Trends and Patterns
47
+ Describe temporal trends, correlations, anomalies, or unexpected patterns observed across the query results.
48
+
49
+ ### 5. Limitations and Caveats
50
+ Be explicit about:
51
+ - Data gaps or missing periods
52
+ - Potential biases in the data
53
+ - Queries that returned no results and what that implies
54
+ - Assumptions made during the analysis
55
+
56
+ ### 6. Recommendations
57
+ Actionable next steps based on the findings. Each recommendation must state:
58
+ - **Action:** what to do
59
+ - **Owner:** who should act on it
60
+ - **Rationale:** why this follows from the data
61
+
62
+ ### 7. Follow-up Research Questions
63
+ List 3–5 questions this analysis surfaced but could not answer, to guide future research sessions.
64
+
65
+ ---
66
+
67
+ Format as a structured Markdown document suitable for sharing with stakeholders.
@@ -0,0 +1,123 @@
1
+ # Codifier MCP Tools Reference
2
+
3
+ This document describes all 5 MCP tools exposed by the Codifier server. Reference this when executing any Codifier skill.
4
+
5
+ ---
6
+
7
+ ## 1. `fetch_context`
8
+
9
+ Retrieve memories from the shared knowledge base, filtered by project, type, tags, or full-text search.
10
+
11
+ **Parameters:**
12
+ | Parameter | Type | Required | Description |
13
+ |-----------|------|----------|-------------|
14
+ | `project_id` | string (UUID) | ✓ | Project to scope the query to |
15
+ | `memory_type` | enum | — | Filter by type: `rule`, `document`, `api_contract`, `learning`, `research_finding` |
16
+ | `tags` | string[] | — | All supplied tags must be present on the memory |
17
+ | `query` | string | — | Full-text search applied to title and content |
18
+ | `limit` | number (1–100) | — | Max results (default: 20) |
19
+
20
+ **Returns:** Array of memory records with `id`, `title`, `content`, `memory_type`, `tags`, `source_role`, `created_at`.
21
+
22
+ **Usage patterns:**
23
+ - Fetch all rules for a project: `{ project_id, memory_type: "rule" }`
24
+ - Fetch researcher findings relevant to auth: `{ project_id, memory_type: "research_finding", tags: ["auth"] }`
25
+ - Full-text search across all memory types: `{ project_id, query: "payment processing" }`
26
+
27
+ ---
28
+
29
+ ## 2. `update_memory`
30
+
31
+ Create a new memory or update an existing one in the shared knowledge base.
32
+
33
+ **Parameters:**
34
+ | Parameter | Type | Required | Description |
35
+ |-----------|------|----------|-------------|
36
+ | `project_id` | string (UUID) | ✓ | Project to scope this memory to |
37
+ | `memory_type` | enum | ✓ | `rule`, `document`, `api_contract`, `learning`, `research_finding` |
38
+ | `title` | string | ✓ | Short descriptive title |
39
+ | `content` | object | ✓ | Structured content payload (any JSON object) |
40
+ | `id` | string (UUID) | — | If provided, updates the existing record instead of creating |
41
+ | `tags` | string[] | — | Tags for filtering and categorization |
42
+ | `category` | string | — | Category grouping (e.g., "security", "error-handling") |
43
+ | `description` | string | — | Human-readable summary |
44
+ | `confidence` | number (0–1) | — | Confidence score (default: 1.0) |
45
+ | `source_role` | string | — | Role that produced this memory (e.g., "developer", "researcher") |
46
+
47
+ **Returns:** The created or updated memory record including its `id`.
48
+
49
+ **Usage patterns:**
50
+ - Store a generated Rules.md: `{ project_id, memory_type: "document", title: "Rules.md", content: { text: "..." }, source_role: "developer" }`
51
+ - Store a research finding: `{ project_id, memory_type: "research_finding", title: "Q4 Retention Analysis", content: { summary: "...", findings: [...] }, source_role: "researcher" }`
52
+ - Update an existing memory: `{ project_id, id: "<existing-id>", memory_type: "rule", title: "...", content: {...} }`
53
+
54
+ ---
55
+
56
+ ## 3. `manage_projects`
57
+
58
+ Create, list, or switch the active project.
59
+
60
+ **Parameters:**
61
+ | Parameter | Type | Required | Description |
62
+ |-----------|------|----------|-------------|
63
+ | `operation` | enum | ✓ | `create`, `list`, or `switch` |
64
+ | `name` | string | For `create` | Project name |
65
+ | `org` | string | — | Organisation name (optional for `create`) |
66
+ | `project_id` | string (UUID) | For `switch` | Project to switch to |
67
+
68
+ **Returns:**
69
+ - `list`: Array of projects with `id`, `name`, `org`, `created_at`
70
+ - `create`: The created project record including its `id`
71
+ - `switch`: Confirmation of the active project
72
+
73
+ **Usage patterns:**
74
+ - List all projects: `{ operation: "list" }`
75
+ - Create a new project: `{ operation: "create", name: "Payments Redesign", org: "Acme Corp" }`
76
+ - Switch to an existing project: `{ operation: "switch", project_id: "<uuid>" }`
77
+
78
+ ---
79
+
80
+ ## 4. `pack_repo`
81
+
82
+ Condense a code repository into a versioned text snapshot using RepoMix. The snapshot is stored in the `repositories` table and can be retrieved for context.
83
+
84
+ **Parameters:**
85
+ | Parameter | Type | Required | Description |
86
+ |-----------|------|----------|-------------|
87
+ | `url` | string | ✓ | Repository URL (e.g., `https://github.com/org/repo`) or local path |
88
+ | `project_id` | string (UUID) | ✓ | Project to associate the snapshot with |
89
+ | `version_label` | string | — | Version label for this snapshot (e.g., `"v1.2.3"`, `"sprint-5"`, `"2026-02"`) |
90
+
91
+ **Returns:** Repository record with `id`, `url`, `version_label`, `token_count`, `file_count`, and `created_at`.
92
+
93
+ **Usage patterns:**
94
+ - Pack a public GitHub repo: `{ url: "https://github.com/org/repo", project_id, version_label: "2026-02" }`
95
+ - Pack multiple repos for brownfield onboarding: call once per repo URL
96
+
97
+ **Note:** Large repos may take 30–60 seconds. The packed snapshot is plain text suitable for LLM context.
98
+
99
+ ---
100
+
101
+ ## 5. `query_data`
102
+
103
+ Discover schemas and execute SELECT queries against an AWS Athena data warehouse.
104
+
105
+ **Parameters:**
106
+ | Parameter | Type | Required | Description |
107
+ |-----------|------|----------|-------------|
108
+ | `operation` | enum | ✓ | `list-tables`, `describe-tables`, or `execute-query` |
109
+ | `project_id` | string (UUID) | ✓ | Project UUID for session scoping |
110
+ | `query` | string | For `execute-query` | SQL SELECT statement to execute |
111
+ | `table_names` | string[] | For `describe-tables` | Tables to describe |
112
+
113
+ **Returns:**
114
+ - `list-tables`: Array of available table names
115
+ - `describe-tables`: Schema definitions for requested tables
116
+ - `execute-query`: Query results (capped at 100KB; truncation notice included if limit hit)
117
+
118
+ **Usage patterns:**
119
+ - Discover available tables: `{ operation: "list-tables", project_id }`
120
+ - Get schema for selected tables: `{ operation: "describe-tables", project_id, table_names: ["events", "users"] }`
121
+ - Execute a query: `{ operation: "execute-query", project_id, query: "SELECT user_id, COUNT(*) FROM events GROUP BY 1 LIMIT 100" }`
122
+
123
+ **Constraints:** Only SELECT statements are permitted. DDL and DML are rejected.