@memberjunction/db-auto-doc 2.116.0 → 2.118.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +652 -165
- package/bin/run.js +7 -0
- package/dist/api/DBAutoDocAPI.d.ts +252 -0
- package/dist/api/DBAutoDocAPI.d.ts.map +1 -0
- package/dist/api/DBAutoDocAPI.js +530 -0
- package/dist/api/DBAutoDocAPI.js.map +1 -0
- package/dist/api/index.d.ts +7 -0
- package/dist/api/index.d.ts.map +1 -0
- package/dist/api/index.js +10 -0
- package/dist/api/index.js.map +1 -0
- package/dist/commands/analyze.d.ts +6 -4
- package/dist/commands/analyze.d.ts.map +1 -1
- package/dist/commands/analyze.js +58 -71
- package/dist/commands/analyze.js.map +1 -1
- package/dist/commands/export.d.ts +14 -4
- package/dist/commands/export.d.ts.map +1 -1
- package/dist/commands/export.js +156 -61
- package/dist/commands/export.js.map +1 -1
- package/dist/commands/init.d.ts +3 -4
- package/dist/commands/init.d.ts.map +1 -1
- package/dist/commands/init.js +155 -146
- package/dist/commands/init.js.map +1 -1
- package/dist/commands/reset.d.ts +4 -1
- package/dist/commands/reset.d.ts.map +1 -1
- package/dist/commands/reset.js +33 -19
- package/dist/commands/reset.js.map +1 -1
- package/dist/commands/status.d.ts +10 -0
- package/dist/commands/status.d.ts.map +1 -0
- package/dist/commands/status.js +66 -0
- package/dist/commands/status.js.map +1 -0
- package/dist/core/AnalysisEngine.d.ts +108 -0
- package/dist/core/AnalysisEngine.d.ts.map +1 -0
- package/dist/core/AnalysisEngine.js +716 -0
- package/dist/core/AnalysisEngine.js.map +1 -0
- package/dist/core/AnalysisOrchestrator.d.ts +37 -0
- package/dist/core/AnalysisOrchestrator.d.ts.map +1 -0
- package/dist/core/AnalysisOrchestrator.js +294 -0
- package/dist/core/AnalysisOrchestrator.js.map +1 -0
- package/dist/core/BackpropagationEngine.d.ts +32 -0
- package/dist/core/BackpropagationEngine.d.ts.map +1 -0
- package/dist/core/BackpropagationEngine.js +121 -0
- package/dist/core/BackpropagationEngine.js.map +1 -0
- package/dist/core/ConvergenceDetector.d.ts +27 -0
- package/dist/core/ConvergenceDetector.d.ts.map +1 -0
- package/dist/core/ConvergenceDetector.js +92 -0
- package/dist/core/ConvergenceDetector.js.map +1 -0
- package/dist/core/GuardrailsManager.d.ts +78 -0
- package/dist/core/GuardrailsManager.d.ts.map +1 -0
- package/dist/core/GuardrailsManager.js +367 -0
- package/dist/core/GuardrailsManager.js.map +1 -0
- package/dist/core/index.d.ts +7 -0
- package/dist/core/index.d.ts.map +1 -0
- package/dist/core/index.js +13 -0
- package/dist/core/index.js.map +1 -0
- package/dist/database/Database.d.ts +56 -0
- package/dist/database/Database.d.ts.map +1 -0
- package/dist/database/Database.js +172 -0
- package/dist/database/Database.js.map +1 -0
- package/dist/database/TopologicalSorter.d.ts +25 -0
- package/dist/database/TopologicalSorter.d.ts.map +1 -0
- package/dist/database/TopologicalSorter.js +150 -0
- package/dist/database/TopologicalSorter.js.map +1 -0
- package/dist/database/index.d.ts +6 -0
- package/dist/database/index.d.ts.map +1 -0
- package/dist/database/index.js +14 -0
- package/dist/database/index.js.map +1 -0
- package/dist/discovery/ColumnStatsCache.d.ts +91 -0
- package/dist/discovery/ColumnStatsCache.d.ts.map +1 -0
- package/dist/discovery/ColumnStatsCache.js +231 -0
- package/dist/discovery/ColumnStatsCache.js.map +1 -0
- package/dist/discovery/DiscoveryEngine.d.ts +100 -0
- package/dist/discovery/DiscoveryEngine.d.ts.map +1 -0
- package/dist/discovery/DiscoveryEngine.js +726 -0
- package/dist/discovery/DiscoveryEngine.js.map +1 -0
- package/dist/discovery/DiscoveryTriggerAnalyzer.d.ts +57 -0
- package/dist/discovery/DiscoveryTriggerAnalyzer.d.ts.map +1 -0
- package/dist/discovery/DiscoveryTriggerAnalyzer.js +186 -0
- package/dist/discovery/DiscoveryTriggerAnalyzer.js.map +1 -0
- package/dist/discovery/FKDetector.d.ts +47 -0
- package/dist/discovery/FKDetector.d.ts.map +1 -0
- package/dist/discovery/FKDetector.js +317 -0
- package/dist/discovery/FKDetector.js.map +1 -0
- package/dist/discovery/LLMDiscoveryValidator.d.ts +64 -0
- package/dist/discovery/LLMDiscoveryValidator.d.ts.map +1 -0
- package/dist/discovery/LLMDiscoveryValidator.js +431 -0
- package/dist/discovery/LLMDiscoveryValidator.js.map +1 -0
- package/dist/discovery/LLMSanityChecker.d.ts +38 -0
- package/dist/discovery/LLMSanityChecker.d.ts.map +1 -0
- package/dist/discovery/LLMSanityChecker.js +156 -0
- package/dist/discovery/LLMSanityChecker.js.map +1 -0
- package/dist/discovery/PKDetector.d.ts +62 -0
- package/dist/discovery/PKDetector.d.ts.map +1 -0
- package/dist/discovery/PKDetector.js +436 -0
- package/dist/discovery/PKDetector.js.map +1 -0
- package/dist/discovery/index.d.ts +9 -0
- package/dist/discovery/index.d.ts.map +1 -0
- package/dist/discovery/index.js +25 -0
- package/dist/discovery/index.js.map +1 -0
- package/dist/drivers/BaseAutoDocDriver.d.ts +132 -0
- package/dist/drivers/BaseAutoDocDriver.d.ts.map +1 -0
- package/dist/drivers/BaseAutoDocDriver.js +121 -0
- package/dist/drivers/BaseAutoDocDriver.js.map +1 -0
- package/dist/drivers/MySQLDriver.d.ts +61 -0
- package/dist/drivers/MySQLDriver.d.ts.map +1 -0
- package/dist/drivers/MySQLDriver.js +668 -0
- package/dist/drivers/MySQLDriver.js.map +1 -0
- package/dist/drivers/PostgreSQLDriver.d.ts +65 -0
- package/dist/drivers/PostgreSQLDriver.d.ts.map +1 -0
- package/dist/drivers/PostgreSQLDriver.js +704 -0
- package/dist/drivers/PostgreSQLDriver.js.map +1 -0
- package/dist/drivers/SQLServerDriver.d.ts +61 -0
- package/dist/drivers/SQLServerDriver.d.ts.map +1 -0
- package/dist/drivers/SQLServerDriver.js +667 -0
- package/dist/drivers/SQLServerDriver.js.map +1 -0
- package/dist/generators/CSVGenerator.d.ts +35 -0
- package/dist/generators/CSVGenerator.d.ts.map +1 -0
- package/dist/generators/CSVGenerator.js +154 -0
- package/dist/generators/CSVGenerator.js.map +1 -0
- package/dist/generators/HTMLGenerator.d.ts +29 -0
- package/dist/generators/HTMLGenerator.d.ts.map +1 -0
- package/dist/generators/HTMLGenerator.js +710 -0
- package/dist/generators/HTMLGenerator.js.map +1 -0
- package/dist/generators/MarkdownGenerator.d.ts +27 -0
- package/dist/generators/MarkdownGenerator.d.ts.map +1 -0
- package/dist/generators/MarkdownGenerator.js +361 -0
- package/dist/generators/MarkdownGenerator.js.map +1 -0
- package/dist/generators/MermaidGenerator.d.ts +35 -0
- package/dist/generators/MermaidGenerator.d.ts.map +1 -0
- package/dist/generators/MermaidGenerator.js +321 -0
- package/dist/generators/MermaidGenerator.js.map +1 -0
- package/dist/generators/ReportGenerator.d.ts +22 -0
- package/dist/generators/ReportGenerator.d.ts.map +1 -0
- package/dist/generators/ReportGenerator.js +176 -0
- package/dist/generators/ReportGenerator.js.map +1 -0
- package/dist/generators/SQLGenerator.d.ts +31 -0
- package/dist/generators/SQLGenerator.d.ts.map +1 -0
- package/dist/generators/SQLGenerator.js +168 -0
- package/dist/generators/SQLGenerator.js.map +1 -0
- package/dist/generators/index.d.ts +10 -0
- package/dist/generators/index.d.ts.map +1 -0
- package/dist/generators/index.js +19 -0
- package/dist/generators/index.js.map +1 -0
- package/dist/index.d.ts +11 -20
- package/dist/index.d.ts.map +1 -1
- package/dist/index.js +19 -20
- package/dist/index.js.map +1 -1
- package/dist/prompts/PromptEngine.d.ts +65 -0
- package/dist/prompts/PromptEngine.d.ts.map +1 -0
- package/dist/prompts/PromptEngine.js +282 -0
- package/dist/prompts/PromptEngine.js.map +1 -0
- package/dist/prompts/PromptFileLoader.d.ts +21 -0
- package/dist/prompts/PromptFileLoader.d.ts.map +1 -0
- package/dist/prompts/PromptFileLoader.js +74 -0
- package/dist/prompts/PromptFileLoader.js.map +1 -0
- package/dist/prompts/index.d.ts +6 -0
- package/dist/prompts/index.d.ts.map +1 -0
- package/dist/prompts/index.js +11 -0
- package/dist/prompts/index.js.map +1 -0
- package/dist/state/IterationTracker.d.ts +64 -0
- package/dist/state/IterationTracker.d.ts.map +1 -0
- package/dist/state/IterationTracker.js +136 -0
- package/dist/state/IterationTracker.js.map +1 -0
- package/dist/state/StateManager.d.ts +79 -0
- package/dist/state/StateManager.d.ts.map +1 -0
- package/dist/state/StateManager.js +348 -0
- package/dist/state/StateManager.js.map +1 -0
- package/dist/state/StateValidator.d.ts +24 -0
- package/dist/state/StateValidator.d.ts.map +1 -0
- package/dist/state/StateValidator.js +147 -0
- package/dist/state/StateValidator.js.map +1 -0
- package/dist/state/index.d.ts +7 -0
- package/dist/state/index.d.ts.map +1 -0
- package/dist/state/index.js +13 -0
- package/dist/state/index.js.map +1 -0
- package/dist/types/analysis.d.ts +76 -0
- package/dist/types/analysis.d.ts.map +1 -0
- package/dist/types/analysis.js +6 -0
- package/dist/types/analysis.js.map +1 -0
- package/dist/types/config.d.ts +132 -0
- package/dist/types/config.d.ts.map +1 -0
- package/dist/types/config.js +7 -0
- package/dist/types/config.js.map +1 -0
- package/dist/types/discovery.d.ts +277 -0
- package/dist/types/discovery.d.ts.map +1 -0
- package/dist/types/discovery.js +7 -0
- package/dist/types/discovery.js.map +1 -0
- package/dist/types/driver.d.ts +148 -0
- package/dist/types/driver.d.ts.map +1 -0
- package/dist/types/driver.js +7 -0
- package/dist/types/driver.js.map +1 -0
- package/dist/types/index.d.ts +8 -0
- package/dist/types/index.d.ts.map +1 -0
- package/dist/types/index.js +24 -0
- package/dist/types/index.js.map +1 -0
- package/dist/types/prompts.d.ts +158 -0
- package/dist/types/prompts.d.ts.map +1 -0
- package/dist/types/prompts.js +6 -0
- package/dist/types/prompts.js.map +1 -0
- package/dist/types/state.d.ts +278 -0
- package/dist/types/state.d.ts.map +1 -0
- package/dist/types/state.js +7 -0
- package/dist/types/state.js.map +1 -0
- package/dist/utils/config-loader.d.ts +29 -0
- package/dist/utils/config-loader.d.ts.map +1 -0
- package/dist/utils/config-loader.js +163 -0
- package/dist/utils/config-loader.js.map +1 -0
- package/dist/utils/index.d.ts +5 -0
- package/dist/utils/index.d.ts.map +1 -0
- package/dist/utils/index.js +9 -0
- package/dist/utils/index.js.map +1 -0
- package/package.json +24 -3
- package/dist/ai/simple-ai-client.d.ts +0 -70
- package/dist/ai/simple-ai-client.d.ts.map +0 -1
- package/dist/ai/simple-ai-client.js +0 -181
- package/dist/ai/simple-ai-client.js.map +0 -1
- package/dist/analyzers/analyzer.d.ts +0 -23
- package/dist/analyzers/analyzer.d.ts.map +0 -1
- package/dist/analyzers/analyzer.js +0 -127
- package/dist/analyzers/analyzer.js.map +0 -1
- package/dist/cli-old/cli.d.ts +0 -3
- package/dist/cli-old/cli.d.ts.map +0 -1
- package/dist/cli-old/cli.js +0 -388
- package/dist/cli-old/cli.js.map +0 -1
- package/dist/commands/review.d.ts +0 -11
- package/dist/commands/review.d.ts.map +0 -1
- package/dist/commands/review.js +0 -82
- package/dist/commands/review.js.map +0 -1
- package/dist/database/connection.d.ts +0 -40
- package/dist/database/connection.d.ts.map +0 -1
- package/dist/database/connection.js +0 -136
- package/dist/database/connection.js.map +0 -1
- package/dist/database/introspection.d.ts +0 -59
- package/dist/database/introspection.d.ts.map +0 -1
- package/dist/database/introspection.js +0 -124
- package/dist/database/introspection.js.map +0 -1
- package/dist/generators/markdown-generator.d.ts +0 -8
- package/dist/generators/markdown-generator.d.ts.map +0 -1
- package/dist/generators/markdown-generator.js +0 -106
- package/dist/generators/markdown-generator.js.map +0 -1
- package/dist/generators/sql-generator.d.ts +0 -20
- package/dist/generators/sql-generator.d.ts.map +0 -1
- package/dist/generators/sql-generator.js +0 -83
- package/dist/generators/sql-generator.js.map +0 -1
- package/dist/state/state-manager.d.ts +0 -95
- package/dist/state/state-manager.d.ts.map +0 -1
- package/dist/state/state-manager.js +0 -236
- package/dist/state/state-manager.js.map +0 -1
- package/dist/types/state-file.d.ts +0 -124
- package/dist/types/state-file.d.ts.map +0 -1
- package/dist/types/state-file.js +0 -79
- package/dist/types/state-file.js.map +0 -1
package/README.md
CHANGED
|
@@ -1,244 +1,731 @@
|
|
|
1
|
-
# Database
|
|
1
|
+
# DBAutoDoc - AI-Powered Database Documentation Generator
|
|
2
2
|
|
|
3
|
-
|
|
4
|
-
|
|
5
|
-
## 🚀 **Standalone Tool - No MemberJunction Runtime Required**
|
|
6
|
-
|
|
7
|
-
This tool works with **ANY** SQL Server database. You don't need MemberJunction installed or running.
|
|
3
|
+
Automatically generate comprehensive documentation for SQL Server, MySQL, and PostgreSQL databases using AI. DBAutoDoc analyzes your database structure, uses Large Language Models to understand the purpose of tables and columns, and saves descriptions as database metadata (extended properties for SQL Server, comments for MySQL/PostgreSQL).
|
|
8
4
|
|
|
9
5
|
## Features
|
|
10
6
|
|
|
11
|
-
|
|
12
|
-
-
|
|
13
|
-
-
|
|
14
|
-
-
|
|
15
|
-
-
|
|
16
|
-
|
|
17
|
-
|
|
18
|
-
|
|
19
|
-
|
|
20
|
-
-
|
|
21
|
-
|
|
22
|
-
|
|
23
|
-
|
|
7
|
+
### Core Capabilities
|
|
8
|
+
- **🤖 AI-Powered Analysis** - Uses OpenAI, Anthropic, Google, or Groq to generate intelligent descriptions
|
|
9
|
+
- **🔄 Iterative Refinement** - Multi-pass analysis with backpropagation for accuracy
|
|
10
|
+
- **📊 Topological Processing** - Analyzes tables in dependency order for better context
|
|
11
|
+
- **📈 Data-Driven** - Leverages cardinality, statistics, and sample data for insights
|
|
12
|
+
- **🎯 Convergence Detection** - Automatically knows when analysis is complete
|
|
13
|
+
- **💾 State Tracking** - Full audit trail of all iterations and reasoning
|
|
14
|
+
- **🔌 Standalone** - Works with ANY database, no MemberJunction required
|
|
15
|
+
|
|
16
|
+
### Multi-Database Support
|
|
17
|
+
- **SQL Server** - Full support with extended properties
|
|
18
|
+
- **PostgreSQL** - Complete implementation with COMMENT syntax
|
|
19
|
+
- **MySQL** - Full support with column/table comments
|
|
20
|
+
- **Unified Interface** - Single configuration approach across all databases
|
|
21
|
+
|
|
22
|
+
### Advanced Features
|
|
23
|
+
- **🔍 Relationship Discovery** - Automatically detect missing primary and foreign keys using statistical analysis and LLM validation
|
|
24
|
+
- **🛡️ Granular Guardrails** - Multi-level resource controls (run, phase, iteration limits)
|
|
25
|
+
- **⏸️ Resume Capability** - Pause and resume analysis from checkpoint state files
|
|
26
|
+
- **📦 Programmatic API** - Use as a library in your own applications
|
|
27
|
+
- **🔧 Extensible** - Custom database drivers and analysis plugins
|
|
28
|
+
|
|
29
|
+
### Output Formats
|
|
30
|
+
- **SQL Scripts** - Database-specific metadata scripts (extended properties, comments)
|
|
31
|
+
- **Markdown Documentation** - Human-readable docs with ERD diagrams
|
|
32
|
+
- **HTML Documentation** - Interactive, searchable documentation with embedded CSS/JS
|
|
33
|
+
- **CSV Exports** - Spreadsheet-ready table and column data
|
|
34
|
+
- **Mermaid Diagrams** - Standalone ERD files (.mmd and .html)
|
|
35
|
+
- **Analysis Reports** - Detailed metrics and quality assessments
|
|
24
36
|
|
|
25
37
|
## Installation
|
|
26
38
|
|
|
39
|
+
### Global Installation (Recommended for DBAs)
|
|
40
|
+
|
|
27
41
|
```bash
|
|
28
|
-
# Install globally (for standalone use)
|
|
29
42
|
npm install -g @memberjunction/db-auto-doc
|
|
43
|
+
```
|
|
30
44
|
|
|
31
|
-
|
|
32
|
-
npx @memberjunction/db-auto-doc
|
|
45
|
+
### Within MemberJunction Project
|
|
33
46
|
|
|
34
|
-
|
|
35
|
-
|
|
47
|
+
```bash
|
|
48
|
+
npm install @memberjunction/db-auto-doc
|
|
49
|
+
```
|
|
50
|
+
|
|
51
|
+
### As a Library Dependency
|
|
52
|
+
|
|
53
|
+
```bash
|
|
54
|
+
npm install @memberjunction/db-auto-doc --save
|
|
36
55
|
```
|
|
37
56
|
|
|
38
57
|
## Quick Start
|
|
39
58
|
|
|
40
|
-
###
|
|
59
|
+
### 1. Initialize
|
|
41
60
|
|
|
42
61
|
```bash
|
|
43
|
-
# 1. Initialize project
|
|
44
62
|
db-auto-doc init
|
|
63
|
+
```
|
|
64
|
+
|
|
65
|
+
This interactive wizard will:
|
|
66
|
+
- Configure database connection
|
|
67
|
+
- Set up AI provider (OpenAI, Anthropic, Google, or Groq)
|
|
68
|
+
- Configure guardrails and resource limits
|
|
69
|
+
- Optionally add seed context for better analysis
|
|
70
|
+
- Create `config.json`
|
|
45
71
|
|
|
46
|
-
|
|
47
|
-
# AI_API_KEY=sk-your-key-here
|
|
72
|
+
### 2. Analyze
|
|
48
73
|
|
|
49
|
-
|
|
74
|
+
```bash
|
|
50
75
|
db-auto-doc analyze
|
|
76
|
+
```
|
|
77
|
+
|
|
78
|
+
This will:
|
|
79
|
+
- Introspect your database structure
|
|
80
|
+
- Analyze data (cardinality, statistics, patterns)
|
|
81
|
+
- Optionally discover missing primary and foreign keys
|
|
82
|
+
- Build dependency graph
|
|
83
|
+
- Run iterative AI analysis with backpropagation
|
|
84
|
+
- Perform sanity checks
|
|
85
|
+
- Save state to `db-doc-state.json`
|
|
51
86
|
|
|
52
|
-
|
|
53
|
-
db-auto-doc review
|
|
87
|
+
### 3. Export
|
|
54
88
|
|
|
55
|
-
|
|
56
|
-
db-auto-doc export --
|
|
89
|
+
```bash
|
|
90
|
+
db-auto-doc export --sql --markdown --html --csv --mermaid
|
|
57
91
|
```
|
|
58
92
|
|
|
59
|
-
|
|
93
|
+
This generates:
|
|
94
|
+
- **SQL Script**: Database-specific metadata statements
|
|
95
|
+
- **Markdown Documentation**: Human-readable docs with ERD links
|
|
96
|
+
- **HTML Documentation**: Interactive searchable documentation
|
|
97
|
+
- **CSV Files**: tables.csv and columns.csv for spreadsheet analysis
|
|
98
|
+
- **Mermaid Diagrams**: erd.mmd and erd.html for visualization
|
|
99
|
+
|
|
100
|
+
Optionally apply directly to database:
|
|
60
101
|
|
|
61
102
|
```bash
|
|
62
|
-
|
|
63
|
-
mj dbdoc init
|
|
64
|
-
mj dbdoc analyze --schemas=dbo
|
|
65
|
-
mj dbdoc review --unapproved-only
|
|
66
|
-
mj dbdoc export --approved-only --execute
|
|
103
|
+
db-auto-doc export --sql --apply
|
|
67
104
|
```
|
|
68
105
|
|
|
69
|
-
###
|
|
106
|
+
### 4. Check Status
|
|
70
107
|
|
|
71
|
-
```
|
|
72
|
-
|
|
73
|
-
|
|
74
|
-
StateManager,
|
|
75
|
-
DatabaseAnalyzer,
|
|
76
|
-
SimpleAIClient,
|
|
77
|
-
} from '@memberjunction/db-auto-doc';
|
|
108
|
+
```bash
|
|
109
|
+
db-auto-doc status
|
|
110
|
+
```
|
|
78
111
|
|
|
79
|
-
|
|
80
|
-
|
|
81
|
-
|
|
82
|
-
|
|
83
|
-
|
|
112
|
+
Shows:
|
|
113
|
+
- Analysis progress and phase completion
|
|
114
|
+
- Convergence status
|
|
115
|
+
- Low-confidence tables and columns
|
|
116
|
+
- Token usage, cost, and duration
|
|
117
|
+
- Guardrail status and warnings
|
|
84
118
|
|
|
85
|
-
|
|
86
|
-
await analyzer.analyze({ schemas: ['dbo'] });
|
|
119
|
+
### 5. Resume Analysis
|
|
87
120
|
|
|
88
|
-
|
|
89
|
-
|
|
90
|
-
const state = stateManager.getState();
|
|
91
|
-
const sqlGen = new SQLGenerator();
|
|
92
|
-
const sql = sqlGen.generate(state);
|
|
121
|
+
```bash
|
|
122
|
+
db-auto-doc analyze --resume ./db-doc-state.json
|
|
93
123
|
```
|
|
94
124
|
|
|
95
|
-
|
|
125
|
+
Resume a previous analysis from a checkpoint state file, useful for:
|
|
126
|
+
- Continuing after hitting guardrail limits
|
|
127
|
+
- Recovering from interruptions
|
|
128
|
+
- Incremental database updates
|
|
129
|
+
|
|
130
|
+
## How It Works
|
|
131
|
+
|
|
132
|
+
### Topological Analysis
|
|
96
133
|
|
|
97
|
-
|
|
98
|
-
Initialize new documentation project
|
|
99
|
-
- Prompts for database connection
|
|
100
|
-
- Creates `.env` file
|
|
101
|
-
- Creates `db-doc-state.json`
|
|
102
|
-
- Optionally asks seed questions
|
|
134
|
+
DBAutoDoc processes tables in dependency order:
|
|
103
135
|
|
|
104
|
-
|
|
105
|
-
|
|
106
|
-
|
|
107
|
-
|
|
108
|
-
|
|
109
|
-
|
|
136
|
+
```
|
|
137
|
+
Level 0: Users, Products, Categories (no dependencies)
|
|
138
|
+
↓
|
|
139
|
+
Level 1: Orders (depends on Users), ProductCategories (Products + Categories)
|
|
140
|
+
↓
|
|
141
|
+
Level 2: OrderItems (depends on Orders + Products)
|
|
142
|
+
↓
|
|
143
|
+
Level 3: Shipments (depends on OrderItems)
|
|
144
|
+
```
|
|
110
145
|
|
|
111
|
-
|
|
112
|
-
Review and approve AI-generated documentation
|
|
113
|
-
- `--schema <schema>` - Review specific schema
|
|
114
|
-
- `--unapproved-only` - Only show unapproved items
|
|
146
|
+
Processing in this order allows child tables to benefit from parent table context.
|
|
115
147
|
|
|
116
|
-
###
|
|
117
|
-
Generate output files
|
|
118
|
-
- `--format <format>` - sql|markdown|all (default: all)
|
|
119
|
-
- `--output <path>` - Output directory
|
|
120
|
-
- `--execute` - Execute SQL script (apply to database)
|
|
121
|
-
- `--approved-only` - Only export approved items
|
|
148
|
+
### Relationship Discovery
|
|
122
149
|
|
|
123
|
-
|
|
124
|
-
|
|
125
|
-
-
|
|
150
|
+
For legacy databases missing primary/foreign key constraints, DBAutoDoc can:
|
|
151
|
+
- **Detect Primary Keys** using statistical analysis (uniqueness, nullability, cardinality)
|
|
152
|
+
- **Find Foreign Keys** using value overlap analysis and naming patterns
|
|
153
|
+
- **LLM Validation** to verify discovered relationships make business sense
|
|
154
|
+
- **Backpropagation** to refine parent table analysis based on child relationships
|
|
126
155
|
|
|
127
|
-
|
|
156
|
+
Triggered automatically when:
|
|
157
|
+
- Tables lack primary key constraints
|
|
158
|
+
- Insufficient foreign key relationships detected (below threshold)
|
|
128
159
|
|
|
129
|
-
|
|
160
|
+
### Backpropagation
|
|
130
161
|
|
|
131
|
-
|
|
132
|
-
# Database Connection
|
|
133
|
-
DB_SERVER=localhost
|
|
134
|
-
DB_DATABASE=YourDatabase
|
|
135
|
-
DB_USER=sa
|
|
136
|
-
DB_PASSWORD=YourPassword
|
|
137
|
-
DB_ENCRYPT=true
|
|
138
|
-
DB_TRUST_SERVER_CERTIFICATE=true
|
|
162
|
+
After analyzing child tables, DBAutoDoc can detect insights about parent tables and trigger re-analysis:
|
|
139
163
|
|
|
140
|
-
|
|
141
|
-
|
|
142
|
-
|
|
143
|
-
|
|
164
|
+
```
|
|
165
|
+
Level 0: "Persons" → Initially thinks: "General contact information"
|
|
166
|
+
↓
|
|
167
|
+
Level 1: "Students" table reveals Persons.Type has values: Student, Teacher, Staff
|
|
168
|
+
↓
|
|
169
|
+
BACKPROPAGATE to Level 0: "Persons" → Revise to: "School personnel with role-based typing"
|
|
144
170
|
```
|
|
145
171
|
|
|
146
|
-
|
|
172
|
+
### Convergence
|
|
173
|
+
|
|
174
|
+
Analysis stops when:
|
|
175
|
+
1. **No changes** in last N iterations (stability window)
|
|
176
|
+
2. **All tables** meet confidence threshold
|
|
177
|
+
3. **Max iterations** reached
|
|
178
|
+
4. **Guardrail limits** exceeded (tokens, cost, duration)
|
|
179
|
+
|
|
180
|
+
### Granular Guardrails
|
|
181
|
+
|
|
182
|
+
Multi-level resource controls ensure analysis stays within bounds:
|
|
147
183
|
|
|
148
|
-
|
|
184
|
+
**Run-Level Limits**:
|
|
185
|
+
- `maxTokensPerRun`: Total token budget for entire analysis
|
|
186
|
+
- `maxDurationSeconds`: Maximum wall-clock time
|
|
187
|
+
- `maxCostDollars`: Maximum AI cost
|
|
188
|
+
|
|
189
|
+
**Phase-Level Limits**:
|
|
190
|
+
- `maxTokensPerPhase.discovery`: Budget for relationship discovery
|
|
191
|
+
- `maxTokensPerPhase.analysis`: Budget for description generation
|
|
192
|
+
- `maxTokensPerPhase.sanityChecks`: Budget for validation
|
|
193
|
+
|
|
194
|
+
**Iteration-Level Limits**:
|
|
195
|
+
- `maxTokensPerIteration`: Per-iteration token cap
|
|
196
|
+
- `maxIterationDurationSeconds`: Per-iteration time limit
|
|
197
|
+
|
|
198
|
+
**Warning Thresholds**:
|
|
199
|
+
- Configurable percentage-based warnings (default 80-85%)
|
|
200
|
+
- Early notification before hitting hard limits
|
|
201
|
+
|
|
202
|
+
### Data Analysis
|
|
203
|
+
|
|
204
|
+
For each column, DBAutoDoc collects:
|
|
205
|
+
- **Cardinality**: Distinct value counts
|
|
206
|
+
- **Statistics**: Min, max, average, standard deviation
|
|
207
|
+
- **Patterns**: Common prefixes, format detection
|
|
208
|
+
- **Value Distribution**: Actual enum values if low cardinality
|
|
209
|
+
- **Sample Data**: Stratified sampling across value ranges
|
|
210
|
+
|
|
211
|
+
This rich context enables AI to make accurate inferences.
|
|
212
|
+
|
|
213
|
+
## Configuration
|
|
214
|
+
|
|
215
|
+
### SQL Server Configuration
|
|
149
216
|
|
|
150
217
|
```json
|
|
151
218
|
{
|
|
152
|
-
"version": "1.0",
|
|
219
|
+
"version": "1.0.0",
|
|
153
220
|
"database": {
|
|
154
|
-
"
|
|
155
|
-
"
|
|
221
|
+
"provider": "sqlserver",
|
|
222
|
+
"host": "localhost",
|
|
223
|
+
"database": "MyDatabase",
|
|
224
|
+
"user": "sa",
|
|
225
|
+
"password": "YourPassword",
|
|
226
|
+
"encrypt": true,
|
|
227
|
+
"trustServerCertificate": false
|
|
156
228
|
},
|
|
157
|
-
"
|
|
158
|
-
"
|
|
159
|
-
"
|
|
229
|
+
"ai": {
|
|
230
|
+
"provider": "openai",
|
|
231
|
+
"model": "gpt-4-turbo-preview",
|
|
232
|
+
"apiKey": "sk-...",
|
|
233
|
+
"temperature": 0.1,
|
|
234
|
+
"maxTokens": 8000,
|
|
235
|
+
"effortLevel": 50
|
|
160
236
|
},
|
|
161
|
-
"
|
|
162
|
-
"
|
|
163
|
-
|
|
164
|
-
|
|
165
|
-
|
|
166
|
-
|
|
167
|
-
|
|
168
|
-
|
|
169
|
-
|
|
170
|
-
|
|
171
|
-
|
|
172
|
-
|
|
173
|
-
|
|
237
|
+
"analysis": {
|
|
238
|
+
"cardinalityThreshold": 20,
|
|
239
|
+
"sampleSize": 10,
|
|
240
|
+
"includeStatistics": true,
|
|
241
|
+
"includePatternAnalysis": true,
|
|
242
|
+
"convergence": {
|
|
243
|
+
"maxIterations": 50,
|
|
244
|
+
"stabilityWindow": 2,
|
|
245
|
+
"confidenceThreshold": 0.85
|
|
246
|
+
},
|
|
247
|
+
"backpropagation": {
|
|
248
|
+
"enabled": true,
|
|
249
|
+
"maxDepth": 3
|
|
250
|
+
},
|
|
251
|
+
"sanityChecks": {
|
|
252
|
+
"dependencyLevel": true,
|
|
253
|
+
"schemaLevel": true,
|
|
254
|
+
"crossSchema": true
|
|
255
|
+
},
|
|
256
|
+
"guardrails": {
|
|
257
|
+
"enabled": true,
|
|
258
|
+
"stopOnExceeded": true,
|
|
259
|
+
"maxTokensPerRun": 250000,
|
|
260
|
+
"maxDurationSeconds": 3600,
|
|
261
|
+
"maxCostDollars": 50,
|
|
262
|
+
"maxTokensPerPhase": {
|
|
263
|
+
"discovery": 100000,
|
|
264
|
+
"analysis": 150000,
|
|
265
|
+
"sanityChecks": 50000
|
|
266
|
+
},
|
|
267
|
+
"maxTokensPerIteration": 50000,
|
|
268
|
+
"maxIterationDurationSeconds": 600,
|
|
269
|
+
"warnThresholds": {
|
|
270
|
+
"tokenPercentage": 80,
|
|
271
|
+
"durationPercentage": 80,
|
|
272
|
+
"costPercentage": 80,
|
|
273
|
+
"iterationTokenPercentage": 85,
|
|
274
|
+
"phaseTokenPercentage": 85
|
|
275
|
+
}
|
|
276
|
+
},
|
|
277
|
+
"relationshipDiscovery": {
|
|
278
|
+
"enabled": true,
|
|
279
|
+
"triggers": {
|
|
280
|
+
"runOnMissingPKs": true,
|
|
281
|
+
"runOnInsufficientFKs": true,
|
|
282
|
+
"fkDeficitThreshold": 0.4
|
|
283
|
+
},
|
|
284
|
+
"tokenBudget": {
|
|
285
|
+
"ratioOfTotal": 0.4
|
|
286
|
+
},
|
|
287
|
+
"confidence": {
|
|
288
|
+
"primaryKeyMinimum": 0.7,
|
|
289
|
+
"foreignKeyMinimum": 0.6,
|
|
290
|
+
"llmValidationThreshold": 0.8
|
|
291
|
+
},
|
|
292
|
+
"sampling": {
|
|
293
|
+
"maxRowsPerTable": 1000,
|
|
294
|
+
"valueOverlapSampleSize": 100,
|
|
295
|
+
"statisticalSignificance": 100,
|
|
296
|
+
"compositeKeyMaxColumns": 3
|
|
297
|
+
},
|
|
298
|
+
"patterns": {
|
|
299
|
+
"primaryKeyNames": ["^id$", ".*_id$", "^pk_.*", ".*_key$"],
|
|
300
|
+
"foreignKeyNames": [".*_id$", ".*_fk$", "^fk_.*"]
|
|
301
|
+
},
|
|
302
|
+
"llmValidation": {
|
|
303
|
+
"enabled": true,
|
|
304
|
+
"batchSize": 10
|
|
305
|
+
},
|
|
306
|
+
"backpropagation": {
|
|
307
|
+
"enabled": true,
|
|
308
|
+
"maxIterations": 5
|
|
174
309
|
}
|
|
175
310
|
}
|
|
176
311
|
},
|
|
177
|
-
"
|
|
312
|
+
"output": {
|
|
313
|
+
"stateFile": "./db-doc-state.json",
|
|
314
|
+
"outputDir": "./output",
|
|
315
|
+
"sqlFile": "./output/add-descriptions.sql",
|
|
316
|
+
"markdownFile": "./output/database-documentation.md"
|
|
317
|
+
},
|
|
318
|
+
"schemas": {
|
|
319
|
+
"exclude": ["sys", "INFORMATION_SCHEMA"]
|
|
320
|
+
},
|
|
321
|
+
"tables": {
|
|
322
|
+
"exclude": ["sysdiagrams", "__MigrationHistory"]
|
|
323
|
+
}
|
|
178
324
|
}
|
|
179
325
|
```
|
|
180
326
|
|
|
181
|
-
|
|
327
|
+
### PostgreSQL Configuration
|
|
182
328
|
|
|
183
|
-
```
|
|
184
|
-
|
|
185
|
-
|
|
186
|
-
|
|
187
|
-
|
|
188
|
-
|
|
329
|
+
```json
|
|
330
|
+
{
|
|
331
|
+
"version": "1.0.0",
|
|
332
|
+
"database": {
|
|
333
|
+
"provider": "postgresql",
|
|
334
|
+
"host": "localhost",
|
|
335
|
+
"port": 5432,
|
|
336
|
+
"database": "mydatabase",
|
|
337
|
+
"user": "postgres",
|
|
338
|
+
"password": "YourPassword",
|
|
339
|
+
"ssl": false
|
|
340
|
+
},
|
|
341
|
+
"ai": {
|
|
342
|
+
"provider": "openai",
|
|
343
|
+
"model": "gpt-4-turbo-preview",
|
|
344
|
+
"apiKey": "sk-...",
|
|
345
|
+
"temperature": 0.1,
|
|
346
|
+
"maxTokens": 8000
|
|
347
|
+
},
|
|
348
|
+
"analysis": {
|
|
349
|
+
"cardinalityThreshold": 20,
|
|
350
|
+
"sampleSize": 10,
|
|
351
|
+
"includeStatistics": true,
|
|
352
|
+
"guardrails": {
|
|
353
|
+
"enabled": true,
|
|
354
|
+
"maxTokensPerRun": 250000
|
|
355
|
+
}
|
|
356
|
+
},
|
|
357
|
+
"output": {
|
|
358
|
+
"stateFile": "./db-doc-state.json",
|
|
359
|
+
"outputDir": "./output",
|
|
360
|
+
"sqlFile": "./output/add-descriptions.sql",
|
|
361
|
+
"markdownFile": "./output/database-documentation.md"
|
|
362
|
+
},
|
|
363
|
+
"schemas": {
|
|
364
|
+
"exclude": ["pg_catalog", "information_schema"]
|
|
365
|
+
}
|
|
366
|
+
}
|
|
367
|
+
```
|
|
368
|
+
|
|
369
|
+
### MySQL Configuration
|
|
370
|
+
|
|
371
|
+
```json
|
|
372
|
+
{
|
|
373
|
+
"version": "1.0.0",
|
|
374
|
+
"database": {
|
|
375
|
+
"provider": "mysql",
|
|
376
|
+
"host": "localhost",
|
|
377
|
+
"port": 3306,
|
|
378
|
+
"database": "mydatabase",
|
|
379
|
+
"user": "root",
|
|
380
|
+
"password": "YourPassword"
|
|
381
|
+
},
|
|
382
|
+
"ai": {
|
|
383
|
+
"provider": "openai",
|
|
384
|
+
"model": "gpt-4-turbo-preview",
|
|
385
|
+
"apiKey": "sk-...",
|
|
386
|
+
"temperature": 0.1,
|
|
387
|
+
"maxTokens": 8000
|
|
388
|
+
},
|
|
389
|
+
"analysis": {
|
|
390
|
+
"cardinalityThreshold": 20,
|
|
391
|
+
"sampleSize": 10,
|
|
392
|
+
"includeStatistics": true,
|
|
393
|
+
"guardrails": {
|
|
394
|
+
"enabled": true,
|
|
395
|
+
"maxTokensPerRun": 250000
|
|
396
|
+
}
|
|
397
|
+
},
|
|
398
|
+
"output": {
|
|
399
|
+
"stateFile": "./db-doc-state.json",
|
|
400
|
+
"outputDir": "./output",
|
|
401
|
+
"sqlFile": "./output/add-descriptions.sql",
|
|
402
|
+
"markdownFile": "./output/database-documentation.md"
|
|
403
|
+
},
|
|
404
|
+
"schemas": {
|
|
405
|
+
"exclude": ["mysql", "information_schema", "performance_schema", "sys"]
|
|
406
|
+
}
|
|
407
|
+
}
|
|
408
|
+
```
|
|
189
409
|
|
|
190
|
-
|
|
191
|
-
|
|
192
|
-
|
|
193
|
-
|
|
410
|
+
## Supported AI Providers
|
|
411
|
+
|
|
412
|
+
DBAutoDoc integrates with MemberJunction's AI provider system, supporting:
|
|
413
|
+
|
|
414
|
+
### OpenAI
|
|
415
|
+
```json
|
|
416
|
+
{
|
|
417
|
+
"provider": "OpenAILLM",
|
|
418
|
+
"model": "gpt-4-turbo-preview",
|
|
419
|
+
"apiKey": "sk-..."
|
|
420
|
+
}
|
|
421
|
+
```
|
|
194
422
|
|
|
195
|
-
|
|
196
|
-
|
|
423
|
+
### Anthropic
|
|
424
|
+
```json
|
|
425
|
+
{
|
|
426
|
+
"provider": "AnthropicLLM",
|
|
427
|
+
"model": "claude-3-5-sonnet-20241022",
|
|
428
|
+
"apiKey": "sk-ant-..."
|
|
429
|
+
}
|
|
197
430
|
```
|
|
198
431
|
|
|
199
|
-
|
|
432
|
+
### Google
|
|
433
|
+
```json
|
|
434
|
+
{
|
|
435
|
+
"provider": "GoogleLLM",
|
|
436
|
+
"model": "gemini-1.5-pro",
|
|
437
|
+
"apiKey": "..."
|
|
438
|
+
}
|
|
439
|
+
```
|
|
440
|
+
|
|
441
|
+
### Groq
|
|
442
|
+
```json
|
|
443
|
+
{
|
|
444
|
+
"provider": "GroqLLM",
|
|
445
|
+
"model": "llama-3.3-70b-versatile",
|
|
446
|
+
"apiKey": "gsk_..."
|
|
447
|
+
}
|
|
448
|
+
```
|
|
449
|
+
|
|
450
|
+
### Other Providers
|
|
451
|
+
Any BaseLLM-compatible provider registered with MemberJunction can be used.
|
|
452
|
+
|
|
453
|
+
## State File
|
|
454
|
+
|
|
455
|
+
The `db-doc-state.json` file tracks:
|
|
456
|
+
- All schemas, tables, and columns
|
|
457
|
+
- **Description iterations** with reasoning and confidence
|
|
458
|
+
- **Analysis runs** with metrics (tokens, cost, duration)
|
|
459
|
+
- **Processing logs** for debugging
|
|
460
|
+
- **Relationship discovery results** (primary keys, foreign keys)
|
|
461
|
+
- **Guardrail metrics** (phase and iteration budgets)
|
|
462
|
+
|
|
463
|
+
### Iteration Tracking
|
|
464
|
+
|
|
465
|
+
Each description has an iteration history:
|
|
466
|
+
|
|
467
|
+
```json
|
|
468
|
+
{
|
|
469
|
+
"descriptionIterations": [
|
|
470
|
+
{
|
|
471
|
+
"description": "Initial hypothesis...",
|
|
472
|
+
"reasoning": "Based on column names...",
|
|
473
|
+
"generatedAt": "2024-01-15T10:00:00Z",
|
|
474
|
+
"modelUsed": "gpt-4",
|
|
475
|
+
"confidence": 0.75,
|
|
476
|
+
"triggeredBy": "initial"
|
|
477
|
+
},
|
|
478
|
+
{
|
|
479
|
+
"description": "Revised hypothesis...",
|
|
480
|
+
"reasoning": "Child table analysis revealed...",
|
|
481
|
+
"generatedAt": "2024-01-15T10:05:00Z",
|
|
482
|
+
"modelUsed": "gpt-4",
|
|
483
|
+
"confidence": 0.92,
|
|
484
|
+
"triggeredBy": "backpropagation",
|
|
485
|
+
"changedFrom": "Initial hypothesis..."
|
|
486
|
+
}
|
|
487
|
+
]
|
|
488
|
+
}
|
|
489
|
+
```
|
|
200
490
|
|
|
201
|
-
|
|
202
|
-
|
|
203
|
-
|
|
204
|
-
|
|
205
|
-
|
|
206
|
-
|
|
491
|
+
## Programmatic Usage
|
|
492
|
+
|
|
493
|
+
DBAutoDoc can be used as a library with a comprehensive programmatic API:
|
|
494
|
+
|
|
495
|
+
### Simple API (Recommended)
|
|
496
|
+
|
|
497
|
+
```typescript
|
|
498
|
+
import { DBAutoDocAPI } from '@memberjunction/db-auto-doc';
|
|
499
|
+
|
|
500
|
+
const api = new DBAutoDocAPI();
|
|
501
|
+
|
|
502
|
+
// Analyze database
|
|
503
|
+
const result = await api.analyze({
|
|
504
|
+
database: {
|
|
505
|
+
provider: 'sqlserver',
|
|
506
|
+
host: 'localhost',
|
|
507
|
+
database: 'MyDB',
|
|
508
|
+
user: 'sa',
|
|
509
|
+
password: 'password'
|
|
510
|
+
},
|
|
511
|
+
ai: {
|
|
512
|
+
provider: 'OpenAILLM',
|
|
513
|
+
model: 'gpt-4-turbo-preview',
|
|
514
|
+
apiKey: 'sk-...'
|
|
515
|
+
},
|
|
516
|
+
analysis: {
|
|
517
|
+
convergence: { maxIterations: 10 },
|
|
518
|
+
guardrails: { maxTokensPerRun: 100000 }
|
|
519
|
+
},
|
|
520
|
+
output: {
|
|
521
|
+
outputDir: './output'
|
|
522
|
+
},
|
|
523
|
+
onProgress: (message, data) => {
|
|
524
|
+
console.log(message, data);
|
|
525
|
+
}
|
|
526
|
+
});
|
|
527
|
+
|
|
528
|
+
// Resume from state file
|
|
529
|
+
const resumed = await api.resume('./db-doc-state.json', {
|
|
530
|
+
analysis: {
|
|
531
|
+
convergence: { maxIterations: 20 }
|
|
532
|
+
}
|
|
533
|
+
});
|
|
534
|
+
|
|
535
|
+
// Export documentation
|
|
536
|
+
const exported = await api.export('./db-doc-state.json', {
|
|
537
|
+
formats: ['sql', 'markdown', 'html', 'csv', 'mermaid'],
|
|
538
|
+
outputDir: './docs',
|
|
539
|
+
applyToDatabase: true
|
|
540
|
+
});
|
|
541
|
+
|
|
542
|
+
// Get analysis status
|
|
543
|
+
const status = await api.getStatus('./db-doc-state.json');
|
|
544
|
+
console.log('Progress:', status.progress);
|
|
545
|
+
console.log('Tokens used:', status.metrics.totalTokens);
|
|
546
|
+
console.log('Estimated cost:', status.metrics.estimatedCost);
|
|
547
|
+
```
|
|
548
|
+
|
|
549
|
+
### Advanced API (Full Control)
|
|
550
|
+
|
|
551
|
+
```typescript
|
|
552
|
+
import {
|
|
553
|
+
ConfigLoader,
|
|
554
|
+
DatabaseConnection,
|
|
555
|
+
Introspector,
|
|
556
|
+
TopologicalSorter,
|
|
557
|
+
StateManager,
|
|
558
|
+
PromptEngine,
|
|
559
|
+
AnalysisEngine,
|
|
560
|
+
GuardrailsManager,
|
|
561
|
+
SQLGenerator,
|
|
562
|
+
MarkdownGenerator,
|
|
563
|
+
HTMLGenerator,
|
|
564
|
+
CSVGenerator,
|
|
565
|
+
MermaidGenerator
|
|
566
|
+
} from '@memberjunction/db-auto-doc';
|
|
567
|
+
|
|
568
|
+
// Load config
|
|
569
|
+
const config = await ConfigLoader.load('./config.json');
|
|
570
|
+
|
|
571
|
+
// Connect to database
|
|
572
|
+
const db = new DatabaseConnection(config.database);
|
|
573
|
+
await db.connect();
|
|
574
|
+
|
|
575
|
+
// Introspect
|
|
576
|
+
const driver = db.getDriver();
|
|
577
|
+
const introspector = new Introspector(driver);
|
|
578
|
+
const schemas = await introspector.getSchemas(config.schemas, config.tables);
|
|
579
|
+
|
|
580
|
+
// Initialize analysis components
|
|
581
|
+
const promptEngine = new PromptEngine(config.ai, './prompts');
|
|
582
|
+
await promptEngine.initialize();
|
|
583
|
+
|
|
584
|
+
const stateManager = new StateManager(config.output.stateFile);
|
|
585
|
+
const state = stateManager.createInitialState(config.database.database, config.database.server);
|
|
586
|
+
state.schemas = schemas;
|
|
587
|
+
|
|
588
|
+
const guardrails = new GuardrailsManager(config.analysis.guardrails);
|
|
589
|
+
const iterationTracker = new IterationTracker();
|
|
590
|
+
|
|
591
|
+
// Run analysis
|
|
592
|
+
const analysisEngine = new AnalysisEngine(config, promptEngine, stateManager, iterationTracker);
|
|
593
|
+
// ... custom analysis workflow
|
|
594
|
+
|
|
595
|
+
// Generate outputs
|
|
596
|
+
const sqlGen = new SQLGenerator();
|
|
597
|
+
const sql = sqlGen.generate(state, { approvedOnly: false });
|
|
598
|
+
|
|
599
|
+
const mdGen = new MarkdownGenerator();
|
|
600
|
+
const markdown = mdGen.generate(state);
|
|
601
|
+
|
|
602
|
+
const htmlGen = new HTMLGenerator();
|
|
603
|
+
const html = htmlGen.generate(state, { confidenceThreshold: 0.7 });
|
|
604
|
+
|
|
605
|
+
const csvGen = new CSVGenerator();
|
|
606
|
+
const { tables, columns } = csvGen.generate(state);
|
|
607
|
+
|
|
608
|
+
const mermaidGen = new MermaidGenerator();
|
|
609
|
+
const erdDiagram = mermaidGen.generate(state);
|
|
610
|
+
const erdHtml = mermaidGen.generateHtml(state);
|
|
611
|
+
```
|
|
612
|
+
|
|
613
|
+
## Cost Estimation
|
|
614
|
+
|
|
615
|
+
Typical costs (will vary by database size and complexity):
|
|
616
|
+
|
|
617
|
+
| Database Size | Tables | Iterations | Tokens | Cost (GPT-4) | Cost (Groq) |
|
|
618
|
+
|---------------|--------|------------|--------|--------------|-------------|
|
|
619
|
+
| Small | 10-20 | 2-3 | ~50K | $0.50 | $0.02 |
|
|
620
|
+
| Medium | 50-100 | 3-5 | ~200K | $2.00 | $0.08 |
|
|
621
|
+
| Large | 200+ | 5-8 | ~500K | $5.00 | $0.20 |
|
|
622
|
+
| Enterprise | 500+ | 8-15 | ~1.5M | $15.00 | $0.60 |
|
|
623
|
+
|
|
624
|
+
**With Relationship Discovery**: Add 25-40% to token/cost estimates for databases with missing constraints.
|
|
625
|
+
|
|
626
|
+
**Guardrails** help control costs by setting hard limits on token usage and runtime.
|
|
627
|
+
|
|
628
|
+
## Best Practices
|
|
629
|
+
|
|
630
|
+
1. **Start with guardrails** - Set reasonable token/cost limits to avoid surprises
|
|
631
|
+
2. **Add seed context** - Helps AI understand database purpose and domain
|
|
632
|
+
3. **Review low-confidence items** - Focus manual effort where AI is uncertain
|
|
633
|
+
4. **Use backpropagation** - Improves accuracy significantly
|
|
634
|
+
5. **Enable relationship discovery** - For legacy databases missing constraints
|
|
635
|
+
6. **Filter exports** - Use `--confidence-threshold` to only apply high-confidence descriptions
|
|
636
|
+
7. **Iterate** - Run analysis multiple times if first pass isn't satisfactory
|
|
637
|
+
8. **Resume from checkpoints** - Save costs by continuing previous runs
|
|
638
|
+
9. **Use appropriate models** - Balance cost vs. quality (GPT-4 vs. Groq)
|
|
639
|
+
10. **Export multiple formats** - HTML for browsing, CSV for analysis, SQL for database
|
|
640
|
+
|
|
641
|
+
## Troubleshooting
|
|
642
|
+
|
|
643
|
+
### "Connection failed"
|
|
644
|
+
- Check server, database, user, password in config
|
|
645
|
+
- Verify database server is running and accessible
|
|
646
|
+
- Check firewall rules and network connectivity
|
|
647
|
+
- For PostgreSQL: verify SSL settings
|
|
648
|
+
- For MySQL: check port and authentication method
|
|
649
|
+
|
|
650
|
+
### "Analysis not converging"
|
|
651
|
+
- Increase `maxIterations` in config
|
|
652
|
+
- Lower `confidenceThreshold`
|
|
653
|
+
- Add more seed context
|
|
654
|
+
- Check warnings in state file for specific issues
|
|
655
|
+
- Review guardrail limits (may be hitting token budget)
|
|
656
|
+
|
|
657
|
+
### "High token usage"
|
|
658
|
+
- Enable guardrails with appropriate limits
|
|
659
|
+
- Reduce `maxTokens` per prompt
|
|
660
|
+
- Filter schemas/tables to focus on subset
|
|
661
|
+
- Use cheaper model (Groq instead of GPT-4)
|
|
662
|
+
- Disable relationship discovery if not needed
|
|
663
|
+
|
|
664
|
+
### "Guardrail limits exceeded"
|
|
665
|
+
- Review metrics in state file
|
|
666
|
+
- Adjust limits upward if budget allows
|
|
667
|
+
- Use `--resume` to continue from checkpoint
|
|
668
|
+
- Focus on specific schemas/tables
|
|
669
|
+
- Reduce iteration count
|
|
670
|
+
|
|
671
|
+
### "Relationship discovery not finding keys"
|
|
672
|
+
- Check confidence thresholds (may be too high)
|
|
673
|
+
- Review statistical significance settings
|
|
674
|
+
- Enable LLM validation for better accuracy
|
|
675
|
+
- Check naming patterns configuration
|
|
676
|
+
- Verify sample size is adequate
|
|
677
|
+
|
|
678
|
+
## Documentation
|
|
679
|
+
|
|
680
|
+
Comprehensive documentation is available in the `docs/` folder:
|
|
681
|
+
|
|
682
|
+
- **[USER_GUIDE.md](./docs/USER_GUIDE.md)** - Complete user documentation
|
|
683
|
+
- **[ARCHITECTURE.md](./docs/ARCHITECTURE.md)** - Technical architecture and design
|
|
684
|
+
- **[API_USAGE.md](./docs/API_USAGE.md)** - Programmatic API examples
|
|
685
|
+
- **[GUARDRAILS.md](./docs/GUARDRAILS.md)** - Guardrails system documentation
|
|
686
|
+
- **[CHANGES.md](./docs/CHANGES.md)** - Recent changes and enhancements
|
|
207
687
|
|
|
208
688
|
## Architecture
|
|
209
689
|
|
|
210
|
-
|
|
211
|
-
- **oclif-based commands** in `src/commands/` (init, analyze, review, export, reset)
|
|
212
|
-
- **Standalone package** with zero MJ runtime dependencies
|
|
213
|
-
- **MJCLI integration** via thin delegation commands in `packages/MJCLI/src/commands/dbdoc/`
|
|
214
|
-
- **Reusable services** exported for programmatic use
|
|
215
|
-
- **State file architecture** enables incremental refinement across runs
|
|
690
|
+
DBAutoDoc uses a sophisticated multi-phase architecture:
|
|
216
691
|
|
|
217
|
-
|
|
692
|
+
1. **Discovery Phase** - Introspection and optional relationship discovery
|
|
693
|
+
2. **Analysis Phase** - Iterative LLM-based description generation
|
|
694
|
+
3. **Sanity Check Phase** - Validation and quality assurance
|
|
695
|
+
4. **Export Phase** - Multi-format documentation generation
|
|
218
696
|
|
|
219
|
-
|
|
220
|
-
-
|
|
221
|
-
-
|
|
222
|
-
-
|
|
223
|
-
-
|
|
224
|
-
- CI/CD integration examples
|
|
225
|
-
- Schema diff detection for automatic re-documentation
|
|
697
|
+
See [ARCHITECTURE.md](./docs/ARCHITECTURE.md) for comprehensive architecture documentation, including:
|
|
698
|
+
- Phase flow diagrams
|
|
699
|
+
- Extension points for customization
|
|
700
|
+
- Database driver development guide
|
|
701
|
+
- LLM intelligence strategy
|
|
226
702
|
|
|
227
|
-
##
|
|
703
|
+
## Contributing
|
|
228
704
|
|
|
229
|
-
|
|
230
|
-
- SQL Server database access
|
|
231
|
-
- OpenAI or Anthropic API key
|
|
705
|
+
DBAutoDoc is part of the MemberJunction project. Contributions welcome!
|
|
232
706
|
|
|
233
707
|
## License
|
|
234
708
|
|
|
235
|
-
MIT
|
|
709
|
+
MIT
|
|
710
|
+
|
|
711
|
+
## Demo Databases
|
|
712
|
+
|
|
713
|
+
### LousyDB - Legacy Database Demo
|
|
714
|
+
|
|
715
|
+
Located in `/Demos/LousyDB/`, this demo showcases **Relationship Discovery** capabilities on a realistic legacy database:
|
|
716
|
+
|
|
717
|
+
- ❌ **Zero metadata** - No PK or FK constraints defined
|
|
718
|
+
- 🔤 **Cryptic naming** - Short abbreviations (`cst`, `ord`, `pmt`)
|
|
719
|
+
- 🔡 **Single-char codes** - Undocumented status values (`A`, `T`, `P`)
|
|
720
|
+
- 💔 **Data quality issues** - Orphaned records, NULLs, duplicates
|
|
721
|
+
- 📊 **20 tables** across 2 schemas with 1000+ rows
|
|
236
722
|
|
|
237
|
-
|
|
723
|
+
Perfect for testing DBAutoDoc's ability to **reverse-engineer** poorly-documented databases.
|
|
238
724
|
|
|
239
|
-
|
|
240
|
-
- Documentation: https://docs.memberjunction.org
|
|
725
|
+
See `/Demos/LousyDB/README.md` for details and testing instructions.
|
|
241
726
|
|
|
242
|
-
##
|
|
727
|
+
## Links
|
|
243
728
|
|
|
244
|
-
|
|
729
|
+
- **GitHub**: https://github.com/MemberJunction/MJ
|
|
730
|
+
- **Documentation**: https://docs.memberjunction.org
|
|
731
|
+
- **Support**: https://github.com/MemberJunction/MJ/issues
|