codegraph-gen 0.2.0__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- codegraph_gen-0.2.0/PKG-INFO +169 -0
- codegraph_gen-0.2.0/README.md +134 -0
- codegraph_gen-0.2.0/pyproject.toml +70 -0
- codegraph_gen-0.2.0/src/codegraph_gen/__init__.py +0 -0
- codegraph_gen-0.2.0/src/codegraph_gen/__main__.py +311 -0
- codegraph_gen-0.2.0/src/codegraph_gen/ai.py +77 -0
- codegraph_gen-0.2.0/src/codegraph_gen/analyzer.py +100 -0
- codegraph_gen-0.2.0/src/codegraph_gen/builder.py +747 -0
- codegraph_gen-0.2.0/src/codegraph_gen/cluster.py +116 -0
- codegraph_gen-0.2.0/src/codegraph_gen/config.py +76 -0
- codegraph_gen-0.2.0/src/codegraph_gen/detect.py +59 -0
- codegraph_gen-0.2.0/src/codegraph_gen/engine.py +367 -0
- codegraph_gen-0.2.0/src/codegraph_gen/parser/__init__.py +27 -0
- codegraph_gen-0.2.0/src/codegraph_gen/parser/base.py +38 -0
- codegraph_gen-0.2.0/src/codegraph_gen/parser/cpp.py +349 -0
- codegraph_gen-0.2.0/src/codegraph_gen/parser/go.py +268 -0
- codegraph_gen-0.2.0/src/codegraph_gen/parser/javascript.py +370 -0
- codegraph_gen-0.2.0/src/codegraph_gen/parser/kotlin.py +387 -0
- codegraph_gen-0.2.0/src/codegraph_gen/parser/python.py +415 -0
- codegraph_gen-0.2.0/src/codegraph_gen/parser/rust.py +497 -0
- codegraph_gen-0.2.0/src/codegraph_gen/parser/swift.py +327 -0
- codegraph_gen-0.2.0/src/codegraph_gen/py.typed +0 -0
- codegraph_gen-0.2.0/src/codegraph_gen/renderer.py +498 -0
- codegraph_gen-0.2.0/src/codegraph_gen/writer.py +97 -0
|
@@ -0,0 +1,169 @@
|
|
|
1
|
+
Metadata-Version: 2.3
|
|
2
|
+
Name: codegraph-gen
|
|
3
|
+
Version: 0.2.0
|
|
4
|
+
Summary: AST-based codebase knowledge graph generator in Markdown
|
|
5
|
+
Keywords: knowledge-graph,ast,codebase,markdown,tree-sitter,visualization,static-analysis,ai-agent,obsidian
|
|
6
|
+
Author: twn39
|
|
7
|
+
Author-email: twn39 <twn39@163.com>
|
|
8
|
+
License: MIT
|
|
9
|
+
Classifier: Development Status :: 4 - Beta
|
|
10
|
+
Classifier: Intended Audience :: Developers
|
|
11
|
+
Classifier: License :: OSI Approved :: MIT License
|
|
12
|
+
Classifier: Programming Language :: Python :: 3
|
|
13
|
+
Classifier: Programming Language :: Python :: 3.12
|
|
14
|
+
Classifier: Topic :: Software Development :: Code Generators
|
|
15
|
+
Classifier: Topic :: Software Development :: Libraries :: Python Modules
|
|
16
|
+
Requires-Dist: networkx>=3.0
|
|
17
|
+
Requires-Dist: tree-sitter>=0.23.0
|
|
18
|
+
Requires-Dist: tree-sitter-python
|
|
19
|
+
Requires-Dist: tree-sitter-javascript
|
|
20
|
+
Requires-Dist: tree-sitter-typescript
|
|
21
|
+
Requires-Dist: tree-sitter-go
|
|
22
|
+
Requires-Dist: tree-sitter-rust
|
|
23
|
+
Requires-Dist: tree-sitter-swift
|
|
24
|
+
Requires-Dist: click>=8.0.0
|
|
25
|
+
Requires-Dist: rich>=13.0.0
|
|
26
|
+
Requires-Dist: pydantic>=2.0.0
|
|
27
|
+
Requires-Dist: tree-sitter-c>=0.24.2
|
|
28
|
+
Requires-Dist: tree-sitter-cpp>=0.23.4
|
|
29
|
+
Requires-Dist: tree-sitter-kotlin>=1.1.0
|
|
30
|
+
Requires-Python: >=3.12
|
|
31
|
+
Project-URL: Homepage, https://github.com/twn39/codegraph
|
|
32
|
+
Project-URL: Repository, https://github.com/twn39/codegraph
|
|
33
|
+
Project-URL: Issues, https://github.com/twn39/codegraph/issues
|
|
34
|
+
Description-Content-Type: text/markdown
|
|
35
|
+
|
|
36
|
+
# codegraph
|
|
37
|
+
|
|
38
|
+
`codegraph` 是一个面向 AI Agent(如 Antigravity、Codex、Claude Code 等)的静态代码知识图谱生成工具。它能够静态解析多语言 codebase,通过社区发现算法自动进行组件聚类,并导出为由标准 Markdown 文件组成的关联图谱库(Obsidian-like vault),极大地辅助 AI Agent 在本地进行精准的架构理解、逻辑导航与深度洞察分析。
|
|
39
|
+
|
|
40
|
+
与基于图形化 Canvas 渲染的知识图谱不同,`codegraph` 采用全 Markdown 的扁平结构存储。它专门为 LLM 设计,摒弃了昂贵且复杂的数据库依赖,让 AI Agent 可以通过标准文件读取与路径导航(Relative Links)轻松周游整个代码库。
|
|
41
|
+
|
|
42
|
+
---
|
|
43
|
+
|
|
44
|
+
## 🚀 核心特性
|
|
45
|
+
|
|
46
|
+
- **多语言 AST 解析**:基于 `tree-sitter`,原生支持 **Python, JavaScript, TypeScript, Go, Rust, Swift**。
|
|
47
|
+
- **语义边解析与绑定**:静态解析跨文件的函数/方法调用(`calls`)、类型继承/接口实现(`inherits`/`implements`)以及文件导入关系(`imports`)。
|
|
48
|
+
- **逻辑组件自动聚类**:利用贪心模块度社区发现算法(Louvain Modularity Clustering)将紧密耦合的文件和符号自动聚类为 **Component(逻辑组件)**,并根据组件核心节点智能命名。
|
|
49
|
+
- **架构脆弱性分析**:自动识别 **God Nodes(度数最高的核心抽象)**,并静态检测文件级别的 **循环导入依赖(Circular Imports)**。
|
|
50
|
+
- **Agent 友好交互协议**:生成离线 Agent 提示词文件 `AGENT_PROMPT.md` 与规则文件 `AGENTS.md`,实现零 API 成本的 Agent 驱动型架构洞察分析。
|
|
51
|
+
|
|
52
|
+
---
|
|
53
|
+
|
|
54
|
+
## 📦 架构概览
|
|
55
|
+
|
|
56
|
+
```mermaid
|
|
57
|
+
graph TD
|
|
58
|
+
A[工作区源码 Workspace] --> B[detect: 语言识别与过滤]
|
|
59
|
+
B --> C[parser: Tree-Sitter AST 符号提取]
|
|
60
|
+
C --> D[builder: NetworkX 语义图组装与绑定]
|
|
61
|
+
D --> E[cluster: 社区模块度聚类命名]
|
|
62
|
+
E --> F[analyze: 上帝节点与循环导入分析]
|
|
63
|
+
F --> G[export: 导出至 .codegraph/]
|
|
64
|
+
G --> H[AGENT_PROMPT.md / AGENTS.md / README.md / nodes / components]
|
|
65
|
+
```
|
|
66
|
+
|
|
67
|
+
---
|
|
68
|
+
|
|
69
|
+
## 🛠️ 安装指南
|
|
70
|
+
|
|
71
|
+
推荐使用 [uv](https://github.com/astral-sh/uv) 管理项目依赖与虚拟环境:
|
|
72
|
+
|
|
73
|
+
```bash
|
|
74
|
+
# 克隆仓库
|
|
75
|
+
git clone <repository-url>
|
|
76
|
+
cd codegraph
|
|
77
|
+
|
|
78
|
+
# 同步依赖并激活虚拟环境
|
|
79
|
+
uv sync
|
|
80
|
+
source .venv/bin/activate
|
|
81
|
+
|
|
82
|
+
# 全局安装 (推荐)
|
|
83
|
+
uv tool install --force --no-cache .
|
|
84
|
+
```
|
|
85
|
+
|
|
86
|
+
### 2. 注入 AI Agent 斜杠命令集成
|
|
87
|
+
|
|
88
|
+
`codegraph` 支持一键将 `/codegraph` 自定义斜杠命令注册到您的 AI Agent(如 Codex 或 Antigravity)的全局配置中:
|
|
89
|
+
|
|
90
|
+
```bash
|
|
91
|
+
# 为 Codex / Antigravity 注入 /codegraph 全局斜杠命令
|
|
92
|
+
codegraph install --platform codex
|
|
93
|
+
```
|
|
94
|
+
|
|
95
|
+
注册完成后,在对应的 Agent 终端中,您只需输入 `/codegraph` 即可全自动运行整个图谱的提取、分析与回写流程。
|
|
96
|
+
|
|
97
|
+
---
|
|
98
|
+
|
|
99
|
+
## 📖 使用方法
|
|
100
|
+
|
|
101
|
+
### 1. 构建代码图谱
|
|
102
|
+
|
|
103
|
+
在项目根目录下运行 `codegraph build`,默认会扫描当前文件夹并将图谱输出至 `.codegraph/` 目录下:
|
|
104
|
+
|
|
105
|
+
```bash
|
|
106
|
+
# 扫描当前目录并生成图谱
|
|
107
|
+
codegraph build .
|
|
108
|
+
|
|
109
|
+
# 指定自定义输出目录
|
|
110
|
+
codegraph build . --output my_vault/
|
|
111
|
+
|
|
112
|
+
# 排除指定文件夹
|
|
113
|
+
codegraph build . --exclude extra_folder/ --exclude docs/
|
|
114
|
+
```
|
|
115
|
+
|
|
116
|
+
### 2. 导出 vault 目录结构
|
|
117
|
+
|
|
118
|
+
输出的 `.codegraph/` 是一个自包含的 Markdown 知识图谱数据库,结构如下:
|
|
119
|
+
|
|
120
|
+
```
|
|
121
|
+
.codegraph/
|
|
122
|
+
├── README.md # 图谱主索引,包含统计、Mermaid 组件依赖图、上帝节点、循环依赖及 AI 架构洞察
|
|
123
|
+
├── AGENT_PROMPT.md # 供外部 AI Agent 读取的架构分析提示词模版
|
|
124
|
+
├── AGENTS.md # 外部 AI Agent 协同工作规则与导航指南
|
|
125
|
+
├── components/ # 聚类生成的逻辑组件详情(如 Component_3_BaseParser_.md)
|
|
126
|
+
└── nodes/ # 所有物理文件和符号的详情(包含定义签名、双向调用链与 Mermaid 拓扑)
|
|
127
|
+
```
|
|
128
|
+
|
|
129
|
+
---
|
|
130
|
+
|
|
131
|
+
## 🤖 与 AI Agent(Codex / Antigravity / Claude Code)协同分析
|
|
132
|
+
|
|
133
|
+
`codegraph` 的核心设计思想是**离线构建,Agent 分析**。这避免了在 CLI 中直接硬编码大模型 API,降低了使用成本,并充分利用了你当前对话中功能更强、带有上下文读取能力的外部 Agent。
|
|
134
|
+
|
|
135
|
+
### 步骤 1:生成本地图谱
|
|
136
|
+
|
|
137
|
+
运行以下命令,为你的 codebase 生成静态图谱底座:
|
|
138
|
+
```bash
|
|
139
|
+
codegraph build .
|
|
140
|
+
```
|
|
141
|
+
|
|
142
|
+
### 步骤 2:在 Agent 中一键触发分析
|
|
143
|
+
|
|
144
|
+
打开你的 AI Agent(如 Codex、Antigravity 或 Claude Code 终端),直接向它发送以下指令:
|
|
145
|
+
|
|
146
|
+
> 💡 **分析指令**:
|
|
147
|
+
> `请读取并遵循 .codegraph/AGENT_PROMPT.md 中的提示要求,对本项目进行深度架构分析,然后将中文分析报告写入 .codegraph/README.md 文件的 “AI 架构深度洞察” 章节中。`
|
|
148
|
+
|
|
149
|
+
Agent 将自动利用其文件读写能力:
|
|
150
|
+
1. 读取 `.codegraph/AGENT_PROMPT.md` 获取元数据和 Mermaid 关系。
|
|
151
|
+
2. 进行推理,并在 `.codegraph/README.md` 的 `## AI 架构深度洞察` 章节中生成专业的中文架构报告。
|
|
152
|
+
|
|
153
|
+
### 步骤 3:日常问答与逻辑导航
|
|
154
|
+
|
|
155
|
+
AI Agent 能够非常聪明地利用该图谱来导航庞大的 codebase。当你想问代码结构时,指示它利用图谱即可:
|
|
156
|
+
- *“哪个组件依赖了 BaseParser?”* -> Agent 会读取 `components/` 下的组件文件。
|
|
157
|
+
- *“调用这个函数的入口在什么地方?”* -> Agent 会查看 `nodes/` 下对应节点文件的 `Incoming Calls`。
|
|
158
|
+
|
|
159
|
+
---
|
|
160
|
+
|
|
161
|
+
## 📜 规则设置(自动感知)
|
|
162
|
+
|
|
163
|
+
如需让 AI Agent 在进入仓库时自动使用 `.codegraph`,你可以在根目录下创建 `CLAUDE.md` 或在 `README.md` 中引用 `.codegraph/AGENTS.md` 中的规则。
|
|
164
|
+
例如在 `CLAUDE.md` 中写入:
|
|
165
|
+
```markdown
|
|
166
|
+
- Before answering architecture or codebase questions, read .codegraph/README.md for god nodes and community structure.
|
|
167
|
+
- Navigate .codegraph/components/ and .codegraph/nodes/ instead of reading raw code files directly.
|
|
168
|
+
- After modifying code files, run `codegraph build .` to keep the graph current.
|
|
169
|
+
```
|
|
@@ -0,0 +1,134 @@
|
|
|
1
|
+
# codegraph
|
|
2
|
+
|
|
3
|
+
`codegraph` 是一个面向 AI Agent(如 Antigravity、Codex、Claude Code 等)的静态代码知识图谱生成工具。它能够静态解析多语言 codebase,通过社区发现算法自动进行组件聚类,并导出为由标准 Markdown 文件组成的关联图谱库(Obsidian-like vault),极大地辅助 AI Agent 在本地进行精准的架构理解、逻辑导航与深度洞察分析。
|
|
4
|
+
|
|
5
|
+
与基于图形化 Canvas 渲染的知识图谱不同,`codegraph` 采用全 Markdown 的扁平结构存储。它专门为 LLM 设计,摒弃了昂贵且复杂的数据库依赖,让 AI Agent 可以通过标准文件读取与路径导航(Relative Links)轻松周游整个代码库。
|
|
6
|
+
|
|
7
|
+
---
|
|
8
|
+
|
|
9
|
+
## 🚀 核心特性
|
|
10
|
+
|
|
11
|
+
- **多语言 AST 解析**:基于 `tree-sitter`,原生支持 **Python, JavaScript, TypeScript, Go, Rust, Swift**。
|
|
12
|
+
- **语义边解析与绑定**:静态解析跨文件的函数/方法调用(`calls`)、类型继承/接口实现(`inherits`/`implements`)以及文件导入关系(`imports`)。
|
|
13
|
+
- **逻辑组件自动聚类**:利用贪心模块度社区发现算法(Louvain Modularity Clustering)将紧密耦合的文件和符号自动聚类为 **Component(逻辑组件)**,并根据组件核心节点智能命名。
|
|
14
|
+
- **架构脆弱性分析**:自动识别 **God Nodes(度数最高的核心抽象)**,并静态检测文件级别的 **循环导入依赖(Circular Imports)**。
|
|
15
|
+
- **Agent 友好交互协议**:生成离线 Agent 提示词文件 `AGENT_PROMPT.md` 与规则文件 `AGENTS.md`,实现零 API 成本的 Agent 驱动型架构洞察分析。
|
|
16
|
+
|
|
17
|
+
---
|
|
18
|
+
|
|
19
|
+
## 📦 架构概览
|
|
20
|
+
|
|
21
|
+
```mermaid
|
|
22
|
+
graph TD
|
|
23
|
+
A[工作区源码 Workspace] --> B[detect: 语言识别与过滤]
|
|
24
|
+
B --> C[parser: Tree-Sitter AST 符号提取]
|
|
25
|
+
C --> D[builder: NetworkX 语义图组装与绑定]
|
|
26
|
+
D --> E[cluster: 社区模块度聚类命名]
|
|
27
|
+
E --> F[analyze: 上帝节点与循环导入分析]
|
|
28
|
+
F --> G[export: 导出至 .codegraph/]
|
|
29
|
+
G --> H[AGENT_PROMPT.md / AGENTS.md / README.md / nodes / components]
|
|
30
|
+
```
|
|
31
|
+
|
|
32
|
+
---
|
|
33
|
+
|
|
34
|
+
## 🛠️ 安装指南
|
|
35
|
+
|
|
36
|
+
推荐使用 [uv](https://github.com/astral-sh/uv) 管理项目依赖与虚拟环境:
|
|
37
|
+
|
|
38
|
+
```bash
|
|
39
|
+
# 克隆仓库
|
|
40
|
+
git clone <repository-url>
|
|
41
|
+
cd codegraph
|
|
42
|
+
|
|
43
|
+
# 同步依赖并激活虚拟环境
|
|
44
|
+
uv sync
|
|
45
|
+
source .venv/bin/activate
|
|
46
|
+
|
|
47
|
+
# 全局安装 (推荐)
|
|
48
|
+
uv tool install --force --no-cache .
|
|
49
|
+
```
|
|
50
|
+
|
|
51
|
+
### 2. 注入 AI Agent 斜杠命令集成
|
|
52
|
+
|
|
53
|
+
`codegraph` 支持一键将 `/codegraph` 自定义斜杠命令注册到您的 AI Agent(如 Codex 或 Antigravity)的全局配置中:
|
|
54
|
+
|
|
55
|
+
```bash
|
|
56
|
+
# 为 Codex / Antigravity 注入 /codegraph 全局斜杠命令
|
|
57
|
+
codegraph install --platform codex
|
|
58
|
+
```
|
|
59
|
+
|
|
60
|
+
注册完成后,在对应的 Agent 终端中,您只需输入 `/codegraph` 即可全自动运行整个图谱的提取、分析与回写流程。
|
|
61
|
+
|
|
62
|
+
---
|
|
63
|
+
|
|
64
|
+
## 📖 使用方法
|
|
65
|
+
|
|
66
|
+
### 1. 构建代码图谱
|
|
67
|
+
|
|
68
|
+
在项目根目录下运行 `codegraph build`,默认会扫描当前文件夹并将图谱输出至 `.codegraph/` 目录下:
|
|
69
|
+
|
|
70
|
+
```bash
|
|
71
|
+
# 扫描当前目录并生成图谱
|
|
72
|
+
codegraph build .
|
|
73
|
+
|
|
74
|
+
# 指定自定义输出目录
|
|
75
|
+
codegraph build . --output my_vault/
|
|
76
|
+
|
|
77
|
+
# 排除指定文件夹
|
|
78
|
+
codegraph build . --exclude extra_folder/ --exclude docs/
|
|
79
|
+
```
|
|
80
|
+
|
|
81
|
+
### 2. 导出 vault 目录结构
|
|
82
|
+
|
|
83
|
+
输出的 `.codegraph/` 是一个自包含的 Markdown 知识图谱数据库,结构如下:
|
|
84
|
+
|
|
85
|
+
```
|
|
86
|
+
.codegraph/
|
|
87
|
+
├── README.md # 图谱主索引,包含统计、Mermaid 组件依赖图、上帝节点、循环依赖及 AI 架构洞察
|
|
88
|
+
├── AGENT_PROMPT.md # 供外部 AI Agent 读取的架构分析提示词模版
|
|
89
|
+
├── AGENTS.md # 外部 AI Agent 协同工作规则与导航指南
|
|
90
|
+
├── components/ # 聚类生成的逻辑组件详情(如 Component_3_BaseParser_.md)
|
|
91
|
+
└── nodes/ # 所有物理文件和符号的详情(包含定义签名、双向调用链与 Mermaid 拓扑)
|
|
92
|
+
```
|
|
93
|
+
|
|
94
|
+
---
|
|
95
|
+
|
|
96
|
+
## 🤖 与 AI Agent(Codex / Antigravity / Claude Code)协同分析
|
|
97
|
+
|
|
98
|
+
`codegraph` 的核心设计思想是**离线构建,Agent 分析**。这避免了在 CLI 中直接硬编码大模型 API,降低了使用成本,并充分利用了你当前对话中功能更强、带有上下文读取能力的外部 Agent。
|
|
99
|
+
|
|
100
|
+
### 步骤 1:生成本地图谱
|
|
101
|
+
|
|
102
|
+
运行以下命令,为你的 codebase 生成静态图谱底座:
|
|
103
|
+
```bash
|
|
104
|
+
codegraph build .
|
|
105
|
+
```
|
|
106
|
+
|
|
107
|
+
### 步骤 2:在 Agent 中一键触发分析
|
|
108
|
+
|
|
109
|
+
打开你的 AI Agent(如 Codex、Antigravity 或 Claude Code 终端),直接向它发送以下指令:
|
|
110
|
+
|
|
111
|
+
> 💡 **分析指令**:
|
|
112
|
+
> `请读取并遵循 .codegraph/AGENT_PROMPT.md 中的提示要求,对本项目进行深度架构分析,然后将中文分析报告写入 .codegraph/README.md 文件的 “AI 架构深度洞察” 章节中。`
|
|
113
|
+
|
|
114
|
+
Agent 将自动利用其文件读写能力:
|
|
115
|
+
1. 读取 `.codegraph/AGENT_PROMPT.md` 获取元数据和 Mermaid 关系。
|
|
116
|
+
2. 进行推理,并在 `.codegraph/README.md` 的 `## AI 架构深度洞察` 章节中生成专业的中文架构报告。
|
|
117
|
+
|
|
118
|
+
### 步骤 3:日常问答与逻辑导航
|
|
119
|
+
|
|
120
|
+
AI Agent 能够非常聪明地利用该图谱来导航庞大的 codebase。当你想问代码结构时,指示它利用图谱即可:
|
|
121
|
+
- *“哪个组件依赖了 BaseParser?”* -> Agent 会读取 `components/` 下的组件文件。
|
|
122
|
+
- *“调用这个函数的入口在什么地方?”* -> Agent 会查看 `nodes/` 下对应节点文件的 `Incoming Calls`。
|
|
123
|
+
|
|
124
|
+
---
|
|
125
|
+
|
|
126
|
+
## 📜 规则设置(自动感知)
|
|
127
|
+
|
|
128
|
+
如需让 AI Agent 在进入仓库时自动使用 `.codegraph`,你可以在根目录下创建 `CLAUDE.md` 或在 `README.md` 中引用 `.codegraph/AGENTS.md` 中的规则。
|
|
129
|
+
例如在 `CLAUDE.md` 中写入:
|
|
130
|
+
```markdown
|
|
131
|
+
- Before answering architecture or codebase questions, read .codegraph/README.md for god nodes and community structure.
|
|
132
|
+
- Navigate .codegraph/components/ and .codegraph/nodes/ instead of reading raw code files directly.
|
|
133
|
+
- After modifying code files, run `codegraph build .` to keep the graph current.
|
|
134
|
+
```
|
|
@@ -0,0 +1,70 @@
|
|
|
1
|
+
[project]
|
|
2
|
+
name = "codegraph-gen"
|
|
3
|
+
version = "0.2.0"
|
|
4
|
+
description = "AST-based codebase knowledge graph generator in Markdown"
|
|
5
|
+
readme = "README.md"
|
|
6
|
+
authors = [
|
|
7
|
+
{ name = "twn39", email = "twn39@163.com" }
|
|
8
|
+
]
|
|
9
|
+
license = { text = "MIT" }
|
|
10
|
+
keywords = [
|
|
11
|
+
"knowledge-graph",
|
|
12
|
+
"ast",
|
|
13
|
+
"codebase",
|
|
14
|
+
"markdown",
|
|
15
|
+
"tree-sitter",
|
|
16
|
+
"visualization",
|
|
17
|
+
"static-analysis",
|
|
18
|
+
"ai-agent",
|
|
19
|
+
"obsidian",
|
|
20
|
+
]
|
|
21
|
+
classifiers = [
|
|
22
|
+
"Development Status :: 4 - Beta",
|
|
23
|
+
"Intended Audience :: Developers",
|
|
24
|
+
"License :: OSI Approved :: MIT License",
|
|
25
|
+
"Programming Language :: Python :: 3",
|
|
26
|
+
"Programming Language :: Python :: 3.12",
|
|
27
|
+
"Topic :: Software Development :: Code Generators",
|
|
28
|
+
"Topic :: Software Development :: Libraries :: Python Modules",
|
|
29
|
+
]
|
|
30
|
+
requires-python = ">=3.12"
|
|
31
|
+
dependencies = [
|
|
32
|
+
"networkx>=3.0",
|
|
33
|
+
"tree-sitter>=0.23.0",
|
|
34
|
+
"tree-sitter-python",
|
|
35
|
+
"tree-sitter-javascript",
|
|
36
|
+
"tree-sitter-typescript",
|
|
37
|
+
"tree-sitter-go",
|
|
38
|
+
"tree-sitter-rust",
|
|
39
|
+
"tree-sitter-swift",
|
|
40
|
+
"click>=8.0.0",
|
|
41
|
+
"rich>=13.0.0",
|
|
42
|
+
"pydantic>=2.0.0",
|
|
43
|
+
"tree-sitter-c>=0.24.2",
|
|
44
|
+
"tree-sitter-cpp>=0.23.4",
|
|
45
|
+
"tree-sitter-kotlin>=1.1.0",
|
|
46
|
+
]
|
|
47
|
+
|
|
48
|
+
[dependency-groups]
|
|
49
|
+
dev = [
|
|
50
|
+
"pytest>=8.0.0",
|
|
51
|
+
"ruff>=0.15.16",
|
|
52
|
+
"ty>=0.0.44",
|
|
53
|
+
]
|
|
54
|
+
|
|
55
|
+
[project.urls]
|
|
56
|
+
Homepage = "https://github.com/twn39/codegraph"
|
|
57
|
+
Repository = "https://github.com/twn39/codegraph"
|
|
58
|
+
Issues = "https://github.com/twn39/codegraph/issues"
|
|
59
|
+
|
|
60
|
+
[project.scripts]
|
|
61
|
+
codegraph = "codegraph_gen.__main__:main"
|
|
62
|
+
codegraph-gen = "codegraph_gen.__main__:main"
|
|
63
|
+
|
|
64
|
+
[build-system]
|
|
65
|
+
requires = ["uv_build>=0.9.26,<0.10.0"]
|
|
66
|
+
build-backend = "uv_build"
|
|
67
|
+
|
|
68
|
+
[tool.pytest.ini_options]
|
|
69
|
+
testpaths = ["tests"]
|
|
70
|
+
|
|
File without changes
|
|
@@ -0,0 +1,311 @@
|
|
|
1
|
+
from pathlib import Path
|
|
2
|
+
import click
|
|
3
|
+
from rich.console import Console
|
|
4
|
+
from rich.table import Table
|
|
5
|
+
from rich.progress import (
|
|
6
|
+
Progress,
|
|
7
|
+
SpinnerColumn,
|
|
8
|
+
TextColumn,
|
|
9
|
+
BarColumn,
|
|
10
|
+
MofNCompleteColumn,
|
|
11
|
+
)
|
|
12
|
+
|
|
13
|
+
from codegraph_gen.config import CodegraphConfig, DEFAULT_EXCLUSIONS
|
|
14
|
+
|
|
15
|
+
console = Console()
|
|
16
|
+
|
|
17
|
+
|
|
18
|
+
@click.group()
|
|
19
|
+
def cli():
|
|
20
|
+
"""codegraph - Build a Markdown knowledge graph of your codebase for AI analysis."""
|
|
21
|
+
pass
|
|
22
|
+
|
|
23
|
+
|
|
24
|
+
@cli.command()
|
|
25
|
+
@click.argument(
|
|
26
|
+
"src_dir",
|
|
27
|
+
type=click.Path(exists=True, file_okay=False, path_type=Path),
|
|
28
|
+
default=".",
|
|
29
|
+
)
|
|
30
|
+
@click.option(
|
|
31
|
+
"--output",
|
|
32
|
+
"-o",
|
|
33
|
+
type=click.Path(path_type=Path),
|
|
34
|
+
default=Path(".codegraph"),
|
|
35
|
+
help="Directory where the Markdown vault will be written.",
|
|
36
|
+
)
|
|
37
|
+
@click.option(
|
|
38
|
+
"--exclude",
|
|
39
|
+
"-e",
|
|
40
|
+
multiple=True,
|
|
41
|
+
type=str,
|
|
42
|
+
help="Additional folder names/patterns to exclude from scanning.",
|
|
43
|
+
)
|
|
44
|
+
@click.option(
|
|
45
|
+
"--parallel/--no-parallel",
|
|
46
|
+
default=True,
|
|
47
|
+
help="Enable/disable parallel parsing (using multiprocessing).",
|
|
48
|
+
)
|
|
49
|
+
@click.option(
|
|
50
|
+
"--workers",
|
|
51
|
+
"-w",
|
|
52
|
+
type=int,
|
|
53
|
+
default=None,
|
|
54
|
+
help="Number of worker processes to use for parallel parsing.",
|
|
55
|
+
)
|
|
56
|
+
@click.option(
|
|
57
|
+
"--cache/--no-cache",
|
|
58
|
+
default=True,
|
|
59
|
+
help="Enable/disable incremental parsing cache.",
|
|
60
|
+
)
|
|
61
|
+
def build(
|
|
62
|
+
src_dir: Path,
|
|
63
|
+
output: Path,
|
|
64
|
+
exclude: list[str],
|
|
65
|
+
parallel: bool,
|
|
66
|
+
workers: int | None,
|
|
67
|
+
cache: bool,
|
|
68
|
+
):
|
|
69
|
+
"""Parses the codebase in SRC_DIR and exports the Markdown graph vault."""
|
|
70
|
+
console.print("[bold blue]Starting codegraph analysis...[/bold blue]")
|
|
71
|
+
|
|
72
|
+
# 1. Prepare configuration
|
|
73
|
+
exclusions = set(DEFAULT_EXCLUSIONS)
|
|
74
|
+
if exclude:
|
|
75
|
+
exclusions.update(exclude)
|
|
76
|
+
|
|
77
|
+
import os
|
|
78
|
+
|
|
79
|
+
if not parallel:
|
|
80
|
+
max_workers = 1
|
|
81
|
+
elif workers is not None:
|
|
82
|
+
max_workers = workers
|
|
83
|
+
else:
|
|
84
|
+
max_workers = os.cpu_count() or 4
|
|
85
|
+
|
|
86
|
+
config = CodegraphConfig(
|
|
87
|
+
workspace_dir=src_dir.resolve(),
|
|
88
|
+
output_dir=output.resolve(),
|
|
89
|
+
exclusions=exclusions,
|
|
90
|
+
max_workers=max_workers,
|
|
91
|
+
use_cache=cache,
|
|
92
|
+
)
|
|
93
|
+
|
|
94
|
+
from codegraph_gen.engine import CodegraphEngine, PipelineStage
|
|
95
|
+
|
|
96
|
+
engine = CodegraphEngine(config)
|
|
97
|
+
|
|
98
|
+
# Run pipeline with click progress bar
|
|
99
|
+
with Progress(
|
|
100
|
+
SpinnerColumn(),
|
|
101
|
+
TextColumn("[progress.description]{task.description}"),
|
|
102
|
+
BarColumn(),
|
|
103
|
+
MofNCompleteColumn(),
|
|
104
|
+
console=console,
|
|
105
|
+
) as progress:
|
|
106
|
+
task = progress.add_task("Initializing...", total=None)
|
|
107
|
+
|
|
108
|
+
def progress_callback(stage: PipelineStage, current_item, idx, total):
|
|
109
|
+
if stage == PipelineStage.DISCOVERING:
|
|
110
|
+
progress.update(task, description="Discovering source files...")
|
|
111
|
+
elif stage == PipelineStage.PARSING:
|
|
112
|
+
if total > 0:
|
|
113
|
+
progress.update(task, total=total)
|
|
114
|
+
progress.update(
|
|
115
|
+
task,
|
|
116
|
+
description=f"Parsing {current_item.name if current_item else ''}",
|
|
117
|
+
completed=idx,
|
|
118
|
+
)
|
|
119
|
+
elif stage == PipelineStage.BUILDING:
|
|
120
|
+
progress.update(task, description="Building reference graph...")
|
|
121
|
+
elif stage == PipelineStage.CLUSTERING:
|
|
122
|
+
progress.update(task, description="Clustering components...")
|
|
123
|
+
elif stage == PipelineStage.ANALYZING:
|
|
124
|
+
progress.update(task, description="Analyzing graph metrics...")
|
|
125
|
+
elif stage == PipelineStage.RENDERING:
|
|
126
|
+
progress.update(task, description="Rendering Markdown vault...")
|
|
127
|
+
elif stage == PipelineStage.WRITING:
|
|
128
|
+
progress.update(task, description="Writing files to disk...")
|
|
129
|
+
elif stage == PipelineStage.COMPLETED:
|
|
130
|
+
progress.update(task, description="Done!")
|
|
131
|
+
|
|
132
|
+
result = engine.run_pipeline(progress_callback=progress_callback)
|
|
133
|
+
|
|
134
|
+
G = result.graph
|
|
135
|
+
if G.number_of_nodes() == 0:
|
|
136
|
+
console.print("[bold yellow]Completed build, but graph is empty.[/bold yellow]")
|
|
137
|
+
return
|
|
138
|
+
|
|
139
|
+
files_count = len(result.files)
|
|
140
|
+
symbols_count = G.number_of_nodes() - files_count
|
|
141
|
+
|
|
142
|
+
console.print(f"Found [green]{files_count}[/green] supported files to analyze.")
|
|
143
|
+
console.print(
|
|
144
|
+
f"Assembled graph with [green]{G.number_of_nodes()}[/green] nodes and [green]{G.number_of_edges()}[/green] edges."
|
|
145
|
+
)
|
|
146
|
+
console.print(f" - Files: {files_count}")
|
|
147
|
+
console.print(f" - Symbols (Classes/Functions/Methods): {symbols_count}")
|
|
148
|
+
|
|
149
|
+
console.print(
|
|
150
|
+
"[bold green]Success! Codebase knowledge graph built successfully.[/bold green]"
|
|
151
|
+
)
|
|
152
|
+
|
|
153
|
+
table = Table(title="Logical Components Summary")
|
|
154
|
+
table.add_column("Component Name", style="cyan", no_wrap=True)
|
|
155
|
+
table.add_column("Cohesion (Density)", style="magenta")
|
|
156
|
+
table.add_column("Size (Nodes)", style="green")
|
|
157
|
+
|
|
158
|
+
for cid, members in result.components.items():
|
|
159
|
+
table.add_row(
|
|
160
|
+
result.component_names[cid],
|
|
161
|
+
str(result.cohesion_scores[cid]),
|
|
162
|
+
str(len(members)),
|
|
163
|
+
)
|
|
164
|
+
|
|
165
|
+
console.print(table)
|
|
166
|
+
console.print(
|
|
167
|
+
f"\nView the main graph entrypoint at: [bold underline]{config.absolute_output_dir}/README.md[/bold underline]"
|
|
168
|
+
)
|
|
169
|
+
console.print(
|
|
170
|
+
f"💡 [bold yellow]AI Insight Tip:[/bold yellow] Ask your AI Agent (e.g. Antigravity, Claude Code, Codex) to read [bold]{config.absolute_output_dir}/AGENT_PROMPT.md[/bold] and write the architectural report directly to [bold]{config.absolute_output_dir}/README.md[/bold].\n"
|
|
171
|
+
)
|
|
172
|
+
|
|
173
|
+
|
|
174
|
+
@cli.command()
|
|
175
|
+
@click.option(
|
|
176
|
+
"--platform",
|
|
177
|
+
"-p",
|
|
178
|
+
default="codex",
|
|
179
|
+
type=click.Choice(["codex", "antigravity"]),
|
|
180
|
+
help="The AI agent platform to integrate with.",
|
|
181
|
+
)
|
|
182
|
+
def install(platform: str):
|
|
183
|
+
"""Installs the codegraph slash command into your AI Agent's global config."""
|
|
184
|
+
console.print(
|
|
185
|
+
f"[bold blue]Installing codegraph integration for {platform}...[/bold blue]"
|
|
186
|
+
)
|
|
187
|
+
|
|
188
|
+
# 1. Resolve skills directory based on target platform
|
|
189
|
+
if platform == "codex":
|
|
190
|
+
skills_dir = Path.home() / ".codex" / "skills" / "codegraph"
|
|
191
|
+
elif platform == "antigravity":
|
|
192
|
+
skills_dir = Path.home() / ".gemini" / "config" / "skills" / "codegraph"
|
|
193
|
+
else:
|
|
194
|
+
skills_dir = Path.home() / ".codex" / "skills" / "codegraph"
|
|
195
|
+
|
|
196
|
+
# 2. Skill file content
|
|
197
|
+
skill_content = """---
|
|
198
|
+
name: codegraph
|
|
199
|
+
description: "Build a Markdown codebase knowledge graph using codegraph, perform logical component clustering, analyze god nodes/circular dependencies, and write deep architectural insights to .codegraph/README.md."
|
|
200
|
+
trigger: /codegraph
|
|
201
|
+
---
|
|
202
|
+
|
|
203
|
+
# /codegraph
|
|
204
|
+
|
|
205
|
+
Build a codebase knowledge graph using `codegraph` for any folder, cluster symbols into logical components, detect god nodes and cycles, and perform a deep architectural analysis to write insights directly to the `.codegraph/README.md` vault.
|
|
206
|
+
|
|
207
|
+
## Usage
|
|
208
|
+
|
|
209
|
+
```
|
|
210
|
+
/codegraph # Run the full build & AI analysis pipeline on the current directory
|
|
211
|
+
/codegraph <path> # Run the pipeline on a specific subfolder/path
|
|
212
|
+
/codegraph --exclude <pattern> # Build and exclude specific folders/patterns
|
|
213
|
+
```
|
|
214
|
+
|
|
215
|
+
## What You Must Do When Invoked
|
|
216
|
+
|
|
217
|
+
If the user invoked `/codegraph` with no path, do not ask the user for a path. Instead of scanning the entire project root directory `.` (which may include non-essential scripts, docs, or huge subfolders), you MUST prioritize targeting the primary source directory (e.g. `src/`, `lib/`, `app/`) and test directory (e.g. `tests/`, `test/`).
|
|
218
|
+
- If specific source or test folders are found, run the build targeting those folders, or build the root `.` but exclude other non-code/non-test directories (e.g., `docs/`, `scripts/`, `examples/`) using the `--exclude` flag to keep the graph focused on code and tests.
|
|
219
|
+
- Otherwise, default to `.` (current directory).
|
|
220
|
+
|
|
221
|
+
Follow these steps in order. Do not skip any steps.
|
|
222
|
+
|
|
223
|
+
### Step 1 - Ensure codegraph is installed
|
|
224
|
+
|
|
225
|
+
Check and locate the `codegraph` executable. To support virtual environments, resolve the binary in the following priority order:
|
|
226
|
+
1. Local virtual environment: `.venv/bin/codegraph` or `venv/bin/codegraph`
|
|
227
|
+
2. Global command: `codegraph` (installed globally or via uv tool)
|
|
228
|
+
|
|
229
|
+
You can use this shell logic to resolve the executable:
|
|
230
|
+
```bash
|
|
231
|
+
if [ -f ".venv/bin/codegraph" ]; then
|
|
232
|
+
CODEGRAPH_BIN=".venv/bin/codegraph"
|
|
233
|
+
elif [ -f "venv/bin/codegraph" ]; then
|
|
234
|
+
CODEGRAPH_BIN="venv/bin/codegraph"
|
|
235
|
+
else
|
|
236
|
+
if ! command -v codegraph >/dev/null 2>&1; then
|
|
237
|
+
uv tool install codegraph
|
|
238
|
+
fi
|
|
239
|
+
CODEGRAPH_BIN="codegraph"
|
|
240
|
+
fi
|
|
241
|
+
echo "Using codegraph binary: $CODEGRAPH_BIN"
|
|
242
|
+
```
|
|
243
|
+
|
|
244
|
+
### Step 2 - Build the Knowledge Graph
|
|
245
|
+
|
|
246
|
+
Run the resolved `$CODEGRAPH_BIN` on the specified directory:
|
|
247
|
+
```bash
|
|
248
|
+
$CODEGRAPH_BIN build INPUT_PATH
|
|
249
|
+
# Or with additional exclude arguments if provided by the user
|
|
250
|
+
```
|
|
251
|
+
*(Replace `INPUT_PATH` with the resolved target path, e.g. `.`)*
|
|
252
|
+
|
|
253
|
+
If the command fails or errors out, capture the terminal stderr/logs, display them to the user with a helpful explanation, and ask them if they want to exclude specific directories or fix the errors. Do not fail silently.
|
|
254
|
+
|
|
255
|
+
### Step 3 - Perform Deep Architectural Analysis
|
|
256
|
+
|
|
257
|
+
Once the graph is built successfully:
|
|
258
|
+
1. Read the newly generated `<path>/.codegraph/AGENT_PROMPT.md` file using your file reading tools.
|
|
259
|
+
2. Read the project statistics, communities, god nodes, and cycle warnings from it.
|
|
260
|
+
3. Perform a deep, professional architectural review of the codebase (using **English** as the report language), combined with deep insight analysis of the code implementation of existing features.
|
|
261
|
+
4. Focus your review on:
|
|
262
|
+
- **System Architecture Evaluation**: Explain the design patterns, modularity level, and alignment between physical directories and logical components in the codebase.
|
|
263
|
+
- **Core Abstractions & Boundary Evaluation**: Deeply analyze God Nodes to determine which ones are core support and which ones have excessive responsibilities (God Object / Fat Class) that may lead to high risk.
|
|
264
|
+
- **Potential Bottlenecks & Architectural Refactoring Recommendations**: Point out high-coupling risk points and negative impacts of circular dependencies, and provide specific, actionable refactoring optimization plans (e.g., decoupling, extracting interfaces, dependency inversion).
|
|
265
|
+
5. Read the existing `<path>/.codegraph/README.md` first. If there's an existing `## AI Architectural Insights` section, merge your new findings with it rather than silently overwriting and discarding previous edits.
|
|
266
|
+
6. Write the completed report into `<path>/.codegraph/README.md` under the `## AI Architectural Insights` section, replacing any placeholder instructions.
|
|
267
|
+
|
|
268
|
+
### Step 4 - Present Summary to the User
|
|
269
|
+
|
|
270
|
+
Finally, reply to the user in English, summarizing:
|
|
271
|
+
- The graph statistics (number of files, symbols, edges).
|
|
272
|
+
- The logical component summary (with sizes and cohesion scores).
|
|
273
|
+
- A brief bulleted summary of your key architectural findings and recommendations.
|
|
274
|
+
- Clickable markdown links pointing to:
|
|
275
|
+
- The main entrypoint: `[README.md](file:///<absolute_path_to_vault>/README.md)`
|
|
276
|
+
- The agent guidelines: `[AGENTS.md](file:///<absolute_path_to_vault>/AGENTS.md)`
|
|
277
|
+
- The detailed components folder: `[components/](file:///<absolute_path_to_vault>/components/)`
|
|
278
|
+
"""
|
|
279
|
+
|
|
280
|
+
try:
|
|
281
|
+
skills_dir.mkdir(parents=True, exist_ok=True)
|
|
282
|
+
skill_file = skills_dir / "SKILL.md"
|
|
283
|
+
skill_file.write_text(skill_content, encoding="utf-8")
|
|
284
|
+
console.print(
|
|
285
|
+
f"[bold green]Successfully installed /codegraph slash command to: [underline]{skill_file}[/underline][/bold green]"
|
|
286
|
+
)
|
|
287
|
+
except Exception as e:
|
|
288
|
+
console.print(f"[bold red]Failed to write skill configuration: {e}[/bold red]")
|
|
289
|
+
|
|
290
|
+
|
|
291
|
+
@cli.command()
|
|
292
|
+
def info():
|
|
293
|
+
"""Prints tool info and supported languages."""
|
|
294
|
+
try:
|
|
295
|
+
from importlib.metadata import version
|
|
296
|
+
|
|
297
|
+
ver = version("codegraph")
|
|
298
|
+
except Exception:
|
|
299
|
+
ver = "0.2.0"
|
|
300
|
+
console.print(f"[bold]codegraph v{ver}[/bold]")
|
|
301
|
+
console.print(
|
|
302
|
+
"Supported languages: Python, JavaScript, TypeScript, Go, Rust, Swift"
|
|
303
|
+
)
|
|
304
|
+
|
|
305
|
+
|
|
306
|
+
def main():
|
|
307
|
+
cli()
|
|
308
|
+
|
|
309
|
+
|
|
310
|
+
if __name__ == "__main__":
|
|
311
|
+
main()
|