@dtt_siye/atool 1.4.0 → 1.5.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +97 -214
- package/README.md.atool-backup.20260410_114701 +299 -0
- package/VERSION +1 -1
- package/bin/atool.js +55 -9
- package/install.sh +14 -4
- package/lib/install-cursor.sh +22 -0
- package/lib/install-kiro.sh +26 -2
- package/lib/pre-scan.sh +3 -1
- package/lib/project-init.sh +28 -9
- package/package.json +1 -1
- package/skills/ai-project-architecture/SKILL.md +33 -534
- package/skills/ai-project-architecture/rules/architecture-validation.md +200 -0
- package/skills/ai-project-architecture/rules/compliance-check.md +83 -0
- package/skills/ai-project-architecture/rules/iron-laws.md +188 -0
- package/skills/ai-project-architecture/rules/migration.md +94 -0
- package/skills/ai-project-architecture/rules/refactoring.md +91 -0
- package/skills/ai-project-architecture/rules/testing.md +249 -0
- package/skills/ai-project-architecture/rules/verification.md +111 -0
- package/skills/atool-init/SKILL.md +24 -4
- package/skills/project-analyze/SKILL.md +29 -8
- package/skills/project-analyze/phases/phase1-setup.md +61 -4
- package/skills/project-analyze/phases/phase2-understand.md +129 -27
- package/skills/project-analyze/phases/phase3-graph.md +32 -4
- package/skills/project-analyze/prompts/understand-agent.md +156 -298
- package/skills/project-analyze/rules/java.md +69 -1
- package/skills/project-query/SKILL.md +64 -734
- package/skills/project-query/rules/aggregate-stats.md +301 -0
- package/skills/project-query/rules/data-lineage.md +228 -0
- package/skills/project-query/rules/impact-analysis.md +218 -0
- package/skills/project-query/rules/neighborhood.md +234 -0
- package/skills/project-query/rules/node-lookup.md +97 -0
- package/skills/project-query/rules/path-query.md +135 -0
- package/skills/software-architecture/SKILL.md +39 -501
- package/skills/software-architecture/rules/concurrency-ha.md +346 -0
- package/skills/software-architecture/rules/ddd.md +450 -0
- package/skills/software-architecture/rules/decision-workflow.md +155 -0
- package/skills/software-architecture/rules/deployment.md +508 -0
- package/skills/software-architecture/rules/styles.md +232 -0
|
@@ -0,0 +1,301 @@
|
|
|
1
|
+
# Aggregate Query - 聚合查询
|
|
2
|
+
|
|
3
|
+
## 支持的度量指标
|
|
4
|
+
|
|
5
|
+
### 复杂度指标
|
|
6
|
+
| 度量 | 解释 | 高风险阈值 | 数据来源 |
|
|
7
|
+
|------|------|-----------|---------|
|
|
8
|
+
| `cyclomatic_complexity` | 圈复杂度 | > 10 | Static Analysis |
|
|
9
|
+
| `halstead_volume` | 代码体积 | > 8 | Static Analysis |
|
|
10
|
+
| `lines_of_code` | 函数行数 | > 200 | Static Analysis |
|
|
11
|
+
|
|
12
|
+
### 耦合指标
|
|
13
|
+
| 度量 | 解释 | 高风险阈值 | 数据来源 |
|
|
14
|
+
|------|------|-----------|---------|
|
|
15
|
+
| `coupling (Ca)` | 传入耦合(有多少依赖我) | > 0.7 | Dependency Graph |
|
|
16
|
+
| `coupling (Ce)` | 传出耦合(我依赖多少) | > 0.7 | Dependency Graph |
|
|
17
|
+
| `instability (I)` | 不稳定性 Ce/(Ca+Ce) | > 0.8 | Dependency Graph |
|
|
18
|
+
|
|
19
|
+
### 重要性指标
|
|
20
|
+
| 度量 | 解释 | 高风险阈值 | 数据来源 |
|
|
21
|
+
|------|------|-----------|---------|
|
|
22
|
+
| `importance` | 综合重要性评分 | > 0.8 | Weighted Metrics |
|
|
23
|
+
| `betweenness_centrality` | 介数中心性(关键路由节点) | > 0.5 | Network Analysis |
|
|
24
|
+
| `in_degree` | 入度(被依赖次数) | > 20 | Dependency Graph |
|
|
25
|
+
| `out_degree` | 出度(依赖他人次数) | > 20 | Dependency Graph |
|
|
26
|
+
|
|
27
|
+
### 质量指标
|
|
28
|
+
| 度量 | 解释 | 高风险阈值 | 数据来源 |
|
|
29
|
+
|------|------|-----------|---------|
|
|
30
|
+
| `maintainability_index` | 可维护性指数 | < 70 | Static Analysis |
|
|
31
|
+
| `technical_debt_ratio` | 技术债占比 | > 0.2 | Code Analysis |
|
|
32
|
+
| `code_duplication_ratio` | 重复代码占比 | > 0.1 | Duplication Analysis |
|
|
33
|
+
|
|
34
|
+
### 数据流指标
|
|
35
|
+
| 度量 | 解释 | 高风险阈值 | 数据来源 |
|
|
36
|
+
|------|------|-----------|---------|
|
|
37
|
+
| `flow_rate` | 数据流量 | > 100 | Data Flow Analysis |
|
|
38
|
+
| `schema_coupling` | 数据耦合度 | > 0.5 | Schema Analysis |
|
|
39
|
+
|
|
40
|
+
## 执行步骤
|
|
41
|
+
|
|
42
|
+
### 1. 解析查询条件
|
|
43
|
+
```bash
|
|
44
|
+
/top 10 most complex functions
|
|
45
|
+
# metric=cyclomatic_complexity, direction=desc, limit=10, filter=
|
|
46
|
+
|
|
47
|
+
/modules ranked by importance
|
|
48
|
+
# metric=importance, direction=desc, limit=*, filter=type=module
|
|
49
|
+
|
|
50
|
+
/which module has highest coupling
|
|
51
|
+
# metric=coupling (Ce), direction=desc, limit=1, filter=type=module
|
|
52
|
+
|
|
53
|
+
/functions with cyclomatic_complexity > 10
|
|
54
|
+
# metric=cyclomatic_complexity, direction=desc, limit=*, filter=>10
|
|
55
|
+
|
|
56
|
+
/show all nodes with criticality CRITICAL
|
|
57
|
+
# metric=criticality, direction=desc, limit=*, filter=CRITICAL
|
|
58
|
+
```
|
|
59
|
+
|
|
60
|
+
### 2. 从 knowledge-graph.json 加载所有节点
|
|
61
|
+
```json
|
|
62
|
+
{
|
|
63
|
+
"nodes": [
|
|
64
|
+
{
|
|
65
|
+
"id": "func:UserService.createUser",
|
|
66
|
+
"label": "createUser",
|
|
67
|
+
"type": "function",
|
|
68
|
+
"importance": 0.85,
|
|
69
|
+
"cyclomatic_complexity": 8,
|
|
70
|
+
"lines_of_code": 42,
|
|
71
|
+
"maintainability_index": 75,
|
|
72
|
+
...
|
|
73
|
+
}
|
|
74
|
+
]
|
|
75
|
+
}
|
|
76
|
+
```
|
|
77
|
+
|
|
78
|
+
### 3. 应用过滤条件
|
|
79
|
+
- 数值过滤:`cyclomatic_complexity > 10`
|
|
80
|
+
- 类型过滤:`type=module`
|
|
81
|
+
- 组合过滤:`importance > 0.7 AND type=function`
|
|
82
|
+
|
|
83
|
+
### 4. 按指定度量排序
|
|
84
|
+
- 降序:`cyclomatic_complexity DESC`
|
|
85
|
+
- 升序:`lines_of_code ASC`
|
|
86
|
+
- 多级排序:`importance DESC, cyclomatic_complexity ASC`
|
|
87
|
+
|
|
88
|
+
### 5. 取 top N(默认 10,最大 50)
|
|
89
|
+
```
|
|
90
|
+
top 10: 前 10 个结果
|
|
91
|
+
top 50: 前 50 个结果
|
|
92
|
+
*:全部结果(谨慎使用)
|
|
93
|
+
```
|
|
94
|
+
|
|
95
|
+
### 6. 生成柱状图和排序表格
|
|
96
|
+
- ASCII 柱状图:终端友好
|
|
97
|
+
- Mermaid 柱状图:可视化展示
|
|
98
|
+
- 详细表格:完整数据展示
|
|
99
|
+
|
|
100
|
+
## jq 实现
|
|
101
|
+
|
|
102
|
+
Template 6 (aggregate):
|
|
103
|
+
```jq
|
|
104
|
+
# 应用过滤条件
|
|
105
|
+
def apply_filters($metric, $filter, $node_type):
|
|
106
|
+
$nodes | map(select(
|
|
107
|
+
.type == $node_type and
|
|
108
|
+
($metric != "" and (.[$metric] | tonumber) >= $filter)
|
|
109
|
+
));
|
|
110
|
+
end;
|
|
111
|
+
|
|
112
|
+
# 按度量排序
|
|
113
|
+
def sort_by_metric($metric, $direction="desc"):
|
|
114
|
+
if $direction == "desc" then
|
|
115
|
+
sort_by(.[$metric] | tonumber) | reverse
|
|
116
|
+
else
|
|
117
|
+
sort_by(.[$metric] | tonumber)
|
|
118
|
+
end;
|
|
119
|
+
end;
|
|
120
|
+
|
|
121
|
+
# 生成 ASCII 柱状图
|
|
122
|
+
def generate_ascii_chart($nodes, $metric, $max_length=40):
|
|
123
|
+
$max_value = $nodes | map(.[$metric] | tonumber) | max;
|
|
124
|
+
$nodes | map({
|
|
125
|
+
label: (.label | sub(".+:"; "") | .[:15] + "..."), # 截断标签
|
|
126
|
+
value: .[$metric] | tonumber,
|
|
127
|
+
bar_length: (.[ $metric] | tonumber * $max_length / $max_value | floor)
|
|
128
|
+
}) | map("[\(.label)] \(.bar_length * "█" + ($max_length - .bar_length) * " ") \(.value)");
|
|
129
|
+
end;
|
|
130
|
+
|
|
131
|
+
# 生成结果表格
|
|
132
|
+
def generate_result_table($nodes, $metric, $title):
|
|
133
|
+
if ($nodes | length) == 0 then
|
|
134
|
+
"No results found."
|
|
135
|
+
else
|
|
136
|
+
header = ["Rank", "Label", "Type", $metric | ascii_upcase, "Risk Level"];
|
|
137
|
+
rows = $nodes | map(. as $node | [
|
|
138
|
+
$rank,
|
|
139
|
+
$node.label,
|
|
140
|
+
$node.type,
|
|
141
|
+
$node[$metric] | tostring,
|
|
142
|
+
get_risk_level($node[$metric], $metric)
|
|
143
|
+
]);
|
|
144
|
+
|
|
145
|
+
header + [""] + rows
|
|
146
|
+
end;
|
|
147
|
+
end;
|
|
148
|
+
|
|
149
|
+
# 获取风险等级
|
|
150
|
+
def get_risk_level($value, $metric):
|
|
151
|
+
case $metric
|
|
152
|
+
when "cyclomatic_complexity" then
|
|
153
|
+
if $value > 10 then "🔴 CRITICAL"
|
|
154
|
+
elif $value > 7 then "🟠 HIGH"
|
|
155
|
+
elif $value > 3 then "🟢 MEDIUM"
|
|
156
|
+
else "🔵 LOW" end
|
|
157
|
+
when "importance" then
|
|
158
|
+
if $value > 0.8 then "🔴 CRITICAL"
|
|
159
|
+
elif $value > 0.6 then "🟠 HIGH"
|
|
160
|
+
elif $value > 0.3 then "🟢 MEDIUM"
|
|
161
|
+
else "🔵 LOW" end
|
|
162
|
+
else "🔵 LOW"
|
|
163
|
+
end;
|
|
164
|
+
end;
|
|
165
|
+
```
|
|
166
|
+
|
|
167
|
+
## 用户指令示例
|
|
168
|
+
|
|
169
|
+
### 复杂度相关
|
|
170
|
+
```bash
|
|
171
|
+
# 查找最复杂的函数
|
|
172
|
+
/top 20 functions by cyclomatic_complexity
|
|
173
|
+
|
|
174
|
+
# 查找代码体积大的函数
|
|
175
|
+
/top 10 functions by halstead_volume
|
|
176
|
+
|
|
177
|
+
# 查找行数过多的函数
|
|
178
|
+
/functions with lines_of_code > 200
|
|
179
|
+
```
|
|
180
|
+
|
|
181
|
+
### 耦合相关
|
|
182
|
+
```bash
|
|
183
|
+
# 查找传出耦合最高的模块
|
|
184
|
+
/modules ranked by coupling (Ce)
|
|
185
|
+
|
|
186
|
+
# 查找不稳定的模块
|
|
187
|
+
/modules with instability > 0.8
|
|
188
|
+
|
|
189
|
+
# 查找传入耦合高的模块
|
|
190
|
+
/modules with coupling (Ca) > 0.7
|
|
191
|
+
```
|
|
192
|
+
|
|
193
|
+
### 重要性相关
|
|
194
|
+
```bash
|
|
195
|
+
# 查找最重要的函数/top 15 functions by importance
|
|
196
|
+
# 查找枢纽节点
|
|
197
|
+
/top 10 nodes by betweenness_centrality
|
|
198
|
+
# 查找被依赖最多的函数
|
|
199
|
+
/functions with in_degree > 10
|
|
200
|
+
```
|
|
201
|
+
|
|
202
|
+
### 质量相关
|
|
203
|
+
```bash
|
|
204
|
+
# 查找技术债占比高的
|
|
205
|
+
/nodes with technical_debt_ratio > 0.3
|
|
206
|
+
|
|
207
|
+
# 查找可维护性差的
|
|
208
|
+
/functions with maintainability_index < 60
|
|
209
|
+
|
|
210
|
+
# 查找重复代码多的
|
|
211
|
+
/modules with code_duplication_ratio > 0.15
|
|
212
|
+
```
|
|
213
|
+
|
|
214
|
+
## 输出示例
|
|
215
|
+
|
|
216
|
+
```
|
|
217
|
+
## Query Result: Aggregate Metrics
|
|
218
|
+
|
|
219
|
+
Query: Top 10 Functions by Cyclomatic Complexity
|
|
220
|
+
|
|
221
|
+
### Results (Sorted by cyclomatic_complexity DESC)
|
|
222
|
+
|
|
223
|
+
| Rank | Function | CC | Lines | Risk Level |
|
|
224
|
+
|------|----------|----|----|------------|
|
|
225
|
+
| 1 | OrderService.processOrder | 23 | 342 | 🔴 CRITICAL |
|
|
226
|
+
| 2 | PaymentService.validatePayment | 18 | 267 | 🔴 HIGH |
|
|
227
|
+
| 3 | AuthService.authenticate | 14 | 198 | 🟠 MEDIUM |
|
|
228
|
+
| 4 | UserService.createUser | 9 | 156 | 🟢 LOW |
|
|
229
|
+
| 5 | ShippingService.estimateDelivery | 8 | 124 | 🟢 LOW |
|
|
230
|
+
|
|
231
|
+
### ASCII Chart
|
|
232
|
+
|
|
233
|
+
cyclomatic_complexity distribution:
|
|
234
|
+
OrderService.processOrder |██████████████████████ 23
|
|
235
|
+
PaymentService.validatePayment |██████████████████ 18
|
|
236
|
+
AuthService.authenticate |██████████████ 14
|
|
237
|
+
UserService.createUser |█████████ 9
|
|
238
|
+
ShippingService.estimateDelivery |████████ 8
|
|
239
|
+
|
|
240
|
+
### Statistics
|
|
241
|
+
- Mean CC: 8.2
|
|
242
|
+
- Median CC: 7
|
|
243
|
+
- Max CC: 23 (OrderService.processOrder)
|
|
244
|
+
- Functions with CC > 10: 3 (HIGH RISK)
|
|
245
|
+
|
|
246
|
+
### Recommendations
|
|
247
|
+
- 🔴 CRITICAL: Refactor OrderService.processOrder (consider splitting into 3-4 functions)
|
|
248
|
+
- 🔴 HIGH: Add comprehensive tests for PaymentService.validatePayment
|
|
249
|
+
- 🟠 MEDIUM: Monitor AuthService.authenticate for future complexity growth
|
|
250
|
+
```
|
|
251
|
+
|
|
252
|
+
## 聚合查询最佳实践
|
|
253
|
+
|
|
254
|
+
### 1. 分层分析
|
|
255
|
+
```bash
|
|
256
|
+
# 系统级:顶层模块
|
|
257
|
+
/top 10 modules by importance (G1)
|
|
258
|
+
|
|
259
|
+
# 模块级:内部结构
|
|
260
|
+
/top 20 functions by cyclomatic_complexity (G3)
|
|
261
|
+
|
|
262
|
+
# 实现级:具体代码
|
|
263
|
+
/functions with technical_debt_ratio > 0.2 (G5)
|
|
264
|
+
```
|
|
265
|
+
|
|
266
|
+
### 2. 组合查询
|
|
267
|
+
```bash
|
|
268
|
+
# 高复杂度 + 高耦合
|
|
269
|
+
/functions with cyclomatic_complexity > 10 AND coupling (Ce) > 0.7
|
|
270
|
+
|
|
271
|
+
# 高重要性 + 低质量
|
|
272
|
+
/functions with importance > 0.8 AND maintainability_index < 70
|
|
273
|
+
|
|
274
|
+
# 全局质量概览
|
|
275
|
+
/show quality metrics summary
|
|
276
|
+
```
|
|
277
|
+
|
|
278
|
+
### 3. 趋势分析
|
|
279
|
+
```bash
|
|
280
|
+
# 对比两个版本
|
|
281
|
+
/compare v1 vs v2 (metric=cyclomatic_complexity)
|
|
282
|
+
|
|
283
|
+
# 查看历史趋势
|
|
284
|
+
/show trend cyclomatic_complexity (last=5)
|
|
285
|
+
```
|
|
286
|
+
|
|
287
|
+
## 风险等级颜色编码
|
|
288
|
+
|
|
289
|
+
| 颜色 | 风险等级 | 阈值 | 行动建议 |
|
|
290
|
+
|------|---------|------|---------|
|
|
291
|
+
| 🔴 | CRITICAL | 最严重 | 立即重构,优先级最高 |
|
|
292
|
+
| 🟠 | HIGH | 严重 | 近期处理,安排重构 |
|
|
293
|
+
| 🟢 | MEDIUM | 中等 | 计划内优化,持续监控 |
|
|
294
|
+
| 🔵 | LOW | 轻微 | 正常维护,无需特别关注 |
|
|
295
|
+
|
|
296
|
+
## 故障排除
|
|
297
|
+
|
|
298
|
+
1. **度量不存在**:某些度量需要 L2+ 分析深度,重新运行更深分析
|
|
299
|
+
2. **结果为空**:检查过滤条件是否过于严格,调整参数
|
|
300
|
+
3. **性能问题**:大量数据时使用 limit 限制数量
|
|
301
|
+
4. **排序错误**:确认 metric 名称和排序方向是否正确
|
|
@@ -0,0 +1,228 @@
|
|
|
1
|
+
# Data Lineage - 数据血缘
|
|
2
|
+
|
|
3
|
+
## 执行步骤
|
|
4
|
+
|
|
5
|
+
### 1. 定位数据实体节点 D
|
|
6
|
+
查找类型为 `data_entity` 的节点:
|
|
7
|
+
- `Order entity` → `data_entity:Order`
|
|
8
|
+
- `User data` → `data_entity:User`
|
|
9
|
+
- `Payment status` → `data_entity:PaymentStatus`
|
|
10
|
+
|
|
11
|
+
### 2. 反向追踪(来源 Upstream)
|
|
12
|
+
沿着 `writes_to`、`writes_state` 边反向查找:
|
|
13
|
+
|
|
14
|
+
```mermaid
|
|
15
|
+
graph TD
|
|
16
|
+
User_Input -->|creates| OrderService.createOrder
|
|
17
|
+
User_Input -->|creates| UserService.createUser
|
|
18
|
+
OrderService.createOrder -->|writes_to| data_entity:Order
|
|
19
|
+
UserService.createUser -->|writes_to| data_entity:User
|
|
20
|
+
```
|
|
21
|
+
|
|
22
|
+
递归规则:
|
|
23
|
+
- 从数据实体反向查找写入该实体的函数
|
|
24
|
+
- 继续查找这些函数的输入来源
|
|
25
|
+
- 直到找到边界节点:API、user_input、scheduled_task、config
|
|
26
|
+
|
|
27
|
+
### 3. 正向追踪(去向 Downstream)
|
|
28
|
+
沿着 `reads_from`、`reads_state`、`transforms` 边正向查找:
|
|
29
|
+
|
|
30
|
+
```mermaid
|
|
31
|
+
graph TD
|
|
32
|
+
data_entity:Order -->|reads_from| OrderService.getOrders
|
|
33
|
+
data_entity:Order -->|reads_from| OrderService.processOrder
|
|
34
|
+
data_entity:Order -->|persists_to| database:PostgreSQL
|
|
35
|
+
```
|
|
36
|
+
|
|
37
|
+
递归规则:
|
|
38
|
+
- 从数据实体查找读取该实体的函数
|
|
39
|
+
- 继续查找这些函数的输出流向
|
|
40
|
+
- 直到找到汇聚点:database、external_api、log、output
|
|
41
|
+
|
|
42
|
+
### 4. 计算数据流速率
|
|
43
|
+
```
|
|
44
|
+
flow_rate = count(writers) × count(readers) × avg_transformation_steps
|
|
45
|
+
```
|
|
46
|
+
|
|
47
|
+
评级:
|
|
48
|
+
- `flow_rate > 100` → 🔴 HIGH ACTIVITY
|
|
49
|
+
- `10 < flow_rate ≤ 100` → 🟠 MEDIUM ACTIVITY
|
|
50
|
+
- `flow_rate ≤ 10` → 🟢 LOW ACTIVITY
|
|
51
|
+
|
|
52
|
+
### 5. 生成 Mermaid `graph TD` 数据流图
|
|
53
|
+
使用标准样式:
|
|
54
|
+
- 数据节点(蓝色):`data_entity:Order`
|
|
55
|
+
- 处理函数(绿色):`func:OrderService.createOrder`
|
|
56
|
+
- 外部系统(灰色):`api:POST/orders`
|
|
57
|
+
- 每条边标注操作类型
|
|
58
|
+
|
|
59
|
+
## jq 实现
|
|
60
|
+
|
|
61
|
+
Template 4 (data-lineage):
|
|
62
|
+
```jq
|
|
63
|
+
# 查找写入者(上游)
|
|
64
|
+
def find_upstream_writers(data_entity_id):
|
|
65
|
+
.edges[] | select(
|
|
66
|
+
.target == data_entity_id and
|
|
67
|
+
(.type == "writes_to" or .type == "writes_state")
|
|
68
|
+
) | .source;
|
|
69
|
+
end;
|
|
70
|
+
|
|
71
|
+
# 查找读取者(下游)
|
|
72
|
+
def find_downstream_readers(data_entity_id):
|
|
73
|
+
.edges[] | select(
|
|
74
|
+
.source == data_entity_id and
|
|
75
|
+
(.type == "reads_from" or .type == "reads_state")
|
|
76
|
+
) | .target;
|
|
77
|
+
end;
|
|
78
|
+
|
|
79
|
+
# 生成数据流图
|
|
80
|
+
def generate_data_flow_graph($data_entity_id):
|
|
81
|
+
$writers = find_upstream_writers($data_entity_id);
|
|
82
|
+
$readers = find_downstream_readers($data_entity_id);
|
|
83
|
+
|
|
84
|
+
$upstream = $writers | map({
|
|
85
|
+
id: .,
|
|
86
|
+
type: if (. | startswith("api:")) or (. | startswith("user_input:"))
|
|
87
|
+
then "source"
|
|
88
|
+
else "function" end,
|
|
89
|
+
direction: "to",
|
|
90
|
+
target: $data_entity_id
|
|
91
|
+
});
|
|
92
|
+
|
|
93
|
+
$downstream = $readers | map({
|
|
94
|
+
id: .,
|
|
95
|
+
type: if (. | endswith(":database")) or (. | endswith(":external_api"))
|
|
96
|
+
then "sink"
|
|
97
|
+
else "function" end,
|
|
98
|
+
direction: "from",
|
|
99
|
+
source: $data_entity_id
|
|
100
|
+
});
|
|
101
|
+
|
|
102
|
+
$all_nodes = ($upstream + $downstream) | .id | unique;
|
|
103
|
+
|
|
104
|
+
{
|
|
105
|
+
nodes: $all_nodes | map({
|
|
106
|
+
id: .,
|
|
107
|
+
label: .,
|
|
108
|
+
class: if startswith("data_entity:") then "dataEntity"
|
|
109
|
+
elif startswith("api:") then "api"
|
|
110
|
+
elif endswith(":database") then "database"
|
|
111
|
+
elif startswith("user_input:") then "userInput"
|
|
112
|
+
else "function" end
|
|
113
|
+
}),
|
|
114
|
+
edges: ($upstream + $downstream) | map({
|
|
115
|
+
source: .source,
|
|
116
|
+
target: .target,
|
|
117
|
+
label: .type,
|
|
118
|
+
class: .type
|
|
119
|
+
})
|
|
120
|
+
};
|
|
121
|
+
end;
|
|
122
|
+
|
|
123
|
+
# 计算流速率
|
|
124
|
+
def calculate_flow_rate($data_entity_id):
|
|
125
|
+
$writers = find_upstream_writers($data_entity_id) | length;
|
|
126
|
+
$readers = find_downstream_readers($data_entity_id) | length;
|
|
127
|
+
$avg_steps = if $writers + $readers > 0
|
|
128
|
+
then 3 # 平均变换步骤
|
|
129
|
+
else 0 end;
|
|
130
|
+
$writers * $readers * $avg_steps;
|
|
131
|
+
end;
|
|
132
|
+
```
|
|
133
|
+
|
|
134
|
+
## 用户指令示例
|
|
135
|
+
|
|
136
|
+
```bash
|
|
137
|
+
# 追踪实体流向
|
|
138
|
+
/trace data lineage of Order entity
|
|
139
|
+
/trace data lineage of User
|
|
140
|
+
/trace data lineage of Payment status
|
|
141
|
+
|
|
142
|
+
# 查看数据来源
|
|
143
|
+
/where does Order data flow from
|
|
144
|
+
/where does User data come from
|
|
145
|
+
/trace data flow for Order creation
|
|
146
|
+
```
|
|
147
|
+
|
|
148
|
+
## 输出示例
|
|
149
|
+
|
|
150
|
+
```
|
|
151
|
+
## Query Result: Data Lineage
|
|
152
|
+
|
|
153
|
+
Target: data_entity:Order
|
|
154
|
+
|
|
155
|
+
### Upstream (Data Sources)
|
|
156
|
+
api:POST/orders
|
|
157
|
+
--creates--> func:OrderService.createOrder
|
|
158
|
+
--writes_to--> data_entity:Order
|
|
159
|
+
|
|
160
|
+
### Processing (Transformations)
|
|
161
|
+
func:OrderService.createOrder
|
|
162
|
+
--transforms--> func:PaymentService.calculateTotal
|
|
163
|
+
--transforms--> func:ShippingService.estimateDelivery
|
|
164
|
+
|
|
165
|
+
### Downstream (Data Sinks)
|
|
166
|
+
data_entity:Order
|
|
167
|
+
--reads_from--> func:OrderService.getOrders
|
|
168
|
+
--returns--> api:GET/orders/{id}
|
|
169
|
+
|
|
170
|
+
--persists_to--> database:PostgreSQL
|
|
171
|
+
|
|
172
|
+
### Data Flow Metrics
|
|
173
|
+
- Writers: 2 functions
|
|
174
|
+
- Readers: 5 functions
|
|
175
|
+
- Transformations: 3 steps
|
|
176
|
+
- Flow Rate: 30 (very high activity)
|
|
177
|
+
```
|
|
178
|
+
|
|
179
|
+
## Mermaid 图示例
|
|
180
|
+
|
|
181
|
+
```mermaid
|
|
182
|
+
graph TD
|
|
183
|
+
classDef dataEntity fill:#90EE90
|
|
184
|
+
classDef function fill:#98FB98
|
|
185
|
+
classDef api fill:#FFB6C1
|
|
186
|
+
classDef database fill:#87CEEB
|
|
187
|
+
|
|
188
|
+
U[user_input:POST/orders] -->|creates| F1[func:OrderService.createOrder]
|
|
189
|
+
F1 -->|writes_to| O[data_entity:Order]
|
|
190
|
+
|
|
191
|
+
O -->|reads_from| F2[func:OrderService.getOrders]
|
|
192
|
+
F2 -->|returns| API[api:GET/orders/{id}]
|
|
193
|
+
|
|
194
|
+
O -->|persists_to| DB[database:PostgreSQL]
|
|
195
|
+
F1 -->|transforms| F3[func:PaymentService.calculateTotal]
|
|
196
|
+
F1 -->|transforms| F4[func:ShippingService.estimateDelivery]
|
|
197
|
+
|
|
198
|
+
class U api
|
|
199
|
+
class O dataEntity
|
|
200
|
+
class F1,F2,F3,F4 function
|
|
201
|
+
class DB database
|
|
202
|
+
class API api
|
|
203
|
+
```
|
|
204
|
+
|
|
205
|
+
## 数据流类型表
|
|
206
|
+
|
|
207
|
+
| 操作类型 | 边标签 | 含义 | 颜色 |
|
|
208
|
+
|---------|-------|------|------|
|
|
209
|
+
| `creates` | creates | 创建新数据 | #90EE90 |
|
|
210
|
+
| `writes_to` | writes_to | 写入数据 | #FFB6C1 |
|
|
211
|
+
| `reads_from` | reads_from | 读取数据 | #87CEEB |
|
|
212
|
+
| `transforms` | transforms | 数据转换 | #FFD700 |
|
|
213
|
+
| `persists_to` | persists_to | 持久化到数据库 | #DDA0DD |
|
|
214
|
+
| `returns` | returns | 返回响应 | #FFA07A |
|
|
215
|
+
|
|
216
|
+
## 故障排除
|
|
217
|
+
|
|
218
|
+
1. **无数据流**:检查 entity 名称是否正确,使用 `/find` 确认
|
|
219
|
+
2. **流向不完整**:检查分析深度是否足够(推荐 L2+)
|
|
220
|
+
3. **变换步骤过多**:使用粒度缩放(G3-G4)过滤
|
|
221
|
+
4. **环形引用**:可能存在设计问题,需要评审
|
|
222
|
+
|
|
223
|
+
## 最佳实践
|
|
224
|
+
|
|
225
|
+
1. **关键实体优先**:重点关注核心业务实体
|
|
226
|
+
2. **高活跃度监控**:对 flow_rate > 50 的实体进行重点监控
|
|
227
|
+
3. **变更影响评估**:修改关键实体前必须分析影响
|
|
228
|
+
4. **性能优化**:高流量的实体可能是性能瓶颈
|