@josephyan/qingflow-app-user-mcp 0.2.0-beta.17 → 0.2.0-beta.19

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -3,13 +3,13 @@
3
3
  Install:
4
4
 
5
5
  ```bash
6
- npm install @josephyan/qingflow-app-user-mcp@0.2.0-beta.17
6
+ npm install @josephyan/qingflow-app-user-mcp@0.2.0-beta.19
7
7
  ```
8
8
 
9
9
  Run:
10
10
 
11
11
  ```bash
12
- npx -y -p @josephyan/qingflow-app-user-mcp@0.2.0-beta.17 qingflow-app-user-mcp
12
+ npx -y -p @josephyan/qingflow-app-user-mcp@0.2.0-beta.19 qingflow-app-user-mcp
13
13
  ```
14
14
 
15
15
  Environment:
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@josephyan/qingflow-app-user-mcp",
3
- "version": "0.2.0-beta.17",
3
+ "version": "0.2.0-beta.19",
4
4
  "description": "Operational end-user MCP for Qingflow records, tasks, comments, and directory workflows.",
5
5
  "license": "MIT",
6
6
  "type": "module",
package/pyproject.toml CHANGED
@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
4
4
 
5
5
  [project]
6
6
  name = "qingflow-mcp"
7
- version = "0.2.0b17"
7
+ version = "0.2.0b19"
8
8
  description = "User-authenticated MCP server for Qingflow"
9
9
  readme = "README.md"
10
10
  license = "MIT"
@@ -21,8 +21,8 @@ When the task is in `prod`, browser parity matters, or the user says "the page h
21
21
  Primary record and data tools:
22
22
 
23
23
  - `record_query`
24
+ - `record_schema_get`
24
25
  - `record_write_plan`
25
- - `record_field_resolve`
26
26
  - `record_create`
27
27
  - `record_get`
28
28
  - `record_update`
@@ -90,10 +90,10 @@ Do not use builder-side tools here:
90
90
  - Use `task_statistics` before `task_list` when the user only needs counts
91
91
  - Use `task_list_grouped` when worksheet or group buckets matter
92
92
  - Use `task_urge` only when the user clearly wants a reminder sent for a pending task
93
- - Use `record_field_resolve` when field selectors are ambiguous; if the task then turns into analysis, switch to `$qingflow-record-analysis`
93
+ - Use `record_schema_get` when field selectors are ambiguous; if the task then turns into analysis, switch to `$qingflow-record-analysis`
94
94
  - For precise record lookup, use `record_get` when `apply_id` is known
95
- - Use `record_field_resolve` when the user gives field titles and you are not fully sure about the exact schema; do not guess ambiguous fields silently
96
- - If the task has already shifted into analysis and `record_field_resolve` still leaves multiple plausible fields, stop and ask the user to confirm the intended field instead of continuing to try read tools in a loop
95
+ - Use `record_schema_get` when the user gives field titles and you are not fully sure about the exact schema; do not guess ambiguous fields silently
96
+ - If the task has already shifted into analysis and `record_schema_get` still leaves multiple plausible fields, stop and ask the user to confirm the intended field instead of continuing to try read tools in a loop
97
97
  - Treat field selectors as schema-first and platform-generic. Prefer exact field titles, then neutral aliases such as `创建时间`, `新增时间`, `负责人`, `部门`, `时间`, or `阶段` only when the tool resolves them clearly. Do not assume CRM shorthand like `销售`, `商机阶段`, `客户全称`, or similar domain shortcuts apply across arbitrary Qingflow apps
98
98
  - For updates, inspect current data first unless the user already provided the exact target and patch
99
99
  - For deletes, confirm the exact record scope and report the deleted ids
@@ -125,9 +125,9 @@ When the user asks for demo data, seed, smoke data, or mock data:
125
125
 
126
126
  ## Response Interpretation
127
127
 
128
- - `record_query(summary)` and `record_aggregate` expose `completeness`; do not treat partial scans as final conclusions
129
- - `record_query(summary)` and `record_aggregate` now also expose `analysis_status`, `safe_for_final_conclusion`, and `analysis_counts`; if `status=partial_success` or `safe_for_final_conclusion=false`, do not present the result as final
130
- - For aggregate or summary answers, report both `backend_total_count` and `scanned_count` when coverage matters; for full analysis framing, switch to `$qingflow-record-analysis`
128
+ - `record_query(query_mode="list")` is browse/sample output, not a final analysis result
129
+ - If `record_query(query_mode="list")` reports `row_cap_hit`, `sample_only`, or capped rows, do not present it as full data
130
+ - For grouped distributions, trends, or final statistical conclusions, switch to `$qingflow-record-analysis` and use `record_schema_get -> record_analyze`
131
131
  - `record_write_plan` is static preflight, not a guarantee that submit will pass runtime linkage or visibility checks
132
132
  - `record_create` now returns integer `apply_id`; you can pass that id directly into `record_get`, `record_update`, or `record_delete`
133
133
  - `verify_write=true` means the tool read the record back and compared the written fields; if it returns `status=verification_failed` or `ok=false`, do not report the create or update as successful
@@ -5,14 +5,13 @@ For final statistics, grouped distributions, or insight-style analysis, use [$qi
5
5
  ## Counts
6
6
 
7
7
  - Prefer `effective_count`
8
- - For `record_query(summary)` and `record_aggregate`, inspect `completeness`, `analysis_status`, and `safe_for_final_conclusion` before concluding
9
- - If `status=partial_success`, treat the result as exploratory unless the user explicitly asked for a partial sample
8
+ - For final analysis, inspect `record_analyze.data.completeness` and `safe_for_final_conclusion` before concluding
9
+ - If `record_analyze.status!=success`, treat the result as exploratory unless the user explicitly asked for a partial sample
10
10
  - `record_query(list)` is for browsing and sample inspection. If it reports `row_cap_hit`, `sample_only`, or capped `returned_items`, do not present it as full data
11
11
  - When coverage matters, surface:
12
12
  - `backend_total_count`
13
13
  - `scanned_count`
14
- - `unscanned_count`
15
- - Reuse `suggested_next_call` or `estimate.recommended_arguments` instead of inventing bigger scan settings by hand
14
+ - Use narrower views, filters, or smaller analysis questions instead of inventing manual scan settings by hand
16
15
  - If the browser and MCP disagree, compare `request_route.base_url` and `request_route.qf_version` first
17
16
  - Do not mix a full aggregate total with sample-only list detail in one sentence like “基于全部数据分析”; split the answer into `全量结论` and `样本观察`
18
17
 
@@ -25,8 +24,8 @@ For final statistics, grouped distributions, or insight-style analysis, use [$qi
25
24
 
26
25
  - `record_write_plan` is static preflight only; linked visibility and runtime required rules can still reject writes
27
26
  - `record_write_plan` now exposes `write_format.support_level`; check `full / restricted / unsupported` before attempting non-trivial writes
28
- - Use `record_field_resolve` when field titles are uncertain instead of guessing ids
29
- - For analysis tasks, use the fixed preflight order `record_field_resolve -> record_query_plan -> summary/aggregate`; do not switch tools blindly after `FIELD_NOT_FOUND` or ambiguity
27
+ - Use `record_schema_get` when field titles are uncertain instead of guessing ids
28
+ - For analysis tasks, use the fixed path `record_schema_get -> record_analyze`; do not switch tools blindly after `FIELD_NOT_FOUND` or ambiguity
30
29
  - Prefer `strict_full=true` for final statistics or business conclusions
31
30
  - `record_create` and `record_update` can do post-write verification with `verify_write=true`; use that for complex, subtable, or production writes
32
31
  - `apply_id` is normalized to an integer; pass it directly into later record tools
@@ -9,9 +9,9 @@ Use `record_query` first when:
9
9
  - the user only gives a title or business key
10
10
  - the target record id is unknown
11
11
  - updates or deletes need confirmation
12
- - summary analysis or final counts are needed
12
+ - ordinary list browsing or spot checks are needed
13
13
 
14
- Use `record_query_plan` first when:
14
+ Use [$qingflow-record-analysis](/Users/yanqidong/Documents/qingflow-next/.codex/skills/qingflow-record-analysis/SKILL.md) when:
15
15
 
16
16
  - field titles may be ambiguous
17
17
  - filters are still in natural-language shape
@@ -22,14 +22,13 @@ Use `record_query_plan` first when:
22
22
 
23
23
  ## Final analysis pattern
24
24
 
25
- 1. Run `record_query_plan`
26
- 2. If the plan exposes `estimate.recommended_arguments` or `suggested_next_call`, prefer those arguments directly
27
- 3. Run `record_query(query_mode="summary", strict_full=true, auto_expand_pages=true)` to confirm the total scope
28
- 4. Run `record_aggregate(strict_full=true, auto_expand_pages=true)` for grouped results
29
- 5. Run `record_query(query_mode="list")` only if you still need sample rows or examples
30
- 6. Report `backend_total_count`, `scanned_count`, and whether the result is safe for a final conclusion
31
- 7. If `status=partial_success` or `safe_for_final_conclusion=false`, stop at “partial result” instead of presenting a final business conclusion
32
- 8. If list rows are sample-only, separate the answer into:
25
+ 1. Run `record_schema_get`
26
+ 2. Generate one or more field_id-based DSLs
27
+ 3. Run `record_analyze(strict_full=true)` for summary/distribution/trend/cross analysis
28
+ 4. Run `record_query(query_mode="list")` only if you still need sample rows or examples
29
+ 5. Report `backend_total_count`, `scanned_count`, and whether the result is safe for a final conclusion
30
+ 6. If `status=error` or `safe_for_final_conclusion=false`, stop at “partial result instead of presenting a final business conclusion
31
+ 7. If list rows are sample-only, separate the answer into:
33
32
  - `全量可信结论`
34
33
  - `样本观察(不作为最终结论)`
35
34
  - optional `待验证假设`
@@ -42,12 +41,12 @@ Do not do this:
42
41
  2. Get `200` rows back
43
42
  3. Report平均值、占比、地域分布 as if they were based on all records
44
43
 
45
- This is not acceptable because the list endpoint can be capped. Use `record_query_plan -> summary -> aggregate` first, then treat list rows as sample-only evidence.
44
+ This is not acceptable because the list endpoint can be capped. Use `record_schema_get -> record_analyze` first, then treat list rows as sample-only evidence.
46
45
 
47
46
  ## Create pattern
48
47
 
49
48
  1. Confirm target app
50
- 2. Resolve fields with `record_field_resolve` if needed. Prefer exact schema titles first; only rely on platform-neutral aliases such as `创建时间`, `负责人`, or `部门` when they resolve cleanly, and do not assume business-domain shorthand like `销售` is portable across apps
49
+ 2. Resolve fields with `record_schema_get` if needed. Prefer exact schema titles first; only rely on platform-neutral aliases such as `创建时间`, `负责人`, or `部门` when they resolve cleanly, and do not assume business-domain shorthand like `销售` is portable across apps
51
50
  3. Run `record_write_plan` for non-trivial payloads or any `fields`-based write
52
51
  4. For relation fields, query the target app first and resolve the referenced record `apply_id`
53
52
  5. For attachments, call `file_upload_local` first and reuse the returned `attachment_value`
@@ -2,7 +2,7 @@
2
2
  name: qingflow-record-analysis
3
3
  description: Analyze Qingflow record data safely after the MCP is already connected and authenticated. Use when the user wants grouped distributions, ratios, averages, rankings, trends, insights, or any final statistical conclusion across an existing app's data. Do not use this skill for schema changes, app design, or ordinary record CRUD unless they are strictly supporting an analysis flow.
4
4
  metadata:
5
- short-description: Analyze Qingflow record data with plan-first, sample-safe reporting
5
+ short-description: Analyze Qingflow record data with schema-first DSL execution
6
6
  ---
7
7
 
8
8
  # Qingflow Record Analysis
@@ -19,50 +19,228 @@ Before running analysis in `prod`, confirm the intended environment and compare
19
19
 
20
20
  Use these tools as the core analysis surface:
21
21
 
22
- - `record_field_resolve`
23
- - `record_query_plan`
24
- - `record_query`
25
- - `record_aggregate`
22
+ - `record_schema_get`
23
+ - `record_analyze`
26
24
 
27
- Use `record_get` or ordinary `record_query(list)` only when you need sample rows or a specific supporting example after the main analysis path.
25
+ Use `record_query(query_mode="list")` or `record_get` only when you need sample rows or a specific supporting example after the main analysis path.
28
26
 
29
27
  ## Hard Rules
30
28
 
31
- - Analysis tasks must start with `record_query_plan`
32
- - If fields are uncertain, the fixed order is:
33
- 1. `record_field_resolve`
34
- 2. `record_query_plan`
35
- 3. `record_query(summary)` and/or `record_aggregate`
36
- - If `record_field_resolve` returns multiple plausible fields and you still cannot identify the exact analysis field confidently, stop and ask the user to confirm from a short candidate list instead of continuing to guess
37
- - Do not loop between `record_field_resolve`, `record_query`, and `record_aggregate` trying field-name variants repeatedly; once ambiguity remains after one focused refinement pass, pause and confirm the field with the user
29
+ - Analysis tasks must start with `record_schema_get`
30
+ - Build one or more small DSLs, then run `record_analyze` separately for each question
31
+ - DSL field references must use `field_id` only
32
+ - Normalize relative time phrases into explicit legal date ranges before building the DSL
33
+ - If the user asks for `最近一个完整自然月 / 上个月 / 最近30天 / 本季度 / 去年同期`, first convert that phrase into concrete dates, then verify the dates are legal before calling MCP
34
+ - Never send impossible dates such as `2026-02-29`; if the intended month is February 2026, the legal upper bound is `2026-02-28`
35
+ - If the schema still leaves multiple plausible fields, stop and ask the user to confirm from a short candidate list instead of guessing
36
+ - Do not keep retrying different guessed field names in a loop
38
37
  - `record_query(list)` is never the basis for a final statistical conclusion
39
38
  - If `record_query(list)` reports `row_cap_hit`, `sample_only`, capped `returned_items`, or compact output, treat it as sample-only evidence
40
- - Do not mix full totals from `summary` or `aggregate` with sample-only list observations as one combined “全量结论”
41
- - Reuse `record_query_plan.estimate.recommended_arguments` or `suggested_next_call` instead of guessing scan parameters
42
- - For final conclusions, prefer `strict_full=true` and `auto_expand_pages=true`
39
+ - Do not mix full totals from `record_analyze` with sample-only list observations as one combined `全量结论`
40
+ - Do not manually tune paging or scan-budget parameters for analysis; `record_analyze` hides them
41
+ - For final conclusions, prefer `strict_full=true`
42
+ - Before choosing a DSL shape, first decide whether the question needs `count`, `sum`, `avg`, `distinct_count`, `ratio`, or `ranking`
43
+ - Do not guess a metric just because the user said `数量`, `单量`, `人数`, or `金额`
44
+ - If one business question depends on multiple metrics, split it into smaller structured questions and build multiple focused DSLs
45
+ - `渗透率 / 转化率 / 占比类结论必须先定义分子和分母`
46
+ - Do not claim a metric you did not query.
47
+ - Derived ratios must be computed outside the DSL after trusted numerator and denominator queries complete; do not invent `div`, `formula`, or expression metrics inside `record_analyze`
48
+ - If the requested business question requires unsupported derived math, split it into multiple DSLs and compute the final ratio only in the reasoning layer after the source metrics are confirmed
49
+ - If the user asks for multiple conclusions and only part of them is completed reliably, explicitly disclose which parts are complete and which parts remain unresolved
43
50
 
44
51
  ## Standard Operating Order
45
52
 
46
53
  For analysis:
47
54
 
48
55
  1. Confirm target app and environment
49
- 2. If field names are uncertain, run `record_field_resolve`
50
- 3. If the resolver still leaves more than one plausible candidate, present the shortlist and ask the user which field they mean
51
- 4. Run `record_query_plan`
52
- 5. Reuse MCP-recommended arguments from `suggested_next_call` or `estimate.recommended_arguments`
53
- 6. Run `record_query(query_mode="summary", strict_full=true, auto_expand_pages=true)` to establish full scope
54
- 7. Run `record_aggregate(strict_full=true, auto_expand_pages=true)` for grouped distributions, ranking, ratios, averages, or trends
55
- 8. Run `record_query(query_mode="list")` only if you still need sample rows, examples, or manual inspection
56
- 9. Before answering, separate:
56
+ 2. Run `record_schema_get`
57
+ 3. Inspect fields, aliases, suggested dimensions, suggested metrics, and suggested time fields
58
+ 4. Generate one or more field_id-based DSLs
59
+ 5. Run `record_analyze` once per DSL
60
+ 6. Run `record_query(query_mode="list")` only if you still need sample rows, examples, or manual inspection
61
+ 7. Before answering, separate:
57
62
  - `全量可信结论`
58
63
  - `样本观察`
59
64
  - `待验证假设`
60
65
 
66
+ ## Semantic Guardrails
67
+
68
+ - If the user asks for penetration, conversion, share-of-total, win rate, non-standard ratio, or any `%` metric, first write down:
69
+ - numerator definition
70
+ - denominator definition
71
+ - whether each side needs its own DSL
72
+ - If you cannot name the denominator from real schema fields and filters, do not use words like `渗透率`, `转化率`, `占比`, `比例`, or `%`
73
+ - If a field is still ambiguous after `record_schema_get`, do not guess; either select one unique `field_id` from the schema or ask the user to confirm from a short candidate list
74
+ - If a statement depends on `count`, query `count`
75
+ - If a statement depends on total amount, query `sum`
76
+ - If a statement depends on average level, query `avg` or derive it from trusted `sum + count`
77
+ - If a statement depends on trend, query a time dimension with `bucket`
78
+ - If a statement depends on a ratio that the DSL cannot express directly, run the numerator and denominator separately, then compute the ratio outside MCP only after both sides are complete and compatible
79
+ - Rankings must come from structured sorted results, not from loose natural-language restatement
80
+ - When grouped rows are truncated, describe them as `已返回分组中` or `主要分组`
81
+ - If `presentation.rows_truncated=true` or `presentation.statement_scope=returned_groups_only`, do not use words like `各部门`、`所有分组`、`完整名单`、`全部渠道`
82
+ - If grouped rows are truncated, explicitly downgrade the wording to `前 N 个分组` or `主要分组`, never `全部`
83
+ - Complex answers should default to `先结构、后解读`: present the table / metrics / ordering first, then add concise interpretation
84
+ - Final wording should stay as close as possible to schema titles, dimension aliases, and metric aliases; do not rename the business object or field title unless the user asked for a rewrite
85
+
86
+ ## DSL Contract
87
+
88
+ Use `record_schema_get` as the source of truth for every DSL field reference:
89
+
90
+ - Use `fields[].field_id` in `dimensions[].field_id`, `metrics[].field_id`, and `filters[].field_id`
91
+ - Treat `suggested_dimensions`, `suggested_metrics`, and `suggested_time_fields` as hints, not as executable DSL by themselves
92
+ - Do not pass field titles, aliases, or guessed ids where `field_id` is required
93
+
94
+ The `record_analyze` call should be built from this argument shape:
95
+
96
+ ```json
97
+ {
98
+ "app_key": "APP_1",
99
+ "dimensions": [],
100
+ "metrics": [],
101
+ "filters": [],
102
+ "sort": [],
103
+ "limit": 50,
104
+ "strict_full": true,
105
+ "view_key": null,
106
+ "view_name": null,
107
+ "output_profile": "normal"
108
+ }
109
+ ```
110
+
111
+ Top-level argument rules:
112
+
113
+ - `app_key`: required. The target Qingflow app.
114
+ - `dimensions`: required list. Use `[]` for whole-table summary. Use one item per grouping dimension for grouped analysis.
115
+ - `metrics`: optional list. If omitted or empty, `record_analyze` defaults to a single `count` metric.
116
+ - `filters`: optional list. Filters restrict the analyzed dataset before results are interpreted.
117
+ - `sort`: optional list. Sorting applies to result rows, not raw source rows.
118
+ - `limit`: positive integer. It only limits returned result rows; it does not reduce the internal scan scope.
119
+ - `strict_full`: boolean. Prefer `true` for final conclusions. If `true`, incomplete scans return an error; if `false`, incomplete scans return partial results.
120
+ - `view_key` / `view_name`: optional. Use a view to narrow scope before analysis. Prefer `view_key` when both are available.
121
+ - `output_profile`: `normal` or `verbose`. Prefer `normal` unless you are debugging completeness or route issues.
122
+
123
+ Item contracts:
124
+
125
+ - `dimensions` item:
126
+ - shape: `{ "field_id": 2, "alias": "状态", "bucket": null }`
127
+ - `field_id`: required integer from `record_schema_get`
128
+ - `alias`: optional but recommended; if omitted, the field title becomes the alias
129
+ - `bucket`: optional; allowed values are `day`, `week`, `month`, `quarter`, `year`, or omitted / `null`
130
+ - `bucket` may only be used on fields from `suggested_time_fields`
131
+ - `metrics` item:
132
+ - shape: `{ "op": "sum", "field_id": 7, "alias": "总金额" }`
133
+ - `op`: one of `count`, `sum`, `avg`, `min`, `max`, `distinct_count`
134
+ - `field_id`: required for `sum`, `avg`, `min`, `max`, `distinct_count`; do not pass it for `count`
135
+ - `alias`: optional but strongly recommended because `sort.by` must reference aliases
136
+ - `filters` item:
137
+ - shape: `{ "field_id": 2, "op": "eq", "value": "进行中" }`
138
+ - `field_id`: required integer from `record_schema_get`
139
+ - `op`: optional; defaults to `eq`
140
+ - supported ops: `eq`, `neq`, `in`, `not_in`, `gt`, `gte`, `lt`, `lte`, `between`, `contains`, `is_null`, `not_null`
141
+ - value rules:
142
+ - `eq`, `neq`, `gt`, `gte`, `lt`, `lte`, `contains`: pass a single scalar value
143
+ - `in`, `not_in`: pass an array
144
+ - `between`: pass a two-item array like `[min, max]`
145
+ - `is_null`, `not_null`: omit `value`
146
+ - `sort` item:
147
+ - shape: `{ "by": "记录数", "order": "desc" }`
148
+ - `by`: required and must reference an alias already defined in `dimensions` or `metrics`
149
+ - `order`: optional; use `asc` or `desc`; default is `asc`
150
+ - do not sort by raw field title or `field_id`
151
+
152
+ Practical rules:
153
+
154
+ - Keep one DSL focused on one question. Prefer multiple small DSLs over one overloaded request.
155
+ - Always set explicit aliases for metrics you may sort by, compare, or quote in the final answer.
156
+ - For trend analysis, use one time dimension with `bucket`, then sort by that time alias ascending.
157
+ - For cross analysis, use multiple `dimensions` and a small set of metrics.
158
+ - Do not attempt formulas, joins, having clauses, cohort analysis, or manual paging controls in this DSL.
159
+ - Do not pass unsupported keys such as `formula`, `expr`, `numerator`, `denominator`, `left`, `right`, or `operator` inside metric items.
160
+
161
+ ## Minimal DSL Templates
162
+
163
+ Summary:
164
+
165
+ ```json
166
+ {
167
+ "dimensions": [],
168
+ "metrics": [
169
+ { "op": "count", "alias": "记录数" }
170
+ ],
171
+ "filters": [],
172
+ "sort": [],
173
+ "limit": 1,
174
+ "strict_full": true
175
+ }
176
+ ```
177
+
178
+ Single-dimension distribution:
179
+
180
+ ```json
181
+ {
182
+ "dimensions": [
183
+ { "field_id": 2, "alias": "状态" }
184
+ ],
185
+ "metrics": [
186
+ { "op": "count", "alias": "记录数" }
187
+ ],
188
+ "filters": [],
189
+ "sort": [
190
+ { "by": "记录数", "order": "desc" }
191
+ ],
192
+ "limit": 50,
193
+ "strict_full": true
194
+ }
195
+ ```
196
+
197
+ Time trend:
198
+
199
+ ```json
200
+ {
201
+ "dimensions": [
202
+ { "field_id": 3, "alias": "月份", "bucket": "month" }
203
+ ],
204
+ "metrics": [
205
+ { "op": "count", "alias": "记录数" }
206
+ ],
207
+ "filters": [],
208
+ "sort": [
209
+ { "by": "月份", "order": "asc" }
210
+ ],
211
+ "limit": 24,
212
+ "strict_full": true
213
+ }
214
+ ```
215
+
216
+ Two-dimensional cross analysis:
217
+
218
+ ```json
219
+ {
220
+ "dimensions": [
221
+ { "field_id": 2, "alias": "状态" },
222
+ { "field_id": 5, "alias": "负责人" }
223
+ ],
224
+ "metrics": [
225
+ { "op": "count", "alias": "记录数" },
226
+ { "op": "sum", "field_id": 7, "alias": "总金额" }
227
+ ],
228
+ "filters": [],
229
+ "sort": [
230
+ { "by": "记录数", "order": "desc" }
231
+ ],
232
+ "limit": 100,
233
+ "strict_full": true
234
+ }
235
+ ```
236
+
61
237
  ## Output Gate
62
238
 
63
- - If `record_query_plan` was not used, downgrade the answer to `初步观察`
64
- - If `safe_for_final_conclusion=false`, do not present the result as a final conclusion
65
- - If aggregate/summary is full but list evidence is sample-only, split the answer into:
239
+ - Only write `全量可信结论` when the supporting `record_analyze` calls report `completeness.status=complete` and `safe_for_final_conclusion=true`
240
+ - If any key analysis call is incomplete, downgrade the answer to `初步观察` or `部分结果`
241
+ - Treat `safe_for_final_conclusion=true` as necessary but not sufficient when the metric definition is incomplete or grouped rows are truncated
242
+ - If `presentation.statement_scope=returned_groups_only`, you may still give full-population conclusions about totals or ratios, but not a full grouped enumeration claim
243
+ - If aggregate-style output is full but list evidence is sample-only, split the answer into:
66
244
  - `全量可信结论`
67
245
  - `样本观察(不作为最终结论)`
68
246
  - optional `待验证假设`
@@ -1,4 +1,4 @@
1
1
  interface:
2
2
  display_name: "Qingflow Record Analysis"
3
- short_description: "Analyze Qingflow record data with plan-first, sample-safe reporting"
4
- default_prompt: "Use $qingflow-record-analysis for grouped distributions, ratios, rankings, trends, and final statistical conclusions in Qingflow apps. Start with record_query_plan, treat record_query(list) as sample-only when capped, and separate full conclusions from sample observations."
3
+ short_description: "Analyze Qingflow record data with schema-first DSL execution"
4
+ default_prompt: "Use $qingflow-record-analysis for grouped distributions, ratios, rankings, trends, and final statistical conclusions in Qingflow apps. Start with record_schema_get, build one or more field_id-based DSLs, then run record_analyze. Treat record_query(query_mode=\"list\") as sample-only when capped, and separate full conclusions from sample observations."
@@ -1,18 +1,29 @@
1
1
  # Analysis Gotchas
2
2
 
3
- ## Do not skip plan
3
+ ## Do not skip schema
4
4
 
5
- If the task is analysis-style and you jump straight to `record_query(list)` or `record_aggregate`, you are already off the stable path.
5
+ If the task is analysis-style and you jump straight to `record_query(query_mode="list")` or `record_analyze`, you are already off the stable path.
6
6
 
7
7
  Correct recovery:
8
8
 
9
- 1. `record_field_resolve` if fields are uncertain
10
- 2. `record_query_plan`
11
- 3. follow the recommended chain
9
+ 1. `record_schema_get`
10
+ 2. inspect the schema and choose fields
11
+ 3. build one or more small DSLs
12
+ 4. run `record_analyze`
13
+
14
+ ## Normalize relative time phrases before building the DSL.
15
+
16
+ Examples:
17
+
18
+ - `最近一个完整自然月` -> convert to an explicit full-month date range
19
+ - `上个月` -> convert to a concrete month range
20
+ - `最近30天` -> convert to exact start/end dates
21
+
22
+ Do not pass vague time phrases or impossible dates into MCP.
12
23
 
13
24
  ## Do not treat 200-row list output as full data
14
25
 
15
- `record_query(list)` can hit:
26
+ `record_query(query_mode="list")` can hit:
16
27
 
17
28
  - `row_cap=200`
18
29
  - `row_cap_hit=true`
@@ -29,15 +40,28 @@ It is not acceptable to use that result alone for:
29
40
  - 地域分布
30
41
  - “基于全部数据”的 business insight
31
42
 
32
- ## Do not mix full aggregate totals with sample rows
43
+ ## Do not mix full analyze totals with sample rows
33
44
 
34
- If summary or aggregate gives full-population coverage, but list rows are capped, do not merge them into one final statement.
45
+ If `record_analyze` gives full-population coverage, but list rows are capped, do not merge them into one final statement.
35
46
 
36
47
  Split them into:
37
48
 
38
49
  - `全量可信结论`
39
50
  - `样本观察`
40
51
 
52
+ ## Do not present truncated grouped rows as a full grouped list
53
+
54
+ If `presentation.rows_truncated=true` or `presentation.statement_scope=returned_groups_only`:
55
+
56
+ - do not say `各部门`
57
+ - do not say `所有分组`
58
+ - do not say `完整名单`
59
+
60
+ Correct recovery:
61
+
62
+ - do not describe the answer as complete grouped coverage
63
+ - keep the wording inside the returned group scope
64
+
41
65
  ## Do not guess fields under ambiguity
42
66
 
43
67
  If the field is uncertain:
@@ -45,24 +69,73 @@ If the field is uncertain:
45
69
  - do not bounce across tools
46
70
  - do not guess ids
47
71
  - do not switch from one read tool to another by trial and error
48
- - do not keep retrying different field-name variants after ambiguity is already clear
72
+ - do not keep retrying different guessed field names in a loop
49
73
 
50
74
  Correct recovery:
51
75
 
52
- 1. `record_field_resolve`
76
+ 1. `record_schema_get`
53
77
  2. if several plausible candidates remain, ask the user to confirm from a short list
54
- 3. `record_query_plan`
78
+ 3. build the DSL only after the field is clear
55
79
 
56
80
  Examples of the right recovery question:
57
81
 
58
82
  - “我找到两个可能的字段:`线索来源`、`来源渠道`。你要按哪个字段统计?”
59
83
  - “目前最像‘来源’的字段有这三个:`来源`、`来源渠道`、`获客来源`。请确认你要按哪个字段分析。”
60
84
 
61
- ## Do not override MCP scan recommendations casually
85
+ ## Do not try to control paging manually
86
+
87
+ `record_analyze` hides paging and scan budget on purpose.
88
+
89
+ - Do not invent `page_size`
90
+ - Do not invent `requested_pages`
91
+ - Do not invent `scan_max_pages`
92
+ - Do not invent `auto_expand_pages`
93
+
94
+ When the result is incomplete:
95
+
96
+ 1. narrow the scope with views or filters
97
+ 2. reduce the analysis problem into smaller DSLs
98
+ 3. keep the answer at `初步观察` or `部分结果` if completeness is still not enough
99
+
100
+ ## Do not guess metric semantics from loose business wording
101
+
102
+ Before building the DSL, first decide whether the question needs:
103
+
104
+ - `count`
105
+ - `sum`
106
+ - `avg`
107
+ - `distinct_count`
108
+ - a ratio with numerator + denominator
109
+ - a sorted ranking result
110
+
111
+ Do not jump straight from words like `数量`, `人数`, `单量`, or `金额` to one assumed metric.
112
+
113
+ ## Do not hide partial completion
114
+
115
+ If the user asked for several outputs and only part of them is stable:
116
+
117
+ - say which parts are complete
118
+ - say which parts are still unresolved
119
+ - do not present the answer as fully finished
120
+
121
+ ## Do not send unsupported formula or div-style metrics into `record_analyze`.
122
+
123
+ Examples to avoid:
124
+
125
+ - `{"op":"div", ...}`
126
+ - metric items with `formula`, `expr`, `numerator`, or `denominator`
127
+
128
+ Correct recovery:
129
+
130
+ 1. query the source metrics with separate DSLs
131
+ 2. confirm both sides are complete and compatible
132
+ 3. compute the derived ratio outside MCP in the reasoning layer
62
133
 
63
- If `record_query_plan` returns:
134
+ ## Do not call something a ratio without the denominator
64
135
 
65
- - `estimate.recommended_arguments`
66
- - `suggested_next_call`
136
+ If the user asks for penetration / conversion / 占比:
67
137
 
68
- reuse them first. These are safer than hand-tuning `page_size`, `requested_pages`, and `scan_max_pages` from memory.
138
+ 1. define numerator
139
+ 2. define denominator
140
+ 3. query both sides explicitly
141
+ 4. only then compute and report the ratio