npm - @qingflow-tech/qingflow-app-user-mcp - Versions diffs - 1.0.4 → 1.0.6 - Mend

@qingflow-tech/qingflow-app-user-mcp 1.0.4 → 1.0.6

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (27) hide show

package/README.md +2 -2
package/package.json +1 -1
package/pyproject.toml +2 -1
package/skills/qingflow-app-user/SKILL.md +2 -1
package/skills/qingflow-app-user/references/data-gotchas.md +8 -4
package/skills/qingflow-app-user/references/public-surface-sync.md +6 -4
package/skills/qingflow-app-user/references/record-patterns.md +14 -4
package/skills/qingflow-record-analysis/SKILL.md +103 -166
package/skills/qingflow-record-analysis/agents/openai.yaml +2 -2
package/skills/qingflow-record-analysis/references/analysis-gotchas.md +56 -110
package/skills/qingflow-record-analysis/references/analysis-patterns.md +106 -119
package/skills/qingflow-record-analysis/references/business-context.md +74 -0
package/skills/qingflow-record-analysis/references/confidence-reporting.md +49 -72
package/skills/qingflow-record-analysis/references/data-access-playbook.md +106 -0
package/skills/qingflow-record-analysis/references/pandas-recipes.md +172 -0
package/skills/qingflow-record-analysis/references/report-format.md +76 -0
package/skills/qingflow-record-insert/SKILL.md +28 -7
package/skills/qingflow-record-update/SKILL.md +1 -1
package/src/qingflow_mcp/backend_client.py +55 -1
package/src/qingflow_mcp/cli/commands/record.py +63 -6
package/src/qingflow_mcp/cli/formatters.py +101 -1
package/src/qingflow_mcp/public_surface.py +2 -1
package/src/qingflow_mcp/response_trim.py +235 -10
package/src/qingflow_mcp/server.py +19 -12
package/src/qingflow_mcp/server_app_user.py +30 -13
package/src/qingflow_mcp/tools/record_tools.py +13425 -8817
package/skills/qingflow-record-analysis/references/dsl-templates.md +0 -93

package/skills/qingflow-record-analysis/references/analysis-gotchas.md CHANGED Viewed

@@ -1,145 +1,91 @@
 # Analysis Gotchas
-## Do not skip schema
+## Do Not Skip Schema
-If the task is analysis-style and you jump straight to `record_list` or `record_analyze`, you are already off the stable path.
+Correct path:
-Correct recovery:
+1. `app_get`
+2. `record_browse_schema_get`
+3. `record_access`
+4. Python
-1. `record_browse_schema_get`
-2. inspect the schema and choose fields
-3. build one or more small DSLs
-4. run `record_analyze`
+`record_browse_schema_get` returns readable fields for the selected view. Missing fields are permission or view-scope boundaries, not invitations to guess hidden ids.
-The schema here is applicant-node visible-only. If a field is absent, treat it as not available to the current user rather than switching to guessed ids or builder-side memory.
+## Do Not Use Export For Analysis
-## Normalize relative time phrases before building the DSL.
+Export tools are for user-requested files. Analysis uses `record_access` because it returns structured completeness and compact field metadata.
-Examples:
+## Do Not Treat `record_list` As Full Data
-- `最近一个完整自然月` -> convert to an explicit full-month date range
-- `上个月` -> convert to a concrete month range
-- `最近30天` -> convert to exact start/end dates
+`record_list` is sample/browse only. It can be capped and should not justify:
-Do not pass vague time phrases or impossible dates into MCP.
+- average
+- share
+- ranking
+- trend
+- regional distribution
+- "all data" insights
-## Do not treat 200-row list output as full data
+## Do Not Control Paging
-`record_list` can hit:
+`record_access` owns paging internally.
-- `row_cap=200`
-- `row_cap_hit=true`
-- `sample_only=true`
+Do not invent:
-When this happens, it is sample-only evidence.
+- `page`
+- `page_size`
+- `limit`
+- `requested_pages`
+- `scan_max_pages`
+- `max_rows`
+- `timeout`
-It is not acceptable to use that result alone for:
+## Do Not Print Raw CSV
-- 平均值
-- 占比
-- 排名
-- 趋势
-- 地域分布
-- “基于全部数据”的 business insight
+Read CSV files with pandas. Summarize computed results, not raw rows.
-## Do not mix full analyze totals with sample rows
+## Do Not Rename Source Files
-If `record_analyze` gives full-population coverage, but list rows are capped, do not merge them into one final statement.
+CSV columns are directly readable and field-id anchored: `record_id`, `<字段标题>__field_<id>`. Use those columns directly in pandas.
-Split them into:
+## Do Not Trust Sparse Dimensions
-- `全量可信结论`
-- `样本观察`
+Before final grouping, run a field-quality profile. If the selected field is mostly blank, say so and downgrade the claim.
-## Do not present truncated grouped rows as a full grouped list
+Rules of thumb:
-If `completeness.rows_truncated=true` or `completeness.statement_scope=returned_groups_only`:
+- Overall blank rate above 40%: not a primary conclusion dimension.
+- Any compared period blank rate above 80%: do not use that field for period comparison.
+- A sparse field can support only `已填写样本观察`.
-- do not say `各部门`
-- do not say `所有分组`
-- do not say `完整名单`
+If the user asks for a semantic field such as `板块`, test nearby candidates like product, platform, module, stage, source, owner, or department before concluding.
-Correct recovery:
+## Do Not Hide Incomplete Access
-- do not describe the answer as complete grouped coverage
-- keep the wording inside the returned group scope
+If `needs_scope`, no CSV exists. Ask for a time/business scope.
-## Do not guess fields under ambiguity
+If `partial`, use only subset wording and avoid full-population claims.
-If the field is uncertain:
+If field meaning is ambiguous, ask the user to confirm from a short list.
-- do not bounce across tools
-- do not guess ids
-- do not switch from one read tool to another by trial and error
-- do not keep retrying different guessed field names in a loop
+## Do Not Guess Metrics
-Correct recovery:
+Before fetching data, decide whether the request needs count, sum, average, distinct count, ratio, ranking, trend, or comparison.
-1. `record_browse_schema_get`
-2. if several plausible candidates remain, ask the user to confirm from a short list
-3. build the DSL only after the field is clear
+## Do Not Call A Ratio Without Denominator
-If the intended field is absent from the schema altogether, stop and explain that it is not visible in the current applicant-node permission scope.
+For penetration, conversion, or share:
-Examples of the right recovery question:
-- “我找到两个可能的字段：`线索来源`、`来源渠道`。你要按哪个字段统计？”
-- “目前最像‘来源’的字段有这三个：`来源`、`来源渠道`、`获客来源`。请确认你要按哪个字段分析。”
-## Do not try to control paging manually
-`record_analyze` hides paging and scan budget on purpose.
-- Do not invent `page_size`
-- Do not invent `requested_pages`
-- Do not invent `scan_max_pages`
-- Do not invent `auto_expand_pages`
-When the result is incomplete:
-1. narrow the scope with views or filters
-2. reduce the analysis problem into smaller DSLs
-3. keep the answer at `初步观察` or `部分结果` if completeness is still not enough
-## Do not guess metric semantics from loose business wording
-Before building the DSL, first decide whether the question needs:
-- `count`
-- `sum`
-- `avg`
-- `distinct_count`
-- a ratio with numerator + denominator
-- a sorted ranking result
-Do not jump straight from words like `数量`, `人数`, `单量`, or `金额` to one assumed metric.
-## Do not hide partial completion
-If the user asked for several outputs and only part of them is stable:
-- say which parts are complete
-- say which parts are still unresolved
-- do not present the answer as fully finished
-## Do not send unsupported formula or div-style metrics into `record_analyze`.
-Examples to avoid:
-- `{"op":"div", ...}`
-- metric items with `formula`, `expr`, `numerator`, or `denominator`
-Correct recovery:
-1. query the source metrics with separate DSLs
-2. confirm both sides are complete and compatible
-3. compute the derived ratio outside MCP in the reasoning layer
+1. define numerator
+2. define denominator
+3. query compatible source data
+4. compute in Python
+5. report numerator and denominator
-## Do not call something a ratio without the denominator
+## Normalize Relative Dates
-If the user asks for penetration / conversion / 占比:
+Convert relative phrases into exact ranges before `record_access`.
-1. define numerator
-2. define denominator
-3. query both sides explicitly
-4. only then compute and report the ratio
+- `今年5月` -> exact May 1 to May 31 in the current year
+- `去年同期` -> same date range in previous year
+- `最近30天` -> exact rolling start/end dates

package/skills/qingflow-record-analysis/references/analysis-patterns.md CHANGED Viewed

@@ -1,125 +1,112 @@
 # Analysis Patterns
-## When to use this skill
-Use this skill when the user asks for:
-- 分布
-- 占比
-- 平均值
-- 排名 / top-N
-- 趋势
-- 洞察
-- 最终统计结论
-- 全量范围内的 business summary
-## Canonical analysis sequence
-1. `record_browse_schema_get`
-2. decide whether the question needs `count`, `sum`, `avg`, `distinct_count`, `ratio`, or `ranking`
-3. build one or more field_id-based DSLs
-4. `record_analyze`
-5. `record_list` only for sample inspection
-Result reading order:
-1. `result.rows`
-2. `result.totals.metric_totals`
-3. `ranking`
-4. `ratios`
-5. `completeness`
-6. `presentation`
-Treat `record_browse_schema_get` as the browse-schema source of truth. Missing fields are permission boundaries, not invitations to guess hidden ids.
-## Distribution / ratio pattern
-1. Run `record_browse_schema_get`
-2. Inspect candidate fields and aliases
-3. If several plausible candidates remain, stop and ask the user to confirm the field from a short list
-4. Build a DSL with:
-   - one dimension
-   - `count`
-   - sort by the count alias
-5. Run `record_analyze`
-6. Report:
-   - `result.totals.metric_totals`
-   - `safe_for_final_conclusion`
-   - `completeness.statement_scope`
-   - `completeness.warnings`
-7. If grouped rows are truncated, describe the answer as `主要分组` or `已返回分组中`, not `各部门` or `全部`
-## penetration / conversion / share-of-total pattern
-1. Run `record_browse_schema_get`
-2. Write down the business definition in plain language:
-   - numerator
-   - denominator
-   - grouping dimension, if any
-3. Build separate DSLs when numerator and denominator are not the same filtered population
-4. Query the numerator first
-5. Query the denominator second
-6. Only compute the ratio outside MCP after both source results are complete and use compatible scopes
-7. If the denominator is missing, do not call the output `渗透率`, `转化率`, `占比`, or `%`
-## Average / ranking pattern
-1. Run `record_browse_schema_get`
-2. Choose one dimension field and one numeric metric field
-3. Build a DSL with:
-   - `dimensions=[...]`
-   - `metrics=[count,sum]` or `metrics=[count,avg,min,max]`
-4. Run `record_analyze`
-5. If the answer uses ranking language, make the ranking come from structured sorted results
-6. Prefer the structured `ranking` block when it exists instead of inferring order from loose text
-7. Use list mode only to inspect examples after the aggregate result is understood
-## Trend pattern
-1. Run `record_browse_schema_get`
-2. Choose a date/time field from `suggested_time_fields`
-3. Build a DSL with `bucket=day|week|month|quarter|year`
-4. Run `record_analyze`
-5. Treat the result as final only if `safe_for_final_conclusion=true`
-6. If the user asked for a relative time phrase such as `最近一个完整自然月`, translate it into an explicit legal date range before building the DSL
-## Sample inspection pattern
-Only use `record_list` after schema/analyze when you need:
-- example rows
-- spot checks
-- representative samples
-- manual inspection of records behind an aggregate bucket
-Never use list mode alone to justify final averages, shares, rankings, or regional distribution claims.
-## Statement-to-query discipline
-- If you want to say `单量低` or `volume low`, query `count`
-- If you want to say `金额高`, query `sum`
-- If you want to say `客单价高`, query `avg` or trusted `sum + count`
-- If you want to say `增长` or `下降`, query a time bucket
-- If you want to say `渗透率` or `占比`, query both numerator and denominator
-- If you want to say `各部门` / `全部渠道` / `完整名单`, make sure `completeness.statement_scope=full_population` and `completeness.rows_truncated=false`
-- If you want to say `Top N` or `排名`, make sure the result is explicitly sorted and the conclusion follows that returned order
-- If the task is complex, default to `先结构、后解读`
-## Ambiguous field recovery
-If the user asks for something like “来源分布” or “类型占比” and the exact field is unclear:
-1. run `record_browse_schema_get`
-2. inspect titles, aliases, and suggested fields
-3. if one candidate is clearly dominant, proceed
-4. if multiple candidates are still plausible, ask the user to confirm which field they want
+## Canonical Sequence
+1. `app_get`
+2. `record_browse_schema_get`
+3. decide metric intent
+4. choose `record_access.columns / where / order_by`
+5. `record_access`
+6. Python over every returned CSV shard
+7. optional `record_list` or `record_get` only for sample/detail verification
+Metric intent must be one of:
+- `count`
+- `sum`
+- `avg`
+- `distinct_count`
+- ratio with numerator and denominator
+- sorted ranking
+- time trend
+- period comparison
+## Distribution
+1. Fetch grouping field and filter fields.
+2. Run the field-quality profile for the grouping field.
+3. If the field passes quality gates, group by the readable field-id anchored column such as `项目状态__field_343283094`.
+4. Count rows and calculate share from the sum of counts.
+5. Report top groups plus total row count.
+If the grouping field is ambiguous, ask the user to choose from a short candidate list.
+## Dimension Selection
+When the user asks for a semantic bucket such as `板块`, `模块`, `业务线`, or `来源`, inspect candidate fields and choose the most reliable one:
+1. Match schema titles to the user's wording.
+2. Fetch candidate fields together if they are cheap.
+3. Profile `blank_rate`, period coverage, and distinct count.
+4. Prefer the candidate with clear semantics and usable coverage.
+5. If the literal field is sparse, downgrade it to `已填写样本观察` and use the nearest reliable fallback for the main conclusion.
+Example: if `缺陷所属模块` is mostly empty but `缺陷所属平台` and `所属产品` are complete, use platform/product for the main conclusion and state that module-level analysis is limited.
+Quality gates:
+- Overall `blank_rate > 0.4`: not a primary conclusion dimension.
+- Any compared period `blank_rate > 0.8`: not valid for period comparison.
+- High-cardinality description/id fields are not dimensions unless the user explicitly asks for record-level ranking.
+## Ratio / Conversion / Penetration
+1. Define numerator and denominator in plain language.
+2. Fetch both populations with compatible scope.
+3. Compute ratio in Python.
+4. Report `numerator / denominator = percentage`.
-Do not keep retrying different guessed field names in a loop.
+If denominator is missing or scope differs, do not call the result a rate.
-## Partial completion discipline
+## Average / Sum
-If the user asked for several conclusions and only some of them are fully supported:
+1. Fetch grouping field and numeric metric field.
+2. Convert the metric column with `pd.to_numeric(errors="coerce")`.
+3. Report count, sum, and average together when useful.
+4. State how blanks/non-numeric values were handled if material.
+## Ranking
+1. Build the metric in Python.
+2. Sort explicitly.
+3. Report Top N with metric values.
+4. Do not infer ranking from unsorted sample rows.
+## Trend
+1. Choose a date/time field from `suggested_time_fields`.
+2. Convert relative phrases into exact date ranges.
+3. Fetch the date field and metrics.
+4. Bucket in pandas by day/week/month/quarter/year.
+5. Report both absolute values and changes.
+## Same-Period Comparison
+For `今年5月 vs 去年5月`:
+1. Use the same date field for both periods.
+2. Fetch the full combined date range or two separate compatible ranges.
+3. Apply identical business filters.
+4. Compute absolute delta and percentage delta.
+5. State both periods explicitly.
+## Sample Inspection
+Use `record_list` only after the aggregate result is complete, and only for:
+- representative examples
+- checking surprising categories
+- manually inspecting records behind a bucket
+Never use `record_list` alone for final averages, shares, rankings, trends, or distributions.
+## Ambiguous Field Recovery
+If the exact field is unclear:
+1. inspect `record_browse_schema_get.fields`
+2. use titles and suggested fields
+3. if one candidate is clearly dominant, proceed
+4. otherwise ask the user to confirm
-1. state which parts are complete
-2. state which parts are still unresolved
-3. do not present the answer as fully complete
+Do not retry tools with guessed field names.

package/skills/qingflow-record-analysis/references/business-context.md ADDED Viewed

@@ -0,0 +1,74 @@
+# Business Context
+Use this when analysis depends on organization, aliases, ownership, stage semantics, or user-provided business definitions.
+## When To Check Context
+Check for business context when the request mentions:
+- department / team / group / region
+- owner / assignee / sales rep / partner
+- stage / status / funnel / conversion
+- product line / business line
+- same-period comparison
+- "北斗部门", "SMB", "伙伴", or any named internal scope
+## Mapping Rules
+Use explicit mappings in this order:
+1. the user's message in the current thread
+2. attached or local business context files
+3. schema-visible fields and sample records
+4. short clarification to the user
+Do not infer hidden org hierarchy from memory. If the mapping changes the denominator or grouping, state it in the final answer.
+Example:
+```python
+dept_map = {
+    "烈焰组": "北斗部门",
+    "飓风组": "北斗部门",
+}
+df["部门口径"] = df["field_40"].replace(dept_map)
+```
+Final wording:
+```text
+部门口径：将「烈焰组」「飓风组」合并计入「北斗部门」。
+```
+## Ratio Definitions
+Before computing rates, define:
+- numerator
+- denominator
+- time range
+- grouping dimension
+- exclusions
+If any part is ambiguous, ask. Do not rename a count as a rate.
+## Time Scope
+Normalize relative dates to exact dates before calling `record_access`.
+Examples:
+- `今年5月` -> `2026-05-01` to `2026-05-31` when current year is 2026
+- `去年同期` -> same month range in the previous year
+- `最近一个完整自然月` -> previous calendar month, not the last 30 days
+## Cross-App Reconciliation
+If the analysis needs multiple apps:
+1. run the standard sequence per app
+2. keep each dataset's scope and completeness separately
+3. join in Python only on explicit ids or trusted business keys
+4. disclose join keys and unmatched records
+If no reliable key exists, report the gap instead of forcing a join.

package/skills/qingflow-record-analysis/references/confidence-reporting.md CHANGED Viewed

@@ -1,92 +1,69 @@
 # Confidence Reporting
-## Required output structure
+## Full Conclusion Gate
-When analysis is intended as a final answer, use this order:
-1. `全量可信结论`
-2. `样本观察`
-3. `待验证假设`
-## Full conclusion gate
-Only write `全量可信结论` when:
+Use `全量可信结论` only when:
 - `record_browse_schema_get` was used
-- the analysis path used one or more `record_analyze` calls
-- every key analysis result has `safe_for_final_conclusion=true`
-- `safe_for_final_conclusion=true is necessary but not sufficient`
-- no key result depends on an invalid time phrase, an undefined denominator, or an unsupported derived metric
-- the result is not just a capped list sample
-## Sample observation gate
-Put evidence into `样本观察` when:
-- it came from `record_list`
-- the tool reports `row_cap_hit`
-- the tool reports `sample_only`
-- the result is compact/capped and not complete
-## Downgrade rule
-If `record_browse_schema_get` was not used for an analysis task, downgrade the overall framing to `初步观察` instead of `洞察` or `结论`.
-## Anti-mixing rule
-Do not combine:
-- full totals from `record_analyze`
-- sample-only details from `record_list`
-into one sentence like “基于全部数据分析...”.
-Instead:
-- full totals and distributions go into `全量可信结论`
-- illustrative rows go into `样本观察`
-## Semantic gate
+- data came from `record_access`
+- all returned CSV shards were read in Python
+- `record_access.complete=true`
+- `record_access.truncated=false`
+- `record_access.safe_for_final_conclusion=true`
+- metric definitions are complete
+- denominator exists for every ratio
+- time fields and date ranges are explicit
+- primary grouping dimensions pass field-quality gates
-Even when `safe_for_final_conclusion=true`, do not overstate the answer if:
+## Initial Observation Gate
-- the metric definition is incomplete
-- the denominator was not queried
-- the conclusion mentions trend but no time bucket was queried
-- the conclusion mentions单量/volume but no `count` metric was queried
-- the conclusion depends on a derived metric the DSL cannot natively express
-- `completeness.statement_scope=returned_groups_only`
-- `completeness.rows_truncated=true`
+Use `初步观察` when:
-## Grouped enumeration gate
+- `record_access.status=needs_scope`
+- `record_access.status=partial`
+- `record_access.complete=false`
+- `record_access.truncated=true`
+- `record_access.safe_for_final_conclusion=false`
+- evidence came from `record_list`
+- scope or saved view filter is unverified
-If grouped rows were truncated:
+## Anti-Mixing Rule
-- do not call the grouped output `各部门`, `全部渠道`, `完整名单`, or `所有分组`
-- use `已返回分组中`, `主要分组`, or `前 N 个分组`
-- keep full-population statements only for metrics that still cover the full analyzed population
+Do not combine full CSV-derived totals and sample-only rows in one sentence.
-## Partial completion disclosure
+Correct split:
-If the user asked for multiple conclusions but only some are complete:
+- full totals/distributions: `全量可信结论`
+- illustrative examples: `样本观察`
-- explicitly disclose which parts are complete
-- explicitly disclose which parts are not yet complete
-- do not collapse the answer into one all-clear conclusion
+## Semantic Gate
-## Example skeleton
+Even with `safe_for_final_conclusion=true`, downgrade if:
-### 全量可信结论
+- metric definition is incomplete
+- denominator was not queried
+- conclusion mentions trend but no time field was used
+- conclusion mentions volume but no count was computed
+- grouping depends on unconfirmed business aliases
+- custom view scope is not verified
+- primary grouping field has high missingness or poor period coverage
-- `result.totals.metric_totals=...`
-- `safe_for_final_conclusion=true`
-- 这里写最终业务结论
+## Partial Disclosure
-### 样本观察
+If only part of the user request is complete:
-- 以下来自样本明细浏览，不作为最终统计结论
-- 这里写代表性样本现象
+- say which parts are complete
+- say which parts are unresolved
+- do not collapse into one all-clear conclusion
-### 待验证假设
+## Compact Disclosure Template
-- 这里写还需要进一步验证的推测
+```text
+可信度：全量可信 / 初步观察
+数据完整性：complete=..., truncated=..., safe_for_final_conclusion=...
+字段质量：primary dimension blank_rate=..., period coverage=...
+取数字段：...
+时间范围：...
+业务口径：...
+限制：...
+```