mustflow 2.103.3 → 2.103.10

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (38) hide show
  1. package/dist/cli/commands/run.js +11 -0
  2. package/dist/cli/i18n/en.js +2 -0
  3. package/dist/cli/i18n/es.js +2 -0
  4. package/dist/cli/i18n/fr.js +2 -0
  5. package/dist/cli/i18n/hi.js +2 -0
  6. package/dist/cli/i18n/ko.js +2 -0
  7. package/dist/cli/i18n/zh.js +2 -0
  8. package/dist/cli/lib/external-skill-import.js +78 -14
  9. package/dist/cli/lib/local-index/sql.js +9 -1
  10. package/dist/cli/lib/run-plan.js +37 -0
  11. package/dist/core/change-impact.js +16 -0
  12. package/dist/core/code-outline.js +3 -13
  13. package/dist/core/config-chain.js +3 -13
  14. package/dist/core/dependency-graph.js +3 -13
  15. package/dist/core/docs-link-integrity.js +23 -4
  16. package/dist/core/env-contract.js +3 -13
  17. package/dist/core/export-diff.js +3 -3
  18. package/dist/core/ignored-directories.js +40 -0
  19. package/dist/core/reference-drift.js +4 -2
  20. package/dist/core/related-files.js +3 -13
  21. package/dist/core/repo-merge-conflict-scan.js +3 -9
  22. package/dist/core/route-outline.js +3 -13
  23. package/dist/core/script-pack-suggestions.js +23 -12
  24. package/dist/core/secret-risk-scan.js +3 -13
  25. package/dist/core/skill-route-resolution.js +21 -1
  26. package/package.json +2 -2
  27. package/schemas/link-integrity-report.schema.json +1 -0
  28. package/schemas/reference-drift-report.schema.json +1 -0
  29. package/templates/default/i18n.toml +19 -7
  30. package/templates/default/locales/en/.mustflow/skills/ai-generated-code-hardening/SKILL.md +30 -7
  31. package/templates/default/locales/en/.mustflow/skills/api-request-performance-review/SKILL.md +12 -6
  32. package/templates/default/locales/en/.mustflow/skills/completion-evidence-gate/SKILL.md +20 -9
  33. package/templates/default/locales/en/.mustflow/skills/hot-path-performance-review/SKILL.md +20 -15
  34. package/templates/default/locales/en/.mustflow/skills/next-action-menu/SKILL.md +22 -7
  35. package/templates/default/locales/en/.mustflow/skills/quadratic-scan-review/SKILL.md +21 -19
  36. package/templates/default/locales/en/.mustflow/skills/react-code-change/SKILL.md +54 -8
  37. package/templates/default/locales/en/.mustflow/skills/vertical-slice-tdd/SKILL.md +22 -8
  38. package/templates/default/manifest.toml +1 -1
@@ -2,11 +2,11 @@
2
2
  mustflow_doc: skill.hot-path-performance-review
3
3
  locale: en
4
4
  canonical: true
5
- revision: 1
5
+ revision: 2
6
6
  lifecycle: mustflow-owned
7
7
  authority: procedure
8
8
  name: hot-path-performance-review
9
- description: Apply this skill when code is created, changed, reviewed, or reported and the main performance risk is ordinary work repeated many times, such as repeated I/O, repeated scans, hidden quadratic lookup, per-item allocation, lock hold time, sequential async waits, unbounded fan-out, or missing observability for hot paths.
9
+ description: Apply this skill when code is created, changed, reviewed, or reported and the main performance risk is ordinary work repeated many times, such as repeated I/O, repeated scans, hidden quadratic lookup, allocation or GC churn, per-item parsing or serialization, lock hold time, sequential async waits, unbounded fan-out, or missing observability for hot paths.
10
10
  metadata:
11
11
  mustflow_schema: "1"
12
12
  mustflow_kind: procedure
@@ -54,7 +54,7 @@ The review question is not only "which line looks slow?" It is "how often does t
54
54
 
55
55
  - Hot path: the request, loop, render, job, export, import, queue consumer, sync, report, or command path under review.
56
56
  - Multipliers: requests, rows, items, files, users, tenants, retries, pages, renders, workers, queue messages, shards, or nested loops that multiply the work.
57
- - Per-iteration cost: external calls, queries, filesystem reads, allocations, clones, DTO conversions, JSON parse/stringify, logging, formatting, regex, sorting, hashing, image or crypto work, and lock hold time.
57
+ - Per-iteration cost: external calls, queries, filesystem reads, temporary arrays, object spreads, array spreads, concat copies, clones, DTO conversions, JSON parse/stringify, string splitting, logging, formatting, regex, sorting, hashing, image or crypto work, and lock hold time.
58
58
  - Boundary ledger: DB, network, cache, filesystem, IPC, provider SDK, queue, logger, metrics sink, transaction, pool, mutex, thread, goroutine, task, or UI main thread crossed by the path.
59
59
  - Data-size and tail-latency evidence when available: p50, p95, p99, row count, payload size, allocation count, query count, round-trip count, queue depth, pool wait, lock wait, cache hit rate, retry count, or timeout behavior.
60
60
  - Correctness boundaries: order, duplicates, idempotency, authorization, tenant isolation, consistency, partial failure, stale data, cancellation, retry semantics, and error behavior.
@@ -100,22 +100,27 @@ The review question is not only "which line looks slow?" It is "how often does t
100
100
  10. Reuse expensive clients and sessions. Per-request or per-item HTTP clients, DB clients, ORM clients, SDK clients, connection pools, TLS handshakes, regexes, date formatters, and thread pools are performance traps unless the API requires that lifecycle.
101
101
  11. Check cache honesty. A cache needs a bounded key space, invalidation or TTL, max size, authorization dimensions, negative-cache policy, stale behavior, and cache stampede protection such as locking, singleflight, early refresh, or request coalescing.
102
102
  12. Check logging and telemetry in hot paths. Repeated debug logs, eager log-string creation, whole-object serialization, high-cardinality metrics, and JSON formatting for discarded logs can dominate CPU and I/O during incidents.
103
- 13. Check string, JSON, DTO, and clone churn. Repeated string concatenation, `JSON.parse(JSON.stringify(...))`, `cloneDeep`, broad object spread, deep copy, repeated DTO-to-DTO conversion, and repeated serialization can move the bottleneck into "clean" mapping code.
104
- 14. Check large value passing and materialization. In value-copy languages or APIs, large structs, arrays, buffers, spread copies, full file reads, full JSON loads, and eager `collect` calls can turn neat code into memory traffic.
105
- 15. Check regex, parsing, formatting, and locale work. Nested or ambiguous regexes, repeated date parsing, timezone conversion, numeric or locale formatting, and per-row formatter creation should be reviewed with worst-case input in mind.
106
- 16. Check CPU-heavy work in request or UI paths. Image resizing, compression, encryption, hashing, diffing, report generation, spreadsheet export, and search indexing may need batching, worker offload, queueing, or streaming, but only with clear backpressure and failure behavior.
107
- 17. Check queues and workers. Moving work to a queue only moves the bottleneck unless consumers batch DB writes, bulk external calls where safe, bound retries, apply jitter, define poison-message handling, and expose backlog.
108
- 18. Check retry and timeout multiplication. A request with several calls, long timeouts, and several retries can become a tail-latency monster. Count worst-case wait and verify idempotency before adding more attempts.
109
- 19. Review tail behavior, not just average. p50 can look fine while p95 or p99 holds locks, connections, workers, or thread-pool slots long enough to hurt everyone else.
110
- 20. Add observability before large optimization when evidence is missing. Prefer query count, external-call count, payload bytes, allocation count, cache hit rate, queue backlog, pool wait, lock wait, retry count, and span timing over guessing.
111
- 21. Rank the likely payoff. Usually fix repeated external round trips, N+1 access, hidden quadratic scans, overfetching, wide transactions, lock hold time, unbounded fan-out, and missing timeouts before micro-optimizing arithmetic.
112
- 22. Label evidence honestly. If there is no configured benchmark or production trace, report the finding as static complexity or hot-path risk, not measured speedup.
103
+ 13. Check allocation and GC churn before micro-optimizing arithmetic.
104
+ - `filter().map().reduce()`, `flatMap`, `Object.values`, `Object.entries`, `split().map(trim)`, `slice`, and `sort` chains can allocate large temporary arrays.
105
+ - Spread accumulation, `concat` in loops, repeated object spread while building indexes, and `cloneDeep` can copy growing data many times.
106
+ - `JSON.stringify` or `JSON.parse(JSON.stringify(...))` used for comparison, cloning, cache keys, or logging can dominate CPU and allocation while losing type semantics.
107
+ - Repeated `RegExp`, `Date`, `Intl`, formatter, `Set`, or `Map` construction inside hot loops should move outside the loop or become request-scoped only when ownership and memory bounds are clear.
108
+ 14. Check string, JSON, DTO, and clone churn. Repeated string concatenation, `JSON.parse(JSON.stringify(...))`, `cloneDeep`, broad object spread, deep copy, repeated DTO-to-DTO conversion, and repeated serialization can move the bottleneck into "clean" mapping code.
109
+ 15. Check large value passing and materialization. In value-copy languages or APIs, large structs, arrays, buffers, spread copies, full file reads, full JSON loads, all-pages accumulation, and eager `collect` calls can turn neat code into memory traffic.
110
+ 16. Check regex, parsing, formatting, and locale work. Nested or ambiguous regexes, repeated date parsing, timezone conversion, numeric or locale formatting, and per-row formatter creation should be reviewed with worst-case input in mind.
111
+ 17. Check CPU-heavy work in request or UI paths. Image resizing, compression, encryption, hashing, diffing, report generation, spreadsheet export, and search indexing may need batching, worker offload, queueing, or streaming, but only with clear backpressure and failure behavior.
112
+ 18. Check queues and workers. Moving work to a queue only moves the bottleneck unless consumers batch DB writes, bulk external calls where safe, bound retries, apply jitter, define poison-message handling, and expose backlog.
113
+ 19. Check retry and timeout multiplication. A request with several calls, long timeouts, and several retries can become a tail-latency monster. Count worst-case wait and verify idempotency before adding more attempts.
114
+ 20. Review tail behavior, not just average. p50 can look fine while p95 or p99 holds locks, connections, workers, or thread-pool slots long enough to hurt everyone else.
115
+ 21. Add observability before large optimization when evidence is missing. Prefer query count, external-call count, payload bytes, allocation count, heap growth, GC pause, event-loop delay, cache hit rate, queue backlog, queue wait, pool wait, lock wait, retry count, and span timing over guessing.
116
+ 22. Rank the likely payoff. Usually fix repeated external round trips, N+1 access, hidden quadratic scans, overfetching, wide transactions, lock hold time, allocation churn, unbounded fan-out, and missing timeouts before micro-optimizing arithmetic.
117
+ 23. Label evidence honestly. If there is no configured benchmark or production trace, report the finding as static complexity or hot-path risk, not measured speedup.
113
118
 
114
119
  <!-- mustflow-section: postconditions -->
115
120
  ## Postconditions
116
121
 
117
122
  - Hot path, cost multipliers, data size, round-trip count, wait points, and copy or allocation points are explicit.
118
- - N+1 queries, repeated external calls, hidden quadratic scans, unbounded materialization, sequential waits, unbounded fan-out, per-item client creation, broad logging, repeated serialization, and lock or transaction hold time are fixed or reported.
123
+ - N+1 queries, repeated external calls, hidden quadratic scans, unbounded materialization, temporary-array chains, spread or concat copy accumulation, sequential waits, unbounded fan-out, per-item client creation, broad logging, repeated parsing or serialization, allocation churn, and lock or transaction hold time are fixed or reported.
119
124
  - Cache, queue, retry, timeout, batching, bulk-write, concurrency, pagination, projection, index-fit, and observability behavior are explicit where relevant.
120
125
  - Correctness, authorization, tenant isolation, ordering, duplicates, partial failure, cancellation, and stale-data behavior remain intact or are called out as tradeoffs.
121
126
  - Performance claims are backed by configured evidence or labeled as static review risk.
@@ -151,7 +156,7 @@ Use the narrowest configured test, build, docs, release, or mustflow intent that
151
156
  - Hot path reviewed
152
157
  - Cost ledger: iteration count, data size, round trips, wait time, copy or allocation count
153
158
  - Repeated external access, N+1, hidden quadratic scans, and multi-pass collection findings
154
- - DB, pagination, index-fit, transaction, lock, async, client reuse, cache, queue, retry, timeout, logging, serialization, clone, regex, parsing, formatting, and CPU-heavy work checked where relevant
159
+ - DB, pagination, index-fit, transaction, lock, async, client reuse, cache, queue, retry, timeout, logging, temporary arrays, spread or concat accumulation, serialization, clone, regex, parsing, formatting, allocation, GC, and CPU-heavy work checked where relevant
155
160
  - Optimization or review recommendation
156
161
  - Evidence level: measured, configured-test evidence, static complexity risk, manual-only, missing, or not applicable
157
162
  - Command intents run
@@ -2,11 +2,11 @@
2
2
  mustflow_doc: skill.next-action-menu
3
3
  locale: en
4
4
  canonical: true
5
- revision: 1
5
+ revision: 3
6
6
  lifecycle: mustflow-owned
7
7
  authority: procedure
8
8
  name: next-action-menu
9
- description: Apply this skill when a final report, completion note, repository improvement loop, or follow-up workflow should offer a bounded numbered next-action menu that a user can select with a single digit in the next turn.
9
+ description: Apply this skill when a final report, completion note, repository improvement loop, or follow-up workflow should offer a bounded numbered next-action menu that a user can select with a single digit in the next turn. Use especially after non-trivial completed or paused work, commits, pushes, release or deploy preparation, verification, or remaining approval gates when concrete follow-up actions exist.
10
10
  metadata:
11
11
  mustflow_schema: "1"
12
12
  mustflow_kind: procedure
@@ -36,6 +36,8 @@ scope, approval, verification, command contracts, release gates, or safety rules
36
36
 
37
37
  - A final report, completion note, handoff, or repository improvement cycle has one or more useful
38
38
  follow-up tasks.
39
+ - A non-trivial task is being reported after changed files, a created commit, completed verification,
40
+ push readiness, release or deploy preparation, paused work, or another concrete approval gate.
39
41
  - The user repeatedly asks for "next recommended work", "continue", "proceed", or selects follow-up
40
42
  items after previous completion reports.
41
43
  - The agent needs to present a bounded backlog that can be selected by a single digit in the next
@@ -48,9 +50,10 @@ scope, approval, verification, command contracts, release gates, or safety rules
48
50
  - The current answer is a tiny direct response with no meaningful follow-up.
49
51
  - There are no evidence-backed next actions, or all plausible next actions are speculative.
50
52
  - The user asked not to include recommendations, menus, or follow-up prompts.
51
- - The next action requires a blocking product, security, privacy, legal, release, migration,
52
- destructive, dependency, credential, deployment, or payment decision that has not been authorized.
53
- Report the decision gate instead of offering it as a one-digit action.
53
+ - The only possible next action requires a blocking product, security, privacy, legal, release,
54
+ migration, destructive, dependency, credential, deployment, or payment decision that has not been
55
+ authorized and there is no safe bounded action to describe. Report the decision gate instead of
56
+ offering it as a one-digit action.
54
57
  - Another interface already owns selection state and has a stricter picker, ticket, or work-order
55
58
  contract.
56
59
 
@@ -89,8 +92,11 @@ scope, approval, verification, command contracts, release gates, or safety rules
89
92
  <!-- mustflow-section: procedure -->
90
93
  ## Procedure
91
94
 
92
- 1. Decide whether a menu is useful.
93
- - Include a menu only when at least one concrete follow-up task is valuable.
95
+ 1. Decide whether a menu is useful or required.
96
+ - Include a menu when at least one concrete follow-up task is valuable.
97
+ - For non-trivial completion reports, commits, completed verification, push readiness, release or
98
+ deploy preparation, paused work, or unresolved approval gates, treat the menu as required when
99
+ any concrete next action exists.
94
100
  - Do not fabricate filler items to reach a fixed row count.
95
101
  2. Build at most nine items.
96
102
  - Use digits `1` through `9`.
@@ -108,6 +114,13 @@ scope, approval, verification, command contracts, release gates, or safety rules
108
114
  the host format allows it.
109
115
  - Use four columns: number, next task title, description, and recommendation score.
110
116
  - In Korean final reports, use `추천도` for the recommendation-score column label.
117
+ - Use non-breaking padding in short header cells so narrow renderers do not wrap Korean headers
118
+ vertically. Prefer this template:
119
+ `| 번호&nbsp;&nbsp; | 다음 작업 | 설명 | 추천도&nbsp;&nbsp; |`
120
+ `|---:|---|---|:---:|`
121
+ For English, prefer:
122
+ `| No.&nbsp;&nbsp; | Next task | Description | Score&nbsp;&nbsp; |`
123
+ `|---:|---|---|:---:|`
111
124
  - Keep descriptions short enough to scan but specific enough to execute.
112
125
  - Localize column labels to the report language when appropriate.
113
126
  6. Mark gated items plainly.
@@ -116,6 +129,8 @@ scope, approval, verification, command contracts, release gates, or safety rules
116
129
  genuinely plausible follow-ups.
117
130
  - The description must state the gate, such as explicit user approval, configured command intent,
118
131
  owner decision, or manual verification.
132
+ - A gated item in the table is only a visible next-action option; it is not approval to perform
133
+ that action.
119
134
  7. Interpret a single-digit next user message as a menu selection only when all conditions hold:
120
135
  - the immediately relevant previous assistant final report contained a next-action menu;
121
136
  - the digit maps to an item in that menu;
@@ -2,7 +2,7 @@
2
2
  mustflow_doc: skill.quadratic-scan-review
3
3
  locale: en
4
4
  canonical: true
5
- revision: 1
5
+ revision: 2
6
6
  lifecycle: mustflow-owned
7
7
  authority: procedure
8
8
  name: quadratic-scan-review
@@ -84,33 +84,35 @@ The review question is not "is there a loop inside a loop?" That catches only th
84
84
  ## Procedure
85
85
 
86
86
  1. Name the repeated path and multiply call count by inner scan length. Review the product `outer_count * inner_count`, not the apparent number of loops.
87
- 2. Search for the obvious collection-combinator shapes: `map` plus `filter`, `map` plus `find`, `forEach` plus `includes`, `filter` plus `indexOf`, `reduce` plus spread, and chained `filter().map().sort()` inside a repeated path.
87
+ 2. Search for the obvious collection-combinator shapes: `map` plus `filter`, `map` plus `find`, `forEach` plus `includes`, `filter` plus `indexOf`, `filter` plus `findIndex`, `reduce` plus spread, and chained `filter().map().sort()` inside a repeated path.
88
88
  3. Search for membership checks over arrays. `includes`, `indexOf`, `contains`, `find`, `some`, and list membership inside a loop usually want `Set.has` or `Map.has` unless the searched list is tiny and hard-capped.
89
89
  4. Search for code joins by ID. `posts.map(post => users.find(...))`, `users.map(user => orders.filter(...))`, permission lookups, likes, bookmarks, read state, tags, and relation lists usually need a `Map` or grouped `Map` keyed by ID or composite key.
90
90
  5. Check duplicate removal. `filter((x, i) => arr.indexOf(x) === i)` is O(N^2). Prefer `Set` for scalar values and `Map` keyed by stable identity for objects.
91
91
  6. Check sorted arrays. Sorting does not make `find` fast. If code repeatedly searches a sorted array, use a prebuilt map, binary search with a proven comparator, or a single sorted merge.
92
92
  7. Check repeated sorting. Sorting inside a per-item loop is usually worse than scanning once, keeping a top candidate, using a heap, or sorting once before the loop.
93
- 8. Check copy-accumulation patterns. `reduce` with `[...acc, item]`, repeated object spread over a growing object, repeated string `+=`, and repeated concatenation can become quadratic copy work. Prefer push, builders, buffers, or one final copy at the boundary.
94
- 9. Check JSON and serialization comparisons. Repeated `JSON.stringify` inside search, equality, sort, dedupe, or render logic multiplies object size by item count. Use explicit keys and precomputed normalized keys.
95
- 10. Open helper bodies called from loops or render paths. Harmless helper names can hide full-list scans, database calls, resolver calls, serialization, sorting, or permission checks.
96
- 11. Check ORM and lazy relations. A single visible loop can become one query per entity. Replace per-entity relation access with eager loading, joins, `WHERE id IN (...)`, batch loading, or DataLoader-style batching.
97
- 12. Check GraphQL and nested resolvers. Parent-list resolvers plus per-field DB or API calls create hidden pairwise fan-out. Batch by parent IDs and preserve field-level authorization semantics.
98
- 13. Check render-time lookup. `rows.map(row => columns.find(...))`, `items.map(item => selectedIds.includes(item.id))`, derived data recomputed on every render, and per-row helper scans should move to memoized sets or maps when inputs are large or stable.
99
- 14. Check all-data-in-app joins. Fetching `allUsers`, `allOrders`, or `allLogs` and joining in application arrays is often a database join without an index. Push join, filter, sort, and pagination to the data store when the data store owns the index and semantics allow it.
100
- 15. Check tree and graph construction. `nodes.map(node => nodes.filter(child => child.parentId === node.id))` should usually become `childrenByParentId` plus one assembly pass. `visited.includes(id)` in traversal should be a `Set`.
101
- 16. Check event-log and time-window scans. Repeatedly scanning all previous events per event should usually become grouping, sorting once, and one pointer or rolling aggregate per key.
102
- 17. Check interval overlap. All-pairs range checks are sometimes necessary, but overlap detection often only needs sorting by start and comparing adjacent or active intervals.
103
- 18. Check incremental updates. Adding one item should not recompute a full ranking, group map, unread count, cart total, or dashboard aggregate unless the collection is fixed and tiny.
104
- 19. Separate index from cache. A `Map` built from current input is an index. A cache stores results across calls or time. Use an index for repeated lookup over already-owned data before introducing cache invalidation.
105
- 20. Require a hard cap for "small list" exceptions. Countries, enum options, or fixed config lists may stay arrays if the cap is real. User data, logs, orders, comments, permissions, tags, events, and uploaded rows need scalable lookup.
106
- 21. Preserve behavior while changing shape. Before replacing scans with indexes, state how order, duplicates, first or last match, missing references, authorization filtering, and stable keys are preserved.
107
- 22. Add growth evidence when feasible. If configured tests or fixtures can scale input size, prefer a small growth test that compares behavior at larger counts. If benchmarking is not configured, report complexity-only evidence instead of a speedup claim.
93
+ 8. Check queue and deletion patterns. JavaScript `shift()` in a large BFS or queue loop moves the remaining array repeatedly; use a head index or real queue. `findIndex` plus `splice` while matching requests to available items can scan and move the same growing array repeatedly; bucket by key and advance a consumption pointer instead.
94
+ 9. Check copy-accumulation patterns. `reduce` with `[...acc, item]`, repeated object spread over a growing object, repeated string `+=`, repeated `concat`, and repeated array spread over a growing result can become quadratic copy work. Prefer push, builders, buffers, or one final copy at the boundary.
95
+ 10. Check JSON and serialization comparisons. Repeated `JSON.stringify` inside search, equality, sort, dedupe, or render logic multiplies object size by item count. Use explicit keys and precomputed normalized keys.
96
+ 11. Open helper bodies called from loops or render paths. Harmless helper names can hide full-list scans, database calls, resolver calls, serialization, sorting, or permission checks.
97
+ 12. Check ORM and lazy relations. A single visible loop can become one query per entity. Replace per-entity relation access with eager loading, joins, `WHERE id IN (...)`, batch loading, or DataLoader-style batching.
98
+ 13. Check GraphQL and nested resolvers. Parent-list resolvers plus per-field DB or API calls create hidden pairwise fan-out. Batch by parent IDs and preserve field-level authorization semantics.
99
+ 14. Check render-time lookup. `rows.map(row => columns.find(...))`, `items.map(item => selectedIds.includes(item.id))`, derived data recomputed on every render, and per-row helper scans should move to memoized sets or maps when inputs are large or stable.
100
+ 15. Check all-data-in-app joins. Fetching `allUsers`, `allOrders`, or `allLogs` and joining in application arrays is often a database join without an index. Push join, filter, sort, and pagination to the data store when the data store owns the index and semantics allow it.
101
+ 16. Check tree and graph construction. `nodes.map(node => nodes.filter(child => child.parentId === node.id))` should usually become `childrenByParentId` plus one assembly pass. `visited.includes(id)` in traversal should be a `Set`. Very deep trees may also need an explicit stack to avoid call-stack failure.
102
+ 17. Check event-log and time-window scans. Repeatedly scanning all previous events per event should usually become grouping, sorting once, and one pointer or rolling aggregate per key.
103
+ 18. Check interval overlap. All-pairs range checks are sometimes necessary, but overlap detection often only needs sorting by start and comparing adjacent or active intervals.
104
+ 19. Check true all-pairs similarity separately. If every item must be compared with every other item, do not promise a linear rewrite. First narrow candidates with stable keys, categories, buckets, hashes, n-grams, ranges, or database indexes, then compare only within the candidate set.
105
+ 20. Check incremental updates. Adding one item should not recompute a full ranking, group map, unread count, cart total, or dashboard aggregate unless the collection is fixed and tiny.
106
+ 21. Separate index from cache. A `Map` built from current input is an index. A cache stores results across calls or time. Use an index for repeated lookup over already-owned data before introducing cache invalidation.
107
+ 22. Require a hard cap for "small list" exceptions. Countries, enum options, or fixed config lists may stay arrays if the cap is real. User data, logs, orders, comments, permissions, tags, events, and uploaded rows need scalable lookup.
108
+ 23. Preserve behavior while changing shape. Before replacing scans with indexes, state how order, duplicates, first or last match, missing references, authorization filtering, and stable keys are preserved.
109
+ 24. Add growth evidence when feasible. If configured tests or fixtures can scale input size, prefer a small growth test that compares behavior at larger counts. If benchmarking is not configured, report complexity-only evidence instead of a speedup claim.
108
110
 
109
111
  <!-- mustflow-section: postconditions -->
110
112
  ## Postconditions
111
113
 
112
114
  - Each suspected O(N^2) path has an outer count, inner count, and data-growth classification.
113
- - Repeated membership checks, code joins, duplicate removal, tree building, resolver fan-out, render-time lookup, helper-hidden scans, repeated sort, copy accumulation, and JSON comparison are fixed or reported.
115
+ - Repeated membership checks, code joins, duplicate removal, tree building, resolver fan-out, render-time lookup, helper-hidden scans, repeated sort, queue `shift()`, `findIndex` plus `splice`, copy accumulation, interval scans, all-pairs candidate narrowing, and JSON comparison are fixed or reported.
114
116
  - Array-to-set or array-to-map changes preserve order, duplicates, missing records, first or last winner, authorization, and stable key behavior.
115
117
  - Small-list exceptions have an explicit hard cap or are reported as residual risk.
116
118
  - Performance claims are backed by configured evidence or labeled as static complexity risk.
@@ -146,7 +148,7 @@ Use the narrowest configured test, build, docs, release, or mustflow intent that
146
148
  - Repeated path reviewed
147
149
  - Outer count, inner count, and data-growth classification
148
150
  - Hidden scan patterns found or ruled out
149
- - Membership, join, dedupe, helper, ORM, resolver, render, tree, graph, event, interval, sort, copy, string, and JSON checks where relevant
151
+ - Membership, join, dedupe, helper, ORM, resolver, render, tree, graph, event, interval, all-pairs, queue, deletion, sort, copy, string, and JSON checks where relevant
150
152
  - Index, grouping, sorted merge, database join, or intentional all-pairs decision
151
153
  - Semantics preserved: order, duplicates, first or last winner, missing IDs, authorization, and stable keys
152
154
  - Evidence level: configured test, static complexity risk, manual-only, missing, or not applicable
@@ -2,11 +2,11 @@
2
2
  mustflow_doc: skill.react-code-change
3
3
  locale: en
4
4
  canonical: true
5
- revision: 1
5
+ revision: 2
6
6
  lifecycle: mustflow-owned
7
7
  authority: procedure
8
8
  name: react-code-change
9
- description: Apply this skill when React, React DOM, React Server Components, Server Actions, React Compiler, Hooks, Suspense, Actions, forms, refs, context, concurrent rendering, SSR streaming, resource hints, package metadata, or React-related tests are created, changed, reviewed, or upgraded.
9
+ description: Apply this skill when React, React DOM, React Server Components, Server Actions, React Compiler, Hooks, Suspense, Actions, forms, refs, context, render performance, concurrent rendering, SSR streaming, resource hints, package metadata, or React-related tests are created, changed, reviewed, or upgraded.
10
10
  metadata:
11
11
  mustflow_schema: "1"
12
12
  mustflow_kind: procedure
@@ -75,6 +75,10 @@ expect current React guidance and small, compatible changes.
75
75
  - State and mutation evidence: local state owner, derived values, external
76
76
  stores, context providers, forms, Actions, optimistic updates, and rollback
77
77
  behavior.
78
+ - Render performance evidence: React DevTools Profiler or `<Profiler>` data when
79
+ available, render count, render duration, prop identity changes, context update
80
+ scope, list size, DOM node count, key stability, layout effect use, first-load
81
+ bundle ownership, and offscreen DOM cost.
78
82
  - Configured verification intents for lint, build, tests, docs, package, and
79
83
  mustflow checks.
80
84
 
@@ -186,14 +190,49 @@ expect current React guidance and small, compatible changes.
186
190
  errors, resets, progressive enhancement, and rollback.
187
191
  - Keep explicit error handling, authorization, validation, idempotency, and
188
192
  rollback behavior. Do not hide server failures behind optimistic UI.
189
- 10. **Respect React 19.2 rendering and performance APIs.**
193
+ 10. **Review React render hot paths with evidence.**
194
+ - Use React DevTools Profiler, `<Profiler>`, framework traces, or existing
195
+ render-count evidence before claiming a render-performance fix. If none is
196
+ configured, report static render risk instead of measured speedup.
197
+ - Check whether state is owned too high in the tree. Search inputs, tabs,
198
+ modal flags, hover state, and local drafts should not rerender a whole page
199
+ unless that page truly owns the state.
200
+ - Check `memo` failures from unstable props. Inline objects, arrays, functions,
201
+ and selector results can make `React.memo` ineffective; prefer primitive
202
+ props, stable callbacks, or moving object creation behind a real dependency.
203
+ - Move expensive render-time `filter`, `sort`, `map`, grouping, and lookup work
204
+ behind `useMemo`, server-side pagination, route loaders, or pre-indexed data
205
+ when input size can grow.
206
+ - Large lists need pagination, infinite query boundaries, virtualization, or a
207
+ documented hard cap. Do not render thousands of rows because the sample data
208
+ has twenty.
209
+ - Reject unstable keys such as array index for reorderable data and
210
+ `Math.random()` for any list. Use stable item identity so React preserves
211
+ row state and avoids forced remounts.
212
+ - Split oversized context values by change frequency and ownership. `memo`
213
+ does not stop rerenders caused by a fresh context value.
214
+ - Do not use `useEffect` plus `setState` for values derived from current props
215
+ or state. Compute during render or memoize the calculation to avoid the
216
+ extra render pass.
217
+ - For search and filtering, keep the controlled input urgent and move heavy
218
+ result updates behind `useDeferredValue`, `useTransition`, server filtering,
219
+ or pagination when the supported React version and UX allow it.
220
+ - Use `useLayoutEffect` only when pre-paint measurement is required. Avoid
221
+ DOM read/write interleaving that causes layout thrashing.
222
+ - Lazy-load heavy charts, editors, maps, markdown renderers, syntax
223
+ highlighters, and modal-only widgets when they are not needed for the first
224
+ interaction path.
225
+ - For large offscreen sections, consider `content-visibility` plus
226
+ `contain-intrinsic-size`, framework lazy boundaries, or route splitting when
227
+ browser support and layout stability are acceptable.
228
+ 11. **Respect React 19.2 rendering and performance APIs.**
190
229
  - Treat `<Activity>` as hidden UI with preserved state, unmounted effects,
191
230
  and lower-priority hidden updates, not as `display: none` or ordinary
192
231
  conditional rendering.
193
232
  - Use React Performance Tracks, React DevTools, or existing profiler evidence
194
233
  when claiming render, effect, Scheduler, transition, or component
195
234
  performance improvements.
196
- 11. **Keep server rendering and RSC boundaries exact.**
235
+ 12. **Keep server rendering and RSC boundaries exact.**
197
236
  - Distinguish Server Components from Server Actions. `"use server"` marks
198
237
  server functions or modules for actions; it is not a Server Component tag.
199
238
  - Keep browser APIs, client state, and event handlers out of Server
@@ -206,13 +245,13 @@ expect current React guidance and small, compatible changes.
206
245
  - In Node environments, do not assume Web Streams are faster than Node
207
246
  streams; preserve the existing SSR stream API unless the task proves the
208
247
  runtime benefit and compression behavior.
209
- 12. **Use React DOM document and resource APIs close to the owner.**
248
+ 13. **Use React DOM document and resource APIs close to the owner.**
210
249
  - Metadata, stylesheets with `precedence`, async scripts, `preinit`,
211
250
  `preload`, `preconnect`, and `prefetchDNS` may belong near the component
212
251
  that needs them when React and the framework support that behavior.
213
252
  - Avoid duplicate head managers, resource hint spam, and hints for assets
214
253
  whose timing or priority is unproven.
215
- 13. **Verify through the repository contract.**
254
+ 14. **Verify through the repository contract.**
216
255
  - Run the smallest configured checks that cover changed React code, package
217
256
  metadata, build output, docs, and tests.
218
257
  - Report missing browser, hydration, SSR, RSC, compiler, profiler, or
@@ -225,12 +264,16 @@ expect current React guidance and small, compatible changes.
225
264
  status are known or explicitly reported as unknown.
226
265
  - Effects, state, memoization, context, refs, forms, Suspense, and async
227
266
  boundaries follow React's current model for the supported version.
267
+ - Render performance claims are backed by profiler or render-count evidence, or
268
+ static risks such as state too high, unstable props, render-time transforms,
269
+ huge lists, unstable keys, oversized context, derived-state effects, layout
270
+ thrashing, eager heavy widgets, and offscreen DOM cost are reported honestly.
228
271
  - React 19 and React 19.2 APIs are not introduced into code that still promises
229
272
  older React compatibility.
230
273
  - SSR, RSC, Server Action, browser-only, and resource-hint boundaries are
231
274
  preserved.
232
- - Performance claims have profiler or benchmark evidence, or are reported as
233
- unverified.
275
+ - Performance claims have profiler, benchmark, render-count, or configured
276
+ evidence, or are reported as unverified.
234
277
 
235
278
  <!-- mustflow-section: verification -->
236
279
  ## Verification
@@ -271,6 +314,9 @@ surfaces changed.
271
314
  - React surface and supported version checked
272
315
  - Compiler, lint, effect, state, memoization, context, ref, form, Suspense, SSR,
273
316
  RSC, and resource-boundary notes
317
+ - Render performance notes: profiler evidence, state ownership, prop identity,
318
+ render-time work, list size, key stability, context scope, derived state,
319
+ layout effects, lazy loading, and offscreen DOM
274
320
  - Freshness-sensitive React claims checked or left conservative
275
321
  - Files changed
276
322
  - Command intents run
@@ -2,7 +2,7 @@
2
2
  mustflow_doc: skill.vertical-slice-tdd
3
3
  locale: en
4
4
  canonical: true
5
- revision: 1
5
+ revision: 2
6
6
  lifecycle: mustflow-owned
7
7
  authority: procedure
8
8
  name: vertical-slice-tdd
@@ -30,7 +30,7 @@ metadata:
30
30
 
31
31
  Support explicit test-driven development without making test-first work mandatory for every mustflow task.
32
32
 
33
- This skill keeps TDD work in narrow vertical behavior slices: one observable contract, one focused test change, the smallest implementation that proves it, and only then a local refactor inside the covered slice.
33
+ This skill keeps TDD work in one vertical behavior slice at a time: choose the next test by risk and evidence value, prove one observable contract, attack the test for false-green weakness, implement only enough behavior to pass, and only then refactor inside the covered slice.
34
34
 
35
35
  <!-- mustflow-section: use-when -->
36
36
  ## Use When
@@ -54,6 +54,7 @@ This skill keeps TDD work in narrow vertical behavior slices: one observable con
54
54
 
55
55
  - User request or issue evidence that makes TDD or slice-by-slice work appropriate.
56
56
  - The observable behavior contract for the first slice.
57
+ - A short test list or risk list, ordered by which test would expose the most important uncertainty next.
57
58
  - Existing tests, fixtures, and helpers near that behavior.
58
59
  - The expected RED category and baseline status before implementation.
59
60
  - Relevant command-intent contract entries for the narrowest verification path.
@@ -78,9 +79,11 @@ This skill keeps TDD work in narrow vertical behavior slices: one observable con
78
79
  <!-- mustflow-section: procedure -->
79
80
  ## Procedure
80
81
 
81
- 1. Select one vertical behavior slice.
82
+ 1. Select the next evidence-bearing slice.
82
83
  - Name the user-visible or public behavior.
83
84
  - Define the smallest input, action, and observable output that prove the slice.
85
+ - Prefer the test that would reveal the riskiest unknown, boundary, integration contract, or regression path, not merely the easiest happy path.
86
+ - Treat Red-Green-Refactor as the inner loop, not the whole method. Do not start adding tests before choosing why this test is the next useful evidence.
84
87
  - Keep cross-cutting infrastructure, broad refactors, and speculative future cases outside the slice.
85
88
  2. Find existing coverage.
86
89
  - Prefer extending a nearby existing test when it already owns the behavior surface.
@@ -90,30 +93,39 @@ This skill keeps TDD work in narrow vertical behavior slices: one observable con
90
93
  - Use `test-design-guard` to select the test shape and assertion.
91
94
  - Assert observable behavior such as a return value, exit code, output, file effect, state transition, schema result, or error shape.
92
95
  - Keep mocks supportive rather than the only behavior evidence, unless the interaction itself is the public contract.
93
- 4. Classify the RED result before implementation.
96
+ 4. Attack the test before trusting it.
97
+ - Ask what bug could still pass this test. Strengthen the assertion when the answer is concrete and in scope.
98
+ - Prefer property, contract, approval, integration, or mutation-style evidence only when `test-design-guard` shows that shape fits the contract and stays bounded.
99
+ - For legacy code, use characterization or approval-style evidence to freeze current behavior before refactoring when the intended behavior is not yet trusted.
100
+ - For API or service boundaries, prefer consumer, schema, or contract evidence over mocks of the provider's imagined behavior.
101
+ - If implementation was AI-assisted, check that generated code did not outrun the selected test by adding untested branches, features, or public behavior.
102
+ 5. Classify the RED result before implementation.
94
103
  - `behavior_red` is the only valid behavior RED.
95
104
  - `api_scaffold_red` may be reported only for an explicitly new public API scaffold and must not be counted as behavior coverage.
96
105
  - `invalid_red` includes setup failures, wrong imports, missing unrelated symbols, runner failures, fixture failures, syntax or type errors, bad mocks, missing awaits, environment failures, and unrelated baseline failures.
97
106
  - If RED is invalid, fix the test setup or report the invalid evidence before changing implementation behavior.
98
- 5. Implement the smallest behavior change.
107
+ 6. Implement the smallest behavior change.
99
108
  - Change only the code needed for the current observable contract.
100
109
  - Preserve existing public behavior outside the slice.
101
110
  - Avoid introducing abstractions unless they directly reduce complexity in the current slice.
102
- 6. Verify GREEN with the narrowest configured command intent.
111
+ - Do not accept a broad AI-generated implementation just because the narrow test turned green; trim or defer unproven behavior.
112
+ 7. Verify GREEN with the narrowest configured command intent.
103
113
  - Start with the intent that covers the changed test and implementation surface.
104
114
  - Escalate only when the slice crosses public surfaces, package or template contracts, or the related selector cannot cover the changed files.
105
115
  - Keep command evidence separate from RED evidence and implementation notes.
106
- 7. Refactor only after GREEN.
116
+ 8. Refactor only after GREEN.
107
117
  - Limit refactoring to code covered by the slice.
108
118
  - Re-run the same configured verification intent after behavior-preserving cleanup when the refactor is non-trivial.
109
- 8. Decide whether to continue.
119
+ 9. Decide whether to continue.
110
120
  - Repeat only when the next slice is clearly in scope.
121
+ - Reorder the remaining test list when new evidence changes the highest-risk unknown.
111
122
  - Stop and report deferred slices when the remaining work is broader than the user request or needs a new decision.
112
123
 
113
124
  <!-- mustflow-section: postconditions -->
114
125
  ## Postconditions
115
126
 
116
127
  - Each completed slice has a named behavior contract, RED category, implementation summary, and GREEN verification evidence.
128
+ - Each completed slice records why that test was chosen next and how false-green risk was checked.
117
129
  - Invalid RED and scaffold-only RED are not reported as behavior coverage.
118
130
  - Deferred slices, rejected speculative cases, skipped checks, and remaining risks are explicit.
119
131
  - No command execution claim relies on anything outside the configured command intents.
@@ -145,10 +157,12 @@ Prefer the narrowest configured intent that proves the current slice. Escalate o
145
157
  ## Output Format
146
158
 
147
159
  - TDD trigger and slice scope
160
+ - Next-test selection rationale
148
161
  - Existing coverage reused
149
162
  - Slices completed
150
163
  - Slices deferred
151
164
  - Cases rejected as duplicate or speculative
165
+ - False-green checks and test-strength limits
152
166
  - RED Evidence:
153
167
  - category: `behavior_red`, `api_scaffold_red`, `invalid_red`, or `not_applicable`
154
168
  - command intent
@@ -1,6 +1,6 @@
1
1
  id = "default"
2
2
  name = "default"
3
- version = "2.103.3"
3
+ version = "2.103.10"
4
4
  description = "Minimal workflow for LLM agents to read, edit, and verify their work in a repository."
5
5
  common_root = "common"
6
6
  locales_root = "locales"