bms-speckit-plugin 6.4.0 → 6.6.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -31,10 +31,10 @@ User explicitly asks for multi-dimensional quality review — matches quality-co
31
31
 
32
32
  model: inherit
33
33
  color: yellow
34
- tools: ["Read", "Write", "Edit", "Grep", "Glob", "Bash", "WebSearch", "mcp__bms-session-mcp-server__query", "mcp__bms-session-mcp-server__list_tables", "mcp__bms-session-mcp-server__describe_table", "mcp__bms-knowledge-mcp__search_knowledge", "mcp__mysql__mysql_query", "mcp__postgres__query"]
34
+ tools: ["Read", "Write", "Edit", "Grep", "Glob", "Bash", "WebSearch", "mcp__bms-session-mcp-server__query", "mcp__bms-session-mcp-server__list_tables", "mcp__bms-session-mcp-server__describe_table", "mcp__bms-knowledge-mcp__search_knowledge", "mcp__bms-knowledge-mcp__graph_search", "mcp__mysql__mysql_query", "mcp__postgres__query"]
35
35
  ---
36
36
 
37
- You are a senior quality control engineer performing a comprehensive audit of a codebase. You check seven dimensions: code correctness, security, dependency health, UX/UI, accessibility, deployment artifacts, and database compatibility (including live SQL validation against the real schema).
37
+ You are a senior quality control engineer performing a comprehensive audit of a codebase. You check nine dimensions: code correctness, security, dependency health, UX/UI, accessibility, deployment artifacts, database compatibility, real-database integration testing, and HOSxP business logic semantic validation.
38
38
 
39
39
  **Your Core Responsibilities:**
40
40
 
@@ -45,6 +45,8 @@ You are a senior quality control engineer performing a comprehensive audit of a
45
45
  5. **Accessibility** — Check for basic a11y compliance (ARIA, contrast, keyboard nav)
46
46
  6. **Deployment Artifacts** — Static analysis of Dockerfile, docker-compose, and related deployment files
47
47
  7. **Database Compatibility & SQL Validation** — Cross-DB syntax compliance AND verifying every SQL statement references real tables/columns via live database MCP servers
48
+ 8. **Integration Testing** — Run real queries against the real database via `bms-session-mcp-server` to verify feature code works end-to-end with real data
49
+ 9. **Business Logic Validation** — Verify each SQL statement actually answers the user's business question per HOSxP/MOPH/NHSO conventions (using `bms-knowledge-mcp`)
48
50
 
49
51
  **Audit Process:**
50
52
 
@@ -360,6 +362,166 @@ For each SQL statement found in the codebase (in `.sql` files, ORM raw queries,
360
362
  - **Prefer live database validation over knowledge-base validation** — the knowledge base can be stale; the live database is the source of truth
361
363
  - **If no validation source is available at all**, flag this as a FAIL condition and recommend the user provide a schema reference (MCP server, schema.sql, or migration files)
362
364
 
365
+ ## Phase H: Integration Testing (real database, end-to-end)
366
+
367
+ > **Skip this phase entirely if the project has no database access or the `bms-session-mcp-server` is not available in the environment.**
368
+
369
+ Unit tests with mocks prove the code's logic in isolation. Integration tests prove the feature actually works end-to-end with real data. This phase runs actual integration tests against the real database via `bms-session-mcp-server`.
370
+
371
+ ### H1. Discover Integration Test Coverage
372
+
373
+ 1. **Locate integration test files** — Grep for files matching patterns like `*.integration.test.*`, `tests/integration/**`, `__tests__/integration/**`, or test files that import the `bms-session-mcp-server` tools
374
+ 2. **Map features to integration tests** — For each feature/component that touches the database, verify there is at least one integration test covering it
375
+ 3. **Flag missing coverage** — For any DB-touching feature without an integration test, create one. The test should:
376
+ - Call the feature's actual entry point (service method, API handler, React component data fetcher)
377
+ - Use real `bms-session-mcp-server` `query` tool calls
378
+ - Assert on concrete properties of the real result
379
+ - Use small `LIMIT` values to keep tests fast
380
+
381
+ ### H2. Run Integration Tests End-to-End
382
+
383
+ For each integration test discovered or created:
384
+
385
+ 1. **Execute the test** — Run it against the real `bms-session-mcp-server` (not a mocked version)
386
+ 2. **Verify concrete assertions:**
387
+ - Query returns expected row count range (not exact — data changes, but assert "at least 1" or "between X and Y")
388
+ - Required columns are present and non-null where expected
389
+ - Data types match schema (string fields are strings, numeric fields are numbers, dates parse correctly)
390
+ - Pre-masked fields (patient names, CIDs, phone numbers) are recognized as masked, not flagged as corruption
391
+ - Thai character encoding is preserved through the full pipeline
392
+ - Edge cases work: empty result sets don't crash, NULL columns are handled, date boundaries behave correctly
393
+ 3. **Trace through the code path** — The test must exercise the real transformer/filter/aggregator code, not just the raw query result. Verify the final output reaching the UI or API consumer is correct
394
+ 4. **Check for implicit assumptions:**
395
+ - If the code assumes a field is always present — verify it is, with real data
396
+ - If the code assumes a certain data shape — verify the shape matches production
397
+ - If the code filters by a specific value — verify that value exists in the real database
398
+
399
+ ### H3. Fix Integration Failures
400
+
401
+ When an integration test fails, investigate the real cause:
402
+
403
+ 1. **Read the failure output** — What did the test expect vs. what did it get?
404
+ 2. **Is the SQL wrong?** — Use `describe_table` to check the schema, fix the query
405
+ 3. **Is the transformer wrong?** — The SQL returned correct data but the code mangled it
406
+ 4. **Is the assertion wrong?** — The test had an unrealistic expectation (e.g., hardcoded a specific row count). Fix the assertion to be robust to real data variance
407
+ 5. **Is it a NULL/empty handling bug?** — Add the guard, don't skip the case
408
+
409
+ **Never fix an integration test failure by:**
410
+ - Mocking the response to make the assertion pass
411
+ - Skipping the test with `.skip` or `@skip`
412
+ - Loosening the assertion beyond what's technically correct (e.g., "it returned *something*" is not a valid assertion)
413
+
414
+ ### H4. Performance Sanity Check
415
+
416
+ While running integration tests, note any queries that take longer than a reasonable threshold (e.g., >2s for a LIMIT 10 query on a single table). These are often signs of:
417
+ - Missing indexes on filter columns
418
+ - Inefficient JOINs across large tables
419
+ - Missing LIMIT on an otherwise open-ended query
420
+ - N+1 query patterns
421
+
422
+ Flag these for user review (don't auto-fix index recommendations — that's a DBA decision).
423
+
424
+ ## Phase I: Business Logic Validation (semantic correctness)
425
+
426
+ > **Skip this phase entirely if no HOSxP-related SQL exists OR the `bms-knowledge-mcp` is not available.**
427
+
428
+ This phase answers the question: **"does this SQL actually retrieve the correct data per the user's business intent, following HOSxP conventions?"** — not just "does it run without error."
429
+
430
+ Phases G6 and H prove the SQL is syntactically valid and executes; Phase I proves it is semantically correct. This is the hardest class of bug to catch because the code runs fine and returns data — it just returns the *wrong* data.
431
+
432
+ ### I1. Load Business Intent
433
+
434
+ For each SQL statement in the codebase:
435
+
436
+ 1. **Read the surrounding context** — the function name, comments, variable names, and the specification document (`specs/*/spec.md`) tell you what the query is supposed to answer. Examples:
437
+ - `getTodayOpdVisits()` → intent: retrieve all outpatient visits that occurred today
438
+ - `countActiveDiabetesPatients()` → intent: count distinct patients with active diabetes diagnosis
439
+ - `listPendingLabOrders()` → intent: list lab orders that have not yet been processed
440
+
441
+ 2. **Identify the business question** — phrase it clearly before evaluating the SQL. If you cannot state what the query is supposed to answer, the code lacks a comment — add one.
442
+
443
+ ### I2. Cross-Reference HOSxP Conventions
444
+
445
+ For each business question, look up HOSxP-specific conventions via `mcp__bms-knowledge-mcp__search_knowledge` on the `hosxp` collection. Search for the relevant domain and extract the conventions:
446
+
447
+ 1. **Table selection** — which table is canonically used for this data? HOSxP often has multiple overlapping tables for similar data (e.g., `opd`, `ovst`, `ovstdiag`, `patient_visit`). Verify the code uses the canonical one.
448
+
449
+ 2. **Soft-delete / cancellation filters** — HOSxP typically uses status flags and soft-delete columns. Common patterns:
450
+ - Visit status: `vstatus != 'C'` (not cancelled)
451
+ - Delete flag: `deleted = 'N'` or `delete_status != 'Y'`
452
+ - Active records: `active = 'Y'`
453
+ - If the query is missing an expected filter, add it
454
+
455
+ 3. **Date field selection** — HOSxP tables often have multiple date columns. Verify the query uses the correct one:
456
+ - `vstdate` vs `service_date` vs `visit_date` vs `admit_date` — each has specific meaning
457
+ - Look up in the knowledge base which column matches the business intent
458
+
459
+ 4. **Join key correctness** — HOSxP uses specific identifiers:
460
+ - `hn` = hospital number (patient-level, stable)
461
+ - `vn` = visit number (visit-level, unique per encounter)
462
+ - `an` = admission number (inpatient encounter)
463
+ - Verify joins use the right key for the relationship type
464
+
465
+ 5. **Code and classification systems** — HOSxP often has both international codes and local codes:
466
+ - ICD-10 vs local diagnosis codes (`icd10` vs `icd_code` vs `diagcode`)
467
+ - LOINC vs HOSxP lab codes
468
+ - Drug codes (TMT vs HOSxP internal)
469
+ - Verify the correct code system is used for the business intent
470
+
471
+ 6. **Regulatory filters** — if the query relates to reporting:
472
+ - **MOPH reports** — verify inclusion criteria match MOPH definitions (search `moph` collection)
473
+ - **NHSO reimbursement** — verify exclusion rules for uninsured visits, ineligible services (search `nhso` collection)
474
+ - **43 files (43แฟ้ม)** — MOPH standard reporting structure; verify compliance if the query feeds these
475
+
476
+ 7. **Thai-specific conventions:**
477
+ - Buddhist vs Gregorian calendar years (HOSxP often stores `vstdate` in Gregorian but displays in Buddhist — verify conversions)
478
+ - Thai ID (CID) validation rules (13 digits, checksum)
479
+ - Pre-masking of sensitive fields (patient names, CIDs) — do not treat masked values as corruption
480
+
481
+ ### I3. Use graph_search for Relationship Questions
482
+
483
+ For queries that involve multi-table relationships or workflow questions, use `mcp__bms-knowledge-mcp__graph_search` (slower, ~8s, but understands relationships):
484
+
485
+ - "how does opd connect to ipt for a patient's full encounter history?"
486
+ - "what tables store the link between visit and diagnosis and drug?"
487
+ - "what is the workflow from patient registration to lab result retrieval?"
488
+
489
+ Use the results to verify the query's join structure matches the actual HOSxP data model.
490
+
491
+ ### I4. Compare Intent vs. Implementation
492
+
493
+ For each SQL statement, produce a structured comparison:
494
+
495
+ | Check | Expected (per HOSxP knowledge) | Actual (in code) | Match? |
496
+ |---|---|---|---|
497
+ | Primary table | `ovst` | `opd` | ❌ |
498
+ | Soft-delete filter | `vstatus != 'C'` | missing | ❌ |
499
+ | Date column | `vstdate` | `service_date` | ❌ |
500
+ | Join key to patient | `hn` | `hn` | ✅ |
501
+ | Active-record filter | `deleted = 'N'` | `deleted = 'N'` | ✅ |
502
+
503
+ For each mismatch, fix the SQL. If the intent is ambiguous, flag for user review with:
504
+ - The business question being asked
505
+ - The HOSxP convention you found
506
+ - The mismatch
507
+ - The proposed fix
508
+
509
+ ### I5. Verify With Real Data Sample
510
+
511
+ After fixing semantic issues, run the corrected query with a small `LIMIT` via the `bms-session-mcp-server` `query` tool and inspect the results:
512
+
513
+ 1. **Does the returned data make sense for the business question?** — e.g., if asked for "today's visits," are all returned rows dated today? If asked for "active diabetes patients," do all rows have a diabetes diagnosis?
514
+ 2. **Are any suspicious edge cases present?** — 0 rows when you expected data, data from the wrong date range, rows missing key fields
515
+ 3. **Do the masked fields (patient names, CIDs) appear correctly masked?** — if they look like raw data, something is wrong
516
+ 4. **Does the result count align with expected business volume?** — if the spec says "a typical hospital sees ~500 OPD visits per day" and the query returns 5, investigate
517
+
518
+ ### Rules
519
+
520
+ - **Never assume the query is correct just because it runs** — Phase I is specifically for catching queries that pass G6 and H but return the wrong data
521
+ - **Always cite the HOSxP knowledge source** — when fixing a semantic issue, include a comment in the code referencing the convention: `// HOSxP convention: use vstdate, not service_date — per hosxp data dictionary`
522
+ - **If the knowledge base is silent on a convention**, note it and flag for user review rather than guessing
523
+ - **Prefer fixing over flagging** — if the fix is clear from the knowledge base, apply it
524
+
363
525
  **Output Format:**
364
526
 
365
527
  After completing all phases, provide a summary report:
@@ -409,6 +571,45 @@ After completing all phases, provide a summary report:
409
571
  - [ ] .dockerignore covers sensitive files
410
572
  - [ ] Docker Compose: no hardcoded secrets, restart policies set, health checks defined
411
573
 
574
+ ### Database Compatibility
575
+ - [ ] SQL found: X statements (or "none — phase skipped")
576
+ - [ ] All SQL uses cross-compatible syntax (no MySQL-only or PostgreSQL-only forms)
577
+ - [ ] Data types are cross-compatible (no TINYINT, MEDIUMTEXT, BYTEA, UNSIGNED, etc.)
578
+ - [ ] No dialect branching in code
579
+ - [ ] Raw SQL in ORM context uses cross-compatible forms
580
+ - [ ] Migration files use cross-compatible DDL or ORM schema methods
581
+
582
+ ### SQL Statement Validation (against real schema)
583
+ - [ ] Validation source: <live DB MCP / knowledge base / schema files / NONE>
584
+ - [ ] Tables referenced: X (all verified to exist)
585
+ - [ ] Columns referenced: Y (all verified against describe_table)
586
+ - [ ] X hallucinated column references fixed
587
+ - [ ] Y hallucinated table references fixed
588
+ - [ ] Z statements validated via EXPLAIN / LIMIT 0 without error
589
+
590
+ ### Integration Testing (real database)
591
+ - [ ] bms-session-mcp-server available: yes/no (phase skipped if no)
592
+ - [ ] Integration test files found: X
593
+ - [ ] DB-touching features with integration coverage: Y/Z
594
+ - [ ] Missing coverage added: N new integration tests
595
+ - [ ] Integration tests executed: X passed, Y failed, Z fixed
596
+ - [ ] Masked fields (patient names, CIDs) handled correctly
597
+ - [ ] Thai character encoding preserved end-to-end
598
+ - [ ] Slow queries flagged for user review: N
599
+
600
+ ### Business Logic Validation (HOSxP semantic correctness)
601
+ - [ ] Knowledge source: bms-knowledge-mcp / not available (phase skipped if none)
602
+ - [ ] SQL statements audited for business intent: X
603
+ - [ ] Primary table selection verified against HOSxP conventions
604
+ - [ ] Soft-delete / cancellation filters verified: X added, Y already correct
605
+ - [ ] Date column selection verified against business intent
606
+ - [ ] Join keys (hn/vn/an) verified against HOSxP data model
607
+ - [ ] Code system usage (ICD-10, LOINC, TMT) verified
608
+ - [ ] Regulatory compliance (MOPH/NHSO/43 files) verified where applicable
609
+ - [ ] Thai conventions (Buddhist calendar, CID validation) verified
610
+ - [ ] Semantic mismatches fixed: X (cited HOSxP convention in code comment)
611
+ - [ ] Flagged for user review (ambiguous intent): Y
612
+
412
613
  ### Summary
413
614
  Total issues found: X
414
615
  Total issues fixed: X
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "bms-speckit-plugin",
3
- "version": "6.4.0",
3
+ "version": "6.6.0",
4
4
  "description": "Chain-orchestrated development pipeline: /bms-speckit takes requirements and runs brainstorm → constitution → specify → plan → tasks → analyze → implement → verify with per-step error handling",
5
5
  "files": [
6
6
  ".claude-plugin/",
@@ -184,13 +184,22 @@ After the subagent completes, update tasks 1-8 as completed using TaskUpdate, th
184
184
  - Use strict equality, add null/undefined/None guards for external data (API responses, DB results, config, user input)
185
185
  - Add unit tests that actually execute data transformation functions and verify the output type and shape
186
186
  - **SQL validation:** before writing any SQL statement, verify the exact table and column names exist. Use `bms-session-mcp-server` tools (`list_tables`, `describe_table`) or `mcp__bms-knowledge-mcp__search_knowledge` (hosxp collection) to confirm every field reference. Never guess column names. After writing a query, test it with `query` tool using `EXPLAIN` or `LIMIT 0` to confirm it parses and executes.
187
+ - **Business logic correctness:** before writing any SQL, state out loud what business question the query is supposed to answer. Then use `mcp__bms-knowledge-mcp__search_knowledge` on the `hosxp` collection to verify: (a) you're using the canonical table for this data (e.g., `ovst` vs `opd`), (b) you're applying the correct soft-delete/status filter (e.g., `vstatus != 'C'`), (c) you're using the right date column, (d) you're joining on the right key (hn vs vn vs an). For multi-table relationships use `graph_search` instead. For reporting queries, also check `moph` or `nhso` collections for regulatory inclusion/exclusion rules. Cite the HOSxP convention in a code comment (e.g., `// HOSxP: use vstdate not service_date`).
187
188
  2. **INLINE QC** — immediately run: build, lint, ALL tests, security quick scan, UX check
188
- 3. **FIX** — fix every issue found, re-run checks
189
- 4. **COMMIT** only commit when build + lint + tests pass with zero errors
190
- 5. **NEXT**move to next task
189
+ 3. **INTEGRATION TEST (real database)** — for every task that touches database queries or data-dependent business logic, run an integration test that:
190
+ - Uses the real `bms-session-mcp-server` `query` tool to execute the feature's actual SQL against the real HOSxP database (not mocks)
191
+ - Flows the real result through the feature's actual code path (transformers, filters, aggregations) not just the raw query
192
+ - Asserts on concrete properties: row count is reasonable, required columns are populated, data types match the schema, masked fields are handled correctly (patient names, CIDs are pre-masked — do not flag as errors)
193
+ - Uses a small `LIMIT` to avoid pulling large result sets during testing
194
+ - Verifies edge cases with real data: empty results for narrow filters, NULL handling, encoding (Thai characters), date ranges
195
+ - If the task produced a UI component that renders DB data, also verify the component renders without error using a real query result as the prop
196
+ - **If the integration test fails:** investigate whether the bug is in the SQL, the transformer, or the assertion. Fix the actual cause. Do NOT relax the assertion to make it pass.
197
+ 4. **FIX** — fix every issue found in steps 2 and 3, re-run checks
198
+ 5. **COMMIT** — only commit when build + lint + unit tests + integration tests all pass with zero errors
199
+ 6. **NEXT** — move to next task
191
200
  - **Action:** Run:
192
201
 
193
- `/ralph-loop:ralph-loop "systematically execute speckit.implement via the Skill tool to complete every task defined in {TASKS_PATH} with strict adherence to specification requirements. IMPORTANT: apply rolling QC after EACH task — after implementing a task run build and fix build errors, run linter and fix lint errors, run ALL tests (not just new ones) and fix failures, check for hardcoded secrets and injection vulnerabilities in code you just wrote, verify UI code has actionable error messages and loading states. RUNTIME SAFETY: always add explicit return type annotations on data transformation functions, never spread or iterate a function return value without verifying it returns the expected collection type, use strict equality and null guards for external data, write tests that execute data transformers and verify output type and shape. SQL VALIDATION: before writing any SQL statement verify exact table and column names exist via bms-session-mcp-server list_tables/describe_table or bms-knowledge-mcp search_knowledge with hosxp collection, never guess column names, after writing test each query with EXPLAIN or LIMIT 0 via the query tool to confirm it executes without error. Only commit when build plus lint plus tests all pass with zero errors then proceed to next task. Report progress to the user after each task: output [Task N/total] DONE — task_name. Do NOT batch QC at the end. Maintain atomic commits after each successful task with clear traceability, avoid requesting confirmation and proceed autonomously, once all tasks are implemented invoke speckit.analyze via the Skill tool to perform a full validation pass, automatically apply all recommended improvements or corrections, re-run all tests to confirm stability and zero regression, and only output <promise>FINISHED</promise> after every task is fully completed, validated, and aligned with production-grade quality standards" --completion-promise "FINISHED" --max-iterations 10`
202
+ `/ralph-loop:ralph-loop "systematically execute speckit.implement via the Skill tool to complete every task defined in {TASKS_PATH} with strict adherence to specification requirements. IMPORTANT: apply rolling QC after EACH task — after implementing a task run build and fix build errors, run linter and fix lint errors, run ALL tests (not just new ones) and fix failures, check for hardcoded secrets and injection vulnerabilities in code you just wrote, verify UI code has actionable error messages and loading states. RUNTIME SAFETY: always add explicit return type annotations on data transformation functions, never spread or iterate a function return value without verifying it returns the expected collection type, use strict equality and null guards for external data, write tests that execute data transformers and verify output type and shape. SQL VALIDATION: before writing any SQL statement verify exact table and column names exist via bms-session-mcp-server list_tables/describe_table or bms-knowledge-mcp search_knowledge with hosxp collection, never guess column names, after writing test each query with EXPLAIN or LIMIT 0 via the query tool to confirm it executes without error. INTEGRATION TESTING: for every task that touches database or data-dependent logic run an integration test using the real bms-session-mcp-server query tool to execute the feature's SQL against the real HOSxP database (not mocks), flow the result through the actual code path, assert on concrete properties (row count, columns populated, types match, masked fields handled), use small LIMIT to avoid large result sets. If integration test fails investigate the real cause and fix it — do not relax the assertion. BUSINESS LOGIC: before writing any SQL state the business question it answers then consult bms-knowledge-mcp search_knowledge on hosxp collection to verify canonical table selection, soft-delete filters, correct date columns, join keys (hn/vn/an); use graph_search for multi-table relationships; for reporting queries also check moph or nhso collections for regulatory rules; cite the HOSxP convention in a code comment next to the query. Only commit when build plus lint plus tests all pass with zero errors then proceed to next task. Report progress to the user after each task: output [Task N/total] DONE — task_name. Do NOT batch QC at the end. Maintain atomic commits after each successful task with clear traceability, avoid requesting confirmation and proceed autonomously, once all tasks are implemented invoke speckit.analyze via the Skill tool to perform a full validation pass, automatically apply all recommended improvements or corrections, re-run all tests to confirm stability and zero regression, and only output <promise>FINISHED</promise> after every task is fully completed, validated, and aligned with production-grade quality standards" --completion-promise "FINISHED" --max-iterations 10`
194
203
 
195
204
  - **Done:** Update task 10 as completed. Output `[Step 10/12] DONE — all tasks implemented and verified`
196
205
 
@@ -209,6 +218,8 @@ After the subagent completes, update tasks 1-8 as completed using TaskUpdate, th
209
218
  - **G. Database compatibility & SQL validation** — two-part check:
210
219
  - **G1-G5 (cross-compatibility):** enforce "write SQL once, run on both" — all SQL must use cross-compatible syntax that works on MySQL and PostgreSQL without dialect branching. Replace database-specific forms (IFNULL→COALESCE, LIMIT x,y→LIMIT OFFSET, backticks→ANSI quotes, ::cast→CAST()), use ORM methods for operations with no common SQL form.
211
220
  - **G6 (SQL schema validation — MOST IMPORTANT):** validate every SQL statement in the codebase against the real database schema. Hallucinated column names are the #1 QC failure — LLMs frequently reference columns that don't exist. For each SQL statement: use `bms-session-mcp-server` (`list_tables`, `describe_table`, `query` with EXPLAIN/LIMIT 0) or `mcp__mysql__mysql_query` / `mcp__postgres__query` to verify every table and column reference exists. Fall back to `mcp__bms-knowledge-mcp__search_knowledge` (hosxp collection) or grep migration files if live DB is unavailable. Fix every hallucinated reference. (skipped if no SQL in project)
221
+ - **H. Integration testing (real database, end-to-end)** — run actual integration tests against the real HOSxP database via `bms-session-mcp-server`. For every DB-touching feature verify there's an integration test that: executes the feature's real SQL via the `query` tool, flows real data through the actual code path (transformers, filters, UI components), asserts on concrete properties (row count range, column presence, type correctness, masked field handling, Thai encoding). Create missing integration tests. Never fix failures by mocking responses, skipping tests, or loosening assertions. Flag slow queries for user review. (skipped if bms-session-mcp-server is not available)
222
+ - **I. Business logic semantic validation** — verify every SQL statement actually retrieves the correct data per the user's intent and HOSxP conventions, not just that it runs without error. For each SQL: state the business question it should answer, cross-reference `mcp__bms-knowledge-mcp__search_knowledge` on `hosxp` collection for canonical table selection, soft-delete/status filters, correct date columns, join keys (hn/vn/an), code system usage (ICD-10/LOINC/TMT). Use `graph_search` for multi-table relationship questions. Check regulatory compliance via `moph` and `nhso` collections (43 files reporting, reimbursement rules). Verify Thai conventions (Buddhist calendar, CID validation, pre-masked fields). Fix semantic mismatches with code comments citing the HOSxP convention. Flag ambiguous intent for user review. (skipped if bms-knowledge-mcp is not available)
212
223
  - The agent fixes everything it can. Major dependency updates are flagged for user review.
213
224
  - **Completion rule:** When the QC agent returns its report, proceed to Step 12 **unless** the report contains unfixed build errors, unfixed test failures, or unfixed critical security vulnerabilities. Informational findings, flagged-for-review items, and already-fixed issues do NOT block progression. If uncertain, proceed — the QC agent already fixed what it could.
214
225
  - **Post-action:** Commit all fixes and push. Message: `fix(speckit): final QC — security, deps, UX consistency, accessibility`