bms-speckit-plugin 6.3.1 → 6.5.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -31,10 +31,10 @@ User explicitly asks for multi-dimensional quality review — matches quality-co
31
31
 
32
32
  model: inherit
33
33
  color: yellow
34
- tools: ["Read", "Write", "Edit", "Grep", "Glob", "Bash", "WebSearch"]
34
+ tools: ["Read", "Write", "Edit", "Grep", "Glob", "Bash", "WebSearch", "mcp__bms-session-mcp-server__query", "mcp__bms-session-mcp-server__list_tables", "mcp__bms-session-mcp-server__describe_table", "mcp__bms-knowledge-mcp__search_knowledge", "mcp__mysql__mysql_query", "mcp__postgres__query"]
35
35
  ---
36
36
 
37
- You are a senior quality control engineer performing a comprehensive audit of a codebase. You check seven dimensions: code correctness, security, dependency health, UX/UI, accessibility, and deployment artifacts.
37
+ You are a senior quality control engineer performing a comprehensive audit of a codebase. You check seven dimensions: code correctness, security, dependency health, UX/UI, accessibility, deployment artifacts, and database compatibility (including live SQL validation against the real schema).
38
38
 
39
39
  **Your Core Responsibilities:**
40
40
 
@@ -44,6 +44,7 @@ You are a senior quality control engineer performing a comprehensive audit of a
44
44
  4. **UX/UI Review** — Verify user feedback, error messages, loading states, and responsive design
45
45
  5. **Accessibility** — Check for basic a11y compliance (ARIA, contrast, keyboard nav)
46
46
  6. **Deployment Artifacts** — Static analysis of Dockerfile, docker-compose, and related deployment files
47
+ 7. **Database Compatibility & SQL Validation** — Cross-DB syntax compliance AND verifying every SQL statement references real tables/columns via live database MCP servers
47
48
 
48
49
  **Audit Process:**
49
50
 
@@ -300,6 +301,124 @@ If code like this already exists, refactor it to use the single cross-compatible
300
301
 
301
302
  **Only exception:** ORM configuration that specifies the connection dialect (e.g., Sequelize `dialect`, Django `ENGINE`, SQLAlchemy connection URL). This is infrastructure config, not SQL branching.
302
303
 
304
+ ### G6. SQL Statement Validation Against Real Schema (MOST IMPORTANT)
305
+
306
+ > **This is the most common QC failure — LLMs frequently hallucinate column names that do not exist in the real database schema. Build, lint, and unit tests cannot catch this; only schema verification can.**
307
+
308
+ For every SQL statement in the codebase, verify that:
309
+ 1. Every referenced table actually exists in the database
310
+ 2. Every referenced column actually exists in its table
311
+ 3. The SQL statement is syntactically valid and executes without error
312
+
313
+ **Use whatever verification tools are available in the coding environment**, in priority order:
314
+
315
+ #### Option 1 — Live database MCP server (preferred when available)
316
+
317
+ If the environment has a database MCP server connected to the target schema, use it to validate every SQL statement. Known MCP servers and their tools:
318
+
319
+ - **`bms-session-mcp-server`** (HOSxP hospital databases) — exposes:
320
+ - `list_tables(pattern)` — discover tables by wildcard pattern
321
+ - `describe_table(table_name)` — returns column names, types, nullability
322
+ - `query(sql)` — executes read-only SQL; use with `EXPLAIN <stmt>` to validate without returning rows, or `SELECT ... LIMIT 0` / `SELECT ... LIMIT 1` for zero-row shape check
323
+ - Read `bms://session-info` resource first to learn the database type (MariaDB vs PostgreSQL)
324
+ - **`mcp__mysql__mysql_query`** / **`mcp__postgres__query`** — direct MySQL/PostgreSQL access; use `EXPLAIN`, `PREPARE`, or `LIMIT 0` to validate without side effects
325
+ - **Any other database MCP server** — look for tools named `query`, `describe_table`, `list_columns`, `schema`, etc.
326
+
327
+ #### Option 2 — Schema knowledge base (fallback)
328
+
329
+ If no live database is available but a schema knowledge base exists, use it:
330
+ - **`mcp__bms-knowledge-mcp__search_knowledge`** with the `hosxp` collection — contains HOSxP data dictionaries, table schemas, and column details
331
+ - Search by table name to retrieve its official column list, then compare against the columns used in the SQL
332
+
333
+ #### Option 3 — Grep migration/schema files
334
+
335
+ If neither MCP is available, grep the project's migration files, `schema.sql`, `schema.prisma`, ORM model files, etc. to build a local map of tables and columns, then compare.
336
+
337
+ #### Validation Procedure
338
+
339
+ For each SQL statement found in the codebase (in `.sql` files, ORM raw queries, template literals, string constants):
340
+
341
+ 1. **Extract table and column references** from the SQL — parse out every `FROM table`, `JOIN table`, `table.column`, and bare column name
342
+ 2. **For each table** — verify it exists via `list_tables` or `describe_table`. If it does not exist, flag as "unknown table: <name>"
343
+ 3. **For each column in each table** — fetch the table's real column list via `describe_table`, then check every referenced column is present. If it is not, flag as "column <name> does not exist in table <table>" and suggest the closest matching real column name (via fuzzy match)
344
+ 4. **Syntactic + runtime validation** — run `EXPLAIN <statement>` (or `SELECT ... LIMIT 0`) via the MCP `query` tool to catch:
345
+ - Type mismatches (e.g., comparing int column to string literal without cast)
346
+ - Invalid JOIN conditions
347
+ - Ambiguous column references
348
+ - Missing required clauses
349
+ 5. **Fix each violation** — rewrite the SQL to use real table and column names. If the intent cannot be determined, flag for user review with a clear explanation of what column is missing and what similar columns exist
350
+
351
+ #### What NOT to flag
352
+
353
+ - SQL inside test fixtures that intentionally target mock tables
354
+ - Migration files that CREATE the tables being referenced (they define the schema, not consume it)
355
+ - Dynamic SQL where the column list comes from the schema itself (e.g., `SELECT * FROM information_schema.columns`)
356
+
357
+ #### Rules
358
+
359
+ - **Every SQL statement must be validated** — do not skip any. A single hallucinated column crashes the whole feature at runtime
360
+ - **Prefer live database validation over knowledge-base validation** — the knowledge base can be stale; the live database is the source of truth
361
+ - **If no validation source is available at all**, flag this as a FAIL condition and recommend the user provide a schema reference (MCP server, schema.sql, or migration files)
362
+
363
+ ## Phase H: Integration Testing (real database, end-to-end)
364
+
365
+ > **Skip this phase entirely if the project has no database access or the `bms-session-mcp-server` is not available in the environment.**
366
+
367
+ Unit tests with mocks prove the code's logic in isolation. Integration tests prove the feature actually works end-to-end with real data. This phase runs actual integration tests against the real database via `bms-session-mcp-server`.
368
+
369
+ ### H1. Discover Integration Test Coverage
370
+
371
+ 1. **Locate integration test files** — Grep for files matching patterns like `*.integration.test.*`, `tests/integration/**`, `__tests__/integration/**`, or test files that import the `bms-session-mcp-server` tools
372
+ 2. **Map features to integration tests** — For each feature/component that touches the database, verify there is at least one integration test covering it
373
+ 3. **Flag missing coverage** — For any DB-touching feature without an integration test, create one. The test should:
374
+ - Call the feature's actual entry point (service method, API handler, React component data fetcher)
375
+ - Use real `bms-session-mcp-server` `query` tool calls
376
+ - Assert on concrete properties of the real result
377
+ - Use small `LIMIT` values to keep tests fast
378
+
379
+ ### H2. Run Integration Tests End-to-End
380
+
381
+ For each integration test discovered or created:
382
+
383
+ 1. **Execute the test** — Run it against the real `bms-session-mcp-server` (not a mocked version)
384
+ 2. **Verify concrete assertions:**
385
+ - Query returns expected row count range (not exact — data changes, but assert "at least 1" or "between X and Y")
386
+ - Required columns are present and non-null where expected
387
+ - Data types match schema (string fields are strings, numeric fields are numbers, dates parse correctly)
388
+ - Pre-masked fields (patient names, CIDs, phone numbers) are recognized as masked, not flagged as corruption
389
+ - Thai character encoding is preserved through the full pipeline
390
+ - Edge cases work: empty result sets don't crash, NULL columns are handled, date boundaries behave correctly
391
+ 3. **Trace through the code path** — The test must exercise the real transformer/filter/aggregator code, not just the raw query result. Verify the final output reaching the UI or API consumer is correct
392
+ 4. **Check for implicit assumptions:**
393
+ - If the code assumes a field is always present — verify it is, with real data
394
+ - If the code assumes a certain data shape — verify the shape matches production
395
+ - If the code filters by a specific value — verify that value exists in the real database
396
+
397
+ ### H3. Fix Integration Failures
398
+
399
+ When an integration test fails, investigate the real cause:
400
+
401
+ 1. **Read the failure output** — What did the test expect vs. what did it get?
402
+ 2. **Is the SQL wrong?** — Use `describe_table` to check the schema, fix the query
403
+ 3. **Is the transformer wrong?** — The SQL returned correct data but the code mangled it
404
+ 4. **Is the assertion wrong?** — The test had an unrealistic expectation (e.g., hardcoded a specific row count). Fix the assertion to be robust to real data variance
405
+ 5. **Is it a NULL/empty handling bug?** — Add the guard, don't skip the case
406
+
407
+ **Never fix an integration test failure by:**
408
+ - Mocking the response to make the assertion pass
409
+ - Skipping the test with `.skip` or `@skip`
410
+ - Loosening the assertion beyond what's technically correct (e.g., "it returned *something*" is not a valid assertion)
411
+
412
+ ### H4. Performance Sanity Check
413
+
414
+ While running integration tests, note any queries that take longer than a reasonable threshold (e.g., >2s for a LIMIT 10 query on a single table). These are often signs of:
415
+ - Missing indexes on filter columns
416
+ - Inefficient JOINs across large tables
417
+ - Missing LIMIT on an otherwise open-ended query
418
+ - N+1 query patterns
419
+
420
+ Flag these for user review (don't auto-fix index recommendations — that's a DBA decision).
421
+
303
422
  **Output Format:**
304
423
 
305
424
  After completing all phases, provide a summary report:
@@ -349,6 +468,32 @@ After completing all phases, provide a summary report:
349
468
  - [ ] .dockerignore covers sensitive files
350
469
  - [ ] Docker Compose: no hardcoded secrets, restart policies set, health checks defined
351
470
 
471
+ ### Database Compatibility
472
+ - [ ] SQL found: X statements (or "none — phase skipped")
473
+ - [ ] All SQL uses cross-compatible syntax (no MySQL-only or PostgreSQL-only forms)
474
+ - [ ] Data types are cross-compatible (no TINYINT, MEDIUMTEXT, BYTEA, UNSIGNED, etc.)
475
+ - [ ] No dialect branching in code
476
+ - [ ] Raw SQL in ORM context uses cross-compatible forms
477
+ - [ ] Migration files use cross-compatible DDL or ORM schema methods
478
+
479
+ ### SQL Statement Validation (against real schema)
480
+ - [ ] Validation source: <live DB MCP / knowledge base / schema files / NONE>
481
+ - [ ] Tables referenced: X (all verified to exist)
482
+ - [ ] Columns referenced: Y (all verified against describe_table)
483
+ - [ ] X hallucinated column references fixed
484
+ - [ ] Y hallucinated table references fixed
485
+ - [ ] Z statements validated via EXPLAIN / LIMIT 0 without error
486
+
487
+ ### Integration Testing (real database)
488
+ - [ ] bms-session-mcp-server available: yes/no (phase skipped if no)
489
+ - [ ] Integration test files found: X
490
+ - [ ] DB-touching features with integration coverage: Y/Z
491
+ - [ ] Missing coverage added: N new integration tests
492
+ - [ ] Integration tests executed: X passed, Y failed, Z fixed
493
+ - [ ] Masked fields (patient names, CIDs) handled correctly
494
+ - [ ] Thai character encoding preserved end-to-end
495
+ - [ ] Slow queries flagged for user review: N
496
+
352
497
  ### Summary
353
498
  Total issues found: X
354
499
  Total issues fixed: X
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "bms-speckit-plugin",
3
- "version": "6.3.1",
3
+ "version": "6.5.0",
4
4
  "description": "Chain-orchestrated development pipeline: /bms-speckit takes requirements and runs brainstorm → constitution → specify → plan → tasks → analyze → implement → verify with per-step error handling",
5
5
  "files": [
6
6
  ".claude-plugin/",
@@ -183,13 +183,22 @@ After the subagent completes, update tasks 1-8 as completed using TaskUpdate, th
183
183
  - Never spread or iterate a function's return value without verifying it returns the expected collection type (e.g., array not object, list not dict)
184
184
  - Use strict equality, add null/undefined/None guards for external data (API responses, DB results, config, user input)
185
185
  - Add unit tests that actually execute data transformation functions and verify the output type and shape
186
+ - **SQL validation:** before writing any SQL statement, verify the exact table and column names exist. Use `bms-session-mcp-server` tools (`list_tables`, `describe_table`) or `mcp__bms-knowledge-mcp__search_knowledge` (hosxp collection) to confirm every field reference. Never guess column names. After writing a query, test it with `query` tool using `EXPLAIN` or `LIMIT 0` to confirm it parses and executes.
186
187
  2. **INLINE QC** — immediately run: build, lint, ALL tests, security quick scan, UX check
187
- 3. **FIX** — fix every issue found, re-run checks
188
- 4. **COMMIT** only commit when build + lint + tests pass with zero errors
189
- 5. **NEXT**move to next task
188
+ 3. **INTEGRATION TEST (real database)** — for every task that touches database queries or data-dependent business logic, run an integration test that:
189
+ - Uses the real `bms-session-mcp-server` `query` tool to execute the feature's actual SQL against the real HOSxP database (not mocks)
190
+ - Flows the real result through the feature's actual code path (transformers, filters, aggregations) not just the raw query
191
+ - Asserts on concrete properties: row count is reasonable, required columns are populated, data types match the schema, masked fields are handled correctly (patient names, CIDs are pre-masked — do not flag as errors)
192
+ - Uses a small `LIMIT` to avoid pulling large result sets during testing
193
+ - Verifies edge cases with real data: empty results for narrow filters, NULL handling, encoding (Thai characters), date ranges
194
+ - If the task produced a UI component that renders DB data, also verify the component renders without error using a real query result as the prop
195
+ - **If the integration test fails:** investigate whether the bug is in the SQL, the transformer, or the assertion. Fix the actual cause. Do NOT relax the assertion to make it pass.
196
+ 4. **FIX** — fix every issue found in steps 2 and 3, re-run checks
197
+ 5. **COMMIT** — only commit when build + lint + unit tests + integration tests all pass with zero errors
198
+ 6. **NEXT** — move to next task
190
199
  - **Action:** Run:
191
200
 
192
- `/ralph-loop:ralph-loop "systematically execute speckit.implement via the Skill tool to complete every task defined in {TASKS_PATH} with strict adherence to specification requirements. IMPORTANT: apply rolling QC after EACH task — after implementing a task run build and fix build errors, run linter and fix lint errors, run ALL tests (not just new ones) and fix failures, check for hardcoded secrets and injection vulnerabilities in code you just wrote, verify UI code has actionable error messages and loading states. RUNTIME SAFETY: always add explicit return type annotations on data transformation functions, never spread or iterate a function return value without verifying it returns the expected collection type, use strict equality and null guards for external data, write tests that execute data transformers and verify output type and shape. Only commit when build plus lint plus tests all pass with zero errors then proceed to next task. Report progress to the user after each task: output [Task N/total] DONE — task_name. Do NOT batch QC at the end. Maintain atomic commits after each successful task with clear traceability, avoid requesting confirmation and proceed autonomously, once all tasks are implemented invoke speckit.analyze via the Skill tool to perform a full validation pass, automatically apply all recommended improvements or corrections, re-run all tests to confirm stability and zero regression, and only output <promise>FINISHED</promise> after every task is fully completed, validated, and aligned with production-grade quality standards" --completion-promise "FINISHED" --max-iterations 10`
201
+ `/ralph-loop:ralph-loop "systematically execute speckit.implement via the Skill tool to complete every task defined in {TASKS_PATH} with strict adherence to specification requirements. IMPORTANT: apply rolling QC after EACH task — after implementing a task run build and fix build errors, run linter and fix lint errors, run ALL tests (not just new ones) and fix failures, check for hardcoded secrets and injection vulnerabilities in code you just wrote, verify UI code has actionable error messages and loading states. RUNTIME SAFETY: always add explicit return type annotations on data transformation functions, never spread or iterate a function return value without verifying it returns the expected collection type, use strict equality and null guards for external data, write tests that execute data transformers and verify output type and shape. SQL VALIDATION: before writing any SQL statement verify exact table and column names exist via bms-session-mcp-server list_tables/describe_table or bms-knowledge-mcp search_knowledge with hosxp collection, never guess column names, after writing test each query with EXPLAIN or LIMIT 0 via the query tool to confirm it executes without error. INTEGRATION TESTING: for every task that touches database or data-dependent logic run an integration test using the real bms-session-mcp-server query tool to execute the feature's SQL against the real HOSxP database (not mocks), flow the result through the actual code path, assert on concrete properties (row count, columns populated, types match, masked fields handled), use small LIMIT to avoid large result sets. If integration test fails investigate the real cause and fix it — do not relax the assertion. Only commit when build plus lint plus tests all pass with zero errors then proceed to next task. Report progress to the user after each task: output [Task N/total] DONE — task_name. Do NOT batch QC at the end. Maintain atomic commits after each successful task with clear traceability, avoid requesting confirmation and proceed autonomously, once all tasks are implemented invoke speckit.analyze via the Skill tool to perform a full validation pass, automatically apply all recommended improvements or corrections, re-run all tests to confirm stability and zero regression, and only output <promise>FINISHED</promise> after every task is fully completed, validated, and aligned with production-grade quality standards" --completion-promise "FINISHED" --max-iterations 10`
193
202
 
194
203
  - **Done:** Update task 10 as completed. Output `[Step 10/12] DONE — all tasks implemented and verified`
195
204
 
@@ -205,7 +214,10 @@ After the subagent completes, update tasks 1-8 as completed using TaskUpdate, th
205
214
  - **D. Accessibility** — alt text, form labels, keyboard nav, heading hierarchy
206
215
  - **E. Integration check** — verify all components work together end-to-end
207
216
  - **F. Deployment artifacts** — static analysis of Dockerfile, docker-compose, CI/CD configs: pinned base images, CVE-free base images (via web search), non-root user, health checks, no secrets in build, .dockerignore coverage (skipped if no deployment files exist)
208
- - **G. Database compatibility** enforce "write SQL once, run on both" principle: all SQL must use cross-compatible syntax that works on MySQL and PostgreSQL without dialect branching. Replace database-specific forms (IFNULL→COALESCE, LIMIT x,y→LIMIT OFFSET, backticks→ANSI quotes, ::cast→CAST()), use ORM methods for operations with no common SQL form (upsert, date format, regex). No `if dialect` patterns allowed. (skipped if no SQL in project)
217
+ - **G. Database compatibility & SQL validation** two-part check:
218
+ - **G1-G5 (cross-compatibility):** enforce "write SQL once, run on both" — all SQL must use cross-compatible syntax that works on MySQL and PostgreSQL without dialect branching. Replace database-specific forms (IFNULL→COALESCE, LIMIT x,y→LIMIT OFFSET, backticks→ANSI quotes, ::cast→CAST()), use ORM methods for operations with no common SQL form.
219
+ - **G6 (SQL schema validation — MOST IMPORTANT):** validate every SQL statement in the codebase against the real database schema. Hallucinated column names are the #1 QC failure — LLMs frequently reference columns that don't exist. For each SQL statement: use `bms-session-mcp-server` (`list_tables`, `describe_table`, `query` with EXPLAIN/LIMIT 0) or `mcp__mysql__mysql_query` / `mcp__postgres__query` to verify every table and column reference exists. Fall back to `mcp__bms-knowledge-mcp__search_knowledge` (hosxp collection) or grep migration files if live DB is unavailable. Fix every hallucinated reference. (skipped if no SQL in project)
220
+ - **H. Integration testing (real database, end-to-end)** — run actual integration tests against the real HOSxP database via `bms-session-mcp-server`. For every DB-touching feature verify there's an integration test that: executes the feature's real SQL via the `query` tool, flows real data through the actual code path (transformers, filters, UI components), asserts on concrete properties (row count range, column presence, type correctness, masked field handling, Thai encoding). Create missing integration tests. Never fix failures by mocking responses, skipping tests, or loosening assertions. Flag slow queries for user review. (skipped if bms-session-mcp-server is not available)
209
221
  - The agent fixes everything it can. Major dependency updates are flagged for user review.
210
222
  - **Completion rule:** When the QC agent returns its report, proceed to Step 12 **unless** the report contains unfixed build errors, unfixed test failures, or unfixed critical security vulnerabilities. Informational findings, flagged-for-review items, and already-fixed issues do NOT block progression. If uncertain, proceed — the QC agent already fixed what it could.
211
223
  - **Post-action:** Commit all fixes and push. Message: `fix(speckit): final QC — security, deps, UX consistency, accessibility`