sequant 2.1.1 → 2.1.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -758,21 +758,21 @@ echo "Size gate: $total_changes lines changed (threshold: $threshold), pkg_chang
758
758
 
759
759
  Run these checks directly (no sub-agents needed):
760
760
 
761
- ```bash
762
- # Type safety: check for 'any' additions
763
- any_count=$(git diff origin/main...HEAD | grep '^\+' | grep -v '^\+\+\+' | grep -cw 'any' || true)
761
+ **IMPORTANT:** Use the Grep tool (not bash `grep`) for pattern matching — bash grep uses BSD regex on macOS which is incompatible with some patterns below. The Grep tool uses ripgrep which works cross-platform.
764
762
 
763
+ ```bash
765
764
  # Deleted tests check
766
765
  deleted_tests=$(git diff origin/main...HEAD --name-only --diff-filter=D | grep -cE '\.(test|spec)\.' || true)
767
766
 
768
767
  # Scope: files changed count
769
768
  files_changed=$(git diff origin/main...HEAD --name-only | wc -l | tr -d ' ')
769
+ ```
770
770
 
771
- # Security scan (lightweight just check for obvious patterns in added lines)
772
- security_issues=$(git diff origin/main...HEAD | grep '^\+' | grep -v '^\+\+\+' | grep -ciE 'eval\(|innerHTML|dangerouslySetInnerHTML|exec\(|password.*=.*["']|secret.*=.*["']|api.?key.*=.*["']' || true)
771
+ For type safety and security scans, use the Grep tool instead of bash:
772
+ - **Type safety:** `Grep(pattern=":\\s*any[,;)\\]]|as any", path="<changed-files>")` on added lines
773
+ - **Security scan:** `Grep(pattern="eval\\(|innerHTML|dangerouslySetInnerHTML|password.*=.*[\"']|secret.*=.*[\"']", path="<changed-files>")` on added lines
773
774
 
774
- echo "Inline checks: any=$any_count, deleted_tests=$deleted_tests, files=$files_changed, security_issues=$security_issues"
775
- ```
775
+ Count results from the Grep tool output to get `any_count` and `security_issues`.
776
776
 
777
777
  **After inline checks, skip to the output template** (the sub-agent section below is not executed).
778
778
 
@@ -1359,39 +1359,20 @@ Before any READY_FOR_MERGE verdict, complete the adversarial thinking checklist:
1359
1359
 
1360
1360
  See [testing-requirements.md](references/testing-requirements.md) for edge case checklists.
1361
1361
 
1362
- ### 5. Adversarial Self-Evaluation (REQUIRED)
1362
+ ### 5. Risk Assessment (REQUIRED unless SMALL_DIFF)
1363
1363
 
1364
- **Before issuing your verdict**, you MUST complete this adversarial self-evaluation to catch issues that automated quality checks miss.
1365
-
1366
- **Why this matters:** QA automation catches type issues, deleted tests, and scope creep - but misses:
1367
- - Features that don't actually work as expected
1368
- - Tests that pass but don't test the right things
1369
- - Edge cases only apparent when actually using the feature
1370
-
1371
- **Answer these questions honestly:**
1372
- 1. "Did the implementation actually work when I reviewed it, or am I assuming it works?"
1373
- 2. "Do the tests actually test the feature's primary purpose, or just pass?"
1374
- 3. "What's the most likely way this feature could break in production?"
1375
- 4. "Am I giving a positive verdict because the code looks clean, or because I verified it works?"
1376
- 5. "Are there 'design choices' I'm excusing that are actually bad practices?" (e.g., no version pinning, leaking secrets to unnecessary env vars, non-portable shell in example code, no input validation). Would I accept this in a code review from a junior developer?
1364
+ **Before issuing your verdict**, state the implementation risks in 2-3 sentences.
1377
1365
 
1378
1366
  **Include this section in your output:**
1379
1367
 
1380
1368
  ```markdown
1381
- ### Self-Evaluation
1369
+ ### Risk Assessment
1382
1370
 
1383
- - **Verified working:** [Yes/No - did you actually verify the feature works, or assume it does?]
1384
- - **Test efficacy:** [High/Medium/Low - do tests catch the feature breaking?]
1385
- - **Likely failure mode:** [What would most likely break this in production?]
1386
- - **Verdict confidence:** [High/Medium/Low - explain any uncertainty]
1371
+ - **Likely failure mode:** [How would this break in production? Be specific.]
1372
+ - **Not tested:** [What gaps exist in test coverage for these changes?]
1387
1373
  ```
1388
1374
 
1389
- **If any answer reveals concerns:**
1390
- - Factor the concerns into your verdict
1391
- - If significant, change verdict to `AC_NOT_MET` or `AC_MET_BUT_NOT_A_PLUS`
1392
- - Document the concerns in the QA comment
1393
-
1394
- **Do NOT skip this self-evaluation.** Honest reflection catches issues that code review cannot.
1375
+ **If either field reveals significant concerns**, factor them into your verdict. A serious failure mode with no test coverage should downgrade to `AC_MET_BUT_NOT_A_PLUS` or `AC_NOT_MET`.
1395
1376
 
1396
1377
  #### Skill Change Review (Conditional)
1397
1378
 
@@ -1402,7 +1383,7 @@ See [testing-requirements.md](references/testing-requirements.md) for edge case
1402
1383
  skills_changed=$(git diff main...HEAD --name-only | grep -E "^\.claude/skills/.*\.md$" | wc -l | xargs || true)
1403
1384
  ```
1404
1385
 
1405
- **If skills_changed > 0, add these adversarial prompts:**
1386
+ **If skills_changed > 0, add these verification prompts:**
1406
1387
 
1407
1388
  | Prompt | Why It Matters |
1408
1389
  |--------|----------------|
@@ -1985,14 +1966,14 @@ When the size gate determined `SMALL_DIFF=true`, use the **simplified output tem
1985
1966
  - [ ] **Code Review Findings** - Strengths, issues, suggestions
1986
1967
  - [ ] **Test Coverage Analysis** - Changed files with/without tests, critical paths flagged
1987
1968
  - [ ] **Anti-Pattern Detection** - Code patterns check (lightweight)
1988
- - [ ] **Self-Evaluation Completed** - Adversarial self-evaluation section included
1969
+ - [ ] **Risk Assessment** - Likely failure mode and coverage gaps stated
1989
1970
  - [ ] **Verdict** - One of: READY_FOR_MERGE, AC_MET_BUT_NOT_A_PLUS, NEEDS_VERIFICATION, AC_NOT_MET
1990
1971
  - [ ] **Documentation Check** - README/docs updated if feature adds new functionality
1991
1972
  - [ ] **Next Steps** - Clear, actionable recommendations
1992
1973
 
1993
1974
  ### Standard QA (Implementation Exists, `SMALL_DIFF=false`)
1994
1975
 
1995
- - [ ] **Self-Evaluation Completed** - Adversarial self-evaluation section included in output
1976
+ - [ ] **Risk Assessment** - Likely failure mode and coverage gaps stated in output
1996
1977
  - [ ] **AC Coverage** - Each AC item marked as MET, PARTIALLY_MET, NOT_MET, PENDING, or N/A
1997
1978
  - [ ] **Quality Plan Verification** - Included if quality plan exists (or marked N/A if no quality plan)
1998
1979
  - [ ] **CI Status** - Included if PR exists (or marked "No PR" / "No CI configured")
@@ -2008,7 +1989,7 @@ When the size gate determined `SMALL_DIFF=true`, use the **simplified output tem
2008
1989
  - [ ] **Execution Evidence** - Included if scripts/CLI modified (or marked N/A)
2009
1990
  - [ ] **Script Verification Override** - Included if scripts/CLI modified AND /verify was skipped (with justification and risk assessment)
2010
1991
  - [ ] **Skill Command Verification** - Included if `.claude/skills/**/*.md` modified (or marked N/A)
2011
- - [ ] **Skill Change Review** - Skill-specific adversarial prompts included if skills changed
1992
+ - [ ] **Skill Change Review** - Skill-specific verification prompts included if skills changed
2012
1993
  - [ ] **Smoke Test** - Included if workflow-affecting changes (skills, scripts, CLI), or marked "Not Required"
2013
1994
  - [ ] **CHANGELOG Verification** - User-facing changes have `[Unreleased]` entry (or marked N/A)
2014
1995
  - [ ] **Documentation Check** - README/docs updated if feature adds new functionality
@@ -2097,12 +2078,10 @@ When the size gate triggers simple fix mode, use this shorter template:
2097
2078
 
2098
2079
  ---
2099
2080
 
2100
- ### Self-Evaluation
2081
+ ### Risk Assessment
2101
2082
 
2102
- - **Verified working:** [Yes/No]
2103
- - **Test efficacy:** [High/Medium/Low]
2104
- - **Likely failure mode:** [description]
2105
- - **Verdict confidence:** [High/Medium/Low]
2083
+ - **Likely failure mode:** [How would this break in production?]
2084
+ - **Not tested:** [What gaps exist in test coverage?]
2106
2085
 
2107
2086
  ---
2108
2087
 
@@ -2387,12 +2366,10 @@ You MUST include these sections:
2387
2366
 
2388
2367
  ---
2389
2368
 
2390
- ### Self-Evaluation
2369
+ ### Risk Assessment
2391
2370
 
2392
- - **Verified working:** [Yes/No - did you actually verify the feature works?]
2393
- - **Test efficacy:** [High/Medium/Low - do tests catch the feature breaking?]
2394
- - **Likely failure mode:** [What would most likely break this in production?]
2395
- - **Verdict confidence:** [High/Medium/Low - explain any uncertainty]
2371
+ - **Likely failure mode:** [How would this break in production? Be specific.]
2372
+ - **Not tested:** [What gaps exist in test coverage for these changes?]
2396
2373
 
2397
2374
  ---
2398
2375