@intentsolutions/audit-harness 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md ADDED
@@ -0,0 +1,28 @@
1
+ # Changelog
2
+
3
+ All notable changes to `@intentsolutions/audit-harness` are documented here.
4
+
5
+ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
6
+ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
7
+
8
+ ## [0.1.0] — 2026-04-21
9
+
10
+ Initial release. Extracted from the `audit-tests` Claude Code skill v7.0.0 to enable in-repo enforcement without global skill installation.
11
+
12
+ ### Added
13
+
14
+ - `audit-harness verify` — SHA-256 hash verification for pinned policy files
15
+ - `audit-harness init` — initialize/re-init the `.harness-hash` manifest
16
+ - `audit-harness list` — list pinned files
17
+ - `audit-harness escape-scan` — detect AI escape patterns in a diff (coverage threshold lowering, test deletion, architecture bypasses, test skip markers)
18
+ - `audit-harness arch` — dispatch language-appropriate architecture checker (dependency-cruiser / import-linter / ArchUnit / deptrac / arch-go)
19
+ - `audit-harness bias` — count common test-bias patterns
20
+ - `audit-harness gherkin-lint` — advisory Gherkin quality check
21
+ - `audit-harness crap` — CRAP (Complexity × Coverage) scorer for Python, JS/TS, Go, Rust
22
+
23
+ ### Key design decisions
24
+
25
+ - **Scripts stay as shell/python.** Not a TypeScript port — battle-tested implementations, language-portable, minimal dependencies.
26
+ - **Thin Node CLI.** `bin/audit-harness.js` is a dispatcher only; all logic lives in `scripts/`.
27
+ - **Policy-driven thresholds.** `escape-scan.sh` reads floors from `tests/TESTING.md` in the target repo, not from the script source.
28
+ - **Zero runtime dependencies** beyond Node 18+, bash, and Python 3 (only if using `crap` command).
package/LICENSE ADDED
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2026 Jeremy Longshore / Intent Solutions
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
package/README.md ADDED
@@ -0,0 +1,135 @@
1
+ # @intentsolutions/audit-harness
2
+
3
+ Deterministic test-enforcement toolkit. Companion to the `audit-tests` and `implement-tests` Claude Code skills — but usable standalone in any repo that wants hash-pinned, escape-scanned, AI-proof quality gates.
4
+
5
+ ## What it is
6
+
7
+ A small CLI wrapping 6 deterministic scripts:
8
+
9
+ | Command | Purpose |
10
+ |---|---|
11
+ | `audit-harness verify` | Verify hash-pinned artifacts haven't changed since `--init` |
12
+ | `audit-harness init` | Pin the current state of engineer-owned policy files |
13
+ | `audit-harness list` | Show pinned files |
14
+ | `audit-harness escape-scan --staged` | Detect AI attempts to lower test thresholds, delete tests, bypass architecture rules |
15
+ | `audit-harness arch` | Run language-appropriate architecture-rule checker (dependency-cruiser / import-linter / ArchUnit / deptrac / arch-go) |
16
+ | `audit-harness bias` | Count common test-bias patterns |
17
+ | `audit-harness gherkin-lint` | Advisory Gherkin quality check |
18
+ | `audit-harness crap` | CRAP (Complexity × Coverage) scorer — Python, Go, JS/TS, Rust |
19
+
20
+ ## Install
21
+
22
+ ```bash
23
+ pnpm add -D @intentsolutions/audit-harness
24
+ # or: npm install --save-dev @intentsolutions/audit-harness
25
+ # or: yarn add --dev @intentsolutions/audit-harness
26
+ ```
27
+
28
+ ## Quick usage
29
+
30
+ ### Pre-commit hook (`.husky/pre-commit`)
31
+
32
+ ```bash
33
+ #!/usr/bin/env sh
34
+ pnpm exec audit-harness escape-scan --staged
35
+ pnpm exec audit-harness verify
36
+ ```
37
+
38
+ ### CI workflow (`.github/workflows/ci.yml`)
39
+
40
+ ```yaml
41
+ containment:
42
+ runs-on: ubuntu-latest
43
+ steps:
44
+ - uses: actions/checkout@v6
45
+ - uses: pnpm/action-setup@v5
46
+ - uses: actions/setup-node@v6
47
+ with: { node-version: '20', cache: 'pnpm' }
48
+ - run: pnpm install --frozen-lockfile
49
+ - run: pnpm exec audit-harness verify
50
+ - run: pnpm exec audit-harness escape-scan --range origin/main..HEAD
51
+ ```
52
+
53
+ ### Engineer workflow — change a policy threshold
54
+
55
+ ```bash
56
+ # 1. Edit tests/TESTING.md to change coverage.line from 80 to 75
57
+ # 2. Re-init to accept the change
58
+ pnpm exec audit-harness init
59
+ # 3. Commit the updated manifest alongside the policy change
60
+ git add tests/TESTING.md .harness-hash
61
+ git commit -m "chore(test): lower coverage floor to 75"
62
+ ```
63
+
64
+ ## The containment model
65
+
66
+ The harness enforces this rule: **policy changes must be conscious, not silent.**
67
+
68
+ Engineer-owned files (`tests/TESTING.md`, `features/*.feature`, `.dependency-cruiser.cjs`, `stryker.conf.json`, etc.) are hashed into a manifest. Any diff that changes their content without a fresh `audit-harness init` is caught by pre-commit / CI and **REFUSED**.
69
+
70
+ AI agents remain useful (they can read policy, they can implement within constraints). What they can't do is silently weaken the constraints. That's the entire design.
71
+
72
+ See `audit-tests/references/philosophy.md` in the companion skill for the full rationale.
73
+
74
+ ## The 7-layer testing taxonomy
75
+
76
+ This harness sits inside a larger framework:
77
+
78
+ ```
79
+ L7 Acceptance / RTM / Personas / Journeys ← WHAT are we proving?
80
+ L6 E2E / BDD / Visual regression ← User-level guarantees
81
+ L5 Perf / Security (SAST/DAST) / A11y / Chaos ← Non-functional
82
+ L4 Integration / Contract / Migration ← Infrastructure wiring
83
+ L3 Unit + Coverage + Mutation + Arch + CRAP ← Code-level correctness ← audit-harness lives here
84
+ L2 Static analysis / Lint / Types / Secrets ← Read-only scanning
85
+ L1 Git hooks / CI enforcement ← The cheapest gate ← audit-harness enables this
86
+ ```
87
+
88
+ The harness commands serve L1 (escape-scan in pre-commit + CI) and L3 (CRAP, architecture, bias, hash-pin).
89
+
90
+ ## Exit codes
91
+
92
+ Important for CI scripting:
93
+
94
+ | Exit | Command | Meaning |
95
+ |---|---|---|
96
+ | 0 | any | Clean |
97
+ | 1 | escape-scan | CHALLENGE — requires engineer-approved comment |
98
+ | 2 | verify | `HARNESS_TAMPERED` — pinned file changed |
99
+ | 2 | escape-scan | REFUSE — pipeline halted |
100
+ | 3 | verify | No manifest (fresh repo, not an error) |
101
+
102
+ ## Language support
103
+
104
+ Most scripts are language-agnostic (shell + regex). CRAP has per-language backends:
105
+
106
+ | Language | CRAP | Arch | Notes |
107
+ |---|---|---|---|
108
+ | Python | radon + coverage.py | import-linter | full support |
109
+ | JS/TS | complexity-report + c8 | dependency-cruiser | full support |
110
+ | Go | gocyclo + go test -cover | arch-go | full support |
111
+ | Rust | rust-code-analysis + tarpaulin | (custom) | coverage integration pending |
112
+ | Java/Kotlin | — | ArchUnit | via language-native tooling |
113
+ | .NET | — | ArchUnitNET | via language-native tooling |
114
+ | PHP | — | deptrac | via language-native tooling |
115
+
116
+ ## License
117
+
118
+ MIT — see [LICENSE](./LICENSE).
119
+
120
+ ## Related
121
+
122
+ - [`audit-tests` Claude Code skill](https://github.com/jeremylongshore/audit-harness#related) — diagnostic pipeline that uses this harness
123
+ - [`implement-tests` Claude Code skill](https://github.com/jeremylongshore/audit-harness#related) — filesystem-mutating installer that installs this harness as part of L1/L3 setup
124
+
125
+ ## Versioning
126
+
127
+ SemVer. Breaking changes to the CLI surface bump major; new commands bump minor; bug fixes bump patch.
128
+
129
+ ## Contributing
130
+
131
+ This is infrastructure code. Changes need to be conservative. Before opening a PR:
132
+
133
+ 1. Read `audit-tests/references/philosophy.md` (in the companion skill) to understand the escape-grammar design
134
+ 2. Run `bash scripts/escape-scan.sh --staged` on your own diff — yes, the harness tests itself
135
+ 3. Add test cases if you're adding a new pattern to escape-scan or a new command to the CLI
@@ -0,0 +1,95 @@
1
+ #!/usr/bin/env node
2
+ /**
3
+ * audit-harness CLI dispatcher
4
+ *
5
+ * Thin wrapper that invokes the canonical shell/python implementations in scripts/.
6
+ * Keeping the scripts as-is (not a TS port) for v0.x — they're battle-tested
7
+ * and language-portable. The CLI just adds discoverability + cross-platform-ish shell resolution.
8
+ */
9
+ const { spawn } = require('node:child_process');
10
+ const { resolve, dirname } = require('node:path');
11
+ const { existsSync } = require('node:fs');
12
+
13
+ const SCRIPTS = resolve(__dirname, '..', 'scripts');
14
+
15
+ const COMMANDS = {
16
+ 'verify': { script: 'harness-hash.sh', args: ['--verify'] },
17
+ 'init': { script: 'harness-hash.sh', args: ['--init'] },
18
+ 'list': { script: 'harness-hash.sh', args: ['--list'] },
19
+ 'escape-scan': { script: 'escape-scan.sh', args: [] },
20
+ 'arch': { script: 'arch-check.sh', args: [] },
21
+ 'bias': { script: 'bias-count.sh', args: [] },
22
+ 'gherkin-lint': { script: 'gherkin-lint.sh', args: [] },
23
+ 'crap': { script: 'crap-score.py', args: [] },
24
+ };
25
+
26
+ function usage() {
27
+ console.log(`audit-harness — deterministic test-enforcement toolkit
28
+
29
+ Usage:
30
+ audit-harness <command> [args...]
31
+
32
+ Commands:
33
+ verify Verify hash-pinned artifacts (exit 2 = HARNESS_TAMPERED)
34
+ init Initialize or re-init the .harness-hash manifest
35
+ list List currently pinned files
36
+ escape-scan <source> Scan a diff for escape attempts
37
+ source: --staged | --range A..B | - (stdin) | path.patch
38
+ arch Run architecture-rule checks (Wall 7)
39
+ bias Count test-bias patterns (tautology, smoke-only, etc.)
40
+ gherkin-lint Advisory Gherkin quality check
41
+ crap [args...] CRAP complexity × coverage scorer (multi-language)
42
+
43
+ Options:
44
+ --version, -v Print version
45
+ --help, -h Print this help
46
+
47
+ Exit codes (escape-scan):
48
+ 0 = clean
49
+ 1 = CHALLENGE (engineer-approved comment required)
50
+ 2 = REFUSE (pipeline halted)
51
+ `);
52
+ }
53
+
54
+ const [cmd, ...rest] = process.argv.slice(2);
55
+
56
+ if (!cmd || cmd === '--help' || cmd === '-h') {
57
+ usage();
58
+ process.exit(0);
59
+ }
60
+
61
+ if (cmd === '--version' || cmd === '-v') {
62
+ const pkg = require('../package.json');
63
+ console.log(pkg.version);
64
+ process.exit(0);
65
+ }
66
+
67
+ const entry = COMMANDS[cmd];
68
+ if (!entry) {
69
+ console.error(`audit-harness: unknown command '${cmd}'`);
70
+ usage();
71
+ process.exit(2);
72
+ }
73
+
74
+ const scriptPath = resolve(SCRIPTS, entry.script);
75
+ if (!existsSync(scriptPath)) {
76
+ console.error(`audit-harness: script not found at ${scriptPath}`);
77
+ process.exit(2);
78
+ }
79
+
80
+ const isPython = entry.script.endsWith('.py');
81
+ const interpreter = isPython ? 'python3' : 'bash';
82
+ const finalArgs = [scriptPath, ...entry.args, ...rest];
83
+
84
+ const child = spawn(interpreter, finalArgs, { stdio: 'inherit' });
85
+ child.on('exit', (code, signal) => {
86
+ if (signal) {
87
+ console.error(`audit-harness: ${entry.script} killed by ${signal}`);
88
+ process.exit(128);
89
+ }
90
+ process.exit(code ?? 0);
91
+ });
92
+ child.on('error', (err) => {
93
+ console.error(`audit-harness: failed to spawn ${interpreter}: ${err.message}`);
94
+ process.exit(2);
95
+ });
package/package.json ADDED
@@ -0,0 +1,47 @@
1
+ {
2
+ "name": "@intentsolutions/audit-harness",
3
+ "version": "0.1.0",
4
+ "description": "Deterministic test-enforcement harness — escape-scan, hash-pinning, CRAP, architecture checks, bias detection, Gherkin lint. Companion to the audit-tests and implement-tests Claude Code skills.",
5
+ "license": "MIT",
6
+ "author": "Jeremy Longshore <jeremy@intentsolutions.io>",
7
+ "homepage": "https://github.com/jeremylongshore/audit-harness",
8
+ "repository": {
9
+ "type": "git",
10
+ "url": "git+https://github.com/jeremylongshore/audit-harness.git"
11
+ },
12
+ "bugs": {
13
+ "url": "https://github.com/jeremylongshore/audit-harness/issues"
14
+ },
15
+ "keywords": [
16
+ "testing",
17
+ "test-audit",
18
+ "hash-pin",
19
+ "escape-scan",
20
+ "crap",
21
+ "architecture",
22
+ "mutation-testing",
23
+ "coverage-gate",
24
+ "7-layer-testing",
25
+ "claude-code",
26
+ "ai-containment"
27
+ ],
28
+ "bin": {
29
+ "audit-harness": "./bin/audit-harness.js"
30
+ },
31
+ "files": [
32
+ "bin",
33
+ "scripts",
34
+ "README.md",
35
+ "LICENSE",
36
+ "CHANGELOG.md"
37
+ ],
38
+ "publishConfig": {
39
+ "access": "public"
40
+ },
41
+ "engines": {
42
+ "node": ">=18"
43
+ },
44
+ "scripts": {
45
+ "test": "bash scripts/escape-scan.sh --staged || true"
46
+ }
47
+ }
@@ -0,0 +1,143 @@
1
+ #!/usr/bin/env bash
2
+ # arch-check.sh — Wall 7 architecture-constraint dispatcher.
3
+ #
4
+ # Detects the primary language of the repo, invokes the appropriate
5
+ # dependency / architecture checker with the project's rule pack, and
6
+ # normalizes the exit code.
7
+ #
8
+ # Exit codes:
9
+ # 0 — all rules pass
10
+ # 1 — rule violations detected
11
+ # 2 — no tool installed / no config / unsupported language
12
+ #
13
+ # Usage:
14
+ # bash arch-check.sh # run from repo root
15
+ # bash arch-check.sh --json # emit JSON summary to stdout
16
+ # bash arch-check.sh --help
17
+
18
+ set -euo pipefail
19
+
20
+ ROOT="${ROOT:-$(pwd)}"
21
+ JSON_OUT=0
22
+ REPORT_DIR="${ROOT}/reports/arch"
23
+
24
+ usage() {
25
+ sed -n '2,20p' "$0"
26
+ exit 0
27
+ }
28
+
29
+ for arg in "$@"; do
30
+ case "$arg" in
31
+ --json) JSON_OUT=1 ;;
32
+ --help|-h) usage ;;
33
+ *) echo "arch-check: unknown flag $arg" >&2; exit 2 ;;
34
+ esac
35
+ done
36
+
37
+ mkdir -p "$REPORT_DIR"
38
+
39
+ emit_result() {
40
+ local tool="$1" status="$2" violations="$3" log="$4"
41
+ if [[ "$JSON_OUT" -eq 1 ]]; then
42
+ printf '{"tool":"%s","status":"%s","violations":%s,"log":"%s"}\n' \
43
+ "$tool" "$status" "$violations" "$log"
44
+ else
45
+ echo "arch-check: tool=$tool status=$status violations=$violations"
46
+ echo " log=$log"
47
+ fi
48
+ }
49
+
50
+ # 1. dependency-cruiser (JS/TS)
51
+ if [[ -f "${ROOT}/.dependency-cruiser.js" || -f "${ROOT}/.dependency-cruiser.cjs" ]]; then
52
+ LOG="${REPORT_DIR}/dep-cruiser.log"
53
+ if command -v npx >/dev/null 2>&1; then
54
+ if npx --no-install dependency-cruiser --validate --output-type err "${ROOT}/src" > "$LOG" 2>&1; then
55
+ emit_result dependency-cruiser pass 0 "$LOG"
56
+ exit 0
57
+ else
58
+ VIOL=$(grep -c "error" "$LOG" || echo 0)
59
+ emit_result dependency-cruiser fail "$VIOL" "$LOG"
60
+ exit 1
61
+ fi
62
+ else
63
+ emit_result dependency-cruiser missing-tool 0 "$LOG"
64
+ exit 2
65
+ fi
66
+ fi
67
+
68
+ # 2. import-linter (Python)
69
+ if [[ -f "${ROOT}/.importlinter" ]] || grep -q "^\[importlinter\]" "${ROOT}/pyproject.toml" 2>/dev/null; then
70
+ LOG="${REPORT_DIR}/import-linter.log"
71
+ if command -v lint-imports >/dev/null 2>&1; then
72
+ if (cd "$ROOT" && lint-imports) > "$LOG" 2>&1; then
73
+ emit_result import-linter pass 0 "$LOG"
74
+ exit 0
75
+ else
76
+ VIOL=$(grep -c "BROKEN" "$LOG" || echo 0)
77
+ emit_result import-linter fail "$VIOL" "$LOG"
78
+ exit 1
79
+ fi
80
+ else
81
+ emit_result import-linter missing-tool 0 "$LOG"
82
+ exit 2
83
+ fi
84
+ fi
85
+
86
+ # 3. deptrac (PHP)
87
+ if [[ -f "${ROOT}/deptrac.yaml" ]]; then
88
+ LOG="${REPORT_DIR}/deptrac.log"
89
+ if [[ -x "${ROOT}/vendor/bin/deptrac" ]]; then
90
+ if (cd "$ROOT" && vendor/bin/deptrac analyse --no-progress) > "$LOG" 2>&1; then
91
+ emit_result deptrac pass 0 "$LOG"
92
+ exit 0
93
+ else
94
+ VIOL=$(grep -Ec "violation" "$LOG" || echo 0)
95
+ emit_result deptrac fail "$VIOL" "$LOG"
96
+ exit 1
97
+ fi
98
+ else
99
+ emit_result deptrac missing-tool 0 "$LOG"
100
+ exit 2
101
+ fi
102
+ fi
103
+
104
+ # 4. arch-go
105
+ if [[ -f "${ROOT}/arch-go.yml" ]]; then
106
+ LOG="${REPORT_DIR}/arch-go.log"
107
+ if command -v arch-go >/dev/null 2>&1; then
108
+ if (cd "$ROOT" && arch-go) > "$LOG" 2>&1; then
109
+ emit_result arch-go pass 0 "$LOG"
110
+ exit 0
111
+ else
112
+ VIOL=$(grep -c "Violation" "$LOG" || echo 0)
113
+ emit_result arch-go fail "$VIOL" "$LOG"
114
+ exit 1
115
+ fi
116
+ else
117
+ emit_result arch-go missing-tool 0 "$LOG"
118
+ exit 2
119
+ fi
120
+ fi
121
+
122
+ # 5. ArchUnit (Java/Kotlin) — run via build tool
123
+ if [[ -f "${ROOT}/build.gradle" || -f "${ROOT}/build.gradle.kts" ]] && \
124
+ grep -rq "com.tngtech.archunit" "${ROOT}" --include="*.gradle*" 2>/dev/null; then
125
+ LOG="${REPORT_DIR}/archunit.log"
126
+ if [[ -x "${ROOT}/gradlew" ]]; then
127
+ if (cd "$ROOT" && ./gradlew test --tests '*ArchitectureTest*' --tests '*ArchTest*') > "$LOG" 2>&1; then
128
+ emit_result archunit pass 0 "$LOG"
129
+ exit 0
130
+ else
131
+ VIOL=$(grep -Ec "violated|FAILED" "$LOG" || echo 0)
132
+ emit_result archunit fail "$VIOL" "$LOG"
133
+ exit 1
134
+ fi
135
+ else
136
+ emit_result archunit missing-tool 0 "$LOG"
137
+ exit 2
138
+ fi
139
+ fi
140
+
141
+ # No tool / config found
142
+ emit_result none not-configured 0 "$REPORT_DIR/none.log"
143
+ exit 2
@@ -0,0 +1,88 @@
1
+ #!/usr/bin/env bash
2
+ # Quick test bias pattern counter
3
+ # Usage: bash bias-count.sh [test-directory]
4
+ #
5
+ # Scans test files for common bias patterns that weaken test suites.
6
+ # See references/test-quality-deep-audit.md Section 1 for full details.
7
+
8
+ set -euo pipefail
9
+
10
+ TEST_DIR="${1:-tests}"
11
+
12
+ if [ ! -d "$TEST_DIR" ]; then
13
+ echo "ERROR: Test directory '$TEST_DIR' not found"
14
+ echo "Usage: bash bias-count.sh [test-directory]"
15
+ exit 1
16
+ fi
17
+
18
+ echo "═══════════════════════════════════════"
19
+ echo " TEST BIAS SCAN — $TEST_DIR"
20
+ echo "═══════════════════════════════════════"
21
+ echo
22
+
23
+ TOTAL_BIAS=0
24
+
25
+ count_pattern() {
26
+ local label="$1"
27
+ local pattern="$2"
28
+ local count
29
+ count=$(grep -rn "$pattern" "$TEST_DIR" 2>/dev/null | wc -l)
30
+ TOTAL_BIAS=$((TOTAL_BIAS + count))
31
+ printf " %-30s %d\n" "$label" "$count"
32
+ }
33
+
34
+ echo "BIAS PATTERNS"
35
+ echo "─────────────────────────────────────"
36
+ count_pattern "Smoke-only (is not None)" "is not None$"
37
+ count_pattern "Smoke-only (assertIsNotNone)" "assertIsNotNone"
38
+ count_pattern "Smoke-only (toBeDefined)" "toBeDefined()"
39
+ count_pattern "Tautological (sorted==sorted)" "sorted.*==.*sorted"
40
+ count_pattern "Tautological (len==len)" "len.*==.*len"
41
+ count_pattern "Symmetric input (0,0)" "(0, 0)"
42
+ count_pattern "Symmetric input (1,1)" "(1, 1)"
43
+ count_pattern "Symmetric input (100,100)" "(100, 100)"
44
+ count_pattern "Range-only assertion" "assert.*<=.*<="
45
+ count_pattern 'Substring check (in str)' '" in '
46
+ echo
47
+
48
+ # Count test functions
49
+ TEST_COUNT=$(grep -rn "def test_\|it('\|it(\"\\|test('\|test(\"" "$TEST_DIR" 2>/dev/null | wc -l)
50
+
51
+ # Count total assertions
52
+ ASSERT_COUNT=$(grep -rn "assert\b\|assertEqual\|expect(" "$TEST_DIR" 2>/dev/null | wc -l)
53
+
54
+ # Assertion density
55
+ if [ "$TEST_COUNT" -gt 0 ]; then
56
+ DENSITY=$(echo "scale=2; $ASSERT_COUNT / $TEST_COUNT" | bc)
57
+ else
58
+ DENSITY="0"
59
+ fi
60
+
61
+ # Per-100 bias rate
62
+ if [ "$TEST_COUNT" -gt 0 ]; then
63
+ RATE=$(echo "scale=1; $TOTAL_BIAS * 100 / $TEST_COUNT" | bc)
64
+ else
65
+ RATE="0"
66
+ fi
67
+
68
+ echo "SUMMARY"
69
+ echo "─────────────────────────────────────"
70
+ printf " %-30s %d\n" "Test functions" "$TEST_COUNT"
71
+ printf " %-30s %d\n" "Total assertions" "$ASSERT_COUNT"
72
+ printf " %-30s %s\n" "Assertion density" "$DENSITY per test"
73
+ printf " %-30s %d\n" "Bias patterns found" "$TOTAL_BIAS"
74
+ printf " %-30s %s\n" "Per-100-tests rate" "$RATE"
75
+ echo
76
+
77
+ # Grade
78
+ if [ "$(echo "$RATE <= 5" | bc)" -eq 1 ]; then
79
+ echo " Grade: LOW — no action needed"
80
+ elif [ "$(echo "$RATE <= 15" | bc)" -eq 1 ]; then
81
+ echo " Grade: MODERATE — review flagged tests"
82
+ elif [ "$(echo "$RATE <= 30" | bc)" -eq 1 ]; then
83
+ echo " Grade: HIGH — systematic remediation needed"
84
+ else
85
+ echo " Grade: CRITICAL — full rewrite of flagged tests"
86
+ fi
87
+ echo
88
+ echo "═══════════════════════════════════════"
@@ -0,0 +1,385 @@
1
+ #!/usr/bin/env python3
2
+ """CRAP (Change Risk Analyzer and Predictor) calculator — multi-language.
3
+
4
+ Reads language-native complexity and coverage outputs, computes
5
+ CRAP(m) = C(m)^2 * (1 - cov(m)/100)^3 + C(m)
6
+ for every method, ranks them, and emits CSV + JSON.
7
+
8
+ Walls 5 and 6 of the Seven Walls (audit-tests skill):
9
+ - Production code: no method CRAP > 30; project average <= 10.
10
+ - Test code: no method CRAP > 15.
11
+
12
+ Thresholds are configurable via --threshold (local tuning is logged).
13
+ """
14
+
15
+ from __future__ import annotations
16
+
17
+ import argparse
18
+ import csv
19
+ import json
20
+ import os
21
+ import shutil
22
+ import subprocess
23
+ import sys
24
+ from dataclasses import asdict, dataclass
25
+ from pathlib import Path
26
+
27
+
28
+ @dataclass
29
+ class MethodScore:
30
+ language: str
31
+ path: str
32
+ method: str
33
+ complexity: int
34
+ coverage: float
35
+ crap: float
36
+ kind: str # "src" or "test"
37
+
38
+
39
+ def crap(complexity: int, coverage_pct: float) -> float:
40
+ cov = max(0.0, min(100.0, coverage_pct)) / 100.0
41
+ return (complexity ** 2) * ((1.0 - cov) ** 3) + complexity
42
+
43
+
44
+ def detect_language(root: Path) -> str:
45
+ candidates = [
46
+ ("pyproject.toml", "python"),
47
+ ("setup.py", "python"),
48
+ ("package.json", "js"),
49
+ ("go.mod", "go"),
50
+ ("Cargo.toml", "rust"),
51
+ ("pom.xml", "java"),
52
+ ("build.gradle", "java"),
53
+ ("build.gradle.kts", "java"),
54
+ ("composer.json", "php"),
55
+ ("Gemfile", "ruby"),
56
+ ("*.csproj", "dotnet"),
57
+ ]
58
+ for pattern, lang in candidates:
59
+ if "*" in pattern:
60
+ if any(root.glob(pattern)):
61
+ return lang
62
+ elif (root / pattern).is_file():
63
+ return lang
64
+ return "unknown"
65
+
66
+
67
+ def which_or_none(cmd: str) -> str | None:
68
+ return shutil.which(cmd)
69
+
70
+
71
+ def run(cmd: list[str], cwd: Path) -> tuple[int, str, str]:
72
+ p = subprocess.run(cmd, cwd=str(cwd), capture_output=True, text=True, check=False)
73
+ return p.returncode, p.stdout, p.stderr
74
+
75
+
76
+ # ---------- Python: radon + coverage ----------
77
+
78
+ def score_python(root: Path, kind: str) -> list[MethodScore]:
79
+ if kind == "src":
80
+ candidates = ["src", "myapp", "app"]
81
+ scanned = [t for t in candidates if (root / t).is_dir()]
82
+ if not scanned:
83
+ test_dirs = {"tests", "test", "spec", "specs", "features", "__tests__"}
84
+ ignore = {".git", ".venv", "venv", "node_modules", "dist", "build", "target", ".tox", ".mypy_cache", ".pytest_cache", "reports", "__pycache__"}
85
+ scanned = [
86
+ p.name for p in root.iterdir()
87
+ if p.is_dir()
88
+ and not p.name.startswith(".")
89
+ and p.name not in ignore
90
+ and p.name not in test_dirs
91
+ and any(p.rglob("*.py"))
92
+ ]
93
+ else:
94
+ candidates = ["tests", "test"]
95
+ scanned = [t for t in candidates if (root / t).is_dir()]
96
+ if not scanned:
97
+ return []
98
+
99
+ if which_or_none("radon") is None:
100
+ print("[crap-score] radon not installed (pip install radon)", file=sys.stderr)
101
+ return []
102
+
103
+ complexity: dict[tuple[str, str], int] = {}
104
+ for tgt in scanned:
105
+ rc, out, err = run(["radon", "cc", "-s", "-a", "-j", tgt], root)
106
+ if rc != 0 or not out.strip():
107
+ continue
108
+ try:
109
+ data = json.loads(out)
110
+ except json.JSONDecodeError:
111
+ continue
112
+ for fpath, blocks in data.items():
113
+ for block in blocks:
114
+ name = block.get("name") or ""
115
+ method_key = (fpath, name)
116
+ complexity[method_key] = int(block.get("complexity", 0))
117
+
118
+ coverage: dict[str, float] = {}
119
+ cov_json = root / "coverage.json"
120
+ if not cov_json.is_file() and which_or_none("coverage"):
121
+ run(["coverage", "json", "-o", "coverage.json", "--fail-under=0"], root)
122
+ if cov_json.is_file():
123
+ try:
124
+ cov_data = json.loads(cov_json.read_text())
125
+ for fpath, summary in cov_data.get("files", {}).items():
126
+ pct = summary.get("summary", {}).get("percent_covered", 0.0)
127
+ coverage[fpath] = float(pct)
128
+ except (OSError, json.JSONDecodeError):
129
+ pass
130
+
131
+ scores: list[MethodScore] = []
132
+ for (fpath, name), c in complexity.items():
133
+ cov = coverage.get(fpath, 0.0)
134
+ scores.append(
135
+ MethodScore(
136
+ language="python",
137
+ path=fpath,
138
+ method=name,
139
+ complexity=c,
140
+ coverage=cov,
141
+ crap=crap(c, cov),
142
+ kind=kind,
143
+ )
144
+ )
145
+ return scores
146
+
147
+
148
+ # ---------- Go: gocyclo + go test -cover ----------
149
+
150
+ def score_go(root: Path, kind: str) -> list[MethodScore]:
151
+ if which_or_none("gocyclo") is None:
152
+ print("[crap-score] gocyclo not installed", file=sys.stderr)
153
+ return []
154
+
155
+ rc, out, _ = run(["gocyclo", "-ignore", "_test.go" if kind == "src" else ".*\\.go$", "."], root)
156
+ complexity: list[tuple[str, str, int]] = []
157
+ for line in out.splitlines():
158
+ parts = line.strip().split()
159
+ if len(parts) < 4:
160
+ continue
161
+ try:
162
+ c = int(parts[0])
163
+ except ValueError:
164
+ continue
165
+ pkg = parts[1]
166
+ func = parts[2]
167
+ fpath = parts[3].split(":", 1)[0]
168
+ include = fpath.endswith("_test.go") if kind == "test" else not fpath.endswith("_test.go")
169
+ if include:
170
+ complexity.append((fpath, f"{pkg}.{func}", c))
171
+
172
+ coverage: dict[str, float] = {}
173
+ cov_out = root / "coverage.out"
174
+ if not cov_out.is_file():
175
+ run(["go", "test", "-coverprofile=coverage.out", "-covermode=atomic", "./..."], root)
176
+ if cov_out.is_file() and which_or_none("go"):
177
+ rc, out, _ = run(["go", "tool", "cover", "-func=coverage.out"], root)
178
+ for line in out.splitlines():
179
+ parts = line.split()
180
+ if len(parts) >= 3 and parts[-1].endswith("%"):
181
+ fpath = parts[0].split(":", 1)[0]
182
+ try:
183
+ pct = float(parts[-1].rstrip("%"))
184
+ except ValueError:
185
+ continue
186
+ coverage[fpath] = pct
187
+
188
+ scores: list[MethodScore] = []
189
+ for fpath, name, c in complexity:
190
+ cov = coverage.get(fpath, 0.0)
191
+ scores.append(
192
+ MethodScore(
193
+ language="go", path=fpath, method=name, complexity=c,
194
+ coverage=cov, crap=crap(c, cov), kind=kind,
195
+ )
196
+ )
197
+ return scores
198
+
199
+
200
+ # ---------- JS/TS: complexity-report + c8 ----------
201
+
202
+ def score_js(root: Path, kind: str) -> list[MethodScore]:
203
+ cr_bin = which_or_none("cr") or which_or_none("complexity-report")
204
+ if cr_bin is None:
205
+ print("[crap-score] complexity-report not installed (npm i -D complexity-report)", file=sys.stderr)
206
+ return []
207
+ target = "src" if kind == "src" else "tests"
208
+ if not (root / target).is_dir():
209
+ return []
210
+ rc, out, _ = run([cr_bin, "--format", "json", target], root)
211
+ if rc != 0 or not out.strip():
212
+ return []
213
+ try:
214
+ data = json.loads(out)
215
+ except json.JSONDecodeError:
216
+ return []
217
+
218
+ cov_path = root / "coverage" / "coverage-summary.json"
219
+ coverage: dict[str, float] = {}
220
+ if cov_path.is_file():
221
+ try:
222
+ cov_data = json.loads(cov_path.read_text())
223
+ for fpath, summary in cov_data.items():
224
+ if fpath == "total":
225
+ continue
226
+ lines_pct = summary.get("lines", {}).get("pct", 0.0)
227
+ coverage[fpath] = float(lines_pct)
228
+ except (OSError, json.JSONDecodeError):
229
+ pass
230
+
231
+ scores: list[MethodScore] = []
232
+ for report in data.get("reports", []):
233
+ fpath = report.get("path", "")
234
+ cov = coverage.get(fpath, 0.0)
235
+ for func in report.get("functions", []):
236
+ c = int(func.get("cyclomatic", 1))
237
+ scores.append(
238
+ MethodScore(
239
+ language="js", path=fpath, method=func.get("name", "<anon>"),
240
+ complexity=c, coverage=cov, crap=crap(c, cov), kind=kind,
241
+ )
242
+ )
243
+ return scores
244
+
245
+
246
+ # ---------- Rust: rust-code-analysis + tarpaulin ----------
247
+
248
+ def score_rust(root: Path, kind: str) -> list[MethodScore]:
249
+ rca = which_or_none("rust-code-analysis-cli")
250
+ if rca is None:
251
+ print("[crap-score] rust-code-analysis-cli not installed", file=sys.stderr)
252
+ return []
253
+ target = "src" if kind == "src" else "tests"
254
+ if not (root / target).is_dir():
255
+ return []
256
+ rc, out, _ = run([rca, "-m", "-O", "json", "-p", target], root)
257
+ if rc != 0 or not out.strip():
258
+ return []
259
+ complexity: list[tuple[str, str, int]] = []
260
+ for line in out.splitlines():
261
+ try:
262
+ rec = json.loads(line)
263
+ except json.JSONDecodeError:
264
+ continue
265
+ fpath = rec.get("name", "")
266
+ metrics = rec.get("metrics", {}).get("cyclomatic", {})
267
+ for func in rec.get("spaces", []):
268
+ c = int(func.get("metrics", {}).get("cyclomatic", {}).get("sum", 1))
269
+ complexity.append((fpath, func.get("name", "<anon>"), c))
270
+ scores: list[MethodScore] = []
271
+ for fpath, name, c in complexity:
272
+ scores.append(
273
+ MethodScore(
274
+ language="rust", path=fpath, method=name, complexity=c,
275
+ coverage=0.0, crap=crap(c, 0.0), kind=kind,
276
+ )
277
+ )
278
+ return scores
279
+
280
+
281
+ DISPATCH = {
282
+ "python": score_python,
283
+ "go": score_go,
284
+ "js": score_js,
285
+ "rust": score_rust,
286
+ }
287
+
288
+
289
+ # ---------- CLI ----------
290
+
291
+ def main() -> int:
292
+ ap = argparse.ArgumentParser(description=__doc__.splitlines()[0])
293
+ ap.add_argument("--root", default=".", help="Repository root")
294
+ ap.add_argument("--target", choices=["src", "test", "both"], default="both")
295
+ ap.add_argument("--format", choices=["csv", "json", "both"], default="both")
296
+ ap.add_argument("--out", default="reports/crap", help="Output directory")
297
+ ap.add_argument("--lang", default="auto",
298
+ help="Force language (python|go|js|rust); default auto-detect")
299
+ ap.add_argument("--threshold-prod", type=float, default=30.0,
300
+ help="Production CRAP max (default 30)")
301
+ ap.add_argument("--threshold-test", type=float, default=15.0,
302
+ help="Test CRAP max (default 15)")
303
+ ap.add_argument("--threshold-avg", type=float, default=10.0,
304
+ help="Project average max (default 10)")
305
+ args = ap.parse_args()
306
+
307
+ root = Path(args.root).resolve()
308
+ lang = args.lang if args.lang != "auto" else detect_language(root)
309
+ if lang not in DISPATCH:
310
+ print(f"[crap-score] unsupported language: {lang}", file=sys.stderr)
311
+ return 2
312
+
313
+ if any(t != d for t, d in (
314
+ (args.threshold_prod, 30.0),
315
+ (args.threshold_test, 15.0),
316
+ (args.threshold_avg, 10.0),
317
+ )):
318
+ print(f"[crap-score] threshold override: prod={args.threshold_prod} "
319
+ f"test={args.threshold_test} avg={args.threshold_avg}",
320
+ file=sys.stderr)
321
+
322
+ kinds = ["src", "test"] if args.target == "both" else [args.target]
323
+ all_scores: list[MethodScore] = []
324
+ for kind in kinds:
325
+ all_scores.extend(DISPATCH[lang](root, kind))
326
+
327
+ out_dir = root / args.out
328
+ out_dir.mkdir(parents=True, exist_ok=True)
329
+
330
+ if args.format in ("csv", "both"):
331
+ for kind in kinds:
332
+ ranked = sorted(
333
+ [s for s in all_scores if s.kind == kind],
334
+ key=lambda s: s.crap, reverse=True,
335
+ )
336
+ csv_path = out_dir / f"crap-{kind}.csv"
337
+ with csv_path.open("w", newline="") as fh:
338
+ w = csv.writer(fh)
339
+ w.writerow(["rank", "crap", "complexity", "coverage_pct", "path", "method"])
340
+ for i, s in enumerate(ranked, 1):
341
+ w.writerow([i, f"{s.crap:.2f}", s.complexity,
342
+ f"{s.coverage:.1f}", s.path, s.method])
343
+
344
+ src_scores = [s for s in all_scores if s.kind == "src"]
345
+ test_scores = [s for s in all_scores if s.kind == "test"]
346
+ prod_max = max((s.crap for s in src_scores), default=0.0)
347
+ test_max = max((s.crap for s in test_scores), default=0.0)
348
+ prod_avg = (sum(s.crap for s in src_scores) / len(src_scores)) if src_scores else 0.0
349
+
350
+ prod_blockers = [asdict(s) for s in src_scores if s.crap > args.threshold_prod]
351
+ test_blockers = [asdict(s) for s in test_scores if s.crap > args.threshold_test]
352
+ avg_fail = prod_avg > args.threshold_avg
353
+
354
+ pass_ = not (prod_blockers or test_blockers or avg_fail)
355
+
356
+ summary = {
357
+ "language": lang,
358
+ "thresholds": {
359
+ "production_max": args.threshold_prod,
360
+ "test_max": args.threshold_test,
361
+ "project_avg_max": args.threshold_avg,
362
+ },
363
+ "production": {
364
+ "methods_scored": len(src_scores),
365
+ "max_crap": round(prod_max, 2),
366
+ "avg_crap": round(prod_avg, 2),
367
+ "blockers": prod_blockers,
368
+ },
369
+ "test": {
370
+ "methods_scored": len(test_scores),
371
+ "max_crap": round(test_max, 2),
372
+ "blockers": test_blockers,
373
+ },
374
+ "pass": pass_,
375
+ }
376
+
377
+ if args.format in ("json", "both"):
378
+ (out_dir / "summary.json").write_text(json.dumps(summary, indent=2))
379
+
380
+ print(json.dumps({"pass": pass_, "summary_path": str(out_dir / "summary.json")}))
381
+ return 0 if pass_ else 1
382
+
383
+
384
+ if __name__ == "__main__":
385
+ sys.exit(main())
@@ -0,0 +1,171 @@
1
+ #!/usr/bin/env bash
2
+ # escape-scan.sh — detect AI escape attempts in a proposed diff.
3
+ #
4
+ # Scans a unified diff (from git or a patch file) for patterns that indicate
5
+ # the AI is trying to lower a wall instead of meeting the bar.
6
+ #
7
+ # Severity grammar:
8
+ # FLAG → logged, does not halt (printed on stderr)
9
+ # CHALLENGE → require engineer-approved reason (exit 1)
10
+ # REFUSE → halt the pipeline (exit 2)
11
+ #
12
+ # Exit codes:
13
+ # 0 — clean
14
+ # 1 — CHALLENGE (at least one must-challenge pattern matched)
15
+ # 2 — REFUSE (at least one refuse pattern matched, or hash mismatch)
16
+ #
17
+ # Usage:
18
+ # git diff | bash escape-scan.sh -
19
+ # bash escape-scan.sh path/to/change.patch
20
+ # bash escape-scan.sh --staged # git diff --cached
21
+ # bash escape-scan.sh --range HEAD~1..HEAD
22
+
23
+ set -euo pipefail
24
+
25
+ DIFF_SRC=""
26
+ VERIFY_HASH=1
27
+ ROOT="${ROOT:-$(pwd)}"
28
+ HASH_SCRIPT="$(dirname "$0")/harness-hash.sh"
29
+
30
+ if [[ "$#" -eq 0 ]]; then
31
+ echo "escape-scan: pass a diff source (- for stdin, --staged, --range, or a patch file)" >&2
32
+ exit 2
33
+ fi
34
+
35
+ case "$1" in
36
+ -) DIFF_SRC="/dev/stdin" ;;
37
+ --staged) DIFF_SRC=$(mktemp); git diff --cached > "$DIFF_SRC" ;;
38
+ --range) DIFF_SRC=$(mktemp); git diff "$2" > "$DIFF_SRC"; shift ;;
39
+ --no-hash) VERIFY_HASH=0; shift; DIFF_SRC="$1" ;;
40
+ --help|-h)
41
+ sed -n '2,22p' "$0"; exit 0 ;;
42
+ *) DIFF_SRC="$1" ;;
43
+ esac
44
+
45
+ if [[ ! -r "$DIFF_SRC" ]]; then
46
+ echo "escape-scan: cannot read $DIFF_SRC" >&2
47
+ exit 2
48
+ fi
49
+
50
+ REFUSE=0
51
+ CHALLENGE=0
52
+ FLAG=0
53
+
54
+ # --- Load floor thresholds from tests/TESTING.md (fallback to defaults) ---
55
+ # Reads canonical thresholds so audits enforce the repo's policy, not a
56
+ # hardcoded script-level guess. Format expected in TESTING.md (policy section):
57
+ # coverage.line: 80
58
+ # coverage.branch: 70
59
+ # mutation.kill_rate: 70
60
+ COVERAGE_LINE_FLOOR=80
61
+ COVERAGE_BRANCH_FLOOR=70
62
+ MUTATION_FLOOR=70
63
+ TESTING_MD="$ROOT/tests/TESTING.md"
64
+ if [[ -f "$TESTING_MD" ]]; then
65
+ v=$(grep -Ei '^\s*coverage\.line\s*:' "$TESTING_MD" | head -1 | sed -E 's/.*:\s*([0-9]+).*/\1/')
66
+ [[ -n "$v" ]] && COVERAGE_LINE_FLOOR="$v"
67
+ v=$(grep -Ei '^\s*coverage\.branch\s*:' "$TESTING_MD" | head -1 | sed -E 's/.*:\s*([0-9]+).*/\1/')
68
+ [[ -n "$v" ]] && COVERAGE_BRANCH_FLOOR="$v"
69
+ v=$(grep -Ei '^\s*mutation\.kill_rate\s*:' "$TESTING_MD" | head -1 | sed -E 's/.*:\s*([0-9]+).*/\1/')
70
+ [[ -n "$v" ]] && MUTATION_FLOOR="$v"
71
+ fi
72
+
73
+ # Collect only added lines (prefix + but not +++)
74
+ added_lines=$(grep -E '^\+[^+]' "$DIFF_SRC" || true)
75
+ file_headers=$(grep -E '^\+\+\+ ' "$DIFF_SRC" || true)
76
+
77
+ note() {
78
+ local severity="$1" msg="$2"
79
+ echo "[$severity] $msg" >&2
80
+ case "$severity" in
81
+ REFUSE) REFUSE=$((REFUSE + 1)) ;;
82
+ CHALLENGE) CHALLENGE=$((CHALLENGE + 1)) ;;
83
+ FLAG) FLAG=$((FLAG + 1)) ;;
84
+ esac
85
+ }
86
+
87
+ # --- REFUSE: coverage threshold edits ---
88
+ # Floor is policy-driven (tests/TESTING.md coverage.line). Any explicit
89
+ # threshold lower than the floor is an escape attempt.
90
+ check_below_floor() {
91
+ local line="$1" floor="$2"
92
+ local n
93
+ n=$(printf '%s\n' "$line" | grep -oE '[0-9]+' | head -1)
94
+ [[ -n "$n" ]] && [[ "$n" -lt "$floor" ]]
95
+ }
96
+ while IFS= read -r line; do
97
+ if [[ "$line" =~ fail_under[[:space:]]*=[[:space:]]*[0-9] ]] || [[ "$line" =~ --cov-fail-under=[0-9] ]]; then
98
+ if check_below_floor "$line" "$COVERAGE_LINE_FLOOR"; then
99
+ note REFUSE "coverage fail_under lowered below policy floor ($COVERAGE_LINE_FLOOR) — escape attempt"
100
+ fi
101
+ fi
102
+ if [[ "$line" =~ \"(branches|lines|functions|statements)\"[[:space:]]*:[[:space:]]*[0-9] ]]; then
103
+ if check_below_floor "$line" "$COVERAGE_LINE_FLOOR"; then
104
+ note REFUSE "Jest/c8 coverageThreshold lowered below policy floor ($COVERAGE_LINE_FLOOR) — escape attempt"
105
+ fi
106
+ fi
107
+ done <<< "$added_lines"
108
+ if echo "$added_lines" | grep -Eq 'coverageThreshold[[:space:]]*:[[:space:]]*0'; then
109
+ note REFUSE "coverageThreshold set to 0 (escape attempt)"
110
+ fi
111
+ if echo "$added_lines" | grep -Eq 'minimum[[:space:]]*=[[:space:]]*0\.[0-7]'; then
112
+ note REFUSE "JaCoCo minimum lowered (escape attempt)"
113
+ fi
114
+
115
+ # --- REFUSE: architecture bypasses ---
116
+ if echo "$added_lines" | grep -Eq 'depcruise-disable|@ArchIgnore|skip_violations|ignore_imports[[:space:]]*=|severity[[:space:]]*:[[:space:]]*"warn"'; then
117
+ note REFUSE "architecture rule bypass (depcruise-disable / @ArchIgnore / skip_violations / ignore_imports / severity downgrade)"
118
+ fi
119
+
120
+ # --- REFUSE: wholesale test deletion (file headers only) ---
121
+ # Detect deleted test files with no compensating additions
122
+ deleted_tests=$(grep -E '^--- a/.*test.*|^--- a/.*spec.*' "$DIFF_SRC" | grep -v 'test.*\.md$' || true)
123
+ added_tests=$(echo "$file_headers" | grep -E '\+\+\+ b/.*test.*|\+\+\+ b/.*spec.*' || true)
124
+ if [[ -n "$deleted_tests" && -z "$added_tests" ]]; then
125
+ note REFUSE "test file(s) deleted without compensating additions"
126
+ fi
127
+
128
+ # --- REFUSE: .feature file mutation (hash check) ---
129
+ if [[ "$VERIFY_HASH" -eq 1 && -f "$ROOT/.harness-hash" && -x "$HASH_SCRIPT" ]]; then
130
+ if ! (cd "$ROOT" && bash "$HASH_SCRIPT" --verify >/dev/null 2>&1); then
131
+ note REFUSE "HARNESS_TAMPERED — pinned .feature or rule-config file changed"
132
+ fi
133
+ fi
134
+ # Also REFUSE if the diff itself touches .feature files
135
+ if echo "$file_headers" | grep -Eq '\+\+\+ b/.*\.feature'; then
136
+ note REFUSE ".feature file modified (human-owned artifact)"
137
+ fi
138
+
139
+ # --- CHALLENGE: test skip markers ---
140
+ if echo "$added_lines" | grep -Eq '@pytest\.mark\.skip|\.skip\(|\.only\(|@Ignore\b|@Disabled\b|@SkipTest\b'; then
141
+ note CHALLENGE "test skip marker added (requires engineer-approved reason)"
142
+ fi
143
+
144
+ # --- CHALLENGE: mutation bypass markers ---
145
+ if echo "$added_lines" | grep -Eq 'pragma:[[:space:]]*no[[:space:]]*mutate|Stryker[[:space:]]*disable|@DoNotMutate'; then
146
+ note CHALLENGE "mutation bypass marker added"
147
+ fi
148
+
149
+ # --- CHALLENGE: assertion weakening (diff-aware) ---
150
+ # Look at removed+added pairs: old was a strong assertion, new is weak
151
+ # Heuristic: new line contains assertTrue(True) / toBeDefined() / is not None
152
+ if echo "$added_lines" | grep -Eq 'assertTrue\(True\)|assertEquals\(true,[[:space:]]*true\)'; then
153
+ note CHALLENGE "trivially-true assertion added (assertTrue(True) equivalent)"
154
+ fi
155
+
156
+ # --- FLAG: weak-assertion patterns (informational) ---
157
+ if echo "$added_lines" | grep -Eq 'toBeDefined\(\)|\.is not None'; then
158
+ note FLAG "smoke-only assertion pattern (consider tightening)"
159
+ fi
160
+
161
+ # --- Summary & exit ---
162
+ echo "escape-scan: REFUSE=$REFUSE CHALLENGE=$CHALLENGE FLAG=$FLAG"
163
+ if [[ "$REFUSE" -gt 0 ]]; then
164
+ echo "escape-scan: pipeline halted (REFUSE)" >&2
165
+ exit 2
166
+ fi
167
+ if [[ "$CHALLENGE" -gt 0 ]]; then
168
+ echo "escape-scan: pipeline needs engineer approval (CHALLENGE)" >&2
169
+ exit 1
170
+ fi
171
+ exit 0
@@ -0,0 +1,111 @@
1
+ #!/usr/bin/env bash
2
+ # gherkin-lint.sh — Advisory Gherkin quality check for Wall 1.
3
+ #
4
+ # If gherkin-lint is installed (npm i -g gherkin-lint) it is used. Otherwise
5
+ # falls back to awk-based rubric checks for imperative verbs, CSS selectors
6
+ # in steps, missing Background, and overlong scenarios.
7
+ #
8
+ # Non-blocking by default (exit 0 on warnings). Use --strict to turn warnings
9
+ # into failures.
10
+ #
11
+ # Usage:
12
+ # bash gherkin-lint.sh [--path features/] [--strict]
13
+
14
+ set -euo pipefail
15
+
16
+ PATH_ARG="features/"
17
+ STRICT=0
18
+
19
+ while [[ $# -gt 0 ]]; do
20
+ case "$1" in
21
+ --path) PATH_ARG="$2"; shift 2 ;;
22
+ --strict) STRICT=1; shift ;;
23
+ --help|-h)
24
+ sed -n '2,15p' "$0"; exit 0 ;;
25
+ *) echo "gherkin-lint: unknown flag $1" >&2; exit 2 ;;
26
+ esac
27
+ done
28
+
29
+ if [[ ! -d "$PATH_ARG" ]]; then
30
+ echo "gherkin-lint: path not found: $PATH_ARG" >&2
31
+ exit 2
32
+ fi
33
+
34
+ WARN_COUNT=0
35
+ ERROR_COUNT=0
36
+
37
+ warn() { echo "WARN $1:$2 $3"; WARN_COUNT=$((WARN_COUNT + 1)); }
38
+ err() { echo "ERROR $1:$2 $3"; ERROR_COUNT=$((ERROR_COUNT + 1)); }
39
+
40
+ # 1. Prefer official gherkin-lint if available
41
+ if command -v gherkin-lint >/dev/null 2>&1; then
42
+ echo "gherkin-lint: using installed linter"
43
+ if ! gherkin-lint "$PATH_ARG"; then
44
+ ERROR_COUNT=1
45
+ fi
46
+ else
47
+ echo "gherkin-lint: falling back to awk rubric (install gherkin-lint for full rules)"
48
+
49
+ while IFS= read -r -d '' feature; do
50
+ # Imperative verbs / CSS selectors in steps (declarative warning)
51
+ awk -v file="$feature" '
52
+ /^[[:space:]]*(Given|When|Then|And|But)/ {
53
+ line = $0
54
+ if (line ~ /click|type|fill[ _]in|press|select.*from[ _]dropdown/) {
55
+ printf "WARN %s:%d imperative verb in step (prefer declarative)\n", file, NR
56
+ }
57
+ if (line ~ /#[a-zA-Z][-a-zA-Z0-9_]*|\.[a-zA-Z][-a-zA-Z0-9_]*[[:space:]]|xpath/) {
58
+ printf "WARN %s:%d CSS selector / xpath in step (prefer business language)\n", file, NR
59
+ }
60
+ }
61
+ ' "$feature"
62
+
63
+ # Scenario length (> 10 steps)
64
+ awk -v file="$feature" '
65
+ /^[[:space:]]*Scenario/ { sc = NR; steps = 0; sn = $0; next }
66
+ /^[[:space:]]*(Given|When|Then|And|But)/ { if (sc) steps++ }
67
+ /^[[:space:]]*Scenario|^[[:space:]]*Feature|^$/ {
68
+ if (sc && steps > 10) {
69
+ printf "WARN %s:%d scenario has %d steps (>10 is too long)\n", file, sc, steps
70
+ }
71
+ if (NR != sc) { sc = 0; steps = 0 }
72
+ }
73
+ END {
74
+ if (sc && steps > 10) {
75
+ printf "WARN %s:%d scenario has %d steps (>10 is too long)\n", file, sc, steps
76
+ }
77
+ }
78
+ ' "$feature"
79
+
80
+ # Repeated Givens without Background (3+ identical Given lines)
81
+ dupe=$(awk '/^[[:space:]]*Given/ { print }' "$feature" | sort | uniq -c | awk '$1 >= 3 { print }')
82
+ if [[ -n "$dupe" ]] && ! grep -q "^[[:space:]]*Background:" "$feature"; then
83
+ warn "$feature" 0 "repeated Given lines without Background block"
84
+ fi
85
+
86
+ # "And" at scenario start (grammar error)
87
+ awk -v file="$feature" '
88
+ prev_blank = 1
89
+ /^[[:space:]]*$/ { prev_blank = 1; next }
90
+ /^[[:space:]]*Scenario/ { in_scenario = 1; step_count = 0; next }
91
+ /^[[:space:]]*(Given|When|Then|And|But)/ {
92
+ if (in_scenario && step_count == 0 && /^[[:space:]]*And/) {
93
+ printf "ERROR %s:%d scenario starts with And (use Given/When/Then)\n", file, NR
94
+ }
95
+ step_count++
96
+ }
97
+ ' "$feature"
98
+
99
+ done < <(find "$PATH_ARG" -name "*.feature" -print0)
100
+ fi
101
+
102
+ echo ""
103
+ echo "gherkin-lint summary: $WARN_COUNT warning(s), $ERROR_COUNT error(s)"
104
+
105
+ if [[ "$ERROR_COUNT" -gt 0 ]]; then
106
+ exit 1
107
+ fi
108
+ if [[ "$STRICT" -eq 1 && "$WARN_COUNT" -gt 0 ]]; then
109
+ exit 1
110
+ fi
111
+ exit 0
@@ -0,0 +1,116 @@
1
+ #!/usr/bin/env bash
2
+ # harness-hash.sh — SHA-256 manifest for engineer-owned artifacts.
3
+ #
4
+ # Pins .feature files and architecture rule configs. Any byte change to a
5
+ # pinned file without a fresh --init is treated as HARNESS_TAMPERED and
6
+ # causes escape-scan.sh to REFUSE the AI diff.
7
+ #
8
+ # Usage:
9
+ # bash harness-hash.sh --init # write manifest (engineer-initiated)
10
+ # bash harness-hash.sh --verify # compare current hashes to manifest
11
+ # bash harness-hash.sh --list # show which files are pinned
12
+ #
13
+ # Exit codes:
14
+ # 0 — OK (pin matches, or init succeeded)
15
+ # 2 — HARNESS_TAMPERED (hash mismatch)
16
+ # 3 — no manifest found (--verify without --init)
17
+
18
+ set -euo pipefail
19
+
20
+ ROOT="${ROOT:-$(pwd)}"
21
+ MANIFEST="${ROOT}/.harness-hash"
22
+
23
+ PATTERNS=(
24
+ # Wall 1: acceptance
25
+ "features/**/*.feature"
26
+ "features/*.feature"
27
+ # Wall 7: architecture rule configs
28
+ ".dependency-cruiser.js"
29
+ ".dependency-cruiser.cjs"
30
+ ".importlinter"
31
+ "deptrac.yaml"
32
+ "arch-go.yml"
33
+ # Java ArchUnit tests
34
+ "src/test/java/**/*ArchTest*.java"
35
+ "src/test/java/**/*ArchitectureTest*.java"
36
+ # .NET ArchTests
37
+ "test/**/*ArchTests.cs"
38
+ "tests/**/*ArchTests.cs"
39
+ # Coverage thresholds (edits to these are escape attempts — hash them)
40
+ ".c8rc.json"
41
+ "stryker.conf.json"
42
+ "stryker.config.js"
43
+ )
44
+
45
+ collect_files() {
46
+ local out=()
47
+ shopt -s nullglob globstar
48
+ for pattern in "${PATTERNS[@]}"; do
49
+ for f in $pattern; do
50
+ [[ -f "$f" ]] && out+=("$f")
51
+ done
52
+ done
53
+ # de-dupe
54
+ printf '%s\n' "${out[@]}" | sort -u
55
+ }
56
+
57
+ hash_files() {
58
+ local files
59
+ files=$(collect_files)
60
+ if [[ -z "$files" ]]; then
61
+ return 0
62
+ fi
63
+ while IFS= read -r f; do
64
+ printf '%s %s\n' "$(sha256sum "$f" | awk '{print $1}')" "$f"
65
+ done <<< "$files"
66
+ }
67
+
68
+ cmd_init() {
69
+ cd "$ROOT"
70
+ hash_files > "$MANIFEST"
71
+ local count
72
+ count=$(wc -l < "$MANIFEST" | tr -d ' ')
73
+ echo "harness-hash: pinned $count file(s) → $MANIFEST"
74
+ }
75
+
76
+ cmd_verify() {
77
+ cd "$ROOT"
78
+ if [[ ! -f "$MANIFEST" ]]; then
79
+ echo "harness-hash: no manifest at $MANIFEST (run --init)" >&2
80
+ exit 3
81
+ fi
82
+ local current
83
+ current=$(hash_files)
84
+ local expected
85
+ expected=$(cat "$MANIFEST")
86
+
87
+ # Compare sorted manifests so order doesn't matter
88
+ local diff_out
89
+ diff_out=$(diff <(echo "$expected" | sort) <(echo "$current" | sort) || true)
90
+ if [[ -z "$diff_out" ]]; then
91
+ echo "harness-hash: OK"
92
+ exit 0
93
+ fi
94
+ echo "HARNESS_TAMPERED: pinned artifact changed" >&2
95
+ echo "$diff_out" >&2
96
+ exit 2
97
+ }
98
+
99
+ cmd_list() {
100
+ cd "$ROOT"
101
+ if [[ ! -f "$MANIFEST" ]]; then
102
+ echo "harness-hash: no manifest (run --init)" >&2
103
+ exit 3
104
+ fi
105
+ awk '{print $2}' "$MANIFEST"
106
+ }
107
+
108
+ case "${1:-}" in
109
+ --init) cmd_init ;;
110
+ --verify) cmd_verify ;;
111
+ --list) cmd_list ;;
112
+ --help|-h|*)
113
+ sed -n '2,20p' "$0"
114
+ exit 0
115
+ ;;
116
+ esac