code-review-forge 2.0.0a1__py3-none-any.whl
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- code_forge/__init__.py +14 -0
- code_forge/__main__.py +8 -0
- code_forge/autofix.py +78 -0
- code_forge/baseline.py +216 -0
- code_forge/cli.py +983 -0
- code_forge/delta.py +65 -0
- code_forge/diagnose.py +109 -0
- code_forge/diff.py +82 -0
- code_forge/disposition.py +32 -0
- code_forge/e2e_check.py +641 -0
- code_forge/env_resolver.py +91 -0
- code_forge/errors.py +34 -0
- code_forge/exit_codes.py +37 -0
- code_forge/factories.py +191 -0
- code_forge/falsify.py +85 -0
- code_forge/gate_check.py +466 -0
- code_forge/git.py +351 -0
- code_forge/hold.py +126 -0
- code_forge/install_hooks.py +331 -0
- code_forge/lock.py +162 -0
- code_forge/machine.py +792 -0
- code_forge/mode_resolver.py +60 -0
- code_forge/mutation.py +380 -0
- code_forge/parsers/__init__.py +56 -0
- code_forge/parsers/_sarif.py +77 -0
- code_forge/parsers/base.py +65 -0
- code_forge/parsers/checkpatch.py +66 -0
- code_forge/parsers/clippy.py +85 -0
- code_forge/parsers/non_ascii.py +47 -0
- code_forge/parsers/ruff.py +18 -0
- code_forge/parsers/semgrep.py +18 -0
- code_forge/parsers/shellcheck.py +56 -0
- code_forge/registry.py +153 -0
- code_forge/reporter.py +133 -0
- code_forge/runner.py +205 -0
- code_forge/sarif.py +226 -0
- code_forge/skills/adversarial-qe/SKILL.md +272 -0
- code_forge/skills/code-forge/SKILL.md +1193 -0
- code_forge/skills/code-review-expert/SKILL.md +162 -0
- code_forge/skills/code-review-expert/references/code-quality-checklist.md +130 -0
- code_forge/skills/code-review-expert/references/removal-plan.md +52 -0
- code_forge/skills/code-review-expert/references/security-checklist.md +118 -0
- code_forge/skills/code-review-expert/references/solid-checklist.md +65 -0
- code_forge/skills/kernel-fp-verify/SKILL.md +101 -0
- code_forge/skills/qodo-review/SKILL.md +135 -0
- code_forge/skills/smoke-test/SKILL.md +253 -0
- code_forge/skills/smoke-test/references/boundary-cases.md +114 -0
- code_forge/skills/smoke-test/references/concurrency-patterns.md +306 -0
- code_forge/skills/smoke-test/references/injection-payloads.md +124 -0
- code_forge/skills/smoke-test/test-library/shell/README.md +271 -0
- code_forge/skills/smoke-test/test-library/shell/primitives.sh +352 -0
- code_forge/skills/smoke-test/test-library/shell/primitives_test.sh +324 -0
- code_forge/snapshot.py +196 -0
- code_forge/source.py +64 -0
- code_forge/state.py +246 -0
- code_forge/verdict.py +43 -0
- code_review_forge-2.0.0a1.dist-info/METADATA +237 -0
- code_review_forge-2.0.0a1.dist-info/RECORD +62 -0
- code_review_forge-2.0.0a1.dist-info/WHEEL +5 -0
- code_review_forge-2.0.0a1.dist-info/entry_points.txt +2 -0
- code_review_forge-2.0.0a1.dist-info/licenses/LICENSE +179 -0
- code_review_forge-2.0.0a1.dist-info/top_level.txt +1 -0
|
@@ -0,0 +1,135 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: qodo-review
|
|
3
|
+
description: Change-aware pre-review with feature-grouped walkthrough and structured suggestions. Inspired by Qodo's review prompt, runs locally with no Qodo dependency. Replaces code-reviewer as pass 1 in three-cycle review.
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# When to Use
|
|
7
|
+
|
|
8
|
+
- **Pass 1** in the three-cycle static review (before `/code-review-expert` and `/adversarial-qe`)
|
|
9
|
+
- Quick pre-review of uncommitted local changes before diving into deeper architectural review
|
|
10
|
+
- When you want a structured walkthrough grouped by feature/behavior, not just file-by-file
|
|
11
|
+
|
|
12
|
+
# When NOT to Use
|
|
13
|
+
|
|
14
|
+
- Not a substitute for `/code-review-expert` (architecture, SOLID, P0-P3 severity)
|
|
15
|
+
- Not a substitute for `/adversarial-qe` (adversarial testing gaps, edge case hunting)
|
|
16
|
+
- Not for reviewing committed branch changes in the three-cycle flow (use the full cycle for that)
|
|
17
|
+
|
|
18
|
+
# Arguments
|
|
19
|
+
|
|
20
|
+
- No argument: review uncommitted changes (staged + unstaged)
|
|
21
|
+
- `committed`: review current branch vs merge-base
|
|
22
|
+
- `<path>`: review a specific patch file (argument contains `/` or `.`)
|
|
23
|
+
- Other text: passed as context hint to focus the review
|
|
24
|
+
|
|
25
|
+
# How It Works
|
|
26
|
+
|
|
27
|
+
## Step 1: Gather Changes
|
|
28
|
+
|
|
29
|
+
Determine the diff source based on arguments:
|
|
30
|
+
|
|
31
|
+
**Default (uncommitted):**
|
|
32
|
+
```bash
|
|
33
|
+
git diff --name-only | wc -l # file count
|
|
34
|
+
git diff | wc -l # line count
|
|
35
|
+
git diff --no-binary # actual diff
|
|
36
|
+
```
|
|
37
|
+
|
|
38
|
+
**committed mode:**
|
|
39
|
+
```bash
|
|
40
|
+
BASE=$(git merge-base HEAD $(git rev-parse --abbrev-ref @{upstream} 2>/dev/null || git remote show origin | grep 'HEAD branch' | awk '{print $NF}'))
|
|
41
|
+
git diff --no-binary $BASE...HEAD
|
|
42
|
+
```
|
|
43
|
+
|
|
44
|
+
**Patch file mode:**
|
|
45
|
+
```bash
|
|
46
|
+
cat <path>
|
|
47
|
+
```
|
|
48
|
+
|
|
49
|
+
### Edge Cases
|
|
50
|
+
|
|
51
|
+
- **Empty diff**: if `git diff --stat` produces no output, respond "No changes to review" and stop
|
|
52
|
+
- **Large diff**: if file count >10 OR line count >500, split into batches of <=5 files. Review each batch serially. Output each batch's results before starting the next. Use `git diff --no-binary -- <file1> <file2> ...` per batch.
|
|
53
|
+
- **Binary files**: always use `--no-binary` to exclude
|
|
54
|
+
|
|
55
|
+
## Step 2: Output the Review
|
|
56
|
+
|
|
57
|
+
Follow this exact structure. Do not write any text between sections -- only headings and structured entries. Do not include internal reasoning or thinking steps.
|
|
58
|
+
|
|
59
|
+
### Output Format
|
|
60
|
+
|
|
61
|
+
```
|
|
62
|
+
# Changes Summary
|
|
63
|
+
<1-3 sentence summary: what was changed, which components affected, the intent>
|
|
64
|
+
|
|
65
|
+
# Files Walkthrough
|
|
66
|
+
|
|
67
|
+
#### <Feature or Behavior Group 1>
|
|
68
|
+
|
|
69
|
+
**<file_path>** (`Changes` | `New file` | `Removed file`) -- <theme>
|
|
70
|
+
<1 sentence: why this file changed>
|
|
71
|
+
|
|
72
|
+
\`\`\`diff
|
|
73
|
+
- <relevant code before>
|
|
74
|
+
+ <relevant code after>
|
|
75
|
+
\`\`\`
|
|
76
|
+
+<linesAdded> / -<linesRemoved>
|
|
77
|
+
|
|
78
|
+
**<file_path_2>** (...)
|
|
79
|
+
...
|
|
80
|
+
|
|
81
|
+
#### <Feature or Behavior Group 2>
|
|
82
|
+
...
|
|
83
|
+
|
|
84
|
+
# Code Suggestions
|
|
85
|
+
|
|
86
|
+
## 🔴 Security Vulnerabilities (<N> issues)
|
|
87
|
+
|
|
88
|
+
### [🔴 High] <title>
|
|
89
|
+
|
|
90
|
+
**File:** <filePath>
|
|
91
|
+
**Description:** <description of the issue and why it matters>
|
|
92
|
+
|
|
93
|
+
Suggested fix:
|
|
94
|
+
\`\`\`<lang>
|
|
95
|
+
<code suggestion>
|
|
96
|
+
\`\`\`
|
|
97
|
+
|
|
98
|
+
### [🔴 High] <title>
|
|
99
|
+
...
|
|
100
|
+
|
|
101
|
+
## 🔴 Potential Bugs (<N> issues)
|
|
102
|
+
|
|
103
|
+
### [🔴 High | 🟡 Medium] <title>
|
|
104
|
+
...
|
|
105
|
+
|
|
106
|
+
## 🟡 Best Practice Violations (<N> issues)
|
|
107
|
+
|
|
108
|
+
### [🟡 Medium] <title>
|
|
109
|
+
...
|
|
110
|
+
|
|
111
|
+
## 🟢 Minor Issues (<N> issues)
|
|
112
|
+
|
|
113
|
+
### [🟢 Low] <title>
|
|
114
|
+
...
|
|
115
|
+
```
|
|
116
|
+
|
|
117
|
+
### Anti-hallucination gate (mandatory per finding)
|
|
118
|
+
|
|
119
|
+
Before reporting any finding, you MUST:
|
|
120
|
+
1. Re-read the actual file at the cited line (use Read tool, not memory).
|
|
121
|
+
2. Confirm the code you are analyzing matches what is actually in the file.
|
|
122
|
+
3. If the finding references a function, variable, or constant by name, grep to confirm it exists.
|
|
123
|
+
|
|
124
|
+
Findings that fail this gate are false positives. Do not report them.
|
|
125
|
+
|
|
126
|
+
### Rules
|
|
127
|
+
|
|
128
|
+
- Only include a category if issues were found in that category. If zero issues in a category, omit the entire section.
|
|
129
|
+
- Severity markers: 🔴 High (must fix before merge), 🟡 Medium (problematic but non-critical), 🟢 Low (minor/stylistic)
|
|
130
|
+
- Group files into features/behaviors in the walkthrough (e.g., "VLAN pop/push test rework", "Parser extension for push_vlan")
|
|
131
|
+
- Use `diff` code blocks for before/after in walkthrough
|
|
132
|
+
- Use language-specific code blocks for suggestions
|
|
133
|
+
- Every suggestion must include file path and a concrete code fix
|
|
134
|
+
- Categories in order: Security -> Bugs -> Best Practices -> Debug/Leftover Code -> Linting -> Other
|
|
135
|
+
- This is a **read-only review** -- do not modify any files, only output suggestions
|
|
@@ -0,0 +1,253 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: smoke-test
|
|
3
|
+
description: Post-review smoke testing -- verify code works by assembling assertions from a reusable shell primitive library
|
|
4
|
+
allowed-tools:
|
|
5
|
+
- Shell
|
|
6
|
+
- ReadFile
|
|
7
|
+
- WriteFile
|
|
8
|
+
---
|
|
9
|
+
|
|
10
|
+
# Smoke Test (Step 4 of Review Pipeline)
|
|
11
|
+
|
|
12
|
+
## Purpose
|
|
13
|
+
|
|
14
|
+
Execute AFTER three-cycle static review (coder -> code-review-expert -> adversarial-qe) with zero findings.
|
|
15
|
+
Verify code actually works at runtime using a reusable assertion primitive library.
|
|
16
|
+
|
|
17
|
+
Static analysis catches design issues; smoke tests catch runtime issues.
|
|
18
|
+
|
|
19
|
+
## When to Use
|
|
20
|
+
|
|
21
|
+
- **Mandatory**: After three-cycle review shows zero findings and before final commit
|
|
22
|
+
- **Optional**: When user asks "确定没问题么?" or similar verification questions
|
|
23
|
+
- **Skip**: Documentation-only changes (pure markdown/comments with no executable code)
|
|
24
|
+
|
|
25
|
+
## Workflow (4 Steps)
|
|
26
|
+
|
|
27
|
+
### Step A: Analyze the Change
|
|
28
|
+
|
|
29
|
+
Read the code diff. Identify:
|
|
30
|
+
- What changed (new CLI param? error handling? file I/O? concurrency? security fix?)
|
|
31
|
+
- The primary execution path (what command to run)
|
|
32
|
+
- Expected outputs, side effects, and edge cases
|
|
33
|
+
|
|
34
|
+
### Step B: Select Primitives
|
|
35
|
+
|
|
36
|
+
Source the library and consult the decision table:
|
|
37
|
+
|
|
38
|
+
```bash
|
|
39
|
+
source ~/.claude/skills/smoke-test/test-library/shell/primitives.sh
|
|
40
|
+
```
|
|
41
|
+
|
|
42
|
+
| Change Type | Required Primitives | Optional |
|
|
43
|
+
|------------|-------------------|----------|
|
|
44
|
+
| New CLI parameter | `assert_success`, `assert_output_contains` | `assert_stderr_empty` |
|
|
45
|
+
| Error handling | `assert_failure`, `assert_stderr_contains` | `assert_exit_code` |
|
|
46
|
+
| File operations | `assert_file_exists`, `assert_file_contains` | `assert_file_not_exists` |
|
|
47
|
+
| Concurrency | `assert_success` (xN), `assert_no_zombie` | `assert_temp_clean`, `assert_file_contains` (race check) |
|
|
48
|
+
| Security patch | `assert_no_command_exec`, `assert_no_path_traversal` | `assert_output_not_contains` |
|
|
49
|
+
| API response | `assert_json_valid`, `assert_output_contains` | -- |
|
|
50
|
+
| Config parsing | `assert_success`, `assert_file_contains` | -- |
|
|
51
|
+
| Log output | `assert_output_contains` | `assert_stderr_empty` |
|
|
52
|
+
| Cleanup logic | `assert_file_not_exists` | -- |
|
|
53
|
+
|
|
54
|
+
Full primitive reference: `test-library/shell/README.md` -- includes signatures, safety usage guide, combination rules, and zombie race constraints.
|
|
55
|
+
|
|
56
|
+
### Step C: Assemble Test Script
|
|
57
|
+
|
|
58
|
+
Act before Assert. One assertion per line. Values through parameters.
|
|
59
|
+
|
|
60
|
+
**Standard pattern**:
|
|
61
|
+
|
|
62
|
+
```bash
|
|
63
|
+
run_and_capture ./script.sh --flag value
|
|
64
|
+
output=$(cat "$SMOKE_LAST_STDOUT")
|
|
65
|
+
assert_success "$SMOKE_LAST_STATUS" "script.sh --flag"
|
|
66
|
+
assert_output_contains "$output" "expected text" "flag output"
|
|
67
|
+
```
|
|
68
|
+
|
|
69
|
+
**Error handling pattern**:
|
|
70
|
+
|
|
71
|
+
```bash
|
|
72
|
+
run_and_capture ./script.sh --bad-flag
|
|
73
|
+
assert_failure "$SMOKE_LAST_STATUS" "script.sh --bad-flag"
|
|
74
|
+
assert_stderr_contains "$(cat "$SMOKE_LAST_STDERR")" "error:" "bad flag error"
|
|
75
|
+
```
|
|
76
|
+
|
|
77
|
+
**Concurrency pattern (zombie detection before reap)**:
|
|
78
|
+
|
|
79
|
+
```bash
|
|
80
|
+
run_concurrent 5 ./script.sh
|
|
81
|
+
sleep 0.5 # wait for child exits, zombie forms
|
|
82
|
+
assert_no_zombie $$ "concurrent run" # detect BEFORE reap
|
|
83
|
+
concurrent_wait; cw_status=$?
|
|
84
|
+
assert_success "$cw_status" "concurrent execution"
|
|
85
|
+
# Per-instance diagnostics on failure
|
|
86
|
+
if [[ $cw_status -ne 0 ]]; then
|
|
87
|
+
for ((i=0; i<5; i++)); do
|
|
88
|
+
s=$(cat "$CONCURRENT_RESULT_DIR/$i.status" 2>/dev/null || echo unknown)
|
|
89
|
+
(( s != 0 )) && echo " instance $i FAIL: exit=$s"
|
|
90
|
+
done
|
|
91
|
+
fi
|
|
92
|
+
```
|
|
93
|
+
|
|
94
|
+
**Security pattern**:
|
|
95
|
+
|
|
96
|
+
```bash
|
|
97
|
+
# Test injection payloads
|
|
98
|
+
for payload in '$(id)' '`whoami`' '; rm -rf /' '../../etc/passwd'; do
|
|
99
|
+
assert_no_command_exec "$payload" "payload: $payload"
|
|
100
|
+
assert_no_path_traversal "$payload" "payload: $payload"
|
|
101
|
+
done
|
|
102
|
+
|
|
103
|
+
# Test actual script input handling
|
|
104
|
+
run_and_capture ./script.sh '$(id)'
|
|
105
|
+
output=$(cat "$SMOKE_LAST_STDOUT")
|
|
106
|
+
assert_output_not_contains "$output" "uid=" "injection not executed"
|
|
107
|
+
```
|
|
108
|
+
|
|
109
|
+
### Step D: Execute and Record
|
|
110
|
+
|
|
111
|
+
Run the assembled test script. Count results:
|
|
112
|
+
|
|
113
|
+
```bash
|
|
114
|
+
PASS_COUNT=$(grep -c '^PASS: ' test_output.txt)
|
|
115
|
+
FAIL_COUNT=$(grep -c '^FAIL: ' test_output.txt)
|
|
116
|
+
```
|
|
117
|
+
|
|
118
|
+
**Exit criteria**:
|
|
119
|
+
- `FAIL_COUNT == 0`, all test categories covered (normal + boundary + security + concurrency if applicable)
|
|
120
|
+
- Any FAIL -> fix the code, restart from Step 0 (syntax pre-check) of the review pipeline
|
|
121
|
+
|
|
122
|
+
Record results in `REVIEW.md` Section 5 (Smoke Test Results):
|
|
123
|
+
- The assembled test script
|
|
124
|
+
- PASS/FAIL counts
|
|
125
|
+
- Any FAIL details with debug context
|
|
126
|
+
|
|
127
|
+
## Assembly Rules
|
|
128
|
+
|
|
129
|
+
1. **One assertion per expected behavior** -- don't combine checks into one `[[ ... ]]`
|
|
130
|
+
2. **Act before Assert** -- run the command first (via `run_and_capture`), then assert on the captured state
|
|
131
|
+
3. **Values through parameters** -- captured stdout/stderr/status go directly into assertion parameters, not global regex
|
|
132
|
+
4. **Every test starts with `source primitives.sh`** -- one line, all 19 functions (16 primitives + 3 helpers) available
|
|
133
|
+
5. **No gaps in coverage** -- decision table required primitives are non-negotiable
|
|
134
|
+
|
|
135
|
+
## Prohibited Patterns
|
|
136
|
+
|
|
137
|
+
- Do NOT modify tested code -- smoke test is read-only verification
|
|
138
|
+
- Do NOT depend on network -- tests must pass offline
|
|
139
|
+
- Do NOT include syntax checks (`bash -n`, `shellcheck`, `py_compile`, `go vet`) -- these belong in Step 0 (syntax pre-check), before review cycles begin
|
|
140
|
+
|
|
141
|
+
## Non-Shell File Strategy
|
|
142
|
+
|
|
143
|
+
Shell primitives exist because bash has no standard test framework. Python, Go, and C already have mature frameworks -- use them.
|
|
144
|
+
|
|
145
|
+
| Language | Smoke Test Tool | Pattern |
|
|
146
|
+
|----------|----------------|---------|
|
|
147
|
+
| Shell | `source primitives.sh` | Decision table (Step B), standard patterns (Step C) |
|
|
148
|
+
| Python | `pytest` | `subprocess.run(cmd, capture_output=True, text=True)` + `assert r.returncode == 0` |
|
|
149
|
+
| Go | `go test` / `testing.T` | `exec.Command(...).CombinedOutput()` + `if err != nil { t.Fatal(...) }` |
|
|
150
|
+
| Kernel C | `primitives.sh` (build) + `runtest -n` (job XML) | See "Kernel C Changes" below |
|
|
151
|
+
|
|
152
|
+
Rationale: Python's `pytest` and Go's `testing` are industry standards that AI already knows. Replacing them with custom assertion functions provides no benefit and adds learning cost. Shell is the exception -- it has no native test framework, so `primitives.sh` fills that gap. Kernel C is shell-adjacent: the verification targets (kbuild, runtest) are shell processes, so `primitives.sh` applies even though the source language is C.
|
|
153
|
+
|
|
154
|
+
### Kernel C Changes (CentOS Stream / RHEL)
|
|
155
|
+
|
|
156
|
+
Kernel smoke tests cannot achieve full functional verification within the 5-minute pre-commit window (Beaker runs take 30 min - 2 hr). Step 4 validates what's reachable locally; Step 5 handles Beaker.
|
|
157
|
+
|
|
158
|
+
For kernel C commits, the CLAUDE.md commit marker rule has an explicit exception: `# post-review-c3` in kernel context means Step 4 complete (build passes + test plan exists). Step 5 (Beaker) is a separate pre-merge gate, not a pre-commit requirement.
|
|
159
|
+
|
|
160
|
+
**Step 4 (pre-commit)**:
|
|
161
|
+
|
|
162
|
+
```bash
|
|
163
|
+
source ~/.claude/skills/smoke-test/test-library/shell/primitives.sh
|
|
164
|
+
|
|
165
|
+
# 1. Build affected subsystem
|
|
166
|
+
# In-tree (from kernel source dir):
|
|
167
|
+
run_and_capture make M=net/openvswitch -j$(nproc)
|
|
168
|
+
assert_success "$SMOKE_LAST_STATUS" "kbuild net/openvswitch"
|
|
169
|
+
# Out-of-tree (external module):
|
|
170
|
+
# run_and_capture make -C /lib/modules/$(uname -r)/build M=$PWD -j$(nproc)
|
|
171
|
+
# assert_success "$SMOKE_LAST_STATUS" "kbuild external module"
|
|
172
|
+
|
|
173
|
+
# 2. Identify corresponding kernel-qe test
|
|
174
|
+
ls code/kernel/networking/openvswitch/ovs_kernel_socketmap/
|
|
175
|
+
# If no existing test: note in REVIEW.md, plan follow-up
|
|
176
|
+
|
|
177
|
+
# 3. Verify Beaker job XML can be generated (dry-run, do NOT submit)
|
|
178
|
+
run_and_capture runtest -n --task=code/kernel/networking/openvswitch/ovs_kernel_socketmap
|
|
179
|
+
assert_success "$SMOKE_LAST_STATUS" "runtest dry-run -- job XML generated"
|
|
180
|
+
```
|
|
181
|
+
|
|
182
|
+
**Step 5 (pre-merge, separate from Step 4)**:
|
|
183
|
+
|
|
184
|
+
```bash
|
|
185
|
+
runtest -B <build-id> --task=code/kernel/networking/openvswitch/ovs_kernel_socketmap
|
|
186
|
+
# Results on Beaker job page.
|
|
187
|
+
```
|
|
188
|
+
|
|
189
|
+
**Bug reproducer rules** (per CLAUDE.md):
|
|
190
|
+
- Reproducer MUST trigger on buggy kernels and NOT trigger on fixed kernels
|
|
191
|
+
- Manual reproduction insufficient -> escalate to kprobe/kretprobe kernel module
|
|
192
|
+
- IBT constraint (`CONFIG_X86_KERNEL_IBT=y`): function-pointer calls to static symbols fail CET -- trigger through kernel's own code paths
|
|
193
|
+
|
|
194
|
+
## Output Format
|
|
195
|
+
|
|
196
|
+
Smoke test results go into `REVIEW.md` Section 5:
|
|
197
|
+
|
|
198
|
+
```
|
|
199
|
+
## 5. Smoke Test Results
|
|
200
|
+
|
|
201
|
+
**Date**: YYYY-MM-DD HH:MM
|
|
202
|
+
**Files tested**: <list>
|
|
203
|
+
**Primitives sourced**: ~/.claude/skills/smoke-test/test-library/shell/primitives.sh
|
|
204
|
+
|
|
205
|
+
### Test Script
|
|
206
|
+
\```bash
|
|
207
|
+
source ~/.claude/skills/smoke-test/test-library/shell/primitives.sh
|
|
208
|
+
run_and_capture ./script.sh --flag
|
|
209
|
+
output=$(cat "$SMOKE_LAST_STDOUT")
|
|
210
|
+
assert_success "$SMOKE_LAST_STATUS" "script.sh --flag"
|
|
211
|
+
assert_output_contains "$output" "expected" "output check"
|
|
212
|
+
\```
|
|
213
|
+
|
|
214
|
+
### Results
|
|
215
|
+
PASS: N FAIL: 0
|
|
216
|
+
|
|
217
|
+
### Categories
|
|
218
|
+
- [x] Normal Path: PASS
|
|
219
|
+
- [x] Boundary: PASS
|
|
220
|
+
- [x] Security: PASS
|
|
221
|
+
- [x] Concurrency: PASS (or N/A)
|
|
222
|
+
```
|
|
223
|
+
|
|
224
|
+
## Integration with Three-Cycle Review
|
|
225
|
+
|
|
226
|
+
```
|
|
227
|
+
Code Change -> Step 0: Syntax Pre-Check -> Cycle 1 (coder/expert/adversarial)
|
|
228
|
+
-> Any finding? -> Fix -> Cycle 1 restart
|
|
229
|
+
-> Zero findings? -> Cycle 2 (repeat 3 passes)
|
|
230
|
+
-> Any finding? -> Fix -> Cycle 1 restart
|
|
231
|
+
-> Zero findings? -> Cycle 3 (repeat 3 passes)
|
|
232
|
+
-> Any finding? -> Fix -> Cycle 1 restart
|
|
233
|
+
-> Zero findings? -> Step 4: Smoke Test (THIS SKILL)
|
|
234
|
+
-> Any FAIL? -> Fix -> Step 0 restart
|
|
235
|
+
-> All PASS? -> Commit with # post-review-c3
|
|
236
|
+
```
|
|
237
|
+
|
|
238
|
+
## Common Pitfalls
|
|
239
|
+
|
|
240
|
+
1. **Skipping Act and going straight to Assert**: primitives need captured state -- always `run_and_capture` first
|
|
241
|
+
2. **Checking zombie AFTER wait**: `assert_no_zombie` must run before `concurrent_wait` reaps processes
|
|
242
|
+
3. **Using `$?` instead of `$SMOKE_LAST_STATUS`**: `$?` changes on every command; `$SMOKE_LAST_STATUS` is stable
|
|
243
|
+
4. **Not sourcing primitives.sh**: every test script must start with `source primitives.sh`
|
|
244
|
+
5. **Writing custom assertion libraries for Python/Go/C**: use the language's standard test framework (pytest, go test). Shell is the only language that needs `primitives.sh`
|
|
245
|
+
|
|
246
|
+
## References
|
|
247
|
+
|
|
248
|
+
- `test-library/shell/primitives.sh` -- 15 assertion primitives + 3 helper functions
|
|
249
|
+
- `test-library/shell/primitives_test.sh` -- self-tests for every primitive
|
|
250
|
+
- `test-library/shell/README.md` -- decision table, safety guide, combination rules, dependencies
|
|
251
|
+
- `references/injection-payloads.md` -- common injection attack vectors
|
|
252
|
+
- `references/boundary-cases.md` -- edge cases by data type
|
|
253
|
+
- `references/concurrency-patterns.md` -- race condition test scenarios
|
|
@@ -0,0 +1,114 @@
|
|
|
1
|
+
# Boundary Cases by Data Type
|
|
2
|
+
|
|
3
|
+
Edge cases that commonly cause bugs in production.
|
|
4
|
+
|
|
5
|
+
## Strings
|
|
6
|
+
|
|
7
|
+
| Case | Value | Common Bug |
|
|
8
|
+
|------|-------|------------|
|
|
9
|
+
| Empty | `""` | Unhandled empty check, division by zero on length |
|
|
10
|
+
| Whitespace only | `" "` | Treated as non-empty, fails validation |
|
|
11
|
+
| Single char | `"a"` | Off-by-one in substring operations |
|
|
12
|
+
| Very long | 10MB+ string | Memory exhaustion, buffer overflow |
|
|
13
|
+
| Unicode | `"你好🎉"` | Byte vs character length mismatch |
|
|
14
|
+
| Newlines | `"line1\nline2"` | Breaks line-based parsing |
|
|
15
|
+
| Null bytes | `"foo\x00bar"` | String truncation in C-style APIs |
|
|
16
|
+
| Control chars | `"\r\n\t\b"` | Terminal escape injection |
|
|
17
|
+
|
|
18
|
+
## Numbers
|
|
19
|
+
|
|
20
|
+
| Case | Value | Common Bug |
|
|
21
|
+
|------|-------|------------|
|
|
22
|
+
| Zero | `0` | Division by zero, empty array index |
|
|
23
|
+
| Negative | `-1`, `-999` | Unsigned overflow, invalid array index |
|
|
24
|
+
| Max int | `2147483647` (32-bit) | Overflow on +1 |
|
|
25
|
+
| Min int | `-2147483648` (32-bit) | Overflow on negation |
|
|
26
|
+
| Float precision | `0.1 + 0.2` | Not equal to `0.3` |
|
|
27
|
+
| Infinity | `float('inf')` | Breaks comparisons |
|
|
28
|
+
| NaN | `float('nan')` | Breaks equality checks |
|
|
29
|
+
| Scientific notation | `1e10` | Parsing as string fails |
|
|
30
|
+
|
|
31
|
+
## Collections
|
|
32
|
+
|
|
33
|
+
| Case | Value | Common Bug |
|
|
34
|
+
|------|-------|------------|
|
|
35
|
+
| Empty array | `[]` | Index out of bounds on `arr[0]` |
|
|
36
|
+
| Single element | `[1]` | Loop assumes multiple elements |
|
|
37
|
+
| Duplicates | `[1, 1, 1]` | Set conversion loses data |
|
|
38
|
+
| Very large | 1M+ elements | Memory exhaustion, O(n^2) algorithms |
|
|
39
|
+
| Nested empty | `[[]]` | Depth check fails |
|
|
40
|
+
| Mixed types | `[1, "two", None]` | Type assumptions break |
|
|
41
|
+
|
|
42
|
+
## Files
|
|
43
|
+
|
|
44
|
+
| Case | Value | Common Bug |
|
|
45
|
+
|------|-------|------------|
|
|
46
|
+
| Non-existent | `/no/such/file` | Unhandled FileNotFoundError |
|
|
47
|
+
| Empty file | 0 bytes | Read returns empty, breaks parsing |
|
|
48
|
+
| Permission denied | `chmod 000 file` | Unhandled PermissionError |
|
|
49
|
+
| Directory not file | `/tmp/` | IsADirectoryError |
|
|
50
|
+
| Symlink | `ln -s target link` | Follows link unexpectedly |
|
|
51
|
+
| Very large | 10GB+ | Memory exhaustion on read() |
|
|
52
|
+
| Binary data | Non-UTF8 bytes | UnicodeDecodeError |
|
|
53
|
+
|
|
54
|
+
## Network
|
|
55
|
+
|
|
56
|
+
| Case | Scenario | Common Bug |
|
|
57
|
+
|------|----------|------------|
|
|
58
|
+
| Timeout | Server doesn't respond | Hangs forever without timeout |
|
|
59
|
+
| Connection refused | Port closed | Unhandled ConnectionRefusedError |
|
|
60
|
+
| DNS failure | Invalid hostname | Unhandled socket.gaierror |
|
|
61
|
+
| Partial response | Server closes mid-stream | Incomplete data treated as valid |
|
|
62
|
+
| Malformed JSON | `{"key": "value"` | json.JSONDecodeError |
|
|
63
|
+
| HTTP 429 | Rate limit | Retry loop without backoff |
|
|
64
|
+
| HTTP 500 | Server error | Treated as success |
|
|
65
|
+
|
|
66
|
+
## Dates/Times
|
|
67
|
+
|
|
68
|
+
| Case | Value | Common Bug |
|
|
69
|
+
|------|-------|------------|
|
|
70
|
+
| Epoch | `1970-01-01 00:00:00` | Treated as null/unset |
|
|
71
|
+
| Leap year | `2024-02-29` | Invalid date in non-leap year |
|
|
72
|
+
| DST transition | `2024-03-10 02:30` | Non-existent time |
|
|
73
|
+
| Timezone edge | `UTC+14` vs `UTC-12` | Date changes across zones |
|
|
74
|
+
| Far future | `9999-12-31` | Overflow in timestamp conversion |
|
|
75
|
+
|
|
76
|
+
## Concurrency
|
|
77
|
+
|
|
78
|
+
| Case | Scenario | Common Bug |
|
|
79
|
+
|------|----------|------------|
|
|
80
|
+
| Race condition | Two threads write same file | Corrupted data |
|
|
81
|
+
| Deadlock | A waits for B, B waits for A | Infinite hang |
|
|
82
|
+
| Lock timeout | Lock held too long | Fallback not implemented |
|
|
83
|
+
| Signal interruption | SIGINT during I/O | Partial write |
|
|
84
|
+
|
|
85
|
+
## Testing Strategy
|
|
86
|
+
|
|
87
|
+
1. **Identify data types**: What inputs does the code accept?
|
|
88
|
+
2. **Select relevant cases**: Match boundary cases to data types
|
|
89
|
+
3. **Execute with boundary input**: Replace normal input with edge case
|
|
90
|
+
4. **Verify graceful handling**: No crashes, clear error messages
|
|
91
|
+
5. **Check side effects**: Files/logs/state not corrupted
|
|
92
|
+
|
|
93
|
+
## Example Test Script
|
|
94
|
+
|
|
95
|
+
```bash
|
|
96
|
+
#!/bin/bash
|
|
97
|
+
# Boundary test for kimi-balance.sh
|
|
98
|
+
|
|
99
|
+
echo "=== Empty key ==="
|
|
100
|
+
KIMI_API_KEY="" ./kimi-balance.sh
|
|
101
|
+
echo "Exit code: $?"
|
|
102
|
+
|
|
103
|
+
echo "=== Whitespace key ==="
|
|
104
|
+
KIMI_API_KEY=" " ./kimi-balance.sh
|
|
105
|
+
echo "Exit code: $?"
|
|
106
|
+
|
|
107
|
+
echo "=== Invalid JSON response ==="
|
|
108
|
+
response='{"invalid json' ./kimi-balance.sh
|
|
109
|
+
echo "Exit code: $?"
|
|
110
|
+
|
|
111
|
+
echo "=== Very long key (10KB) ==="
|
|
112
|
+
KIMI_API_KEY=$(python3 -c "print('x' * 10000)") ./kimi-balance.sh
|
|
113
|
+
echo "Exit code: $?"
|
|
114
|
+
```
|