evilution 0.21.0 → 0.22.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (52) hide show
  1. checksums.yaml +4 -4
  2. data/.beads/.gitignore +4 -0
  3. data/.beads/interactions.jsonl +16 -0
  4. data/.beads/issues.jsonl +9 -6
  5. data/.claude/settings.json +5 -0
  6. data/CHANGELOG.md +35 -0
  7. data/README.md +28 -13
  8. data/comparison_results/baseline_2026-04-09.md +35 -0
  9. data/comparison_results/operator_classification.md +79 -0
  10. data/comparison_results/operator_prioritization.md +68 -0
  11. data/docs/mutation_density_benchmark.md +91 -0
  12. data/lib/evilution/ast/parser.rb +2 -1
  13. data/lib/evilution/baseline.rb +14 -11
  14. data/lib/evilution/cli.rb +2 -1
  15. data/lib/evilution/config.rb +15 -3
  16. data/lib/evilution/disable_comment.rb +2 -1
  17. data/lib/evilution/integration/base.rb +124 -1
  18. data/lib/evilution/integration/minitest.rb +145 -0
  19. data/lib/evilution/integration/minitest_crash_detector.rb +55 -0
  20. data/lib/evilution/integration/rspec.rb +33 -100
  21. data/lib/evilution/isolation/fork.rb +11 -3
  22. data/lib/evilution/isolation/in_process.rb +12 -3
  23. data/lib/evilution/mcp/mutate_tool.rb +6 -6
  24. data/lib/evilution/mutator/base.rb +4 -0
  25. data/lib/evilution/mutator/operator/bitwise_complement.rb +1 -1
  26. data/lib/evilution/mutator/operator/block_pass_removal.rb +30 -0
  27. data/lib/evilution/mutator/operator/ensure_removal.rb +1 -1
  28. data/lib/evilution/mutator/operator/index_to_at.rb +30 -0
  29. data/lib/evilution/mutator/operator/index_to_dig.rb +2 -2
  30. data/lib/evilution/mutator/operator/index_to_fetch.rb +2 -2
  31. data/lib/evilution/mutator/operator/keyword_argument.rb +1 -1
  32. data/lib/evilution/mutator/operator/regex_simplification.rb +169 -0
  33. data/lib/evilution/mutator/operator/rescue_body_replacement.rb +1 -1
  34. data/lib/evilution/mutator/operator/rescue_removal.rb +1 -1
  35. data/lib/evilution/mutator/operator/symbol_literal.rb +9 -0
  36. data/lib/evilution/mutator/registry.rb +3 -0
  37. data/lib/evilution/reporter/cli.rb +19 -0
  38. data/lib/evilution/reporter/html.rb +12 -3
  39. data/lib/evilution/reporter/json.rb +14 -3
  40. data/lib/evilution/reporter/suggestion.rb +659 -2
  41. data/lib/evilution/result/mutation_result.rb +9 -2
  42. data/lib/evilution/runner.rb +56 -17
  43. data/lib/evilution/spec_resolver.rb +24 -16
  44. data/lib/evilution/version.rb +1 -1
  45. data/lib/evilution.rb +4 -0
  46. data/script/memory_check +5 -5
  47. data/scripts/benchmark_density +261 -0
  48. data/scripts/benchmark_density.yml +19 -0
  49. data/scripts/compare_mutations +404 -0
  50. data/scripts/compare_mutations.yml +24 -0
  51. data/scripts/mutant_json_adapter +224 -0
  52. metadata +17 -2
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: c8f4aa7633e70e4a54aded76fcdfeb152cb4e4ad76d587b5aa0c93bda96246e3
4
- data.tar.gz: b8e65e5d0837b6873c31e6cae9621160a2a6fe75b3949d08a39f91b3df7db60b
3
+ metadata.gz: 41de783fc5459691618f0638cbbe4d47b4f41173b38ffaa357ad00f51831724d
4
+ data.tar.gz: 5db9acf814e90669a1d2596b29b71f72e405328b28354f82121bcc23f0ad1a90
5
5
  SHA512:
6
- metadata.gz: 923d8fa302a830d1b070e27b2494c5ec3227f6c6188b26f1701a0344e54ac230626a75bd4f0ec70d6e4f10af04e5e62d04b05d93c067ee160ce3c653e86faaf6
7
- data.tar.gz: f63389c729c4d121cb24a38a2bd9bd4d386707f81c16dbfd106868e952d72200278cd84fc0936ad60e666e53b5c51efce7d40a615c093d5fc327e1e910a39fd4
6
+ metadata.gz: 4ed3a8f8c3ce59bcb13e18c115c7a1013e2aaa53aab8672e486ff0ef1abe6857b3771628f0f4d8f2c7925c46610c4be315f9c2b13bc60d124e44cf69773b1100
7
+ data.tar.gz: e77b777bbb7030ce89904e504a118adbffe7e7a991f8bc5f9504097b7becc311f3b93676bb0914b5ab5b4155d061448ca387025e86a785b1cbb31ed4e69f1f73
data/.beads/.gitignore CHANGED
@@ -42,8 +42,12 @@ export-state/
42
42
 
43
43
  # Dolt database (managed by Dolt remotes, not git)
44
44
  dolt/
45
+ embeddeddolt/
45
46
  dolt-access.lock
46
47
 
48
+ # Local backup data
49
+ backup/
50
+
47
51
  # NOTE: Do NOT add negation patterns (e.g., !issues.jsonl) here.
48
52
  # They would override fork protection in .git/info/exclude, allowing
49
53
  # contributors to accidentally commit upstream issue databases.
@@ -0,0 +1,16 @@
1
+ {"id":"int-4aac1fda","kind":"field_change","created_at":"2026-04-09T05:02:19.518389279Z","actor":"Denis Kiselev","issue_id":"EV-230","extra":{"field":"status","new_value":"in_progress","old_value":"open"}}
2
+ {"id":"int-ff9e26c4","kind":"field_change","created_at":"2026-04-09T05:38:57.721818299Z","actor":"Denis Kiselev","issue_id":"EV-230","extra":{"field":"status","new_value":"closed","old_value":"in_progress","reason":"Merged PR #622. RegexSimplification operator with quantifier removal, anchor removal, and character class range removal."}}
3
+ {"id":"int-c318b162","kind":"field_change","created_at":"2026-04-09T06:16:18.67347839Z","actor":"Denis Kiselev","issue_id":"EV-78","extra":{"field":"status","new_value":"closed","old_value":"in_progress","reason":"Closed"}}
4
+ {"id":"int-e2f93b8b","kind":"field_change","created_at":"2026-04-09T06:33:11.389955517Z","actor":"Denis Kiselev","issue_id":"EV-82","extra":{"field":"status","new_value":"closed","old_value":"in_progress","reason":"Closed"}}
5
+ {"id":"int-275fe056","kind":"field_change","created_at":"2026-04-09T06:49:54.009377447Z","actor":"Denis Kiselev","issue_id":"EV-80","extra":{"field":"status","new_value":"closed","old_value":"in_progress","reason":"Closed"}}
6
+ {"id":"int-f1854a23","kind":"field_change","created_at":"2026-04-09T07:52:38.321185191Z","actor":"Denis Kiselev","issue_id":"EV-79","extra":{"field":"status","new_value":"closed","old_value":"in_progress","reason":"Closed"}}
7
+ {"id":"int-d2e4b659","kind":"field_change","created_at":"2026-04-09T08:28:47.700848334Z","actor":"Denis Kiselev","issue_id":"EV-81","extra":{"field":"status","new_value":"closed","old_value":"in_progress","reason":"Closed"}}
8
+ {"id":"int-f2a8e3fa","kind":"field_change","created_at":"2026-04-09T11:39:20.111111433Z","actor":"Denis Kiselev","issue_id":"EV-83","extra":{"field":"status","new_value":"closed","old_value":"in_progress","reason":"Closed"}}
9
+ {"id":"int-41c232c2","kind":"field_change","created_at":"2026-04-09T12:51:12.16692591Z","actor":"Denis Kiselev","issue_id":"EV-84","extra":{"field":"status","new_value":"closed","old_value":"in_progress","reason":"Closed"}}
10
+ {"id":"int-0f073191","kind":"field_change","created_at":"2026-04-09T13:03:11.468115004Z","actor":"Denis Kiselev","issue_id":"EV-85","extra":{"field":"status","new_value":"closed","old_value":"in_progress","reason":"Closed"}}
11
+ {"id":"int-91a9616c","kind":"field_change","created_at":"2026-04-09T13:04:05.459458165Z","actor":"Denis Kiselev","issue_id":"EV-69","extra":{"field":"status","new_value":"closed","old_value":"open","reason":"Closed"}}
12
+ {"id":"int-b4fe2b7b","kind":"field_change","created_at":"2026-04-09T13:42:59.996305852Z","actor":"Denis Kiselev","issue_id":"EV-277","extra":{"field":"status","new_value":"closed","old_value":"in_progress","reason":"Closed"}}
13
+ {"id":"int-6671aef1","kind":"field_change","created_at":"2026-04-10T09:05:06.955840839Z","actor":"Denis Kiselev","issue_id":"EV-z42m","extra":{"field":"status","new_value":"closed","old_value":"in_progress","reason":"Merged via PR #655"}}
14
+ {"id":"int-3a6ffc92","kind":"field_change","created_at":"2026-04-10T10:26:29.483210031Z","actor":"Denis Kiselev","issue_id":"EV-sgsb","extra":{"field":"status","new_value":"closed","old_value":"in_progress","reason":"Merged via PR #656"}}
15
+ {"id":"int-1df69312","kind":"field_change","created_at":"2026-04-10T10:40:18.391377715Z","actor":"Denis Kiselev","issue_id":"EV-1l8w","extra":{"field":"status","new_value":"closed","old_value":"in_progress","reason":"Merged"}}
16
+ {"id":"int-427bdc14","kind":"field_change","created_at":"2026-04-10T12:40:12.974701482Z","actor":"Denis Kiselev","issue_id":"EV-k6cz","extra":{"field":"status","new_value":"closed","old_value":"in_progress","reason":"Merged as PR #659 — capture error_class and error_backtrace in MutationResult, thread through isolators + runner + JSON reporter, log in --verbose"}}
data/.beads/issues.jsonl CHANGED
@@ -170,7 +170,7 @@
170
170
  {"id":"EV-228","title":"Equivalent detection: .count → .length as always-equivalent","description":".count → .length mutations are universally unkillable (both return identical integer results). Evilution should detect this pattern and classify as equivalent automatically. Reported in feedback for HomeController stats block.","notes":"GH #509","status":"closed","priority":3,"issue_type":"feature","owner":"denis.kiselyov@gmail.com","created_at":"2026-04-04T11:21:43.587875972+07:00","created_by":"Denis Kiselev","updated_at":"2026-04-08T08:52:50.669215278+07:00","closed_at":"2026-04-08T08:52:50.669215278+07:00","close_reason":"Closed"}
171
171
  {"id":"EV-229","title":"Equivalent detection: .each → .map in void context","description":"When .each is called in void context (return value not assigned or passed), replacing with .map or .reverse_each produces equivalent behavior. Evilution should detect void-context method calls and mark these swaps as likely-equivalent. Reported for Avo reset_password action.","notes":"GH #511","status":"closed","priority":3,"issue_type":"feature","owner":"denis.kiselyov@gmail.com","created_at":"2026-04-04T11:21:46.8330208+07:00","created_by":"Denis Kiselev","updated_at":"2026-04-08T09:31:47.874981801+07:00","closed_at":"2026-04-08T09:31:47.874981801+07:00","close_reason":"Closed"}
172
172
  {"id":"EV-23","title":"Per-mutation spec targeting","description":"Instead of running the full spec suite for every mutation, map each mutated source file to its relevant spec file(s) using convention-based resolution (e.g. lib/foo/bar.rb -> spec/foo/bar_spec.rb) and only run those. This dramatically reduces per-mutation test time. Depends on convention-based spec file resolution being implemented first.","status":"closed","priority":2,"issue_type":"feature","owner":"denis.kiselyov@gmail.com","created_at":"2026-03-10T06:17:28.98620973+07:00","created_by":"Denis Kiselev","updated_at":"2026-03-16T14:49:13.616876819+07:00","closed_at":"2026-03-16T14:49:13.616876819+07:00","close_reason":"Fixed and merged","dependencies":[{"issue_id":"EV-23","depends_on_id":"EV-34","type":"blocks","created_at":"0001-01-01T00:00:00Z"}]}
173
- {"id":"EV-230","title":"Regex simplification operators","description":"Add mutation operators for regex patterns: /\\s+/ → /\\s/ (remove quantifier), remove anchors (^, $, \\A, \\z), simplify character classes. Mutant's regex mutations caught real test gaps (case sensitivity, whitespace handling) that evilution missed. Reported in 2 sessions on Telegram::NewsScorer.","notes":"GH #514","status":"open","priority":3,"issue_type":"feature","owner":"denis.kiselyov@gmail.com","created_at":"2026-04-04T11:21:48.930372762+07:00","created_by":"Denis Kiselev","updated_at":"2026-04-04T11:22:57.239710634+07:00","dependencies":[{"issue_id":"EV-230","depends_on_id":"EV-238","type":"blocks","created_at":"0001-01-01T00:00:00Z"}]}
173
+ {"id":"EV-230","title":"Regex simplification operators","description":"Add mutation operators for regex patterns: /\\s+/ → /\\s/ (remove quantifier), remove anchors (^, $, \\A, \\z), simplify character classes. Mutant's regex mutations caught real test gaps (case sensitivity, whitespace handling) that evilution missed. Reported in 2 sessions on Telegram::NewsScorer.","notes":"GH #514","status":"open","priority":3,"issue_type":"feature","owner":"denis.kiselyov@gmail.com","created_at":"2026-04-04T11:21:48.930372762+07:00","created_by":"Denis Kiselev","updated_at":"2026-04-04T11:22:57.239710634+07:00"}
174
174
  {"id":"EV-231","title":".downcase removal operator","description":"Add mutation that removes .downcase calls. This caught real case-sensitivity test gaps in mutant that evilution missed. Useful for testing that code handles mixed-case input. Reported in 2 sessions on NewsScorer.","notes":"GH #516","status":"closed","priority":3,"issue_type":"feature","owner":"denis.kiselyov@gmail.com","created_at":"2026-04-04T11:21:50.682791593+07:00","created_by":"Denis Kiselev","updated_at":"2026-04-08T09:42:19.793202916+07:00","closed_at":"2026-04-08T09:42:19.793202916+07:00","close_reason":"Closed","dependencies":[{"issue_id":"EV-231","depends_on_id":"EV-238","type":"blocks","created_at":"0001-01-01T00:00:00Z"}]}
175
175
  {"id":"EV-232","title":"Method chain permutation operators (strip → lstrip/rstrip)","description":"Add mutations that replace string cleaning methods with partial variants: strip → lstrip/rstrip, chomp → chop, etc. Mutant's chain permutations caught a real test gap in NewsScorer keyword processing. Reported in 1 session.","notes":"GH #518","status":"closed","priority":3,"issue_type":"feature","owner":"denis.kiselyov@gmail.com","created_at":"2026-04-04T11:21:52.982750285+07:00","created_by":"Denis Kiselev","updated_at":"2026-04-08T11:15:28.398677068+07:00","closed_at":"2026-04-08T11:15:28.398677068+07:00","close_reason":"Closed","dependencies":[{"issue_id":"EV-232","depends_on_id":"EV-238","type":"blocks","created_at":"0001-01-01T00:00:00Z"}]}
176
176
  {"id":"EV-233","title":"Related specs heuristic (run association specs when .includes() mutated)","description":"When mutations remove .includes() eager loading calls, the matching unit spec may not catch N+1 regressions. Consider a heuristic that also runs specs for the included associations or integration specs. Would complement the spec auto-detection feature. Reported in 1 session for NewsController.","notes":"GH #519","status":"closed","priority":3,"issue_type":"feature","owner":"denis.kiselyov@gmail.com","created_at":"2026-04-04T11:21:56.302076763+07:00","created_by":"Denis Kiselev","updated_at":"2026-04-08T12:16:09.453069408+07:00","closed_at":"2026-04-08T12:16:09.453069408+07:00","close_reason":"Closed"}
@@ -178,22 +178,22 @@
178
178
  {"id":"EV-235","title":"Bug: Non-deterministic mutation count on same file","description":"Running evilution twice on the same file (HelpController, 9 LOC) produced different mutation counts (18 vs 15). Unclear cause — possibly non-deterministic operator selection or file state difference. Low priority but should be investigated. Reported once in v0.12.0.","notes":"GitHub: #512","status":"closed","priority":4,"issue_type":"bug","owner":"denis.kiselyov@gmail.com","created_at":"2026-04-04T11:22:01.930671333+07:00","created_by":"Denis Kiselev","updated_at":"2026-04-04T13:20:37.135440498+07:00","closed_at":"2026-04-04T13:20:37.135440498+07:00","close_reason":"Not a bug. The 18 vs 15 difference was exactly the 3 timed-out mutations — counted in run 1 total but excluded in run 2 summary. Mutation generation is fully deterministic (no rand/shuffle/sample in codepath). Reported once in v0.12.0, never reproduced. Reporting has been significantly improved since then."}
179
179
  {"id":"EV-236","title":"--spec-dir flag for directory-level spec inclusion","description":"Add a --spec-dir flag that auto-includes all specs in a directory, reducing the chance of missing coverage from adjacent spec files. Useful when a controller has tests split across spec/requests/, spec/controllers/, and spec/features/. Reported once.","notes":"GitHub: #513","status":"closed","priority":4,"issue_type":"feature","owner":"denis.kiselyov@gmail.com","created_at":"2026-04-04T11:22:04.618160285+07:00","created_by":"Denis Kiselev","updated_at":"2026-04-07T09:55:50.99913701+07:00","closed_at":"2026-04-07T09:55:50.99913701+07:00","close_reason":"--spec-dir CLI flag implemented, composes with --spec, validates directory existence. 3 unit tests passing. Merged via GH #513.","dependencies":[{"issue_id":"EV-236","depends_on_id":"EV-227","type":"blocks","created_at":"0001-01-01T00:00:00Z"}]}
180
180
  {"id":"EV-237","title":"Temp-file based mutation (don't modify original source)","description":"Evilution mutates source files in-place on the filesystem, which triggers file watchers, linters, and IDE notifications during runs. Even with the ensure-based restore (fixed earlier), race conditions exist if the process is killed. Write mutated source to a tempfile and point the test runner at it via load path manipulation. Never modify the original source file. Reported in 2 sessions (v0.16.1).","notes":"GH issue: #520 (https://github.com/marinazzio/evilution/issues/520)","status":"closed","priority":2,"issue_type":"feature","owner":"denis.kiselyov@gmail.com","created_at":"2026-04-04T11:22:06.770551806+07:00","created_by":"Denis Kiselev","updated_at":"2026-04-08T15:52:59.896531797+07:00","closed_at":"2026-04-08T15:52:59.896531797+07:00","close_reason":"Closed","external_ref":"gh-520","labels":["reliability"]}
181
- {"id":"EV-238","title":"Epic: Research mutation density gap with mutant","description":"Evilution consistently generates 1.8-2.6x fewer mutations than mutant across 25 feedback sessions. While some of mutant's extras are equivalent ([]→fetch, to_i→Integer()), many catch real edge cases. This epic covers: (1) Audit mutant's operator list systematically against evilution's 54 operators, (2) Identify which missing operators catch real bugs vs produce noise, (3) Prioritize operator additions by signal-to-noise ratio, (4) Target closing the gap to <1.5x. Related: existing gap analysis created 15 new operator issues (EV-214 through EV-224, #491-#505).","notes":"GitHub: #515","status":"open","priority":2,"issue_type":"feature","owner":"denis.kiselyov@gmail.com","created_at":"2026-04-04T11:22:11.095318733+07:00","created_by":"Denis Kiselev","updated_at":"2026-04-04T11:22:55.790159971+07:00"}
181
+ {"id":"EV-238","title":"Epic: Research mutation density gap with mutant","description":"Evilution consistently generates 1.8-2.6x fewer mutations than mutant across 25 feedback sessions. While some of mutant's extras are equivalent ([]→fetch, to_i→Integer()), many catch real edge cases. This epic covers: (1) Audit mutant's operator list systematically against evilution's 54 operators, (2) Identify which missing operators catch real bugs vs produce noise, (3) Prioritize operator additions by signal-to-noise ratio, (4) Target closing the gap to <1.5x. Related: existing gap analysis created 15 new operator issues (EV-214 through EV-224, #491-#505).","notes":"GitHub: #515","status":"closed","priority":2,"issue_type":"feature","owner":"denis.kiselyov@gmail.com","created_at":"2026-04-04T11:22:11.095318733+07:00","created_by":"Denis Kiselev","updated_at":"2026-04-09T10:41:04.310878722+07:00","closed_at":"2026-04-09T10:41:04.310878722+07:00","close_reason":"Closed"}
182
182
  {"id":"EV-239","title":"Epic: Research and fix high memory baseline","description":"Evilution's memory baseline is 718+ MB even for tiny files in v0.18.0 sessions, and grows across consecutive runs in the same session (718→763→795→800 MB). Previous fixes addressed AST node retention and StringIO leaks but baseline remains high. Mutant peaks at ~200 MB for comparable workloads. This epic covers: (1) Profile memory allocation during boot/setup phase, (2) Identify what's consuming the 718 MB baseline, (3) Investigate cross-run memory growth (session-level leak?), (4) Target bringing baseline under 300 MB. Related: existing rake memory:check infrastructure exists.","notes":"GitHub: #517","status":"closed","priority":2,"issue_type":"feature","owner":"denis.kiselyov@gmail.com","created_at":"2026-04-04T11:22:16.602817355+07:00","created_by":"Denis Kiselev","updated_at":"2026-04-06T00:16:38.665740136+07:00","closed_at":"2026-04-06T00:16:38.665740136+07:00","close_reason":"Premise invalid: 718 MB baseline was the MCP host process, not evilution. Standalone evilution baseline is ~30 MB (confirmed via EV-242/245/246 profiling). No memory fix needed. Sub-issues resolved: EV-242 (30 MB boot baseline), EV-243 (single mutation profiled), EV-245 (no cross-run growth), EV-246 (fork has zero parent-side cost), EV-247 (RSS tracking added)."}
183
183
  {"id":"EV-24","title":"Epic: JSON Output Improvements","description":"Make JSON output fully machine-parseable in all scenarios, including errors. Add diagnostic fields that help agents debug failures.","status":"closed","priority":1,"issue_type":"feature","owner":"denis.kiselyov@gmail.com","created_at":"2026-03-10T06:17:37.450686472+07:00","created_by":"Denis Kiselev","updated_at":"2026-03-16T11:15:24.900944562+07:00","closed_at":"2026-03-16T11:15:24.900944562+07:00","close_reason":"All children complete: structured errors, test_command in JSON, noise suppression","dependencies":[{"issue_id":"EV-24","depends_on_id":"EV-25","type":"blocks","created_at":"0001-01-01T00:00:00Z"},{"issue_id":"EV-24","depends_on_id":"EV-26","type":"blocks","created_at":"0001-01-01T00:00:00Z"},{"issue_id":"EV-24","depends_on_id":"EV-40","type":"blocks","created_at":"0001-01-01T00:00:00Z"}]}
184
184
  {"id":"EV-240","title":"Neutral mutation category (test errors vs test failures)","description":"Add a 'neutral' classification for mutations where tests error (crash/exception) rather than fail (assertion). This helps users distinguish test infrastructure problems from real coverage gaps. Mutant has this distinction and it's valuable — e.g., killfork 403 errors in vanilla-mafia are infrastructure noise, not coverage gaps. Currently evilution has neutral detection implemented but feedback suggests it doesn't always classify correctly. Reported in 3 sessions.","notes":"GH issue: #521 (https://github.com/marinazzio/evilution/issues/521)","status":"in_progress","priority":2,"issue_type":"feature","owner":"denis.kiselyov@gmail.com","created_at":"2026-04-04T11:22:18.548454128+07:00","created_by":"Denis Kiselev","updated_at":"2026-04-08T12:21:16.372929584+07:00","external_ref":"gh-521","labels":["reliability"]}
185
185
  {"id":"EV-241","title":"Heredoc-aware string literal mutations","description":"Evilution's string_literal operator generates many false survived mutants on heredoc templates by mutating whitespace and literal text around interpolations. MigrationGenerator scored 63.1% due to template whitespace mutations that don't affect generated code behavior. Should either skip literal text in heredocs, only mutate interpolated expressions, or add a --skip-heredoc-literals flag. Reported in 1 session but caused significant score distortion.","notes":"GH issue: #522 (https://github.com/marinazzio/evilution/issues/522)","status":"open","priority":2,"issue_type":"feature","owner":"denis.kiselyov@gmail.com","created_at":"2026-04-04T11:22:20.534650035+07:00","created_by":"Denis Kiselev","updated_at":"2026-04-04T11:23:19.83846784+07:00","external_ref":"gh-522","labels":["reliability","false-positives"]}
186
186
  {"id":"EV-242","title":"Profile memory allocation during boot/setup phase","description":"Use memory_profiler gem or ObjectSpace to identify what objects are allocated during Evilution boot before any mutations run. Identify the top 10 memory consumers. This will help understand where the 718 MB baseline comes from.","notes":"GitHub: #527","status":"closed","priority":2,"issue_type":"task","owner":"denis.kiselyov@gmail.com","created_at":"2026-04-04T11:28:39.583808805+07:00","created_by":"Denis Kiselev","updated_at":"2026-04-05T23:36:18.3122267+07:00","closed_at":"2026-04-05T23:36:18.3122267+07:00","close_reason":"Boot footprint is ~30 MB. The 718 MB baseline was MCP host process, not evilution.","dependencies":[{"issue_id":"EV-242","depends_on_id":"EV-239","type":"blocks","created_at":"0001-01-01T00:00:00Z"}]}
187
187
  {"id":"EV-243","title":"Profile memory allocation during single mutation cycle","description":"Measure memory before and after a single mutation+test cycle. Identify what is allocated and not released within one cycle. Compare with mutant's behavior for the same file.","notes":"GitHub: #528","status":"in_progress","priority":2,"issue_type":"task","owner":"denis.kiselyov@gmail.com","created_at":"2026-04-04T11:28:46.085727121+07:00","created_by":"Denis Kiselev","updated_at":"2026-04-05T23:37:08.936438689+07:00","dependencies":[{"issue_id":"EV-243","depends_on_id":"EV-239","type":"blocks","created_at":"0001-01-01T00:00:00Z"}]}
188
- {"id":"EV-244","title":"Run head-to-head comparison on 10 diverse files, catalog every mutation mutant generates that evilution doesn't","description":"Pick 10 files of varying complexity (controller, model, service, validator, lib, migration, Avo resource, helper, concern, formatter). Run both mutant and evilution on each file. Catalog every mutation mutant produces that evilution misses. Classify each as: real signal, likely equivalent, or noise.","notes":"GitHub: #523","status":"open","priority":2,"issue_type":"task","owner":"denis.kiselyov@gmail.com","created_at":"2026-04-04T11:28:52.935963449+07:00","created_by":"Denis Kiselev","updated_at":"2026-04-06T15:36:16.088018831+07:00","dependencies":[{"issue_id":"EV-244","depends_on_id":"EV-238","type":"blocks","created_at":"0001-01-01T00:00:00Z"}]}
188
+ {"id":"EV-244","title":"Run head-to-head comparison on 10 diverse files, catalog every mutation mutant generates that evilution doesn't","description":"Pick 10 files of varying complexity (controller, model, service, validator, lib, migration, Avo resource, helper, concern, formatter). Run both mutant and evilution on each file. Catalog every mutation mutant produces that evilution misses. Classify each as: real signal, likely equivalent, or noise.","notes":"GitHub: #523","status":"closed","priority":2,"issue_type":"task","owner":"denis.kiselyov@gmail.com","created_at":"2026-04-04T11:28:52.935963449+07:00","created_by":"Denis Kiselev","updated_at":"2026-04-09T00:06:40.444637431+07:00","closed_at":"2026-04-09T00:06:40.444637431+07:00","close_reason":"Closed","dependencies":[{"issue_id":"EV-244","depends_on_id":"EV-238","type":"blocks","created_at":"0001-01-01T00:00:00Z"}]}
189
189
  {"id":"EV-245","title":"Investigate cross-run session memory growth","description":"Run multiple evilution invocations via MCP in the same session. Measure RSS between invocations. Determine if the growth (718->763->795->800 MB) is in the MCP server process, worker pool, or accumulated state.","notes":"GitHub: #529","status":"in_progress","priority":2,"issue_type":"task","owner":"denis.kiselyov@gmail.com","created_at":"2026-04-04T11:28:54.234490102+07:00","created_by":"Denis Kiselev","updated_at":"2026-04-05T23:40:31.266000331+07:00","dependencies":[{"issue_id":"EV-245","depends_on_id":"EV-239","type":"blocks","created_at":"0001-01-01T00:00:00Z"}]}
190
190
  {"id":"EV-246","title":"Investigate parent vs child process memory split","description":"Measure RSS of the parent (coordinator) process separately from forked child (worker) processes. Determine where the 718 MB baseline lives — is it the parent before forking, or do children inherit and grow independently?","notes":"GitHub: #531","status":"in_progress","priority":2,"issue_type":"task","owner":"denis.kiselyov@gmail.com","created_at":"2026-04-04T11:28:59.300347033+07:00","created_by":"Denis Kiselev","updated_at":"2026-04-05T23:45:11.030185257+07:00","dependencies":[{"issue_id":"EV-246","depends_on_id":"EV-239","type":"blocks","created_at":"0001-01-01T00:00:00Z"}]}
191
191
  {"id":"EV-247","title":"Add RSS tracking per mutation to JSON output","description":"Include parent_rss_kb and child_rss_kb fields in each mutation result. child_rss_kb partially exists (seen in feedback log) — verify it is accurate and add parent_rss_kb tracking. This provides ongoing observability for memory usage.","notes":"GitHub: #532","status":"in_progress","priority":2,"issue_type":"task","owner":"denis.kiselyov@gmail.com","created_at":"2026-04-04T11:29:08.569266807+07:00","created_by":"Denis Kiselev","updated_at":"2026-04-05T23:49:30.25391422+07:00","dependencies":[{"issue_id":"EV-247","depends_on_id":"EV-239","type":"blocks","created_at":"0001-01-01T00:00:00Z"}]}
192
192
  {"id":"EV-248","title":"Implement memory budget CI gate","description":"Add a CI check that runs rake memory:check and fails if peak RSS exceeds a threshold (e.g., 400 MB for a reference fixture). This prevents memory regressions from being merged.","notes":"GitHub: #533","status":"open","priority":2,"issue_type":"task","owner":"denis.kiselyov@gmail.com","created_at":"2026-04-04T11:29:16.177682161+07:00","created_by":"Denis Kiselev","updated_at":"2026-04-06T11:45:18.765568988+07:00","close_reason":"Growth-based leak detection in CI (EV-274/PR #571) is sufficient. Absolute peak RSS budget not needed — standalone baseline is ~30 MB (EV-239 premise was invalid). Per-mutation growth check catches regressions effectively.","dependencies":[{"issue_id":"EV-248","depends_on_id":"EV-239","type":"blocks","created_at":"0001-01-01T00:00:00Z"}]}
193
193
  {"id":"EV-249","title":"Audit current SourceSurgeon mutation-and-restore flow","description":"Document the current code path: where the file is read, mutated, written, and restored. Identify all callers and the ensure-based restore mechanism. Map the failure modes (SIGKILL, OOM, etc.).","notes":"## Audit Findings\n\n### Read Points\n1. **AST::Parser#call** (parser.rb:11) — File.read to parse with Prism\n2. **Mutator::Base#call** (base.rb:18) — File.read to store as original_source in Mutation\n3. **Integration::RSpec#apply_mutation** (rspec.rb:68) — reads before overwrite (direct-overwrite fallback only)\n4. **Isolation::Fork#restore_original_source** (fork.rb:48) — reads to verify if restore needed (defense-in-depth)\n\n### Mutation (In-Memory)\nSourceSurgeon.apply (source_surgeon.rb:6-10) is pure in-memory byte surgery. Never touches filesystem. Called from Mutator::Base#add_mutation.\n\n### Two Write Paths in Integration::RSpec#apply_mutation\n- **Path A (LOAD_PATH shadow, preferred):** Target under $LOAD_PATH → Dir.mktmpdir, write mutated source to mirrored subpath, prepend to $LOAD_PATH. Original file never touched.\n- **Path B (Direct overwrite, fallback):** Not under $LOAD_PATH → acquires exclusive flock, overwrites original file.\n\n### Restore — Two Layers\n- **Layer 1:** Integration::RSpec#restore_original via ensure in #call (rspec.rb:33-35). Path A: removes temp dir from $LOAD_PATH, purges $LOADED_FEATURES, deletes temp dir. Path B: writes @original_content back, releases flock.\n- **Layer 2:** Isolation::Fork#restore_original_source (fork.rb:47-53) — parent-process defense-in-depth. Only in sequential (Fork isolation) path. NOT in parallel path.\n\n### Execution Paths\n- **Sequential (jobs=1):** Runner → Isolation::Fork → fork child → Integration::RSpec#call → ensure restore (child) + ensure restore (parent)\n- **Parallel (jobs>1):** Runner → Parallel::Pool → WorkQueue forks workers → Isolation::InProcess → Integration::RSpec#call → ensure restore only. No parent-side defense-in-depth.\n\n### Failure Modes\n- Normal flow / exception: Safe on both paths\n- SIGKILL child (sequential): Safe (parent restores on direct-overwrite; file untouched on temp-dir)\n- **SIGKILL worker (parallel) + direct-overwrite: FILE CORRUPTED — no recovery**\n- **OOM parallel worker + direct-overwrite: FILE CORRUPTED**\n- **SIGINT/SIGTERM to parent + direct-overwrite: File may be corrupted (zero signal handlers in lib/)**\n- Disk full during restore: File stays corrupted\n\n### Key Findings\n- Zero trap/at_exit/Signal.trap calls in entire lib/ directory\n- Biggest risk: direct-overwrite fallback in parallel mode (no parent-side restore)\n- Epic EV-237 should eliminate direct-overwrite path entirely, making LOAD_PATH shadow the only path","status":"closed","priority":2,"issue_type":"task","owner":"denis.kiselyov@gmail.com","created_at":"2026-04-04T11:29:17.689070042+07:00","created_by":"Denis Kiselev","updated_at":"2026-04-08T15:53:00.183398494+07:00","closed_at":"2026-04-08T15:53:00.183398494+07:00","close_reason":"Closed","dependencies":[{"issue_id":"EV-249","depends_on_id":"EV-237","type":"blocks","created_at":"0001-01-01T00:00:00Z"}]}
194
194
  {"id":"EV-25","title":"Structured error responses in JSON mode","description":"When --format json is used and exit code is 2 (error), output a JSON object with error details instead of unstructured stderr text. Schema: { \"error\": { \"type\": \"config_error|parse_error|runtime_error\", \"message\": \"...\", \"file\": \"...\" } }. Agents currently have to regex-parse stderr which is fragile.","status":"closed","priority":1,"issue_type":"feature","owner":"denis.kiselyov@gmail.com","created_at":"2026-03-10T06:17:38.283715502+07:00","created_by":"Denis Kiselev","updated_at":"2026-03-15T22:41:54.370789377+07:00","closed_at":"2026-03-15T22:41:54.370789377+07:00","close_reason":"Merged PR #74 — structured JSON error output in CLI"}
195
- {"id":"EV-250","title":"Classify mutant's extra mutations by operator category","description":"From the head-to-head data (EV-244), group mutant's extra mutations by category (e.g., receiver mutations, argument permutations, method name substitutions, literal boundary values). Count how many are signal vs noise per category. Produce a table of categories with signal/noise breakdown.","notes":"GitHub: #526","status":"open","priority":2,"issue_type":"task","owner":"denis.kiselyov@gmail.com","created_at":"2026-04-04T11:29:21.320269648+07:00","created_by":"Denis Kiselev","updated_at":"2026-04-04T11:29:32.613448913+07:00","dependencies":[{"issue_id":"EV-250","depends_on_id":"EV-238","type":"blocks","created_at":"0001-01-01T00:00:00Z"}]}
196
- {"id":"EV-251","title":"Prioritize operator additions by signal-to-noise ratio","description":"Rank the missing operator categories by: (a) frequency of real signal catches, (b) implementation complexity, (c) expected equivalent mutant rate. Produce a prioritized implementation order for new operators.","notes":"GitHub: #534","status":"open","priority":2,"issue_type":"task","owner":"denis.kiselyov@gmail.com","created_at":"2026-04-04T11:29:38.370181376+07:00","created_by":"Denis Kiselev","updated_at":"2026-04-04T11:29:53.283203728+07:00","dependencies":[{"issue_id":"EV-251","depends_on_id":"EV-238","type":"blocks","created_at":"0001-01-01T00:00:00Z"}]}
195
+ {"id":"EV-250","title":"Classify mutant's extra mutations by operator category","description":"From the head-to-head data (EV-244), group mutant's extra mutations by category (e.g., receiver mutations, argument permutations, method name substitutions, literal boundary values). Count how many are signal vs noise per category. Produce a table of categories with signal/noise breakdown.","notes":"GitHub: #526","status":"closed","priority":2,"issue_type":"task","owner":"denis.kiselyov@gmail.com","created_at":"2026-04-04T11:29:21.320269648+07:00","created_by":"Denis Kiselev","updated_at":"2026-04-09T09:11:20.047999158+07:00","closed_at":"2026-04-09T09:11:20.047999158+07:00","close_reason":"Closed","dependencies":[{"issue_id":"EV-250","depends_on_id":"EV-238","type":"blocks","created_at":"0001-01-01T00:00:00Z"}]}
196
+ {"id":"EV-251","title":"Prioritize operator additions by signal-to-noise ratio","description":"Rank the missing operator categories by: (a) frequency of real signal catches, (b) implementation complexity, (c) expected equivalent mutant rate. Produce a prioritized implementation order for new operators.","notes":"GitHub: #534","status":"closed","priority":2,"issue_type":"task","owner":"denis.kiselyov@gmail.com","created_at":"2026-04-04T11:29:38.370181376+07:00","created_by":"Denis Kiselev","updated_at":"2026-04-09T10:41:04.207816996+07:00","closed_at":"2026-04-09T10:41:04.207816996+07:00","close_reason":"Closed","dependencies":[{"issue_id":"EV-251","depends_on_id":"EV-238","type":"blocks","created_at":"0001-01-01T00:00:00Z"}]}
197
197
  {"id":"EV-252","title":"Reproduce and measure per-mutation RSS growth on reference fixture","description":"Create a reproducible benchmark: run evilution on a known fixture file, record RSS after each mutation. Confirm the ~3-8 MB/mutation growth rate. Baseline for measuring fix effectiveness.","notes":"GitHub: #539","status":"in_progress","priority":0,"issue_type":"task","owner":"denis.kiselyov@gmail.com","created_at":"2026-04-04T11:29:39.421709567+07:00","created_by":"Denis Kiselev","updated_at":"2026-04-04T15:08:27.260195719+07:00","dependencies":[{"issue_id":"EV-252","depends_on_id":"EV-226","type":"blocks","created_at":"0001-01-01T00:00:00Z"}]}
198
198
  {"id":"EV-253","title":"Profile object allocation delta per mutation cycle","description":"Use ObjectSpace.count_objects or memory_profiler to capture what new objects are allocated during one mutation cycle and not released. Identify the top retained object types.","notes":"GitHub: #540 — Profiling complete. Root cause: RSpec ExampleGroup subclass ivars create reference cycles preventing GC (+3380 slots/mutation). Secondary: World#@sources_by_path cache. Fix proven: clearing EG ivars + sources cache after Runner.run = 0 growth.","status":"closed","priority":0,"issue_type":"task","owner":"denis.kiselyov@gmail.com","created_at":"2026-04-04T11:29:41.259785614+07:00","created_by":"Denis Kiselev","updated_at":"2026-04-05T15:50:12.517441649+07:00","closed_at":"2026-04-05T15:50:12.517441649+07:00","close_reason":"Profiling complete. Root cause identified: RSpec ExampleGroup reference cycles + World source cache. Findings documented on GH #540.","dependencies":[{"issue_id":"EV-253","depends_on_id":"EV-226","type":"blocks","created_at":"0001-01-01T00:00:00Z"}]}
199
199
  {"id":"EV-254","title":"Design temp-file mutation architecture","description":"Design how the temp file is created, where it lives (tmpdir vs .evilution/tmp), how the test process is redirected to load it (Ruby $LOAD_PATH manipulation vs file-level bootsnap override vs ENV-based), and how Rails autoloader interacts with it.","notes":"## Architecture Design: Temp-File Mutation\n\n### Problem Statement\nTwo write paths exist in Integration::RSpec#apply_mutation:\n- **Path A (LOAD_PATH shadow):** Works when file is under $LOAD_PATH — writes to temp dir, prepends to $LOAD_PATH. Original file never touched. Already safe.\n- **Path B (Direct overwrite):** Fallback when file is NOT under $LOAD_PATH — overwrites original, restores via ensure. Vulnerable to SIGKILL/OOM (especially in parallel mode where no parent-side defense-in-depth exists).\n\n**Goal:** Eliminate Path B entirely. Never modify the original source file.\n\n---\n\n### Design Decisions\n\n#### 1. Temp file location: Dir.mktmpdir (system tmpdir)\n- Use `Dir.mktmpdir('evilution')` (same as current Path A), NOT .evilution/tmp\n- Rationale: system tmpdir is auto-cleaned on reboot; no risk of polluting the project directory; avoids .gitignore concerns; already proven in current Path A\n\n#### 2. Redirection mechanism: LOAD_PATH shadow + explicit load\n**For files under $LOAD_PATH** (most lib/ files): Keep current approach — mirror subpath in temp dir, prepend to $LOAD_PATH. This handles any `require` calls during the test run.\n\n**For files NOT under $LOAD_PATH** (the current fallback case): \n- Write mutated source to temp dir mirroring the relative path from project root\n- Explicitly `load` the temp file in the forked child to redefine the class/module\n- This replaces the direct-overwrite approach entirely\n- The `load` approach works because it always executes the file (unlike `require` which checks $LOADED_FEATURES)\n\n#### 3. SourceSurgeon: No changes needed\nSourceSurgeon.apply is already pure in-memory byte surgery. It returns a mutated string without touching the filesystem. No changes required for this epic.\n\n#### 4. Where file I/O moves\n**Integration::RSpec#apply_mutation** remains the owner of temp-file writes, but the two-path logic changes:\n- Path A (under $LOAD_PATH): unchanged — temp dir + $LOAD_PATH prepend\n- Path B (not under $LOAD_PATH): temp dir + explicit `load` (replaces direct overwrite)\n- Both paths use temp files. Original is never touched.\n\n#### 5. Restore/cleanup strategy (three layers)\n1. **ensure in Integration::RSpec#call** (existing): Remove temp dir from $LOAD_PATH, purge $LOADED_FEATURES entries, FileUtils.rm_rf temp dir. Works for both paths now.\n2. **ensure in Isolation::Fork#call** (existing): Simplify — no longer needs to check/restore original file content. Instead, just verify temp dir cleanup. Keep as defense-in-depth for temp dir leaks.\n3. **at_exit hook** (new): Register a cleanup for the temp base dir pattern (evilution*) in case of unhandled exit. Safety net for leaked temp dirs.\n4. **Signal traps** (new): Trap SIGTERM/SIGINT in the parent process to ensure temp dir cleanup before exit.\n\n#### 6. Isolation::Fork#restore_original_source\n- Remove the file-content comparison and rewrite logic\n- Replace with temp-dir cleanup verification (check if any evilution temp dirs remain, clean them)\n- This is now truly defense-in-depth rather than a critical restore path\n\n#### 7. Parallel mode (InProcess isolation)\n- No special handling needed — each worker is a forked process with its own $LOAD_PATH\n- Temp dirs are per-mutation, isolated across workers\n- The biggest current risk (corrupted original file on worker SIGKILL) is eliminated because the original file is never modified\n\n#### 8. Zeitwerk (Rails autoloader) compatibility\n- Zeitwerk maps file paths to constant names using autoload_paths (which are $LOAD_PATH entries in Rails)\n- For files under Zeitwerk-managed paths: LOAD_PATH shadow works — Zeitwerk will find the temp version first\n- For files NOT under Zeitwerk paths: the explicit `load` approach bypasses Zeitwerk entirely, which is correct since Zeitwerk wouldn't manage those files anyway\n- Edge case: Zeitwerk caches file-to-constant mappings. In a forked child, the cache is inherited. Since we `load` after fork, the class is redefined in-place — Zeitwerk's cache remains valid (same constant, new definition)\n- Need integration test to verify (EV-268)\n\n---\n\n### Implementation Order\n1. **EV-263** (SourceSurgeon temp-file write): Modify apply_mutation to always use temp files. Add explicit `load` for non-LOAD_PATH files. Remove direct-overwrite fallback.\n2. **EV-265** (Load-path redirection): Refine the LOAD_PATH prepend logic. Handle edge cases (multiple LOAD_PATH matches, nested paths).\n3. **EV-267** (Cleanup): Add at_exit hook and signal traps. Simplify Isolation::Fork defense-in-depth.\n4. **EV-266** (Zeitwerk): Test and handle Zeitwerk edge cases.\n5. **EV-268** (Integration tests): Verify original file never modified, cleanup on normal/exceptional/signal exit, Zeitwerk compat.\n\n### Files to modify\n- `lib/evilution/integration/rspec.rb` — primary changes (apply_mutation, restore_original)\n- `lib/evilution/isolation/fork.rb` — simplify restore_original_source\n- `lib/evilution/isolation/in_process.rb` — no changes expected\n- `lib/evilution/ast/source_surgeon.rb` — no changes\n- `lib/evilution/runner.rb` — possibly add at_exit/signal trap registration","status":"closed","priority":2,"issue_type":"task","owner":"denis.kiselyov@gmail.com","created_at":"2026-04-04T11:29:42.91131604+07:00","created_by":"Denis Kiselev","updated_at":"2026-04-08T15:53:00.18355494+07:00","closed_at":"2026-04-08T15:53:00.18355494+07:00","close_reason":"Closed","dependencies":[{"issue_id":"EV-254","depends_on_id":"EV-237","type":"blocks","created_at":"0001-01-01T00:00:00Z"}]}
@@ -207,7 +207,7 @@
207
207
  {"id":"EV-261","title":"Add --skip-heredoc-literals CLI flag","description":"Add a flag to completely skip string literal mutations inside heredocs. For users who prefer zero heredoc mutations.","notes":"GitHub: #548","status":"closed","priority":2,"issue_type":"task","owner":"denis.kiselyov@gmail.com","created_at":"2026-04-04T11:29:56.515382643+07:00","created_by":"Denis Kiselev","updated_at":"2026-04-08T19:20:01.937747302+07:00","closed_at":"2026-04-08T19:20:01.937747302+07:00","close_reason":"Closed","dependencies":[{"issue_id":"EV-261","depends_on_id":"EV-241","type":"blocks","created_at":"0001-01-01T00:00:00Z"},{"issue_id":"EV-261","depends_on_id":"EV-260","type":"blocks","created_at":"0001-01-01T00:00:00Z"}]}
208
208
  {"id":"EV-262","title":"Add tests for heredoc mutation behavior","description":"Test: heredoc with no interpolation (skipped or mutated to empty), heredoc with interpolation (only expressions mutated), squiggly heredoc, nested heredoc.","notes":"GitHub: #549","status":"in_progress","priority":2,"issue_type":"task","owner":"denis.kiselyov@gmail.com","created_at":"2026-04-04T11:29:56.333773283+07:00","created_by":"Denis Kiselev","updated_at":"2026-04-08T19:20:02.042056553+07:00","dependencies":[{"issue_id":"EV-262","depends_on_id":"EV-241","type":"blocks","created_at":"0001-01-01T00:00:00Z"},{"issue_id":"EV-262","depends_on_id":"EV-261","type":"blocks","created_at":"0001-01-01T00:00:00Z"}]}
209
209
  {"id":"EV-263","title":"Implement temp-file write in SourceSurgeon","description":"Modify SourceSurgeon.apply to write mutated source to a temp file instead of overwriting the original. Return the temp file path. Original file is never touched.","notes":"GitHub: #537","status":"closed","priority":2,"issue_type":"task","owner":"denis.kiselyov@gmail.com","created_at":"2026-04-04T11:29:59.265360981+07:00","created_by":"Denis Kiselev","updated_at":"2026-04-08T15:53:00.183566714+07:00","closed_at":"2026-04-08T15:53:00.183566714+07:00","close_reason":"Closed","dependencies":[{"issue_id":"EV-263","depends_on_id":"EV-237","type":"blocks","created_at":"0001-01-01T00:00:00Z"}]}
210
- {"id":"EV-264","title":"Define target metric and measurement methodology for mutation density gap","description":"Define what 'closing the gap' means: target ratio (e.g., <1.5x), measurement protocol (which files, which mutant config), and a benchmark script that can be re-run to track progress over time.","notes":"GitHub: #541","status":"open","priority":2,"issue_type":"task","owner":"denis.kiselyov@gmail.com","created_at":"2026-04-04T11:30:08.545241632+07:00","created_by":"Denis Kiselev","updated_at":"2026-04-04T11:30:26.634822515+07:00","dependencies":[{"issue_id":"EV-264","depends_on_id":"EV-238","type":"blocks","created_at":"0001-01-01T00:00:00Z"}]}
210
+ {"id":"EV-264","title":"Define target metric and measurement methodology for mutation density gap","description":"Define what 'closing the gap' means: target ratio (e.g., <1.5x), measurement protocol (which files, which mutant config), and a benchmark script that can be re-run to track progress over time.","notes":"GitHub: #541","status":"closed","priority":2,"issue_type":"task","owner":"denis.kiselyov@gmail.com","created_at":"2026-04-04T11:30:08.545241632+07:00","created_by":"Denis Kiselev","updated_at":"2026-04-08T23:19:42.57055849+07:00","closed_at":"2026-04-08T23:19:42.57055849+07:00","close_reason":"Closed","dependencies":[{"issue_id":"EV-264","depends_on_id":"EV-238","type":"blocks","created_at":"0001-01-01T00:00:00Z"}]}
211
211
  {"id":"EV-265","title":"Implement load-path redirection for forked test process","description":"In the fork isolation, prepend the temp directory to $LOAD_PATH (or use a more targeted mechanism) so that require and load pick up the mutated file instead of the original.","notes":"GitHub: #550","status":"closed","priority":2,"issue_type":"task","owner":"denis.kiselyov@gmail.com","created_at":"2026-04-04T11:30:17.522950262+07:00","created_by":"Denis Kiselev","updated_at":"2026-04-08T15:53:00.18357647+07:00","closed_at":"2026-04-08T15:53:00.18357647+07:00","close_reason":"Closed","dependencies":[{"issue_id":"EV-265","depends_on_id":"EV-237","type":"blocks","created_at":"0001-01-01T00:00:00Z"}]}
212
212
  {"id":"EV-266","title":"Handle Rails autoloader (Zeitwerk) compatibility","description":"Zeitwerk uses absolute paths. Test that the temp-file approach works with Zeitwerk's file-to-constant mapping. May need to use Zeitwerk's on_load callbacks or file override mechanism.","notes":"GitHub: #551","status":"closed","priority":2,"issue_type":"task","owner":"denis.kiselyov@gmail.com","created_at":"2026-04-04T11:30:40.575881302+07:00","created_by":"Denis Kiselev","updated_at":"2026-04-08T15:53:00.183584148+07:00","closed_at":"2026-04-08T15:53:00.183584148+07:00","close_reason":"Closed","dependencies":[{"issue_id":"EV-266","depends_on_id":"EV-237","type":"blocks","created_at":"0001-01-01T00:00:00Z"}]}
213
213
  {"id":"EV-267","title":"Add cleanup of temp files after mutation run","description":"Ensure temp files are cleaned up on normal exit, exception, and signal (SIGTERM/SIGINT). Use at_exit hooks and signal traps.","notes":"GitHub: #552","status":"closed","priority":2,"issue_type":"task","owner":"denis.kiselyov@gmail.com","created_at":"2026-04-04T11:30:57.733824316+07:00","created_by":"Denis Kiselev","updated_at":"2026-04-08T15:53:00.18359655+07:00","closed_at":"2026-04-08T15:53:00.18359655+07:00","close_reason":"Closed","dependencies":[{"issue_id":"EV-267","depends_on_id":"EV-237","type":"blocks","created_at":"0001-01-01T00:00:00Z"}]}
@@ -221,6 +221,9 @@
221
221
  {"id":"EV-274","title":"Add rake memory:check to CI pipeline","description":"Add the memory leak regression check (rake memory:check) as a CI step. This catches regressions in isolation/integration code by detecting per-mutation RSS growth spikes. Requires Linux runner (/proc/self/status). Consider running after the main spec suite to avoid blocking fast feedback.","notes":"GitHub: #566","status":"closed","priority":3,"issue_type":"task","owner":"denis.kiselyov@gmail.com","created_at":"2026-04-05T22:53:27.447410371+07:00","created_by":"Denis Kiselev","updated_at":"2026-04-06T11:52:58.110045281+07:00","closed_at":"2026-04-06T11:52:58.110045281+07:00","close_reason":"Implemented in PR #571. Added memory_check job to CI workflow with pinned runner (ubuntu-24.04), SHA-pinned actions, and explicit threshold env vars."}
222
222
  {"id":"EV-275","title":"Use project's own complex classes as memory check fixture","description":"Replace the simple_class.rb fixture in script/memory_check with more complex classes from the evilution codebase itself (e.g. Runner, Config, AST::Parser). This provides realistic per-mutation load: more ExampleGroup subclasses, deeper spec nesting, and heavier metadata — closer to what users see in real projects. Affects check #5 (RSpec integration per-mutation) primarily.","notes":"GitHub: #567","status":"in_progress","priority":3,"issue_type":"task","owner":"denis.kiselyov@gmail.com","created_at":"2026-04-05T22:53:30.214655275+07:00","created_by":"Denis Kiselev","updated_at":"2026-04-06T12:36:50.959507734+07:00"}
223
223
  {"id":"EV-276","title":"InProcess suppress_output closes /dev/null handles causing 'closed stream' on reuse with clear_examples","description":"InProcess#suppress_output uses File.open with blocks that auto-close /dev/null handles after each call. When the RSpec integration uses clear_examples (which reuses Configuration), the formatter retains a reference to $stdout from the first run. On subsequent calls, the formatter writes to the closed handle, causing 'closed stream' errors. Fix: use persistent /dev/null handles or StringIO.","notes":"GitHub: #569","status":"in_progress","priority":2,"issue_type":"bug","owner":"denis.kiselyov@gmail.com","created_at":"2026-04-05T23:06:51.502713099+07:00","created_by":"Denis Kiselev","updated_at":"2026-04-06T12:02:05.127419951+07:00"}
224
+ {"id":"EV-277","title":"Multi-byte character offset bug in AST::Parser subject extraction","description":"When parsing files containing multi-byte characters (e.g. Cyrillic), AST::Parser uses @source[loc.start_offset...loc.end_offset] on the string directly instead of on the binary representation. Prism byte offsets are byte-based, but Ruby string slicing is character-based for encoded strings, causing extracted method bodies to be garbled. Fix: use @source.b[offset...end].force_encoding(@source.encoding) as done in the mutant_json_adapter workaround.","notes":"GitHub: #615","status":"open","priority":1,"issue_type":"bug","owner":"denis.kiselyov@gmail.com","created_at":"2026-04-09T00:01:43.021380194+07:00","created_by":"Denis Kiselev","updated_at":"2026-04-09T00:01:58.491188786+07:00"}
225
+ {"id":"EV-278","title":"IndexToAt operator: [] → .at() substitution","description":"Add mutation operator that replaces array/hash [] access with .at() method. .at() returns nil on out-of-bounds instead of raising, exposing missing bounds checks. Identified in EV-251 prioritization as the only uncovered signal category (60 mutations in benchmark corpus). Low implementation complexity — match CallNode with name [] on collection receivers.","notes":"GitHub: #618","status":"in_progress","priority":3,"issue_type":"feature","owner":"denis.kiselyov@gmail.com","created_at":"2026-04-09T09:12:09.052329944+07:00","created_by":"Denis Kiselev","updated_at":"2026-04-09T11:26:17.496127389+07:00"}
226
+ {"id":"EV-279","title":"BlockPassRemoval operator: remove &:method block pass","description":"Add mutation operator that removes &:symbol block pass arguments from method calls (e.g. map(&:to_s) → map). Low priority — only 5 mutations in benchmark corpus, but trivial to implement. Identified in EV-251 prioritization.","notes":"GitHub: #619","status":"in_progress","priority":4,"issue_type":"feature","owner":"denis.kiselyov@gmail.com","created_at":"2026-04-09T09:12:14.145760441+07:00","created_by":"Denis Kiselev","updated_at":"2026-04-09T10:43:28.718393237+07:00"}
224
227
  {"id":"EV-28","title":"MCP server for direct tool invocation","description":"Implement a Model Context Protocol (MCP) server that exposes evilution as a tool. Agents could call evilution directly instead of shelling out and parsing output. The server should expose a 'mutate' tool that accepts target files, options, and returns structured results.","status":"closed","priority":2,"issue_type":"feature","owner":"denis.kiselyov@gmail.com","created_at":"2026-03-10T06:17:45.29866593+07:00","created_by":"Denis Kiselev","updated_at":"2026-03-16T22:58:51.734461132+07:00","closed_at":"2026-03-16T22:58:51.734461132+07:00","close_reason":"PR #103 merged — MCP server with evilution-mutate tool via stdio transport"}
225
228
  {"id":"EV-29","title":"Add --stdin flag to accept file list from stdin","description":"Add a --stdin flag that reads target file paths (one per line) from stdin. Enables workflows like: git diff --name-only | evilution run --stdin --format json. Each line can include line-range syntax (e.g. lib/foo.rb:15-30).","status":"closed","priority":3,"issue_type":"feature","owner":"denis.kiselyov@gmail.com","created_at":"2026-03-10T06:17:46.306306092+07:00","created_by":"Denis Kiselev","updated_at":"2026-03-16T18:03:28.998559073+07:00","closed_at":"2026-03-16T18:03:28.998559073+07:00","close_reason":"PR #92 merged — --stdin flag for piped file list workflows"}
226
229
  {"id":"EV-3","title":"Phase 2: Mutation Operators & CLI","description":"Implement remaining 17 mutation operators, build CLI with OptionParser, exe/evilution executable, human-readable reporter. Milestone: bundle exec evilution run lib/user.rb --format json","status":"closed","priority":2,"issue_type":"epic","owner":"denis.kiselyov@gmail.com","created_at":"2026-03-02T00:05:00.492971295+07:00","created_by":"Denis Kiselev","updated_at":"2026-03-02T11:21:32.168384165+07:00","closed_at":"2026-03-02T11:21:32.168384165+07:00","close_reason":"Phase 2 complete: all 18 operators, CLI, Reporter::CLI, Registry registration, executable","dependencies":[{"issue_id":"EV-3","depends_on_id":"EV-2","type":"blocks","created_at":"0001-01-01T00:00:00Z"}]}
@@ -0,0 +1,5 @@
1
+ {
2
+ "enabledPlugins": {
3
+ "superpowers@claude-plugins-official": true
4
+ }
5
+ }
data/CHANGELOG.md CHANGED
@@ -1,5 +1,40 @@
1
1
  # Changelog
2
2
 
3
+ ## [0.22.1] - 2026-04-10
4
+
5
+ ### Added
6
+
7
+ - **Error class and backtrace capture** — `MutationResult` now stores `error_class` and `error_backtrace` alongside `error_message`; the backtrace array is duplicated and frozen to keep results immutable; both fields are threaded through `Isolation::Fork` (Marshal-safe across the IPC pipe), `Isolation::InProcess`, and the runner's `compact_result` / `rebuild_results` path (#648, PR #659)
8
+ - **Verbose error diagnostics** — `--verbose` now logs error class, message, and the first 5 backtrace lines for errored mutations (previously `--verbose` only showed memory/GC stats, leaving errors invisible) (#648, PR #659)
9
+ - **Error details in JSON reports** — JSON reporter output includes `error_class` and `error_backtrace` fields under `errors[]` entries when present, so downstream tools (CI, MCP consumers) can surface failure causes without re-running (#648, PR #659)
10
+
11
+ ### Fixed
12
+
13
+ - **Silent load-time crashes in `Isolation::Fork`** — mutations that raised non-`SyntaxError` script errors at load time (e.g. `NoMethodError: super called outside of method`) escaped `Integration::Base`'s narrow rescue and either surfaced cryptically or went silent under fork isolation; both isolators now rescue `ScriptError, StandardError` as a safety net and report them as `:error` status with full class and backtrace (#646, PR #656)
14
+ - **`symbol_literal` operator breaking keyword arguments** — mutating symbols in label form (`foo:` inside hash literals or keyword arguments) produced invalid Ruby source; the operator now detects label-form symbols via Prism's `closing_loc` and skips them, only mutating standalone symbol literals (`:foo`) (#647, PR #657)
15
+ - **Syntax errors in mutated source crashing in-process runs** — `Integration::Base#apply_mutation` now captures `SyntaxError` during `require`/`load` and returns a structured error result instead of propagating the exception up through `call`; error results include the error class and backtrace for diagnosis (#644, #645, PR #653, PR #655)
16
+
17
+ ### Changed
18
+
19
+ - **Integration::Base refactor** — `apply_mutation` split into `apply_via_require` and `apply_via_load` helpers; rescue scope moved from `#call` to `#apply_mutation` so load-time errors return a result hash while abstract-method `NotImplementedError`s still propagate as intended
20
+
21
+ ## [0.22.0] - 2026-04-09
22
+
23
+ ### Added
24
+
25
+ - **Minitest integration** — full Minitest support as an alternative to RSpec; abstract `Integration::Base` framework with template method pattern; `Integration::Minitest` with programmatic `Minitest.__run` execution, `MinitestCrashDetector` reporter for distinguishing assertion failures from crashes; `--integration minitest` CLI flag and `integration: minitest` config option; `SpecResolver` parameterized for Minitest file discovery (`test/`, `_test.rb`); plugin-based runner dispatch via `INTEGRATIONS` registry; baseline runner abstracted from RSpec with injectable runner callable; Minitest concrete suggestion templates using `def test_`/`assert_equal` style (#223, #224, #225, #226, #227, #228, #229, #230)
26
+ - **New mutation operators (3)** — `index_to_at` replaces `arr[0]` with `arr.at(0)` for array index access (#618); `regex_simplification` simplifies regex character classes and quantifiers (#514); `block_pass_removal` removes block arguments (`&...`) in method calls (#619)
27
+ - **Mutation density benchmarking** — comparison tools and methodology for measuring mutation density against reference tool; baseline results and operator classification documents (#523, #526, #541)
28
+
29
+ ### Fixed
30
+
31
+ - **Multi-byte character offset bug** — Prism byte offsets were used with character-based `String#[]`, causing garbled source extraction for files with multi-byte characters (UTF-8 Cyrillic, Thai, CJK, etc.); fixed `AST::Parser`, `DisableComment`, and 7 mutation operators to use `byteslice`/`getbyte`; added `byteslice_source` helper to `Mutator::Base` (#615)
32
+
33
+ ### Changed
34
+
35
+ - **Operator count** — 72 operators (up from 69), with new index-to-at, regex simplification, and block pass removal operators
36
+ - **Test framework support** — RSpec and Minitest both supported; documentation updated throughout CLI help, MCP tool descriptions, and README
37
+
3
38
  ## [0.21.0] - 2026-04-08
4
39
 
5
40
  ### Added
data/README.md CHANGED
@@ -7,7 +7,7 @@
7
7
  * **License**: MIT (free, no commercial restrictions)
8
8
  * **Language**: Ruby >= 3.3
9
9
  * **Parser**: Prism (Ruby's official AST parser, ships with Ruby 3.3+)
10
- * **Test framework**: RSpec (currently the only supported integration)
10
+ * **Test frameworks**: RSpec and Minitest
11
11
 
12
12
  ## Installation
13
13
 
@@ -56,10 +56,11 @@ evilution [command] [options] [files...]
56
56
  | `-j`, `--jobs N` | Integer | 1 | Number of parallel workers. Uses demand-driven work distribution with pipe-based IPC. |
57
57
  | `--no-baseline` | Boolean | _(enabled)_ | Skip baseline test suite check. By default, a baseline run detects pre-existing failures and marks those mutations as `neutral`. |
58
58
  | `--fail-fast [N]` | Integer | _(none)_ | Stop after N surviving mutants (default 1 if no value given). |
59
- | `-v`, `--verbose` | Boolean | false | Verbose output with RSS memory and GC stats per phase and per mutation. |
60
- | `--suggest-tests` | Boolean | false | Generate concrete RSpec test code in suggestions instead of static descriptions. |
59
+ | `-v`, `--verbose` | Boolean | false | Verbose output with RSS memory and GC stats per phase and per mutation; also prints error class, message, and first 5 backtrace lines for errored mutations. |
60
+ | `--suggest-tests` | Boolean | false | Generate concrete test code in suggestions (RSpec or Minitest, based on `--integration`). |
61
61
  | `-q`, `--quiet` | Boolean | false | Suppress output. |
62
62
  | `--stdin` | Boolean | false | Read target file paths from stdin (one per line). |
63
+ | `--integration NAME` | String | `rspec` | Test framework integration: `rspec` or `minitest`. |
63
64
  | `--incremental` | Boolean | false | Cache killed/timeout results; skip unchanged mutations on re-runs. |
64
65
  | `--save-session` | Boolean | false | Persist results as timestamped JSON under `.evilution/results/`. |
65
66
  | `--no-progress` | Boolean | _(enabled)_ | Disable the TTY progress bar. |
@@ -86,8 +87,8 @@ Creates `.evilution.yml`:
86
87
  # timeout: 30 # seconds per mutation
87
88
  # format: text # text | json | html
88
89
  # min_score: 0.0 # 0.0–1.0
89
- # integration: rspec # test framework
90
- # suggest_tests: false # concrete RSpec test code in suggestions
90
+ # integration: rspec # test framework: rspec, minitest
91
+ # suggest_tests: false # concrete test code in suggestions (matches integration)
91
92
  # save_session: false # persist results under .evilution/results/
92
93
  # skip_heredoc_literals: false # skip all string literal mutations inside heredocs
93
94
  # show_disabled: false # report mutations skipped by disable comments
@@ -162,13 +163,20 @@ Use `--format json` for machine-readable output. Schema:
162
163
  ],
163
164
  "killed": ["... same shape as survived entries ..."],
164
165
  "timed_out": ["... same shape as survived entries ..."],
165
- "errors": ["... same shape as survived entries ..."]
166
+ "errors": [
167
+ {
168
+ "... same shape as survived entries, plus: ...": "",
169
+ "error_message": "string (optional) — error message from the failing mutation",
170
+ "error_class": "string (optional) — exception class name (e.g. 'SyntaxError', 'NoMethodError')",
171
+ "error_backtrace": ["string (optional) — first 5 backtrace lines from the exception"]
172
+ }
173
+ ]
166
174
  }
167
175
  ```
168
176
 
169
177
  **Key metric**: `summary.score` — the mutation score. Higher is better. 1.0 means all mutations were caught.
170
178
 
171
- ## Mutation Operators (69 total)
179
+ ## Mutation Operators (72 total)
172
180
 
173
181
  Each operator name is stable and appears in JSON output under `survived[].operator`.
174
182
 
@@ -201,8 +209,10 @@ Each operator name is stable and appears in JSON output under `survived[].operat
201
209
  | `keyword_argument` | Remove keyword defaults/params | `def foo(bar: 42)` -> `def foo(bar:)` |
202
210
  | `multiple_assignment` | Remove targets or swap order | `a, b = 1, 2` -> `b, a = 1, 2` |
203
211
  | `block_removal` | Remove blocks from method calls | `items.map { \|x\| x * 2 }` -> `items.map` |
212
+ | `block_pass_removal` | Remove block arguments passed with `&` | `items.map(&:to_s)` -> `items.map` |
204
213
  | `range_replacement` | Swap inclusive/exclusive ranges | `1..10` -> `1...10` |
205
214
  | `regexp_mutation` | Replace regexp with always/never matching | `/pat/` -> `/a\A/` |
215
+ | `regex_simplification` | Simplify regex quantifiers, anchors, ranges | `/\d+/` -> `/\d/`, `/[a-z]/` -> `/[az]/` |
206
216
  | `receiver_replacement` | Drop explicit `self` receiver | `self.foo` -> `foo` |
207
217
  | `send_mutation` | Swap semantically related methods | `detect` -> `find`, `map` -> `flat_map` |
208
218
  | `compound_assignment` | Swap compound assignment operators | `+=` -> `-=`, `&&=` -> `\|\|=` |
@@ -224,6 +234,7 @@ Each operator name is stable and appears in JSON output under `survived[].operat
224
234
  | `bitwise_complement` | Remove or swap `~` | `~x` -> `x`, `~x` -> `-x` |
225
235
  | `zsuper_removal` | Replace implicit `super` with `nil` | `super` -> `nil` |
226
236
  | `explicit_super_mutation` | Mutate explicit super arguments | `super(a, b)` -> `super` |
237
+ | `index_to_at` | Replace `[]` with `.at()` for arrays | `arr[0]` -> `arr.at(0)` |
227
238
  | `index_to_fetch` | Replace `[]` with `.fetch()` | `h[k]` -> `h.fetch(k)` |
228
239
  | `index_to_dig` | Replace `[]` chains with `.dig()` | `h[a][b]` -> `h.dig(a, b)` |
229
240
  | `index_assignment_removal` | Remove `[]=` assignments | `h[k] = v` -> removed |
@@ -290,9 +301,9 @@ Use `minimal` when context window budget is tight and you only need to see what
290
301
 
291
302
  ### Concrete Test Suggestions
292
303
 
293
- The MCP tool accepts a `suggest_tests` boolean parameter (default: `false`). When enabled, survived mutation suggestions contain concrete RSpec `it` blocks that an agent can drop into a spec file, instead of static description text.
304
+ The MCP tool accepts a `suggest_tests` boolean parameter (default: `false`). When enabled, survived mutation suggestions contain concrete test code that an agent can drop into a test file, instead of static description text. The MCP tool currently generates RSpec-style suggestions (`it`/`expect` blocks).
294
305
 
295
- Pass `suggest_tests: true` in the MCP tool call, or use `--suggest-tests` on the CLI, to activate this mode.
306
+ Pass `suggest_tests: true` in the MCP tool call to activate this mode. The CLI also supports `--suggest-tests`; when using the CLI, generated suggestions match the `--integration` setting (RSpec `it`/`expect` blocks or Minitest `def test_`/`assert_equal` methods).
296
307
 
297
308
  > **Note**: `.mcp.json` is gitignored by default since it is a local editor/agent configuration file.
298
309
 
@@ -356,11 +367,15 @@ Use when you know which file was modified and want to verify its test coverage.
356
367
  For each entry in `survived[]`:
357
368
  1. Read `file` at `line` to understand the code context
358
369
  2. Read `operator` to understand what was changed
359
- 3. Read `suggestion` for a hint on what test to write (use `--suggest-tests` for concrete RSpec code)
370
+ 3. Read `suggestion` for a hint on what test to write (use `--suggest-tests` for concrete test code)
360
371
  4. Write a test that would fail if the mutation were applied
361
372
  5. Re-run evilution on just that file to verify the mutant is now killed
362
373
 
363
- ### 7. CI gate
374
+ ### 7. Diagnosing errored mutations
375
+
376
+ Entries in the JSON `errors[]` array represent mutations that raised an exception (syntax error, load failure, or runtime crash) rather than producing a test outcome. Each entry includes `error_class`, `error_message`, and the first 5 `error_backtrace` lines. Use these fields to decide whether the error is a bug in the mutation operator (file an issue), a load-time problem in the mutated source (often `NoMethodError: super called outside of method` or constant-redefinition issues), or a genuine crash that the original tests should have caught. Run with `--verbose` to stream the same error details to stderr during the run.
377
+
378
+ ### 8. CI gate
364
379
 
365
380
  ```bash
366
381
  bundle exec evilution run lib/ --format json --min-score 0.8 --quiet
@@ -389,9 +404,9 @@ Tests 4 paths (InProcess isolation, Fork isolation, mutation generation + stripp
389
404
  1. **Parse** — Prism parses Ruby files into ASTs with exact byte offsets
390
405
  2. **Extract** — Methods are identified as mutation subjects
391
406
  3. **Filter** — Disable comments, Sorbet `sig` blocks, and AST ignore patterns exclude mutations before execution
392
- 4. **Mutate** — 69 operators produce text replacements at precise byte offsets (source-level surgery, no AST unparsing); heredoc literal text is skipped by default
407
+ 4. **Mutate** — 72 operators produce text replacements at precise byte offsets (source-level surgery, no AST unparsing); heredoc literal text is skipped by default
393
408
  5. **Isolate** — Mutations are applied to temporary file copies (never modifying originals); load-path redirection ensures `require` resolves the mutated copy. Default isolation is in-process; `--isolation fork` uses forked child processes. Parallel mode (`--jobs N`) always uses in-process isolation inside pool workers to avoid double forking
394
- 6. **Test** — RSpec executes against the mutated source
409
+ 6. **Test** — The configured test framework (RSpec or Minitest) executes against the mutated source
395
410
  7. **Collect** — Source strings and AST nodes are released after use to minimize memory retention
396
411
  8. **Report** — Results aggregated into text, JSON, or HTML, including efficiency metrics and peak memory usage
397
412
 
@@ -0,0 +1,35 @@
1
+ # Head-to-Head Mutation Comparison Baseline
2
+
3
+ Date: 2026-04-09
4
+ Evilution: v0.21.0
5
+ Reference tool: mutant 0.16.0 (via mutant_json_adapter)
6
+ Target project: private Rails app (Ruby 4.0.1)
7
+
8
+ ## Results
9
+
10
+ | File | Evilution | Reference | Ratio |
11
+ |------|-----------|-----------|-------|
12
+ | app/controllers/admin/news_controller.rb | 363 | 524 | 1.44x |
13
+ | app/models/player_claim.rb | 237 | 281 | 1.19x |
14
+ | app/services/telegram/news_scorer.rb | 353 | 496 | 1.41x |
15
+ | app/validators/password_strength_validator.rb | 157 | 179 | 1.14x |
16
+ | app/policies/application_policy.rb | 79 | 86 | 1.09x |
17
+ | app/services/telegram/entities_formatter.rb | 454 | 647 | 1.43x |
18
+ | app/services/autosave_game_protocol_service.rb | 315 | 418 | 1.33x |
19
+ | lib/scraper/game_scraper.rb | 449 | 625 | 1.39x |
20
+ | app/jobs/process_telegram_webhook_job.rb | 196 | 255 | 1.30x |
21
+ | lib/telegram/export_parser.rb | 300 | 390 | 1.30x |
22
+ | **TOTAL** | **2,903** | **3,901** | **1.34x** |
23
+
24
+ ## Summary
25
+
26
+ - Density ratio: **1.34x** (target: < 1.5x) — **PASS**
27
+ - Highest gap: entities_formatter.rb (1.44x), news_controller.rb (1.44x)
28
+ - Lowest gap: application_policy.rb (1.09x)
29
+
30
+ ## Notes
31
+
32
+ - Reference tool cannot parse Ruby 4.0 files natively; used mutant_json_adapter
33
+ (Prism-based method extraction + mutant util mutation -e per method)
34
+ - This means reference tool mutations lack class-level context (constants,
35
+ inheritance, instance variables) — actual gap may be slightly different
@@ -0,0 +1,79 @@
1
+ # Reference Tool Mutation Classification by Operator Category
2
+
3
+ Date: 2026-04-09
4
+ Corpus: 10 files from a private Rails app (2,903 evilution / 3,901 reference mutations)
5
+
6
+ ## Category Breakdown
7
+
8
+ Classification of all 3,901 reference tool mutations by semantic category,
9
+ with signal/noise assessment and evilution coverage status.
10
+
11
+ | Category | Count | % | Signal | Evilution Coverage |
12
+ |----------|------:|--:|--------|-------------------|
13
+ | complex_mutation | 854 | 21.9% | Mixed | Partial — multi-statement changes, compound mutations spanning multiple lines |
14
+ | argument_nil | 585 | 15.0% | Signal | Covered — ArgumentNilSubstitution, ArgumentRemoval |
15
+ | guard_clause_restructure | 570 | 14.6% | Noise | Not applicable — reformats `return if X` to `unless X; return; end` without semantic change |
16
+ | receiver_mutation | 416 | 10.7% | Signal | Covered — ReceiverReplacement, MethodCallRemoval |
17
+ | receiver_self_swap | 281 | 7.2% | Mixed | Partial — ReceiverReplacement covers some; `self.` insertion is often equivalent |
18
+ | string_mutation | 224 | 5.7% | Signal | Covered — StringLiteral, StringInterpolation |
19
+ | arithmetic_mutation | 224 | 5.7% | Signal | Covered — ArithmeticReplacement |
20
+ | symbol_mutation | 129 | 3.3% | Signal | Covered — SymbolLiteral (mutant uses `__mutant__` suffix) |
21
+ | hash_mutation | 119 | 3.1% | Mixed | Partial — HashLiteral covers structure; key reordering is noise |
22
+ | method_body_removal | 94 | 2.4% | Signal | Covered — MethodBodyReplacement (empty body variant) |
23
+ | method_body_raise | 79 | 2.0% | Signal | Covered — MethodBodyReplacement (raise variant) |
24
+ | method_body_super | 79 | 2.0% | Signal | Covered — MethodBodyReplacement (super variant) |
25
+ | method_substitution_at | 60 | 1.5% | Mixed | Not covered — `[]` → `.at()` catches missing bounds checks |
26
+ | method_substitution_fetch | 58 | 1.5% | Signal | Covered — IndexToFetch |
27
+ | condition_nil_false | 43 | 1.1% | Signal | Covered — ConditionalBranch, NilReplacement |
28
+ | method_body_nil | 27 | 0.7% | Signal | Covered — MethodBodyReplacement (nil variant) |
29
+ | regex_mutation | 27 | 0.7% | Signal | Covered — RegexpMutation, RegexCapture |
30
+ | boolean_literal | 16 | 0.4% | Signal | Covered — BooleanLiteralReplacement |
31
+ | equality_mutation | 10 | 0.3% | Signal | Covered — EqualityToIdentity |
32
+ | block_pass_mutation | 5 | 0.1% | Signal | Partial — BlockRemoval covers block removal; `&:method` removal not specific |
33
+ | integer_boundary | 1 | 0.0% | Signal | Covered — IntegerLiteral |
34
+
35
+ ## Signal vs Noise Summary
36
+
37
+ | Assessment | Count | % |
38
+ |------------|------:|--:|
39
+ | Signal (catches real bugs) | 2,176 | 55.8% |
40
+ | Mixed (some signal, some equivalent) | 1,155 | 29.6% |
41
+ | Noise (equivalent or reformatting) | 570 | 14.6% |
42
+
43
+ ## Key Findings
44
+
45
+ ### 1. Guard clause restructuring is pure noise (14.6%)
46
+
47
+ The reference tool rewrites `return X if condition` to `unless condition; return X; end`.
48
+ This is a syntactic reformatting, not a semantic mutation. It inflates the mutation count
49
+ without testing anything. Evilution correctly does not produce these.
50
+
51
+ ### 2. Most categories are already covered by evilution
52
+
53
+ Of 20 categories, evilution has operators covering 14 fully and 4 partially.
54
+ Only 2 categories are not covered:
55
+ - **guard_clause_restructure** — noise, should not be added
56
+ - **method_substitution_at** — `[]` → `.at()`, marginal signal
57
+
58
+ ### 3. The "complex_mutation" bucket needs further analysis
59
+
60
+ 854 mutations (21.9%) are multi-statement compound changes that don't fit a single
61
+ category. Many combine receiver replacement + argument modification + formatting
62
+ changes in one diff. Some contain real signal (e.g., removing a hash key from a
63
+ method call), others are largely equivalent reformattings.
64
+
65
+ ### 4. The 1.34x gap is largely explained by:
66
+
67
+ - Guard clause restructuring: 570 mutations (noise)
68
+ - Receiver self-swap equivalents: ~140 mutations (noise portion of 281)
69
+ - Complex compound mutations: ~288 mutations (noise portion of 854)
70
+
71
+ **Removing noise**, the effective gap drops to approximately **1.0-1.1x** —
72
+ near parity for signal-bearing mutations.
73
+
74
+ ## Recommendations for EV-251 (Prioritization)
75
+
76
+ 1. **Do not add** guard clause restructuring — pure noise
77
+ 2. **Consider adding** `[]` → `.at()` substitution (60 mutations, real signal for bounds checking)
78
+ 3. **Investigate** the complex_mutation bucket further to extract any discrete operator patterns
79
+ 4. **Current density target (< 1.5x) is already met** at 1.34x overall
@@ -0,0 +1,68 @@
1
+ # Operator Addition Prioritization
2
+
3
+ Date: 2026-04-09
4
+ Based on: EV-250 classification of 3,901 reference tool mutations across 10 files
5
+
6
+ ## Current Status
7
+
8
+ - Density ratio: **1.34x** (target: < 1.5x) — **already passing**
9
+ - Evilution: 68 operators covering 14/20 reference categories fully, 4 partially
10
+ - Effective signal gap after removing noise: ~1.0-1.1x (near parity)
11
+
12
+ ## Prioritized Operator Additions
13
+
14
+ Ranked by: (a) signal frequency, (b) implementation complexity, (c) equivalent mutant rate.
15
+
16
+ ### Priority 1: High signal, low complexity
17
+
18
+ | # | Operator | Signal Count | Complexity | Equiv. Rate | Notes |
19
+ |---|----------|-------------|-----------|-------------|-------|
20
+ | 1 | `[]` → `.at()` substitution | 60 | Low | Low (~10%) | Catches unchecked array/hash access. Single AST node transform. New operator needed. |
21
+
22
+ **Rationale:** Only uncovered category with real signal. `.at()` returns nil
23
+ instead of raising on out-of-bounds, exposing missing bounds checks. Simple to
24
+ implement — match `CallNode` with name `[]` on collection receivers, emit `.at()`
25
+ variant.
26
+
27
+ ### Priority 2: Improve existing coverage (partial gaps)
28
+
29
+ | # | Operator | Gap Area | Complexity | Equiv. Rate | Notes |
30
+ |---|----------|----------|-----------|-------------|-------|
31
+ | 2 | Regex simplification (EV-230, #514) | 27 | Medium | Low (~15%) | Quantifier removal, anchor removal, character class simplification. Already scoped. |
32
+ | 3 | Block pass removal (`&...`) | 5 | Low | Medium (~30%) | Remove `&` block pass arguments (`&:symbol`, `&method(:name)`, etc). Marginal count but trivial to add. |
33
+
34
+ **Rationale:** EV-230 is already scoped with a GH issue. Block pass removal is
35
+ minimal effort for minimal gain — include only if doing a batch of small operators.
36
+
37
+ ### Priority 3: Do not implement
38
+
39
+ | # | Category | Count | Reason |
40
+ |---|----------|------:|--------|
41
+ | — | Guard clause restructuring | 570 | Pure noise — syntactic reformatting, not semantic mutation |
42
+ | — | Receiver self-swap (remaining) | ~140 | Mostly equivalent — `self.method` vs `method` rarely matters |
43
+ | — | Complex compound mutations | ~288 | Noise portion of multi-statement changes; not decomposable into discrete operators |
44
+
45
+ ## Implementation Order
46
+
47
+ 1. **EV-230** (#514) — Regex simplification operators (already scoped, medium complexity, 27 signal mutations)
48
+ 2. **New: `IndexToAt`** — `[]` → `.at()` substitution (60 signal mutations, low complexity)
49
+ 3. **New: `BlockPassRemoval`** — `&:method` removal (5 signal mutations, trivial)
50
+
51
+ ## Impact Assessment
52
+
53
+ | Scenario | Estimated Ratio | Delta |
54
+ |----------|----------------|-------|
55
+ | Current | 1.34x | — |
56
+ | After adding all Priority 1+2 | ~1.31x | -0.03x |
57
+
58
+ The density gap is already within target. These additions improve **signal
59
+ coverage** (catching real bugs that reference tool catches and evilution misses)
60
+ rather than closing the headline ratio, which is already healthy.
61
+
62
+ ## Recommendation
63
+
64
+ The density gap research (EV-238) can be considered **successful** — the 1.5x
65
+ target is met at 1.34x. Remaining work should focus on signal quality (regex
66
+ mutations, bounds checking) rather than chasing the ratio lower. The reference
67
+ tool's ~15% noise inflation means its raw count is not a meaningful target for
68
+ exact parity.