evilution 0.21.0 → 0.22.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (45) hide show
  1. checksums.yaml +4 -4
  2. data/.beads/.gitignore +4 -0
  3. data/.beads/interactions.jsonl +12 -0
  4. data/.beads/issues.jsonl +9 -6
  5. data/CHANGELOG.md +17 -0
  6. data/README.md +14 -10
  7. data/comparison_results/baseline_2026-04-09.md +35 -0
  8. data/comparison_results/operator_classification.md +79 -0
  9. data/comparison_results/operator_prioritization.md +68 -0
  10. data/docs/mutation_density_benchmark.md +91 -0
  11. data/lib/evilution/ast/parser.rb +2 -1
  12. data/lib/evilution/baseline.rb +14 -11
  13. data/lib/evilution/cli.rb +2 -1
  14. data/lib/evilution/config.rb +15 -3
  15. data/lib/evilution/disable_comment.rb +2 -1
  16. data/lib/evilution/integration/base.rb +98 -1
  17. data/lib/evilution/integration/minitest.rb +145 -0
  18. data/lib/evilution/integration/minitest_crash_detector.rb +55 -0
  19. data/lib/evilution/integration/rspec.rb +33 -100
  20. data/lib/evilution/mcp/mutate_tool.rb +6 -6
  21. data/lib/evilution/mutator/base.rb +4 -0
  22. data/lib/evilution/mutator/operator/bitwise_complement.rb +1 -1
  23. data/lib/evilution/mutator/operator/block_pass_removal.rb +30 -0
  24. data/lib/evilution/mutator/operator/ensure_removal.rb +1 -1
  25. data/lib/evilution/mutator/operator/index_to_at.rb +30 -0
  26. data/lib/evilution/mutator/operator/index_to_dig.rb +2 -2
  27. data/lib/evilution/mutator/operator/index_to_fetch.rb +2 -2
  28. data/lib/evilution/mutator/operator/keyword_argument.rb +1 -1
  29. data/lib/evilution/mutator/operator/regex_simplification.rb +169 -0
  30. data/lib/evilution/mutator/operator/rescue_body_replacement.rb +1 -1
  31. data/lib/evilution/mutator/operator/rescue_removal.rb +1 -1
  32. data/lib/evilution/mutator/registry.rb +3 -0
  33. data/lib/evilution/reporter/html.rb +2 -2
  34. data/lib/evilution/reporter/json.rb +2 -2
  35. data/lib/evilution/reporter/suggestion.rb +659 -2
  36. data/lib/evilution/runner.rb +31 -12
  37. data/lib/evilution/spec_resolver.rb +24 -16
  38. data/lib/evilution/version.rb +1 -1
  39. data/lib/evilution.rb +4 -0
  40. data/scripts/benchmark_density +261 -0
  41. data/scripts/benchmark_density.yml +19 -0
  42. data/scripts/compare_mutations +404 -0
  43. data/scripts/compare_mutations.yml +24 -0
  44. data/scripts/mutant_json_adapter +224 -0
  45. metadata +16 -2
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: c8f4aa7633e70e4a54aded76fcdfeb152cb4e4ad76d587b5aa0c93bda96246e3
4
- data.tar.gz: b8e65e5d0837b6873c31e6cae9621160a2a6fe75b3949d08a39f91b3df7db60b
3
+ metadata.gz: cc78bc7bc68c4d25a6b260b62a83a304d617905187e0b06ca6d0bc050be86403
4
+ data.tar.gz: 0a98e06dfc6ee9c4f0830b5a85346a95bf8de07d6aa9f3c49326077f7e2a8d44
5
5
  SHA512:
6
- metadata.gz: 923d8fa302a830d1b070e27b2494c5ec3227f6c6188b26f1701a0344e54ac230626a75bd4f0ec70d6e4f10af04e5e62d04b05d93c067ee160ce3c653e86faaf6
7
- data.tar.gz: f63389c729c4d121cb24a38a2bd9bd4d386707f81c16dbfd106868e952d72200278cd84fc0936ad60e666e53b5c51efce7d40a615c093d5fc327e1e910a39fd4
6
+ metadata.gz: a8d67fc09591e0bee7395498d41c347aa907bd061831ef2ff83765bb0f143f577b8070b217193ffcfe23bff2a221b85f23b644cb15f3732f9aaf0069c6e369b1
7
+ data.tar.gz: c13839cac96e37075a5f2fa4cd82d2d2ca8116ef058d3ad8cbca6b4716bb0239d4e60553100d7c1d36fb05205deb0415b5f91a855319cd55a979fcad49736ec4
data/.beads/.gitignore CHANGED
@@ -42,8 +42,12 @@ export-state/
42
42
 
43
43
  # Dolt database (managed by Dolt remotes, not git)
44
44
  dolt/
45
+ embeddeddolt/
45
46
  dolt-access.lock
46
47
 
48
+ # Local backup data
49
+ backup/
50
+
47
51
  # NOTE: Do NOT add negation patterns (e.g., !issues.jsonl) here.
48
52
  # They would override fork protection in .git/info/exclude, allowing
49
53
  # contributors to accidentally commit upstream issue databases.
@@ -0,0 +1,12 @@
1
+ {"id":"int-4aac1fda","kind":"field_change","created_at":"2026-04-09T05:02:19.518389279Z","actor":"Denis Kiselev","issue_id":"EV-230","extra":{"field":"status","new_value":"in_progress","old_value":"open"}}
2
+ {"id":"int-ff9e26c4","kind":"field_change","created_at":"2026-04-09T05:38:57.721818299Z","actor":"Denis Kiselev","issue_id":"EV-230","extra":{"field":"status","new_value":"closed","old_value":"in_progress","reason":"Merged PR #622. RegexSimplification operator with quantifier removal, anchor removal, and character class range removal."}}
3
+ {"id":"int-c318b162","kind":"field_change","created_at":"2026-04-09T06:16:18.67347839Z","actor":"Denis Kiselev","issue_id":"EV-78","extra":{"field":"status","new_value":"closed","old_value":"in_progress","reason":"Closed"}}
4
+ {"id":"int-e2f93b8b","kind":"field_change","created_at":"2026-04-09T06:33:11.389955517Z","actor":"Denis Kiselev","issue_id":"EV-82","extra":{"field":"status","new_value":"closed","old_value":"in_progress","reason":"Closed"}}
5
+ {"id":"int-275fe056","kind":"field_change","created_at":"2026-04-09T06:49:54.009377447Z","actor":"Denis Kiselev","issue_id":"EV-80","extra":{"field":"status","new_value":"closed","old_value":"in_progress","reason":"Closed"}}
6
+ {"id":"int-f1854a23","kind":"field_change","created_at":"2026-04-09T07:52:38.321185191Z","actor":"Denis Kiselev","issue_id":"EV-79","extra":{"field":"status","new_value":"closed","old_value":"in_progress","reason":"Closed"}}
7
+ {"id":"int-d2e4b659","kind":"field_change","created_at":"2026-04-09T08:28:47.700848334Z","actor":"Denis Kiselev","issue_id":"EV-81","extra":{"field":"status","new_value":"closed","old_value":"in_progress","reason":"Closed"}}
8
+ {"id":"int-f2a8e3fa","kind":"field_change","created_at":"2026-04-09T11:39:20.111111433Z","actor":"Denis Kiselev","issue_id":"EV-83","extra":{"field":"status","new_value":"closed","old_value":"in_progress","reason":"Closed"}}
9
+ {"id":"int-41c232c2","kind":"field_change","created_at":"2026-04-09T12:51:12.16692591Z","actor":"Denis Kiselev","issue_id":"EV-84","extra":{"field":"status","new_value":"closed","old_value":"in_progress","reason":"Closed"}}
10
+ {"id":"int-0f073191","kind":"field_change","created_at":"2026-04-09T13:03:11.468115004Z","actor":"Denis Kiselev","issue_id":"EV-85","extra":{"field":"status","new_value":"closed","old_value":"in_progress","reason":"Closed"}}
11
+ {"id":"int-91a9616c","kind":"field_change","created_at":"2026-04-09T13:04:05.459458165Z","actor":"Denis Kiselev","issue_id":"EV-69","extra":{"field":"status","new_value":"closed","old_value":"open","reason":"Closed"}}
12
+ {"id":"int-b4fe2b7b","kind":"field_change","created_at":"2026-04-09T13:42:59.996305852Z","actor":"Denis Kiselev","issue_id":"EV-277","extra":{"field":"status","new_value":"closed","old_value":"in_progress","reason":"Closed"}}
data/.beads/issues.jsonl CHANGED
@@ -170,7 +170,7 @@
170
170
  {"id":"EV-228","title":"Equivalent detection: .count → .length as always-equivalent","description":".count → .length mutations are universally unkillable (both return identical integer results). Evilution should detect this pattern and classify as equivalent automatically. Reported in feedback for HomeController stats block.","notes":"GH #509","status":"closed","priority":3,"issue_type":"feature","owner":"denis.kiselyov@gmail.com","created_at":"2026-04-04T11:21:43.587875972+07:00","created_by":"Denis Kiselev","updated_at":"2026-04-08T08:52:50.669215278+07:00","closed_at":"2026-04-08T08:52:50.669215278+07:00","close_reason":"Closed"}
171
171
  {"id":"EV-229","title":"Equivalent detection: .each → .map in void context","description":"When .each is called in void context (return value not assigned or passed), replacing with .map or .reverse_each produces equivalent behavior. Evilution should detect void-context method calls and mark these swaps as likely-equivalent. Reported for Avo reset_password action.","notes":"GH #511","status":"closed","priority":3,"issue_type":"feature","owner":"denis.kiselyov@gmail.com","created_at":"2026-04-04T11:21:46.8330208+07:00","created_by":"Denis Kiselev","updated_at":"2026-04-08T09:31:47.874981801+07:00","closed_at":"2026-04-08T09:31:47.874981801+07:00","close_reason":"Closed"}
172
172
  {"id":"EV-23","title":"Per-mutation spec targeting","description":"Instead of running the full spec suite for every mutation, map each mutated source file to its relevant spec file(s) using convention-based resolution (e.g. lib/foo/bar.rb -> spec/foo/bar_spec.rb) and only run those. This dramatically reduces per-mutation test time. Depends on convention-based spec file resolution being implemented first.","status":"closed","priority":2,"issue_type":"feature","owner":"denis.kiselyov@gmail.com","created_at":"2026-03-10T06:17:28.98620973+07:00","created_by":"Denis Kiselev","updated_at":"2026-03-16T14:49:13.616876819+07:00","closed_at":"2026-03-16T14:49:13.616876819+07:00","close_reason":"Fixed and merged","dependencies":[{"issue_id":"EV-23","depends_on_id":"EV-34","type":"blocks","created_at":"0001-01-01T00:00:00Z"}]}
173
- {"id":"EV-230","title":"Regex simplification operators","description":"Add mutation operators for regex patterns: /\\s+/ → /\\s/ (remove quantifier), remove anchors (^, $, \\A, \\z), simplify character classes. Mutant's regex mutations caught real test gaps (case sensitivity, whitespace handling) that evilution missed. Reported in 2 sessions on Telegram::NewsScorer.","notes":"GH #514","status":"open","priority":3,"issue_type":"feature","owner":"denis.kiselyov@gmail.com","created_at":"2026-04-04T11:21:48.930372762+07:00","created_by":"Denis Kiselev","updated_at":"2026-04-04T11:22:57.239710634+07:00","dependencies":[{"issue_id":"EV-230","depends_on_id":"EV-238","type":"blocks","created_at":"0001-01-01T00:00:00Z"}]}
173
+ {"id":"EV-230","title":"Regex simplification operators","description":"Add mutation operators for regex patterns: /\\s+/ → /\\s/ (remove quantifier), remove anchors (^, $, \\A, \\z), simplify character classes. Mutant's regex mutations caught real test gaps (case sensitivity, whitespace handling) that evilution missed. Reported in 2 sessions on Telegram::NewsScorer.","notes":"GH #514","status":"open","priority":3,"issue_type":"feature","owner":"denis.kiselyov@gmail.com","created_at":"2026-04-04T11:21:48.930372762+07:00","created_by":"Denis Kiselev","updated_at":"2026-04-04T11:22:57.239710634+07:00"}
174
174
  {"id":"EV-231","title":".downcase removal operator","description":"Add mutation that removes .downcase calls. This caught real case-sensitivity test gaps in mutant that evilution missed. Useful for testing that code handles mixed-case input. Reported in 2 sessions on NewsScorer.","notes":"GH #516","status":"closed","priority":3,"issue_type":"feature","owner":"denis.kiselyov@gmail.com","created_at":"2026-04-04T11:21:50.682791593+07:00","created_by":"Denis Kiselev","updated_at":"2026-04-08T09:42:19.793202916+07:00","closed_at":"2026-04-08T09:42:19.793202916+07:00","close_reason":"Closed","dependencies":[{"issue_id":"EV-231","depends_on_id":"EV-238","type":"blocks","created_at":"0001-01-01T00:00:00Z"}]}
175
175
  {"id":"EV-232","title":"Method chain permutation operators (strip → lstrip/rstrip)","description":"Add mutations that replace string cleaning methods with partial variants: strip → lstrip/rstrip, chomp → chop, etc. Mutant's chain permutations caught a real test gap in NewsScorer keyword processing. Reported in 1 session.","notes":"GH #518","status":"closed","priority":3,"issue_type":"feature","owner":"denis.kiselyov@gmail.com","created_at":"2026-04-04T11:21:52.982750285+07:00","created_by":"Denis Kiselev","updated_at":"2026-04-08T11:15:28.398677068+07:00","closed_at":"2026-04-08T11:15:28.398677068+07:00","close_reason":"Closed","dependencies":[{"issue_id":"EV-232","depends_on_id":"EV-238","type":"blocks","created_at":"0001-01-01T00:00:00Z"}]}
176
176
  {"id":"EV-233","title":"Related specs heuristic (run association specs when .includes() mutated)","description":"When mutations remove .includes() eager loading calls, the matching unit spec may not catch N+1 regressions. Consider a heuristic that also runs specs for the included associations or integration specs. Would complement the spec auto-detection feature. Reported in 1 session for NewsController.","notes":"GH #519","status":"closed","priority":3,"issue_type":"feature","owner":"denis.kiselyov@gmail.com","created_at":"2026-04-04T11:21:56.302076763+07:00","created_by":"Denis Kiselev","updated_at":"2026-04-08T12:16:09.453069408+07:00","closed_at":"2026-04-08T12:16:09.453069408+07:00","close_reason":"Closed"}
@@ -178,22 +178,22 @@
178
178
  {"id":"EV-235","title":"Bug: Non-deterministic mutation count on same file","description":"Running evilution twice on the same file (HelpController, 9 LOC) produced different mutation counts (18 vs 15). Unclear cause — possibly non-deterministic operator selection or file state difference. Low priority but should be investigated. Reported once in v0.12.0.","notes":"GitHub: #512","status":"closed","priority":4,"issue_type":"bug","owner":"denis.kiselyov@gmail.com","created_at":"2026-04-04T11:22:01.930671333+07:00","created_by":"Denis Kiselev","updated_at":"2026-04-04T13:20:37.135440498+07:00","closed_at":"2026-04-04T13:20:37.135440498+07:00","close_reason":"Not a bug. The 18 vs 15 difference was exactly the 3 timed-out mutations — counted in run 1 total but excluded in run 2 summary. Mutation generation is fully deterministic (no rand/shuffle/sample in codepath). Reported once in v0.12.0, never reproduced. Reporting has been significantly improved since then."}
179
179
  {"id":"EV-236","title":"--spec-dir flag for directory-level spec inclusion","description":"Add a --spec-dir flag that auto-includes all specs in a directory, reducing the chance of missing coverage from adjacent spec files. Useful when a controller has tests split across spec/requests/, spec/controllers/, and spec/features/. Reported once.","notes":"GitHub: #513","status":"closed","priority":4,"issue_type":"feature","owner":"denis.kiselyov@gmail.com","created_at":"2026-04-04T11:22:04.618160285+07:00","created_by":"Denis Kiselev","updated_at":"2026-04-07T09:55:50.99913701+07:00","closed_at":"2026-04-07T09:55:50.99913701+07:00","close_reason":"--spec-dir CLI flag implemented, composes with --spec, validates directory existence. 3 unit tests passing. Merged via GH #513.","dependencies":[{"issue_id":"EV-236","depends_on_id":"EV-227","type":"blocks","created_at":"0001-01-01T00:00:00Z"}]}
180
180
  {"id":"EV-237","title":"Temp-file based mutation (don't modify original source)","description":"Evilution mutates source files in-place on the filesystem, which triggers file watchers, linters, and IDE notifications during runs. Even with the ensure-based restore (fixed earlier), race conditions exist if the process is killed. Write mutated source to a tempfile and point the test runner at it via load path manipulation. Never modify the original source file. Reported in 2 sessions (v0.16.1).","notes":"GH issue: #520 (https://github.com/marinazzio/evilution/issues/520)","status":"closed","priority":2,"issue_type":"feature","owner":"denis.kiselyov@gmail.com","created_at":"2026-04-04T11:22:06.770551806+07:00","created_by":"Denis Kiselev","updated_at":"2026-04-08T15:52:59.896531797+07:00","closed_at":"2026-04-08T15:52:59.896531797+07:00","close_reason":"Closed","external_ref":"gh-520","labels":["reliability"]}
181
- {"id":"EV-238","title":"Epic: Research mutation density gap with mutant","description":"Evilution consistently generates 1.8-2.6x fewer mutations than mutant across 25 feedback sessions. While some of mutant's extras are equivalent ([]→fetch, to_i→Integer()), many catch real edge cases. This epic covers: (1) Audit mutant's operator list systematically against evilution's 54 operators, (2) Identify which missing operators catch real bugs vs produce noise, (3) Prioritize operator additions by signal-to-noise ratio, (4) Target closing the gap to <1.5x. Related: existing gap analysis created 15 new operator issues (EV-214 through EV-224, #491-#505).","notes":"GitHub: #515","status":"open","priority":2,"issue_type":"feature","owner":"denis.kiselyov@gmail.com","created_at":"2026-04-04T11:22:11.095318733+07:00","created_by":"Denis Kiselev","updated_at":"2026-04-04T11:22:55.790159971+07:00"}
181
+ {"id":"EV-238","title":"Epic: Research mutation density gap with mutant","description":"Evilution consistently generates 1.8-2.6x fewer mutations than mutant across 25 feedback sessions. While some of mutant's extras are equivalent ([]→fetch, to_i→Integer()), many catch real edge cases. This epic covers: (1) Audit mutant's operator list systematically against evilution's 54 operators, (2) Identify which missing operators catch real bugs vs produce noise, (3) Prioritize operator additions by signal-to-noise ratio, (4) Target closing the gap to <1.5x. Related: existing gap analysis created 15 new operator issues (EV-214 through EV-224, #491-#505).","notes":"GitHub: #515","status":"closed","priority":2,"issue_type":"feature","owner":"denis.kiselyov@gmail.com","created_at":"2026-04-04T11:22:11.095318733+07:00","created_by":"Denis Kiselev","updated_at":"2026-04-09T10:41:04.310878722+07:00","closed_at":"2026-04-09T10:41:04.310878722+07:00","close_reason":"Closed"}
182
182
  {"id":"EV-239","title":"Epic: Research and fix high memory baseline","description":"Evilution's memory baseline is 718+ MB even for tiny files in v0.18.0 sessions, and grows across consecutive runs in the same session (718→763→795→800 MB). Previous fixes addressed AST node retention and StringIO leaks but baseline remains high. Mutant peaks at ~200 MB for comparable workloads. This epic covers: (1) Profile memory allocation during boot/setup phase, (2) Identify what's consuming the 718 MB baseline, (3) Investigate cross-run memory growth (session-level leak?), (4) Target bringing baseline under 300 MB. Related: existing rake memory:check infrastructure exists.","notes":"GitHub: #517","status":"closed","priority":2,"issue_type":"feature","owner":"denis.kiselyov@gmail.com","created_at":"2026-04-04T11:22:16.602817355+07:00","created_by":"Denis Kiselev","updated_at":"2026-04-06T00:16:38.665740136+07:00","closed_at":"2026-04-06T00:16:38.665740136+07:00","close_reason":"Premise invalid: 718 MB baseline was the MCP host process, not evilution. Standalone evilution baseline is ~30 MB (confirmed via EV-242/245/246 profiling). No memory fix needed. Sub-issues resolved: EV-242 (30 MB boot baseline), EV-243 (single mutation profiled), EV-245 (no cross-run growth), EV-246 (fork has zero parent-side cost), EV-247 (RSS tracking added)."}
183
183
  {"id":"EV-24","title":"Epic: JSON Output Improvements","description":"Make JSON output fully machine-parseable in all scenarios, including errors. Add diagnostic fields that help agents debug failures.","status":"closed","priority":1,"issue_type":"feature","owner":"denis.kiselyov@gmail.com","created_at":"2026-03-10T06:17:37.450686472+07:00","created_by":"Denis Kiselev","updated_at":"2026-03-16T11:15:24.900944562+07:00","closed_at":"2026-03-16T11:15:24.900944562+07:00","close_reason":"All children complete: structured errors, test_command in JSON, noise suppression","dependencies":[{"issue_id":"EV-24","depends_on_id":"EV-25","type":"blocks","created_at":"0001-01-01T00:00:00Z"},{"issue_id":"EV-24","depends_on_id":"EV-26","type":"blocks","created_at":"0001-01-01T00:00:00Z"},{"issue_id":"EV-24","depends_on_id":"EV-40","type":"blocks","created_at":"0001-01-01T00:00:00Z"}]}
184
184
  {"id":"EV-240","title":"Neutral mutation category (test errors vs test failures)","description":"Add a 'neutral' classification for mutations where tests error (crash/exception) rather than fail (assertion). This helps users distinguish test infrastructure problems from real coverage gaps. Mutant has this distinction and it's valuable — e.g., killfork 403 errors in vanilla-mafia are infrastructure noise, not coverage gaps. Currently evilution has neutral detection implemented but feedback suggests it doesn't always classify correctly. Reported in 3 sessions.","notes":"GH issue: #521 (https://github.com/marinazzio/evilution/issues/521)","status":"in_progress","priority":2,"issue_type":"feature","owner":"denis.kiselyov@gmail.com","created_at":"2026-04-04T11:22:18.548454128+07:00","created_by":"Denis Kiselev","updated_at":"2026-04-08T12:21:16.372929584+07:00","external_ref":"gh-521","labels":["reliability"]}
185
185
  {"id":"EV-241","title":"Heredoc-aware string literal mutations","description":"Evilution's string_literal operator generates many false survived mutants on heredoc templates by mutating whitespace and literal text around interpolations. MigrationGenerator scored 63.1% due to template whitespace mutations that don't affect generated code behavior. Should either skip literal text in heredocs, only mutate interpolated expressions, or add a --skip-heredoc-literals flag. Reported in 1 session but caused significant score distortion.","notes":"GH issue: #522 (https://github.com/marinazzio/evilution/issues/522)","status":"open","priority":2,"issue_type":"feature","owner":"denis.kiselyov@gmail.com","created_at":"2026-04-04T11:22:20.534650035+07:00","created_by":"Denis Kiselev","updated_at":"2026-04-04T11:23:19.83846784+07:00","external_ref":"gh-522","labels":["reliability","false-positives"]}
186
186
  {"id":"EV-242","title":"Profile memory allocation during boot/setup phase","description":"Use memory_profiler gem or ObjectSpace to identify what objects are allocated during Evilution boot before any mutations run. Identify the top 10 memory consumers. This will help understand where the 718 MB baseline comes from.","notes":"GitHub: #527","status":"closed","priority":2,"issue_type":"task","owner":"denis.kiselyov@gmail.com","created_at":"2026-04-04T11:28:39.583808805+07:00","created_by":"Denis Kiselev","updated_at":"2026-04-05T23:36:18.3122267+07:00","closed_at":"2026-04-05T23:36:18.3122267+07:00","close_reason":"Boot footprint is ~30 MB. The 718 MB baseline was MCP host process, not evilution.","dependencies":[{"issue_id":"EV-242","depends_on_id":"EV-239","type":"blocks","created_at":"0001-01-01T00:00:00Z"}]}
187
187
  {"id":"EV-243","title":"Profile memory allocation during single mutation cycle","description":"Measure memory before and after a single mutation+test cycle. Identify what is allocated and not released within one cycle. Compare with mutant's behavior for the same file.","notes":"GitHub: #528","status":"in_progress","priority":2,"issue_type":"task","owner":"denis.kiselyov@gmail.com","created_at":"2026-04-04T11:28:46.085727121+07:00","created_by":"Denis Kiselev","updated_at":"2026-04-05T23:37:08.936438689+07:00","dependencies":[{"issue_id":"EV-243","depends_on_id":"EV-239","type":"blocks","created_at":"0001-01-01T00:00:00Z"}]}
188
- {"id":"EV-244","title":"Run head-to-head comparison on 10 diverse files, catalog every mutation mutant generates that evilution doesn't","description":"Pick 10 files of varying complexity (controller, model, service, validator, lib, migration, Avo resource, helper, concern, formatter). Run both mutant and evilution on each file. Catalog every mutation mutant produces that evilution misses. Classify each as: real signal, likely equivalent, or noise.","notes":"GitHub: #523","status":"open","priority":2,"issue_type":"task","owner":"denis.kiselyov@gmail.com","created_at":"2026-04-04T11:28:52.935963449+07:00","created_by":"Denis Kiselev","updated_at":"2026-04-06T15:36:16.088018831+07:00","dependencies":[{"issue_id":"EV-244","depends_on_id":"EV-238","type":"blocks","created_at":"0001-01-01T00:00:00Z"}]}
188
+ {"id":"EV-244","title":"Run head-to-head comparison on 10 diverse files, catalog every mutation mutant generates that evilution doesn't","description":"Pick 10 files of varying complexity (controller, model, service, validator, lib, migration, Avo resource, helper, concern, formatter). Run both mutant and evilution on each file. Catalog every mutation mutant produces that evilution misses. Classify each as: real signal, likely equivalent, or noise.","notes":"GitHub: #523","status":"closed","priority":2,"issue_type":"task","owner":"denis.kiselyov@gmail.com","created_at":"2026-04-04T11:28:52.935963449+07:00","created_by":"Denis Kiselev","updated_at":"2026-04-09T00:06:40.444637431+07:00","closed_at":"2026-04-09T00:06:40.444637431+07:00","close_reason":"Closed","dependencies":[{"issue_id":"EV-244","depends_on_id":"EV-238","type":"blocks","created_at":"0001-01-01T00:00:00Z"}]}
189
189
  {"id":"EV-245","title":"Investigate cross-run session memory growth","description":"Run multiple evilution invocations via MCP in the same session. Measure RSS between invocations. Determine if the growth (718->763->795->800 MB) is in the MCP server process, worker pool, or accumulated state.","notes":"GitHub: #529","status":"in_progress","priority":2,"issue_type":"task","owner":"denis.kiselyov@gmail.com","created_at":"2026-04-04T11:28:54.234490102+07:00","created_by":"Denis Kiselev","updated_at":"2026-04-05T23:40:31.266000331+07:00","dependencies":[{"issue_id":"EV-245","depends_on_id":"EV-239","type":"blocks","created_at":"0001-01-01T00:00:00Z"}]}
190
190
  {"id":"EV-246","title":"Investigate parent vs child process memory split","description":"Measure RSS of the parent (coordinator) process separately from forked child (worker) processes. Determine where the 718 MB baseline lives — is it the parent before forking, or do children inherit and grow independently?","notes":"GitHub: #531","status":"in_progress","priority":2,"issue_type":"task","owner":"denis.kiselyov@gmail.com","created_at":"2026-04-04T11:28:59.300347033+07:00","created_by":"Denis Kiselev","updated_at":"2026-04-05T23:45:11.030185257+07:00","dependencies":[{"issue_id":"EV-246","depends_on_id":"EV-239","type":"blocks","created_at":"0001-01-01T00:00:00Z"}]}
191
191
  {"id":"EV-247","title":"Add RSS tracking per mutation to JSON output","description":"Include parent_rss_kb and child_rss_kb fields in each mutation result. child_rss_kb partially exists (seen in feedback log) — verify it is accurate and add parent_rss_kb tracking. This provides ongoing observability for memory usage.","notes":"GitHub: #532","status":"in_progress","priority":2,"issue_type":"task","owner":"denis.kiselyov@gmail.com","created_at":"2026-04-04T11:29:08.569266807+07:00","created_by":"Denis Kiselev","updated_at":"2026-04-05T23:49:30.25391422+07:00","dependencies":[{"issue_id":"EV-247","depends_on_id":"EV-239","type":"blocks","created_at":"0001-01-01T00:00:00Z"}]}
192
192
  {"id":"EV-248","title":"Implement memory budget CI gate","description":"Add a CI check that runs rake memory:check and fails if peak RSS exceeds a threshold (e.g., 400 MB for a reference fixture). This prevents memory regressions from being merged.","notes":"GitHub: #533","status":"open","priority":2,"issue_type":"task","owner":"denis.kiselyov@gmail.com","created_at":"2026-04-04T11:29:16.177682161+07:00","created_by":"Denis Kiselev","updated_at":"2026-04-06T11:45:18.765568988+07:00","close_reason":"Growth-based leak detection in CI (EV-274/PR #571) is sufficient. Absolute peak RSS budget not needed — standalone baseline is ~30 MB (EV-239 premise was invalid). Per-mutation growth check catches regressions effectively.","dependencies":[{"issue_id":"EV-248","depends_on_id":"EV-239","type":"blocks","created_at":"0001-01-01T00:00:00Z"}]}
193
193
  {"id":"EV-249","title":"Audit current SourceSurgeon mutation-and-restore flow","description":"Document the current code path: where the file is read, mutated, written, and restored. Identify all callers and the ensure-based restore mechanism. Map the failure modes (SIGKILL, OOM, etc.).","notes":"## Audit Findings\n\n### Read Points\n1. **AST::Parser#call** (parser.rb:11) — File.read to parse with Prism\n2. **Mutator::Base#call** (base.rb:18) — File.read to store as original_source in Mutation\n3. **Integration::RSpec#apply_mutation** (rspec.rb:68) — reads before overwrite (direct-overwrite fallback only)\n4. **Isolation::Fork#restore_original_source** (fork.rb:48) — reads to verify if restore needed (defense-in-depth)\n\n### Mutation (In-Memory)\nSourceSurgeon.apply (source_surgeon.rb:6-10) is pure in-memory byte surgery. Never touches filesystem. Called from Mutator::Base#add_mutation.\n\n### Two Write Paths in Integration::RSpec#apply_mutation\n- **Path A (LOAD_PATH shadow, preferred):** Target under $LOAD_PATH → Dir.mktmpdir, write mutated source to mirrored subpath, prepend to $LOAD_PATH. Original file never touched.\n- **Path B (Direct overwrite, fallback):** Not under $LOAD_PATH → acquires exclusive flock, overwrites original file.\n\n### Restore — Two Layers\n- **Layer 1:** Integration::RSpec#restore_original via ensure in #call (rspec.rb:33-35). Path A: removes temp dir from $LOAD_PATH, purges $LOADED_FEATURES, deletes temp dir. Path B: writes @original_content back, releases flock.\n- **Layer 2:** Isolation::Fork#restore_original_source (fork.rb:47-53) — parent-process defense-in-depth. Only in sequential (Fork isolation) path. NOT in parallel path.\n\n### Execution Paths\n- **Sequential (jobs=1):** Runner → Isolation::Fork → fork child → Integration::RSpec#call → ensure restore (child) + ensure restore (parent)\n- **Parallel (jobs>1):** Runner → Parallel::Pool → WorkQueue forks workers → Isolation::InProcess → Integration::RSpec#call → ensure restore only. No parent-side defense-in-depth.\n\n### Failure Modes\n- Normal flow / exception: Safe on both paths\n- SIGKILL child (sequential): Safe (parent restores on direct-overwrite; file untouched on temp-dir)\n- **SIGKILL worker (parallel) + direct-overwrite: FILE CORRUPTED — no recovery**\n- **OOM parallel worker + direct-overwrite: FILE CORRUPTED**\n- **SIGINT/SIGTERM to parent + direct-overwrite: File may be corrupted (zero signal handlers in lib/)**\n- Disk full during restore: File stays corrupted\n\n### Key Findings\n- Zero trap/at_exit/Signal.trap calls in entire lib/ directory\n- Biggest risk: direct-overwrite fallback in parallel mode (no parent-side restore)\n- Epic EV-237 should eliminate direct-overwrite path entirely, making LOAD_PATH shadow the only path","status":"closed","priority":2,"issue_type":"task","owner":"denis.kiselyov@gmail.com","created_at":"2026-04-04T11:29:17.689070042+07:00","created_by":"Denis Kiselev","updated_at":"2026-04-08T15:53:00.183398494+07:00","closed_at":"2026-04-08T15:53:00.183398494+07:00","close_reason":"Closed","dependencies":[{"issue_id":"EV-249","depends_on_id":"EV-237","type":"blocks","created_at":"0001-01-01T00:00:00Z"}]}
194
194
  {"id":"EV-25","title":"Structured error responses in JSON mode","description":"When --format json is used and exit code is 2 (error), output a JSON object with error details instead of unstructured stderr text. Schema: { \"error\": { \"type\": \"config_error|parse_error|runtime_error\", \"message\": \"...\", \"file\": \"...\" } }. Agents currently have to regex-parse stderr which is fragile.","status":"closed","priority":1,"issue_type":"feature","owner":"denis.kiselyov@gmail.com","created_at":"2026-03-10T06:17:38.283715502+07:00","created_by":"Denis Kiselev","updated_at":"2026-03-15T22:41:54.370789377+07:00","closed_at":"2026-03-15T22:41:54.370789377+07:00","close_reason":"Merged PR #74 — structured JSON error output in CLI"}
195
- {"id":"EV-250","title":"Classify mutant's extra mutations by operator category","description":"From the head-to-head data (EV-244), group mutant's extra mutations by category (e.g., receiver mutations, argument permutations, method name substitutions, literal boundary values). Count how many are signal vs noise per category. Produce a table of categories with signal/noise breakdown.","notes":"GitHub: #526","status":"open","priority":2,"issue_type":"task","owner":"denis.kiselyov@gmail.com","created_at":"2026-04-04T11:29:21.320269648+07:00","created_by":"Denis Kiselev","updated_at":"2026-04-04T11:29:32.613448913+07:00","dependencies":[{"issue_id":"EV-250","depends_on_id":"EV-238","type":"blocks","created_at":"0001-01-01T00:00:00Z"}]}
196
- {"id":"EV-251","title":"Prioritize operator additions by signal-to-noise ratio","description":"Rank the missing operator categories by: (a) frequency of real signal catches, (b) implementation complexity, (c) expected equivalent mutant rate. Produce a prioritized implementation order for new operators.","notes":"GitHub: #534","status":"open","priority":2,"issue_type":"task","owner":"denis.kiselyov@gmail.com","created_at":"2026-04-04T11:29:38.370181376+07:00","created_by":"Denis Kiselev","updated_at":"2026-04-04T11:29:53.283203728+07:00","dependencies":[{"issue_id":"EV-251","depends_on_id":"EV-238","type":"blocks","created_at":"0001-01-01T00:00:00Z"}]}
195
+ {"id":"EV-250","title":"Classify mutant's extra mutations by operator category","description":"From the head-to-head data (EV-244), group mutant's extra mutations by category (e.g., receiver mutations, argument permutations, method name substitutions, literal boundary values). Count how many are signal vs noise per category. Produce a table of categories with signal/noise breakdown.","notes":"GitHub: #526","status":"closed","priority":2,"issue_type":"task","owner":"denis.kiselyov@gmail.com","created_at":"2026-04-04T11:29:21.320269648+07:00","created_by":"Denis Kiselev","updated_at":"2026-04-09T09:11:20.047999158+07:00","closed_at":"2026-04-09T09:11:20.047999158+07:00","close_reason":"Closed","dependencies":[{"issue_id":"EV-250","depends_on_id":"EV-238","type":"blocks","created_at":"0001-01-01T00:00:00Z"}]}
196
+ {"id":"EV-251","title":"Prioritize operator additions by signal-to-noise ratio","description":"Rank the missing operator categories by: (a) frequency of real signal catches, (b) implementation complexity, (c) expected equivalent mutant rate. Produce a prioritized implementation order for new operators.","notes":"GitHub: #534","status":"closed","priority":2,"issue_type":"task","owner":"denis.kiselyov@gmail.com","created_at":"2026-04-04T11:29:38.370181376+07:00","created_by":"Denis Kiselev","updated_at":"2026-04-09T10:41:04.207816996+07:00","closed_at":"2026-04-09T10:41:04.207816996+07:00","close_reason":"Closed","dependencies":[{"issue_id":"EV-251","depends_on_id":"EV-238","type":"blocks","created_at":"0001-01-01T00:00:00Z"}]}
197
197
  {"id":"EV-252","title":"Reproduce and measure per-mutation RSS growth on reference fixture","description":"Create a reproducible benchmark: run evilution on a known fixture file, record RSS after each mutation. Confirm the ~3-8 MB/mutation growth rate. Baseline for measuring fix effectiveness.","notes":"GitHub: #539","status":"in_progress","priority":0,"issue_type":"task","owner":"denis.kiselyov@gmail.com","created_at":"2026-04-04T11:29:39.421709567+07:00","created_by":"Denis Kiselev","updated_at":"2026-04-04T15:08:27.260195719+07:00","dependencies":[{"issue_id":"EV-252","depends_on_id":"EV-226","type":"blocks","created_at":"0001-01-01T00:00:00Z"}]}
198
198
  {"id":"EV-253","title":"Profile object allocation delta per mutation cycle","description":"Use ObjectSpace.count_objects or memory_profiler to capture what new objects are allocated during one mutation cycle and not released. Identify the top retained object types.","notes":"GitHub: #540 — Profiling complete. Root cause: RSpec ExampleGroup subclass ivars create reference cycles preventing GC (+3380 slots/mutation). Secondary: World#@sources_by_path cache. Fix proven: clearing EG ivars + sources cache after Runner.run = 0 growth.","status":"closed","priority":0,"issue_type":"task","owner":"denis.kiselyov@gmail.com","created_at":"2026-04-04T11:29:41.259785614+07:00","created_by":"Denis Kiselev","updated_at":"2026-04-05T15:50:12.517441649+07:00","closed_at":"2026-04-05T15:50:12.517441649+07:00","close_reason":"Profiling complete. Root cause identified: RSpec ExampleGroup reference cycles + World source cache. Findings documented on GH #540.","dependencies":[{"issue_id":"EV-253","depends_on_id":"EV-226","type":"blocks","created_at":"0001-01-01T00:00:00Z"}]}
199
199
  {"id":"EV-254","title":"Design temp-file mutation architecture","description":"Design how the temp file is created, where it lives (tmpdir vs .evilution/tmp), how the test process is redirected to load it (Ruby $LOAD_PATH manipulation vs file-level bootsnap override vs ENV-based), and how Rails autoloader interacts with it.","notes":"## Architecture Design: Temp-File Mutation\n\n### Problem Statement\nTwo write paths exist in Integration::RSpec#apply_mutation:\n- **Path A (LOAD_PATH shadow):** Works when file is under $LOAD_PATH — writes to temp dir, prepends to $LOAD_PATH. Original file never touched. Already safe.\n- **Path B (Direct overwrite):** Fallback when file is NOT under $LOAD_PATH — overwrites original, restores via ensure. Vulnerable to SIGKILL/OOM (especially in parallel mode where no parent-side defense-in-depth exists).\n\n**Goal:** Eliminate Path B entirely. Never modify the original source file.\n\n---\n\n### Design Decisions\n\n#### 1. Temp file location: Dir.mktmpdir (system tmpdir)\n- Use `Dir.mktmpdir('evilution')` (same as current Path A), NOT .evilution/tmp\n- Rationale: system tmpdir is auto-cleaned on reboot; no risk of polluting the project directory; avoids .gitignore concerns; already proven in current Path A\n\n#### 2. Redirection mechanism: LOAD_PATH shadow + explicit load\n**For files under $LOAD_PATH** (most lib/ files): Keep current approach — mirror subpath in temp dir, prepend to $LOAD_PATH. This handles any `require` calls during the test run.\n\n**For files NOT under $LOAD_PATH** (the current fallback case): \n- Write mutated source to temp dir mirroring the relative path from project root\n- Explicitly `load` the temp file in the forked child to redefine the class/module\n- This replaces the direct-overwrite approach entirely\n- The `load` approach works because it always executes the file (unlike `require` which checks $LOADED_FEATURES)\n\n#### 3. SourceSurgeon: No changes needed\nSourceSurgeon.apply is already pure in-memory byte surgery. It returns a mutated string without touching the filesystem. No changes required for this epic.\n\n#### 4. Where file I/O moves\n**Integration::RSpec#apply_mutation** remains the owner of temp-file writes, but the two-path logic changes:\n- Path A (under $LOAD_PATH): unchanged — temp dir + $LOAD_PATH prepend\n- Path B (not under $LOAD_PATH): temp dir + explicit `load` (replaces direct overwrite)\n- Both paths use temp files. Original is never touched.\n\n#### 5. Restore/cleanup strategy (three layers)\n1. **ensure in Integration::RSpec#call** (existing): Remove temp dir from $LOAD_PATH, purge $LOADED_FEATURES entries, FileUtils.rm_rf temp dir. Works for both paths now.\n2. **ensure in Isolation::Fork#call** (existing): Simplify — no longer needs to check/restore original file content. Instead, just verify temp dir cleanup. Keep as defense-in-depth for temp dir leaks.\n3. **at_exit hook** (new): Register a cleanup for the temp base dir pattern (evilution*) in case of unhandled exit. Safety net for leaked temp dirs.\n4. **Signal traps** (new): Trap SIGTERM/SIGINT in the parent process to ensure temp dir cleanup before exit.\n\n#### 6. Isolation::Fork#restore_original_source\n- Remove the file-content comparison and rewrite logic\n- Replace with temp-dir cleanup verification (check if any evilution temp dirs remain, clean them)\n- This is now truly defense-in-depth rather than a critical restore path\n\n#### 7. Parallel mode (InProcess isolation)\n- No special handling needed — each worker is a forked process with its own $LOAD_PATH\n- Temp dirs are per-mutation, isolated across workers\n- The biggest current risk (corrupted original file on worker SIGKILL) is eliminated because the original file is never modified\n\n#### 8. Zeitwerk (Rails autoloader) compatibility\n- Zeitwerk maps file paths to constant names using autoload_paths (which are $LOAD_PATH entries in Rails)\n- For files under Zeitwerk-managed paths: LOAD_PATH shadow works — Zeitwerk will find the temp version first\n- For files NOT under Zeitwerk paths: the explicit `load` approach bypasses Zeitwerk entirely, which is correct since Zeitwerk wouldn't manage those files anyway\n- Edge case: Zeitwerk caches file-to-constant mappings. In a forked child, the cache is inherited. Since we `load` after fork, the class is redefined in-place — Zeitwerk's cache remains valid (same constant, new definition)\n- Need integration test to verify (EV-268)\n\n---\n\n### Implementation Order\n1. **EV-263** (SourceSurgeon temp-file write): Modify apply_mutation to always use temp files. Add explicit `load` for non-LOAD_PATH files. Remove direct-overwrite fallback.\n2. **EV-265** (Load-path redirection): Refine the LOAD_PATH prepend logic. Handle edge cases (multiple LOAD_PATH matches, nested paths).\n3. **EV-267** (Cleanup): Add at_exit hook and signal traps. Simplify Isolation::Fork defense-in-depth.\n4. **EV-266** (Zeitwerk): Test and handle Zeitwerk edge cases.\n5. **EV-268** (Integration tests): Verify original file never modified, cleanup on normal/exceptional/signal exit, Zeitwerk compat.\n\n### Files to modify\n- `lib/evilution/integration/rspec.rb` — primary changes (apply_mutation, restore_original)\n- `lib/evilution/isolation/fork.rb` — simplify restore_original_source\n- `lib/evilution/isolation/in_process.rb` — no changes expected\n- `lib/evilution/ast/source_surgeon.rb` — no changes\n- `lib/evilution/runner.rb` — possibly add at_exit/signal trap registration","status":"closed","priority":2,"issue_type":"task","owner":"denis.kiselyov@gmail.com","created_at":"2026-04-04T11:29:42.91131604+07:00","created_by":"Denis Kiselev","updated_at":"2026-04-08T15:53:00.18355494+07:00","closed_at":"2026-04-08T15:53:00.18355494+07:00","close_reason":"Closed","dependencies":[{"issue_id":"EV-254","depends_on_id":"EV-237","type":"blocks","created_at":"0001-01-01T00:00:00Z"}]}
@@ -207,7 +207,7 @@
207
207
  {"id":"EV-261","title":"Add --skip-heredoc-literals CLI flag","description":"Add a flag to completely skip string literal mutations inside heredocs. For users who prefer zero heredoc mutations.","notes":"GitHub: #548","status":"closed","priority":2,"issue_type":"task","owner":"denis.kiselyov@gmail.com","created_at":"2026-04-04T11:29:56.515382643+07:00","created_by":"Denis Kiselev","updated_at":"2026-04-08T19:20:01.937747302+07:00","closed_at":"2026-04-08T19:20:01.937747302+07:00","close_reason":"Closed","dependencies":[{"issue_id":"EV-261","depends_on_id":"EV-241","type":"blocks","created_at":"0001-01-01T00:00:00Z"},{"issue_id":"EV-261","depends_on_id":"EV-260","type":"blocks","created_at":"0001-01-01T00:00:00Z"}]}
208
208
  {"id":"EV-262","title":"Add tests for heredoc mutation behavior","description":"Test: heredoc with no interpolation (skipped or mutated to empty), heredoc with interpolation (only expressions mutated), squiggly heredoc, nested heredoc.","notes":"GitHub: #549","status":"in_progress","priority":2,"issue_type":"task","owner":"denis.kiselyov@gmail.com","created_at":"2026-04-04T11:29:56.333773283+07:00","created_by":"Denis Kiselev","updated_at":"2026-04-08T19:20:02.042056553+07:00","dependencies":[{"issue_id":"EV-262","depends_on_id":"EV-241","type":"blocks","created_at":"0001-01-01T00:00:00Z"},{"issue_id":"EV-262","depends_on_id":"EV-261","type":"blocks","created_at":"0001-01-01T00:00:00Z"}]}
209
209
  {"id":"EV-263","title":"Implement temp-file write in SourceSurgeon","description":"Modify SourceSurgeon.apply to write mutated source to a temp file instead of overwriting the original. Return the temp file path. Original file is never touched.","notes":"GitHub: #537","status":"closed","priority":2,"issue_type":"task","owner":"denis.kiselyov@gmail.com","created_at":"2026-04-04T11:29:59.265360981+07:00","created_by":"Denis Kiselev","updated_at":"2026-04-08T15:53:00.183566714+07:00","closed_at":"2026-04-08T15:53:00.183566714+07:00","close_reason":"Closed","dependencies":[{"issue_id":"EV-263","depends_on_id":"EV-237","type":"blocks","created_at":"0001-01-01T00:00:00Z"}]}
210
- {"id":"EV-264","title":"Define target metric and measurement methodology for mutation density gap","description":"Define what 'closing the gap' means: target ratio (e.g., <1.5x), measurement protocol (which files, which mutant config), and a benchmark script that can be re-run to track progress over time.","notes":"GitHub: #541","status":"open","priority":2,"issue_type":"task","owner":"denis.kiselyov@gmail.com","created_at":"2026-04-04T11:30:08.545241632+07:00","created_by":"Denis Kiselev","updated_at":"2026-04-04T11:30:26.634822515+07:00","dependencies":[{"issue_id":"EV-264","depends_on_id":"EV-238","type":"blocks","created_at":"0001-01-01T00:00:00Z"}]}
210
+ {"id":"EV-264","title":"Define target metric and measurement methodology for mutation density gap","description":"Define what 'closing the gap' means: target ratio (e.g., <1.5x), measurement protocol (which files, which mutant config), and a benchmark script that can be re-run to track progress over time.","notes":"GitHub: #541","status":"closed","priority":2,"issue_type":"task","owner":"denis.kiselyov@gmail.com","created_at":"2026-04-04T11:30:08.545241632+07:00","created_by":"Denis Kiselev","updated_at":"2026-04-08T23:19:42.57055849+07:00","closed_at":"2026-04-08T23:19:42.57055849+07:00","close_reason":"Closed","dependencies":[{"issue_id":"EV-264","depends_on_id":"EV-238","type":"blocks","created_at":"0001-01-01T00:00:00Z"}]}
211
211
  {"id":"EV-265","title":"Implement load-path redirection for forked test process","description":"In the fork isolation, prepend the temp directory to $LOAD_PATH (or use a more targeted mechanism) so that require and load pick up the mutated file instead of the original.","notes":"GitHub: #550","status":"closed","priority":2,"issue_type":"task","owner":"denis.kiselyov@gmail.com","created_at":"2026-04-04T11:30:17.522950262+07:00","created_by":"Denis Kiselev","updated_at":"2026-04-08T15:53:00.18357647+07:00","closed_at":"2026-04-08T15:53:00.18357647+07:00","close_reason":"Closed","dependencies":[{"issue_id":"EV-265","depends_on_id":"EV-237","type":"blocks","created_at":"0001-01-01T00:00:00Z"}]}
212
212
  {"id":"EV-266","title":"Handle Rails autoloader (Zeitwerk) compatibility","description":"Zeitwerk uses absolute paths. Test that the temp-file approach works with Zeitwerk's file-to-constant mapping. May need to use Zeitwerk's on_load callbacks or file override mechanism.","notes":"GitHub: #551","status":"closed","priority":2,"issue_type":"task","owner":"denis.kiselyov@gmail.com","created_at":"2026-04-04T11:30:40.575881302+07:00","created_by":"Denis Kiselev","updated_at":"2026-04-08T15:53:00.183584148+07:00","closed_at":"2026-04-08T15:53:00.183584148+07:00","close_reason":"Closed","dependencies":[{"issue_id":"EV-266","depends_on_id":"EV-237","type":"blocks","created_at":"0001-01-01T00:00:00Z"}]}
213
213
  {"id":"EV-267","title":"Add cleanup of temp files after mutation run","description":"Ensure temp files are cleaned up on normal exit, exception, and signal (SIGTERM/SIGINT). Use at_exit hooks and signal traps.","notes":"GitHub: #552","status":"closed","priority":2,"issue_type":"task","owner":"denis.kiselyov@gmail.com","created_at":"2026-04-04T11:30:57.733824316+07:00","created_by":"Denis Kiselev","updated_at":"2026-04-08T15:53:00.18359655+07:00","closed_at":"2026-04-08T15:53:00.18359655+07:00","close_reason":"Closed","dependencies":[{"issue_id":"EV-267","depends_on_id":"EV-237","type":"blocks","created_at":"0001-01-01T00:00:00Z"}]}
@@ -221,6 +221,9 @@
221
221
  {"id":"EV-274","title":"Add rake memory:check to CI pipeline","description":"Add the memory leak regression check (rake memory:check) as a CI step. This catches regressions in isolation/integration code by detecting per-mutation RSS growth spikes. Requires Linux runner (/proc/self/status). Consider running after the main spec suite to avoid blocking fast feedback.","notes":"GitHub: #566","status":"closed","priority":3,"issue_type":"task","owner":"denis.kiselyov@gmail.com","created_at":"2026-04-05T22:53:27.447410371+07:00","created_by":"Denis Kiselev","updated_at":"2026-04-06T11:52:58.110045281+07:00","closed_at":"2026-04-06T11:52:58.110045281+07:00","close_reason":"Implemented in PR #571. Added memory_check job to CI workflow with pinned runner (ubuntu-24.04), SHA-pinned actions, and explicit threshold env vars."}
222
222
  {"id":"EV-275","title":"Use project's own complex classes as memory check fixture","description":"Replace the simple_class.rb fixture in script/memory_check with more complex classes from the evilution codebase itself (e.g. Runner, Config, AST::Parser). This provides realistic per-mutation load: more ExampleGroup subclasses, deeper spec nesting, and heavier metadata — closer to what users see in real projects. Affects check #5 (RSpec integration per-mutation) primarily.","notes":"GitHub: #567","status":"in_progress","priority":3,"issue_type":"task","owner":"denis.kiselyov@gmail.com","created_at":"2026-04-05T22:53:30.214655275+07:00","created_by":"Denis Kiselev","updated_at":"2026-04-06T12:36:50.959507734+07:00"}
223
223
  {"id":"EV-276","title":"InProcess suppress_output closes /dev/null handles causing 'closed stream' on reuse with clear_examples","description":"InProcess#suppress_output uses File.open with blocks that auto-close /dev/null handles after each call. When the RSpec integration uses clear_examples (which reuses Configuration), the formatter retains a reference to $stdout from the first run. On subsequent calls, the formatter writes to the closed handle, causing 'closed stream' errors. Fix: use persistent /dev/null handles or StringIO.","notes":"GitHub: #569","status":"in_progress","priority":2,"issue_type":"bug","owner":"denis.kiselyov@gmail.com","created_at":"2026-04-05T23:06:51.502713099+07:00","created_by":"Denis Kiselev","updated_at":"2026-04-06T12:02:05.127419951+07:00"}
224
+ {"id":"EV-277","title":"Multi-byte character offset bug in AST::Parser subject extraction","description":"When parsing files containing multi-byte characters (e.g. Cyrillic), AST::Parser uses @source[loc.start_offset...loc.end_offset] on the string directly instead of on the binary representation. Prism byte offsets are byte-based, but Ruby string slicing is character-based for encoded strings, causing extracted method bodies to be garbled. Fix: use @source.b[offset...end].force_encoding(@source.encoding) as done in the mutant_json_adapter workaround.","notes":"GitHub: #615","status":"open","priority":1,"issue_type":"bug","owner":"denis.kiselyov@gmail.com","created_at":"2026-04-09T00:01:43.021380194+07:00","created_by":"Denis Kiselev","updated_at":"2026-04-09T00:01:58.491188786+07:00"}
225
+ {"id":"EV-278","title":"IndexToAt operator: [] → .at() substitution","description":"Add mutation operator that replaces array/hash [] access with .at() method. .at() returns nil on out-of-bounds instead of raising, exposing missing bounds checks. Identified in EV-251 prioritization as the only uncovered signal category (60 mutations in benchmark corpus). Low implementation complexity — match CallNode with name [] on collection receivers.","notes":"GitHub: #618","status":"in_progress","priority":3,"issue_type":"feature","owner":"denis.kiselyov@gmail.com","created_at":"2026-04-09T09:12:09.052329944+07:00","created_by":"Denis Kiselev","updated_at":"2026-04-09T11:26:17.496127389+07:00"}
226
+ {"id":"EV-279","title":"BlockPassRemoval operator: remove &:method block pass","description":"Add mutation operator that removes &:symbol block pass arguments from method calls (e.g. map(&:to_s) → map). Low priority — only 5 mutations in benchmark corpus, but trivial to implement. Identified in EV-251 prioritization.","notes":"GitHub: #619","status":"in_progress","priority":4,"issue_type":"feature","owner":"denis.kiselyov@gmail.com","created_at":"2026-04-09T09:12:14.145760441+07:00","created_by":"Denis Kiselev","updated_at":"2026-04-09T10:43:28.718393237+07:00"}
224
227
  {"id":"EV-28","title":"MCP server for direct tool invocation","description":"Implement a Model Context Protocol (MCP) server that exposes evilution as a tool. Agents could call evilution directly instead of shelling out and parsing output. The server should expose a 'mutate' tool that accepts target files, options, and returns structured results.","status":"closed","priority":2,"issue_type":"feature","owner":"denis.kiselyov@gmail.com","created_at":"2026-03-10T06:17:45.29866593+07:00","created_by":"Denis Kiselev","updated_at":"2026-03-16T22:58:51.734461132+07:00","closed_at":"2026-03-16T22:58:51.734461132+07:00","close_reason":"PR #103 merged — MCP server with evilution-mutate tool via stdio transport"}
225
228
  {"id":"EV-29","title":"Add --stdin flag to accept file list from stdin","description":"Add a --stdin flag that reads target file paths (one per line) from stdin. Enables workflows like: git diff --name-only | evilution run --stdin --format json. Each line can include line-range syntax (e.g. lib/foo.rb:15-30).","status":"closed","priority":3,"issue_type":"feature","owner":"denis.kiselyov@gmail.com","created_at":"2026-03-10T06:17:46.306306092+07:00","created_by":"Denis Kiselev","updated_at":"2026-03-16T18:03:28.998559073+07:00","closed_at":"2026-03-16T18:03:28.998559073+07:00","close_reason":"PR #92 merged — --stdin flag for piped file list workflows"}
226
229
  {"id":"EV-3","title":"Phase 2: Mutation Operators & CLI","description":"Implement remaining 17 mutation operators, build CLI with OptionParser, exe/evilution executable, human-readable reporter. Milestone: bundle exec evilution run lib/user.rb --format json","status":"closed","priority":2,"issue_type":"epic","owner":"denis.kiselyov@gmail.com","created_at":"2026-03-02T00:05:00.492971295+07:00","created_by":"Denis Kiselev","updated_at":"2026-03-02T11:21:32.168384165+07:00","closed_at":"2026-03-02T11:21:32.168384165+07:00","close_reason":"Phase 2 complete: all 18 operators, CLI, Reporter::CLI, Registry registration, executable","dependencies":[{"issue_id":"EV-3","depends_on_id":"EV-2","type":"blocks","created_at":"0001-01-01T00:00:00Z"}]}
data/CHANGELOG.md CHANGED
@@ -1,5 +1,22 @@
1
1
  # Changelog
2
2
 
3
+ ## [0.22.0] - 2026-04-09
4
+
5
+ ### Added
6
+
7
+ - **Minitest integration** — full Minitest support as an alternative to RSpec; abstract `Integration::Base` framework with template method pattern; `Integration::Minitest` with programmatic `Minitest.__run` execution, `MinitestCrashDetector` reporter for distinguishing assertion failures from crashes; `--integration minitest` CLI flag and `integration: minitest` config option; `SpecResolver` parameterized for Minitest file discovery (`test/`, `_test.rb`); plugin-based runner dispatch via `INTEGRATIONS` registry; baseline runner abstracted from RSpec with injectable runner callable; Minitest concrete suggestion templates using `def test_`/`assert_equal` style (#223, #224, #225, #226, #227, #228, #229, #230)
8
+ - **New mutation operators (3)** — `index_to_at` replaces `arr[0]` with `arr.at(0)` for array index access (#618); `regex_simplification` simplifies regex character classes and quantifiers (#514); `block_pass_removal` removes block arguments (`&...`) in method calls (#619)
9
+ - **Mutation density benchmarking** — comparison tools and methodology for measuring mutation density against reference tool; baseline results and operator classification documents (#523, #526, #541)
10
+
11
+ ### Fixed
12
+
13
+ - **Multi-byte character offset bug** — Prism byte offsets were used with character-based `String#[]`, causing garbled source extraction for files with multi-byte characters (UTF-8 Cyrillic, Thai, CJK, etc.); fixed `AST::Parser`, `DisableComment`, and 7 mutation operators to use `byteslice`/`getbyte`; added `byteslice_source` helper to `Mutator::Base` (#615)
14
+
15
+ ### Changed
16
+
17
+ - **Operator count** — 72 operators (up from 69), with new index-to-at, regex simplification, and block pass removal operators
18
+ - **Test framework support** — RSpec and Minitest both supported; documentation updated throughout CLI help, MCP tool descriptions, and README
19
+
3
20
  ## [0.21.0] - 2026-04-08
4
21
 
5
22
  ### Added
data/README.md CHANGED
@@ -7,7 +7,7 @@
7
7
  * **License**: MIT (free, no commercial restrictions)
8
8
  * **Language**: Ruby >= 3.3
9
9
  * **Parser**: Prism (Ruby's official AST parser, ships with Ruby 3.3+)
10
- * **Test framework**: RSpec (currently the only supported integration)
10
+ * **Test frameworks**: RSpec and Minitest
11
11
 
12
12
  ## Installation
13
13
 
@@ -57,9 +57,10 @@ evilution [command] [options] [files...]
57
57
  | `--no-baseline` | Boolean | _(enabled)_ | Skip baseline test suite check. By default, a baseline run detects pre-existing failures and marks those mutations as `neutral`. |
58
58
  | `--fail-fast [N]` | Integer | _(none)_ | Stop after N surviving mutants (default 1 if no value given). |
59
59
  | `-v`, `--verbose` | Boolean | false | Verbose output with RSS memory and GC stats per phase and per mutation. |
60
- | `--suggest-tests` | Boolean | false | Generate concrete RSpec test code in suggestions instead of static descriptions. |
60
+ | `--suggest-tests` | Boolean | false | Generate concrete test code in suggestions (RSpec or Minitest, based on `--integration`). |
61
61
  | `-q`, `--quiet` | Boolean | false | Suppress output. |
62
62
  | `--stdin` | Boolean | false | Read target file paths from stdin (one per line). |
63
+ | `--integration NAME` | String | `rspec` | Test framework integration: `rspec` or `minitest`. |
63
64
  | `--incremental` | Boolean | false | Cache killed/timeout results; skip unchanged mutations on re-runs. |
64
65
  | `--save-session` | Boolean | false | Persist results as timestamped JSON under `.evilution/results/`. |
65
66
  | `--no-progress` | Boolean | _(enabled)_ | Disable the TTY progress bar. |
@@ -86,8 +87,8 @@ Creates `.evilution.yml`:
86
87
  # timeout: 30 # seconds per mutation
87
88
  # format: text # text | json | html
88
89
  # min_score: 0.0 # 0.0–1.0
89
- # integration: rspec # test framework
90
- # suggest_tests: false # concrete RSpec test code in suggestions
90
+ # integration: rspec # test framework: rspec, minitest
91
+ # suggest_tests: false # concrete test code in suggestions (matches integration)
91
92
  # save_session: false # persist results under .evilution/results/
92
93
  # skip_heredoc_literals: false # skip all string literal mutations inside heredocs
93
94
  # show_disabled: false # report mutations skipped by disable comments
@@ -168,7 +169,7 @@ Use `--format json` for machine-readable output. Schema:
168
169
 
169
170
  **Key metric**: `summary.score` — the mutation score. Higher is better. 1.0 means all mutations were caught.
170
171
 
171
- ## Mutation Operators (69 total)
172
+ ## Mutation Operators (72 total)
172
173
 
173
174
  Each operator name is stable and appears in JSON output under `survived[].operator`.
174
175
 
@@ -201,8 +202,10 @@ Each operator name is stable and appears in JSON output under `survived[].operat
201
202
  | `keyword_argument` | Remove keyword defaults/params | `def foo(bar: 42)` -> `def foo(bar:)` |
202
203
  | `multiple_assignment` | Remove targets or swap order | `a, b = 1, 2` -> `b, a = 1, 2` |
203
204
  | `block_removal` | Remove blocks from method calls | `items.map { \|x\| x * 2 }` -> `items.map` |
205
+ | `block_pass_removal` | Remove block arguments passed with `&` | `items.map(&:to_s)` -> `items.map` |
204
206
  | `range_replacement` | Swap inclusive/exclusive ranges | `1..10` -> `1...10` |
205
207
  | `regexp_mutation` | Replace regexp with always/never matching | `/pat/` -> `/a\A/` |
208
+ | `regex_simplification` | Simplify regex quantifiers, anchors, ranges | `/\d+/` -> `/\d/`, `/[a-z]/` -> `/[az]/` |
206
209
  | `receiver_replacement` | Drop explicit `self` receiver | `self.foo` -> `foo` |
207
210
  | `send_mutation` | Swap semantically related methods | `detect` -> `find`, `map` -> `flat_map` |
208
211
  | `compound_assignment` | Swap compound assignment operators | `+=` -> `-=`, `&&=` -> `\|\|=` |
@@ -224,6 +227,7 @@ Each operator name is stable and appears in JSON output under `survived[].operat
224
227
  | `bitwise_complement` | Remove or swap `~` | `~x` -> `x`, `~x` -> `-x` |
225
228
  | `zsuper_removal` | Replace implicit `super` with `nil` | `super` -> `nil` |
226
229
  | `explicit_super_mutation` | Mutate explicit super arguments | `super(a, b)` -> `super` |
230
+ | `index_to_at` | Replace `[]` with `.at()` for arrays | `arr[0]` -> `arr.at(0)` |
227
231
  | `index_to_fetch` | Replace `[]` with `.fetch()` | `h[k]` -> `h.fetch(k)` |
228
232
  | `index_to_dig` | Replace `[]` chains with `.dig()` | `h[a][b]` -> `h.dig(a, b)` |
229
233
  | `index_assignment_removal` | Remove `[]=` assignments | `h[k] = v` -> removed |
@@ -290,9 +294,9 @@ Use `minimal` when context window budget is tight and you only need to see what
290
294
 
291
295
  ### Concrete Test Suggestions
292
296
 
293
- The MCP tool accepts a `suggest_tests` boolean parameter (default: `false`). When enabled, survived mutation suggestions contain concrete RSpec `it` blocks that an agent can drop into a spec file, instead of static description text.
297
+ The MCP tool accepts a `suggest_tests` boolean parameter (default: `false`). When enabled, survived mutation suggestions contain concrete test code that an agent can drop into a test file, instead of static description text. The MCP tool currently generates RSpec-style suggestions (`it`/`expect` blocks).
294
298
 
295
- Pass `suggest_tests: true` in the MCP tool call, or use `--suggest-tests` on the CLI, to activate this mode.
299
+ Pass `suggest_tests: true` in the MCP tool call to activate this mode. The CLI also supports `--suggest-tests`; when using the CLI, generated suggestions match the `--integration` setting (RSpec `it`/`expect` blocks or Minitest `def test_`/`assert_equal` methods).
296
300
 
297
301
  > **Note**: `.mcp.json` is gitignored by default since it is a local editor/agent configuration file.
298
302
 
@@ -356,7 +360,7 @@ Use when you know which file was modified and want to verify its test coverage.
356
360
  For each entry in `survived[]`:
357
361
  1. Read `file` at `line` to understand the code context
358
362
  2. Read `operator` to understand what was changed
359
- 3. Read `suggestion` for a hint on what test to write (use `--suggest-tests` for concrete RSpec code)
363
+ 3. Read `suggestion` for a hint on what test to write (use `--suggest-tests` for concrete test code)
360
364
  4. Write a test that would fail if the mutation were applied
361
365
  5. Re-run evilution on just that file to verify the mutant is now killed
362
366
 
@@ -389,9 +393,9 @@ Tests 4 paths (InProcess isolation, Fork isolation, mutation generation + stripp
389
393
  1. **Parse** — Prism parses Ruby files into ASTs with exact byte offsets
390
394
  2. **Extract** — Methods are identified as mutation subjects
391
395
  3. **Filter** — Disable comments, Sorbet `sig` blocks, and AST ignore patterns exclude mutations before execution
392
- 4. **Mutate** — 69 operators produce text replacements at precise byte offsets (source-level surgery, no AST unparsing); heredoc literal text is skipped by default
396
+ 4. **Mutate** — 72 operators produce text replacements at precise byte offsets (source-level surgery, no AST unparsing); heredoc literal text is skipped by default
393
397
  5. **Isolate** — Mutations are applied to temporary file copies (never modifying originals); load-path redirection ensures `require` resolves the mutated copy. Default isolation is in-process; `--isolation fork` uses forked child processes. Parallel mode (`--jobs N`) always uses in-process isolation inside pool workers to avoid double forking
394
- 6. **Test** — RSpec executes against the mutated source
398
+ 6. **Test** — The configured test framework (RSpec or Minitest) executes against the mutated source
395
399
  7. **Collect** — Source strings and AST nodes are released after use to minimize memory retention
396
400
  8. **Report** — Results aggregated into text, JSON, or HTML, including efficiency metrics and peak memory usage
397
401
 
@@ -0,0 +1,35 @@
1
+ # Head-to-Head Mutation Comparison Baseline
2
+
3
+ Date: 2026-04-09
4
+ Evilution: v0.21.0
5
+ Reference tool: mutant 0.16.0 (via mutant_json_adapter)
6
+ Target project: private Rails app (Ruby 4.0.1)
7
+
8
+ ## Results
9
+
10
+ | File | Evilution | Reference | Ratio |
11
+ |------|-----------|-----------|-------|
12
+ | app/controllers/admin/news_controller.rb | 363 | 524 | 1.44x |
13
+ | app/models/player_claim.rb | 237 | 281 | 1.19x |
14
+ | app/services/telegram/news_scorer.rb | 353 | 496 | 1.41x |
15
+ | app/validators/password_strength_validator.rb | 157 | 179 | 1.14x |
16
+ | app/policies/application_policy.rb | 79 | 86 | 1.09x |
17
+ | app/services/telegram/entities_formatter.rb | 454 | 647 | 1.43x |
18
+ | app/services/autosave_game_protocol_service.rb | 315 | 418 | 1.33x |
19
+ | lib/scraper/game_scraper.rb | 449 | 625 | 1.39x |
20
+ | app/jobs/process_telegram_webhook_job.rb | 196 | 255 | 1.30x |
21
+ | lib/telegram/export_parser.rb | 300 | 390 | 1.30x |
22
+ | **TOTAL** | **2,903** | **3,901** | **1.34x** |
23
+
24
+ ## Summary
25
+
26
+ - Density ratio: **1.34x** (target: < 1.5x) — **PASS**
27
+ - Highest gap: entities_formatter.rb (1.44x), news_controller.rb (1.44x)
28
+ - Lowest gap: application_policy.rb (1.09x)
29
+
30
+ ## Notes
31
+
32
+ - Reference tool cannot parse Ruby 4.0 files natively; used mutant_json_adapter
33
+ (Prism-based method extraction + mutant util mutation -e per method)
34
+ - This means reference tool mutations lack class-level context (constants,
35
+ inheritance, instance variables) — actual gap may be slightly different
@@ -0,0 +1,79 @@
1
+ # Reference Tool Mutation Classification by Operator Category
2
+
3
+ Date: 2026-04-09
4
+ Corpus: 10 files from a private Rails app (2,903 evilution / 3,901 reference mutations)
5
+
6
+ ## Category Breakdown
7
+
8
+ Classification of all 3,901 reference tool mutations by semantic category,
9
+ with signal/noise assessment and evilution coverage status.
10
+
11
+ | Category | Count | % | Signal | Evilution Coverage |
12
+ |----------|------:|--:|--------|-------------------|
13
+ | complex_mutation | 854 | 21.9% | Mixed | Partial — multi-statement changes, compound mutations spanning multiple lines |
14
+ | argument_nil | 585 | 15.0% | Signal | Covered — ArgumentNilSubstitution, ArgumentRemoval |
15
+ | guard_clause_restructure | 570 | 14.6% | Noise | Not applicable — reformats `return if X` to `unless X; return; end` without semantic change |
16
+ | receiver_mutation | 416 | 10.7% | Signal | Covered — ReceiverReplacement, MethodCallRemoval |
17
+ | receiver_self_swap | 281 | 7.2% | Mixed | Partial — ReceiverReplacement covers some; `self.` insertion is often equivalent |
18
+ | string_mutation | 224 | 5.7% | Signal | Covered — StringLiteral, StringInterpolation |
19
+ | arithmetic_mutation | 224 | 5.7% | Signal | Covered — ArithmeticReplacement |
20
+ | symbol_mutation | 129 | 3.3% | Signal | Covered — SymbolLiteral (mutant uses `__mutant__` suffix) |
21
+ | hash_mutation | 119 | 3.1% | Mixed | Partial — HashLiteral covers structure; key reordering is noise |
22
+ | method_body_removal | 94 | 2.4% | Signal | Covered — MethodBodyReplacement (empty body variant) |
23
+ | method_body_raise | 79 | 2.0% | Signal | Covered — MethodBodyReplacement (raise variant) |
24
+ | method_body_super | 79 | 2.0% | Signal | Covered — MethodBodyReplacement (super variant) |
25
+ | method_substitution_at | 60 | 1.5% | Mixed | Not covered — `[]` → `.at()` catches missing bounds checks |
26
+ | method_substitution_fetch | 58 | 1.5% | Signal | Covered — IndexToFetch |
27
+ | condition_nil_false | 43 | 1.1% | Signal | Covered — ConditionalBranch, NilReplacement |
28
+ | method_body_nil | 27 | 0.7% | Signal | Covered — MethodBodyReplacement (nil variant) |
29
+ | regex_mutation | 27 | 0.7% | Signal | Covered — RegexpMutation, RegexCapture |
30
+ | boolean_literal | 16 | 0.4% | Signal | Covered — BooleanLiteralReplacement |
31
+ | equality_mutation | 10 | 0.3% | Signal | Covered — EqualityToIdentity |
32
+ | block_pass_mutation | 5 | 0.1% | Signal | Partial — BlockRemoval covers block removal; `&:method` removal not specific |
33
+ | integer_boundary | 1 | 0.0% | Signal | Covered — IntegerLiteral |
34
+
35
+ ## Signal vs Noise Summary
36
+
37
+ | Assessment | Count | % |
38
+ |------------|------:|--:|
39
+ | Signal (catches real bugs) | 2,176 | 55.8% |
40
+ | Mixed (some signal, some equivalent) | 1,155 | 29.6% |
41
+ | Noise (equivalent or reformatting) | 570 | 14.6% |
42
+
43
+ ## Key Findings
44
+
45
+ ### 1. Guard clause restructuring is pure noise (14.6%)
46
+
47
+ The reference tool rewrites `return X if condition` to `unless condition; return X; end`.
48
+ This is a syntactic reformatting, not a semantic mutation. It inflates the mutation count
49
+ without testing anything. Evilution correctly does not produce these.
50
+
51
+ ### 2. Most categories are already covered by evilution
52
+
53
+ Of 20 categories, evilution has operators covering 14 fully and 4 partially.
54
+ Only 2 categories are not covered:
55
+ - **guard_clause_restructure** — noise, should not be added
56
+ - **method_substitution_at** — `[]` → `.at()`, marginal signal
57
+
58
+ ### 3. The "complex_mutation" bucket needs further analysis
59
+
60
+ 854 mutations (21.9%) are multi-statement compound changes that don't fit a single
61
+ category. Many combine receiver replacement + argument modification + formatting
62
+ changes in one diff. Some contain real signal (e.g., removing a hash key from a
63
+ method call), others are largely equivalent reformattings.
64
+
65
+ ### 4. The 1.34x gap is largely explained by:
66
+
67
+ - Guard clause restructuring: 570 mutations (noise)
68
+ - Receiver self-swap equivalents: ~140 mutations (noise portion of 281)
69
+ - Complex compound mutations: ~288 mutations (noise portion of 854)
70
+
71
+ **Removing noise**, the effective gap drops to approximately **1.0-1.1x** —
72
+ near parity for signal-bearing mutations.
73
+
74
+ ## Recommendations for EV-251 (Prioritization)
75
+
76
+ 1. **Do not add** guard clause restructuring — pure noise
77
+ 2. **Consider adding** `[]` → `.at()` substitution (60 mutations, real signal for bounds checking)
78
+ 3. **Investigate** the complex_mutation bucket further to extract any discrete operator patterns
79
+ 4. **Current density target (< 1.5x) is already met** at 1.34x overall
@@ -0,0 +1,68 @@
1
+ # Operator Addition Prioritization
2
+
3
+ Date: 2026-04-09
4
+ Based on: EV-250 classification of 3,901 reference tool mutations across 10 files
5
+
6
+ ## Current Status
7
+
8
+ - Density ratio: **1.34x** (target: < 1.5x) — **already passing**
9
+ - Evilution: 68 operators covering 14/20 reference categories fully, 4 partially
10
+ - Effective signal gap after removing noise: ~1.0-1.1x (near parity)
11
+
12
+ ## Prioritized Operator Additions
13
+
14
+ Ranked by: (a) signal frequency, (b) implementation complexity, (c) equivalent mutant rate.
15
+
16
+ ### Priority 1: High signal, low complexity
17
+
18
+ | # | Operator | Signal Count | Complexity | Equiv. Rate | Notes |
19
+ |---|----------|-------------|-----------|-------------|-------|
20
+ | 1 | `[]` → `.at()` substitution | 60 | Low | Low (~10%) | Catches unchecked array/hash access. Single AST node transform. New operator needed. |
21
+
22
+ **Rationale:** Only uncovered category with real signal. `.at()` returns nil
23
+ instead of raising on out-of-bounds, exposing missing bounds checks. Simple to
24
+ implement — match `CallNode` with name `[]` on collection receivers, emit `.at()`
25
+ variant.
26
+
27
+ ### Priority 2: Improve existing coverage (partial gaps)
28
+
29
+ | # | Operator | Gap Area | Complexity | Equiv. Rate | Notes |
30
+ |---|----------|----------|-----------|-------------|-------|
31
+ | 2 | Regex simplification (EV-230, #514) | 27 | Medium | Low (~15%) | Quantifier removal, anchor removal, character class simplification. Already scoped. |
32
+ | 3 | Block pass removal (`&...`) | 5 | Low | Medium (~30%) | Remove `&` block pass arguments (`&:symbol`, `&method(:name)`, etc). Marginal count but trivial to add. |
33
+
34
+ **Rationale:** EV-230 is already scoped with a GH issue. Block pass removal is
35
+ minimal effort for minimal gain — include only if doing a batch of small operators.
36
+
37
+ ### Priority 3: Do not implement
38
+
39
+ | # | Category | Count | Reason |
40
+ |---|----------|------:|--------|
41
+ | — | Guard clause restructuring | 570 | Pure noise — syntactic reformatting, not semantic mutation |
42
+ | — | Receiver self-swap (remaining) | ~140 | Mostly equivalent — `self.method` vs `method` rarely matters |
43
+ | — | Complex compound mutations | ~288 | Noise portion of multi-statement changes; not decomposable into discrete operators |
44
+
45
+ ## Implementation Order
46
+
47
+ 1. **EV-230** (#514) — Regex simplification operators (already scoped, medium complexity, 27 signal mutations)
48
+ 2. **New: `IndexToAt`** — `[]` → `.at()` substitution (60 signal mutations, low complexity)
49
+ 3. **New: `BlockPassRemoval`** — `&:method` removal (5 signal mutations, trivial)
50
+
51
+ ## Impact Assessment
52
+
53
+ | Scenario | Estimated Ratio | Delta |
54
+ |----------|----------------|-------|
55
+ | Current | 1.34x | — |
56
+ | After adding all Priority 1+2 | ~1.31x | -0.03x |
57
+
58
+ The density gap is already within target. These additions improve **signal
59
+ coverage** (catching real bugs that reference tool catches and evilution misses)
60
+ rather than closing the headline ratio, which is already healthy.
61
+
62
+ ## Recommendation
63
+
64
+ The density gap research (EV-238) can be considered **successful** — the 1.5x
65
+ target is met at 1.34x. Remaining work should focus on signal quality (regex
66
+ mutations, bounds checking) rather than chasing the ratio lower. The reference
67
+ tool's ~15% noise inflation means its raw count is not a meaningful target for
68
+ exact parity.
@@ -0,0 +1,91 @@
1
+ # Mutation Density Benchmark Methodology
2
+
3
+ ## Goal
4
+
5
+ Track and close the mutation density gap between evilution and a reference
6
+ mutation testing tool.
7
+ Current gap: **1.8-2.6x** (evilution generates fewer mutations).
8
+ Target: **< 1.5x** across the benchmark corpus.
9
+
10
+ ## Metric
11
+
12
+ **Density ratio** = `reference_mutations / evilution_mutations` per file.
13
+
14
+ A ratio of 1.0 means parity. Values above 1.0 mean the reference tool generates
15
+ more. The aggregate ratio is computed from total mutations across all benchmark
16
+ files (not an average of per-file ratios, which would over-weight small files).
17
+
18
+ ## Measurement Protocol
19
+
20
+ ### Benchmark corpus
21
+
22
+ Select **10 files** from a real-world Rails project covering diverse patterns:
23
+
24
+ | Slot | Category | Example |
25
+ |------|----------------------|----------------------------------|
26
+ | 1 | Controller | `app/controllers/*_controller.rb`|
27
+ | 2 | Model (ActiveRecord) | `app/models/*.rb` |
28
+ | 3 | Service object | `app/services/*.rb` |
29
+ | 4 | Validator | `app/validators/*.rb` |
30
+ | 5 | Concern / mixin | `app/models/concerns/*.rb` |
31
+ | 6 | Helper | `app/helpers/*.rb` |
32
+ | 7 | Formatter / presenter| `app/presenters/*.rb` |
33
+ | 8 | Lib utility | `lib/*.rb` |
34
+ | 9 | Job / worker | `app/jobs/*.rb` |
35
+ | 10 | Configuration / DSL | `config/initializers/*.rb` |
36
+
37
+ Files should be **50-300 LOC** (enough mutations to be meaningful, small enough
38
+ to run quickly). The exact file list is stored in the benchmark config file
39
+ (`scripts/benchmark_density.yml`).
40
+
41
+ ### Tool configuration
42
+
43
+ Both tools must run with equivalent settings:
44
+
45
+ - **evilution**: default operators, no `--skip-heredoc-literals`, no ignore patterns
46
+ - **reference tool**: default operator set, no timeout (we only count, not run)
47
+
48
+ The benchmark counts **generated mutations**, not killed/survived. This isolates
49
+ operator coverage from test quality.
50
+
51
+ ### Running the benchmark
52
+
53
+ ```bash
54
+ # Count-only mode (fast, no test execution):
55
+ scripts/benchmark_density scripts/benchmark_density.yml
56
+
57
+ # Full output with per-file breakdown:
58
+ scripts/benchmark_density scripts/benchmark_density.yml --verbose
59
+ ```
60
+
61
+ ### Output
62
+
63
+ The script produces a table:
64
+
65
+ ```
66
+ File Evilution Reference Ratio
67
+ app/models/user.rb 42 78 1.86x
68
+ app/services/payment.rb 31 52 1.68x
69
+ ...
70
+ TOTAL 312 534 1.71x
71
+ ```
72
+
73
+ And a summary: `Density ratio: 1.71x (target: < 1.50x)`.
74
+
75
+ ## When to Run
76
+
77
+ - **Before each release** that adds new operators
78
+ - **After closing operator issues** from the gap analysis epic (GH #515)
79
+ - **On demand** when evaluating whether a proposed operator is worth adding
80
+
81
+ ## Interpreting Results
82
+
83
+ - **Ratio < 1.5x**: target met
84
+ - **Ratio 1.5-2.0x**: progress, but more operators needed
85
+ - **Ratio > 2.0x**: significant gap remains
86
+ - **Per-file outliers**: files with ratio > 3.0x likely expose a missing operator category
87
+
88
+ Not all extra mutations from the reference tool are valuable. Some produce
89
+ equivalent mutants (semantically identical code). The head-to-head comparison
90
+ (GH #523) classifies each extra mutation as signal vs noise. The density ratio
91
+ is a **coarse progress metric**, not a quality score.