rmcontrols 0.1.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (35) hide show
  1. rmcontrols-0.1.0/LICENSE +19 -0
  2. rmcontrols-0.1.0/PKG-INFO +551 -0
  3. rmcontrols-0.1.0/README.md +509 -0
  4. rmcontrols-0.1.0/pyproject.toml +67 -0
  5. rmcontrols-0.1.0/rmcontrols/__init__.py +46 -0
  6. rmcontrols-0.1.0/rmcontrols/_blobs.py +110 -0
  7. rmcontrols-0.1.0/rmcontrols/_build.py +41 -0
  8. rmcontrols-0.1.0/rmcontrols/_cli_extract.py +130 -0
  9. rmcontrols-0.1.0/rmcontrols/_cli_validate.py +287 -0
  10. rmcontrols-0.1.0/rmcontrols/_extract.py +217 -0
  11. rmcontrols-0.1.0/rmcontrols/_features.py +172 -0
  12. rmcontrols-0.1.0/rmcontrols/_hooks.py +99 -0
  13. rmcontrols-0.1.0/rmcontrols/_region.py +40 -0
  14. rmcontrols-0.1.0/rmcontrols/_s3.py +353 -0
  15. rmcontrols-0.1.0/rmcontrols/_segmentation.py +101 -0
  16. rmcontrols-0.1.0/rmcontrols/_types.py +154 -0
  17. rmcontrols-0.1.0/rmcontrols/_validation.py +435 -0
  18. rmcontrols-0.1.0/rmcontrols/cli.py +171 -0
  19. rmcontrols-0.1.0/rmcontrols/detector.py +438 -0
  20. rmcontrols-0.1.0/rmcontrols/py.typed +0 -0
  21. rmcontrols-0.1.0/rmcontrols/viz.py +282 -0
  22. rmcontrols-0.1.0/rmcontrols.egg-info/PKG-INFO +551 -0
  23. rmcontrols-0.1.0/rmcontrols.egg-info/SOURCES.txt +33 -0
  24. rmcontrols-0.1.0/rmcontrols.egg-info/dependency_links.txt +1 -0
  25. rmcontrols-0.1.0/rmcontrols.egg-info/entry_points.txt +5 -0
  26. rmcontrols-0.1.0/rmcontrols.egg-info/requires.txt +22 -0
  27. rmcontrols-0.1.0/rmcontrols.egg-info/top_level.txt +1 -0
  28. rmcontrols-0.1.0/setup.cfg +4 -0
  29. rmcontrols-0.1.0/tests/test_blobs.py +159 -0
  30. rmcontrols-0.1.0/tests/test_build.py +108 -0
  31. rmcontrols-0.1.0/tests/test_cli.py +240 -0
  32. rmcontrols-0.1.0/tests/test_detector.py +222 -0
  33. rmcontrols-0.1.0/tests/test_features.py +191 -0
  34. rmcontrols-0.1.0/tests/test_segmentation.py +115 -0
  35. rmcontrols-0.1.0/tests/test_viz.py +156 -0
@@ -0,0 +1,19 @@
1
+ Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International
2
+ (CC BY-NC-ND 4.0)
3
+
4
+ Copyright (c) 2026 afilt
5
+
6
+ You are free to:
7
+ Share — copy and redistribute the material in any medium or format
8
+
9
+ Under the following terms:
10
+ Attribution — You must give appropriate credit, provide a link to the
11
+ license, and indicate if changes were made.
12
+ NonCommercial — You may not use the material for commercial purposes.
13
+ NoDerivatives — If you remix, transform, or build upon the material, you
14
+ may not distribute the modified material.
15
+
16
+ No additional restrictions — You may not apply legal terms or technological
17
+ measures that legally restrict others from doing anything the license permits.
18
+
19
+ Full license text: https://creativecommons.org/licenses/by-nc-nd/4.0/legalcode
@@ -0,0 +1,551 @@
1
+ Metadata-Version: 2.4
2
+ Name: rmcontrols
3
+ Version: 0.1.0
4
+ Summary: Flag control tissues from immunochemistry whole slide images
5
+ Author: afilt
6
+ License-Expression: CC-BY-NC-ND-4.0
7
+ Project-URL: Homepage, https://github.com/afilt/rmcontrols
8
+ Project-URL: Repository, https://github.com/afilt/rmcontrols
9
+ Project-URL: Issues, https://github.com/afilt/rmcontrols/issues
10
+ Keywords: IHC,histology,whole slide image,control tissue,pathology
11
+ Classifier: Programming Language :: Python :: 3
12
+ Classifier: Programming Language :: Python :: 3.9
13
+ Classifier: Programming Language :: Python :: 3.10
14
+ Classifier: Programming Language :: Python :: 3.11
15
+ Classifier: Programming Language :: Python :: 3.12
16
+ Classifier: Operating System :: OS Independent
17
+ Classifier: Topic :: Scientific/Engineering :: Bio-Informatics
18
+ Classifier: Topic :: Scientific/Engineering :: Image Processing
19
+ Requires-Python: >=3.9
20
+ Description-Content-Type: text/markdown
21
+ License-File: LICENSE
22
+ Requires-Dist: numpy>=1.23
23
+ Requires-Dist: Pillow
24
+ Requires-Dist: scipy
25
+ Requires-Dist: matplotlib
26
+ Requires-Dist: scikit-image
27
+ Requires-Dist: tqdm
28
+ Provides-Extra: wsi
29
+ Requires-Dist: openslide-python; extra == "wsi"
30
+ Provides-Extra: s3
31
+ Requires-Dist: boto3; extra == "s3"
32
+ Provides-Extra: dev
33
+ Requires-Dist: pytest; extra == "dev"
34
+ Requires-Dist: ruff; extra == "dev"
35
+ Requires-Dist: mypy; extra == "dev"
36
+ Requires-Dist: pytest-cov; extra == "dev"
37
+ Requires-Dist: pre-commit; extra == "dev"
38
+ Requires-Dist: nbstripout; extra == "dev"
39
+ Requires-Dist: ipykernel; extra == "dev"
40
+ Requires-Dist: pip; extra == "dev"
41
+ Dynamic: license-file
42
+
43
+ # rmcontrols
44
+
45
+ Detect and flag control tissues in immunohistochemistry (IHC) whole-slide image thumbnails.
46
+
47
+ ---
48
+
49
+ ## Installation
50
+
51
+ ```bash
52
+ pip install rmcontrols
53
+ ```
54
+
55
+ For WSI support (OpenSlide):
56
+
57
+ ```bash
58
+ pip install "rmcontrols[wsi]"
59
+ ```
60
+
61
+ For S3 support (boto3):
62
+
63
+ ```bash
64
+ pip install "rmcontrols[s3]"
65
+ ```
66
+
67
+ ---
68
+
69
+ ## Check control detection
70
+
71
+ Interactively validate the detected split line for a collection of thumbnails
72
+ or whole-slide images directly from the command line. Results are written to
73
+ a JSON file; the output directory is created automatically.
74
+
75
+ ### Thumbnails
76
+
77
+ ```bash
78
+ # Validate all PNGs in assets/ — results go to ./outputs/validate_thumbnails.json
79
+ rmcontrols-validate-thumbnails "assets/*.png" --side left
80
+
81
+ # Custom output file
82
+ rmcontrols-validate-thumbnails "assets/*.png" --side left \
83
+ --output my_results.json
84
+
85
+ # Replace an existing output file
86
+ rmcontrols-validate-thumbnails "assets/*.png" --overwrite
87
+
88
+ # Use the full 5-panel debug grid instead of the simple split-line view
89
+ rmcontrols-validate-thumbnails "assets/*.png" --full-debug
90
+ ```
91
+
92
+ ### Slides (local or S3)
93
+
94
+ Requires `uv sync --extra wsi` (OpenSlide). For S3 slides also
95
+ `uv sync --extra s3` (boto3).
96
+
97
+ ```bash
98
+ # Local .mrxs slides — results go to ./outputs/validate_slides.json
99
+ rmcontrols-validate-slides "slides/*.mrxs" --side left
100
+
101
+ # Mix of formats (svs, ndpi, scn …)
102
+ rmcontrols-validate-slides "slides/*.svs" --side left \
103
+ --thumbnail-size 800 --output svs_results.json
104
+
105
+ # Single S3 slide (pass the URI directly instead of a glob)
106
+ rmcontrols-validate-slides "s3://my-bucket/slides/case_001.mrxs" \
107
+ --side left --output outputs/case_001.json
108
+
109
+ # Replace an existing output file
110
+ rmcontrols-validate-slides "slides/*.mrxs" --overwrite
111
+ ```
112
+
113
+ ### Output format
114
+
115
+ Both commands produce a JSON array where each entry corresponds to one
116
+ thumbnail or slide:
117
+
118
+ ```json
119
+ [
120
+ {
121
+ "path": "assets/thumbnail_1.png",
122
+ "control_split_x": 142,
123
+ "thumbnail_width": 512,
124
+ "pct": "27.7%"
125
+ },
126
+ {
127
+ "path": "assets/thumbnail_2.png",
128
+ "control_split_x": null,
129
+ "thumbnail_width": 512,
130
+ "pct": "N/A"
131
+ }
132
+ ]
133
+ ```
134
+
135
+ `control_split_x` is `null` when the user entered `0` to mark "no controls".
136
+
137
+ ---
138
+
139
+ ## Quick start
140
+
141
+ ### Python API
142
+
143
+ ```python
144
+ from rmcontrols import detect_controls, visualize
145
+
146
+ thumbnail, regions, control_split_x = detect_controls(
147
+ "assets/thumbnail_1.png",
148
+ side="left",
149
+ )
150
+
151
+ for r in regions:
152
+ print(r.label, r.bbox)
153
+
154
+ img = visualize(thumbnail, regions, control_split_x=control_split_x)
155
+ img.save("result.png")
156
+ ```
157
+
158
+ ### CLI
159
+
160
+ ```bash
161
+ uv run rmcontrols assets/thumbnail_1.png --side left
162
+ uv run rmcontrols assets/thumbnail_1.png --side right --output results.json
163
+ uv run rmcontrols assets/thumbnail_1.png --visualize annotated.png
164
+ ```
165
+
166
+ | Option | Default | Description |
167
+ |---|---|---|
168
+ | `--side` | `left` | Side where controls are placed (`left` or `right`) |
169
+ | `--strip-width` | `0.30` | Strip width as fraction of image width (max 0.40) |
170
+ | `--threshold` | `2.0` | Dissimilarity Z-score threshold |
171
+ | `--min-area` | `500` | Minimum blob area in pixels |
172
+ | `--max-aspect-ratio` | `5.0` | Reject blobs with aspect ratio above this |
173
+ | `--split-margin` | `50` | Extra pixels added beyond outermost control edge |
174
+ | `--proximity` | `50` | Proximity rescue radius in pixels |
175
+ | `--visualize` | — | Save annotated side-by-side PNG |
176
+ | `--output`, `-o` | `outputs/<stem>.json` | Write JSON results to file |
177
+ | `--overwrite` | off | Replace the output file if it already exists |
178
+
179
+ ---
180
+
181
+ ## Debug visualisation
182
+
183
+ ```python
184
+ from rmcontrols import detect_controls_debug, visualize_debug
185
+ import matplotlib.pyplot as plt
186
+
187
+ thumbnail, regions, control_split_x, debug_info = detect_controls_debug(
188
+ "assets/thumbnail_1.png", side="left",
189
+ )
190
+ fig = visualize_debug(thumbnail, debug_info)
191
+ plt.show()
192
+ ```
193
+
194
+ The figure shows five panels:
195
+
196
+ 1. **Original thumbnail** — with the split line overlaid
197
+ 2. **Tissue mask** — binary output of the Otsu step
198
+ 3. **Blob roles** — colour-coded bounding boxes (blue=main, green=control,
199
+ purple=proximity-rescued, orange=rejected)
200
+ 4. **Dissimilarity scores** — bar chart of strip-blob scores vs. threshold
201
+ 5. **Shape features** — grouped bar chart of geometric features per blob
202
+
203
+ ---
204
+
205
+ ## Interactive validation
206
+
207
+ ```python
208
+ from rmcontrols import validate_control_split_x
209
+
210
+ control_split_x, width = validate_control_split_x(
211
+ "assets/thumbnail_1.png",
212
+ side="left",
213
+ strip_width_frac=0.40,
214
+ dissimilarity_threshold=0.05,
215
+ )
216
+ print(f"split={control_split_x} ({control_split_x / width * 100:.1f}% of {width}px)")
217
+ ```
218
+
219
+ At the prompt:
220
+
221
+ | Input | Effect |
222
+ |---|---|
223
+ | *(Enter)* | Accept current value |
224
+ | `<integer>` | Override and redisplay |
225
+ | `0` | No controls → `control_split_x = None` |
226
+ | `debug` | Toggle full 5-panel grid on/off |
227
+ | `break` | Accept and stop (also stops batch loops) |
228
+
229
+ ### Batch validation
230
+
231
+ ```python
232
+ from rmcontrols import validate_control_split_x_batch
233
+ from pathlib import Path
234
+
235
+ results = validate_control_split_x_batch(
236
+ sorted(Path("assets/").glob("*.png")),
237
+ side="left",
238
+ )
239
+ # results: {"path/to/img.png": (control_split_x, width), ...}
240
+ ```
241
+
242
+ ### WSI batch (requires OpenSlide)
243
+
244
+ ```python
245
+ from rmcontrols import validate_control_split_x_wsi
246
+
247
+ results = validate_control_split_x_wsi(
248
+ sorted(Path("slides/").glob("*.svs")),
249
+ side="left",
250
+ thumbnail_size=1000,
251
+ )
252
+ # results: {"slide.svs": (control_split_x, width), ...}
253
+ ```
254
+
255
+ ---
256
+
257
+ ## Hooks
258
+
259
+ ```python
260
+ from rmcontrols import detect_controls, DetectionHooks
261
+
262
+ def log_mask(mask, otsu, scale):
263
+ fg = mask.mean() * 100
264
+ print(f" mask: otsu={otsu}, scale={scale:.2f}, fg={fg:.1f}%")
265
+
266
+ def log_score(blob, score):
267
+ print(f" blob {blob['blob_id']}: score={score:.3f}")
268
+
269
+ hooks = DetectionHooks(
270
+ on_mask_ready=log_mask,
271
+ on_blob_scored=log_score,
272
+ )
273
+ thumbnail, regions, cx = detect_controls("thumbnail.png", hooks=hooks)
274
+ ```
275
+
276
+ Available hooks:
277
+
278
+ | Hook | Signature | When called |
279
+ |---|---|---|
280
+ | `on_mask_ready` | `(mask, otsu, scale) → None` | After tissue segmentation |
281
+ | `on_blobs_extracted` | `(blobs) → None` | After connected-component extraction |
282
+ | `on_blob_scored` | `(blob, score) → None` | Once per strip blob, after scoring |
283
+ | `on_detection_complete` | `(regions, debug_info) → None` | At pipeline end |
284
+
285
+ ---
286
+
287
+ ## Tuning guide
288
+
289
+ | Symptom | Parameter | Direction |
290
+ |---|---|---|
291
+ | Controls not detected | `dissimilarity_threshold` | ↓ lower |
292
+ | Main tissue wrongly flagged as control | `dissimilarity_threshold` | ↑ raise |
293
+ | Split line cuts into main tissue | `control_split_x_margin` | ↑ raise |
294
+ | Small dust/artifact blobs detected | `min_tissue_area_px` | ↑ raise |
295
+ | Long thin stain lines detected | `max_aspect_ratio` | ↓ lower |
296
+ | Faint tissue not segmented | `threshold_scale` | ↑ raise (or set explicitly) |
297
+ | Background over-segmented | `threshold_scale` | ↓ lower (or set explicitly) |
298
+ | Two physically-connected blobs split apart | `control_proximity_px` | ↑ raise |
299
+
300
+ ---
301
+
302
+ ## Development
303
+
304
+ ### Setup
305
+
306
+ ```bash
307
+ # 1. Install uv (if not already installed)
308
+ curl -LsSf https://astral.sh/uv/install.sh | sh
309
+
310
+ # 2. Clone and install with dev dependencies
311
+ git clone git@github.com:afilt/rmcontrols.git
312
+ cd rmcontrols
313
+ uv sync --extra dev
314
+
315
+ # 3. Install pre-commit hooks
316
+ uv run pre-commit install
317
+ ```
318
+
319
+ ```bash
320
+ uv run pytest
321
+ uv run ruff check rmcontrols/
322
+ uv run mypy rmcontrols/
323
+ ```
324
+
325
+ ### Pre-commit hooks
326
+
327
+ The repository ships a `.pre-commit-config.yaml` that runs on every
328
+ `git commit`:
329
+
330
+ | Hook | What it does |
331
+ |---|---|
332
+ | **nbstripout** | Strips all outputs and metadata from `*.ipynb` before committing |
333
+ | **ruff** | Lints Python files and auto-fixes safe issues (imports, style) |
334
+ | **ruff-format** | Formats Python code |
335
+ | **trailing-whitespace** | Removes trailing spaces |
336
+ | **end-of-file-fixer** | Ensures files end with a newline |
337
+ | **check-yaml / check-toml** | Validates `*.yaml` and `*.toml` syntax |
338
+ | **check-merge-conflict** | Aborts if merge-conflict markers are present |
339
+ | **debug-statements** | Rejects accidental `breakpoint()` / `pdb` calls |
340
+
341
+ **First-time setup** (requires a git repository):
342
+
343
+ ```bash
344
+ git init # if the repo does not exist yet
345
+ uv run pre-commit install
346
+ ```
347
+
348
+ **Run all hooks manually** (without committing):
349
+
350
+ ```bash
351
+ uv run pre-commit run --all-files
352
+ ```
353
+
354
+ ---
355
+
356
+
357
+ ## How it works
358
+
359
+ The detector runs an 11-step pipeline on a grayscale thumbnail:
360
+
361
+ ### Step 1 — Grayscale conversion
362
+ The RGB thumbnail is converted to grayscale using BT.601 luminance weights
363
+ (`0.299 R + 0.587 G + 0.114 B`) rather than a simple channel mean, preserving
364
+ perceptual brightness.
365
+
366
+ ### Step 2 — Adaptive Otsu thresholding
367
+ A scale factor is applied to the standard Otsu threshold to handle slides where
368
+ faintly-stained tissue would otherwise be missed.
369
+
370
+ In **auto mode** (`threshold_scale=None`, the default) the scale is chosen by
371
+ sweeping from 1.0 to 2.0 in 0.02 steps and observing the foreground fraction
372
+ (proportion of pixels below the threshold). The scale just before the
373
+ foreground fraction explodes — the *elbow* of the curve — is selected and
374
+ blended towards 1.0: `scale = 0.3 × 1.0 + 0.7 × elbow_scale`. This avoids
375
+ both over-segmentation (too much background included) and under-segmentation
376
+ (faint tissue missed).
377
+
378
+ You can override this by passing an explicit `threshold_scale` value.
379
+
380
+ ### Step 3 — Morphological closing + hole filling
381
+ `scipy.ndimage.binary_closing` (3 × 3 connectivity, 5 iterations) bridges small
382
+ intra-tissue gaps introduced by lightly-stained regions. `binary_fill_holes`
383
+ recovers tissue enclosed by stained borders.
384
+
385
+ ### Step 4 — Border-margin zeroing
386
+ A 5 % margin is zeroed on the **top**, **bottom**, and the side **opposite** to
387
+ the controls. This removes scan-border artifacts (edge staining, slide labels)
388
+ without discarding controls that extend to the near edge.
389
+
390
+ > **Edge case**: if `side="left"`, the left border is kept open; the right 5 %
391
+ > is zeroed. Vice-versa for `side="right"`.
392
+
393
+ ### Step 5 — Connected-component extraction
394
+ `scipy.ndimage.label` + `find_objects` extracts connected components in a single
395
+ label-array pass (O(H×W + Σ areas) rather than O(H×W × n_blobs)). Blobs
396
+ smaller than `min_tissue_area_px` are discarded.
397
+
398
+ ### Step 6 — Strip / main partition
399
+ Blobs whose centroid x-coordinate falls within the control strip
400
+ (`strip_width_frac × W` pixels from the control side) are designated *strip
401
+ blobs*; the remainder are *main tissue blobs*.
402
+
403
+ > **Edge case**: if no strip blobs survive, detection stops and returns an
404
+ > empty result.
405
+
406
+ ### Step 7 — Aspect-ratio filter (line-artifact rejection)
407
+ Blobs with a bounding-box aspect ratio `max(w,h)/min(w,h) > max_aspect_ratio`
408
+ are rejected as line-like scan artifacts. Applied to **both** strip and main
409
+ blobs so that long thin artifacts do not pollute the reference distribution
410
+ used in the next step.
411
+
412
+ > **Edge case**: if all strip blobs are rejected by this filter, detection
413
+ > returns an empty result.
414
+
415
+ ### Step 8 — Morphological dissimilarity scoring
416
+
417
+ This is the **core decision step**: each strip blob is compared to the
418
+ main-tissue population and assigned a scalar *dissimilarity score*. A blob
419
+ whose score exceeds `dissimilarity_threshold` is accepted as a control tissue.
420
+
421
+ #### Feature vector
422
+
423
+ Eight descriptors are computed for every blob:
424
+
425
+ | Feature | Formula | Typical range | What it captures |
426
+ |---|---|---|---|
427
+ | `extent` | area / (bbox_w × bbox_h) | 0 – 1 | How densely the blob fills its bounding box |
428
+ | `aspect_ratio` | max(w,h) / min(w,h) | ≥ 1 | Overall elongation of the bounding box |
429
+ | `solidity` | area / convex-hull area | 0 – 1 | Degree of convexity; deeply notched blobs score low |
430
+ | `convexity` | hull perimeter / blob perimeter | 0 – 1 | Smoothness of the contour |
431
+ | `isoperimetric_ratio` | perim² / (4π × area) | ≥ 1 (circle = 1) | Compactness; complex, ragged blobs score high |
432
+ | `elongation` | λ_max / λ_min of inertia tensor | ≥ 1 | Principal-axis elongation independent of bounding box |
433
+ | `mean_intensity` | mean BT.601 grayscale within blob | 0 – 255 | Average staining darkness |
434
+ | `std_intensity` | std of BT.601 grayscale within blob | 0 – 128 | Staining heterogeneity |
435
+
436
+ The perimeter is computed once via binary erosion and reused for both
437
+ `convexity` and `isoperimetric_ratio`, avoiding redundant morphological passes.
438
+
439
+ > **Why these features?** IHC control tissues are typically small, compact,
440
+ > and uniformly stained punch-outs placed at the slide edge. In contrast, the
441
+ > main tissue fragment is large, irregularly shaped, and may have heterogeneous
442
+ > staining. Features such as `isoperimetric_ratio`, `solidity`, and
443
+ > `mean_intensity` tend to be the most discriminative because they capture
444
+ > compactness and staining level simultaneously.
445
+
446
+ #### Z-score formulation (normal path)
447
+
448
+ When the reference population contains **two or more** main-tissue blobs, the
449
+ dissimilarity score is the **maximum absolute Z-score** across all features
450
+ shared by the candidate and every reference blob:
451
+
452
+ ```
453
+ score = max_k |f_k(candidate) - mean_k(ref)| / std_k(ref)
454
+ ```
455
+
456
+ where `k` indexes the feature keys, `mean_k` and `std_k` are computed over all
457
+ main-tissue blobs, and the denominator is clamped to 1 × 10⁻⁶ to avoid
458
+ division by zero when all reference blobs agree exactly on a feature.
459
+
460
+ Taking the **maximum** (rather than an average or Euclidean norm) means the
461
+ score is driven by the single most deviant feature. This is intentional:
462
+ a control that is identical to main tissue in seven features but wildly
463
+ different in staining intensity should still be flagged.
464
+
465
+ **Worked example** — suppose the reference has three main-tissue blobs with
466
+ `isoperimetric_ratio` values `{1.05, 1.08, 1.07}`:
467
+
468
+ ```
469
+ mean = 1.067
470
+ std = 0.013
471
+ candidate isoperimetric_ratio = 1.45
472
+ Z = |1.45 - 1.067| / 0.013 ≈ 29.5
473
+ ```
474
+
475
+ If all other Z-scores are below the threshold, this single feature drives the
476
+ score to 29.5, which exceeds any sensible threshold and the blob is accepted
477
+ as a control.
478
+
479
+ The default threshold of `0.05` corresponds to roughly 0.05 standard deviations
480
+ — blobs that are statistically typical of main tissue are rejected as controls,
481
+ while genuine controls (which are morphologically distinct in at least one
482
+ dimension) receive scores well above 0.05.
483
+
484
+ #### Single-reference fallback
485
+
486
+ When there is exactly **one** main-tissue blob, the sample standard deviation
487
+ is undefined. In this case the score falls back to the maximum *normalised
488
+ absolute difference*:
489
+
490
+ ```
491
+ score = max_k |f_k(candidate) - f_k(ref)| / range_k
492
+ ```
493
+
494
+ where `range_k` is a hand-tuned plausible range for each feature:
495
+
496
+ | Feature | `range_k` |
497
+ |---|---|
498
+ | `extent` | 1.0 |
499
+ | `aspect_ratio` | 9.0 |
500
+ | `solidity` | 1.0 |
501
+ | `convexity` | 1.0 |
502
+ | `isoperimetric_ratio` | 10.0 |
503
+ | `elongation` | 20.0 |
504
+ | `mean_intensity` | 255.0 |
505
+ | `std_intensity` | 128.0 |
506
+
507
+ This makes the fallback score dimensionless and comparable to the Z-score
508
+ regime so that the same `dissimilarity_threshold` remains meaningful.
509
+
510
+ #### No-reference edge case
511
+
512
+ When there is **no** main tissue at all (e.g. the entire image is background),
513
+ every strip blob is unconditionally accepted as a control and assigned an
514
+ infinite score (`float('inf')`). In practice this is rare but can occur with
515
+ very sparse or over-segmented images.
516
+
517
+ #### Tuning `dissimilarity_threshold`
518
+
519
+ | Threshold | Behaviour |
520
+ |---|---|
521
+ | Low (e.g. 0.5) | Accepts blobs that differ only modestly from main tissue; increases sensitivity but may cause false positives |
522
+ | Default (2.0) | ~2 σ deviation required; robust for typical IHC slides |
523
+ | High (e.g. 5.0) | Only accepts blobs that are extreme outliers; reduces false positives but may miss subtle controls |
524
+
525
+ Use `detect_controls_debug()` and the `visualize_debug()` panel **"4. Dissimilarity scores"**
526
+ to inspect per-blob scores and choose an appropriate threshold for your dataset.
527
+
528
+ A strip blob is accepted as a control if `score >= dissimilarity_threshold`.
529
+
530
+ ### Step 9 — Proximity rescue
531
+ A strip blob initially rejected by dissimilarity is **reinstated** if its
532
+ x-centroid lies within `control_proximity_px` pixels of an already-accepted
533
+ control. This handles cases where a single physical control tissue is split
534
+ into two blobs by a thin staining gap and only one part passes the dissimilarity
535
+ test.
536
+
537
+ ### Step 10 — Spatial constraint
538
+ Accepted controls whose bounding box extends **into** the main-tissue bounding
539
+ box are re-rejected. This prevents a large tissue blob that straddles the
540
+ strip boundary from being misclassified as a control.
541
+
542
+ > **Edge case**: if all candidates are removed by this constraint, detection
543
+ > returns an empty result.
544
+
545
+ ### Step 11 — control_split_x
546
+ The split coordinate is placed at:
547
+ - `side="left"`: `min(max_control_right_edge + control_split_x_margin, W)`
548
+ - `side="right"`: `max(min_control_left_edge − control_split_x_margin, 0)`
549
+
550
+ This is the x-coordinate boundary that separates control tissue from main
551
+ tissue in downstream analysis.