ars-sigma 1.0.0__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,940 @@
1
+ Metadata-Version: 2.4
2
+ Name: ars-sigma
3
+ Version: 1.0.0
4
+ Summary: Conditional inference trees for Python
5
+ Author: ArsChitectura SAS
6
+ License-Expression: LicenseRef-ArsChitectura-Sigma
7
+ Project-URL: Repository, https://github.com/arschitectura/sigma
8
+ Project-URL: Documentation, https://arschitectura.com/products/sigma/
9
+ Project-URL: Contact, https://arschitectura.com/contact/
10
+ Requires-Python: >=3.10
11
+ Description-Content-Type: text/markdown
12
+ License-File: LICENSE.txt
13
+ License-File: NOTICE.txt
14
+ Requires-Dist: numpy>=1.24
15
+ Requires-Dist: scipy>=1.11
16
+ Requires-Dist: scikit-learn>=1.3
17
+ Requires-Dist: typing_extensions>=4.0
18
+ Provides-Extra: viz
19
+ Requires-Dist: graphviz>=0.20; extra == "viz"
20
+ Requires-Dist: matplotlib>=3.8; extra == "viz"
21
+ Dynamic: license-file
22
+
23
+ # Sigma
24
+
25
+ <img src="https://arschitectura.com/medias/sigma_small.webp" alt="Sigma" width="200" height="200" align="right">
26
+
27
+ **Conditional inference trees for Python.**
28
+
29
+ Provides classification (`ClassificationTree`), regression
30
+ (`RegressionTree`), survival-analysis (`SurvivalTree`), and ranking
31
+ (`RankingTree`) estimators, compatible with scikit-learn.
32
+
33
+ - **Unbiased splits** - permutation-based p-values decouple variable selection from split search, avoiding CART's bias toward variables with many possible splits
34
+ - **Interpretable by construction** - each split is a statistical hypothesis test with a reported p-value, and fitted trees render to PNG/SVG via `to_image`
35
+ - **scikit-learn compatible** - `ClassificationTree`, `RegressionTree`, `SurvivalTree`, and `RankingTree` drop into any sklearn pipeline
36
+
37
+ Every statistical method in Sigma comes from a [peer-reviewed paper](#references).
38
+
39
+ ## 1. License
40
+
41
+ Governed by the [**Sigma License**](./LICENSE.txt). This is a
42
+ source-available license, not OSI-approved open source. Commercial use
43
+ is permitted with attribution. ArsChitectura SAS retains an at-will right to
44
+ revoke the license, at any time, for any reason. Licensee shall consult
45
+ [Licensor's organization website](https://arschitectura.com/products/sigma/) and
46
+ the [Software's project homepage](https://github.com/arschitectura/sigma) at
47
+ least once every ninety (90) days.
48
+
49
+ **AI, ML, and other automated ingestion of this library, its
50
+ documentation, or any derivative work is prohibited**, excepted to
51
+ generate your own client code that calls Sigma's public API.
52
+
53
+ Modification of Sigma is permitted only as preparation for a Contribution
54
+ to the Canonical Repository; see [`CONTRIBUTING.md`](./CONTRIBUTING.md)
55
+ for the lifecycle.
56
+
57
+ External contributors must sign the CLA in [`CONTRIBUTING.md`](./CONTRIBUTING.md)
58
+ before a pull request can be accepted.
59
+
60
+ If your needs exceed this, a paid, non-revocable commercial license is available on [request](https://arschitectura.com/contact/).
61
+
62
+ ## 2. Support
63
+
64
+ Read the [documentation](https://arschitectura.com/products/sigma/).
65
+
66
+ Have questions, feedback, or need help getting started? I would love to hear from you - [get in touch](https://arschitectura.com/contact/).
67
+
68
+ <div align="center">
69
+ <a href="https://arschitectura.com/contact/">
70
+ <img src="https://arschitectura.com/medias/card.webp" alt="Card" width="500" height="311">
71
+ </a>
72
+ </div>
73
+
74
+ ## 3. Installation
75
+
76
+ ```bash
77
+ pip install ars-sigma
78
+ ```
79
+
80
+ ## 4. Sample Trees
81
+
82
+ Four trees fitted on classic datasets. Each subsection shows the
83
+ fit code, the `to_text` rendering, and the rendered tree and response
84
+ images. Click an image to view it at full size.
85
+
86
+ ### 4.1. Titanic (classification)
87
+
88
+ Predicting survival probability with a Jeffreys 95% confidence
89
+ interval at each node - surfaces passenger class, sex, and age.
90
+
91
+ ```python
92
+ tree = sigma.ClassificationTree(random_state=123)
93
+ tree.fit(X, y)
94
+ print(tree.to_text(precision=1))
95
+ tree.to_image("png", "titanic.png", precision=1)
96
+ tree.to_image("png", "titanic_response.png", kind="response")
97
+ ```
98
+
99
+ ```
100
+ Died proba. Survived proba. Obs. count Obs. share Split p-value Leaf index
101
+ ---------------------- ---------------------- ---------- ---------- ------------- ----------
102
+ All records 59.6% (55.9% to 63.1%) 40.4% (36.9% to 44.1%) 712 100.0% 0.02%
103
+ ├── Passenger class is "1st" or "2nd" 43.1% (38.1% to 48.3%) 56.9% (51.7% to 61.9%) 357 50.1% 0.02%
104
+ │ ├── Sex is "female" 5.7% (2.9% to 10.2%) 94.3% (89.8% to 97.1%) 157 22.1% 6
105
+ │ └── Sex is "male" 72.5% (66.0% to 78.3%) 27.5% (21.7% to 34.0%) 200 28.1% 0.08%
106
+ │ ├── Passenger class is "1st" 60.4% (50.7% to 69.5%) 39.6% (30.5% to 49.3%) 101 14.2% 1.02%
107
+ │ │ ├── Age <= 53.0 53.2% (42.2% to 63.9%) 46.8% (36.1% to 57.8%) 79 11.1% 5
108
+ │ │ └── Age > 53.0 86.4% (67.9% to 96.0%) 13.6% (4.0% to 32.1%) 22 3.1% 2
109
+ │ └── Passenger class is "2nd" 84.8% (76.8% to 90.9%) 15.2% (9.1% to 23.2%) 99 13.9% 0.62%
110
+ │ ├── Age <= 12.0 0% (0% to 23.8%) 100% (76.2% to 100%) 9 1.3% 7
111
+ │ └── Age > 12.0 93.3% (86.8% to 97.2%) 6.7% (2.8% to 13.2%) 90 12.6% 1
112
+ └── Passenger class is "3rd" 76.1% (71.4% to 80.3%) 23.9% (19.7% to 28.6%) 355 49.9% 0.02%
113
+ ├── Sex is "female" 53.9% (44.3% to 63.4%) 46.1% (36.6% to 55.7%) 102 14.3% 4
114
+ └── Sex is "male" 85.0% (80.2% to 89.0%) 15.0% (11.0% to 19.8%) 253 35.5% 3
115
+ ```
116
+
117
+ <table>
118
+ <tr>
119
+ <td><a href="https://arschitectura.com/medias/sigma_titanic.png"><img src="https://arschitectura.com/medias/sigma_titanic.png" alt="Tree fitted on the Titanic dataset"></a></td>
120
+ </tr>
121
+ <tr>
122
+ <td><a href="https://arschitectura.com/medias/sigma_titanic_response.png"><img src="https://arschitectura.com/medias/sigma_titanic_response.png" alt="Response plot for the Titanic dataset"></a></td>
123
+ </tr>
124
+ </table>
125
+
126
+ ### 4.2. Diabetes (regression)
127
+
128
+ Predicting one-year disease progression with a Bayesian-bootstrap 95%
129
+ confidence interval at each node - surfaces BMI, triglycerides, and blood
130
+ pressure. BMI splits recursively, so the example calls `compact()` (see
131
+ section 5.5) to fold that chain into one multi-way node.
132
+
133
+ ```python
134
+ tree = sigma.RegressionTree(
135
+ test_type="monte_carlo",
136
+ resamples=2000,
137
+ random_state=123,
138
+ reverse_order=True,
139
+ )
140
+ tree.fit(X, y)
141
+ compact_tree = tree.compact()
142
+ print(compact_tree.to_text(precision=1))
143
+ compact_tree.to_image(
144
+ "png", "diabetes.png", orientation="left-to-right", precision=1
145
+ )
146
+ compact_tree.to_image("png", "diabetes_response.png", kind="response")
147
+ ```
148
+
149
+ ```
150
+ Disease progression mean Obs. count Obs. share Split p-value Leaf index
151
+ ------------------------ ---------- ---------- ------------- ----------
152
+ All records 152.1 (145.1 to 159.4) 442 100.0%
153
+ ├── BMI <= 24.4 105.0 (97.6 to 112.9) 165 37.3% 0.05%
154
+ │ ├── Triglycerides (log) <= 4.6 93.9 (86.7 to 101.4) 123 27.8% 8
155
+ │ └── Triglycerides (log) > 4.6 137.7 (121.6 to 153.6) 42 9.5% 6
156
+ ├── 24.4 < BMI <= 27.2 144.0 (131.5 to 157.1) 112 25.3% 0.05%
157
+ │ ├── Total-to-HDL ratio <= 4.8 118.4 (104.3 to 133.9) 65 14.7% 0.05%
158
+ │ │ ├── Triglycerides (log) <= 4.8 102.2 (89.5 to 116.4) 53 12.0% 7
159
+ │ │ └── Triglycerides (log) > 4.8 190.0 (162.1 to 217.8) 12 2.7% 3
160
+ │ └── Total-to-HDL ratio > 4.8 179.3 (160.8 to 197.9) 47 10.6% 4
161
+ └── BMI > 27.2 204.8 (193.8 to 215.4) 165 37.3% 0.05%
162
+ ├── Blood pressure <= 111.8 189.3 (176.8 to 201.6) 124 28.1% 0.05%
163
+ │ ├── Triglycerides (log) <= 5.1 171.1 (156.4 to 186.8) 80 18.1% 5
164
+ │ └── Triglycerides (log) > 5.1 222.5 (205.0 to 238.2) 44 10.0% 2
165
+ └── Blood pressure > 111.8 251.4 (235.1 to 265.8) 41 9.3% 1
166
+ ```
167
+
168
+ <table>
169
+ <tr>
170
+ <td><a href="https://arschitectura.com/medias/sigma_diabetes.png"><img src="https://arschitectura.com/medias/sigma_diabetes.png" alt="Tree fitted on the Diabetes dataset"></a></td>
171
+ </tr>
172
+ <tr>
173
+ <td><a href="https://arschitectura.com/medias/sigma_diabetes_response.png"><img src="https://arschitectura.com/medias/sigma_diabetes_response.png" alt="Response plot for the Diabetes dataset"></a></td>
174
+ </tr>
175
+ </table>
176
+
177
+ ### 4.3. GBSG-2 breast cancer (survival)
178
+
179
+ Predicting recurrence-free years with a Brookmeyer-Crowley 95%
180
+ confidence interval at each node - splits on positive lymph nodes,
181
+ hormone therapy, and progesterone receptor level.
182
+
183
+ ```python
184
+ tree = sigma.SurvivalTree(
185
+ random_state=123,
186
+ metrics=("median", ("survival", 5.0, "years")),
187
+ )
188
+ tree.fit(X, y)
189
+ print(tree.to_text(precision=1))
190
+ tree.to_image("png", "breast_cancer.png", precision=1)
191
+ tree.to_image("png", "breast_cancer_response.png", kind="response")
192
+ ```
193
+
194
+ ```
195
+ Median Recurrence-free years Survival at 5 years Obs. count Obs. share Split p-value Leaf index
196
+ ---------------------------- ---------------------- ---------- ---------- ------------- ----------
197
+ All records 4.9 (4.2 to 5.5) 49.2% (44.6% to 53.6%) 686 100.0% 0.02%
198
+ ├── Tumor size <= 19 unknown (5.4 to unknown) 65.6% (55.4% to 73.9%) 135 19.7% 0.04%
199
+ │ ├── Positive lymph nodes <= 2 unknown (unknown bounds) 80.7% (68.2% to 88.7%) 77 11.2% 7
200
+ │ └── Positive lymph nodes > 2 4.7 (2.6 to 5.5) 44.7% (29.2% to 59.1%) 58 8.5% 3.52%
201
+ │ ├── Age > 41 5.4 (3.5 to unknown) 54.6% (36.2% to 69.8%) 49 7.1% 5
202
+ │ └── Age <= 41 1.3 (0.7 to 3.2) 0% (0% to 0%) 9 1.3% 1
203
+ └── Tumor size > 19 4.3 (3.7 to 5.0) 44.9% (39.7% to 49.9%) 551 80.3% 0.02%
204
+ ├── Positive lymph nodes <= 4 5.7 (4.8 to unknown) 54.7% (47.7% to 61.0%) 332 48.4% 1.10%
205
+ │ ├── Hormone therapy is true unknown (5.6 to unknown) 67.7% (56.3% to 76.7%) 114 16.6% 6
206
+ │ └── Hormone therapy is false 4.8 (3.9 to unknown) 47.1% (38.3% to 55.4%) 218 31.8% 4
207
+ └── Positive lymph nodes > 4 2.4 (2.0 to 3.1) 29.9% (22.6% to 37.5%) 219 31.9% 0.10%
208
+ ├── Progesterone receptor level > 24 4.1 (2.7 to 5.5) 43.8% (31.4% to 55.4%) 107 15.6% 3
209
+ └── Progesterone receptor level <= 24 1.7 (1.4 to 2.2) 17.3% (10.0% to 26.3%) 112 16.3% 2
210
+ ```
211
+
212
+ <table>
213
+ <tr>
214
+ <td><a href="https://arschitectura.com/medias/sigma_breast_cancer.png"><img src="https://arschitectura.com/medias/sigma_breast_cancer.png" alt="Tree fitted on the GBSG-2 breast cancer dataset"></a></td>
215
+ </tr>
216
+ <tr>
217
+ <td><a href="https://arschitectura.com/medias/sigma_breast_cancer_response.png"><img src="https://arschitectura.com/medias/sigma_breast_cancer_response.png" alt="Response plot for the GBSG-2 breast cancer dataset"></a></td>
218
+ </tr>
219
+ </table>
220
+
221
+ ### 4.4. Sushi (ranking)
222
+
223
+ Predicting per-item Plackett-Luce expected rank with a Bayesian-bootstrap
224
+ 95% confidence interval at each node - surfaces sex and age group as the
225
+ strongest demographic drivers of sushi preference among 5000 Japanese
226
+ respondents ranking ten classic sushi. Age group splits recursively on the
227
+ male side, so the example calls `compact()` (see section 5.5) to fold that
228
+ chain into one multi-way node.
229
+
230
+ ```python
231
+ tree = sigma.RankingTree(
232
+ pca_components=10,
233
+ random_state=123,
234
+ max_depth=3,
235
+ )
236
+ tree.fit(X, rankings)
237
+ compact_tree = tree.compact()
238
+ print(compact_tree.to_text(precision=2))
239
+ compact_tree.to_image(
240
+ "png", "sushi.png", orientation="left-to-right", precision=2
241
+ )
242
+ compact_tree.to_image("png", "sushi_response.png", kind="response")
243
+ ```
244
+
245
+ ```
246
+ Ebi rank Anago rank Maguro rank Uni rank Tamago rank Obs. count Obs. share Split p-value Leaf index
247
+ ------------------- ------------------- ------------------- ------------------- ------------------- ---------- ---------- ------------- ----------
248
+ All records 4.94 (4.86 to 5.01) 5.39 (5.32 to 5.46) 4.37 (4.31 to 4.43) 6.07 (5.98 to 6.16) 3.24 (3.15 to 3.31) 5000 100.0% <1e-300
249
+ ├── Gender is "male" 5.16 (5.05 to 5.25) 5.15 (5.06 to 5.28) 4.22 (4.11 to 4.33) 5.66 (5.50 to 5.77) 2.90 (2.82 to 3.01) 2373 47.5%
250
+ │ ├── Age group is "30-39" 5.27 (5.14 to 5.43) 5.04 (4.83 to 5.23) 4.31 (4.18 to 4.45) 5.53 (5.30 to 5.75) 2.83 (2.71 to 2.99) 830 16.6% 6
251
+ │ ├── Age group is "40-49", "50-59", or "60+" 5.21 (5.04 to 5.39) 5.28 (5.12 to 5.45) 4.29 (4.15 to 4.44) 5.03 (4.80 to 5.23) 2.96 (2.81 to 3.13) 884 17.7% 0.60%
252
+ │ │ ├── Childhood region is "Tohoku", "Hokuriku", "Kanto+Shizuoka", "Nagoya", "Kinki", "Chugoku", or "Okinawa" 5.31 (5.13 to 5.50) 5.17 (4.97 to 5.34) 4.29 (4.14 to 4.43) 5.10 (4.88 to 5.33) 2.94 (2.76 to 3.15) 735 14.7% 7
253
+ │ │ └── Childhood region is "Hokkaido", "Shikoku", "Kyushu", or "abroad" 4.74 (4.36 to 5.13) 5.81 (5.37 to 6.23) 4.32 (3.98 to 4.66) 4.67 (4.23 to 5.19) 3.03 (2.66 to 3.35) 149 3.0% 3
254
+ │ └── Age group is "15-19" or "20-29" 4.96 (4.78 to 5.14) 5.14 (4.95 to 5.34) 4.01 (3.84 to 4.17) 6.54 (6.30 to 6.73) 2.93 (2.73 to 3.15) 659 13.2% 5
255
+ └── Gender is "female" 4.73 (4.64 to 4.82) 5.62 (5.53 to 5.72) 4.52 (4.43 to 4.59) 6.44 (6.32 to 6.55) 3.54 (3.43 to 3.66) 2627 52.5% <1e-300
256
+ ├── Age group is "20-29", "30-39", "40-49", "50-59", or "60+" 4.73 (4.63 to 4.82) 5.55 (5.44 to 5.66) 4.56 (4.48 to 4.65) 6.30 (6.13 to 6.41) 3.57 (3.46 to 3.68) 2429 48.6% <1e-300
257
+ │ ├── Childhood region is "Hokuriku", "Kanto+Shizuoka", "Kinki", "Chugoku", "Kyushu", or "abroad" 4.90 (4.78 to 5.01) 5.43 (5.28 to 5.57) 4.54 (4.46 to 4.63) 6.44 (6.31 to 6.59) 3.50 (3.38 to 3.62) 1833 36.7% 4
258
+ │ └── Childhood region is "Hokkaido", "Tohoku", "Nagoya", "Shikoku", or "Okinawa" 4.18 (4.02 to 4.38) 5.90 (5.70 to 6.12) 4.64 (4.47 to 4.82) 5.84 (5.55 to 6.13) 3.78 (3.61 to 3.99) 596 11.9% 1
259
+ └── Age group is "15-19" 4.73 (4.37 to 5.11) 6.46 (6.08 to 6.79) 3.96 (3.68 to 4.30) 7.90 (7.60 to 8.25) 3.25 (2.88 to 3.64) 198 4.0% 2
260
+ ```
261
+
262
+ <table>
263
+ <tr>
264
+ <td><a href="https://arschitectura.com/medias/sigma_sushi.png"><img src="https://arschitectura.com/medias/sigma_sushi.png" alt="Tree fitted on the Sushi preference dataset"></a></td>
265
+ </tr>
266
+ <tr>
267
+ <td><a href="https://arschitectura.com/medias/sigma_sushi_response.png"><img src="https://arschitectura.com/medias/sigma_sushi_response.png" alt="Response plot for the Sushi preference dataset"></a></td>
268
+ </tr>
269
+ </table>
270
+
271
+ ## 5. Advanced usage
272
+
273
+ ### 5.1. Controlling tree depth and node size
274
+
275
+ `alpha` is the principal knob: it sets the significance threshold for
276
+ every split test, so lowering it produces a terser, more statistically
277
+ conservative tree, and raising it produces a richer, more exploratory
278
+ one. `min_splits`, `min_buckets`, and `max_depth` are secondary safety
279
+ bounds, shared between `RegressionTree` and `ClassificationTree`.
280
+
281
+ ```python
282
+ tree = ClassificationTree(
283
+ max_depth=4, # maximum tree depth (None = unlimited)
284
+ )
285
+ ```
286
+
287
+ ### 5.2. Fitting with sample weights
288
+
289
+ Sample weights let you model **variable exposures** - per-row
290
+ time-at-risk, insurance policy-years, or frequency weights for
291
+ pre-aggregated rows. A weight of `k` is equivalent to observing the
292
+ sample `k` times.
293
+
294
+ ```python
295
+ import numpy
296
+ from sigma import RegressionTree
297
+
298
+ n = 200
299
+ X = numpy.random.randn(n, 2)
300
+ claim_amount = numpy.where(X[:, 0] > 0, 1500.0, 300.0) + 100 * numpy.random.randn(n)
301
+ exposure_years = numpy.random.uniform(0.1, 2.0, size=n)
302
+
303
+ tree = RegressionTree()
304
+ tree.fit(X, claim_amount, sample_weight=exposure_years)
305
+ predictions = tree.predict(X)
306
+ ```
307
+
308
+ ### 5.3. Visualizing the tree
309
+
310
+ Install the optional visualization extra and the Graphviz system
311
+ binary (`brew install graphviz` on macOS):
312
+
313
+ ```bash
314
+ pip install ars-sigma[viz]
315
+ ```
316
+
317
+ Then render to PNG, PDF, SVG, or GIF:
318
+
319
+ ```python
320
+ tree.to_image("png", "tree.png", feature_names=["feature_1", "feature_2"], response_name="y")
321
+ ```
322
+
323
+ PNG and PDF additionally require `cairosvg`; SVG needs only the
324
+ Graphviz binary. See `to_image` and `export_graphviz` for the full set
325
+ of display options.
326
+
327
+ ### 5.4. Exporting the tree as a SQL CASE expression
328
+
329
+ `to_sql` (and the module-level `sigma.export_sql`) emits a single SQL
330
+ `CASE` expression that reproduces `tree.predict` row-by-row in any
331
+ SQL-92/SQL-99 engine, with no extra dependencies:
332
+
333
+ ```python
334
+ sql_expression = tree.to_sql()
335
+ print(sql_expression)
336
+ # SELECT id, (<sql_expression>) AS prediction FROM points;
337
+ ```
338
+
339
+ ```sql
340
+ CASE
341
+ WHEN "Passenger class" IN ('1st', '2nd') THEN
342
+ CASE
343
+ WHEN "Sex" = 'female' THEN
344
+ 0.9426751592356688 -- Leaf 6
345
+ WHEN "Sex" = 'male' THEN
346
+ CASE
347
+ WHEN "Passenger class" = '1st' THEN
348
+ CASE
349
+ WHEN "Age" <= 53.0 THEN
350
+ 0.46835443037974683 -- Leaf 5
351
+ WHEN "Age" > 53.0 THEN
352
+ 0.13636363636363635 -- Leaf 2
353
+ ELSE NULL
354
+ END
355
+ WHEN "Passenger class" = '2nd' THEN
356
+ CASE
357
+ WHEN "Age" <= 12.0 THEN
358
+ 1.0 -- Leaf 7
359
+ WHEN "Age" > 12.0 THEN
360
+ 0.06666666666666667 -- Leaf 1
361
+ ELSE NULL
362
+ END
363
+ ELSE 0.275
364
+ END
365
+ ELSE 0.5686274509803921
366
+ END
367
+ WHEN "Passenger class" = '3rd' THEN
368
+ CASE
369
+ WHEN "Sex" = 'female' THEN
370
+ 0.46078431372549017 -- Leaf 4
371
+ WHEN "Sex" = 'male' THEN
372
+ 0.15019762845849802 -- Leaf 3
373
+ ELSE 0.23943661971830985
374
+ END
375
+ ELSE 0.4044943820224719
376
+ END
377
+ ```
378
+
379
+ For `ClassificationTree`, pass `target_class=` to pick which class
380
+ probability the expression should emit. Categorical values not seen at
381
+ fit time evaluate to the holding node's prediction, mirroring
382
+ `tree.predict`. `NULL` numerical or boolean inputs fall through to
383
+ `ELSE NULL`; wrap in `COALESCE(..., default)` to substitute a fallback
384
+ value.
385
+
386
+ ### 5.5. Collapsing recursive splits with `compact()`
387
+
388
+ When the same feature is split over several consecutive levels, a binary
389
+ tree repeats that feature down a chain. `compact()` returns a new tree in
390
+ which each such chain is collapsed into a single multi-way node, with one
391
+ branch per resulting interval (numeric features) or category subset
392
+ (categorical features):
393
+
394
+ ```python
395
+ compact_tree = tree.compact()
396
+ print(compact_tree.to_text())
397
+ ```
398
+
399
+ The compacted tree predicts identically to the original and renders
400
+ through the same `to_text`, `to_image`, and `to_sql` methods. A merged
401
+ node spans several original splits, so it reports no single split
402
+ p-value. For example, a chain of consecutive splits on `Age`
403
+
404
+ ```
405
+ ├── Age <= 30
406
+ └── Age > 30
407
+ ├── Age <= 50
408
+ └── Age > 50
409
+ ```
410
+
411
+ collapses into one node carrying three interval branches:
412
+
413
+ ```
414
+ ├── Age <= 30
415
+ ├── 30 < Age <= 50
416
+ └── Age > 50
417
+ ```
418
+
419
+ The original tree is left unchanged; `compact()` produces an independent
420
+ copy whose node ids are renumbered to match its smaller shape.
421
+
422
+ ## 6. Parameters
423
+
424
+ The table below is a quick reference; each parameter has a dedicated
425
+ subsection further down with defaults, alternatives, and guidance on
426
+ when to choose each option.
427
+
428
+ | Parameter | Description |
429
+ | :-------------------------------- | :---------------------------------------------------------------------------- |
430
+ | `correlation` | Rank-transform inputs (robust) or use raw values (classical) |
431
+ | `test_stat` | How the multivariate score is aggregated into a scalar test statistic |
432
+ | `test_type` | Multiplicity adjustment applied across covariates |
433
+ | `alpha` | Significance level for the stopping rule |
434
+ | `min_splits` | Minimum sum of weights required to attempt a split |
435
+ | `min_buckets` | Minimum sum of weights in each child node |
436
+ | `max_depth` | Maximum tree depth |
437
+ | `categorical_features` | Which feature columns are categorical |
438
+ | `ci_method` (classification tree) | Confidence interval method for per-class proportions |
439
+ | `ci_method` (regression tree) | Confidence interval method for node mean predictions |
440
+ | `ci_method` (ranking tree) | Confidence interval method for per-item leaf PL-MLE expected-rank predictions |
441
+ | `npseudo` | Turner ghost-item pseudo-comparison weight for the per-node PL fit |
442
+ | `pl_max_iter` | Maximum Hunter MM iterations per node's Plackett-Luce fit |
443
+ | `pl_tolerance` | Convergence tolerance on log-worth for the Hunter MM iteration |
444
+ | `ci_coverage` | Coverage level for node-prediction confidence intervals |
445
+ | `transmuter` | Per-node data transform with post-hoc split validation |
446
+ | `resamples` | Number of permutations for `test_type="monte_carlo"` |
447
+ | `decorator` | Per-node decoration callable rendered by `to_text` / `to_image` |
448
+ | `random_state` | RNG seed for permutation resampling, bootstrap CI methods, and plot jitter |
449
+
450
+ ### 6.1. `correlation`
451
+
452
+ **Default**: `"rank"`.
453
+
454
+ Score function for the test statistic.
455
+
456
+ - `"normal"` uses raw values, recovering the original Pearson-like
457
+ behavior from Hothorn et al. (2006). Choose this when the response is
458
+ well-behaved (approximately Gaussian, no heavy outliers) and you want
459
+ the slight power gain on truly linear associations.
460
+ - `"rank"` (default) rank-transforms continuous covariates and, for
461
+ regression, the response before computing the statistic, yielding a
462
+ Spearman-like nonparametric test. Robust to outliers and heavy tails.
463
+ The safe choice for arbitrary real-world data.
464
+
465
+ ### 6.2. `test_stat`
466
+
467
+ **Default**: `"quadratic"`.
468
+
469
+ How the multivariate score is aggregated into a scalar test statistic.
470
+
471
+ - `"maximum"` is a maximum-type statistic that concentrates power on
472
+ alternatives where one component dominates. Choose this when you
473
+ expect a single-direction effect (e.g., a binary classification where
474
+ only one class differs from the rest).
475
+ - `"quadratic"` (default) is an omnibus chi-squared form with good
476
+ power across general alternatives. Choose this when you have no prior
477
+ on the direction of association, or when the response is multivariate
478
+ (multi-class classification with many classes).
479
+
480
+ ### 6.3. `test_type`
481
+
482
+ **Default**: `"sidak"`.
483
+
484
+ Multiplicity adjustment applied across the $m$ candidate covariates
485
+ (indexed by $j$), transforming each raw p-value $P_j$ before the
486
+ stopping rule fires.
487
+
488
+ - `"bonferroni"` is the closed-form $\min(m P_j, 1)$. The simplest and
489
+ best-known correction, strictly more conservative than Sidak under
490
+ independence; prefer `"sidak"` unless matching an external reference.
491
+ - `"monte_carlo"` is the Westfall-Young min-P resampling procedure.
492
+ More powerful than Sidak when covariates are correlated, at the cost
493
+ of $B \cdot m$ extra statistic evaluations per node, where $B$ is the
494
+ number of response permutations (controlled by `resamples`). Choose
495
+ when covariates are highly collinear and you can afford the
496
+ resampling budget; requires a positive `resamples`.
497
+ - `"sidak"` (default) is the closed-form $1 - (1 - P_j)^m$. Powerful
498
+ under independence or positive dependence of test statistics. The
499
+ recommended default.
500
+
501
+ ### 6.4. `alpha`
502
+
503
+ **Default**: `0.05`.
504
+
505
+ Significance level for the stopping rule. Recursion stops at a node
506
+ when $\min_j(\text{adjusted } P_j) > \alpha$.
507
+
508
+ The default `0.05` is a good choice for simple, exploratory analysis.
509
+ For trees fitted on very large datasets, or on correlated records
510
+ where the independence assumption is partially broken, tighten
511
+ `alpha` by one or two orders of magnitude (`0.005` or `0.0005`) to
512
+ keep the tree compact. For models aiming at higher predictive
513
+ accuracy (closer to a full-fledged machine learning model), loosen
514
+ `alpha` to between `0.10` and `0.25`. Tune in concert with
515
+ `max_depth`, `min_splits`, and `min_buckets`.
516
+
517
+ ### 6.5. `min_splits`
518
+
519
+ **Default**: `20`.
520
+
521
+ Minimum sum of weights required to attempt a split. Nodes whose weight
522
+ sum falls below this become leaves regardless of p-values. Increase to
523
+ enforce statistical reliability of node-level estimates on smaller
524
+ subsets, decrease to allow finer partitioning.
525
+
526
+ ### 6.6. `min_buckets`
527
+
528
+ **Default**: `7`.
529
+
530
+ Minimum sum of weights in each child node. Splits that would produce a
531
+ child smaller than this are rejected. Together with `min_splits`,
532
+ controls the smallest leaf permitted; raise both for noisier data.
533
+
534
+ ### 6.7. `max_depth`
535
+
536
+ **Default**: `None` (no limit).
537
+
538
+ Maximum tree depth. Set to a small integer for shallow, easily interpreted
539
+ trees. Leave `None` to let the p-value stopping rule fully control depth.
540
+
541
+ ### 6.8. `categorical_features`
542
+
543
+ **Default**: `None` (all numeric).
544
+
545
+ List of feature columns to treat as categorical. Entries may be
546
+ column-name strings (resolved against the DataFrame columns at fit
547
+ time, i.e. `feature_names_in_`) or integer column indices; mixing the
548
+ two forms is allowed. Letting $K$ denote the number of levels in a
549
+ categorical feature, Sigma uses exhaustive split enumeration for
550
+ $K \le 10$ and an ordered-merge heuristic for $K > 10$ (see the
551
+ Algorithm section).
552
+
553
+ ### 6.9. `ci_method` (`RegressionTree` only)
554
+
555
+ **Default**: `"bayesian_bootstrap"`.
556
+
557
+ Method for the confidence interval on each node's mean prediction. In
558
+ the descriptions below, $y$ denotes the per-row response, $n$ the
559
+ sample size at the node, and $n_{\text{eff}}$ the Kish effective
560
+ sample size at the node.
561
+
562
+ - `"bayesian_bootstrap"` (default) uses Dirichlet resampling of the
563
+ weighted mean. Nonparametric: makes no assumption on the response
564
+ distribution. The safe choice for arbitrary regression targets, but
565
+ less powerful than a method tailored to the response's actual
566
+ distribution.
567
+ - `"bca"` is the bias-corrected and accelerated bootstrap interval
568
+ (Efron, 1987): resample $10{,}000$ times from the empirical
569
+ distribution, then read percentiles corrected for median
570
+ bias ($z_0$) and skewness ($a$, computed via jackknife).
571
+ Nonparametric and second-order accurate ($O(1/n)$ coverage error);
572
+ transformation-respecting. Choose for the frequentist counterpart of
573
+ `"bayesian_bootstrap"` when an external benchmark specifies a
574
+ frequentist confidence interval. Non-deterministic across calls.
575
+ - `"beta"` is a Clopper-Pearson-style Beta interval for proportional
576
+ responses in $[0, 1]$. Choose when $y$ is naturally a rate or
577
+ proportion (conversion rate, click-through rate).
578
+ - `"exponential"` is the exact chi-squared interval for an Exponential
579
+ mean (Gamma with shape $= 1$); requires $y \ge 0$. Choose when
580
+ responses are non-negative waiting times or lifetimes that follow an
581
+ exponential distribution.
582
+ - `"gamma"` is the exact chi-squared interval for a Gamma mean using
583
+ a method-of-moments shape estimate; requires $y \ge 0$. Choose for
584
+ non-negative right-skewed responses (insurance claims, incomes,
585
+ durations).
586
+ - `"log_normal"` is Cox's interval for the arithmetic mean of a
587
+ log-normal response; requires $y > 0$. Centered on the log-normal
588
+ MLE of the mean, not the sample mean. Choose when $\log y$ is
589
+ approximately normal (financial returns, biological measurements).
590
+ - `"log_normal_gci"` is the generalized confidence interval
591
+ (Krishnamoorthy & Mathew, 2003) for the arithmetic mean of a
592
+ log-normal response; requires $y > 0$. Like `"log_normal"` but built
593
+ via Monte Carlo from a generalized pivot, giving asymmetric bounds.
594
+ Choose when $n_{\text{eff}}$ is very small with large $\log y$ variance, where
595
+ Cox's symmetric Wald form begins to lose calibration.
596
+ Non-deterministic across calls.
597
+ - `"normal"` is a Wald-style interval $\bar{Y} \pm z \cdot \text{SE}$
598
+ ($\bar{Y}$ the node weighted mean of $y$, $z$ a standard normal
599
+ quantile, $\text{SE}$ the standard error) with the Kish effective
600
+ sample size. Tight and cheap. Choose when the central limit theorem
601
+ applies comfortably ($n_{\text{eff}}$ well above 30, finite response
602
+ variance).
603
+ - `"poisson"` is the exact Garwood chi-squared interval for a Poisson
604
+ mean rate; requires $y \ge 0$. The conservative choice with
605
+ guaranteed coverage (Patil & Kulkarni, 2012). Choose for count
606
+ responses generated by an approximately Poisson process when
607
+ guaranteed coverage matters more than tightness.
608
+ - `"poisson_jeffreys"` is the equal-tailed Jeffreys interval for a
609
+ Poisson mean rate; requires $y \ge 0$. Shorter than `"poisson"` at
610
+ moderate rates (Patil & Kulkarni, 2012). Choose for count
611
+ responses at moderate rates when you do not require Garwood's
612
+ guaranteed coverage.
613
+ - `"student_t"` has the same form as `"normal"` but uses a Student-t
614
+ quantile with $n_{\text{eff}} - 1$ degrees of freedom. Wider than
615
+ `"normal"` for small effective sample sizes. Choose when
616
+ $n_{\text{eff}}$ is borderline and small-sample coverage matters.
617
+
618
+ ### 6.10. `ci_method` (`ClassificationTree` only)
619
+
620
+ **Default**: `"jeffreys"`.
621
+
622
+ Method for the per-class confidence intervals on node class
623
+ proportions. In the descriptions below, $n$ denotes the node sample
624
+ size and $z$ a standard normal quantile.
625
+
626
+ - `"agresti_coull"` is the adjusted Wald interval: Wald applied
627
+ after adding $z^2/2$ pseudo-successes and $z^2/2$ pseudo-failures.
628
+ Slightly wider and more conservative than `"wilson"` at small
629
+ sample sizes; statistically equivalent to `"wilson"` and
630
+ `"jeffreys"` for $n > 40$ per Brown-Cai-DasGupta (2001). Choose
631
+ when matching an external reference that specifies Agresti-Coull.
632
+ - `"clopper_pearson"` is the exact Beta interval. Has the absolute
633
+ coverage *guarantee* ($\ge$ `ci_coverage` for every true proportion),
634
+ but is conservative: intervals are wider than they need to be on
635
+ average. Choose when guaranteed coverage matters more than tightness
636
+ (regulatory or safety contexts).
637
+ - `"jeffreys"` (default) is a Bayesian interval from the Beta
638
+ posterior with the Jeffreys non-informative prior $\mathrm{Beta}(0.5, 0.5)$.
639
+ Neither systematically conservative nor systematically aggressive on
640
+ average. Recommended for general use.
641
+ - `"mid_p_exact"` is the mid-p variant of Clopper-Pearson. Strictly
642
+ narrower than `"clopper_pearson"` while keeping an exact-tail
643
+ rationale, with average coverage close to nominal. Choose when
644
+ Clopper-Pearson's conservatism feels too wasteful but an exact-tail
645
+ method is still desired.
646
+ - `"wilson"` is the closed-form Wilson score interval, clipped to
647
+ $[0, 1]$. Cheapest to compute and accurate at moderate sample sizes;
648
+ coverage degrades near 0 and 1. Choose when you need vectorized
649
+ speed and class proportions are not extreme.
650
+ - `"wilson_cc"` is the Wilson score interval with Newcombe's
651
+ continuity correction. Slightly wider than `"wilson"`, restoring
652
+ lower-tail coverage at small sample sizes. Choose when the node
653
+ total weight $w_{\text{total}}$ is small and plain Wilson
654
+ under-covers.
655
+
656
+ ### 6.11. `ci_method` (`RankingTree` only)
657
+
658
+ **Default**: `"bayesian_bootstrap"`.
659
+
660
+ Method for the per-item confidence intervals on each node's Plackett-Luce
661
+ expected-rank vector. Both supported methods refit the PL MLE on
662
+ resampled active rows and aggregate the resulting expected-rank vectors
663
+ marginally per item; scalar-mean CI methods (`"normal"`, `"student_t"`)
664
+ and the seven distribution-specific methods of `RegressionTree`
665
+ (`"beta"`, `"exponential"`, `"gamma"`, `"log_normal"`,
666
+ `"log_normal_gci"`, `"poisson"`, `"poisson_jeffreys"`) are rejected at
667
+ construction time because PL expected rank is a non-linear functional of
668
+ a joint MLE rather than a scalar sample mean.
669
+
670
+ - `"bayesian_bootstrap"` (default) draws Dirichlet weights for the
671
+ active rows and refits the PL MLE on each replicate. Nonparametric;
672
+ the safe choice.
673
+ - `"bca"` is the bias-corrected and accelerated bootstrap (Efron, 1987)
674
+ applied to row-resampled PL refits, with the acceleration term
675
+ computed from a leave-one-out jackknife of PL refits. Slower than
676
+ `"bayesian_bootstrap"`; non-deterministic across calls.
677
+
678
+ ### 6.12. `ci_coverage`
679
+
680
+ **Default**: `0.95`.
681
+
682
+ Coverage level for node-prediction confidence intervals. Set to `None`
683
+ to skip CI computation entirely (the proper way to fully avoid the
684
+ per-node `ci_method` cost). Common alternatives: `0.90` (less
685
+ conservative), `0.99` (more conservative). For survival trees, also
686
+ controls the confidence band drawn behind each Kaplan-Meier curve in
687
+ the response plot; for ranking trees, it sets the per-item whisker
688
+ width in the expected-rank response plot.
689
+
690
+ ### 6.13. `transmuter`
691
+
692
+ **Default**: `None`.
693
+
694
+ Optional callable that transforms node-level data before predictions
695
+ and confidence intervals are computed, with post-hoc split validation.
696
+ Signature: `(X, y, sample_weight) -> (y', sample_weight')`, or
697
+ `(X, y, sample_weight, side_data) -> (y', sample_weight')` when
698
+ `side_data` is passed to `fit`. After each candidate split, both
699
+ child subsets are independently transmuted and a significance test is
700
+ run on the transmuted data; if the p-value exceeds `alpha` the split
701
+ is rejected and the node becomes a leaf. Use cases: survival outcomes
702
+ (Kaplan-Meier-style transformation), rate normalization (impressions
703
+ to click-through rate), de-noising heavy-tailed responses.
704
+
705
+ ### 6.14. `resamples`
706
+
707
+ **Default**: `None`.
708
+
709
+ Number of permutations $B$ for `test_type="monte_carlo"`. Required and
710
+ must be a positive integer when monte_carlo is selected; ignored
711
+ otherwise. Typical choices: `1000` for day-to-day production, `10000`
712
+ for paper-grade reproducible adjusted p-values.
713
+
714
+ ### 6.15. `decorator`
715
+
716
+ **Default**: `None`.
717
+
718
+ Optional callable invoked once per node after the tree is built.
719
+ Signature: `(X_active, y_active, w_active, side_data_active) ->
720
+ decoration` where `decoration` is any object (or `None`). The returned
721
+ object is stored on the node as `node.decoration` and rendered by
722
+ `to_text` and `to_image`. Use cases: per-node metric (RMSE,
723
+ classification accuracy), business labels (segment names), diagnostic
724
+ statistics.
725
+
726
+ ### 6.16. `random_state`
727
+
728
+ **Default**: `None`.
729
+
730
+ Seed for all stochastic operations in the estimator. Pass an integer
731
+ for reproducibility; `None` uses an unpredictable seed. Controls:
732
+
733
+ - min-P permutation resampling under `test_type="monte_carlo"`;
734
+ - the bootstrap-family CI methods of `RegressionTree`
735
+ (`bayesian_bootstrap`, `bca`, `log_normal_gci`);
736
+ - the bootstrap-family CI methods of `RankingTree`
737
+ (`bayesian_bootstrap`, `bca`) applied K times per node - once per
738
+ item;
739
+ - the jitter of `to_image(kind="response")` raincloud plots
740
+ (`RegressionTree` only; combined with the leaf index so each leaf
741
+ receives a distinct pattern).
742
+
743
+ ## 7. Algorithm
744
+
745
+ The algorithm builds a decision tree using statistical hypothesis
746
+ testing for unbiased variable selection. Unlike CART, which selects
747
+ variables by maximizing an impurity criterion (and is therefore biased
748
+ toward variables with many possible splits), conditional inference trees
749
+ use permutation-based p-values to decouple variable selection from split
750
+ search.
751
+
752
+ The framework is generic: the only difference between the four task
753
+ families is the influence function $h$ applied to the response $Y_i$
754
+ of observation $i$. For classification with $J$ classes,
755
+ $h(Y_i) = e_J(Y_i)$ (one-hot encoding of the class label). For
756
+ regression, $h(Y_i) = Y_i$ (identity). For survival, $h(Y_i)$ is the
757
+ log-rank score (a scalar centred Savage score). For ranking, the
758
+ ranks-in-cell $Y_i$ is imputed at unranked items with the per-row
759
+ tail mean, log-transformed via $\log(1 + Y_i)$, column-centered, and
760
+ projected onto the top-$R$ right singular vectors of the resulting
761
+ matrix: $h(Y_i) = (\log(1 + Y_i) - \bar{m}) V$, where $\bar{m}$ is
762
+ the global column mean and $V$ is the loading matrix. The
763
+ log-transformation of power-law-distributed rank data before factor
764
+ analysis follows Leydesdorff (2006). All test statistics, p-value
765
+ computations, and splitting criteria use the same formulas.
766
+
767
+ ### 7.1. Step 1: Variable selection and stopping
768
+
769
+ Given $n$ observations with response values $Y_i$, covariate values
770
+ $X_{ji}$ (the value of the $j$-th covariate $X_j$ for observation $i$),
771
+ and case weights $w_i$, define $g_j$ as the score function for
772
+ covariate $X_j$ (identity for numeric covariates, dummy encoding for
773
+ categorical ones). When
774
+ `correlation="rank"` (the default), continuous covariates and regression
775
+ responses are rank-transformed within each node before computing the
776
+ test statistics, yielding Spearman-like nonparametric tests that are
777
+ robust to outliers and non-normality. When `correlation="normal"`, raw
778
+ values are used (Pearson-like, as in the original paper). For each
779
+ covariate $X_j$, the algorithm computes the linear statistic
780
+
781
+ $$T_j = \text{vec}\!\left(\sum_{i=1}^{n} w_i \cdot g_j(X_{ji}) \cdot h(Y_i)^\top\right)$$
782
+
783
+ and derives its conditional expectation $\mu_j$ and covariance
784
+ $\Sigma_j$ under the null hypothesis of independence between $X_j$ and
785
+ the response $Y$. A test statistic (quadratic-form or maximum-type) is
786
+ computed and converted to a p-value $P_j$. A multiplicity adjustment is
787
+ applied across all $m$ covariates, and recursion stops when
788
+ $\min_j(\text{adjusted } P_j) > \alpha$. Otherwise the covariate with
789
+ the smallest adjusted p-value is selected.
790
+
791
+ The default adjustment is the Sidak correction
792
+ ($\text{adjusted } P_j = 1 - (1 - P_j)^m$), which is powerful under the
793
+ mild assumption that the test statistics across covariates are
794
+ independent or positively dependent. A simpler closed-form alternative,
795
+ `test_type="bonferroni"`, uses
796
+ $\text{adjusted } P_j = \min(m P_j, 1)$; it is strictly more
797
+ conservative than Sidak. The third alternative,
798
+ `test_type="monte_carlo"`, uses the Westfall-Young (1993) min-P
799
+ resampling procedure. For each of $B$ permutations of the response, all
800
+ $m$ p-values are recomputed and the minimum recorded. The adjusted
801
+ p-value for covariate $j$ is the proportion of permutations where this
802
+ minimum did not exceed the observed $P_j$. This method is more powerful
803
+ than Sidak when covariates are correlated, at the cost of
804
+ $O(B \cdot m)$ additional statistic evaluations. Set `resamples` (e.g.,
805
+ 1000 or 10000) and optionally `random_state` for reproducibility. All
806
+ three methods are available via the `test_type` parameter.
807
+
808
+ ### 7.2. Step 2: Binary splitting
809
+
810
+ For the selected covariate, the algorithm searches for the binary
811
+ partition $A^*$ that maximizes the two-sample test statistic. Numeric
812
+ covariates are split at midpoints between consecutive unique values.
813
+ Categorical covariates with $K \le 10$ levels use exhaustive enumeration
814
+ of all $2^{K-1} - 1$ partitions; for $K > 10$, categories are ordered
815
+ by weighted mean of the first influence function column and only $K - 1$
816
+ contiguous splits are evaluated (provably optimal for regression,
817
+ heuristic for classification).
818
+
819
+ ### 7.3. Step 3: Recursion and prediction
820
+
821
+ Case weights are updated to reflect node membership and steps 1-2 are
822
+ repeated recursively on each child node. Terminal nodes predict:
823
+
824
+ - **Regression**: the weighted mean of the response.
825
+ - **Classification**: the majority class, with class probabilities
826
+ given by the normalized weighted class counts.
827
+
828
+ ## 8. Partykit compatibility
829
+
830
+ Sigma is a pure-Python reimplementation of R's `partykit::ctree` with
831
+ various improvements. Tree shape, split variables, split thresholds,
832
+ and per-leaf predictions are empirically verified to match
833
+ `partykit::ctree` on three reference datasets, one per task family:
834
+
835
+ - **Regression**: the `airquality` dataset (Ozone on Wind/Temp/Month/Day,
836
+ n=116 after dropping the rows with no Ozone observation). Crosscheck at
837
+ `tests/test_partykit_equivalence.py:26`.
838
+ - **Classification**: the `GlaucomaM` dataset from R's `TH.data` package
839
+ (Class on 62 morphology covariates, n=196). Crosscheck at
840
+ `tests/test_partykit_equivalence.py:75`.
841
+ - **Survival**: the `GBSG2` dataset from `lifelines`
842
+ (`Surv(time, cens) ~ horTh + age + menostat + tsize + tgrade + pnodes +
843
+ progrec + estrec`, n=686). Crosscheck at
844
+ `tests/test_tree_survival.py:661`.
845
+
846
+ Three deliberate deviations from partykit are worth knowing about:
847
+
848
+ 1. **`test_type="sidak"` is the default**, matching partykit's effective
849
+ behavior. Partykit's `testtype="Bonferroni"` is a naming error on their
850
+ part: the adjustment it computes is mathematically the Sidak formula
851
+ $1 - (1 - P_j)^m$, not the textbook Bonferroni $\min(m P_j, 1)$.
852
+ Sigma exposes both options under their correct names; pass
853
+ `test_type="bonferroni"` for the textbook Bonferroni formula, or `test_type="sidak"`
854
+ (the default) to match partykit's "Bonferroni" output exactly.
855
+ 2. **`correlation="rank"` is the default**, where partykit uses raw
856
+ values. Rank-transforming both response and continuous covariates
857
+ gives a Spearman-style test that is robust to outliers and skew, at
858
+ the cost of a small loss of power against linear alternatives. Pass
859
+ `correlation="normal"` to match partykit exactly.
860
+ 3. **Leaves are reordered for display**: `leaves_` iterates in a
861
+ task-appropriate canonical order, and `to_text` / `to_image` swap
862
+ left and right children of each inner node to match. Sort keys are
863
+ descending majority class share (`ClassificationTree`), ascending
864
+ predicted response (`RegressionTree`), worst prognosis first
865
+ (`SurvivalTree`), and ascending lexicographic per-item PL expected-rank
866
+ vector (`RankingTree`). Partykit prints leaves in tree-traversal
867
+ order. The underlying tree is identical; only the iteration order
868
+ of `leaves_` and the visual left-vs-right placement of children in
869
+ exported renderings differ.
870
+
871
+ ## 9. References
872
+
873
+ - Hothorn, T., & Zeileis, A. (2015). *partykit: A Modular Toolkit for
874
+ Recursive Partytioning in R.* *Journal of Machine Learning
875
+ Research*, 16, 3905-3909.
876
+ [jmlr.org/papers/v16/hothorn15a](https://jmlr.org/papers/v16/hothorn15a.html)
877
+ - Turner, H., van Etten, J., Firth, D., & Kosmidis, I. (2020).
878
+ *Modelling Rankings in R: The PlackettLuce Package.* *Computational
879
+ Statistics*, 35(3), 1027-1057.
880
+ [doi:10.1007/s00180-020-00959-3](https://doi.org/10.1007/s00180-020-00959-3)
881
+ - Patil, V. V., & Kulkarni, H. V. (2012). *Comparison of Confidence
882
+ Intervals for the Poisson Mean: Some New Aspects.* *REVSTAT -
883
+ Statistical Journal*, 10(2), 211-227.
884
+ [doi:10.57805/revstat.v10i2.117](https://doi.org/10.57805/revstat.v10i2.117)
885
+ - Hothorn, T., Hornik, K., & Zeileis, A. (2006). *Unbiased Recursive
886
+ Partitioning: A Conditional Inference Framework.* *Journal of
887
+ Computational and Graphical Statistics*, 15(3), 651-674.
888
+ [doi:10.1198/106186006X133933](https://doi.org/10.1198/106186006X133933)
889
+ - Hothorn, T., Hornik, K., van de Wiel, M. A., & Zeileis, A. (2006).
890
+ *A Lego System for Conditional Inference.* *The American
891
+ Statistician*, 60(3), 257-263.
892
+ [doi:10.1198/000313006X118430](https://doi.org/10.1198/000313006X118430)
893
+ - Leydesdorff, L. (2006). *Classification and Powerlaws: The
894
+ Logarithmic Transformation.* *Journal of the American Society for
895
+ Information Science and Technology*, 57(11), 1470-1486.
896
+ [doi:10.1002/asi.20467](https://doi.org/10.1002/asi.20467)
897
+ - Olsson, U. (2005). *Confidence Intervals for the Mean of a
898
+ Log-Normal Distribution.* *Journal of Statistics Education*, 13(1).
899
+ [doi:10.1080/10691898.2005.11910638](https://doi.org/10.1080/10691898.2005.11910638)
900
+ - Hunter, D. R. (2004). *MM Algorithms for Generalized Bradley-Terry
901
+ Models.* *Annals of Statistics*, 32(1), 384-406.
902
+ [doi:10.1214/aos/1079120141](https://doi.org/10.1214/aos/1079120141)
903
+ - Krishnamoorthy, K., & Mathew, T. (2003). *Inferences on the Means of
904
+ Lognormal Distributions Using Generalized p-Values and Generalized
905
+ Confidence Intervals.* *Journal of Statistical Planning and
906
+ Inference*, 115(1), 103-121.
907
+ [doi:10.1016/S0378-3758(02)00153-2](https://doi.org/10.1016/S0378-3758\(02\)00153-2)
908
+ - Hothorn, T., & Lausen, B. (2003). *On the Exact Distribution of
909
+ Maximally Selected Rank Statistics.* *Computational Statistics &
910
+ Data Analysis*, 43(2), 121-137.
911
+ [doi:10.1016/S0167-9473(02)00225-6](https://doi.org/10.1016/S0167-9473\(02\)00225-6)
912
+ - Brown, L. D., Cai, T. T., & DasGupta, A. (2001). *Interval
913
+ Estimation for a Binomial Proportion.* *Statistical Science*, 16(2),
914
+ 101-133.
915
+ [doi:10.1214/ss/1009213286](https://doi.org/10.1214/ss/1009213286)
916
+ - Agresti, A., & Coull, B. A. (1998). *Approximate is Better than
917
+ "Exact" for Interval Estimation of Binomial Proportions.* *The
918
+ American Statistician*, 52(2), 119-126.
919
+ [doi:10.1080/00031305.1998.10480550](https://doi.org/10.1080/00031305.1998.10480550)
920
+ - Newcombe, R. G. (1998). *Two-Sided Confidence Intervals for the
921
+ Single Proportion: Comparison of Seven Methods.* *Statistics in
922
+ Medicine*, 17(8), 857-872.
923
+ [doi:10.1002/sim.777](https://doi.org/10.1002/\(SICI\)1097-0258\(19980430\)17:8%3C857::AID-SIM777%3E3.0.CO;2-E)
924
+ - Efron, B. (1987). *Better Bootstrap Confidence Intervals.* *Journal
925
+ of the American Statistical Association*, 82(397), 171-185.
926
+ [doi:10.1080/01621459.1987.10478410](https://doi.org/10.1080/01621459.1987.10478410)
927
+ - Rubin, D. B. (1981). *The Bayesian Bootstrap.* *Annals of
928
+ Statistics*, 9(1), 130-134.
929
+ [doi:10.1214/aos/1176345338](https://doi.org/10.1214/aos/1176345338)
930
+ - Efron, B. (1977). *The Efficiency of Cox's Likelihood Function for
931
+ Censored Data.* *Journal of the American Statistical Association*,
932
+ 72(359), 557-565.
933
+ [doi:10.1080/01621459.1977.10480613](https://doi.org/10.1080/01621459.1977.10480613)
934
+ - Breslow, N. E. (1974). *Covariance Analysis of Censored Survival
935
+ Data.* *Biometrics*, 30(1), 89-99.
936
+ [doi:10.2307/2529620](https://doi.org/10.2307/2529620)
937
+ - Wilson, E. B. (1927). *Probable Inference, the Law of Succession,
938
+ and Statistical Inference.* *Journal of the American Statistical
939
+ Association*, 22(158), 209-212.
940
+ [doi:10.1080/01621459.1927.10502953](https://doi.org/10.1080/01621459.1927.10502953)