rigortype 0.2.1 → 0.2.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (105) hide show
  1. checksums.yaml +4 -4
  2. data/README.md +41 -14
  3. data/docs/handbook/01-getting-started.md +311 -0
  4. data/docs/handbook/02-everyday-types.md +337 -0
  5. data/docs/handbook/03-narrowing.md +359 -0
  6. data/docs/handbook/04-tuples-and-shapes.md +321 -0
  7. data/docs/handbook/05-methods-and-blocks.md +339 -0
  8. data/docs/handbook/06-classes.md +305 -0
  9. data/docs/handbook/07-rbs-and-extended.md +427 -0
  10. data/docs/handbook/08-understanding-errors.md +373 -0
  11. data/docs/handbook/09-plugins.md +241 -0
  12. data/docs/handbook/10-sorbet.md +347 -0
  13. data/docs/handbook/11-sig-gen.md +312 -0
  14. data/docs/handbook/12-lightweight-hkt.md +333 -0
  15. data/docs/handbook/README.md +275 -0
  16. data/docs/handbook/appendix-elixir.md +370 -0
  17. data/docs/handbook/appendix-go.md +399 -0
  18. data/docs/handbook/appendix-java-csharp.md +470 -0
  19. data/docs/handbook/appendix-liskov.md +580 -0
  20. data/docs/handbook/appendix-mypy.md +370 -0
  21. data/docs/handbook/appendix-phpstan.md +338 -0
  22. data/docs/handbook/appendix-protocols-and-structural-typing.md +292 -0
  23. data/docs/handbook/appendix-rust.md +446 -0
  24. data/docs/handbook/appendix-steep.md +336 -0
  25. data/docs/handbook/appendix-type-theory.md +1662 -0
  26. data/docs/handbook/appendix-typeprof.md +416 -0
  27. data/docs/handbook/appendix-typescript.md +332 -0
  28. data/docs/install.md +189 -0
  29. data/docs/llms.txt +72 -0
  30. data/docs/manual/01-installation.md +342 -0
  31. data/docs/manual/02-cli-reference.md +557 -0
  32. data/docs/manual/03-configuration.md +152 -0
  33. data/docs/manual/04-diagnostics.md +206 -0
  34. data/docs/manual/05-inspecting-types.md +109 -0
  35. data/docs/manual/06-baseline.md +104 -0
  36. data/docs/manual/07-plugins.md +92 -0
  37. data/docs/manual/08-skills.md +143 -0
  38. data/docs/manual/09-editor-integration.md +245 -0
  39. data/docs/manual/10-mcp-server.md +532 -0
  40. data/docs/manual/11-ci.md +274 -0
  41. data/docs/manual/12-caching.md +116 -0
  42. data/docs/manual/13-troubleshooting.md +120 -0
  43. data/docs/manual/14-rails-quickstart.md +332 -0
  44. data/docs/manual/15-type-protection-coverage.md +204 -0
  45. data/docs/manual/16-rbs-extended-annotations.md +190 -0
  46. data/docs/manual/17-driving-improvement.md +160 -0
  47. data/docs/manual/README.md +87 -0
  48. data/docs/manual/ci-templates/README.md +58 -0
  49. data/docs/manual/plugins/README.md +86 -0
  50. data/docs/manual/plugins/rigor-actioncable.md +78 -0
  51. data/docs/manual/plugins/rigor-actionmailer.md +74 -0
  52. data/docs/manual/plugins/rigor-actionpack.md +80 -0
  53. data/docs/manual/plugins/rigor-activejob.md +58 -0
  54. data/docs/manual/plugins/rigor-activerecord.md +102 -0
  55. data/docs/manual/plugins/rigor-activestorage.md +74 -0
  56. data/docs/manual/plugins/rigor-activesupport-core-ext.md +86 -0
  57. data/docs/manual/plugins/rigor-devise.md +70 -0
  58. data/docs/manual/plugins/rigor-dry-schema.md +56 -0
  59. data/docs/manual/plugins/rigor-dry-struct.md +60 -0
  60. data/docs/manual/plugins/rigor-dry-types.md +59 -0
  61. data/docs/manual/plugins/rigor-dry-validation.md +62 -0
  62. data/docs/manual/plugins/rigor-factorybot.md +76 -0
  63. data/docs/manual/plugins/rigor-graphql.md +89 -0
  64. data/docs/manual/plugins/rigor-hanami.md +83 -0
  65. data/docs/manual/plugins/rigor-mangrove.md +73 -0
  66. data/docs/manual/plugins/rigor-minitest.md +86 -0
  67. data/docs/manual/plugins/rigor-pundit.md +72 -0
  68. data/docs/manual/plugins/rigor-rails-i18n.md +92 -0
  69. data/docs/manual/plugins/rigor-rails-routes.md +94 -0
  70. data/docs/manual/plugins/rigor-rails.md +44 -0
  71. data/docs/manual/plugins/rigor-rbs-inline.md +83 -0
  72. data/docs/manual/plugins/rigor-rspec-rails.md +72 -0
  73. data/docs/manual/plugins/rigor-rspec.md +86 -0
  74. data/docs/manual/plugins/rigor-shoulda-matchers.md +78 -0
  75. data/docs/manual/plugins/rigor-sidekiq.md +78 -0
  76. data/docs/manual/plugins/rigor-sinatra.md +61 -0
  77. data/docs/manual/plugins/rigor-sorbet.md +63 -0
  78. data/docs/manual/plugins/rigor-statesman.md +75 -0
  79. data/docs/manual/plugins/rigor-typescript-utility-types.md +71 -0
  80. data/exe/rigor +1 -1
  81. data/lib/rigor/analysis/incremental_session.rb +4 -2
  82. data/lib/rigor/analysis/run_stats.rb +13 -1
  83. data/lib/rigor/analysis/runner.rb +54 -12
  84. data/lib/rigor/cli/check_command.rb +1 -1
  85. data/lib/rigor/cli/docs_command.rb +248 -0
  86. data/lib/rigor/cli/skill_command.rb +103 -41
  87. data/lib/rigor/cli/skill_describe.rb +346 -0
  88. data/lib/rigor/cli.rb +25 -3
  89. data/lib/rigor/inference/method_dispatcher/constant_folding.rb +124 -32
  90. data/lib/rigor/inference/method_dispatcher/shape_dispatch.rb +37 -6
  91. data/lib/rigor/inference/scope_indexer.rb +87 -89
  92. data/lib/rigor/plugin/isolation.rb +5 -5
  93. data/lib/rigor/plugin/loader.rb +4 -2
  94. data/lib/rigor/version.rb +1 -1
  95. data/skills/rigor-ask/SKILL.md +172 -0
  96. data/skills/rigor-doctor/SKILL.md +87 -0
  97. data/skills/rigor-editor-setup/SKILL.md +114 -0
  98. data/skills/rigor-mcp-setup/SKILL.md +117 -0
  99. data/skills/rigor-monkeypatch-resolve/SKILL.md +79 -0
  100. data/skills/rigor-next-steps/SKILL.md +113 -0
  101. data/skills/rigor-plugin-tune/SKILL.md +79 -0
  102. data/skills/rigor-protection-uplift/SKILL.md +133 -0
  103. data/skills/rigor-rbs-setup/SKILL.md +128 -0
  104. data/skills/rigor-upgrade/SKILL.md +79 -0
  105. metadata +90 -1
@@ -0,0 +1,1662 @@
1
+ # Appendix — Connections to Type Theory
2
+
3
+ A short bridge between Rigor's vocabulary and the formal
4
+ type-theoretic concepts you may have seen in a programming-languages
5
+ textbook or in another type checker's documentation. The handbook
6
+ proper is deliberately short on theory; this appendix names the
7
+ underlying ideas so that if you already know one of them, you can
8
+ recognise the corresponding Rigor surface immediately.
9
+
10
+ This page is descriptive, not normative. When the formal language
11
+ here disagrees with the [type
12
+ specification](../type-specification/README.md), the spec binds.
13
+
14
+ ## Five-second pitch
15
+
16
+ | Question | Type-theory term | Rigor surface |
17
+ | --- | --- | --- |
18
+ | What is the universe of types ordered by? | Subtyping (`<:`), a partial order forming a lattice | The carrier zoo with `Top` / `Bot`, `\|` (join), `&` (meet) |
19
+ | What about types that may or may not match? | Gradual consistency (`~`) | The `Dynamic[T]` carrier and the trinary certainty `yes / no / maybe` |
20
+ | How are user types identified? | Nominal vs structural | **Nominal-first hybrid** — classes by name, plus structural facets (`interface`, `HashShape`, capability roles) |
21
+ | How are generics expressed? | Parametric polymorphism (System F-style, but predicative) | RBS generics `class Array[Elem]`, method generics `def map: [U] () { (Elem) -> U } -> Array[U]` |
22
+ | How is "x is a non-empty string" expressed? | Refinement / predicate subtyping | First-class refinement carriers (`non-empty-string`, `int<min, max>`, …) |
23
+ | How does `if x.is_a?(String)` change `x`'s type? | Occurrence typing / flow-sensitive narrowing | Edge-aware narrowing with trinary certainty |
24
+ | What about side effects? | Effect systems | The engine's effect model (mutation, exception, escape) — internal, not user-visible |
25
+ | Soundness or completeness? | Pick one (or neither) | **Neither in full** — Rigor optimises for no-false-positives, with a robustness-principle bias |
26
+ | Why do some features force annotations everywhere? | Decidability of inference — certain combinations (Rank-3+, polymorphic recursion, subtyping + intersection) are undecidable | **The trinary `maybe`** — when inference cannot decide, Rigor stays silent rather than guessing or pestering the user for an annotation |
27
+
28
+ Rigor's design pulls liberally from this catalogue but avoids the
29
+ parts that would force a Ruby author to write annotations they did
30
+ not author themselves.
31
+
32
+ ## The type lattice
33
+
34
+ Rigor's types form a (bounded) lattice under the subtyping
35
+ relation `<:`. The standard textbook picture applies almost
36
+ verbatim:
37
+
38
+ - **`Top`** is the greatest element — every value has type `Top`.
39
+ - **`Bot`** is the least element — no value has type `Bot`. Useful
40
+ for unreachable branches and "this method always raises."
41
+ - **Join `T \| U`** (union) is the least upper bound.
42
+ - **Meet `T & U`** (intersection) is the greatest lower bound.
43
+
44
+ ```ruby
45
+ # Top — every value inhabits it
46
+ x = something_we_know_nothing_about
47
+ assert_type("Dynamic[top]", x) # Top widened with the Dynamic marker
48
+
49
+ # Bot — no value inhabits it; raised-only methods return Bot
50
+ def boom!
51
+ raise "no"
52
+ end
53
+ assert_type("Dynamic[top]", method(:boom!).call) # Method indirection; direct boom! call returns Bot
54
+
55
+ # Join — Union of two non-overlapping types
56
+ n = rand < 0.5 ? 1 : "a"
57
+ assert_type("\"a\" | 1", n)
58
+
59
+ # Meet — Intersection (rarely needed at the surface level)
60
+ # Mostly arises during refinement combinations
61
+ ```
62
+
63
+ Spec: [`docs/type-specification/value-lattice.md`](../type-specification/value-lattice.md),
64
+ [`docs/type-specification/special-types.md`](../type-specification/special-types.md).
65
+
66
+ ## Set-theoretic foundations of the lattice
67
+
68
+ The previous section described `Top` / `Bot` / `|` (join) / `&`
69
+ (meet) as "the standard textbook picture." The semantic
70
+ foundation that makes that picture *work* — where union and
71
+ intersection on types behave the way the user expects — is the
72
+ **set-theoretic** view of types.
73
+
74
+ In the set-theoretic view, a type `T` is interpreted as the *set
75
+ of values inhabiting it*, and the type operators correspond to
76
+ set operations:
77
+
78
+ | Type operator | Set operation |
79
+ | --- | --- |
80
+ | `T \| U` | `T ∪ U` (union of inhabitants) |
81
+ | `T & U` | `T ∩ U` (intersection of inhabitants) |
82
+ | `T - U` | `T ∖ U` (set-theoretic difference; sometimes written `T ¬ U`) |
83
+ | `Top` | the universal set |
84
+ | `Bot` | the empty set |
85
+
86
+ This is the semantics behind **semantic subtyping**: `T <: U` iff
87
+ every inhabitant of `T` is also an inhabitant of `U`. Subtyping
88
+ becomes set inclusion.
89
+
90
+ ### Academic root
91
+
92
+ **Frisch, Castagna & Benzaken** developed semantic subtyping in
93
+ the early 2000s as the type-theoretic foundation of CDuce — a
94
+ language for processing XML where union, intersection, and
95
+ negation types are first-class. The framework was consolidated
96
+ in **Castagna's 2024 textbook *Programming with Union,
97
+ Intersection, and Negation Types***, the current canonical
98
+ reference for the area.
99
+
100
+ Under semantic subtyping, the lattice operators are *exactly*
101
+ the set-theoretic ones, and a decidable subtyping algorithm
102
+ exists for the resulting fragment — much richer than HM but
103
+ still tractable.
104
+
105
+ ### Industrial uptake
106
+
107
+ - **TypeScript** and **Flow** use union / intersection
108
+ arithmetic informally — the spec talks about "subtype-of-T-or-U"
109
+ rather than committing to a semantic model, but the resulting
110
+ behaviour matches the set-theoretic reading in most cases.
111
+ - **Elixir's set-theoretic type system** (José Valim + Giuseppe
112
+ Castagna collaboration, 2023–) is the first major industrial
113
+ language to *deliberately* adopt the formal framework. The
114
+ decision to ground Elixir's typed surface in semantic subtyping
115
+ rather than a syntactic ad-hoc subtyping relation is the most
116
+ consequential design choice the project has made.
117
+ - **Scala 3** ships union types and intersection types as
118
+ first-class constructs; the semantics are loosely
119
+ set-theoretic.
120
+ - **Ceylon** (now retired) was an early industrial experiment
121
+ with explicit union / intersection types at the surface.
122
+
123
+ ### Rigor's position
124
+
125
+ Rigor's lattice **is** set-theoretic in spirit:
126
+
127
+ - `T | U`, `T & U`, `T - U` are present at the surface
128
+ ([`docs/type-specification/type-operators.md`](../type-specification/type-operators.md)).
129
+ - Top / Bot behave as the universal / empty sets.
130
+ - The difference operator `T - U` is what lets occurrence typing
131
+ express "in the `else` branch of `if x.is_a?(String)`, the type
132
+ of `x` is `Top - String`" precisely, rather than as an
133
+ approximation.
134
+
135
+ Rigor does NOT formalise its lattice as a semantic-subtyping
136
+ system in the Castagna sense. Three reasons:
137
+
138
+ 1. **The trinary certainty already absorbs the hard cases.**
139
+ Semantic subtyping's value is a decision procedure for full
140
+ union / intersection / negation. Rigor's `maybe` arm handles
141
+ "cannot decide" without needing the decision procedure to
142
+ terminate.
143
+ 2. **Nominal-first conflicts with pure semantic subtyping.**
144
+ A semantic-subtyping reading would collapse two distinct
145
+ nominal classes with identical method sets (§ "Nominal vs
146
+ structural typing"), which Ruby authors do not want.
147
+ 3. **Implementation cost.** Castagna's decision procedure is
148
+ theoretically elegant but operationally heavy for an analyser
149
+ that walks an AST per file under a per-file budget
150
+ ([`inference-budgets.md`](../type-specification/inference-budgets.md)).
151
+
152
+ Rigor's `T | U` / `T & U` / `T - U` operators are best read
153
+ with the set-theoretic interpretation in mind, even though
154
+ Rigor's algorithms do not formally lean on it. A reader coming
155
+ from Elixir's set-theoretic types or from Castagna's textbook
156
+ will find the surface familiar; the gap is in the formalisation
157
+ depth, not the surface design.
158
+
159
+ ## Subtyping and gradual consistency
160
+
161
+ Static type theory uses one relation: **subtyping (`<:`)**.
162
+ `Integer <: Numeric` means every `Integer` is a `Numeric`.
163
+
164
+ Gradual typing adds a second relation: **consistency (`~`)**.
165
+ `Dynamic[T] ~ U` means "I do not statically know whether the
166
+ runtime value will satisfy `U`, but it is permitted to."
167
+ Consistency is reflexive and symmetric but **not transitive**.
168
+ This is the key technical move that distinguishes gradual typing
169
+ from "just add an `Any` type to the lattice."
170
+
171
+ Rigor exposes both relations through a **trinary certainty**:
172
+
173
+ | Certainty | Reads as | Use site |
174
+ | --- | --- | --- |
175
+ | `yes` | `T <: U` provably holds | The call is safe; no diagnostic. |
176
+ | `no` | `T <: U` provably fails | A diagnostic fires. |
177
+ | `maybe` | Cannot prove either way | No diagnostic — Rigor stays silent (robustness principle). |
178
+
179
+ ```ruby
180
+ # yes: provably Integer <: Numeric
181
+ def add_one(n) = n + 1
182
+ add_one(42) # certainty: yes
183
+
184
+ # no: Constant<"a"> <: Integer is provably false
185
+ add_one("a") # certainty: no — call.argument-type-mismatch fires
186
+
187
+ # maybe: Dynamic[top] ~ Integer holds; <: cannot be decided
188
+ add_one(JSON.parse(input)) # certainty: maybe — silent
189
+ ```
190
+
191
+ Spec: [`docs/type-specification/relations-and-certainty.md`](../type-specification/relations-and-certainty.md).
192
+
193
+ ## Nominal vs structural typing
194
+
195
+ Java is nominal: `class Foo {}` and `class Bar {}` with identical
196
+ member sets are distinct types. TypeScript is structural: two
197
+ type aliases with identical members are interchangeable.
198
+
199
+ Rigor is **nominal-first with structural facets**:
200
+
201
+ 1. **Nominal** is the default. `Nominal[User]` and
202
+ `Nominal[Admin]` are distinct even with identical methods.
203
+ 2. **Structural via `interface`**. RBS `interface _Comparable`
204
+ defines a shape — anything implementing the named methods
205
+ satisfies it, regardless of class.
206
+ 3. **Structural via `HashShape` and `Tuple`**. Ruby literals
207
+ `{name: "x", age: 30}` and `[1, "a"]` get per-key / per-index
208
+ structural types automatically.
209
+ 4. **Capability roles** are a Rigor-specific structural facet —
210
+ named structural interfaces with hidden carriers
211
+ (`_ReadableStream`, `_RewindableStream`, …). These let the
212
+ robustness principle widen user-method parameter types to "any
213
+ value that supports the capability we actually use" without
214
+ forcing the user to write the `interface`.
215
+
216
+ ```ruby
217
+ # Nominal — User and Admin are distinct
218
+ class User; end
219
+ class Admin; end
220
+ u = User.new
221
+ def takes_user(u) end
222
+ takes_user(Admin.new) # call.argument-type-mismatch
223
+
224
+ # Structural via HashShape — literals get per-key types
225
+ person = {name: "Alice", age: 30}
226
+ assert_type("{ name: \"Alice\", age: 30 }", person)
227
+
228
+ # Structural via interface
229
+ def shout(thing)
230
+ thing.upcase
231
+ end
232
+ # Rigor infers the parameter as "anything with #upcase: () -> String"
233
+ ```
234
+
235
+ Spec:
236
+ [`docs/type-specification/structural-interfaces-and-object-shapes.md`](../type-specification/structural-interfaces-and-object-shapes.md).
237
+
238
+ ## Polymorphism
239
+
240
+ The Cardelli/Wegner taxonomy of polymorphism maps cleanly onto Rigor:
241
+
242
+ | Polymorphism family | Rigor surface | Notes |
243
+ | --- | --- | --- |
244
+ | **Parametric** (System F-style, predicative) | RBS generics `class Foo[T]`, method generics `def m: [U] (U) -> U` | No higher-rank or higher-kinded quantification at the user surface. |
245
+ | **Subtype** | `<:` over the lattice | Standard; method calls dispatch by inferred receiver type. |
246
+ | **Ad-hoc** (overloading) | RBS method overloads (`def m: (Integer) -> Integer \| (String) -> String`) | Resolution picks the most specific arm. |
247
+ | **Coercion** | Rigor's Ruby-coercion model (`Integer#coerce`, etc.) | Inferred per the runtime semantics; not a user-visible operator. |
248
+ | **Row polymorphism** | (not exposed at the user surface) | `HashShape` carries closed-vs-open key sets internally; not a quantifiable axis. See § "Object shapes" for the lineage. |
249
+
250
+ ```ruby
251
+ # Parametric — method generics in RBS
252
+ # sig: def first: [E] (Array[E]) -> E?
253
+ def first(arr) = arr[0]
254
+
255
+ # Subtype — Integer <: Numeric flows through method calls
256
+ def total(ns) = ns.sum
257
+ total([1, 2, 3]) # ns: Array[Integer]
258
+ total([1, 2.0, 3]) # ns: Array[Numeric]
259
+
260
+ # Ad-hoc — RBS overload picks per call site
261
+ "abc" * 3 # String overload
262
+ [1, 2] * 3 # Array overload
263
+ ```
264
+
265
+ Spec: [`docs/type-specification/rbs-compatible-types.md`](../type-specification/rbs-compatible-types.md).
266
+
267
+ ## F-bounded polymorphism and self types
268
+
269
+ A recurring concrete need in object-oriented languages: a method
270
+ that "returns an instance of the actual receiver's class." Ruby's
271
+ `Object#tap` is the canonical example — `arr.tap { |x| ... }`
272
+ returns the same `arr`, with the same type, not the widened
273
+ `Object` ancestor.
274
+
275
+ Expressing "I return *my own* class" needs a mechanism beyond
276
+ plain parametric polymorphism, because the return-type parameter
277
+ must track the *runtime* class of `self`, not just a static type
278
+ variable bound at the call site. Two related mechanisms in the
279
+ literature:
280
+
281
+ ### Self types
282
+
283
+ A reserved keyword (`self`, `Self`, `this`) inside a class
284
+ declaration meaning "the type of the actual receiver." A method
285
+ declared `def m: () -> self` in class `Foo` returns `Foo` in
286
+ `Foo` but returns `Bar` in `class Bar < Foo`. RBS uses `self`
287
+ exactly this way:
288
+
289
+ ```ruby
290
+ # RBS for Object#tap (excerpt)
291
+ class Object
292
+ def tap: { (self) -> void } -> self
293
+ end
294
+ ```
295
+
296
+ When called on an `Array[Integer]`, the block sees the receiver
297
+ typed as `Array[Integer]` and the call returns `Array[Integer]` —
298
+ not `Object`.
299
+
300
+ ### F-bounded polymorphism
301
+
302
+ A more general mechanism — a type parameter constrained to
303
+ extend its own parameterisation: `T <: Comparable[T]`. The
304
+ classical reference is **Canning, Cook, Hill, Olthoff &
305
+ Mitchell 1989** (*F-Bounded Polymorphism for Object-Oriented
306
+ Programming*). The motivating problem is the same — "a method on
307
+ `Comparable` should return the comparing class itself, not the
308
+ abstract `Comparable` interface" — but the encoding is via a
309
+ constrained type parameter rather than a `self` keyword.
310
+
311
+ ### Industrial uptake
312
+
313
+ | Language | Self-type form | F-bounded form |
314
+ | --- | --- | --- |
315
+ | Java | (none direct) | `<T extends Comparable<T>>` |
316
+ | Scala | `this: T =>` self-type | `[T <: Comparable[T]]` |
317
+ | TypeScript | `this` type in method signatures | `<T extends C<T>>` |
318
+ | Sorbet | `T.self_type`, `T.attached_class` | Limited via generic class |
319
+ | RBS | `self` keyword | `[T < Comparable[T]]` (syntactically supported, limited) |
320
+
321
+ ### Rigor's position
322
+
323
+ Rigor honours the RBS `self` keyword in method signatures. The
324
+ walker substitutes the actual receiver type for `self` when
325
+ synthesising the return type — so `Array[Integer]#dup` (declared
326
+ as `def dup: () -> self`) returns `Array[Integer]`, not the
327
+ ancestor's `Object`. This is the small mechanism that removes a
328
+ major source of unwanted widening in OO Ruby code:
329
+
330
+ ```ruby
331
+ arr = [1, 2, 3]
332
+ copy = arr.dup
333
+ # Without self-type: copy : Object (useless)
334
+ # With self-type: copy : Array[Integer] (load-bearing)
335
+ ```
336
+
337
+ F-bounded polymorphism in its full generality is harder. The
338
+ inference machinery has to solve a constraint that mentions the
339
+ type variable on both sides of `<:`. Rigor's RBS surface accepts
340
+ the constrained form `[T < C[T]]` but the walker treats
341
+ unresolved F-bounded constraints conservatively (`Dynamic[top]`
342
+ fallback when the bound cannot be solved locally). This matches
343
+ the no-false-positives stance: an over-precise F-bounded
344
+ inference would spread `T`-mention errors through the codebase,
345
+ and the practical Ruby idiom (declare `self` rather than
346
+ quantify over `T <: C[T]`) sidesteps the harder cases anyway.
347
+
348
+ The type-theoretic family extends further (G-bounded polymorphism;
349
+ parameterised self-types; ML-style first-class modules with
350
+ `with type self = ...`) but those forms are not exposed at the
351
+ Rigor surface.
352
+
353
+ ## Object shapes — row polymorphism, Hack, and HashShape's lineage
354
+
355
+ The `HashShape{...}` carrier and the closely related `Tuple[...]`
356
+ appeared first in § "Nominal vs structural typing" and again in
357
+ the precision table of § "Beyond pure inference," where they turn
358
+ an otherwise-`Hash[Symbol, A | B | C]`-shaped join into something
359
+ a downstream caller can use. They sit in a family of *structural
360
+ shape* designs with both an academic root and an industrial
361
+ lineage. Tracing those threads is the easiest way to explain why
362
+ `HashShape` looks the way it does.
363
+
364
+ ### The academic root: row polymorphism
365
+
366
+ **Row polymorphism** (Wand, 1987; Rémy, 1989; Cardelli & Mitchell,
367
+ 1991) is the formal mechanism for typing "records that may carry
368
+ additional fields beyond the ones I named." A *row variable* `ρ`
369
+ quantifies over the trailing fields of a record type:
370
+
371
+ > `{ name: String; age: Integer | ρ }` — "any record with at
372
+ > least these two fields; ρ is the rest."
373
+
374
+ Garrigue (1990s) extended the framework with **kinds**, letting
375
+ OCaml's polymorphic-record system distinguish "the class of
376
+ records carrying `name: String`" from "the class of records
377
+ carrying `length: Int`." OCaml's open object types
378
+ (`< get_name : string; .. >`) sit on this foundation.
379
+
380
+ **Matsumoto & Minamide (2008)** applied the Garrigue-kinded
381
+ framework directly to Ruby — 多相レコード型に基づくRuby
382
+ プログラムの型推論. The paper demonstrated that Ruby's
383
+ "duck typing" surface admits a row-polymorphic reading: a method
384
+ `def shout(x); x.upcase; end` infers as roughly
385
+ `∀α, ρ. {upcase: () -> α | ρ} -> α`. The inference algorithm
386
+ works, but the inferred types (together with the kind constraints
387
+ they drag along) are dense for everyday Ruby code, where users
388
+ overwhelmingly reason in nominal classes rather than structural
389
+ rows.
390
+
391
+ The [Rigor-perspective review of the
392
+ paper](../notes/20260518-matsumoto-2008-poly-records-rigor-review.md)
393
+ records the retrospective: the experiment *worked* but it
394
+ **retroactively justified Rigor's nominal-first design** rather
395
+ than recommending row variables as the primary modeling tool.
396
+ Rigor's carrier zoo treats nominal classes as the unit of
397
+ modelling, with structural shapes as inference-precision fallbacks.
398
+
399
+ ### The industrial lineage: Hack → Psalm/PHPStan
400
+
401
+ In parallel with the academic line, the practical "typed
402
+ dictionary" trajectory went a different way. Facebook's
403
+ [**Hack `shape(...)`**](https://docs.hhvm.com/hack/built-in-types/shape/)
404
+ introduced first-class shape types as part of the migration story
405
+ from dynamic PHP arrays to a typed surface:
406
+
407
+ - Per-key typing — `shape('name' => string, 'age' => int)`.
408
+ - Optional keys via `?'key' => T`.
409
+ - Closed by default; `...` opens the shape to additional keys.
410
+
411
+ **Psalm** and **PHPStan** adopted the same idea under the PHPDoc
412
+ syntax `array{name: string, age: int}`, with one important
413
+ emphasis flipped: the shape is *inferred* from the literal at the
414
+ use site rather than declared up front. TypeScript's object-type
415
+ literal `{ name: string; age: number }` is the same idea under
416
+ different syntax, with structural subtyping turned on by default.
417
+
418
+ The industrial design **deliberately avoids row variables**.
419
+ There is no `array{name: string, ...ρ}` quantified over the
420
+ trailing keys; every shape is closed (or fully open) with no
421
+ quantifiable in-between. The price is loss of full
422
+ row-polymorphic expressiveness; the benefit is tractable inference
423
+ and *readable* inferred types.
424
+
425
+ ### HashShape's position
426
+
427
+ Rigor's `HashShape{...}` sits squarely in the Hack / Psalm
428
+ lineage rather than the row-polymorphic one:
429
+
430
+ | Property | Row polymorphism | Hack `shape(...)` | Psalm `array{...}` | Rigor `HashShape{...}` |
431
+ | --- | --- | --- | --- | --- |
432
+ | Per-key typing | Yes | Yes | Yes | Yes |
433
+ | Optional keys | Yes (via row constraints) | Yes (`?'k'`) | Yes (`?k:`) | Yes (open/closed flag internally) |
434
+ | Row variables quantifiable by the user | **Yes** | No | No | **No** |
435
+ | Inferred from literals | (Inference is global) | No — user-declared | Yes (per call site) | **Yes** — built-in for hash literals |
436
+ | Primary modelling vehicle for users | Yes (in ML-family languages that adopt it) | Yes (idiomatic Hack) | Sometimes (alongside classes) | **No** — nominal classes are primary; HashShape is the inference-precision fallback |
437
+
438
+ Two specific choices stand out:
439
+
440
+ 1. **No row variables at the user surface.** Like Hack and Psalm,
441
+ Rigor does not let the user write
442
+ `HashShape{name: String, *rest}` quantified over the trailing
443
+ keys. Internally `HashShape` carries an open/closed flag, so
444
+ the analyser can still answer "is this set of keys finite?",
445
+ but the type language has no `ρ`. This is the same trade Hack
446
+ made: more readable inferred types and tractable inference, at
447
+ the cost of full row-polymorphic expressivity.
448
+ 2. **Inferred, not declared.** Where Hack expects the user to
449
+ write `shape(...)` explicitly, Rigor produces `HashShape`
450
+ automatically from hash literals. The common Ruby-author
451
+ experience is "I wrote `{a: 1, b: 'x'}` and Rigor reported
452
+ `HashShape{a: Constant<1>, b: Constant<"x">}`," not "I
453
+ declared a shape type and Rigor checked my literal against
454
+ it." This matches the Psalm / PHPStan emphasis more closely
455
+ than Hack's declaration-first design.
456
+
457
+ The combination (inferred-from-literal + Hack/Psalm-shaped
458
+ surface + nominal-first ecosystem) makes `HashShape` a
459
+ **precision carrier** (§ "Beyond pure inference") rather than a
460
+ *modelling primitive*. It exists to sharpen the type of a hash
461
+ literal that would otherwise widen to
462
+ `Hash[Symbol, A | B | C]`. It is not the unit a Rigor user
463
+ reaches for to describe a domain object; that role belongs to
464
+ `class User; end` plus the surrounding RBS, exactly as the
465
+ Matsumoto retrospective recommends.
466
+
467
+ ### `Tuple` and the same lineage
468
+
469
+ `Tuple[A, B, C]` is the array analogue, and the same lineage
470
+ applies — TypeScript's `[A, B, C]`, Hack's `tuple(A, B, C)`,
471
+ Psalm/PHPStan's `array{0: A, 1: B, 2: C}` shorthand. The
472
+ motivation is identical: a literal `[1, "a", :sym]` carries
473
+ per-index information that the `Array[Integer | String | Symbol]`
474
+ join discards.
475
+
476
+ ### Why not full row polymorphism in Rigor?
477
+
478
+ The temptation to surface row variables for users who want them
479
+ is real, and the question is open at the ADR level. The reasons
480
+ it has not landed at the user surface in v0.1.x:
481
+
482
+ - **Inference cost.** Garrigue-kinded inference is decidable but
483
+ more expensive than Rigor's local walker; the analyser's
484
+ per-file budget (see
485
+ [`inference-budgets.md`](../type-specification/inference-budgets.md))
486
+ would have to accommodate global row-constraint solving.
487
+ - **Readability.** The Matsumoto experiment found that inferred
488
+ row-polymorphic types for everyday Ruby code are dense and hard
489
+ to skim. Rigor's no-false-positives stance amplifies the
490
+ problem, since it makes the inferred type a thing the user
491
+ reads in `rigor annotate`.
492
+ - **Empirical demand.** Hash literals in real Ruby code are
493
+ typically per-call ad-hoc dictionaries, not polymorphic-record
494
+ values flowing through multiple operations. The closed-or-open
495
+ per-call structural type matches the observed use; the row
496
+ quantification rarely earns its complexity.
497
+
498
+ If row variables ever become needed (for a typed
499
+ `merge` / `transform_keys` / `slice` story that benefits from
500
+ quantifying over rows), the question opens through an ADR rather
501
+ than at the user surface by default.
502
+
503
+ ## Variance
504
+
505
+ RBS (and therefore Rigor) inherits the standard variance
506
+ vocabulary for generic parameters:
507
+
508
+ - **Covariant (`out T`)** — `Foo[Sub] <: Foo[Sup]` when
509
+ `Sub <: Sup`. Producer position.
510
+ - **Contravariant (`in T`)** — `Foo[Sup] <: Foo[Sub]` when
511
+ `Sub <: Sup`. Consumer position.
512
+ - **Invariant (default)** — neither.
513
+
514
+ Ruby's mutable containers (`Array`, `Hash`, `Set`) are invariant
515
+ in their element type for soundness — the standard Java-arrays-
516
+ are-covariant cautionary tale applies. RBS declares them as such;
517
+ Rigor honours those declarations.
518
+
519
+ ## Refinement types and predicate subtyping
520
+
521
+ A **refinement type** restricts a base type by a predicate: in
522
+ Liquid Types / SMT-driven systems this is written as
523
+ `{x: Int | x > 0}`. Rigor exposes a curated catalogue of
524
+ refinements with reserved names:
525
+
526
+ | Refinement | Predicate (informally) | Carrier |
527
+ | --- | --- | --- |
528
+ | `non-empty-string` | `s : String, s.size >= 1` | refinement on `String` |
529
+ | `numeric-string` | `s : String, s =~ /\A[+-]?\d+(\.\d+)?\z/` | refinement on `String` |
530
+ | `literal-string` | "provably built from literals" | refinement on `String` |
531
+ | `int<min, max>` | `n : Integer, min <= n <= max` | range carrier |
532
+ | `non-zero-int` | `n : Integer, n != 0` | refinement on `Integer` |
533
+ | `positive-int` | `n : Integer, n > 0` | refinement on `Integer` |
534
+ | `non-empty-array[T]` | `arr : Array[T], arr.size >= 1` | refinement on `Array[T]` |
535
+ | `non-empty-hash[K, V]` | `h : Hash[K, V], h.size >= 1` | refinement on `Hash[K, V]` |
536
+
537
+ The refinements compose with subtyping the way you would expect:
538
+ `positive-int <: non-zero-int <: Integer <: Numeric`.
539
+ **Rigor narrows into refinement carriers automatically** when the
540
+ control-flow analysis proves the predicate:
541
+
542
+ ```ruby
543
+ def length_of(s)
544
+ return 0 if s.empty?
545
+ s.size # at this program point: s : non-empty-string
546
+ end
547
+ ```
548
+
549
+ This is the practical payoff of refinement subtyping without
550
+ asking the user to author the refinement.
551
+
552
+ Spec: [`docs/type-specification/imported-built-in-types.md`](../type-specification/imported-built-in-types.md),
553
+ [`docs/type-specification/rigor-extensions.md`](../type-specification/rigor-extensions.md).
554
+
555
+ ## Occurrence typing (flow-sensitive narrowing)
556
+
557
+ The technical term for "`if x.is_a?(String)` makes `x : String`
558
+ inside the branch" is **occurrence typing** (Tobin-Hochstadt &
559
+ Felleisen, 2008). TypeScript calls it *narrowing*; mypy calls it
560
+ *type guards*. The underlying mechanism is the same: the type
561
+ checker walks the control-flow graph and refines each variable
562
+ along the edges where a predicate must have held.
563
+
564
+ Rigor implements occurrence typing as **edge-aware narrowing** with
565
+ a few extensions specific to Ruby:
566
+
567
+ - Standard predicates: `is_a?`, `kind_of?`, `instance_of?`,
568
+ `respond_to?`, `nil?`, `==`, `===`, `frozen?`, `empty?`,
569
+ comparison operators.
570
+ - Pattern matching: `case x; in pattern` narrows along the
571
+ matched branch.
572
+ - Equality semantics are split into structural and reference
573
+ equality where Ruby distinguishes them.
574
+ - Mutation effects on a narrowed variable invalidate the
575
+ narrowing at the next read — *fact stability*.
576
+ - User-extended predicates via the `predicate-if-true` /
577
+ `predicate-if-false` directives (the analogue of TypeScript's
578
+ `x is Foo` type guards).
579
+
580
+ ```ruby
581
+ def describe(x)
582
+ if x.is_a?(String)
583
+ # x : String here
584
+ x.upcase
585
+ elsif x.nil?
586
+ "(nil)"
587
+ else
588
+ # x : Top - String - nil here (everything else narrowed out)
589
+ x.inspect
590
+ end
591
+ end
592
+ ```
593
+
594
+ Spec: [`docs/type-specification/control-flow-analysis.md`](../type-specification/control-flow-analysis.md),
595
+ [`docs/type-specification/rbs-extended.md`](../type-specification/rbs-extended.md).
596
+
597
+ ## Pattern matching and exhaustiveness
598
+
599
+ The previous section noted that Rigor narrows along
600
+ `case x; in pattern` branches the same way it narrows along
601
+ `if x.is_a?(...)`. There is a related but distinct property that
602
+ pattern matching enables in type systems built around **algebraic
603
+ data types** (ADTs) or **tagged unions**: **exhaustiveness
604
+ checking**.
605
+
606
+ A `case` that does not cover every value the scrutinee can take
607
+ *should*, under exhaustiveness, be a type error rather than a
608
+ runtime fallthrough. The compiler verifies that for every
609
+ possible shape of the scrutinee, some arm matches.
610
+
611
+ ### Academic root
612
+
613
+ **Maranget 2007** (*Warnings for Pattern Matching*) gave the
614
+ algorithm OCaml uses to compute pattern-matching warnings —
615
+ non-exhaustive matches and redundant arms. The broader topic
616
+ sits in the ML / Haskell lineage where exhaustiveness has been
617
+ load-bearing since the late 1970s.
618
+
619
+ ### Industrial uptake
620
+
621
+ - **OCaml**: emits warnings for non-exhaustive matches; can be
622
+ turned into hard errors via `-strict-formats` or per-pattern
623
+ attributes.
624
+ - **Rust**: requires exhaustiveness on `match` against an `enum`
625
+ type; non-exhaustive matches are *compile* errors.
626
+ - **Scala**: warns on non-exhaustive `match`; raises
627
+ `MatchError` at runtime if unmatched.
628
+ - **TypeScript**: simulates exhaustiveness via the "exhaustive
629
+ `never` check" idiom — assigning the scrutinee to a `never`-
630
+ typed variable in the default branch fails to type-check if
631
+ any case was missed.
632
+ - **Sorbet**: `T.absurd(x)` checks that every case of a union
633
+ has been narrowed away by the point of the call.
634
+
635
+ ### Ruby and Rigor's position
636
+
637
+ Ruby's `case/in` is non-exhaustive at runtime — an unmatched
638
+ scrutinee silently falls through the `case` expression (returns
639
+ `nil`), or raises `NoMatchingPatternError` if the strict
640
+ `case/in` form is used without an `else`. Rigor inherits the
641
+ language behaviour:
642
+
643
+ - Occurrence-typing narrowing through `case/in` is implemented
644
+ and load-bearing for downstream precision.
645
+ - Exhaustiveness checking is **NOT** implemented in v0.1.x.
646
+
647
+ The choice is consistent with Rigor's no-false-positives stance.
648
+ A `pattern.non-exhaustive` diagnostic would fire on:
649
+
650
+ 1. A scrutinee inferred as a union whose arms were not all
651
+ matched — but the inferred union itself may be approximate
652
+ (`Dynamic[T]` widening, capability-role narrowing, plugin
653
+ contributions), so the "missing arm" cluster could be
654
+ pathological.
655
+ 2. A scrutinee whose "exhaustive set" is open-ended at runtime —
656
+ open class hierarchies, user-defined `===`, monkey-patched
657
+ `kind_of?`. These shapes are common in Ruby and rarely typed
658
+ precisely enough for an exhaustiveness check to be reliable.
659
+ 3. A developer deliberately relying on the fall-through return
660
+ of `nil`. Idiomatic in some Ruby styles.
661
+
662
+ The false-positive surface is uncomfortably large for a language
663
+ without ADTs. A `pattern.non-exhaustive` diagnostic is a future
664
+ direction (no committed milestone). Users wanting exhaustiveness
665
+ *today* can replicate the TypeScript / Sorbet idiom — call a
666
+ method declared to take `Bot` in the default branch — and
667
+ Rigor's narrowing will surface the missed arm via the
668
+ `call.argument-type-mismatch` diagnostic that already exists.
669
+
670
+ ```ruby
671
+ # Self-rolled exhaustiveness via the Bot-receiver idiom
672
+ def unreachable(x)
673
+ raise "unreachable: #{x.inspect}"
674
+ end
675
+ # RBS: def unreachable: (bot) -> bot
676
+
677
+ case shape
678
+ in :circle then ...
679
+ in :square then ...
680
+ # missing :triangle
681
+ else unreachable(shape)
682
+ # If `shape` could still be :triangle here, the
683
+ # `unreachable` call's argument type mismatches `bot`
684
+ # and call.argument-type-mismatch fires.
685
+ end
686
+ ```
687
+
688
+ This is not as ergonomic as a first-class
689
+ `pattern.non-exhaustive`, but it is sound under Rigor's
690
+ no-false-positives discipline and works today.
691
+
692
+ ## Gradual typing
693
+
694
+ Gradual typing (Siek & Taha, 2006; Garcia, Clark & Tanter, 2016)
695
+ is the discipline of letting statically-typed and
696
+ dynamically-typed code coexist in one program. The technical
697
+ machinery is:
698
+
699
+ 1. A distinguished "dynamic" type (`?` in the original paper).
700
+ 2. A *consistency* relation `~` that admits the dynamic type
701
+ anywhere a concrete type is expected (and vice versa) but
702
+ refuses to bridge two unrelated concrete types.
703
+ 3. Optional run-time casts at the static/dynamic boundary.
704
+
705
+ Rigor maps onto this as:
706
+
707
+ | Gradual concept | Rigor surface |
708
+ | --- | --- |
709
+ | Dynamic type `?` | **`Dynamic[T]`** — a carrier that *wraps* a "best-guess" type `T` while marking the value as not-statically-verified. `Dynamic[top]` is the maximally-dynamic form. |
710
+ | Consistency `~` | The `maybe` arm of the trinary certainty — `Dynamic[T] ~ U` holds whenever `T ~ U` does. |
711
+ | Static/dynamic boundary | Per-method, per-file, per-plugin contribution — Rigor records *why* a value became `Dynamic[T]` in its dynamic-origin algebra. |
712
+ | Casts | No in-source cast operator. The opt-in [`rigor-sorbet`](../../plugins/rigor-sorbet/) plugin reads `T.let` / `T.cast` / `T.must` as cast forms; `RBS::Extended` `assert_type` directives serve the same role from `.rbs`. |
713
+
714
+ Two Rigor-specific extensions matter:
715
+
716
+ 1. **`Dynamic[T]` is parameterised.** The original gradual-typing
717
+ paper has a single `?`; Rigor carries the "what we would
718
+ *guess* the type is if asked to commit" alongside the
719
+ uncertainty marker, so refactoring tools can offer better
720
+ suggestions.
721
+ 2. **The robustness principle (Postel's law for types)** —
722
+ parameters are accepted leniently (closer to `Dynamic[T]`),
723
+ returns are reported strictly. See
724
+ [ADR-5](../adr/5-robustness-principle.md).
725
+
726
+ Spec: [`docs/type-specification/special-types.md`](../type-specification/special-types.md),
727
+ [`docs/type-specification/value-lattice.md`](../type-specification/value-lattice.md).
728
+
729
+ ## Blame, the gradual guarantee, and trust boundaries
730
+
731
+ The previous section described `Dynamic[T]` and the consistency
732
+ relation `~` but stopped at the static side. The full
733
+ gradual-typing literature has a substantial run-time-and-policy
734
+ theory built on those static foundations. Rigor inherits part of
735
+ it; the rest is deliberately out of scope.
736
+
737
+ ### Blame
738
+
739
+ **Findler & Felleisen 2002** (*Contracts for Higher-Order
740
+ Functions*) introduced the **blame** principle: when a value
741
+ flows across a static / dynamic boundary and a contract violation
742
+ is detected at runtime, *whose code is at fault*? The answer must
743
+ be unambiguous and inferable from the boundary topology — a value
744
+ crossing from typed code into untyped code carries a positive
745
+ contract obligation; from untyped to typed, a negative one.
746
+
747
+ **Wadler & Findler 2009** (*Well-Typed Programs Can't Be Blamed*)
748
+ gave the slogan: a typed module that follows its declared
749
+ interface is never the cause of a blame error — only untyped code
750
+ (or a static / dynamic interface mismatch) can be.
751
+
752
+ ### The gradual guarantee
753
+
754
+ **Siek, Vitousek, Cimini & Boyland 2015** (*Refined Criteria for
755
+ Gradual Typing*) formalised the property a gradual type system is
756
+ most commonly *expected* to satisfy:
757
+
758
+ > Adding type annotations to a previously well-typed program does
759
+ > not introduce new errors. Removing annotations from a previously
760
+ > well-typed program does not introduce new errors either.
761
+
762
+ This is the **gradual guarantee**. It is the property that makes
763
+ gradual adoption psychologically viable: a developer adding an
764
+ RBS annotation to a working method should never break a
765
+ previously-passing call site, and removing an annotation should
766
+ never fire a new diagnostic.
767
+
768
+ ### Rigor's position
769
+
770
+ Rigor does not insert runtime contracts. Blame in the
771
+ Findler-Felleisen sense has no direct operational analogue —
772
+ Rigor is static-only, and a `Dynamic[T]`-to-concrete-`T` flow is
773
+ a static decision, not a runtime check that could "blame" anyone.
774
+
775
+ The **gradual guarantee** *is* a property Rigor can be measured
776
+ against:
777
+
778
+ - **In spirit**, the no-false-positives stance
779
+ ([ADR-5](../adr/5-robustness-principle.md)) is strictly
780
+ stronger than the gradual guarantee. If Rigor was silent on a
781
+ call site before an annotation was added, it remains silent
782
+ after — unless the annotation provably contradicts the runtime
783
+ behaviour, in which case the diagnostic fires on the annotation
784
+ rather than on the call. The asymmetry "strict on returns,
785
+ lenient on parameters" is calibrated to satisfy this property
786
+ by construction.
787
+ - **In practice**, the gradual guarantee in Rigor reads as: a
788
+ project's baseline of "passes without annotation" should never
789
+ regress when an RBS file is added. This is exactly the property
790
+ the [PHPStan-shaped baseline mechanism](../adr/22-baseline-and-project-onboarding.md)
791
+ enforces — adding annotations shrinks the baseline; it never
792
+ grows it on un-annotated code.
793
+
794
+ ### What Rigor explicitly does NOT do
795
+
796
+ - **Runtime contract insertion at the static / dynamic boundary.**
797
+ The opt-in [`rigor-sorbet`](../../plugins/rigor-sorbet/) plugin
798
+ reads Sorbet's `T.let` / `T.cast` / `T.must` as cast forms, but
799
+ the contract *enforcement* is `sorbet-runtime`'s job, not
800
+ Rigor's. Rigor's static analysis uses the cast as a hint, not
801
+ as a check.
802
+ - **Blame-tracking algebra.** Rigor's dynamic-origin tracking
803
+ records *why* a value became `Dynamic[T]` (which plugin / which
804
+ file / which boundary) and is consulted by refactoring tools,
805
+ but does not assign run-time fault. There is no positive /
806
+ negative contract obligation in Rigor's algebra.
807
+ - **Trust polarity at boundaries.** The "typed code is trusted;
808
+ untyped code is suspect" framing that the
809
+ Wadler-Findler-Greenberg lineage builds on is replaced in
810
+ Rigor by the simpler "we report only diagnostics we can
811
+ prove" framing — which removes the question of who is to
812
+ blame by removing the runtime decision point.
813
+
814
+ The gradual-typing trinity for Rigor: **consistency `~`** (the
815
+ static side, § "Gradual typing"); the **gradual guarantee** (the
816
+ migration story, this section); and **no runtime cost** (the
817
+ engineering stance — Rigor is a compile-time tool, not a
818
+ contract system).
819
+
820
+ ## Effect systems
821
+
822
+ A textbook **effect system** annotates each expression with two
823
+ things: a type *and* a set of effects (Lucassen & Gifford, 1988).
824
+ Effects include I/O, mutation, exceptions, divergence, allocation.
825
+
826
+ Rigor has an effect model but it lives **inside the engine**, not
827
+ at the user surface:
828
+
829
+ | Engine-internal effect | What it tracks | User-visible consequence |
830
+ | --- | --- | --- |
831
+ | Mutation | `arr << x`, `h[k] = v`, ivar writes | Narrowed types lose fact stability after mutating reads. |
832
+ | Exception / non-local exit | `raise`, `throw`, `return`, `break` | The branch contributes nothing to the join; methods that always raise return `Bot`. |
833
+ | Closure escape | A block stored or yielded outside its lexical scope | Narrowings inside the block are not exported to the outer scope. |
834
+
835
+ These effects are not part of an authored signature. They are
836
+ inferred from the AST walk and consulted by the narrowing logic.
837
+ Future plugin / annotation extensions to surface effects at the
838
+ user level are tracked in the spec corpus but not part of v0.1.x.
839
+
840
+ Spec: [`docs/type-specification/control-flow-analysis.md`](../type-specification/control-flow-analysis.md)
841
+ ("Mutation effects" subsection).
842
+
843
+ ## Soundness, completeness, and the no-false-positives stance
844
+
845
+ A static type system is:
846
+
847
+ - **Sound** when every program it accepts is free of the runtime
848
+ errors the type system is supposed to catch ("no false
849
+ negatives at runtime").
850
+ - **Complete** when every program free of those runtime errors is
851
+ accepted by the type system ("no false positives at
852
+ type-check time").
853
+
854
+ Rice's theorem implies you cannot have both in full generality.
855
+ Mainstream static type systems choose **sound but incomplete**
856
+ (Java, Haskell, Rust modulo unsafe). Rigor takes the opposite
857
+ default:
858
+
859
+ > Rigor only fires a diagnostic when it can **prove** the
860
+ > unsoundness. Cases it cannot decide are silent.
861
+
862
+ This is a deliberate design choice grounded in the project's
863
+ audience: Ruby programmers who would otherwise not run a type
864
+ checker at all. A noisy false-positive on the first day kills
865
+ adoption faster than a missed bug on day 30. The robustness
866
+ principle ([ADR-5](../adr/5-robustness-principle.md)) is the
867
+ formal expression of this stance: lenient on parameters
868
+ ("anyone could call this with anything"), strict on returns
869
+ ("we will commit to what we actually return").
870
+
871
+ The trade-offs to be aware of:
872
+
873
+ - **Rigor will miss bugs that a sound checker would catch.**
874
+ This is by design; the alternative is more friction than the
875
+ bug it would catch.
876
+ - **The trinary certainty (`yes` / `no` / `maybe`)** is the
877
+ formal acknowledgement of incompleteness. Most checkers
878
+ collapse to binary; Rigor preserves the third arm because
879
+ it's the arm that earns silence.
880
+ - **`Dynamic[T]` is not a failure mode** in Rigor's model. It is
881
+ a first-class carrier with full algebraic identity.
882
+
883
+ ## Decidability of inference
884
+
885
+ A type system's *expressive power* and the *decidability of
886
+ inferring its types* pull in opposite directions. Adding the wrong
887
+ combination of features can push inference into undecidable
888
+ territory — equivalent in difficulty to the halting problem.
889
+ Language designers therefore pick a fragment that is decidable for
890
+ inference and require annotations for anything beyond.
891
+
892
+ The friendliest accessible-level survey of this landscape in
893
+ Japanese is 水野雅之「計算機に推論できる型、できない型」
894
+ (Wantedly Advent Calendar, 2021; see the reading list below). The
895
+ key results it walks through, in terms of where Rigor sits:
896
+
897
+ | Feature | Inference status | Rigor stance |
898
+ | --- | --- | --- |
899
+ | Let-polymorphism (Hindley–Milner) | Decidable; ~linear in practice | Not Rigor's strategy. Rigor is gradual, not HM-based — RBS generics resolve by walking call sites and consulting signatures, not by global unification. |
900
+ | Higher-rank polymorphism, Rank-2 | Decidable with annotations (Kfoury & Wells, 1994) | Not exposed at the user surface. RBS generics are predicative. |
901
+ | Higher-rank polymorphism, Rank-3+ | **Undecidable** (Wells, 1999) | Not exposed. Would force annotations wherever a polymorphic value flows. |
902
+ | Polymorphic recursion | **Undecidable** (Henglein, 1993) | Not exposed. A generic method body sees its type parameter as fixed at the call site — recursive calls do not re-instantiate it. |
903
+ | Recursive types as inference targets | Decidable for equi/iso-recursive forms, but most languages exclude them from inference | RBS type aliases are nominal — recursive shapes (a tree, a JSON value) live behind a name. Rigor does not synthesise an anonymous fixed-point type during inference. The OCaml cautionary example `let f g x = x x` is well-typed under unrestricted recursive types — exactly the kind of "accepted but unwanted" judgment that motivates the exclusion. |
904
+ | Subtyping + intersection types (full) | **Undecidable in general** | Rigor exposes both `<:` and `&` (meet). Instead of restricting the language to recover decidability, it trades completeness for the trinary certainty — the `maybe` arm is what closes the gap. |
905
+
906
+ ### Rigor's pragmatic response: the third arm
907
+
908
+ A textbook sound type checker has two ways to react when inference
909
+ cannot decide:
910
+
911
+ 1. **Restrict the language** — give up the offending feature (HM
912
+ gives up rank-N polymorphism to keep inference total).
913
+ 2. **Demand annotations** — push the burden onto the author
914
+ (System F makes the user write `Λα.` themselves).
915
+
916
+ Rigor's no-false-positives stance enables a third route, available
917
+ only in the gradual setting:
918
+
919
+ > When inference cannot decide, return `maybe` and stay silent.
920
+
921
+ The `maybe` arm of the trinary certainty is therefore not only an
922
+ acknowledgement of *runtime* uncertainty (the gradual concern from
923
+ the previous section); it is also the formal acknowledgement that
924
+ the static system is *deliberately incomplete in the inferability
925
+ sense*. The two incompletenesses share one representation in
926
+ Rigor's algebra because the practical answer in both cases is the
927
+ same: do not fire a diagnostic the system cannot justify.
928
+
929
+ ```ruby
930
+ # A call where deciding the subtyping-with-intersection constraint
931
+ # would require global, undecidable inference. Rigor returns
932
+ # `maybe` and emits no diagnostic.
933
+ def consume(x)
934
+ x.frobnicate if x.respond_to?(:frobnicate)
935
+ end
936
+ consume(some_value_from_a_dynamic_source) # certainty: maybe — silent
937
+ ```
938
+
939
+ This stance also explains a recurring shape in Rigor's design:
940
+ when a feature would only be addable at the cost of global,
941
+ inference-time blow-up (closed row variables, first-class
942
+ higher-rank polymorphism, full GADT-style constructor-driven
943
+ narrowing), Rigor either ships a nominal substitute (capability
944
+ roles for row polymorphism, `interface` for existentials) or
945
+ defers the feature behind an ADR rather than degrade to a noisy
946
+ approximation.
947
+
948
+ ## Hindley–Milner, principal types, and Rigor's inference architecture
949
+
950
+ The previous two sections discussed **soundness** (does the
951
+ system reject only programs that really would crash?) and
952
+ **decidability** (does the system always give an answer in finite
953
+ time?). Type-theory textbooks bundle these with a third property
954
+ that the appendix has not named so far:
955
+
956
+ - **Principal type property** — every well-typed expression has a
957
+ *most general* type, of which every other valid typing is a
958
+ substitution-instance. In a system with the principal type
959
+ property, "the type of `e`" is a canonical, unambiguous answer
960
+ — not a guess among many.
961
+
962
+ These three properties interact in a way worth understanding,
963
+ because **Hindley–Milner (HM)** — the type system underlying ML,
964
+ OCaml, and Haskell — is the canonical example of having all three
965
+ at once.
966
+
967
+ ### What HM achieves and what it gives up
968
+
969
+ The classical Damas–Milner theorem (1982) is roughly:
970
+
971
+ > Every term typable in HM has a unique principal type, computable
972
+ > by unification (Algorithm W). The system is sound, decidable,
973
+ > and inference is "free" — the user writes no type annotations.
974
+
975
+ The cost is structural. HM accepts only a language without:
976
+
977
+ - rank-N polymorphism beyond let-bound generalisation;
978
+ - subtyping;
979
+ - intersection types;
980
+ - unrestricted recursive types at the user surface;
981
+ - polymorphic recursion.
982
+
983
+ Each excluded feature is exactly the kind that breaks one of the
984
+ three properties when added back:
985
+
986
+ | Added feature | Property that breaks first |
987
+ | --- | --- |
988
+ | Rank-3+ polymorphism | Decidability (Wells, 1999) |
989
+ | Polymorphic recursion | Decidability (Henglein, 1993) |
990
+ | Subtyping in general | Principal types (a value can satisfy several incomparable interfaces; "most general" stops being unique) |
991
+ | Subtyping + intersection (full) | Decidability |
992
+ | Unrestricted recursive types | The "principle of least surprise" — terms like `λx. x x` become well-typed |
993
+
994
+ ### Why Rigor cannot be HM
995
+
996
+ Rigor's surface **already contains the features HM excludes**.
997
+ Subtyping is the foundation of the lattice (`<:`); intersection
998
+ (`&`) is in the algebra; refinements add predicate subtyping;
999
+ generics + occurrence typing + capability roles cover the
1000
+ polymorphism uses Ruby programmers have. An HM-style
1001
+ "infer a principal type for every expression by global
1002
+ unification" architecture is therefore not available to Rigor in
1003
+ principle. This is a structural consequence of the type language
1004
+ Ruby authors expect, not a missing feature.
1005
+
1006
+ Rigor's inference is instead **local and walker-driven**:
1007
+
1008
+ - The walker descends the AST once.
1009
+ - At each expression site it consults RBS signatures, narrowing
1010
+ facts, mutation effects, and plugin contributions.
1011
+ - It returns *the* type of the expression *at that point in the
1012
+ control flow* — the most specific type the local walk can
1013
+ justify, not a canonically most-general one.
1014
+
1015
+ The same expression appearing at two program points may yield two
1016
+ different types (narrowing, flow merges, mutation, plugin
1017
+ contributions can all enter). This is closer in spirit to
1018
+ TypeScript's contextual / flow-sensitive typing than to HM's
1019
+ unification, and it matches how Ruby authors reason about
1020
+ their code: `arr` after `arr.compact!` is not "the same type" as
1021
+ `arr` before it.
1022
+
1023
+ ### Property ledger
1024
+
1025
+ The three properties laid against Rigor and HM:
1026
+
1027
+ | Property | Hindley–Milner | Rigor |
1028
+ | --- | --- | --- |
1029
+ | **Soundness** | Yes | **No, by design** — `maybe` cases stay silent (§ "Soundness, completeness, and the no-false-positives stance"). |
1030
+ | **Decidability** | Yes (DEXPTIME worst-case, near-linear in practice) | Decidable per local walk; whatever the walker cannot decide, it returns `maybe` (§ "Decidability of inference"). |
1031
+ | **Principal type property** | Yes | **No** — subtyping + intersection break it. Rigor reports a *per-occurrence* type, not a canonical most-general one. |
1032
+
1033
+ HM trades expressiveness for the
1034
+ trinity (soundness + decidability + principal types). Rigor
1035
+ trades the trinity for expressiveness, and recovers what it can
1036
+ through the trinary certainty and the no-false-positives stance.
1037
+
1038
+ ### A note on bidirectional / local type inference
1039
+
1040
+ Once subtyping enters the picture, the textbook fallback for
1041
+ HM-style global unification is **bidirectional** or **local type
1042
+ inference** (Pierce & Turner, 2000): split typing rules into a
1043
+ *synthesis* mode (compute the type of `e` from `e`) and a
1044
+ *checking* mode (verify that `e` has an expected type). Steep is
1045
+ in this lineage. Rigor's walker is bidirectional in this informal
1046
+ sense — call sites synthesise; RBS signatures check parameters
1047
+ against the synthesised argument types — but Rigor does not
1048
+ formalise the bidirectional rules because the gradual setting and
1049
+ the trinary certainty make the "could not decide" case explicit
1050
+ rather than a typing-rule failure.
1051
+
1052
+ The next section makes this informal claim concrete.
1053
+
1054
+ ## Bidirectional type checking
1055
+
1056
+ The HM section noted in passing that Rigor's walker is
1057
+ "bidirectional in everything but formalisation." That is the
1058
+ short version of the relationship between Rigor and a substantial
1059
+ contemporary line of type-system work, and worth its own
1060
+ treatment.
1061
+
1062
+ ### The synthesis / checking split
1063
+
1064
+ **Pierce & Turner 2000** (*Local Type Inference*) and the modern
1065
+ canonical reformulation by **Dunfield & Krishnaswami 2013**
1066
+ (*Complete and Easy Bidirectional Typechecking for Higher-Rank
1067
+ Polymorphism*) split typing judgments into two modes:
1068
+
1069
+ - **Synthesis** (`Γ ⊢ e ⇒ T`): given an expression `e`, compute
1070
+ its type `T`. The type flows out.
1071
+ - **Checking** (`Γ ⊢ e ⇐ T`): given `e` and an expected type
1072
+ `T`, verify that `e` has type `T`. The type flows in.
1073
+
1074
+ Every typing rule is one or the other. The two modes alternate
1075
+ top-down (checking propagates expected types) and bottom-up
1076
+ (synthesis returns types) through the AST, replacing the global
1077
+ unification of HM with *local* constraint discharge. The same
1078
+ machine handles subtyping, intersections, and higher-rank
1079
+ polymorphism without needing a global solver.
1080
+
1081
+ ### Industrial uptake
1082
+
1083
+ - **Steep** is explicitly bidirectional and the closest Ruby
1084
+ analogue — its rules are authored in `⇒` / `⇐` form.
1085
+ - **TypeScript**'s "contextual typing" is bidirectional in
1086
+ everything but the name.
1087
+ - **Scala 3**'s match types and contextual typing.
1088
+ - **OCaml** uses local type inference (Pierce & Turner directly)
1089
+ for higher-rank-polymorphic positions where HM cannot decide.
1090
+ - **Roc**, **ReScript**, **Idris 2** — all bidirectional in the
1091
+ modern Dunfield-Krishnaswami style.
1092
+
1093
+ ### Rigor's bidirectional behaviour, informal
1094
+
1095
+ Rigor's walker performs the two modes without naming them:
1096
+
1097
+ | Walker behaviour | Bidirectional mode |
1098
+ | --- | --- |
1099
+ | Computing the type of an argument expression at a call site | Synthesis |
1100
+ | Verifying the argument type against the RBS parameter type | Checking |
1101
+ | Inferring `HashShape` for a hash literal | Synthesis |
1102
+ | Inferring `Tuple` for an array literal | Synthesis |
1103
+ | Walking the `then` / `else` branches of an `if` under a narrowed environment | Synthesis under each branch + join (no expected-type checking) |
1104
+ | Verifying a plugin protocol contract ([ADR-28](../adr/28-path-scoped-protocol-contracts.md)) — method exists + return-type matches | Checking |
1105
+ | Honouring an `RBS::Extended` `assert_type` directive | Checking |
1106
+
1107
+ What Rigor does NOT do that a fully formalised bidirectional
1108
+ system would:
1109
+
1110
+ - **Constraint propagation across non-adjacent expressions.**
1111
+ Each expression's type is decided when the walker reaches it;
1112
+ later uses see the decision as a fixed fact, not as a
1113
+ constraint to be solved together with theirs.
1114
+ - **Local generalisation.** HM's `let`-binding generalisation
1115
+ step does not exist in Rigor; the walker does not introduce
1116
+ fresh type variables to be solved later.
1117
+ - **Formal mode discipline.** Rigor's rules are not authored as
1118
+ `⇒` / `⇐` judgments; the walker's behaviour matches a
1119
+ bidirectional reading but the spec does not enforce it.
1120
+
1121
+ The practical consequence: Rigor's inference is faster than a
1122
+ constraint-based bidirectional system (no global solving) and
1123
+ gives a definite "what type is this expression?" answer at every
1124
+ point — at the cost of not being able to defer typing decisions,
1125
+ which a more constraint-based system would allow when context
1126
+ arrives later in the walk.
1127
+
1128
+ For a reader who has internalised the bidirectional literature,
1129
+ the mental model is: **synthesis everywhere except at the RBS /
1130
+ plugin-contribution boundary, where the declared type is the
1131
+ check side.** That is the entire bidirectional discipline Rigor
1132
+ needs.
1133
+
1134
+ ## Beyond pure inference: reach and precision
1135
+
1136
+ The previous sections framed "what cannot be statically inferred"
1137
+ in terms of theoretical decidability — `maybe` as the response
1138
+ when no proof is available. That covers one half of the design
1139
+ space. There is a second half that the reading-order of this
1140
+ appendix has implied but not yet named: phenomena that are *not*
1141
+ theoretically undecidable, where a pure AST-walking inference
1142
+ *could* return a type but the type it would return either (a)
1143
+ does not exist in the AST at all, or (b) exists but is too wide
1144
+ to be useful.
1145
+
1146
+ Both halves are addressed by the same substrate — `RBS::Extended`
1147
+ directives, plugin contributions, the specialised carrier zoo —
1148
+ but for different reasons. It is worth giving them separate
1149
+ names.
1150
+
1151
+ ### Reach: the AST does not describe the program
1152
+
1153
+ The walker reads the AST. For a Ruby program, the AST is not a
1154
+ complete description of the program's runtime behaviour:
1155
+
1156
+ - `define_method` synthesises methods whose names are computed at
1157
+ evaluation time.
1158
+ - `attr_accessor :name` defines `#name` / `#name=` whose existence
1159
+ the walker recognises by pattern, not by general reasoning.
1160
+ - `class_eval` / `instance_eval` over a block injects code under
1161
+ a different `self`.
1162
+ - DSL forms like `has_many :posts` or
1163
+ `attribute :name, Types::String` declare *both* a method and a
1164
+ type contract through a single helper call.
1165
+ - `eval(string)` with an arbitrary string is genuinely outside
1166
+ the AST.
1167
+
1168
+ None of these is "undecidable" in the sense of the previous two
1169
+ sections. The semantics are well-defined; the walker
1170
+ cannot *read* them from the AST. This is the **reach**
1171
+ problem, distinct from the decidability problem:
1172
+
1173
+ | Problem class | Example | Rigor's response |
1174
+ | --- | --- | --- |
1175
+ | Theoretical undecidability of inference | Rank-3 polymorphism; subtyping + intersection | The trinary `maybe` |
1176
+ | Reach — the AST does not contain the semantics | `define_method`, Rails DSL, `attr_*` | Plugin contributions + `RBS::Extended` + the [ADR-16](../adr/16-macro-expansion.md) macro substrate |
1177
+ | Genuine runtime opacity | `eval(user_input)` | `Dynamic[top]`, then `maybe` at use sites |
1178
+
1179
+ Plugins are written in Ruby because the reach problem cannot be
1180
+ solved in the type language alone — it needs a Ruby-side
1181
+ recogniser that walks the AST, decides "this `has_many :posts`
1182
+ declares an accessor returning `Relation[Post]`," and contributes
1183
+ that fact to the walker's worldview.
1184
+ [ADR-2](../adr/2-extension-api.md),
1185
+ [ADR-16](../adr/16-macro-expansion.md),
1186
+ [ADR-25](../adr/25-plugin-contributed-rbs.md), and
1187
+ [ADR-28](../adr/28-path-scoped-protocol-contracts.md) define the
1188
+ structured extension points where this knowledge enters.
1189
+
1190
+ ### Precision: naive inference produces useless joins
1191
+
1192
+ The second motivation is subtler but at least as important. The
1193
+ simplest "correct" inference rules for compound expressions
1194
+ produce types so wide they tell the user nothing useful:
1195
+
1196
+ | Expression | Naive join | More useful type | Mechanism in Rigor |
1197
+ | --- | --- | --- | --- |
1198
+ | `{user: u, count: 3, msg: "ok"}` | `Hash[Symbol, User \| Integer \| String]` | `HashShape{user: User, count: Integer, msg: String}` | `HashShape` carrier (built-in for hash literals) |
1199
+ | `[1, "a", :sym]` | `Array[Integer \| String \| Symbol]` | `Tuple[Integer, String, Symbol]` | `Tuple` carrier (built-in for array literals) |
1200
+ | A provably-constant value (e.g. `42`, `"ok"`) | `Integer`, `String` | `Constant<42>`, `Constant<"ok">` | `Constant<T>` carrier |
1201
+ | `JSON.parse(input)` | `Hash[String, untyped] \| Array[untyped] \| String \| Integer \| Float \| true \| false \| nil` | `App[json::value, K]` per option `K` | [ADR-20](../adr/20-lightweight-hkt.md) Lightweight HKT + `METHOD_RETURN_OVERRIDES` |
1202
+ | A method whose return depends on its arguments | A wide union of every observed exit | A per-call-site discriminated return | `RBS::Extended` `return_override` directive |
1203
+ | A DSL-managed accessor (`has_many`, `attribute`) | `Dynamic[top]` | `Relation[Model]`, a model-specific shape | Plugin `dynamic_return` + macro substrate |
1204
+
1205
+ These are not undecidability cases — the inference can decide a
1206
+ type, it decides a *useless* one. A type like
1207
+ `Hash[Symbol, Foo | Bar | Buz]` or
1208
+ `true | false | String | Integer | Float` is technically the
1209
+ correct join of observed values, but its consumer cannot do
1210
+ anything with it without narrowing first; the union has erased
1211
+ exactly the information the type system existed to carry.
1212
+
1213
+ The shared design principle is **strictness on returns** — the
1214
+ robustness principle ([ADR-5](../adr/5-robustness-principle.md))
1215
+ treats "the most specific type the analysis can justify" as the
1216
+ goal, not "the smallest type that covers every observed exit."
1217
+ Naive join-widening fails that test in nearly every case where
1218
+ the inputs are heterogeneous.
1219
+
1220
+ This is also why `HashShape` and `Tuple` are **foundational
1221
+ carriers** rather than exotic refinements: without them every
1222
+ hash literal would degrade to a `Hash`-with-union and the
1223
+ inferred type language would describe almost nothing useful in
1224
+ practice.
1225
+
1226
+ ### One substrate, two problems
1227
+
1228
+ The plugin contract and the `RBS::Extended` directive family
1229
+ therefore serve two complementary roles. They extend *where*
1230
+ Rigor can produce a type at all (reach), and they raise *how
1231
+ specific* that type is when produced (precision). The two roles
1232
+ share a substrate but answer different limitations (one of
1233
+ static-analysis scope, one of useful-type design), and neither
1234
+ is the same as the decidability question that the trinary
1235
+ `maybe` answers.
1236
+
1237
+ ## The expression problem and Rigor's plugin contract
1238
+
1239
+ A theoretical framing for one of Rigor's central design choices
1240
+ (the plugin contract) comes from a paper that gave the framing
1241
+ its name.
1242
+
1243
+ ### The problem
1244
+
1245
+ **Wadler 1998** (*The Expression Problem*, informal note) posed
1246
+ the challenge: in a typed language, can you simultaneously
1247
+ support
1248
+
1249
+ 1. **Adding new types** (new data variants) without modifying
1250
+ existing operations, *and*
1251
+ 2. **Adding new operations** (new functions over existing data)
1252
+ without modifying existing types?
1253
+
1254
+ Most type-system paradigms handle one or the other:
1255
+
1256
+ | Paradigm | Easy to add | Hard to add |
1257
+ | --- | --- | --- |
1258
+ | OO (subtyping + dispatch) | New types (subclass) | New operations (must touch every class) |
1259
+ | Functional ADTs + pattern matching | New operations (new function) | New types (must touch every operation) |
1260
+ | Haskell type classes | Either, with care | The other requires `OverlappingInstances` etc. |
1261
+ | Scala traits + pattern matching | Either, with elaboration support | Boilerplate on the unsupported side |
1262
+ | Clojure / Elixir protocols | Either (protocol dispatch) | (Solved by design) |
1263
+ | Ruby open classes | Both! (reopen + monkey-patch) | (Solved by design — sometimes too directly) |
1264
+
1265
+ Ruby sits in the "open classes" row — a non-typed language where
1266
+ the expression problem is solved by `module Foo; def bar; …;
1267
+ end; class String; include Foo; end`. The language solution
1268
+ trades safety for flexibility.
1269
+
1270
+ ### Rigor's plugin contract as the tool-side answer
1271
+
1272
+ Rigor's plugin substrate ([ADR-2](../adr/2-extension-api.md),
1273
+ [ADR-16](../adr/16-macro-expansion.md),
1274
+ [ADR-25](../adr/25-plugin-contributed-rbs.md),
1275
+ [ADR-28](../adr/28-path-scoped-protocol-contracts.md), …) solves
1276
+ the **tool-level** version of the same problem:
1277
+
1278
+ - **Adding new type knowledge** the engine can act on — new RBS
1279
+ bundles, new structural shapes via `signature_paths:`, new
1280
+ TypeNode resolvers ([ADR-13](../adr/13-typenode-resolver-plugin.md))
1281
+ — **without modifying the engine**.
1282
+ - **Adding new analyses / operations** over existing types — new
1283
+ diagnostic rules, new flow contributions, new protocol
1284
+ contracts ([ADR-28](../adr/28-path-scoped-protocol-contracts.md))
1285
+ — **without modifying the type language**.
1286
+
1287
+ The plugin contract is therefore the **expression problem solved
1288
+ at the analyser's extension boundary**, where the language-level
1289
+ solution (open classes) is too coarse for static analysis.
1290
+
1291
+ A worked example:
1292
+
1293
+ - `rigor-activesupport-core-ext` adds a *new fact* about existing
1294
+ classes (`Numeric#hours`, `String#blank?`, `Hash#stringify_keys`)
1295
+ — type-extension axis.
1296
+ - `rigor-web` adds a *new analysis* over existing classes
1297
+ (every class under `lib/controller/` must define
1298
+ `#get(Rack::Request) -> Rack::Response`) — operation-extension
1299
+ axis ([ADR-28](../adr/28-path-scoped-protocol-contracts.md)).
1300
+
1301
+ Neither plugin requires modifying the Rigor engine, and they
1302
+ *compose* — a single plugin can do both axes ([ADR-12 dry-rb
1303
+ packaging](../adr/12-dry-rb-packaging.md) discusses the
1304
+ production examples).
1305
+
1306
+ ### Connection to earlier appendix sections
1307
+
1308
+ This framing also retroactively explains several design choices:
1309
+
1310
+ - **Nominal-first** (§ "Nominal vs structural typing"):
1311
+ nominal class names are the stable attachment point for
1312
+ plugin-contributed facts. Structural shapes are inferred
1313
+ per-call; a plugin would have no name to bind its knowledge
1314
+ to. The expression problem framing prefers explicit `class`
1315
+ declarations precisely because the name is the extension
1316
+ handle.
1317
+ - **The macro substrate** ([ADR-16](../adr/16-macro-expansion.md)):
1318
+ each tier (A: block-as-method, B: trait-inlining, C: heredoc
1319
+ template, D: external-file inclusion) is a different way to
1320
+ add knowledge about a class's behaviour without modifying the
1321
+ class — the type-extension axis made plural.
1322
+ - **Path-scoped protocol contracts**
1323
+ ([ADR-28](../adr/28-path-scoped-protocol-contracts.md)): a
1324
+ plugin can declare a behavioural contract for an entire
1325
+ directory of user-authored classes without the classes
1326
+ opting in — the operation-extension axis made tool-side.
1327
+
1328
+ The plugin contract is therefore **the expression problem solved
1329
+ at the analyser layer rather than the language layer**, not an
1330
+ ad-hoc Rigor design choice. The same theoretical pressure
1331
+ that drove Haskell to type classes and Clojure to protocols
1332
+ drives Rigor to a structured plugin substrate.
1333
+
1334
+ ## Smaller connections, in brief
1335
+
1336
+ A grab-bag of further type-theoretic / programming-languages
1337
+ connections. Each is summarised in a paragraph rather than a
1338
+ section because the topic either maps to mechanisms already
1339
+ covered or to a deliberate non-feature
1340
+ ([§ What Rigor does NOT model](#what-rigor-does-not-model)), but
1341
+ a reader hunting for "does Rigor have a story for X?" should be
1342
+ able to find one here.
1343
+
1344
+ ### Type erasure vs reification
1345
+
1346
+ A language **erases** types at runtime (Java generics, Haskell,
1347
+ OCaml) or **reifies** them (C#, .NET, Ruby's `.class`). Ruby is
1348
+ fully reified — `arr.class` returns `Array` at runtime, and
1349
+ `is_a?` queries are first-class. Rigor leans on this: occurrence
1350
+ typing's predicate set (`is_a?` / `kind_of?` / `instance_of?` /
1351
+ `respond_to?`) all use Ruby's reified class objects, and the
1352
+ narrowing rules are sound *because* the run-time check matches
1353
+ the type-theoretic class membership. The gradual-typing
1354
+ literature on "type-erased" vs "reified" gradual systems
1355
+ (Wrigstad et al. 2010, *Integrating Typed and Untyped Code in a
1356
+ Scripting Language*) classifies Rigor's setting as fully reified
1357
+ on the dynamic side — which is what makes `Dynamic[T]` narrow
1358
+ back to a concrete type safely whenever the runtime predicate
1359
+ fires.
1360
+
1361
+ ### Algebraic effects vs monadic effects
1362
+
1363
+ The textbook alternative to monadic effects (Haskell `IO`, Scala
1364
+ `cats-effect`) is **algebraic effects with handlers** (Plotkin &
1365
+ Pretnar 2009; Koka / Eff / OCaml 5's effect handlers). Algebraic
1366
+ effects let an "effectful" computation be paused at the effect
1367
+ site and resumed by a handler — closer to delimited
1368
+ continuations than to monad bind. Rigor's effect model
1369
+ (§ "Effect systems") is neither monadic nor algebraic; it is
1370
+ *inferred* and *engine-internal*. Surfacing effects to the user
1371
+ (annotation grammar; pure-function marker; algebraic-effect
1372
+ signatures) is a future direction tracked in the spec corpus;
1373
+ the relevant prior art is the Koka community's surface design.
1374
+
1375
+ ### Single vs multiple dispatch
1376
+
1377
+ Ruby is single-dispatch — method selection depends on the
1378
+ receiver's class only. Languages with **multiple dispatch**
1379
+ (CLOS, Julia, Dylan) select methods based on the runtime types
1380
+ of every argument. RBS overloads — `def m: (Integer) -> Integer
1381
+ | (String) -> String` — simulate a static-side analogue of
1382
+ multiple dispatch by picking an arm by argument type at the call
1383
+ site. Rigor honours the "most-specific arm wins" resolution that
1384
+ multiple-dispatch type systems require, but the runtime dispatch
1385
+ remains single-dispatch; the overload arm is selected at
1386
+ type-check time, not at call time.
1387
+
1388
+ ### Phantom types and brand types
1389
+
1390
+ A **phantom type** carries a type parameter that does not appear
1391
+ in any field — e.g., `class Length<U>` where `U` is `Metres` or
1392
+ `Feet`. The type carries an invariant the runtime does not
1393
+ enforce. **Brand types** wrap a base type in a unique nominal
1394
+ type (`class ValidatedEmail < String`) so that only
1395
+ verified-via-constructor values inhabit it. Rigor's refinement
1396
+ carriers (§ "Refinement types") cover the brand-type use case
1397
+ for the common refinements (`non-empty-string`, `positive-int`,
1398
+ …); user-extensible brand types via `class X < Y` are typed
1399
+ nominally by Rigor — `ValidatedEmail` is distinct from `String`
1400
+ at the type level. The phantom-type-via-unused-parameter pattern
1401
+ is also typeable but not widely used in Ruby; the equivalent
1402
+ expressiveness usually arrives via refinements or nominal
1403
+ wrapping.
1404
+
1405
+ ### Open-world vs closed-world assumption
1406
+
1407
+ RBS treats unknown methods on a `Dynamic[T]` receiver under the
1408
+ **open-world** assumption — "we do not know the full method set;
1409
+ an unknown method might exist at runtime." Rigor inherits this
1410
+ on dynamic receivers, which is why a `respond_to?`-narrowed call
1411
+ on a `Dynamic[T]` value does not fire `call.undefined-method`.
1412
+ On a concretely-typed receiver (e.g. `String`), Rigor uses the
1413
+ **closed-world** assumption — the RBS signature is taken as the
1414
+ authoritative method set, and an unknown method fires
1415
+ `call.undefined-method` (subject to the
1416
+ [ADR-26 `open_receivers:`](../adr/26-activerecord-relation-typing.md)
1417
+ exemption for receivers whose method set is provably open at
1418
+ runtime, like `ActiveRecord::Relation`).
1419
+
1420
+ ### Equirecursive vs isorecursive types
1421
+
1422
+ Two formalisms for recursive types: **equirecursive** treats
1423
+ `T = μX. F(X)` as judgmentally equal to `F(μX. F(X))` (any
1424
+ recursive type can be silently unfolded); **isorecursive**
1425
+ requires explicit `fold` / `unfold` conversions at every use.
1426
+ RBS type aliases are nominally recursive — `type tree = [Symbol,
1427
+ tree?]` is the type *by name*, and Rigor compares them by name,
1428
+ not by structural unfolding. This is the conservative path that
1429
+ sidesteps the equirecursive-vs-isorecursive debate entirely:
1430
+ naming a recursive type is the only way to write it.
1431
+
1432
+ ### Polarity and variance positions
1433
+
1434
+ Variance (§ "Variance") is derived from **positions** in a type
1435
+ expression: a type variable appearing in **positive** position
1436
+ (return type, output of a producer) is covariant; in **negative**
1437
+ position (parameter type, input to a consumer) is contravariant;
1438
+ in both (mutable storage) is invariant. The polarity reading is
1439
+ the standard textbook derivation (Pierce, *Types and Programming
1440
+ Languages*, ch. 15). RBS's `out T` / `in T` annotations express
1441
+ the result directly without forcing the user to read variance
1442
+ off a type expression's polarity structure; Rigor honours the
1443
+ declared variance without re-deriving it.
1444
+
1445
+ ### Subtype-with-bounded-existentials
1446
+
1447
+ The standard encoding of "a value with a hidden type that
1448
+ satisfies an interface" is **bounded existential types** — `∃X
1449
+ <: I. T(X)`. Most industrial languages encode this via
1450
+ *structural interfaces* instead: a parameter typed `I` accepts
1451
+ any value with `I`'s method set, hiding the receiver's concrete
1452
+ type. Rigor follows the industrial convention — `interface
1453
+ _Comparable` (§ "Nominal vs structural typing") is the
1454
+ existential-bound mechanism users actually reach for. The pack /
1455
+ unpack syntax of ML-style existential types is not exposed.
1456
+
1457
+ ### Refinement-as-types vs refinement-as-predicates
1458
+
1459
+ The § "Refinement types" section covers Rigor's curated
1460
+ refinement catalogue (`non-empty-string`, `positive-int`, …),
1461
+ which sits in the **refinements-as-types** tradition. The
1462
+ alternative — **refinements-as-predicates** (SMT-backed Liquid
1463
+ Types; F\*'s subset types) — keeps the predicate as a first-class
1464
+ formula attached to the base type, and discharges it via SMT at
1465
+ each constraint site. Rigor uses a much weaker, decidable
1466
+ fragment: the predicate is a named refinement carrier (not an
1467
+ arbitrary formula), and the "narrow into the refinement" rule is
1468
+ a deterministic engine step (not an SMT call). The Rondon-
1469
+ Kawaguchi-Jhala Liquid Types reference in the reading list is
1470
+ the technical seed Rigor's catalogue draws from at a distance.
1471
+
1472
+ ## What Rigor does NOT model
1473
+
1474
+ For completeness, a short list of type-theoretic features Rigor
1475
+ *does not* currently expose at the user surface — naming them
1476
+ here so you can stop looking:
1477
+
1478
+ - **Higher-kinded types (HKT).** `Functor[F[_]]` style
1479
+ abstraction. Tracked as a "future direction" but not in any
1480
+ shipped slice. (General HKT inference is undecidable; ADR-20
1481
+ sketches a defunctionalised, annotation-driven approach that
1482
+ sidesteps this.)
1483
+ - **Higher-rank polymorphism (System F⊤).** All RBS generics
1484
+ are predicative; type variables cannot quantify over
1485
+ polymorphic types. (Rank-3 inference is undecidable per Wells,
1486
+ 1999; the predicative restriction keeps Rigor's surface
1487
+ inferable without per-call annotations.)
1488
+ - **Polymorphic recursion.** A generic method body re-applied
1489
+ inside itself at a *different* instantiation. Inference is
1490
+ undecidable (Henglein, 1993); RBS does not offer the syntax
1491
+ and Rigor does not synthesise it.
1492
+ - **Full dependent types.** No `Vec[n, T]` with `n : Integer`.
1493
+ Type-checking is decidable but inference is not; integer-range
1494
+ refinements (`int<min, max>`) cover the most common practical
1495
+ need without crossing the line.
1496
+ - **Row polymorphism as a user-quantifiable axis.** `HashShape`
1497
+ carries open-vs-closed semantics internally but does not
1498
+ expose row variables. See § "Object shapes — row polymorphism,
1499
+ Hack, and HashShape's lineage" for the design rationale.
1500
+ - **Existential types.** No `pack` / `unpack`. Closest analogue
1501
+ is structural `interface`.
1502
+ - **GADTs.** No type-refinement-by-constructor; pattern
1503
+ matching narrows via the standard occurrence-typing path, not
1504
+ via type-index propagation.
1505
+ - **Linear / affine types.** No move-checking or use-once
1506
+ enforcement.
1507
+ - **Session types, capabilities-as-types.** Out of scope.
1508
+ - **Mechanised soundness proof.** Deliberately deferred; see the
1509
+ [Matsumoto & Minamide 2010 review](../notes/20260518-matsumoto-2010-cfa-rigor-review.md)
1510
+ for the upstream "prove soundness on a tiny core" approach
1511
+ Rigor has not yet adopted.
1512
+
1513
+ If a topic on this list later becomes important to the user
1514
+ base, it will be discussed in an ADR before any implementation
1515
+ slice. Until then, the absence is a feature.
1516
+
1517
+ ## A short reading list
1518
+
1519
+ Papers and books behind the choices above, in roughly the order
1520
+ they map to the sections of this appendix:
1521
+
1522
+ - B.C. Pierce. *Types and Programming Languages.* MIT Press,
1523
+ 2002. Standard reference for everything in the first half of
1524
+ this appendix.
1525
+ - Cardelli & Wegner. "On Understanding Types, Data
1526
+ Abstraction, and Polymorphism." *ACM Computing Surveys*,
1527
+ 1985. Origin of the polymorphism taxonomy.
1528
+ - Canning, Cook, Hill, Olthoff & Mitchell. "F-Bounded
1529
+ Polymorphism for Object-Oriented Programming." *FPCA 1989.*
1530
+ The classical reference for F-bounded polymorphism — the
1531
+ type-theoretic root of RBS's `self` keyword and Sorbet's
1532
+ `T.attached_class`.
1533
+ - Wadler, P. "The Expression Problem." Informal note posted to
1534
+ the `java-genericity` mailing list, 1998. The challenge that
1535
+ motivates Rigor's plugin substrate as the analyser-layer
1536
+ answer.
1537
+ - Wand, M. "Complete Type Inference for Simple Objects."
1538
+ *LICS*, 1987. The seed of row polymorphism — first
1539
+ formulation of "infer object types with extra fields."
1540
+ - Rémy, D. "Type Checking Records and Variants in a Natural
1541
+ Extension of ML." *POPL 1989.* The row-variable mechanism in
1542
+ the form it is most commonly cited.
1543
+ - Cardelli, L. & Mitchell, J.C. "Operations on Records."
1544
+ *Mathematical Structures in Computer Science*, 1991. The
1545
+ algebraic treatment of record operations under row
1546
+ polymorphism — the substrate Garrigue and then Matsumoto &
1547
+ Minamide build on.
1548
+ - Siek & Taha. "Gradual Typing for Functional Languages."
1549
+ *Scheme Workshop*, 2006. The original gradual-typing paper.
1550
+ - Garcia, Clark & Tanter. "Abstracting Gradual Typing."
1551
+ *POPL 2016.* The modern reformulation of gradual typing in
1552
+ terms of abstract interpretation.
1553
+ - Findler & Felleisen. "Contracts for Higher-Order Functions."
1554
+ *ICFP 2002.* The origin of blame as a formal principle —
1555
+ background for § "Blame, the gradual guarantee, and trust
1556
+ boundaries."
1557
+ - Wadler & Findler. "Well-Typed Programs Can't Be Blamed."
1558
+ *ESOP 2009.* The headline result on the asymmetry between
1559
+ typed and untyped code at the boundary.
1560
+ - Siek, Vitousek, Cimini & Boyland. "Refined Criteria for
1561
+ Gradual Typing." *SNAPL 2015.* The original statement of the
1562
+ gradual guarantee.
1563
+ - Frisch, Castagna & Benzaken. "Semantic Subtyping: Dealing
1564
+ Set-Theoretically with Function, Union, Intersection, and
1565
+ Negation Types." *Journal of the ACM*, 2008. The foundational
1566
+ treatment of set-theoretic types — background for § "Set-
1567
+ theoretic foundations of the lattice."
1568
+ - Castagna, G. *Programming with Union, Intersection, and
1569
+ Negation Types.* 2024. The current canonical reference for
1570
+ semantic-subtyping-based type systems; the framework behind
1571
+ Elixir's set-theoretic types.
1572
+ - Dunfield & Krishnaswami. "Complete and Easy Bidirectional
1573
+ Typechecking for Higher-Rank Polymorphism." *ICFP 2013.* The
1574
+ modern canonical reference for bidirectional type checking —
1575
+ background for § "Bidirectional type checking."
1576
+ - Tobin-Hochstadt & Felleisen. "The Design and Implementation
1577
+ of Typed Scheme." *POPL 2008.* Origin of occurrence typing.
1578
+ - Maranget, L. "Warnings for Pattern Matching." *Journal of
1579
+ Functional Programming*, 2007. The algorithm OCaml uses for
1580
+ pattern-match exhaustiveness — background for § "Pattern
1581
+ matching and exhaustiveness."
1582
+ - Rondon, Kawaguchi & Jhala. "Liquid Types." *PLDI 2008.* The
1583
+ refinement-types-with-SMT framework that informs the
1584
+ `int<min, max>` carrier (Rigor uses a much weaker, decidable
1585
+ fragment).
1586
+ - Lucassen & Gifford. "Polymorphic Effect Systems."
1587
+ *POPL 1988.* Origin of effect systems.
1588
+ - Plotkin & Pretnar. "Handlers of Algebraic Effects."
1589
+ *ESOP 2009.* The algebraic-effect-handler design that Koka,
1590
+ Eff, and OCaml 5's effect system descend from — background
1591
+ for the § "Smaller connections" note on algebraic vs monadic
1592
+ effects.
1593
+ - Wrigstad, Nardelli, Lebresne, Östlund & Vitek. "Integrating
1594
+ Typed and Untyped Code in a Scripting Language."
1595
+ *POPL 2010.* The taxonomy of type-erased vs reified gradual
1596
+ systems — background for § "Smaller connections" on Ruby's
1597
+ fully reified dynamic side.
1598
+ - Milner, R. "A Theory of Type Polymorphism in Programming."
1599
+ *JCSS*, 1978. The Hindley–Milner system in its original form
1600
+ — the canonical type system that achieves soundness,
1601
+ decidability, and the principal type property simultaneously
1602
+ by restricting the language.
1603
+ - Damas, L. & Milner, R. "Principal Type-Schemes for Functional
1604
+ Programs." *POPL 1982.* The principal-type theorem and
1605
+ Algorithm W. The reference for what Rigor consciously does
1606
+ *not* attempt.
1607
+ - Pierce, B.C. & Turner, D.N. "Local Type Inference." *ACM
1608
+ TOPLAS*, 2000. The bidirectional / local-inference design that
1609
+ the Ruby static-typing landscape (Steep especially) draws from
1610
+ once subtyping is in the picture — the practical successor to
1611
+ HM under those conditions, and the closest textbook analogue
1612
+ to Rigor's walker.
1613
+ - Wells, J.B. "Typability and Type Checking in System F are
1614
+ Equivalent and Undecidable." *Annals of Pure and Applied
1615
+ Logic*, 1999. The proof that Rank-3 (and higher) type
1616
+ inference is undecidable — the reason RBS generics stay
1617
+ predicative.
1618
+ - Henglein, F. "Type Inference with Polymorphic Recursion."
1619
+ *ACM TOPLAS*, 1993. Establishes that inferring polymorphic
1620
+ recursion is undecidable.
1621
+ - 水野雅之.「計算機に推論できる型、できない型」.
1622
+ *Wantedly Advent Calendar*, 2021.
1623
+ <https://www.wantedly.com/companies/wantedly/post_articles/349494>.
1624
+ A friendly Japanese-language tour of the decidability boundary
1625
+ — Let多相, Rank-N, 多相再帰, 再帰型, サブタイピング+交差型 —
1626
+ and the most accessible companion to the
1627
+ "Decidability of inference" section above.
1628
+ - Matsumoto & Minamide. "Rubyプログラムの制御フロー解析と
1629
+ その健全性の証明." *IPSJ TPRO Vol.3 No.2*, 2010. The
1630
+ upstream Ruby-CFA soundness proof; Rigor-perspective review
1631
+ at
1632
+ [`docs/notes/20260518-matsumoto-2010-cfa-rigor-review.md`](../notes/20260518-matsumoto-2010-cfa-rigor-review.md).
1633
+ - Matsumoto & Minamide. "多相レコード型に基づくRubyプログラム
1634
+ の型推論." *IPSJ TPRO Vol.49 No.SIG 3*, 2008. The
1635
+ Garrigue-kinded polymorphic-record experiment that
1636
+ retroactively justifies Rigor's nominal-first carrier choice;
1637
+ Rigor-perspective review at
1638
+ [`docs/notes/20260518-matsumoto-2008-poly-records-rigor-review.md`](../notes/20260518-matsumoto-2008-poly-records-rigor-review.md).
1639
+
1640
+ ## What's next
1641
+
1642
+ If you came in from a "show me where Rigor stands in the type-
1643
+ theory landscape" question, the rest of the handbook is the
1644
+ practical companion:
1645
+
1646
+ - [Chapter 2 — Everyday types](02-everyday-types.md) for the
1647
+ carrier zoo at the surface level.
1648
+ - [Chapter 3 — Narrowing](03-narrowing.md) for occurrence typing
1649
+ in practice.
1650
+ - [Chapter 7 — RBS and `RBS::Extended`](07-rbs-and-extended.md)
1651
+ for the directive grammar that lets you teach Rigor about a
1652
+ custom predicate.
1653
+ - [Chapter 8 — Understanding errors](08-understanding-errors.md)
1654
+ for the rule catalogue (the user-visible end of the trinary
1655
+ certainty).
1656
+
1657
+ If you want to compare against another *tool* rather than the
1658
+ *theory*, the sibling appendices cover
1659
+ [TypeScript](appendix-typescript.md),
1660
+ [PHPStan](appendix-phpstan.md),
1661
+ [mypy / Pyright](appendix-mypy.md),
1662
+ and [Steep](appendix-steep.md).