rigortype 0.2.0 → 0.2.2
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/README.md +82 -20
- data/data/core_overlay/numeric.rbs +33 -0
- data/data/core_overlay/pathname.rbs +25 -0
- data/data/core_overlay/string_scanner.rbs +28 -0
- data/data/gem_overlay/activesupport/core_ext.rbs +473 -0
- data/data/vendored_gem_sigs/ast/ast.rbs +130 -0
- data/data/vendored_gem_sigs/bcrypt/bcrypt.rbs +47 -0
- data/data/vendored_gem_sigs/bundler/bundler.rbs +238 -0
- data/data/vendored_gem_sigs/cgi/cgi_extras.rbs +34 -0
- data/data/vendored_gem_sigs/did_you_mean/did_you_mean_extras.rbs +34 -0
- data/data/vendored_gem_sigs/idn-ruby/idn.rbs +54 -0
- data/data/vendored_gem_sigs/mysql2/client.rbs +55 -0
- data/data/vendored_gem_sigs/mysql2/error.rbs +5 -0
- data/data/vendored_gem_sigs/mysql2/result.rbs +31 -0
- data/data/vendored_gem_sigs/mysql2/statement.rbs +5 -0
- data/data/vendored_gem_sigs/nokogiri/nokogiri.rbs +2332 -0
- data/data/vendored_gem_sigs/nokogiri/nokogiri_html5.rbs +47 -0
- data/data/vendored_gem_sigs/pg/pg.rbs +212 -0
- data/data/vendored_gem_sigs/prism/prism_supplement.rbs +44 -0
- data/data/vendored_gem_sigs/redis/errors.rbs +50 -0
- data/data/vendored_gem_sigs/redis/future.rbs +5 -0
- data/data/vendored_gem_sigs/redis/redis.rbs +348 -0
- data/data/vendored_gem_sigs/redis/redis_extras.rbs +130 -0
- data/data/vendored_gem_sigs/rubygems/rubygems_extras.rbs +226 -0
- data/docs/handbook/01-getting-started.md +311 -0
- data/docs/handbook/02-everyday-types.md +337 -0
- data/docs/handbook/03-narrowing.md +359 -0
- data/docs/handbook/04-tuples-and-shapes.md +321 -0
- data/docs/handbook/05-methods-and-blocks.md +339 -0
- data/docs/handbook/06-classes.md +305 -0
- data/docs/handbook/07-rbs-and-extended.md +427 -0
- data/docs/handbook/08-understanding-errors.md +373 -0
- data/docs/handbook/09-plugins.md +241 -0
- data/docs/handbook/10-sorbet.md +347 -0
- data/docs/handbook/11-sig-gen.md +312 -0
- data/docs/handbook/12-lightweight-hkt.md +333 -0
- data/docs/handbook/README.md +275 -0
- data/docs/handbook/appendix-elixir.md +370 -0
- data/docs/handbook/appendix-go.md +399 -0
- data/docs/handbook/appendix-java-csharp.md +470 -0
- data/docs/handbook/appendix-liskov.md +580 -0
- data/docs/handbook/appendix-mypy.md +370 -0
- data/docs/handbook/appendix-phpstan.md +338 -0
- data/docs/handbook/appendix-protocols-and-structural-typing.md +292 -0
- data/docs/handbook/appendix-rust.md +446 -0
- data/docs/handbook/appendix-steep.md +336 -0
- data/docs/handbook/appendix-type-theory.md +1662 -0
- data/docs/handbook/appendix-typeprof.md +416 -0
- data/docs/handbook/appendix-typescript.md +332 -0
- data/docs/install.md +189 -0
- data/docs/llms.txt +72 -0
- data/docs/manual/01-installation.md +342 -0
- data/docs/manual/02-cli-reference.md +557 -0
- data/docs/manual/03-configuration.md +152 -0
- data/docs/manual/04-diagnostics.md +206 -0
- data/docs/manual/05-inspecting-types.md +109 -0
- data/docs/manual/06-baseline.md +104 -0
- data/docs/manual/07-plugins.md +92 -0
- data/docs/manual/08-skills.md +143 -0
- data/docs/manual/09-editor-integration.md +245 -0
- data/docs/manual/10-mcp-server.md +532 -0
- data/docs/manual/11-ci.md +274 -0
- data/docs/manual/12-caching.md +116 -0
- data/docs/manual/13-troubleshooting.md +120 -0
- data/docs/manual/14-rails-quickstart.md +332 -0
- data/docs/manual/15-type-protection-coverage.md +204 -0
- data/docs/manual/16-rbs-extended-annotations.md +190 -0
- data/docs/manual/17-driving-improvement.md +160 -0
- data/docs/manual/README.md +87 -0
- data/docs/manual/ci-templates/README.md +58 -0
- data/docs/manual/plugins/README.md +86 -0
- data/docs/manual/plugins/rigor-actioncable.md +78 -0
- data/docs/manual/plugins/rigor-actionmailer.md +74 -0
- data/docs/manual/plugins/rigor-actionpack.md +80 -0
- data/docs/manual/plugins/rigor-activejob.md +58 -0
- data/docs/manual/plugins/rigor-activerecord.md +102 -0
- data/docs/manual/plugins/rigor-activestorage.md +74 -0
- data/docs/manual/plugins/rigor-activesupport-core-ext.md +86 -0
- data/docs/manual/plugins/rigor-devise.md +70 -0
- data/docs/manual/plugins/rigor-dry-schema.md +56 -0
- data/docs/manual/plugins/rigor-dry-struct.md +60 -0
- data/docs/manual/plugins/rigor-dry-types.md +59 -0
- data/docs/manual/plugins/rigor-dry-validation.md +62 -0
- data/docs/manual/plugins/rigor-factorybot.md +76 -0
- data/docs/manual/plugins/rigor-graphql.md +89 -0
- data/docs/manual/plugins/rigor-hanami.md +83 -0
- data/docs/manual/plugins/rigor-mangrove.md +73 -0
- data/docs/manual/plugins/rigor-minitest.md +86 -0
- data/docs/manual/plugins/rigor-pundit.md +72 -0
- data/docs/manual/plugins/rigor-rails-i18n.md +92 -0
- data/docs/manual/plugins/rigor-rails-routes.md +94 -0
- data/docs/manual/plugins/rigor-rails.md +44 -0
- data/docs/manual/plugins/rigor-rbs-inline.md +83 -0
- data/docs/manual/plugins/rigor-rspec-rails.md +72 -0
- data/docs/manual/plugins/rigor-rspec.md +86 -0
- data/docs/manual/plugins/rigor-shoulda-matchers.md +78 -0
- data/docs/manual/plugins/rigor-sidekiq.md +78 -0
- data/docs/manual/plugins/rigor-sinatra.md +61 -0
- data/docs/manual/plugins/rigor-sorbet.md +63 -0
- data/docs/manual/plugins/rigor-statesman.md +75 -0
- data/docs/manual/plugins/rigor-typescript-utility-types.md +71 -0
- data/exe/rigor +1 -1
- data/lib/rigor/analysis/incremental_session.rb +4 -2
- data/lib/rigor/analysis/run_stats.rb +13 -1
- data/lib/rigor/analysis/runner.rb +54 -12
- data/lib/rigor/cli/check_command.rb +26 -3
- data/lib/rigor/cli/coverage_command.rb +67 -92
- data/lib/rigor/cli/coverage_mutation.rb +149 -0
- data/lib/rigor/cli/docs_command.rb +248 -0
- data/lib/rigor/cli/fused_protection_renderer.rb +67 -0
- data/lib/rigor/cli/fused_protection_report.rb +76 -0
- data/lib/rigor/cli/skill_command.rb +103 -41
- data/lib/rigor/cli/skill_describe.rb +346 -0
- data/lib/rigor/cli.rb +25 -3
- data/lib/rigor/config_audit.rb +152 -0
- data/lib/rigor/configuration.rb +12 -0
- data/lib/rigor/environment/rbs_loader.rb +27 -0
- data/lib/rigor/environment.rb +49 -1
- data/lib/rigor/inference/method_dispatcher/constant_folding.rb +140 -38
- data/lib/rigor/inference/method_dispatcher/shape_dispatch.rb +37 -6
- data/lib/rigor/inference/scope_indexer.rb +87 -89
- data/lib/rigor/inference/statement_evaluator.rb +27 -0
- data/lib/rigor/plugin/isolation.rb +5 -5
- data/lib/rigor/plugin/loader.rb +4 -2
- data/lib/rigor/protection/diagnostic_oracle.rb +51 -0
- data/lib/rigor/protection/mutation_scanner.rb +98 -38
- data/lib/rigor/protection/mutator.rb +21 -0
- data/lib/rigor/protection/test_suite_oracle.rb +68 -0
- data/lib/rigor/signature_path_audit.rb +92 -0
- data/lib/rigor/version.rb +1 -1
- data/skills/rigor-ask/SKILL.md +172 -0
- data/skills/rigor-doctor/SKILL.md +87 -0
- data/skills/rigor-editor-setup/SKILL.md +114 -0
- data/skills/rigor-mcp-setup/SKILL.md +117 -0
- data/skills/rigor-monkeypatch-resolve/SKILL.md +79 -0
- data/skills/rigor-next-steps/SKILL.md +113 -0
- data/skills/rigor-plugin-tune/SKILL.md +79 -0
- data/skills/rigor-protection-uplift/SKILL.md +133 -0
- data/skills/rigor-rbs-setup/SKILL.md +128 -0
- data/skills/rigor-upgrade/SKILL.md +79 -0
- metadata +120 -1
|
@@ -0,0 +1,1662 @@
|
|
|
1
|
+
# Appendix — Connections to Type Theory
|
|
2
|
+
|
|
3
|
+
A short bridge between Rigor's vocabulary and the formal
|
|
4
|
+
type-theoretic concepts you may have seen in a programming-languages
|
|
5
|
+
textbook or in another type checker's documentation. The handbook
|
|
6
|
+
proper is deliberately short on theory; this appendix names the
|
|
7
|
+
underlying ideas so that if you already know one of them, you can
|
|
8
|
+
recognise the corresponding Rigor surface immediately.
|
|
9
|
+
|
|
10
|
+
This page is descriptive, not normative. When the formal language
|
|
11
|
+
here disagrees with the [type
|
|
12
|
+
specification](../type-specification/README.md), the spec binds.
|
|
13
|
+
|
|
14
|
+
## Five-second pitch
|
|
15
|
+
|
|
16
|
+
| Question | Type-theory term | Rigor surface |
|
|
17
|
+
| --- | --- | --- |
|
|
18
|
+
| What is the universe of types ordered by? | Subtyping (`<:`), a partial order forming a lattice | The carrier zoo with `Top` / `Bot`, `\|` (join), `&` (meet) |
|
|
19
|
+
| What about types that may or may not match? | Gradual consistency (`~`) | The `Dynamic[T]` carrier and the trinary certainty `yes / no / maybe` |
|
|
20
|
+
| How are user types identified? | Nominal vs structural | **Nominal-first hybrid** — classes by name, plus structural facets (`interface`, `HashShape`, capability roles) |
|
|
21
|
+
| How are generics expressed? | Parametric polymorphism (System F-style, but predicative) | RBS generics `class Array[Elem]`, method generics `def map: [U] () { (Elem) -> U } -> Array[U]` |
|
|
22
|
+
| How is "x is a non-empty string" expressed? | Refinement / predicate subtyping | First-class refinement carriers (`non-empty-string`, `int<min, max>`, …) |
|
|
23
|
+
| How does `if x.is_a?(String)` change `x`'s type? | Occurrence typing / flow-sensitive narrowing | Edge-aware narrowing with trinary certainty |
|
|
24
|
+
| What about side effects? | Effect systems | The engine's effect model (mutation, exception, escape) — internal, not user-visible |
|
|
25
|
+
| Soundness or completeness? | Pick one (or neither) | **Neither in full** — Rigor optimises for no-false-positives, with a robustness-principle bias |
|
|
26
|
+
| Why do some features force annotations everywhere? | Decidability of inference — certain combinations (Rank-3+, polymorphic recursion, subtyping + intersection) are undecidable | **The trinary `maybe`** — when inference cannot decide, Rigor stays silent rather than guessing or pestering the user for an annotation |
|
|
27
|
+
|
|
28
|
+
Rigor's design pulls liberally from this catalogue but avoids the
|
|
29
|
+
parts that would force a Ruby author to write annotations they did
|
|
30
|
+
not author themselves.
|
|
31
|
+
|
|
32
|
+
## The type lattice
|
|
33
|
+
|
|
34
|
+
Rigor's types form a (bounded) lattice under the subtyping
|
|
35
|
+
relation `<:`. The standard textbook picture applies almost
|
|
36
|
+
verbatim:
|
|
37
|
+
|
|
38
|
+
- **`Top`** is the greatest element — every value has type `Top`.
|
|
39
|
+
- **`Bot`** is the least element — no value has type `Bot`. Useful
|
|
40
|
+
for unreachable branches and "this method always raises."
|
|
41
|
+
- **Join `T \| U`** (union) is the least upper bound.
|
|
42
|
+
- **Meet `T & U`** (intersection) is the greatest lower bound.
|
|
43
|
+
|
|
44
|
+
```ruby
|
|
45
|
+
# Top — every value inhabits it
|
|
46
|
+
x = something_we_know_nothing_about
|
|
47
|
+
assert_type("Dynamic[top]", x) # Top widened with the Dynamic marker
|
|
48
|
+
|
|
49
|
+
# Bot — no value inhabits it; raised-only methods return Bot
|
|
50
|
+
def boom!
|
|
51
|
+
raise "no"
|
|
52
|
+
end
|
|
53
|
+
assert_type("Dynamic[top]", method(:boom!).call) # Method indirection; direct boom! call returns Bot
|
|
54
|
+
|
|
55
|
+
# Join — Union of two non-overlapping types
|
|
56
|
+
n = rand < 0.5 ? 1 : "a"
|
|
57
|
+
assert_type("\"a\" | 1", n)
|
|
58
|
+
|
|
59
|
+
# Meet — Intersection (rarely needed at the surface level)
|
|
60
|
+
# Mostly arises during refinement combinations
|
|
61
|
+
```
|
|
62
|
+
|
|
63
|
+
Spec: [`docs/type-specification/value-lattice.md`](../type-specification/value-lattice.md),
|
|
64
|
+
[`docs/type-specification/special-types.md`](../type-specification/special-types.md).
|
|
65
|
+
|
|
66
|
+
## Set-theoretic foundations of the lattice
|
|
67
|
+
|
|
68
|
+
The previous section described `Top` / `Bot` / `|` (join) / `&`
|
|
69
|
+
(meet) as "the standard textbook picture." The semantic
|
|
70
|
+
foundation that makes that picture *work* — where union and
|
|
71
|
+
intersection on types behave the way the user expects — is the
|
|
72
|
+
**set-theoretic** view of types.
|
|
73
|
+
|
|
74
|
+
In the set-theoretic view, a type `T` is interpreted as the *set
|
|
75
|
+
of values inhabiting it*, and the type operators correspond to
|
|
76
|
+
set operations:
|
|
77
|
+
|
|
78
|
+
| Type operator | Set operation |
|
|
79
|
+
| --- | --- |
|
|
80
|
+
| `T \| U` | `T ∪ U` (union of inhabitants) |
|
|
81
|
+
| `T & U` | `T ∩ U` (intersection of inhabitants) |
|
|
82
|
+
| `T - U` | `T ∖ U` (set-theoretic difference; sometimes written `T ¬ U`) |
|
|
83
|
+
| `Top` | the universal set |
|
|
84
|
+
| `Bot` | the empty set |
|
|
85
|
+
|
|
86
|
+
This is the semantics behind **semantic subtyping**: `T <: U` iff
|
|
87
|
+
every inhabitant of `T` is also an inhabitant of `U`. Subtyping
|
|
88
|
+
becomes set inclusion.
|
|
89
|
+
|
|
90
|
+
### Academic root
|
|
91
|
+
|
|
92
|
+
**Frisch, Castagna & Benzaken** developed semantic subtyping in
|
|
93
|
+
the early 2000s as the type-theoretic foundation of CDuce — a
|
|
94
|
+
language for processing XML where union, intersection, and
|
|
95
|
+
negation types are first-class. The framework was consolidated
|
|
96
|
+
in **Castagna's 2024 textbook *Programming with Union,
|
|
97
|
+
Intersection, and Negation Types***, the current canonical
|
|
98
|
+
reference for the area.
|
|
99
|
+
|
|
100
|
+
Under semantic subtyping, the lattice operators are *exactly*
|
|
101
|
+
the set-theoretic ones, and a decidable subtyping algorithm
|
|
102
|
+
exists for the resulting fragment — much richer than HM but
|
|
103
|
+
still tractable.
|
|
104
|
+
|
|
105
|
+
### Industrial uptake
|
|
106
|
+
|
|
107
|
+
- **TypeScript** and **Flow** use union / intersection
|
|
108
|
+
arithmetic informally — the spec talks about "subtype-of-T-or-U"
|
|
109
|
+
rather than committing to a semantic model, but the resulting
|
|
110
|
+
behaviour matches the set-theoretic reading in most cases.
|
|
111
|
+
- **Elixir's set-theoretic type system** (José Valim + Giuseppe
|
|
112
|
+
Castagna collaboration, 2023–) is the first major industrial
|
|
113
|
+
language to *deliberately* adopt the formal framework. The
|
|
114
|
+
decision to ground Elixir's typed surface in semantic subtyping
|
|
115
|
+
rather than a syntactic ad-hoc subtyping relation is the most
|
|
116
|
+
consequential design choice the project has made.
|
|
117
|
+
- **Scala 3** ships union types and intersection types as
|
|
118
|
+
first-class constructs; the semantics are loosely
|
|
119
|
+
set-theoretic.
|
|
120
|
+
- **Ceylon** (now retired) was an early industrial experiment
|
|
121
|
+
with explicit union / intersection types at the surface.
|
|
122
|
+
|
|
123
|
+
### Rigor's position
|
|
124
|
+
|
|
125
|
+
Rigor's lattice **is** set-theoretic in spirit:
|
|
126
|
+
|
|
127
|
+
- `T | U`, `T & U`, `T - U` are present at the surface
|
|
128
|
+
([`docs/type-specification/type-operators.md`](../type-specification/type-operators.md)).
|
|
129
|
+
- Top / Bot behave as the universal / empty sets.
|
|
130
|
+
- The difference operator `T - U` is what lets occurrence typing
|
|
131
|
+
express "in the `else` branch of `if x.is_a?(String)`, the type
|
|
132
|
+
of `x` is `Top - String`" precisely, rather than as an
|
|
133
|
+
approximation.
|
|
134
|
+
|
|
135
|
+
Rigor does NOT formalise its lattice as a semantic-subtyping
|
|
136
|
+
system in the Castagna sense. Three reasons:
|
|
137
|
+
|
|
138
|
+
1. **The trinary certainty already absorbs the hard cases.**
|
|
139
|
+
Semantic subtyping's value is a decision procedure for full
|
|
140
|
+
union / intersection / negation. Rigor's `maybe` arm handles
|
|
141
|
+
"cannot decide" without needing the decision procedure to
|
|
142
|
+
terminate.
|
|
143
|
+
2. **Nominal-first conflicts with pure semantic subtyping.**
|
|
144
|
+
A semantic-subtyping reading would collapse two distinct
|
|
145
|
+
nominal classes with identical method sets (§ "Nominal vs
|
|
146
|
+
structural typing"), which Ruby authors do not want.
|
|
147
|
+
3. **Implementation cost.** Castagna's decision procedure is
|
|
148
|
+
theoretically elegant but operationally heavy for an analyser
|
|
149
|
+
that walks an AST per file under a per-file budget
|
|
150
|
+
([`inference-budgets.md`](../type-specification/inference-budgets.md)).
|
|
151
|
+
|
|
152
|
+
Rigor's `T | U` / `T & U` / `T - U` operators are best read
|
|
153
|
+
with the set-theoretic interpretation in mind, even though
|
|
154
|
+
Rigor's algorithms do not formally lean on it. A reader coming
|
|
155
|
+
from Elixir's set-theoretic types or from Castagna's textbook
|
|
156
|
+
will find the surface familiar; the gap is in the formalisation
|
|
157
|
+
depth, not the surface design.
|
|
158
|
+
|
|
159
|
+
## Subtyping and gradual consistency
|
|
160
|
+
|
|
161
|
+
Static type theory uses one relation: **subtyping (`<:`)**.
|
|
162
|
+
`Integer <: Numeric` means every `Integer` is a `Numeric`.
|
|
163
|
+
|
|
164
|
+
Gradual typing adds a second relation: **consistency (`~`)**.
|
|
165
|
+
`Dynamic[T] ~ U` means "I do not statically know whether the
|
|
166
|
+
runtime value will satisfy `U`, but it is permitted to."
|
|
167
|
+
Consistency is reflexive and symmetric but **not transitive**.
|
|
168
|
+
This is the key technical move that distinguishes gradual typing
|
|
169
|
+
from "just add an `Any` type to the lattice."
|
|
170
|
+
|
|
171
|
+
Rigor exposes both relations through a **trinary certainty**:
|
|
172
|
+
|
|
173
|
+
| Certainty | Reads as | Use site |
|
|
174
|
+
| --- | --- | --- |
|
|
175
|
+
| `yes` | `T <: U` provably holds | The call is safe; no diagnostic. |
|
|
176
|
+
| `no` | `T <: U` provably fails | A diagnostic fires. |
|
|
177
|
+
| `maybe` | Cannot prove either way | No diagnostic — Rigor stays silent (robustness principle). |
|
|
178
|
+
|
|
179
|
+
```ruby
|
|
180
|
+
# yes: provably Integer <: Numeric
|
|
181
|
+
def add_one(n) = n + 1
|
|
182
|
+
add_one(42) # certainty: yes
|
|
183
|
+
|
|
184
|
+
# no: Constant<"a"> <: Integer is provably false
|
|
185
|
+
add_one("a") # certainty: no — call.argument-type-mismatch fires
|
|
186
|
+
|
|
187
|
+
# maybe: Dynamic[top] ~ Integer holds; <: cannot be decided
|
|
188
|
+
add_one(JSON.parse(input)) # certainty: maybe — silent
|
|
189
|
+
```
|
|
190
|
+
|
|
191
|
+
Spec: [`docs/type-specification/relations-and-certainty.md`](../type-specification/relations-and-certainty.md).
|
|
192
|
+
|
|
193
|
+
## Nominal vs structural typing
|
|
194
|
+
|
|
195
|
+
Java is nominal: `class Foo {}` and `class Bar {}` with identical
|
|
196
|
+
member sets are distinct types. TypeScript is structural: two
|
|
197
|
+
type aliases with identical members are interchangeable.
|
|
198
|
+
|
|
199
|
+
Rigor is **nominal-first with structural facets**:
|
|
200
|
+
|
|
201
|
+
1. **Nominal** is the default. `Nominal[User]` and
|
|
202
|
+
`Nominal[Admin]` are distinct even with identical methods.
|
|
203
|
+
2. **Structural via `interface`**. RBS `interface _Comparable`
|
|
204
|
+
defines a shape — anything implementing the named methods
|
|
205
|
+
satisfies it, regardless of class.
|
|
206
|
+
3. **Structural via `HashShape` and `Tuple`**. Ruby literals
|
|
207
|
+
`{name: "x", age: 30}` and `[1, "a"]` get per-key / per-index
|
|
208
|
+
structural types automatically.
|
|
209
|
+
4. **Capability roles** are a Rigor-specific structural facet —
|
|
210
|
+
named structural interfaces with hidden carriers
|
|
211
|
+
(`_ReadableStream`, `_RewindableStream`, …). These let the
|
|
212
|
+
robustness principle widen user-method parameter types to "any
|
|
213
|
+
value that supports the capability we actually use" without
|
|
214
|
+
forcing the user to write the `interface`.
|
|
215
|
+
|
|
216
|
+
```ruby
|
|
217
|
+
# Nominal — User and Admin are distinct
|
|
218
|
+
class User; end
|
|
219
|
+
class Admin; end
|
|
220
|
+
u = User.new
|
|
221
|
+
def takes_user(u) end
|
|
222
|
+
takes_user(Admin.new) # call.argument-type-mismatch
|
|
223
|
+
|
|
224
|
+
# Structural via HashShape — literals get per-key types
|
|
225
|
+
person = {name: "Alice", age: 30}
|
|
226
|
+
assert_type("{ name: \"Alice\", age: 30 }", person)
|
|
227
|
+
|
|
228
|
+
# Structural via interface
|
|
229
|
+
def shout(thing)
|
|
230
|
+
thing.upcase
|
|
231
|
+
end
|
|
232
|
+
# Rigor infers the parameter as "anything with #upcase: () -> String"
|
|
233
|
+
```
|
|
234
|
+
|
|
235
|
+
Spec:
|
|
236
|
+
[`docs/type-specification/structural-interfaces-and-object-shapes.md`](../type-specification/structural-interfaces-and-object-shapes.md).
|
|
237
|
+
|
|
238
|
+
## Polymorphism
|
|
239
|
+
|
|
240
|
+
The Cardelli/Wegner taxonomy of polymorphism maps cleanly onto Rigor:
|
|
241
|
+
|
|
242
|
+
| Polymorphism family | Rigor surface | Notes |
|
|
243
|
+
| --- | --- | --- |
|
|
244
|
+
| **Parametric** (System F-style, predicative) | RBS generics `class Foo[T]`, method generics `def m: [U] (U) -> U` | No higher-rank or higher-kinded quantification at the user surface. |
|
|
245
|
+
| **Subtype** | `<:` over the lattice | Standard; method calls dispatch by inferred receiver type. |
|
|
246
|
+
| **Ad-hoc** (overloading) | RBS method overloads (`def m: (Integer) -> Integer \| (String) -> String`) | Resolution picks the most specific arm. |
|
|
247
|
+
| **Coercion** | Rigor's Ruby-coercion model (`Integer#coerce`, etc.) | Inferred per the runtime semantics; not a user-visible operator. |
|
|
248
|
+
| **Row polymorphism** | (not exposed at the user surface) | `HashShape` carries closed-vs-open key sets internally; not a quantifiable axis. See § "Object shapes" for the lineage. |
|
|
249
|
+
|
|
250
|
+
```ruby
|
|
251
|
+
# Parametric — method generics in RBS
|
|
252
|
+
# sig: def first: [E] (Array[E]) -> E?
|
|
253
|
+
def first(arr) = arr[0]
|
|
254
|
+
|
|
255
|
+
# Subtype — Integer <: Numeric flows through method calls
|
|
256
|
+
def total(ns) = ns.sum
|
|
257
|
+
total([1, 2, 3]) # ns: Array[Integer]
|
|
258
|
+
total([1, 2.0, 3]) # ns: Array[Numeric]
|
|
259
|
+
|
|
260
|
+
# Ad-hoc — RBS overload picks per call site
|
|
261
|
+
"abc" * 3 # String overload
|
|
262
|
+
[1, 2] * 3 # Array overload
|
|
263
|
+
```
|
|
264
|
+
|
|
265
|
+
Spec: [`docs/type-specification/rbs-compatible-types.md`](../type-specification/rbs-compatible-types.md).
|
|
266
|
+
|
|
267
|
+
## F-bounded polymorphism and self types
|
|
268
|
+
|
|
269
|
+
A recurring concrete need in object-oriented languages: a method
|
|
270
|
+
that "returns an instance of the actual receiver's class." Ruby's
|
|
271
|
+
`Object#tap` is the canonical example — `arr.tap { |x| ... }`
|
|
272
|
+
returns the same `arr`, with the same type, not the widened
|
|
273
|
+
`Object` ancestor.
|
|
274
|
+
|
|
275
|
+
Expressing "I return *my own* class" needs a mechanism beyond
|
|
276
|
+
plain parametric polymorphism, because the return-type parameter
|
|
277
|
+
must track the *runtime* class of `self`, not just a static type
|
|
278
|
+
variable bound at the call site. Two related mechanisms in the
|
|
279
|
+
literature:
|
|
280
|
+
|
|
281
|
+
### Self types
|
|
282
|
+
|
|
283
|
+
A reserved keyword (`self`, `Self`, `this`) inside a class
|
|
284
|
+
declaration meaning "the type of the actual receiver." A method
|
|
285
|
+
declared `def m: () -> self` in class `Foo` returns `Foo` in
|
|
286
|
+
`Foo` but returns `Bar` in `class Bar < Foo`. RBS uses `self`
|
|
287
|
+
exactly this way:
|
|
288
|
+
|
|
289
|
+
```ruby
|
|
290
|
+
# RBS for Object#tap (excerpt)
|
|
291
|
+
class Object
|
|
292
|
+
def tap: { (self) -> void } -> self
|
|
293
|
+
end
|
|
294
|
+
```
|
|
295
|
+
|
|
296
|
+
When called on an `Array[Integer]`, the block sees the receiver
|
|
297
|
+
typed as `Array[Integer]` and the call returns `Array[Integer]` —
|
|
298
|
+
not `Object`.
|
|
299
|
+
|
|
300
|
+
### F-bounded polymorphism
|
|
301
|
+
|
|
302
|
+
A more general mechanism — a type parameter constrained to
|
|
303
|
+
extend its own parameterisation: `T <: Comparable[T]`. The
|
|
304
|
+
classical reference is **Canning, Cook, Hill, Olthoff &
|
|
305
|
+
Mitchell 1989** (*F-Bounded Polymorphism for Object-Oriented
|
|
306
|
+
Programming*). The motivating problem is the same — "a method on
|
|
307
|
+
`Comparable` should return the comparing class itself, not the
|
|
308
|
+
abstract `Comparable` interface" — but the encoding is via a
|
|
309
|
+
constrained type parameter rather than a `self` keyword.
|
|
310
|
+
|
|
311
|
+
### Industrial uptake
|
|
312
|
+
|
|
313
|
+
| Language | Self-type form | F-bounded form |
|
|
314
|
+
| --- | --- | --- |
|
|
315
|
+
| Java | (none direct) | `<T extends Comparable<T>>` |
|
|
316
|
+
| Scala | `this: T =>` self-type | `[T <: Comparable[T]]` |
|
|
317
|
+
| TypeScript | `this` type in method signatures | `<T extends C<T>>` |
|
|
318
|
+
| Sorbet | `T.self_type`, `T.attached_class` | Limited via generic class |
|
|
319
|
+
| RBS | `self` keyword | `[T < Comparable[T]]` (syntactically supported, limited) |
|
|
320
|
+
|
|
321
|
+
### Rigor's position
|
|
322
|
+
|
|
323
|
+
Rigor honours the RBS `self` keyword in method signatures. The
|
|
324
|
+
walker substitutes the actual receiver type for `self` when
|
|
325
|
+
synthesising the return type — so `Array[Integer]#dup` (declared
|
|
326
|
+
as `def dup: () -> self`) returns `Array[Integer]`, not the
|
|
327
|
+
ancestor's `Object`. This is the small mechanism that removes a
|
|
328
|
+
major source of unwanted widening in OO Ruby code:
|
|
329
|
+
|
|
330
|
+
```ruby
|
|
331
|
+
arr = [1, 2, 3]
|
|
332
|
+
copy = arr.dup
|
|
333
|
+
# Without self-type: copy : Object (useless)
|
|
334
|
+
# With self-type: copy : Array[Integer] (load-bearing)
|
|
335
|
+
```
|
|
336
|
+
|
|
337
|
+
F-bounded polymorphism in its full generality is harder. The
|
|
338
|
+
inference machinery has to solve a constraint that mentions the
|
|
339
|
+
type variable on both sides of `<:`. Rigor's RBS surface accepts
|
|
340
|
+
the constrained form `[T < C[T]]` but the walker treats
|
|
341
|
+
unresolved F-bounded constraints conservatively (`Dynamic[top]`
|
|
342
|
+
fallback when the bound cannot be solved locally). This matches
|
|
343
|
+
the no-false-positives stance: an over-precise F-bounded
|
|
344
|
+
inference would spread `T`-mention errors through the codebase,
|
|
345
|
+
and the practical Ruby idiom (declare `self` rather than
|
|
346
|
+
quantify over `T <: C[T]`) sidesteps the harder cases anyway.
|
|
347
|
+
|
|
348
|
+
The type-theoretic family extends further (G-bounded polymorphism;
|
|
349
|
+
parameterised self-types; ML-style first-class modules with
|
|
350
|
+
`with type self = ...`) but those forms are not exposed at the
|
|
351
|
+
Rigor surface.
|
|
352
|
+
|
|
353
|
+
## Object shapes — row polymorphism, Hack, and HashShape's lineage
|
|
354
|
+
|
|
355
|
+
The `HashShape{...}` carrier and the closely related `Tuple[...]`
|
|
356
|
+
appeared first in § "Nominal vs structural typing" and again in
|
|
357
|
+
the precision table of § "Beyond pure inference," where they turn
|
|
358
|
+
an otherwise-`Hash[Symbol, A | B | C]`-shaped join into something
|
|
359
|
+
a downstream caller can use. They sit in a family of *structural
|
|
360
|
+
shape* designs with both an academic root and an industrial
|
|
361
|
+
lineage. Tracing those threads is the easiest way to explain why
|
|
362
|
+
`HashShape` looks the way it does.
|
|
363
|
+
|
|
364
|
+
### The academic root: row polymorphism
|
|
365
|
+
|
|
366
|
+
**Row polymorphism** (Wand, 1987; Rémy, 1989; Cardelli & Mitchell,
|
|
367
|
+
1991) is the formal mechanism for typing "records that may carry
|
|
368
|
+
additional fields beyond the ones I named." A *row variable* `ρ`
|
|
369
|
+
quantifies over the trailing fields of a record type:
|
|
370
|
+
|
|
371
|
+
> `{ name: String; age: Integer | ρ }` — "any record with at
|
|
372
|
+
> least these two fields; ρ is the rest."
|
|
373
|
+
|
|
374
|
+
Garrigue (1990s) extended the framework with **kinds**, letting
|
|
375
|
+
OCaml's polymorphic-record system distinguish "the class of
|
|
376
|
+
records carrying `name: String`" from "the class of records
|
|
377
|
+
carrying `length: Int`." OCaml's open object types
|
|
378
|
+
(`< get_name : string; .. >`) sit on this foundation.
|
|
379
|
+
|
|
380
|
+
**Matsumoto & Minamide (2008)** applied the Garrigue-kinded
|
|
381
|
+
framework directly to Ruby — 多相レコード型に基づくRuby
|
|
382
|
+
プログラムの型推論. The paper demonstrated that Ruby's
|
|
383
|
+
"duck typing" surface admits a row-polymorphic reading: a method
|
|
384
|
+
`def shout(x); x.upcase; end` infers as roughly
|
|
385
|
+
`∀α, ρ. {upcase: () -> α | ρ} -> α`. The inference algorithm
|
|
386
|
+
works, but the inferred types (together with the kind constraints
|
|
387
|
+
they drag along) are dense for everyday Ruby code, where users
|
|
388
|
+
overwhelmingly reason in nominal classes rather than structural
|
|
389
|
+
rows.
|
|
390
|
+
|
|
391
|
+
The [Rigor-perspective review of the
|
|
392
|
+
paper](../notes/20260518-matsumoto-2008-poly-records-rigor-review.md)
|
|
393
|
+
records the retrospective: the experiment *worked* but it
|
|
394
|
+
**retroactively justified Rigor's nominal-first design** rather
|
|
395
|
+
than recommending row variables as the primary modeling tool.
|
|
396
|
+
Rigor's carrier zoo treats nominal classes as the unit of
|
|
397
|
+
modelling, with structural shapes as inference-precision fallbacks.
|
|
398
|
+
|
|
399
|
+
### The industrial lineage: Hack → Psalm/PHPStan
|
|
400
|
+
|
|
401
|
+
In parallel with the academic line, the practical "typed
|
|
402
|
+
dictionary" trajectory went a different way. Facebook's
|
|
403
|
+
[**Hack `shape(...)`**](https://docs.hhvm.com/hack/built-in-types/shape/)
|
|
404
|
+
introduced first-class shape types as part of the migration story
|
|
405
|
+
from dynamic PHP arrays to a typed surface:
|
|
406
|
+
|
|
407
|
+
- Per-key typing — `shape('name' => string, 'age' => int)`.
|
|
408
|
+
- Optional keys via `?'key' => T`.
|
|
409
|
+
- Closed by default; `...` opens the shape to additional keys.
|
|
410
|
+
|
|
411
|
+
**Psalm** and **PHPStan** adopted the same idea under the PHPDoc
|
|
412
|
+
syntax `array{name: string, age: int}`, with one important
|
|
413
|
+
emphasis flipped: the shape is *inferred* from the literal at the
|
|
414
|
+
use site rather than declared up front. TypeScript's object-type
|
|
415
|
+
literal `{ name: string; age: number }` is the same idea under
|
|
416
|
+
different syntax, with structural subtyping turned on by default.
|
|
417
|
+
|
|
418
|
+
The industrial design **deliberately avoids row variables**.
|
|
419
|
+
There is no `array{name: string, ...ρ}` quantified over the
|
|
420
|
+
trailing keys; every shape is closed (or fully open) with no
|
|
421
|
+
quantifiable in-between. The price is loss of full
|
|
422
|
+
row-polymorphic expressiveness; the benefit is tractable inference
|
|
423
|
+
and *readable* inferred types.
|
|
424
|
+
|
|
425
|
+
### HashShape's position
|
|
426
|
+
|
|
427
|
+
Rigor's `HashShape{...}` sits squarely in the Hack / Psalm
|
|
428
|
+
lineage rather than the row-polymorphic one:
|
|
429
|
+
|
|
430
|
+
| Property | Row polymorphism | Hack `shape(...)` | Psalm `array{...}` | Rigor `HashShape{...}` |
|
|
431
|
+
| --- | --- | --- | --- | --- |
|
|
432
|
+
| Per-key typing | Yes | Yes | Yes | Yes |
|
|
433
|
+
| Optional keys | Yes (via row constraints) | Yes (`?'k'`) | Yes (`?k:`) | Yes (open/closed flag internally) |
|
|
434
|
+
| Row variables quantifiable by the user | **Yes** | No | No | **No** |
|
|
435
|
+
| Inferred from literals | (Inference is global) | No — user-declared | Yes (per call site) | **Yes** — built-in for hash literals |
|
|
436
|
+
| Primary modelling vehicle for users | Yes (in ML-family languages that adopt it) | Yes (idiomatic Hack) | Sometimes (alongside classes) | **No** — nominal classes are primary; HashShape is the inference-precision fallback |
|
|
437
|
+
|
|
438
|
+
Two specific choices stand out:
|
|
439
|
+
|
|
440
|
+
1. **No row variables at the user surface.** Like Hack and Psalm,
|
|
441
|
+
Rigor does not let the user write
|
|
442
|
+
`HashShape{name: String, *rest}` quantified over the trailing
|
|
443
|
+
keys. Internally `HashShape` carries an open/closed flag, so
|
|
444
|
+
the analyser can still answer "is this set of keys finite?",
|
|
445
|
+
but the type language has no `ρ`. This is the same trade Hack
|
|
446
|
+
made: more readable inferred types and tractable inference, at
|
|
447
|
+
the cost of full row-polymorphic expressivity.
|
|
448
|
+
2. **Inferred, not declared.** Where Hack expects the user to
|
|
449
|
+
write `shape(...)` explicitly, Rigor produces `HashShape`
|
|
450
|
+
automatically from hash literals. The common Ruby-author
|
|
451
|
+
experience is "I wrote `{a: 1, b: 'x'}` and Rigor reported
|
|
452
|
+
`HashShape{a: Constant<1>, b: Constant<"x">}`," not "I
|
|
453
|
+
declared a shape type and Rigor checked my literal against
|
|
454
|
+
it." This matches the Psalm / PHPStan emphasis more closely
|
|
455
|
+
than Hack's declaration-first design.
|
|
456
|
+
|
|
457
|
+
The combination (inferred-from-literal + Hack/Psalm-shaped
|
|
458
|
+
surface + nominal-first ecosystem) makes `HashShape` a
|
|
459
|
+
**precision carrier** (§ "Beyond pure inference") rather than a
|
|
460
|
+
*modelling primitive*. It exists to sharpen the type of a hash
|
|
461
|
+
literal that would otherwise widen to
|
|
462
|
+
`Hash[Symbol, A | B | C]`. It is not the unit a Rigor user
|
|
463
|
+
reaches for to describe a domain object; that role belongs to
|
|
464
|
+
`class User; end` plus the surrounding RBS, exactly as the
|
|
465
|
+
Matsumoto retrospective recommends.
|
|
466
|
+
|
|
467
|
+
### `Tuple` and the same lineage
|
|
468
|
+
|
|
469
|
+
`Tuple[A, B, C]` is the array analogue, and the same lineage
|
|
470
|
+
applies — TypeScript's `[A, B, C]`, Hack's `tuple(A, B, C)`,
|
|
471
|
+
Psalm/PHPStan's `array{0: A, 1: B, 2: C}` shorthand. The
|
|
472
|
+
motivation is identical: a literal `[1, "a", :sym]` carries
|
|
473
|
+
per-index information that the `Array[Integer | String | Symbol]`
|
|
474
|
+
join discards.
|
|
475
|
+
|
|
476
|
+
### Why not full row polymorphism in Rigor?
|
|
477
|
+
|
|
478
|
+
The temptation to surface row variables for users who want them
|
|
479
|
+
is real, and the question is open at the ADR level. The reasons
|
|
480
|
+
it has not landed at the user surface in v0.1.x:
|
|
481
|
+
|
|
482
|
+
- **Inference cost.** Garrigue-kinded inference is decidable but
|
|
483
|
+
more expensive than Rigor's local walker; the analyser's
|
|
484
|
+
per-file budget (see
|
|
485
|
+
[`inference-budgets.md`](../type-specification/inference-budgets.md))
|
|
486
|
+
would have to accommodate global row-constraint solving.
|
|
487
|
+
- **Readability.** The Matsumoto experiment found that inferred
|
|
488
|
+
row-polymorphic types for everyday Ruby code are dense and hard
|
|
489
|
+
to skim. Rigor's no-false-positives stance amplifies the
|
|
490
|
+
problem, since it makes the inferred type a thing the user
|
|
491
|
+
reads in `rigor annotate`.
|
|
492
|
+
- **Empirical demand.** Hash literals in real Ruby code are
|
|
493
|
+
typically per-call ad-hoc dictionaries, not polymorphic-record
|
|
494
|
+
values flowing through multiple operations. The closed-or-open
|
|
495
|
+
per-call structural type matches the observed use; the row
|
|
496
|
+
quantification rarely earns its complexity.
|
|
497
|
+
|
|
498
|
+
If row variables ever become needed (for a typed
|
|
499
|
+
`merge` / `transform_keys` / `slice` story that benefits from
|
|
500
|
+
quantifying over rows), the question opens through an ADR rather
|
|
501
|
+
than at the user surface by default.
|
|
502
|
+
|
|
503
|
+
## Variance
|
|
504
|
+
|
|
505
|
+
RBS (and therefore Rigor) inherits the standard variance
|
|
506
|
+
vocabulary for generic parameters:
|
|
507
|
+
|
|
508
|
+
- **Covariant (`out T`)** — `Foo[Sub] <: Foo[Sup]` when
|
|
509
|
+
`Sub <: Sup`. Producer position.
|
|
510
|
+
- **Contravariant (`in T`)** — `Foo[Sup] <: Foo[Sub]` when
|
|
511
|
+
`Sub <: Sup`. Consumer position.
|
|
512
|
+
- **Invariant (default)** — neither.
|
|
513
|
+
|
|
514
|
+
Ruby's mutable containers (`Array`, `Hash`, `Set`) are invariant
|
|
515
|
+
in their element type for soundness — the standard Java-arrays-
|
|
516
|
+
are-covariant cautionary tale applies. RBS declares them as such;
|
|
517
|
+
Rigor honours those declarations.
|
|
518
|
+
|
|
519
|
+
## Refinement types and predicate subtyping
|
|
520
|
+
|
|
521
|
+
A **refinement type** restricts a base type by a predicate: in
|
|
522
|
+
Liquid Types / SMT-driven systems this is written as
|
|
523
|
+
`{x: Int | x > 0}`. Rigor exposes a curated catalogue of
|
|
524
|
+
refinements with reserved names:
|
|
525
|
+
|
|
526
|
+
| Refinement | Predicate (informally) | Carrier |
|
|
527
|
+
| --- | --- | --- |
|
|
528
|
+
| `non-empty-string` | `s : String, s.size >= 1` | refinement on `String` |
|
|
529
|
+
| `numeric-string` | `s : String, s =~ /\A[+-]?\d+(\.\d+)?\z/` | refinement on `String` |
|
|
530
|
+
| `literal-string` | "provably built from literals" | refinement on `String` |
|
|
531
|
+
| `int<min, max>` | `n : Integer, min <= n <= max` | range carrier |
|
|
532
|
+
| `non-zero-int` | `n : Integer, n != 0` | refinement on `Integer` |
|
|
533
|
+
| `positive-int` | `n : Integer, n > 0` | refinement on `Integer` |
|
|
534
|
+
| `non-empty-array[T]` | `arr : Array[T], arr.size >= 1` | refinement on `Array[T]` |
|
|
535
|
+
| `non-empty-hash[K, V]` | `h : Hash[K, V], h.size >= 1` | refinement on `Hash[K, V]` |
|
|
536
|
+
|
|
537
|
+
The refinements compose with subtyping the way you would expect:
|
|
538
|
+
`positive-int <: non-zero-int <: Integer <: Numeric`.
|
|
539
|
+
**Rigor narrows into refinement carriers automatically** when the
|
|
540
|
+
control-flow analysis proves the predicate:
|
|
541
|
+
|
|
542
|
+
```ruby
|
|
543
|
+
def length_of(s)
|
|
544
|
+
return 0 if s.empty?
|
|
545
|
+
s.size # at this program point: s : non-empty-string
|
|
546
|
+
end
|
|
547
|
+
```
|
|
548
|
+
|
|
549
|
+
This is the practical payoff of refinement subtyping without
|
|
550
|
+
asking the user to author the refinement.
|
|
551
|
+
|
|
552
|
+
Spec: [`docs/type-specification/imported-built-in-types.md`](../type-specification/imported-built-in-types.md),
|
|
553
|
+
[`docs/type-specification/rigor-extensions.md`](../type-specification/rigor-extensions.md).
|
|
554
|
+
|
|
555
|
+
## Occurrence typing (flow-sensitive narrowing)
|
|
556
|
+
|
|
557
|
+
The technical term for "`if x.is_a?(String)` makes `x : String`
|
|
558
|
+
inside the branch" is **occurrence typing** (Tobin-Hochstadt &
|
|
559
|
+
Felleisen, 2008). TypeScript calls it *narrowing*; mypy calls it
|
|
560
|
+
*type guards*. The underlying mechanism is the same: the type
|
|
561
|
+
checker walks the control-flow graph and refines each variable
|
|
562
|
+
along the edges where a predicate must have held.
|
|
563
|
+
|
|
564
|
+
Rigor implements occurrence typing as **edge-aware narrowing** with
|
|
565
|
+
a few extensions specific to Ruby:
|
|
566
|
+
|
|
567
|
+
- Standard predicates: `is_a?`, `kind_of?`, `instance_of?`,
|
|
568
|
+
`respond_to?`, `nil?`, `==`, `===`, `frozen?`, `empty?`,
|
|
569
|
+
comparison operators.
|
|
570
|
+
- Pattern matching: `case x; in pattern` narrows along the
|
|
571
|
+
matched branch.
|
|
572
|
+
- Equality semantics are split into structural and reference
|
|
573
|
+
equality where Ruby distinguishes them.
|
|
574
|
+
- Mutation effects on a narrowed variable invalidate the
|
|
575
|
+
narrowing at the next read — *fact stability*.
|
|
576
|
+
- User-extended predicates via the `predicate-if-true` /
|
|
577
|
+
`predicate-if-false` directives (the analogue of TypeScript's
|
|
578
|
+
`x is Foo` type guards).
|
|
579
|
+
|
|
580
|
+
```ruby
|
|
581
|
+
def describe(x)
|
|
582
|
+
if x.is_a?(String)
|
|
583
|
+
# x : String here
|
|
584
|
+
x.upcase
|
|
585
|
+
elsif x.nil?
|
|
586
|
+
"(nil)"
|
|
587
|
+
else
|
|
588
|
+
# x : Top - String - nil here (everything else narrowed out)
|
|
589
|
+
x.inspect
|
|
590
|
+
end
|
|
591
|
+
end
|
|
592
|
+
```
|
|
593
|
+
|
|
594
|
+
Spec: [`docs/type-specification/control-flow-analysis.md`](../type-specification/control-flow-analysis.md),
|
|
595
|
+
[`docs/type-specification/rbs-extended.md`](../type-specification/rbs-extended.md).
|
|
596
|
+
|
|
597
|
+
## Pattern matching and exhaustiveness
|
|
598
|
+
|
|
599
|
+
The previous section noted that Rigor narrows along
|
|
600
|
+
`case x; in pattern` branches the same way it narrows along
|
|
601
|
+
`if x.is_a?(...)`. There is a related but distinct property that
|
|
602
|
+
pattern matching enables in type systems built around **algebraic
|
|
603
|
+
data types** (ADTs) or **tagged unions**: **exhaustiveness
|
|
604
|
+
checking**.
|
|
605
|
+
|
|
606
|
+
A `case` that does not cover every value the scrutinee can take
|
|
607
|
+
*should*, under exhaustiveness, be a type error rather than a
|
|
608
|
+
runtime fallthrough. The compiler verifies that for every
|
|
609
|
+
possible shape of the scrutinee, some arm matches.
|
|
610
|
+
|
|
611
|
+
### Academic root
|
|
612
|
+
|
|
613
|
+
**Maranget 2007** (*Warnings for Pattern Matching*) gave the
|
|
614
|
+
algorithm OCaml uses to compute pattern-matching warnings —
|
|
615
|
+
non-exhaustive matches and redundant arms. The broader topic
|
|
616
|
+
sits in the ML / Haskell lineage where exhaustiveness has been
|
|
617
|
+
load-bearing since the late 1970s.
|
|
618
|
+
|
|
619
|
+
### Industrial uptake
|
|
620
|
+
|
|
621
|
+
- **OCaml**: emits warnings for non-exhaustive matches; can be
|
|
622
|
+
turned into hard errors via `-strict-formats` or per-pattern
|
|
623
|
+
attributes.
|
|
624
|
+
- **Rust**: requires exhaustiveness on `match` against an `enum`
|
|
625
|
+
type; non-exhaustive matches are *compile* errors.
|
|
626
|
+
- **Scala**: warns on non-exhaustive `match`; raises
|
|
627
|
+
`MatchError` at runtime if unmatched.
|
|
628
|
+
- **TypeScript**: simulates exhaustiveness via the "exhaustive
|
|
629
|
+
`never` check" idiom — assigning the scrutinee to a `never`-
|
|
630
|
+
typed variable in the default branch fails to type-check if
|
|
631
|
+
any case was missed.
|
|
632
|
+
- **Sorbet**: `T.absurd(x)` checks that every case of a union
|
|
633
|
+
has been narrowed away by the point of the call.
|
|
634
|
+
|
|
635
|
+
### Ruby and Rigor's position
|
|
636
|
+
|
|
637
|
+
Ruby's `case/in` is non-exhaustive at runtime — an unmatched
|
|
638
|
+
scrutinee silently falls through the `case` expression (returns
|
|
639
|
+
`nil`), or raises `NoMatchingPatternError` if the strict
|
|
640
|
+
`case/in` form is used without an `else`. Rigor inherits the
|
|
641
|
+
language behaviour:
|
|
642
|
+
|
|
643
|
+
- Occurrence-typing narrowing through `case/in` is implemented
|
|
644
|
+
and load-bearing for downstream precision.
|
|
645
|
+
- Exhaustiveness checking is **NOT** implemented in v0.1.x.
|
|
646
|
+
|
|
647
|
+
The choice is consistent with Rigor's no-false-positives stance.
|
|
648
|
+
A `pattern.non-exhaustive` diagnostic would fire on:
|
|
649
|
+
|
|
650
|
+
1. A scrutinee inferred as a union whose arms were not all
|
|
651
|
+
matched — but the inferred union itself may be approximate
|
|
652
|
+
(`Dynamic[T]` widening, capability-role narrowing, plugin
|
|
653
|
+
contributions), so the "missing arm" cluster could be
|
|
654
|
+
pathological.
|
|
655
|
+
2. A scrutinee whose "exhaustive set" is open-ended at runtime —
|
|
656
|
+
open class hierarchies, user-defined `===`, monkey-patched
|
|
657
|
+
`kind_of?`. These shapes are common in Ruby and rarely typed
|
|
658
|
+
precisely enough for an exhaustiveness check to be reliable.
|
|
659
|
+
3. A developer deliberately relying on the fall-through return
|
|
660
|
+
of `nil`. Idiomatic in some Ruby styles.
|
|
661
|
+
|
|
662
|
+
The false-positive surface is uncomfortably large for a language
|
|
663
|
+
without ADTs. A `pattern.non-exhaustive` diagnostic is a future
|
|
664
|
+
direction (no committed milestone). Users wanting exhaustiveness
|
|
665
|
+
*today* can replicate the TypeScript / Sorbet idiom — call a
|
|
666
|
+
method declared to take `Bot` in the default branch — and
|
|
667
|
+
Rigor's narrowing will surface the missed arm via the
|
|
668
|
+
`call.argument-type-mismatch` diagnostic that already exists.
|
|
669
|
+
|
|
670
|
+
```ruby
|
|
671
|
+
# Self-rolled exhaustiveness via the Bot-receiver idiom
|
|
672
|
+
def unreachable(x)
|
|
673
|
+
raise "unreachable: #{x.inspect}"
|
|
674
|
+
end
|
|
675
|
+
# RBS: def unreachable: (bot) -> bot
|
|
676
|
+
|
|
677
|
+
case shape
|
|
678
|
+
in :circle then ...
|
|
679
|
+
in :square then ...
|
|
680
|
+
# missing :triangle
|
|
681
|
+
else unreachable(shape)
|
|
682
|
+
# If `shape` could still be :triangle here, the
|
|
683
|
+
# `unreachable` call's argument type mismatches `bot`
|
|
684
|
+
# and call.argument-type-mismatch fires.
|
|
685
|
+
end
|
|
686
|
+
```
|
|
687
|
+
|
|
688
|
+
This is not as ergonomic as a first-class
|
|
689
|
+
`pattern.non-exhaustive`, but it is sound under Rigor's
|
|
690
|
+
no-false-positives discipline and works today.
|
|
691
|
+
|
|
692
|
+
## Gradual typing
|
|
693
|
+
|
|
694
|
+
Gradual typing (Siek & Taha, 2006; Garcia, Clark & Tanter, 2016)
|
|
695
|
+
is the discipline of letting statically-typed and
|
|
696
|
+
dynamically-typed code coexist in one program. The technical
|
|
697
|
+
machinery is:
|
|
698
|
+
|
|
699
|
+
1. A distinguished "dynamic" type (`?` in the original paper).
|
|
700
|
+
2. A *consistency* relation `~` that admits the dynamic type
|
|
701
|
+
anywhere a concrete type is expected (and vice versa) but
|
|
702
|
+
refuses to bridge two unrelated concrete types.
|
|
703
|
+
3. Optional run-time casts at the static/dynamic boundary.
|
|
704
|
+
|
|
705
|
+
Rigor maps onto this as:
|
|
706
|
+
|
|
707
|
+
| Gradual concept | Rigor surface |
|
|
708
|
+
| --- | --- |
|
|
709
|
+
| Dynamic type `?` | **`Dynamic[T]`** — a carrier that *wraps* a "best-guess" type `T` while marking the value as not-statically-verified. `Dynamic[top]` is the maximally-dynamic form. |
|
|
710
|
+
| Consistency `~` | The `maybe` arm of the trinary certainty — `Dynamic[T] ~ U` holds whenever `T ~ U` does. |
|
|
711
|
+
| Static/dynamic boundary | Per-method, per-file, per-plugin contribution — Rigor records *why* a value became `Dynamic[T]` in its dynamic-origin algebra. |
|
|
712
|
+
| Casts | No in-source cast operator. The opt-in [`rigor-sorbet`](../../plugins/rigor-sorbet/) plugin reads `T.let` / `T.cast` / `T.must` as cast forms; `RBS::Extended` `assert_type` directives serve the same role from `.rbs`. |
|
|
713
|
+
|
|
714
|
+
Two Rigor-specific extensions matter:
|
|
715
|
+
|
|
716
|
+
1. **`Dynamic[T]` is parameterised.** The original gradual-typing
|
|
717
|
+
paper has a single `?`; Rigor carries the "what we would
|
|
718
|
+
*guess* the type is if asked to commit" alongside the
|
|
719
|
+
uncertainty marker, so refactoring tools can offer better
|
|
720
|
+
suggestions.
|
|
721
|
+
2. **The robustness principle (Postel's law for types)** —
|
|
722
|
+
parameters are accepted leniently (closer to `Dynamic[T]`),
|
|
723
|
+
returns are reported strictly. See
|
|
724
|
+
[ADR-5](../adr/5-robustness-principle.md).
|
|
725
|
+
|
|
726
|
+
Spec: [`docs/type-specification/special-types.md`](../type-specification/special-types.md),
|
|
727
|
+
[`docs/type-specification/value-lattice.md`](../type-specification/value-lattice.md).
|
|
728
|
+
|
|
729
|
+
## Blame, the gradual guarantee, and trust boundaries
|
|
730
|
+
|
|
731
|
+
The previous section described `Dynamic[T]` and the consistency
|
|
732
|
+
relation `~` but stopped at the static side. The full
|
|
733
|
+
gradual-typing literature has a substantial run-time-and-policy
|
|
734
|
+
theory built on those static foundations. Rigor inherits part of
|
|
735
|
+
it; the rest is deliberately out of scope.
|
|
736
|
+
|
|
737
|
+
### Blame
|
|
738
|
+
|
|
739
|
+
**Findler & Felleisen 2002** (*Contracts for Higher-Order
|
|
740
|
+
Functions*) introduced the **blame** principle: when a value
|
|
741
|
+
flows across a static / dynamic boundary and a contract violation
|
|
742
|
+
is detected at runtime, *whose code is at fault*? The answer must
|
|
743
|
+
be unambiguous and inferable from the boundary topology — a value
|
|
744
|
+
crossing from typed code into untyped code carries a positive
|
|
745
|
+
contract obligation; from untyped to typed, a negative one.
|
|
746
|
+
|
|
747
|
+
**Wadler & Findler 2009** (*Well-Typed Programs Can't Be Blamed*)
|
|
748
|
+
gave the slogan: a typed module that follows its declared
|
|
749
|
+
interface is never the cause of a blame error — only untyped code
|
|
750
|
+
(or a static / dynamic interface mismatch) can be.
|
|
751
|
+
|
|
752
|
+
### The gradual guarantee
|
|
753
|
+
|
|
754
|
+
**Siek, Vitousek, Cimini & Boyland 2015** (*Refined Criteria for
|
|
755
|
+
Gradual Typing*) formalised the property a gradual type system is
|
|
756
|
+
most commonly *expected* to satisfy:
|
|
757
|
+
|
|
758
|
+
> Adding type annotations to a previously well-typed program does
|
|
759
|
+
> not introduce new errors. Removing annotations from a previously
|
|
760
|
+
> well-typed program does not introduce new errors either.
|
|
761
|
+
|
|
762
|
+
This is the **gradual guarantee**. It is the property that makes
|
|
763
|
+
gradual adoption psychologically viable: a developer adding an
|
|
764
|
+
RBS annotation to a working method should never break a
|
|
765
|
+
previously-passing call site, and removing an annotation should
|
|
766
|
+
never fire a new diagnostic.
|
|
767
|
+
|
|
768
|
+
### Rigor's position
|
|
769
|
+
|
|
770
|
+
Rigor does not insert runtime contracts. Blame in the
|
|
771
|
+
Findler-Felleisen sense has no direct operational analogue —
|
|
772
|
+
Rigor is static-only, and a `Dynamic[T]`-to-concrete-`T` flow is
|
|
773
|
+
a static decision, not a runtime check that could "blame" anyone.
|
|
774
|
+
|
|
775
|
+
The **gradual guarantee** *is* a property Rigor can be measured
|
|
776
|
+
against:
|
|
777
|
+
|
|
778
|
+
- **In spirit**, the no-false-positives stance
|
|
779
|
+
([ADR-5](../adr/5-robustness-principle.md)) is strictly
|
|
780
|
+
stronger than the gradual guarantee. If Rigor was silent on a
|
|
781
|
+
call site before an annotation was added, it remains silent
|
|
782
|
+
after — unless the annotation provably contradicts the runtime
|
|
783
|
+
behaviour, in which case the diagnostic fires on the annotation
|
|
784
|
+
rather than on the call. The asymmetry "strict on returns,
|
|
785
|
+
lenient on parameters" is calibrated to satisfy this property
|
|
786
|
+
by construction.
|
|
787
|
+
- **In practice**, the gradual guarantee in Rigor reads as: a
|
|
788
|
+
project's baseline of "passes without annotation" should never
|
|
789
|
+
regress when an RBS file is added. This is exactly the property
|
|
790
|
+
the [PHPStan-shaped baseline mechanism](../adr/22-baseline-and-project-onboarding.md)
|
|
791
|
+
enforces — adding annotations shrinks the baseline; it never
|
|
792
|
+
grows it on un-annotated code.
|
|
793
|
+
|
|
794
|
+
### What Rigor explicitly does NOT do
|
|
795
|
+
|
|
796
|
+
- **Runtime contract insertion at the static / dynamic boundary.**
|
|
797
|
+
The opt-in [`rigor-sorbet`](../../plugins/rigor-sorbet/) plugin
|
|
798
|
+
reads Sorbet's `T.let` / `T.cast` / `T.must` as cast forms, but
|
|
799
|
+
the contract *enforcement* is `sorbet-runtime`'s job, not
|
|
800
|
+
Rigor's. Rigor's static analysis uses the cast as a hint, not
|
|
801
|
+
as a check.
|
|
802
|
+
- **Blame-tracking algebra.** Rigor's dynamic-origin tracking
|
|
803
|
+
records *why* a value became `Dynamic[T]` (which plugin / which
|
|
804
|
+
file / which boundary) and is consulted by refactoring tools,
|
|
805
|
+
but does not assign run-time fault. There is no positive /
|
|
806
|
+
negative contract obligation in Rigor's algebra.
|
|
807
|
+
- **Trust polarity at boundaries.** The "typed code is trusted;
|
|
808
|
+
untyped code is suspect" framing that the
|
|
809
|
+
Wadler-Findler-Greenberg lineage builds on is replaced in
|
|
810
|
+
Rigor by the simpler "we report only diagnostics we can
|
|
811
|
+
prove" framing — which removes the question of who is to
|
|
812
|
+
blame by removing the runtime decision point.
|
|
813
|
+
|
|
814
|
+
The gradual-typing trinity for Rigor: **consistency `~`** (the
|
|
815
|
+
static side, § "Gradual typing"); the **gradual guarantee** (the
|
|
816
|
+
migration story, this section); and **no runtime cost** (the
|
|
817
|
+
engineering stance — Rigor is a compile-time tool, not a
|
|
818
|
+
contract system).
|
|
819
|
+
|
|
820
|
+
## Effect systems
|
|
821
|
+
|
|
822
|
+
A textbook **effect system** annotates each expression with two
|
|
823
|
+
things: a type *and* a set of effects (Lucassen & Gifford, 1988).
|
|
824
|
+
Effects include I/O, mutation, exceptions, divergence, allocation.
|
|
825
|
+
|
|
826
|
+
Rigor has an effect model but it lives **inside the engine**, not
|
|
827
|
+
at the user surface:
|
|
828
|
+
|
|
829
|
+
| Engine-internal effect | What it tracks | User-visible consequence |
|
|
830
|
+
| --- | --- | --- |
|
|
831
|
+
| Mutation | `arr << x`, `h[k] = v`, ivar writes | Narrowed types lose fact stability after mutating reads. |
|
|
832
|
+
| Exception / non-local exit | `raise`, `throw`, `return`, `break` | The branch contributes nothing to the join; methods that always raise return `Bot`. |
|
|
833
|
+
| Closure escape | A block stored or yielded outside its lexical scope | Narrowings inside the block are not exported to the outer scope. |
|
|
834
|
+
|
|
835
|
+
These effects are not part of an authored signature. They are
|
|
836
|
+
inferred from the AST walk and consulted by the narrowing logic.
|
|
837
|
+
Future plugin / annotation extensions to surface effects at the
|
|
838
|
+
user level are tracked in the spec corpus but not part of v0.1.x.
|
|
839
|
+
|
|
840
|
+
Spec: [`docs/type-specification/control-flow-analysis.md`](../type-specification/control-flow-analysis.md)
|
|
841
|
+
("Mutation effects" subsection).
|
|
842
|
+
|
|
843
|
+
## Soundness, completeness, and the no-false-positives stance
|
|
844
|
+
|
|
845
|
+
A static type system is:
|
|
846
|
+
|
|
847
|
+
- **Sound** when every program it accepts is free of the runtime
|
|
848
|
+
errors the type system is supposed to catch ("no false
|
|
849
|
+
negatives at runtime").
|
|
850
|
+
- **Complete** when every program free of those runtime errors is
|
|
851
|
+
accepted by the type system ("no false positives at
|
|
852
|
+
type-check time").
|
|
853
|
+
|
|
854
|
+
Rice's theorem implies you cannot have both in full generality.
|
|
855
|
+
Mainstream static type systems choose **sound but incomplete**
|
|
856
|
+
(Java, Haskell, Rust modulo unsafe). Rigor takes the opposite
|
|
857
|
+
default:
|
|
858
|
+
|
|
859
|
+
> Rigor only fires a diagnostic when it can **prove** the
|
|
860
|
+
> unsoundness. Cases it cannot decide are silent.
|
|
861
|
+
|
|
862
|
+
This is a deliberate design choice grounded in the project's
|
|
863
|
+
audience: Ruby programmers who would otherwise not run a type
|
|
864
|
+
checker at all. A noisy false-positive on the first day kills
|
|
865
|
+
adoption faster than a missed bug on day 30. The robustness
|
|
866
|
+
principle ([ADR-5](../adr/5-robustness-principle.md)) is the
|
|
867
|
+
formal expression of this stance: lenient on parameters
|
|
868
|
+
("anyone could call this with anything"), strict on returns
|
|
869
|
+
("we will commit to what we actually return").
|
|
870
|
+
|
|
871
|
+
The trade-offs to be aware of:
|
|
872
|
+
|
|
873
|
+
- **Rigor will miss bugs that a sound checker would catch.**
|
|
874
|
+
This is by design; the alternative is more friction than the
|
|
875
|
+
bug it would catch.
|
|
876
|
+
- **The trinary certainty (`yes` / `no` / `maybe`)** is the
|
|
877
|
+
formal acknowledgement of incompleteness. Most checkers
|
|
878
|
+
collapse to binary; Rigor preserves the third arm because
|
|
879
|
+
it's the arm that earns silence.
|
|
880
|
+
- **`Dynamic[T]` is not a failure mode** in Rigor's model. It is
|
|
881
|
+
a first-class carrier with full algebraic identity.
|
|
882
|
+
|
|
883
|
+
## Decidability of inference
|
|
884
|
+
|
|
885
|
+
A type system's *expressive power* and the *decidability of
|
|
886
|
+
inferring its types* pull in opposite directions. Adding the wrong
|
|
887
|
+
combination of features can push inference into undecidable
|
|
888
|
+
territory — equivalent in difficulty to the halting problem.
|
|
889
|
+
Language designers therefore pick a fragment that is decidable for
|
|
890
|
+
inference and require annotations for anything beyond.
|
|
891
|
+
|
|
892
|
+
The friendliest accessible-level survey of this landscape in
|
|
893
|
+
Japanese is 水野雅之「計算機に推論できる型、できない型」
|
|
894
|
+
(Wantedly Advent Calendar, 2021; see the reading list below). The
|
|
895
|
+
key results it walks through, in terms of where Rigor sits:
|
|
896
|
+
|
|
897
|
+
| Feature | Inference status | Rigor stance |
|
|
898
|
+
| --- | --- | --- |
|
|
899
|
+
| Let-polymorphism (Hindley–Milner) | Decidable; ~linear in practice | Not Rigor's strategy. Rigor is gradual, not HM-based — RBS generics resolve by walking call sites and consulting signatures, not by global unification. |
|
|
900
|
+
| Higher-rank polymorphism, Rank-2 | Decidable with annotations (Kfoury & Wells, 1994) | Not exposed at the user surface. RBS generics are predicative. |
|
|
901
|
+
| Higher-rank polymorphism, Rank-3+ | **Undecidable** (Wells, 1999) | Not exposed. Would force annotations wherever a polymorphic value flows. |
|
|
902
|
+
| Polymorphic recursion | **Undecidable** (Henglein, 1993) | Not exposed. A generic method body sees its type parameter as fixed at the call site — recursive calls do not re-instantiate it. |
|
|
903
|
+
| Recursive types as inference targets | Decidable for equi/iso-recursive forms, but most languages exclude them from inference | RBS type aliases are nominal — recursive shapes (a tree, a JSON value) live behind a name. Rigor does not synthesise an anonymous fixed-point type during inference. The OCaml cautionary example `let f g x = x x` is well-typed under unrestricted recursive types — exactly the kind of "accepted but unwanted" judgment that motivates the exclusion. |
|
|
904
|
+
| Subtyping + intersection types (full) | **Undecidable in general** | Rigor exposes both `<:` and `&` (meet). Instead of restricting the language to recover decidability, it trades completeness for the trinary certainty — the `maybe` arm is what closes the gap. |
|
|
905
|
+
|
|
906
|
+
### Rigor's pragmatic response: the third arm
|
|
907
|
+
|
|
908
|
+
A textbook sound type checker has two ways to react when inference
|
|
909
|
+
cannot decide:
|
|
910
|
+
|
|
911
|
+
1. **Restrict the language** — give up the offending feature (HM
|
|
912
|
+
gives up rank-N polymorphism to keep inference total).
|
|
913
|
+
2. **Demand annotations** — push the burden onto the author
|
|
914
|
+
(System F makes the user write `Λα.` themselves).
|
|
915
|
+
|
|
916
|
+
Rigor's no-false-positives stance enables a third route, available
|
|
917
|
+
only in the gradual setting:
|
|
918
|
+
|
|
919
|
+
> When inference cannot decide, return `maybe` and stay silent.
|
|
920
|
+
|
|
921
|
+
The `maybe` arm of the trinary certainty is therefore not only an
|
|
922
|
+
acknowledgement of *runtime* uncertainty (the gradual concern from
|
|
923
|
+
the previous section); it is also the formal acknowledgement that
|
|
924
|
+
the static system is *deliberately incomplete in the inferability
|
|
925
|
+
sense*. The two incompletenesses share one representation in
|
|
926
|
+
Rigor's algebra because the practical answer in both cases is the
|
|
927
|
+
same: do not fire a diagnostic the system cannot justify.
|
|
928
|
+
|
|
929
|
+
```ruby
|
|
930
|
+
# A call where deciding the subtyping-with-intersection constraint
|
|
931
|
+
# would require global, undecidable inference. Rigor returns
|
|
932
|
+
# `maybe` and emits no diagnostic.
|
|
933
|
+
def consume(x)
|
|
934
|
+
x.frobnicate if x.respond_to?(:frobnicate)
|
|
935
|
+
end
|
|
936
|
+
consume(some_value_from_a_dynamic_source) # certainty: maybe — silent
|
|
937
|
+
```
|
|
938
|
+
|
|
939
|
+
This stance also explains a recurring shape in Rigor's design:
|
|
940
|
+
when a feature would only be addable at the cost of global,
|
|
941
|
+
inference-time blow-up (closed row variables, first-class
|
|
942
|
+
higher-rank polymorphism, full GADT-style constructor-driven
|
|
943
|
+
narrowing), Rigor either ships a nominal substitute (capability
|
|
944
|
+
roles for row polymorphism, `interface` for existentials) or
|
|
945
|
+
defers the feature behind an ADR rather than degrade to a noisy
|
|
946
|
+
approximation.
|
|
947
|
+
|
|
948
|
+
## Hindley–Milner, principal types, and Rigor's inference architecture
|
|
949
|
+
|
|
950
|
+
The previous two sections discussed **soundness** (does the
|
|
951
|
+
system reject only programs that really would crash?) and
|
|
952
|
+
**decidability** (does the system always give an answer in finite
|
|
953
|
+
time?). Type-theory textbooks bundle these with a third property
|
|
954
|
+
that the appendix has not named so far:
|
|
955
|
+
|
|
956
|
+
- **Principal type property** — every well-typed expression has a
|
|
957
|
+
*most general* type, of which every other valid typing is a
|
|
958
|
+
substitution-instance. In a system with the principal type
|
|
959
|
+
property, "the type of `e`" is a canonical, unambiguous answer
|
|
960
|
+
— not a guess among many.
|
|
961
|
+
|
|
962
|
+
These three properties interact in a way worth understanding,
|
|
963
|
+
because **Hindley–Milner (HM)** — the type system underlying ML,
|
|
964
|
+
OCaml, and Haskell — is the canonical example of having all three
|
|
965
|
+
at once.
|
|
966
|
+
|
|
967
|
+
### What HM achieves and what it gives up
|
|
968
|
+
|
|
969
|
+
The classical Damas–Milner theorem (1982) is roughly:
|
|
970
|
+
|
|
971
|
+
> Every term typable in HM has a unique principal type, computable
|
|
972
|
+
> by unification (Algorithm W). The system is sound, decidable,
|
|
973
|
+
> and inference is "free" — the user writes no type annotations.
|
|
974
|
+
|
|
975
|
+
The cost is structural. HM accepts only a language without:
|
|
976
|
+
|
|
977
|
+
- rank-N polymorphism beyond let-bound generalisation;
|
|
978
|
+
- subtyping;
|
|
979
|
+
- intersection types;
|
|
980
|
+
- unrestricted recursive types at the user surface;
|
|
981
|
+
- polymorphic recursion.
|
|
982
|
+
|
|
983
|
+
Each excluded feature is exactly the kind that breaks one of the
|
|
984
|
+
three properties when added back:
|
|
985
|
+
|
|
986
|
+
| Added feature | Property that breaks first |
|
|
987
|
+
| --- | --- |
|
|
988
|
+
| Rank-3+ polymorphism | Decidability (Wells, 1999) |
|
|
989
|
+
| Polymorphic recursion | Decidability (Henglein, 1993) |
|
|
990
|
+
| Subtyping in general | Principal types (a value can satisfy several incomparable interfaces; "most general" stops being unique) |
|
|
991
|
+
| Subtyping + intersection (full) | Decidability |
|
|
992
|
+
| Unrestricted recursive types | The "principle of least surprise" — terms like `λx. x x` become well-typed |
|
|
993
|
+
|
|
994
|
+
### Why Rigor cannot be HM
|
|
995
|
+
|
|
996
|
+
Rigor's surface **already contains the features HM excludes**.
|
|
997
|
+
Subtyping is the foundation of the lattice (`<:`); intersection
|
|
998
|
+
(`&`) is in the algebra; refinements add predicate subtyping;
|
|
999
|
+
generics + occurrence typing + capability roles cover the
|
|
1000
|
+
polymorphism uses Ruby programmers have. An HM-style
|
|
1001
|
+
"infer a principal type for every expression by global
|
|
1002
|
+
unification" architecture is therefore not available to Rigor in
|
|
1003
|
+
principle. This is a structural consequence of the type language
|
|
1004
|
+
Ruby authors expect, not a missing feature.
|
|
1005
|
+
|
|
1006
|
+
Rigor's inference is instead **local and walker-driven**:
|
|
1007
|
+
|
|
1008
|
+
- The walker descends the AST once.
|
|
1009
|
+
- At each expression site it consults RBS signatures, narrowing
|
|
1010
|
+
facts, mutation effects, and plugin contributions.
|
|
1011
|
+
- It returns *the* type of the expression *at that point in the
|
|
1012
|
+
control flow* — the most specific type the local walk can
|
|
1013
|
+
justify, not a canonically most-general one.
|
|
1014
|
+
|
|
1015
|
+
The same expression appearing at two program points may yield two
|
|
1016
|
+
different types (narrowing, flow merges, mutation, plugin
|
|
1017
|
+
contributions can all enter). This is closer in spirit to
|
|
1018
|
+
TypeScript's contextual / flow-sensitive typing than to HM's
|
|
1019
|
+
unification, and it matches how Ruby authors reason about
|
|
1020
|
+
their code: `arr` after `arr.compact!` is not "the same type" as
|
|
1021
|
+
`arr` before it.
|
|
1022
|
+
|
|
1023
|
+
### Property ledger
|
|
1024
|
+
|
|
1025
|
+
The three properties laid against Rigor and HM:
|
|
1026
|
+
|
|
1027
|
+
| Property | Hindley–Milner | Rigor |
|
|
1028
|
+
| --- | --- | --- |
|
|
1029
|
+
| **Soundness** | Yes | **No, by design** — `maybe` cases stay silent (§ "Soundness, completeness, and the no-false-positives stance"). |
|
|
1030
|
+
| **Decidability** | Yes (DEXPTIME worst-case, near-linear in practice) | Decidable per local walk; whatever the walker cannot decide, it returns `maybe` (§ "Decidability of inference"). |
|
|
1031
|
+
| **Principal type property** | Yes | **No** — subtyping + intersection break it. Rigor reports a *per-occurrence* type, not a canonical most-general one. |
|
|
1032
|
+
|
|
1033
|
+
HM trades expressiveness for the
|
|
1034
|
+
trinity (soundness + decidability + principal types). Rigor
|
|
1035
|
+
trades the trinity for expressiveness, and recovers what it can
|
|
1036
|
+
through the trinary certainty and the no-false-positives stance.
|
|
1037
|
+
|
|
1038
|
+
### A note on bidirectional / local type inference
|
|
1039
|
+
|
|
1040
|
+
Once subtyping enters the picture, the textbook fallback for
|
|
1041
|
+
HM-style global unification is **bidirectional** or **local type
|
|
1042
|
+
inference** (Pierce & Turner, 2000): split typing rules into a
|
|
1043
|
+
*synthesis* mode (compute the type of `e` from `e`) and a
|
|
1044
|
+
*checking* mode (verify that `e` has an expected type). Steep is
|
|
1045
|
+
in this lineage. Rigor's walker is bidirectional in this informal
|
|
1046
|
+
sense — call sites synthesise; RBS signatures check parameters
|
|
1047
|
+
against the synthesised argument types — but Rigor does not
|
|
1048
|
+
formalise the bidirectional rules because the gradual setting and
|
|
1049
|
+
the trinary certainty make the "could not decide" case explicit
|
|
1050
|
+
rather than a typing-rule failure.
|
|
1051
|
+
|
|
1052
|
+
The next section makes this informal claim concrete.
|
|
1053
|
+
|
|
1054
|
+
## Bidirectional type checking
|
|
1055
|
+
|
|
1056
|
+
The HM section noted in passing that Rigor's walker is
|
|
1057
|
+
"bidirectional in everything but formalisation." That is the
|
|
1058
|
+
short version of the relationship between Rigor and a substantial
|
|
1059
|
+
contemporary line of type-system work, and worth its own
|
|
1060
|
+
treatment.
|
|
1061
|
+
|
|
1062
|
+
### The synthesis / checking split
|
|
1063
|
+
|
|
1064
|
+
**Pierce & Turner 2000** (*Local Type Inference*) and the modern
|
|
1065
|
+
canonical reformulation by **Dunfield & Krishnaswami 2013**
|
|
1066
|
+
(*Complete and Easy Bidirectional Typechecking for Higher-Rank
|
|
1067
|
+
Polymorphism*) split typing judgments into two modes:
|
|
1068
|
+
|
|
1069
|
+
- **Synthesis** (`Γ ⊢ e ⇒ T`): given an expression `e`, compute
|
|
1070
|
+
its type `T`. The type flows out.
|
|
1071
|
+
- **Checking** (`Γ ⊢ e ⇐ T`): given `e` and an expected type
|
|
1072
|
+
`T`, verify that `e` has type `T`. The type flows in.
|
|
1073
|
+
|
|
1074
|
+
Every typing rule is one or the other. The two modes alternate
|
|
1075
|
+
top-down (checking propagates expected types) and bottom-up
|
|
1076
|
+
(synthesis returns types) through the AST, replacing the global
|
|
1077
|
+
unification of HM with *local* constraint discharge. The same
|
|
1078
|
+
machine handles subtyping, intersections, and higher-rank
|
|
1079
|
+
polymorphism without needing a global solver.
|
|
1080
|
+
|
|
1081
|
+
### Industrial uptake
|
|
1082
|
+
|
|
1083
|
+
- **Steep** is explicitly bidirectional and the closest Ruby
|
|
1084
|
+
analogue — its rules are authored in `⇒` / `⇐` form.
|
|
1085
|
+
- **TypeScript**'s "contextual typing" is bidirectional in
|
|
1086
|
+
everything but the name.
|
|
1087
|
+
- **Scala 3**'s match types and contextual typing.
|
|
1088
|
+
- **OCaml** uses local type inference (Pierce & Turner directly)
|
|
1089
|
+
for higher-rank-polymorphic positions where HM cannot decide.
|
|
1090
|
+
- **Roc**, **ReScript**, **Idris 2** — all bidirectional in the
|
|
1091
|
+
modern Dunfield-Krishnaswami style.
|
|
1092
|
+
|
|
1093
|
+
### Rigor's bidirectional behaviour, informal
|
|
1094
|
+
|
|
1095
|
+
Rigor's walker performs the two modes without naming them:
|
|
1096
|
+
|
|
1097
|
+
| Walker behaviour | Bidirectional mode |
|
|
1098
|
+
| --- | --- |
|
|
1099
|
+
| Computing the type of an argument expression at a call site | Synthesis |
|
|
1100
|
+
| Verifying the argument type against the RBS parameter type | Checking |
|
|
1101
|
+
| Inferring `HashShape` for a hash literal | Synthesis |
|
|
1102
|
+
| Inferring `Tuple` for an array literal | Synthesis |
|
|
1103
|
+
| Walking the `then` / `else` branches of an `if` under a narrowed environment | Synthesis under each branch + join (no expected-type checking) |
|
|
1104
|
+
| Verifying a plugin protocol contract ([ADR-28](../adr/28-path-scoped-protocol-contracts.md)) — method exists + return-type matches | Checking |
|
|
1105
|
+
| Honouring an `RBS::Extended` `assert_type` directive | Checking |
|
|
1106
|
+
|
|
1107
|
+
What Rigor does NOT do that a fully formalised bidirectional
|
|
1108
|
+
system would:
|
|
1109
|
+
|
|
1110
|
+
- **Constraint propagation across non-adjacent expressions.**
|
|
1111
|
+
Each expression's type is decided when the walker reaches it;
|
|
1112
|
+
later uses see the decision as a fixed fact, not as a
|
|
1113
|
+
constraint to be solved together with theirs.
|
|
1114
|
+
- **Local generalisation.** HM's `let`-binding generalisation
|
|
1115
|
+
step does not exist in Rigor; the walker does not introduce
|
|
1116
|
+
fresh type variables to be solved later.
|
|
1117
|
+
- **Formal mode discipline.** Rigor's rules are not authored as
|
|
1118
|
+
`⇒` / `⇐` judgments; the walker's behaviour matches a
|
|
1119
|
+
bidirectional reading but the spec does not enforce it.
|
|
1120
|
+
|
|
1121
|
+
The practical consequence: Rigor's inference is faster than a
|
|
1122
|
+
constraint-based bidirectional system (no global solving) and
|
|
1123
|
+
gives a definite "what type is this expression?" answer at every
|
|
1124
|
+
point — at the cost of not being able to defer typing decisions,
|
|
1125
|
+
which a more constraint-based system would allow when context
|
|
1126
|
+
arrives later in the walk.
|
|
1127
|
+
|
|
1128
|
+
For a reader who has internalised the bidirectional literature,
|
|
1129
|
+
the mental model is: **synthesis everywhere except at the RBS /
|
|
1130
|
+
plugin-contribution boundary, where the declared type is the
|
|
1131
|
+
check side.** That is the entire bidirectional discipline Rigor
|
|
1132
|
+
needs.
|
|
1133
|
+
|
|
1134
|
+
## Beyond pure inference: reach and precision
|
|
1135
|
+
|
|
1136
|
+
The previous sections framed "what cannot be statically inferred"
|
|
1137
|
+
in terms of theoretical decidability — `maybe` as the response
|
|
1138
|
+
when no proof is available. That covers one half of the design
|
|
1139
|
+
space. There is a second half that the reading-order of this
|
|
1140
|
+
appendix has implied but not yet named: phenomena that are *not*
|
|
1141
|
+
theoretically undecidable, where a pure AST-walking inference
|
|
1142
|
+
*could* return a type but the type it would return either (a)
|
|
1143
|
+
does not exist in the AST at all, or (b) exists but is too wide
|
|
1144
|
+
to be useful.
|
|
1145
|
+
|
|
1146
|
+
Both halves are addressed by the same substrate — `RBS::Extended`
|
|
1147
|
+
directives, plugin contributions, the specialised carrier zoo —
|
|
1148
|
+
but for different reasons. It is worth giving them separate
|
|
1149
|
+
names.
|
|
1150
|
+
|
|
1151
|
+
### Reach: the AST does not describe the program
|
|
1152
|
+
|
|
1153
|
+
The walker reads the AST. For a Ruby program, the AST is not a
|
|
1154
|
+
complete description of the program's runtime behaviour:
|
|
1155
|
+
|
|
1156
|
+
- `define_method` synthesises methods whose names are computed at
|
|
1157
|
+
evaluation time.
|
|
1158
|
+
- `attr_accessor :name` defines `#name` / `#name=` whose existence
|
|
1159
|
+
the walker recognises by pattern, not by general reasoning.
|
|
1160
|
+
- `class_eval` / `instance_eval` over a block injects code under
|
|
1161
|
+
a different `self`.
|
|
1162
|
+
- DSL forms like `has_many :posts` or
|
|
1163
|
+
`attribute :name, Types::String` declare *both* a method and a
|
|
1164
|
+
type contract through a single helper call.
|
|
1165
|
+
- `eval(string)` with an arbitrary string is genuinely outside
|
|
1166
|
+
the AST.
|
|
1167
|
+
|
|
1168
|
+
None of these is "undecidable" in the sense of the previous two
|
|
1169
|
+
sections. The semantics are well-defined; the walker
|
|
1170
|
+
cannot *read* them from the AST. This is the **reach**
|
|
1171
|
+
problem, distinct from the decidability problem:
|
|
1172
|
+
|
|
1173
|
+
| Problem class | Example | Rigor's response |
|
|
1174
|
+
| --- | --- | --- |
|
|
1175
|
+
| Theoretical undecidability of inference | Rank-3 polymorphism; subtyping + intersection | The trinary `maybe` |
|
|
1176
|
+
| Reach — the AST does not contain the semantics | `define_method`, Rails DSL, `attr_*` | Plugin contributions + `RBS::Extended` + the [ADR-16](../adr/16-macro-expansion.md) macro substrate |
|
|
1177
|
+
| Genuine runtime opacity | `eval(user_input)` | `Dynamic[top]`, then `maybe` at use sites |
|
|
1178
|
+
|
|
1179
|
+
Plugins are written in Ruby because the reach problem cannot be
|
|
1180
|
+
solved in the type language alone — it needs a Ruby-side
|
|
1181
|
+
recogniser that walks the AST, decides "this `has_many :posts`
|
|
1182
|
+
declares an accessor returning `Relation[Post]`," and contributes
|
|
1183
|
+
that fact to the walker's worldview.
|
|
1184
|
+
[ADR-2](../adr/2-extension-api.md),
|
|
1185
|
+
[ADR-16](../adr/16-macro-expansion.md),
|
|
1186
|
+
[ADR-25](../adr/25-plugin-contributed-rbs.md), and
|
|
1187
|
+
[ADR-28](../adr/28-path-scoped-protocol-contracts.md) define the
|
|
1188
|
+
structured extension points where this knowledge enters.
|
|
1189
|
+
|
|
1190
|
+
### Precision: naive inference produces useless joins
|
|
1191
|
+
|
|
1192
|
+
The second motivation is subtler but at least as important. The
|
|
1193
|
+
simplest "correct" inference rules for compound expressions
|
|
1194
|
+
produce types so wide they tell the user nothing useful:
|
|
1195
|
+
|
|
1196
|
+
| Expression | Naive join | More useful type | Mechanism in Rigor |
|
|
1197
|
+
| --- | --- | --- | --- |
|
|
1198
|
+
| `{user: u, count: 3, msg: "ok"}` | `Hash[Symbol, User \| Integer \| String]` | `HashShape{user: User, count: Integer, msg: String}` | `HashShape` carrier (built-in for hash literals) |
|
|
1199
|
+
| `[1, "a", :sym]` | `Array[Integer \| String \| Symbol]` | `Tuple[Integer, String, Symbol]` | `Tuple` carrier (built-in for array literals) |
|
|
1200
|
+
| A provably-constant value (e.g. `42`, `"ok"`) | `Integer`, `String` | `Constant<42>`, `Constant<"ok">` | `Constant<T>` carrier |
|
|
1201
|
+
| `JSON.parse(input)` | `Hash[String, untyped] \| Array[untyped] \| String \| Integer \| Float \| true \| false \| nil` | `App[json::value, K]` per option `K` | [ADR-20](../adr/20-lightweight-hkt.md) Lightweight HKT + `METHOD_RETURN_OVERRIDES` |
|
|
1202
|
+
| A method whose return depends on its arguments | A wide union of every observed exit | A per-call-site discriminated return | `RBS::Extended` `return_override` directive |
|
|
1203
|
+
| A DSL-managed accessor (`has_many`, `attribute`) | `Dynamic[top]` | `Relation[Model]`, a model-specific shape | Plugin `dynamic_return` + macro substrate |
|
|
1204
|
+
|
|
1205
|
+
These are not undecidability cases — the inference can decide a
|
|
1206
|
+
type, it decides a *useless* one. A type like
|
|
1207
|
+
`Hash[Symbol, Foo | Bar | Buz]` or
|
|
1208
|
+
`true | false | String | Integer | Float` is technically the
|
|
1209
|
+
correct join of observed values, but its consumer cannot do
|
|
1210
|
+
anything with it without narrowing first; the union has erased
|
|
1211
|
+
exactly the information the type system existed to carry.
|
|
1212
|
+
|
|
1213
|
+
The shared design principle is **strictness on returns** — the
|
|
1214
|
+
robustness principle ([ADR-5](../adr/5-robustness-principle.md))
|
|
1215
|
+
treats "the most specific type the analysis can justify" as the
|
|
1216
|
+
goal, not "the smallest type that covers every observed exit."
|
|
1217
|
+
Naive join-widening fails that test in nearly every case where
|
|
1218
|
+
the inputs are heterogeneous.
|
|
1219
|
+
|
|
1220
|
+
This is also why `HashShape` and `Tuple` are **foundational
|
|
1221
|
+
carriers** rather than exotic refinements: without them every
|
|
1222
|
+
hash literal would degrade to a `Hash`-with-union and the
|
|
1223
|
+
inferred type language would describe almost nothing useful in
|
|
1224
|
+
practice.
|
|
1225
|
+
|
|
1226
|
+
### One substrate, two problems
|
|
1227
|
+
|
|
1228
|
+
The plugin contract and the `RBS::Extended` directive family
|
|
1229
|
+
therefore serve two complementary roles. They extend *where*
|
|
1230
|
+
Rigor can produce a type at all (reach), and they raise *how
|
|
1231
|
+
specific* that type is when produced (precision). The two roles
|
|
1232
|
+
share a substrate but answer different limitations (one of
|
|
1233
|
+
static-analysis scope, one of useful-type design), and neither
|
|
1234
|
+
is the same as the decidability question that the trinary
|
|
1235
|
+
`maybe` answers.
|
|
1236
|
+
|
|
1237
|
+
## The expression problem and Rigor's plugin contract
|
|
1238
|
+
|
|
1239
|
+
A theoretical framing for one of Rigor's central design choices
|
|
1240
|
+
(the plugin contract) comes from a paper that gave the framing
|
|
1241
|
+
its name.
|
|
1242
|
+
|
|
1243
|
+
### The problem
|
|
1244
|
+
|
|
1245
|
+
**Wadler 1998** (*The Expression Problem*, informal note) posed
|
|
1246
|
+
the challenge: in a typed language, can you simultaneously
|
|
1247
|
+
support
|
|
1248
|
+
|
|
1249
|
+
1. **Adding new types** (new data variants) without modifying
|
|
1250
|
+
existing operations, *and*
|
|
1251
|
+
2. **Adding new operations** (new functions over existing data)
|
|
1252
|
+
without modifying existing types?
|
|
1253
|
+
|
|
1254
|
+
Most type-system paradigms handle one or the other:
|
|
1255
|
+
|
|
1256
|
+
| Paradigm | Easy to add | Hard to add |
|
|
1257
|
+
| --- | --- | --- |
|
|
1258
|
+
| OO (subtyping + dispatch) | New types (subclass) | New operations (must touch every class) |
|
|
1259
|
+
| Functional ADTs + pattern matching | New operations (new function) | New types (must touch every operation) |
|
|
1260
|
+
| Haskell type classes | Either, with care | The other requires `OverlappingInstances` etc. |
|
|
1261
|
+
| Scala traits + pattern matching | Either, with elaboration support | Boilerplate on the unsupported side |
|
|
1262
|
+
| Clojure / Elixir protocols | Either (protocol dispatch) | (Solved by design) |
|
|
1263
|
+
| Ruby open classes | Both! (reopen + monkey-patch) | (Solved by design — sometimes too directly) |
|
|
1264
|
+
|
|
1265
|
+
Ruby sits in the "open classes" row — a non-typed language where
|
|
1266
|
+
the expression problem is solved by `module Foo; def bar; …;
|
|
1267
|
+
end; class String; include Foo; end`. The language solution
|
|
1268
|
+
trades safety for flexibility.
|
|
1269
|
+
|
|
1270
|
+
### Rigor's plugin contract as the tool-side answer
|
|
1271
|
+
|
|
1272
|
+
Rigor's plugin substrate ([ADR-2](../adr/2-extension-api.md),
|
|
1273
|
+
[ADR-16](../adr/16-macro-expansion.md),
|
|
1274
|
+
[ADR-25](../adr/25-plugin-contributed-rbs.md),
|
|
1275
|
+
[ADR-28](../adr/28-path-scoped-protocol-contracts.md), …) solves
|
|
1276
|
+
the **tool-level** version of the same problem:
|
|
1277
|
+
|
|
1278
|
+
- **Adding new type knowledge** the engine can act on — new RBS
|
|
1279
|
+
bundles, new structural shapes via `signature_paths:`, new
|
|
1280
|
+
TypeNode resolvers ([ADR-13](../adr/13-typenode-resolver-plugin.md))
|
|
1281
|
+
— **without modifying the engine**.
|
|
1282
|
+
- **Adding new analyses / operations** over existing types — new
|
|
1283
|
+
diagnostic rules, new flow contributions, new protocol
|
|
1284
|
+
contracts ([ADR-28](../adr/28-path-scoped-protocol-contracts.md))
|
|
1285
|
+
— **without modifying the type language**.
|
|
1286
|
+
|
|
1287
|
+
The plugin contract is therefore the **expression problem solved
|
|
1288
|
+
at the analyser's extension boundary**, where the language-level
|
|
1289
|
+
solution (open classes) is too coarse for static analysis.
|
|
1290
|
+
|
|
1291
|
+
A worked example:
|
|
1292
|
+
|
|
1293
|
+
- `rigor-activesupport-core-ext` adds a *new fact* about existing
|
|
1294
|
+
classes (`Numeric#hours`, `String#blank?`, `Hash#stringify_keys`)
|
|
1295
|
+
— type-extension axis.
|
|
1296
|
+
- `rigor-web` adds a *new analysis* over existing classes
|
|
1297
|
+
(every class under `lib/controller/` must define
|
|
1298
|
+
`#get(Rack::Request) -> Rack::Response`) — operation-extension
|
|
1299
|
+
axis ([ADR-28](../adr/28-path-scoped-protocol-contracts.md)).
|
|
1300
|
+
|
|
1301
|
+
Neither plugin requires modifying the Rigor engine, and they
|
|
1302
|
+
*compose* — a single plugin can do both axes ([ADR-12 dry-rb
|
|
1303
|
+
packaging](../adr/12-dry-rb-packaging.md) discusses the
|
|
1304
|
+
production examples).
|
|
1305
|
+
|
|
1306
|
+
### Connection to earlier appendix sections
|
|
1307
|
+
|
|
1308
|
+
This framing also retroactively explains several design choices:
|
|
1309
|
+
|
|
1310
|
+
- **Nominal-first** (§ "Nominal vs structural typing"):
|
|
1311
|
+
nominal class names are the stable attachment point for
|
|
1312
|
+
plugin-contributed facts. Structural shapes are inferred
|
|
1313
|
+
per-call; a plugin would have no name to bind its knowledge
|
|
1314
|
+
to. The expression problem framing prefers explicit `class`
|
|
1315
|
+
declarations precisely because the name is the extension
|
|
1316
|
+
handle.
|
|
1317
|
+
- **The macro substrate** ([ADR-16](../adr/16-macro-expansion.md)):
|
|
1318
|
+
each tier (A: block-as-method, B: trait-inlining, C: heredoc
|
|
1319
|
+
template, D: external-file inclusion) is a different way to
|
|
1320
|
+
add knowledge about a class's behaviour without modifying the
|
|
1321
|
+
class — the type-extension axis made plural.
|
|
1322
|
+
- **Path-scoped protocol contracts**
|
|
1323
|
+
([ADR-28](../adr/28-path-scoped-protocol-contracts.md)): a
|
|
1324
|
+
plugin can declare a behavioural contract for an entire
|
|
1325
|
+
directory of user-authored classes without the classes
|
|
1326
|
+
opting in — the operation-extension axis made tool-side.
|
|
1327
|
+
|
|
1328
|
+
The plugin contract is therefore **the expression problem solved
|
|
1329
|
+
at the analyser layer rather than the language layer**, not an
|
|
1330
|
+
ad-hoc Rigor design choice. The same theoretical pressure
|
|
1331
|
+
that drove Haskell to type classes and Clojure to protocols
|
|
1332
|
+
drives Rigor to a structured plugin substrate.
|
|
1333
|
+
|
|
1334
|
+
## Smaller connections, in brief
|
|
1335
|
+
|
|
1336
|
+
A grab-bag of further type-theoretic / programming-languages
|
|
1337
|
+
connections. Each is summarised in a paragraph rather than a
|
|
1338
|
+
section because the topic either maps to mechanisms already
|
|
1339
|
+
covered or to a deliberate non-feature
|
|
1340
|
+
([§ What Rigor does NOT model](#what-rigor-does-not-model)), but
|
|
1341
|
+
a reader hunting for "does Rigor have a story for X?" should be
|
|
1342
|
+
able to find one here.
|
|
1343
|
+
|
|
1344
|
+
### Type erasure vs reification
|
|
1345
|
+
|
|
1346
|
+
A language **erases** types at runtime (Java generics, Haskell,
|
|
1347
|
+
OCaml) or **reifies** them (C#, .NET, Ruby's `.class`). Ruby is
|
|
1348
|
+
fully reified — `arr.class` returns `Array` at runtime, and
|
|
1349
|
+
`is_a?` queries are first-class. Rigor leans on this: occurrence
|
|
1350
|
+
typing's predicate set (`is_a?` / `kind_of?` / `instance_of?` /
|
|
1351
|
+
`respond_to?`) all use Ruby's reified class objects, and the
|
|
1352
|
+
narrowing rules are sound *because* the run-time check matches
|
|
1353
|
+
the type-theoretic class membership. The gradual-typing
|
|
1354
|
+
literature on "type-erased" vs "reified" gradual systems
|
|
1355
|
+
(Wrigstad et al. 2010, *Integrating Typed and Untyped Code in a
|
|
1356
|
+
Scripting Language*) classifies Rigor's setting as fully reified
|
|
1357
|
+
on the dynamic side — which is what makes `Dynamic[T]` narrow
|
|
1358
|
+
back to a concrete type safely whenever the runtime predicate
|
|
1359
|
+
fires.
|
|
1360
|
+
|
|
1361
|
+
### Algebraic effects vs monadic effects
|
|
1362
|
+
|
|
1363
|
+
The textbook alternative to monadic effects (Haskell `IO`, Scala
|
|
1364
|
+
`cats-effect`) is **algebraic effects with handlers** (Plotkin &
|
|
1365
|
+
Pretnar 2009; Koka / Eff / OCaml 5's effect handlers). Algebraic
|
|
1366
|
+
effects let an "effectful" computation be paused at the effect
|
|
1367
|
+
site and resumed by a handler — closer to delimited
|
|
1368
|
+
continuations than to monad bind. Rigor's effect model
|
|
1369
|
+
(§ "Effect systems") is neither monadic nor algebraic; it is
|
|
1370
|
+
*inferred* and *engine-internal*. Surfacing effects to the user
|
|
1371
|
+
(annotation grammar; pure-function marker; algebraic-effect
|
|
1372
|
+
signatures) is a future direction tracked in the spec corpus;
|
|
1373
|
+
the relevant prior art is the Koka community's surface design.
|
|
1374
|
+
|
|
1375
|
+
### Single vs multiple dispatch
|
|
1376
|
+
|
|
1377
|
+
Ruby is single-dispatch — method selection depends on the
|
|
1378
|
+
receiver's class only. Languages with **multiple dispatch**
|
|
1379
|
+
(CLOS, Julia, Dylan) select methods based on the runtime types
|
|
1380
|
+
of every argument. RBS overloads — `def m: (Integer) -> Integer
|
|
1381
|
+
| (String) -> String` — simulate a static-side analogue of
|
|
1382
|
+
multiple dispatch by picking an arm by argument type at the call
|
|
1383
|
+
site. Rigor honours the "most-specific arm wins" resolution that
|
|
1384
|
+
multiple-dispatch type systems require, but the runtime dispatch
|
|
1385
|
+
remains single-dispatch; the overload arm is selected at
|
|
1386
|
+
type-check time, not at call time.
|
|
1387
|
+
|
|
1388
|
+
### Phantom types and brand types
|
|
1389
|
+
|
|
1390
|
+
A **phantom type** carries a type parameter that does not appear
|
|
1391
|
+
in any field — e.g., `class Length<U>` where `U` is `Metres` or
|
|
1392
|
+
`Feet`. The type carries an invariant the runtime does not
|
|
1393
|
+
enforce. **Brand types** wrap a base type in a unique nominal
|
|
1394
|
+
type (`class ValidatedEmail < String`) so that only
|
|
1395
|
+
verified-via-constructor values inhabit it. Rigor's refinement
|
|
1396
|
+
carriers (§ "Refinement types") cover the brand-type use case
|
|
1397
|
+
for the common refinements (`non-empty-string`, `positive-int`,
|
|
1398
|
+
…); user-extensible brand types via `class X < Y` are typed
|
|
1399
|
+
nominally by Rigor — `ValidatedEmail` is distinct from `String`
|
|
1400
|
+
at the type level. The phantom-type-via-unused-parameter pattern
|
|
1401
|
+
is also typeable but not widely used in Ruby; the equivalent
|
|
1402
|
+
expressiveness usually arrives via refinements or nominal
|
|
1403
|
+
wrapping.
|
|
1404
|
+
|
|
1405
|
+
### Open-world vs closed-world assumption
|
|
1406
|
+
|
|
1407
|
+
RBS treats unknown methods on a `Dynamic[T]` receiver under the
|
|
1408
|
+
**open-world** assumption — "we do not know the full method set;
|
|
1409
|
+
an unknown method might exist at runtime." Rigor inherits this
|
|
1410
|
+
on dynamic receivers, which is why a `respond_to?`-narrowed call
|
|
1411
|
+
on a `Dynamic[T]` value does not fire `call.undefined-method`.
|
|
1412
|
+
On a concretely-typed receiver (e.g. `String`), Rigor uses the
|
|
1413
|
+
**closed-world** assumption — the RBS signature is taken as the
|
|
1414
|
+
authoritative method set, and an unknown method fires
|
|
1415
|
+
`call.undefined-method` (subject to the
|
|
1416
|
+
[ADR-26 `open_receivers:`](../adr/26-activerecord-relation-typing.md)
|
|
1417
|
+
exemption for receivers whose method set is provably open at
|
|
1418
|
+
runtime, like `ActiveRecord::Relation`).
|
|
1419
|
+
|
|
1420
|
+
### Equirecursive vs isorecursive types
|
|
1421
|
+
|
|
1422
|
+
Two formalisms for recursive types: **equirecursive** treats
|
|
1423
|
+
`T = μX. F(X)` as judgmentally equal to `F(μX. F(X))` (any
|
|
1424
|
+
recursive type can be silently unfolded); **isorecursive**
|
|
1425
|
+
requires explicit `fold` / `unfold` conversions at every use.
|
|
1426
|
+
RBS type aliases are nominally recursive — `type tree = [Symbol,
|
|
1427
|
+
tree?]` is the type *by name*, and Rigor compares them by name,
|
|
1428
|
+
not by structural unfolding. This is the conservative path that
|
|
1429
|
+
sidesteps the equirecursive-vs-isorecursive debate entirely:
|
|
1430
|
+
naming a recursive type is the only way to write it.
|
|
1431
|
+
|
|
1432
|
+
### Polarity and variance positions
|
|
1433
|
+
|
|
1434
|
+
Variance (§ "Variance") is derived from **positions** in a type
|
|
1435
|
+
expression: a type variable appearing in **positive** position
|
|
1436
|
+
(return type, output of a producer) is covariant; in **negative**
|
|
1437
|
+
position (parameter type, input to a consumer) is contravariant;
|
|
1438
|
+
in both (mutable storage) is invariant. The polarity reading is
|
|
1439
|
+
the standard textbook derivation (Pierce, *Types and Programming
|
|
1440
|
+
Languages*, ch. 15). RBS's `out T` / `in T` annotations express
|
|
1441
|
+
the result directly without forcing the user to read variance
|
|
1442
|
+
off a type expression's polarity structure; Rigor honours the
|
|
1443
|
+
declared variance without re-deriving it.
|
|
1444
|
+
|
|
1445
|
+
### Subtype-with-bounded-existentials
|
|
1446
|
+
|
|
1447
|
+
The standard encoding of "a value with a hidden type that
|
|
1448
|
+
satisfies an interface" is **bounded existential types** — `∃X
|
|
1449
|
+
<: I. T(X)`. Most industrial languages encode this via
|
|
1450
|
+
*structural interfaces* instead: a parameter typed `I` accepts
|
|
1451
|
+
any value with `I`'s method set, hiding the receiver's concrete
|
|
1452
|
+
type. Rigor follows the industrial convention — `interface
|
|
1453
|
+
_Comparable` (§ "Nominal vs structural typing") is the
|
|
1454
|
+
existential-bound mechanism users actually reach for. The pack /
|
|
1455
|
+
unpack syntax of ML-style existential types is not exposed.
|
|
1456
|
+
|
|
1457
|
+
### Refinement-as-types vs refinement-as-predicates
|
|
1458
|
+
|
|
1459
|
+
The § "Refinement types" section covers Rigor's curated
|
|
1460
|
+
refinement catalogue (`non-empty-string`, `positive-int`, …),
|
|
1461
|
+
which sits in the **refinements-as-types** tradition. The
|
|
1462
|
+
alternative — **refinements-as-predicates** (SMT-backed Liquid
|
|
1463
|
+
Types; F\*'s subset types) — keeps the predicate as a first-class
|
|
1464
|
+
formula attached to the base type, and discharges it via SMT at
|
|
1465
|
+
each constraint site. Rigor uses a much weaker, decidable
|
|
1466
|
+
fragment: the predicate is a named refinement carrier (not an
|
|
1467
|
+
arbitrary formula), and the "narrow into the refinement" rule is
|
|
1468
|
+
a deterministic engine step (not an SMT call). The Rondon-
|
|
1469
|
+
Kawaguchi-Jhala Liquid Types reference in the reading list is
|
|
1470
|
+
the technical seed Rigor's catalogue draws from at a distance.
|
|
1471
|
+
|
|
1472
|
+
## What Rigor does NOT model
|
|
1473
|
+
|
|
1474
|
+
For completeness, a short list of type-theoretic features Rigor
|
|
1475
|
+
*does not* currently expose at the user surface — naming them
|
|
1476
|
+
here so you can stop looking:
|
|
1477
|
+
|
|
1478
|
+
- **Higher-kinded types (HKT).** `Functor[F[_]]` style
|
|
1479
|
+
abstraction. Tracked as a "future direction" but not in any
|
|
1480
|
+
shipped slice. (General HKT inference is undecidable; ADR-20
|
|
1481
|
+
sketches a defunctionalised, annotation-driven approach that
|
|
1482
|
+
sidesteps this.)
|
|
1483
|
+
- **Higher-rank polymorphism (System F⊤).** All RBS generics
|
|
1484
|
+
are predicative; type variables cannot quantify over
|
|
1485
|
+
polymorphic types. (Rank-3 inference is undecidable per Wells,
|
|
1486
|
+
1999; the predicative restriction keeps Rigor's surface
|
|
1487
|
+
inferable without per-call annotations.)
|
|
1488
|
+
- **Polymorphic recursion.** A generic method body re-applied
|
|
1489
|
+
inside itself at a *different* instantiation. Inference is
|
|
1490
|
+
undecidable (Henglein, 1993); RBS does not offer the syntax
|
|
1491
|
+
and Rigor does not synthesise it.
|
|
1492
|
+
- **Full dependent types.** No `Vec[n, T]` with `n : Integer`.
|
|
1493
|
+
Type-checking is decidable but inference is not; integer-range
|
|
1494
|
+
refinements (`int<min, max>`) cover the most common practical
|
|
1495
|
+
need without crossing the line.
|
|
1496
|
+
- **Row polymorphism as a user-quantifiable axis.** `HashShape`
|
|
1497
|
+
carries open-vs-closed semantics internally but does not
|
|
1498
|
+
expose row variables. See § "Object shapes — row polymorphism,
|
|
1499
|
+
Hack, and HashShape's lineage" for the design rationale.
|
|
1500
|
+
- **Existential types.** No `pack` / `unpack`. Closest analogue
|
|
1501
|
+
is structural `interface`.
|
|
1502
|
+
- **GADTs.** No type-refinement-by-constructor; pattern
|
|
1503
|
+
matching narrows via the standard occurrence-typing path, not
|
|
1504
|
+
via type-index propagation.
|
|
1505
|
+
- **Linear / affine types.** No move-checking or use-once
|
|
1506
|
+
enforcement.
|
|
1507
|
+
- **Session types, capabilities-as-types.** Out of scope.
|
|
1508
|
+
- **Mechanised soundness proof.** Deliberately deferred; see the
|
|
1509
|
+
[Matsumoto & Minamide 2010 review](../notes/20260518-matsumoto-2010-cfa-rigor-review.md)
|
|
1510
|
+
for the upstream "prove soundness on a tiny core" approach
|
|
1511
|
+
Rigor has not yet adopted.
|
|
1512
|
+
|
|
1513
|
+
If a topic on this list later becomes important to the user
|
|
1514
|
+
base, it will be discussed in an ADR before any implementation
|
|
1515
|
+
slice. Until then, the absence is a feature.
|
|
1516
|
+
|
|
1517
|
+
## A short reading list
|
|
1518
|
+
|
|
1519
|
+
Papers and books behind the choices above, in roughly the order
|
|
1520
|
+
they map to the sections of this appendix:
|
|
1521
|
+
|
|
1522
|
+
- B.C. Pierce. *Types and Programming Languages.* MIT Press,
|
|
1523
|
+
2002. Standard reference for everything in the first half of
|
|
1524
|
+
this appendix.
|
|
1525
|
+
- Cardelli & Wegner. "On Understanding Types, Data
|
|
1526
|
+
Abstraction, and Polymorphism." *ACM Computing Surveys*,
|
|
1527
|
+
1985. Origin of the polymorphism taxonomy.
|
|
1528
|
+
- Canning, Cook, Hill, Olthoff & Mitchell. "F-Bounded
|
|
1529
|
+
Polymorphism for Object-Oriented Programming." *FPCA 1989.*
|
|
1530
|
+
The classical reference for F-bounded polymorphism — the
|
|
1531
|
+
type-theoretic root of RBS's `self` keyword and Sorbet's
|
|
1532
|
+
`T.attached_class`.
|
|
1533
|
+
- Wadler, P. "The Expression Problem." Informal note posted to
|
|
1534
|
+
the `java-genericity` mailing list, 1998. The challenge that
|
|
1535
|
+
motivates Rigor's plugin substrate as the analyser-layer
|
|
1536
|
+
answer.
|
|
1537
|
+
- Wand, M. "Complete Type Inference for Simple Objects."
|
|
1538
|
+
*LICS*, 1987. The seed of row polymorphism — first
|
|
1539
|
+
formulation of "infer object types with extra fields."
|
|
1540
|
+
- Rémy, D. "Type Checking Records and Variants in a Natural
|
|
1541
|
+
Extension of ML." *POPL 1989.* The row-variable mechanism in
|
|
1542
|
+
the form it is most commonly cited.
|
|
1543
|
+
- Cardelli, L. & Mitchell, J.C. "Operations on Records."
|
|
1544
|
+
*Mathematical Structures in Computer Science*, 1991. The
|
|
1545
|
+
algebraic treatment of record operations under row
|
|
1546
|
+
polymorphism — the substrate Garrigue and then Matsumoto &
|
|
1547
|
+
Minamide build on.
|
|
1548
|
+
- Siek & Taha. "Gradual Typing for Functional Languages."
|
|
1549
|
+
*Scheme Workshop*, 2006. The original gradual-typing paper.
|
|
1550
|
+
- Garcia, Clark & Tanter. "Abstracting Gradual Typing."
|
|
1551
|
+
*POPL 2016.* The modern reformulation of gradual typing in
|
|
1552
|
+
terms of abstract interpretation.
|
|
1553
|
+
- Findler & Felleisen. "Contracts for Higher-Order Functions."
|
|
1554
|
+
*ICFP 2002.* The origin of blame as a formal principle —
|
|
1555
|
+
background for § "Blame, the gradual guarantee, and trust
|
|
1556
|
+
boundaries."
|
|
1557
|
+
- Wadler & Findler. "Well-Typed Programs Can't Be Blamed."
|
|
1558
|
+
*ESOP 2009.* The headline result on the asymmetry between
|
|
1559
|
+
typed and untyped code at the boundary.
|
|
1560
|
+
- Siek, Vitousek, Cimini & Boyland. "Refined Criteria for
|
|
1561
|
+
Gradual Typing." *SNAPL 2015.* The original statement of the
|
|
1562
|
+
gradual guarantee.
|
|
1563
|
+
- Frisch, Castagna & Benzaken. "Semantic Subtyping: Dealing
|
|
1564
|
+
Set-Theoretically with Function, Union, Intersection, and
|
|
1565
|
+
Negation Types." *Journal of the ACM*, 2008. The foundational
|
|
1566
|
+
treatment of set-theoretic types — background for § "Set-
|
|
1567
|
+
theoretic foundations of the lattice."
|
|
1568
|
+
- Castagna, G. *Programming with Union, Intersection, and
|
|
1569
|
+
Negation Types.* 2024. The current canonical reference for
|
|
1570
|
+
semantic-subtyping-based type systems; the framework behind
|
|
1571
|
+
Elixir's set-theoretic types.
|
|
1572
|
+
- Dunfield & Krishnaswami. "Complete and Easy Bidirectional
|
|
1573
|
+
Typechecking for Higher-Rank Polymorphism." *ICFP 2013.* The
|
|
1574
|
+
modern canonical reference for bidirectional type checking —
|
|
1575
|
+
background for § "Bidirectional type checking."
|
|
1576
|
+
- Tobin-Hochstadt & Felleisen. "The Design and Implementation
|
|
1577
|
+
of Typed Scheme." *POPL 2008.* Origin of occurrence typing.
|
|
1578
|
+
- Maranget, L. "Warnings for Pattern Matching." *Journal of
|
|
1579
|
+
Functional Programming*, 2007. The algorithm OCaml uses for
|
|
1580
|
+
pattern-match exhaustiveness — background for § "Pattern
|
|
1581
|
+
matching and exhaustiveness."
|
|
1582
|
+
- Rondon, Kawaguchi & Jhala. "Liquid Types." *PLDI 2008.* The
|
|
1583
|
+
refinement-types-with-SMT framework that informs the
|
|
1584
|
+
`int<min, max>` carrier (Rigor uses a much weaker, decidable
|
|
1585
|
+
fragment).
|
|
1586
|
+
- Lucassen & Gifford. "Polymorphic Effect Systems."
|
|
1587
|
+
*POPL 1988.* Origin of effect systems.
|
|
1588
|
+
- Plotkin & Pretnar. "Handlers of Algebraic Effects."
|
|
1589
|
+
*ESOP 2009.* The algebraic-effect-handler design that Koka,
|
|
1590
|
+
Eff, and OCaml 5's effect system descend from — background
|
|
1591
|
+
for the § "Smaller connections" note on algebraic vs monadic
|
|
1592
|
+
effects.
|
|
1593
|
+
- Wrigstad, Nardelli, Lebresne, Östlund & Vitek. "Integrating
|
|
1594
|
+
Typed and Untyped Code in a Scripting Language."
|
|
1595
|
+
*POPL 2010.* The taxonomy of type-erased vs reified gradual
|
|
1596
|
+
systems — background for § "Smaller connections" on Ruby's
|
|
1597
|
+
fully reified dynamic side.
|
|
1598
|
+
- Milner, R. "A Theory of Type Polymorphism in Programming."
|
|
1599
|
+
*JCSS*, 1978. The Hindley–Milner system in its original form
|
|
1600
|
+
— the canonical type system that achieves soundness,
|
|
1601
|
+
decidability, and the principal type property simultaneously
|
|
1602
|
+
by restricting the language.
|
|
1603
|
+
- Damas, L. & Milner, R. "Principal Type-Schemes for Functional
|
|
1604
|
+
Programs." *POPL 1982.* The principal-type theorem and
|
|
1605
|
+
Algorithm W. The reference for what Rigor consciously does
|
|
1606
|
+
*not* attempt.
|
|
1607
|
+
- Pierce, B.C. & Turner, D.N. "Local Type Inference." *ACM
|
|
1608
|
+
TOPLAS*, 2000. The bidirectional / local-inference design that
|
|
1609
|
+
the Ruby static-typing landscape (Steep especially) draws from
|
|
1610
|
+
once subtyping is in the picture — the practical successor to
|
|
1611
|
+
HM under those conditions, and the closest textbook analogue
|
|
1612
|
+
to Rigor's walker.
|
|
1613
|
+
- Wells, J.B. "Typability and Type Checking in System F are
|
|
1614
|
+
Equivalent and Undecidable." *Annals of Pure and Applied
|
|
1615
|
+
Logic*, 1999. The proof that Rank-3 (and higher) type
|
|
1616
|
+
inference is undecidable — the reason RBS generics stay
|
|
1617
|
+
predicative.
|
|
1618
|
+
- Henglein, F. "Type Inference with Polymorphic Recursion."
|
|
1619
|
+
*ACM TOPLAS*, 1993. Establishes that inferring polymorphic
|
|
1620
|
+
recursion is undecidable.
|
|
1621
|
+
- 水野雅之.「計算機に推論できる型、できない型」.
|
|
1622
|
+
*Wantedly Advent Calendar*, 2021.
|
|
1623
|
+
<https://www.wantedly.com/companies/wantedly/post_articles/349494>.
|
|
1624
|
+
A friendly Japanese-language tour of the decidability boundary
|
|
1625
|
+
— Let多相, Rank-N, 多相再帰, 再帰型, サブタイピング+交差型 —
|
|
1626
|
+
and the most accessible companion to the
|
|
1627
|
+
"Decidability of inference" section above.
|
|
1628
|
+
- Matsumoto & Minamide. "Rubyプログラムの制御フロー解析と
|
|
1629
|
+
その健全性の証明." *IPSJ TPRO Vol.3 No.2*, 2010. The
|
|
1630
|
+
upstream Ruby-CFA soundness proof; Rigor-perspective review
|
|
1631
|
+
at
|
|
1632
|
+
[`docs/notes/20260518-matsumoto-2010-cfa-rigor-review.md`](../notes/20260518-matsumoto-2010-cfa-rigor-review.md).
|
|
1633
|
+
- Matsumoto & Minamide. "多相レコード型に基づくRubyプログラム
|
|
1634
|
+
の型推論." *IPSJ TPRO Vol.49 No.SIG 3*, 2008. The
|
|
1635
|
+
Garrigue-kinded polymorphic-record experiment that
|
|
1636
|
+
retroactively justifies Rigor's nominal-first carrier choice;
|
|
1637
|
+
Rigor-perspective review at
|
|
1638
|
+
[`docs/notes/20260518-matsumoto-2008-poly-records-rigor-review.md`](../notes/20260518-matsumoto-2008-poly-records-rigor-review.md).
|
|
1639
|
+
|
|
1640
|
+
## What's next
|
|
1641
|
+
|
|
1642
|
+
If you came in from a "show me where Rigor stands in the type-
|
|
1643
|
+
theory landscape" question, the rest of the handbook is the
|
|
1644
|
+
practical companion:
|
|
1645
|
+
|
|
1646
|
+
- [Chapter 2 — Everyday types](02-everyday-types.md) for the
|
|
1647
|
+
carrier zoo at the surface level.
|
|
1648
|
+
- [Chapter 3 — Narrowing](03-narrowing.md) for occurrence typing
|
|
1649
|
+
in practice.
|
|
1650
|
+
- [Chapter 7 — RBS and `RBS::Extended`](07-rbs-and-extended.md)
|
|
1651
|
+
for the directive grammar that lets you teach Rigor about a
|
|
1652
|
+
custom predicate.
|
|
1653
|
+
- [Chapter 8 — Understanding errors](08-understanding-errors.md)
|
|
1654
|
+
for the rule catalogue (the user-visible end of the trinary
|
|
1655
|
+
certainty).
|
|
1656
|
+
|
|
1657
|
+
If you want to compare against another *tool* rather than the
|
|
1658
|
+
*theory*, the sibling appendices cover
|
|
1659
|
+
[TypeScript](appendix-typescript.md),
|
|
1660
|
+
[PHPStan](appendix-phpstan.md),
|
|
1661
|
+
[mypy / Pyright](appendix-mypy.md),
|
|
1662
|
+
and [Steep](appendix-steep.md).
|