typisch 0.1.5

Sign up to get free protection for your applications and to get access to all the features.
data/README.md ADDED
@@ -0,0 +1,55 @@
1
+ # Typisch
2
+
3
+ A schema language / type system / validation framework, for semi-structured data and for data in dynamic languages.
4
+
5
+ The initial implementation is in Ruby, but the aim is that it be capable of describing data equally well in a range of dynamic languages
6
+ and semi-structured data serialization languages -- Ruby, Python, Javascript, YAML, JSON, ...
7
+
8
+ *N.B.* still a work in progress at this stage, but enough works that it may be interesting to poke around!
9
+
10
+ ## What distinguishes it?
11
+
12
+ It aims to be more than just an ad-hoc data validation library; rather, a proper type system with formal foundations.
13
+
14
+ This means that it's able to do more than just validate data; it can settle questions phrased in terms of the
15
+ schemas themselves -- like "is this schema a subtype of this other schema?", or "compute the intersection of
16
+ these two schemas" -- in a well-defined and sound way.
17
+
18
+ This is nice, since if you're going to bother specifying a schema for your data at all (as opposed to just
19
+ writing some validation code), you're probably doing so because you want to be able to reason statically
20
+ about the structure of the data. Having solid foundations makes that easier and more pleasant.
21
+
22
+ ## As a type system, it features
23
+
24
+ - Record types with structural subtyping
25
+ - Nominal subtyping based on a hierarchy of type tags (which can be based on the subclassing graph of the host language)
26
+ - Tagged union types (arbitrary untagged unions may be computed too, but some type information may be lost where the tags overlap)
27
+ - Equi-recursive types, eg "Person {parent: Person}", which can be used to type-check cyclic object graphs if required
28
+ - Parameterised polymorphic types for Sequences, Tuples and other collection types
29
+ - A numeric tower with subtyping for the primitive numeric types
30
+ - Refinement types like "Integer greater than 0", "String of at most 10 characters", "Float from the following set of allowed values" etc
31
+ - Decidable subtyping for all the above
32
+ - Ability to compute unions and intersections of types
33
+
34
+ If that sounds surprisingly powerful, bear in mind there's one very common type system feature which it *lacks*: function types, or typing
35
+ of code. Typisch only cares about typing data, which makes its life significantly easier.
36
+
37
+ Usually type systems for data are called 'schema languages', their types 'schemas', and type-checking 'validation'.
38
+ Forgive me if I use these terms somewhat interchangeably here.
39
+
40
+ ## Semi-structured data and subtyping
41
+
42
+ One way to characterise semi-structured data would be: data whose datatype admits structural subtyping.
43
+
44
+ Structural subtyping allows extra fields beyond those specifically required, to be present on an object without cause for concern
45
+ -- as may frequently be the case with "duck-typed" data in dynamic languages, and data serialised in extensible schemas in formats
46
+ like JSON.
47
+
48
+ Sometimes you only care to validate, to serialize or to process a *subset* of a large structured object graph.
49
+ Structural typing provides a rather nice way to describe these subsets, as *supertypes* of the more complete datatype.
50
+
51
+ So, a good notion of subtyping seems useful for a type system designed to cope well with semi-structured data.
52
+
53
+ ## Why Typisch?
54
+
55
+ Well, it combines Type and Schema. It's also german for "typical", as in "typical, another bloody schema language".
@@ -0,0 +1,13 @@
1
+ module Typisch
2
+ class Type::Boolean < Type::Constructor::Singleton
3
+ def self.tag
4
+ "Boolean"
5
+ end
6
+
7
+ def shallow_check_type(instance)
8
+ instance == true || instance == false
9
+ end
10
+
11
+ Registry.register_global_type(:boolean, top_type)
12
+ end
13
+ end
@@ -0,0 +1,112 @@
1
+ module Typisch
2
+ # All types except Top, Bottom and Unions (which may necessarily involve more than one constructor type) have a
3
+ # tag associated with them which is used at runtime to distinguish its instances somewhat from instances
4
+ # of other types.
5
+ #
6
+ # This is an abstract superclass; each subclass of Type::Constructor is assumed to implement its own, distinct
7
+ # lattice of types. For simple atomic types like Bool, there will only be one tag, "Bool", in that lattice.
8
+ #
9
+ # While the type lattices of different subclass of Type::Constructor are non-overlapping, within a subclass
10
+ # (such as Type::Object or Type::Numeric) there may be a non-trivial type lattice, eg for Numeric,
11
+ # Int < Float, and for Object, the type lattice is based on a nominal tag inheritance hierarchy in
12
+ # the host language together with structural subtyping rules for object properties.
13
+ #
14
+ # A list of 'reserved tags' is maintained globally, and any Type::Constructor subtype which allows custom
15
+ # user-specified tags to be used should ensure that they don't match any reserved tags.
16
+ class Type::Constructor < Type
17
+ CONSTRUCTOR_TYPE_SUBCLASSES = []
18
+
19
+ class << self
20
+ def inherited(subclass)
21
+ # we add any non-abstract subclasses to this list, which is used to
22
+ # construct the Any type. For now Constructor::Singleton is the only
23
+ # other abstract Type::Constructor subclass:
24
+ unless subclass == Type::Constructor::Singleton
25
+ CONSTRUCTOR_TYPE_SUBCLASSES << subclass
26
+ end
27
+ super
28
+ end
29
+
30
+ # This should be the top in the type lattice for this class of taged types.
31
+ # Its tag should be the top_tag above.
32
+ # You are passed the overall_top, ie the top type of the overall type lattice,
33
+ # to use; this is needed by parameterised types which want to parameterise their
34
+ # top type by the overall top, eg Top = Foo | Bar | Sequence[Top] | ...
35
+ def top_type(overall_top)
36
+ raise NotImplementedError
37
+ end
38
+
39
+ # This gets called by the subtyper on a Type::Constructor subclass, with two instances of
40
+ # that subclass.
41
+ # It should return true or false; if it needs to check some subgoals,
42
+ # say on child types of the ones passed in, it should use the supplied
43
+ # 'recursively_check_subtype' block rather than calling itself recursively
44
+ # directly. This hides away the details of the corecursive subtyping algorithm
45
+ # for you.
46
+ def check_subtype(x, y, &recursively_check_subtype)
47
+ raise NotImplementedError
48
+ end
49
+ end
50
+
51
+ # the distinct type lattice within which this type lives.
52
+ # the type system as a whole can be seen as a set of non-overlapping type lattices, together
53
+ # with tagged unions drawn from them.
54
+ #
55
+ # the interface for a type lattice is just that it responds to 'check_subtype'; by default
56
+ # the class of a type implements this interface
57
+ def type_lattice
58
+ self.class
59
+ end
60
+
61
+ def check_type(instance)
62
+ shallow_check_type(instance)
63
+ end
64
+
65
+ # the tag of this particular type
66
+ def tag
67
+ raise NotImplementedError
68
+ end
69
+
70
+ # these are here so as to implement a common interface with Type::Union
71
+ def alternative_types
72
+ [self]
73
+ end
74
+
75
+ # A class of constructor type of which there is only one type, and
76
+ # hence only one tag.
77
+ #
78
+ # Will have no supertype besides Any, and no subtype besides
79
+ # Nothing.
80
+ #
81
+ # (abstract superclass; see Boolean or Null for example subclasses).
82
+ class Singleton < Type::Constructor
83
+ class << self
84
+ private :new
85
+
86
+ def tag
87
+ raise NotImplementedError
88
+ end
89
+
90
+ def top_type(*)
91
+ @top_type ||= new
92
+ end
93
+
94
+ def check_subtype(x, y)
95
+ true
96
+ end
97
+ end
98
+
99
+ def subexpression_types
100
+ []
101
+ end
102
+
103
+ def tag
104
+ self.class.tag
105
+ end
106
+
107
+ def to_s(*)
108
+ @name.inspect
109
+ end
110
+ end
111
+ end
112
+ end
@@ -0,0 +1,41 @@
1
+ require 'date'
2
+ require 'time'
3
+
4
+ module Typisch
5
+ class Type::Date < Type::Constructor::Singleton
6
+ def self.tag; "Date"; end
7
+ Registry.register_global_type(:date, top_type)
8
+
9
+ # You could add your own to the list, if you have some alternative ruby implementation
10
+ # with a Date-like interface which you want to typecheck.
11
+ VALID_IMPLEMENTATION_CLASSES = [::Date]
12
+
13
+ def shallow_check_type(instance)
14
+ case instance when *VALID_IMPLEMENTATION_CLASSES then true else false end
15
+ end
16
+ end
17
+
18
+ class Type::Time < Type::Constructor::Singleton
19
+ def self.tag; "Time"; end
20
+ Registry.register_global_type(:time, top_type)
21
+ Registry.register_global_type(:datetime, top_type)
22
+
23
+ # You could add your own to the list, if you have some alternative ruby implementation
24
+ # with a Time-like interface which you want to typecheck here.
25
+ #
26
+ # Maybe allow DateTime too under ruby? its interface is slightly different though.
27
+ # Typisch types are aiming to not be overly coupled to the structure of implementation
28
+ # classes in Ruby though - eg when serializing to JSON we don't really care about
29
+ # ruby's DateTime vs Time quirks. So, not sure whether to add DateTime here and pretend
30
+ # its interface is similar enough to that of Time, or just pretend it doesn't exist.
31
+ #
32
+ # Or maybe we could just define this as 'anything which respond_to?(:to_time)' or similar.
33
+ # Although annoyingly DateTime and Time don't even have to_time / to_datetime methods to
34
+ # convert between them. Poor stdlib design :(
35
+ VALID_IMPLEMENTATION_CLASSES = [::Time]
36
+
37
+ def shallow_check_type(instance)
38
+ case instance when *VALID_IMPLEMENTATION_CLASSES then true else false end
39
+ end
40
+ end
41
+ end
@@ -0,0 +1,242 @@
1
+ module Typisch
2
+
3
+ # Apologies this is a bit messy. Should probably do a bit of a tidy-up
4
+ # once the dust has settled around the DSL syntax.
5
+ #
6
+ # It's a layer ontop of the core type model though - could be worse, it
7
+ # could be horribly intertwined with the model itself :)
8
+
9
+ module DSL
10
+ def registry
11
+ raise NotImplementedError
12
+ end
13
+
14
+ def register(name, *type_args, &type_block_arg)
15
+ result = registry[name] = type(*type_args, &type_block_arg)
16
+ if @pending_annotations
17
+ result.annotations.merge!(@pending_annotations)
18
+ @pending_annotations = nil
19
+ end
20
+ result
21
+ end
22
+
23
+ def register_type_for_class(klass, *object_type_args, &object_type_block_arg)
24
+ name = :"#{klass}"
25
+ type = register(name, :object, klass, *object_type_args, &object_type_block_arg)
26
+ registry.register_type_for_class(klass, type)
27
+ type
28
+ end
29
+
30
+ def register_version_type_for_class(klass, version, *object_type_args, &object_type_block_arg)
31
+ name = :"#{klass}__#{version}"
32
+ type = register(name, :object, klass, *object_type_args, &object_type_block_arg)
33
+ registry.register_version_type_for_class(klass, version, type)
34
+ type
35
+ end
36
+
37
+ # annotations apply to the next register'd type.
38
+ #
39
+ # annotate "Some description", :some_other => 'annotations'
40
+ # annotate :description => "Some description", :some_other => 'annotations'
41
+ def annotate(description_or_options, options=nil)
42
+ if description_or_options.is_a?(::String)
43
+ options ||= {}; options[:description] = description_or_options
44
+ else
45
+ options = description_or_options
46
+ end
47
+ @pending_annotations ||= {}
48
+ @pending_annotations.merge!(options)
49
+ end
50
+
51
+ def type(arg, *more_args, &block_arg)
52
+ case arg
53
+ when Type
54
+ arg
55
+ when ::Symbol
56
+ if more_args.empty? && !block_arg
57
+ registry[arg]
58
+ elsif more_args.last.is_a?(::Hash) && !block_arg && (version = more_args.last[:version])
59
+ registry[:"#{arg}__#{version}"]
60
+ else
61
+ send(arg, *more_args, &block_arg)
62
+ end
63
+ when ::Module
64
+ type(:"#{arg}", *more_args, &block_arg)
65
+ else
66
+ raise ArgumentError, "expected Type or type name or class, but was given #{arg.class}"
67
+ end
68
+ end
69
+
70
+ def string(refinements=nil)
71
+ refinements ? Type::String.new(refinements) : registry[:string]
72
+ end
73
+
74
+ def sequence(*args)
75
+ seq_options = {}
76
+ if (opts = args.last and opts.is_a?(::Hash))
77
+ seq_options[:slice] = opts.delete(:slice) if opts.has_key?(:slice)
78
+ seq_options[:total_length] = opts.delete(:total_length) if opts.has_key?(:total_length)
79
+ args.pop if opts.empty?
80
+ end
81
+ Type::Sequence.new(type(*args), seq_options)
82
+ end
83
+
84
+ def tuple(*types)
85
+ Type::Tuple.new(*types.map {|t| type(t)})
86
+ end
87
+
88
+ def object(*args, &block)
89
+ klass, properties = _normalize_object_args(::Object, *args)
90
+ _object(klass, properties, &block)
91
+ end
92
+
93
+ def _normalize_object_args(default_class, klass_or_properties=nil, properties=nil)
94
+ case klass_or_properties
95
+ when ::Hash then [default_class, klass_or_properties]
96
+ when ::NilClass then [default_class, {}]
97
+ when ::Module then [klass_or_properties, properties || {}]
98
+ end
99
+ end
100
+
101
+ # back-end for object, which takes args in a normalized format
102
+ def _object(klass, properties, derive_from=nil, &block)
103
+ if block
104
+ object_context = if derive_from
105
+ DerivedObjectContext.new(self, derive_from)
106
+ else
107
+ ObjectContext.new(self)
108
+ end
109
+ object_context.instance_eval(&block)
110
+ properties.merge!(object_context.properties)
111
+ end
112
+ properties.keys.each do |k|
113
+ type_args, type_block_arg = properties[k]
114
+ properties[k] = type(*type_args, &type_block_arg)
115
+ end
116
+ type = Type::Object.new(klass.to_s, properties)
117
+ if block && (prop_annot = object_context.property_annotations)
118
+ type.annotations[:properties] = prop_annot
119
+ end
120
+ type
121
+ end
122
+
123
+ def union(*types)
124
+ Type::Union.new(*types.map {|t| type(t)})
125
+ end
126
+
127
+ def nullable(t)
128
+ union(type(t), :null)
129
+ end
130
+
131
+ def derived_from(original_type, *args, &block_arg)
132
+ if args.empty? && !block_arg
133
+ original_type
134
+ elsif args.last.is_a?(::Hash) && (version = args.last[:version]) && original_type.name
135
+ # slightly messy; we rely on the convention that 'versions' of named types are registered
136
+ # as :"#{name}__#{version}", this allows you to specify or override the version when
137
+ # deriving from a type, even if it is still just a named placeholder, by manipulating the
138
+ # name.
139
+ original_name = original_type.name.to_s.sub(/__.*$/, '')
140
+ :"#{original_name}__#{version}"
141
+ else
142
+ original_type = type(original_type).target
143
+ case original_type
144
+ when Type::Object
145
+ klass, properties = _normalize_object_args(original_type.class_or_module, *args)
146
+ _object(klass, properties, original_type, &block_arg)
147
+ when Type::Sequence
148
+ slice_overrides = (args.last.is_a?(::Hash) && args.last[:slice]) ? args.pop : {}
149
+ Type::Sequence.new(derived_from(original_type.type, *args, &block_arg), slice_overrides)
150
+ when Type::Union
151
+ non_null = original_type.excluding_null
152
+ raise "DSL doesn't support deriving from union types (except simple unions with null)" if Type::Union === non_null
153
+ nullable(derived_from(non_null, *args, &block_arg))
154
+ else
155
+ raise "DSL doesn't support deriving from #{original_type.class} types" if args.length > 0 || block_arg
156
+ original_type
157
+ end
158
+ end
159
+ end
160
+
161
+ class ObjectContext
162
+ attr_reader :properties, :property_annotations
163
+
164
+ def initialize(parent_context)
165
+ @parent_context = parent_context
166
+ @properties = {}
167
+ end
168
+
169
+ # property annotations apply to the next declared property.
170
+ #
171
+ # annotate "Some description", :some_other => 'annotations'
172
+ # annotate :description => "Some description", :some_other => 'annotations'
173
+ def annotate(description_or_options, options=nil)
174
+ if description_or_options.is_a?(::String)
175
+ options ||= {}; options[:description] = description_or_options
176
+ else
177
+ options = description_or_options
178
+ end
179
+ @pending_annotations ||= {}
180
+ @pending_annotations.merge!(options)
181
+ end
182
+
183
+ def property(name, *type_args, &type_block_arg)
184
+ raise Error, "property #{name.inspect} declared twice" if @properties[name]
185
+ @properties[name] = [type_args, type_block_arg]
186
+ if @pending_annotations
187
+ @property_annotations ||= {}
188
+ @property_annotations[name] = @pending_annotations
189
+ @pending_annotations = nil
190
+ end
191
+ end
192
+
193
+ def method_missing(name, *args, &block)
194
+ @parent_context.respond_to?(name) ? @parent_context.send(name, *args, &block) : super
195
+ end
196
+ end
197
+
198
+ class DerivedObjectContext < ObjectContext
199
+ def initialize(c, original_object_type)
200
+ super(c)
201
+ @original_object_type = original_object_type
202
+ end
203
+
204
+ def derive_properties(*names)
205
+ names.each {|n| derive_property(n)}
206
+ end
207
+
208
+ # use a property from the original object type being derived from
209
+ def derive_property(name, *derive_args, &derive_block)
210
+ type = @original_object_type[name] or raise "use_property: no such property #{name.inspect} on the original type"
211
+ derived_type = derived_from(type, *derive_args, &derive_block)
212
+ property name, derived_type
213
+ end
214
+
215
+ # derives all properties from the original type which haven't already been derived.
216
+ # this is done for you by Typisch::Typed::ClassMethods#register_subtype.
217
+ def derive_all_properties
218
+ @original_object_type.property_names_to_types.each do |name, type|
219
+ property(name, type) unless @properties[name]
220
+ end
221
+ end
222
+
223
+ def derive_all_properties_except(*props)
224
+ @original_object_type.property_names_to_types.each do |name, type|
225
+ next if props.include?(name)
226
+ property(name, type) unless @properties[name]
227
+ end
228
+ end
229
+ end
230
+ end
231
+
232
+ class DSLContext
233
+ include DSL
234
+
235
+ attr_reader :registry
236
+
237
+ def initialize(registry)
238
+ @registry = registry
239
+ end
240
+ end
241
+
242
+ end
@@ -0,0 +1,11 @@
1
+ module Typisch
2
+ class Error < StandardError; end
3
+ class TypeDeclarationError < Error; end
4
+ class NameResolutionError < TypeDeclarationError
5
+ def initialize(type_name)
6
+ super("Problem resolving named placeholder type: cannot find type with name #{type_name} in registry")
7
+ end
8
+ end
9
+ class SerializationError < Error; end
10
+ class CyclicSerialization < SerializationError; end
11
+ end
@@ -0,0 +1,57 @@
1
+ module Typisch
2
+ META_TYPES = Registry.new
3
+ META_TYPES.register do
4
+
5
+ register(:"Typisch::Type", :union,
6
+ :"Typisch::Type::String",
7
+ :"Typisch::Type::Numeric",
8
+ :"Typisch::Type::Boolean",
9
+ :"Typisch::Type::Null",
10
+ :"Typisch::Type::Date",
11
+ :"Typisch::Type::Time",
12
+ :"Typisch::Type::Object",
13
+ :"Typisch::Type::Sequence",
14
+ :"Typisch::Type::Tuple",
15
+ :"Typisch::Type::Union"
16
+ )
17
+
18
+ register_type_for_class(Type::Boolean)
19
+ register_type_for_class(Type::Null)
20
+ register_type_for_class(Type::Date)
21
+ register_type_for_class(Type::Time)
22
+
23
+ register_type_for_class(Type::Numeric) do
24
+ property :tag, :string
25
+ end
26
+
27
+ register_type_for_class(Type::String) do
28
+ property :values, nullable(sequence(:string))
29
+ property :max_length, nullable(:integer)
30
+ end
31
+
32
+ register_type_for_class(::Range) do
33
+ property :begin, :integer
34
+ property :end, :integer
35
+ end
36
+
37
+ register_type_for_class(Type::Sequence) do
38
+ property :type, :"Typisch::Type"
39
+ property :slice, nullable(:Range)
40
+ property :total_length, nullable(:boolean)
41
+ end
42
+
43
+ register_type_for_class(Type::Tuple) do
44
+ property :types, sequence(:"Typisch::Type")
45
+ end
46
+
47
+ register_type_for_class(Type::Object) do
48
+ property :property_names_to_types, sequence(tuple(:string, :"Typisch::Type"))
49
+ property :tag, :string
50
+ end
51
+
52
+ register_type_for_class(Type::Union) do
53
+ property :alternative_types, sequence(:"Typisch::Type")
54
+ end
55
+ end
56
+
57
+ end
@@ -0,0 +1,67 @@
1
+ module Typisch
2
+ # This is a proxy wrapper for a type, which we can use as a placeholder for a named
3
+ # type which hasn't yet been declared. Helps when it comes to cyclic references etc.
4
+ #
5
+ # (You can view this as a free variable, where the scope of all free variables is
6
+ # implicitly closed over at the top level by the 'registry'. We don't keep variables
7
+ # lying around as symbolic things in a syntax tree though, we're just using them as
8
+ # temporary placeholders on the way to rewriting it as a syntax *graph*).
9
+ #
10
+ # Once a register_types block has finished, the registry ensures that all references
11
+ # in the type graph to NamedPlaceholders are replaced with references to their targets.
12
+ class Type::NamedPlaceholder < Type
13
+ def initialize(name, registry)
14
+ @registry = registry
15
+ @name = name
16
+ end
17
+
18
+ def target
19
+ return @target if @target
20
+ @target = @registry[@name]
21
+ case @target when NilClass, Type::NamedPlaceholder
22
+ raise NameResolutionError.new(@name.inspect)
23
+ end
24
+ end
25
+
26
+ attr_writer :target
27
+
28
+ def target=(target)
29
+ @target = target.target
30
+ end
31
+ private :target=
32
+
33
+ # this is slightly naughty - we actually pretend to be of the class
34
+ # of our target object.
35
+ #
36
+ # note that TargetClass === placeholder will still return false.
37
+
38
+ def class
39
+ target.class
40
+ end
41
+
42
+ def is_a?(klass)
43
+ target.is_a?(klass)
44
+ end
45
+ alias :kind_of? :is_a?
46
+
47
+ def instance_of?(klass)
48
+ target.instance_of?(klass)
49
+ end
50
+
51
+ def to_s(*)
52
+ @name.inspect
53
+ end
54
+
55
+ # let us proxy these through
56
+ undef :alternative_types, :check_type, :shallow_check_type, :subexpression_types
57
+ undef :excluding_null, :annotations, :canonicalize!
58
+
59
+ def method_missing(name, *args, &block)
60
+ target.respond_to?(name) ? target.send(name, *args, &block) : super
61
+ end
62
+
63
+ def respond_to?(name, include_private=false)
64
+ super || target.respond_to?(name, include_private)
65
+ end
66
+ end
67
+ end
@@ -0,0 +1,17 @@
1
+ module Typisch
2
+ class Type::Null < Type::Constructor::Singleton
3
+ def self.tag
4
+ "Null"
5
+ end
6
+
7
+ def shallow_check_type(instance)
8
+ instance.nil?
9
+ end
10
+
11
+ def excluding_null
12
+ Type::Nothing::INSTANCE
13
+ end
14
+
15
+ Registry.register_global_type(:null, top_type)
16
+ end
17
+ end