stamina-core 0.5.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (36) hide show
  1. data/CHANGELOG.md +78 -0
  2. data/LICENCE.md +22 -0
  3. data/lib/stamina-core/stamina-core.rb +1 -0
  4. data/lib/stamina-core/stamina/adl.rb +298 -0
  5. data/lib/stamina-core/stamina/automaton.rb +1300 -0
  6. data/lib/stamina-core/stamina/automaton/complement.rb +26 -0
  7. data/lib/stamina-core/stamina/automaton/complete.rb +36 -0
  8. data/lib/stamina-core/stamina/automaton/compose.rb +111 -0
  9. data/lib/stamina-core/stamina/automaton/determinize.rb +104 -0
  10. data/lib/stamina-core/stamina/automaton/equivalence.rb +57 -0
  11. data/lib/stamina-core/stamina/automaton/hide.rb +41 -0
  12. data/lib/stamina-core/stamina/automaton/metrics.rb +77 -0
  13. data/lib/stamina-core/stamina/automaton/minimize.rb +23 -0
  14. data/lib/stamina-core/stamina/automaton/minimize/hopcroft.rb +118 -0
  15. data/lib/stamina-core/stamina/automaton/minimize/pitchies.rb +130 -0
  16. data/lib/stamina-core/stamina/automaton/strip.rb +16 -0
  17. data/lib/stamina-core/stamina/automaton/walking.rb +361 -0
  18. data/lib/stamina-core/stamina/command.rb +38 -0
  19. data/lib/stamina-core/stamina/command/adl2dot.rb +82 -0
  20. data/lib/stamina-core/stamina/command/help.rb +23 -0
  21. data/lib/stamina-core/stamina/command/robustness.rb +21 -0
  22. data/lib/stamina-core/stamina/command/run.rb +84 -0
  23. data/lib/stamina-core/stamina/core.rb +11 -0
  24. data/lib/stamina-core/stamina/dsl.rb +6 -0
  25. data/lib/stamina-core/stamina/dsl/automata.rb +23 -0
  26. data/lib/stamina-core/stamina/dsl/core.rb +14 -0
  27. data/lib/stamina-core/stamina/engine.rb +32 -0
  28. data/lib/stamina-core/stamina/engine/context.rb +35 -0
  29. data/lib/stamina-core/stamina/errors.rb +26 -0
  30. data/lib/stamina-core/stamina/ext/math.rb +19 -0
  31. data/lib/stamina-core/stamina/loader.rb +3 -0
  32. data/lib/stamina-core/stamina/markable.rb +42 -0
  33. data/lib/stamina-core/stamina/utils.rb +1 -0
  34. data/lib/stamina-core/stamina/utils/decorate.rb +81 -0
  35. data/lib/stamina-core/stamina/version.rb +14 -0
  36. metadata +93 -0
@@ -0,0 +1,78 @@
1
+ # 0.5.0 / FIX ME
2
+
3
+ * Breaking features.
4
+
5
+ * Support for ruby 1.8.7 has been definitely removed.
6
+
7
+ * Major enhancements
8
+
9
+ * The project has been split in different sub gems (core, induction and gui). This
10
+ implies a lot of internal changes, but the public API has not been affected. A main
11
+ 'stamina' gem automatically includes all sub gems so previous behavior is guaranteed.
12
+
13
+ * Minor enhancements
14
+ * Fixed a bug with bundler usage in main stamina binary
15
+ * adl2dot command now support samples as input in addition to automata. In that case,
16
+ the dot result models a PTA (prefix tree acceptor)
17
+ * Added --png to 'stamina adl2dot'
18
+
19
+ # 0.4.0 / 2011-05-01
20
+
21
+ * Major Enhancements
22
+
23
+ * Added Automaton#to_adl as an shortcut for Stamina::ADL::print_automaton(...)
24
+ * Added Sample#to_pta taken from Induction::Commons
25
+ * Added Automaton completion (all strings parsable) under Automaton#complete[!?]
26
+ * Added Automaton stripping (removal of unreachable states) under Automaton#strip[!]
27
+ * Added Automaton minimization (Hopcroft + Pitchies) under Automaton#minimize
28
+ * Added Abbadingo generators under Abbadingo::RandomDFA and Abbadingo::RandomSample
29
+ * Added a main 'stamina' command relying on Quickl. classiy/adl2dot commands become
30
+ subcommands of stamina itself (see stamina --help for a list of available commands).
31
+ Induction command (rpni and redblue) are now handled by a 'stamina infer' with
32
+ options.
33
+ * Error states and now correctly handled in ADL::parse and ADL::flush
34
+ * RedBlue has been renamed as BlueFringe everywhere (red_?blue -> blue_fringe)
35
+
36
+ * Minnor Enhancements
37
+ * Added a few optimizations here and there
38
+
39
+ * Bug fixes
40
+
41
+ * Fixed a bug in Automaton#depth when some states are unreachable
42
+
43
+ # 0.3.1 / 2011-03-24
44
+
45
+ * Major Enhancements
46
+
47
+ * Implemented the decoration algorithm of Damas10, allowing to decorate states
48
+ with information propagated from states to states until a fixpoint is reached.
49
+ * Added Automaton::Metrics module, automatically included, with useful metrics
50
+ like automaton depth, accepting ratio and so on.
51
+ * Added Scoring module and Classifier#classification_scoring(sample) method
52
+ with common measures from information retrieval.
53
+
54
+ * On the devel side
55
+
56
+ * Moved specific automaton tests under test/stamina/automaton/...
57
+
58
+ # 0.3.0 / 2011-03-24
59
+
60
+ * On the devel side
61
+
62
+ * The project structure is now handled by Noe
63
+ * Ensures that tests are correctly executed under ruby 1.9.2
64
+
65
+
66
+ # 0.2.2 / 2010-10-22
67
+
68
+ * Major Enhancements
69
+
70
+ * Sample#<< does not detect inconsistencies anymore, to ensure a linear method instead of a quadratic one.
71
+
72
+ * On the devel side
73
+
74
+ * Fixes a bug in Rakefile that lead to test failures under ruby 1.8.7
75
+
76
+ # 0.2.1 / 2010-05-01
77
+
78
+ * Main public version for the official competition, extracted from private SVN.
@@ -0,0 +1,22 @@
1
+ The MIT License
2
+
3
+ Copyright (c) 2008-2009 University of Louvain
4
+ (Universite catholique de Louvain-la-Neuve, Belgium)
5
+
6
+ Permission is hereby granted, free of charge, to any person obtaining a copy
7
+ of this software and associated documentation files (the "Software"), to deal
8
+ in the Software without restriction, including without limitation the rights
9
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
10
+ copies of the Software, and to permit persons to whom the Software is
11
+ furnished to do so, subject to the following conditions:
12
+
13
+ The above copyright notice and this permission notice shall be included in
14
+ all copies or substantial portions of the Software.
15
+
16
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
17
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
18
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
19
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
20
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
21
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
22
+ THE SOFTWARE.
@@ -0,0 +1 @@
1
+ require_relative 'stamina/core'
@@ -0,0 +1,298 @@
1
+ module Stamina
2
+ #
3
+ # Automaton Description Language module. This module provides parsing and
4
+ # printing methods for automata and samples. Documentation of the file format
5
+ # used for an automaton is given in parse_automaton; file format for samples is
6
+ # documented in parse_sample.
7
+ #
8
+ # Methods of this module are not intended to be included by a class but invoked
9
+ # on the module instead:
10
+ #
11
+ # begin
12
+ # dfa = Stamina::ADL.parse_automaton_file("my_automaton.adl")
13
+ # rescue ADL::ParseError => ex
14
+ # puts "Oops, the ADL automaton file seems corrupted..."
15
+ # end
16
+ #
17
+ # == Detailed API
18
+ module ADL
19
+
20
+ #################################################################################
21
+ # Automaton Section #
22
+ #################################################################################
23
+
24
+ #
25
+ # Parses a given automaton description and returns an Automaton instance.
26
+ #
27
+ # Raises:
28
+ # - ArgumentError unless _descr_ is an IO object or a String.
29
+ # - ADL::ParseError if the ADL automaton format is not respected.
30
+ #
31
+ # ADL provides a really simple grammar to describe automata. Here is a succint
32
+ # example (full documentation of the ADL automaton grammar can be found in
33
+ # the self-documenting example/adl/automaton.adl file).
34
+ #
35
+ # # Some header comments: tool which has generated this automaton,
36
+ # # maybe a date or other tool options ...
37
+ # # here: 'this automaton accepts the a(ba)* regular language'
38
+ # 2 2
39
+ # 0 true false
40
+ # 1 false true
41
+ # 0 1 a
42
+ # 1 0 b
43
+ #
44
+ def self.parse_automaton(descr)
45
+ automaton = nil
46
+ ADL::to_io(descr) do |io|
47
+ state_count, edge_count = nil, nil
48
+ state_read, edge_read = 0, 0
49
+ states = {}
50
+ mode = :header
51
+
52
+ automaton = Automaton.new do |fa|
53
+ # parse each description line
54
+ line_number = 1
55
+ io.each_line do |l|
56
+ index = l.index('#')
57
+ l = l[0,index] if index
58
+ l = l.strip
59
+ next if l.empty? or l[0,1]=='#'
60
+
61
+ case mode
62
+ when :header
63
+ # looking for |state_count edge_count|
64
+ raise(ADL::ParseError,
65
+ "Parse error line #{line_number}: 'state_count edge_count' expected, "\
66
+ "'#{l}' found.") unless /^(\d+)\s+(\d+)$/ =~ l
67
+ state_count, edge_count = $1.to_i, $2.to_i
68
+ mode = :states
69
+
70
+ when :states
71
+ # looking for |number initial accepting|
72
+ raise(ADL::ParseError,
73
+ "Parse error line #{line_number}: state definition expected, "\
74
+ "'#{l}' found.") unless /^(\S+)\s+(true|false)\s+(true|false)(\s+(true|false))?$/ =~ l
75
+ id, initial, accepting, error = $1, $2, $3, $5
76
+ initial, accepting, error = ("true"==initial), ("true"==accepting), ("true"==error)
77
+
78
+ state = fa.add_state(:initial => initial, :accepting => accepting, :error => error)
79
+ state[:name]=id.to_s
80
+ states[id] = state
81
+
82
+ state_read += 1
83
+ mode = (edge_count==0 ? :end : :edges) if state_read==state_count
84
+
85
+ when :edges
86
+ # looking for |source target symbol|
87
+ raise(ADL::ParseError,
88
+ "Parse error line #{line_number}: edge definition expected, "\
89
+ "'#{l}' found.") unless /^(\S+)\s+(\S+)\s+(\S+)$/ =~ l
90
+ source, target, symbol = $1, $2, $3
91
+ raise(ADL::ParseError,
92
+ "Parse error line #{line_number}: no such state #{source}") \
93
+ unless states[source]
94
+ raise(ADL::ParseError,
95
+ "Parse error line #{line_number}: no such state #{target}") \
96
+ unless states[target]
97
+
98
+ fa.connect(states[source], states[target], {:symbol => symbol})
99
+
100
+ edge_read += 1
101
+ mode = :end if edge_read==edge_count
102
+
103
+ when :end
104
+ raise(ADL::ParseError,
105
+ "Parse error line #{line_number}: trailing data found '#{l}")
106
+
107
+ end # case mode
108
+
109
+ line_number += 1
110
+ end
111
+
112
+ raise(ADL::ParseError, "Parse error: #{state_count} states annouced, "\
113
+ "#{state_read} found.") if state_count != state_read
114
+ raise(ADL::ParseError, "Parse error: #{edge_count} edges annouced, "\
115
+ "#{edge_read} found.") if edge_count != edge_read
116
+
117
+ end # Automaton.new
118
+ end
119
+ return automaton
120
+ end # def self.parse
121
+
122
+ #
123
+ # Parses an automaton file _f_.
124
+ #
125
+ # Shortcut for:
126
+ # File.open(f, 'r') do |io|
127
+ # Stamina::ADL.parse_automaton(io)
128
+ # end
129
+ #
130
+ def self.parse_automaton_file(f)
131
+ automaton = nil
132
+ File.open(f) do |file|
133
+ automaton = ADL::parse_automaton(file)
134
+ end
135
+ automaton
136
+ end
137
+
138
+ #
139
+ # Prints an automaton to a buffer (responding to <code>:&lt;&lt;</code>) in ADL
140
+ # format. Returns the buffer itself.
141
+ #
142
+ def self.print_automaton(fa, buffer="")
143
+ buffer << "#{fa.state_count.to_s} #{fa.edge_count.to_s}" << "\n"
144
+ fa.states.each do |s|
145
+ buffer << "#{s.index.to_s} #{s.initial?} #{s.accepting?}" << (s.error? ? " true" : "") << "\n"
146
+ end
147
+ fa.edges.each do |e|
148
+ buffer << "#{e.source.index.to_s} #{e.target.index.to_s} #{e.symbol.to_s}" << "\n"
149
+ end
150
+ buffer
151
+ end
152
+
153
+ #
154
+ # Prints an automaton to a file whose path is provided.
155
+ #
156
+ # Shortcut for:
157
+ # File.open(file, 'w') do |io|
158
+ # print_automaton(fa, io)
159
+ # end
160
+ #
161
+ def self.print_automaton_to_file(fa, file)
162
+ File.open(file, 'w') do |io|
163
+ print_automaton(fa, io)
164
+ end
165
+ end
166
+
167
+ #################################################################################
168
+ # String and Sample Section #
169
+ #################################################################################
170
+
171
+ #
172
+ # Parses an input string _str_ and returns a InputString instance. Format of
173
+ # input strings is documented in parse_sample. _str_ is required to be a ruby
174
+ # String.
175
+ #
176
+ # Raises:
177
+ # - ADL::ParseError if the ADL string format is not respected.
178
+ #
179
+ def self.parse_string(str)
180
+ symbols = str.split(' ')
181
+ case symbols[0]
182
+ when '+'
183
+ symbols.shift
184
+ InputString.new symbols, true, false
185
+ when '-'
186
+ symbols.shift
187
+ InputString.new symbols, false, false
188
+ when '?'
189
+ symbols.shift
190
+ InputString.new symbols, nil, false
191
+ else
192
+ raise ADL::ParseError, "Invalid string format #{str}", caller
193
+ end
194
+ end
195
+
196
+ #
197
+ # Parses the sample provided by _descr_. When a block is provided, yields it with
198
+ # InputString instances and ignores the sample argument. Otherwise, fills the sample
199
+ # (any object responding to <code><<</code>) with string, creating a fresh new
200
+ # one (as a Sample instance) if sample is nil.
201
+ #
202
+ # ADL provides a really simple grammar to describe samples (here is a succint
203
+ # example, the full documentation of the sample grammar can be found in the
204
+ # self-documenting example/adl/sample.adl file):
205
+ #
206
+ # #
207
+ # # Some header comments: tool which has generated this sample,
208
+ # # maybe a date or other tool options ...
209
+ # # here: 'this sample is caracteristic for the a(ba)* regular language'
210
+ # #
211
+ # # Positive, Negative, Unlabeled strings become with +, -, ?, respectively
212
+ # # Empty lines and lines becoming with # are simply ignored.
213
+ # #
214
+ # -
215
+ # + a
216
+ # - a b
217
+ # + a b a
218
+ #
219
+ # Raises:
220
+ # - ArgumentError unless _descr_ argument is an IO object or a String.
221
+ # - ADL::ParseError if the ADL sample format is not respected.
222
+ # - InconsistencyError if the sample is not consistent (see Sample)
223
+ #
224
+ def self.parse_sample(descr, sample=nil)
225
+ sample = Sample.new if (sample.nil? and not block_given?)
226
+ ADL::to_io(descr) do |io|
227
+ io.each_line do |l|
228
+ l = l.strip
229
+ next if l.empty? or l[0,1]=='#'
230
+ if sample.nil? and block_given?
231
+ yield parse_string(l)
232
+ else
233
+ sample << parse_string(l)
234
+ end
235
+ end
236
+ end
237
+ sample
238
+ end
239
+
240
+ #
241
+ # Parses an automaton file _f_.
242
+ #
243
+ # Shortuct for:
244
+ # File.open(f) do |file|
245
+ # sample = ADL::parse_sample(file, sample)
246
+ # end
247
+ #
248
+ def self.parse_sample_file(f, sample=nil)
249
+ File.open(f) do |file|
250
+ sample = ADL::parse_sample(file, sample)
251
+ end
252
+ sample
253
+ end
254
+
255
+ #
256
+ # Prints a sample in ADL format on a buffer. Sample argument is expected to be
257
+ # an object responding to each, yielding InputString instances. Buffer is expected
258
+ # to be an object responding to <code><<</code>.
259
+ #
260
+ def self.print_sample(sample, buffer="")
261
+ sample.each do |str|
262
+ buffer << str.to_s << "\n"
263
+ end
264
+ end
265
+
266
+ #
267
+ # Prints a sample in a file.
268
+ #
269
+ # Shortcut for:
270
+ # File.open(file, 'w') do |io|
271
+ # print_sample(sample, f)
272
+ # end
273
+ #
274
+ def self.print_sample_in_file(sample, file)
275
+ File.open(file, 'w') do |f|
276
+ print_sample(sample, f)
277
+ end
278
+ end
279
+
280
+ ### private section ##########################################################
281
+ private
282
+
283
+ #
284
+ # Converts a parsable argument to an IO object or raises an ArgumentError.
285
+ #
286
+ def self.to_io(descr)
287
+ case descr
288
+ when IO
289
+ yield descr
290
+ when String
291
+ yield StringIO.new(descr)
292
+ else
293
+ raise ArgumentError, "IO instance expected, #{descr.class} received", caller
294
+ end
295
+ end
296
+
297
+ end # module ADL
298
+ end # module Stamina
@@ -0,0 +1,1300 @@
1
+ module Stamina
2
+
3
+ #
4
+ # Automaton data-structure.
5
+ #
6
+ # == Examples
7
+ # The following example uses a lot of useful DRY shortcuts, so, if it does not
8
+ # fit you needs then, read on!):
9
+ #
10
+ # # Building an automaton for the regular language a(ba)*
11
+ # fa = Automaton.new do
12
+ # add_state(:initial => true)
13
+ # add_state(:accepting => true)
14
+ # connect(0,1,'a')
15
+ # connect(1,0,'b')
16
+ # end
17
+ #
18
+ # # It accepts 'a b a b a', rejects 'a b' as well as ''
19
+ # puts fa.accepts?('? a b a b a') # prints true
20
+ # puts fa.accepts?('? a b') # prints false
21
+ # puts fa.rejects?('?') # prints true
22
+ #
23
+ # == Four things you need to know
24
+ # 1. Automaton, State and Edge classes implement a Markable design pattern, that
25
+ # is, you can read and write any key/value pair you want on them using the []
26
+ # and []= operators. Note that the following keys are used by Stamina itself,
27
+ # with the obvious semantics (for automata and transducers):
28
+ # - <tt>:initial</tt>, <tt>:accepting</tt>, <tt>:error</tt> on State;
29
+ # expected to be _true_ or _false_ (_nil_ and ommitted are considered as false).
30
+ # Shortcuts for querying and setting these attributes are provided by State.
31
+ # - <tt>:symbol</tt> on Edge, with shortcuts as well on Edge.
32
+ # The convention is to use _nil_ for the epsilon symbol (aka non observable)
33
+ # on non deterministic automata.
34
+ # The following keys are reserved for future extensions:
35
+ # - <tt>:output</tt> on State and Edge.
36
+ # - <tt>:short_prefix</tt> on State.
37
+ # See also the "About states and edges" subsection of the design choices.
38
+ # 2. Why using State methods State#step and State#delta ? The Automaton class includes
39
+ # the Walking module by default, which is much more powerful !
40
+ # 3. The constructor of this class executes the argument block (between <tt>do</tt>
41
+ # and <tt>end</tt>) with instance_eval by default. You won't be able to invoke
42
+ # the methods defined in the scope of your block in such a case. See new
43
+ # for details.
44
+ # 4. This class has not been designed with efficiency in mind. If you experiment
45
+ # performance problems, read the "About Automaton modifications" sub section
46
+ # of the design choices.
47
+ #
48
+ # == Design choices
49
+ # This section fully details the design choices that has been made for the
50
+ # implementation of the Automaton data structure used by Stamina. It is provided
51
+ # because Automaton is one of the core classes of Stamina, that probably all
52
+ # users (and contributors) will use. Automaton usage is really user-friendly,
53
+ # so <b>you are normally not required</b> to read this section in the first
54
+ # place ! Read it only if of interest for you, or if you experiment unexpected
55
+ # results.
56
+ #
57
+ # === One Automaton class only
58
+ # One class only implements all kinds of automata: deterministic, non-deterministic,
59
+ # transducers, prefix-tree-acceptors, etc. The Markable design pattern on states and
60
+ # edges should allow you to make anything you could find useful with this class.
61
+ #
62
+ # === Adjacency-list graph
63
+ # This class implements an automaton using a adjacent-list graph structure.
64
+ # The automaton has state and edge array lists and exposes them through the
65
+ # _states_ and _edges_ accessors. In order to let users enjoy the enumerability
66
+ # of Ruby's arrays while allowing automata to be modified, these arrays are
67
+ # externaly modifiable. However, <b>users are not expected to modify them!</b>
68
+ # and future versions of Stamina will certainly remove this ability.
69
+ #
70
+ # === Indices exposed
71
+ # State and Edge indices in these arrays are exposed by this class. Unless stated
72
+ # explicitely, all methods taking state or edge arguments support indices as well.
73
+ # Moreover, ith_state, ith_states, ith_edge and ith_edges methods provide powerful
74
+ # access to states and edges by indices. All these methods are robust to invalid
75
+ # indices (and raise an IndexError if incorrectly invoked) but do not allow
76
+ # negative indexing (unlike ruby arrays).
77
+ #
78
+ # States and edges know their index in the corresponding array and expose them
79
+ # through the (read-only) _index_ accessor. These indices are always valid;
80
+ # without deletion of states or edges in the automaton, they are guaranteed not
81
+ # to change. Indices saved in your own variables must be considered deprecated
82
+ # each time you perform a deletion ! That's the only rule to respect if you plan
83
+ # to use indices.
84
+ #
85
+ # Indices exposition may seem a strange choice and could be interpreted as
86
+ # breaking OOP's best practice. You are not required to use them but, as will
87
+ # quiclky appear, using them is really powerful and leads to beautiful code!
88
+ # If you don't remove any state or edge, this class guarantees that indices
89
+ # are assigned in the same order as invocations of add_state and add_edge (as
90
+ # well as their plural forms and aliases).
91
+ #
92
+ # === About states and edges
93
+ # Edges know their source and target states, which are exposed through the
94
+ # _source_ and _target_ (read-only) accessors (also aliased as _from_ and _to_).
95
+ # States keep their incoming and outgoing edges in arrays, which are accessible
96
+ # (in fact, a copy) using State#in_edges and State#out_edges. If you use them
97
+ # for walking the automaton in a somewhat standard way, consider using the Walking
98
+ # module instead!
99
+ #
100
+ # Common attributes of states and edges are installed using the Markable pattern
101
+ # itself:
102
+ # - <tt>:initial</tt>, <tt>:accepting</tt> and <tt>:error</tt> on states. These
103
+ # attributes are expected to be _true_ or _false_ (_nil_ and ommitted are also
104
+ # supported and both considered as false).
105
+ # - <tt>:symbol</tt> on edges. Any object you want as long as it responds to the
106
+ # <tt><=></tt> operator. Also, the convention is to use _nil_ for the epsilon
107
+ # symbol (aka non observable) on non deterministic automata.
108
+ #
109
+ # In addition, useful shortcuts are available:
110
+ # - <tt>s.initial?</tt> is a shortcut for <tt>s[:initial]</tt> if _s_ is a State
111
+ # - <tt>s.initial!</tt> is a shortcut for <tt>s[:initial]=true</tt> if _s_ is a State
112
+ # - Similar shortcuts are available for :accepting and :error
113
+ # - <tt>e.symbol</tt> is a shortcut for <tt>e[:symbol]</tt> if _e_ is an Edge
114
+ # - <tt>e.symbol='a'</tt> is a shortcut for <tt>e[:symbol]='a'</tt> if _e_ is an Edge
115
+ #
116
+ # Following keys should be considered reserved by Stamina for future extensions:
117
+ # - <tt>:output</tt> on State and Edge.
118
+ # - <tt>:short_prefix</tt> on State.
119
+ #
120
+ # === About Automaton modifications
121
+ # This class has not been implemented with efficiency in mind. In particular, we expect
122
+ # the vast majority of Stamina core algorithms considering automata as immutable values.
123
+ # For this reason, the Automaton class does not handle modifications really efficiently.
124
+ #
125
+ # So, if you experiment performance problems, consider what follows:
126
+ # 1. Why updating an automaton ? Building a fresh one is much more clean and efficient !
127
+ # This is particularly true for removals.
128
+ # 2. If you can create multiples states or edges at once, consider the plural form
129
+ # of the modification methods: add_n_states and drop_states. Those methods are
130
+ # optimized for multiple updates.
131
+ #
132
+ # == Detailed API
133
+ class Automaton
134
+ include Stamina::Markable
135
+
136
+ #
137
+ # Automaton state.
138
+ #
139
+ class State
140
+ include Stamina::Markable
141
+ attr_reader :automaton, :index
142
+
143
+ #
144
+ # Creates a state.
145
+ #
146
+ # Arguments:
147
+ # - automaton: parent automaton of the state.
148
+ # - index: index of the state in the state list.
149
+ # - data: user data attached to this state.
150
+ #
151
+ def initialize(automaton, index, data)
152
+ @automaton = automaton
153
+ @index = index
154
+ @data = data.dup
155
+ @out_edges = []
156
+ @in_edges = []
157
+ @epsilon_closure = nil
158
+ end
159
+
160
+ ### public read-only section ###############################################
161
+ public
162
+
163
+ # Returns true if this state is an initial state, false otherwise.
164
+ def initial?
165
+ !!@data[:initial]
166
+ end
167
+
168
+ # Sets this state as an initial state.
169
+ def initial!
170
+ @data[:initial] = true
171
+ end
172
+
173
+ # Returns true if this state is an accepting state, false otherwise.
174
+ def accepting?
175
+ !!@data[:accepting]
176
+ end
177
+
178
+ # Sets this state as an accepting state.
179
+ def accepting!
180
+ @data[:accepting] = true
181
+ end
182
+
183
+ # Returns true if this state is an error state, false otherwise.
184
+ def error?
185
+ !!@data[:error]
186
+ end
187
+
188
+ # Sets this state as an error state.
189
+ def error!
190
+ @data[:error] = true
191
+ end
192
+
193
+ # Returns true if this state is deterministic, false otherwise.
194
+ def deterministic?
195
+ outs = out_symbols
196
+ (outs.size==@out_edges.size) and not(outs.include?(nil))
197
+ end
198
+
199
+ # Checks if this state is a sink state or not. Sink states are defined as
200
+ # non accepting states having no outgoing transition or only loop
201
+ # transitions.
202
+ def sink?
203
+ !accepting? && out_edges.all?{|e| e.target==self}
204
+ end
205
+
206
+ # Returns an array containing all incoming edges of the state. Edges are
207
+ # sorted if _sorted_ is set to true. If two incoming edges have same symbol
208
+ # no order is guaranteed between them.
209
+ #
210
+ # Returned array may be modified.
211
+ def in_edges(sorted=false)
212
+ sorted ? @in_edges.sort : @in_edges.dup
213
+ end
214
+
215
+ # Returns an array containing all outgoing edges of the state. Edges are
216
+ # sorted if _sorted_ is set to true. If two outgoing edges have same symbol
217
+ # no order is guaranteed between them.
218
+ #
219
+ # Returned array may be modified.
220
+ def out_edges(sorted=false)
221
+ sorted ? @out_edges.sort : @out_edges.dup
222
+ end
223
+
224
+ # Returns an array with the different symbols appearing on incoming edges.
225
+ # Returned array does not contain duplicates. Symbols are sorted in the
226
+ # array if _sorted_ is set to true.
227
+ #
228
+ # Returned array may be modified.
229
+ def in_symbols(sorted=false)
230
+ symbols = @in_edges.collect{|e| e.symbol}.uniq
231
+ return sorted ? (symbols.sort &automaton.symbols_comparator) : symbols
232
+ end
233
+
234
+ # Returns an array with the different symbols appearing on outgoing edges.
235
+ # Returned array does not contain duplicates. Symbols are sorted in the
236
+ # array if _sorted_ is set to true.
237
+ #
238
+ # Returned array may be modified.
239
+ def out_symbols(sorted=false)
240
+ symbols = @out_edges.collect{|e| e.symbol}.uniq
241
+ return sorted ? (symbols.sort &automaton.symbols_comparator) : symbols
242
+ end
243
+
244
+ # Returns an array with adjacent states (in or out edge).
245
+ #
246
+ # Returned array may be modified.
247
+ def adjacent_states()
248
+ (in_adjacent_states+out_adjacent_states).uniq
249
+ end
250
+
251
+ # Returns an array with adjacent states along an incoming edge (without
252
+ # duplicates).
253
+ #
254
+ # Returned array may be modified.
255
+ def in_adjacent_states()
256
+ (@in_edges.collect {|e| e.source}).uniq
257
+ end
258
+
259
+ # Returns an array with adjacent states along an outgoing edge (whithout
260
+ # duplicates).
261
+ #
262
+ # Returned array may be modified.
263
+ def out_adjacent_states()
264
+ (@out_edges.collect {|e| e.target}).uniq
265
+ end
266
+
267
+ # Returns reachable states from this one with an input _symbol_. Returned
268
+ # array does not contain duplicates and may be modified. This method if not
269
+ # epsilon symbol aware.
270
+ def step(symbol)
271
+ @out_edges.select{|e| e.symbol==symbol}.collect{|e| e.target}
272
+ end
273
+
274
+ # Returns the state reached from this one with an input _symbol_, or nil if
275
+ # no such state. This method is not epsilon symbol aware. Moreover it is
276
+ # expected to be used on deterministic states only. If the state is not
277
+ # deterministic, the method returns one reachable state if such a state
278
+ # exists; which one is returned must be considered non deterministic.
279
+ def dfa_step(symbol)
280
+ edge = @out_edges.find{|e| e.symbol==symbol}
281
+ edge ? edge.target : nil
282
+ end
283
+
284
+ # Computes the epsilon closure of this state. Epsilon closure is the set of
285
+ # all states reached from this one with a <tt>eps*</tt> input (sequence of
286
+ # zero or more epsilon symbols). The current state is always contained in
287
+ # the epsilon closure. Returns an unsorted array without duplicates; this
288
+ # array may not be modified.
289
+ def epsilon_closure()
290
+ @epsilon_closure ||= compute_epsilon_closure(Set.new).to_a.freeze
291
+ end
292
+
293
+ # Internal implementation of epsilon_closure. _result_ is expected to be
294
+ # a Set instance, is modified and is the returned value.
295
+ def compute_epsilon_closure(result)
296
+ result << self
297
+ step(nil).each do |t|
298
+ t.compute_epsilon_closure(result) unless result.include?(t)
299
+ end
300
+ raise if result.nil?
301
+ return result
302
+ end
303
+
304
+ # Computes an array representing the set of states that can be reached from
305
+ # this state with a given input _symbol_. Returned array does not contain
306
+ # duplicates and may be modified. No particular ordering of states in the
307
+ # array is guaranteed.
308
+ #
309
+ # This method is epsilon symbol aware (represented with nil) on non
310
+ # deterministic automata, meaning that it actually computes the set of
311
+ # reachable states through strings respecting the <tt>eps* symbol eps*</tt>
312
+ # regular expression, where eps is the epsilon symbol.
313
+ def delta(symbol)
314
+ if automaton.deterministic?
315
+ target = dfa_delta(symbol)
316
+ target.nil? ? [] : [target]
317
+ else
318
+ # 1) first compute epsilon closure of self
319
+ at_epsilon = epsilon_closure
320
+
321
+ # 2) now, look where we can go from there
322
+ at_espilon_then_symbol = at_epsilon.collect do |s|
323
+ s.step(symbol)
324
+ end.flatten.uniq
325
+
326
+ # 3) look where we can go from there using epsilon
327
+ result = at_espilon_then_symbol.collect do |s|
328
+ s.epsilon_closure
329
+ end.flatten.uniq
330
+
331
+ # return result as an array
332
+ result
333
+ end
334
+ end
335
+
336
+ # Returns the target state that can be reached from this state with _symbol_
337
+ # input. Returns nil if no such state exists.
338
+ #
339
+ # This method is expected to be used on deterministic automata. Unlike delta,
340
+ # it returns a State instance (or nil), not an array of states. When used on
341
+ # non deterministic automata, it returns a state immediately reachable from
342
+ # this state with _symbol_ input, or nil if no such state exists. This
343
+ # method is not epsilon aware.
344
+ def dfa_delta(symbol)
345
+ return nil if symbol.nil?
346
+ edge = @out_edges.find{|e| e.symbol==symbol}
347
+ edge.nil? ? nil : edge.target
348
+ end
349
+
350
+ # Provides comparator of states, based on the index in the automaton state
351
+ # list. This method returns nil unless _o_ is a State from the same
352
+ # automaton than self.
353
+ def <=>(o)
354
+ return nil unless State===o
355
+ return nil unless automaton===o.automaton
356
+ return index <=> o.index
357
+ end
358
+
359
+ # Returns a string representation
360
+ def inspect
361
+ 's' << @index.to_s
362
+ end
363
+
364
+ # Returns a string representation
365
+ def to_s
366
+ 's' << @index.to_s
367
+ end
368
+
369
+ ### protected write section ################################################
370
+ protected
371
+
372
+ # Changes the index of this state in the state list. This method is only
373
+ # expected to be used by the automaton itself.
374
+ def index=(i) @index=i end
375
+
376
+ #
377
+ # Fired by Loaded when a user data is changed. The message is forwarded to
378
+ # the automaton.
379
+ #
380
+ def state_changed(what, description)
381
+ @epsilon_closure = nil
382
+ @automaton.send(:state_changed, what, description)
383
+ end
384
+
385
+ # Adds an incoming edge to the state.
386
+ def add_incoming_edge(edge)
387
+ @epsilon_closure = nil
388
+ @in_edges << edge
389
+ end
390
+
391
+ # Adds an outgoing edge to the state.
392
+ def add_outgoing_edge(edge)
393
+ @epsilon_closure = nil
394
+ @out_edges << edge
395
+ end
396
+
397
+ # Adds an incoming edge to the state.
398
+ def drop_incoming_edge(edge)
399
+ @epsilon_closure = nil
400
+ @in_edges.delete(edge)
401
+ end
402
+
403
+ # Adds an outgoing edge to the state.
404
+ def drop_outgoing_edge(edge)
405
+ @epsilon_closure = nil
406
+ @out_edges.delete(edge)
407
+ end
408
+
409
+ protected :compute_epsilon_closure
410
+ end
411
+
412
+ #
413
+ # Automaton edge.
414
+ #
415
+ class Edge
416
+ include Stamina::Markable
417
+ attr_reader :automaton, :index, :from, :to
418
+
419
+ #
420
+ # Creates an edge.
421
+ #
422
+ # Arguments:
423
+ # - automaton: parent automaton of the edge.
424
+ # - index: index of the edge in the edge list.
425
+ # - data: user data attached to this edge.
426
+ # - from: source state of the edge.
427
+ # - to: target state of the edge.
428
+ #
429
+ def initialize(automaton, index, data, from, to)
430
+ @automaton, @index = automaton, index
431
+ @data = data
432
+ @from, @to = from, to
433
+ end
434
+
435
+ # Returns edge symbol.
436
+ def symbol()
437
+ @data[:symbol]
438
+ end
439
+
440
+ # Sets edge symbol.
441
+ def symbol=(symbol)
442
+ @data[:symbol] = symbol
443
+ end
444
+
445
+ alias :source :from
446
+ alias :target :to
447
+
448
+ #
449
+ # Provides comparator of edges, based on the index in the automaton edge
450
+ # list. This method returns nil unless _o_ is an Edge from the same
451
+ # automaton than self.
452
+ # Once again, this method has nothing to do with equality, it looks at an
453
+ # index and ID only.
454
+ #
455
+ def <=>(o)
456
+ return nil unless Edge===o
457
+ return nil unless automaton===o.automaton
458
+ return index <=> o.index
459
+ end
460
+
461
+ # Returns a string representation
462
+ def inspect
463
+ 'e' << @index.to_s
464
+ end
465
+
466
+ # Returns a string representation
467
+ def to_s
468
+ 'e' << @index.to_s
469
+ end
470
+
471
+ ### protected write section ################################################
472
+ protected
473
+
474
+ # Changes the index of this edge in the edge list. This method is only
475
+ # expected to be used by the automaton itself.
476
+ def index=(i) @index=i end
477
+
478
+ #
479
+ # Fired by Loaded when a user data is changed. The message if forwarded to
480
+ # the automaton.
481
+ #
482
+ def state_changed(what, infos)
483
+ @automaton.send(:state_changed, what, infos)
484
+ end
485
+
486
+ end
487
+
488
+ ### Automaton class ##########################################################
489
+ public
490
+
491
+ # State list and edge list of the automaton
492
+ attr_reader :states, :edges
493
+
494
+ #
495
+ # Creates an empty automaton and executes the block passed as argument. The _onself_
496
+ # argument dictates the way _block_ is executed:
497
+ # - when set to false, the block is executed traditionnally (i.e. using yield).
498
+ # In this case, methods invocations must be performed on the automaton object
499
+ # passed as block argument.
500
+ # - when set to _true_ (by default) the block is executed in the context of the
501
+ # automaton itself (i.e. with instance_eval), allowing call of its methods
502
+ # without prefixing them by the automaton variable. The automaton still
503
+ # passes itself as first block argument. Note that in this case, you won't be
504
+ # able to invoke a method defined in the scope of your block.
505
+ #
506
+ # Example:
507
+ # # The DRY way to do:
508
+ # Automaton.new do |automaton| # automaton will not be used here, but it is passed
509
+ # add_state(:initial => true)
510
+ # add_state(:accepting => true)
511
+ # connect(0, 1, 'a')
512
+ # connect(1, 0, 'b')
513
+ #
514
+ # # method_in_caller_scope() # commented because not allowed here !!
515
+ # end
516
+ #
517
+ # # The other way:
518
+ # Automaton.new(false) do |automaton| # automaton MUST be used here
519
+ # automaton.add_state(:initial => true)
520
+ # automaton.add_state(:accepting => true)
521
+ # automaton.connect(0, 1, 'a')
522
+ # automaton.connect(1, 0, 'b')
523
+ #
524
+ # method_in_caller_scope() # allowed in this variant !!
525
+ # end
526
+ #
527
+ def initialize(onself=true, &block) # :yields: automaton
528
+ @states = []
529
+ @edges = []
530
+ @initials = nil
531
+ @alphabet = nil
532
+ @deterministic = nil
533
+
534
+ # if there's a block, execute it now!
535
+ if block_given?
536
+ if onself
537
+ if RUBY_VERSION >= "1.9.0"
538
+ instance_exec(self, &block)
539
+ else
540
+ instance_eval(&block)
541
+ end
542
+ else
543
+ block.call(self)
544
+ end
545
+ end
546
+ end
547
+
548
+ # Coerces `arg` to an automaton
549
+ def self.coerce(arg)
550
+ if arg.respond_to?(:to_fa)
551
+ arg.to_fa
552
+ elsif arg.is_a?(String)
553
+ parse(arg)
554
+ else
555
+ raise ArgumentError, "Invalid argument #{arg} for `Automaton`"
556
+ end
557
+ end
558
+
559
+ # Parses an automaton using ADL
560
+ def self.parse(str)
561
+ ADL::parse_automaton(str)
562
+ end
563
+
564
+ ### public read-only section #################################################
565
+ public
566
+
567
+ # Returns a symbols comparator taking epsilon symbols into account. Comparator
568
+ # is provided as Proc instance which is a lambda function.
569
+ def symbols_comparator
570
+ @symbols_comparator ||= Kernel.lambda do |a,b|
571
+ if a==b then 0
572
+ elsif a.nil? then -1
573
+ elsif b.nil? then 1
574
+ else a <=> b
575
+ end
576
+ end
577
+ end
578
+
579
+ # Returns the number of states
580
+ def state_count() @states.size end
581
+
582
+ # Returns the number of edges
583
+ def edge_count() @edges.size end
584
+
585
+ #
586
+ # Returns the i-th state of the state list.
587
+ #
588
+ # Raises:
589
+ # - ArgumentError unless i is an Integer
590
+ # - IndexError if i is not in [0..state_count)
591
+ #
592
+ def ith_state(i)
593
+ raise(ArgumentError, "Integer expected, #{i} found.", caller)\
594
+ unless Integer === i
595
+ raise(ArgumentError, "Invalid state index #{i}", caller)\
596
+ unless i>=0 and i<state_count
597
+ @states[i]
598
+ end
599
+
600
+ #
601
+ # Returns state associated with the supplied state name, throws an exception if no such state can be found.
602
+ #
603
+ def get_state(name)
604
+ raise(ArgumentError, "String expected, #{name} found.", caller)\
605
+ unless String === name
606
+ result = states.find do |s|
607
+ name == s[:name]
608
+ end
609
+ raise(ArgumentError, "State #{name} was not found", caller)\
610
+ if result.nil?
611
+ result
612
+ end
613
+
614
+ #
615
+ # Returns the i-th states of the state list.
616
+ #
617
+ # Raises:
618
+ # - ArgumentError unless all _i_ are integers
619
+ # - IndexError unless all _i_ are in [0..state_count)
620
+ #
621
+ def ith_states(*i)
622
+ i.collect{|j| ith_state(j)}
623
+ end
624
+
625
+ #
626
+ # Returns the i-th edge of the edge list.
627
+ #
628
+ # Raises:
629
+ # - ArgumentError unless i is an Integer
630
+ # - IndexError if i is not in [0..state_count)
631
+ #
632
+ def ith_edge(i)
633
+ raise(ArgumentError, "Integer expected, #{i} found.", caller)\
634
+ unless Integer === i
635
+ raise(ArgumentError, "Invalid edge index #{i}", caller)\
636
+ unless i>=0 and i<edge_count
637
+ @edges[i]
638
+ end
639
+
640
+ #
641
+ # Returns the i-th edges of the edge list.
642
+ #
643
+ # Raises:
644
+ # - ArgumentError unless all _i_ are integers
645
+ # - IndexError unless all _i_ are in [0..edge_count)
646
+ #
647
+ def ith_edges(*i)
648
+ i.collect{|j| ith_edge(j)}
649
+ end
650
+
651
+ #
652
+ # Calls block for each state of the automaton state list. States are
653
+ # enumerated in index order.
654
+ #
655
+ def each_state() @states.each {|s| yield s if block_given?} end
656
+
657
+ #
658
+ # Calls block for each edge of the automaton edge list. Edges are
659
+ # enumerated in index order.
660
+ #
661
+ def each_edge() @edges.each {|e| yield e if block_given?} end
662
+
663
+ #
664
+ # Returns an array with incoming edges of _state_. Edges are sorted by symbols
665
+ # if _sorted_ is set to true. If two incoming edges have same symbol, no
666
+ # order is guaranteed between them. Returned array may be modified.
667
+ #
668
+ # If _state_ is an Integer, this method returns the incoming edges of the
669
+ # state'th state in the state list.
670
+ #
671
+ # Raises:
672
+ # - IndexError if state is an Integer and state<0 or state>=state_count.
673
+ # - ArgumentError if _state_ is not a valid state for this automaton.
674
+ #
675
+ def in_edges(state, sorted=false) to_state(state).in_edges(sorted) end
676
+
677
+ #
678
+ # Returns an array with outgoing edges of _state_. Edges are sorted by symbols
679
+ # if _sorted_ is set to true. If two incoming edges have same symbol, no
680
+ # order is guaranteed between them. Returned array may be modified.
681
+ #
682
+ # If _state_ is an Integer, this method returns the outgoing edges of the
683
+ # state'th state in the state list.
684
+ #
685
+ # Raises:
686
+ # - IndexError if state is an Integer and state<0 or state>=state_count.
687
+ # - ArgumentError if state is not a valid state (not a state or not from this
688
+ # automaton)
689
+ #
690
+ def out_edges(state, sorted=false) to_state(state).out_edges(sorted) end
691
+
692
+ #
693
+ # Returns an array with the different symbols appearing on incoming edges of
694
+ # _state_. Returned array does not contain duplicates and may be modified;
695
+ # it is sorted if _sorted_ is set to true.
696
+ #
697
+ # If _state_ is an Integer, this method returns the incoming symbols of the
698
+ # state'th state in the state list.
699
+ #
700
+ # Raises:
701
+ # - IndexError if state is an Integer and state<0 or state>=state_count.
702
+ # - ArgumentError if _state_ is not a valid state for this automaton.
703
+ #
704
+ def in_symbols(state, sorted=false) to_state(state).in_symbols(sorted) end
705
+
706
+ #
707
+ # Returns an array with the different symbols appearing on outgoing edges of
708
+ # _state_. Returned array does not contain duplicates and may be modified;
709
+ # it is sorted if _sorted_ is set to true.
710
+ #
711
+ # If _state_ is an Integer, this method returns the outgoing symbols of the
712
+ # state'th state in the state list.
713
+ #
714
+ # Raises:
715
+ # - IndexError if state is an Integer and state<0 or state>=state_count.
716
+ # - ArgumentError if state is not a valid state (not a state or not from this
717
+ # automaton)
718
+ #
719
+ def out_symbols(state, sorted=false) to_state(state).out_symbols(sorted) end
720
+
721
+ #
722
+ # Returns an array with adjacent states (along incoming and outgoing edges)
723
+ # of _state_. Returned array does not contain duplicates; it may be modified.
724
+ #
725
+ # If _state_ is an Integer, this method returns the adjacent states of the
726
+ # state'th state in the state list.
727
+ #
728
+ # Raises:
729
+ # - IndexError if state is an Integer and state<0 or state>=state_count.
730
+ # - ArgumentError if state is not a valid state (not a state or not from this
731
+ # automaton)
732
+ #
733
+ def adjacent_states(state) to_state(state).adjacent_states() end
734
+
735
+ #
736
+ # Returns an array with adjacent states (along incoming edges) of _state_.
737
+ # Returned array does not contain duplicates; it may be modified.
738
+ #
739
+ # If _state_ is an Integer, this method returns the incoming adjacent states
740
+ # of the state'th state in the state list.
741
+ #
742
+ # Raises:
743
+ # - IndexError if state is an Integer and state<0 or state>=state_count.
744
+ # - ArgumentError if state is not a valid state (not a state or not from this
745
+ # automaton)
746
+ #
747
+ def in_adjacent_states(state) to_state(state).in_adjacent_states() end
748
+
749
+ #
750
+ # Returns an array with adjacent states (along outgoing edges) of _state_.
751
+ # Returned array does not contain duplicates; it may be modified.
752
+ #
753
+ # If _state_ is an Integer, this method returns the outgoing adjacent states
754
+ # of the state'th state in the state list.
755
+ #
756
+ # Raises:
757
+ # - IndexError if state is an Integer and state<0 or state>=state_count.
758
+ # - ArgumentError if state is not a valid state (not a state or not from this
759
+ # automaton)
760
+ #
761
+ def out_adjacent_states(state) to_state(state).out_adjacent_states() end
762
+
763
+ #
764
+ # Collects all initial states of this Automaton and returns it. Returned array
765
+ # does not contain duplicates and may be modified.
766
+ #
767
+ # This method is epsilon symbol aware (represented with nil) on
768
+ # non-deterministic automata, meaning that it actually computes the set of
769
+ # reachable states from an initial state through strings respecting the
770
+ # <tt>eps*</tt> regular expression, where eps is the epsilon symbol.
771
+ #
772
+ def initial_states
773
+ if @initials.nil? or @initials.empty?
774
+ @initials = compute_initial_states
775
+ end
776
+ @initials
777
+ end
778
+
779
+ #
780
+ # Returns the initial state of the automaton. This method is expected to used
781
+ # on deterministic automata only. Unlike initial_states, it returns one State
782
+ # instance instead of an Array.
783
+ #
784
+ # When used with a non deterministic automaton, it returns one of the states
785
+ # tagged as initial. Which one is returned must be considered a non
786
+ # deterministic choice. This method is not epsilon symbol aware.
787
+ #
788
+ def initial_state
789
+ initial_states[0]
790
+ end
791
+
792
+ # Internal implementation of initial_states.
793
+ def compute_initial_states()
794
+ initials = @states.select {|s| s.initial?}
795
+ initials.collect{|s| s.epsilon_closure}.flatten.uniq
796
+ end
797
+
798
+ ### public write section #####################################################
799
+ public
800
+
801
+ #
802
+ # Adds a new state.
803
+ #
804
+ # Arguments:
805
+ # - data: user-data to attach to the state (see Automaton documentation).
806
+ #
807
+ # Raises:
808
+ # - ArgumentError if _data_ is not a valid state data.
809
+ #
810
+ def add_state(data={})
811
+ data = to_valid_state_data(data)
812
+
813
+ # create new state, add it to state-list
814
+ state = State.new(self, state_count, data)
815
+ @states << state
816
+
817
+ # let the automaton know that something has changed
818
+ state_changed(:state_added, state)
819
+
820
+ # return created state
821
+ state
822
+ end
823
+ alias :create_state :add_state
824
+
825
+ #
826
+ # Adds _n_ new states in the automaton. Created states are returned as an
827
+ # ordered array (order of states according to their index in state list).
828
+ #
829
+ # _data_ is duplicated for each created state.
830
+ #
831
+ def add_n_states(n, data={})
832
+ created = []
833
+ n.times do |i|
834
+ created << add_state(block_given? ? data.merge(yield(i)) : data.dup)
835
+ end
836
+ created
837
+ end
838
+ alias :create_n_states :add_n_states
839
+
840
+ #
841
+ # Adds a new edge, connecting _from_ and _to_ states of the automaton.
842
+ #
843
+ # Arguments:
844
+ # - from: either a State or a valid state index (Integer).
845
+ # - to: either a State or a valid state index (Integer).
846
+ # - data: user data to attach to the created edge (see Automaton documentation).
847
+ #
848
+ # Raises:
849
+ # - IndexError if _from_ is an Integer but not in [0..state_count)
850
+ # - IndexError if _to_ is an Integer but not in [0..state_count)
851
+ # - ArgumentError if _from_ is not a valid state for this automaton.
852
+ # - ArgumentError if _to_ is not a valid state for this automaton.
853
+ # - ArgumentError if _data_ is not a valid edge data.
854
+ #
855
+ def add_edge(from, to, data)
856
+ from, to, data = to_state(from), to_state(to), to_valid_edge_data(data)
857
+
858
+ # create edge, install it, add it to edge-list
859
+ edge = Edge.new(self, edge_count, data, from, to)
860
+ @edges << edge
861
+ from.send(:add_outgoing_edge, edge)
862
+ to.send(:add_incoming_edge, edge)
863
+
864
+ # let automaton know that something has changed
865
+ state_changed(:edge_added, edge)
866
+
867
+ # return created edge
868
+ edge
869
+ end
870
+ alias :create_edge :add_edge
871
+ alias :connect :add_edge
872
+
873
+ #
874
+ # Adds all states and transitions (as copies) from a different automaton.
875
+ # None of the added states are made initial. Returns the (associated state
876
+ # of the) initial state of the added part.
877
+ #
878
+ # This method is deprecated and should not be used anymore. Use dup instead.
879
+ #
880
+ # In order to ensure that names of the new states do not clash with names of
881
+ # existing states, state names may have to be removed from added states;
882
+ # this is the case if _clear_names_ is set to true.
883
+ #
884
+ def add_automaton(fa, clear_names = true)
885
+ initial = nil
886
+ fa.dup(self){|source,target|
887
+ initial = target if target.initial?
888
+ target[:initial] = false
889
+ target[:name] = nil if clear_names
890
+ }
891
+ initial
892
+ end
893
+
894
+ #
895
+ # Constructs a replica of this automaton and returns a copy.
896
+ #
897
+ # This copy can be modified in whatever way without affecting the original
898
+ # automaton.
899
+ #
900
+ def dup(fa = Automaton.new)
901
+ added = states.collect do |source|
902
+ target = fa.add_state(source.data.dup)
903
+ yield(source, target) if block_given?
904
+ target
905
+ end
906
+ edges.each do |edge|
907
+ from, to = added[edge.from.index], added[edge.to.index]
908
+ fa.connect(from, to, edge.data.dup)
909
+ end
910
+ fa
911
+ end
912
+
913
+ #
914
+ # Drops a state of the automaton, as well as all connected edges to that state.
915
+ # If _state_ is an integer, the state-th state of the state list is removed.
916
+ # This method returns the automaton itself.
917
+ #
918
+ # Raises:
919
+ # - IndexError if _edge_ is an Integer but not in [0..edge_count)
920
+ # - ArgumentError if _edge_ is not a valid edge for this automaton.
921
+ #
922
+ def drop_state(state)
923
+ state = to_state(state)
924
+ # remove edges first: drop_edges ensures that edge list is coherent
925
+ drop_edges(*(state.in_edges + state.out_edges).uniq)
926
+
927
+ # remove state now and renumber
928
+ @states.delete_at(state.index)
929
+ state.index.upto(state_count-1) do |i|
930
+ @states[i].send(:index=, i)
931
+ end
932
+ state.send(:index=, -1)
933
+
934
+ state_changed(:state_dropped, state)
935
+ self
936
+ end
937
+ alias :delete_state :drop_state
938
+
939
+ #
940
+ # Drops all states passed as parameter as well as all their connected edges.
941
+ # Arguments may be state instances, as well as valid state indices. Duplicates
942
+ # are even supported. This method has no effect on the automaton and raises
943
+ # an error if some state argument is not valid.
944
+ #
945
+ # Raises:
946
+ # - ArgumentError if one state in _states_ is not a valid state of this
947
+ # automaton.
948
+ #
949
+ def drop_states(*states)
950
+ # check states first
951
+ states = states.collect{|s| to_state(s)}.uniq.sort
952
+ edges = states.collect{|s| (s.in_edges + s.out_edges).uniq}.flatten.uniq.sort
953
+
954
+ # Remove all edges, we do not use drop_edges to avoid spending too much
955
+ # time reindexing edges. Moreover, we can do it that way because we take
956
+ # edges in reverse indexing order (has been sorted previously)
957
+ until edges.empty?
958
+ edge = edges.pop
959
+ edge.source.send(:drop_outgoing_edge,edge)
960
+ edge.target.send(:drop_incoming_edge,edge)
961
+ @edges.delete_at(edge.index)
962
+ edge.send(:index=, -1)
963
+ state_changed(:edge_dropped, edge)
964
+ end
965
+
966
+ # Remove all states, same kind of hack is used
967
+ until states.empty?
968
+ state = states.pop
969
+ @states.delete_at(state.index)
970
+ state.send(:index=, -1)
971
+ state_changed(:state_dropped, state)
972
+ end
973
+
974
+ # sanitize state and edge lists
975
+ @states.each_with_index {|s,i| s.send(:index=,i)}
976
+ @edges.each_with_index {|e,i| e.send(:index=,i)}
977
+
978
+ self
979
+ end
980
+
981
+ #
982
+ # Drops an edge in the automaton. If _edge_ is an integer, the edge-th edge
983
+ # of the edge list is removed. This method returns the automaton itself.
984
+ #
985
+ # Raises:
986
+ # - IndexError if _edge_ is an Integer but not in [0..edge_count)
987
+ # - ArgumentError if _edge_ is not a valid edge for this automaton.
988
+ #
989
+ def drop_edge(edge)
990
+ edge = to_edge(edge)
991
+ @edges.delete_at(edge.index)
992
+ edge.from.send(:drop_outgoing_edge,edge)
993
+ edge.to.send(:drop_incoming_edge,edge)
994
+ edge.index.upto(edge_count-1) do |i|
995
+ @edges[i].send(:index=, i)
996
+ end
997
+ edge.send(:index=,-1)
998
+ state_changed(:edge_dropped, edge)
999
+ self
1000
+ end
1001
+ alias :delete_edge :drop_edge
1002
+
1003
+ #
1004
+ # Drops all edges passed as parameters. Arguments may be edge objects,
1005
+ # as well as valid edge indices. Duplicates are even supported. This method
1006
+ # has no effect on the automaton and raises an error if some edge argument
1007
+ # is not valid.
1008
+ #
1009
+ # Raises:
1010
+ # - ArgumentError if one edge in _edges_ is not a valid edge of this automaton.
1011
+ #
1012
+ def drop_edges(*edges)
1013
+ # check edges first
1014
+ edges = edges.collect{|e| to_edge(e)}.uniq
1015
+
1016
+ # remove all edges
1017
+ edges.each do |e|
1018
+ @edges.delete(e)
1019
+ e.from.send(:drop_outgoing_edge,e)
1020
+ e.to.send(:drop_incoming_edge,e)
1021
+ e.send(:index=, -1)
1022
+ state_changed(:edge_dropped, e)
1023
+ end
1024
+ @edges.each_with_index do |e,i|
1025
+ e.send(:index=,i)
1026
+ end
1027
+
1028
+ self
1029
+ end
1030
+ alias :delete_edges :drop_edges
1031
+
1032
+ ### protected section ########################################################
1033
+ protected
1034
+
1035
+ #
1036
+ # Converts a _state_ argument to a valid State of this automaton.
1037
+ # There are three ways to refer to a state, by position in the internal
1038
+ # collection of states, using an instance of State and using a name of a
1039
+ # state (represented with a String).
1040
+ #
1041
+ # Raises:
1042
+ # - IndexError if state is an Integer and state<0 or state>=state_count.
1043
+ # - ArgumentError if state is not a valid state (not a state or not from this
1044
+ # automaton)
1045
+ #
1046
+ def to_state(state)
1047
+ case state
1048
+ when State
1049
+ return state if state.automaton==self and state==@states[state.index]
1050
+ raise ArgumentError, "Not a state of this automaton", caller
1051
+ when Integer
1052
+ return ith_state(state)
1053
+ when String
1054
+ result = get_state(state)
1055
+ return result unless result.nil?
1056
+ end
1057
+ raise ArgumentError, "Invalid state argument #{state}", caller
1058
+ end
1059
+
1060
+ #
1061
+ # Converts an _edge_ argument to a valid Edge of this automaton.
1062
+ #
1063
+ # Raises:
1064
+ # - IndexError if _edge_ is an Integer but not in [0..edge_count)
1065
+ # - ArgumentError if _edge_ is not a valid edge (not a edge or not from this
1066
+ # automaton)
1067
+ #
1068
+ def to_edge(edge)
1069
+ case edge
1070
+ when Edge
1071
+ return edge if edge.automaton==self and edge==@edges[edge.index]
1072
+ raise ArgumentError, "Not an edge of this automaton", caller
1073
+ when Integer
1074
+ return ith_edge(edge)
1075
+ end
1076
+ raise ArgumentError, "Invalid edge argument #{edge}", caller
1077
+ end
1078
+
1079
+ #
1080
+ # Checks if a given user-data contains enough information to be attached to
1081
+ # a given state. Returns the data if ok.
1082
+ #
1083
+ # Raises:
1084
+ # - ArgumentError if data is not considered a valid state data.
1085
+ #
1086
+ def to_valid_state_data(data)
1087
+ raise(ArgumentError,
1088
+ "User data should be an Hash", caller) unless Hash===data
1089
+ data
1090
+ end
1091
+
1092
+ #
1093
+ # Checks if a given user-data contains enough information to be attached to
1094
+ # a given edge. Returns the data if ok.
1095
+ #
1096
+ # Raises:
1097
+ # - ArgumentError if data is not considered a valid edge data.
1098
+ #
1099
+ def to_valid_edge_data(data)
1100
+ return {:symbol => data} if data.nil? or data.is_a?(String)
1101
+ raise(ArgumentError,
1102
+ "User data should be an Hash", caller) unless Hash===data
1103
+ raise(ArgumentError,
1104
+ "User data should contain a :symbol attribute.",
1105
+ caller) unless data.has_key?(:symbol)
1106
+ raise(ArgumentError,
1107
+ "Edge :symbol attribute cannot be an array.",
1108
+ caller) if Array===data[:symbol]
1109
+ data
1110
+ end
1111
+
1112
+ ### public sections with useful utilities ####################################
1113
+ public
1114
+
1115
+ # Returns true if the automaton is deterministic, false otherwise
1116
+ def deterministic?
1117
+ if @deterministic.nil?
1118
+ @deterministic = @states.all?{|s| s.deterministic?}
1119
+ end
1120
+ @deterministic
1121
+ end
1122
+
1123
+ ### public & protected sections about alphabet ###############################
1124
+ protected
1125
+
1126
+ # Deduces the alphabet from the automaton edges.
1127
+ def deduce_alphabet
1128
+ edges.collect{|e| e.symbol}.uniq.compact.sort
1129
+ end
1130
+
1131
+ public
1132
+
1133
+ # Returns the alphabet of the automaton.
1134
+ def alphabet
1135
+ @alphabet || deduce_alphabet
1136
+ end
1137
+
1138
+ # Sets the aphabet of the automaton. _alph_ is expected to be an array without
1139
+ # nil nor duplicated. This method raises an ArgumentError otherwise. Such an
1140
+ # error is also raised if a symbol used on the automaton edges is not included
1141
+ # in _alph_.
1142
+ def alphabet=(alph)
1143
+ raise ArgumentError, "Invalid alphabet" unless alph.uniq.compact.size==alph.size
1144
+ raise ArgumentError, "Invalid alphabet" unless deduce_alphabet.reject{|s| alph.include?(s)}.empty?
1145
+ @alphabet = alph.sort
1146
+ end
1147
+
1148
+ ### public section about coercions ###########################################
1149
+ public
1150
+
1151
+ # Returns a finite automaton
1152
+ def to_fa
1153
+ self
1154
+ end
1155
+
1156
+ # Returns a deterministic finite automaton
1157
+ def to_dfa
1158
+ self.deterministic? ? self : self.determinize
1159
+ end
1160
+
1161
+ # Returns a canonical deterministic finite automaton
1162
+ def to_cdfa
1163
+ cdfa = self
1164
+ cdfa = cdfa.determinize unless self.deterministic?
1165
+ cdfa = cdfa.complete unless self.complete?
1166
+ cdfa = cdfa.minimize
1167
+ cdfa
1168
+ end
1169
+
1170
+ # Returns a regular language
1171
+ def to_reglang
1172
+ RegLang.new(self)
1173
+ end
1174
+
1175
+ ### public section about dot utilities #######################################
1176
+ protected
1177
+
1178
+ #
1179
+ # Converts a hash of attributes (typically automaton, state or edge attributes)
1180
+ # to a <code>[...]</code> dot string. Braces are part of the output.
1181
+ #
1182
+ def attributes2dot(attrs)
1183
+ buffer = ""
1184
+ attrs.keys.sort{|k1,k2| k1.to_s <=> k2.to_s}.each do |key|
1185
+ buffer << " " unless buffer.empty?
1186
+ value = attrs[key].to_s.gsub('"','\"')
1187
+ buffer << "#{key}=\"#{value}\""
1188
+ end
1189
+ buffer
1190
+ end
1191
+
1192
+ public
1193
+
1194
+ #
1195
+ # Generates a dot output from an automaton. The rewriter block takes
1196
+ # two arguments: the first one is a Markable instance (graph, state or
1197
+ # edge), the second one indicates which kind of element is passed (through
1198
+ # :automaton, :state or :edge symbol). The rewriter is expected to return a
1199
+ # hash-like object providing dot attributes for the element.
1200
+ #
1201
+ # When no rewriter is provided, a default one is used by default, providing
1202
+ # the following behavior:
1203
+ # - on :automaton
1204
+ #
1205
+ # {:rankdir => "LR"}
1206
+ #
1207
+ # - on :state
1208
+ #
1209
+ # {:shape => "doublecircle/circle" (following accepting?),
1210
+ # :style => "filled",
1211
+ # :fillcolor => "green/red/white" (if initial?/error?/else, respectively)}
1212
+ #
1213
+ # - on edge
1214
+ #
1215
+ # {:label => "#{edge.symbol}"}
1216
+ #
1217
+ def to_dot(&rewriter)
1218
+ unless rewriter
1219
+ to_dot do |elm, kind|
1220
+ case kind
1221
+ when :automaton
1222
+ {:pack => true, :rankdir => "LR", :ranksep => 0, :margin => 0}
1223
+ when :state
1224
+ {:shape => (elm.accepting? ? "doublecircle" : "circle"),
1225
+ :style => "filled",
1226
+ :color => "black",
1227
+ :fillcolor => (elm.initial? ? "green" : (elm.error? ? "red" : "white")),
1228
+ :width => 0.6, :height => 0.6, :fixedsize => true
1229
+ }
1230
+ when :edge
1231
+ {:label => elm.symbol.nil? ? '' : elm.symbol.to_s,
1232
+ :arrowsize => 0.7}
1233
+ end
1234
+ end
1235
+ else
1236
+ buffer = "digraph G {\n"
1237
+ attrs = attributes2dot(rewriter.call(self, :automaton))
1238
+ buffer << " graph [#{attrs}];\n"
1239
+ self.depth
1240
+ states.sort{|s1,s2| s1[:depth] <=> s2[:depth]}.each do |s|
1241
+ s.remove_mark(:depth)
1242
+ attrs = attributes2dot(rewriter.call(s, :state))
1243
+ buffer << " #{s.index} [#{attrs}];\n"
1244
+ end
1245
+ edges.each do |e|
1246
+ attrs = attributes2dot(rewriter.call(e, :edge))
1247
+ buffer << " #{e.source.index} -> #{e.target.index} [#{attrs}];\n"
1248
+ end
1249
+ buffer << "}\n"
1250
+ end
1251
+ end
1252
+
1253
+ ### public section about adl utilities #######################################
1254
+ public
1255
+
1256
+ # Prints this automaton in ADL format
1257
+ def to_adl(buffer = "")
1258
+ Stamina::ADL.print_automaton(self, buffer)
1259
+ end
1260
+
1261
+ ### public section about reordering ##########################################
1262
+ public
1263
+
1264
+ # Uses a comparator block to reorder the state list.
1265
+ def order_states(&block)
1266
+ raise ArgumentError, "A comparator block must be given" unless block_given?
1267
+ raise ArgumentError, "A comparator block of arity 2 must be given" unless block.arity==2
1268
+ @states.sort!(&block)
1269
+ @states.each_with_index{|s,i| s.send(:index=, i)}
1270
+ self
1271
+ end
1272
+
1273
+ ### protected section about changes ##########################################
1274
+ protected
1275
+
1276
+ #
1277
+ # Fires by write method when an automaton change occurs.
1278
+ #
1279
+ def state_changed(what, infos)
1280
+ @initials = nil
1281
+ @deterministic = nil
1282
+ end
1283
+
1284
+ protected :compute_initial_states
1285
+
1286
+ DUM = Automaton.new{ add_state(:initial => true, :accepting => false) }
1287
+ DEE = Automaton.new{ add_state(:initial => true, :accepting => true) }
1288
+ end # class Automaton
1289
+
1290
+ end # module Stamina
1291
+ require_relative 'automaton/walking'
1292
+ require_relative 'automaton/complete'
1293
+ require_relative 'automaton/complement'
1294
+ require_relative 'automaton/strip'
1295
+ require_relative 'automaton/equivalence'
1296
+ require_relative 'automaton/determinize'
1297
+ require_relative 'automaton/minimize'
1298
+ require_relative 'automaton/metrics'
1299
+ require_relative 'automaton/compose'
1300
+ require_relative 'automaton/hide'