stamina-core 0.5.0

Sign up to get free protection for your applications and to get access to all the features.
Files changed (36) hide show
  1. data/CHANGELOG.md +78 -0
  2. data/LICENCE.md +22 -0
  3. data/lib/stamina-core/stamina-core.rb +1 -0
  4. data/lib/stamina-core/stamina/adl.rb +298 -0
  5. data/lib/stamina-core/stamina/automaton.rb +1300 -0
  6. data/lib/stamina-core/stamina/automaton/complement.rb +26 -0
  7. data/lib/stamina-core/stamina/automaton/complete.rb +36 -0
  8. data/lib/stamina-core/stamina/automaton/compose.rb +111 -0
  9. data/lib/stamina-core/stamina/automaton/determinize.rb +104 -0
  10. data/lib/stamina-core/stamina/automaton/equivalence.rb +57 -0
  11. data/lib/stamina-core/stamina/automaton/hide.rb +41 -0
  12. data/lib/stamina-core/stamina/automaton/metrics.rb +77 -0
  13. data/lib/stamina-core/stamina/automaton/minimize.rb +23 -0
  14. data/lib/stamina-core/stamina/automaton/minimize/hopcroft.rb +118 -0
  15. data/lib/stamina-core/stamina/automaton/minimize/pitchies.rb +130 -0
  16. data/lib/stamina-core/stamina/automaton/strip.rb +16 -0
  17. data/lib/stamina-core/stamina/automaton/walking.rb +361 -0
  18. data/lib/stamina-core/stamina/command.rb +38 -0
  19. data/lib/stamina-core/stamina/command/adl2dot.rb +82 -0
  20. data/lib/stamina-core/stamina/command/help.rb +23 -0
  21. data/lib/stamina-core/stamina/command/robustness.rb +21 -0
  22. data/lib/stamina-core/stamina/command/run.rb +84 -0
  23. data/lib/stamina-core/stamina/core.rb +11 -0
  24. data/lib/stamina-core/stamina/dsl.rb +6 -0
  25. data/lib/stamina-core/stamina/dsl/automata.rb +23 -0
  26. data/lib/stamina-core/stamina/dsl/core.rb +14 -0
  27. data/lib/stamina-core/stamina/engine.rb +32 -0
  28. data/lib/stamina-core/stamina/engine/context.rb +35 -0
  29. data/lib/stamina-core/stamina/errors.rb +26 -0
  30. data/lib/stamina-core/stamina/ext/math.rb +19 -0
  31. data/lib/stamina-core/stamina/loader.rb +3 -0
  32. data/lib/stamina-core/stamina/markable.rb +42 -0
  33. data/lib/stamina-core/stamina/utils.rb +1 -0
  34. data/lib/stamina-core/stamina/utils/decorate.rb +81 -0
  35. data/lib/stamina-core/stamina/version.rb +14 -0
  36. metadata +93 -0
@@ -0,0 +1,78 @@
1
+ # 0.5.0 / FIX ME
2
+
3
+ * Breaking features.
4
+
5
+ * Support for ruby 1.8.7 has been definitely removed.
6
+
7
+ * Major enhancements
8
+
9
+ * The project has been split in different sub gems (core, induction and gui). This
10
+ implies a lot of internal changes, but the public API has not been affected. A main
11
+ 'stamina' gem automatically includes all sub gems so previous behavior is guaranteed.
12
+
13
+ * Minor enhancements
14
+ * Fixed a bug with bundler usage in main stamina binary
15
+ * adl2dot command now support samples as input in addition to automata. In that case,
16
+ the dot result models a PTA (prefix tree acceptor)
17
+ * Added --png to 'stamina adl2dot'
18
+
19
+ # 0.4.0 / 2011-05-01
20
+
21
+ * Major Enhancements
22
+
23
+ * Added Automaton#to_adl as an shortcut for Stamina::ADL::print_automaton(...)
24
+ * Added Sample#to_pta taken from Induction::Commons
25
+ * Added Automaton completion (all strings parsable) under Automaton#complete[!?]
26
+ * Added Automaton stripping (removal of unreachable states) under Automaton#strip[!]
27
+ * Added Automaton minimization (Hopcroft + Pitchies) under Automaton#minimize
28
+ * Added Abbadingo generators under Abbadingo::RandomDFA and Abbadingo::RandomSample
29
+ * Added a main 'stamina' command relying on Quickl. classiy/adl2dot commands become
30
+ subcommands of stamina itself (see stamina --help for a list of available commands).
31
+ Induction command (rpni and redblue) are now handled by a 'stamina infer' with
32
+ options.
33
+ * Error states and now correctly handled in ADL::parse and ADL::flush
34
+ * RedBlue has been renamed as BlueFringe everywhere (red_?blue -> blue_fringe)
35
+
36
+ * Minnor Enhancements
37
+ * Added a few optimizations here and there
38
+
39
+ * Bug fixes
40
+
41
+ * Fixed a bug in Automaton#depth when some states are unreachable
42
+
43
+ # 0.3.1 / 2011-03-24
44
+
45
+ * Major Enhancements
46
+
47
+ * Implemented the decoration algorithm of Damas10, allowing to decorate states
48
+ with information propagated from states to states until a fixpoint is reached.
49
+ * Added Automaton::Metrics module, automatically included, with useful metrics
50
+ like automaton depth, accepting ratio and so on.
51
+ * Added Scoring module and Classifier#classification_scoring(sample) method
52
+ with common measures from information retrieval.
53
+
54
+ * On the devel side
55
+
56
+ * Moved specific automaton tests under test/stamina/automaton/...
57
+
58
+ # 0.3.0 / 2011-03-24
59
+
60
+ * On the devel side
61
+
62
+ * The project structure is now handled by Noe
63
+ * Ensures that tests are correctly executed under ruby 1.9.2
64
+
65
+
66
+ # 0.2.2 / 2010-10-22
67
+
68
+ * Major Enhancements
69
+
70
+ * Sample#<< does not detect inconsistencies anymore, to ensure a linear method instead of a quadratic one.
71
+
72
+ * On the devel side
73
+
74
+ * Fixes a bug in Rakefile that lead to test failures under ruby 1.8.7
75
+
76
+ # 0.2.1 / 2010-05-01
77
+
78
+ * Main public version for the official competition, extracted from private SVN.
@@ -0,0 +1,22 @@
1
+ The MIT License
2
+
3
+ Copyright (c) 2008-2009 University of Louvain
4
+ (Universite catholique de Louvain-la-Neuve, Belgium)
5
+
6
+ Permission is hereby granted, free of charge, to any person obtaining a copy
7
+ of this software and associated documentation files (the "Software"), to deal
8
+ in the Software without restriction, including without limitation the rights
9
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
10
+ copies of the Software, and to permit persons to whom the Software is
11
+ furnished to do so, subject to the following conditions:
12
+
13
+ The above copyright notice and this permission notice shall be included in
14
+ all copies or substantial portions of the Software.
15
+
16
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
17
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
18
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
19
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
20
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
21
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
22
+ THE SOFTWARE.
@@ -0,0 +1 @@
1
+ require_relative 'stamina/core'
@@ -0,0 +1,298 @@
1
+ module Stamina
2
+ #
3
+ # Automaton Description Language module. This module provides parsing and
4
+ # printing methods for automata and samples. Documentation of the file format
5
+ # used for an automaton is given in parse_automaton; file format for samples is
6
+ # documented in parse_sample.
7
+ #
8
+ # Methods of this module are not intended to be included by a class but invoked
9
+ # on the module instead:
10
+ #
11
+ # begin
12
+ # dfa = Stamina::ADL.parse_automaton_file("my_automaton.adl")
13
+ # rescue ADL::ParseError => ex
14
+ # puts "Oops, the ADL automaton file seems corrupted..."
15
+ # end
16
+ #
17
+ # == Detailed API
18
+ module ADL
19
+
20
+ #################################################################################
21
+ # Automaton Section #
22
+ #################################################################################
23
+
24
+ #
25
+ # Parses a given automaton description and returns an Automaton instance.
26
+ #
27
+ # Raises:
28
+ # - ArgumentError unless _descr_ is an IO object or a String.
29
+ # - ADL::ParseError if the ADL automaton format is not respected.
30
+ #
31
+ # ADL provides a really simple grammar to describe automata. Here is a succint
32
+ # example (full documentation of the ADL automaton grammar can be found in
33
+ # the self-documenting example/adl/automaton.adl file).
34
+ #
35
+ # # Some header comments: tool which has generated this automaton,
36
+ # # maybe a date or other tool options ...
37
+ # # here: 'this automaton accepts the a(ba)* regular language'
38
+ # 2 2
39
+ # 0 true false
40
+ # 1 false true
41
+ # 0 1 a
42
+ # 1 0 b
43
+ #
44
+ def self.parse_automaton(descr)
45
+ automaton = nil
46
+ ADL::to_io(descr) do |io|
47
+ state_count, edge_count = nil, nil
48
+ state_read, edge_read = 0, 0
49
+ states = {}
50
+ mode = :header
51
+
52
+ automaton = Automaton.new do |fa|
53
+ # parse each description line
54
+ line_number = 1
55
+ io.each_line do |l|
56
+ index = l.index('#')
57
+ l = l[0,index] if index
58
+ l = l.strip
59
+ next if l.empty? or l[0,1]=='#'
60
+
61
+ case mode
62
+ when :header
63
+ # looking for |state_count edge_count|
64
+ raise(ADL::ParseError,
65
+ "Parse error line #{line_number}: 'state_count edge_count' expected, "\
66
+ "'#{l}' found.") unless /^(\d+)\s+(\d+)$/ =~ l
67
+ state_count, edge_count = $1.to_i, $2.to_i
68
+ mode = :states
69
+
70
+ when :states
71
+ # looking for |number initial accepting|
72
+ raise(ADL::ParseError,
73
+ "Parse error line #{line_number}: state definition expected, "\
74
+ "'#{l}' found.") unless /^(\S+)\s+(true|false)\s+(true|false)(\s+(true|false))?$/ =~ l
75
+ id, initial, accepting, error = $1, $2, $3, $5
76
+ initial, accepting, error = ("true"==initial), ("true"==accepting), ("true"==error)
77
+
78
+ state = fa.add_state(:initial => initial, :accepting => accepting, :error => error)
79
+ state[:name]=id.to_s
80
+ states[id] = state
81
+
82
+ state_read += 1
83
+ mode = (edge_count==0 ? :end : :edges) if state_read==state_count
84
+
85
+ when :edges
86
+ # looking for |source target symbol|
87
+ raise(ADL::ParseError,
88
+ "Parse error line #{line_number}: edge definition expected, "\
89
+ "'#{l}' found.") unless /^(\S+)\s+(\S+)\s+(\S+)$/ =~ l
90
+ source, target, symbol = $1, $2, $3
91
+ raise(ADL::ParseError,
92
+ "Parse error line #{line_number}: no such state #{source}") \
93
+ unless states[source]
94
+ raise(ADL::ParseError,
95
+ "Parse error line #{line_number}: no such state #{target}") \
96
+ unless states[target]
97
+
98
+ fa.connect(states[source], states[target], {:symbol => symbol})
99
+
100
+ edge_read += 1
101
+ mode = :end if edge_read==edge_count
102
+
103
+ when :end
104
+ raise(ADL::ParseError,
105
+ "Parse error line #{line_number}: trailing data found '#{l}")
106
+
107
+ end # case mode
108
+
109
+ line_number += 1
110
+ end
111
+
112
+ raise(ADL::ParseError, "Parse error: #{state_count} states annouced, "\
113
+ "#{state_read} found.") if state_count != state_read
114
+ raise(ADL::ParseError, "Parse error: #{edge_count} edges annouced, "\
115
+ "#{edge_read} found.") if edge_count != edge_read
116
+
117
+ end # Automaton.new
118
+ end
119
+ return automaton
120
+ end # def self.parse
121
+
122
+ #
123
+ # Parses an automaton file _f_.
124
+ #
125
+ # Shortcut for:
126
+ # File.open(f, 'r') do |io|
127
+ # Stamina::ADL.parse_automaton(io)
128
+ # end
129
+ #
130
+ def self.parse_automaton_file(f)
131
+ automaton = nil
132
+ File.open(f) do |file|
133
+ automaton = ADL::parse_automaton(file)
134
+ end
135
+ automaton
136
+ end
137
+
138
+ #
139
+ # Prints an automaton to a buffer (responding to <code>:&lt;&lt;</code>) in ADL
140
+ # format. Returns the buffer itself.
141
+ #
142
+ def self.print_automaton(fa, buffer="")
143
+ buffer << "#{fa.state_count.to_s} #{fa.edge_count.to_s}" << "\n"
144
+ fa.states.each do |s|
145
+ buffer << "#{s.index.to_s} #{s.initial?} #{s.accepting?}" << (s.error? ? " true" : "") << "\n"
146
+ end
147
+ fa.edges.each do |e|
148
+ buffer << "#{e.source.index.to_s} #{e.target.index.to_s} #{e.symbol.to_s}" << "\n"
149
+ end
150
+ buffer
151
+ end
152
+
153
+ #
154
+ # Prints an automaton to a file whose path is provided.
155
+ #
156
+ # Shortcut for:
157
+ # File.open(file, 'w') do |io|
158
+ # print_automaton(fa, io)
159
+ # end
160
+ #
161
+ def self.print_automaton_to_file(fa, file)
162
+ File.open(file, 'w') do |io|
163
+ print_automaton(fa, io)
164
+ end
165
+ end
166
+
167
+ #################################################################################
168
+ # String and Sample Section #
169
+ #################################################################################
170
+
171
+ #
172
+ # Parses an input string _str_ and returns a InputString instance. Format of
173
+ # input strings is documented in parse_sample. _str_ is required to be a ruby
174
+ # String.
175
+ #
176
+ # Raises:
177
+ # - ADL::ParseError if the ADL string format is not respected.
178
+ #
179
+ def self.parse_string(str)
180
+ symbols = str.split(' ')
181
+ case symbols[0]
182
+ when '+'
183
+ symbols.shift
184
+ InputString.new symbols, true, false
185
+ when '-'
186
+ symbols.shift
187
+ InputString.new symbols, false, false
188
+ when '?'
189
+ symbols.shift
190
+ InputString.new symbols, nil, false
191
+ else
192
+ raise ADL::ParseError, "Invalid string format #{str}", caller
193
+ end
194
+ end
195
+
196
+ #
197
+ # Parses the sample provided by _descr_. When a block is provided, yields it with
198
+ # InputString instances and ignores the sample argument. Otherwise, fills the sample
199
+ # (any object responding to <code><<</code>) with string, creating a fresh new
200
+ # one (as a Sample instance) if sample is nil.
201
+ #
202
+ # ADL provides a really simple grammar to describe samples (here is a succint
203
+ # example, the full documentation of the sample grammar can be found in the
204
+ # self-documenting example/adl/sample.adl file):
205
+ #
206
+ # #
207
+ # # Some header comments: tool which has generated this sample,
208
+ # # maybe a date or other tool options ...
209
+ # # here: 'this sample is caracteristic for the a(ba)* regular language'
210
+ # #
211
+ # # Positive, Negative, Unlabeled strings become with +, -, ?, respectively
212
+ # # Empty lines and lines becoming with # are simply ignored.
213
+ # #
214
+ # -
215
+ # + a
216
+ # - a b
217
+ # + a b a
218
+ #
219
+ # Raises:
220
+ # - ArgumentError unless _descr_ argument is an IO object or a String.
221
+ # - ADL::ParseError if the ADL sample format is not respected.
222
+ # - InconsistencyError if the sample is not consistent (see Sample)
223
+ #
224
+ def self.parse_sample(descr, sample=nil)
225
+ sample = Sample.new if (sample.nil? and not block_given?)
226
+ ADL::to_io(descr) do |io|
227
+ io.each_line do |l|
228
+ l = l.strip
229
+ next if l.empty? or l[0,1]=='#'
230
+ if sample.nil? and block_given?
231
+ yield parse_string(l)
232
+ else
233
+ sample << parse_string(l)
234
+ end
235
+ end
236
+ end
237
+ sample
238
+ end
239
+
240
+ #
241
+ # Parses an automaton file _f_.
242
+ #
243
+ # Shortuct for:
244
+ # File.open(f) do |file|
245
+ # sample = ADL::parse_sample(file, sample)
246
+ # end
247
+ #
248
+ def self.parse_sample_file(f, sample=nil)
249
+ File.open(f) do |file|
250
+ sample = ADL::parse_sample(file, sample)
251
+ end
252
+ sample
253
+ end
254
+
255
+ #
256
+ # Prints a sample in ADL format on a buffer. Sample argument is expected to be
257
+ # an object responding to each, yielding InputString instances. Buffer is expected
258
+ # to be an object responding to <code><<</code>.
259
+ #
260
+ def self.print_sample(sample, buffer="")
261
+ sample.each do |str|
262
+ buffer << str.to_s << "\n"
263
+ end
264
+ end
265
+
266
+ #
267
+ # Prints a sample in a file.
268
+ #
269
+ # Shortcut for:
270
+ # File.open(file, 'w') do |io|
271
+ # print_sample(sample, f)
272
+ # end
273
+ #
274
+ def self.print_sample_in_file(sample, file)
275
+ File.open(file, 'w') do |f|
276
+ print_sample(sample, f)
277
+ end
278
+ end
279
+
280
+ ### private section ##########################################################
281
+ private
282
+
283
+ #
284
+ # Converts a parsable argument to an IO object or raises an ArgumentError.
285
+ #
286
+ def self.to_io(descr)
287
+ case descr
288
+ when IO
289
+ yield descr
290
+ when String
291
+ yield StringIO.new(descr)
292
+ else
293
+ raise ArgumentError, "IO instance expected, #{descr.class} received", caller
294
+ end
295
+ end
296
+
297
+ end # module ADL
298
+ end # module Stamina
@@ -0,0 +1,1300 @@
1
+ module Stamina
2
+
3
+ #
4
+ # Automaton data-structure.
5
+ #
6
+ # == Examples
7
+ # The following example uses a lot of useful DRY shortcuts, so, if it does not
8
+ # fit you needs then, read on!):
9
+ #
10
+ # # Building an automaton for the regular language a(ba)*
11
+ # fa = Automaton.new do
12
+ # add_state(:initial => true)
13
+ # add_state(:accepting => true)
14
+ # connect(0,1,'a')
15
+ # connect(1,0,'b')
16
+ # end
17
+ #
18
+ # # It accepts 'a b a b a', rejects 'a b' as well as ''
19
+ # puts fa.accepts?('? a b a b a') # prints true
20
+ # puts fa.accepts?('? a b') # prints false
21
+ # puts fa.rejects?('?') # prints true
22
+ #
23
+ # == Four things you need to know
24
+ # 1. Automaton, State and Edge classes implement a Markable design pattern, that
25
+ # is, you can read and write any key/value pair you want on them using the []
26
+ # and []= operators. Note that the following keys are used by Stamina itself,
27
+ # with the obvious semantics (for automata and transducers):
28
+ # - <tt>:initial</tt>, <tt>:accepting</tt>, <tt>:error</tt> on State;
29
+ # expected to be _true_ or _false_ (_nil_ and ommitted are considered as false).
30
+ # Shortcuts for querying and setting these attributes are provided by State.
31
+ # - <tt>:symbol</tt> on Edge, with shortcuts as well on Edge.
32
+ # The convention is to use _nil_ for the epsilon symbol (aka non observable)
33
+ # on non deterministic automata.
34
+ # The following keys are reserved for future extensions:
35
+ # - <tt>:output</tt> on State and Edge.
36
+ # - <tt>:short_prefix</tt> on State.
37
+ # See also the "About states and edges" subsection of the design choices.
38
+ # 2. Why using State methods State#step and State#delta ? The Automaton class includes
39
+ # the Walking module by default, which is much more powerful !
40
+ # 3. The constructor of this class executes the argument block (between <tt>do</tt>
41
+ # and <tt>end</tt>) with instance_eval by default. You won't be able to invoke
42
+ # the methods defined in the scope of your block in such a case. See new
43
+ # for details.
44
+ # 4. This class has not been designed with efficiency in mind. If you experiment
45
+ # performance problems, read the "About Automaton modifications" sub section
46
+ # of the design choices.
47
+ #
48
+ # == Design choices
49
+ # This section fully details the design choices that has been made for the
50
+ # implementation of the Automaton data structure used by Stamina. It is provided
51
+ # because Automaton is one of the core classes of Stamina, that probably all
52
+ # users (and contributors) will use. Automaton usage is really user-friendly,
53
+ # so <b>you are normally not required</b> to read this section in the first
54
+ # place ! Read it only if of interest for you, or if you experiment unexpected
55
+ # results.
56
+ #
57
+ # === One Automaton class only
58
+ # One class only implements all kinds of automata: deterministic, non-deterministic,
59
+ # transducers, prefix-tree-acceptors, etc. The Markable design pattern on states and
60
+ # edges should allow you to make anything you could find useful with this class.
61
+ #
62
+ # === Adjacency-list graph
63
+ # This class implements an automaton using a adjacent-list graph structure.
64
+ # The automaton has state and edge array lists and exposes them through the
65
+ # _states_ and _edges_ accessors. In order to let users enjoy the enumerability
66
+ # of Ruby's arrays while allowing automata to be modified, these arrays are
67
+ # externaly modifiable. However, <b>users are not expected to modify them!</b>
68
+ # and future versions of Stamina will certainly remove this ability.
69
+ #
70
+ # === Indices exposed
71
+ # State and Edge indices in these arrays are exposed by this class. Unless stated
72
+ # explicitely, all methods taking state or edge arguments support indices as well.
73
+ # Moreover, ith_state, ith_states, ith_edge and ith_edges methods provide powerful
74
+ # access to states and edges by indices. All these methods are robust to invalid
75
+ # indices (and raise an IndexError if incorrectly invoked) but do not allow
76
+ # negative indexing (unlike ruby arrays).
77
+ #
78
+ # States and edges know their index in the corresponding array and expose them
79
+ # through the (read-only) _index_ accessor. These indices are always valid;
80
+ # without deletion of states or edges in the automaton, they are guaranteed not
81
+ # to change. Indices saved in your own variables must be considered deprecated
82
+ # each time you perform a deletion ! That's the only rule to respect if you plan
83
+ # to use indices.
84
+ #
85
+ # Indices exposition may seem a strange choice and could be interpreted as
86
+ # breaking OOP's best practice. You are not required to use them but, as will
87
+ # quiclky appear, using them is really powerful and leads to beautiful code!
88
+ # If you don't remove any state or edge, this class guarantees that indices
89
+ # are assigned in the same order as invocations of add_state and add_edge (as
90
+ # well as their plural forms and aliases).
91
+ #
92
+ # === About states and edges
93
+ # Edges know their source and target states, which are exposed through the
94
+ # _source_ and _target_ (read-only) accessors (also aliased as _from_ and _to_).
95
+ # States keep their incoming and outgoing edges in arrays, which are accessible
96
+ # (in fact, a copy) using State#in_edges and State#out_edges. If you use them
97
+ # for walking the automaton in a somewhat standard way, consider using the Walking
98
+ # module instead!
99
+ #
100
+ # Common attributes of states and edges are installed using the Markable pattern
101
+ # itself:
102
+ # - <tt>:initial</tt>, <tt>:accepting</tt> and <tt>:error</tt> on states. These
103
+ # attributes are expected to be _true_ or _false_ (_nil_ and ommitted are also
104
+ # supported and both considered as false).
105
+ # - <tt>:symbol</tt> on edges. Any object you want as long as it responds to the
106
+ # <tt><=></tt> operator. Also, the convention is to use _nil_ for the epsilon
107
+ # symbol (aka non observable) on non deterministic automata.
108
+ #
109
+ # In addition, useful shortcuts are available:
110
+ # - <tt>s.initial?</tt> is a shortcut for <tt>s[:initial]</tt> if _s_ is a State
111
+ # - <tt>s.initial!</tt> is a shortcut for <tt>s[:initial]=true</tt> if _s_ is a State
112
+ # - Similar shortcuts are available for :accepting and :error
113
+ # - <tt>e.symbol</tt> is a shortcut for <tt>e[:symbol]</tt> if _e_ is an Edge
114
+ # - <tt>e.symbol='a'</tt> is a shortcut for <tt>e[:symbol]='a'</tt> if _e_ is an Edge
115
+ #
116
+ # Following keys should be considered reserved by Stamina for future extensions:
117
+ # - <tt>:output</tt> on State and Edge.
118
+ # - <tt>:short_prefix</tt> on State.
119
+ #
120
+ # === About Automaton modifications
121
+ # This class has not been implemented with efficiency in mind. In particular, we expect
122
+ # the vast majority of Stamina core algorithms considering automata as immutable values.
123
+ # For this reason, the Automaton class does not handle modifications really efficiently.
124
+ #
125
+ # So, if you experiment performance problems, consider what follows:
126
+ # 1. Why updating an automaton ? Building a fresh one is much more clean and efficient !
127
+ # This is particularly true for removals.
128
+ # 2. If you can create multiples states or edges at once, consider the plural form
129
+ # of the modification methods: add_n_states and drop_states. Those methods are
130
+ # optimized for multiple updates.
131
+ #
132
+ # == Detailed API
133
+ class Automaton
134
+ include Stamina::Markable
135
+
136
+ #
137
+ # Automaton state.
138
+ #
139
+ class State
140
+ include Stamina::Markable
141
+ attr_reader :automaton, :index
142
+
143
+ #
144
+ # Creates a state.
145
+ #
146
+ # Arguments:
147
+ # - automaton: parent automaton of the state.
148
+ # - index: index of the state in the state list.
149
+ # - data: user data attached to this state.
150
+ #
151
+ def initialize(automaton, index, data)
152
+ @automaton = automaton
153
+ @index = index
154
+ @data = data.dup
155
+ @out_edges = []
156
+ @in_edges = []
157
+ @epsilon_closure = nil
158
+ end
159
+
160
+ ### public read-only section ###############################################
161
+ public
162
+
163
+ # Returns true if this state is an initial state, false otherwise.
164
+ def initial?
165
+ !!@data[:initial]
166
+ end
167
+
168
+ # Sets this state as an initial state.
169
+ def initial!
170
+ @data[:initial] = true
171
+ end
172
+
173
+ # Returns true if this state is an accepting state, false otherwise.
174
+ def accepting?
175
+ !!@data[:accepting]
176
+ end
177
+
178
+ # Sets this state as an accepting state.
179
+ def accepting!
180
+ @data[:accepting] = true
181
+ end
182
+
183
+ # Returns true if this state is an error state, false otherwise.
184
+ def error?
185
+ !!@data[:error]
186
+ end
187
+
188
+ # Sets this state as an error state.
189
+ def error!
190
+ @data[:error] = true
191
+ end
192
+
193
+ # Returns true if this state is deterministic, false otherwise.
194
+ def deterministic?
195
+ outs = out_symbols
196
+ (outs.size==@out_edges.size) and not(outs.include?(nil))
197
+ end
198
+
199
+ # Checks if this state is a sink state or not. Sink states are defined as
200
+ # non accepting states having no outgoing transition or only loop
201
+ # transitions.
202
+ def sink?
203
+ !accepting? && out_edges.all?{|e| e.target==self}
204
+ end
205
+
206
+ # Returns an array containing all incoming edges of the state. Edges are
207
+ # sorted if _sorted_ is set to true. If two incoming edges have same symbol
208
+ # no order is guaranteed between them.
209
+ #
210
+ # Returned array may be modified.
211
+ def in_edges(sorted=false)
212
+ sorted ? @in_edges.sort : @in_edges.dup
213
+ end
214
+
215
+ # Returns an array containing all outgoing edges of the state. Edges are
216
+ # sorted if _sorted_ is set to true. If two outgoing edges have same symbol
217
+ # no order is guaranteed between them.
218
+ #
219
+ # Returned array may be modified.
220
+ def out_edges(sorted=false)
221
+ sorted ? @out_edges.sort : @out_edges.dup
222
+ end
223
+
224
+ # Returns an array with the different symbols appearing on incoming edges.
225
+ # Returned array does not contain duplicates. Symbols are sorted in the
226
+ # array if _sorted_ is set to true.
227
+ #
228
+ # Returned array may be modified.
229
+ def in_symbols(sorted=false)
230
+ symbols = @in_edges.collect{|e| e.symbol}.uniq
231
+ return sorted ? (symbols.sort &automaton.symbols_comparator) : symbols
232
+ end
233
+
234
+ # Returns an array with the different symbols appearing on outgoing edges.
235
+ # Returned array does not contain duplicates. Symbols are sorted in the
236
+ # array if _sorted_ is set to true.
237
+ #
238
+ # Returned array may be modified.
239
+ def out_symbols(sorted=false)
240
+ symbols = @out_edges.collect{|e| e.symbol}.uniq
241
+ return sorted ? (symbols.sort &automaton.symbols_comparator) : symbols
242
+ end
243
+
244
+ # Returns an array with adjacent states (in or out edge).
245
+ #
246
+ # Returned array may be modified.
247
+ def adjacent_states()
248
+ (in_adjacent_states+out_adjacent_states).uniq
249
+ end
250
+
251
+ # Returns an array with adjacent states along an incoming edge (without
252
+ # duplicates).
253
+ #
254
+ # Returned array may be modified.
255
+ def in_adjacent_states()
256
+ (@in_edges.collect {|e| e.source}).uniq
257
+ end
258
+
259
+ # Returns an array with adjacent states along an outgoing edge (whithout
260
+ # duplicates).
261
+ #
262
+ # Returned array may be modified.
263
+ def out_adjacent_states()
264
+ (@out_edges.collect {|e| e.target}).uniq
265
+ end
266
+
267
+ # Returns reachable states from this one with an input _symbol_. Returned
268
+ # array does not contain duplicates and may be modified. This method if not
269
+ # epsilon symbol aware.
270
+ def step(symbol)
271
+ @out_edges.select{|e| e.symbol==symbol}.collect{|e| e.target}
272
+ end
273
+
274
+ # Returns the state reached from this one with an input _symbol_, or nil if
275
+ # no such state. This method is not epsilon symbol aware. Moreover it is
276
+ # expected to be used on deterministic states only. If the state is not
277
+ # deterministic, the method returns one reachable state if such a state
278
+ # exists; which one is returned must be considered non deterministic.
279
+ def dfa_step(symbol)
280
+ edge = @out_edges.find{|e| e.symbol==symbol}
281
+ edge ? edge.target : nil
282
+ end
283
+
284
+ # Computes the epsilon closure of this state. Epsilon closure is the set of
285
+ # all states reached from this one with a <tt>eps*</tt> input (sequence of
286
+ # zero or more epsilon symbols). The current state is always contained in
287
+ # the epsilon closure. Returns an unsorted array without duplicates; this
288
+ # array may not be modified.
289
+ def epsilon_closure()
290
+ @epsilon_closure ||= compute_epsilon_closure(Set.new).to_a.freeze
291
+ end
292
+
293
+ # Internal implementation of epsilon_closure. _result_ is expected to be
294
+ # a Set instance, is modified and is the returned value.
295
+ def compute_epsilon_closure(result)
296
+ result << self
297
+ step(nil).each do |t|
298
+ t.compute_epsilon_closure(result) unless result.include?(t)
299
+ end
300
+ raise if result.nil?
301
+ return result
302
+ end
303
+
304
+ # Computes an array representing the set of states that can be reached from
305
+ # this state with a given input _symbol_. Returned array does not contain
306
+ # duplicates and may be modified. No particular ordering of states in the
307
+ # array is guaranteed.
308
+ #
309
+ # This method is epsilon symbol aware (represented with nil) on non
310
+ # deterministic automata, meaning that it actually computes the set of
311
+ # reachable states through strings respecting the <tt>eps* symbol eps*</tt>
312
+ # regular expression, where eps is the epsilon symbol.
313
+ def delta(symbol)
314
+ if automaton.deterministic?
315
+ target = dfa_delta(symbol)
316
+ target.nil? ? [] : [target]
317
+ else
318
+ # 1) first compute epsilon closure of self
319
+ at_epsilon = epsilon_closure
320
+
321
+ # 2) now, look where we can go from there
322
+ at_espilon_then_symbol = at_epsilon.collect do |s|
323
+ s.step(symbol)
324
+ end.flatten.uniq
325
+
326
+ # 3) look where we can go from there using epsilon
327
+ result = at_espilon_then_symbol.collect do |s|
328
+ s.epsilon_closure
329
+ end.flatten.uniq
330
+
331
+ # return result as an array
332
+ result
333
+ end
334
+ end
335
+
336
+ # Returns the target state that can be reached from this state with _symbol_
337
+ # input. Returns nil if no such state exists.
338
+ #
339
+ # This method is expected to be used on deterministic automata. Unlike delta,
340
+ # it returns a State instance (or nil), not an array of states. When used on
341
+ # non deterministic automata, it returns a state immediately reachable from
342
+ # this state with _symbol_ input, or nil if no such state exists. This
343
+ # method is not epsilon aware.
344
+ def dfa_delta(symbol)
345
+ return nil if symbol.nil?
346
+ edge = @out_edges.find{|e| e.symbol==symbol}
347
+ edge.nil? ? nil : edge.target
348
+ end
349
+
350
+ # Provides comparator of states, based on the index in the automaton state
351
+ # list. This method returns nil unless _o_ is a State from the same
352
+ # automaton than self.
353
+ def <=>(o)
354
+ return nil unless State===o
355
+ return nil unless automaton===o.automaton
356
+ return index <=> o.index
357
+ end
358
+
359
+ # Returns a string representation
360
+ def inspect
361
+ 's' << @index.to_s
362
+ end
363
+
364
+ # Returns a string representation
365
+ def to_s
366
+ 's' << @index.to_s
367
+ end
368
+
369
+ ### protected write section ################################################
370
+ protected
371
+
372
+ # Changes the index of this state in the state list. This method is only
373
+ # expected to be used by the automaton itself.
374
+ def index=(i) @index=i end
375
+
376
+ #
377
+ # Fired by Loaded when a user data is changed. The message is forwarded to
378
+ # the automaton.
379
+ #
380
+ def state_changed(what, description)
381
+ @epsilon_closure = nil
382
+ @automaton.send(:state_changed, what, description)
383
+ end
384
+
385
+ # Adds an incoming edge to the state.
386
+ def add_incoming_edge(edge)
387
+ @epsilon_closure = nil
388
+ @in_edges << edge
389
+ end
390
+
391
+ # Adds an outgoing edge to the state.
392
+ def add_outgoing_edge(edge)
393
+ @epsilon_closure = nil
394
+ @out_edges << edge
395
+ end
396
+
397
+ # Adds an incoming edge to the state.
398
+ def drop_incoming_edge(edge)
399
+ @epsilon_closure = nil
400
+ @in_edges.delete(edge)
401
+ end
402
+
403
+ # Adds an outgoing edge to the state.
404
+ def drop_outgoing_edge(edge)
405
+ @epsilon_closure = nil
406
+ @out_edges.delete(edge)
407
+ end
408
+
409
+ protected :compute_epsilon_closure
410
+ end
411
+
412
+ #
413
+ # Automaton edge.
414
+ #
415
+ class Edge
416
+ include Stamina::Markable
417
+ attr_reader :automaton, :index, :from, :to
418
+
419
+ #
420
+ # Creates an edge.
421
+ #
422
+ # Arguments:
423
+ # - automaton: parent automaton of the edge.
424
+ # - index: index of the edge in the edge list.
425
+ # - data: user data attached to this edge.
426
+ # - from: source state of the edge.
427
+ # - to: target state of the edge.
428
+ #
429
+ def initialize(automaton, index, data, from, to)
430
+ @automaton, @index = automaton, index
431
+ @data = data
432
+ @from, @to = from, to
433
+ end
434
+
435
+ # Returns edge symbol.
436
+ def symbol()
437
+ @data[:symbol]
438
+ end
439
+
440
+ # Sets edge symbol.
441
+ def symbol=(symbol)
442
+ @data[:symbol] = symbol
443
+ end
444
+
445
+ alias :source :from
446
+ alias :target :to
447
+
448
+ #
449
+ # Provides comparator of edges, based on the index in the automaton edge
450
+ # list. This method returns nil unless _o_ is an Edge from the same
451
+ # automaton than self.
452
+ # Once again, this method has nothing to do with equality, it looks at an
453
+ # index and ID only.
454
+ #
455
+ def <=>(o)
456
+ return nil unless Edge===o
457
+ return nil unless automaton===o.automaton
458
+ return index <=> o.index
459
+ end
460
+
461
+ # Returns a string representation
462
+ def inspect
463
+ 'e' << @index.to_s
464
+ end
465
+
466
+ # Returns a string representation
467
+ def to_s
468
+ 'e' << @index.to_s
469
+ end
470
+
471
+ ### protected write section ################################################
472
+ protected
473
+
474
+ # Changes the index of this edge in the edge list. This method is only
475
+ # expected to be used by the automaton itself.
476
+ def index=(i) @index=i end
477
+
478
+ #
479
+ # Fired by Loaded when a user data is changed. The message if forwarded to
480
+ # the automaton.
481
+ #
482
+ def state_changed(what, infos)
483
+ @automaton.send(:state_changed, what, infos)
484
+ end
485
+
486
+ end
487
+
488
+ ### Automaton class ##########################################################
489
+ public
490
+
491
+ # State list and edge list of the automaton
492
+ attr_reader :states, :edges
493
+
494
+ #
495
+ # Creates an empty automaton and executes the block passed as argument. The _onself_
496
+ # argument dictates the way _block_ is executed:
497
+ # - when set to false, the block is executed traditionnally (i.e. using yield).
498
+ # In this case, methods invocations must be performed on the automaton object
499
+ # passed as block argument.
500
+ # - when set to _true_ (by default) the block is executed in the context of the
501
+ # automaton itself (i.e. with instance_eval), allowing call of its methods
502
+ # without prefixing them by the automaton variable. The automaton still
503
+ # passes itself as first block argument. Note that in this case, you won't be
504
+ # able to invoke a method defined in the scope of your block.
505
+ #
506
+ # Example:
507
+ # # The DRY way to do:
508
+ # Automaton.new do |automaton| # automaton will not be used here, but it is passed
509
+ # add_state(:initial => true)
510
+ # add_state(:accepting => true)
511
+ # connect(0, 1, 'a')
512
+ # connect(1, 0, 'b')
513
+ #
514
+ # # method_in_caller_scope() # commented because not allowed here !!
515
+ # end
516
+ #
517
+ # # The other way:
518
+ # Automaton.new(false) do |automaton| # automaton MUST be used here
519
+ # automaton.add_state(:initial => true)
520
+ # automaton.add_state(:accepting => true)
521
+ # automaton.connect(0, 1, 'a')
522
+ # automaton.connect(1, 0, 'b')
523
+ #
524
+ # method_in_caller_scope() # allowed in this variant !!
525
+ # end
526
+ #
527
+ def initialize(onself=true, &block) # :yields: automaton
528
+ @states = []
529
+ @edges = []
530
+ @initials = nil
531
+ @alphabet = nil
532
+ @deterministic = nil
533
+
534
+ # if there's a block, execute it now!
535
+ if block_given?
536
+ if onself
537
+ if RUBY_VERSION >= "1.9.0"
538
+ instance_exec(self, &block)
539
+ else
540
+ instance_eval(&block)
541
+ end
542
+ else
543
+ block.call(self)
544
+ end
545
+ end
546
+ end
547
+
548
+ # Coerces `arg` to an automaton
549
+ def self.coerce(arg)
550
+ if arg.respond_to?(:to_fa)
551
+ arg.to_fa
552
+ elsif arg.is_a?(String)
553
+ parse(arg)
554
+ else
555
+ raise ArgumentError, "Invalid argument #{arg} for `Automaton`"
556
+ end
557
+ end
558
+
559
+ # Parses an automaton using ADL
560
+ def self.parse(str)
561
+ ADL::parse_automaton(str)
562
+ end
563
+
564
+ ### public read-only section #################################################
565
+ public
566
+
567
+ # Returns a symbols comparator taking epsilon symbols into account. Comparator
568
+ # is provided as Proc instance which is a lambda function.
569
+ def symbols_comparator
570
+ @symbols_comparator ||= Kernel.lambda do |a,b|
571
+ if a==b then 0
572
+ elsif a.nil? then -1
573
+ elsif b.nil? then 1
574
+ else a <=> b
575
+ end
576
+ end
577
+ end
578
+
579
+ # Returns the number of states
580
+ def state_count() @states.size end
581
+
582
+ # Returns the number of edges
583
+ def edge_count() @edges.size end
584
+
585
+ #
586
+ # Returns the i-th state of the state list.
587
+ #
588
+ # Raises:
589
+ # - ArgumentError unless i is an Integer
590
+ # - IndexError if i is not in [0..state_count)
591
+ #
592
+ def ith_state(i)
593
+ raise(ArgumentError, "Integer expected, #{i} found.", caller)\
594
+ unless Integer === i
595
+ raise(ArgumentError, "Invalid state index #{i}", caller)\
596
+ unless i>=0 and i<state_count
597
+ @states[i]
598
+ end
599
+
600
+ #
601
+ # Returns state associated with the supplied state name, throws an exception if no such state can be found.
602
+ #
603
+ def get_state(name)
604
+ raise(ArgumentError, "String expected, #{name} found.", caller)\
605
+ unless String === name
606
+ result = states.find do |s|
607
+ name == s[:name]
608
+ end
609
+ raise(ArgumentError, "State #{name} was not found", caller)\
610
+ if result.nil?
611
+ result
612
+ end
613
+
614
+ #
615
+ # Returns the i-th states of the state list.
616
+ #
617
+ # Raises:
618
+ # - ArgumentError unless all _i_ are integers
619
+ # - IndexError unless all _i_ are in [0..state_count)
620
+ #
621
+ def ith_states(*i)
622
+ i.collect{|j| ith_state(j)}
623
+ end
624
+
625
+ #
626
+ # Returns the i-th edge of the edge list.
627
+ #
628
+ # Raises:
629
+ # - ArgumentError unless i is an Integer
630
+ # - IndexError if i is not in [0..state_count)
631
+ #
632
+ def ith_edge(i)
633
+ raise(ArgumentError, "Integer expected, #{i} found.", caller)\
634
+ unless Integer === i
635
+ raise(ArgumentError, "Invalid edge index #{i}", caller)\
636
+ unless i>=0 and i<edge_count
637
+ @edges[i]
638
+ end
639
+
640
+ #
641
+ # Returns the i-th edges of the edge list.
642
+ #
643
+ # Raises:
644
+ # - ArgumentError unless all _i_ are integers
645
+ # - IndexError unless all _i_ are in [0..edge_count)
646
+ #
647
+ def ith_edges(*i)
648
+ i.collect{|j| ith_edge(j)}
649
+ end
650
+
651
+ #
652
+ # Calls block for each state of the automaton state list. States are
653
+ # enumerated in index order.
654
+ #
655
+ def each_state() @states.each {|s| yield s if block_given?} end
656
+
657
+ #
658
+ # Calls block for each edge of the automaton edge list. Edges are
659
+ # enumerated in index order.
660
+ #
661
+ def each_edge() @edges.each {|e| yield e if block_given?} end
662
+
663
+ #
664
+ # Returns an array with incoming edges of _state_. Edges are sorted by symbols
665
+ # if _sorted_ is set to true. If two incoming edges have same symbol, no
666
+ # order is guaranteed between them. Returned array may be modified.
667
+ #
668
+ # If _state_ is an Integer, this method returns the incoming edges of the
669
+ # state'th state in the state list.
670
+ #
671
+ # Raises:
672
+ # - IndexError if state is an Integer and state<0 or state>=state_count.
673
+ # - ArgumentError if _state_ is not a valid state for this automaton.
674
+ #
675
+ def in_edges(state, sorted=false) to_state(state).in_edges(sorted) end
676
+
677
+ #
678
+ # Returns an array with outgoing edges of _state_. Edges are sorted by symbols
679
+ # if _sorted_ is set to true. If two incoming edges have same symbol, no
680
+ # order is guaranteed between them. Returned array may be modified.
681
+ #
682
+ # If _state_ is an Integer, this method returns the outgoing edges of the
683
+ # state'th state in the state list.
684
+ #
685
+ # Raises:
686
+ # - IndexError if state is an Integer and state<0 or state>=state_count.
687
+ # - ArgumentError if state is not a valid state (not a state or not from this
688
+ # automaton)
689
+ #
690
+ def out_edges(state, sorted=false) to_state(state).out_edges(sorted) end
691
+
692
+ #
693
+ # Returns an array with the different symbols appearing on incoming edges of
694
+ # _state_. Returned array does not contain duplicates and may be modified;
695
+ # it is sorted if _sorted_ is set to true.
696
+ #
697
+ # If _state_ is an Integer, this method returns the incoming symbols of the
698
+ # state'th state in the state list.
699
+ #
700
+ # Raises:
701
+ # - IndexError if state is an Integer and state<0 or state>=state_count.
702
+ # - ArgumentError if _state_ is not a valid state for this automaton.
703
+ #
704
+ def in_symbols(state, sorted=false) to_state(state).in_symbols(sorted) end
705
+
706
+ #
707
+ # Returns an array with the different symbols appearing on outgoing edges of
708
+ # _state_. Returned array does not contain duplicates and may be modified;
709
+ # it is sorted if _sorted_ is set to true.
710
+ #
711
+ # If _state_ is an Integer, this method returns the outgoing symbols of the
712
+ # state'th state in the state list.
713
+ #
714
+ # Raises:
715
+ # - IndexError if state is an Integer and state<0 or state>=state_count.
716
+ # - ArgumentError if state is not a valid state (not a state or not from this
717
+ # automaton)
718
+ #
719
+ def out_symbols(state, sorted=false) to_state(state).out_symbols(sorted) end
720
+
721
+ #
722
+ # Returns an array with adjacent states (along incoming and outgoing edges)
723
+ # of _state_. Returned array does not contain duplicates; it may be modified.
724
+ #
725
+ # If _state_ is an Integer, this method returns the adjacent states of the
726
+ # state'th state in the state list.
727
+ #
728
+ # Raises:
729
+ # - IndexError if state is an Integer and state<0 or state>=state_count.
730
+ # - ArgumentError if state is not a valid state (not a state or not from this
731
+ # automaton)
732
+ #
733
+ def adjacent_states(state) to_state(state).adjacent_states() end
734
+
735
+ #
736
+ # Returns an array with adjacent states (along incoming edges) of _state_.
737
+ # Returned array does not contain duplicates; it may be modified.
738
+ #
739
+ # If _state_ is an Integer, this method returns the incoming adjacent states
740
+ # of the state'th state in the state list.
741
+ #
742
+ # Raises:
743
+ # - IndexError if state is an Integer and state<0 or state>=state_count.
744
+ # - ArgumentError if state is not a valid state (not a state or not from this
745
+ # automaton)
746
+ #
747
+ def in_adjacent_states(state) to_state(state).in_adjacent_states() end
748
+
749
+ #
750
+ # Returns an array with adjacent states (along outgoing edges) of _state_.
751
+ # Returned array does not contain duplicates; it may be modified.
752
+ #
753
+ # If _state_ is an Integer, this method returns the outgoing adjacent states
754
+ # of the state'th state in the state list.
755
+ #
756
+ # Raises:
757
+ # - IndexError if state is an Integer and state<0 or state>=state_count.
758
+ # - ArgumentError if state is not a valid state (not a state or not from this
759
+ # automaton)
760
+ #
761
+ def out_adjacent_states(state) to_state(state).out_adjacent_states() end
762
+
763
+ #
764
+ # Collects all initial states of this Automaton and returns it. Returned array
765
+ # does not contain duplicates and may be modified.
766
+ #
767
+ # This method is epsilon symbol aware (represented with nil) on
768
+ # non-deterministic automata, meaning that it actually computes the set of
769
+ # reachable states from an initial state through strings respecting the
770
+ # <tt>eps*</tt> regular expression, where eps is the epsilon symbol.
771
+ #
772
+ def initial_states
773
+ if @initials.nil? or @initials.empty?
774
+ @initials = compute_initial_states
775
+ end
776
+ @initials
777
+ end
778
+
779
+ #
780
+ # Returns the initial state of the automaton. This method is expected to used
781
+ # on deterministic automata only. Unlike initial_states, it returns one State
782
+ # instance instead of an Array.
783
+ #
784
+ # When used with a non deterministic automaton, it returns one of the states
785
+ # tagged as initial. Which one is returned must be considered a non
786
+ # deterministic choice. This method is not epsilon symbol aware.
787
+ #
788
+ def initial_state
789
+ initial_states[0]
790
+ end
791
+
792
+ # Internal implementation of initial_states.
793
+ def compute_initial_states()
794
+ initials = @states.select {|s| s.initial?}
795
+ initials.collect{|s| s.epsilon_closure}.flatten.uniq
796
+ end
797
+
798
+ ### public write section #####################################################
799
+ public
800
+
801
+ #
802
+ # Adds a new state.
803
+ #
804
+ # Arguments:
805
+ # - data: user-data to attach to the state (see Automaton documentation).
806
+ #
807
+ # Raises:
808
+ # - ArgumentError if _data_ is not a valid state data.
809
+ #
810
+ def add_state(data={})
811
+ data = to_valid_state_data(data)
812
+
813
+ # create new state, add it to state-list
814
+ state = State.new(self, state_count, data)
815
+ @states << state
816
+
817
+ # let the automaton know that something has changed
818
+ state_changed(:state_added, state)
819
+
820
+ # return created state
821
+ state
822
+ end
823
+ alias :create_state :add_state
824
+
825
+ #
826
+ # Adds _n_ new states in the automaton. Created states are returned as an
827
+ # ordered array (order of states according to their index in state list).
828
+ #
829
+ # _data_ is duplicated for each created state.
830
+ #
831
+ def add_n_states(n, data={})
832
+ created = []
833
+ n.times do |i|
834
+ created << add_state(block_given? ? data.merge(yield(i)) : data.dup)
835
+ end
836
+ created
837
+ end
838
+ alias :create_n_states :add_n_states
839
+
840
+ #
841
+ # Adds a new edge, connecting _from_ and _to_ states of the automaton.
842
+ #
843
+ # Arguments:
844
+ # - from: either a State or a valid state index (Integer).
845
+ # - to: either a State or a valid state index (Integer).
846
+ # - data: user data to attach to the created edge (see Automaton documentation).
847
+ #
848
+ # Raises:
849
+ # - IndexError if _from_ is an Integer but not in [0..state_count)
850
+ # - IndexError if _to_ is an Integer but not in [0..state_count)
851
+ # - ArgumentError if _from_ is not a valid state for this automaton.
852
+ # - ArgumentError if _to_ is not a valid state for this automaton.
853
+ # - ArgumentError if _data_ is not a valid edge data.
854
+ #
855
+ def add_edge(from, to, data)
856
+ from, to, data = to_state(from), to_state(to), to_valid_edge_data(data)
857
+
858
+ # create edge, install it, add it to edge-list
859
+ edge = Edge.new(self, edge_count, data, from, to)
860
+ @edges << edge
861
+ from.send(:add_outgoing_edge, edge)
862
+ to.send(:add_incoming_edge, edge)
863
+
864
+ # let automaton know that something has changed
865
+ state_changed(:edge_added, edge)
866
+
867
+ # return created edge
868
+ edge
869
+ end
870
+ alias :create_edge :add_edge
871
+ alias :connect :add_edge
872
+
873
+ #
874
+ # Adds all states and transitions (as copies) from a different automaton.
875
+ # None of the added states are made initial. Returns the (associated state
876
+ # of the) initial state of the added part.
877
+ #
878
+ # This method is deprecated and should not be used anymore. Use dup instead.
879
+ #
880
+ # In order to ensure that names of the new states do not clash with names of
881
+ # existing states, state names may have to be removed from added states;
882
+ # this is the case if _clear_names_ is set to true.
883
+ #
884
+ def add_automaton(fa, clear_names = true)
885
+ initial = nil
886
+ fa.dup(self){|source,target|
887
+ initial = target if target.initial?
888
+ target[:initial] = false
889
+ target[:name] = nil if clear_names
890
+ }
891
+ initial
892
+ end
893
+
894
+ #
895
+ # Constructs a replica of this automaton and returns a copy.
896
+ #
897
+ # This copy can be modified in whatever way without affecting the original
898
+ # automaton.
899
+ #
900
+ def dup(fa = Automaton.new)
901
+ added = states.collect do |source|
902
+ target = fa.add_state(source.data.dup)
903
+ yield(source, target) if block_given?
904
+ target
905
+ end
906
+ edges.each do |edge|
907
+ from, to = added[edge.from.index], added[edge.to.index]
908
+ fa.connect(from, to, edge.data.dup)
909
+ end
910
+ fa
911
+ end
912
+
913
+ #
914
+ # Drops a state of the automaton, as well as all connected edges to that state.
915
+ # If _state_ is an integer, the state-th state of the state list is removed.
916
+ # This method returns the automaton itself.
917
+ #
918
+ # Raises:
919
+ # - IndexError if _edge_ is an Integer but not in [0..edge_count)
920
+ # - ArgumentError if _edge_ is not a valid edge for this automaton.
921
+ #
922
+ def drop_state(state)
923
+ state = to_state(state)
924
+ # remove edges first: drop_edges ensures that edge list is coherent
925
+ drop_edges(*(state.in_edges + state.out_edges).uniq)
926
+
927
+ # remove state now and renumber
928
+ @states.delete_at(state.index)
929
+ state.index.upto(state_count-1) do |i|
930
+ @states[i].send(:index=, i)
931
+ end
932
+ state.send(:index=, -1)
933
+
934
+ state_changed(:state_dropped, state)
935
+ self
936
+ end
937
+ alias :delete_state :drop_state
938
+
939
+ #
940
+ # Drops all states passed as parameter as well as all their connected edges.
941
+ # Arguments may be state instances, as well as valid state indices. Duplicates
942
+ # are even supported. This method has no effect on the automaton and raises
943
+ # an error if some state argument is not valid.
944
+ #
945
+ # Raises:
946
+ # - ArgumentError if one state in _states_ is not a valid state of this
947
+ # automaton.
948
+ #
949
+ def drop_states(*states)
950
+ # check states first
951
+ states = states.collect{|s| to_state(s)}.uniq.sort
952
+ edges = states.collect{|s| (s.in_edges + s.out_edges).uniq}.flatten.uniq.sort
953
+
954
+ # Remove all edges, we do not use drop_edges to avoid spending too much
955
+ # time reindexing edges. Moreover, we can do it that way because we take
956
+ # edges in reverse indexing order (has been sorted previously)
957
+ until edges.empty?
958
+ edge = edges.pop
959
+ edge.source.send(:drop_outgoing_edge,edge)
960
+ edge.target.send(:drop_incoming_edge,edge)
961
+ @edges.delete_at(edge.index)
962
+ edge.send(:index=, -1)
963
+ state_changed(:edge_dropped, edge)
964
+ end
965
+
966
+ # Remove all states, same kind of hack is used
967
+ until states.empty?
968
+ state = states.pop
969
+ @states.delete_at(state.index)
970
+ state.send(:index=, -1)
971
+ state_changed(:state_dropped, state)
972
+ end
973
+
974
+ # sanitize state and edge lists
975
+ @states.each_with_index {|s,i| s.send(:index=,i)}
976
+ @edges.each_with_index {|e,i| e.send(:index=,i)}
977
+
978
+ self
979
+ end
980
+
981
+ #
982
+ # Drops an edge in the automaton. If _edge_ is an integer, the edge-th edge
983
+ # of the edge list is removed. This method returns the automaton itself.
984
+ #
985
+ # Raises:
986
+ # - IndexError if _edge_ is an Integer but not in [0..edge_count)
987
+ # - ArgumentError if _edge_ is not a valid edge for this automaton.
988
+ #
989
+ def drop_edge(edge)
990
+ edge = to_edge(edge)
991
+ @edges.delete_at(edge.index)
992
+ edge.from.send(:drop_outgoing_edge,edge)
993
+ edge.to.send(:drop_incoming_edge,edge)
994
+ edge.index.upto(edge_count-1) do |i|
995
+ @edges[i].send(:index=, i)
996
+ end
997
+ edge.send(:index=,-1)
998
+ state_changed(:edge_dropped, edge)
999
+ self
1000
+ end
1001
+ alias :delete_edge :drop_edge
1002
+
1003
+ #
1004
+ # Drops all edges passed as parameters. Arguments may be edge objects,
1005
+ # as well as valid edge indices. Duplicates are even supported. This method
1006
+ # has no effect on the automaton and raises an error if some edge argument
1007
+ # is not valid.
1008
+ #
1009
+ # Raises:
1010
+ # - ArgumentError if one edge in _edges_ is not a valid edge of this automaton.
1011
+ #
1012
+ def drop_edges(*edges)
1013
+ # check edges first
1014
+ edges = edges.collect{|e| to_edge(e)}.uniq
1015
+
1016
+ # remove all edges
1017
+ edges.each do |e|
1018
+ @edges.delete(e)
1019
+ e.from.send(:drop_outgoing_edge,e)
1020
+ e.to.send(:drop_incoming_edge,e)
1021
+ e.send(:index=, -1)
1022
+ state_changed(:edge_dropped, e)
1023
+ end
1024
+ @edges.each_with_index do |e,i|
1025
+ e.send(:index=,i)
1026
+ end
1027
+
1028
+ self
1029
+ end
1030
+ alias :delete_edges :drop_edges
1031
+
1032
+ ### protected section ########################################################
1033
+ protected
1034
+
1035
+ #
1036
+ # Converts a _state_ argument to a valid State of this automaton.
1037
+ # There are three ways to refer to a state, by position in the internal
1038
+ # collection of states, using an instance of State and using a name of a
1039
+ # state (represented with a String).
1040
+ #
1041
+ # Raises:
1042
+ # - IndexError if state is an Integer and state<0 or state>=state_count.
1043
+ # - ArgumentError if state is not a valid state (not a state or not from this
1044
+ # automaton)
1045
+ #
1046
+ def to_state(state)
1047
+ case state
1048
+ when State
1049
+ return state if state.automaton==self and state==@states[state.index]
1050
+ raise ArgumentError, "Not a state of this automaton", caller
1051
+ when Integer
1052
+ return ith_state(state)
1053
+ when String
1054
+ result = get_state(state)
1055
+ return result unless result.nil?
1056
+ end
1057
+ raise ArgumentError, "Invalid state argument #{state}", caller
1058
+ end
1059
+
1060
+ #
1061
+ # Converts an _edge_ argument to a valid Edge of this automaton.
1062
+ #
1063
+ # Raises:
1064
+ # - IndexError if _edge_ is an Integer but not in [0..edge_count)
1065
+ # - ArgumentError if _edge_ is not a valid edge (not a edge or not from this
1066
+ # automaton)
1067
+ #
1068
+ def to_edge(edge)
1069
+ case edge
1070
+ when Edge
1071
+ return edge if edge.automaton==self and edge==@edges[edge.index]
1072
+ raise ArgumentError, "Not an edge of this automaton", caller
1073
+ when Integer
1074
+ return ith_edge(edge)
1075
+ end
1076
+ raise ArgumentError, "Invalid edge argument #{edge}", caller
1077
+ end
1078
+
1079
+ #
1080
+ # Checks if a given user-data contains enough information to be attached to
1081
+ # a given state. Returns the data if ok.
1082
+ #
1083
+ # Raises:
1084
+ # - ArgumentError if data is not considered a valid state data.
1085
+ #
1086
+ def to_valid_state_data(data)
1087
+ raise(ArgumentError,
1088
+ "User data should be an Hash", caller) unless Hash===data
1089
+ data
1090
+ end
1091
+
1092
+ #
1093
+ # Checks if a given user-data contains enough information to be attached to
1094
+ # a given edge. Returns the data if ok.
1095
+ #
1096
+ # Raises:
1097
+ # - ArgumentError if data is not considered a valid edge data.
1098
+ #
1099
+ def to_valid_edge_data(data)
1100
+ return {:symbol => data} if data.nil? or data.is_a?(String)
1101
+ raise(ArgumentError,
1102
+ "User data should be an Hash", caller) unless Hash===data
1103
+ raise(ArgumentError,
1104
+ "User data should contain a :symbol attribute.",
1105
+ caller) unless data.has_key?(:symbol)
1106
+ raise(ArgumentError,
1107
+ "Edge :symbol attribute cannot be an array.",
1108
+ caller) if Array===data[:symbol]
1109
+ data
1110
+ end
1111
+
1112
+ ### public sections with useful utilities ####################################
1113
+ public
1114
+
1115
+ # Returns true if the automaton is deterministic, false otherwise
1116
+ def deterministic?
1117
+ if @deterministic.nil?
1118
+ @deterministic = @states.all?{|s| s.deterministic?}
1119
+ end
1120
+ @deterministic
1121
+ end
1122
+
1123
+ ### public & protected sections about alphabet ###############################
1124
+ protected
1125
+
1126
+ # Deduces the alphabet from the automaton edges.
1127
+ def deduce_alphabet
1128
+ edges.collect{|e| e.symbol}.uniq.compact.sort
1129
+ end
1130
+
1131
+ public
1132
+
1133
+ # Returns the alphabet of the automaton.
1134
+ def alphabet
1135
+ @alphabet || deduce_alphabet
1136
+ end
1137
+
1138
+ # Sets the aphabet of the automaton. _alph_ is expected to be an array without
1139
+ # nil nor duplicated. This method raises an ArgumentError otherwise. Such an
1140
+ # error is also raised if a symbol used on the automaton edges is not included
1141
+ # in _alph_.
1142
+ def alphabet=(alph)
1143
+ raise ArgumentError, "Invalid alphabet" unless alph.uniq.compact.size==alph.size
1144
+ raise ArgumentError, "Invalid alphabet" unless deduce_alphabet.reject{|s| alph.include?(s)}.empty?
1145
+ @alphabet = alph.sort
1146
+ end
1147
+
1148
+ ### public section about coercions ###########################################
1149
+ public
1150
+
1151
+ # Returns a finite automaton
1152
+ def to_fa
1153
+ self
1154
+ end
1155
+
1156
+ # Returns a deterministic finite automaton
1157
+ def to_dfa
1158
+ self.deterministic? ? self : self.determinize
1159
+ end
1160
+
1161
+ # Returns a canonical deterministic finite automaton
1162
+ def to_cdfa
1163
+ cdfa = self
1164
+ cdfa = cdfa.determinize unless self.deterministic?
1165
+ cdfa = cdfa.complete unless self.complete?
1166
+ cdfa = cdfa.minimize
1167
+ cdfa
1168
+ end
1169
+
1170
+ # Returns a regular language
1171
+ def to_reglang
1172
+ RegLang.new(self)
1173
+ end
1174
+
1175
+ ### public section about dot utilities #######################################
1176
+ protected
1177
+
1178
+ #
1179
+ # Converts a hash of attributes (typically automaton, state or edge attributes)
1180
+ # to a <code>[...]</code> dot string. Braces are part of the output.
1181
+ #
1182
+ def attributes2dot(attrs)
1183
+ buffer = ""
1184
+ attrs.keys.sort{|k1,k2| k1.to_s <=> k2.to_s}.each do |key|
1185
+ buffer << " " unless buffer.empty?
1186
+ value = attrs[key].to_s.gsub('"','\"')
1187
+ buffer << "#{key}=\"#{value}\""
1188
+ end
1189
+ buffer
1190
+ end
1191
+
1192
+ public
1193
+
1194
+ #
1195
+ # Generates a dot output from an automaton. The rewriter block takes
1196
+ # two arguments: the first one is a Markable instance (graph, state or
1197
+ # edge), the second one indicates which kind of element is passed (through
1198
+ # :automaton, :state or :edge symbol). The rewriter is expected to return a
1199
+ # hash-like object providing dot attributes for the element.
1200
+ #
1201
+ # When no rewriter is provided, a default one is used by default, providing
1202
+ # the following behavior:
1203
+ # - on :automaton
1204
+ #
1205
+ # {:rankdir => "LR"}
1206
+ #
1207
+ # - on :state
1208
+ #
1209
+ # {:shape => "doublecircle/circle" (following accepting?),
1210
+ # :style => "filled",
1211
+ # :fillcolor => "green/red/white" (if initial?/error?/else, respectively)}
1212
+ #
1213
+ # - on edge
1214
+ #
1215
+ # {:label => "#{edge.symbol}"}
1216
+ #
1217
+ def to_dot(&rewriter)
1218
+ unless rewriter
1219
+ to_dot do |elm, kind|
1220
+ case kind
1221
+ when :automaton
1222
+ {:pack => true, :rankdir => "LR", :ranksep => 0, :margin => 0}
1223
+ when :state
1224
+ {:shape => (elm.accepting? ? "doublecircle" : "circle"),
1225
+ :style => "filled",
1226
+ :color => "black",
1227
+ :fillcolor => (elm.initial? ? "green" : (elm.error? ? "red" : "white")),
1228
+ :width => 0.6, :height => 0.6, :fixedsize => true
1229
+ }
1230
+ when :edge
1231
+ {:label => elm.symbol.nil? ? '' : elm.symbol.to_s,
1232
+ :arrowsize => 0.7}
1233
+ end
1234
+ end
1235
+ else
1236
+ buffer = "digraph G {\n"
1237
+ attrs = attributes2dot(rewriter.call(self, :automaton))
1238
+ buffer << " graph [#{attrs}];\n"
1239
+ self.depth
1240
+ states.sort{|s1,s2| s1[:depth] <=> s2[:depth]}.each do |s|
1241
+ s.remove_mark(:depth)
1242
+ attrs = attributes2dot(rewriter.call(s, :state))
1243
+ buffer << " #{s.index} [#{attrs}];\n"
1244
+ end
1245
+ edges.each do |e|
1246
+ attrs = attributes2dot(rewriter.call(e, :edge))
1247
+ buffer << " #{e.source.index} -> #{e.target.index} [#{attrs}];\n"
1248
+ end
1249
+ buffer << "}\n"
1250
+ end
1251
+ end
1252
+
1253
+ ### public section about adl utilities #######################################
1254
+ public
1255
+
1256
+ # Prints this automaton in ADL format
1257
+ def to_adl(buffer = "")
1258
+ Stamina::ADL.print_automaton(self, buffer)
1259
+ end
1260
+
1261
+ ### public section about reordering ##########################################
1262
+ public
1263
+
1264
+ # Uses a comparator block to reorder the state list.
1265
+ def order_states(&block)
1266
+ raise ArgumentError, "A comparator block must be given" unless block_given?
1267
+ raise ArgumentError, "A comparator block of arity 2 must be given" unless block.arity==2
1268
+ @states.sort!(&block)
1269
+ @states.each_with_index{|s,i| s.send(:index=, i)}
1270
+ self
1271
+ end
1272
+
1273
+ ### protected section about changes ##########################################
1274
+ protected
1275
+
1276
+ #
1277
+ # Fires by write method when an automaton change occurs.
1278
+ #
1279
+ def state_changed(what, infos)
1280
+ @initials = nil
1281
+ @deterministic = nil
1282
+ end
1283
+
1284
+ protected :compute_initial_states
1285
+
1286
+ DUM = Automaton.new{ add_state(:initial => true, :accepting => false) }
1287
+ DEE = Automaton.new{ add_state(:initial => true, :accepting => true) }
1288
+ end # class Automaton
1289
+
1290
+ end # module Stamina
1291
+ require_relative 'automaton/walking'
1292
+ require_relative 'automaton/complete'
1293
+ require_relative 'automaton/complement'
1294
+ require_relative 'automaton/strip'
1295
+ require_relative 'automaton/equivalence'
1296
+ require_relative 'automaton/determinize'
1297
+ require_relative 'automaton/minimize'
1298
+ require_relative 'automaton/metrics'
1299
+ require_relative 'automaton/compose'
1300
+ require_relative 'automaton/hide'