rubylabs 0.9.0 → 0.9.1

Sign up to get free protection for your applications and to get access to all the features.
data/lib/elizalab.rb CHANGED
@@ -1,20 +1,22 @@
1
- #! /usr/bin/env ruby
1
+ module RubyLabs
2
2
 
3
3
  =begin rdoc
4
4
 
5
- == ELIZA
6
-
7
- Joseph Weizenbaum's ELIZA program in Ruby. Students can play with the Doctor script,
5
+ == ElizaLab
6
+
7
+ The ElizaLab module has definitions of classes and methods used in the projects for Chapter 10
8
+ of <em>Explorations in Computing</em>. The methods and classes in this module are a Ruby
9
+ implementation of Joseph Weizenbaum's ELIZA program. Users can "chat" with the Doctor script,
8
10
  which mimics a Rogerian psychiatrist, and experiment by adding new rules to Doctor or
9
11
  writing their own scripts.
10
12
 
11
- =end
13
+ Most methods used to install a script or carry on a conversation are in a module named Eliza.
14
+ To interact with Eliza, call one of the class methods, e.g. to load the "doctor" script that
15
+ comes with the ElizaLab module call <tt>Eliza.load(:doctor)</tt> and to start a conversation
16
+ call <tt>Eliza.run</tt>. See the documentation for the Eliza module for a complete list of
17
+ these top level methods.
12
18
 
13
- =begin
14
- TODO don't add rule to queue more than once (e.g. word repeated in input sentence)
15
19
  =end
16
-
17
- module RubyLabs
18
20
 
19
21
  module ElizaLab
20
22
 
@@ -23,269 +25,352 @@ module ElizaLab
23
25
 
24
26
  =begin rdoc
25
27
 
28
+ == Rule
29
+
26
30
  A transformation rule is associated with a key word, and is triggered
27
31
  when that word is found in an input sentence. Rules have integer
28
- priorities, and if more than one rule is enabled ELIZA applies the one
29
- with the highest priority. Each rule has an ordered list of patterns,
30
- and each pattern has a list of reassembly rules.
32
+ priorities, and if more than one rule is enabled Eliza applies the one
33
+ with the highest priority.
31
34
 
32
- To apply a rule, scan the patterns, and for the first pattern that matches
33
- a sentence, build the output using the current reassembly rule.
35
+ Each rule has an ordered list of patterns, which control how Eliza will
36
+ respond to sentences containing the key word (see the Pattern class).
34
37
 
35
38
  =end
36
39
 
37
40
  class Rule
38
41
 
39
- attr_accessor :key, :priority, :patterns
40
-
41
- =begin rdoc
42
- Specify the key word for a rule when the rule is created.
43
- =end
44
-
45
- def initialize(key, priority = 1)
46
- @key = key
47
- @priority = priority
48
- @patterns = Array.new
49
- end
42
+ attr_accessor :key, :priority, :patterns
43
+
44
+ # Create a new Rule object for sentences containing the word +key+. An
45
+ # optional second argument specifies the rule's priority (the default is 1,
46
+ # which is the lowest priority). The list of patterns is initially empty.
47
+
48
+ def initialize(key, priority = 1)
49
+ @key = key
50
+ @priority = priority
51
+ @patterns = Array.new
52
+ end
50
53
 
51
- =begin rdoc
52
- Compare rule priorities. r1 precedes r2 in the queue if r1 has a higher
53
- priority than r2. The >= is important, in order to make sure the default
54
- rule stays at the end of the queue (i.e. new rules will be inserted at the
55
- front).
56
- =end
57
-
54
+ # Compare this Rule with another Rule object +x+ based on their priority attributes. The rule comparison operator
55
+ # is used when a Rule is added to a priority queue.
56
+ #--
57
+ # The >= operator in the method body is important, in order to make sure the default
58
+ # rule stays at the end of the queue (i.e. new rules will be inserted at the
59
+ # front).
60
+
58
61
  def <(x)
59
62
  @priority >= x.priority
60
63
  end
61
-
62
- def addPattern(expr)
63
- if expr.class == Pattern
64
- @patterns << expr
65
- else
66
- if expr.class == String
67
- expr = Regexp.new(expr.slice(1..-2))
68
- end
69
- @patterns << Pattern.new(expr)
70
- end
71
- end
72
-
73
- def [](n)
74
- @patterns[n]
75
- end
76
-
77
- def addReassembly(line, n = -1)
78
- @patterns[n].add_response(line)
79
- end
80
-
81
-
82
- =begin rdoc
83
- Rule application -- try the patterns in order. When the line matches a pattern,
84
- return the next reassembly for that pattern. Apply variable substitutions to both
85
- the patterns and the reassemblies if they contain variables.
86
- =end
87
-
88
- def apply(s, opt)
89
- @patterns.each do |p|
90
- if @@verbose
91
- print "trying pattern "
92
- p p.regexp
93
- end
94
- res = p.apply(s, opt)
95
- return res if ! res.nil?
96
- end
97
- return nil
98
- end
99
-
100
- def to_s
101
- s = @key + " / " + @priority.to_s + "\n"
102
- @patterns.each { |r| s += " " + r.to_s + "\n" }
103
- return s
104
- end
105
-
106
- def inspect
64
+
65
+ # Add a new sentence pattern (represented by a Pattern object) to the list of patterns
66
+ # for this rule. +expr+ can either be a reference to an existing Pattern object, or
67
+ # a string, in which case a new Pattern is created.
68
+
69
+ def addPattern(expr)
70
+ if expr.class == Pattern
71
+ @patterns << expr
72
+ else
73
+ if expr.class == String
74
+ expr = Regexp.new(expr.slice(1..-2))
75
+ end
76
+ @patterns << Pattern.new(expr)
77
+ end
78
+ end
79
+
80
+ # Return a reference to sentence pattern +n+ associated with this rule.
81
+
82
+ def [](n)
83
+ @patterns[n]
84
+ end
85
+
86
+ # Helper method called by methods that read scripts from a file -- add a response
87
+ # string to sentence pattern +n+.
88
+
89
+ def addReassembly(line, n = -1)
90
+ @patterns[n].add_response(line)
91
+ end
92
+
93
+ # Apply this rule to a sentence +s+. Try the patterns in order, to see if any of them match +s+.
94
+ # When +s+ matches a pattern, return the next reassembly for that pattern. Apply variable substitutions to both
95
+ # the patterns and the reassemblies if they contain variables. Return +nil+ if no patterns apply to +s+.
96
+ #
97
+ # The second argument, +opt+, is a symbol that is passed to Pattern#apply to control whether or not
98
+ # it should do preprocessing. Possible values are <tt>:preprocess</tt> or <tt>:no_preprocess</tt>.
99
+
100
+ def apply(s, opt)
101
+ @patterns.each do |p|
102
+ if @@verbose
103
+ print "trying pattern "
104
+ p p.regexp
105
+ end
106
+ res = p.apply(s, opt)
107
+ return res if ! res.nil?
108
+ end
109
+ return nil
110
+ end
111
+
112
+ # Create a string that contains the rule name and priority.
113
+
114
+ def to_s
115
+ s = @key + " / " + @priority.to_s + "\n"
116
+ @patterns.each { |r| s += " " + r.to_s + "\n" }
117
+ return s
118
+ end
119
+
120
+ # Create a string that describes the attributes of this Rule object.
121
+
122
+ def inspect
107
123
  # s = @key.inspect
108
124
  s = ""
109
- s += " [#{@priority}]" if @priority > 1
110
- s += " --> [\n" + @patterns.join("\n") + "]"
111
- return s
112
- end
125
+ s += " [#{@priority}]" if @priority > 1
126
+ s += " --> [\n" + @patterns.join("\n") + "]"
127
+ return s
128
+ end
113
129
 
114
130
  end # class Rule
115
131
 
116
132
  =begin rdoc
117
133
 
134
+ == Pattern
135
+
118
136
  A Pattern represents one way to transform an input sentence into a
119
137
  response. A Pattern instance has a regular expression and a list of
120
138
  one or more reassembly strings that can refer to groups in the expression.
121
139
  There is also an index to record the last reassembly string used, so
122
140
  the application can cycle through the strings.
123
141
 
124
- For convenience the constructor inserts word break anchors and attaches
125
- a /i to the expression as needed. NOTE: the inspect method removes these
126
- automatic items so the printed string is cleaner; to see the real Regexp
127
- call the regexp accessor. Example:
128
-
129
- >> p = Pattern.new(/hi/,"hello")
130
- => /hi/ -> ["hello"] [0]
131
- >> p.regexp
132
- => /\bhi\b/i
133
-
134
- Another convenience: add group delimiters around wildcards (.*), groups of
135
- words (a|b|c), and variable names ($x) if they aren't already there.
136
-
137
142
  =end
138
143
 
139
- # Pattern.new called internally only from Rule#addPattern, which is called
140
- # to add /.*/ for default rule, or when reading /.../ line from script.
141
-
142
- # In interactive experiments, users can call Pattern.new(s) or Pattern.new(s,a)
143
- # where s is a string or regexp, and a is an array of response strings.
144
-
145
144
  class Pattern
146
- attr_accessor :regexp, :list, :index, :md
147
-
148
- def initialize(expr, list = [])
145
+ attr_accessor :regexp, :list, :index, :md
146
+
147
+ # Create a new sentence pattern that will apply to input sentences that
148
+ # match +expr+. The argument can be either a string or a regular expression.
149
+ # If the argument is a string, it is converted to a regular expression that
150
+ # matches exactly that string, e.g. "duck" is converted to /duck/.
151
+ #
152
+ # To make it easier for uses to create patterns without knowing too many details
153
+ # of regular expressions the constructor modifies the regular expression:
154
+ # word breaks:: Insert word break anchors before the first word and after the last word in the expression
155
+ # case insensitive:: Add a \i modifier to the regular expression
156
+ # wildcards:: Insert parentheses around ".*"
157
+ # variables:: Insert parentheses around variable names of the form "$n"
158
+ # alternatives:: Insert parentheses around groups of words, e.g. "a|b|c"
159
+ #
160
+ # To see the real final regular expression stored with a rule call the
161
+ # +regexp+ accessor method.
162
+ #
163
+ # Example:
164
+ # >> p = Pattern.new("duck")
165
+ # => duck: []
166
+ # >> p.regexp
167
+ # => /\bduck\b/i
168
+ #
169
+ # >> p = Pattern.new("plane|train|automobile")
170
+ # => (plane|train|automobile): []
171
+ # >> p.regexp
172
+ # => /(plane|train|automobile)/i
173
+ #
174
+ # >> p = Pattern.new("I don't like .*")
175
+ # => I don't like (.*): []
176
+ # >> p.regexp
177
+ # => /\bI don't like (.*)/i
178
+ #--
179
+ # Pattern.new called internally only from Rule#addPattern, which is called
180
+ # to add /.*/ for default rule, or when reading /.../ line from script.
181
+ #
182
+ # In interactive experiments, users can call Pattern.new(s) or Pattern.new(s,a)
183
+ # where s is a string or regexp, and a is an array of response strings.
184
+
185
+ def initialize(expr, list = [])
149
186
  raise "Pattern#initialize: expr must be String or Regexp" unless (expr.class == String || expr.class == Regexp)
150
- re = (expr.class == String) ? expr : expr.source
187
+ re = (expr.class == String) ? expr : expr.source
151
188
  add_parens(re, /\(?\.\*\)?/ )
152
189
  add_parens(re, /\(?[\w' ]+(\|[\w' ]+)+\)?/ )
153
190
  add_parens(re, /\(?\$\w+\)?/ )
154
- re.insert(0,'\b') if re =~ /^\w/
155
- re.insert(-1,'\b') if re =~ /\w$/
156
- @regexp = Regexp.new(re, :IGNORECASE)
157
- @list = list.nil? ? Array.new : list
158
- @index = 0
159
- end
160
-
161
- def reset
162
- @index = 0
163
- end
164
-
165
- # s is a source string, r is a pattern with optional parens -- add parens if they're not there
166
-
167
- def add_parens(s, r)
168
- s.gsub!(r) { |m| ( m[0] == ?( ) ? m : "(" + m + ")" }
169
- end
170
-
171
- def add_response(sentence)
172
- @list << sentence
173
- end
174
-
175
- def apply(s, opt = :preprocess)
176
- Eliza.preprocess(s) if opt == :preprocess
177
- @md = s.match(@regexp)
178
- return nil if @list.empty? || @md == nil
179
- res = @list[inc()].clone
180
- return res if res[0] == ?@
181
- puts "reassembling '#{res}'" if @@verbose
182
- res.gsub!(/\$\d+/) do |ns|
183
- n = ns.slice(1..-1).to_i # strip leading $, convert to int
184
- if n && @md[n]
185
- puts "postprocess #{@md[n]}" if @@verbose
186
- @md[n].gsub(/[a-z\-$']+/i) do |w|
187
- (@@post.has_key?(w) && @@post[w][0] != ?$) ? @@post[w] : w
188
- end
189
- else
190
- warn "Pattern.apply: no match for #{ns} in '#{res}'"
191
- ""
192
- end
193
- end
194
- return res
195
- end
196
-
197
- def match(s)
198
- @md = s.match(@regexp)
199
- return @md != nil
200
- end
201
-
202
- def parts
203
- return @md.nil? ? nil : @md.captures
204
- end
205
-
206
- def to_s
207
- s = " /" + cleanRegexp + "/\n"
208
- @list.each { |x| s += " \"" + x + "\"\n" }
209
- return s
210
- end
211
-
212
- def inspect
213
- return cleanRegexp + ": " + @list.inspect
214
- end
215
-
216
- def cleanRegexp
217
- res = @regexp.source
218
- res.gsub!(/\\b/,"")
219
- return res
220
- end
221
-
222
- private
223
-
224
- def inc
225
- n = @index
226
- @index = (@index + 1) % @list.length
227
- return n
228
- end
229
-
191
+ re.insert(0,'\b') if re =~ /^\w/
192
+ re.insert(-1,'\b') if re =~ /\w$/
193
+ @regexp = Regexp.new(re, :IGNORECASE)
194
+ @list = list.nil? ? Array.new : list
195
+ @index = 0
196
+ end
197
+
198
+ # Reset the internal counter in this pattern, so that the next response comes from
199
+ # the first response string.
200
+
201
+ def reset
202
+ @index = 0
203
+ end
204
+
205
+ # Helper method called by the constructor -- add parentheses around every occurrence
206
+ # of the string +r+ in sentence pattern +s+. Checks to make sure there aren't already
207
+ # parentheses there.
208
+
209
+ def add_parens(s, r)
210
+ s.gsub!(r) { |m| ( m[0] == ?( ) ? m : "(" + m + ")" }
211
+ end
212
+
213
+ # Add sentence +s+ to the set of response strings for this pattern.
214
+
215
+ def add_response(s)
216
+ @list << s
217
+ end
218
+
219
+ # Try to apply this pattern to input sentence +s+. If +s+ matches the regular
220
+ # expression for this rule, extract the parts that match groups, insert them
221
+ # into the next response string, and return the result. If +s+ does not match
222
+ # the regular expression return +nil+.
223
+ #
224
+ # The second argument should be a symbol that controls whether or not the method
225
+ # applies preprocessing rules. The default is to apply preprocessing, which is the
226
+ # typical case when users call the method from an IRB session. But when Eliza is
227
+ # running, preprocessing is done already, so this argument is set to :no_preprocess.
228
+
229
+ def apply(s, opt = :preprocess)
230
+ Eliza.preprocess(s) if opt == :preprocess
231
+ @md = s.match(@regexp)
232
+ return nil if @list.empty? || @md == nil
233
+ res = @list[inc()].clone
234
+ return res if res[0] == ?@
235
+ puts "reassembling '#{res}'" if @@verbose
236
+ res.gsub!(/\$\d+/) do |ns|
237
+ n = ns.slice(1..-1).to_i # strip leading $, convert to int
238
+ if n && @md[n]
239
+ puts "postprocess #{@md[n]}" if @@verbose
240
+ @md[n].gsub(/[a-z\-$']+/i) do |w|
241
+ (@@post.has_key?(w) && @@post[w][0] != ?$) ? @@post[w] : w
242
+ end
243
+ else
244
+ warn "Pattern.apply: no match for #{ns} in '#{res}'"
245
+ ""
246
+ end
247
+ end
248
+ return res
249
+ end
250
+
251
+ # Helper method -- return +true+ if sentence +s+ matches the regular expression
252
+ # for this pattern.
253
+
254
+ def match(s)
255
+ @md = s.match(@regexp)
256
+ return @md != nil
257
+ end
258
+
259
+ # Helper method -- return an array of parts of the input sentence captured when
260
+ # the input was compared to the regular expression and that matched any wild cards
261
+ # or groups in the regular expression.
262
+
263
+ def parts
264
+ return @md.nil? ? nil : @md.captures
265
+ end
266
+
267
+ # Create a string that summarizes the attributes of this pattern.
268
+
269
+ def to_s
270
+ s = " /" + cleanRegexp + "/\n"
271
+ @list.each { |x| s += " \"" + x + "\"\n" }
272
+ return s
273
+ end
274
+
275
+ # Create a more detailed string summarizing the pattern and its possible responses.
276
+
277
+ def inspect
278
+ return cleanRegexp + ": " + @list.inspect
279
+ end
280
+
281
+ # Helper method called by inspect and to_s -- remove the word boundary anchors from
282
+ # the regular expression so it is easier to read.
283
+
284
+ def cleanRegexp
285
+ res = @regexp.source
286
+ res.gsub!(/\\b/,"")
287
+ return res
288
+ end
289
+
290
+ private
291
+
292
+ def inc
293
+ n = @index
294
+ @index = (@index + 1) % @list.length
295
+ return n
296
+ end
297
+
230
298
  end # class Pattern
231
299
 
232
300
 
233
301
  =begin rdoc
234
- A Dictionary is basically a Hash, but it overrides [] and []= to be case-insensitive
302
+
303
+ == Dictionary
304
+
305
+ A Dictionary object is basically a Hash, but it overrides [] and []= to be case-insensitive.
306
+
235
307
  =end
236
308
 
237
309
  class Dictionary < Hash
238
310
 
311
+ # Create a new empty dictionary.
312
+
239
313
  def initialize
240
314
  super
241
315
  @lc_keys = Hash.new
242
316
  end
243
317
 
318
+ # Look up word +x+ in the dictionary, after converting all the letters in +x+ to lower case.
319
+
244
320
  def [](x)
245
321
  @lc_keys[x.downcase]
246
322
  end
247
323
 
324
+ # Convert all letters in +x+ to lower case, then save item +y+ with the converted key.
325
+
248
326
  def []=(x,y)
249
327
  super
250
328
  @lc_keys[x.downcase] = y
251
329
  end
252
330
 
331
+ # Convert +x+ to lower case, then see if there is an entry for the converted key in the dictionary.
332
+
253
333
  def has_key?(x)
254
334
  return @lc_keys.has_key?(x.downcase)
255
335
  end
256
336
 
257
337
  end # class Dictionary
258
338
 
339
+ =begin rdoc
340
+
341
+ == Eliza
342
+
343
+ This top-level class of the Eliza module defines a singleton object that has
344
+ methods for managing a chat with Eliza.
345
+
346
+ =end
347
+
259
348
  class Eliza
260
-
261
- # These class variables define the "application" processed by ELIZA -- the rule
262
- # sets used to transform inputs to outputs. When ELIZA is initialized or reset it
263
- # gets a default rule that just echoes the user input.
264
-
265
- # Note: I haven't figured out how to have this method called when the module is first
266
- # loaded. As a workaround, any method that refers to a class variable (run, info, etc)
267
- # checks to see if they have been defined yet, and if not, call the reset method.
268
-
269
- def Eliza.clear
270
- @@script = nil
271
- @@aliases = Hash.new
272
- @@vars = Hash.new
273
- @@starts = Array.new
274
- @@stops = Array.new
275
- @@queue = PriorityQueue.new
276
-
277
- @@verbose = false
278
- @@pre.clear
279
- @@post.clear
280
- @@rules.clear
281
-
282
- @@default = Rule.new(:default)
283
- @@default.addPattern(/(.*)/)
284
- @@default.addReassembly("$1")
285
-
286
- return true
287
- end
288
-
349
+
350
+ # Initialize (or reinitialize) the module -- clear out any rules that have been
351
+ # loaded from a script, and install the default script that simply echoes the
352
+ # user intput.
353
+
354
+ def Eliza.clear
355
+ @@script = nil
356
+ @@aliases = Hash.new
357
+ @@vars = Hash.new
358
+ @@starts = Array.new
359
+ @@stops = Array.new
360
+ @@queue = PriorityQueue.new
361
+
362
+ @@verbose = false
363
+ @@pre.clear
364
+ @@post.clear
365
+ @@rules.clear
366
+
367
+ @@default = Rule.new(:default)
368
+ @@default.addPattern(/(.*)/)
369
+ @@default.addReassembly("$1")
370
+
371
+ return true
372
+ end
373
+
289
374
  #
290
375
  # def Eliza.queue
291
376
  # return @@queue
@@ -299,31 +384,37 @@ words (a|b|c), and variable names ($x) if they aren't already there.
299
384
  # return @@vars
300
385
  # end
301
386
  #
387
+
388
+ # These methods are useful for debugging Eliza, but not for end users...
302
389
 
303
- def Eliza.pre
390
+ def Eliza.pre # :nodoc:
304
391
  return @@pre
305
392
  end
306
393
 
307
- def Eliza.post
394
+ def Eliza.post # :nodoc:
308
395
  return @@post
309
396
  end
310
397
 
311
- def Eliza.rules
398
+ def Eliza.rules # :nodoc:
312
399
  return @@rules
313
400
  end
314
-
315
- def Eliza.verbose
316
- @@verbose = true
317
- end
318
-
319
- def Eliza.quiet
320
- @@verbose = false
321
- end
322
-
323
- =begin rdoc
324
- Save a copy of a script that is distributed with RubyLabs; if no output file name specified
325
- make a file name from the program name.
326
- =end
401
+
402
+ # Turn on "verbose mode" to see a detailed trace of which rules and sentence
403
+ # patterns are being applied as Eliza responds to an input sentence. Call
404
+ # Eliza.quiet to return to normal mode.
405
+
406
+ def Eliza.verbose
407
+ @@verbose = true
408
+ end
409
+
410
+ # Turn off "verbose mode" to return to normal processing. See Eliza.verbose.
411
+
412
+ def Eliza.quiet
413
+ @@verbose = false
414
+ end
415
+
416
+ # Save a copy of a script that is distributed with RubyLabs; if no output file name specified
417
+ # make a file name from the program name.
327
418
 
328
419
  def Eliza.checkout(script, filename = nil)
329
420
  scriptfilename = script.to_s + ".txt"
@@ -334,284 +425,334 @@ words (a|b|c), and variable names ($x) if they aren't already there.
334
425
  end
335
426
  outfilename = filename.nil? ? (script.to_s + ".txt") : filename
336
427
  dest = File.open(outfilename, "w")
337
- File.open(scriptfilename).each do |line|
338
- dest.puts line.chomp
339
- end
428
+ File.open(scriptfilename).each do |line|
429
+ dest.puts line.chomp
430
+ end
340
431
  dest.close
341
432
  puts "Copy of #{script} saved in #{outfilename}"
342
433
  end
343
434
 
344
- # Utility procedure to get the rule for a word -- can be called interactively or
345
- # when processing a script
346
-
347
- def Eliza.rule_for(w)
348
- @@rules[w] || ((x = @@aliases[w]) && (r = @@rules[x]))
349
- end
350
-
351
- # Preprocessing -- turn string into single line, words separated by single space,
352
- # apply pre-processing substitutions
353
-
354
- def Eliza.preprocess(s)
435
+ # See if Eliza has a rule associated with keyword +w+. If so, return a reference
436
+ # to that Rule object, otherwise return +nil+.
437
+
438
+ def Eliza.rule_for(w)
439
+ @@rules[w] || ((x = @@aliases[w]) && (r = @@rules[x]))
440
+ end
441
+
442
+ # Apply preprocessing rules to an input +s+. Makes sure the entire input is a single
443
+ # line and words are separated by single space, then applies pre-processing substitution
444
+ # rules. The string is modified in place, so after this call the string +s+ has all
445
+ # of the preprocessing substitutions.
446
+
447
+ def Eliza.preprocess(s)
355
448
  s.gsub!( /\s+/, " " )
356
- s.gsub!(@@word) { |w| @@pre.has_key?(w) ? @@pre[w] : w }
357
- puts "preprocess: line = '#{s}'" if @@verbose
358
- end
359
-
360
- # First pass over the input -- scan each word, apply preprocessing substitutions,
361
- # add rule names to the priority queue. NOTE: this method does a destructive
362
- # update to the input line....
363
-
364
- def Eliza.scan(line, queue)
365
- Eliza.preprocess(line)
366
- line.scan(@@word) do |w|
449
+ s.gsub!(@@word) { |w| @@pre.has_key?(w) ? @@pre[w] : w }
450
+ puts "preprocess: line = '#{s}'" if @@verbose
451
+ end
452
+
453
+ # The scan method implements the first step in the "Eliza algorithm" to determine the response to an input sentence.
454
+ # Apply preprocessing substitutions, then break the line into individual words, and
455
+ # for each word that is associated with a Rule object, add the rule to the priority
456
+ # queue.
457
+ #--
458
+ # NOTE: this method does a destructive update to the input line....
459
+
460
+ def Eliza.scan(line, queue)
461
+ Eliza.preprocess(line)
462
+ line.scan(@@word) do |w|
367
463
  w.downcase!
368
- if r = Eliza.rule_for(w)
369
- queue << r
370
- puts "add rule for '#{w}' to queue" if @@verbose
371
- end
372
- end
373
- end
374
-
375
- def Eliza.apply(line, rule)
376
- puts "applying rule: key = '#{rule.key}'" if @@verbose
377
- if res = rule.apply(line, :no_preprocess)
378
- if res[0] == ?@
379
- rulename = res.slice(1..-1)
380
- if @@rules[rulename]
381
- return Eliza.apply( line, @@rules[rulename] )
382
- else
383
- warn "Eliza.apply: no rule for #{rulename}"
384
- return nil
385
- end
386
- else
387
- return res
388
- end
389
- else
390
- return nil
391
- end
392
- end
393
-
394
- # The heart of the program -- apply transformation rules to an input sentence.
395
-
396
- def Eliza.transform(s)
397
- s.sub!(/[\n\.\?!\-]*$/,"") # strip trailing punctuation
464
+ if r = Eliza.rule_for(w)
465
+ queue << r
466
+ puts "add rule for '#{w}' to queue" if @@verbose
467
+ end
468
+ end
469
+ end
470
+
471
+ # The apply method implements the second step in the "Eliza algorithm" to determine the response to an input sentence.
472
+ # It is called from the top level method (Eliza.transform) to see if a rule applies to an
473
+ # input sentence. If so, return the string generated by the rule object, otherwise
474
+ # return +nil+.
475
+ #
476
+ # This is the method that handles indirection in scripts. If a rule body has a line
477
+ # of the form "@x" it means sentences containing the rule for this word should be
478
+ # handle by the rule for +x+. For example, suppose a script has this rule:
479
+ # duck
480
+ # /football/
481
+ # "I love my Ducks"
482
+ # /.*/
483
+ # @bird
484
+ # If an input sentence contains the word "duck", this rule will be added to the queue.
485
+ # If Eliza applies the rule (after first trying higher priority rules) it will
486
+ # see if the sentence matches the pattern /football/, i.e. if the word "football" appears
487
+ # anywhere else in the sentence, and if so respond with the string "I love my Ducks". If not, the
488
+ # next pattern succeeds (every input matches .*) and the response is generated by the
489
+ # rules for "bird".
490
+
491
+ def Eliza.apply(line, rule)
492
+ puts "applying rule: key = '#{rule.key}'" if @@verbose
493
+ if res = rule.apply(line, :no_preprocess)
494
+ if res[0] == ?@
495
+ rulename = res.slice(1..-1)
496
+ if @@rules[rulename]
497
+ return Eliza.apply( line, @@rules[rulename] )
498
+ else
499
+ warn "Eliza.apply: no rule for #{rulename}"
500
+ return nil
501
+ end
502
+ else
503
+ return res
504
+ end
505
+ else
506
+ return nil
507
+ end
508
+ end
509
+
510
+ # The transform method is called by the top level Eliza.run method to process
511
+ # each sentence typed by the user. Initialize a priority queue, apply
512
+ # preprocessing transformations, and add rules for each word to the queue. Then apply
513
+ # the rules, in order, until a call to <tt>r.apply</tt> for some rule +r+ returns a
514
+ # non-nil response. Note that the default rule should apply to any input string, so
515
+ # it should never be the case that the queue empties out before some rule can apply.
516
+
517
+ def Eliza.transform(s)
518
+ s.sub!(/[\n\.\?!\-]*$/,"") # strip trailing punctuation
398
519
  # s.downcase!
399
520
 
400
- @@queue = PriorityQueue.new
401
- @@queue << @@default # initialize queue with default rule
402
-
403
- Eliza.scan(s, @@queue) # add rules for recognized key words
404
-
405
- while @@queue.length > 0 # apply rules in order of priority
406
- if @@verbose
407
- print "queue: "
408
- p @@queue.collect { |r| r.key }
409
- end
410
- rule = @@queue.shift
411
- if result = Eliza.apply(s, rule)
412
- return result
413
- end
414
- end
415
-
416
- warn "No rules applied" if @@queue.empty?
417
- return nil
418
- end
419
-
420
- # The parser calls this method to deal with directives (lines where the first
421
- # word begins with a colon)
422
-
423
- def Eliza.parseDirective(line)
424
- word = Eliza.detachWord(line)
425
- case word
426
- when "alias"
427
- if line.empty? || line[0] != ?$
428
- warn "symbol after :alias must be a variable name; ignoring '#{word} #{line}'"
429
- return
430
- else
431
- sym = Eliza.detachWord(line)
432
- @@vars[sym] = Array.new
433
- line.split.each do |s|
434
- @@aliases[s] = sym
435
- @@vars[sym] << s
436
- end
437
- end
438
- when "start"
439
- @@starts << line.unquote
440
- when "stop"
441
- @@stops << line.unquote
442
- when "pre"
443
- sym = Eliza.detachWord(line)
444
- @@pre[sym] = line.unquote
445
- when "post"
446
- sym = Eliza.detachWord(line)
447
- @@post[sym] = line.unquote
448
- when "default"
449
- @@default = line[@@word]
450
- else
451
- warn "unknown directive: :#{word} (ignored)"
452
- end
453
- end
454
-
455
- # Remove a word from the front of a line
456
-
457
- def Eliza.detachWord(line)
458
- word = line[@@word] # pattern matches the first word
459
- if line.index(" ")
460
- line.slice!(0..line.index(" ")) # delete up to end of the word
461
- line.lstrip! # in case there are extra spaces after word
462
- else
463
- line.slice!(0..-1) # line just had the one word
464
- end
465
- return word
466
- end
467
-
468
- # Check each pattern's regular expression and replace var names by alternation
469
- # constructs. If the script specified a default rule name look up that
470
- # rule and save it as the default.
471
-
472
- def Eliza.compileRules
473
- @@rules.each do |key,val|
474
- a = val.patterns()
475
- a.each do |p|
476
- expr = p.regexp.inspect
477
- expr.gsub!(/\$\w+/) { |x| @@vars[x].join("|") }
478
- p.regexp = eval(expr)
479
- end
480
- end
481
- if @@default.class == String
482
- @@default = @@rules[@@default]
483
- end
484
- end
485
-
486
- # Parse rules in file f, store them in global arrays. Strategy: use a local
487
- # var named 'rule', initially set to nil. New rules start with a single word
488
- # at the start of a line. When such a line is found in the input file, create a
489
- # new Rule object and store it in 'rule'. Subsequent lines that are part of the
490
- # current rule (lines that contain regular expressions or strings) are added to
491
- # current Rule object. Directives indicate the end of a rule, so 'rule' is reset
492
- # to nil when a directive is seen.
493
-
494
- def Eliza.load(filename)
495
- begin
496
- Eliza.clear
497
- rule = nil
498
- if filename.class == Symbol
499
- filename = File.join(@@elizaDirectory, filename.to_s + ".txt")
521
+ @@queue = PriorityQueue.new
522
+ @@queue << @@default # initialize queue with default rule
523
+
524
+ Eliza.scan(s, @@queue) # add rules for recognized key words
525
+
526
+ while @@queue.length > 0 # apply rules in order of priority
527
+ if @@verbose
528
+ print "queue: "
529
+ p @@queue.collect { |r| r.key }
530
+ end
531
+ rule = @@queue.shift
532
+ if result = Eliza.apply(s, rule)
533
+ return result
534
+ end
535
+ end
536
+
537
+ warn "No rules applied" if @@queue.empty?
538
+ return nil
539
+ end
540
+
541
+ # Helper method -- Eliza.load calls this method to deal with directives (lines where the first
542
+ # word begins with a colon)
543
+
544
+ def Eliza.parseDirective(line) # :nodoc:
545
+ word = Eliza.detachWord(line)
546
+ case word
547
+ when "alias"
548
+ if line.empty? || line[0] != ?$
549
+ warn "symbol after :alias must be a variable name; ignoring '#{word} #{line}'"
550
+ return
551
+ else
552
+ sym = Eliza.detachWord(line)
553
+ @@vars[sym] = Array.new
554
+ line.split.each do |s|
555
+ @@aliases[s] = sym
556
+ @@vars[sym] << s
557
+ end
558
+ end
559
+ when "start"
560
+ @@starts << line.unquote
561
+ when "stop"
562
+ @@stops << line.unquote
563
+ when "pre"
564
+ sym = Eliza.detachWord(line)
565
+ @@pre[sym] = line.unquote
566
+ when "post"
567
+ sym = Eliza.detachWord(line)
568
+ @@post[sym] = line.unquote
569
+ when "default"
570
+ @@default = line[@@word]
571
+ else
572
+ warn "unknown directive: :#{word} (ignored)"
573
+ end
574
+ end
575
+
576
+ # Helper method called by methods that read scripts -- remove a word from the front of a line
577
+
578
+ def Eliza.detachWord(line)
579
+ word = line[@@word] # pattern matches the first word
580
+ if line.index(" ")
581
+ line.slice!(0..line.index(" ")) # delete up to end of the word
582
+ line.lstrip! # in case there are extra spaces after word
583
+ else
584
+ line.slice!(0..-1) # line just had the one word
585
+ end
586
+ return word
587
+ end
588
+
589
+ # Helper method called by Eliza.load.
590
+ # Check each pattern's regular expression and replace var names by alternation
591
+ # constructs. If the script specified a default rule name look up that
592
+ # rule and save it as the default.
593
+
594
+ def Eliza.compileRules
595
+ @@rules.each do |key,val|
596
+ a = val.patterns()
597
+ a.each do |p|
598
+ expr = p.regexp.inspect
599
+ expr.gsub!(/\$\w+/) { |x| @@vars[x].join("|") }
600
+ p.regexp = eval(expr)
601
+ end
602
+ end
603
+ if @@default.class == String
604
+ @@default = @@rules[@@default]
605
+ end
606
+ end
607
+
608
+ # Parse rules in +filename+, store them in global arrays. If +filename+ is a symbol it
609
+ # refers to a script file in the ElizaLab data directory; if it's a string it should
610
+ # be the name of a file in the current directory.
611
+ #--
612
+ # Strategy: use a local var named 'rule', initially set to nil. New rules start with a single word
613
+ # at the start of a line. When such a line is found in the input file, create a
614
+ # new Rule object and store it in 'rule'. Subsequent lines that are part of the
615
+ # current rule (lines that contain regular expressions or strings) are added to
616
+ # current Rule object. Directives indicate the end of a rule, so 'rule' is reset
617
+ # to nil when a directive is seen.
618
+
619
+ def Eliza.load(filename)
620
+ begin
621
+ Eliza.clear
622
+ rule = nil
623
+ if filename.class == Symbol
624
+ filename = File.join(@@elizaDirectory, filename.to_s + ".txt")
625
+ end
626
+ File.open(filename).each do |line|
627
+ line.strip!
628
+ next if line.empty? || line[0] == ?#
629
+ if line[0] == ?:
630
+ Eliza.parseDirective(line)
631
+ rule = nil
632
+ else
633
+ if line =~ @@iword
634
+ rulename, priority = line.split
635
+ rule = priority ? Rule.new(rulename, priority.to_i) : Rule.new(rulename)
636
+ @@rules[rule.key] = rule
637
+ elsif rule.nil?
638
+ warn "missing rule name? unexpected input '#{line}'"
639
+ elsif line[0] == ?/
640
+ if line[-1] == ?/
641
+ rule.addPattern(line)
642
+ else
643
+ warn "badly formed expression (missing /): '#{line}'"
644
+ end
645
+ elsif line[0] == ?"
646
+ if line[-1] == ?"
647
+ rule.addReassembly(line.unquote)
648
+ else
649
+ warn "badly formed string (missing \"): '#{line}'"
650
+ end
651
+ elsif line[0] == ?@
652
+ rule.addReassembly(line)
653
+ else
654
+ warn "unexpected line in rule for #{rulename}: '#{line}'"
655
+ end
656
+ end
500
657
  end
501
- File.open(filename).each do |line|
502
- line.strip!
503
- next if line.empty? || line[0] == ?#
504
- if line[0] == ?:
505
- Eliza.parseDirective(line)
506
- rule = nil
507
- else
508
- if line =~ @@iword
509
- rulename, priority = line.split
510
- rule = priority ? Rule.new(rulename, priority.to_i) : Rule.new(rulename)
511
- @@rules[rule.key] = rule
512
- elsif rule.nil?
513
- warn "missing rule name? unexpected input '#{line}'"
514
- elsif line[0] == ?/
515
- if line[-1] == ?/
516
- rule.addPattern(line)
517
- else
518
- warn "badly formed expression (missing /): '#{line}'"
519
- end
520
- elsif line[0] == ?"
521
- if line[-1] == ?"
522
- rule.addReassembly(line.unquote)
523
- else
524
- warn "badly formed string (missing \"): '#{line}'"
525
- end
526
- elsif line[0] == ?@
527
- rule.addReassembly(line)
528
- else
529
- warn "unexpected line in rule for #{rulename}: '#{line}'"
530
- end
531
- end
532
- end
533
- Eliza.compileRules
534
- @@script = filename
535
- rescue
536
- puts "Eliza: Error processing #{filename}: #{$!}"
537
- return false
538
- end
539
- return true
540
- end
541
-
542
- def Eliza.dump
543
- Eliza.clear unless defined? @@default
544
- puts "Script: #{@@script}"
545
- print "Starts:\n "; p @@starts
546
- print "Stops:\n "; p @@stops
547
- print "Vars:\n "; p @@vars
548
- print "Aliases:\n "; p @@aliases
549
- print "Pre:\n "; p @@pre
550
- print "Post:\n "; p @@post
551
- print "Default:\n "; p @@default
552
- print "Queue:\n "; p @@queue.collect { |r| r.key }
553
- puts
554
- @@rules.each { |key,val| puts val }
555
- return nil
556
- end
557
-
558
- def Eliza.info
559
- Eliza.clear unless defined? @@default
560
-
561
- words = Hash.new
562
- npatterns = 0
563
-
564
- @@rules.each do |k,r|
565
- words[k] = 1 unless k[0] == ?$
566
- r.patterns.each do |p|
567
- npatterns += 1
568
- p.cleanRegexp.split.each do |w|
658
+ Eliza.compileRules
659
+ @@script = filename
660
+ rescue
661
+ puts "Eliza: Error processing #{filename}: #{$!}"
662
+ return false
663
+ end
664
+ return true
665
+ end
666
+
667
+ # Print a complete description of all the rules from the current script.
668
+
669
+ def Eliza.dump
670
+ Eliza.clear unless defined? @@default
671
+ puts "Script: #{@@script}"
672
+ print "Starts:\n "; p @@starts
673
+ print "Stops:\n "; p @@stops
674
+ print "Vars:\n "; p @@vars
675
+ print "Aliases:\n "; p @@aliases
676
+ print "Pre:\n "; p @@pre
677
+ print "Post:\n "; p @@post
678
+ print "Default:\n "; p @@default
679
+ print "Queue:\n "; p @@queue.collect { |r| r.key }
680
+ puts
681
+ @@rules.each { |key,val| puts val }
682
+ return nil
683
+ end
684
+
685
+ # Print a summary description of the current script, with the number of rules
686
+ # and sentence patterns and a list of key words from all the rules.
687
+
688
+ def Eliza.info
689
+ Eliza.clear unless defined? @@default
690
+
691
+ words = Hash.new
692
+ npatterns = 0
693
+
694
+ @@rules.each do |k,r|
695
+ words[k] = 1 unless k[0] == ?$
696
+ r.patterns.each do |p|
697
+ npatterns += 1
698
+ p.cleanRegexp.split.each do |w|
569
699
  Eliza.saveWords(w, words)
570
- end
571
- end
572
- end
573
-
574
- @@aliases.keys.each do |k|
575
- Eliza.saveWords(k, words)
576
- end
577
-
578
- puts "Script: #{@@script}"
579
- puts " #{@@rules.size} rules with #{npatterns} sentence patterns"
580
- puts " #{words.length} key words: #{words.keys.sort.join(', ')}"
581
- end
582
-
583
- def Eliza.reset
584
- @@rules.each do |k, r|
585
- r.patterns.each { |p| p.reset }
586
- end
587
- return true
588
- end
589
-
590
- def Eliza.saveWords(s, hash)
700
+ end
701
+ end
702
+ end
703
+
704
+ @@aliases.keys.each do |k|
705
+ Eliza.saveWords(k, words)
706
+ end
707
+
708
+ puts "Script: #{@@script}"
709
+ puts " #{@@rules.size} rules with #{npatterns} sentence patterns"
710
+ puts " #{words.length} key words: #{words.keys.sort.join(', ')}"
711
+ end
712
+
713
+ # Helper method called by Eliza.info -- don't include common words like "the" or "a"
714
+ # in list of key words, and clean up regular expression symbols. Put the remaining
715
+ # items in the hash.
716
+
717
+ def Eliza.saveWords(s, hash) # :nodoc:
591
718
  return if ["a","an","in","of","the"].include?(s)
592
719
  s.gsub! "(", ""
593
720
  s.gsub! ")", ""
594
721
  s.gsub! ".*", ""
595
722
  s.gsub! "?", ""
596
723
  return if s.length == 0
597
- s.split(/\|/).each { |w| hash[w.downcase] = 1 }
598
- end
599
-
600
- def Eliza.run
601
- Eliza.clear unless defined? @@default
602
- puts @@starts[rand(@@starts.length)] if ! @@starts.empty?
603
- loop do
724
+ s.split(/\|/).each { |w| hash[w.downcase] = 1 }
725
+ end
726
+
727
+ # Delete the current script, reset Eliza back to its initial state.
728
+
729
+ def Eliza.reset
730
+ @@rules.each do |k, r|
731
+ r.patterns.each { |p| p.reset }
732
+ end
733
+ return true
734
+ end
735
+
736
+ # Top level method to carry on a conversation. Starts a read-eval-print loop,
737
+ # stopping when the user types "bye" or "quit". For each sentence, call
738
+ # Eliza.transform to find a rule that applies to the sentence and print the
739
+ # response.
740
+
741
+ def Eliza.run
742
+ Eliza.clear unless defined? @@default
743
+ puts @@starts[rand(@@starts.length)] if ! @@starts.empty?
744
+ loop do
604
745
  s = readline(" H: ", true)
605
- return if s.nil?
606
- s.chomp!
607
- next if s.empty?
608
- if s == "bye" || s == "quit"
609
- puts @@stops[rand(@@stops.length)] if ! @@stops.empty?
610
- return
611
- end
612
- puts " C: " + Eliza.transform(s)
613
- end
614
- end
746
+ return if s.nil?
747
+ s.chomp!
748
+ next if s.empty?
749
+ if s == "bye" || s == "quit"
750
+ puts @@stops[rand(@@stops.length)] if ! @@stops.empty?
751
+ return
752
+ end
753
+ puts " C: " + Eliza.transform(s)
754
+ end
755
+ end
615
756
 
616
757
  end # class Eliza
617
758
 
@@ -619,31 +760,43 @@ words (a|b|c), and variable names ($x) if they aren't already there.
619
760
  # the ElizaLab module
620
761
 
621
762
  @@verbose = false
622
- @@elizaDirectory = File.join(File.dirname(__FILE__), '..', 'data', 'eliza')
623
- @@pre = Dictionary.new
624
- @@post = Dictionary.new
625
- @@rules = Dictionary.new
626
- @@word = /[a-z\-$']+/i # pattern for a "word" in the input language
627
- @@iword = /^[a-z\-$']+/i # same, but must be the first item on the line
628
- @@var = /\$\d+/ # variable name in reassembly string
629
-
763
+ @@elizaDirectory = File.join(File.dirname(__FILE__), '..', 'data', 'eliza')
764
+ @@pre = Dictionary.new
765
+ @@post = Dictionary.new
766
+ @@rules = Dictionary.new
767
+ @@word = /[a-z\-$']+/i # pattern for a "word" in the input language
768
+ @@iword = /^[a-z\-$']+/i # same, but must be the first item on the line
769
+ @@var = /\$\d+/ # variable name in reassembly string
770
+
630
771
  end # module ElizaLab
631
772
 
632
773
  end # module RubyLabs
633
774
 
634
- class String
635
-
636
775
  =begin rdoc
637
- A useful operation on strings -- call +s.unquote+ to remove double quotes from
638
- the beginning and end of string +s+.
776
+
777
+ == String
778
+
779
+ The code for the ELIZA lab (elizalab.rb) has the definition of a new method for strings
780
+ that removes quotes from the beginning and ending of a string.
639
781
  =end
640
782
 
641
- def unquote
642
- if self[0] == ?" && self[-1] == ?"
643
- return self.slice(1..-2)
644
- else
645
- return self
646
- end
647
- end
783
+ class String
784
+
785
+ # Call +s.unquote+ to return a copy of string +s+ with double quotes removed from
786
+ # the beginning and end.
787
+ #
788
+ # Example:
789
+ # >> s = '"Is it raining?"'
790
+ # => "\"Is it raining?\""
791
+ # >> s.unquote
792
+ # => "Is it raining?"
793
+
794
+ def unquote
795
+ if self[0] == ?" && self[-1] == ?"
796
+ return self.slice(1..-2)
797
+ else
798
+ return self
799
+ end
800
+ end
648
801
 
649
802
  end