rltk 2.0.0 → 2.1.0

Sign up to get free protection for your applications and to get access to all the features.
data/README.md CHANGED
@@ -170,6 +170,44 @@ The default starting symbol of the grammar is the left-hand side of the first pr
170
170
 
171
171
  **Make sure you call `finalize` at the end of your parser definition, and only call it once.**
172
172
 
173
+ ### Shortcuts
174
+
175
+ RLTK provides several shortcuts for common grammar constructs. Right now these shortcuts include the {RLTK::Parser.empty_list} and {RLTK::Parser.nonempty_list} methods. An empty list is a list that may contain 0, 1, or more elements, with a given token seperating each element. A non-empty list contains **at least** 1 element.
176
+
177
+ This example shows how these shortcuts may be used to define a list of integers separated by a `:COMMA` token:
178
+
179
+ class ListParser < RLTK::Parser
180
+ nonempty_list(:int_list, :INT, :COMMA)
181
+
182
+ finalize
183
+ end
184
+
185
+ If you wanted to define a list of floats or integers you could define your parser like this:
186
+
187
+ class ListParser < RLTK::Parser
188
+ nonempty_list(:mixed_list, [:INT, :FLOAT], :COMMA)
189
+
190
+ finalize
191
+ end
192
+
193
+ A list may also contain multiple tokens between the separator:
194
+
195
+ class ListParser < RLTK::Parser
196
+ nonempty_list(:foo_bar_list, 'FOO BAR', :COMMA)
197
+
198
+ finalize
199
+ end
200
+
201
+ Lastly, you can mix all of these features together:
202
+
203
+ class ListParser < RLTK::Parser
204
+ nonempty_list(:foo_list, ['FOO BAR', 'FOO BAZ+'], :COMMA)
205
+
206
+ finalize
207
+ end
208
+
209
+ The productions generated by these shortcuts will always evaluate to an array. In the first two examples above the productions will produce a 1-D array containing the values of the `INT` or `FLOAT` tokens. In the last two examples the productions `foo_bar_list` and `foo_list` will produce 2-D arrays where the top level array is composed of tuples coresponding to the values of `FOO`, and `BAR` or one or more `BAZ`s.
210
+
173
211
  ### Precedence and Associativity
174
212
 
175
213
  To help you remove ambiguity from your grammars RLTK lets you assign precedence and associativity information to terminal symbols. Productions then get assigned precedence and associativity based on either the last terminal symbol on the right-hand side of the production, or an optional parameter to the {RLTK::Parser.production} or {RLTK::Parser.clause} methods. When an {RLTK::Parser} encounters a shift/reduce error it will attempt to resolve it using the following rules:
@@ -255,6 +293,12 @@ The {RLTK::Parser.parse} and {RLTK::Parser#parse} methods also have several opti
255
293
 
256
294
  * **verbose** - Value should be `true`, `false`, an `IO` object, or a file name. Default value is `false`. If a non `false` (or `nil`) value is specified a detailed description of the actions of the parser are printed to $stdout, the provided `IO` object, or the specified file as it parses the input.
257
295
 
296
+ ### Parse Trees
297
+
298
+ The above section briefly mentions the *parse_tree* option. So that this neat feature doesn't get lost in the rest of the documentation here is the tree generated by the Kazoo parser from Chapter 7 of the tutorial when it parses the line `def fib(a) if a < 2 then 1 else fib(a-1) + fib(a-2);`:
299
+
300
+ ![Kazoo parse tree.](https://github.com/chriswailes/RLTK/raw/master/resources/simple_tree.png)
301
+
258
302
  ### Parsing Exceptions
259
303
 
260
304
  Calls to {RLTK::Parser.parse} may raise one of four exceptions:
@@ -268,7 +312,7 @@ Calls to {RLTK::Parser.parse} may raise one of four exceptions:
268
312
 
269
313
  **Warning: this is the lest tested feature of RLTK. If you encounter any problems while using it, please let me know so I can fix any bugs as soon as possible.**
270
314
 
271
- When an RLTK parser encounters a token for which there are no more valid tokens (and it is on the last parse stack / possible parse-tree path) it will enter error handling mode. In this mode the parser pops states and input off of the parse stack (the parser is a pushdown automaton after all) until it finds a state that has a shift action for the `ERROR` terminal. A dummy `ERROR` terminal is then placed onto the parse stack and the shift action is taken. This error token will have the position information of the token that caused the parser to enter error handling mode.
315
+ When an RLTK parser encounters a token for which there are no more valid actions (and it is on the last parse stack / possible parse-tree path) it will enter error handling mode. In this mode the parser pops states and input off of the parse stack (the parser is a pushdown automaton after all) until it finds a state that has a shift action for the `ERROR` terminal. A dummy `ERROR` terminal is then placed onto the parse stack and the shift action is taken. This error token will have the position information of the token that caused the parser to enter error handling mode.
272
316
 
273
317
  If the input (including the `ERROR` token) can be reduced immediately the associated error handling proc is evaluated and we continue parsing. If the parser can't immediately reduce it will begin shifting tokens onto the input stack. This may cause the parser to enter a state in which it again has no valid actions for an input. When this happens it enters error handling mode again and pops states and input off of the stack until it reaches an error state again. In this way it searches for the first substring after the error occurred for which it can resume parsing.
274
318
 
@@ -349,7 +393,7 @@ Why did I fork ruby-llvm, and why might you want to use the RLTK bindings over r
349
393
 
350
394
  * **Cleaner Codebase** - The RLTK bindings present a cleaner interface to the LLVM library by conforming to more standard Ruby programming practices, providing better abstractions and cleaner inheritance hierarchies, overloading constructors and other methods properly, and performing type checking on objects to better aid in debugging.
351
395
  * **Documentation** - RLTK's bindings provide slightly better documentation.
352
- * **Completeness** - The RLTK bindings provide several features that are missing from the ruby-llvm project. These include the ability to initialize LLVM for architectures besides x86 (RLTK supports all architectures supported by LLVM) and the presence of all of LLVM's optimization passes.
396
+ * **Completeness** - The RLTK bindings provide several features that are missing from the ruby-llvm project. These include the ability to initialize LLVM for architectures besides x86 (RLTK supports all architectures supported by LLVM), the presence of all of LLVM's optimization passes, and the ability to print the LLVM IR representation of modules and values to files.
353
397
  * **Ease of Use** - Several features have been added to make generating code easier.
354
398
  * **Speed** - The RLTK bindings are ever so slightly faster due to avoiding unnecessary FFI calls.
355
399
 
data/Rakefile CHANGED
@@ -137,7 +137,16 @@ task :gen_bindings do
137
137
  headers = [
138
138
  'llvm-ecb.h',
139
139
 
140
- 'llvm-ecb/support.h'
140
+ 'llvm-ecb/asm.h',
141
+ 'llvm-ecb/module.h',
142
+ 'llvm-ecb/support.h',
143
+
144
+ 'llvm-ecb/value.h',
145
+ 'llvm-ecb/target.h'
146
+
147
+ # This causes value.h to not be included.
148
+ #'llvm-ecb/target.h',
149
+ #'llvm-ecb/value.h'
141
150
  ]
142
151
 
143
152
  begin
data/lib/rltk/ast.rb CHANGED
@@ -272,11 +272,52 @@ module RLTK # :nodoc:
272
272
  @notes.delete(key)
273
273
  end
274
274
 
275
- # An iterator over the node's children.
275
+ # This method is a simple wrapper around Marshal.dump, and is used
276
+ # to serialize an AST. You can use Marshal.load to reconstruct a
277
+ # serialized AST.
278
+ #
279
+ # @param [nil, IO, String] dest Where the serialized version of the AST will end up. If nil, this method will return the AST as a string.
280
+ # @param [Fixnum] limit Recursion depth. If -1 is specified there is no limit on the recursion depth.
281
+ #
282
+ # @return [void, String] String if *dest* is nil, void otherwise.
283
+ def dump(dest = nil, limit = -1)
284
+ case dest
285
+ when nil then Marshal.dump(self, limit)
286
+ when String then File.open(dest, 'w') { |f| Marshal.dump(self, f, limit) }
287
+ when IO then Marshal.dump(self, dest, limit)
288
+ else raise TypeError, "AST#dump expects nil, a String, or an IO object for the dest parameter."
289
+ end
290
+ end
291
+
292
+ # An iterator over the node's children. The AST may be traversed in
293
+ # the following orders:
294
+ #
295
+ # * Pre-order (:pre)
296
+ # * Post-order (:post)
297
+ # * Level-order (:level)
276
298
  #
277
299
  # @return [void]
278
- def each
279
- self.children.each { |c| yield c }
300
+ def each(order = :pre, &block)
301
+ case order
302
+ when :pre
303
+ yield self
304
+
305
+ self.children.compact.each { |c| c.each(:pre, &block) }
306
+
307
+ when :post
308
+ self.children.compact.each { |c| c.each(:post, &block) }
309
+
310
+ yield self
311
+
312
+ when :level
313
+ level_queue = [self]
314
+
315
+ while node = level_queue.shift
316
+ yield node
317
+
318
+ level_queue += node.children.compact
319
+ end
320
+ end
280
321
  end
281
322
 
282
323
  # Tests to see if a note named *key* is present at this node.
@@ -300,6 +341,10 @@ module RLTK # :nodoc:
300
341
  @notes = Hash.new()
301
342
  @parent = nil
302
343
 
344
+ # Pad out the objects array with nil values.
345
+ max_args = self.class.value_names.length + self.class.child_names.length
346
+ objects.fill(nil, objects.length...max_args)
347
+
303
348
  pivot = self.class.value_names.length
304
349
 
305
350
  self.values = objects[0...pivot]
data/lib/rltk/cfg.rb CHANGED
@@ -96,11 +96,13 @@ module RLTK # :nodoc:
96
96
  # @return [void]
97
97
  def add_production(production)
98
98
  @productions_sym[production.lhs] << (@productions_id[production.id] = production)
99
+
100
+ production
99
101
  end
100
102
 
101
103
  # Sets the EBNF callback to *callback*.
102
104
  #
103
- # @param [Proc] callback A Proc object to be called when EBNF operators are expanded.
105
+ # @param [Proc] callback A Proc object to be called when EBNF operators are expanded and list productions are added.
104
106
  #
105
107
  # @return [void]
106
108
  def callback(&callback)
@@ -116,9 +118,7 @@ module RLTK # :nodoc:
116
118
  #
117
119
  # @return [Production]
118
120
  def clause(expression)
119
- if not @curr_lhs
120
- raise GrammarError, 'CFG.clause called outside of CFG.production block.'
121
- end
121
+ raise GrammarError, 'CFG#clause called outside of CFG#production block.' if not @curr_lhs
122
122
 
123
123
  lhs = @curr_lhs.to_sym
124
124
  rhs = Array.new
@@ -144,17 +144,10 @@ module RLTK # :nodoc:
144
144
 
145
145
  rhs <<
146
146
  case ttype1
147
- when :'?'
148
- self.get_question(tvalue0)
149
-
150
- when :*
151
- self.get_star(tvalue0)
152
-
153
- when :+
154
- self.get_plus(tvalue0)
155
-
156
- else
157
- tvalue0
147
+ when :'?' then self.get_question(tvalue0)
148
+ when :* then self.get_star(tvalue0)
149
+ when :+ then self.get_plus(tvalue0)
150
+ else tvalue0
158
151
  end
159
152
  else
160
153
  rhs << tvalue0
@@ -174,6 +167,34 @@ module RLTK # :nodoc:
174
167
  return production
175
168
  end
176
169
 
170
+ # This method adds the necessary productions for empty lists to the
171
+ # grammar. These productions are named `symbol`, `symbol + '_prime'`
172
+ # and `symbol + '_elements'`
173
+ #
174
+ # @param [Symbol] symbol The name of the production to add.
175
+ # @param [Array<String>] list_elements An array of expressions that may appear in the list.
176
+ # @param [Symbol] separator The list separator symbol.
177
+ #
178
+ # @return [void]
179
+ def empty_list_production(symbol, list_elements, separator)
180
+ # Add the items for the following productions:
181
+ #
182
+ # symbol: | symbol_prime
183
+
184
+ prime = symbol.to_s + '_prime'
185
+
186
+ # 1st Production
187
+ production = self.production(symbol, '').first
188
+ @callback.call(production, :elp, :first)
189
+
190
+ # 2nd Production
191
+ production = self.production(symbol, prime.to_s).first
192
+ @callback.call(production, :elp, :second)
193
+
194
+ self.nonempty_list(prime, list_elements, separator)
195
+ end
196
+ alias :empty_list :empty_list_production
197
+
177
198
  # @param [Symbol, Array<Symbol>] sentence Sentence to find the *first set* for.
178
199
  #
179
200
  # @return [Array<Symbol>] The *first set* for the given sentence.
@@ -310,11 +331,11 @@ module RLTK # :nodoc:
310
331
  # token_plus: token | token token_plus
311
332
 
312
333
  # 1st production
313
- self.add_production(production = Production.new(self.next_id, new_symbol, [symbol]))
334
+ production = self.add_production(Production.new(self.next_id, new_symbol, [symbol]))
314
335
  @callback.call(production, :+, :first)
315
336
 
316
337
  # 2nd production
317
- self.add_production(production = Production.new(self.next_id, new_symbol, [symbol, new_symbol]))
338
+ production = self.add_production(Production.new(self.next_id, new_symbol, [new_symbol, symbol]))
318
339
  @callback.call(production, :+, :second)
319
340
 
320
341
  # Add the new symbol to the list of nonterminals.
@@ -338,11 +359,11 @@ module RLTK # :nodoc:
338
359
  # nonterm_question: | nonterm
339
360
 
340
361
  # 1st (empty) production.
341
- self.add_production(production = Production.new(self.next_id, new_symbol, []))
362
+ production = self.add_production(Production.new(self.next_id, new_symbol, []))
342
363
  @callback.call(production, :'?', :first)
343
364
 
344
365
  # 2nd production
345
- self.add_production(production = Production.new(self.next_id, new_symbol, [symbol]))
366
+ production = self.add_production(Production.new(self.next_id, new_symbol, [symbol]))
346
367
  @callback.call(production, :'?', :second)
347
368
 
348
369
  # Add the new symbol to the list of nonterminals.
@@ -366,11 +387,11 @@ module RLTK # :nodoc:
366
387
  # token_star: | token token_star
367
388
 
368
389
  # 1st (empty) production
369
- self.add_production(production = Production.new(self.next_id, new_symbol, []))
390
+ production = self.add_production(Production.new(self.next_id, new_symbol, []))
370
391
  @callback.call(production, :*, :first)
371
392
 
372
393
  # 2nd production
373
- self.add_production(production = Production.new(self.next_id, new_symbol, [symbol, new_symbol]))
394
+ production = self.add_production(Production.new(self.next_id, new_symbol, [new_symbol, symbol]))
374
395
  @callback.call(production, :*, :second)
375
396
 
376
397
  # Add the new symbol to the list of nonterminals.
@@ -385,6 +406,54 @@ module RLTK # :nodoc:
385
406
  @production_counter += 1
386
407
  end
387
408
 
409
+ # This method adds the necessary productions for non-empty lists to
410
+ # the grammar. These productions are named `symbol` and
411
+ # `symbol + '_elements'`
412
+ #
413
+ # @param [Symbol] symbol The name of the production to add.
414
+ # @param [String, Symbol, Array<String>] list_elements Expression(s) that may appear in the list.
415
+ # @param [Symbol] separator The list separator symbol.
416
+ #
417
+ # @return [void]
418
+ def nonempty_list_production(symbol, list_elements, separator)
419
+ # Add the items for the following productions:
420
+ #
421
+ # symbol_elements: list_elements.join('|')
422
+ #
423
+ # symbol: symbol_elements | symbol separator symbol_elements
424
+
425
+ if list_elements.is_a?(String) or list_elements.is_a?(Symbol)
426
+ list_elements = [list_elements.to_s]
427
+
428
+ elsif list_elements.is_a?(Array)
429
+ if list_elements.empty?
430
+ raise ArgumentError, 'Parameter list_elements must not be empty.'
431
+ else
432
+ list_elements.map! { |el| el.to_s }
433
+ end
434
+
435
+ else
436
+ raise ArgumentError, 'Parameter list_elements must be a String, Symbol, or array of Strings and Symbols.'
437
+ end
438
+
439
+ el_symbol = (symbol.to_s + '_elements').to_sym
440
+
441
+ # 1st Production
442
+ production = self.production(symbol, el_symbol.to_s).first
443
+ @callback.call(production, :nelp, :first)
444
+
445
+ # 2nd Production
446
+ production = self.production(symbol, "#{symbol} #{separator} #{el_symbol}").first
447
+ @callback.call(production, :nelp, :second)
448
+
449
+ # 3rd Productions
450
+ list_elements.each do |el|
451
+ production = self.production(el_symbol, el).first
452
+ @callback.call(production, :nelp, :third)
453
+ end
454
+ end
455
+ alias :nonempty_list :nonempty_list_production
456
+
388
457
  # @return [Array<Symbol>] All terminal symbols used in the grammar's definition.
389
458
  def nonterms
390
459
  @nonterms.keys
@@ -74,23 +74,47 @@ module RLTK::CG # :nodoc:
74
74
 
75
75
  # List of architectures supported by LLVM.
76
76
  ARCHS = [
77
- :ARM,
78
77
  :Alpha,
78
+ :ARM,
79
79
  :Blackfin,
80
80
  :CBackend,
81
81
  :CellSPU,
82
82
  :CppBackend,
83
83
  :MBlaze,
84
- :MSP430,
85
84
  :Mips,
86
- :PTX,
85
+ :MSP430,
87
86
  :PowerPC,
87
+ :PTX,
88
88
  :Sparc,
89
89
  :SystemZ,
90
- :XCore,
90
+ :X86,
91
+ :XCore
92
+ ]
93
+
94
+ # List of assembly parsers.
95
+ ASM_PARSERS = [
96
+ :ARM,
97
+ :MBLaze,
91
98
  :X86
92
99
  ]
93
-
100
+
101
+ # List of assembly printers.
102
+ ASM_PRINTERS = [
103
+ :Alpha,
104
+ :ARM,
105
+ :Blackfin,
106
+ :CellSPU,
107
+ :MBLaze,
108
+ :Mips,
109
+ :MSP430,
110
+ :PowerPC,
111
+ :PTX,
112
+ :Sparc,
113
+ :SystemZ,
114
+ :X86,
115
+ :XCore
116
+ ]
117
+
94
118
  ###########
95
119
  # Methods #
96
120
  ###########
@@ -131,6 +155,14 @@ module RLTK::CG # :nodoc:
131
155
  add_binding("LLVMInitialize#{arch}TargetMC", [], :void)
132
156
  end
133
157
 
158
+ ASM_PARSERS.each do |asm|
159
+ add_binding("LLVMInitialize#{asm}AsmParser", [], :void)
160
+ end
161
+
162
+ ASM_PRINTERS.each do |asm|
163
+ add_binding("LLVMInitialize#{asm}AsmPrinter", [], :void)
164
+ end
165
+
134
166
  add_binding(:LLVMDisposeMessage, [:pointer], :void)
135
167
  end
136
168
  end
@@ -12,6 +12,7 @@
12
12
  require 'rltk/util/abstract_class'
13
13
  require 'rltk/cg/bindings'
14
14
  require 'rltk/cg/pass_manager'
15
+ require 'rltk/cg/target'
15
16
 
16
17
  #######################
17
18
  # Classes and Modules #
@@ -38,6 +39,8 @@ module RLTK::CG # :nodoc:
38
39
  #
39
40
  # @raise [RuntimeError] An error is raised if something went horribly wrong inside LLVM during the creation of this engine.
40
41
  def initialize(mod, &block)
42
+ check_type(mod, Module, 'mod')
43
+
41
44
  block = Proc.new { |ptr, error| Bindings.create_execution_engine_for_module(ptr, mod, error) } if block == nil
42
45
 
43
46
  ptr = FFI::MemoryPointer.new(:pointer)
@@ -47,6 +50,9 @@ module RLTK::CG # :nodoc:
47
50
  if status.zero?
48
51
  @ptr = ptr.read_pointer
49
52
  @module = mod
53
+
54
+ # Associate this engine with the provided module.
55
+ @module.engine = self
50
56
 
51
57
  else
52
58
  errorp = error.read_pointer
@@ -71,18 +77,6 @@ module RLTK::CG # :nodoc:
71
77
  end
72
78
  end
73
79
 
74
- # @return [FunctionPassManager] Function pass manager for this engine.
75
- def function_pass_manager
76
- @function_pass_manager ||= FunctionPassManager.new(self, @module)
77
- end
78
- alias :fpm :function_pass_manager
79
-
80
- # @return [PassManager] Pass manager for this engine.
81
- def pass_manager
82
- @pass_manager ||= PassManager.new(self, @module)
83
- end
84
- alias :pm :pass_manager
85
-
86
80
  # Builds a pointer to a global value.
87
81
  #
88
82
  # @param [GlobalValue] global Value you want a pointer to.
@@ -141,6 +135,11 @@ module RLTK::CG # :nodoc:
141
135
  GenericValue.new(Bindings.run_function_as_main(@ptr, fun, args.length, argv, env))
142
136
  end
143
137
  alias :run_main :run_function_as_main
138
+
139
+ # @return [TargetData] Information about the target architecture for this execution engine.
140
+ def target_data
141
+ TargetData.new(Bindings.get_execution_engine_target_data(@ptr))
142
+ end
144
143
  end
145
144
 
146
145
  # An execution engine that interprets the given code.
@@ -5102,16 +5102,10 @@ module RLTK::CG::Bindings
5102
5102
  # function returns the number of bytes in the instruction or zero if there was
5103
5103
  # no valid instruction.
5104
5104
  #
5105
- # @method disasm_instruction(dc, bytes, bytes_size, pc, out_string, out_string_size)
5106
- # @param [FFI::Pointer(DisasmContextRef)] dc
5107
- # @param [FFI::Pointer(*Uint8T)] bytes
5108
- # @param [Integer] bytes_size
5109
- # @param [Integer] pc
5110
- # @param [String] out_string
5111
- # @param [Integer] out_string_size
5105
+ # @method disasm_instruction()
5112
5106
  # @return [Integer]
5113
5107
  # @scope class
5114
- attach_function :disasm_instruction, :LLVMDisasmInstruction, [:pointer, :pointer, :ulong, :ulong, :string, :ulong], :ulong
5108
+ attach_function :disasm_instruction, :LLVMDisasmInstruction, [], :int
5115
5109
 
5116
5110
  # (Not documented)
5117
5111
  #