rubynode 0.1.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
data/Changelog ADDED
@@ -0,0 +1,15 @@
1
+ Changelog
2
+
3
+ -- 0.1.1
4
+
5
+ * RubyNode is available as gem now
6
+ * Added the necessary Ruby source files for out of the box 1.8.5 support
7
+ * Added support for u1.cfunc / nd_cfnc
8
+ * Removed unused attributes for NODE_CASE and NODE_ZSUPER
9
+ * Documentation is included as HTML now
10
+ * Stricter version check for the included Ruby source
11
+ * Some minor fixes and improvements
12
+
13
+ -- 0.1.0
14
+
15
+ * Initial release
data/README ADDED
@@ -0,0 +1,84 @@
1
+
2
+ RubyNode
3
+ ========
4
+
5
+ RubyNode is a library that allows read only access to Ruby's internal NODE
6
+ structure. It can retrieve the node trees of methods and procs and it can use
7
+ Ruby's parser to parse Ruby source code strings to node trees.
8
+
9
+ It provides the class RubyNode, which wraps an internal NODE. Trees of RubyNode
10
+ instances can also be transformed into trees of arrays and hashes (similar to
11
+ s-expressions), which are easy to manipulate and work with.
12
+
13
+
14
+ Requirements
15
+ ------------
16
+
17
+ RubyNode is tested with Ruby 1.8.4 and 1.8.5, but it should also work with
18
+ other 1.8 versions and also with 1.9 (but there might be some node types that
19
+ are not fully supported).
20
+
21
+
22
+ Installation
23
+ ------------
24
+
25
+ RubyNode generates some of its C source code from Ruby's source code, because
26
+ the node types and other details differ between Ruby versions.
27
+
28
+ For the official releases of Ruby 1.8.4 and 1.8.5 the needed source files are
29
+ included in the package. To compile RubyNode for any other Ruby version, you
30
+ will need that version's source tar ball extracted somewhere.
31
+
32
+ So, for Ruby 1.8.4 and 1.8.5 just run (as root):
33
+
34
+ gem install rubynode
35
+
36
+ Or if you do not use the gem:
37
+
38
+ ruby setup.rb
39
+
40
+ That command will compile the C extension and install all files to their
41
+ default location (to customize the non-gem installation, please see "ruby
42
+ setup.rb --help")
43
+
44
+ For other Ruby versions you need to provide the (absolute) path to the source
45
+ directory for that Ruby version in the enviroment variable RUBY_SOURCE_DIR.
46
+ Example (as root):
47
+
48
+ RUBY_SOURCE_DIR="/path/to/ruby_source" gem install rubynode
49
+
50
+ Or if you do not use the gem:
51
+
52
+ RUBY_SOURCE_DIR="/path/to/ruby_source" ruby setup.rb
53
+
54
+
55
+ Usage
56
+ -----
57
+
58
+ See doc/index.html and doc/api.html.
59
+
60
+
61
+ Feedback
62
+ --------
63
+
64
+ If you find a bug, think that something doesn't work as it should or have other
65
+ suggestions, then please don't hesitate to contact me (dbatml@gmx.de) and tell
66
+ me about it.
67
+
68
+
69
+ Thanks
70
+ ------
71
+
72
+ I would like to thank Paul Brannan for writing Nodewrap, which inspired me to
73
+ write RubyNode and also gave me some ideas and code.
74
+
75
+
76
+ License
77
+ -------
78
+
79
+ Copyright 2006 Dominik Bathon <dbatml@gmx.de>
80
+
81
+ RubyNode is licensed under the same terms as Ruby.
82
+
83
+ The distribution contains some (extracts from) files from Ruby which are
84
+ Copyright Yukihiro Matsumoto (see those files).
data/doc/api.html ADDED
@@ -0,0 +1,434 @@
1
+ <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
2
+ <html>
3
+ <head>
4
+ <title>RubyNode API</title>
5
+ <link href="style.css" media="all" rel="Stylesheet" type="text/css">
6
+ </head>
7
+ <body>
8
+ <h1>RubyNode <span class="caps">API</span></h1>
9
+
10
+
11
+ <p>RubyNode consists of two parts: a C extension, which provides the core
12
+ functionality and a Ruby library, which adds additional functionality.</p>
13
+
14
+
15
+ <p>To get only the C extension <code>require "rubynode_ext"</code>, to get the full
16
+ functionality (recommended) just <code>require "rubynode"</code>.</p>
17
+
18
+
19
+ <p>Sections: <a href="#section1">Introduction to Ruby NODEs</a>, <a href="#section2">Accessing the fields using RubyNode</a>, <a href="#section3">Aliases for better readability</a>, <a href="#section4">Getting RubyNodes</a>, <a href="#section5">Higher level methods</a>, <a href="#section6">More examples</a>.</p>
20
+
21
+
22
+ <h2 id="section1">Introduction to Ruby NODEs</h2>
23
+
24
+
25
+ <p>Ruby mainly uses the NODEs to store the <span class="caps">AST</span> (abstract syntax tree) of parsed
26
+ Ruby files, but they are also used for some other things. On the C side, <span class="caps">NODE</span>
27
+ is a struct:</p>
28
+
29
+
30
+ <pre><code>typedef struct RNode {
31
+ unsigned long flags;
32
+ char *nd_file;
33
+ union {
34
+ struct RNode *node;
35
+ ID id;
36
+ VALUE value;
37
+ VALUE (*cfunc)(ANYARGS);
38
+ ID *tbl;
39
+ } u1;
40
+ union {
41
+ struct RNode *node;
42
+ ID id;
43
+ long argc;
44
+ VALUE value;
45
+ } u2;
46
+ union {
47
+ struct RNode *node;
48
+ ID id;
49
+ long state;
50
+ struct global_entry *entry;
51
+ long cnt;
52
+ VALUE value;
53
+ } u3;
54
+ } NODE;
55
+ </code></pre>
56
+
57
+ <p>The important parts are the 3 unions <code>u1</code>, <code>u2</code> and <code>u3</code>, which can store
58
+ values of different types. There is also a <code>nd_file</code> field, which contains the
59
+ name of the file from which this node was parsed and there is a <code>flags</code> field,
60
+ which contains some flags, the node type and the line number (all ORed
61
+ together).</p>
62
+
63
+
64
+ <p>The node types are defined as an enum in C:</p>
65
+
66
+
67
+ <pre><code>enum node_type {
68
+ NODE_METHOD,
69
+ NODE_FBODY,
70
+ NODE_CFUNC,
71
+ NODE_SCOPE,
72
+ NODE_BLOCK,
73
+ ...
74
+ </code></pre>
75
+
76
+ <h2 id="section2">Accessing the fields using RubyNode</h2>
77
+
78
+
79
+ <p>If you have an instance of RubyNode, you can access all these fields using the
80
+ following methods:</p>
81
+
82
+
83
+ <table>
84
+ <tr>
85
+ <td><code>file</code> </td>
86
+ <td> the contents of <code>nd_file</code> as String</td>
87
+ </tr>
88
+ <tr>
89
+ <td><code>flags</code> </td>
90
+ <td> the raw <code>flags</code> field as Integer</td>
91
+ </tr>
92
+ <tr>
93
+ <td><code>line</code> </td>
94
+ <td> the line number (extracted from <code>flags</code>) as Integer</td>
95
+ </tr>
96
+ <tr>
97
+ <td><code>type</code> </td>
98
+ <td> the node type as Symbol (e.g. <code>NODE_SCOPE</code> =&gt; <code>:scope</code>)</td>
99
+ </tr>
100
+ </table>
101
+
102
+
103
+
104
+
105
+ <p>The unions can be accessed using the following methods:</p>
106
+
107
+
108
+ <pre><code>u1_as_long
109
+ u1_cfunc
110
+ u1_id
111
+ u1_node
112
+ u1_tbl
113
+ u1_value
114
+ u2_argc
115
+ u2_id
116
+ u2_node
117
+ u2_value
118
+ u3_cnt
119
+ u3_id
120
+ u3_node
121
+ u3_state
122
+ u3_value
123
+ </code></pre>
124
+
125
+ <p>The <code>*_id</code> methods return a Symbol, if the value of this union is a valid ID.
126
+ If the value is <code>0</code> or <code>1</code> they will return the corresponding Fixnum (this is
127
+ because some node types store <code>0</code> and <code>1</code> instead of an ID for special cases).
128
+ Otherwise <code>nil</code> is returned.</p>
129
+
130
+
131
+ <p>The <code>*_value</code> and <code>*_node</code> methods return the object or the node (wrapped as
132
+ RubyNode), that is stored in the union. If the union is no object or node,
133
+ then <code>nil</code> is returned. The *_value and *_node methods can be used
134
+ interchangeably, they are just aliases.</p>
135
+
136
+
137
+ <p><code>u1_as_long</code>, <code>u2_argc</code>, <code>u3_cnt</code> and <code>u3_state</code> always return the raw &#8220;long&#8221;,
138
+ that is stored in the union as Integer.</p>
139
+
140
+
141
+ <p><code>u1_cfunc</code> returns the unsigned value of the function pointer, that is stored
142
+ in the union as Integer.</p>
143
+
144
+
145
+ <p><code>u1_tbl</code> returns the local variable table for a scope node as an Array, for
146
+ all other node types it returns <code>nil</code>.</p>
147
+
148
+
149
+ <p>All these methods never raise exceptions, they just return <code>nil</code>, if the
150
+ value, that is stored in the union does not have the requested type.</p>
151
+
152
+
153
+ <p><code>u3.entry</code> is not supported (for obvious reasons).</p>
154
+
155
+
156
+ <h2 id="section3">Aliases for better readability</h2>
157
+
158
+
159
+ <p>It would not be very readable to always access the unions directly, so Ruby
160
+ uses defines on the C side to remedy this:</p>
161
+
162
+
163
+ <pre><code>#define nd_head u1.node
164
+ #define nd_alen u2.argc
165
+ #define nd_next u3.node
166
+
167
+ #define nd_cond u1.node
168
+ #define nd_body u2.node
169
+ #define nd_else u3.node
170
+ ...
171
+ </code></pre>
172
+
173
+ <p>RubyNode also makes these aliases available. To get a full list of the
174
+ available aliases for your Ruby version you can use something like the
175
+ following in irb:</p>
176
+
177
+
178
+ <pre><code>&gt;&gt; puts RubyNode.instance_methods.grep(/^nd_/).sort
179
+ nd_1st
180
+ nd_2nd
181
+ nd_aid
182
+ nd_alen
183
+ nd_argc
184
+ nd_args
185
+ nd_beg
186
+ nd_body
187
+ ...
188
+ </code></pre>
189
+
190
+ <h2 id="section4">Getting RubyNodes</h2>
191
+
192
+
193
+ <p>Now that we know how to work with RubyNode instances, it would be nice to have
194
+ some to try it out ;-). There is no way to instantiate RubyNode instances
195
+ yourself, you can only get them through one of the following methods.</p>
196
+
197
+
198
+ <h3>(Unbound)Method</h3>
199
+
200
+
201
+ <p>You can access the body node of methods by using the <code>body_node</code> method.</p>
202
+
203
+
204
+ <h3>Proc</h3>
205
+
206
+
207
+ <p>Procs have three different nodes, the body node, which contains the actual
208
+ code, the var node, which describes the arguments of the proc and the cref
209
+ node, which is the lexical class/module nesting in which this proc was defined.
210
+ Those nodes can be accessed using the methods <code>body_node</code>, <code>var_node</code> and
211
+ <code>cref_node</code>.</p>
212
+
213
+
214
+ <h3>String</h3>
215
+
216
+
217
+ <p>RubyNode adds a <code>parse_to_nodes</code> method to String, which will parse the given
218
+ string using Ruby&#8217;s parser. The parsing will be done in the current
219
+ context/binding/scope, so it basically returns the <span class="caps">AST</span>, that eval would see,
220
+ given this string.</p>
221
+
222
+
223
+ <p>For Ruby 1.8.x there also is the method <code>parse_begin_to_nodes</code>, which returns
224
+ the <span class="caps">AST</span> for all <code>BEGIN</code> blocks in the string. Those <code>BEGIN</code> blocks won&#8217;t be in
225
+ the <span class="caps">AST</span> returned by <code>parse_to_nodes</code>. In Ruby 1.9 <code>parse_to_nodes</code> returns a
226
+ combined <span class="caps">AST</span>. <code>parse_to_nodes</code> and <code>parse_begin_to_nodes</code> also accept two
227
+ optional arguments: file and line. The arguments work similar to those of
228
+ Ruby&#8217;s <code>eval</code> and default to <code>"(string)"</code> and <code>1</code>.</p>
229
+
230
+
231
+ <p>For more details please see the examples below.</p>
232
+
233
+
234
+ <h2 id="section5">Higher level methods</h2>
235
+
236
+
237
+ <p>All the functionality described above is provided by the C extension. The
238
+ following methods are only available if you <code>require "rubynode"</code> (instead of
239
+ just <code>"rubynode_ext"</code>).</p>
240
+
241
+
242
+ <p>It would be a bit tedious to only work with the above methods to access the
243
+ attributes of nodes. You would have to know which node type has what
244
+ attributes for example. Fortunately RubyNode provides a nicer way: the method
245
+ <code>attribs_hash</code>. This method returns a hash that contains all attributes of the
246
+ node. Example:</p>
247
+
248
+
249
+ <pre><code>&gt;&gt; n = "1 + 2".parse_to_nodes.nd_next
250
+ =&gt; #&lt;RubyNode :call&gt;
251
+ &gt;&gt; n.attribs_hash
252
+ =&gt; {:mid=&gt;:+, :recv=&gt;#&lt;RubyNode :lit&gt;, :args=&gt;#&lt;RubyNode :array&gt;}
253
+ </code></pre>
254
+
255
+ <p>This is nice but it is still a bit tedious, because you would then probably do
256
+ something like the following:</p>
257
+
258
+
259
+ <pre><code>&gt;&gt; n.attribs_hash[:recv].attribs_hash
260
+ =&gt; {:lit=&gt;1}
261
+ &gt;&gt; n.attribs_hash[:args].attribs_hash
262
+ =&gt; {:next=&gt;false, :head=&gt;#&lt;RubyNode :lit&gt;, :alen=&gt;1}
263
+ &gt;&gt; n.attribs_hash[:args].attribs_hash[:head].attribs_hash
264
+ =&gt; {:lit=&gt;2}
265
+ </code></pre>
266
+
267
+ <p>So, there is an even nicer way: the method <code>transform</code>. It is basically a
268
+ recursive version of <code>attribs_hash</code>, it transforms a node tree into a tree of
269
+ arrays and hashes. Example:</p>
270
+
271
+
272
+ <pre><code>&gt;&gt; n.transform
273
+ =&gt; [:call, {:mid=&gt;:+, :recv=&gt;[:lit, {:lit=&gt;1}], :args=&gt;[:array, [[:lit, {:lit=&gt;2}]]]}]
274
+ </code></pre>
275
+
276
+ <p>So the <code>#&lt;RubyNode :call&gt;</code> became <code>[:call, "attribs_hash applied recursively"]</code>.
277
+ You might have noticed, that the <code>:array</code> node doesn&#8217;t have a hash as second
278
+ element in its array, this is some special magic to make it easier to work
279
+ with <code>:array</code> nodes. If you really want to see the node tree of <code>:array</code>
280
+ nodes, you can get that, too:</p>
281
+
282
+
283
+ <pre><code>&gt;&gt; n.nd_args.transform
284
+ =&gt; [:array, [[:lit, {:lit=&gt;2}]]]
285
+ &gt;&gt; n.nd_args.transform(:keep_array_nodes =&gt; true)
286
+ =&gt; [:array, {:next=&gt;false, :head=&gt;[:lit, {:lit=&gt;2}], :alen=&gt;1}]
287
+ </code></pre>
288
+
289
+ <p>The same magic is also done for <code>:block</code> nodes:</p>
290
+
291
+
292
+ <pre><code>&gt;&gt; bl = "foo; bar".parse_to_nodes
293
+ =&gt; #&lt;RubyNode :block&gt;
294
+ &gt;&gt; bl.transform
295
+ =&gt; [:block, [[:vcall, {:mid=&gt;:foo}], [:vcall, {:mid=&gt;:bar}]]]
296
+ &gt;&gt; pp bl.transform(:keep_block_nodes =&gt; true)
297
+ [:block,
298
+ {:next=&gt;[:block, {:next=&gt;false, :head=&gt;[:vcall, {:mid=&gt;:bar}]}],
299
+ :head=&gt;[:vcall, {:mid=&gt;:foo}]}]
300
+ </code></pre>
301
+
302
+ <p><code>transform</code> also strips <code>:newline</code> nodes (only useful for Ruby 1.8, 1.9
303
+ doesn&#8217;t have <code>:newline</code> nodes), but if you really want those, you can get
304
+ them:</p>
305
+
306
+
307
+ <pre><code>&gt;&gt; pp bl.transform
308
+ [:block, [[:vcall, {:mid=&gt;:foo}], [:vcall, {:mid=&gt;:bar}]]]
309
+ =&gt; nil
310
+ &gt;&gt; pp bl.transform(:keep_newline_nodes =&gt; true)
311
+ [:block,
312
+ [[:newline, {:next=&gt;[:vcall, {:mid=&gt;:foo}]}],
313
+ [:newline, {:next=&gt;[:vcall, {:mid=&gt;:bar}]}]]]
314
+ </code></pre>
315
+
316
+ <p>And finally <code>transform</code> can also include the original RubyNode instance in the
317
+ hash, if you later need access to the filename, line number or the flags:</p>
318
+
319
+
320
+ <pre><code>&gt;&gt; pp bl.transform(:include_node =&gt; true)
321
+ [:block,
322
+ [[:vcall, {:mid=&gt;:foo, :node=&gt;#&lt;RubyNode :vcall&gt;}],
323
+ [:vcall, {:mid=&gt;:bar, :node=&gt;#&lt;RubyNode :vcall&gt;}]]]
324
+ =&gt; nil
325
+ &gt;&gt; pp n.transform(:include_node =&gt; true)
326
+ [:call,
327
+ {:mid=&gt;:+,
328
+ :recv=&gt;[:lit, {:lit=&gt;1, :node=&gt;#&lt;RubyNode :lit&gt;}],
329
+ :args=&gt;[:array, [[:lit, {:lit=&gt;2, :node=&gt;#&lt;RubyNode :lit&gt;}]]],
330
+ :node=&gt;#&lt;RubyNode :call&gt;}]
331
+ </code></pre>
332
+
333
+ <p>The options <code>:keep_array_nodes</code>, <code>:keep_block_nodes</code>, <code>:keep_newline_nodes</code>
334
+ and <code>:include_node</code> can also be combined.</p>
335
+
336
+
337
+ <h2 id="section6">More examples</h2>
338
+
339
+
340
+ <h3>(Unbound)Method</h3>
341
+
342
+
343
+ <pre><code>&gt;&gt; class A
344
+ &gt;&gt; def foo(x)
345
+ &gt;&gt; @bar + x
346
+ &gt;&gt; end
347
+ &gt;&gt; end
348
+ =&gt; nil
349
+ &gt;&gt; pp A.instance_method(:foo).body_node.transform
350
+ [:scope,
351
+ {:next=&gt;
352
+ [:block,
353
+ [[:args, {:rest=&gt;-1, :cnt=&gt;1, :opt=&gt;false}],
354
+ [:call,
355
+ {:mid=&gt;:+,
356
+ :recv=&gt;[:ivar, {:vid=&gt;:@bar}],
357
+ :args=&gt;[:array, [[:lvar, {:cnt=&gt;2, :vid=&gt;:x}]]]}]]],
358
+ :rval=&gt;[:cref, {:next=&gt;[:cref, {:next=&gt;false, :clss=&gt;Object}], :clss=&gt;A}],
359
+ :tbl=&gt;[:x]}]
360
+ =&gt; nil
361
+ &gt;&gt; pp A.new.method(:foo).body_node.transform
362
+ [:scope,
363
+ {:next=&gt;
364
+ [:block,
365
+ [[:args, {:rest=&gt;-1, :cnt=&gt;1, :opt=&gt;false}],
366
+ [:call,
367
+ {:mid=&gt;:+,
368
+ :recv=&gt;[:ivar, {:vid=&gt;:@bar}],
369
+ :args=&gt;[:array, [[:lvar, {:cnt=&gt;2, :vid=&gt;:x}]]]}]]],
370
+ :rval=&gt;[:cref, {:next=&gt;[:cref, {:next=&gt;false, :clss=&gt;Object}], :clss=&gt;A}],
371
+ :tbl=&gt;[:x]}]
372
+ </code></pre>
373
+
374
+ <h3>Proc</h3>
375
+
376
+
377
+ <pre><code>&gt;&gt; add_23 = proc { |x| x + 23 }
378
+ =&gt; #&lt;Proc:0xb7edafd8@(irb):9&gt;
379
+ &gt;&gt; add_23.body_node.transform
380
+ =&gt; [:call, {:mid=&gt;:+, :recv=&gt;[:dvar, {:vid=&gt;:x}], :args=&gt;[:array, [[:lit, {:lit=&gt;23}]]]}]
381
+ &gt;&gt; add_23.var_node.transform
382
+ =&gt; [:dasgn_curr, {:value=&gt;false, :vid=&gt;:x}]
383
+ &gt;&gt; add_23.cref_node.transform
384
+ =&gt; [:cref, {:next=&gt;false, :clss=&gt;Object}]
385
+ </code></pre>
386
+
387
+ <h3>Parsing strings</h3>
388
+
389
+
390
+ <p>As mentioned above, the parsing is done in the current context, so the result
391
+ can differ depending on local variables:</p>
392
+
393
+
394
+ <pre><code>&gt;&gt; defined? z
395
+ =&gt; nil
396
+ &gt;&gt; "z".parse_to_nodes.transform
397
+ =&gt; [:vcall, {:mid=&gt;:z}]
398
+ &gt;&gt; z = 42
399
+ =&gt; 42
400
+ &gt;&gt; defined? z
401
+ =&gt; "local-variable"
402
+ &gt;&gt; "z".parse_to_nodes.transform
403
+ =&gt; [:lvar, {:cnt=&gt;4, :vid=&gt;:z}]
404
+ </code></pre>
405
+
406
+ <h3><code>BEGIN</code> blocks</h3>
407
+
408
+
409
+ <p>Ruby 1.8:</p>
410
+
411
+
412
+ <pre><code>&gt;&gt; "BEGIN { p 1 }; p 2".parse_to_nodes.transform
413
+ =&gt; [:fcall, {:mid=&gt;:p, :args=&gt;[:array, [[:lit, {:lit=&gt;2}]]]}]
414
+ &gt;&gt; pp "BEGIN { p 1 }; p 2".parse_begin_to_nodes.transform
415
+ [:scope,
416
+ {:next=&gt;[:fcall, {:mid=&gt;:p, :args=&gt;[:array, [[:lit, {:lit=&gt;1}]]]}],
417
+ :rval=&gt;false,
418
+ :tbl=&gt;nil}]
419
+ </code></pre>
420
+
421
+ <p>Ruby 1.9:</p>
422
+
423
+
424
+ <pre><code>&gt;&gt; pp "BEGIN { p 1 }; p 2".parse_to_nodes.transform
425
+ [:prelude,
426
+ {:head=&gt;
427
+ [:scope,
428
+ {:rval=&gt;false,
429
+ :tbl=&gt;nil,
430
+ :next=&gt;[:fcall, {:args=&gt;[:array, [[:lit, {:lit=&gt;1}]]], :mid=&gt;:p}]}],
431
+ :body=&gt;[:fcall, {:args=&gt;[:array, [[:lit, {:lit=&gt;2}]]], :mid=&gt;:p}]}]
432
+ </code></pre>
433
+ </body>
434
+ </html>