rubynode 0.1.1

Sign up to get free protection for your applications and to get access to all the features.
data/Changelog ADDED
@@ -0,0 +1,15 @@
1
+ Changelog
2
+
3
+ -- 0.1.1
4
+
5
+ * RubyNode is available as gem now
6
+ * Added the necessary Ruby source files for out of the box 1.8.5 support
7
+ * Added support for u1.cfunc / nd_cfnc
8
+ * Removed unused attributes for NODE_CASE and NODE_ZSUPER
9
+ * Documentation is included as HTML now
10
+ * Stricter version check for the included Ruby source
11
+ * Some minor fixes and improvements
12
+
13
+ -- 0.1.0
14
+
15
+ * Initial release
data/README ADDED
@@ -0,0 +1,84 @@
1
+
2
+ RubyNode
3
+ ========
4
+
5
+ RubyNode is a library that allows read only access to Ruby's internal NODE
6
+ structure. It can retrieve the node trees of methods and procs and it can use
7
+ Ruby's parser to parse Ruby source code strings to node trees.
8
+
9
+ It provides the class RubyNode, which wraps an internal NODE. Trees of RubyNode
10
+ instances can also be transformed into trees of arrays and hashes (similar to
11
+ s-expressions), which are easy to manipulate and work with.
12
+
13
+
14
+ Requirements
15
+ ------------
16
+
17
+ RubyNode is tested with Ruby 1.8.4 and 1.8.5, but it should also work with
18
+ other 1.8 versions and also with 1.9 (but there might be some node types that
19
+ are not fully supported).
20
+
21
+
22
+ Installation
23
+ ------------
24
+
25
+ RubyNode generates some of its C source code from Ruby's source code, because
26
+ the node types and other details differ between Ruby versions.
27
+
28
+ For the official releases of Ruby 1.8.4 and 1.8.5 the needed source files are
29
+ included in the package. To compile RubyNode for any other Ruby version, you
30
+ will need that version's source tar ball extracted somewhere.
31
+
32
+ So, for Ruby 1.8.4 and 1.8.5 just run (as root):
33
+
34
+ gem install rubynode
35
+
36
+ Or if you do not use the gem:
37
+
38
+ ruby setup.rb
39
+
40
+ That command will compile the C extension and install all files to their
41
+ default location (to customize the non-gem installation, please see "ruby
42
+ setup.rb --help")
43
+
44
+ For other Ruby versions you need to provide the (absolute) path to the source
45
+ directory for that Ruby version in the enviroment variable RUBY_SOURCE_DIR.
46
+ Example (as root):
47
+
48
+ RUBY_SOURCE_DIR="/path/to/ruby_source" gem install rubynode
49
+
50
+ Or if you do not use the gem:
51
+
52
+ RUBY_SOURCE_DIR="/path/to/ruby_source" ruby setup.rb
53
+
54
+
55
+ Usage
56
+ -----
57
+
58
+ See doc/index.html and doc/api.html.
59
+
60
+
61
+ Feedback
62
+ --------
63
+
64
+ If you find a bug, think that something doesn't work as it should or have other
65
+ suggestions, then please don't hesitate to contact me (dbatml@gmx.de) and tell
66
+ me about it.
67
+
68
+
69
+ Thanks
70
+ ------
71
+
72
+ I would like to thank Paul Brannan for writing Nodewrap, which inspired me to
73
+ write RubyNode and also gave me some ideas and code.
74
+
75
+
76
+ License
77
+ -------
78
+
79
+ Copyright 2006 Dominik Bathon <dbatml@gmx.de>
80
+
81
+ RubyNode is licensed under the same terms as Ruby.
82
+
83
+ The distribution contains some (extracts from) files from Ruby which are
84
+ Copyright Yukihiro Matsumoto (see those files).
data/doc/api.html ADDED
@@ -0,0 +1,434 @@
1
+ <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
2
+ <html>
3
+ <head>
4
+ <title>RubyNode API</title>
5
+ <link href="style.css" media="all" rel="Stylesheet" type="text/css">
6
+ </head>
7
+ <body>
8
+ <h1>RubyNode <span class="caps">API</span></h1>
9
+
10
+
11
+ <p>RubyNode consists of two parts: a C extension, which provides the core
12
+ functionality and a Ruby library, which adds additional functionality.</p>
13
+
14
+
15
+ <p>To get only the C extension <code>require "rubynode_ext"</code>, to get the full
16
+ functionality (recommended) just <code>require "rubynode"</code>.</p>
17
+
18
+
19
+ <p>Sections: <a href="#section1">Introduction to Ruby NODEs</a>, <a href="#section2">Accessing the fields using RubyNode</a>, <a href="#section3">Aliases for better readability</a>, <a href="#section4">Getting RubyNodes</a>, <a href="#section5">Higher level methods</a>, <a href="#section6">More examples</a>.</p>
20
+
21
+
22
+ <h2 id="section1">Introduction to Ruby NODEs</h2>
23
+
24
+
25
+ <p>Ruby mainly uses the NODEs to store the <span class="caps">AST</span> (abstract syntax tree) of parsed
26
+ Ruby files, but they are also used for some other things. On the C side, <span class="caps">NODE</span>
27
+ is a struct:</p>
28
+
29
+
30
+ <pre><code>typedef struct RNode {
31
+ unsigned long flags;
32
+ char *nd_file;
33
+ union {
34
+ struct RNode *node;
35
+ ID id;
36
+ VALUE value;
37
+ VALUE (*cfunc)(ANYARGS);
38
+ ID *tbl;
39
+ } u1;
40
+ union {
41
+ struct RNode *node;
42
+ ID id;
43
+ long argc;
44
+ VALUE value;
45
+ } u2;
46
+ union {
47
+ struct RNode *node;
48
+ ID id;
49
+ long state;
50
+ struct global_entry *entry;
51
+ long cnt;
52
+ VALUE value;
53
+ } u3;
54
+ } NODE;
55
+ </code></pre>
56
+
57
+ <p>The important parts are the 3 unions <code>u1</code>, <code>u2</code> and <code>u3</code>, which can store
58
+ values of different types. There is also a <code>nd_file</code> field, which contains the
59
+ name of the file from which this node was parsed and there is a <code>flags</code> field,
60
+ which contains some flags, the node type and the line number (all ORed
61
+ together).</p>
62
+
63
+
64
+ <p>The node types are defined as an enum in C:</p>
65
+
66
+
67
+ <pre><code>enum node_type {
68
+ NODE_METHOD,
69
+ NODE_FBODY,
70
+ NODE_CFUNC,
71
+ NODE_SCOPE,
72
+ NODE_BLOCK,
73
+ ...
74
+ </code></pre>
75
+
76
+ <h2 id="section2">Accessing the fields using RubyNode</h2>
77
+
78
+
79
+ <p>If you have an instance of RubyNode, you can access all these fields using the
80
+ following methods:</p>
81
+
82
+
83
+ <table>
84
+ <tr>
85
+ <td><code>file</code> </td>
86
+ <td> the contents of <code>nd_file</code> as String</td>
87
+ </tr>
88
+ <tr>
89
+ <td><code>flags</code> </td>
90
+ <td> the raw <code>flags</code> field as Integer</td>
91
+ </tr>
92
+ <tr>
93
+ <td><code>line</code> </td>
94
+ <td> the line number (extracted from <code>flags</code>) as Integer</td>
95
+ </tr>
96
+ <tr>
97
+ <td><code>type</code> </td>
98
+ <td> the node type as Symbol (e.g. <code>NODE_SCOPE</code> =&gt; <code>:scope</code>)</td>
99
+ </tr>
100
+ </table>
101
+
102
+
103
+
104
+
105
+ <p>The unions can be accessed using the following methods:</p>
106
+
107
+
108
+ <pre><code>u1_as_long
109
+ u1_cfunc
110
+ u1_id
111
+ u1_node
112
+ u1_tbl
113
+ u1_value
114
+ u2_argc
115
+ u2_id
116
+ u2_node
117
+ u2_value
118
+ u3_cnt
119
+ u3_id
120
+ u3_node
121
+ u3_state
122
+ u3_value
123
+ </code></pre>
124
+
125
+ <p>The <code>*_id</code> methods return a Symbol, if the value of this union is a valid ID.
126
+ If the value is <code>0</code> or <code>1</code> they will return the corresponding Fixnum (this is
127
+ because some node types store <code>0</code> and <code>1</code> instead of an ID for special cases).
128
+ Otherwise <code>nil</code> is returned.</p>
129
+
130
+
131
+ <p>The <code>*_value</code> and <code>*_node</code> methods return the object or the node (wrapped as
132
+ RubyNode), that is stored in the union. If the union is no object or node,
133
+ then <code>nil</code> is returned. The *_value and *_node methods can be used
134
+ interchangeably, they are just aliases.</p>
135
+
136
+
137
+ <p><code>u1_as_long</code>, <code>u2_argc</code>, <code>u3_cnt</code> and <code>u3_state</code> always return the raw &#8220;long&#8221;,
138
+ that is stored in the union as Integer.</p>
139
+
140
+
141
+ <p><code>u1_cfunc</code> returns the unsigned value of the function pointer, that is stored
142
+ in the union as Integer.</p>
143
+
144
+
145
+ <p><code>u1_tbl</code> returns the local variable table for a scope node as an Array, for
146
+ all other node types it returns <code>nil</code>.</p>
147
+
148
+
149
+ <p>All these methods never raise exceptions, they just return <code>nil</code>, if the
150
+ value, that is stored in the union does not have the requested type.</p>
151
+
152
+
153
+ <p><code>u3.entry</code> is not supported (for obvious reasons).</p>
154
+
155
+
156
+ <h2 id="section3">Aliases for better readability</h2>
157
+
158
+
159
+ <p>It would not be very readable to always access the unions directly, so Ruby
160
+ uses defines on the C side to remedy this:</p>
161
+
162
+
163
+ <pre><code>#define nd_head u1.node
164
+ #define nd_alen u2.argc
165
+ #define nd_next u3.node
166
+
167
+ #define nd_cond u1.node
168
+ #define nd_body u2.node
169
+ #define nd_else u3.node
170
+ ...
171
+ </code></pre>
172
+
173
+ <p>RubyNode also makes these aliases available. To get a full list of the
174
+ available aliases for your Ruby version you can use something like the
175
+ following in irb:</p>
176
+
177
+
178
+ <pre><code>&gt;&gt; puts RubyNode.instance_methods.grep(/^nd_/).sort
179
+ nd_1st
180
+ nd_2nd
181
+ nd_aid
182
+ nd_alen
183
+ nd_argc
184
+ nd_args
185
+ nd_beg
186
+ nd_body
187
+ ...
188
+ </code></pre>
189
+
190
+ <h2 id="section4">Getting RubyNodes</h2>
191
+
192
+
193
+ <p>Now that we know how to work with RubyNode instances, it would be nice to have
194
+ some to try it out ;-). There is no way to instantiate RubyNode instances
195
+ yourself, you can only get them through one of the following methods.</p>
196
+
197
+
198
+ <h3>(Unbound)Method</h3>
199
+
200
+
201
+ <p>You can access the body node of methods by using the <code>body_node</code> method.</p>
202
+
203
+
204
+ <h3>Proc</h3>
205
+
206
+
207
+ <p>Procs have three different nodes, the body node, which contains the actual
208
+ code, the var node, which describes the arguments of the proc and the cref
209
+ node, which is the lexical class/module nesting in which this proc was defined.
210
+ Those nodes can be accessed using the methods <code>body_node</code>, <code>var_node</code> and
211
+ <code>cref_node</code>.</p>
212
+
213
+
214
+ <h3>String</h3>
215
+
216
+
217
+ <p>RubyNode adds a <code>parse_to_nodes</code> method to String, which will parse the given
218
+ string using Ruby&#8217;s parser. The parsing will be done in the current
219
+ context/binding/scope, so it basically returns the <span class="caps">AST</span>, that eval would see,
220
+ given this string.</p>
221
+
222
+
223
+ <p>For Ruby 1.8.x there also is the method <code>parse_begin_to_nodes</code>, which returns
224
+ the <span class="caps">AST</span> for all <code>BEGIN</code> blocks in the string. Those <code>BEGIN</code> blocks won&#8217;t be in
225
+ the <span class="caps">AST</span> returned by <code>parse_to_nodes</code>. In Ruby 1.9 <code>parse_to_nodes</code> returns a
226
+ combined <span class="caps">AST</span>. <code>parse_to_nodes</code> and <code>parse_begin_to_nodes</code> also accept two
227
+ optional arguments: file and line. The arguments work similar to those of
228
+ Ruby&#8217;s <code>eval</code> and default to <code>"(string)"</code> and <code>1</code>.</p>
229
+
230
+
231
+ <p>For more details please see the examples below.</p>
232
+
233
+
234
+ <h2 id="section5">Higher level methods</h2>
235
+
236
+
237
+ <p>All the functionality described above is provided by the C extension. The
238
+ following methods are only available if you <code>require "rubynode"</code> (instead of
239
+ just <code>"rubynode_ext"</code>).</p>
240
+
241
+
242
+ <p>It would be a bit tedious to only work with the above methods to access the
243
+ attributes of nodes. You would have to know which node type has what
244
+ attributes for example. Fortunately RubyNode provides a nicer way: the method
245
+ <code>attribs_hash</code>. This method returns a hash that contains all attributes of the
246
+ node. Example:</p>
247
+
248
+
249
+ <pre><code>&gt;&gt; n = "1 + 2".parse_to_nodes.nd_next
250
+ =&gt; #&lt;RubyNode :call&gt;
251
+ &gt;&gt; n.attribs_hash
252
+ =&gt; {:mid=&gt;:+, :recv=&gt;#&lt;RubyNode :lit&gt;, :args=&gt;#&lt;RubyNode :array&gt;}
253
+ </code></pre>
254
+
255
+ <p>This is nice but it is still a bit tedious, because you would then probably do
256
+ something like the following:</p>
257
+
258
+
259
+ <pre><code>&gt;&gt; n.attribs_hash[:recv].attribs_hash
260
+ =&gt; {:lit=&gt;1}
261
+ &gt;&gt; n.attribs_hash[:args].attribs_hash
262
+ =&gt; {:next=&gt;false, :head=&gt;#&lt;RubyNode :lit&gt;, :alen=&gt;1}
263
+ &gt;&gt; n.attribs_hash[:args].attribs_hash[:head].attribs_hash
264
+ =&gt; {:lit=&gt;2}
265
+ </code></pre>
266
+
267
+ <p>So, there is an even nicer way: the method <code>transform</code>. It is basically a
268
+ recursive version of <code>attribs_hash</code>, it transforms a node tree into a tree of
269
+ arrays and hashes. Example:</p>
270
+
271
+
272
+ <pre><code>&gt;&gt; n.transform
273
+ =&gt; [:call, {:mid=&gt;:+, :recv=&gt;[:lit, {:lit=&gt;1}], :args=&gt;[:array, [[:lit, {:lit=&gt;2}]]]}]
274
+ </code></pre>
275
+
276
+ <p>So the <code>#&lt;RubyNode :call&gt;</code> became <code>[:call, "attribs_hash applied recursively"]</code>.
277
+ You might have noticed, that the <code>:array</code> node doesn&#8217;t have a hash as second
278
+ element in its array, this is some special magic to make it easier to work
279
+ with <code>:array</code> nodes. If you really want to see the node tree of <code>:array</code>
280
+ nodes, you can get that, too:</p>
281
+
282
+
283
+ <pre><code>&gt;&gt; n.nd_args.transform
284
+ =&gt; [:array, [[:lit, {:lit=&gt;2}]]]
285
+ &gt;&gt; n.nd_args.transform(:keep_array_nodes =&gt; true)
286
+ =&gt; [:array, {:next=&gt;false, :head=&gt;[:lit, {:lit=&gt;2}], :alen=&gt;1}]
287
+ </code></pre>
288
+
289
+ <p>The same magic is also done for <code>:block</code> nodes:</p>
290
+
291
+
292
+ <pre><code>&gt;&gt; bl = "foo; bar".parse_to_nodes
293
+ =&gt; #&lt;RubyNode :block&gt;
294
+ &gt;&gt; bl.transform
295
+ =&gt; [:block, [[:vcall, {:mid=&gt;:foo}], [:vcall, {:mid=&gt;:bar}]]]
296
+ &gt;&gt; pp bl.transform(:keep_block_nodes =&gt; true)
297
+ [:block,
298
+ {:next=&gt;[:block, {:next=&gt;false, :head=&gt;[:vcall, {:mid=&gt;:bar}]}],
299
+ :head=&gt;[:vcall, {:mid=&gt;:foo}]}]
300
+ </code></pre>
301
+
302
+ <p><code>transform</code> also strips <code>:newline</code> nodes (only useful for Ruby 1.8, 1.9
303
+ doesn&#8217;t have <code>:newline</code> nodes), but if you really want those, you can get
304
+ them:</p>
305
+
306
+
307
+ <pre><code>&gt;&gt; pp bl.transform
308
+ [:block, [[:vcall, {:mid=&gt;:foo}], [:vcall, {:mid=&gt;:bar}]]]
309
+ =&gt; nil
310
+ &gt;&gt; pp bl.transform(:keep_newline_nodes =&gt; true)
311
+ [:block,
312
+ [[:newline, {:next=&gt;[:vcall, {:mid=&gt;:foo}]}],
313
+ [:newline, {:next=&gt;[:vcall, {:mid=&gt;:bar}]}]]]
314
+ </code></pre>
315
+
316
+ <p>And finally <code>transform</code> can also include the original RubyNode instance in the
317
+ hash, if you later need access to the filename, line number or the flags:</p>
318
+
319
+
320
+ <pre><code>&gt;&gt; pp bl.transform(:include_node =&gt; true)
321
+ [:block,
322
+ [[:vcall, {:mid=&gt;:foo, :node=&gt;#&lt;RubyNode :vcall&gt;}],
323
+ [:vcall, {:mid=&gt;:bar, :node=&gt;#&lt;RubyNode :vcall&gt;}]]]
324
+ =&gt; nil
325
+ &gt;&gt; pp n.transform(:include_node =&gt; true)
326
+ [:call,
327
+ {:mid=&gt;:+,
328
+ :recv=&gt;[:lit, {:lit=&gt;1, :node=&gt;#&lt;RubyNode :lit&gt;}],
329
+ :args=&gt;[:array, [[:lit, {:lit=&gt;2, :node=&gt;#&lt;RubyNode :lit&gt;}]]],
330
+ :node=&gt;#&lt;RubyNode :call&gt;}]
331
+ </code></pre>
332
+
333
+ <p>The options <code>:keep_array_nodes</code>, <code>:keep_block_nodes</code>, <code>:keep_newline_nodes</code>
334
+ and <code>:include_node</code> can also be combined.</p>
335
+
336
+
337
+ <h2 id="section6">More examples</h2>
338
+
339
+
340
+ <h3>(Unbound)Method</h3>
341
+
342
+
343
+ <pre><code>&gt;&gt; class A
344
+ &gt;&gt; def foo(x)
345
+ &gt;&gt; @bar + x
346
+ &gt;&gt; end
347
+ &gt;&gt; end
348
+ =&gt; nil
349
+ &gt;&gt; pp A.instance_method(:foo).body_node.transform
350
+ [:scope,
351
+ {:next=&gt;
352
+ [:block,
353
+ [[:args, {:rest=&gt;-1, :cnt=&gt;1, :opt=&gt;false}],
354
+ [:call,
355
+ {:mid=&gt;:+,
356
+ :recv=&gt;[:ivar, {:vid=&gt;:@bar}],
357
+ :args=&gt;[:array, [[:lvar, {:cnt=&gt;2, :vid=&gt;:x}]]]}]]],
358
+ :rval=&gt;[:cref, {:next=&gt;[:cref, {:next=&gt;false, :clss=&gt;Object}], :clss=&gt;A}],
359
+ :tbl=&gt;[:x]}]
360
+ =&gt; nil
361
+ &gt;&gt; pp A.new.method(:foo).body_node.transform
362
+ [:scope,
363
+ {:next=&gt;
364
+ [:block,
365
+ [[:args, {:rest=&gt;-1, :cnt=&gt;1, :opt=&gt;false}],
366
+ [:call,
367
+ {:mid=&gt;:+,
368
+ :recv=&gt;[:ivar, {:vid=&gt;:@bar}],
369
+ :args=&gt;[:array, [[:lvar, {:cnt=&gt;2, :vid=&gt;:x}]]]}]]],
370
+ :rval=&gt;[:cref, {:next=&gt;[:cref, {:next=&gt;false, :clss=&gt;Object}], :clss=&gt;A}],
371
+ :tbl=&gt;[:x]}]
372
+ </code></pre>
373
+
374
+ <h3>Proc</h3>
375
+
376
+
377
+ <pre><code>&gt;&gt; add_23 = proc { |x| x + 23 }
378
+ =&gt; #&lt;Proc:0xb7edafd8@(irb):9&gt;
379
+ &gt;&gt; add_23.body_node.transform
380
+ =&gt; [:call, {:mid=&gt;:+, :recv=&gt;[:dvar, {:vid=&gt;:x}], :args=&gt;[:array, [[:lit, {:lit=&gt;23}]]]}]
381
+ &gt;&gt; add_23.var_node.transform
382
+ =&gt; [:dasgn_curr, {:value=&gt;false, :vid=&gt;:x}]
383
+ &gt;&gt; add_23.cref_node.transform
384
+ =&gt; [:cref, {:next=&gt;false, :clss=&gt;Object}]
385
+ </code></pre>
386
+
387
+ <h3>Parsing strings</h3>
388
+
389
+
390
+ <p>As mentioned above, the parsing is done in the current context, so the result
391
+ can differ depending on local variables:</p>
392
+
393
+
394
+ <pre><code>&gt;&gt; defined? z
395
+ =&gt; nil
396
+ &gt;&gt; "z".parse_to_nodes.transform
397
+ =&gt; [:vcall, {:mid=&gt;:z}]
398
+ &gt;&gt; z = 42
399
+ =&gt; 42
400
+ &gt;&gt; defined? z
401
+ =&gt; "local-variable"
402
+ &gt;&gt; "z".parse_to_nodes.transform
403
+ =&gt; [:lvar, {:cnt=&gt;4, :vid=&gt;:z}]
404
+ </code></pre>
405
+
406
+ <h3><code>BEGIN</code> blocks</h3>
407
+
408
+
409
+ <p>Ruby 1.8:</p>
410
+
411
+
412
+ <pre><code>&gt;&gt; "BEGIN { p 1 }; p 2".parse_to_nodes.transform
413
+ =&gt; [:fcall, {:mid=&gt;:p, :args=&gt;[:array, [[:lit, {:lit=&gt;2}]]]}]
414
+ &gt;&gt; pp "BEGIN { p 1 }; p 2".parse_begin_to_nodes.transform
415
+ [:scope,
416
+ {:next=&gt;[:fcall, {:mid=&gt;:p, :args=&gt;[:array, [[:lit, {:lit=&gt;1}]]]}],
417
+ :rval=&gt;false,
418
+ :tbl=&gt;nil}]
419
+ </code></pre>
420
+
421
+ <p>Ruby 1.9:</p>
422
+
423
+
424
+ <pre><code>&gt;&gt; pp "BEGIN { p 1 }; p 2".parse_to_nodes.transform
425
+ [:prelude,
426
+ {:head=&gt;
427
+ [:scope,
428
+ {:rval=&gt;false,
429
+ :tbl=&gt;nil,
430
+ :next=&gt;[:fcall, {:args=&gt;[:array, [[:lit, {:lit=&gt;1}]]], :mid=&gt;:p}]}],
431
+ :body=&gt;[:fcall, {:args=&gt;[:array, [[:lit, {:lit=&gt;2}]]], :mid=&gt;:p}]}]
432
+ </code></pre>
433
+ </body>
434
+ </html>