rawk 0.1.1 → 0.1.2

Sign up to get free protection for your applications and to get access to all the features.
Files changed (4) hide show
  1. data/.gitignore +1 -0
  2. data/README.rdoc +195 -0
  3. data/rawk.gemspec +20 -0
  4. metadata +17 -6
data/.gitignore ADDED
@@ -0,0 +1 @@
1
+ *.gem
data/README.rdoc ADDED
@@ -0,0 +1,195 @@
1
+ = Rawk
2
+
3
+ An awk-inspired ruby DSL
4
+
5
+ Last week, after years of ignoring awk, I ran into a shell script problem where it was the only viable solution (we didn't have ruby on the server) and I was forced to learn a bit more about it.
6
+
7
+ Once, I had awk figured out, I thought it would be fun to write an awk DSL using ruby. It's turned out to be quite an interesting little project for the daily train ride to work and back.
8
+
9
+ Obviously, you can use ruby -e and {other magic}[http://code.joejag.com/2009/using-ruby-as-an-awk-replacement/] to execute snippets of ruby, but I like the way awk provides a bit more structure and a richer environment for more complex command line mangling.
10
+
11
+ == Install
12
+
13
+ Clone the git repo or download the file as a .tar or .zip. The bin directory contains the rawk executable. Does not require any gems (except rspec if you want to run the tests)
14
+
15
+ I will package rawk as a gem when I find a spare moment.
16
+
17
+ == Example
18
+
19
+ A simple awk program
20
+
21
+ $ ls -ltr | awk '
22
+ BEGIN {print "Starting..."}
23
+ {print $9, $1}
24
+ END {print "done"} '
25
+
26
+ Creates the following output
27
+
28
+ Starting...
29
+ total
30
+ spec drwxr-xr-x
31
+ lib drwxr-xr-x
32
+ bin drwxr-xr-x
33
+ README -rw-r--r--
34
+ done
35
+
36
+ This can be written using rawk as
37
+
38
+ $ ls -ltr | bin/rawk '
39
+ start {puts "Starting..."}
40
+ every {|record| puts "#{record.cols[8]} #{record.cols[0]}"}
41
+ finish {puts "done"} '
42
+
43
+ And it also creates the same output
44
+
45
+ Starting...
46
+ total
47
+ spec drwxr-xr-x
48
+ lib drwxr-xr-x
49
+ bin drwxr-xr-x
50
+ README -rw-r--r--
51
+ done
52
+
53
+ Notice that the structure and semantics of an awk program is preserved and you use normal ruby code to process the input stream. I've had to bend the knee to the ruby interpreter and change the syntax slightly but I think it actually makes rawk programs a bit clearer than awk.
54
+
55
+ Details descriptions are shown below. I'm assuming you have a working knowledge of awk. Wikipedia provides an {easy primer}[http://en.wikipedia.org/wiki/AWK] if you need to brush up.
56
+
57
+ == Conditions and blocks
58
+
59
+ rawk provides 3 built-in conditions.
60
+
61
+ start {<code>}
62
+
63
+ Runs before any lines are read from the input stream. Equivalent to a BEGIN condition in awk
64
+
65
+ every {|record| <code>}
66
+
67
+ Runs once for each line of input data. Yields an object of type Line (see below)
68
+ Equivalent to an anonymous block such as awk '{print $1}'
69
+
70
+ finish {<code>}
71
+
72
+ Runs after the end of the input stream
73
+ Equivalent to an END condition in awk
74
+
75
+ You can provide multiple blocks of code for each condition.
76
+
77
+ ls -ltr | head -2 | bin/rawk '
78
+ every {|record| puts 1}
79
+ every {|record| puts 2} '
80
+
81
+ prints
82
+
83
+ 1
84
+ 2
85
+ 1
86
+ 2
87
+
88
+ === Not supported (yet)
89
+
90
+ * Conditional blocks
91
+
92
+ == Lines
93
+
94
+ every yields an object of type Line which is subclass of String that adds a cols method to access columns. The cols method returns an array of column values.
95
+
96
+ echo "hello world" | bin/rawk 'every do |record|
97
+ puts "#{record.cols.length} columns: #{record.cols.join(",")}"
98
+ end'
99
+
100
+ -> 2 columns: hello,world
101
+
102
+ Note that cols is aliased to c for convenience
103
+
104
+ echo "hello world" | bin/rawk 'every do |record|
105
+ puts record.c[0]
106
+ end'
107
+
108
+ -> hello
109
+
110
+ == Functions, classes and other ruby stuff
111
+
112
+ You can use ruby as normal before specifying any blocks. For example...
113
+
114
+ Functions
115
+
116
+ echo hello world | bin/rawk '
117
+ def print_first_column(record)
118
+ puts record.cols.first
119
+ end
120
+ every {|record| print_first_column(record)}'
121
+
122
+ Classes
123
+
124
+ echo hello world | bin/rawk '
125
+ class Printer
126
+ def self.print_first(record)
127
+ puts record.cols.first
128
+ end
129
+ end
130
+ every {|record| Printer.print_first(record)} '
131
+
132
+ Requires and gems
133
+
134
+ require works as you would expect although rubygems is not required by default.
135
+
136
+ echo "ruby" | bin/rawk '
137
+ require "rubygems"
138
+ require "active_support/all"
139
+ every {|record| puts record.cols.first.pluralize} '
140
+
141
+ -> rubies
142
+
143
+ == Builtins
144
+
145
+ rawk provides builtins as member variables. You can change them as you see fit.
146
+
147
+ @nr holds the current record number
148
+ ls -ltr | head -2 | bin/rawk 'every {puts @nr}'
149
+
150
+ @fs specifies the field separator applied to each record
151
+
152
+ echo "foo.bar" | bin/rawk '
153
+ start {@fs="."}
154
+ every {|record| puts "1: #{record.cols[0]} 2: #{record.cols[1]}"} '
155
+
156
+ -> 1: foo 2: bar
157
+
158
+ === Not supported (yet)
159
+
160
+ I'm working on support for the following awk built-ins
161
+
162
+ FILENAME:
163
+ Contains the name of the current input-file.
164
+ * Reading input data is not supported yet
165
+ * When I add it, I'll add @filename as a member
166
+
167
+ RS:
168
+ Stores the current "record separator" character. Since, by default, an input line is the input record, the default record separator character is a "newline".
169
+ * Will be @rs
170
+ * Currently, records are delimited by newline
171
+
172
+
173
+ === Redundant
174
+
175
+ The following awk built-ins are redundant in ruby
176
+
177
+ NF:
178
+ Keeps a count of the number of fields in an input record. The last field in the input record can be designated by $NF.
179
+ * NF can be coded as 'every {|record| record.cols.size}'
180
+ * $NF can be coded as 'every {|record| record.cols.last}'
181
+
182
+ OFS:
183
+ Stores the "output field separator", which separates the fields when Awk prints them. The default is a "space" character.
184
+ * Ruby's string handling is far superior to awk's so there is no point in implementing a print routine
185
+
186
+ ORS:
187
+ Stores the "output record separator", which separates the output records when Awk prints them. The default is a "newline" character.
188
+ * You already have complete control of the output stream. If you don't want newlines, use print or printf instead of puts
189
+
190
+ OFMT: Stores the format for numeric output. The default format is "%.6g".
191
+ * Ruby's string and number handing gives you much better control over this sort of thing
192
+
193
+ == Using rawk inside a ruby program
194
+
195
+ Technically, you can require rawk and use it's classes directly. However, it will be messy until I add gem packaging so I'll wait until you can install rawk as a gem before I go ahead an document it.
data/rawk.gemspec ADDED
@@ -0,0 +1,20 @@
1
+ require 'rubygems'
2
+
3
+ SPEC = Gem::Specification.new do |s|
4
+ s.name = "rawk"
5
+ s.version = '0.1.2'
6
+ s.author = "Adrian Mowat"
7
+ s.homepage = "https://github.com/mowat27/rawk"
8
+ s.summary = "An awk-inspired ruby DSL"
9
+ s.description = <<EOS
10
+ An awk-inspired ruby DSL
11
+ Provides an awk-like ruby interface for stream procesing
12
+ EOS
13
+ s.has_rdoc = false
14
+
15
+ s.files = `git ls-files`.split("\n")
16
+ s.test_files = `git ls-files -- {spec,features}/*`.split("\n")
17
+ s.executables = `git ls-files -- bin/*`.split("\n").map{ |f| File.basename(f) }
18
+
19
+ s.require_paths = ["lib/rawk"]
20
+ end
metadata CHANGED
@@ -1,24 +1,32 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: rawk
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.1.1
4
+ version: 0.1.2
5
5
  prerelease:
6
6
  platform: ruby
7
7
  authors:
8
8
  - Adrian Mowat
9
- autorequire: rawk/rawk
9
+ autorequire:
10
10
  bindir: bin
11
11
  cert_chain: []
12
12
  date: 2011-08-14 00:00:00.000000000Z
13
13
  dependencies: []
14
- description:
14
+ description: ! 'An awk-inspired ruby DSL
15
+
16
+ Provides an awk-like ruby interface for stream procesing
17
+
18
+ '
15
19
  email:
16
- executables: []
20
+ executables:
21
+ - rawk
17
22
  extensions: []
18
23
  extra_rdoc_files: []
19
24
  files:
25
+ - .gitignore
26
+ - README.rdoc
20
27
  - bin/rawk
21
28
  - lib/rawk/rawk.rb
29
+ - rawk.gemspec
22
30
  - spec/rawk/line_spec.rb
23
31
  - spec/rawk/rawk_spec.rb
24
32
  - spec/spec_helper.rb
@@ -27,7 +35,7 @@ licenses: []
27
35
  post_install_message:
28
36
  rdoc_options: []
29
37
  require_paths:
30
- - - lib
38
+ - lib/rawk
31
39
  required_ruby_version: !ruby/object:Gem::Requirement
32
40
  none: false
33
41
  requirements:
@@ -46,4 +54,7 @@ rubygems_version: 1.8.8
46
54
  signing_key:
47
55
  specification_version: 3
48
56
  summary: An awk-inspired ruby DSL
49
- test_files: []
57
+ test_files:
58
+ - spec/rawk/line_spec.rb
59
+ - spec/rawk/rawk_spec.rb
60
+ - spec/spec_helper.rb