rawk 0.1.1 → 0.1.2
Sign up to get free protection for your applications and to get access to all the features.
- data/.gitignore +1 -0
- data/README.rdoc +195 -0
- data/rawk.gemspec +20 -0
- metadata +17 -6
data/.gitignore
ADDED
@@ -0,0 +1 @@
|
|
1
|
+
*.gem
|
data/README.rdoc
ADDED
@@ -0,0 +1,195 @@
|
|
1
|
+
= Rawk
|
2
|
+
|
3
|
+
An awk-inspired ruby DSL
|
4
|
+
|
5
|
+
Last week, after years of ignoring awk, I ran into a shell script problem where it was the only viable solution (we didn't have ruby on the server) and I was forced to learn a bit more about it.
|
6
|
+
|
7
|
+
Once, I had awk figured out, I thought it would be fun to write an awk DSL using ruby. It's turned out to be quite an interesting little project for the daily train ride to work and back.
|
8
|
+
|
9
|
+
Obviously, you can use ruby -e and {other magic}[http://code.joejag.com/2009/using-ruby-as-an-awk-replacement/] to execute snippets of ruby, but I like the way awk provides a bit more structure and a richer environment for more complex command line mangling.
|
10
|
+
|
11
|
+
== Install
|
12
|
+
|
13
|
+
Clone the git repo or download the file as a .tar or .zip. The bin directory contains the rawk executable. Does not require any gems (except rspec if you want to run the tests)
|
14
|
+
|
15
|
+
I will package rawk as a gem when I find a spare moment.
|
16
|
+
|
17
|
+
== Example
|
18
|
+
|
19
|
+
A simple awk program
|
20
|
+
|
21
|
+
$ ls -ltr | awk '
|
22
|
+
BEGIN {print "Starting..."}
|
23
|
+
{print $9, $1}
|
24
|
+
END {print "done"} '
|
25
|
+
|
26
|
+
Creates the following output
|
27
|
+
|
28
|
+
Starting...
|
29
|
+
total
|
30
|
+
spec drwxr-xr-x
|
31
|
+
lib drwxr-xr-x
|
32
|
+
bin drwxr-xr-x
|
33
|
+
README -rw-r--r--
|
34
|
+
done
|
35
|
+
|
36
|
+
This can be written using rawk as
|
37
|
+
|
38
|
+
$ ls -ltr | bin/rawk '
|
39
|
+
start {puts "Starting..."}
|
40
|
+
every {|record| puts "#{record.cols[8]} #{record.cols[0]}"}
|
41
|
+
finish {puts "done"} '
|
42
|
+
|
43
|
+
And it also creates the same output
|
44
|
+
|
45
|
+
Starting...
|
46
|
+
total
|
47
|
+
spec drwxr-xr-x
|
48
|
+
lib drwxr-xr-x
|
49
|
+
bin drwxr-xr-x
|
50
|
+
README -rw-r--r--
|
51
|
+
done
|
52
|
+
|
53
|
+
Notice that the structure and semantics of an awk program is preserved and you use normal ruby code to process the input stream. I've had to bend the knee to the ruby interpreter and change the syntax slightly but I think it actually makes rawk programs a bit clearer than awk.
|
54
|
+
|
55
|
+
Details descriptions are shown below. I'm assuming you have a working knowledge of awk. Wikipedia provides an {easy primer}[http://en.wikipedia.org/wiki/AWK] if you need to brush up.
|
56
|
+
|
57
|
+
== Conditions and blocks
|
58
|
+
|
59
|
+
rawk provides 3 built-in conditions.
|
60
|
+
|
61
|
+
start {<code>}
|
62
|
+
|
63
|
+
Runs before any lines are read from the input stream. Equivalent to a BEGIN condition in awk
|
64
|
+
|
65
|
+
every {|record| <code>}
|
66
|
+
|
67
|
+
Runs once for each line of input data. Yields an object of type Line (see below)
|
68
|
+
Equivalent to an anonymous block such as awk '{print $1}'
|
69
|
+
|
70
|
+
finish {<code>}
|
71
|
+
|
72
|
+
Runs after the end of the input stream
|
73
|
+
Equivalent to an END condition in awk
|
74
|
+
|
75
|
+
You can provide multiple blocks of code for each condition.
|
76
|
+
|
77
|
+
ls -ltr | head -2 | bin/rawk '
|
78
|
+
every {|record| puts 1}
|
79
|
+
every {|record| puts 2} '
|
80
|
+
|
81
|
+
prints
|
82
|
+
|
83
|
+
1
|
84
|
+
2
|
85
|
+
1
|
86
|
+
2
|
87
|
+
|
88
|
+
=== Not supported (yet)
|
89
|
+
|
90
|
+
* Conditional blocks
|
91
|
+
|
92
|
+
== Lines
|
93
|
+
|
94
|
+
every yields an object of type Line which is subclass of String that adds a cols method to access columns. The cols method returns an array of column values.
|
95
|
+
|
96
|
+
echo "hello world" | bin/rawk 'every do |record|
|
97
|
+
puts "#{record.cols.length} columns: #{record.cols.join(",")}"
|
98
|
+
end'
|
99
|
+
|
100
|
+
-> 2 columns: hello,world
|
101
|
+
|
102
|
+
Note that cols is aliased to c for convenience
|
103
|
+
|
104
|
+
echo "hello world" | bin/rawk 'every do |record|
|
105
|
+
puts record.c[0]
|
106
|
+
end'
|
107
|
+
|
108
|
+
-> hello
|
109
|
+
|
110
|
+
== Functions, classes and other ruby stuff
|
111
|
+
|
112
|
+
You can use ruby as normal before specifying any blocks. For example...
|
113
|
+
|
114
|
+
Functions
|
115
|
+
|
116
|
+
echo hello world | bin/rawk '
|
117
|
+
def print_first_column(record)
|
118
|
+
puts record.cols.first
|
119
|
+
end
|
120
|
+
every {|record| print_first_column(record)}'
|
121
|
+
|
122
|
+
Classes
|
123
|
+
|
124
|
+
echo hello world | bin/rawk '
|
125
|
+
class Printer
|
126
|
+
def self.print_first(record)
|
127
|
+
puts record.cols.first
|
128
|
+
end
|
129
|
+
end
|
130
|
+
every {|record| Printer.print_first(record)} '
|
131
|
+
|
132
|
+
Requires and gems
|
133
|
+
|
134
|
+
require works as you would expect although rubygems is not required by default.
|
135
|
+
|
136
|
+
echo "ruby" | bin/rawk '
|
137
|
+
require "rubygems"
|
138
|
+
require "active_support/all"
|
139
|
+
every {|record| puts record.cols.first.pluralize} '
|
140
|
+
|
141
|
+
-> rubies
|
142
|
+
|
143
|
+
== Builtins
|
144
|
+
|
145
|
+
rawk provides builtins as member variables. You can change them as you see fit.
|
146
|
+
|
147
|
+
@nr holds the current record number
|
148
|
+
ls -ltr | head -2 | bin/rawk 'every {puts @nr}'
|
149
|
+
|
150
|
+
@fs specifies the field separator applied to each record
|
151
|
+
|
152
|
+
echo "foo.bar" | bin/rawk '
|
153
|
+
start {@fs="."}
|
154
|
+
every {|record| puts "1: #{record.cols[0]} 2: #{record.cols[1]}"} '
|
155
|
+
|
156
|
+
-> 1: foo 2: bar
|
157
|
+
|
158
|
+
=== Not supported (yet)
|
159
|
+
|
160
|
+
I'm working on support for the following awk built-ins
|
161
|
+
|
162
|
+
FILENAME:
|
163
|
+
Contains the name of the current input-file.
|
164
|
+
* Reading input data is not supported yet
|
165
|
+
* When I add it, I'll add @filename as a member
|
166
|
+
|
167
|
+
RS:
|
168
|
+
Stores the current "record separator" character. Since, by default, an input line is the input record, the default record separator character is a "newline".
|
169
|
+
* Will be @rs
|
170
|
+
* Currently, records are delimited by newline
|
171
|
+
|
172
|
+
|
173
|
+
=== Redundant
|
174
|
+
|
175
|
+
The following awk built-ins are redundant in ruby
|
176
|
+
|
177
|
+
NF:
|
178
|
+
Keeps a count of the number of fields in an input record. The last field in the input record can be designated by $NF.
|
179
|
+
* NF can be coded as 'every {|record| record.cols.size}'
|
180
|
+
* $NF can be coded as 'every {|record| record.cols.last}'
|
181
|
+
|
182
|
+
OFS:
|
183
|
+
Stores the "output field separator", which separates the fields when Awk prints them. The default is a "space" character.
|
184
|
+
* Ruby's string handling is far superior to awk's so there is no point in implementing a print routine
|
185
|
+
|
186
|
+
ORS:
|
187
|
+
Stores the "output record separator", which separates the output records when Awk prints them. The default is a "newline" character.
|
188
|
+
* You already have complete control of the output stream. If you don't want newlines, use print or printf instead of puts
|
189
|
+
|
190
|
+
OFMT: Stores the format for numeric output. The default format is "%.6g".
|
191
|
+
* Ruby's string and number handing gives you much better control over this sort of thing
|
192
|
+
|
193
|
+
== Using rawk inside a ruby program
|
194
|
+
|
195
|
+
Technically, you can require rawk and use it's classes directly. However, it will be messy until I add gem packaging so I'll wait until you can install rawk as a gem before I go ahead an document it.
|
data/rawk.gemspec
ADDED
@@ -0,0 +1,20 @@
|
|
1
|
+
require 'rubygems'
|
2
|
+
|
3
|
+
SPEC = Gem::Specification.new do |s|
|
4
|
+
s.name = "rawk"
|
5
|
+
s.version = '0.1.2'
|
6
|
+
s.author = "Adrian Mowat"
|
7
|
+
s.homepage = "https://github.com/mowat27/rawk"
|
8
|
+
s.summary = "An awk-inspired ruby DSL"
|
9
|
+
s.description = <<EOS
|
10
|
+
An awk-inspired ruby DSL
|
11
|
+
Provides an awk-like ruby interface for stream procesing
|
12
|
+
EOS
|
13
|
+
s.has_rdoc = false
|
14
|
+
|
15
|
+
s.files = `git ls-files`.split("\n")
|
16
|
+
s.test_files = `git ls-files -- {spec,features}/*`.split("\n")
|
17
|
+
s.executables = `git ls-files -- bin/*`.split("\n").map{ |f| File.basename(f) }
|
18
|
+
|
19
|
+
s.require_paths = ["lib/rawk"]
|
20
|
+
end
|
metadata
CHANGED
@@ -1,24 +1,32 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: rawk
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.1.
|
4
|
+
version: 0.1.2
|
5
5
|
prerelease:
|
6
6
|
platform: ruby
|
7
7
|
authors:
|
8
8
|
- Adrian Mowat
|
9
|
-
autorequire:
|
9
|
+
autorequire:
|
10
10
|
bindir: bin
|
11
11
|
cert_chain: []
|
12
12
|
date: 2011-08-14 00:00:00.000000000Z
|
13
13
|
dependencies: []
|
14
|
-
description:
|
14
|
+
description: ! 'An awk-inspired ruby DSL
|
15
|
+
|
16
|
+
Provides an awk-like ruby interface for stream procesing
|
17
|
+
|
18
|
+
'
|
15
19
|
email:
|
16
|
-
executables:
|
20
|
+
executables:
|
21
|
+
- rawk
|
17
22
|
extensions: []
|
18
23
|
extra_rdoc_files: []
|
19
24
|
files:
|
25
|
+
- .gitignore
|
26
|
+
- README.rdoc
|
20
27
|
- bin/rawk
|
21
28
|
- lib/rawk/rawk.rb
|
29
|
+
- rawk.gemspec
|
22
30
|
- spec/rawk/line_spec.rb
|
23
31
|
- spec/rawk/rawk_spec.rb
|
24
32
|
- spec/spec_helper.rb
|
@@ -27,7 +35,7 @@ licenses: []
|
|
27
35
|
post_install_message:
|
28
36
|
rdoc_options: []
|
29
37
|
require_paths:
|
30
|
-
-
|
38
|
+
- lib/rawk
|
31
39
|
required_ruby_version: !ruby/object:Gem::Requirement
|
32
40
|
none: false
|
33
41
|
requirements:
|
@@ -46,4 +54,7 @@ rubygems_version: 1.8.8
|
|
46
54
|
signing_key:
|
47
55
|
specification_version: 3
|
48
56
|
summary: An awk-inspired ruby DSL
|
49
|
-
test_files:
|
57
|
+
test_files:
|
58
|
+
- spec/rawk/line_spec.rb
|
59
|
+
- spec/rawk/rawk_spec.rb
|
60
|
+
- spec/spec_helper.rb
|