parsely 0.1.3 → 0.1.4
Sign up to get free protection for your applications and to get access to all the features.
- data/.travis.yml +38 -0
- data/LICENSE +12 -0
- data/README.mkd +45 -0
- data/Rakefile +5 -0
- data/TODO +5 -1
- data/lib/parsely.rb +65 -61
- data/parsely.gemspec +1 -0
- data/test/basic/017-number-lines/command +1 -0
- data/test/basic/017-number-lines/lines.txt +3 -0
- data/test/basic/017-number-lines/output +3 -0
- data/test/basic/018-increment-lineno/command +1 -0
- data/test/basic/018-increment-lineno/lines.txt +3 -0
- data/test/basic/018-increment-lineno/output +3 -0
- data/test/basic/019-increment-lineno-and-values/command +1 -0
- data/test/basic/019-increment-lineno-and-values/lines.txt +3 -0
- data/test/basic/019-increment-lineno-and-values/output +3 -0
- metadata +30 -7
data/.travis.yml
ADDED
@@ -0,0 +1,38 @@
|
|
1
|
+
# Passes arguments to bundle install (http://gembundler.com/man/bundle-install.1.html)
|
2
|
+
#bundler_args: --binstubs
|
3
|
+
|
4
|
+
# Specify which ruby versions you wish to run your tests on, each version will be used
|
5
|
+
rvm:
|
6
|
+
- 1.9.2
|
7
|
+
- ruby-head
|
8
|
+
|
9
|
+
# Define how to run your tests (defaults to `bundle exec rake` or `rake` depending on whether you have a `Gemfile`)
|
10
|
+
#script: "bundle exec rake db:drop db:create db:migrate test"
|
11
|
+
|
12
|
+
# Define tasks to be completed before and after tests run . Will allow folding of content on frontend
|
13
|
+
#before_script:
|
14
|
+
# - command_1
|
15
|
+
# - command_2
|
16
|
+
|
17
|
+
#after_script:
|
18
|
+
# - command_1
|
19
|
+
# - command_2
|
20
|
+
|
21
|
+
# Specify an ENV variable to run before: 'bundle install' and 'rake' (or your defined 'script')
|
22
|
+
#env: "RAILS_ENV='test' "
|
23
|
+
|
24
|
+
# Specify the recipients for email notification
|
25
|
+
#notifications:
|
26
|
+
# recipients:
|
27
|
+
# - email-address-1
|
28
|
+
# - email-address-2
|
29
|
+
# disabled: true # Disable email notifications
|
30
|
+
|
31
|
+
# Specify branches to build
|
32
|
+
# You can either specify only or except. If you specify both, except will be ignored.
|
33
|
+
#branches:
|
34
|
+
# only:
|
35
|
+
# - master
|
36
|
+
# except:
|
37
|
+
# - legacy
|
38
|
+
|
data/LICENSE
ADDED
@@ -0,0 +1,12 @@
|
|
1
|
+
THE DRINK-WARE LICENSE (forked from BEER-WARE r42)
|
2
|
+
|
3
|
+
Gabriele Renzi (http://www.riffraff.info)
|
4
|
+
wrote this code, whereas not specified otherwise.
|
5
|
+
|
6
|
+
As long as you retain this notice you can do whatever you want with this stuff.
|
7
|
+
If we meet some day, and you think this stuff is worth it, you can buy me
|
8
|
+
a beer, coffee, palinka shot or any other drink in return.
|
9
|
+
|
10
|
+
This program is free software; you can redistribute it and/or modify
|
11
|
+
it under the terms of the Artistic License version 2.0.
|
12
|
+
|
data/README.mkd
ADDED
@@ -0,0 +1,45 @@
|
|
1
|
+
# DISCLAIMER
|
2
|
+
|
3
|
+
Do Not Use This.
|
4
|
+
At the moment it is mostly a code dump, and a bad one at
|
5
|
+
that, while I slowly add the stuff I need. One day I will refactor the
|
6
|
+
code and add useful options and make it reliable and it may become a
|
7
|
+
useful project for other people. As it stands, it is not, just move on.
|
8
|
+
|
9
|
+
|
10
|
+
## Summary
|
11
|
+
parsely is a tool to extract and manipulate text files.
|
12
|
+
Basically, it allows you to run ruby one liners (think `-n/-p`) with some additional
|
13
|
+
shortcuts.
|
14
|
+
|
15
|
+
parsely is intended as a replacement for all those single-use-and-discard scripts
|
16
|
+
in sed/awk/perl/ruby that I constantly end up rewriting, such as counting frequencies,
|
17
|
+
summing fields, selecting (c,t)sv rows by field values etc
|
18
|
+
|
19
|
+
It does nothing you can't do with a few pipes, sed, awk, grep, ack, perl,
|
20
|
+
ruby, sort, uniq, bc, ministats and comm.
|
21
|
+
|
22
|
+
It is useful for me because
|
23
|
+
* I am very bad at remembering options for command line tools, and get
|
24
|
+
confused when BSD and GNU tools don't match
|
25
|
+
* I always get confused escaping stuff in the shell
|
26
|
+
* I have written or googled a freq.awk a dozen times
|
27
|
+
|
28
|
+
This is most likely useless to you.
|
29
|
+
|
30
|
+
## INSTALLATION
|
31
|
+
|
32
|
+
Running
|
33
|
+
|
34
|
+
gem install parsely
|
35
|
+
|
36
|
+
should be enough to install.
|
37
|
+
I use ruby (YARV) 1.9.2 and have not tested this anywhere else.
|
38
|
+
|
39
|
+
## SUPPORT
|
40
|
+
|
41
|
+
Open a ticket at http://github.com/riffraff/parsely/issues if you want
|
42
|
+
something in parsely, but I don't think you should use this tool, at
|
43
|
+
least for the next couple of years.
|
44
|
+
Or you can write me an email at rff.rff+parsely@gmail.com if you want.
|
45
|
+
|
data/Rakefile
CHANGED
data/TODO
CHANGED
@@ -3,7 +3,7 @@
|
|
3
3
|
- _2 in dump by _1 in other
|
4
4
|
+ _1 if /www/
|
5
5
|
+ _1 if _2 =~ x
|
6
|
-
|
6
|
+
+ .parselyrc
|
7
7
|
- split by: if a X or not, or if _1 in other_file (subsides comm)
|
8
8
|
+ in "foo 1\nbar 2" "sum(_2) if _2 < 3" to see how much I can delete
|
9
9
|
- _1._1 for submatch (e.g. user agent)
|
@@ -18,3 +18,7 @@ skip jsonp bit
|
|
18
18
|
summaryse is a cool extension from which to steal stuff. Maybe provide basic functionalities through that
|
19
19
|
- given [/m/09_pbpl /type/object/key /soft/isbn/9780491035453,..] and [/m/077601h /book/isbn/book_editions /m/09_pbpl] get [/m/09_pbpl /soft/isbn/9780491035453]
|
20
20
|
- use merge sorted and merge unsorted probably
|
21
|
+
- find a simpler way to do aggregates
|
22
|
+
- top k
|
23
|
+
selext top/max N or N% ditto for minus
|
24
|
+
select outliers
|
data/lib/parsely.rb
CHANGED
@@ -1,13 +1,22 @@
|
|
1
1
|
require 'set'
|
2
2
|
require 'English'
|
3
3
|
$OUTPUT_FIELD_SEPARATOR = ' '
|
4
|
-
|
4
|
+
module Kernel
|
5
5
|
def p args
|
6
6
|
STDERR.puts(args.inspect) #if $DEBUG
|
7
7
|
end
|
8
|
+
end
|
8
9
|
|
9
10
|
class PseudoBinding
|
10
11
|
class PerlVar < String
|
12
|
+
# this is not defined in terms of <=>
|
13
|
+
def == other
|
14
|
+
if other.is_a? Numeric
|
15
|
+
to_f == other
|
16
|
+
else
|
17
|
+
super
|
18
|
+
end
|
19
|
+
end
|
11
20
|
def <=> other
|
12
21
|
if other.is_a? Numeric
|
13
22
|
to_f <=> other
|
@@ -18,11 +27,24 @@ class PseudoBinding
|
|
18
27
|
def inspect
|
19
28
|
"PerlVar(#{super})"
|
20
29
|
end
|
30
|
+
#unneed as of now
|
31
|
+
#def coerce something
|
32
|
+
# [something, to_f]
|
33
|
+
#end
|
34
|
+
def + other
|
35
|
+
case other
|
36
|
+
when Numeric
|
37
|
+
PerlVar.new((to_i + other).to_s)
|
38
|
+
when String
|
39
|
+
PerlVar.new((to_s + other).to_s)
|
40
|
+
end
|
41
|
+
end
|
21
42
|
end
|
22
43
|
PerlNil = PerlVar.new ''
|
23
44
|
attr :line
|
45
|
+
attr :vals
|
24
46
|
def initialize lineno, vals
|
25
|
-
@line, @vals = lineno, vals.map {|x| PerlVar.new(x)}
|
47
|
+
@line, @vals = PerlVar.new(lineno.to_s), vals.map {|x| PerlVar.new(x)}
|
26
48
|
end
|
27
49
|
def method_missing name, *args
|
28
50
|
if args.empty?
|
@@ -36,6 +58,11 @@ class PseudoBinding
|
|
36
58
|
end
|
37
59
|
end
|
38
60
|
end
|
61
|
+
class Array
|
62
|
+
def value
|
63
|
+
self
|
64
|
+
end
|
65
|
+
end
|
39
66
|
class String
|
40
67
|
def value
|
41
68
|
to_s
|
@@ -48,33 +75,18 @@ class Proc
|
|
48
75
|
end
|
49
76
|
end
|
50
77
|
class Parsely
|
51
|
-
VERSION = "0.1.
|
78
|
+
VERSION = "0.1.4"
|
52
79
|
def self.cmd(&block)
|
53
|
-
|
54
|
-
klass.class_eval do
|
55
|
-
def process(items)
|
56
|
-
value.assign(items)
|
57
|
-
_process(value)
|
58
|
-
end
|
59
|
-
end
|
60
|
-
klass
|
80
|
+
Struct.new :value, &block
|
61
81
|
end
|
62
82
|
RGX= /"(.*?)"|\[(.*?)\]|([^\s]+)/
|
63
|
-
|
64
|
-
|
65
|
-
|
66
|
-
|
67
|
-
def process(items)
|
68
|
-
items[index]
|
69
|
-
end
|
70
|
-
def to_i
|
71
|
-
value.to_i
|
72
|
-
end
|
73
|
-
def to_f
|
74
|
-
value.to_f
|
83
|
+
|
84
|
+
Expression = Struct.new :code, :items do
|
85
|
+
def process(pb)
|
86
|
+
result = pb.instance_eval(code)
|
75
87
|
end
|
76
88
|
def to_s
|
77
|
-
|
89
|
+
code.to_s
|
78
90
|
end
|
79
91
|
end
|
80
92
|
Ops = {
|
@@ -85,27 +97,27 @@ class Parsely
|
|
85
97
|
@result = proc { @running_value }
|
86
98
|
@result.single = true
|
87
99
|
end
|
88
|
-
def
|
100
|
+
def process(value)
|
89
101
|
if value.to_f < @running_value
|
90
102
|
@running_value = value.to_f
|
91
103
|
end
|
92
104
|
@result
|
93
105
|
end
|
94
106
|
end,
|
95
|
-
|
107
|
+
:max => cmd do
|
96
108
|
def initialize index
|
97
109
|
super
|
98
110
|
@running_value = Float::MIN #-Inf would be better
|
99
111
|
@result = proc { @running_value }
|
100
112
|
@result.single = true
|
101
113
|
end
|
102
|
-
def
|
114
|
+
def process(value)
|
103
115
|
if value.to_f > @running_value
|
104
116
|
@running_value = value.to_f
|
105
117
|
end
|
106
118
|
@result
|
107
119
|
end
|
108
|
-
|
120
|
+
end,
|
109
121
|
:sum => cmd do
|
110
122
|
def initialize index
|
111
123
|
super
|
@@ -113,7 +125,7 @@ class Parsely
|
|
113
125
|
@result = proc { @running_value }
|
114
126
|
@result.single = true
|
115
127
|
end
|
116
|
-
def
|
128
|
+
def process(value)
|
117
129
|
@running_value += value.to_i
|
118
130
|
@result
|
119
131
|
end
|
@@ -126,7 +138,7 @@ class Parsely
|
|
126
138
|
@result = proc { @running_value/@running_count.to_f }
|
127
139
|
@result.single = true
|
128
140
|
end
|
129
|
-
def
|
141
|
+
def process(value)
|
130
142
|
@running_value += value.to_i
|
131
143
|
@running_count += 1
|
132
144
|
@result
|
@@ -137,7 +149,7 @@ class Parsely
|
|
137
149
|
super
|
138
150
|
@running_freqs = Hash.new(0)
|
139
151
|
@running_count = 0
|
140
|
-
|
152
|
+
as_ary=nil
|
141
153
|
@result = proc do
|
142
154
|
if as_ary.nil?
|
143
155
|
as_ary=@running_freqs.sort_by do |k,v| [-v,k] end.each
|
@@ -146,7 +158,7 @@ class Parsely
|
|
146
158
|
[v, k]
|
147
159
|
end
|
148
160
|
end
|
149
|
-
def
|
161
|
+
def process(value)
|
150
162
|
@running_freqs[value.to_s]+=1
|
151
163
|
@running_count += 1
|
152
164
|
@result
|
@@ -180,39 +192,32 @@ class Parsely
|
|
180
192
|
cached.next
|
181
193
|
end
|
182
194
|
end
|
183
|
-
def
|
195
|
+
def process(value)
|
184
196
|
@running_values << value.to_i
|
185
197
|
@result
|
186
198
|
end
|
187
199
|
end,
|
188
200
|
}
|
201
|
+
PseudoBinding.class_eval do
|
202
|
+
Ops.each do |k,v|
|
203
|
+
#instantiating the object is expensive and we are not using 99% of them
|
204
|
+
obj = nil
|
205
|
+
define_method k do |values|
|
206
|
+
obj ||= v.new(nil)
|
207
|
+
obj.process(values)
|
208
|
+
end
|
209
|
+
end
|
210
|
+
end
|
189
211
|
|
190
212
|
def parse(expr)
|
191
213
|
val, cond = expr.split(/ if /)
|
192
214
|
# p [ val, cond]
|
193
|
-
|
194
|
-
|
195
|
-
|
196
|
-
|
197
|
-
opname = $1.to_sym
|
198
|
-
klass=Ops[opname]
|
199
|
-
if klass.nil?
|
200
|
-
=begin
|
201
|
-
if respond_to? opname
|
202
|
-
klass = cmd do
|
203
|
-
def _process(value)
|
204
|
-
send opname, value
|
205
|
-
end
|
206
|
-
end
|
207
|
-
else
|
208
|
-
=end
|
209
|
-
abort "unknown op '#$1'"
|
210
|
-
end
|
211
|
-
klass.new(Value.new($2.to_i))
|
212
|
-
when /\_(\d+)/
|
213
|
-
Value.new($1.to_i)
|
214
|
-
end
|
215
|
+
|
216
|
+
val = '['+val+']'
|
217
|
+
if val =~ /([\(\)\w])( ([\(\)\w]))+/
|
218
|
+
val = val.split(" ").join(",")
|
215
219
|
end
|
220
|
+
r = Expression.new(val)
|
216
221
|
[r, parse_cond(cond)]
|
217
222
|
end
|
218
223
|
|
@@ -223,7 +228,7 @@ class Parsely
|
|
223
228
|
when nil, ''
|
224
229
|
proc { |bnd| true }
|
225
230
|
else
|
226
|
-
|
231
|
+
proc { |bnd| bnd.instance_eval(str) }
|
227
232
|
end
|
228
233
|
end
|
229
234
|
|
@@ -250,7 +255,7 @@ class Parsely
|
|
250
255
|
end
|
251
256
|
|
252
257
|
def main_loop(expr,lines)
|
253
|
-
|
258
|
+
expression, cond =parse(expr)
|
254
259
|
result = []
|
255
260
|
result = lines.map.with_index do |line, lineno|
|
256
261
|
line.chomp!
|
@@ -263,15 +268,14 @@ class Parsely
|
|
263
268
|
#XXX ugly
|
264
269
|
next unless items
|
265
270
|
b = PseudoBinding.new(lineno, items)
|
266
|
-
|
267
|
-
a.process(items) if cond[b]
|
268
|
-
end
|
271
|
+
expression.process(b) if cond[b]
|
269
272
|
end
|
270
273
|
last = []
|
271
274
|
result.each do |cols|
|
275
|
+
next if cols.nil? #when test fails
|
272
276
|
result_line = cols.map do |col|
|
273
277
|
next if col.nil?
|
274
|
-
col.value
|
278
|
+
col.value
|
275
279
|
end.join.strip
|
276
280
|
same_results = cols.zip(last).map do |a,b|
|
277
281
|
a.respond_to?(:single) && a.single && a.object_id == b.object_id && !a.is_a?(Numeric)
|
data/parsely.gemspec
CHANGED
@@ -10,6 +10,7 @@ Gem::Specification.new do |s|
|
|
10
10
|
s.homepage = "http://github.com/riffraff/parsely"
|
11
11
|
s.summary = %q{a simple tool for text file wrangling}
|
12
12
|
s.description = %q{parsely is a simple tool for managing text files.
|
13
|
+
DO NOT USE IT.
|
13
14
|
Mostly to replace a lot of awk/sed/ruby/perl one-off scripts.
|
14
15
|
This is an internal release, guaranteed to break and ruin your life.}
|
15
16
|
|
@@ -0,0 +1 @@
|
|
1
|
+
parsely 'line + ": " + _0' lines.txt
|
@@ -0,0 +1 @@
|
|
1
|
+
parsely 'line + 1 + ": " + _0' lines.txt
|
@@ -0,0 +1 @@
|
|
1
|
+
parsely 'line + ": " + (_0 + 1)' lines.txt
|
metadata
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: parsely
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.1.
|
4
|
+
version: 0.1.4
|
5
5
|
prerelease:
|
6
6
|
platform: ruby
|
7
7
|
authors:
|
@@ -9,11 +9,11 @@ authors:
|
|
9
9
|
autorequire:
|
10
10
|
bindir: bin
|
11
11
|
cert_chain: []
|
12
|
-
date: 2011-
|
12
|
+
date: 2011-12-10 00:00:00.000000000Z
|
13
13
|
dependencies:
|
14
14
|
- !ruby/object:Gem::Dependency
|
15
15
|
name: rake
|
16
|
-
requirement: &
|
16
|
+
requirement: &2152641300 !ruby/object:Gem::Requirement
|
17
17
|
none: false
|
18
18
|
requirements:
|
19
19
|
- - ! '>='
|
@@ -21,10 +21,11 @@ dependencies:
|
|
21
21
|
version: '0'
|
22
22
|
type: :development
|
23
23
|
prerelease: false
|
24
|
-
version_requirements: *
|
25
|
-
description: ! "parsely is a simple tool for managing text files.\n
|
26
|
-
to replace a lot of awk/sed/ruby/perl one-off
|
27
|
-
is an internal release, guaranteed to break
|
24
|
+
version_requirements: *2152641300
|
25
|
+
description: ! "parsely is a simple tool for managing text files.\n DO
|
26
|
+
NOT USE IT.\n Mostly to replace a lot of awk/sed/ruby/perl one-off
|
27
|
+
scripts.\n This is an internal release, guaranteed to break
|
28
|
+
and ruin your life."
|
28
29
|
email:
|
29
30
|
- rff.rff+parsely@gmail.com
|
30
31
|
executables:
|
@@ -33,7 +34,10 @@ extensions: []
|
|
33
34
|
extra_rdoc_files: []
|
34
35
|
files:
|
35
36
|
- .gitignore
|
37
|
+
- .travis.yml
|
36
38
|
- Gemfile
|
39
|
+
- LICENSE
|
40
|
+
- README.mkd
|
37
41
|
- Rakefile
|
38
42
|
- TODO
|
39
43
|
- bin/parsely
|
@@ -89,6 +93,15 @@ files:
|
|
89
93
|
- test/basic/016-rcfile-home/counts.txt
|
90
94
|
- test/basic/016-rcfile-home/myhome/.parselyrc
|
91
95
|
- test/basic/016-rcfile-home/output
|
96
|
+
- test/basic/017-number-lines/command
|
97
|
+
- test/basic/017-number-lines/lines.txt
|
98
|
+
- test/basic/017-number-lines/output
|
99
|
+
- test/basic/018-increment-lineno/command
|
100
|
+
- test/basic/018-increment-lineno/lines.txt
|
101
|
+
- test/basic/018-increment-lineno/output
|
102
|
+
- test/basic/019-increment-lineno-and-values/command
|
103
|
+
- test/basic/019-increment-lineno-and-values/lines.txt
|
104
|
+
- test/basic/019-increment-lineno-and-values/output
|
92
105
|
- test/cli-runner.rb
|
93
106
|
homepage: http://github.com/riffraff/parsely
|
94
107
|
licenses: []
|
@@ -165,4 +178,14 @@ test_files:
|
|
165
178
|
- test/basic/016-rcfile-home/counts.txt
|
166
179
|
- test/basic/016-rcfile-home/myhome/.parselyrc
|
167
180
|
- test/basic/016-rcfile-home/output
|
181
|
+
- test/basic/017-number-lines/command
|
182
|
+
- test/basic/017-number-lines/lines.txt
|
183
|
+
- test/basic/017-number-lines/output
|
184
|
+
- test/basic/018-increment-lineno/command
|
185
|
+
- test/basic/018-increment-lineno/lines.txt
|
186
|
+
- test/basic/018-increment-lineno/output
|
187
|
+
- test/basic/019-increment-lineno-and-values/command
|
188
|
+
- test/basic/019-increment-lineno-and-values/lines.txt
|
189
|
+
- test/basic/019-increment-lineno-and-values/output
|
168
190
|
- test/cli-runner.rb
|
191
|
+
has_rdoc:
|