parsely 0.1.3 → 0.1.4
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- data/.travis.yml +38 -0
- data/LICENSE +12 -0
- data/README.mkd +45 -0
- data/Rakefile +5 -0
- data/TODO +5 -1
- data/lib/parsely.rb +65 -61
- data/parsely.gemspec +1 -0
- data/test/basic/017-number-lines/command +1 -0
- data/test/basic/017-number-lines/lines.txt +3 -0
- data/test/basic/017-number-lines/output +3 -0
- data/test/basic/018-increment-lineno/command +1 -0
- data/test/basic/018-increment-lineno/lines.txt +3 -0
- data/test/basic/018-increment-lineno/output +3 -0
- data/test/basic/019-increment-lineno-and-values/command +1 -0
- data/test/basic/019-increment-lineno-and-values/lines.txt +3 -0
- data/test/basic/019-increment-lineno-and-values/output +3 -0
- metadata +30 -7
data/.travis.yml
ADDED
@@ -0,0 +1,38 @@
|
|
1
|
+
# Passes arguments to bundle install (http://gembundler.com/man/bundle-install.1.html)
|
2
|
+
#bundler_args: --binstubs
|
3
|
+
|
4
|
+
# Specify which ruby versions you wish to run your tests on, each version will be used
|
5
|
+
rvm:
|
6
|
+
- 1.9.2
|
7
|
+
- ruby-head
|
8
|
+
|
9
|
+
# Define how to run your tests (defaults to `bundle exec rake` or `rake` depending on whether you have a `Gemfile`)
|
10
|
+
#script: "bundle exec rake db:drop db:create db:migrate test"
|
11
|
+
|
12
|
+
# Define tasks to be completed before and after tests run . Will allow folding of content on frontend
|
13
|
+
#before_script:
|
14
|
+
# - command_1
|
15
|
+
# - command_2
|
16
|
+
|
17
|
+
#after_script:
|
18
|
+
# - command_1
|
19
|
+
# - command_2
|
20
|
+
|
21
|
+
# Specify an ENV variable to run before: 'bundle install' and 'rake' (or your defined 'script')
|
22
|
+
#env: "RAILS_ENV='test' "
|
23
|
+
|
24
|
+
# Specify the recipients for email notification
|
25
|
+
#notifications:
|
26
|
+
# recipients:
|
27
|
+
# - email-address-1
|
28
|
+
# - email-address-2
|
29
|
+
# disabled: true # Disable email notifications
|
30
|
+
|
31
|
+
# Specify branches to build
|
32
|
+
# You can either specify only or except. If you specify both, except will be ignored.
|
33
|
+
#branches:
|
34
|
+
# only:
|
35
|
+
# - master
|
36
|
+
# except:
|
37
|
+
# - legacy
|
38
|
+
|
data/LICENSE
ADDED
@@ -0,0 +1,12 @@
|
|
1
|
+
THE DRINK-WARE LICENSE (forked from BEER-WARE r42)
|
2
|
+
|
3
|
+
Gabriele Renzi (http://www.riffraff.info)
|
4
|
+
wrote this code, whereas not specified otherwise.
|
5
|
+
|
6
|
+
As long as you retain this notice you can do whatever you want with this stuff.
|
7
|
+
If we meet some day, and you think this stuff is worth it, you can buy me
|
8
|
+
a beer, coffee, palinka shot or any other drink in return.
|
9
|
+
|
10
|
+
This program is free software; you can redistribute it and/or modify
|
11
|
+
it under the terms of the Artistic License version 2.0.
|
12
|
+
|
data/README.mkd
ADDED
@@ -0,0 +1,45 @@
|
|
1
|
+
# DISCLAIMER
|
2
|
+
|
3
|
+
Do Not Use This.
|
4
|
+
At the moment it is mostly a code dump, and a bad one at
|
5
|
+
that, while I slowly add the stuff I need. One day I will refactor the
|
6
|
+
code and add useful options and make it reliable and it may become a
|
7
|
+
useful project for other people. As it stands, it is not, just move on.
|
8
|
+
|
9
|
+
|
10
|
+
## Summary
|
11
|
+
parsely is a tool to extract and manipulate text files.
|
12
|
+
Basically, it allows you to run ruby one liners (think `-n/-p`) with some additional
|
13
|
+
shortcuts.
|
14
|
+
|
15
|
+
parsely is intended as a replacement for all those single-use-and-discard scripts
|
16
|
+
in sed/awk/perl/ruby that I constantly end up rewriting, such as counting frequencies,
|
17
|
+
summing fields, selecting (c,t)sv rows by field values etc
|
18
|
+
|
19
|
+
It does nothing you can't do with a few pipes, sed, awk, grep, ack, perl,
|
20
|
+
ruby, sort, uniq, bc, ministats and comm.
|
21
|
+
|
22
|
+
It is useful for me because
|
23
|
+
* I am very bad at remembering options for command line tools, and get
|
24
|
+
confused when BSD and GNU tools don't match
|
25
|
+
* I always get confused escaping stuff in the shell
|
26
|
+
* I have written or googled a freq.awk a dozen times
|
27
|
+
|
28
|
+
This is most likely useless to you.
|
29
|
+
|
30
|
+
## INSTALLATION
|
31
|
+
|
32
|
+
Running
|
33
|
+
|
34
|
+
gem install parsely
|
35
|
+
|
36
|
+
should be enough to install.
|
37
|
+
I use ruby (YARV) 1.9.2 and have not tested this anywhere else.
|
38
|
+
|
39
|
+
## SUPPORT
|
40
|
+
|
41
|
+
Open a ticket at http://github.com/riffraff/parsely/issues if you want
|
42
|
+
something in parsely, but I don't think you should use this tool, at
|
43
|
+
least for the next couple of years.
|
44
|
+
Or you can write me an email at rff.rff+parsely@gmail.com if you want.
|
45
|
+
|
data/Rakefile
CHANGED
data/TODO
CHANGED
@@ -3,7 +3,7 @@
|
|
3
3
|
- _2 in dump by _1 in other
|
4
4
|
+ _1 if /www/
|
5
5
|
+ _1 if _2 =~ x
|
6
|
-
|
6
|
+
+ .parselyrc
|
7
7
|
- split by: if a X or not, or if _1 in other_file (subsides comm)
|
8
8
|
+ in "foo 1\nbar 2" "sum(_2) if _2 < 3" to see how much I can delete
|
9
9
|
- _1._1 for submatch (e.g. user agent)
|
@@ -18,3 +18,7 @@ skip jsonp bit
|
|
18
18
|
summaryse is a cool extension from which to steal stuff. Maybe provide basic functionalities through that
|
19
19
|
- given [/m/09_pbpl /type/object/key /soft/isbn/9780491035453,..] and [/m/077601h /book/isbn/book_editions /m/09_pbpl] get [/m/09_pbpl /soft/isbn/9780491035453]
|
20
20
|
- use merge sorted and merge unsorted probably
|
21
|
+
- find a simpler way to do aggregates
|
22
|
+
- top k
|
23
|
+
selext top/max N or N% ditto for minus
|
24
|
+
select outliers
|
data/lib/parsely.rb
CHANGED
@@ -1,13 +1,22 @@
|
|
1
1
|
require 'set'
|
2
2
|
require 'English'
|
3
3
|
$OUTPUT_FIELD_SEPARATOR = ' '
|
4
|
-
|
4
|
+
module Kernel
|
5
5
|
def p args
|
6
6
|
STDERR.puts(args.inspect) #if $DEBUG
|
7
7
|
end
|
8
|
+
end
|
8
9
|
|
9
10
|
class PseudoBinding
|
10
11
|
class PerlVar < String
|
12
|
+
# this is not defined in terms of <=>
|
13
|
+
def == other
|
14
|
+
if other.is_a? Numeric
|
15
|
+
to_f == other
|
16
|
+
else
|
17
|
+
super
|
18
|
+
end
|
19
|
+
end
|
11
20
|
def <=> other
|
12
21
|
if other.is_a? Numeric
|
13
22
|
to_f <=> other
|
@@ -18,11 +27,24 @@ class PseudoBinding
|
|
18
27
|
def inspect
|
19
28
|
"PerlVar(#{super})"
|
20
29
|
end
|
30
|
+
#unneed as of now
|
31
|
+
#def coerce something
|
32
|
+
# [something, to_f]
|
33
|
+
#end
|
34
|
+
def + other
|
35
|
+
case other
|
36
|
+
when Numeric
|
37
|
+
PerlVar.new((to_i + other).to_s)
|
38
|
+
when String
|
39
|
+
PerlVar.new((to_s + other).to_s)
|
40
|
+
end
|
41
|
+
end
|
21
42
|
end
|
22
43
|
PerlNil = PerlVar.new ''
|
23
44
|
attr :line
|
45
|
+
attr :vals
|
24
46
|
def initialize lineno, vals
|
25
|
-
@line, @vals = lineno, vals.map {|x| PerlVar.new(x)}
|
47
|
+
@line, @vals = PerlVar.new(lineno.to_s), vals.map {|x| PerlVar.new(x)}
|
26
48
|
end
|
27
49
|
def method_missing name, *args
|
28
50
|
if args.empty?
|
@@ -36,6 +58,11 @@ class PseudoBinding
|
|
36
58
|
end
|
37
59
|
end
|
38
60
|
end
|
61
|
+
class Array
|
62
|
+
def value
|
63
|
+
self
|
64
|
+
end
|
65
|
+
end
|
39
66
|
class String
|
40
67
|
def value
|
41
68
|
to_s
|
@@ -48,33 +75,18 @@ class Proc
|
|
48
75
|
end
|
49
76
|
end
|
50
77
|
class Parsely
|
51
|
-
VERSION = "0.1.
|
78
|
+
VERSION = "0.1.4"
|
52
79
|
def self.cmd(&block)
|
53
|
-
|
54
|
-
klass.class_eval do
|
55
|
-
def process(items)
|
56
|
-
value.assign(items)
|
57
|
-
_process(value)
|
58
|
-
end
|
59
|
-
end
|
60
|
-
klass
|
80
|
+
Struct.new :value, &block
|
61
81
|
end
|
62
82
|
RGX= /"(.*?)"|\[(.*?)\]|([^\s]+)/
|
63
|
-
|
64
|
-
|
65
|
-
|
66
|
-
|
67
|
-
def process(items)
|
68
|
-
items[index]
|
69
|
-
end
|
70
|
-
def to_i
|
71
|
-
value.to_i
|
72
|
-
end
|
73
|
-
def to_f
|
74
|
-
value.to_f
|
83
|
+
|
84
|
+
Expression = Struct.new :code, :items do
|
85
|
+
def process(pb)
|
86
|
+
result = pb.instance_eval(code)
|
75
87
|
end
|
76
88
|
def to_s
|
77
|
-
|
89
|
+
code.to_s
|
78
90
|
end
|
79
91
|
end
|
80
92
|
Ops = {
|
@@ -85,27 +97,27 @@ class Parsely
|
|
85
97
|
@result = proc { @running_value }
|
86
98
|
@result.single = true
|
87
99
|
end
|
88
|
-
def
|
100
|
+
def process(value)
|
89
101
|
if value.to_f < @running_value
|
90
102
|
@running_value = value.to_f
|
91
103
|
end
|
92
104
|
@result
|
93
105
|
end
|
94
106
|
end,
|
95
|
-
|
107
|
+
:max => cmd do
|
96
108
|
def initialize index
|
97
109
|
super
|
98
110
|
@running_value = Float::MIN #-Inf would be better
|
99
111
|
@result = proc { @running_value }
|
100
112
|
@result.single = true
|
101
113
|
end
|
102
|
-
def
|
114
|
+
def process(value)
|
103
115
|
if value.to_f > @running_value
|
104
116
|
@running_value = value.to_f
|
105
117
|
end
|
106
118
|
@result
|
107
119
|
end
|
108
|
-
|
120
|
+
end,
|
109
121
|
:sum => cmd do
|
110
122
|
def initialize index
|
111
123
|
super
|
@@ -113,7 +125,7 @@ class Parsely
|
|
113
125
|
@result = proc { @running_value }
|
114
126
|
@result.single = true
|
115
127
|
end
|
116
|
-
def
|
128
|
+
def process(value)
|
117
129
|
@running_value += value.to_i
|
118
130
|
@result
|
119
131
|
end
|
@@ -126,7 +138,7 @@ class Parsely
|
|
126
138
|
@result = proc { @running_value/@running_count.to_f }
|
127
139
|
@result.single = true
|
128
140
|
end
|
129
|
-
def
|
141
|
+
def process(value)
|
130
142
|
@running_value += value.to_i
|
131
143
|
@running_count += 1
|
132
144
|
@result
|
@@ -137,7 +149,7 @@ class Parsely
|
|
137
149
|
super
|
138
150
|
@running_freqs = Hash.new(0)
|
139
151
|
@running_count = 0
|
140
|
-
|
152
|
+
as_ary=nil
|
141
153
|
@result = proc do
|
142
154
|
if as_ary.nil?
|
143
155
|
as_ary=@running_freqs.sort_by do |k,v| [-v,k] end.each
|
@@ -146,7 +158,7 @@ class Parsely
|
|
146
158
|
[v, k]
|
147
159
|
end
|
148
160
|
end
|
149
|
-
def
|
161
|
+
def process(value)
|
150
162
|
@running_freqs[value.to_s]+=1
|
151
163
|
@running_count += 1
|
152
164
|
@result
|
@@ -180,39 +192,32 @@ class Parsely
|
|
180
192
|
cached.next
|
181
193
|
end
|
182
194
|
end
|
183
|
-
def
|
195
|
+
def process(value)
|
184
196
|
@running_values << value.to_i
|
185
197
|
@result
|
186
198
|
end
|
187
199
|
end,
|
188
200
|
}
|
201
|
+
PseudoBinding.class_eval do
|
202
|
+
Ops.each do |k,v|
|
203
|
+
#instantiating the object is expensive and we are not using 99% of them
|
204
|
+
obj = nil
|
205
|
+
define_method k do |values|
|
206
|
+
obj ||= v.new(nil)
|
207
|
+
obj.process(values)
|
208
|
+
end
|
209
|
+
end
|
210
|
+
end
|
189
211
|
|
190
212
|
def parse(expr)
|
191
213
|
val, cond = expr.split(/ if /)
|
192
214
|
# p [ val, cond]
|
193
|
-
|
194
|
-
|
195
|
-
|
196
|
-
|
197
|
-
opname = $1.to_sym
|
198
|
-
klass=Ops[opname]
|
199
|
-
if klass.nil?
|
200
|
-
=begin
|
201
|
-
if respond_to? opname
|
202
|
-
klass = cmd do
|
203
|
-
def _process(value)
|
204
|
-
send opname, value
|
205
|
-
end
|
206
|
-
end
|
207
|
-
else
|
208
|
-
=end
|
209
|
-
abort "unknown op '#$1'"
|
210
|
-
end
|
211
|
-
klass.new(Value.new($2.to_i))
|
212
|
-
when /\_(\d+)/
|
213
|
-
Value.new($1.to_i)
|
214
|
-
end
|
215
|
+
|
216
|
+
val = '['+val+']'
|
217
|
+
if val =~ /([\(\)\w])( ([\(\)\w]))+/
|
218
|
+
val = val.split(" ").join(",")
|
215
219
|
end
|
220
|
+
r = Expression.new(val)
|
216
221
|
[r, parse_cond(cond)]
|
217
222
|
end
|
218
223
|
|
@@ -223,7 +228,7 @@ class Parsely
|
|
223
228
|
when nil, ''
|
224
229
|
proc { |bnd| true }
|
225
230
|
else
|
226
|
-
|
231
|
+
proc { |bnd| bnd.instance_eval(str) }
|
227
232
|
end
|
228
233
|
end
|
229
234
|
|
@@ -250,7 +255,7 @@ class Parsely
|
|
250
255
|
end
|
251
256
|
|
252
257
|
def main_loop(expr,lines)
|
253
|
-
|
258
|
+
expression, cond =parse(expr)
|
254
259
|
result = []
|
255
260
|
result = lines.map.with_index do |line, lineno|
|
256
261
|
line.chomp!
|
@@ -263,15 +268,14 @@ class Parsely
|
|
263
268
|
#XXX ugly
|
264
269
|
next unless items
|
265
270
|
b = PseudoBinding.new(lineno, items)
|
266
|
-
|
267
|
-
a.process(items) if cond[b]
|
268
|
-
end
|
271
|
+
expression.process(b) if cond[b]
|
269
272
|
end
|
270
273
|
last = []
|
271
274
|
result.each do |cols|
|
275
|
+
next if cols.nil? #when test fails
|
272
276
|
result_line = cols.map do |col|
|
273
277
|
next if col.nil?
|
274
|
-
col.value
|
278
|
+
col.value
|
275
279
|
end.join.strip
|
276
280
|
same_results = cols.zip(last).map do |a,b|
|
277
281
|
a.respond_to?(:single) && a.single && a.object_id == b.object_id && !a.is_a?(Numeric)
|
data/parsely.gemspec
CHANGED
@@ -10,6 +10,7 @@ Gem::Specification.new do |s|
|
|
10
10
|
s.homepage = "http://github.com/riffraff/parsely"
|
11
11
|
s.summary = %q{a simple tool for text file wrangling}
|
12
12
|
s.description = %q{parsely is a simple tool for managing text files.
|
13
|
+
DO NOT USE IT.
|
13
14
|
Mostly to replace a lot of awk/sed/ruby/perl one-off scripts.
|
14
15
|
This is an internal release, guaranteed to break and ruin your life.}
|
15
16
|
|
@@ -0,0 +1 @@
|
|
1
|
+
parsely 'line + ": " + _0' lines.txt
|
@@ -0,0 +1 @@
|
|
1
|
+
parsely 'line + 1 + ": " + _0' lines.txt
|
@@ -0,0 +1 @@
|
|
1
|
+
parsely 'line + ": " + (_0 + 1)' lines.txt
|
metadata
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: parsely
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.1.
|
4
|
+
version: 0.1.4
|
5
5
|
prerelease:
|
6
6
|
platform: ruby
|
7
7
|
authors:
|
@@ -9,11 +9,11 @@ authors:
|
|
9
9
|
autorequire:
|
10
10
|
bindir: bin
|
11
11
|
cert_chain: []
|
12
|
-
date: 2011-
|
12
|
+
date: 2011-12-10 00:00:00.000000000Z
|
13
13
|
dependencies:
|
14
14
|
- !ruby/object:Gem::Dependency
|
15
15
|
name: rake
|
16
|
-
requirement: &
|
16
|
+
requirement: &2152641300 !ruby/object:Gem::Requirement
|
17
17
|
none: false
|
18
18
|
requirements:
|
19
19
|
- - ! '>='
|
@@ -21,10 +21,11 @@ dependencies:
|
|
21
21
|
version: '0'
|
22
22
|
type: :development
|
23
23
|
prerelease: false
|
24
|
-
version_requirements: *
|
25
|
-
description: ! "parsely is a simple tool for managing text files.\n
|
26
|
-
to replace a lot of awk/sed/ruby/perl one-off
|
27
|
-
is an internal release, guaranteed to break
|
24
|
+
version_requirements: *2152641300
|
25
|
+
description: ! "parsely is a simple tool for managing text files.\n DO
|
26
|
+
NOT USE IT.\n Mostly to replace a lot of awk/sed/ruby/perl one-off
|
27
|
+
scripts.\n This is an internal release, guaranteed to break
|
28
|
+
and ruin your life."
|
28
29
|
email:
|
29
30
|
- rff.rff+parsely@gmail.com
|
30
31
|
executables:
|
@@ -33,7 +34,10 @@ extensions: []
|
|
33
34
|
extra_rdoc_files: []
|
34
35
|
files:
|
35
36
|
- .gitignore
|
37
|
+
- .travis.yml
|
36
38
|
- Gemfile
|
39
|
+
- LICENSE
|
40
|
+
- README.mkd
|
37
41
|
- Rakefile
|
38
42
|
- TODO
|
39
43
|
- bin/parsely
|
@@ -89,6 +93,15 @@ files:
|
|
89
93
|
- test/basic/016-rcfile-home/counts.txt
|
90
94
|
- test/basic/016-rcfile-home/myhome/.parselyrc
|
91
95
|
- test/basic/016-rcfile-home/output
|
96
|
+
- test/basic/017-number-lines/command
|
97
|
+
- test/basic/017-number-lines/lines.txt
|
98
|
+
- test/basic/017-number-lines/output
|
99
|
+
- test/basic/018-increment-lineno/command
|
100
|
+
- test/basic/018-increment-lineno/lines.txt
|
101
|
+
- test/basic/018-increment-lineno/output
|
102
|
+
- test/basic/019-increment-lineno-and-values/command
|
103
|
+
- test/basic/019-increment-lineno-and-values/lines.txt
|
104
|
+
- test/basic/019-increment-lineno-and-values/output
|
92
105
|
- test/cli-runner.rb
|
93
106
|
homepage: http://github.com/riffraff/parsely
|
94
107
|
licenses: []
|
@@ -165,4 +178,14 @@ test_files:
|
|
165
178
|
- test/basic/016-rcfile-home/counts.txt
|
166
179
|
- test/basic/016-rcfile-home/myhome/.parselyrc
|
167
180
|
- test/basic/016-rcfile-home/output
|
181
|
+
- test/basic/017-number-lines/command
|
182
|
+
- test/basic/017-number-lines/lines.txt
|
183
|
+
- test/basic/017-number-lines/output
|
184
|
+
- test/basic/018-increment-lineno/command
|
185
|
+
- test/basic/018-increment-lineno/lines.txt
|
186
|
+
- test/basic/018-increment-lineno/output
|
187
|
+
- test/basic/019-increment-lineno-and-values/command
|
188
|
+
- test/basic/019-increment-lineno-and-values/lines.txt
|
189
|
+
- test/basic/019-increment-lineno-and-values/output
|
168
190
|
- test/cli-runner.rb
|
191
|
+
has_rdoc:
|