buftok 0.1 → 0.2.0
Sign up to get free protection for your applications and to get access to all the features.
- data/CONTRIBUTING.md +49 -0
- data/Gemfile +6 -0
- data/LICENSE.md +56 -0
- data/README.md +48 -0
- data/Rakefile +59 -24
- data/buftok.gemspec +17 -0
- data/lib/buftok.rb +31 -48
- data/test/test_buftok.rb +27 -0
- metadata +67 -41
data/CONTRIBUTING.md
ADDED
@@ -0,0 +1,49 @@
|
|
1
|
+
## Contributing
|
2
|
+
In the spirit of [free software][free-sw], **everyone** is encouraged to help
|
3
|
+
improve this project. Here are some ways *you* can contribute:
|
4
|
+
|
5
|
+
[free-sw]: http://www.fsf.org/licensing/essays/free-sw.html
|
6
|
+
|
7
|
+
* Use alpha, beta, and pre-release versions.
|
8
|
+
* Report bugs.
|
9
|
+
* Suggest new features.
|
10
|
+
* Write or edit documentation.
|
11
|
+
* Write specifications.
|
12
|
+
* Write code (**no patch is too small**: fix typos, add comments, clean up
|
13
|
+
inconsistent whitespace).
|
14
|
+
* Refactor code.
|
15
|
+
* Fix [issues][].
|
16
|
+
* Review patches.
|
17
|
+
|
18
|
+
[issues]: https://github.com/sferik/buftok/issues
|
19
|
+
|
20
|
+
## Submitting an Issue
|
21
|
+
We use the [GitHub issue tracker][issues] to track bugs and features. Before
|
22
|
+
submitting a bug report or feature request, check to make sure it hasn't
|
23
|
+
already been submitted. When submitting a bug report, please include a [Gist][]
|
24
|
+
that includes a stack trace and any details that may be necessary to reproduce
|
25
|
+
the bug, including your gem version, Ruby version, and operating system.
|
26
|
+
Ideally, a bug report should include a pull request with failing specs.
|
27
|
+
|
28
|
+
[gist]: https://gist.github.com/
|
29
|
+
|
30
|
+
## Submitting a Pull Request
|
31
|
+
1. [Fork the repository.][fork]
|
32
|
+
2. [Create a topic branch.][branch]
|
33
|
+
3. Add specs for your unimplemented feature or bug fix.
|
34
|
+
4. Run `bundle exec rake spec`. If your specs pass, return to step 3.
|
35
|
+
5. Implement your feature or bug fix.
|
36
|
+
6. Run `bundle exec rake spec`. If your specs fail, return to step 5.
|
37
|
+
7. Run `open coverage/index.html`. If your changes are not completely covered
|
38
|
+
by your tests, return to step 3.
|
39
|
+
8. Run `RUBYOPT=W2 bundle exec rake spec 2>&1 | grep buftok`. If your changes
|
40
|
+
produce any warnings, return to step 5.
|
41
|
+
9. Add documentation for your feature or bug fix.
|
42
|
+
10. Run `bundle exec rake yard`. If your changes are not 100% documented, go
|
43
|
+
back to step 9.
|
44
|
+
11. Commit and push your changes.
|
45
|
+
12. [Submit a pull request.][pr]
|
46
|
+
|
47
|
+
[fork]: http://help.github.com/fork-a-repo/
|
48
|
+
[branch]: http://learn.github.com/p/branching.html
|
49
|
+
[pr]: http://help.github.com/send-pull-requests/
|
data/Gemfile
ADDED
data/LICENSE.md
ADDED
@@ -0,0 +1,56 @@
|
|
1
|
+
Ruby is copyrighted free software by Yukihiro Matsumoto <matz@netlab.jp>.
|
2
|
+
You can redistribute it and/or modify it under either the terms of the
|
3
|
+
2-clause BSDL (see the file BSDL), or the conditions below:
|
4
|
+
|
5
|
+
1. You may make and give away verbatim copies of the source form of the
|
6
|
+
software without restriction, provided that you duplicate all of the
|
7
|
+
original copyright notices and associated disclaimers.
|
8
|
+
|
9
|
+
2. You may modify your copy of the software in any way, provided that
|
10
|
+
you do at least ONE of the following:
|
11
|
+
|
12
|
+
a) place your modifications in the Public Domain or otherwise
|
13
|
+
make them Freely Available, such as by posting said
|
14
|
+
modifications to Usenet or an equivalent medium, or by allowing
|
15
|
+
the author to include your modifications in the software.
|
16
|
+
|
17
|
+
b) use the modified software only within your corporation or
|
18
|
+
organization.
|
19
|
+
|
20
|
+
c) give non-standard binaries non-standard names, with
|
21
|
+
instructions on where to get the original software distribution.
|
22
|
+
|
23
|
+
d) make other distribution arrangements with the author.
|
24
|
+
|
25
|
+
3. You may distribute the software in object code or binary form,
|
26
|
+
provided that you do at least ONE of the following:
|
27
|
+
|
28
|
+
a) distribute the binaries and library files of the software,
|
29
|
+
together with instructions (in the manual page or equivalent)
|
30
|
+
on where to get the original distribution.
|
31
|
+
|
32
|
+
b) accompany the distribution with the machine-readable source of
|
33
|
+
the software.
|
34
|
+
|
35
|
+
c) give non-standard binaries non-standard names, with
|
36
|
+
instructions on where to get the original software distribution.
|
37
|
+
|
38
|
+
d) make other distribution arrangements with the author.
|
39
|
+
|
40
|
+
4. You may modify and include the part of the software into any other
|
41
|
+
software (possibly commercial). But some files in the distribution
|
42
|
+
are not written by the author, so that they are not under these terms.
|
43
|
+
|
44
|
+
For the list of those files and their copying conditions, see the
|
45
|
+
file LEGAL.
|
46
|
+
|
47
|
+
5. The scripts and library files supplied as input to or produced as
|
48
|
+
output from the software do not automatically fall under the
|
49
|
+
copyright of the software, but belong to whomever generated them,
|
50
|
+
and may be sold commercially, and may be aggregated with this
|
51
|
+
software.
|
52
|
+
|
53
|
+
6. THIS SOFTWARE IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR
|
54
|
+
IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED
|
55
|
+
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
|
56
|
+
PURPOSE.
|
data/README.md
ADDED
@@ -0,0 +1,48 @@
|
|
1
|
+
# BufferedTokenizer
|
2
|
+
|
3
|
+
[![Gem Version](https://badge.fury.io/rb/buftok.png)][gem]
|
4
|
+
[![Build Status](https://travis-ci.org/sferik/buftok.png?branch=master)][travis]
|
5
|
+
[![Dependency Status](https://gemnasium.com/sferik/buftok.png?travis)][gemnasium]
|
6
|
+
[![Code Climate](https://codeclimate.com/github/sferik/buftok.png)][codeclimate]
|
7
|
+
|
8
|
+
[gem]: https://rubygems.org/gems/buftok
|
9
|
+
[travis]: https://travis-ci.org/sferik/buftok
|
10
|
+
[gemnasium]: https://gemnasium.com/sferik/buftok
|
11
|
+
[codeclimate]: https://codeclimate.com/github/sferik/buftok
|
12
|
+
|
13
|
+
###### Statefully split input data by a specifiable token
|
14
|
+
|
15
|
+
BufferedTokenizer takes a delimiter upon instantiation, or acts line-based by
|
16
|
+
default. It allows input to be spoon-fed from some outside source which
|
17
|
+
receives arbitrary length datagrams which may-or-may-not contain the token by
|
18
|
+
which entities are delimited. In this respect it's ideally paired with
|
19
|
+
something like [EventMachine][].
|
20
|
+
|
21
|
+
[EventMachine]: http://rubyeventmachine.com/
|
22
|
+
|
23
|
+
## Supported Ruby Versions
|
24
|
+
This library aims to support and is [tested against][travis] the following Ruby
|
25
|
+
implementations:
|
26
|
+
|
27
|
+
* Ruby 1.8.7
|
28
|
+
* Ruby 1.9.2
|
29
|
+
* Ruby 1.9.3
|
30
|
+
* Ruby 2.0.0
|
31
|
+
|
32
|
+
If something doesn't work on one of these interpreters, it's a bug.
|
33
|
+
|
34
|
+
This library may inadvertently work (or seem to work) on other Ruby
|
35
|
+
implementations, however support will only be provided for the versions listed
|
36
|
+
above.
|
37
|
+
|
38
|
+
If you would like this library to support another Ruby version, you may
|
39
|
+
volunteer to be a maintainer. Being a maintainer entails making sure all tests
|
40
|
+
run and pass on that implementation. When something breaks on your
|
41
|
+
implementation, you will be responsible for providing patches in a timely
|
42
|
+
fashion. If critical issues for a particular implementation exist at the time
|
43
|
+
of a major release, support for that Ruby version may be dropped.
|
44
|
+
|
45
|
+
## Copyright
|
46
|
+
Copyright (c) 2006-2013 Tony Arcieri, Martin Emde, Erik Michaels-Ober.
|
47
|
+
Distributed under the [Ruby license][license].
|
48
|
+
[license]: http://www.ruby-lang.org/en/LICENSE.txt
|
data/Rakefile
CHANGED
@@ -1,31 +1,66 @@
|
|
1
|
-
require '
|
2
|
-
require '
|
3
|
-
require 'rake/
|
4
|
-
require 'spec/rake/spectask'
|
1
|
+
require 'bundler'
|
2
|
+
require 'rdoc/task'
|
3
|
+
require 'rake/testtask'
|
5
4
|
|
6
|
-
|
7
|
-
|
8
|
-
|
5
|
+
task :default => :test
|
6
|
+
|
7
|
+
Bundler::GemHelper.install_tasks
|
9
8
|
|
10
|
-
|
11
|
-
|
12
|
-
|
13
|
-
|
9
|
+
RDoc::Task.new do |task|
|
10
|
+
task.rdoc_dir = 'doc'
|
11
|
+
task.title = 'BufferedTokenizer'
|
12
|
+
task.rdoc_files.include('lib/**/*.rb')
|
14
13
|
end
|
15
14
|
|
16
|
-
|
17
|
-
|
18
|
-
|
19
|
-
s.date = %q{2006-12-18}
|
20
|
-
s.summary = %q{BufferedTokenizer extracts token delimited entities from a sequence of arbitrary inputs}
|
21
|
-
s.email = %q{tony@clickcaster.com}
|
22
|
-
s.homepage = %q{http://buftok.rubyforge.org}
|
23
|
-
s.rubyforge_project = %q{buftok}
|
24
|
-
s.has_rdoc = true
|
25
|
-
s.authors = ["Tony Arcieri","Martin Emde"]
|
26
|
-
s.files = ["Rakefile", "lib", "lib/buftok.rb"]
|
15
|
+
Rake::TestTask.new :test do |t|
|
16
|
+
t.libs << 'lib'
|
17
|
+
t.test_files = FileList['test/**/*.rb']
|
27
18
|
end
|
28
19
|
|
29
|
-
|
30
|
-
|
20
|
+
desc "Benchmark the current implementation"
|
21
|
+
task :bench do
|
22
|
+
require 'benchmark'
|
23
|
+
require File.expand_path('lib/buftok', File.dirname(__FILE__))
|
24
|
+
|
25
|
+
n = 50000
|
26
|
+
delimiter = "\n\n"
|
27
|
+
|
28
|
+
frequency1 = 1000
|
29
|
+
puts "generating #{n} strings, with #{delimiter.inspect} every #{frequency1} strings..."
|
30
|
+
data1 = (0...n).map do |i|
|
31
|
+
(((i % frequency1 == 1) ? "\n" : "") +
|
32
|
+
("s" * i) +
|
33
|
+
((i % frequency1 == 0) ? "\n" : "")).freeze
|
34
|
+
end
|
35
|
+
|
36
|
+
frequency2 = 10
|
37
|
+
puts "generating #{n} strings, with #{delimiter.inspect} every #{frequency2} strings..."
|
38
|
+
data2 = (0...n).map do |i|
|
39
|
+
(((i % frequency2 == 1) ? "\n" : "") +
|
40
|
+
("s" * i) +
|
41
|
+
((i % frequency2 == 0) ? "\n" : "")).freeze
|
42
|
+
end
|
43
|
+
|
44
|
+
Benchmark.bmbm do |x|
|
45
|
+
x.report("1 char, freq: #{frequency1}") do
|
46
|
+
bt1 = BufferedTokenizer.new
|
47
|
+
n.times { |i| bt1.extract(data1[i]) }
|
48
|
+
end
|
49
|
+
|
50
|
+
x.report("2 char, freq: #{frequency1}") do
|
51
|
+
bt2 = BufferedTokenizer.new(delimiter)
|
52
|
+
n.times { |i| bt2.extract(data1[i]) }
|
53
|
+
end
|
54
|
+
|
55
|
+
x.report("1 char, freq: #{frequency2}") do
|
56
|
+
bt3 = BufferedTokenizer.new
|
57
|
+
n.times { |i| bt3.extract(data2[i]) }
|
58
|
+
end
|
59
|
+
|
60
|
+
x.report("2 char, freq: #{frequency2}") do
|
61
|
+
bt4 = BufferedTokenizer.new(delimiter)
|
62
|
+
n.times { |i| bt4.extract(data2[i]) }
|
63
|
+
end
|
64
|
+
|
65
|
+
end
|
31
66
|
end
|
data/buftok.gemspec
ADDED
@@ -0,0 +1,17 @@
|
|
1
|
+
Gem::Specification.new do |spec|
|
2
|
+
spec.add_development_dependency 'bundler', '~> 1.0'
|
3
|
+
spec.authors = ["Tony Arcieri", "Martin Emde", "Erik Michaels-Ober"]
|
4
|
+
spec.description = %q{BufferedTokenizer extracts token delimited entities from a sequence of arbitrary inputs}
|
5
|
+
spec.email = "sferik@gmail.com"
|
6
|
+
spec.files = %w(CONTRIBUTING.md Gemfile LICENSE.md README.md Rakefile buftok.gemspec)
|
7
|
+
spec.files += Dir.glob("lib/**/*.rb")
|
8
|
+
spec.files += Dir.glob("test/**/*.rb")
|
9
|
+
spec.test_files = spec.files.grep(%r{^test/})
|
10
|
+
spec.homepage = "https://github.com/sferik/buftok"
|
11
|
+
spec.licenses = ['MIT']
|
12
|
+
spec.name = "buftok"
|
13
|
+
spec.require_paths = ["lib"]
|
14
|
+
spec.required_rubygems_version = '>= 1.3.5'
|
15
|
+
spec.summary = spec.description
|
16
|
+
spec.version = "0.2.0"
|
17
|
+
end
|
data/lib/buftok.rb
CHANGED
@@ -1,26 +1,22 @@
|
|
1
|
-
# BufferedTokenizer - Statefully split input data by a specifiable token
|
2
|
-
# (C)2006 Tony Arcieri, Martin Emde
|
3
|
-
# Distributed under the Ruby license (http://www.ruby-lang.org/en/LICENSE.txt)
|
4
|
-
|
5
1
|
# BufferedTokenizer takes a delimiter upon instantiation, or acts line-based
|
6
2
|
# by default. It allows input to be spoon-fed from some outside source which
|
7
3
|
# receives arbitrary length datagrams which may-or-may-not contain the token
|
8
4
|
# by which entities are delimited. In this respect it's ideally paired with
|
9
|
-
# something like EventMachine (http://
|
5
|
+
# something like EventMachine (http://rubyeventmachine.com/).
|
10
6
|
class BufferedTokenizer
|
11
|
-
# New BufferedTokenizers will operate on lines delimited by
|
12
|
-
#
|
13
|
-
#
|
14
|
-
|
15
|
-
|
7
|
+
# New BufferedTokenizers will operate on lines delimited by a delimiter,
|
8
|
+
# which is by default the global input delimiter $/ ("\n").
|
9
|
+
#
|
10
|
+
# The input buffer is stored as an array. This is by far the most efficient
|
11
|
+
# approach given language constraints (in C a linked list would be a more
|
12
|
+
# appropriate data structure). Segments of input data are stored in a list
|
13
|
+
# which is only joined when a token is reached, substantially reducing the
|
14
|
+
# number of objects required for the operation.
|
15
|
+
def initialize(delimiter = $/)
|
16
16
|
@delimiter = delimiter
|
17
|
-
|
18
|
-
# The input buffer is stored as an array. This is by far the most efficient
|
19
|
-
# approach given language constraints (in C a linked list would be a more
|
20
|
-
# appropriate data structure). Segments of input data are stored in a list
|
21
|
-
# which is only joined when a token is reached, substantially reducing the
|
22
|
-
# number of objects required for the operation.
|
23
17
|
@input = []
|
18
|
+
@tail = ''
|
19
|
+
@trim = @delimiter.length - 1
|
24
20
|
end
|
25
21
|
|
26
22
|
# Extract takes an arbitrary string of input data and returns an array of
|
@@ -28,49 +24,36 @@ class BufferedTokenizer
|
|
28
24
|
# makes for easy processing of datagrams using a pattern like:
|
29
25
|
#
|
30
26
|
# tokenizer.extract(data).map { |entity| Decode(entity) }.each do ...
|
27
|
+
#
|
28
|
+
# Using -1 makes split to return "" if the token is at the end of
|
29
|
+
# the string, meaning the last element is the start of the next chunk.
|
31
30
|
def extract(data)
|
32
|
-
|
33
|
-
|
34
|
-
|
35
|
-
|
36
|
-
# return "" in this case, meaning that the last entry in the list represents a
|
37
|
-
# new segment of data where the token has not been encountered
|
38
|
-
entities = data.split @delimiter, -1
|
31
|
+
if @trim > 0
|
32
|
+
tail_end = @tail.slice!(-@trim, @trim) # returns nil if string is too short
|
33
|
+
data = tail_end + data if tail_end
|
34
|
+
end
|
39
35
|
|
40
|
-
|
41
|
-
|
42
|
-
@
|
36
|
+
@input << @tail
|
37
|
+
entities = data.split(@delimiter, -1)
|
38
|
+
@tail = entities.shift
|
43
39
|
|
44
|
-
|
45
|
-
|
46
|
-
|
47
|
-
|
40
|
+
unless entities.empty?
|
41
|
+
@input << @tail
|
42
|
+
entities.unshift @input.join
|
43
|
+
@input.clear
|
44
|
+
@tail = entities.pop
|
45
|
+
end
|
48
46
|
|
49
|
-
# At this point, we've hit a token, or potentially multiple tokens. Now we can bring
|
50
|
-
# together all the data we've buffered from earlier calls without hitting a token,
|
51
|
-
# and add it to our list of discovered entities.
|
52
|
-
entities.unshift @input.join
|
53
|
-
|
54
|
-
# Now that we've hit a token, joined the input buffer and added it to the entities
|
55
|
-
# list, we can go ahead and clear the input buffer. All of the segments that were
|
56
|
-
# stored before the join can now be garbage collected.
|
57
|
-
@input.clear
|
58
|
-
|
59
|
-
# The last entity in the list is not token delimited, however, thanks to the -1
|
60
|
-
# passed to split. It represents the beginning of a new list of as-yet-untokenized
|
61
|
-
# data, so we add it to the start of the list.
|
62
|
-
@input << entities.pop
|
63
|
-
|
64
|
-
# Now we're left with the list of extracted token-delimited entities we wanted
|
65
|
-
# in the first place. Hooray!
|
66
47
|
entities
|
67
48
|
end
|
68
|
-
|
49
|
+
|
69
50
|
# Flush the contents of the input buffer, i.e. return the input buffer even though
|
70
51
|
# a token has not yet been encountered
|
71
52
|
def flush
|
53
|
+
@input << @tail
|
72
54
|
buffer = @input.join
|
73
55
|
@input.clear
|
56
|
+
@tail = "" # @tail.clear is slightly faster, but not supported on 1.8.7
|
74
57
|
buffer
|
75
58
|
end
|
76
59
|
end
|
data/test/test_buftok.rb
ADDED
@@ -0,0 +1,27 @@
|
|
1
|
+
require 'test/unit'
|
2
|
+
require 'buftok'
|
3
|
+
|
4
|
+
class TestBuftok < Test::Unit::TestCase
|
5
|
+
def test_buftok
|
6
|
+
tokenizer = BufferedTokenizer.new
|
7
|
+
assert_equal %w[foo], tokenizer.extract("foo\nbar".freeze)
|
8
|
+
assert_equal %w[barbaz qux], tokenizer.extract("baz\nqux\nquu".freeze)
|
9
|
+
assert_equal 'quu', tokenizer.flush
|
10
|
+
assert_equal '', tokenizer.flush
|
11
|
+
end
|
12
|
+
|
13
|
+
def test_delimiter
|
14
|
+
tokenizer = BufferedTokenizer.new('<>')
|
15
|
+
assert_equal ['', "foo\n"], tokenizer.extract("<>foo\n<>".freeze)
|
16
|
+
assert_equal %w[bar], tokenizer.extract('bar<>baz'.freeze)
|
17
|
+
assert_equal 'baz', tokenizer.flush
|
18
|
+
end
|
19
|
+
|
20
|
+
def test_split_delimiter
|
21
|
+
tokenizer = BufferedTokenizer.new('<>'.freeze)
|
22
|
+
assert_equal [], tokenizer.extract('foo<'.freeze)
|
23
|
+
assert_equal %w[foo], tokenizer.extract('>bar<'.freeze)
|
24
|
+
assert_equal %w[bar<baz qux], tokenizer.extract('baz<>qux<>'.freeze)
|
25
|
+
assert_equal '', tokenizer.flush
|
26
|
+
end
|
27
|
+
end
|
metadata
CHANGED
@@ -1,49 +1,75 @@
|
|
1
|
-
--- !ruby/object:Gem::Specification
|
2
|
-
rubygems_version: 0.9.0
|
3
|
-
specification_version: 1
|
1
|
+
--- !ruby/object:Gem::Specification
|
4
2
|
name: buftok
|
5
|
-
version: !ruby/object:Gem::Version
|
6
|
-
version:
|
7
|
-
|
8
|
-
summary: BufferedTokenizer extracts token delimited entities from a sequence of arbitrary inputs
|
9
|
-
require_paths:
|
10
|
-
- lib
|
11
|
-
email: tony@clickcaster.com
|
12
|
-
homepage: http://buftok.rubyforge.org
|
13
|
-
rubyforge_project: buftok
|
14
|
-
description:
|
15
|
-
autorequire:
|
16
|
-
default_executable:
|
17
|
-
bindir: bin
|
18
|
-
has_rdoc: true
|
19
|
-
required_ruby_version: !ruby/object:Gem::Version::Requirement
|
20
|
-
requirements:
|
21
|
-
- - ">"
|
22
|
-
- !ruby/object:Gem::Version
|
23
|
-
version: 0.0.0
|
24
|
-
version:
|
3
|
+
version: !ruby/object:Gem::Version
|
4
|
+
version: 0.2.0
|
5
|
+
prerelease:
|
25
6
|
platform: ruby
|
26
|
-
|
27
|
-
cert_chain:
|
28
|
-
post_install_message:
|
29
|
-
authors:
|
7
|
+
authors:
|
30
8
|
- Tony Arcieri
|
31
9
|
- Martin Emde
|
32
|
-
|
10
|
+
- Erik Michaels-Ober
|
11
|
+
autorequire:
|
12
|
+
bindir: bin
|
13
|
+
cert_chain: []
|
14
|
+
date: 2013-11-22 00:00:00.000000000 Z
|
15
|
+
dependencies:
|
16
|
+
- !ruby/object:Gem::Dependency
|
17
|
+
name: bundler
|
18
|
+
requirement: !ruby/object:Gem::Requirement
|
19
|
+
none: false
|
20
|
+
requirements:
|
21
|
+
- - ~>
|
22
|
+
- !ruby/object:Gem::Version
|
23
|
+
version: '1.0'
|
24
|
+
type: :development
|
25
|
+
prerelease: false
|
26
|
+
version_requirements: !ruby/object:Gem::Requirement
|
27
|
+
none: false
|
28
|
+
requirements:
|
29
|
+
- - ~>
|
30
|
+
- !ruby/object:Gem::Version
|
31
|
+
version: '1.0'
|
32
|
+
description: BufferedTokenizer extracts token delimited entities from a sequence of
|
33
|
+
arbitrary inputs
|
34
|
+
email: sferik@gmail.com
|
35
|
+
executables: []
|
36
|
+
extensions: []
|
37
|
+
extra_rdoc_files: []
|
38
|
+
files:
|
39
|
+
- CONTRIBUTING.md
|
40
|
+
- Gemfile
|
41
|
+
- LICENSE.md
|
42
|
+
- README.md
|
33
43
|
- Rakefile
|
34
|
-
-
|
44
|
+
- buftok.gemspec
|
35
45
|
- lib/buftok.rb
|
36
|
-
|
37
|
-
|
46
|
+
- test/test_buftok.rb
|
47
|
+
homepage: https://github.com/sferik/buftok
|
48
|
+
licenses:
|
49
|
+
- MIT
|
50
|
+
post_install_message:
|
38
51
|
rdoc_options: []
|
39
|
-
|
40
|
-
|
41
|
-
|
42
|
-
|
43
|
-
|
44
|
-
|
45
|
-
|
52
|
+
require_paths:
|
53
|
+
- lib
|
54
|
+
required_ruby_version: !ruby/object:Gem::Requirement
|
55
|
+
none: false
|
56
|
+
requirements:
|
57
|
+
- - ! '>='
|
58
|
+
- !ruby/object:Gem::Version
|
59
|
+
version: '0'
|
60
|
+
required_rubygems_version: !ruby/object:Gem::Requirement
|
61
|
+
none: false
|
62
|
+
requirements:
|
63
|
+
- - ! '>='
|
64
|
+
- !ruby/object:Gem::Version
|
65
|
+
version: 1.3.5
|
46
66
|
requirements: []
|
47
|
-
|
48
|
-
|
49
|
-
|
67
|
+
rubyforge_project:
|
68
|
+
rubygems_version: 1.8.23
|
69
|
+
signing_key:
|
70
|
+
specification_version: 3
|
71
|
+
summary: BufferedTokenizer extracts token delimited entities from a sequence of arbitrary
|
72
|
+
inputs
|
73
|
+
test_files:
|
74
|
+
- test/test_buftok.rb
|
75
|
+
has_rdoc:
|