file_char_licker 0.5.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +7 -0
- data/.gitignore +14 -0
- data/.rspec +2 -0
- data/.travis.yml +3 -0
- data/Gemfile +4 -0
- data/LICENSE.txt +22 -0
- data/README.ja.md +119 -0
- data/README.md +120 -0
- data/Rakefile +7 -0
- data/file_char_licker.gemspec +30 -0
- data/lib/file_char_licker/attach_licker.rb +64 -0
- data/lib/file_char_licker/licker/licker.rb +246 -0
- data/lib/file_char_licker/licker/mb_licker.rb +137 -0
- data/lib/file_char_licker/version.rb +3 -0
- data/lib/file_char_licker.rb +5 -0
- data/spec/file_line_seeker_spec.rb +11 -0
- data/spec/spec_helper.rb +2 -0
- metadata +109 -0
checksums.yaml
ADDED
@@ -0,0 +1,7 @@
|
|
1
|
+
---
|
2
|
+
SHA1:
|
3
|
+
metadata.gz: c1341be913919ca0f0d23414b762dba9fe6ef6c2
|
4
|
+
data.tar.gz: 673f0740cb88c46adbe9bd430d867279211b12bf
|
5
|
+
SHA512:
|
6
|
+
metadata.gz: c414e58867f395b33fc14b052822a5a0097e7ccb5d3802d1bb1217c1379c1da16df7c5bdebb5b52a8a56779823c64612ce096269b3c09db363c4a6338e9f4295
|
7
|
+
data.tar.gz: 572affe684c3cb4361ded31a7bf949079106239feb1552ecfeb400fcd392b6d710c22b8f7806f0d0a0ca3531db66632eeeb7500717d1380b8abe5f223c28e243
|
data/.gitignore
ADDED
data/.rspec
ADDED
data/.travis.yml
ADDED
data/Gemfile
ADDED
data/LICENSE.txt
ADDED
@@ -0,0 +1,22 @@
|
|
1
|
+
Copyright (c) 2014 TODO: Write your name
|
2
|
+
|
3
|
+
MIT License
|
4
|
+
|
5
|
+
Permission is hereby granted, free of charge, to any person obtaining
|
6
|
+
a copy of this software and associated documentation files (the
|
7
|
+
"Software"), to deal in the Software without restriction, including
|
8
|
+
without limitation the rights to use, copy, modify, merge, publish,
|
9
|
+
distribute, sublicense, and/or sell copies of the Software, and to
|
10
|
+
permit persons to whom the Software is furnished to do so, subject to
|
11
|
+
the following conditions:
|
12
|
+
|
13
|
+
The above copyright notice and this permission notice shall be
|
14
|
+
included in all copies or substantial portions of the Software.
|
15
|
+
|
16
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
|
17
|
+
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
|
18
|
+
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
|
19
|
+
NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
|
20
|
+
LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
|
21
|
+
OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
|
22
|
+
WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
|
data/README.ja.md
ADDED
@@ -0,0 +1,119 @@
|
|
1
|
+
FileCharLicker
|
2
|
+
====
|
3
|
+
|
4
|
+
Ruby 用のライブラリです。
|
5
|
+
|
6
|
+
主な機能は以下。
|
7
|
+
|
8
|
+
- 文字ベースでのファイルポインタ位置移動
|
9
|
+
- ポインタ前後の文字列を得る
|
10
|
+
- マルチバイト文字対応
|
11
|
+
|
12
|
+
## Installation
|
13
|
+
|
14
|
+
```ruby
|
15
|
+
gem install file_char_picker
|
16
|
+
```
|
17
|
+
|
18
|
+
## Usage
|
19
|
+
|
20
|
+
### attach (初期化)
|
21
|
+
|
22
|
+
FileCharLicker の機能を利用するには、まず File インスタンスを初期化する必要があります。
|
23
|
+
|
24
|
+
```ruby
|
25
|
+
file = open(path)
|
26
|
+
FileLineSeeker.attach(file, encoding)
|
27
|
+
```
|
28
|
+
|
29
|
+
引数 _file_ は 初期化する File オブジェクトのインスタンスです。
|
30
|
+
|
31
|
+
引数 _encoding_ は文字エンコーディングの設定です。 utf8, eucjp, jis, sjis か、それぞれの接頭辞を渡してください。省略するとアスキー文字のみの対応となります。
|
32
|
+
|
33
|
+
初期化すると、以下のインスタンスメソッドが使用できるようになります。
|
34
|
+
|
35
|
+
### around_lines
|
36
|
+
|
37
|
+
```ruby
|
38
|
+
file.around_lines(needle)
|
39
|
+
```
|
40
|
+
|
41
|
+
実行時のファイルポインタ位置を中心に、引数 needle が含まれる行を前後に検索し、文字列を返します。引数 _needle_ は String オブジェクトか Regexp オブジェクトが利用可能です。
|
42
|
+
|
43
|
+
連続する前後にマッチする行がなかった場合は、空文字列が返ります。
|
44
|
+
|
45
|
+
実行時のファイルポインタは、行頭を想定しています。
|
46
|
+
|
47
|
+
### backward_char
|
48
|
+
|
49
|
+
```ruby
|
50
|
+
file.backward_char
|
51
|
+
```
|
52
|
+
|
53
|
+
実行時のファイルポインタ位置から、一文字前の文字を得ます。主にマルチバイト文字の処理に使用します。
|
54
|
+
|
55
|
+
一文字前の文字がない場合、 nil が返ります。
|
56
|
+
|
57
|
+
### backward_lines
|
58
|
+
|
59
|
+
```ruby
|
60
|
+
file.backward_lines(size)
|
61
|
+
```
|
62
|
+
|
63
|
+
実行時のファイルポインタ位置から、引数 _size_ で指定した行数分前の文字列を得ます。指定行数に到達する前にファイルの先頭に達した場合、それまでの文字列を返します。
|
64
|
+
|
65
|
+
実行時のファイルポインタは、行頭を想定しています。
|
66
|
+
|
67
|
+
### current_line
|
68
|
+
|
69
|
+
```ruby
|
70
|
+
file.current_line
|
71
|
+
```
|
72
|
+
|
73
|
+
実行時のファイルポインタ位置が含まれる行の文字列を得ます。
|
74
|
+
|
75
|
+
### forward_lines
|
76
|
+
|
77
|
+
```ruby
|
78
|
+
file.forward_lines(size)
|
79
|
+
```
|
80
|
+
|
81
|
+
実行時のファイルポインタ位置から、引数 _size_ で指定した行数分後ろの文字列を得ます。指定行数に到達する前にファイルの末尾に達した場合、それまでの行数を返します。
|
82
|
+
|
83
|
+
実行時のファイルポインタは、行頭を想定しています。
|
84
|
+
|
85
|
+
### seek_contiguous_min
|
86
|
+
|
87
|
+
```ruby
|
88
|
+
file.scan_contiguous_min(needle)
|
89
|
+
```
|
90
|
+
|
91
|
+
実行時のファイルポインタ位置を中心に、引数 needle が含まれる行を前方に検索し、最初の行の先頭のポインタ位置に移動します。引数 _needle_ は String オブジェクトか Regexp オブジェクトが利用可能です。
|
92
|
+
|
93
|
+
移動成功時、そのポインタ位置を格納した Integer オブジェクトを返します。正しく移動できなかった場合 (前方に一致する行がなかった場合) 、 nil が返ります。
|
94
|
+
|
95
|
+
実行時のファイルポインタは、行頭を想定しています。
|
96
|
+
|
97
|
+
### seek_contiguous_max(*args)
|
98
|
+
|
99
|
+
```ruby
|
100
|
+
file.scan_contiguous_min(needle)
|
101
|
+
```
|
102
|
+
|
103
|
+
実行時のファイルポインタ位置を中心に、引数 needle が含まれる行を後方に検索し、最終行の末尾のポインタ位置に移動します。引数 _needle_ は String オブジェクトか Regexp オブジェクトが利用可能です。
|
104
|
+
|
105
|
+
移動成功時、そのポインタ位置を格納した Integer オブジェクトを返します。正しく移動できなかった場合 (後方に一致する行がなかった場合) 、 nil が返ります。
|
106
|
+
|
107
|
+
実行時のファイルポインタは、行頭を想定しています。
|
108
|
+
|
109
|
+
### seek_line_head
|
110
|
+
|
111
|
+
```ruby
|
112
|
+
file.seek_line_head
|
113
|
+
```
|
114
|
+
|
115
|
+
実行時のファイルポインタ位置を基準に、ポインタを行の先頭に移動させます。
|
116
|
+
|
117
|
+
## Author
|
118
|
+
|
119
|
+
[indeep-xyz](http://indeep.xyz/)
|
data/README.md
ADDED
@@ -0,0 +1,120 @@
|
|
1
|
+
FileCharLicker
|
2
|
+
====
|
3
|
+
|
4
|
+
library for Ruby.
|
5
|
+
|
6
|
+
it has the following functions.
|
7
|
+
|
8
|
+
- move the position of file pointer in character-based.
|
9
|
+
- get string that is around the file pointer.
|
10
|
+
- support for multi-byte character.
|
11
|
+
|
12
|
+
|
13
|
+
## Installation
|
14
|
+
|
15
|
+
```ruby
|
16
|
+
gem install file_char_picker
|
17
|
+
```
|
18
|
+
|
19
|
+
## Usage
|
20
|
+
|
21
|
+
### attach (setup)
|
22
|
+
|
23
|
+
at first, you must set up incetance of File object before use features of _FileCharLicker_ .
|
24
|
+
|
25
|
+
```ruby
|
26
|
+
file = open(path)
|
27
|
+
FileLineSeeker.attach(file, encoding)
|
28
|
+
```
|
29
|
+
|
30
|
+
the argument of _file_ is instance of File object that you want to use features of _FileCharLicker_ .
|
31
|
+
|
32
|
+
you can pass String object that is 'utf8', 'eucjp', 'jis', 'sjis' or each prefix. if you pass _nil_ or does not pass, the instance set up for ascii character.
|
33
|
+
|
34
|
+
you can use the following instance methods after setup.
|
35
|
+
|
36
|
+
### around_lines
|
37
|
+
|
38
|
+
```ruby
|
39
|
+
file.around_lines(needle)
|
40
|
+
```
|
41
|
+
|
42
|
+
get string around the position of file pointer that matched _needle_ argument. you can pass to _needle_ argument as String object or Regexp object.
|
43
|
+
|
44
|
+
if does not exist the matched string around the position of file pointer, return empty string.
|
45
|
+
|
46
|
+
I assume that the position of file pointer is start of line in run.
|
47
|
+
|
48
|
+
### backward_char
|
49
|
+
|
50
|
+
```ruby
|
51
|
+
file.backward_char
|
52
|
+
```
|
53
|
+
|
54
|
+
get a character that a character before the position of file pointer. mainly, may use to multi-byte character.
|
55
|
+
|
56
|
+
if does not exist before a character, return _nil_ .
|
57
|
+
|
58
|
+
### backward_lines
|
59
|
+
|
60
|
+
```ruby
|
61
|
+
file.backward_lines(size)
|
62
|
+
```
|
63
|
+
|
64
|
+
get string for lines before the position of file pointer. the _size_ argument is number of lines. if reach to beginning of file (BOF) in advance of _size_ number, return string until BOF.
|
65
|
+
|
66
|
+
I assume that the position of file pointer is start of line in run.
|
67
|
+
|
68
|
+
### current_line
|
69
|
+
|
70
|
+
```ruby
|
71
|
+
file.current_line
|
72
|
+
```
|
73
|
+
|
74
|
+
get string that is a line of the position of file pointer.
|
75
|
+
|
76
|
+
### forward_lines
|
77
|
+
|
78
|
+
```ruby
|
79
|
+
file.forward_lines(size)
|
80
|
+
```
|
81
|
+
|
82
|
+
get string for lines after the position of file pointer. the _size_ argument is number of lines. if reach to end of file (EOF) in advance of _size_ number, return string until EOF.
|
83
|
+
|
84
|
+
I assume that the position of file pointer is start of line in run.
|
85
|
+
|
86
|
+
### seek_contiguous_min
|
87
|
+
|
88
|
+
```ruby
|
89
|
+
file.scan_contiguous_min(needle)
|
90
|
+
```
|
91
|
+
|
92
|
+
move the position of file pointer to the start of line that matched the _needle_ argument. the line is contiguous and backward line from the file pointer at run. you can pass to _needle_ argument as String object or Regexp object.
|
93
|
+
|
94
|
+
return Integer object for the position of file pointer if succeed to move. else return nil.
|
95
|
+
|
96
|
+
I assume that the position of file pointer is start of line in run.
|
97
|
+
|
98
|
+
### seek_contiguous_max(*args)
|
99
|
+
|
100
|
+
```ruby
|
101
|
+
file.scan_contiguous_min(needle)
|
102
|
+
```
|
103
|
+
|
104
|
+
move the position of file pointer to the end of line that matched the _needle_ argument. the line is contiguous and forward line from the file pointer at run. you can pass to _needle_ argument as String object or Regexp object.
|
105
|
+
|
106
|
+
return Integer object for the position of file pointer if succeed to move. else return nil.
|
107
|
+
|
108
|
+
I assume that the position of file pointer is start of line in run.
|
109
|
+
|
110
|
+
### seek_line_head
|
111
|
+
|
112
|
+
```ruby
|
113
|
+
file.seek_line_head
|
114
|
+
```
|
115
|
+
|
116
|
+
move the position of file pointer to the head of line.
|
117
|
+
|
118
|
+
## Author
|
119
|
+
|
120
|
+
[indeep-xyz](http://indeep.xyz/)
|
data/Rakefile
ADDED
@@ -0,0 +1,30 @@
|
|
1
|
+
# coding: utf-8
|
2
|
+
lib = File.expand_path('../lib', __FILE__)
|
3
|
+
$LOAD_PATH.unshift(lib) unless $LOAD_PATH.include?(lib)
|
4
|
+
require 'file_char_licker/version'
|
5
|
+
|
6
|
+
Gem::Specification.new do |spec|
|
7
|
+
spec.name = "file_char_licker"
|
8
|
+
spec.version = FileCharLicker::VERSION
|
9
|
+
spec.authors = ["indeep-xyz"]
|
10
|
+
spec.email = ["indeep.xyz@gmail.com"]
|
11
|
+
spec.summary = %q{move the position of file pointer in character-based and get string and.}
|
12
|
+
spec.description = <<EOT
|
13
|
+
it has the following functions.
|
14
|
+
|
15
|
+
- move the position of file pointer in character-based.
|
16
|
+
- get string that is around the file pointer.
|
17
|
+
- support for multi-byte character.
|
18
|
+
EOT
|
19
|
+
spec.homepage = ""
|
20
|
+
spec.license = "MIT"
|
21
|
+
|
22
|
+
spec.files = `git ls-files -z`.split("\x0")
|
23
|
+
spec.executables = spec.files.grep(%r{^bin/}) { |f| File.basename(f) }
|
24
|
+
spec.test_files = spec.files.grep(%r{^(test|spec|features)/})
|
25
|
+
spec.require_paths = ["lib"]
|
26
|
+
|
27
|
+
spec.add_development_dependency "bundler", "~> 1.7"
|
28
|
+
spec.add_development_dependency "rake", "~> 10.0"
|
29
|
+
spec.add_development_dependency "rspec"
|
30
|
+
end
|
@@ -0,0 +1,64 @@
|
|
1
|
+
require "file_char_licker/licker/licker"
|
2
|
+
require "file_char_licker/licker/mb_licker"
|
3
|
+
|
4
|
+
module FileCharLicker
|
5
|
+
class << self
|
6
|
+
|
7
|
+
def attach(file, encoding = nil)
|
8
|
+
|
9
|
+
seeker = encoding.nil? \
|
10
|
+
? Licker.new(file) \
|
11
|
+
: MbLicker.new(file, encoding)
|
12
|
+
|
13
|
+
# attach variables/methods to instance
|
14
|
+
file.instance_variable_set(:@file_char_licker, seeker)
|
15
|
+
instance_methods_set(file)
|
16
|
+
|
17
|
+
seeker
|
18
|
+
end
|
19
|
+
|
20
|
+
private
|
21
|
+
|
22
|
+
def instance_methods_set(file)
|
23
|
+
|
24
|
+
file.instance_eval do
|
25
|
+
class << self
|
26
|
+
|
27
|
+
def around_lines(*args)
|
28
|
+
@file_char_licker.around_lines(*args)
|
29
|
+
end
|
30
|
+
|
31
|
+
def backward_char(*args)
|
32
|
+
@file_char_licker.backward_char(*args)
|
33
|
+
end
|
34
|
+
|
35
|
+
def backward_lines(*args)
|
36
|
+
@file_char_licker.backward_lines(*args)
|
37
|
+
end
|
38
|
+
|
39
|
+
def current_line(*args)
|
40
|
+
@file_char_licker.current_line(*args)
|
41
|
+
end
|
42
|
+
|
43
|
+
def forward_lines(*args)
|
44
|
+
@file_char_licker.forward_lines(*args)
|
45
|
+
end
|
46
|
+
|
47
|
+
def seek_contiguous_min(*args)
|
48
|
+
@file_char_licker.seek_contiguous_min(*args)
|
49
|
+
end
|
50
|
+
|
51
|
+
def seek_contiguous_max(*args)
|
52
|
+
@file_char_licker.seek_contiguous_max(*args)
|
53
|
+
end
|
54
|
+
|
55
|
+
def seek_line_head(*args)
|
56
|
+
@file_char_licker.seek_line_head(*args)
|
57
|
+
end
|
58
|
+
end
|
59
|
+
end
|
60
|
+
|
61
|
+
end
|
62
|
+
|
63
|
+
end
|
64
|
+
end
|
@@ -0,0 +1,246 @@
|
|
1
|
+
require "kconv"
|
2
|
+
|
3
|
+
module FileCharLicker
|
4
|
+
class Licker
|
5
|
+
|
6
|
+
def initialize(file)
|
7
|
+
|
8
|
+
@file = file
|
9
|
+
end
|
10
|
+
|
11
|
+
# get lines around for passed file#pos
|
12
|
+
#
|
13
|
+
# args
|
14
|
+
# pos ... starting point for file#pos
|
15
|
+
# require to be within contiguous range
|
16
|
+
# needle ... RegExp object for contiguous check
|
17
|
+
#
|
18
|
+
# returner
|
19
|
+
# String object as lines
|
20
|
+
def around_lines(needle)
|
21
|
+
|
22
|
+
file = @file
|
23
|
+
pos = file.pos
|
24
|
+
result = ""
|
25
|
+
|
26
|
+
# scan min
|
27
|
+
file.seek(pos)
|
28
|
+
min = seek_contiguous_min(needle) || pos
|
29
|
+
|
30
|
+
# scan max
|
31
|
+
file.seek(pos)
|
32
|
+
max = seek_contiguous_max(needle) || pos
|
33
|
+
|
34
|
+
# read
|
35
|
+
# - require succeed scan processes
|
36
|
+
if max > min
|
37
|
+
file.seek(min)
|
38
|
+
result = file.read(max - min).chomp
|
39
|
+
end
|
40
|
+
|
41
|
+
result
|
42
|
+
end
|
43
|
+
|
44
|
+
# get a backword character from file#pos
|
45
|
+
#
|
46
|
+
# returner
|
47
|
+
# String object ... exists
|
48
|
+
# nil ... not exists
|
49
|
+
def backward_char
|
50
|
+
|
51
|
+
file = @file
|
52
|
+
result = nil
|
53
|
+
|
54
|
+
if file.pos > 0
|
55
|
+
file.seek(-1, IO::SEEK_CUR)
|
56
|
+
result = file.getc
|
57
|
+
end
|
58
|
+
|
59
|
+
result
|
60
|
+
end
|
61
|
+
|
62
|
+
# get backward lines from file#pos
|
63
|
+
# #pos value should be at SOL (Start Of Line)
|
64
|
+
#
|
65
|
+
# args
|
66
|
+
# size ... indication of reading bytesize
|
67
|
+
#
|
68
|
+
# returner
|
69
|
+
# String object as lines
|
70
|
+
def backward_lines(size = 10)
|
71
|
+
|
72
|
+
file = @file
|
73
|
+
reg = Regexp.new('\r\n|\r|\n')
|
74
|
+
result = ""
|
75
|
+
|
76
|
+
while file.pos > 0
|
77
|
+
|
78
|
+
char = backward_char
|
79
|
+
|
80
|
+
break if char.nil?
|
81
|
+
|
82
|
+
# backward pos as bytesize of char
|
83
|
+
file.seek(-(char.bytesize), IO::SEEK_CUR)
|
84
|
+
|
85
|
+
result.insert(0, char)
|
86
|
+
break if char.match(reg) && result.scan(reg).size > size
|
87
|
+
end
|
88
|
+
|
89
|
+
result
|
90
|
+
end
|
91
|
+
|
92
|
+
# get a line string at current position
|
93
|
+
def current_line
|
94
|
+
|
95
|
+
seek_line_head
|
96
|
+
@file.gets
|
97
|
+
end
|
98
|
+
|
99
|
+
# get forward lines
|
100
|
+
#
|
101
|
+
# args
|
102
|
+
# size ... number of lines
|
103
|
+
#
|
104
|
+
# returner
|
105
|
+
# String object as lines
|
106
|
+
def forward_lines(size = 10)
|
107
|
+
|
108
|
+
file = @file
|
109
|
+
result = ""
|
110
|
+
|
111
|
+
while result.scan(/\r\n|\r|\n/).size < size && !file.eof?
|
112
|
+
|
113
|
+
result += file.gets
|
114
|
+
end
|
115
|
+
|
116
|
+
result
|
117
|
+
end
|
118
|
+
|
119
|
+
# scan max file#pos of contiguous.
|
120
|
+
# before set to be within contiguous range.
|
121
|
+
#
|
122
|
+
# args
|
123
|
+
# needle ... RegExp or String object for contiguous check
|
124
|
+
# step_lines ... number of lines for #forward_lines
|
125
|
+
#
|
126
|
+
# returner
|
127
|
+
# Integer object for file#pos
|
128
|
+
# EOL of matched line
|
129
|
+
def seek_contiguous_max(needle, step_lines = 10)
|
130
|
+
|
131
|
+
file = @file
|
132
|
+
max = nil
|
133
|
+
|
134
|
+
loop do
|
135
|
+
|
136
|
+
# file#pos before #forward_lines
|
137
|
+
pos_old = file.pos
|
138
|
+
|
139
|
+
lines = forward_lines(step_lines)
|
140
|
+
lines_pos = lines.rindex(needle)
|
141
|
+
|
142
|
+
# for debug
|
143
|
+
# p [
|
144
|
+
# lines: lines,
|
145
|
+
# lines_pos: lines_pos,
|
146
|
+
# file_pos: file.pos
|
147
|
+
# ].to_s
|
148
|
+
# sleep 0.05
|
149
|
+
|
150
|
+
# if did not match needle
|
151
|
+
# - returner is last set value to 'max'
|
152
|
+
break if lines_pos.nil?
|
153
|
+
|
154
|
+
lines_end_pos = str_byte_index(lines, /(\r\n|\r|\n)+?/, lines_pos)
|
155
|
+
|
156
|
+
if file.eof?
|
157
|
+
max = (lines_end_pos.nil?) ? file.size : pos_old + lines_end_pos
|
158
|
+
break
|
159
|
+
else
|
160
|
+
max = pos_old + lines_end_pos
|
161
|
+
|
162
|
+
break if lines_end_pos < lines.bytesize - 1
|
163
|
+
end
|
164
|
+
|
165
|
+
end
|
166
|
+
|
167
|
+
max
|
168
|
+
end
|
169
|
+
|
170
|
+
# scan min file#pos of contiguous.
|
171
|
+
# before set to be within contiguous range.
|
172
|
+
#
|
173
|
+
# args
|
174
|
+
# needle ... RegExp or String object for contiguous check
|
175
|
+
# step_lines ... number of lines for #backward_lines
|
176
|
+
#
|
177
|
+
# returner
|
178
|
+
# Integer object for file#pos
|
179
|
+
# EOS of matched line
|
180
|
+
def seek_contiguous_min(needle, step_lines = 10)
|
181
|
+
|
182
|
+
file = @file
|
183
|
+
min = nil
|
184
|
+
|
185
|
+
loop do
|
186
|
+
|
187
|
+
lines = backward_lines(step_lines)
|
188
|
+
lines_pos = str_byte_index(lines, needle)
|
189
|
+
file_pos = file.pos
|
190
|
+
|
191
|
+
# for debug
|
192
|
+
# p [
|
193
|
+
# lines: lines,
|
194
|
+
# lines_pos: lines_pos,
|
195
|
+
# file_pos: file_pos
|
196
|
+
# ].to_s
|
197
|
+
# sleep 0.05
|
198
|
+
|
199
|
+
if lines_pos.nil?
|
200
|
+
break
|
201
|
+
else
|
202
|
+
|
203
|
+
min = file_pos + lines_pos
|
204
|
+
break if lines_pos > 0 || file_pos < 1
|
205
|
+
end
|
206
|
+
end
|
207
|
+
|
208
|
+
min
|
209
|
+
end
|
210
|
+
|
211
|
+
def seek_line_head
|
212
|
+
|
213
|
+
file = @file
|
214
|
+
|
215
|
+
if file.pos > 0
|
216
|
+
|
217
|
+
# move pointer to before character
|
218
|
+
file.seek(-1, IO::SEEK_CUR)
|
219
|
+
|
220
|
+
# loop
|
221
|
+
# - move pointer until reach to EOL of before line.
|
222
|
+
until file.getc.match(/[\r\n]/)
|
223
|
+
|
224
|
+
# move pointer to before character
|
225
|
+
if file.pos > 1
|
226
|
+
file.seek(-2, IO::SEEK_CUR)
|
227
|
+
else
|
228
|
+
|
229
|
+
# if EOS, break
|
230
|
+
file.rewind
|
231
|
+
break
|
232
|
+
end
|
233
|
+
end
|
234
|
+
end
|
235
|
+
|
236
|
+
file.pos
|
237
|
+
end
|
238
|
+
|
239
|
+
protected
|
240
|
+
|
241
|
+
# String#index (for method of child class)
|
242
|
+
def str_byte_index(haystack, needle, offset = 0)
|
243
|
+
haystack.index(needle, offset)
|
244
|
+
end
|
245
|
+
end
|
246
|
+
end
|
@@ -0,0 +1,137 @@
|
|
1
|
+
module FileCharLicker
|
2
|
+
class MbLicker < FileCharLicker::Licker
|
3
|
+
|
4
|
+
attr_accessor :mb_bytesize_max
|
5
|
+
attr_reader :encoding, :kconv_checker, :nkf_option
|
6
|
+
|
7
|
+
def initialize(file, encoding = 'utf-8')
|
8
|
+
|
9
|
+
super(file)
|
10
|
+
init_encoding_variables(encoding)
|
11
|
+
|
12
|
+
@mb_bytesize_max = 6
|
13
|
+
end
|
14
|
+
|
15
|
+
# get lines around for passed file#pos
|
16
|
+
def around_lines(*args)
|
17
|
+
|
18
|
+
lines = super(*args)
|
19
|
+
NKF.nkf(@nkf_option, lines)
|
20
|
+
end
|
21
|
+
|
22
|
+
# get a backword character from file#pos
|
23
|
+
#
|
24
|
+
# instance variables
|
25
|
+
# mb_bytesize_max ... max bytesize for multibyte character
|
26
|
+
#
|
27
|
+
# args
|
28
|
+
# update_flag ... if true, update file#pos for backward character's head
|
29
|
+
#
|
30
|
+
# returner
|
31
|
+
# String object ... exists
|
32
|
+
# nil ... not exists
|
33
|
+
def backward_char
|
34
|
+
|
35
|
+
file = @file
|
36
|
+
pos_max = file.pos - 1
|
37
|
+
pos_min = pos_max - @mb_bytesize_max
|
38
|
+
|
39
|
+
pos_max.downto(pos_min) do |pos|
|
40
|
+
|
41
|
+
break if pos < 0
|
42
|
+
|
43
|
+
file.seek(pos)
|
44
|
+
char = file.getc
|
45
|
+
|
46
|
+
# return file#getc character
|
47
|
+
# - when that is regular for multibyte character
|
48
|
+
return char if check_mb(char)
|
49
|
+
end
|
50
|
+
|
51
|
+
nil
|
52
|
+
end
|
53
|
+
|
54
|
+
def seek_line_head
|
55
|
+
|
56
|
+
file = @file
|
57
|
+
|
58
|
+
# loop
|
59
|
+
# - move pointer until reach to EOL of before line.
|
60
|
+
while file.pos > 0
|
61
|
+
|
62
|
+
char = backward_char
|
63
|
+
|
64
|
+
# break
|
65
|
+
# - if did not get backward character
|
66
|
+
# - if got character is a line break character
|
67
|
+
break if char.nil? || char.match(/[\r\n]/)
|
68
|
+
|
69
|
+
# backward pos as bytesize of char
|
70
|
+
file.seek(-(char.bytesize), IO::SEEK_CUR)
|
71
|
+
end
|
72
|
+
|
73
|
+
file.pos
|
74
|
+
end
|
75
|
+
|
76
|
+
protected
|
77
|
+
|
78
|
+
# String#index for multibyte
|
79
|
+
def str_byte_index(haystack, needle, offset = 0)
|
80
|
+
|
81
|
+
result = nil
|
82
|
+
mb_idx = haystack.index(needle, offset)
|
83
|
+
|
84
|
+
unless mb_idx.nil?
|
85
|
+
result = haystack.slice(0..mb_idx).bytesize - 1
|
86
|
+
end
|
87
|
+
|
88
|
+
result
|
89
|
+
end
|
90
|
+
|
91
|
+
private
|
92
|
+
|
93
|
+
# check multibyte
|
94
|
+
def check_mb(char)
|
95
|
+
char.__send__(@kconv_checker)
|
96
|
+
end
|
97
|
+
|
98
|
+
def init_encoding_variables(enc_source)
|
99
|
+
|
100
|
+
@encoding = parse_encoding(enc_source)
|
101
|
+
|
102
|
+
case @encoding
|
103
|
+
when 'eucjp'
|
104
|
+
@kconv_checker = 'iseuc'
|
105
|
+
@nkf_option = '-exm0'
|
106
|
+
|
107
|
+
when 'jis'
|
108
|
+
@kconv_checker = 'isjis'
|
109
|
+
@nkf_option = '-jxm0'
|
110
|
+
|
111
|
+
when 'sjis'
|
112
|
+
@kconv_checker = 'issjis'
|
113
|
+
@nkf_option = '-sxm0'
|
114
|
+
|
115
|
+
when 'utf8'
|
116
|
+
@kconv_checker = 'isutf8'
|
117
|
+
@nkf_option = '-wxm0'
|
118
|
+
end
|
119
|
+
end
|
120
|
+
|
121
|
+
def parse_encoding(enc)
|
122
|
+
|
123
|
+
result = nil
|
124
|
+
|
125
|
+
unless enc.nil?
|
126
|
+
case
|
127
|
+
when enc.match(/^e/) then result = 'eucjp'
|
128
|
+
when enc.match(/^j/) then result = 'jis'
|
129
|
+
when enc.match(/^s/) then result = 'sjis'
|
130
|
+
when enc.match(/^u.*8?/) then result = 'utf8'
|
131
|
+
end
|
132
|
+
end
|
133
|
+
|
134
|
+
result
|
135
|
+
end
|
136
|
+
end
|
137
|
+
end
|
data/spec/spec_helper.rb
ADDED
metadata
ADDED
@@ -0,0 +1,109 @@
|
|
1
|
+
--- !ruby/object:Gem::Specification
|
2
|
+
name: file_char_licker
|
3
|
+
version: !ruby/object:Gem::Version
|
4
|
+
version: 0.5.0
|
5
|
+
platform: ruby
|
6
|
+
authors:
|
7
|
+
- indeep-xyz
|
8
|
+
autorequire:
|
9
|
+
bindir: bin
|
10
|
+
cert_chain: []
|
11
|
+
date: 2014-10-08 00:00:00.000000000 Z
|
12
|
+
dependencies:
|
13
|
+
- !ruby/object:Gem::Dependency
|
14
|
+
name: bundler
|
15
|
+
requirement: !ruby/object:Gem::Requirement
|
16
|
+
requirements:
|
17
|
+
- - "~>"
|
18
|
+
- !ruby/object:Gem::Version
|
19
|
+
version: '1.7'
|
20
|
+
type: :development
|
21
|
+
prerelease: false
|
22
|
+
version_requirements: !ruby/object:Gem::Requirement
|
23
|
+
requirements:
|
24
|
+
- - "~>"
|
25
|
+
- !ruby/object:Gem::Version
|
26
|
+
version: '1.7'
|
27
|
+
- !ruby/object:Gem::Dependency
|
28
|
+
name: rake
|
29
|
+
requirement: !ruby/object:Gem::Requirement
|
30
|
+
requirements:
|
31
|
+
- - "~>"
|
32
|
+
- !ruby/object:Gem::Version
|
33
|
+
version: '10.0'
|
34
|
+
type: :development
|
35
|
+
prerelease: false
|
36
|
+
version_requirements: !ruby/object:Gem::Requirement
|
37
|
+
requirements:
|
38
|
+
- - "~>"
|
39
|
+
- !ruby/object:Gem::Version
|
40
|
+
version: '10.0'
|
41
|
+
- !ruby/object:Gem::Dependency
|
42
|
+
name: rspec
|
43
|
+
requirement: !ruby/object:Gem::Requirement
|
44
|
+
requirements:
|
45
|
+
- - ">="
|
46
|
+
- !ruby/object:Gem::Version
|
47
|
+
version: '0'
|
48
|
+
type: :development
|
49
|
+
prerelease: false
|
50
|
+
version_requirements: !ruby/object:Gem::Requirement
|
51
|
+
requirements:
|
52
|
+
- - ">="
|
53
|
+
- !ruby/object:Gem::Version
|
54
|
+
version: '0'
|
55
|
+
description: |
|
56
|
+
it has the following functions.
|
57
|
+
|
58
|
+
- move the position of file pointer in character-based.
|
59
|
+
- get string that is around the file pointer.
|
60
|
+
- support for multi-byte character.
|
61
|
+
email:
|
62
|
+
- indeep.xyz@gmail.com
|
63
|
+
executables: []
|
64
|
+
extensions: []
|
65
|
+
extra_rdoc_files: []
|
66
|
+
files:
|
67
|
+
- ".gitignore"
|
68
|
+
- ".rspec"
|
69
|
+
- ".travis.yml"
|
70
|
+
- Gemfile
|
71
|
+
- LICENSE.txt
|
72
|
+
- README.ja.md
|
73
|
+
- README.md
|
74
|
+
- Rakefile
|
75
|
+
- file_char_licker.gemspec
|
76
|
+
- lib/file_char_licker.rb
|
77
|
+
- lib/file_char_licker/attach_licker.rb
|
78
|
+
- lib/file_char_licker/licker/licker.rb
|
79
|
+
- lib/file_char_licker/licker/mb_licker.rb
|
80
|
+
- lib/file_char_licker/version.rb
|
81
|
+
- spec/file_line_seeker_spec.rb
|
82
|
+
- spec/spec_helper.rb
|
83
|
+
homepage: ''
|
84
|
+
licenses:
|
85
|
+
- MIT
|
86
|
+
metadata: {}
|
87
|
+
post_install_message:
|
88
|
+
rdoc_options: []
|
89
|
+
require_paths:
|
90
|
+
- lib
|
91
|
+
required_ruby_version: !ruby/object:Gem::Requirement
|
92
|
+
requirements:
|
93
|
+
- - ">="
|
94
|
+
- !ruby/object:Gem::Version
|
95
|
+
version: '0'
|
96
|
+
required_rubygems_version: !ruby/object:Gem::Requirement
|
97
|
+
requirements:
|
98
|
+
- - ">="
|
99
|
+
- !ruby/object:Gem::Version
|
100
|
+
version: '0'
|
101
|
+
requirements: []
|
102
|
+
rubyforge_project:
|
103
|
+
rubygems_version: 2.2.2
|
104
|
+
signing_key:
|
105
|
+
specification_version: 4
|
106
|
+
summary: move the position of file pointer in character-based and get string and.
|
107
|
+
test_files:
|
108
|
+
- spec/file_line_seeker_spec.rb
|
109
|
+
- spec/spec_helper.rb
|