searchyj 0.6.0 → 0.6.1
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- data/README.ja.md +18 -23
- data/README.md +30 -37
- data/TODO.md +3 -7
- data/lib/searchyj.rb +6 -6
- data/lib/searchyj/cli.rb +35 -23
- data/lib/searchyj/main.rb +8 -9
- data/lib/searchyj/searcher.rb +21 -7
- data/lib/searchyj/version.rb +1 -1
- metadata +2 -2
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA1:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: a23769233d6904d18accdf9c7f560d750f6fd276
|
4
|
+
data.tar.gz: 65ce5afcec37ed5d17ba548c53c63b51c3b2a85b
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 76169acf83749d560ec56b4ab5043d6888aea93c6e88ab238a279972c505c1c7eb8b3a25c293524aa92139cd621f33c2e10e534327a2997b9e30ea8181535086
|
7
|
+
data.tar.gz: e0315aadfb6c7954b04ab5ac08dc2a9484718cca25435db21b1ae0b58253d5e660c7186d091503f37ee13163e9b496557662f861a582725af354c5ec701dbfda
|
data/README.ja.md
CHANGED
@@ -1,5 +1,4 @@
|
|
1
|
-
SearchYJ
|
2
|
-
====
|
1
|
+
# SearchYJ
|
3
2
|
|
4
3
|
Search on Yahoo Japan.
|
5
4
|
|
@@ -7,9 +6,9 @@ Search on Yahoo Japan.
|
|
7
6
|
|
8
7
|
Gemfile に次の一行を書き加えてください。
|
9
8
|
|
10
|
-
|
9
|
+
~~~ruby
|
11
10
|
gem 'searchyj'
|
12
|
-
|
11
|
+
~~~
|
13
12
|
|
14
13
|
そして以下を実行。
|
15
14
|
|
@@ -24,19 +23,12 @@ gem 'searchyj'
|
|
24
23
|
出力結果は基本的に JSON 形式で、以下のパラメータを持ちます。
|
25
24
|
|
26
25
|
- uri
|
27
|
-
- ウェブサイトの URI。
|
28
|
-
|
29
|
-
|
30
26
|
- title
|
31
|
-
-
|
32
|
-
- 検索結果のタイトルをそのまま使っているため、省略形の場合があります。
|
33
|
-
|
34
|
-
|
27
|
+
- 検索結果のタイトルをそのまま収集するため、省略されている場合があります。
|
35
28
|
- rank
|
36
29
|
- 検索順位。
|
37
30
|
- SearchYJ では検索に混じる広告を排除しながらレコードを拾っています。この機能が貧弱なため、多少値がずれることがあります。
|
38
31
|
|
39
|
-
|
40
32
|
### list
|
41
33
|
|
42
34
|
検索結果を指定の個数揃えて出力します。
|
@@ -51,12 +43,11 @@ gem 'searchyj'
|
|
51
43
|
|
52
44
|
一度の検索でこの値に満たなかった場合、この値に届くまで検索を繰り返します。値に到達するより先に検索が最後まで到達した場合には、それまでに集めた検索結果を返します。
|
53
45
|
|
54
|
-
|
46
|
+
省略時の値は 10 です。
|
55
47
|
|
56
48
|
#### --from, -f
|
57
49
|
|
58
|
-
|
59
|
-
|
50
|
+
指定の順位から検索を開始します。
|
60
51
|
|
61
52
|
### detect
|
62
53
|
|
@@ -64,19 +55,23 @@ gem 'searchyj'
|
|
64
55
|
|
65
56
|
$ searchyj detect [options] <SearchTerm>
|
66
57
|
|
67
|
-
検索にひとつも引っかからない場合は文字列
|
58
|
+
検索にひとつも引っかからない場合は文字列 _null_ が出力されます。
|
68
59
|
|
69
60
|
#### --regexp, -r
|
70
61
|
|
71
|
-
|
62
|
+
マッチさせたい正規表現用の文字列です。マッチさせるパラメータの種類は _--key_ オプションに依存しています。
|
72
63
|
|
73
64
|
必須です。
|
74
65
|
|
75
66
|
#### --key, -k
|
76
67
|
|
77
|
-
比較対象のキー名です。
|
68
|
+
比較対象のキー名です。 _title_ か _uri_ を渡すことができます。
|
69
|
+
|
70
|
+
省略時の値は _title_ です。
|
78
71
|
|
79
|
-
|
72
|
+
#### --from, -f
|
73
|
+
|
74
|
+
指定の順位から検索を開始します。
|
80
75
|
|
81
76
|
### rank
|
82
77
|
|
@@ -84,17 +79,17 @@ gem 'searchyj'
|
|
84
79
|
|
85
80
|
$ searchyj rank [options] <SearchTerm>
|
86
81
|
|
87
|
-
検索にひとつも引っかからない場合は文字列
|
82
|
+
検索にひとつも引っかからない場合は文字列 _null_ が出力されます。
|
88
83
|
|
89
84
|
#### --rank, -r
|
90
85
|
|
91
|
-
|
86
|
+
出力したいレコードの順位です。
|
92
87
|
|
93
|
-
|
88
|
+
省略時の値は 1 です。
|
94
89
|
|
95
90
|
## Usage (Programming)
|
96
91
|
|
97
|
-
|
92
|
+
_lib/searchyj.rb_ やその他を読んでください。
|
98
93
|
|
99
94
|
## Author
|
100
95
|
|
data/README.md
CHANGED
@@ -1,15 +1,15 @@
|
|
1
|
-
SearchYJ
|
2
|
-
====
|
1
|
+
# SearchYJ
|
3
2
|
|
4
3
|
Search on Yahoo Japan.
|
5
4
|
|
6
|
-
|
5
|
+
Installation
|
6
|
+
----
|
7
7
|
|
8
8
|
Add this line to your application's Gemfile:
|
9
9
|
|
10
|
-
|
10
|
+
~~~ruby
|
11
11
|
gem 'searchyj'
|
12
|
-
|
12
|
+
~~~
|
13
13
|
|
14
14
|
And then execute:
|
15
15
|
|
@@ -21,85 +21,78 @@ Or install it yourself as:
|
|
21
21
|
|
22
22
|
## Usage (CLI)
|
23
23
|
|
24
|
-
The
|
25
|
-
|
26
|
-
It has parameters the followings.
|
24
|
+
The format of the search result is JSON which includes parameters the followings.
|
27
25
|
|
28
26
|
- uri
|
29
|
-
- The URI of the web site。
|
30
|
-
|
31
|
-
|
32
27
|
- title
|
33
|
-
- The title of the
|
34
|
-
- The title might be abbreviation.
|
35
|
-
|
36
|
-
|
28
|
+
- The title might be abbreviated because its source string is from the list of the search result.
|
37
29
|
- rank
|
38
|
-
- The rank order in the search
|
39
|
-
- This number might have a slight error.
|
40
|
-
|
30
|
+
- The number of the rank order in the search result.
|
31
|
+
- This number might have a slight error. The records collected by SearchYJ is excepted some advertisements from the search result. The feature of exception is rough yet.
|
41
32
|
|
42
33
|
### list
|
43
34
|
|
44
|
-
Print the search
|
35
|
+
Print the search result.
|
45
36
|
|
46
37
|
$ searchyj list [options] <SearchTerm>
|
47
38
|
|
48
|
-
If the
|
39
|
+
If the number of the result is none, SearchYJ print an empty array in the form of string.
|
49
40
|
|
50
41
|
#### --size, -s
|
51
42
|
|
52
|
-
|
43
|
+
This number is the size of the result records.
|
53
44
|
|
54
|
-
|
45
|
+
SearchYJ continues searching and to collecting records until the sum of the records reach the size value.
|
55
46
|
|
56
|
-
If
|
47
|
+
If searching is over or become unable to get new records, before reach the size value, SearchYJ print the collected records at that time.
|
57
48
|
|
58
49
|
The default value is 10.
|
59
50
|
|
60
|
-
|
61
51
|
#### --from, -f
|
62
52
|
|
63
|
-
|
64
|
-
|
53
|
+
The searching process starts from this number of the search ranking.
|
65
54
|
|
66
55
|
### detect
|
67
56
|
|
68
|
-
|
57
|
+
Get the record matched first with the option value.
|
69
58
|
|
70
59
|
$ searchyj detect [options] <SearchTerm>
|
71
60
|
|
72
|
-
If the
|
61
|
+
If the matching record is none, SearchYJ print a _null_ in the form of string.
|
73
62
|
|
74
63
|
#### --regexp, -r
|
75
64
|
|
76
|
-
|
65
|
+
This value is a regular expression to extract from the search result. The matching target in the parameters is depended on the value of _--key_ option.
|
77
66
|
|
78
|
-
The
|
67
|
+
The value is required.
|
79
68
|
|
80
69
|
#### --key, -k
|
81
70
|
|
82
|
-
|
71
|
+
This value is the name of the matching target which a key name of parameters in records. This option receives any of _title_ or _uri_.
|
72
|
+
|
73
|
+
The default value is _title_.
|
74
|
+
|
75
|
+
#### --from, -f
|
83
76
|
|
84
|
-
The
|
77
|
+
The searching process starts from this number of the search ranking.
|
85
78
|
|
86
79
|
### rank
|
87
80
|
|
88
|
-
Print a record
|
81
|
+
Print a particular record extracted from the search result by the number of rank order.
|
89
82
|
|
90
83
|
$ searchyj rank [options] <SearchTerm>
|
91
84
|
|
92
|
-
If the
|
85
|
+
If the matching record is none, SearchYJ print a _null_ in the form of string.
|
93
86
|
|
94
87
|
#### --rank, -r
|
95
88
|
|
96
|
-
|
89
|
+
This value is the number of rank order in the search result.
|
97
90
|
|
98
|
-
The
|
91
|
+
The default value is _1_.
|
99
92
|
|
100
93
|
## Usage (Programming)
|
101
94
|
|
102
|
-
Please read
|
95
|
+
Please read _lib/searchyj.rb_ and others.
|
103
96
|
|
104
97
|
## Author
|
105
98
|
|
data/TODO.md
CHANGED
data/lib/searchyj.rb
CHANGED
@@ -5,18 +5,18 @@ require "searchyj/cli"
|
|
5
5
|
module SearchYJ
|
6
6
|
module_function
|
7
7
|
|
8
|
-
def list(term, size = 10,
|
8
|
+
def list(term, size = 10, **args)
|
9
9
|
mgr = SearchYJ::Main.new
|
10
|
-
mgr.list(term, size,
|
10
|
+
mgr.list(term, size, args)
|
11
11
|
end
|
12
12
|
|
13
|
-
def rank(term, rank)
|
13
|
+
def rank(term, rank, **args)
|
14
14
|
mgr = SearchYJ::Main.new
|
15
|
-
mgr.rank(term, rank)
|
15
|
+
mgr.rank(term, rank, args)
|
16
16
|
end
|
17
17
|
|
18
|
-
def detect(term, regexp, key = :title)
|
18
|
+
def detect(term, regexp, key = :title, **args)
|
19
19
|
mgr = SearchYJ::Main.new
|
20
|
-
mgr.detect(term, regexp, key)
|
20
|
+
mgr.detect(term, regexp, key, args)
|
21
21
|
end
|
22
22
|
end
|
data/lib/searchyj/cli.rb
CHANGED
@@ -4,66 +4,78 @@ require 'json'
|
|
4
4
|
|
5
5
|
module SearchYJ
|
6
6
|
class CLI < Thor
|
7
|
+
class_option \
|
8
|
+
:from,
|
9
|
+
type: :numeric,
|
10
|
+
default: 1,
|
11
|
+
aliases: '-f',
|
12
|
+
desc: \
|
13
|
+
'The searching process starts ' \
|
14
|
+
'from this number of the search ranking.'
|
15
|
+
|
7
16
|
desc 'detect',
|
8
|
-
"
|
9
|
-
'meet the conditions of a regexp and a key.'
|
17
|
+
"Get the record matched first with the option value."
|
10
18
|
option :regexp,
|
11
19
|
type: :string,
|
12
20
|
required: true,
|
13
21
|
aliases: '-r',
|
14
|
-
desc: 'Regexp
|
22
|
+
desc: 'Regexp to extract from the search result.'
|
15
23
|
option :key,
|
16
24
|
type: :string,
|
17
25
|
default: 'title',
|
18
26
|
aliases: '-k',
|
19
27
|
desc: \
|
20
|
-
|
21
|
-
|
28
|
+
"The key name of the matching target\n" \
|
29
|
+
"This option receives any of 'title' or 'uri'."
|
22
30
|
def detect(term)
|
23
|
-
|
24
|
-
|
31
|
+
opt = symbolized_options
|
32
|
+
key = opt.delete(:key)
|
33
|
+
regexp = Regexp.new(opt.delete(:regexp))
|
25
34
|
|
26
35
|
puts JSON.dump(
|
27
|
-
SearchYJ.detect(term, regexp, key)
|
36
|
+
SearchYJ.detect(term, regexp, key, opt)
|
28
37
|
)
|
29
38
|
end
|
30
39
|
|
31
40
|
desc 'list',
|
32
|
-
'
|
41
|
+
'Print the search result.'
|
33
42
|
option :size,
|
34
43
|
type: :numeric,
|
35
44
|
default: 10,
|
36
45
|
aliases: '-s',
|
37
46
|
desc: 'The size of the returner'
|
38
|
-
option :from,
|
39
|
-
type: :numeric,
|
40
|
-
default: 1,
|
41
|
-
aliases: '-f',
|
42
|
-
desc: \
|
43
|
-
'Start to search from this number of the search ranking'
|
44
47
|
def list(term)
|
45
|
-
|
46
|
-
|
48
|
+
opt = symbolized_options
|
49
|
+
size = opt.delete(:size)
|
47
50
|
|
48
51
|
puts JSON.dump(
|
49
|
-
SearchYJ.list(term, size,
|
52
|
+
SearchYJ.list(term, size, opt)
|
50
53
|
)
|
51
54
|
end
|
52
55
|
|
53
56
|
desc 'rank',
|
54
|
-
"
|
55
|
-
|
57
|
+
"Print a particular record extracted " \
|
58
|
+
"from the search result by the number of rank order."
|
56
59
|
option :rank,
|
57
60
|
type: :numeric,
|
58
|
-
|
61
|
+
default: 1,
|
59
62
|
aliases: '-r',
|
60
63
|
desc: 'The rank order in the search ranking'
|
61
64
|
def rank(term)
|
62
|
-
|
65
|
+
opt = symbolized_options
|
66
|
+
rank = opt.delete(:rank)
|
63
67
|
|
64
68
|
puts JSON.dump(
|
65
|
-
SearchYJ.rank(term, rank)
|
69
|
+
SearchYJ.rank(term, rank, opt)
|
66
70
|
)
|
67
71
|
end
|
72
|
+
|
73
|
+
private
|
74
|
+
|
75
|
+
def symbolized_options
|
76
|
+
options.map do |key, value|
|
77
|
+
[key.to_sym, value]
|
78
|
+
end.to_h
|
79
|
+
end
|
68
80
|
end
|
69
81
|
end
|
data/lib/searchyj/main.rb
CHANGED
@@ -11,10 +11,10 @@ module SearchYJ
|
|
11
11
|
# @return [Hash]
|
12
12
|
# A result record if matched the arguments
|
13
13
|
# Else nil
|
14
|
-
def detect(term, regexp, key = :title)
|
14
|
+
def detect(term, regexp, key = :title, **args)
|
15
15
|
key = key.to_sym unless key.is_a?(Symbol)
|
16
16
|
|
17
|
-
searcher = Searcher.new
|
17
|
+
searcher = Searcher.new(args)
|
18
18
|
searcher.uri.search_term = term
|
19
19
|
searcher.pager.size = 100
|
20
20
|
|
@@ -30,15 +30,13 @@ module SearchYJ
|
|
30
30
|
# Get records of the search result.
|
31
31
|
# @param term [String] Search term
|
32
32
|
# @param size [Integer] The size of the returner
|
33
|
-
# @param
|
34
|
-
# Start to search from this number of the search ranking
|
33
|
+
# @param args [Hash]
|
35
34
|
#
|
36
35
|
# @return [Array]
|
37
36
|
# Includes the result records
|
38
|
-
def list(term, size = 10,
|
39
|
-
searcher = Searcher.new
|
37
|
+
def list(term, size = 10, **args)
|
38
|
+
searcher = Searcher.new(args)
|
40
39
|
searcher.uri.search_term = term
|
41
|
-
searcher.uri.index = start_index
|
42
40
|
searcher.pager.size = size
|
43
41
|
list = []
|
44
42
|
|
@@ -58,8 +56,9 @@ module SearchYJ
|
|
58
56
|
# @return [Hash]
|
59
57
|
# A result record if matched the arguments
|
60
58
|
# Else nil
|
61
|
-
def rank(term, rank)
|
62
|
-
|
59
|
+
def rank(term, rank, **args)
|
60
|
+
args[:from] = rank
|
61
|
+
result = list(term, 1, args)
|
63
62
|
(result.size > 0) ? result[0] : nil
|
64
63
|
end
|
65
64
|
end
|
data/lib/searchyj/searcher.rb
CHANGED
@@ -19,8 +19,6 @@ module SearchYJ
|
|
19
19
|
:limit_loop, :user_agent, :sleep_time
|
20
20
|
|
21
21
|
ENCODING = 'UTF-8'
|
22
|
-
LIMIT_LOOP = 50
|
23
|
-
SLEEP_TIME = 1
|
24
22
|
USER_AGENT = \
|
25
23
|
'Mozilla/5.0 (Windows NT 6.1; WOW64; rv:38.0)' \
|
26
24
|
'Gecko/20100101 Firefox/38.0'
|
@@ -28,13 +26,29 @@ module SearchYJ
|
|
28
26
|
OpenUriError = Class.new(StandardError)
|
29
27
|
|
30
28
|
# Initialize myself.
|
31
|
-
|
29
|
+
# @param encoding [String]
|
30
|
+
# The character encoding that is used to parse HTML
|
31
|
+
# @param from [Integer]
|
32
|
+
# Start to search from this number of the search ranking
|
33
|
+
# @param sleep_time [Integer]
|
34
|
+
# The time of sleep after fetching from internet
|
35
|
+
# @param limit_loop [Integer]
|
36
|
+
# The number of limit that is connectable in one process
|
37
|
+
# @param user_agent [String]
|
38
|
+
# Specify the user agent when open uri
|
39
|
+
def initialize(
|
40
|
+
encoding: ENCODING,
|
41
|
+
from: 1,
|
42
|
+
sleep_time: 1,
|
43
|
+
limit_loop: 50,
|
44
|
+
user_agent: USER_AGENT)
|
32
45
|
@pager = PageSizeAdjuster.new
|
33
46
|
@uri = UriManager.new
|
34
|
-
@
|
35
|
-
@
|
36
|
-
@
|
37
|
-
@
|
47
|
+
@uri.index = from
|
48
|
+
@encoding = encoding
|
49
|
+
@limit_loop = limit_loop
|
50
|
+
@sleep_time = sleep_time
|
51
|
+
@user_agent = user_agent
|
38
52
|
end
|
39
53
|
|
40
54
|
def run(&block)
|
data/lib/searchyj/version.rb
CHANGED
metadata
CHANGED
@@ -1,14 +1,14 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: searchyj
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.6.
|
4
|
+
version: 0.6.1
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- indeep-xyz
|
8
8
|
autorequire:
|
9
9
|
bindir: bin
|
10
10
|
cert_chain: []
|
11
|
-
date: 2015-
|
11
|
+
date: 2015-08-16 00:00:00.000000000 Z
|
12
12
|
dependencies:
|
13
13
|
- !ruby/object:Gem::Dependency
|
14
14
|
name: nokogiri
|