searchyj 0.6.0 → 0.6.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/README.ja.md +18 -23
- data/README.md +30 -37
- data/TODO.md +3 -7
- data/lib/searchyj.rb +6 -6
- data/lib/searchyj/cli.rb +35 -23
- data/lib/searchyj/main.rb +8 -9
- data/lib/searchyj/searcher.rb +21 -7
- data/lib/searchyj/version.rb +1 -1
- metadata +2 -2
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA1:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: a23769233d6904d18accdf9c7f560d750f6fd276
|
4
|
+
data.tar.gz: 65ce5afcec37ed5d17ba548c53c63b51c3b2a85b
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 76169acf83749d560ec56b4ab5043d6888aea93c6e88ab238a279972c505c1c7eb8b3a25c293524aa92139cd621f33c2e10e534327a2997b9e30ea8181535086
|
7
|
+
data.tar.gz: e0315aadfb6c7954b04ab5ac08dc2a9484718cca25435db21b1ae0b58253d5e660c7186d091503f37ee13163e9b496557662f861a582725af354c5ec701dbfda
|
data/README.ja.md
CHANGED
@@ -1,5 +1,4 @@
|
|
1
|
-
SearchYJ
|
2
|
-
====
|
1
|
+
# SearchYJ
|
3
2
|
|
4
3
|
Search on Yahoo Japan.
|
5
4
|
|
@@ -7,9 +6,9 @@ Search on Yahoo Japan.
|
|
7
6
|
|
8
7
|
Gemfile に次の一行を書き加えてください。
|
9
8
|
|
10
|
-
|
9
|
+
~~~ruby
|
11
10
|
gem 'searchyj'
|
12
|
-
|
11
|
+
~~~
|
13
12
|
|
14
13
|
そして以下を実行。
|
15
14
|
|
@@ -24,19 +23,12 @@ gem 'searchyj'
|
|
24
23
|
出力結果は基本的に JSON 形式で、以下のパラメータを持ちます。
|
25
24
|
|
26
25
|
- uri
|
27
|
-
- ウェブサイトの URI。
|
28
|
-
|
29
|
-
|
30
26
|
- title
|
31
|
-
-
|
32
|
-
- 検索結果のタイトルをそのまま使っているため、省略形の場合があります。
|
33
|
-
|
34
|
-
|
27
|
+
- 検索結果のタイトルをそのまま収集するため、省略されている場合があります。
|
35
28
|
- rank
|
36
29
|
- 検索順位。
|
37
30
|
- SearchYJ では検索に混じる広告を排除しながらレコードを拾っています。この機能が貧弱なため、多少値がずれることがあります。
|
38
31
|
|
39
|
-
|
40
32
|
### list
|
41
33
|
|
42
34
|
検索結果を指定の個数揃えて出力します。
|
@@ -51,12 +43,11 @@ gem 'searchyj'
|
|
51
43
|
|
52
44
|
一度の検索でこの値に満たなかった場合、この値に届くまで検索を繰り返します。値に到達するより先に検索が最後まで到達した場合には、それまでに集めた検索結果を返します。
|
53
45
|
|
54
|
-
|
46
|
+
省略時の値は 10 です。
|
55
47
|
|
56
48
|
#### --from, -f
|
57
49
|
|
58
|
-
|
59
|
-
|
50
|
+
指定の順位から検索を開始します。
|
60
51
|
|
61
52
|
### detect
|
62
53
|
|
@@ -64,19 +55,23 @@ gem 'searchyj'
|
|
64
55
|
|
65
56
|
$ searchyj detect [options] <SearchTerm>
|
66
57
|
|
67
|
-
検索にひとつも引っかからない場合は文字列
|
58
|
+
検索にひとつも引っかからない場合は文字列 _null_ が出力されます。
|
68
59
|
|
69
60
|
#### --regexp, -r
|
70
61
|
|
71
|
-
|
62
|
+
マッチさせたい正規表現用の文字列です。マッチさせるパラメータの種類は _--key_ オプションに依存しています。
|
72
63
|
|
73
64
|
必須です。
|
74
65
|
|
75
66
|
#### --key, -k
|
76
67
|
|
77
|
-
比較対象のキー名です。
|
68
|
+
比較対象のキー名です。 _title_ か _uri_ を渡すことができます。
|
69
|
+
|
70
|
+
省略時の値は _title_ です。
|
78
71
|
|
79
|
-
|
72
|
+
#### --from, -f
|
73
|
+
|
74
|
+
指定の順位から検索を開始します。
|
80
75
|
|
81
76
|
### rank
|
82
77
|
|
@@ -84,17 +79,17 @@ gem 'searchyj'
|
|
84
79
|
|
85
80
|
$ searchyj rank [options] <SearchTerm>
|
86
81
|
|
87
|
-
検索にひとつも引っかからない場合は文字列
|
82
|
+
検索にひとつも引っかからない場合は文字列 _null_ が出力されます。
|
88
83
|
|
89
84
|
#### --rank, -r
|
90
85
|
|
91
|
-
|
86
|
+
出力したいレコードの順位です。
|
92
87
|
|
93
|
-
|
88
|
+
省略時の値は 1 です。
|
94
89
|
|
95
90
|
## Usage (Programming)
|
96
91
|
|
97
|
-
|
92
|
+
_lib/searchyj.rb_ やその他を読んでください。
|
98
93
|
|
99
94
|
## Author
|
100
95
|
|
data/README.md
CHANGED
@@ -1,15 +1,15 @@
|
|
1
|
-
SearchYJ
|
2
|
-
====
|
1
|
+
# SearchYJ
|
3
2
|
|
4
3
|
Search on Yahoo Japan.
|
5
4
|
|
6
|
-
|
5
|
+
Installation
|
6
|
+
----
|
7
7
|
|
8
8
|
Add this line to your application's Gemfile:
|
9
9
|
|
10
|
-
|
10
|
+
~~~ruby
|
11
11
|
gem 'searchyj'
|
12
|
-
|
12
|
+
~~~
|
13
13
|
|
14
14
|
And then execute:
|
15
15
|
|
@@ -21,85 +21,78 @@ Or install it yourself as:
|
|
21
21
|
|
22
22
|
## Usage (CLI)
|
23
23
|
|
24
|
-
The
|
25
|
-
|
26
|
-
It has parameters the followings.
|
24
|
+
The format of the search result is JSON which includes parameters the followings.
|
27
25
|
|
28
26
|
- uri
|
29
|
-
- The URI of the web site。
|
30
|
-
|
31
|
-
|
32
27
|
- title
|
33
|
-
- The title of the
|
34
|
-
- The title might be abbreviation.
|
35
|
-
|
36
|
-
|
28
|
+
- The title might be abbreviated because its source string is from the list of the search result.
|
37
29
|
- rank
|
38
|
-
- The rank order in the search
|
39
|
-
- This number might have a slight error.
|
40
|
-
|
30
|
+
- The number of the rank order in the search result.
|
31
|
+
- This number might have a slight error. The records collected by SearchYJ is excepted some advertisements from the search result. The feature of exception is rough yet.
|
41
32
|
|
42
33
|
### list
|
43
34
|
|
44
|
-
Print the search
|
35
|
+
Print the search result.
|
45
36
|
|
46
37
|
$ searchyj list [options] <SearchTerm>
|
47
38
|
|
48
|
-
If the
|
39
|
+
If the number of the result is none, SearchYJ print an empty array in the form of string.
|
49
40
|
|
50
41
|
#### --size, -s
|
51
42
|
|
52
|
-
|
43
|
+
This number is the size of the result records.
|
53
44
|
|
54
|
-
|
45
|
+
SearchYJ continues searching and to collecting records until the sum of the records reach the size value.
|
55
46
|
|
56
|
-
If
|
47
|
+
If searching is over or become unable to get new records, before reach the size value, SearchYJ print the collected records at that time.
|
57
48
|
|
58
49
|
The default value is 10.
|
59
50
|
|
60
|
-
|
61
51
|
#### --from, -f
|
62
52
|
|
63
|
-
|
64
|
-
|
53
|
+
The searching process starts from this number of the search ranking.
|
65
54
|
|
66
55
|
### detect
|
67
56
|
|
68
|
-
|
57
|
+
Get the record matched first with the option value.
|
69
58
|
|
70
59
|
$ searchyj detect [options] <SearchTerm>
|
71
60
|
|
72
|
-
If the
|
61
|
+
If the matching record is none, SearchYJ print a _null_ in the form of string.
|
73
62
|
|
74
63
|
#### --regexp, -r
|
75
64
|
|
76
|
-
|
65
|
+
This value is a regular expression to extract from the search result. The matching target in the parameters is depended on the value of _--key_ option.
|
77
66
|
|
78
|
-
The
|
67
|
+
The value is required.
|
79
68
|
|
80
69
|
#### --key, -k
|
81
70
|
|
82
|
-
|
71
|
+
This value is the name of the matching target which a key name of parameters in records. This option receives any of _title_ or _uri_.
|
72
|
+
|
73
|
+
The default value is _title_.
|
74
|
+
|
75
|
+
#### --from, -f
|
83
76
|
|
84
|
-
The
|
77
|
+
The searching process starts from this number of the search ranking.
|
85
78
|
|
86
79
|
### rank
|
87
80
|
|
88
|
-
Print a record
|
81
|
+
Print a particular record extracted from the search result by the number of rank order.
|
89
82
|
|
90
83
|
$ searchyj rank [options] <SearchTerm>
|
91
84
|
|
92
|
-
If the
|
85
|
+
If the matching record is none, SearchYJ print a _null_ in the form of string.
|
93
86
|
|
94
87
|
#### --rank, -r
|
95
88
|
|
96
|
-
|
89
|
+
This value is the number of rank order in the search result.
|
97
90
|
|
98
|
-
The
|
91
|
+
The default value is _1_.
|
99
92
|
|
100
93
|
## Usage (Programming)
|
101
94
|
|
102
|
-
Please read
|
95
|
+
Please read _lib/searchyj.rb_ and others.
|
103
96
|
|
104
97
|
## Author
|
105
98
|
|
data/TODO.md
CHANGED
data/lib/searchyj.rb
CHANGED
@@ -5,18 +5,18 @@ require "searchyj/cli"
|
|
5
5
|
module SearchYJ
|
6
6
|
module_function
|
7
7
|
|
8
|
-
def list(term, size = 10,
|
8
|
+
def list(term, size = 10, **args)
|
9
9
|
mgr = SearchYJ::Main.new
|
10
|
-
mgr.list(term, size,
|
10
|
+
mgr.list(term, size, args)
|
11
11
|
end
|
12
12
|
|
13
|
-
def rank(term, rank)
|
13
|
+
def rank(term, rank, **args)
|
14
14
|
mgr = SearchYJ::Main.new
|
15
|
-
mgr.rank(term, rank)
|
15
|
+
mgr.rank(term, rank, args)
|
16
16
|
end
|
17
17
|
|
18
|
-
def detect(term, regexp, key = :title)
|
18
|
+
def detect(term, regexp, key = :title, **args)
|
19
19
|
mgr = SearchYJ::Main.new
|
20
|
-
mgr.detect(term, regexp, key)
|
20
|
+
mgr.detect(term, regexp, key, args)
|
21
21
|
end
|
22
22
|
end
|
data/lib/searchyj/cli.rb
CHANGED
@@ -4,66 +4,78 @@ require 'json'
|
|
4
4
|
|
5
5
|
module SearchYJ
|
6
6
|
class CLI < Thor
|
7
|
+
class_option \
|
8
|
+
:from,
|
9
|
+
type: :numeric,
|
10
|
+
default: 1,
|
11
|
+
aliases: '-f',
|
12
|
+
desc: \
|
13
|
+
'The searching process starts ' \
|
14
|
+
'from this number of the search ranking.'
|
15
|
+
|
7
16
|
desc 'detect',
|
8
|
-
"
|
9
|
-
'meet the conditions of a regexp and a key.'
|
17
|
+
"Get the record matched first with the option value."
|
10
18
|
option :regexp,
|
11
19
|
type: :string,
|
12
20
|
required: true,
|
13
21
|
aliases: '-r',
|
14
|
-
desc: 'Regexp
|
22
|
+
desc: 'Regexp to extract from the search result.'
|
15
23
|
option :key,
|
16
24
|
type: :string,
|
17
25
|
default: 'title',
|
18
26
|
aliases: '-k',
|
19
27
|
desc: \
|
20
|
-
|
21
|
-
|
28
|
+
"The key name of the matching target\n" \
|
29
|
+
"This option receives any of 'title' or 'uri'."
|
22
30
|
def detect(term)
|
23
|
-
|
24
|
-
|
31
|
+
opt = symbolized_options
|
32
|
+
key = opt.delete(:key)
|
33
|
+
regexp = Regexp.new(opt.delete(:regexp))
|
25
34
|
|
26
35
|
puts JSON.dump(
|
27
|
-
SearchYJ.detect(term, regexp, key)
|
36
|
+
SearchYJ.detect(term, regexp, key, opt)
|
28
37
|
)
|
29
38
|
end
|
30
39
|
|
31
40
|
desc 'list',
|
32
|
-
'
|
41
|
+
'Print the search result.'
|
33
42
|
option :size,
|
34
43
|
type: :numeric,
|
35
44
|
default: 10,
|
36
45
|
aliases: '-s',
|
37
46
|
desc: 'The size of the returner'
|
38
|
-
option :from,
|
39
|
-
type: :numeric,
|
40
|
-
default: 1,
|
41
|
-
aliases: '-f',
|
42
|
-
desc: \
|
43
|
-
'Start to search from this number of the search ranking'
|
44
47
|
def list(term)
|
45
|
-
|
46
|
-
|
48
|
+
opt = symbolized_options
|
49
|
+
size = opt.delete(:size)
|
47
50
|
|
48
51
|
puts JSON.dump(
|
49
|
-
SearchYJ.list(term, size,
|
52
|
+
SearchYJ.list(term, size, opt)
|
50
53
|
)
|
51
54
|
end
|
52
55
|
|
53
56
|
desc 'rank',
|
54
|
-
"
|
55
|
-
|
57
|
+
"Print a particular record extracted " \
|
58
|
+
"from the search result by the number of rank order."
|
56
59
|
option :rank,
|
57
60
|
type: :numeric,
|
58
|
-
|
61
|
+
default: 1,
|
59
62
|
aliases: '-r',
|
60
63
|
desc: 'The rank order in the search ranking'
|
61
64
|
def rank(term)
|
62
|
-
|
65
|
+
opt = symbolized_options
|
66
|
+
rank = opt.delete(:rank)
|
63
67
|
|
64
68
|
puts JSON.dump(
|
65
|
-
SearchYJ.rank(term, rank)
|
69
|
+
SearchYJ.rank(term, rank, opt)
|
66
70
|
)
|
67
71
|
end
|
72
|
+
|
73
|
+
private
|
74
|
+
|
75
|
+
def symbolized_options
|
76
|
+
options.map do |key, value|
|
77
|
+
[key.to_sym, value]
|
78
|
+
end.to_h
|
79
|
+
end
|
68
80
|
end
|
69
81
|
end
|
data/lib/searchyj/main.rb
CHANGED
@@ -11,10 +11,10 @@ module SearchYJ
|
|
11
11
|
# @return [Hash]
|
12
12
|
# A result record if matched the arguments
|
13
13
|
# Else nil
|
14
|
-
def detect(term, regexp, key = :title)
|
14
|
+
def detect(term, regexp, key = :title, **args)
|
15
15
|
key = key.to_sym unless key.is_a?(Symbol)
|
16
16
|
|
17
|
-
searcher = Searcher.new
|
17
|
+
searcher = Searcher.new(args)
|
18
18
|
searcher.uri.search_term = term
|
19
19
|
searcher.pager.size = 100
|
20
20
|
|
@@ -30,15 +30,13 @@ module SearchYJ
|
|
30
30
|
# Get records of the search result.
|
31
31
|
# @param term [String] Search term
|
32
32
|
# @param size [Integer] The size of the returner
|
33
|
-
# @param
|
34
|
-
# Start to search from this number of the search ranking
|
33
|
+
# @param args [Hash]
|
35
34
|
#
|
36
35
|
# @return [Array]
|
37
36
|
# Includes the result records
|
38
|
-
def list(term, size = 10,
|
39
|
-
searcher = Searcher.new
|
37
|
+
def list(term, size = 10, **args)
|
38
|
+
searcher = Searcher.new(args)
|
40
39
|
searcher.uri.search_term = term
|
41
|
-
searcher.uri.index = start_index
|
42
40
|
searcher.pager.size = size
|
43
41
|
list = []
|
44
42
|
|
@@ -58,8 +56,9 @@ module SearchYJ
|
|
58
56
|
# @return [Hash]
|
59
57
|
# A result record if matched the arguments
|
60
58
|
# Else nil
|
61
|
-
def rank(term, rank)
|
62
|
-
|
59
|
+
def rank(term, rank, **args)
|
60
|
+
args[:from] = rank
|
61
|
+
result = list(term, 1, args)
|
63
62
|
(result.size > 0) ? result[0] : nil
|
64
63
|
end
|
65
64
|
end
|
data/lib/searchyj/searcher.rb
CHANGED
@@ -19,8 +19,6 @@ module SearchYJ
|
|
19
19
|
:limit_loop, :user_agent, :sleep_time
|
20
20
|
|
21
21
|
ENCODING = 'UTF-8'
|
22
|
-
LIMIT_LOOP = 50
|
23
|
-
SLEEP_TIME = 1
|
24
22
|
USER_AGENT = \
|
25
23
|
'Mozilla/5.0 (Windows NT 6.1; WOW64; rv:38.0)' \
|
26
24
|
'Gecko/20100101 Firefox/38.0'
|
@@ -28,13 +26,29 @@ module SearchYJ
|
|
28
26
|
OpenUriError = Class.new(StandardError)
|
29
27
|
|
30
28
|
# Initialize myself.
|
31
|
-
|
29
|
+
# @param encoding [String]
|
30
|
+
# The character encoding that is used to parse HTML
|
31
|
+
# @param from [Integer]
|
32
|
+
# Start to search from this number of the search ranking
|
33
|
+
# @param sleep_time [Integer]
|
34
|
+
# The time of sleep after fetching from internet
|
35
|
+
# @param limit_loop [Integer]
|
36
|
+
# The number of limit that is connectable in one process
|
37
|
+
# @param user_agent [String]
|
38
|
+
# Specify the user agent when open uri
|
39
|
+
def initialize(
|
40
|
+
encoding: ENCODING,
|
41
|
+
from: 1,
|
42
|
+
sleep_time: 1,
|
43
|
+
limit_loop: 50,
|
44
|
+
user_agent: USER_AGENT)
|
32
45
|
@pager = PageSizeAdjuster.new
|
33
46
|
@uri = UriManager.new
|
34
|
-
@
|
35
|
-
@
|
36
|
-
@
|
37
|
-
@
|
47
|
+
@uri.index = from
|
48
|
+
@encoding = encoding
|
49
|
+
@limit_loop = limit_loop
|
50
|
+
@sleep_time = sleep_time
|
51
|
+
@user_agent = user_agent
|
38
52
|
end
|
39
53
|
|
40
54
|
def run(&block)
|
data/lib/searchyj/version.rb
CHANGED
metadata
CHANGED
@@ -1,14 +1,14 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: searchyj
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.6.
|
4
|
+
version: 0.6.1
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- indeep-xyz
|
8
8
|
autorequire:
|
9
9
|
bindir: bin
|
10
10
|
cert_chain: []
|
11
|
-
date: 2015-
|
11
|
+
date: 2015-08-16 00:00:00.000000000 Z
|
12
12
|
dependencies:
|
13
13
|
- !ruby/object:Gem::Dependency
|
14
14
|
name: nokogiri
|