bookmeter_scraper 0.1.0
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +7 -0
- data/.gitignore +10 -0
- data/.rspec +2 -0
- data/.travis.yml +10 -0
- data/Gemfile +4 -0
- data/LICENSE.txt +21 -0
- data/README.ja.md +157 -0
- data/README.md +163 -0
- data/Rakefile +5 -0
- data/bin/console +14 -0
- data/bin/setup +7 -0
- data/bookmeter_scraper.gemspec +28 -0
- data/exe/bookmeter_scraper +3 -0
- data/lib/bookmeter_scraper.rb +2 -0
- data/lib/bookmeter_scraper/bookmeter.rb +414 -0
- data/lib/bookmeter_scraper/version.rb +3 -0
- metadata +144 -0
checksums.yaml
ADDED
@@ -0,0 +1,7 @@
|
|
1
|
+
---
|
2
|
+
SHA1:
|
3
|
+
metadata.gz: 6c328c66bbd91ea36ee0471c01f4e16a69ffd347
|
4
|
+
data.tar.gz: d1d5fde5a9d223c0aada00c670d1feed0b5e9c3a
|
5
|
+
SHA512:
|
6
|
+
metadata.gz: 9a2c3a6149faa92850aca03c455bd5ccd95f8eddaee005e4befd4daa328c340e171df5018803f9feb0ebb0f7fdf630f91d9b9259c52fb358f5b3ccf5570595e7
|
7
|
+
data.tar.gz: dcbf2db1efa63928b1c00a00d35af13ffba46345578c84b40b26a9b320ba69acd899c66c9319af5e15894a3d4e5ccc5fadbf64136fbbcbea804a0097f0729251
|
data/.gitignore
ADDED
data/.rspec
ADDED
data/.travis.yml
ADDED
data/Gemfile
ADDED
data/LICENSE.txt
ADDED
@@ -0,0 +1,21 @@
|
|
1
|
+
The MIT License (MIT)
|
2
|
+
|
3
|
+
Copyright (c) 2016 Kohei Yamamoto
|
4
|
+
|
5
|
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
6
|
+
of this software and associated documentation files (the "Software"), to deal
|
7
|
+
in the Software without restriction, including without limitation the rights
|
8
|
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
9
|
+
copies of the Software, and to permit persons to whom the Software is
|
10
|
+
furnished to do so, subject to the following conditions:
|
11
|
+
|
12
|
+
The above copyright notice and this permission notice shall be included in
|
13
|
+
all copies or substantial portions of the Software.
|
14
|
+
|
15
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
16
|
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
17
|
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
18
|
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
19
|
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
20
|
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
|
21
|
+
THE SOFTWARE.
|
data/README.ja.md
ADDED
@@ -0,0 +1,157 @@
|
|
1
|
+
# Bookmeter Scraper [![Build Status](https://travis-ci.org/kymmt90/bookmeter_scraper.svg?branch=master)](https://travis-ci.org/kymmt90/bookmeter_scraper)
|
2
|
+
|
3
|
+
[読書メーター](http://bookmeter.com)の情報をスクレイピングして Ruby で扱えるようにするための gem です。
|
4
|
+
|
5
|
+
- 書籍情報
|
6
|
+
- 読んだ本
|
7
|
+
- 読んでる本
|
8
|
+
- 積読本
|
9
|
+
- 読みたい本
|
10
|
+
- お気に入り / お気に入られユーザ
|
11
|
+
- ユーザプロフィール
|
12
|
+
|
13
|
+
を取得可能です。
|
14
|
+
|
15
|
+
## 注意
|
16
|
+
|
17
|
+
スクレイピングの頻度は常識の範囲内にとどめてください。読書メーターのサーバーへ故意に著しい負荷をかける行為は、利用規約の第 9 条で禁止されています。
|
18
|
+
|
19
|
+
- [利用規約 - 読書メーター](http://bookmeter.com/terms.php)
|
20
|
+
|
21
|
+
## 使いかた
|
22
|
+
|
23
|
+
この gem を使うときは以下のコードが必要です。
|
24
|
+
|
25
|
+
```ruby
|
26
|
+
require 'bookmeter_scraper'
|
27
|
+
```
|
28
|
+
|
29
|
+
### ログイン
|
30
|
+
|
31
|
+
書籍情報、お気に入り / お気に入られユーザ情報を取得するには、`Bookmeter.log_in` でログインしておく必要があります。
|
32
|
+
|
33
|
+
```ruby
|
34
|
+
bookmeter = BookmeterScraper::Bookmeter.log_in('example@example.com', 'password')
|
35
|
+
bookmeter.logged_in? # true
|
36
|
+
```
|
37
|
+
|
38
|
+
`Bookmeter#log_in` でもログイン可能です。
|
39
|
+
|
40
|
+
```ruby
|
41
|
+
bookmeter = BookmeterScraper::Bookmeter.new
|
42
|
+
bookmeter.log_in('example@example.com', 'password')
|
43
|
+
```
|
44
|
+
|
45
|
+
### 書籍情報の取得
|
46
|
+
|
47
|
+
以下の書籍情報
|
48
|
+
|
49
|
+
- 読んだ本
|
50
|
+
- 読んでる本
|
51
|
+
- 積読本
|
52
|
+
- 読みたい本
|
53
|
+
|
54
|
+
を取得できます。取得には事前のログインが必要です。
|
55
|
+
|
56
|
+
#### 読んだ本
|
57
|
+
|
58
|
+
`Bookmeter#read_books` で「読んだ本」情報が取得できます。
|
59
|
+
|
60
|
+
```ruby
|
61
|
+
books = bookmeter.read_books # ログインユーザの「読んだ本」を取得
|
62
|
+
bookmeter.read_books('01010101') # 他のユーザの ID を指定して、そのユーザの「読んだ本」を取得
|
63
|
+
```
|
64
|
+
|
65
|
+
書籍情報は書名 `name` と読了日(初読了日と再読日の両方)の配列 `read_dates` を属性として持つ `Struct` の配列として取得できます。
|
66
|
+
|
67
|
+
```ruby
|
68
|
+
books[0].name
|
69
|
+
books[0].read_dates
|
70
|
+
```
|
71
|
+
|
72
|
+
さらに、`Bookmeter#read_books_in` で特定年月の「読んだ本」情報が取得できます。
|
73
|
+
|
74
|
+
```ruby
|
75
|
+
books = bookmeter.read_books_in(2016, 1) # ログインユーザが 2016 年 1 月に「読んだ本」を取得
|
76
|
+
books = bookmeter.read_books_in(2016, 1, '01010101') # ID で指定した他のユーザが 2016 年 1 月に「読んだ本」を取得
|
77
|
+
```
|
78
|
+
|
79
|
+
#### 読んでる本 / 積読本 / 読みたい本
|
80
|
+
|
81
|
+
「読んだ本」以外の書籍情報
|
82
|
+
|
83
|
+
- 読んでる本
|
84
|
+
- 積読本
|
85
|
+
- 読みたい本
|
86
|
+
|
87
|
+
も、それぞれ
|
88
|
+
|
89
|
+
- `Bookmeter#reading_books`
|
90
|
+
- `Bookmeter#tsundoku`
|
91
|
+
- `Bookmeter#wish_list`
|
92
|
+
|
93
|
+
で取得できます。
|
94
|
+
|
95
|
+
```ruby
|
96
|
+
books = bookmeter.reading_books # ログインユーザの「読んでる本」を取得
|
97
|
+
books[0].name
|
98
|
+
books[0].read_dates # 読了日の Array は空
|
99
|
+
|
100
|
+
bookmeter.tsundoku # ログインユーザの「積読本」を取得
|
101
|
+
bookmeter.wish_list # ログインユーザの「読みたい本」を取得
|
102
|
+
```
|
103
|
+
|
104
|
+
### お気に入り / お気に入られユーザ情報の取得
|
105
|
+
|
106
|
+
`Bookmeter#followings` と `Bookmeter#followers` でログインユーザが参照できるお気に入り / お気に入られユーザの情報を取得できます。取得には事前のログインが必要です。
|
107
|
+
|
108
|
+
```ruby
|
109
|
+
following_users = bookmeter.followings # 「お気に入り」ユーザの情報を取得
|
110
|
+
followers = bookmeter.followers # 「お気に入られ」ユーザの情報を取得
|
111
|
+
```
|
112
|
+
|
113
|
+
ユーザ情報はユーザ名 `name` とユーザ ID `id` を持つ `Struct` の配列として取得できます。
|
114
|
+
|
115
|
+
```ruby
|
116
|
+
following_users[0].name
|
117
|
+
following_users[0].id
|
118
|
+
followers[0].name
|
119
|
+
followers[0].id
|
120
|
+
```
|
121
|
+
|
122
|
+
#### 注意
|
123
|
+
|
124
|
+
**お気に入り / お気に入られのページにページネーションが存在する場合には未対応です。**
|
125
|
+
|
126
|
+
### ユーザのプロフィールの取得
|
127
|
+
|
128
|
+
`Bookmeter#profile` でユーザのプロフィールを取得できます。プロフィールはログインなしで閲覧できるため、ログインは不要です。
|
129
|
+
|
130
|
+
```ruby
|
131
|
+
bookmeter = BookmeterScraper::Bookmeter.new
|
132
|
+
user_id = '000000'
|
133
|
+
profile = bookmeter.profile(user_id) # 任意ユーザの ID を指定してプロフィールを取得可能
|
134
|
+
```
|
135
|
+
|
136
|
+
プロフィール情報は以下の属性を持つ `Struct` として取得できます。プロフィールで設定されていない属性は `nil` となります。
|
137
|
+
|
138
|
+
```ruby
|
139
|
+
profile.name # ユーザ名
|
140
|
+
profile.gender # 性別
|
141
|
+
profile.age # 年齢
|
142
|
+
profile.blood_type # 血液型
|
143
|
+
profile.job # 職業
|
144
|
+
profile.address # 現住所
|
145
|
+
profile.url # URL / ブログ
|
146
|
+
profile.description # 自己紹介
|
147
|
+
profile.first_day # 記録初日
|
148
|
+
profile.elapsed_days # 経過日数
|
149
|
+
profile.read_books_count # 読んだ本の数
|
150
|
+
profile.read_pages_count # 読んだページの数
|
151
|
+
profile.reviews_count # 感想/レビューの数
|
152
|
+
profile.bookshelfs_count # 本棚の数
|
153
|
+
```
|
154
|
+
|
155
|
+
## ライセンス
|
156
|
+
|
157
|
+
[MIT License](http://opensource.org/licenses/MIT)
|
data/README.md
ADDED
@@ -0,0 +1,163 @@
|
|
1
|
+
# Bookmeter Scraper [![Build Status](https://travis-ci.org/kymmt90/bookmeter_scraper.svg?branch=master)](https://travis-ci.org/kymmt90/bookmeter_scraper)
|
2
|
+
|
3
|
+
A library for scraping [Bookmeter](http://bookmeter.com).
|
4
|
+
|
5
|
+
Japanese README is [here](https://github.com/kymmt90/bookmeter_scraper/blob/master/README.ja.md).
|
6
|
+
|
7
|
+
|
8
|
+
## Installation
|
9
|
+
|
10
|
+
Add this line to your application's Gemfile:
|
11
|
+
|
12
|
+
```ruby
|
13
|
+
gem 'bookmeter_scraper'
|
14
|
+
```
|
15
|
+
|
16
|
+
And then execute:
|
17
|
+
|
18
|
+
$ bundle
|
19
|
+
|
20
|
+
Or install it yourself as:
|
21
|
+
|
22
|
+
$ gem install bookmeter_scraper
|
23
|
+
|
24
|
+
|
25
|
+
## Usage
|
26
|
+
|
27
|
+
Add this line to your code before using this library:
|
28
|
+
|
29
|
+
```ruby
|
30
|
+
require 'bookmeter_scraper'
|
31
|
+
```
|
32
|
+
|
33
|
+
### Log in
|
34
|
+
|
35
|
+
You need to log in Bookmeter to get books and followings / followers information by `Bookmeter.log_in`:
|
36
|
+
|
37
|
+
```ruby
|
38
|
+
bookmeter = BookmeterScraper::Bookmeter.log_in('example@example.com', 'password')
|
39
|
+
bookmeter.logged_in? # true
|
40
|
+
```
|
41
|
+
|
42
|
+
`Bookmeter#log_in` is also available:
|
43
|
+
|
44
|
+
```ruby
|
45
|
+
bookmeter = BookmeterScraper::Bookmeter.new
|
46
|
+
bookmeter.log_in('example@example.com', 'password')
|
47
|
+
```
|
48
|
+
|
49
|
+
### Get books information
|
50
|
+
|
51
|
+
You can get books information:
|
52
|
+
|
53
|
+
- read books
|
54
|
+
- reading books
|
55
|
+
- tsundoku (stockpile)
|
56
|
+
- wish list
|
57
|
+
|
58
|
+
You need to log in Bookmeter in advance to get these information.
|
59
|
+
|
60
|
+
#### Read books
|
61
|
+
|
62
|
+
You can get read books information by `Bookmeter#read_books`:
|
63
|
+
|
64
|
+
```ruby
|
65
|
+
books = bookmeter.read_books # get read books of the logged in user
|
66
|
+
bookmeter.read_books('01010101') # get read books of a user specified by ID
|
67
|
+
```
|
68
|
+
|
69
|
+
Books infomation is an array of `Struct` which has `name` and `read_dates` as attributes.
|
70
|
+
`read_dates` is an array of finished reading dates (first finished date and reread dates):
|
71
|
+
|
72
|
+
```ruby
|
73
|
+
books[0].name
|
74
|
+
books[0].read_dates
|
75
|
+
```
|
76
|
+
|
77
|
+
To specify year-month for read books, you can use `Bookmeter#read_books_in`:
|
78
|
+
|
79
|
+
```ruby
|
80
|
+
books = bookmeter.read_books_in(2016, 1) # get read books of the logged in user in 2016-01
|
81
|
+
books = bookmeter.read_books_in(2016, 1, '01010101') # get read books of a user in 2016-01
|
82
|
+
```
|
83
|
+
|
84
|
+
#### Reading books / Tsundoku / Wish list
|
85
|
+
|
86
|
+
You can get other books information:
|
87
|
+
|
88
|
+
- `Bookmeter#reading_books`
|
89
|
+
- `Bookmeter#tsundoku`
|
90
|
+
- `Bookmeter#wish_list`
|
91
|
+
|
92
|
+
```ruby
|
93
|
+
books = bookmeter.reading_books
|
94
|
+
books[0].name
|
95
|
+
books[0].read_dates # this array is empty
|
96
|
+
|
97
|
+
bookmeter.tsundoku
|
98
|
+
bookmeter.wish_list
|
99
|
+
```
|
100
|
+
|
101
|
+
### Get followings users / followers information
|
102
|
+
|
103
|
+
You can get following users (followings) and followers information by `Bookmeter#followings` and `Bookmeter#followers`:
|
104
|
+
|
105
|
+
```ruby
|
106
|
+
following_users = bookmeter.followings
|
107
|
+
followers = bookmeter.followers
|
108
|
+
```
|
109
|
+
|
110
|
+
You need to log in Bookmeter in advance to get these information.
|
111
|
+
|
112
|
+
Users information is an array of `Struct` which has `name` and `id` as attributes.
|
113
|
+
|
114
|
+
```ruby
|
115
|
+
following_users[0].name
|
116
|
+
following_users[0].id
|
117
|
+
followers[0].name
|
118
|
+
followers[0].id
|
119
|
+
```
|
120
|
+
|
121
|
+
#### Notice
|
122
|
+
|
123
|
+
**`Bookmeter#followings` and `Bookmeter#followers` have not supported paginated followings / followers pages yet.**
|
124
|
+
|
125
|
+
### Get user profile
|
126
|
+
|
127
|
+
You can get a user profile by `Bookmeter#profile`:
|
128
|
+
|
129
|
+
```ruby
|
130
|
+
bookmeter = BookmeterScraper::Bookmeter.new
|
131
|
+
user_id = '000000'
|
132
|
+
profile = bookmeter.profile(user_id) # You can specify arbitrary user ID
|
133
|
+
```
|
134
|
+
|
135
|
+
You do not need to log in to get user profiles.
|
136
|
+
Profile information is `Struct` which has these attributes:
|
137
|
+
|
138
|
+
```ruby
|
139
|
+
profile.name
|
140
|
+
profile.gender
|
141
|
+
profile.age
|
142
|
+
profile.blood_type
|
143
|
+
profile.job
|
144
|
+
profile.address
|
145
|
+
profile.url
|
146
|
+
profile.description
|
147
|
+
profile.first_day
|
148
|
+
profile.elapsed_days
|
149
|
+
profile.read_books_count
|
150
|
+
profile.read_pages_count
|
151
|
+
profile.reviews_count
|
152
|
+
profile.bookshelfs_count
|
153
|
+
```
|
154
|
+
|
155
|
+
|
156
|
+
## Contributing
|
157
|
+
|
158
|
+
Bug reports and pull requests are welcome on GitHub at https://github.com/kymmt90/bookmeter_scraper.
|
159
|
+
|
160
|
+
|
161
|
+
## License
|
162
|
+
|
163
|
+
The gem is available as open source under the terms of the [MIT License](http://opensource.org/licenses/MIT).
|
data/Rakefile
ADDED
data/bin/console
ADDED
@@ -0,0 +1,14 @@
|
|
1
|
+
#!/usr/bin/env ruby
|
2
|
+
|
3
|
+
require "bundler/setup"
|
4
|
+
require "bookmeter_scraper"
|
5
|
+
|
6
|
+
# You can add fixtures and/or initialization code here to make experimenting
|
7
|
+
# with your gem easier. You can also use a different console, if you like.
|
8
|
+
|
9
|
+
# (If you use this, don't forget to add pry to your Gemfile!)
|
10
|
+
# require "pry"
|
11
|
+
# Pry.start
|
12
|
+
|
13
|
+
require "irb"
|
14
|
+
IRB.start
|
data/bin/setup
ADDED
@@ -0,0 +1,28 @@
|
|
1
|
+
lib = File.expand_path('../lib', __FILE__)
|
2
|
+
$LOAD_PATH.unshift(lib) unless $LOAD_PATH.include?(lib)
|
3
|
+
require 'bookmeter_scraper/version'
|
4
|
+
|
5
|
+
Gem::Specification.new do |spec|
|
6
|
+
spec.name = "bookmeter_scraper"
|
7
|
+
spec.version = BookmeterScraper::VERSION
|
8
|
+
spec.authors = ["Kohei Yamamoto"]
|
9
|
+
spec.email = ["kymmt90@gmail.com"]
|
10
|
+
|
11
|
+
spec.summary = %q{Bookmeter scraping library}
|
12
|
+
spec.description = %q{Bookmeter scraping library}
|
13
|
+
spec.homepage = "https://github.com/kymmt90/bookmeter_scraper"
|
14
|
+
spec.license = "MIT"
|
15
|
+
|
16
|
+
spec.files = `git ls-files -z`.split("\x0").reject { |f| f.match(%r{^(test|spec|features)/}) }
|
17
|
+
spec.bindir = "exe"
|
18
|
+
spec.executables = spec.files.grep(%r{^exe/}) { |f| File.basename(f) }
|
19
|
+
spec.require_paths = ["lib"]
|
20
|
+
|
21
|
+
spec.add_development_dependency "bundler", "~> 1.10"
|
22
|
+
spec.add_development_dependency "rake", "~> 10.0"
|
23
|
+
spec.add_development_dependency "rspec", "~> 3.4"
|
24
|
+
spec.add_development_dependency "webmock", "~> 1.22"
|
25
|
+
|
26
|
+
spec.add_dependency "yasuri", "~> 0.0"
|
27
|
+
spec.add_dependency "mechanize", "~> 2.7"
|
28
|
+
end
|
@@ -0,0 +1,414 @@
|
|
1
|
+
require 'mechanize'
|
2
|
+
require 'yasuri'
|
3
|
+
|
4
|
+
module BookmeterScraper
|
5
|
+
class Bookmeter
|
6
|
+
ROOT_URI = 'http://bookmeter.com'.freeze
|
7
|
+
LOGIN_URI = "#{ROOT_URI}/login".freeze
|
8
|
+
|
9
|
+
PROFILE_ATTRIBUTES = %i(name gender age blood_type job address url description first_day elapsed_days read_books_count read_pages_count reviews_count bookshelfs_count)
|
10
|
+
Profile = Struct.new(*PROFILE_ATTRIBUTES)
|
11
|
+
|
12
|
+
BOOK_ATTRIBUTES = %i(name read_dates)
|
13
|
+
Book = Struct.new(*BOOK_ATTRIBUTES)
|
14
|
+
|
15
|
+
USER_ATTRIBUTES = %i(name id)
|
16
|
+
User = Struct.new(*USER_ATTRIBUTES)
|
17
|
+
|
18
|
+
JP_ATTRIBUTE_NAMES = {
|
19
|
+
gender: '性別',
|
20
|
+
age: '年齢',
|
21
|
+
blood_type: '血液型',
|
22
|
+
job: '職業',
|
23
|
+
address: '現住所',
|
24
|
+
url: 'URL / ブログ',
|
25
|
+
description: '自己紹介',
|
26
|
+
first_day: '記録初日',
|
27
|
+
elapsed_days: '経過日数',
|
28
|
+
read_books_count: '読んだ本',
|
29
|
+
read_pages_count: '読んだページ',
|
30
|
+
reviews_count: '感想/レビュー',
|
31
|
+
bookshelfs_count: '本棚',
|
32
|
+
}
|
33
|
+
|
34
|
+
NUM_BOOKS_PER_PAGE = 40
|
35
|
+
NUM_USERS_PER_PAGE = 20
|
36
|
+
|
37
|
+
attr_reader :log_in_user_id
|
38
|
+
|
39
|
+
def self.mypage_uri(user_id)
|
40
|
+
raise ArgumentError unless user_id =~ /^\d+$/
|
41
|
+
"#{ROOT_URI}/u/#{user_id}"
|
42
|
+
end
|
43
|
+
|
44
|
+
def self.read_books_uri(user_id)
|
45
|
+
raise ArgumentError unless user_id =~ /^\d+$/
|
46
|
+
"#{ROOT_URI}/u/#{user_id}/booklist"
|
47
|
+
end
|
48
|
+
|
49
|
+
def self.reading_books_uri(user_id)
|
50
|
+
raise ArgumentError unless user_id =~ /^\d+$/
|
51
|
+
"#{ROOT_URI}/u/#{user_id}/booklistnow"
|
52
|
+
end
|
53
|
+
|
54
|
+
def self.tsundoku_uri(user_id)
|
55
|
+
raise ArgumentError unless user_id =~ /^\d+$/
|
56
|
+
"#{ROOT_URI}/u/#{user_id}/booklisttun"
|
57
|
+
end
|
58
|
+
|
59
|
+
def self.wish_list_uri(user_id)
|
60
|
+
raise ArgumentError unless user_id =~ /^\d+$/
|
61
|
+
"#{ROOT_URI}/u/#{user_id}/booklistpre"
|
62
|
+
end
|
63
|
+
|
64
|
+
def self.followings_uri(user_id)
|
65
|
+
raise ArgumentError unless user_id =~ /^\d+$/
|
66
|
+
"#{ROOT_URI}/u/#{user_id}/favorite_user"
|
67
|
+
end
|
68
|
+
|
69
|
+
def self.followers_uri(user_id)
|
70
|
+
raise ArgumentError unless user_id =~ /^\d+$/
|
71
|
+
"#{ROOT_URI}/u/#{user_id}/favorited_user"
|
72
|
+
end
|
73
|
+
|
74
|
+
def self.log_in(mail, password)
|
75
|
+
Bookmeter.new.tap do |bookmeter|
|
76
|
+
bookmeter.log_in(mail, password)
|
77
|
+
end
|
78
|
+
end
|
79
|
+
|
80
|
+
|
81
|
+
def initialize(agent = nil)
|
82
|
+
@agent = agent.nil? ? Bookmeter.new_agent : agent
|
83
|
+
@logged_in = false
|
84
|
+
end
|
85
|
+
|
86
|
+
def log_in(mail, password)
|
87
|
+
raise BookmeterError if @agent.nil?
|
88
|
+
|
89
|
+
next_page = nil
|
90
|
+
page = @agent.get(LOGIN_URI) do |page|
|
91
|
+
next_page = page.form_with(action: '/login') do |form|
|
92
|
+
form.field_with(name: 'mail').value = mail
|
93
|
+
form.field_with(name: 'password').value = password
|
94
|
+
end.submit
|
95
|
+
end
|
96
|
+
@logged_in = next_page.uri.to_s == ROOT_URI + '/'
|
97
|
+
return unless logged_in?
|
98
|
+
|
99
|
+
mypage = next_page.link_with(text: 'マイページ').click
|
100
|
+
@log_in_user_id = extract_user_id(mypage)
|
101
|
+
end
|
102
|
+
|
103
|
+
def logged_in?
|
104
|
+
@logged_in
|
105
|
+
end
|
106
|
+
|
107
|
+
def profile(user_id)
|
108
|
+
raise ArgumentError unless user_id =~ /^\d+$/
|
109
|
+
|
110
|
+
mypage = @agent.get(Bookmeter.mypage_uri(user_id))
|
111
|
+
|
112
|
+
profile_dl_tags = mypage.search('#side_left > div.inner > div.profile > dl')
|
113
|
+
jp_attribute_names = profile_dl_tags.map { |i| i.children[0].children.text }
|
114
|
+
attribute_values = profile_dl_tags.map { |i| i.children[1].children.text }
|
115
|
+
jp_attributes = Hash[jp_attribute_names.zip(attribute_values)]
|
116
|
+
attributes = PROFILE_ATTRIBUTES.map do |attribute|
|
117
|
+
jp_attributes[JP_ATTRIBUTE_NAMES[attribute]]
|
118
|
+
end
|
119
|
+
attributes[0] = mypage.at_css('#side_left > div.inner > h3').text
|
120
|
+
|
121
|
+
Profile.new(*attributes)
|
122
|
+
end
|
123
|
+
|
124
|
+
def read_books(user_id = @log_in_user_id)
|
125
|
+
books = get_books(user_id, :read_books_uri)
|
126
|
+
books.each { |b| yield b } if block_given?
|
127
|
+
books
|
128
|
+
end
|
129
|
+
|
130
|
+
def read_books_in(year, month, user_id = @log_in_user_id)
|
131
|
+
date = Time.local(year, month)
|
132
|
+
books = get_read_books(user_id, date)
|
133
|
+
books.each { |b| yield b } if block_given?
|
134
|
+
books
|
135
|
+
end
|
136
|
+
|
137
|
+
def reading_books(user_id = @log_in_user_id)
|
138
|
+
books = get_books(user_id, :reading_books_uri)
|
139
|
+
books.each { |b| yield b } if block_given?
|
140
|
+
books
|
141
|
+
end
|
142
|
+
|
143
|
+
def tsundoku(user_id = @log_in_user_id)
|
144
|
+
books = get_books(user_id, :tsundoku_uri)
|
145
|
+
books.each { |b| yield b } if block_given?
|
146
|
+
books
|
147
|
+
end
|
148
|
+
|
149
|
+
def wish_list(user_id = @log_in_user_id)
|
150
|
+
books = get_books(user_id, :wish_list_uri)
|
151
|
+
books.each { |b| yield b } if block_given?
|
152
|
+
books
|
153
|
+
end
|
154
|
+
|
155
|
+
def followings(user_id = @log_in_user_id)
|
156
|
+
users = get_followings(user_id)
|
157
|
+
end
|
158
|
+
|
159
|
+
def followers(user_id = @log_in_user_id)
|
160
|
+
users = get_followers(user_id)
|
161
|
+
end
|
162
|
+
|
163
|
+
private
|
164
|
+
|
165
|
+
def self.new_agent
|
166
|
+
agent = Mechanize.new do |a|
|
167
|
+
a.user_agent_alias = 'Mac Safari'
|
168
|
+
end
|
169
|
+
end
|
170
|
+
|
171
|
+
def extract_user_id(page)
|
172
|
+
page.uri.to_s.match(/\/u\/(\d+)$/)[1]
|
173
|
+
end
|
174
|
+
|
175
|
+
def get_books(user_id, uri_method)
|
176
|
+
books = []
|
177
|
+
scraped_pages = scrape_book_pages(user_id, uri_method)
|
178
|
+
scraped_pages.each do |page|
|
179
|
+
books << get_book_structs(page)
|
180
|
+
books.flatten!
|
181
|
+
end
|
182
|
+
books
|
183
|
+
end
|
184
|
+
|
185
|
+
def get_read_books(user_id, target_ym)
|
186
|
+
result = []
|
187
|
+
scrape_book_pages(user_id, :read_books_uri).each do |page|
|
188
|
+
first_book_date = get_read_date(page['book_1_link'])
|
189
|
+
last_book_date = get_last_book_date(page)
|
190
|
+
|
191
|
+
first_book_ym = Time.local(first_book_date['year'].to_i, first_book_date['month'].to_i)
|
192
|
+
last_book_ym = Time.local(last_book_date['year'].to_i, last_book_date['month'].to_i)
|
193
|
+
|
194
|
+
if target_ym < last_book_ym
|
195
|
+
next
|
196
|
+
elsif target_ym == first_book_ym && target_ym > last_book_ym
|
197
|
+
result.concat(get_target_books(target_ym, page))
|
198
|
+
break
|
199
|
+
elsif target_ym < first_book_ym && target_ym > last_book_ym
|
200
|
+
result.concat(get_target_books(target_ym, page))
|
201
|
+
break
|
202
|
+
elsif target_ym <= first_book_ym && target_ym >= last_book_ym
|
203
|
+
result.concat(get_target_books(target_ym, page))
|
204
|
+
elsif target_ym > first_book_ym
|
205
|
+
break
|
206
|
+
end
|
207
|
+
end
|
208
|
+
result
|
209
|
+
end
|
210
|
+
|
211
|
+
def get_last_book_date(page)
|
212
|
+
NUM_BOOKS_PER_PAGE.downto(1) do |i|
|
213
|
+
link = page["book_#{i}_link"]
|
214
|
+
next if link.empty?
|
215
|
+
return get_read_date(link)
|
216
|
+
end
|
217
|
+
end
|
218
|
+
|
219
|
+
def get_target_books(target_ym, page)
|
220
|
+
target_books = []
|
221
|
+
|
222
|
+
1.upto(NUM_BOOKS_PER_PAGE) do |i|
|
223
|
+
next if page["book_#{i}_link"].empty?
|
224
|
+
|
225
|
+
read_yms = []
|
226
|
+
read_date = get_read_date(page["book_#{i}_link"])
|
227
|
+
read_dates = [Time.local(read_date['year'], read_date['month'], read_date['day'])]
|
228
|
+
read_yms << Time.local(read_date['year'], read_date['month'])
|
229
|
+
|
230
|
+
reread_dates = []
|
231
|
+
reread_dates << get_reread_date(page["book_#{i}_link"])
|
232
|
+
reread_dates.flatten!
|
233
|
+
|
234
|
+
unless reread_dates.empty?
|
235
|
+
reread_dates.each do |date|
|
236
|
+
read_yms << Time.local(date['reread_year'], date['reread_month'])
|
237
|
+
end
|
238
|
+
end
|
239
|
+
|
240
|
+
next unless read_yms.include?(target_ym)
|
241
|
+
|
242
|
+
unless reread_dates.empty?
|
243
|
+
reread_dates.each do |date|
|
244
|
+
read_dates << Time.local(date['reread_year'], date['reread_month'], date['reread_day'])
|
245
|
+
end
|
246
|
+
end
|
247
|
+
book_name = get_book_name(page["book_#{i}_link"])
|
248
|
+
book = Book.new(book_name, read_dates)
|
249
|
+
target_books << book
|
250
|
+
end
|
251
|
+
|
252
|
+
target_books
|
253
|
+
end
|
254
|
+
|
255
|
+
def scrape_book_pages(user_id, uri_method)
|
256
|
+
raise ArgumentError unless user_id =~ /^\d+$/
|
257
|
+
raise ArgumentError unless Bookmeter.methods.include?(uri_method)
|
258
|
+
return [] unless logged_in?
|
259
|
+
|
260
|
+
books_page = @agent.get(Bookmeter.method(uri_method).call(user_id))
|
261
|
+
|
262
|
+
# if books are not found at all
|
263
|
+
return [] if books_page.search('#main_left > div > center > a').empty?
|
264
|
+
|
265
|
+
if books_page.search('span.now_page').empty?
|
266
|
+
books_root = Yasuri.struct_books '//*[@id="main_left"]/div' do
|
267
|
+
1.upto(NUM_BOOKS_PER_PAGE) do |i|
|
268
|
+
send("text_book_#{i}_name", "//*[@id=\"main_left\"]/div/div[#{i + 1}]/div[2]/a")
|
269
|
+
send("text_book_#{i}_link", "//*[@id=\"main_left\"]/div/div[#{i + 1}]/div[2]/a/@href")
|
270
|
+
end
|
271
|
+
end
|
272
|
+
return [books_root.inject(@agent, books_page)]
|
273
|
+
end
|
274
|
+
|
275
|
+
books_root = Yasuri.pages_root '//span[@class="now_page"]/following-sibling::span[1]/a' do
|
276
|
+
text_page_index '//span[@class="now_page"]/a'
|
277
|
+
1.upto(NUM_BOOKS_PER_PAGE) do |i|
|
278
|
+
send("text_book_#{i}_name", "//*[@id=\"main_left\"]/div/div[#{i + 1}]/div[2]/a")
|
279
|
+
send("text_book_#{i}_link", "//*[@id=\"main_left\"]/div/div[#{i + 1}]/div[2]/a/@href")
|
280
|
+
end
|
281
|
+
end
|
282
|
+
books_root.inject(@agent, books_page)
|
283
|
+
end
|
284
|
+
|
285
|
+
def get_book_name(book_link)
|
286
|
+
@agent.get(ROOT_URI + book_link).search('#title').text
|
287
|
+
end
|
288
|
+
|
289
|
+
def get_read_date(book_link)
|
290
|
+
book_page = @agent.get(ROOT_URI + book_link)
|
291
|
+
book_date = Yasuri.struct_date '//*[@id="book_edit_area"]/form[1]/div[2]' do
|
292
|
+
text_year '//*[@id="read_date_y"]/option[1]', truncate: /\d+/, proc: :to_i
|
293
|
+
text_month '//*[@id="read_date_m"]/option[1]', truncate: /\d+/, proc: :to_i
|
294
|
+
text_day '//*[@id="read_date_d"]/option[1]', truncate: /\d+/, proc: :to_i
|
295
|
+
end
|
296
|
+
book_date.inject(@agent, book_page)
|
297
|
+
end
|
298
|
+
|
299
|
+
def get_reread_date(book_link)
|
300
|
+
book_page = @agent.get(ROOT_URI + book_link)
|
301
|
+
book_reread_date = Yasuri.struct_reread_date '//*[@id="book_edit_area"]/div/form[1]/div[2]' do
|
302
|
+
text_reread_year '//div[@class="reread_box"]/form[1]/div[2]/select[1]/option[1]', truncate: /\d+/, proc: :to_i
|
303
|
+
text_reread_month '//div[@class="reread_box"]/form[1]/div[2]/select[2]/option[1]', truncate: /\d+/, proc: :to_i
|
304
|
+
text_reread_day '//div[@class="reread_box"]/form[1]/div[2]/select[3]/option[1]', truncate: /\d+/, proc: :to_i
|
305
|
+
end
|
306
|
+
book_reread_date.inject(@agent, book_page)
|
307
|
+
end
|
308
|
+
|
309
|
+
def get_book_structs(page)
|
310
|
+
books = []
|
311
|
+
|
312
|
+
1.upto(NUM_BOOKS_PER_PAGE) do |i|
|
313
|
+
break if page["book_#{i}_link"].empty?
|
314
|
+
|
315
|
+
read_dates = []
|
316
|
+
read_date = get_read_date(page["book_#{i}_link"])
|
317
|
+
unless read_date.empty?
|
318
|
+
read_dates << Time.local(read_date['year'], read_date['month'], read_date['day'])
|
319
|
+
end
|
320
|
+
|
321
|
+
reread_dates = []
|
322
|
+
reread_dates << get_reread_date(page["book_#{i}_link"])
|
323
|
+
reread_dates.flatten!
|
324
|
+
|
325
|
+
unless reread_dates.empty?
|
326
|
+
reread_dates.each do |date|
|
327
|
+
read_dates << Time.local(date['reread_year'], date['reread_month'], date['reread_day'])
|
328
|
+
end
|
329
|
+
end
|
330
|
+
|
331
|
+
book_name = get_book_name(page["book_#{i}_link"])
|
332
|
+
book = Book.new(book_name, read_dates)
|
333
|
+
books << book
|
334
|
+
end
|
335
|
+
|
336
|
+
books
|
337
|
+
end
|
338
|
+
|
339
|
+
def get_followings(user_id)
|
340
|
+
users = []
|
341
|
+
scraped_pages = user_id == @log_in_user_id ? scrape_followings_page(user_id)
|
342
|
+
: scrape_others_followings_page(user_id)
|
343
|
+
scraped_pages.each do |page|
|
344
|
+
users << get_user_structs(page)
|
345
|
+
users.flatten!
|
346
|
+
end
|
347
|
+
users
|
348
|
+
end
|
349
|
+
|
350
|
+
def get_followers(user_id)
|
351
|
+
users = []
|
352
|
+
scraped_pages = scrape_followers_page(user_id)
|
353
|
+
scraped_pages.each do |page|
|
354
|
+
users << get_user_structs(page)
|
355
|
+
users.flatten!
|
356
|
+
end
|
357
|
+
users
|
358
|
+
end
|
359
|
+
|
360
|
+
def get_user_structs(page)
|
361
|
+
users = []
|
362
|
+
|
363
|
+
1.upto(NUM_USERS_PER_PAGE) do |i|
|
364
|
+
break if page["user_#{i}_name"].empty?
|
365
|
+
|
366
|
+
user_name = page["user_#{i}_name"]
|
367
|
+
user_id = page["user_#{i}_link"].match(/\/u\/(\d+)$/)[1]
|
368
|
+
user = User.new(user_name, user_id)
|
369
|
+
users << user
|
370
|
+
end
|
371
|
+
|
372
|
+
users
|
373
|
+
end
|
374
|
+
|
375
|
+
def scrape_followings_page(user_id)
|
376
|
+
raise ArgumentError unless user_id =~ /^\d+$/
|
377
|
+
return [] unless logged_in?
|
378
|
+
|
379
|
+
followings_page = @agent.get(Bookmeter.followings_uri(user_id))
|
380
|
+
followings_root = Yasuri.struct_books '//*[@id="main_left"]/div' do
|
381
|
+
1.upto(NUM_USERS_PER_PAGE) do |i|
|
382
|
+
send("text_user_#{i}_name", "//*[@id=\"main_left\"]/div/div[#{i}]/a/@title")
|
383
|
+
send("text_user_#{i}_link", "//*[@id=\"main_left\"]/div/div[#{i}]/a/@href")
|
384
|
+
end
|
385
|
+
end
|
386
|
+
[followings_root.inject(@agent, followings_page)]
|
387
|
+
end
|
388
|
+
|
389
|
+
def scrape_others_followings_page(user_id)
|
390
|
+
scrape_users_listing_page(user_id, :followings_uri)
|
391
|
+
end
|
392
|
+
|
393
|
+
def scrape_followers_page(user_id)
|
394
|
+
scrape_users_listing_page(user_id, :followers_uri)
|
395
|
+
end
|
396
|
+
|
397
|
+
def scrape_users_listing_page(user_id, uri_method)
|
398
|
+
raise ArgumentError unless user_id =~ /^\d+$/
|
399
|
+
raise ArgumentError unless Bookmeter.methods.include?(uri_method)
|
400
|
+
return [] unless logged_in?
|
401
|
+
|
402
|
+
page = @agent.get(Bookmeter.method(uri_method).call(user_id))
|
403
|
+
root = Yasuri.struct_users '//*[@id="main_left"]/div' do
|
404
|
+
1.upto(NUM_USERS_PER_PAGE) do |i|
|
405
|
+
send("text_user_#{i}_name", "//*[@id=\"main_left\"]/div/div[#{i}]/div/div[2]/a/@title")
|
406
|
+
send("text_user_#{i}_link", "//*[@id=\"main_left\"]/div/div[#{i}]/div/div[2]/a/@href")
|
407
|
+
end
|
408
|
+
end
|
409
|
+
[root.inject(@agent, page)]
|
410
|
+
end
|
411
|
+
end
|
412
|
+
|
413
|
+
class BookmeterError < StandardError; end
|
414
|
+
end
|
metadata
ADDED
@@ -0,0 +1,144 @@
|
|
1
|
+
--- !ruby/object:Gem::Specification
|
2
|
+
name: bookmeter_scraper
|
3
|
+
version: !ruby/object:Gem::Version
|
4
|
+
version: 0.1.0
|
5
|
+
platform: ruby
|
6
|
+
authors:
|
7
|
+
- Kohei Yamamoto
|
8
|
+
autorequire:
|
9
|
+
bindir: exe
|
10
|
+
cert_chain: []
|
11
|
+
date: 2016-02-26 00:00:00.000000000 Z
|
12
|
+
dependencies:
|
13
|
+
- !ruby/object:Gem::Dependency
|
14
|
+
name: bundler
|
15
|
+
requirement: !ruby/object:Gem::Requirement
|
16
|
+
requirements:
|
17
|
+
- - "~>"
|
18
|
+
- !ruby/object:Gem::Version
|
19
|
+
version: '1.10'
|
20
|
+
type: :development
|
21
|
+
prerelease: false
|
22
|
+
version_requirements: !ruby/object:Gem::Requirement
|
23
|
+
requirements:
|
24
|
+
- - "~>"
|
25
|
+
- !ruby/object:Gem::Version
|
26
|
+
version: '1.10'
|
27
|
+
- !ruby/object:Gem::Dependency
|
28
|
+
name: rake
|
29
|
+
requirement: !ruby/object:Gem::Requirement
|
30
|
+
requirements:
|
31
|
+
- - "~>"
|
32
|
+
- !ruby/object:Gem::Version
|
33
|
+
version: '10.0'
|
34
|
+
type: :development
|
35
|
+
prerelease: false
|
36
|
+
version_requirements: !ruby/object:Gem::Requirement
|
37
|
+
requirements:
|
38
|
+
- - "~>"
|
39
|
+
- !ruby/object:Gem::Version
|
40
|
+
version: '10.0'
|
41
|
+
- !ruby/object:Gem::Dependency
|
42
|
+
name: rspec
|
43
|
+
requirement: !ruby/object:Gem::Requirement
|
44
|
+
requirements:
|
45
|
+
- - "~>"
|
46
|
+
- !ruby/object:Gem::Version
|
47
|
+
version: '3.4'
|
48
|
+
type: :development
|
49
|
+
prerelease: false
|
50
|
+
version_requirements: !ruby/object:Gem::Requirement
|
51
|
+
requirements:
|
52
|
+
- - "~>"
|
53
|
+
- !ruby/object:Gem::Version
|
54
|
+
version: '3.4'
|
55
|
+
- !ruby/object:Gem::Dependency
|
56
|
+
name: webmock
|
57
|
+
requirement: !ruby/object:Gem::Requirement
|
58
|
+
requirements:
|
59
|
+
- - "~>"
|
60
|
+
- !ruby/object:Gem::Version
|
61
|
+
version: '1.22'
|
62
|
+
type: :development
|
63
|
+
prerelease: false
|
64
|
+
version_requirements: !ruby/object:Gem::Requirement
|
65
|
+
requirements:
|
66
|
+
- - "~>"
|
67
|
+
- !ruby/object:Gem::Version
|
68
|
+
version: '1.22'
|
69
|
+
- !ruby/object:Gem::Dependency
|
70
|
+
name: yasuri
|
71
|
+
requirement: !ruby/object:Gem::Requirement
|
72
|
+
requirements:
|
73
|
+
- - "~>"
|
74
|
+
- !ruby/object:Gem::Version
|
75
|
+
version: '0.0'
|
76
|
+
type: :runtime
|
77
|
+
prerelease: false
|
78
|
+
version_requirements: !ruby/object:Gem::Requirement
|
79
|
+
requirements:
|
80
|
+
- - "~>"
|
81
|
+
- !ruby/object:Gem::Version
|
82
|
+
version: '0.0'
|
83
|
+
- !ruby/object:Gem::Dependency
|
84
|
+
name: mechanize
|
85
|
+
requirement: !ruby/object:Gem::Requirement
|
86
|
+
requirements:
|
87
|
+
- - "~>"
|
88
|
+
- !ruby/object:Gem::Version
|
89
|
+
version: '2.7'
|
90
|
+
type: :runtime
|
91
|
+
prerelease: false
|
92
|
+
version_requirements: !ruby/object:Gem::Requirement
|
93
|
+
requirements:
|
94
|
+
- - "~>"
|
95
|
+
- !ruby/object:Gem::Version
|
96
|
+
version: '2.7'
|
97
|
+
description: Bookmeter scraping library
|
98
|
+
email:
|
99
|
+
- kymmt90@gmail.com
|
100
|
+
executables:
|
101
|
+
- bookmeter_scraper
|
102
|
+
extensions: []
|
103
|
+
extra_rdoc_files: []
|
104
|
+
files:
|
105
|
+
- ".gitignore"
|
106
|
+
- ".rspec"
|
107
|
+
- ".travis.yml"
|
108
|
+
- Gemfile
|
109
|
+
- LICENSE.txt
|
110
|
+
- README.ja.md
|
111
|
+
- README.md
|
112
|
+
- Rakefile
|
113
|
+
- bin/console
|
114
|
+
- bin/setup
|
115
|
+
- bookmeter_scraper.gemspec
|
116
|
+
- exe/bookmeter_scraper
|
117
|
+
- lib/bookmeter_scraper.rb
|
118
|
+
- lib/bookmeter_scraper/bookmeter.rb
|
119
|
+
- lib/bookmeter_scraper/version.rb
|
120
|
+
homepage: https://github.com/kymmt90/bookmeter_scraper
|
121
|
+
licenses:
|
122
|
+
- MIT
|
123
|
+
metadata: {}
|
124
|
+
post_install_message:
|
125
|
+
rdoc_options: []
|
126
|
+
require_paths:
|
127
|
+
- lib
|
128
|
+
required_ruby_version: !ruby/object:Gem::Requirement
|
129
|
+
requirements:
|
130
|
+
- - ">="
|
131
|
+
- !ruby/object:Gem::Version
|
132
|
+
version: '0'
|
133
|
+
required_rubygems_version: !ruby/object:Gem::Requirement
|
134
|
+
requirements:
|
135
|
+
- - ">="
|
136
|
+
- !ruby/object:Gem::Version
|
137
|
+
version: '0'
|
138
|
+
requirements: []
|
139
|
+
rubyforge_project:
|
140
|
+
rubygems_version: 2.5.1
|
141
|
+
signing_key:
|
142
|
+
specification_version: 4
|
143
|
+
summary: Bookmeter scraping library
|
144
|
+
test_files: []
|