bookmeter_scraper 0.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +7 -0
- data/.gitignore +10 -0
- data/.rspec +2 -0
- data/.travis.yml +10 -0
- data/Gemfile +4 -0
- data/LICENSE.txt +21 -0
- data/README.ja.md +157 -0
- data/README.md +163 -0
- data/Rakefile +5 -0
- data/bin/console +14 -0
- data/bin/setup +7 -0
- data/bookmeter_scraper.gemspec +28 -0
- data/exe/bookmeter_scraper +3 -0
- data/lib/bookmeter_scraper.rb +2 -0
- data/lib/bookmeter_scraper/bookmeter.rb +414 -0
- data/lib/bookmeter_scraper/version.rb +3 -0
- metadata +144 -0
checksums.yaml
ADDED
@@ -0,0 +1,7 @@
|
|
1
|
+
---
|
2
|
+
SHA1:
|
3
|
+
metadata.gz: 6c328c66bbd91ea36ee0471c01f4e16a69ffd347
|
4
|
+
data.tar.gz: d1d5fde5a9d223c0aada00c670d1feed0b5e9c3a
|
5
|
+
SHA512:
|
6
|
+
metadata.gz: 9a2c3a6149faa92850aca03c455bd5ccd95f8eddaee005e4befd4daa328c340e171df5018803f9feb0ebb0f7fdf630f91d9b9259c52fb358f5b3ccf5570595e7
|
7
|
+
data.tar.gz: dcbf2db1efa63928b1c00a00d35af13ffba46345578c84b40b26a9b320ba69acd899c66c9319af5e15894a3d4e5ccc5fadbf64136fbbcbea804a0097f0729251
|
data/.gitignore
ADDED
data/.rspec
ADDED
data/.travis.yml
ADDED
data/Gemfile
ADDED
data/LICENSE.txt
ADDED
@@ -0,0 +1,21 @@
|
|
1
|
+
The MIT License (MIT)
|
2
|
+
|
3
|
+
Copyright (c) 2016 Kohei Yamamoto
|
4
|
+
|
5
|
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
6
|
+
of this software and associated documentation files (the "Software"), to deal
|
7
|
+
in the Software without restriction, including without limitation the rights
|
8
|
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
9
|
+
copies of the Software, and to permit persons to whom the Software is
|
10
|
+
furnished to do so, subject to the following conditions:
|
11
|
+
|
12
|
+
The above copyright notice and this permission notice shall be included in
|
13
|
+
all copies or substantial portions of the Software.
|
14
|
+
|
15
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
16
|
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
17
|
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
18
|
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
19
|
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
20
|
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
|
21
|
+
THE SOFTWARE.
|
data/README.ja.md
ADDED
@@ -0,0 +1,157 @@
|
|
1
|
+
# Bookmeter Scraper [](https://travis-ci.org/kymmt90/bookmeter_scraper)
|
2
|
+
|
3
|
+
[読書メーター](http://bookmeter.com)の情報をスクレイピングして Ruby で扱えるようにするための gem です。
|
4
|
+
|
5
|
+
- 書籍情報
|
6
|
+
- 読んだ本
|
7
|
+
- 読んでる本
|
8
|
+
- 積読本
|
9
|
+
- 読みたい本
|
10
|
+
- お気に入り / お気に入られユーザ
|
11
|
+
- ユーザプロフィール
|
12
|
+
|
13
|
+
を取得可能です。
|
14
|
+
|
15
|
+
## 注意
|
16
|
+
|
17
|
+
スクレイピングの頻度は常識の範囲内にとどめてください。読書メーターのサーバーへ故意に著しい負荷をかける行為は、利用規約の第 9 条で禁止されています。
|
18
|
+
|
19
|
+
- [利用規約 - 読書メーター](http://bookmeter.com/terms.php)
|
20
|
+
|
21
|
+
## 使いかた
|
22
|
+
|
23
|
+
この gem を使うときは以下のコードが必要です。
|
24
|
+
|
25
|
+
```ruby
|
26
|
+
require 'bookmeter_scraper'
|
27
|
+
```
|
28
|
+
|
29
|
+
### ログイン
|
30
|
+
|
31
|
+
書籍情報、お気に入り / お気に入られユーザ情報を取得するには、`Bookmeter.log_in` でログインしておく必要があります。
|
32
|
+
|
33
|
+
```ruby
|
34
|
+
bookmeter = BookmeterScraper::Bookmeter.log_in('example@example.com', 'password')
|
35
|
+
bookmeter.logged_in? # true
|
36
|
+
```
|
37
|
+
|
38
|
+
`Bookmeter#log_in` でもログイン可能です。
|
39
|
+
|
40
|
+
```ruby
|
41
|
+
bookmeter = BookmeterScraper::Bookmeter.new
|
42
|
+
bookmeter.log_in('example@example.com', 'password')
|
43
|
+
```
|
44
|
+
|
45
|
+
### 書籍情報の取得
|
46
|
+
|
47
|
+
以下の書籍情報
|
48
|
+
|
49
|
+
- 読んだ本
|
50
|
+
- 読んでる本
|
51
|
+
- 積読本
|
52
|
+
- 読みたい本
|
53
|
+
|
54
|
+
を取得できます。取得には事前のログインが必要です。
|
55
|
+
|
56
|
+
#### 読んだ本
|
57
|
+
|
58
|
+
`Bookmeter#read_books` で「読んだ本」情報が取得できます。
|
59
|
+
|
60
|
+
```ruby
|
61
|
+
books = bookmeter.read_books # ログインユーザの「読んだ本」を取得
|
62
|
+
bookmeter.read_books('01010101') # 他のユーザの ID を指定して、そのユーザの「読んだ本」を取得
|
63
|
+
```
|
64
|
+
|
65
|
+
書籍情報は書名 `name` と読了日(初読了日と再読日の両方)の配列 `read_dates` を属性として持つ `Struct` の配列として取得できます。
|
66
|
+
|
67
|
+
```ruby
|
68
|
+
books[0].name
|
69
|
+
books[0].read_dates
|
70
|
+
```
|
71
|
+
|
72
|
+
さらに、`Bookmeter#read_books_in` で特定年月の「読んだ本」情報が取得できます。
|
73
|
+
|
74
|
+
```ruby
|
75
|
+
books = bookmeter.read_books_in(2016, 1) # ログインユーザが 2016 年 1 月に「読んだ本」を取得
|
76
|
+
books = bookmeter.read_books_in(2016, 1, '01010101') # ID で指定した他のユーザが 2016 年 1 月に「読んだ本」を取得
|
77
|
+
```
|
78
|
+
|
79
|
+
#### 読んでる本 / 積読本 / 読みたい本
|
80
|
+
|
81
|
+
「読んだ本」以外の書籍情報
|
82
|
+
|
83
|
+
- 読んでる本
|
84
|
+
- 積読本
|
85
|
+
- 読みたい本
|
86
|
+
|
87
|
+
も、それぞれ
|
88
|
+
|
89
|
+
- `Bookmeter#reading_books`
|
90
|
+
- `Bookmeter#tsundoku`
|
91
|
+
- `Bookmeter#wish_list`
|
92
|
+
|
93
|
+
で取得できます。
|
94
|
+
|
95
|
+
```ruby
|
96
|
+
books = bookmeter.reading_books # ログインユーザの「読んでる本」を取得
|
97
|
+
books[0].name
|
98
|
+
books[0].read_dates # 読了日の Array は空
|
99
|
+
|
100
|
+
bookmeter.tsundoku # ログインユーザの「積読本」を取得
|
101
|
+
bookmeter.wish_list # ログインユーザの「読みたい本」を取得
|
102
|
+
```
|
103
|
+
|
104
|
+
### お気に入り / お気に入られユーザ情報の取得
|
105
|
+
|
106
|
+
`Bookmeter#followings` と `Bookmeter#followers` でログインユーザが参照できるお気に入り / お気に入られユーザの情報を取得できます。取得には事前のログインが必要です。
|
107
|
+
|
108
|
+
```ruby
|
109
|
+
following_users = bookmeter.followings # 「お気に入り」ユーザの情報を取得
|
110
|
+
followers = bookmeter.followers # 「お気に入られ」ユーザの情報を取得
|
111
|
+
```
|
112
|
+
|
113
|
+
ユーザ情報はユーザ名 `name` とユーザ ID `id` を持つ `Struct` の配列として取得できます。
|
114
|
+
|
115
|
+
```ruby
|
116
|
+
following_users[0].name
|
117
|
+
following_users[0].id
|
118
|
+
followers[0].name
|
119
|
+
followers[0].id
|
120
|
+
```
|
121
|
+
|
122
|
+
#### 注意
|
123
|
+
|
124
|
+
**お気に入り / お気に入られのページにページネーションが存在する場合には未対応です。**
|
125
|
+
|
126
|
+
### ユーザのプロフィールの取得
|
127
|
+
|
128
|
+
`Bookmeter#profile` でユーザのプロフィールを取得できます。プロフィールはログインなしで閲覧できるため、ログインは不要です。
|
129
|
+
|
130
|
+
```ruby
|
131
|
+
bookmeter = BookmeterScraper::Bookmeter.new
|
132
|
+
user_id = '000000'
|
133
|
+
profile = bookmeter.profile(user_id) # 任意ユーザの ID を指定してプロフィールを取得可能
|
134
|
+
```
|
135
|
+
|
136
|
+
プロフィール情報は以下の属性を持つ `Struct` として取得できます。プロフィールで設定されていない属性は `nil` となります。
|
137
|
+
|
138
|
+
```ruby
|
139
|
+
profile.name # ユーザ名
|
140
|
+
profile.gender # 性別
|
141
|
+
profile.age # 年齢
|
142
|
+
profile.blood_type # 血液型
|
143
|
+
profile.job # 職業
|
144
|
+
profile.address # 現住所
|
145
|
+
profile.url # URL / ブログ
|
146
|
+
profile.description # 自己紹介
|
147
|
+
profile.first_day # 記録初日
|
148
|
+
profile.elapsed_days # 経過日数
|
149
|
+
profile.read_books_count # 読んだ本の数
|
150
|
+
profile.read_pages_count # 読んだページの数
|
151
|
+
profile.reviews_count # 感想/レビューの数
|
152
|
+
profile.bookshelfs_count # 本棚の数
|
153
|
+
```
|
154
|
+
|
155
|
+
## ライセンス
|
156
|
+
|
157
|
+
[MIT License](http://opensource.org/licenses/MIT)
|
data/README.md
ADDED
@@ -0,0 +1,163 @@
|
|
1
|
+
# Bookmeter Scraper [](https://travis-ci.org/kymmt90/bookmeter_scraper)
|
2
|
+
|
3
|
+
A library for scraping [Bookmeter](http://bookmeter.com).
|
4
|
+
|
5
|
+
Japanese README is [here](https://github.com/kymmt90/bookmeter_scraper/blob/master/README.ja.md).
|
6
|
+
|
7
|
+
|
8
|
+
## Installation
|
9
|
+
|
10
|
+
Add this line to your application's Gemfile:
|
11
|
+
|
12
|
+
```ruby
|
13
|
+
gem 'bookmeter_scraper'
|
14
|
+
```
|
15
|
+
|
16
|
+
And then execute:
|
17
|
+
|
18
|
+
$ bundle
|
19
|
+
|
20
|
+
Or install it yourself as:
|
21
|
+
|
22
|
+
$ gem install bookmeter_scraper
|
23
|
+
|
24
|
+
|
25
|
+
## Usage
|
26
|
+
|
27
|
+
Add this line to your code before using this library:
|
28
|
+
|
29
|
+
```ruby
|
30
|
+
require 'bookmeter_scraper'
|
31
|
+
```
|
32
|
+
|
33
|
+
### Log in
|
34
|
+
|
35
|
+
You need to log in Bookmeter to get books and followings / followers information by `Bookmeter.log_in`:
|
36
|
+
|
37
|
+
```ruby
|
38
|
+
bookmeter = BookmeterScraper::Bookmeter.log_in('example@example.com', 'password')
|
39
|
+
bookmeter.logged_in? # true
|
40
|
+
```
|
41
|
+
|
42
|
+
`Bookmeter#log_in` is also available:
|
43
|
+
|
44
|
+
```ruby
|
45
|
+
bookmeter = BookmeterScraper::Bookmeter.new
|
46
|
+
bookmeter.log_in('example@example.com', 'password')
|
47
|
+
```
|
48
|
+
|
49
|
+
### Get books information
|
50
|
+
|
51
|
+
You can get books information:
|
52
|
+
|
53
|
+
- read books
|
54
|
+
- reading books
|
55
|
+
- tsundoku (stockpile)
|
56
|
+
- wish list
|
57
|
+
|
58
|
+
You need to log in Bookmeter in advance to get these information.
|
59
|
+
|
60
|
+
#### Read books
|
61
|
+
|
62
|
+
You can get read books information by `Bookmeter#read_books`:
|
63
|
+
|
64
|
+
```ruby
|
65
|
+
books = bookmeter.read_books # get read books of the logged in user
|
66
|
+
bookmeter.read_books('01010101') # get read books of a user specified by ID
|
67
|
+
```
|
68
|
+
|
69
|
+
Books infomation is an array of `Struct` which has `name` and `read_dates` as attributes.
|
70
|
+
`read_dates` is an array of finished reading dates (first finished date and reread dates):
|
71
|
+
|
72
|
+
```ruby
|
73
|
+
books[0].name
|
74
|
+
books[0].read_dates
|
75
|
+
```
|
76
|
+
|
77
|
+
To specify year-month for read books, you can use `Bookmeter#read_books_in`:
|
78
|
+
|
79
|
+
```ruby
|
80
|
+
books = bookmeter.read_books_in(2016, 1) # get read books of the logged in user in 2016-01
|
81
|
+
books = bookmeter.read_books_in(2016, 1, '01010101') # get read books of a user in 2016-01
|
82
|
+
```
|
83
|
+
|
84
|
+
#### Reading books / Tsundoku / Wish list
|
85
|
+
|
86
|
+
You can get other books information:
|
87
|
+
|
88
|
+
- `Bookmeter#reading_books`
|
89
|
+
- `Bookmeter#tsundoku`
|
90
|
+
- `Bookmeter#wish_list`
|
91
|
+
|
92
|
+
```ruby
|
93
|
+
books = bookmeter.reading_books
|
94
|
+
books[0].name
|
95
|
+
books[0].read_dates # this array is empty
|
96
|
+
|
97
|
+
bookmeter.tsundoku
|
98
|
+
bookmeter.wish_list
|
99
|
+
```
|
100
|
+
|
101
|
+
### Get followings users / followers information
|
102
|
+
|
103
|
+
You can get following users (followings) and followers information by `Bookmeter#followings` and `Bookmeter#followers`:
|
104
|
+
|
105
|
+
```ruby
|
106
|
+
following_users = bookmeter.followings
|
107
|
+
followers = bookmeter.followers
|
108
|
+
```
|
109
|
+
|
110
|
+
You need to log in Bookmeter in advance to get these information.
|
111
|
+
|
112
|
+
Users information is an array of `Struct` which has `name` and `id` as attributes.
|
113
|
+
|
114
|
+
```ruby
|
115
|
+
following_users[0].name
|
116
|
+
following_users[0].id
|
117
|
+
followers[0].name
|
118
|
+
followers[0].id
|
119
|
+
```
|
120
|
+
|
121
|
+
#### Notice
|
122
|
+
|
123
|
+
**`Bookmeter#followings` and `Bookmeter#followers` have not supported paginated followings / followers pages yet.**
|
124
|
+
|
125
|
+
### Get user profile
|
126
|
+
|
127
|
+
You can get a user profile by `Bookmeter#profile`:
|
128
|
+
|
129
|
+
```ruby
|
130
|
+
bookmeter = BookmeterScraper::Bookmeter.new
|
131
|
+
user_id = '000000'
|
132
|
+
profile = bookmeter.profile(user_id) # You can specify arbitrary user ID
|
133
|
+
```
|
134
|
+
|
135
|
+
You do not need to log in to get user profiles.
|
136
|
+
Profile information is `Struct` which has these attributes:
|
137
|
+
|
138
|
+
```ruby
|
139
|
+
profile.name
|
140
|
+
profile.gender
|
141
|
+
profile.age
|
142
|
+
profile.blood_type
|
143
|
+
profile.job
|
144
|
+
profile.address
|
145
|
+
profile.url
|
146
|
+
profile.description
|
147
|
+
profile.first_day
|
148
|
+
profile.elapsed_days
|
149
|
+
profile.read_books_count
|
150
|
+
profile.read_pages_count
|
151
|
+
profile.reviews_count
|
152
|
+
profile.bookshelfs_count
|
153
|
+
```
|
154
|
+
|
155
|
+
|
156
|
+
## Contributing
|
157
|
+
|
158
|
+
Bug reports and pull requests are welcome on GitHub at https://github.com/kymmt90/bookmeter_scraper.
|
159
|
+
|
160
|
+
|
161
|
+
## License
|
162
|
+
|
163
|
+
The gem is available as open source under the terms of the [MIT License](http://opensource.org/licenses/MIT).
|
data/Rakefile
ADDED
data/bin/console
ADDED
@@ -0,0 +1,14 @@
|
|
1
|
+
#!/usr/bin/env ruby
|
2
|
+
|
3
|
+
require "bundler/setup"
|
4
|
+
require "bookmeter_scraper"
|
5
|
+
|
6
|
+
# You can add fixtures and/or initialization code here to make experimenting
|
7
|
+
# with your gem easier. You can also use a different console, if you like.
|
8
|
+
|
9
|
+
# (If you use this, don't forget to add pry to your Gemfile!)
|
10
|
+
# require "pry"
|
11
|
+
# Pry.start
|
12
|
+
|
13
|
+
require "irb"
|
14
|
+
IRB.start
|
data/bin/setup
ADDED
@@ -0,0 +1,28 @@
|
|
1
|
+
lib = File.expand_path('../lib', __FILE__)
|
2
|
+
$LOAD_PATH.unshift(lib) unless $LOAD_PATH.include?(lib)
|
3
|
+
require 'bookmeter_scraper/version'
|
4
|
+
|
5
|
+
Gem::Specification.new do |spec|
|
6
|
+
spec.name = "bookmeter_scraper"
|
7
|
+
spec.version = BookmeterScraper::VERSION
|
8
|
+
spec.authors = ["Kohei Yamamoto"]
|
9
|
+
spec.email = ["kymmt90@gmail.com"]
|
10
|
+
|
11
|
+
spec.summary = %q{Bookmeter scraping library}
|
12
|
+
spec.description = %q{Bookmeter scraping library}
|
13
|
+
spec.homepage = "https://github.com/kymmt90/bookmeter_scraper"
|
14
|
+
spec.license = "MIT"
|
15
|
+
|
16
|
+
spec.files = `git ls-files -z`.split("\x0").reject { |f| f.match(%r{^(test|spec|features)/}) }
|
17
|
+
spec.bindir = "exe"
|
18
|
+
spec.executables = spec.files.grep(%r{^exe/}) { |f| File.basename(f) }
|
19
|
+
spec.require_paths = ["lib"]
|
20
|
+
|
21
|
+
spec.add_development_dependency "bundler", "~> 1.10"
|
22
|
+
spec.add_development_dependency "rake", "~> 10.0"
|
23
|
+
spec.add_development_dependency "rspec", "~> 3.4"
|
24
|
+
spec.add_development_dependency "webmock", "~> 1.22"
|
25
|
+
|
26
|
+
spec.add_dependency "yasuri", "~> 0.0"
|
27
|
+
spec.add_dependency "mechanize", "~> 2.7"
|
28
|
+
end
|
@@ -0,0 +1,414 @@
|
|
1
|
+
require 'mechanize'
|
2
|
+
require 'yasuri'
|
3
|
+
|
4
|
+
module BookmeterScraper
|
5
|
+
class Bookmeter
|
6
|
+
ROOT_URI = 'http://bookmeter.com'.freeze
|
7
|
+
LOGIN_URI = "#{ROOT_URI}/login".freeze
|
8
|
+
|
9
|
+
PROFILE_ATTRIBUTES = %i(name gender age blood_type job address url description first_day elapsed_days read_books_count read_pages_count reviews_count bookshelfs_count)
|
10
|
+
Profile = Struct.new(*PROFILE_ATTRIBUTES)
|
11
|
+
|
12
|
+
BOOK_ATTRIBUTES = %i(name read_dates)
|
13
|
+
Book = Struct.new(*BOOK_ATTRIBUTES)
|
14
|
+
|
15
|
+
USER_ATTRIBUTES = %i(name id)
|
16
|
+
User = Struct.new(*USER_ATTRIBUTES)
|
17
|
+
|
18
|
+
JP_ATTRIBUTE_NAMES = {
|
19
|
+
gender: '性別',
|
20
|
+
age: '年齢',
|
21
|
+
blood_type: '血液型',
|
22
|
+
job: '職業',
|
23
|
+
address: '現住所',
|
24
|
+
url: 'URL / ブログ',
|
25
|
+
description: '自己紹介',
|
26
|
+
first_day: '記録初日',
|
27
|
+
elapsed_days: '経過日数',
|
28
|
+
read_books_count: '読んだ本',
|
29
|
+
read_pages_count: '読んだページ',
|
30
|
+
reviews_count: '感想/レビュー',
|
31
|
+
bookshelfs_count: '本棚',
|
32
|
+
}
|
33
|
+
|
34
|
+
NUM_BOOKS_PER_PAGE = 40
|
35
|
+
NUM_USERS_PER_PAGE = 20
|
36
|
+
|
37
|
+
attr_reader :log_in_user_id
|
38
|
+
|
39
|
+
def self.mypage_uri(user_id)
|
40
|
+
raise ArgumentError unless user_id =~ /^\d+$/
|
41
|
+
"#{ROOT_URI}/u/#{user_id}"
|
42
|
+
end
|
43
|
+
|
44
|
+
def self.read_books_uri(user_id)
|
45
|
+
raise ArgumentError unless user_id =~ /^\d+$/
|
46
|
+
"#{ROOT_URI}/u/#{user_id}/booklist"
|
47
|
+
end
|
48
|
+
|
49
|
+
def self.reading_books_uri(user_id)
|
50
|
+
raise ArgumentError unless user_id =~ /^\d+$/
|
51
|
+
"#{ROOT_URI}/u/#{user_id}/booklistnow"
|
52
|
+
end
|
53
|
+
|
54
|
+
def self.tsundoku_uri(user_id)
|
55
|
+
raise ArgumentError unless user_id =~ /^\d+$/
|
56
|
+
"#{ROOT_URI}/u/#{user_id}/booklisttun"
|
57
|
+
end
|
58
|
+
|
59
|
+
def self.wish_list_uri(user_id)
|
60
|
+
raise ArgumentError unless user_id =~ /^\d+$/
|
61
|
+
"#{ROOT_URI}/u/#{user_id}/booklistpre"
|
62
|
+
end
|
63
|
+
|
64
|
+
def self.followings_uri(user_id)
|
65
|
+
raise ArgumentError unless user_id =~ /^\d+$/
|
66
|
+
"#{ROOT_URI}/u/#{user_id}/favorite_user"
|
67
|
+
end
|
68
|
+
|
69
|
+
def self.followers_uri(user_id)
|
70
|
+
raise ArgumentError unless user_id =~ /^\d+$/
|
71
|
+
"#{ROOT_URI}/u/#{user_id}/favorited_user"
|
72
|
+
end
|
73
|
+
|
74
|
+
def self.log_in(mail, password)
|
75
|
+
Bookmeter.new.tap do |bookmeter|
|
76
|
+
bookmeter.log_in(mail, password)
|
77
|
+
end
|
78
|
+
end
|
79
|
+
|
80
|
+
|
81
|
+
def initialize(agent = nil)
|
82
|
+
@agent = agent.nil? ? Bookmeter.new_agent : agent
|
83
|
+
@logged_in = false
|
84
|
+
end
|
85
|
+
|
86
|
+
def log_in(mail, password)
|
87
|
+
raise BookmeterError if @agent.nil?
|
88
|
+
|
89
|
+
next_page = nil
|
90
|
+
page = @agent.get(LOGIN_URI) do |page|
|
91
|
+
next_page = page.form_with(action: '/login') do |form|
|
92
|
+
form.field_with(name: 'mail').value = mail
|
93
|
+
form.field_with(name: 'password').value = password
|
94
|
+
end.submit
|
95
|
+
end
|
96
|
+
@logged_in = next_page.uri.to_s == ROOT_URI + '/'
|
97
|
+
return unless logged_in?
|
98
|
+
|
99
|
+
mypage = next_page.link_with(text: 'マイページ').click
|
100
|
+
@log_in_user_id = extract_user_id(mypage)
|
101
|
+
end
|
102
|
+
|
103
|
+
def logged_in?
|
104
|
+
@logged_in
|
105
|
+
end
|
106
|
+
|
107
|
+
def profile(user_id)
|
108
|
+
raise ArgumentError unless user_id =~ /^\d+$/
|
109
|
+
|
110
|
+
mypage = @agent.get(Bookmeter.mypage_uri(user_id))
|
111
|
+
|
112
|
+
profile_dl_tags = mypage.search('#side_left > div.inner > div.profile > dl')
|
113
|
+
jp_attribute_names = profile_dl_tags.map { |i| i.children[0].children.text }
|
114
|
+
attribute_values = profile_dl_tags.map { |i| i.children[1].children.text }
|
115
|
+
jp_attributes = Hash[jp_attribute_names.zip(attribute_values)]
|
116
|
+
attributes = PROFILE_ATTRIBUTES.map do |attribute|
|
117
|
+
jp_attributes[JP_ATTRIBUTE_NAMES[attribute]]
|
118
|
+
end
|
119
|
+
attributes[0] = mypage.at_css('#side_left > div.inner > h3').text
|
120
|
+
|
121
|
+
Profile.new(*attributes)
|
122
|
+
end
|
123
|
+
|
124
|
+
def read_books(user_id = @log_in_user_id)
|
125
|
+
books = get_books(user_id, :read_books_uri)
|
126
|
+
books.each { |b| yield b } if block_given?
|
127
|
+
books
|
128
|
+
end
|
129
|
+
|
130
|
+
def read_books_in(year, month, user_id = @log_in_user_id)
|
131
|
+
date = Time.local(year, month)
|
132
|
+
books = get_read_books(user_id, date)
|
133
|
+
books.each { |b| yield b } if block_given?
|
134
|
+
books
|
135
|
+
end
|
136
|
+
|
137
|
+
def reading_books(user_id = @log_in_user_id)
|
138
|
+
books = get_books(user_id, :reading_books_uri)
|
139
|
+
books.each { |b| yield b } if block_given?
|
140
|
+
books
|
141
|
+
end
|
142
|
+
|
143
|
+
def tsundoku(user_id = @log_in_user_id)
|
144
|
+
books = get_books(user_id, :tsundoku_uri)
|
145
|
+
books.each { |b| yield b } if block_given?
|
146
|
+
books
|
147
|
+
end
|
148
|
+
|
149
|
+
def wish_list(user_id = @log_in_user_id)
|
150
|
+
books = get_books(user_id, :wish_list_uri)
|
151
|
+
books.each { |b| yield b } if block_given?
|
152
|
+
books
|
153
|
+
end
|
154
|
+
|
155
|
+
def followings(user_id = @log_in_user_id)
|
156
|
+
users = get_followings(user_id)
|
157
|
+
end
|
158
|
+
|
159
|
+
def followers(user_id = @log_in_user_id)
|
160
|
+
users = get_followers(user_id)
|
161
|
+
end
|
162
|
+
|
163
|
+
private
|
164
|
+
|
165
|
+
def self.new_agent
|
166
|
+
agent = Mechanize.new do |a|
|
167
|
+
a.user_agent_alias = 'Mac Safari'
|
168
|
+
end
|
169
|
+
end
|
170
|
+
|
171
|
+
def extract_user_id(page)
|
172
|
+
page.uri.to_s.match(/\/u\/(\d+)$/)[1]
|
173
|
+
end
|
174
|
+
|
175
|
+
def get_books(user_id, uri_method)
|
176
|
+
books = []
|
177
|
+
scraped_pages = scrape_book_pages(user_id, uri_method)
|
178
|
+
scraped_pages.each do |page|
|
179
|
+
books << get_book_structs(page)
|
180
|
+
books.flatten!
|
181
|
+
end
|
182
|
+
books
|
183
|
+
end
|
184
|
+
|
185
|
+
def get_read_books(user_id, target_ym)
|
186
|
+
result = []
|
187
|
+
scrape_book_pages(user_id, :read_books_uri).each do |page|
|
188
|
+
first_book_date = get_read_date(page['book_1_link'])
|
189
|
+
last_book_date = get_last_book_date(page)
|
190
|
+
|
191
|
+
first_book_ym = Time.local(first_book_date['year'].to_i, first_book_date['month'].to_i)
|
192
|
+
last_book_ym = Time.local(last_book_date['year'].to_i, last_book_date['month'].to_i)
|
193
|
+
|
194
|
+
if target_ym < last_book_ym
|
195
|
+
next
|
196
|
+
elsif target_ym == first_book_ym && target_ym > last_book_ym
|
197
|
+
result.concat(get_target_books(target_ym, page))
|
198
|
+
break
|
199
|
+
elsif target_ym < first_book_ym && target_ym > last_book_ym
|
200
|
+
result.concat(get_target_books(target_ym, page))
|
201
|
+
break
|
202
|
+
elsif target_ym <= first_book_ym && target_ym >= last_book_ym
|
203
|
+
result.concat(get_target_books(target_ym, page))
|
204
|
+
elsif target_ym > first_book_ym
|
205
|
+
break
|
206
|
+
end
|
207
|
+
end
|
208
|
+
result
|
209
|
+
end
|
210
|
+
|
211
|
+
def get_last_book_date(page)
|
212
|
+
NUM_BOOKS_PER_PAGE.downto(1) do |i|
|
213
|
+
link = page["book_#{i}_link"]
|
214
|
+
next if link.empty?
|
215
|
+
return get_read_date(link)
|
216
|
+
end
|
217
|
+
end
|
218
|
+
|
219
|
+
def get_target_books(target_ym, page)
|
220
|
+
target_books = []
|
221
|
+
|
222
|
+
1.upto(NUM_BOOKS_PER_PAGE) do |i|
|
223
|
+
next if page["book_#{i}_link"].empty?
|
224
|
+
|
225
|
+
read_yms = []
|
226
|
+
read_date = get_read_date(page["book_#{i}_link"])
|
227
|
+
read_dates = [Time.local(read_date['year'], read_date['month'], read_date['day'])]
|
228
|
+
read_yms << Time.local(read_date['year'], read_date['month'])
|
229
|
+
|
230
|
+
reread_dates = []
|
231
|
+
reread_dates << get_reread_date(page["book_#{i}_link"])
|
232
|
+
reread_dates.flatten!
|
233
|
+
|
234
|
+
unless reread_dates.empty?
|
235
|
+
reread_dates.each do |date|
|
236
|
+
read_yms << Time.local(date['reread_year'], date['reread_month'])
|
237
|
+
end
|
238
|
+
end
|
239
|
+
|
240
|
+
next unless read_yms.include?(target_ym)
|
241
|
+
|
242
|
+
unless reread_dates.empty?
|
243
|
+
reread_dates.each do |date|
|
244
|
+
read_dates << Time.local(date['reread_year'], date['reread_month'], date['reread_day'])
|
245
|
+
end
|
246
|
+
end
|
247
|
+
book_name = get_book_name(page["book_#{i}_link"])
|
248
|
+
book = Book.new(book_name, read_dates)
|
249
|
+
target_books << book
|
250
|
+
end
|
251
|
+
|
252
|
+
target_books
|
253
|
+
end
|
254
|
+
|
255
|
+
def scrape_book_pages(user_id, uri_method)
|
256
|
+
raise ArgumentError unless user_id =~ /^\d+$/
|
257
|
+
raise ArgumentError unless Bookmeter.methods.include?(uri_method)
|
258
|
+
return [] unless logged_in?
|
259
|
+
|
260
|
+
books_page = @agent.get(Bookmeter.method(uri_method).call(user_id))
|
261
|
+
|
262
|
+
# if books are not found at all
|
263
|
+
return [] if books_page.search('#main_left > div > center > a').empty?
|
264
|
+
|
265
|
+
if books_page.search('span.now_page').empty?
|
266
|
+
books_root = Yasuri.struct_books '//*[@id="main_left"]/div' do
|
267
|
+
1.upto(NUM_BOOKS_PER_PAGE) do |i|
|
268
|
+
send("text_book_#{i}_name", "//*[@id=\"main_left\"]/div/div[#{i + 1}]/div[2]/a")
|
269
|
+
send("text_book_#{i}_link", "//*[@id=\"main_left\"]/div/div[#{i + 1}]/div[2]/a/@href")
|
270
|
+
end
|
271
|
+
end
|
272
|
+
return [books_root.inject(@agent, books_page)]
|
273
|
+
end
|
274
|
+
|
275
|
+
books_root = Yasuri.pages_root '//span[@class="now_page"]/following-sibling::span[1]/a' do
|
276
|
+
text_page_index '//span[@class="now_page"]/a'
|
277
|
+
1.upto(NUM_BOOKS_PER_PAGE) do |i|
|
278
|
+
send("text_book_#{i}_name", "//*[@id=\"main_left\"]/div/div[#{i + 1}]/div[2]/a")
|
279
|
+
send("text_book_#{i}_link", "//*[@id=\"main_left\"]/div/div[#{i + 1}]/div[2]/a/@href")
|
280
|
+
end
|
281
|
+
end
|
282
|
+
books_root.inject(@agent, books_page)
|
283
|
+
end
|
284
|
+
|
285
|
+
def get_book_name(book_link)
|
286
|
+
@agent.get(ROOT_URI + book_link).search('#title').text
|
287
|
+
end
|
288
|
+
|
289
|
+
def get_read_date(book_link)
|
290
|
+
book_page = @agent.get(ROOT_URI + book_link)
|
291
|
+
book_date = Yasuri.struct_date '//*[@id="book_edit_area"]/form[1]/div[2]' do
|
292
|
+
text_year '//*[@id="read_date_y"]/option[1]', truncate: /\d+/, proc: :to_i
|
293
|
+
text_month '//*[@id="read_date_m"]/option[1]', truncate: /\d+/, proc: :to_i
|
294
|
+
text_day '//*[@id="read_date_d"]/option[1]', truncate: /\d+/, proc: :to_i
|
295
|
+
end
|
296
|
+
book_date.inject(@agent, book_page)
|
297
|
+
end
|
298
|
+
|
299
|
+
def get_reread_date(book_link)
|
300
|
+
book_page = @agent.get(ROOT_URI + book_link)
|
301
|
+
book_reread_date = Yasuri.struct_reread_date '//*[@id="book_edit_area"]/div/form[1]/div[2]' do
|
302
|
+
text_reread_year '//div[@class="reread_box"]/form[1]/div[2]/select[1]/option[1]', truncate: /\d+/, proc: :to_i
|
303
|
+
text_reread_month '//div[@class="reread_box"]/form[1]/div[2]/select[2]/option[1]', truncate: /\d+/, proc: :to_i
|
304
|
+
text_reread_day '//div[@class="reread_box"]/form[1]/div[2]/select[3]/option[1]', truncate: /\d+/, proc: :to_i
|
305
|
+
end
|
306
|
+
book_reread_date.inject(@agent, book_page)
|
307
|
+
end
|
308
|
+
|
309
|
+
def get_book_structs(page)
|
310
|
+
books = []
|
311
|
+
|
312
|
+
1.upto(NUM_BOOKS_PER_PAGE) do |i|
|
313
|
+
break if page["book_#{i}_link"].empty?
|
314
|
+
|
315
|
+
read_dates = []
|
316
|
+
read_date = get_read_date(page["book_#{i}_link"])
|
317
|
+
unless read_date.empty?
|
318
|
+
read_dates << Time.local(read_date['year'], read_date['month'], read_date['day'])
|
319
|
+
end
|
320
|
+
|
321
|
+
reread_dates = []
|
322
|
+
reread_dates << get_reread_date(page["book_#{i}_link"])
|
323
|
+
reread_dates.flatten!
|
324
|
+
|
325
|
+
unless reread_dates.empty?
|
326
|
+
reread_dates.each do |date|
|
327
|
+
read_dates << Time.local(date['reread_year'], date['reread_month'], date['reread_day'])
|
328
|
+
end
|
329
|
+
end
|
330
|
+
|
331
|
+
book_name = get_book_name(page["book_#{i}_link"])
|
332
|
+
book = Book.new(book_name, read_dates)
|
333
|
+
books << book
|
334
|
+
end
|
335
|
+
|
336
|
+
books
|
337
|
+
end
|
338
|
+
|
339
|
+
def get_followings(user_id)
|
340
|
+
users = []
|
341
|
+
scraped_pages = user_id == @log_in_user_id ? scrape_followings_page(user_id)
|
342
|
+
: scrape_others_followings_page(user_id)
|
343
|
+
scraped_pages.each do |page|
|
344
|
+
users << get_user_structs(page)
|
345
|
+
users.flatten!
|
346
|
+
end
|
347
|
+
users
|
348
|
+
end
|
349
|
+
|
350
|
+
def get_followers(user_id)
|
351
|
+
users = []
|
352
|
+
scraped_pages = scrape_followers_page(user_id)
|
353
|
+
scraped_pages.each do |page|
|
354
|
+
users << get_user_structs(page)
|
355
|
+
users.flatten!
|
356
|
+
end
|
357
|
+
users
|
358
|
+
end
|
359
|
+
|
360
|
+
def get_user_structs(page)
|
361
|
+
users = []
|
362
|
+
|
363
|
+
1.upto(NUM_USERS_PER_PAGE) do |i|
|
364
|
+
break if page["user_#{i}_name"].empty?
|
365
|
+
|
366
|
+
user_name = page["user_#{i}_name"]
|
367
|
+
user_id = page["user_#{i}_link"].match(/\/u\/(\d+)$/)[1]
|
368
|
+
user = User.new(user_name, user_id)
|
369
|
+
users << user
|
370
|
+
end
|
371
|
+
|
372
|
+
users
|
373
|
+
end
|
374
|
+
|
375
|
+
def scrape_followings_page(user_id)
|
376
|
+
raise ArgumentError unless user_id =~ /^\d+$/
|
377
|
+
return [] unless logged_in?
|
378
|
+
|
379
|
+
followings_page = @agent.get(Bookmeter.followings_uri(user_id))
|
380
|
+
followings_root = Yasuri.struct_books '//*[@id="main_left"]/div' do
|
381
|
+
1.upto(NUM_USERS_PER_PAGE) do |i|
|
382
|
+
send("text_user_#{i}_name", "//*[@id=\"main_left\"]/div/div[#{i}]/a/@title")
|
383
|
+
send("text_user_#{i}_link", "//*[@id=\"main_left\"]/div/div[#{i}]/a/@href")
|
384
|
+
end
|
385
|
+
end
|
386
|
+
[followings_root.inject(@agent, followings_page)]
|
387
|
+
end
|
388
|
+
|
389
|
+
def scrape_others_followings_page(user_id)
|
390
|
+
scrape_users_listing_page(user_id, :followings_uri)
|
391
|
+
end
|
392
|
+
|
393
|
+
def scrape_followers_page(user_id)
|
394
|
+
scrape_users_listing_page(user_id, :followers_uri)
|
395
|
+
end
|
396
|
+
|
397
|
+
def scrape_users_listing_page(user_id, uri_method)
|
398
|
+
raise ArgumentError unless user_id =~ /^\d+$/
|
399
|
+
raise ArgumentError unless Bookmeter.methods.include?(uri_method)
|
400
|
+
return [] unless logged_in?
|
401
|
+
|
402
|
+
page = @agent.get(Bookmeter.method(uri_method).call(user_id))
|
403
|
+
root = Yasuri.struct_users '//*[@id="main_left"]/div' do
|
404
|
+
1.upto(NUM_USERS_PER_PAGE) do |i|
|
405
|
+
send("text_user_#{i}_name", "//*[@id=\"main_left\"]/div/div[#{i}]/div/div[2]/a/@title")
|
406
|
+
send("text_user_#{i}_link", "//*[@id=\"main_left\"]/div/div[#{i}]/div/div[2]/a/@href")
|
407
|
+
end
|
408
|
+
end
|
409
|
+
[root.inject(@agent, page)]
|
410
|
+
end
|
411
|
+
end
|
412
|
+
|
413
|
+
class BookmeterError < StandardError; end
|
414
|
+
end
|
metadata
ADDED
@@ -0,0 +1,144 @@
|
|
1
|
+
--- !ruby/object:Gem::Specification
|
2
|
+
name: bookmeter_scraper
|
3
|
+
version: !ruby/object:Gem::Version
|
4
|
+
version: 0.1.0
|
5
|
+
platform: ruby
|
6
|
+
authors:
|
7
|
+
- Kohei Yamamoto
|
8
|
+
autorequire:
|
9
|
+
bindir: exe
|
10
|
+
cert_chain: []
|
11
|
+
date: 2016-02-26 00:00:00.000000000 Z
|
12
|
+
dependencies:
|
13
|
+
- !ruby/object:Gem::Dependency
|
14
|
+
name: bundler
|
15
|
+
requirement: !ruby/object:Gem::Requirement
|
16
|
+
requirements:
|
17
|
+
- - "~>"
|
18
|
+
- !ruby/object:Gem::Version
|
19
|
+
version: '1.10'
|
20
|
+
type: :development
|
21
|
+
prerelease: false
|
22
|
+
version_requirements: !ruby/object:Gem::Requirement
|
23
|
+
requirements:
|
24
|
+
- - "~>"
|
25
|
+
- !ruby/object:Gem::Version
|
26
|
+
version: '1.10'
|
27
|
+
- !ruby/object:Gem::Dependency
|
28
|
+
name: rake
|
29
|
+
requirement: !ruby/object:Gem::Requirement
|
30
|
+
requirements:
|
31
|
+
- - "~>"
|
32
|
+
- !ruby/object:Gem::Version
|
33
|
+
version: '10.0'
|
34
|
+
type: :development
|
35
|
+
prerelease: false
|
36
|
+
version_requirements: !ruby/object:Gem::Requirement
|
37
|
+
requirements:
|
38
|
+
- - "~>"
|
39
|
+
- !ruby/object:Gem::Version
|
40
|
+
version: '10.0'
|
41
|
+
- !ruby/object:Gem::Dependency
|
42
|
+
name: rspec
|
43
|
+
requirement: !ruby/object:Gem::Requirement
|
44
|
+
requirements:
|
45
|
+
- - "~>"
|
46
|
+
- !ruby/object:Gem::Version
|
47
|
+
version: '3.4'
|
48
|
+
type: :development
|
49
|
+
prerelease: false
|
50
|
+
version_requirements: !ruby/object:Gem::Requirement
|
51
|
+
requirements:
|
52
|
+
- - "~>"
|
53
|
+
- !ruby/object:Gem::Version
|
54
|
+
version: '3.4'
|
55
|
+
- !ruby/object:Gem::Dependency
|
56
|
+
name: webmock
|
57
|
+
requirement: !ruby/object:Gem::Requirement
|
58
|
+
requirements:
|
59
|
+
- - "~>"
|
60
|
+
- !ruby/object:Gem::Version
|
61
|
+
version: '1.22'
|
62
|
+
type: :development
|
63
|
+
prerelease: false
|
64
|
+
version_requirements: !ruby/object:Gem::Requirement
|
65
|
+
requirements:
|
66
|
+
- - "~>"
|
67
|
+
- !ruby/object:Gem::Version
|
68
|
+
version: '1.22'
|
69
|
+
- !ruby/object:Gem::Dependency
|
70
|
+
name: yasuri
|
71
|
+
requirement: !ruby/object:Gem::Requirement
|
72
|
+
requirements:
|
73
|
+
- - "~>"
|
74
|
+
- !ruby/object:Gem::Version
|
75
|
+
version: '0.0'
|
76
|
+
type: :runtime
|
77
|
+
prerelease: false
|
78
|
+
version_requirements: !ruby/object:Gem::Requirement
|
79
|
+
requirements:
|
80
|
+
- - "~>"
|
81
|
+
- !ruby/object:Gem::Version
|
82
|
+
version: '0.0'
|
83
|
+
- !ruby/object:Gem::Dependency
|
84
|
+
name: mechanize
|
85
|
+
requirement: !ruby/object:Gem::Requirement
|
86
|
+
requirements:
|
87
|
+
- - "~>"
|
88
|
+
- !ruby/object:Gem::Version
|
89
|
+
version: '2.7'
|
90
|
+
type: :runtime
|
91
|
+
prerelease: false
|
92
|
+
version_requirements: !ruby/object:Gem::Requirement
|
93
|
+
requirements:
|
94
|
+
- - "~>"
|
95
|
+
- !ruby/object:Gem::Version
|
96
|
+
version: '2.7'
|
97
|
+
description: Bookmeter scraping library
|
98
|
+
email:
|
99
|
+
- kymmt90@gmail.com
|
100
|
+
executables:
|
101
|
+
- bookmeter_scraper
|
102
|
+
extensions: []
|
103
|
+
extra_rdoc_files: []
|
104
|
+
files:
|
105
|
+
- ".gitignore"
|
106
|
+
- ".rspec"
|
107
|
+
- ".travis.yml"
|
108
|
+
- Gemfile
|
109
|
+
- LICENSE.txt
|
110
|
+
- README.ja.md
|
111
|
+
- README.md
|
112
|
+
- Rakefile
|
113
|
+
- bin/console
|
114
|
+
- bin/setup
|
115
|
+
- bookmeter_scraper.gemspec
|
116
|
+
- exe/bookmeter_scraper
|
117
|
+
- lib/bookmeter_scraper.rb
|
118
|
+
- lib/bookmeter_scraper/bookmeter.rb
|
119
|
+
- lib/bookmeter_scraper/version.rb
|
120
|
+
homepage: https://github.com/kymmt90/bookmeter_scraper
|
121
|
+
licenses:
|
122
|
+
- MIT
|
123
|
+
metadata: {}
|
124
|
+
post_install_message:
|
125
|
+
rdoc_options: []
|
126
|
+
require_paths:
|
127
|
+
- lib
|
128
|
+
required_ruby_version: !ruby/object:Gem::Requirement
|
129
|
+
requirements:
|
130
|
+
- - ">="
|
131
|
+
- !ruby/object:Gem::Version
|
132
|
+
version: '0'
|
133
|
+
required_rubygems_version: !ruby/object:Gem::Requirement
|
134
|
+
requirements:
|
135
|
+
- - ">="
|
136
|
+
- !ruby/object:Gem::Version
|
137
|
+
version: '0'
|
138
|
+
requirements: []
|
139
|
+
rubyforge_project:
|
140
|
+
rubygems_version: 2.5.1
|
141
|
+
signing_key:
|
142
|
+
specification_version: 4
|
143
|
+
summary: Bookmeter scraping library
|
144
|
+
test_files: []
|