niwa_textream 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,3 @@
1
+ language: ruby
2
+ rvm:
3
+ - 2.3.1
@@ -0,0 +1,49 @@
1
+ # Contributor Code of Conduct
2
+
3
+ As contributors and maintainers of this project, and in the interest of
4
+ fostering an open and welcoming community, we pledge to respect all people who
5
+ contribute through reporting issues, posting feature requests, updating
6
+ documentation, submitting pull requests or patches, and other activities.
7
+
8
+ We are committed to making participation in this project a harassment-free
9
+ experience for everyone, regardless of level of experience, gender, gender
10
+ identity and expression, sexual orientation, disability, personal appearance,
11
+ body size, race, ethnicity, age, religion, or nationality.
12
+
13
+ Examples of unacceptable behavior by participants include:
14
+
15
+ * The use of sexualized language or imagery
16
+ * Personal attacks
17
+ * Trolling or insulting/derogatory comments
18
+ * Public or private harassment
19
+ * Publishing other's private information, such as physical or electronic
20
+ addresses, without explicit permission
21
+ * Other unethical or unprofessional conduct
22
+
23
+ Project maintainers have the right and responsibility to remove, edit, or
24
+ reject comments, commits, code, wiki edits, issues, and other contributions
25
+ that are not aligned to this Code of Conduct, or to ban temporarily or
26
+ permanently any contributor for other behaviors that they deem inappropriate,
27
+ threatening, offensive, or harmful.
28
+
29
+ By adopting this Code of Conduct, project maintainers commit themselves to
30
+ fairly and consistently applying these principles to every aspect of managing
31
+ this project. Project maintainers who do not follow or enforce the Code of
32
+ Conduct may be permanently removed from the project team.
33
+
34
+ This code of conduct applies both within project spaces and in public spaces
35
+ when an individual is representing the project or its community.
36
+
37
+ Instances of abusive, harassing, or otherwise unacceptable behavior may be
38
+ reported by contacting a project maintainer at niwatolli3@gmail.com. All
39
+ complaints will be reviewed and investigated and will result in a response that
40
+ is deemed necessary and appropriate to the circumstances. Maintainers are
41
+ obligated to maintain confidentiality with regard to the reporter of an
42
+ incident.
43
+
44
+ This Code of Conduct is adapted from the [Contributor Covenant][homepage],
45
+ version 1.3.0, available at
46
+ [http://contributor-covenant.org/version/1/3/0/][version]
47
+
48
+ [homepage]: http://contributor-covenant.org
49
+ [version]: http://contributor-covenant.org/version/1/3/0/
data/Gemfile ADDED
@@ -0,0 +1,4 @@
1
+ source 'https://rubygems.org'
2
+
3
+ # Specify your gem's dependencies in niwa_textream.gemspec
4
+ gemspec
data/LICENSE ADDED
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2016 niwatolli3
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
@@ -0,0 +1,52 @@
1
+ # NiwaTextream
2
+
3
+ [![Build Status](https://travis-ci.org/niwatolli3/niwa_textream.svg?branch=master)](https://travis-ci.org/niwatolli3/niwa_textream)
4
+ [![Code Climate](https://codeclimate.com/github/niwatolli3/niwa_textream/badges/gpa.svg)](https://codeclimate.com/github/niwatolli3/niwa_textream)
5
+ [![Test Coverage](https://codeclimate.com/github/niwatolli3/niwa_textream/badges/coverage.svg)](https://codeclimate.com/github/niwatolli3/niwa_textream/coverage)
6
+ [![Issue Count](https://codeclimate.com/github/niwatolli3/niwa_textream/badges/issue_count.svg)](https://codeclimate.com/github/niwatolli3/niwa_textream)
7
+
8
+ NiwaTextream is a scraping library for Yahoo! Textream.
9
+
10
+ ## Installation
11
+
12
+ Add this line to your application's Gemfile:
13
+
14
+ ```ruby
15
+ gem 'niwa_textream'
16
+ ```
17
+
18
+ And then execute:
19
+
20
+ $ bundle
21
+
22
+ Or install it yourself as:
23
+
24
+ $ gem install niwa_textream
25
+
26
+ ## Usage
27
+
28
+ This gem is following Page Object Pattern.
29
+
30
+ Get category list displayed on top page
31
+ ```ruby
32
+ mecha = Mechanize.new
33
+ NiwaTextream::TopPage.goTo(mecha)
34
+ @topPage = NiwaTextream::TopPage.new(mecha)
35
+ print(@topPage.categories)
36
+ ```
37
+
38
+ ## Development
39
+
40
+ After checking out the repo, run `bin/setup` to install dependencies. You can also run `bin/console` for an interactive prompt that will allow you to experiment.
41
+
42
+ To install this gem onto your local machine, run `bundle exec rake install`. To release a new version, update the version number in `version.rb`, and then run `bundle exec rake release`, which will create a git tag for the version, push git commits and tags, and push the `.gem` file to [rubygems.org](https://rubygems.org).
43
+
44
+ ## Contributing
45
+
46
+ Bug reports and pull requests are welcome on GitHub at https://github.com/[USERNAME]/niwa_textream. This project is intended to be a safe, welcoming space for collaboration, and contributors are expected to adhere to the [Contributor Covenant](http://contributor-covenant.org) code of conduct.
47
+
48
+
49
+ ## License
50
+
51
+ The gem is available as open source under the terms of the [MIT License](http://opensource.org/licenses/MIT).
52
+
@@ -0,0 +1,10 @@
1
+ require "bundler/gem_tasks"
2
+ task :default => :spec
3
+
4
+ begin
5
+ require 'rspec/core/rake_task'
6
+ RSpec::Core::RakeTask.new(:spec) do |spec|
7
+ spec.pattern = 'spec/**/*_spec.rb'
8
+ end
9
+ rescue LoadError => e
10
+ end
@@ -0,0 +1,14 @@
1
+ #!/usr/bin/env ruby
2
+
3
+ require "bundler/setup"
4
+ require "niwa_textream"
5
+
6
+ # You can add fixtures and/or initialization code here to make experimenting
7
+ # with your gem easier. You can also use a different console, if you like.
8
+
9
+ # (If you use this, don't forget to add pry to your Gemfile!)
10
+ # require "pry"
11
+ # Pry.start
12
+
13
+ require "irb"
14
+ IRB.start
@@ -0,0 +1,8 @@
1
+ #!/usr/bin/env bash
2
+ set -euo pipefail
3
+ IFS=$'\n\t'
4
+ set -vx
5
+
6
+ bundle install
7
+
8
+ # Do any other automated setup that you need to do here
@@ -0,0 +1,13 @@
1
+ require "niwa_textream/version"
2
+ require "mechanize"
3
+
4
+ %w[ models/category pages/top/top_page pages/category/category_page ].each do |file|
5
+ require "niwa_textream/#{ file }"
6
+ end
7
+
8
+ module NiwaTextream
9
+ def self.test
10
+ TopPage.new(Mechanize.new)
11
+ TopPage.url
12
+ end
13
+ end
@@ -0,0 +1,10 @@
1
+ module NiwaTextream
2
+ # Contains Category object.
3
+ class BaseModel
4
+ @@elem = nil
5
+
6
+ # element of this model
7
+ attr_accessor :elem
8
+
9
+ end
10
+ end
@@ -0,0 +1,18 @@
1
+ require 'niwa_textream/models/base_model'
2
+
3
+ module NiwaTextream
4
+ # Contains Category object.
5
+ class Category < BaseModel
6
+ # category name
7
+ attr_accessor :name, :parent, :num_thread
8
+
9
+ @name = nil
10
+ # parent's category object. top category: Nil
11
+ @parent = nil
12
+
13
+ @num_thread = -1
14
+
15
+ def initialize
16
+ end
17
+ end
18
+ end
@@ -0,0 +1,16 @@
1
+ require 'niwa_textream/models/base_model'
2
+
3
+ module NiwaTextream
4
+ class Message < BaseModel
5
+ attr_accessor :message_number, :body, :posted_at
6
+ @message_number
7
+ @body
8
+ @posted_at
9
+
10
+ def initialize(elem, body, posted_at)
11
+ @elem = elem
12
+ @body = body
13
+ @posted_at = posted_at
14
+ end
15
+ end
16
+ end
@@ -0,0 +1,17 @@
1
+ require 'niwa_textream/models/base_model'
2
+
3
+ module NiwaTextream
4
+ class Thread < BaseModel
5
+ attr_accessor :title, :num_comment, :last_updated
6
+ @title
7
+ @num_comment
8
+ @last_updated
9
+
10
+ def initialize(elem, title, num_comment, last_updated)
11
+ @elem = elem
12
+ @title = title
13
+ @num_comment = num_comment
14
+ @last_updated = last_updated
15
+ end
16
+ end
17
+ end
@@ -0,0 +1,37 @@
1
+ require 'niwa_textream/pages/main/main_page.rb'
2
+ require 'niwa_textream/pages/thread/thread_page'
3
+
4
+ module NiwaTextream
5
+ class CategoryPage < MainPage
6
+ @@url = "http://textream.yahoo.co.jp/category/%{category_id}"
7
+ @categories = nil
8
+ attr_accessor :categories
9
+
10
+ def initialize(mechanize)
11
+ super(mechanize)
12
+ setCategory
13
+ return self
14
+ end
15
+
16
+ # set category(its parent category is not set)
17
+ def setCategory
18
+ @categories = {}
19
+ @mechanize.page.search("//a[@class='cf']").each do |cat|
20
+ num_thread_with_bracket = cat.search('.//span')[0].inner_text
21
+ num_thread = num_thread_with_bracket.match('\((\d+)\)')[1]
22
+ catObj = NiwaTextream::Category.new
23
+ catObj.elem = cat
24
+ catObj.name = cat.inner_text.match('(.+?)\((.+?)\)')[1]
25
+ catObj.num_thread = num_thread
26
+ @categories[catObj.name] = catObj
27
+ puts("--#CategoryPage#--")
28
+ puts(catObj.name)
29
+ end
30
+ end
31
+
32
+ def clickCategory(name)
33
+ @mechanize.click(@categories[name].elem)
34
+ return NiwaTextream::ThreadPage.new(@mechanize)
35
+ end
36
+ end
37
+ end
@@ -0,0 +1,28 @@
1
+ module NiwaTextream
2
+ class MainPage
3
+ attr_accessor :url
4
+
5
+ @@url = nil
6
+ # mechanize object
7
+ @mechanize = nil
8
+ # header object
9
+ @header = nil
10
+ # body object
11
+ @body = nil
12
+
13
+ @page = nil
14
+
15
+ def initialize(mechanize)
16
+ @mechanize = mechanize
17
+ end
18
+
19
+ # go to page by using mechanize
20
+ def self.goTo(mechanize)
21
+ @page = mechanize.get(@@url)
22
+ end
23
+
24
+ def self.url
25
+ @@url
26
+ end
27
+ end
28
+ end
@@ -0,0 +1,45 @@
1
+ require 'niwa_textream/pages/main/main_page'
2
+ require 'niwa_textream/pages/message/message_header'
3
+ require 'niwa_textream/models/message'
4
+
5
+ module NiwaTextream
6
+ class MessageHeader < MainPage
7
+ # @@url = "http://textream.yahoo.co.jp/category/%{category_id}"
8
+ attr_accessor :bar, :prevBtn, :nextBtn
9
+ @bar = nil
10
+ @prevBtn = nil
11
+ @nextBtn = nil
12
+
13
+ def initialize(mechanize)
14
+ super(mechanize)
15
+ setTopBg
16
+ return self
17
+ end
18
+
19
+ def setTopBg
20
+ @messages = []
21
+ @bar = @mechanize.page.search("//div[@id='toppg']")
22
+ @prevBtn = @bar.search(".//li[@class='prev']/a")[0]
23
+ @nextBtn = @bar.search(".//li[@class='next']/a")[0]
24
+ end
25
+
26
+ def prevPageAvail?
27
+ return @prevBtn.nil? == false
28
+ end
29
+
30
+ def nextPageAvail?
31
+ return @nextBtn.nil? == false
32
+ end
33
+
34
+ def clickPrevButton
35
+ @mechanize.click(@prevBtn)
36
+ return MessagePage.new(@mechanize)
37
+ end
38
+
39
+ def clickNextButton
40
+ @mechanize.click(@nextBtn)
41
+ return MessagePage.new(@mechanize)
42
+ end
43
+ end
44
+ end
45
+
@@ -0,0 +1,34 @@
1
+ require 'niwa_textream/pages/main/main_page.rb'
2
+ require 'niwa_textream/pages/thread/thread_page'
3
+ require 'niwa_textream/utils/TimeUtil'
4
+ require 'niwa_textream/models/message'
5
+ require 'niwa_textream/pages/message/message_header'
6
+
7
+ module NiwaTextream
8
+ class MessagePage < MainPage
9
+ # @@url = "http://textream.yahoo.co.jp/category/%{category_id}"
10
+ @messages = nil
11
+ @message_header = nil
12
+ attr_accessor :messages, :message_header
13
+
14
+ def initialize(mechanize)
15
+ super(mechanize)
16
+ @message_header = MessageHeader.new(mechanize)
17
+ setMessages
18
+ return self
19
+ end
20
+
21
+ def setMessages
22
+ @messages = []
23
+ @mechanize.page.search("//ul[@class='commentList']//div[@class='comment']").each do |message|
24
+ message_id = message['data-comment']
25
+ body = message.search(".//p[@class='comText']")[0].inner_text()
26
+ posted_at_str = message.search(".//p[@class='comWriter']/span/a").inner_text()
27
+ posted_at = NiwaTextream::TimeUtil.getDateTime(posted_at_str)
28
+ messageObj = NiwaTextream::Message.new(message, body, posted_at)
29
+ @messages.push(messageObj)
30
+ puts("#{body}, #{posted_at}")
31
+ end
32
+ end
33
+ end
34
+ end
@@ -0,0 +1,64 @@
1
+ require 'niwa_textream/pages/main/main_page.rb'
2
+ require 'niwa_textream/models/thread'
3
+ require 'niwa_textream/pages/message/message_page'
4
+
5
+ module NiwaTextream
6
+ # thread list
7
+ class ThreadPage < MainPage
8
+ @@url = "http://textream.yahoo.co.jp/thread/%{category_id}"
9
+ @threads = nil
10
+ @next_page_elem
11
+ @prev_page_elem
12
+ attr_accessor :threads, :prev_page_elem, :next_page_elem
13
+
14
+ def initialize(mechanize)
15
+ super(mechanize)
16
+ setThreads
17
+ setNextPageElem
18
+ setPrevPageElem
19
+ return self
20
+ end
21
+
22
+ def setThreads
23
+ @threads = []
24
+ @mechanize.page.search("//*[@id='trdlst']//dl[@class='cf']").each do |thread|
25
+ thread_title_elem = thread.search(".//a[@data-sec='trdlst']")[0]
26
+ last_updated = DateTime.parse(thread.search(".//li[@class='time bold']")[0].inner_text())
27
+ num_comment = thread.search("//*[@class='commentCount']").inner_text().to_i
28
+ @threads.push(NiwaTextream::Thread.new(thread_title_elem, thread_title_elem.inner_text(), num_comment, last_updated))
29
+ puts("#{thread_title_elem.inner_text()}, #{num_comment}, #{last_updated}")
30
+ end
31
+ end
32
+
33
+ def setNextPageElem
34
+ @next_page_elem = @mechanize.page.search("//*[@class='btnNext']/a")[0]
35
+ end
36
+
37
+ def setPrevPageElem
38
+ @prev_page_elem = @mechanize.page.search("//*[@class='btnPrev']/a")[0]
39
+ end
40
+
41
+ def prevPageAvail?
42
+ @prev_page_elem.nil? == false
43
+ end
44
+
45
+ def nextPageAvail?
46
+ @next_page_elem.nil? == false
47
+ end
48
+
49
+ def clickNextPage
50
+ @mechanize.click(@next_page_elem)
51
+ return NiwaTextream::ThreadPage.new(@mechanize)
52
+ end
53
+
54
+ def clickPrevPage
55
+ @mechanize.click(@prev_page_elem)
56
+ return NiwaTextream::ThreadPage.new(@mechanize)
57
+ end
58
+
59
+ def clickThread(thread)
60
+ @mechanize.click(thread.elem)
61
+ return MessagePage.new(@mechanize)
62
+ end
63
+ end
64
+ end