zhSieve 0.2.0

Sign up to get free protection for your applications and to get access to all the features.
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA1:
3
+ metadata.gz: 09ed25603348589c1d6143e3b2abbd385d14868c
4
+ data.tar.gz: c34c4681fc7c62385e4216cb4122e16d9ada77eb
5
+ SHA512:
6
+ metadata.gz: 597563b7b3c9b360ed099bf48ac54e36146afeeea869286847abf71d98de2b5f5e6366f1bd0be18707cae9e40d9531911374782c398ec4282dc9ffb6d77334c1
7
+ data.tar.gz: c87182ee1d852b11301c7117f7cfe24891ecd8de7ffa9cb7f318162fa12220b8d8029ea345473f28466b25cd37819ccc43c90574304f33c84c671ccedb9def73
@@ -0,0 +1,14 @@
1
+ /.bundle/
2
+ /.yardoc
3
+ /Gemfile.lock
4
+ /_yardoc/
5
+ /coverage/
6
+ /doc/
7
+ /pkg/
8
+ /spec/reports/
9
+ /tmp/
10
+
11
+ # ignor local testing files
12
+ localbackup
13
+ cookies.txt
14
+ /testfile
data/.rspec ADDED
@@ -0,0 +1,2 @@
1
+ --format documentation
2
+ --color
@@ -0,0 +1,5 @@
1
+ sudo: false
2
+ language: ruby
3
+ rvm:
4
+ - 2.1.2
5
+ before_install: gem install bundler -v 1.13.6
@@ -0,0 +1,74 @@
1
+ # Contributor Covenant Code of Conduct
2
+
3
+ ## Our Pledge
4
+
5
+ In the interest of fostering an open and welcoming environment, we as
6
+ contributors and maintainers pledge to making participation in our project and
7
+ our community a harassment-free experience for everyone, regardless of age, body
8
+ size, disability, ethnicity, gender identity and expression, level of experience,
9
+ nationality, personal appearance, race, religion, or sexual identity and
10
+ orientation.
11
+
12
+ ## Our Standards
13
+
14
+ Examples of behavior that contributes to creating a positive environment
15
+ include:
16
+
17
+ * Using welcoming and inclusive language
18
+ * Being respectful of differing viewpoints and experiences
19
+ * Gracefully accepting constructive criticism
20
+ * Focusing on what is best for the community
21
+ * Showing empathy towards other community members
22
+
23
+ Examples of unacceptable behavior by participants include:
24
+
25
+ * The use of sexualized language or imagery and unwelcome sexual attention or
26
+ advances
27
+ * Trolling, insulting/derogatory comments, and personal or political attacks
28
+ * Public or private harassment
29
+ * Publishing others' private information, such as a physical or electronic
30
+ address, without explicit permission
31
+ * Other conduct which could reasonably be considered inappropriate in a
32
+ professional setting
33
+
34
+ ## Our Responsibilities
35
+
36
+ Project maintainers are responsible for clarifying the standards of acceptable
37
+ behavior and are expected to take appropriate and fair corrective action in
38
+ response to any instances of unacceptable behavior.
39
+
40
+ Project maintainers have the right and responsibility to remove, edit, or
41
+ reject comments, commits, code, wiki edits, issues, and other contributions
42
+ that are not aligned to this Code of Conduct, or to ban temporarily or
43
+ permanently any contributor for other behaviors that they deem inappropriate,
44
+ threatening, offensive, or harmful.
45
+
46
+ ## Scope
47
+
48
+ This Code of Conduct applies both within project spaces and in public spaces
49
+ when an individual is representing the project or its community. Examples of
50
+ representing a project or community include using an official project e-mail
51
+ address, posting via an official social media account, or acting as an appointed
52
+ representative at an online or offline event. Representation of a project may be
53
+ further defined and clarified by project maintainers.
54
+
55
+ ## Enforcement
56
+
57
+ Instances of abusive, harassing, or otherwise unacceptable behavior may be
58
+ reported by contacting the project team at wxz138@case.edu. All
59
+ complaints will be reviewed and investigated and will result in a response that
60
+ is deemed necessary and appropriate to the circumstances. The project team is
61
+ obligated to maintain confidentiality with regard to the reporter of an incident.
62
+ Further details of specific enforcement policies may be posted separately.
63
+
64
+ Project maintainers who do not follow or enforce the Code of Conduct in good
65
+ faith may face temporary or permanent repercussions as determined by other
66
+ members of the project's leadership.
67
+
68
+ ## Attribution
69
+
70
+ This Code of Conduct is adapted from the [Contributor Covenant][homepage], version 1.4,
71
+ available at [http://contributor-covenant.org/version/1/4][version]
72
+
73
+ [homepage]: http://contributor-covenant.org
74
+ [version]: http://contributor-covenant.org/version/1/4/
data/Gemfile ADDED
@@ -0,0 +1,4 @@
1
+ source 'https://rubygems.org'
2
+
3
+ # Specify your gem's dependencies in zhSieve.gemspec
4
+ gemspec
@@ -0,0 +1,21 @@
1
+ The MIT License (MIT)
2
+
3
+ Copyright (c) 2016 Wei Zhu
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in
13
+ all copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
21
+ THE SOFTWARE.
@@ -0,0 +1,41 @@
1
+ # zhSieve
2
+ A ruby based zhihu content crawler
3
+
4
+ [![license](https://img.shields.io/github/license/mashape/apistatus.svg)](https://github.com/gwzz/zhSieve/blob/master/LICENSE)
5
+ [![Gem Version](https://badge.fury.io/rb/zhSieve.svg)](https://badge.fury.io/rb/zhSieve)
6
+ ###
7
+ Status: under working :construction:
8
+
9
+ ### Pre-Request
10
+ Create cookies.txt file and put it into root folder. (Mozilla cookies.txt-style)
11
+
12
+ ### Usage
13
+ Crawler specific answer, and output it as markdown format.
14
+ ```shell
15
+ $ zhSieve answer -q 'Your Question Id' -a 'Your Answer Id'
16
+ ```
17
+ Crawler specific Zhuanlan article, and output it as markdown format.
18
+ ```shell
19
+ $ zhSieve article -z 'Your Article Id'
20
+ ```
21
+ ### TO-DO List
22
+
23
+ - [ ] Question :rotating_light:
24
+ - [ ] Ansewer :checkered_flag:
25
+ - Use question_id and answer_id to crawler page information, and output a markdown file with ["question title and link", "author's avatar, name, biography", "answer content"].
26
+ - Testing:warning:!
27
+ - [ ] People :construction:
28
+ - [ ] ZhuanLan, cantains two components
29
+ - 1. Crawler single article :checkered_flag:
30
+ - 2. Crawler someone's zhuanlan category :rotating_light:
31
+
32
+
33
+ ## Contributing
34
+
35
+ Bug reports and pull requests are welcome on GitHub at https://github.com/gwzz/zhSieve. This project is intended to be a safe, welcoming space for collaboration, and contributors are expected to adhere to the [Contributor Covenant](http://contributor-covenant.org) code of conduct.
36
+
37
+
38
+ ## License
39
+
40
+ The gem is available as open source under the terms of the [MIT License](http://opensource.org/licenses/MIT).
41
+
@@ -0,0 +1,6 @@
1
+ require "bundler/gem_tasks"
2
+ require "rspec/core/rake_task"
3
+
4
+ RSpec::Core::RakeTask.new(:spec)
5
+
6
+ task :default => :spec
@@ -0,0 +1,14 @@
1
+ #!/usr/bin/env ruby
2
+
3
+ require "bundler/setup"
4
+ require "zhSieve"
5
+
6
+ # You can add fixtures and/or initialization code here to make experimenting
7
+ # with your gem easier. You can also use a different console, if you like.
8
+
9
+ # (If you use this, don't forget to add pry to your Gemfile!)
10
+ # require "pry"
11
+ # Pry.start
12
+
13
+ require "irb"
14
+ IRB.start
@@ -0,0 +1,8 @@
1
+ #!/usr/bin/env bash
2
+ set -euo pipefail
3
+ IFS=$'\n\t'
4
+ set -vx
5
+
6
+ bundle install
7
+
8
+ # Do any other automated setup that you need to do here
@@ -0,0 +1,3 @@
1
+ #!/usr/bin/env ruby
2
+ require 'zhSieve/cli'
3
+ ZhSieve::CLI.start
@@ -0,0 +1,58 @@
1
+ require "zhSieve/version"
2
+ require "zhSieve/htmlpage"
3
+ require "zhSieve/html2md"
4
+ require 'mechanize'
5
+
6
+ module ZhSieve
7
+ BASE_URL = "https://www.zhihu.com"
8
+ ZL_URI = "https://zhuanlan.zhihu.com/api/posts/"
9
+ def self.crawl_answer(options)
10
+ question_id = "#{options[:question_id]}"
11
+ question_uri = "/question/#{question_id}"
12
+ answer_id = "#{options[:answer_id]}"
13
+ answer_uri = "#answer-#{answer_id}"
14
+ search_uri = "#{BASE_URL}#{question_uri}#{answer_uri}"
15
+ agent = Mechanize.new
16
+ agent.user_agent = 'Chrome/53.0.2785.143'
17
+ agent.max_history = 1
18
+ # Dir.chdir(File.dirname(__FILE__))
19
+ agent.cookie_jar.load_cookiestxt("./cookies.txt")
20
+ search_page = agent.get("#{search_uri}")
21
+ haha = HTMLPage.new(contents:search_page,question_id:question_id,answer_id:answer_id).answerMarkdown
22
+ end
23
+
24
+ def self.crawl_zl_people(options)
25
+
26
+ end
27
+
28
+ def self.crawl_zl_article(options)
29
+ article_id = "#{options[:article_id]}"
30
+ search_uri = "#{ZL_URI}#{article_id}"
31
+ agent = Mechanize.new
32
+ agent.user_agent = 'Chrome/53.0.2785.143'
33
+ agent.max_history = 1
34
+ # Dir.chdir(File.dirname(__FILE__))
35
+ agent.cookie_jar.load_cookiestxt("./cookies.txt")
36
+ search_page = agent.get("#{search_uri}")
37
+ haha = HTMLPage.new(contents:search_page,article_id:article_id).articleMarkdown
38
+ end
39
+ # BASE_URL = "https://www.zhihu.com"
40
+ # question_id = "43830406"
41
+ # question_uri = "/question/#{question_id}"
42
+ # answer_id = "46959669"
43
+ # answer_uri = "#answer-#{answer_id}"
44
+ # PEOPLE_id = "/people/"
45
+ # LIST_URL = "#{BASE_URL}/question/51727516#answer-46414810"
46
+ # HEADERS_HASH = {"User-Agent" => "Chrome/53.0.2785.143"}
47
+
48
+ # arguments = ARGV
49
+
50
+ # agent = Mechanize.new
51
+ # agent.user_agent = 'Chrome/53.0.2785.143'
52
+ # agent.max_history = 1
53
+ # Dir.chdir(File.dirname(__FILE__))
54
+ # agent.cookie_jar.load_cookiestxt("../cookies.txt")
55
+ # # search_page = agent.get(BASE_URL+PEOPLE_URL)
56
+ # search_page = agent.get("https://www.zhihu.com#{question_uri}#{answer_uri}")
57
+ # haha = HTMLPage.new(contents:search_page,question_id:question_id,answer_id:answer_id)
58
+ end
@@ -0,0 +1,23 @@
1
+ class Answer
2
+ attr_accessor :author,:bio,:avatar,:author_link,:content,:question,:link
3
+
4
+ def initialize
5
+ @author = 'NA'
6
+ @bio = 'NA'
7
+ @avatar = 'NA'
8
+ @link = 'NA'
9
+ @content = 'NA'
10
+ @question = 'NA'
11
+ @author_link = 'NA'
12
+ end
13
+
14
+ def format_markdown
15
+ res_file = "## " + "[#{@question}](" + @link + ")" + "\n"
16
+ res_file += "### "+@avatar + "[#{@author}](" + @author_link + ")" + "\n"
17
+ res_file += "#### "+@bio + "\n"
18
+ res_file += "*****\n"
19
+ res_file += @content
20
+ res_file
21
+ end
22
+
23
+ end
@@ -0,0 +1,26 @@
1
+ class Article
2
+ attr_accessor :author,:bio,:avatar,:author_link,:content,:link,:title,:title_image,:published_time
3
+
4
+ def initialize
5
+ @author = 'NA'
6
+ @bio = 'NA'
7
+ @avatar = 'NA'
8
+ @link = 'NA'
9
+ @content = 'NA'
10
+ @author_link = 'NA'
11
+ @title = 'NA'
12
+ @title_image = 'NA'
13
+ @published_time = 'NA'
14
+ end
15
+
16
+ def format_markdown
17
+ res_file = "## [#{@title}](#{@link})" + "\n"
18
+ res_file += "### "+@avatar + "[#{@author}](" + @author_link + ")" + "\n"
19
+ res_file += "#### "+ @bio + "\n"
20
+ res_file += "*****\n"
21
+ res_file += @title_image
22
+ res_file += @content
23
+ res_file
24
+ end
25
+
26
+ end
@@ -0,0 +1,25 @@
1
+ require 'thor'
2
+ require 'zhSieve'
3
+
4
+ module ZhSieve
5
+ class CLI < Thor
6
+ desc "version", "Show current version"
7
+ def version
8
+ puts ZhSieve::VERSION
9
+ end
10
+
11
+ desc "answer", "Set the question id [Q], answer id [A]"
12
+ option :q, :required => true
13
+ option :a, :required => true
14
+ def answer
15
+ puts ZhSieve.crawl_answer(question_id:options[:q],answer_id:options[:a])
16
+ end
17
+
18
+ desc "article", "Set the zhuanlan article id [Z]"
19
+ option :z, :required => true
20
+ def article
21
+ puts ZhSieve.crawl_zl_article(article_id:options[:z])
22
+ end
23
+
24
+ end
25
+ end
@@ -0,0 +1,135 @@
1
+ require 'nokogiri'
2
+ require 'json'
3
+ require_relative 'people'
4
+ require_relative 'answer'
5
+ require_relative 'article'
6
+
7
+ module ZhSieve
8
+ module Converter
9
+ def to_markdown string_contents
10
+ raise NoContents unless string_contents!=nil
11
+ doc = Nokogiri::HTML(string_contents,'UTF-8')
12
+ doc.children.map { |ele| parse_element(ele) }.join
13
+ end
14
+
15
+ def people_to_markdown string_contents
16
+ raise NoContents unless string_contents!=nil
17
+ doc = Nokogiri::HTML(string_contents,'UTF-8')
18
+ search_user = People.new
19
+ # set people info
20
+ end
21
+
22
+ def answer_to_markdown string_contents,question_id,answer_id
23
+ raise NoContents unless string_contents!=nil
24
+ doc = Nokogiri::HTML(string_contents,'UTF-8')
25
+ answer_node = doc.search '[data-aid="'+answer_id.to_s+'"]'
26
+ search_answer = Answer.new
27
+ # search and set question infomations
28
+ avatar_raw = answer_node.search '[class="zm-list-avatar avatar"]'
29
+ author_raw = answer_node.search '[class="author-link"]'
30
+ bio_raw = answer_node.search '[class="bio"]'
31
+ content_raw = answer_node.search '[class="zm-editable-content clearfix"]'
32
+ search_answer.question = doc.title.strip
33
+ search_answer.link = "https://www.zhihu.com/people/#{question_id}#answer-#{answer_id}"
34
+ search_answer.avatar= parse_element(avatar_raw.first)
35
+ search_answer.bio = parse_element(bio_raw.first)
36
+ search_answer.content = parse_element(content_raw.first)
37
+ author_info = parse_element(author_raw.first).split(/[\[\]\(\)]/)
38
+ search_answer.author = author_info[1]
39
+ search_answer.author_link = "https://www.zhihu.com" + author_info[3]
40
+ markdown_text = search_answer.format_markdown
41
+ end
42
+
43
+ def article_to_markdown string_contents,article_id
44
+ raise NoContents unless string_contents!=nil
45
+ doc_hash = JSON.parse(string_contents)
46
+ search_article = Article.new
47
+ search_article.title = doc_hash["title"]
48
+ search_article.title_image = "![](#{doc_hash["titleImage"]})"
49
+ search_article.published_time = doc_hash['publishedTime']
50
+ search_article.link = "https://zhuanlan.zhihu.com/p/#{article_id}"
51
+ search_article.content = doc_hash["content"]
52
+ search_article.author = doc_hash["author"]["name"]
53
+ search_article.bio = doc_hash["author"]["bio"] || "No Bio!"
54
+ search_article.author_link = doc_hash["author"]["profileUrl"]
55
+ avatar_template = doc_hash["author"]["avatar"]["template"].gsub('{size}','s')
56
+ avatar_id = doc_hash["author"]["avatar"]["id"]
57
+ avatar_uri = avatar_template.gsub('{id}', "#{avatar_id}")
58
+ search_article.avatar = "![](#{avatar_uri})"
59
+ markdown_text = search_article.format_markdown
60
+ end
61
+
62
+ def parse_element(ele)
63
+ if ele.is_a? Nokogiri::XML::Text
64
+ return "#{ele.text}\n"
65
+ else
66
+ if (children = ele.children).count > 0
67
+ return wrap_node(ele,children.map {|ele| parse_element(ele)}.join )
68
+ else
69
+ return wrap_node(ele,ele.text)
70
+ end
71
+ end
72
+ end
73
+
74
+ # wrap node with markdown
75
+ def wrap_node(node,contents=nil)
76
+ result = ''
77
+ contents.strip! unless contents==nil
78
+ # check if there is a custom parse exist
79
+ if respond_to? "parse_#{node.name}"
80
+ return self.send("parse_#{node.name}",node,contents)
81
+ end
82
+ # skip hidden node
83
+ return '' if node['style'] and node['style'] =~ /display:\s*none/
84
+ # default parse
85
+ case node.name.downcase
86
+ when 'i'
87
+ when 'script'
88
+ when 'style'
89
+ when 'li'
90
+ result << "*#{contents}\n"
91
+ when 'blockquote'
92
+ contents.split('\n').each do |part|
93
+ result << ">#{contents}\n"
94
+ end
95
+ when 'p'
96
+ result << "\n#{contents}\n"
97
+ when 'strong'
98
+ result << "**#{contents}**\n"
99
+ when 'h1'
100
+ result << "# #{contents}\n"
101
+ when 'h2'
102
+ result << "## #{contents}\n"
103
+ when 'h3'
104
+ result << "### #{contents}\n"
105
+ when 'hr'
106
+ result << "****\n"
107
+ when 'br'
108
+ result << "\n"
109
+ when 'img'
110
+ result << "![#{node['alt']}](#{node['src']})"
111
+ when 'a'
112
+ result << "[#{contents}](#{node['href']})"
113
+ else
114
+ result << contents unless contents == nil
115
+ end
116
+ result
117
+ end
118
+
119
+ # define custom node processor
120
+ def method_missing(name,*args,&block)
121
+ self.class.send :define_method,"parse_#{name}" do |node,contents|
122
+ block.call node,contents
123
+ end
124
+ end
125
+
126
+ def debug
127
+ puts '----------------------------------'
128
+ puts yield
129
+ puts '----------------------------------'
130
+ end
131
+
132
+ end
133
+
134
+ class NoContents < Exception;end
135
+ end
@@ -0,0 +1,39 @@
1
+ require_relative 'html2md'
2
+ module ZhSieve
3
+ class HTMLPage
4
+ include ZhSieve::Converter
5
+ attr_accessor :contents,:question_id,:answer_id,:article_id
6
+
7
+ def initialize(options)
8
+ @contents = options[:contents].body
9
+ @question_id = options[:question_id]
10
+ @answer_id = options[:answer_id]
11
+ @article_id = options[:article_id]
12
+ end
13
+
14
+ def peopleMarkdown
15
+ @markdown = people_to_markdown(@contents)
16
+ end
17
+
18
+ def answerMarkdown
19
+ @markdown = answer_to_markdown(@contents,@question_id,@answer_id)
20
+ end
21
+
22
+ def articleMarkdown
23
+ @markdown = article_to_markdown(@contents,@article_id)
24
+ end
25
+
26
+ def to_html
27
+ @html = @contents
28
+ end
29
+
30
+ def markdown
31
+ @markdown ||= markdown!
32
+ end
33
+
34
+ def markdown!
35
+ @markdown = to_markdown(@contents)
36
+ end
37
+
38
+ end
39
+ end
@@ -0,0 +1,14 @@
1
+ class People
2
+ attr_accessor :name,:bio,:avatar,:location,:business,:education,:detail
3
+
4
+ def initialize
5
+ @name = 'NA'
6
+ @avatar = 'NA'
7
+ @location = 'NA'
8
+ @business = 'NA'
9
+ @education = 'NA'
10
+ @detail = 'NA'
11
+ @bio = 'NA'
12
+ end
13
+
14
+ end
@@ -0,0 +1,3 @@
1
+ module ZhSieve
2
+ VERSION = "0.2.0"
3
+ end
@@ -0,0 +1,12 @@
1
+ class Zhuanlan
2
+ attr_accessor :author,:bio,:avatar,:author_link,:content,:link
3
+
4
+ def initialize
5
+
6
+ end
7
+
8
+ def format_markdown
9
+
10
+ end
11
+
12
+ end
@@ -0,0 +1,41 @@
1
+ # coding: utf-8
2
+ lib = File.expand_path('../lib', __FILE__)
3
+ $LOAD_PATH.unshift(lib) unless $LOAD_PATH.include?(lib)
4
+ require 'zhSieve/version'
5
+
6
+ Gem::Specification.new do |spec|
7
+ spec.name = "zhSieve"
8
+ spec.version = ZhSieve::VERSION
9
+ spec.authors = ["Wei Zhu"]
10
+ spec.email = ["zhucyw@gmail.com"]
11
+
12
+ spec.summary = "A ruby based zhihu content crawler."
13
+ spec.description = "A ruby based zhihu content crawler."
14
+ spec.homepage = "https://github.com/gwzz/zhSieve"
15
+ spec.license = "MIT"
16
+
17
+ # Prevent pushing this gem to RubyGems.org. To allow pushes either set the 'allowed_push_host'
18
+ # to allow pushing to a single host or delete this section to allow pushing to any host.
19
+ if spec.respond_to?(:metadata)
20
+ spec.metadata['allowed_push_host'] = "https://rubygems.org"
21
+ else
22
+ raise "RubyGems 2.0 or newer is required to protect against " \
23
+ "public gem pushes."
24
+ end
25
+
26
+ spec.files = `git ls-files -z`.split("\x0").reject do |f|
27
+ f.match(%r{^(test|spec|features)/})
28
+ end
29
+ spec.bindir = "exe"
30
+ spec.executables = spec.files.grep(%r{^exe/}) { |f| File.basename(f) }
31
+ spec.require_paths = ["lib"]
32
+
33
+ spec.add_development_dependency "bundler", "~> 1.13"
34
+ spec.add_development_dependency "rake", "~> 10.0"
35
+ spec.add_development_dependency "rspec", "~> 3.0"
36
+ spec.add_development_dependency "cucumber"
37
+ spec.add_development_dependency "aruba"
38
+ spec.add_dependency "activesupport", "~> 4.2.0"
39
+ spec.add_dependency "thor"
40
+ spec.add_dependency "mechanize"
41
+ end
metadata ADDED
@@ -0,0 +1,179 @@
1
+ --- !ruby/object:Gem::Specification
2
+ name: zhSieve
3
+ version: !ruby/object:Gem::Version
4
+ version: 0.2.0
5
+ platform: ruby
6
+ authors:
7
+ - Wei Zhu
8
+ autorequire:
9
+ bindir: exe
10
+ cert_chain: []
11
+ date: 2016-11-04 00:00:00.000000000 Z
12
+ dependencies:
13
+ - !ruby/object:Gem::Dependency
14
+ name: bundler
15
+ requirement: !ruby/object:Gem::Requirement
16
+ requirements:
17
+ - - "~>"
18
+ - !ruby/object:Gem::Version
19
+ version: '1.13'
20
+ type: :development
21
+ prerelease: false
22
+ version_requirements: !ruby/object:Gem::Requirement
23
+ requirements:
24
+ - - "~>"
25
+ - !ruby/object:Gem::Version
26
+ version: '1.13'
27
+ - !ruby/object:Gem::Dependency
28
+ name: rake
29
+ requirement: !ruby/object:Gem::Requirement
30
+ requirements:
31
+ - - "~>"
32
+ - !ruby/object:Gem::Version
33
+ version: '10.0'
34
+ type: :development
35
+ prerelease: false
36
+ version_requirements: !ruby/object:Gem::Requirement
37
+ requirements:
38
+ - - "~>"
39
+ - !ruby/object:Gem::Version
40
+ version: '10.0'
41
+ - !ruby/object:Gem::Dependency
42
+ name: rspec
43
+ requirement: !ruby/object:Gem::Requirement
44
+ requirements:
45
+ - - "~>"
46
+ - !ruby/object:Gem::Version
47
+ version: '3.0'
48
+ type: :development
49
+ prerelease: false
50
+ version_requirements: !ruby/object:Gem::Requirement
51
+ requirements:
52
+ - - "~>"
53
+ - !ruby/object:Gem::Version
54
+ version: '3.0'
55
+ - !ruby/object:Gem::Dependency
56
+ name: cucumber
57
+ requirement: !ruby/object:Gem::Requirement
58
+ requirements:
59
+ - - ">="
60
+ - !ruby/object:Gem::Version
61
+ version: '0'
62
+ type: :development
63
+ prerelease: false
64
+ version_requirements: !ruby/object:Gem::Requirement
65
+ requirements:
66
+ - - ">="
67
+ - !ruby/object:Gem::Version
68
+ version: '0'
69
+ - !ruby/object:Gem::Dependency
70
+ name: aruba
71
+ requirement: !ruby/object:Gem::Requirement
72
+ requirements:
73
+ - - ">="
74
+ - !ruby/object:Gem::Version
75
+ version: '0'
76
+ type: :development
77
+ prerelease: false
78
+ version_requirements: !ruby/object:Gem::Requirement
79
+ requirements:
80
+ - - ">="
81
+ - !ruby/object:Gem::Version
82
+ version: '0'
83
+ - !ruby/object:Gem::Dependency
84
+ name: activesupport
85
+ requirement: !ruby/object:Gem::Requirement
86
+ requirements:
87
+ - - "~>"
88
+ - !ruby/object:Gem::Version
89
+ version: 4.2.0
90
+ type: :runtime
91
+ prerelease: false
92
+ version_requirements: !ruby/object:Gem::Requirement
93
+ requirements:
94
+ - - "~>"
95
+ - !ruby/object:Gem::Version
96
+ version: 4.2.0
97
+ - !ruby/object:Gem::Dependency
98
+ name: thor
99
+ requirement: !ruby/object:Gem::Requirement
100
+ requirements:
101
+ - - ">="
102
+ - !ruby/object:Gem::Version
103
+ version: '0'
104
+ type: :runtime
105
+ prerelease: false
106
+ version_requirements: !ruby/object:Gem::Requirement
107
+ requirements:
108
+ - - ">="
109
+ - !ruby/object:Gem::Version
110
+ version: '0'
111
+ - !ruby/object:Gem::Dependency
112
+ name: mechanize
113
+ requirement: !ruby/object:Gem::Requirement
114
+ requirements:
115
+ - - ">="
116
+ - !ruby/object:Gem::Version
117
+ version: '0'
118
+ type: :runtime
119
+ prerelease: false
120
+ version_requirements: !ruby/object:Gem::Requirement
121
+ requirements:
122
+ - - ">="
123
+ - !ruby/object:Gem::Version
124
+ version: '0'
125
+ description: A ruby based zhihu content crawler.
126
+ email:
127
+ - zhucyw@gmail.com
128
+ executables:
129
+ - zhSieve
130
+ extensions: []
131
+ extra_rdoc_files: []
132
+ files:
133
+ - ".gitignore"
134
+ - ".rspec"
135
+ - ".travis.yml"
136
+ - CODE_OF_CONDUCT.md
137
+ - Gemfile
138
+ - LICENSE.txt
139
+ - README.md
140
+ - Rakefile
141
+ - bin/console
142
+ - bin/setup
143
+ - exe/zhSieve
144
+ - lib/zhSieve.rb
145
+ - lib/zhSieve/answer.rb
146
+ - lib/zhSieve/article.rb
147
+ - lib/zhSieve/cli.rb
148
+ - lib/zhSieve/html2md.rb
149
+ - lib/zhSieve/htmlpage.rb
150
+ - lib/zhSieve/people.rb
151
+ - lib/zhSieve/version.rb
152
+ - lib/zhSieve/zhuanlan.rb
153
+ - zhSieve.gemspec
154
+ homepage: https://github.com/gwzz/zhSieve
155
+ licenses:
156
+ - MIT
157
+ metadata:
158
+ allowed_push_host: https://rubygems.org
159
+ post_install_message:
160
+ rdoc_options: []
161
+ require_paths:
162
+ - lib
163
+ required_ruby_version: !ruby/object:Gem::Requirement
164
+ requirements:
165
+ - - ">="
166
+ - !ruby/object:Gem::Version
167
+ version: '0'
168
+ required_rubygems_version: !ruby/object:Gem::Requirement
169
+ requirements:
170
+ - - ">="
171
+ - !ruby/object:Gem::Version
172
+ version: '0'
173
+ requirements: []
174
+ rubyforge_project:
175
+ rubygems_version: 2.2.2
176
+ signing_key:
177
+ specification_version: 4
178
+ summary: A ruby based zhihu content crawler.
179
+ test_files: []