mercury_parser 0.0.1

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml ADDED
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA1:
3
+ metadata.gz: 5cf8578d07811493b1de6baad3ddda00253d7185
4
+ data.tar.gz: bc312548d3bb45dc59f7d947df13f903305646f3
5
+ SHA512:
6
+ metadata.gz: 91875c69c130eb3ff1df8fb94a757c55ca3c6115f1c7a0c37430cc5f3200d8e008775bd4d1d68b4fe2129e1f320196320eba71ea60da6bca2d07bc9107b97fdc
7
+ data.tar.gz: f258abab55bb8663208cb7b9f4b4a04cc08dd66a9dc98523ec2924f665d9da2b8d663b39a3c67fe151545843dd555fc611f6b47ae7924865cbb206d4f642f8b6
checksums.yaml.gz.sig ADDED
@@ -0,0 +1,2 @@
1
+ ��{���Y^Y�Q� T���Ѳ�F��g8�b��mG_k�t�W� ��c�
2
+ �_w�z����[���,�:�~뗟(w�� �$�����=���ת�T<�Q=��a� ��ef��$?ՉXpC+>N�!� �x.ez��,�^�d_
data.tar.gz.sig ADDED
@@ -0,0 +1,2 @@
1
+ ��k<�em��E��C�Ғ������U7��OW��B�X�^�> �0[�dq�6UN:r�^��=�"�;�}M_��.d B�`��U�!
2
+ ^H�8��޷+"z\b�i���. ��k�C�ʒ‡�/��.�L<�p�}�Č�ܞ??��}^p��~��s�� |�_��ٴ�ZG㒎��n8�䛁|(__�{����(u=-0��ALn�V�z��1ő�3*H,0nA�l#�<dhrhH0G�L�%= J0�+#(꧌�%n�
data/.gitignore ADDED
@@ -0,0 +1,10 @@
1
+ *.gem
2
+ .bundle
3
+ vendor/bundle
4
+ .ruby-version
5
+ .rbenv-vars
6
+ spec/reports
7
+ Gemfile.lock
8
+ log/*
9
+ .yardoc
10
+ doc/*
data/.rspec ADDED
@@ -0,0 +1,2 @@
1
+ --color
2
+ --order random
data/.travis.yml ADDED
@@ -0,0 +1,12 @@
1
+ bundler_args: --without development
2
+ language: ruby
3
+
4
+ before_install:
5
+ - rvm get head
6
+
7
+ rvm:
8
+ - 2.1.2
9
+ - 2.1.1
10
+ - 2.1.0
11
+ - 2.0.0
12
+ - 1.9.3
data/.yardopts ADDED
@@ -0,0 +1 @@
1
+ --markup markdown
data/Gemfile ADDED
@@ -0,0 +1,17 @@
1
+ source 'https://rubygems.org'
2
+
3
+ gem 'bundler', '~> 1.3'
4
+ gem 'rake'
5
+ gem 'yard'
6
+
7
+ group :development do
8
+ gem 'pry'
9
+ gem 'byebug'
10
+ end
11
+
12
+ group :test do
13
+ gem 'coveralls', require: false
14
+ gem 'rspec', '~> 3.0'
15
+ end
16
+
17
+ gemspec
data/README.md ADDED
@@ -0,0 +1,87 @@
1
+ # Mercury Parser
2
+ A tiny ruby wrapper for [Mercury's Web Parser](https://mercury.postlight.com/web-parser/)
3
+
4
+ [![Gem Version](https://badge.fury.io/rb/mercury_parser.png)](http://badge.fury.io/rb/mercury_parser)
5
+ [![Code Climate](https://codeclimate.com/github/moisesnarvaez/mercury_parser.png)](https://codeclimate.com/github/moisesnarvaez/mercury_parser)
6
+ [![Dependency Status](https://gemnasium.com/moisesnarvaez/mercury_parser.png)](https://gemnasium.com/moisesnarvaez/mercury_parser)
7
+ [![Build Status](https://travis-ci.org/moisesnarvaez/mercury_parser.png)](https://travis-ci.org/moisesnarvaez/mercury_parser)
8
+
9
+ ## Installation
10
+ Add this line to your application's Gemfile:
11
+
12
+ gem 'mercury_parser'
13
+
14
+ And then execute:
15
+
16
+ bundle install
17
+
18
+ ## Configuration
19
+
20
+ Set the Api Key:
21
+
22
+ ```ruby
23
+ MercuryParser.api_key = MERCURY_API_KEY
24
+ ```
25
+
26
+ Make sure to set `MERCURY_API_KEY` in your environement variables. You can get an API key by contacting Mercury's team directly, more information on their [web parser page](https://mercury.postlight.com/web-parser/).
27
+
28
+ Multiple tokens or multithreaded usage:
29
+
30
+ ```ruby
31
+ client = MercuryParser::Client.new(api_key: MERCURY_API_KEY)
32
+ ```
33
+
34
+ ## Usage
35
+
36
+ ### Parse
37
+
38
+ Parse a webpage and return its main content:
39
+
40
+ ```ruby
41
+ article = MercuryParser.parse("https://trackchanges.postlight.com/building-awesome-cms-f034344d8ed")
42
+ => #<MercuryParser::Article title="Building Awesome CMS", content="<div><div class=\"section-content\"><div class=\"section-inner sectionLayout--insetColumn\"><figure id=\"1b95\" class=\"graf graf--figure graf-after--h3\"><div class=\"aspectRatioPlaceholder is-locked\"><img class=\"graf-image\" src=\"https://d262ilb51hltx0.cloudfront.net/max/800/1*zo51eqdjJ_XSU0D8Vm8P9A.png\"></div></figure><p id=\"c21b\" class=\"graf graf--p graf-after--figure\"><a href=\"https://github.com/postlight/awesome-cms\" class=\"markup--anchor markup--p-anchor\">Awesome CMS</a> is&#x2026;an awesome list of awesome CMSes. It&#x2019;s on GitHub, so anyone can add to it via a pull request. Here are some notes on how and why it came to be.</p><p id=\"2a96\" class=\"graf graf--p graf-after--h3\">GitHub has a <a href=\"https://help.github.com/articles/search-syntax/\" class=\"markup--anchor markup--p-anchor\">set of powerful commands</a> for narrowing search results. In seeking out modern content management tools, I used queries like this:</p><p id=\"5c79\" class=\"graf graf--p graf-after--p\"><a href=\"https://github.com/search?o=desc&amp;q=cms+OR+%22content+management%22+OR+admin+pushed%3A%3E2016-01-01+stars%3A%3E50&amp;ref=searchresults&amp;s=stars&amp;type=Repositories&amp;utf8=&#x2713;\" class=\"markup--anchor markup--p-anchor\">cms OR &#x201C;content management&#x201D; OR admin pushed:&gt;2016&#x2013;01&#x2013;01 stars:&gt;50</a></p><p id=\"7d38\" class=\"graf graf--p graf-after--p\">Sorting by stars, I worked my way backwards. I was able to quickly spot relevant CMS projects. I also started to notice some trends.</p><ul class=\"postList\"><li id=\"8671\" class=\"graf graf--li graf-after--p\">Modern and popular content management systems are written in PHP, JavaScript, Python, and Ruby. There are also a few content management systems written in .NET (C#), but they are much less popular on GitHub.</li><li id=\"a406\" class=\"graf graf--li graf-after--li\">Headless content management systems are gaining popularity. Simply presenting the UI for users to edit content, and relying on the end user to create the user-facing site by ingesting the API. <a href=\"http://getdirectus.com/\" class=\"markup--anchor markup--li-anchor\">Directus</a> and <a href=\"https://www.cloudcms.com/\" class=\"markup--anchor markup--li-anchor\">Cloud CMS</a> are headless CMS options.</li><li id=\"e133\" class=\"graf graf--li graf-after--li\">Static content management systems don&#x2019;t host pages for you. Instead they help generate your CMS, using static files. <a href=\"https://github.com/netlify/netlify-cms\" class=\"markup--anchor markup--li-anchor\">Netlify CMS</a>, <a href=\"https://respondcms.com/\" class=\"markup--anchor markup--li-anchor\">Respond CMS</a>, and <a href=\"https://www.getlektor.com/\" class=\"markup--anchor markup--li-anchor\">Lektor</a> are a few of the options in the static CMS space.</li></ul><p id=\"3bfc\" class=\"graf graf--p graf-after--h3\">I knew the list of all popular content management systems would be huge. I didn&#x2019;t want to put that data into Markdown directly, as it would be difficult to maintain and to augment with extra data (stars on GitHub, last push date, tags, etc).</p><p id=\"4bcb\" class=\"graf graf--p graf-after--p\">Instead, I opted to store the data in <a href=\"https://github.com/toml-lang/toml\" class=\"markup--anchor markup--p-anchor\">TOML</a>, a human-friendly configuration file language. You can view all of the data that powers Awesome CMS in the <a href=\"https://github.com/postlight/awesome-cms/tree/97216ef432963d4dfb2238340e2ebf9a4127fb1e/data\" class=\"markup--anchor markup--p-anchor\">data folder</a>. Here&#x2019;s WordPress&#x2019; entry in that file:</p><pre id=\"4771\" class=\"graf graf--pre graf-after--p\">[[cms]]<br>name = &quot;WordPress&quot;<br>description = &quot;WordPress is a free and open-source content management system (CMS) based on PHP and MySQL.&quot;<br>url = &quot;https://wordpress.org&quot;<br>github_repo = &quot;WordPress/WordPress&quot;<br>awesome_repo = &quot;miziomon/awesome-wordpress&quot;<br>language = &quot;php&quot;</pre><p id=\"4703\" class=\"graf graf--p graf-after--pre\">I process this file using JavaScript in <a href=\"https://github.com/postlight/awesome-cms/blob/97216ef432963d4dfb2238340e2ebf9a4127fb1e/scripts/generateReadme.js\" class=\"markup--anchor markup--p-anchor\">generateReadme.js</a>. It handles processing the TOML, fetching information from GitHub, and generating the final README.md file using the <a href=\"https://github.com/postlight/awesome-cms/blob/master/README.md.hbs\" class=\"markup--anchor markup--p-anchor\">Handlebars template</a>. I&#x2019;m scraping GitHub for star counts because GitHub&#x2019;s API only allows for 60 requests an hour for authenticated users. We want to make it as easy as possible for anyone to contribute. Requiring users to generate a GitHub authentication token to generate the README wasn&#x2019;t an option.</p><p id=\"73aa\" class=\"graf graf--p graf-after--p\">By storing the data in TOML at generating the README.md using JavaScript, I&#x2019;ve essentially created an incredibly light-weight, GitHub backed, static CMS to power Awesome CMS.</p><figure id=\"7c3e\" class=\"graf graf--figure graf-after--p graf--last\"><div class=\"aspectRatioPlaceholder is-locked\"><img class=\"graf-image\" src=\"https://d262ilb51hltx0.cloudfront.net/max/800/1*Y69yr0JgwOaLzACB0ZXDGw.gif\"></div><figcaption class=\"imageCaption\">I heard you like content management systems</figcaption></figure></div></div></div>", author="Jeremy Mack", date_published="2016-10-03T12:48:58.385Z", lead_image_url="https://d262ilb51hltx0.cloudfront.net/max/1200/1*zo51eqdjJ_XSU0D8Vm8P9A.png", dek=nil, next_page_url=nil, url="https://trackchanges.postlight.com/building-awesome-cms-f034344d8ed", domain="trackchanges.postlight.com", excerpt="Awesome CMS is…an awesome list of awesome CMSes. It’s on GitHub, so anyone can add to it via a pull request.", word_count=397, direction="ltr", total_pages=1, rendered_pages=1>
43
+
44
+ article.title
45
+ article.content
46
+ article.author
47
+ article.date_published
48
+ article.lead_image_url
49
+ article.dek
50
+ article.next_page_url
51
+ article.url
52
+ article.domain
53
+ article.excerpt
54
+ article.word_count
55
+ article.direction
56
+ article.total_pages
57
+ article.rendered_pages
58
+ ```
59
+
60
+ ## Contributing
61
+
62
+ 1. Fork it
63
+ 2. [Create a topic branch](http://learn.github.com/p/branching.html)
64
+ 3. Add specs for your unimplemented modifications
65
+ 4. Run `bundle exec rspec`. If specs pass, return to step 3.
66
+ 5. Implement your modifications
67
+ 6. Run `bundle exec rspec`. If specs fail, return to step 5.
68
+ 7. Commit your changes and push
69
+ 8. [Submit a pull request](http://help.github.com/send-pull-requests/)
70
+
71
+ ## Inspiration
72
+ Based on: [ReadabilityParserGem](https://github.com/phildionne/readability_parser)
73
+
74
+ ## Author
75
+ [Moises Narvaez](http://www.moisesnarvaez.com)
76
+
77
+ ## Copyright
78
+ Copyright (c) 2016 Moises Narvaez
79
+
80
+ ## License
81
+ MIT License
82
+
83
+ Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
84
+
85
+ The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
86
+
87
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
data/Rakefile ADDED
@@ -0,0 +1,11 @@
1
+ require 'bundler'
2
+ require 'rake'
3
+ require 'bundler/gem_tasks'
4
+ require 'rspec/core/rake_task'
5
+
6
+ task default: :spec
7
+
8
+ desc 'Run all specs'
9
+ RSpec::Core::RakeTask.new(:spec) do |task|
10
+ task.pattern = 'spec/**/*_spec.rb'
11
+ end
@@ -0,0 +1,25 @@
1
+ require 'mercury_parser/configuration'
2
+ require 'mercury_parser/client'
3
+
4
+ module MercuryParser
5
+ extend Configuration
6
+
7
+ class << self
8
+ # Alias for MercuryParser::Client.new
9
+ #
10
+ # @return [MercuryParser::Client]
11
+ def new(options = {})
12
+ MercuryParser::Client.new(options)
13
+ end
14
+
15
+ # Delegate to MercuryParser::Client
16
+ def method_missing(method, *args, &block)
17
+ return super unless new.respond_to?(method)
18
+ new.send(method, *args, &block)
19
+ end
20
+
21
+ def respond_to?(method, include_private = false)
22
+ new.respond_to?(method, include_private) || super(method, include_private)
23
+ end
24
+ end
25
+ end # MercuryParser
@@ -0,0 +1,21 @@
1
+ module MercuryParser
2
+ module API
3
+ module Content
4
+
5
+ # Parse a webpage and return its main content
6
+ # Returns a MercuryParser::Article object
7
+ #
8
+ # Optionally pass the ID of an article as `id => "id"` in `options` to return the content for a specific DOM node
9
+ # You can also pass a `max_pages` integer to set the maximum number of pages to parse and combine. Default is 25.
10
+ #
11
+ # @param url [String] The URL of an article to return the content for
12
+ # @return [MercuryParser::Article]
13
+ def parse(url, options = {})
14
+ params = { url: url }
15
+ response = get('', params.merge(options))
16
+
17
+ MercuryParser::Article.new(response)
18
+ end
19
+ end # Content
20
+ end # API
21
+ end
@@ -0,0 +1,12 @@
1
+ require 'ostruct'
2
+
3
+ module MercuryParser
4
+ class Article < OpenStruct
5
+ # Returns a MercuryParser::Article object
6
+ #
7
+ # @return [MercuryParser::Article]
8
+ def initialize(article)
9
+ super
10
+ end
11
+ end # Article
12
+ end
@@ -0,0 +1,21 @@
1
+ require 'mercury_parser/connection'
2
+ require 'mercury_parser/request'
3
+ require 'mercury_parser/api/content'
4
+ require 'mercury_parser/article'
5
+
6
+ module MercuryParser
7
+ class Client
8
+ attr_accessor *Configuration::VALID_CONFIG_KEYS
9
+
10
+ def initialize(options = {})
11
+ options = MercuryParser.options.merge(options)
12
+ Configuration::VALID_OPTIONS_KEYS.each do |key|
13
+ send("#{key}=", options[key])
14
+ end
15
+ end
16
+
17
+ include MercuryParser::Connection
18
+ include MercuryParser::Request
19
+ include MercuryParser::API::Content
20
+ end # Client
21
+ end
@@ -0,0 +1,37 @@
1
+ require 'mercury_parser/version'
2
+
3
+ module MercuryParser
4
+ module Configuration
5
+ VALID_CONNECTION_KEYS = [:api_endpoint, :user_agent].freeze
6
+ VALID_OPTIONS_KEYS = [:api_key].freeze
7
+ VALID_CONFIG_KEYS = VALID_CONNECTION_KEYS + VALID_OPTIONS_KEYS
8
+
9
+ DEFAULT_API_ENDPOINT = "https://mercury.postlight.com/parser"
10
+ DEFAULT_USER_AGENT = "MercuryParser Ruby Gem #{MercuryParser::VERSION}".freeze
11
+ DEFAULT_API_TOKEN = nil
12
+
13
+ attr_accessor *VALID_CONFIG_KEYS
14
+
15
+ def self.extended(base)
16
+ base.reset!
17
+ end
18
+
19
+ # Convenience method to allow configuration options to be set in a block
20
+ def configure
21
+ yield self
22
+ end
23
+
24
+ def options
25
+ Hash[ * VALID_CONFIG_KEYS.map { |key| [key, send(key)] }.flatten ]
26
+ end
27
+
28
+ def reset!
29
+ self.api_endpoint = DEFAULT_API_ENDPOINT
30
+ self.user_agent = DEFAULT_USER_AGENT
31
+
32
+ self.api_key = DEFAULT_API_TOKEN
33
+
34
+ return true
35
+ end
36
+ end # Configuration
37
+ end
@@ -0,0 +1,39 @@
1
+ require 'faraday'
2
+ require 'faraday_middleware'
3
+
4
+ module MercuryParser
5
+ module Connection
6
+
7
+ # Instantiate a Faraday::Connection
8
+ # @private
9
+ private
10
+
11
+ # Returns a Faraday::Connection object
12
+ #
13
+ # @return [Faraday::Connection]
14
+ def connection(options = {})
15
+ options = {
16
+ :url => MercuryParser.api_endpoint
17
+ }.merge(options)
18
+
19
+ connection = Faraday.new(options) do |c|
20
+ # encode request params as "www-form-urlencoded"
21
+ c.use Faraday::Request::UrlEncoded
22
+
23
+ c.use FaradayMiddleware::FollowRedirects, limit: 3
24
+
25
+ # raise exceptions on 40x, 50x responses
26
+ c.use Faraday::Response::RaiseError
27
+
28
+ c.response :xml, :content_type => /\bxml$/
29
+ c.response :json, :content_type => /\bjson$/
30
+
31
+ c.adapter Faraday.default_adapter
32
+ end
33
+
34
+ connection.headers[:user_agent] = MercuryParser.user_agent
35
+
36
+ connection
37
+ end
38
+ end # Connection
39
+ end
@@ -0,0 +1,60 @@
1
+ require 'multi_json'
2
+
3
+ module MercuryParser
4
+ class Error < StandardError
5
+
6
+ # Raised when Mercury returns a 4xx or 500 HTTP status code
7
+ class ClientError < Error
8
+
9
+ # Creates a new error from an HTTP environement
10
+ #
11
+ # @param response [Hash]
12
+ # @return [MercuryParser::Error::ClientError]
13
+ def initialize(error = nil)
14
+ parsed_error = parse_error(error)
15
+ http_error = error.response[:status].to_i
16
+
17
+ if ERROR_MAP.has_key?(http_error)
18
+ raise ERROR_MAP[http_error].new(parsed_error[:messages])
19
+ else
20
+ super
21
+ end
22
+ end
23
+
24
+
25
+ private
26
+
27
+ def parse_error(error)
28
+ MultiJson.load(error.response[:body], :symbolize_keys => true)
29
+ end
30
+ end # ClientError
31
+
32
+ class ConfigurationError < MercuryParser::Error; end
33
+
34
+ # Raised when there's an error in Faraday
35
+ class RequestError < MercuryParser::Error; end
36
+
37
+ # Raised when MercuryParser returns a 400 HTTP status code
38
+ class BadRequest < MercuryParser::Error; end
39
+
40
+ # Raised when MercuryParser returns a 401 HTTP status code
41
+ class UnauthorizedRequest < MercuryParser::Error; end
42
+
43
+ # Raised when MercuryParser returns a 403 HTTP status code
44
+ class Forbidden < MercuryParser::Error; end
45
+
46
+ # Raised when MercuryParser returns a 404 HTTP status code
47
+ class NotFound < MercuryParser::Error; end
48
+
49
+ # Raised when MercuryParser returns a 500 HTTP status code
50
+ class InternalServerError < MercuryParser::Error; end
51
+
52
+ ERROR_MAP = {
53
+ 400 => MercuryParser::Error::BadRequest,
54
+ 401 => MercuryParser::Error::UnauthorizedRequest,
55
+ 403 => MercuryParser::Error::Forbidden,
56
+ 404 => MercuryParser::Error::NotFound,
57
+ 500 => MercuryParser::Error::InternalServerError
58
+ }
59
+ end # Error
60
+ end
@@ -0,0 +1,38 @@
1
+ require 'mercury_parser/error'
2
+
3
+ module MercuryParser
4
+ module Request
5
+
6
+ # Performs a HTTP Get request
7
+ def get(path, params={})
8
+ request(:get, path, params)
9
+ end
10
+
11
+
12
+ private
13
+
14
+ # Returns a Faraday::Response object
15
+ #
16
+ # @return [Faraday::Response]
17
+ def request(method, path, params = {})
18
+ raise MercuryParser::Error::ConfigurationError.new("Please configure MercuryParser.api_key first") if api_key.nil?
19
+
20
+ connection_options = {}
21
+ begin
22
+ response = connection(connection_options).send(method) do |req|
23
+ req.url(path, params)
24
+ req.headers['Content-Type'] = 'application/json'
25
+ req.headers['x-api-key'] = api_key
26
+ end
27
+ rescue Faraday::Error::ClientError => error
28
+ if error.is_a?(Faraday::Error::ClientError)
29
+ raise MercuryParser::Error::ClientError.new(error)
30
+ else
31
+ raise MercuryParser::Error::RequestError.new(error)
32
+ end
33
+ end
34
+
35
+ response.body
36
+ end
37
+ end # Request
38
+ end
@@ -0,0 +1,3 @@
1
+ module MercuryParser
2
+ VERSION = "0.0.1"
3
+ end
@@ -0,0 +1,30 @@
1
+ # -*- encoding: utf-8 -*-
2
+ lib = File.expand_path('../lib', __FILE__)
3
+ $LOAD_PATH.unshift(lib) unless $LOAD_PATH.include?(lib)
4
+
5
+ require 'mercury_parser/version'
6
+
7
+ Gem::Specification.new do |gem|
8
+ gem.name = "mercury_parser"
9
+ gem.version = MercuryParser::VERSION
10
+ gem.authors = ["Moises Narvaez"]
11
+ gem.email = ["MoisesNarvaez@gmail.com"]
12
+ gem.description = %q{A tiny ruby wrapper for Mercury's content parser api}
13
+ gem.summary = %q{Interact with the article parsing featureset of Mercury. This means grabbing an article's content based on a URL.}
14
+ gem.homepage = "https://github.com/moisesnarvaez/mercury_parser"
15
+ gem.licenses = "MIT"
16
+
17
+ gem.files = `git ls-files`.split($/)
18
+ gem.executables = gem.files.grep(%r{^bin/}).map{ |f| File.basename(f) }
19
+ gem.test_files = gem.files.grep(%r{^(test|spec|features)/})
20
+ gem.require_paths = ["lib"]
21
+
22
+ gem.cert_chain = ['certs/gem-public_cert.pem']
23
+ gem.signing_key = File.expand_path("~/.gem/gem-private_key.pem") if $0 =~ /gem\z/
24
+
25
+ gem.add_dependency "faraday", "~> 0.9"
26
+ gem.add_dependency "faraday_middleware", "~> 0.9"
27
+ gem.add_dependency "hashie", "~> 3.2"
28
+ gem.add_dependency "multi_xml", "~> 0.5"
29
+ gem.add_dependency "multi_json", "~> 1.10"
30
+ end
@@ -0,0 +1,4 @@
1
+ require 'spec_helper'
2
+
3
+ describe MercuryParser::API::Content do
4
+ end
@@ -0,0 +1,4 @@
1
+ require 'spec_helper'
2
+
3
+ describe MercuryParser::Article do
4
+ end
@@ -0,0 +1,62 @@
1
+ require 'spec_helper'
2
+
3
+ describe MercuryParser::Client do
4
+
5
+ after do
6
+ MercuryParser.reset!
7
+ end
8
+
9
+ context "with module configuration" do
10
+ before do
11
+ MercuryParser.configure do |config|
12
+ MercuryParser::Configuration::VALID_CONFIG_KEYS.each do |key|
13
+ config.send("#{key}=", key)
14
+ end
15
+ end
16
+ end
17
+
18
+ it "inherits the module configuration" do
19
+ MercuryParser::Configuration::VALID_CONFIG_KEYS.each do |key|
20
+ expect(MercuryParser.send(:"#{key}")).to eq(key)
21
+ end
22
+ end
23
+ end
24
+
25
+ context "with class configuration" do
26
+ before do
27
+ @configuration = {
28
+ api_key: '1234'
29
+ }
30
+ end
31
+
32
+ it "overrides the module configuration after initialization" do
33
+ MercuryParser.configure do |config|
34
+ @configuration.each do |key, value|
35
+ config.send("#{key}=", value)
36
+ end
37
+ end
38
+
39
+ MercuryParser::Configuration::VALID_OPTIONS_KEYS.each do |key|
40
+ expect(MercuryParser.send(:"#{key}")).to eq(@configuration[key])
41
+ end
42
+ end
43
+ end
44
+
45
+ describe "#connection" do
46
+ it "looks like Faraday connection" do
47
+ expect(subject.send(:connection)).to respond_to(:run_request)
48
+ end
49
+ end
50
+
51
+ describe "#request" do
52
+ before { MercuryParser.api_key = '1234' }
53
+
54
+ it "catches Faraday connection errors" do
55
+ skip
56
+ end
57
+
58
+ it "catches Mercury Parser API errors" do
59
+ skip
60
+ end
61
+ end
62
+ end
@@ -0,0 +1,7 @@
1
+ require 'spec_helper'
2
+
3
+ describe MercuryParser::Error do
4
+ it "raises the correct error based on api HTTP response code" do
5
+ skip
6
+ end
7
+ end
@@ -0,0 +1,14 @@
1
+ require 'spec_helper'
2
+
3
+ describe MercuryParser do
4
+
5
+ after do
6
+ MercuryParser.reset!
7
+ end
8
+
9
+ describe "#new" do
10
+ it "is a MercuryParser::Client" do
11
+ expect(MercuryParser.new).to be_a_kind_of(MercuryParser::Client)
12
+ end
13
+ end
14
+ end
@@ -0,0 +1,10 @@
1
+ require 'bundler/setup'
2
+ require 'rspec'
3
+
4
+ require 'mercury_parser'
5
+
6
+ RSpec.configure do |config|
7
+ config.filter_run focus: true
8
+ config.filter_run_excluding skip: true
9
+ config.run_all_when_everything_filtered = true
10
+ end
metadata ADDED
@@ -0,0 +1,166 @@
1
+ --- !ruby/object:Gem::Specification
2
+ name: mercury_parser
3
+ version: !ruby/object:Gem::Version
4
+ version: 0.0.1
5
+ platform: ruby
6
+ authors:
7
+ - Moises Narvaez
8
+ autorequire:
9
+ bindir: bin
10
+ cert_chain:
11
+ - |
12
+ -----BEGIN CERTIFICATE-----
13
+ MIIDhTCCAm2gAwIBAgIBATANBgkqhkiG9w0BAQUFADBEMRYwFAYDVQQDDA1tb2lz
14
+ ZXNuYXJ2YWV6MRUwEwYKCZImiZPyLGQBGRYFZ21haWwxEzARBgoJkiaJk/IsZAEZ
15
+ FgNjb20wHhcNMTYxMTIxMjIxNzAxWhcNMTcxMTIxMjIxNzAxWjBEMRYwFAYDVQQD
16
+ DA1tb2lzZXNuYXJ2YWV6MRUwEwYKCZImiZPyLGQBGRYFZ21haWwxEzARBgoJkiaJ
17
+ k/IsZAEZFgNjb20wggEiMA0GCSqGSIb3DQEBAQUAA4IBDwAwggEKAoIBAQDCgs6b
18
+ ZV5GnDnYnY5ia8b3macpMzPabZZk0DAD7VdBr1yaabN8MwzVqfo8NK8DRHyS0gAc
19
+ QWnO/WD0IG41aad3DdlZqLxg6MburvmontSzcvsCsnmdSoqbFMWBiKmiEhIvVHWs
20
+ a2x/7nUUPCiEQ/zoA4xNVhLSPAizF8jgWXIwtWAUWG1gGqmsdy45Ox5tqb/trh7y
21
+ 7QkNZjYy9xGelTPuIOutoD7247+UWFjUyyG/g3wNaEQUVI3RDQRWzOVDKJtHxCo7
22
+ WXbjtw2r3LS/F4MW5M+637hid780yNrIxkiqDs59Lkt51WFEQVnZxoVXtD5dci0I
23
+ PMiTtPYeVA1aq9tFAgMBAAGjgYEwfzAJBgNVHRMEAjAAMAsGA1UdDwQEAwIEsDAd
24
+ BgNVHQ4EFgQUPqahq7ETJWDvF7WBCFXj13ak+/wwIgYDVR0RBBswGYEXbW9pc2Vz
25
+ bmFydmFlekBnbWFpbC5jb20wIgYDVR0SBBswGYEXbW9pc2VzbmFydmFlekBnbWFp
26
+ bC5jb20wDQYJKoZIhvcNAQEFBQADggEBAKzCnByaDkYUyeFpSAaOaoXHymMKjd6S
27
+ dI+ESmQYPzZmOFzUrIeImKvfwNS5AENpxyO1TF/M4LtYtTi5TTu/bcr0tloXYq+Z
28
+ M+nOFlum82y5F4ndXk/mdT+bxKxK/VH9jI47N/eC9aKQCAtTkKISKKhHFBprN0Yx
29
+ LexaioJTNCUIRtR6RUS3vSmXcma1Z19Z6mkHT4W4ljianFiEce/jubJPqNYlQkGZ
30
+ ypccthPoC9Hj/J31ykMMe6GK9Kvjh9J9X/fcV1Zy8vaE1uOa5D1r1PsZeFL7UQwl
31
+ 4KzyFooXeRThwYgBIr55pffGE/pBC+q8diOD3EDZMXL0E2YGnHsH98s=
32
+ -----END CERTIFICATE-----
33
+ date: 2016-11-21 00:00:00.000000000 Z
34
+ dependencies:
35
+ - !ruby/object:Gem::Dependency
36
+ name: faraday
37
+ requirement: !ruby/object:Gem::Requirement
38
+ requirements:
39
+ - - "~>"
40
+ - !ruby/object:Gem::Version
41
+ version: '0.9'
42
+ type: :runtime
43
+ prerelease: false
44
+ version_requirements: !ruby/object:Gem::Requirement
45
+ requirements:
46
+ - - "~>"
47
+ - !ruby/object:Gem::Version
48
+ version: '0.9'
49
+ - !ruby/object:Gem::Dependency
50
+ name: faraday_middleware
51
+ requirement: !ruby/object:Gem::Requirement
52
+ requirements:
53
+ - - "~>"
54
+ - !ruby/object:Gem::Version
55
+ version: '0.9'
56
+ type: :runtime
57
+ prerelease: false
58
+ version_requirements: !ruby/object:Gem::Requirement
59
+ requirements:
60
+ - - "~>"
61
+ - !ruby/object:Gem::Version
62
+ version: '0.9'
63
+ - !ruby/object:Gem::Dependency
64
+ name: hashie
65
+ requirement: !ruby/object:Gem::Requirement
66
+ requirements:
67
+ - - "~>"
68
+ - !ruby/object:Gem::Version
69
+ version: '3.2'
70
+ type: :runtime
71
+ prerelease: false
72
+ version_requirements: !ruby/object:Gem::Requirement
73
+ requirements:
74
+ - - "~>"
75
+ - !ruby/object:Gem::Version
76
+ version: '3.2'
77
+ - !ruby/object:Gem::Dependency
78
+ name: multi_xml
79
+ requirement: !ruby/object:Gem::Requirement
80
+ requirements:
81
+ - - "~>"
82
+ - !ruby/object:Gem::Version
83
+ version: '0.5'
84
+ type: :runtime
85
+ prerelease: false
86
+ version_requirements: !ruby/object:Gem::Requirement
87
+ requirements:
88
+ - - "~>"
89
+ - !ruby/object:Gem::Version
90
+ version: '0.5'
91
+ - !ruby/object:Gem::Dependency
92
+ name: multi_json
93
+ requirement: !ruby/object:Gem::Requirement
94
+ requirements:
95
+ - - "~>"
96
+ - !ruby/object:Gem::Version
97
+ version: '1.10'
98
+ type: :runtime
99
+ prerelease: false
100
+ version_requirements: !ruby/object:Gem::Requirement
101
+ requirements:
102
+ - - "~>"
103
+ - !ruby/object:Gem::Version
104
+ version: '1.10'
105
+ description: A tiny ruby wrapper for Mercury's content parser api
106
+ email:
107
+ - MoisesNarvaez@gmail.com
108
+ executables: []
109
+ extensions: []
110
+ extra_rdoc_files: []
111
+ files:
112
+ - ".gitignore"
113
+ - ".rspec"
114
+ - ".travis.yml"
115
+ - ".yardopts"
116
+ - Gemfile
117
+ - README.md
118
+ - Rakefile
119
+ - lib/mercury_parser.rb
120
+ - lib/mercury_parser/api/content.rb
121
+ - lib/mercury_parser/article.rb
122
+ - lib/mercury_parser/client.rb
123
+ - lib/mercury_parser/configuration.rb
124
+ - lib/mercury_parser/connection.rb
125
+ - lib/mercury_parser/error.rb
126
+ - lib/mercury_parser/request.rb
127
+ - lib/mercury_parser/version.rb
128
+ - mercury_parser.gemspec
129
+ - spec/mercury_parser/api/content_spec.rb
130
+ - spec/mercury_parser/article_spec.rb
131
+ - spec/mercury_parser/client_spec.rb
132
+ - spec/mercury_parser/error_spec.rb
133
+ - spec/mercury_parser_spec.rb
134
+ - spec/spec_helper.rb
135
+ homepage: https://github.com/moisesnarvaez/mercury_parser
136
+ licenses:
137
+ - MIT
138
+ metadata: {}
139
+ post_install_message:
140
+ rdoc_options: []
141
+ require_paths:
142
+ - lib
143
+ required_ruby_version: !ruby/object:Gem::Requirement
144
+ requirements:
145
+ - - ">="
146
+ - !ruby/object:Gem::Version
147
+ version: '0'
148
+ required_rubygems_version: !ruby/object:Gem::Requirement
149
+ requirements:
150
+ - - ">="
151
+ - !ruby/object:Gem::Version
152
+ version: '0'
153
+ requirements: []
154
+ rubyforge_project:
155
+ rubygems_version: 2.6.8
156
+ signing_key:
157
+ specification_version: 4
158
+ summary: Interact with the article parsing featureset of Mercury. This means grabbing
159
+ an article's content based on a URL.
160
+ test_files:
161
+ - spec/mercury_parser/api/content_spec.rb
162
+ - spec/mercury_parser/article_spec.rb
163
+ - spec/mercury_parser/client_spec.rb
164
+ - spec/mercury_parser/error_spec.rb
165
+ - spec/mercury_parser_spec.rb
166
+ - spec/spec_helper.rb
metadata.gz.sig ADDED
Binary file