web-page-parser 1.0.0 → 1.1.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: e1076ec5c2d36f32055c1d8996bd632ae9ec8c41
4
- data.tar.gz: bd448302e1e04cf6d022f747959b0740367e75af
3
+ metadata.gz: 03275ebf096ab1230e14cc43df8d273bfb39a4de
4
+ data.tar.gz: 628fb68509c4352ac1962036cf1e33c0a51e9390
5
5
  SHA512:
6
- metadata.gz: 31f642be9f27c32b59fd2cdf0e1fd19f17fb6f0f55f10d506cae97923bc72ca64b508aa296f2bce5345a6bbbec98d7b8ffd3f7fe92ae80fff914dad29e906c16
7
- data.tar.gz: 330bfa9cf1e96e7c0cd98c51ca1f0d63de85fcfec11d074fdcac5ac538d86dd3e6f7237f39d28c24d32121fe04fd6331caf4c96d674b7f1923c5262b53b89574
6
+ metadata.gz: efb4b1cde955569cb14431a8d3748a348255a408acd15c7975715fd1b56a909489a143d2869187b834c066d6763a13276f7cc8b548112d61ea054fad0b8164ac
7
+ data.tar.gz: 083055dbff7568b6576a64c191d57406a316aaabd60e59f1eb77a2c960ed125899ead48cf2dcb3d894b2955f2d76d7a8494240d94bc3b2990e71b9485a51c62a
@@ -1 +1,4 @@
1
- Q��ϟi{��/wW�i����U��k�[#j: �,5�%vm["{Eu g�^��S8�z2���Tb���d����a ʲ����Ƈ��@�\��j�Hg4�#�l��MF����RQ^܅��׶ŭ��Hu���%�QY��2^:T�U��S^3%&W��? KA���1�0����o�z=8x/6/��v�|�"�vԬ�9.t`�rSc�/�k ��z+�"o�����_L��ʷ��B{d/�y�Ʋ�|����
1
+ �N��4���+���#|�>l�=�c,ô�=����k��P��E s�%�x%ճ�Q^��U#�G�o�1�����9ܝa&�K8{���������wj�ai\j�i��������`D�'�Sw) ڨ�/f�d7���Hy(�'Яb�渥}˙��YBvy�7�
2
+ X��ʍ(���
3
+ �F�<�m_P���}_��s��uL��=�Py
4
+ �� �:M뤖�*�~�������in�}�+\�䌴���b�
data.tar.gz.sig CHANGED
Binary file
@@ -1,36 +1,45 @@
1
1
  = Web Page Parser
2
2
 
3
- Web Page Parser is a Ruby library to parse the content out of web
4
- pages, such as BBC News pages. It strips all non-textual stuff out,
5
- leaving the title, publication date and an array of paragraphs. It
6
- makes heavy use of regular expressions, rather than actually parsing
7
- the HTML. This may sound a bit whacky, but BBC News html in particular
8
- has semantic markup *within comments*, which cannot easily be
9
- referenced with standard HTML parsing. Regular expressions are much
10
- faster than full HTML parsing too.
3
+ Web Page Parser is a Ruby library to parse the content out of certain web pages, such as BBC News pages. It strips all non-textual stuff out, leaving the title, publication date and an array of paragraphs.
11
4
 
5
+ Web Page Parser used to make heavy use of regular expressions, rather than actually parsing the HTML. This may sound a bit whacky, but BBC News HTML in particular had semantic markup *within comments*, which could not easily be referenced with standard HTML parsing. But the early wild west days of using Web Page Parser (back in 2009!) are over and news web page formatting has improved a lot and most of the parsers now use standard HTML parsing.
12
6
 
13
- Web Page Parser currently supports BBC News pages and Guardian news
14
- articles but new parsers are planned and can be added easily.
7
+ Web Page Parser currently supports BBC News, Independent, New York Times, Washington Post and Guardian news articles but new parsers are planned and can be added easily.
8
+
9
+ == News Sniffer
10
+
11
+ Web Page Parser is primarily used by the {News Sniffer}[http://www.newssniffer.co.uk] project, which parses and archives news articles to keep track of how they change. This has heavily influenced the design of Web Page Parser.
12
+
13
+ News Sniffer requires that an update to a parser doesn't cause a false change to be detected in the backlog of tracked articles. Web Page Parser caters to this by supporting multiple versions of each parser.
14
+
15
+ So whenever a parser has to be changed, say, to support a new design, or to remove some useless non-textual widget, the existing parser is not touched and a new version is added. The new version will often inherit most of the behaviour of the previous version and just add the new filters or tweaks necessary.
16
+
17
+ So, for example, the BBC often change their design for new articles but their old articles can stay using the same old design. Web Page Parser's BBC parser still supports the older article formats without changing the resulting parsed content at all.
18
+
19
+ Web Page Parser will always use the latest version of each parser by default (using the url to detect which parser to use), but you can specifically require any particular version. News Sniffer keeps track of which parser version was used for each article to it can ensure it uses the same one from then on.
15
20
 
16
- It is used by the {News Sniffer}[http://www.newssniffer.co.uk]
17
- project, which parses and archives news articles to keep track of how
18
- they change.
19
21
 
20
22
  == Example usage
21
23
 
22
24
  require 'web-page-parser'
23
- require 'open-uri'
24
25
 
25
26
  url = "http://news.bbc.co.uk/1/hi/uk/8041972.stm"
26
- page_data = open(url).read
27
27
 
28
- page = WebPageParser::ParserFactory.parser_for(:url => url, :page => page_data)
28
+ page = WebPageParser::ParserFactory.parser_for(:url => url)
29
29
 
30
30
  puts page.title # MPs hit back over expenses claims
31
31
  puts page.date # 2009-05-09T18:58:59+00:00
32
32
  puts page.content.first # The wife of author Ken Follett and ...
33
33
 
34
+ == Or specify a particular parser
35
+
36
+ url = "http://www.theguardian.com/world/2014/oct/24/kurds-fear-isis-chemical-weapon-kobani"
37
+
38
+ page = WebPageParser::GuardianPageParserV1.new(:url => url)
39
+
40
+ puts page.title # Barack Obama declares Iraq war a success
41
+
42
+
34
43
  == Ruby 1.8 support
35
44
 
36
45
  Installing the Oniguruma gem on Ruby 1.8 will make Web Page Parser run
@@ -42,5 +51,5 @@ Web Page Parser was written by {John Leach}[http://johnleach.co.uk]
42
51
  and is released under the MIT License.
43
52
 
44
53
  The code is available on
45
- {github}[http://github.com/johnl/web-page-parser/tree/master].
54
+ {github}[http://github.com/johnl/web-page-parser].
46
55
 
@@ -12,13 +12,14 @@ module WebPageParser
12
12
  attr_accessor :retrieve_session
13
13
  end
14
14
 
15
- attr_reader :url, :guid
15
+ attr_reader :url
16
16
 
17
17
  # takes a hash of options. The :url option passes the page url, and
18
18
  # the :page option passes the raw html page content for parsing
19
19
  def initialize(options = { })
20
20
  @url = options[:url]
21
21
  @page = options[:page]
22
+ @guid = options[:guid]
22
23
  end
23
24
 
24
25
  # return the page contents, retrieving it from the server if necessary
@@ -46,6 +47,15 @@ module WebPageParser
46
47
  def date
47
48
  end
48
49
 
50
+ def guid_from_url
51
+ end
52
+
53
+ def guid
54
+ return @guid if @guid
55
+ @guid = guid_from_url if url
56
+ @guid
57
+ end
58
+
49
59
  # Return a hash representing the textual content of this web page
50
60
  def hash
51
61
  digest = Digest::MD5.new
@@ -23,6 +23,7 @@ module WebPageParser
23
23
  c.dns_cache_timeout = 600
24
24
  c.enable_cookies = true
25
25
  c.follow_location = true
26
+ c.max_redirects = 6
26
27
  c.autoreferer = true
27
28
  c.headers["User-Agent"] = 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.0.4) Gecko/20060508 Firefox/1.5.0.4'
28
29
  c.headers["Accept-encoding"] = 'gzip, deflate'
@@ -0,0 +1,62 @@
1
+ module WebPageParser
2
+ class TheInterceptPageParserFactory < WebPageParser::ParserFactory
3
+ URL_RE = ORegexp.new('firstlook.org/theintercept/[0-9]{4}/[0-9]{2}/[0-9]{2}/[a-z0-9-]+')
4
+ def self.can_parse?(options)
5
+ URL_RE.match(options[:url])
6
+ end
7
+
8
+ def self.create(options = {})
9
+ TheInterceptPageParserV1.new(options)
10
+ end
11
+ end
12
+
13
+ # TheInterceptPageParserV1 parses "The Intercept" web pages using html
14
+ # parsing.
15
+ class TheInterceptPageParserV1 < WebPageParser::BaseParser
16
+ require 'nokogiri'
17
+
18
+ # WashPo articles have a guid in the url (as of Jan 2014, a
19
+ # uuid)
20
+ def guid_from_url
21
+ # get the last large number from the url, if there is one
22
+ url.to_s.scan(/https:\/\/firstlook.org\/theintercept\/[0-9]{4}\/[0-9]{2}\/[0-9]{2}\/[a-z0-9-]+/).last
23
+ end
24
+
25
+ def html_doc
26
+ @html_document ||= Nokogiri::HTML(page)
27
+ end
28
+
29
+ def title
30
+ return @title if @title
31
+ title_meta = html_doc.at_css('meta[property="og:title"]')
32
+ title = nil
33
+ if title_meta
34
+ title = title_meta['content'].to_s.strip
35
+ end
36
+ if title.nil?
37
+ title = html_doc.css('head title').text.strip
38
+ end
39
+ title = title.gsub(/- The Intercept$/,'')
40
+ @title = title.strip
41
+ end
42
+
43
+ def content
44
+ return @content if @content
45
+ story_body = html_doc.css('article div.ti-body p').collect do |p|
46
+ p.text.strip.gsub(160.chr(Encoding::UTF_8), ' ') # convert &nbsp; to actual space
47
+ end
48
+ @content = story_body.select { |p| !p.empty? }
49
+ end
50
+
51
+ def date
52
+ return @date if @date
53
+ if date_meta = html_doc.at_css('meta[property="article:published_time"]')
54
+ date_string = date_meta['content'].scan(/[0-9]{4}-[0-9]{2}-[0-9]{2}T[0-9]{2}:[0-9]{2}:[0-9]{2}\+[0-9]{2}:[0-9]{2}/).first
55
+ @date = DateTime.parse(date_string) rescue nil
56
+ end
57
+ return @date if @date
58
+ # failing that, get it from the url
59
+ @date = DateTime.parse(url.scan(/[0-9]{4}\/[0-9]{2}\/[0-9]{2}/).first.to_s) rescue nil
60
+ end
61
+ end
62
+ end
@@ -14,6 +14,16 @@ share_as :AllPageParsers do
14
14
  content.empty?.should be_true
15
15
  end
16
16
 
17
+ it "should use guid_from_url if available" do
18
+ class GuidTestPageParser < WebPageParser::BaseParser
19
+ def guid_from_url
20
+ "guidfromurl"
21
+ end
22
+ end
23
+ GuidTestPageParser.new.guid.should == nil
24
+ GuidTestPageParser.new(:url => 'someurl').guid.should == 'guidfromurl'
25
+ end
26
+
17
27
  context "when hashing the content" do
18
28
  before :each do
19
29
  @wpp = WebPageParser::BaseParser.new(@valid_options)
@@ -0,0 +1,327 @@
1
+ <!DOCTYPE html>
2
+
3
+ <!--[if lt IE 7]><html class="no-js lt-ie9 lt-ie8 lt-ie7"> <![endif]-->
4
+ <!--[if IE 7]><html class="no-js lt-ie9 lt-ie8"> <![endif]-->
5
+ <!--[if IE 8]><html class="no-js lt-ie9"> <![endif]-->
6
+ <!--[if gt IE 8]><!--><html class="no-js"> <!--<![endif]-->
7
+ <head>
8
+ <meta charset="utf-8" />
9
+
10
+ <meta http-equiv="X-UA-Compatible" content="IE=edge">
11
+ <meta name="description" content="">
12
+ <meta name="viewport" content="width=device-width, initial-scale=1">
13
+
14
+ <link rel="shortcut icon" href="https://prod01-cdn00.cdn.firstlook.org/theintercept/wp-content/themes/the-intercept-v2/favicon.png">
15
+
16
+ <script src='https://prod01-cdn01.cdn.firstlook.org/theintercept/wp-content/themes/the-intercept-v2/js/modernizr-2.6.2.min.js'></script>
17
+ <!-- This site is optimized for SEO -->
18
+ <title>Canada, At War For 13 Years, Shocked That &#039;A Terrorist&#039; Attacked Its Soldiers - The Intercept</title>
19
+ <link rel="canonical" href="https://firstlook.org/theintercept/2014/10/22/canada-proclaiming-war-12-years-shocked-someone-attacked-soldiers/" />
20
+ <meta property="og:locale" content="en_US" />
21
+ <meta property="og:type" content="article" />
22
+ <meta property="og:title" content="Canada, At War For 13 Years, Shocked That &#039;A Terrorist&#039; Attacked Its Soldiers - The Intercept" />
23
+ <meta property="og:description" content="(updated below &#8211; Update II) TORONTO &#8211; In Quebec on Monday, two Canadian soldiers were hit by a car driven by Martin Couture-Rouleau, a 25-year-old Canadian who, as The Globe and Mail reported, &#8220;converted to Islam recently and called himself Ahmad Rouleau.&#8221; One of the soldiers died, as did Couture-Rouleau when he was shot by police upon&gt;&gt;" />
24
+ <meta property="og:url" content="https://firstlook.org/theintercept/2014/10/22/canada-proclaiming-war-12-years-shocked-someone-attacked-soldiers/" />
25
+ <meta property="og:site_name" content="The Intercept" />
26
+ <meta property="article:section" content="Uncategorized" />
27
+ <meta property="article:published_time" content="&lt;span class=&#039;fltimestamp&#039; data-timestamp=&#039;1413982586&#039;&gt;2014-10-22T08:56:26+00:00&lt;/span&gt;" />
28
+ <meta property="article:modified_time" content="2014-10-26T16:46:02+00:00" />
29
+ <meta property="og:image" content="https://prod01-cdn00.cdn.firstlook.org/wp-uploads/sites/1/2014/10/stephen-harper.jpg" />
30
+ <meta name="twitter:card" content="summary"/>
31
+ <meta name="twitter:site" content="@the_intercept"/>
32
+ <meta name="twitter:domain" content="The Intercept"/>
33
+ <meta name="twitter:creator" content="@the_intercept"/>
34
+ <!-- / Yoast WordPress SEO plugin. -->
35
+
36
+ <link rel="alternate" type="application/rss+xml" title="The Intercept &raquo; Canada, At War For 13 Years, Shocked That &#8216;A Terrorist&#8217; Attacked Its Soldiers Comments Feed" href="https://firstlook.org/theintercept/2014/10/22/canada-proclaiming-war-12-years-shocked-someone-attacked-soldiers/feed/" />
37
+ <link rel='stylesheet' id='main-css-css' href='https://prod01-cdn00.cdn.firstlook.org/theintercept/wp-content/themes/the-intercept-v2/css/all.css?ver=4a4b95118f8708ad1db764f737e1088a' type='text/css' media='' />
38
+ <script type='text/javascript' src='https://prod01-cdn01.cdn.firstlook.org/theintercept/wp-includes/js/jquery/jquery.js?ver=1.11.0'></script>
39
+ <script type='text/javascript' src='https://prod01-cdn01.cdn.firstlook.org/theintercept/wp-includes/js/jquery/jquery-migrate.min.js?ver=1.2.1'></script>
40
+ <script type='text/javascript' src='https://prod01-cdn01.cdn.firstlook.org/theintercept/wp-content/themes/the-intercept-v2/js/jquery.dotdotdot.min.js?ver=e7489c03aaea168ba084298955d7fb9a'></script>
41
+ <script type='text/javascript' src='https://prod01-cdn01.cdn.firstlook.org/theintercept/wp-content/themes/the-intercept-v2/js/jquery.stellar.js?ver=facdbc0dc5a7eea6bcfabbba807822ed'></script>
42
+ <script type='text/javascript' src='https://prod01-cdn01.cdn.firstlook.org/theintercept/wp-content/themes/the-intercept-v2/js/jquery.hoverIntent.js?ver=be188522bc57c3f0821dfb2053609915'></script>
43
+ <script type='text/javascript' src='https://prod01-cdn01.cdn.firstlook.org/theintercept/wp-content/themes/the-intercept-v2/js/jquery.sticky-kit.js?ver=b42204c98ed4cf287173bbef20dbd1ce'></script>
44
+ <script type='text/javascript' src='https://prod01-cdn01.cdn.firstlook.org/theintercept/wp-content/themes/the-intercept-v2/js/moment.js?ver=2bc343658034d1a7f4d6694fef2659e4'></script>
45
+ <script type='text/javascript' src='https://prod01-cdn01.cdn.firstlook.org/theintercept/wp-content/themes/the-intercept-v2/js/flm.js?ver=e0465435eeee4e2c4578c915a2e37991'></script>
46
+ <script type='text/javascript' src='https://prod01-cdn01.cdn.firstlook.org/theintercept/wp-content/themes/the-intercept-v2/js/document-loader.js?ver=8a7dc017b1b53ec94d972081643be68a'></script>
47
+ <script type='text/javascript' src='https://prod01-cdn01.cdn.firstlook.org/theintercept/wp-content/themes/the-intercept-v2/js/pez/pez.js?ver=abb38c195d86a81dac1a003fc1b74fb2'></script>
48
+ <script type='text/javascript' src='https://prod01-cdn01.cdn.firstlook.org/theintercept/wp-content/themes/the-intercept-v2/js/pez/pez.linq.js?ver=ee21e8ca846da516f844ed3c361ab9c4'></script>
49
+ <script type='text/javascript' src='https://prod01-cdn01.cdn.firstlook.org/theintercept/wp-content/themes/the-intercept-v2/js/pez/pez.matching.js?ver=e6d719720f9888ee8f389df195a2b74c'></script>
50
+ <script type='text/javascript' src='https://prod01-cdn01.cdn.firstlook.org/theintercept/wp-content/themes/the-intercept-v2/js/pez/pez.modules.js?ver=f23e2e2c1b069d62353e8ccb1cbd3c0e'></script>
51
+ <script type='text/javascript' src='https://prod01-cdn01.cdn.firstlook.org/theintercept/wp-content/themes/the-intercept-v2/js/pez/pez.object.js?ver=4f91b61f003a57bc4debf273837c70cc'></script>
52
+ <script type='text/javascript' src='https://prod01-cdn01.cdn.firstlook.org/theintercept/wp-content/themes/the-intercept-v2/js/pez/pez.params.js?ver=fce94f86c8bf04f62ff4aae18b14fc4e'></script>
53
+ <script type='text/javascript' src='https://prod01-cdn01.cdn.firstlook.org/theintercept/wp-content/themes/the-intercept-v2/js/pez/pez.strings.js?ver=7c6646a66a66fe2832355bd35d84689f'></script>
54
+ <script type='text/javascript' src='https://prod01-cdn01.cdn.firstlook.org/theintercept/wp-content/themes/the-intercept-v2/js/mods/nav.js?ver=19d2a6ca833bf2079a5f52ed9d413758'></script>
55
+ <script type='text/javascript' src='https://prod01-cdn01.cdn.firstlook.org/theintercept/wp-content/themes/the-intercept-v2/js/mods/search.js?ver=502b1ff2ddeb5a21763503dc0fec123e'></script>
56
+ <script type='text/javascript' src='https://prod01-cdn01.cdn.firstlook.org/theintercept/wp-content/themes/the-intercept-v2/js/mods/paginator.js?ver=9da6cfd5cd618be638314edf6cd09ca0'></script>
57
+ <script type='text/javascript' src='https://prod01-cdn01.cdn.firstlook.org/theintercept/wp-content/themes/the-intercept-v2/js/mods/carousel.js?ver=bb84f70c01affe957d97c873f893419a'></script>
58
+ <script type='text/javascript' src='https://prod01-cdn01.cdn.firstlook.org/theintercept/wp-content/themes/the-intercept-v2/js/mods/features.js?ver=bd7ef1410122fd2d2c7bdb6b73104b69'></script>
59
+ <script type='text/javascript' src='https://prod01-cdn01.cdn.firstlook.org/theintercept/wp-content/themes/the-intercept-v2/js/mods/article.js?ver=b7b006e222227897d25c2c6f0b58760c'></script>
60
+ <script type='text/javascript' src='https://prod01-cdn01.cdn.firstlook.org/theintercept/wp-content/themes/the-intercept-v2/js/mods/share.js?ver=0a9500ca813719b321303852a850bcbc'></script>
61
+ <script type='text/javascript' src='https://prod01-cdn01.cdn.firstlook.org/theintercept/wp-content/themes/the-intercept-v2/js/mods/video.js?ver=c4afb1d47e2d65ed275d37005f014503'></script>
62
+ <link rel="EditURI" type="application/rsd+xml" title="RSD" href="https://firstlook.org/theintercept/xmlrpc.php?rsd" />
63
+ <link rel="wlwmanifest" type="application/wlwmanifest+xml" href="https://prod01-cdn02.cdn.firstlook.org/theintercept/wp-includes/wlwmanifest.xml" />
64
+ <meta name="generator" content="WordPress 3.9.2" />
65
+ <link rel='shortlink' href='https://firstlook.org/theintercept/?p=7196' />
66
+ <script type="text/javascript">
67
+ var _paq = _paq || [];
68
+ _paq.push(["trackPageView"]);
69
+ _paq.push(["enableLinkTracking"]);
70
+
71
+ (function() {
72
+ var u="https://prod01-piwik.firstlook.org/";
73
+ var siteID = "1";
74
+ _paq.push(["setTrackerUrl", u+"piwik.php"]);
75
+ _paq.push(["setSiteId", siteID]);
76
+ var d=document, g=d.createElement("script"), s=d.getElementsByTagName("script")[0]; g.type="text/javascript";
77
+ g.defer=true; g.async=true; g.src=u+"piwik.js"; s.parentNode.insertBefore(g,s);
78
+ })();
79
+ </script>
80
+ <!-- Start Fluid Video Embeds Style Tag -->
81
+ <style type="text/css">
82
+ /* Thanks to Web Designer Wall for writing about this technique: http://webdesignerwall.com/tutorials/css-elastic-videos */
83
+ /* And to A List Apart: http://www.alistapart.com/articles/creating-intrinsic-ratios-for-video/ */
84
+ .fve-video-wrapper {
85
+ position: relative;
86
+ overflow: hidden;
87
+ height: 0;
88
+ background-color: transparent;
89
+ padding-bottom: 56.25%; /* This is default, but will be overriden */
90
+ margin: 0.5em 0; /* A bit of margin at the bottom */
91
+ }
92
+ .fve-video-wrapper iframe,
93
+ .fve-video-wrapper object,
94
+ .fve-video-wrapper embed {
95
+ position: absolute;
96
+ display: block;
97
+ top: 0;
98
+ left: 0;
99
+ width: 100%;
100
+ height: 100%;
101
+ }
102
+ .fve-video-wrapper a.hyperlink-image {
103
+ position: relative;
104
+ display: none;
105
+ }
106
+ .fve-video-wrapper a.hyperlink-image img {
107
+ position: relative;
108
+ z-index: 2;
109
+ width: 100%;
110
+ }
111
+ .fve-video-wrapper a.hyperlink-image .fve-play-button {
112
+ position: absolute;
113
+ left: 35%;
114
+ top: 35%;
115
+ right: 35%;
116
+ bottom: 35%;
117
+ z-index: 3;
118
+ background-color: rgba(40, 40, 40, 0.75);
119
+ background-size: 100% 100%;
120
+ border-radius: 10px;
121
+ }
122
+ .fve-video-wrapper a.hyperlink-image:hover .fve-play-button {
123
+ background-color: rgba(0, 0, 0, 0.85);
124
+ }
125
+ /* End of standard styles */
126
+ </style>
127
+ <!-- End Fluid Video Embeds Style Tag -->
128
+ </head>
129
+ <body class="single single-post postid-7196 single-format-standard">
130
+ <header role="banner">
131
+ <div class="grid">
132
+ <a class="logo-link" href="https://firstlook.org/theintercept">
133
+ <img alt="The Intercept" src="/wp-content/themes/the-intercept-v2/images/the-intercept.png" class="ti-image image-logo" />
134
+ </a>
135
+ <nav role="navigation" class="ti-menu menu-main" data-pz-module="nav">
136
+ <div>
137
+ <i class="hamburger"></i>
138
+ <ul id="menu-primary" class="menu"><li id="menu-item-3924" class="menu-item menu-item-type-custom menu-item-object-custom menu-item-3924"><a href="/theintercept/features/">Features</a></li>
139
+ <li id="menu-item-3922" class="menu-item menu-item-type-custom menu-item-object-custom menu-item-3922"><a href="/theintercept/greenwald/">Greenwald</a></li>
140
+ <li id="menu-item-3923" class="menu-item menu-item-type-custom menu-item-object-custom menu-item-3923"><a href="/theintercept/froomkin/">Froomkin</a></li>
141
+ <li id="menu-item-46" class="menu-item menu-item-type-post_type menu-item-object-page menu-item-46"><a href="https://firstlook.org/theintercept/documents/">Documents</a></li>
142
+ <li id="menu-item-47" class="menu-item menu-item-type-post_type menu-item-object-page menu-item-47"><a href="https://firstlook.org/theintercept/staff/">Staff</a></li>
143
+ <li id="menu-item-3925" class="menu-item menu-item-type-custom menu-item-object-custom menu-item-3925"><a href="/theintercept/contact/">Contact</a></li>
144
+ </ul> </div>
145
+ </nav>
146
+ <nav role="navigation" class="ti-menu menu-social">
147
+ <div class="menu-social-container">
148
+ <ul id="menu-social" class="menu"><li id="menu-item-869" class="ti-icon icon-twitter menu-item menu-item-type-custom menu-item-object-custom menu-item-869"><a title="twitter" target="_blank" href="https://twitter.com/the_intercept">Twitter</a></li>
149
+ <li id="menu-item-3921" class="ti-icon icon-facebook menu-item menu-item-type-custom menu-item-object-custom menu-item-3921"><a href="https://facebook.com/theinterceptflm">Facebook</a></li>
150
+ <li id="menu-item-125" class="ti-icon icon-rss menu-item menu-item-type-post_type menu-item-object-page menu-item-125"><a title="rss" href="https://firstlook.org/theintercept/feeds/">RSS feeds</a></li>
151
+ </ul> </div>
152
+ <form role="search" method="get" id="searchform" action="https://firstlook.org/theintercept/" class="ti-form form-search" data-pz-module="search">
153
+ <div class="toggle">
154
+ <input name="s" id="s" placeholder="Search" />
155
+ </div>
156
+ <button><i class="fa fa-search"></i></button>
157
+ </form> </nav>
158
+ </div>
159
+ </header>
160
+ <section role="main" data-pz-module="article">
161
+ <div class="grid">
162
+ <div class="ti-article-page threecol ti-sticky-parent">
163
+ <header>
164
+ <h1 class="title">Canada, At War For 13 Years, Shocked That &#8216;A Terrorist&#8217; Attacked Its Soldiers</h1> <div class="ti-byline">
165
+ <cite>By <span><a href='https://firstlook.org/theintercept/staff/glenn-greenwald/'>Glenn Greenwald</a></span></cite>
166
+ <div class="ti-social">
167
+ <a class='twitter' href='https://twitter.com/@ggreenwald'>@ggreenwald</a> </div>
168
+ <time><span class='fltimestamp' data-timestamp='1413982586'>22 Oct 2014</span></time>
169
+ </div>
170
+ </header>
171
+ <div class="ti-sidebar ti-sticky sidebar-social">
172
+ <aside><h4>Share</h4> <ul data-pz-module="share" class="social" data-track="Share" data-url="https://firstlook.org/theintercept/2014/10/22/canada-proclaiming-war-12-years-shocked-someone-attacked-soldiers/" data-bitly-ue="http://interc.pt/1otBqax" data-bitly="http%3A%2F%2Finterc.pt%2F1otBqax" data-id="7196">
173
+ <li class="twitter with-icon">
174
+ <a class="twitter" title="Share on Twitter" target="_tw" data-width="500" data-height="250" href="http://twitter.com/share?text=Canada, At War For 13 Years, Shocked That &#8216;A Terrorist&#8217; Attacked Its Soldiers&url=http%3A%2F%2Finterc.pt%2F1otBqax">Twitter</a>
175
+ </li>
176
+ <li class="facebook with-icon">
177
+ <a class="facebook" title="Post on Facebook" target="_fb" data-width="500" data-height="400" href="http://www.facebook.com/sharer/sharer.php?u=http%3A%2F%2Finterc.pt%2F1otBqax">Facebook</a>
178
+ </li>
179
+ <li class="googleplus with-icon">
180
+ <a class="googleplus" title="Post on Google+" target="_gp" data-width="500" data-height="500" href="https://plus.google.com/share?url=http%3A%2F%2Finterc.pt%2F1otBqax">Google</a>
181
+ </li>
182
+ <!--<li class="linkedin with-icon">
183
+ <a class="linkedin" title="Post to LinedIn" target="_li" data-width="500" data-height="515" href="https://www.linkedin.com/cws/share?url=http%3A%2F%2Finterc.pt%2F1otBqax&title=Canada%2C%20At%20War%20For%2013%20Years%2C%20Shocked%20That%20%26%238216%3BA%20Terrorist%26%238217%3B%20Attacked%20Its%20Soldiers">LinkedIn</a>
184
+ </li>-->
185
+ <li class="email with-icon">
186
+ <a class="mail" title="E-mail article" href="mailto:?subject=Canada, At War For 13 Years, Shocked That &#8216;A Terrorist&#8217; Attacked Its Soldiers&body=http%3A%2F%2Finterc.pt%2F1otBqax">Email</a>
187
+ </li>
188
+ <li class="print with-icon">
189
+ <a class="print" title="Print this page" href="#print">Print</a>
190
+ </li>
191
+ </ul>
192
+ </aside> </div>
193
+ <div class="ti-sidebar ti-sticky sidebar-popular">
194
+ <aside><h4>Popular</h4><ul class='ti-popular'><li><a href="https://firstlook.org/theintercept/2014/10/30/inside-story-matt-taibbis-departure-first-look-media/"><img src="https://prod01-cdn01.cdn.firstlook.org/wp-uploads/sites/1/2014/10/matt-taibbi-renaldi-single-popular.jpg" /></a><a href="https://firstlook.org/theintercept/2014/10/30/inside-story-matt-taibbis-departure-first-look-media/" class="excerpt with-image">The Inside Story Of Matt Taibbi&#8217;s Departure From First Look Media</a></li><li><a href="https://firstlook.org/theintercept/2014/10/30/hacking-team/"><img src="https://prod01-cdn00.cdn.firstlook.org/wp-uploads/sites/1/2014/10/laptop-smartphone-single-popular.jpg" /></a><a href="https://firstlook.org/theintercept/2014/10/30/hacking-team/" class="excerpt with-image">Secret Manuals Show the Spyware Sold to Despots and Cops Worldwide</a></li><li><a href="https://firstlook.org/theintercept/2014/10/22/canada-proclaiming-war-12-years-shocked-someone-attacked-soldiers/"><img src="https://prod01-cdn01.cdn.firstlook.org/wp-uploads/sites/1/2014/10/stephen-harper-single-popular.jpg" /></a><a href="https://firstlook.org/theintercept/2014/10/22/canada-proclaiming-war-12-years-shocked-someone-attacked-soldiers/" class="excerpt with-image">Canada, At War For 13 Years, Shocked That &#8216;A Terrorist&#8217; Attacked Its Soldiers</a></li><li><a href="https://firstlook.org/theintercept/2014/10/28/smuggling-snowden-secrets/"><img src="https://prod01-cdn03.cdn.firstlook.org/wp-uploads/sites/1/2014/10/micah_snowden_crop_v3-single-popular.jpg" /></a><a href="https://firstlook.org/theintercept/2014/10/28/smuggling-snowden-secrets/" class="excerpt with-image">Ed Snowden Taught Me To Smuggle Secrets Past Incredible Danger. Now I Teach You.</a></li><li><a href="https://firstlook.org/theintercept/2014/10/31/block-boat-work-middle/"><img src="https://prod01-cdn00.cdn.firstlook.org/wp-uploads/sites/1/2014/10/14533521-single-popular.jpg" /></a><a href="https://firstlook.org/theintercept/2014/10/31/block-boat-work-middle/" class="excerpt with-image">A Small Band of Activists Is Humiliating an Israeli Shipping Giant</a></li></ul></aside> </div>
195
+
196
+ <article class="ti-article">
197
+ <div class="ti-sidebar sidebar-social for-mobile">
198
+ <aside><h4>Share</h4> <ul data-pz-module="share" class="social" data-track="Share" data-url="https://firstlook.org/theintercept/2014/10/22/canada-proclaiming-war-12-years-shocked-someone-attacked-soldiers/" data-bitly-ue="http://interc.pt/1otBqax" data-bitly="http%3A%2F%2Finterc.pt%2F1otBqax" data-id="7196">
199
+ <li class="twitter with-icon">
200
+ <a class="twitter" title="Share on Twitter" target="_tw" data-width="500" data-height="250" href="http://twitter.com/share?text=Canada, At War For 13 Years, Shocked That &#8216;A Terrorist&#8217; Attacked Its Soldiers&url=http%3A%2F%2Finterc.pt%2F1otBqax">Twitter</a>
201
+ </li>
202
+ <li class="facebook with-icon">
203
+ <a class="facebook" title="Post on Facebook" target="_fb" data-width="500" data-height="400" href="http://www.facebook.com/sharer/sharer.php?u=http%3A%2F%2Finterc.pt%2F1otBqax">Facebook</a>
204
+ </li>
205
+ <li class="googleplus with-icon">
206
+ <a class="googleplus" title="Post on Google+" target="_gp" data-width="500" data-height="500" href="https://plus.google.com/share?url=http%3A%2F%2Finterc.pt%2F1otBqax">Google</a>
207
+ </li>
208
+ <!--<li class="linkedin with-icon">
209
+ <a class="linkedin" title="Post to LinedIn" target="_li" data-width="500" data-height="515" href="https://www.linkedin.com/cws/share?url=http%3A%2F%2Finterc.pt%2F1otBqax&title=Canada%2C%20At%20War%20For%2013%20Years%2C%20Shocked%20That%20%26%238216%3BA%20Terrorist%26%238217%3B%20Attacked%20Its%20Soldiers">LinkedIn</a>
210
+ </li>-->
211
+ <li class="email with-icon">
212
+ <a class="mail" title="E-mail article" href="mailto:?subject=Canada, At War For 13 Years, Shocked That &#8216;A Terrorist&#8217; Attacked Its Soldiers&body=http%3A%2F%2Finterc.pt%2F1otBqax">Email</a>
213
+ </li>
214
+ <li class="print with-icon">
215
+ <a class="print" title="Print this page" href="#print">Print</a>
216
+ </li>
217
+ </ul>
218
+ </aside> </div>
219
+ <div class="hero">
220
+ <img src="https://prod01-cdn02.cdn.firstlook.org/wp-uploads/sites/1/2014/10/stephen-harper-article-display-b.jpg" alt="Featured photo - Canada, At War For 13 Years, Shocked That &amp;#8216;A Terrorist&amp;#8217; Attacked Its Soldiers" />
221
+ </div>
222
+ <div class="ti-body">
223
+ <p><strong>(updated below &#8211; Update II)</strong></p>
224
+ <p>TORONTO &#8211; In Quebec on Monday, two Canadian soldiers were hit by a car driven by Martin Couture-Rouleau, a 25-year-old Canadian who, as <em>The Globe and Mail</em> <a href="http://www.theglobeandmail.com/news/national/quebec-hit-and-run/article21187200/">reported</a>, &#8220;converted to Islam recently and called himself Ahmad Rouleau.&#8221; One of the soldiers died, as did Couture-Rouleau when he was shot by police upon apprehension after allegedly brandishing a large knife. Police speculated that the incident was deliberate, alleging the driver waited for two hours before hitting the soldiers, one of whom was wearing a uniform. The incident <a href="http://www.theglobeandmail.com/news/politics/two-soldiers-injured-in-quebec-hit-and-run/article21177035/">took place</a> in the parking lot of a shopping mall 30 miles southeast of Montreal, &#8220;a few kilometres from the Collège militaire royal de Saint-Jean, the military academy operated by the Department of National Defence.&#8221;</p>
225
+ <p>The right-wing Canadian government wasted no time in seizing on the incident to promote its fear-mongering agenda over terrorism, which includes <a href="http://calgary.ctvnews.ca/bill-proposed-to-give-csis-tools-to-investigate-track-and-prosecute-potential-terrorists-1.2057025">pending legislation</a> to vest its intelligence agency, CSIS, with more spying and secrecy powers in the name of fighting ISIS. A government spokesperson <a href="http://www.theglobeandmail.com/news/politics/two-soldiers-injured-in-quebec-hit-and-run/article21177035/">asserted</a> &#8220;clear indications&#8221; that the driver “had become radicalized.”</p>
226
+ <p>In a &#8220;clearly prearranged exchange,&#8221; a conservative MP, during parliamentary question time, asked Prime Minister Stephen Harper (pictured above) whether this was considered a &#8220;terrorist attack&#8221;; in reply, the prime minister gravely opined that the incident was &#8220;obviously extremely troubling.” Canada&#8217;s Public Safety Minister Steven Blaney <a href="http://globalnews.ca/news/1625585/canadian-soldier-struck-by-car-in-quebec-has-died/">pronounced</a> the incident &#8220;clearly linked to terrorist ideology,&#8221; while newspapers predictably followed suit, <a href="http://www.thestar.com/news/canada/2014/10/21/soldier_run_down_in_possible_quebec_terror_attack_dies.html">calling</a> it a &#8220;suspected terrorist attack&#8221; <a href="http://globalnews.ca/news/1625585/canadian-soldier-struck-by-car-in-quebec-has-died/">and</a> &#8220;homegrown terrorism.&#8221; CSIS spokesperson Tahera Mufti said &#8220;the event was the violent expression of an extremist ideology promoted by terrorist groups with global followings&#8221; and added: “That something like this would happen in a peaceable Canadian community like Saint-Jean-sur-Richelieu shows the long reach of these ideologies.&#8221;</p>
227
+ <p>In sum, the national mood and discourse in Canada is virtually identical to what prevails in every Western country whenever <a href="http://www.theguardian.com/commentisfree/2013/may/23/woolwich-attack-terrorism-blowback">an incident like this happens</a>: shock and bewilderment that someone would want to bring violence to such a good and innocent country (&#8220;a peaceable Canadian community like Saint-Jean-sur-Richelieu&#8221;), followed by claims that the incident shows how primitive and savage is the &#8220;terrorist ideology&#8221; of extremist Muslims, followed by rage and demand for still more actions of militarism and freedom-deprivation. There are two points worth making about this:</p>
228
+ <p><strong>First</strong>, Canada has spent the last 13 years proclaiming itself a nation at war. It <a href="http://www.theglobeandmail.com/globe-debate/editorials/now-that-our-war-in-afghanistan-is-over/article17501889/">actively participated</a> in the invasion and occupation of Afghanistan and was an <a href="http://rabble.ca/columnists/2014/08/poland-torture-hot-seat-canada-next">enthusiastic partner</a> in some of the most <a href="http://www.cbc.ca/news/world/omar-khadr-reattempts-to-sue-canada-for-20m-1.2753689">extremist War on Terror abuses</a> perpetrated <a href="http://www.salon.com/2010/08/11/khadr/">by the U.S.</a> Earlier this month, the Prime Minister <a href="http://news.nationalpost.com/2014/10/03/isis-motion-calls-for-air-strikes-no-troops-in-iraq/">revealed</a>, with the <a href="http://globalnews.ca/news/1595317/majority-of-canadians-back-use-of-fighter-jets-to-strike-isis-in-iraq/">support of a large majority</a> of Canadians, that &#8220;Canada is poised to go to war in Iraq, as [he] announced plans in Parliament [] to send CF-18 fighter jets for up to six months to battle Islamic extremists.&#8221; Just yesterday, Canadian Defence Minister Rob Nicholson <a href="http://www.edmontonsun.com/2014/10/21/fighter-jets-depart-from-cfb-cold-lake-alberta-to-middle-east">flamboyantly appeared</a> at the airfield in Alberta from which the fighter jets left for Iraq and stood tall as he issued the standard Churchillian war rhetoric about the noble fight against evil.</p>
229
+ <p>It is always stunning when a country that has brought violence and military force to numerous countries <a href="http://www.theguardian.com/commentisfree/2013/may/23/woolwich-attack-terrorism-blowback">acts shocked and bewildered</a> when someone brings <a href="http://www.theguardian.com/commentisfree/2013/apr/16/boston-marathon-explosions-notes-reactions">a tiny fraction of that violence</a> back to that country. Regardless of one&#8217;s views on the justifiability of Canada&#8217;s lengthy military actions, it&#8217;s not the slightest bit surprising or difficult to understand why people who identify with those on the other end of Canadian bombs and bullets would decide to attack the military responsible for that violence.</p>
230
+ <p>That&#8217;s the nature of war. A country doesn&#8217;t get to run around for years wallowing in war glory, invading, rendering and bombing others, without the risk of having violence brought back to it. Rather than being baffling or shocking, that reaction is completely natural and predictable. The only surprising thing about any of it is that it doesn&#8217;t happen more often.</p>
231
+ <p>The issue here is not justification (very few people would view attacks on soldiers in a shopping mall parking lot to be justified). The issue is <em>causation</em>. Every time one of these attacks occurs — from 9/11 on down — Western governments pretend that it was just some sort of unprovoked, utterly &#8220;senseless&#8221; act of violence caused by primitive, irrational, savage religious extremism inexplicably aimed at a country innocently minding its own business. They even invent fairy tales to feed to the population to explain why it happens: <a href="http://www.washingtonpost.com/wp-srv/nation/specials/attacked/transcripts/bushaddress_092001.html">they hate us for our freedoms.</a></p>
232
+ <p>Those fairy tales are pure deceit. Except in the rarest of cases, the violence has clearly identifiable and easy-to-understand causes: namely, anger over the violence that the country&#8217;s government has spent years directing at others. The <a href="http://www.salon.com/2010/06/22/terrorism_22/">statements of those accused by the west of terrorism</a>, and even the <a href="http://www.salon.com/2009/10/20/terrorism_6/">Pentagon&#8217;s own commissioned research</a>, have made conclusively clear what motivates these acts: namely, anger over the violence, abuse and interference by Western countries in that part of the world, with the world&#8217;s Muslims overwhelmingly the targets and victims. The very policies of militarism and civil liberties erosions justified in the name of stopping terrorism are actually what fuels terrorism and ensures its endless continuation.</p>
233
+ <p>If you want to be a country that spends more than a decade proclaiming itself at war and bringing violence to others, then one should expect that violence will sometimes be directed at you as well. Far from being the by-product of primitive and inscrutable religions, that behavior is the natural reaction of human beings targeted with violence. Anyone who doubts that should review the 13-year orgy of violence the U.S. has unleashed on the world since the 9/11 attack, as well as the decades of violence and interference from the U.S. in that region prior to that.</p>
234
+ <p><strong>Second</strong>, in what conceivable sense can this incident be called a &#8220;terrorist&#8221; attack? As I have <a href="http://www.salon.com/2010/02/19/terrorism_19/">written</a> <a href="http://www.theguardian.com/commentisfree/2012/dec/16/court-terrorism-morales-gangs-meaningless">many times</a> over the last several years, and as some of the <a href="http://www.salon.com/2010/03/14/terrorism_20/">best scholarship proves</a>, &#8220;terrorism&#8221; is a word utterly devoid of objective or consistent meaning. It is little more than a totally malleable, propagandistic fear-mongering term used by Western governments (<a href="http://www.globalresearch.ca/bashar-al-assad-interview-the-fight-against-terrorists-in-syria/5365613">and non-Western ones</a>) to justify whatever actions they undertake. As Professor Tomis Kapitan wrote in <a href="http://opinionator.blogs.nytimes.com/2014/10/19/the-reign-of-terror/?_php=true&amp;_type=blogs&amp;_r=0">a brilliant essay in <em>The New York Times</em> on Monday</a>: &#8220;Part of the success of this rhetoric traces to the fact that there is no consensus about the meaning of &#8216;terrorism.&#8217;&#8221;</p>
235
+ <p>But to the extent the term has any common understanding, it includes the deliberate (or wholly reckless) targeting of civilians with violence for political ends. But in this case in Canada, it wasn&#8217;t civilians who were targeted. If one believes the government&#8217;s accounts of the incident, the driver waited two hours until he saw a soldier in uniform. In other words, he seems to have <em>deliberately avoided attacking civilians</em>, and targeted a soldier instead &#8211; a member of a military that is currently fighting a war.</p>
236
+ <p>Again, the point isn&#8217;t justifiability. There is a compelling argument to make that undeployed soldiers engaged in normal civilian activities at home are not valid targets under the laws of war (although the U.S. and its closest allies use <a href="http://www.theguardian.com/commentisfree/cifamerica/2010/dec/10/al-jazeera-us-integrity-wikileaks">extremely broad</a> and <a href="http://news.nationalpost.com/2014/07/13/gaza-police-chief-survives-israeli-airstrike-on-family-home-but-bombs-kill-18-relatives-including-children/">permissive standards</a> for what constitutes legitimate military targets when it comes to their own violence). The point is that targeting soldiers who are part of a military fighting an active war is completely inconsistent with the common usage of the word &#8220;terrorism,&#8221; and yet it is reflexively applied by government officials and media outlets to this incident in Canada (and others like it <a href="http://www.theguardian.com/commentisfree/2013/may/23/woolwich-attack-terrorism-blowback">in the UK</a> and <a href="http://www.salon.com/2009/11/09/terrorism_7/">the US</a>).</p>
237
+ <p>That&#8217;s because the most common functional definition of &#8220;terrorism&#8221; in Western discourse is quite clear. At this point, it means little more than: &#8220;violence directed at Westerners by Muslims&#8221; (when not used to mean &#8220;violence by Muslims,&#8221; it usually just means: <a href="http://www.theglobeandmail.com/news/politics/ottawas-new-anti-terrorism-strategy-lists-eco-extremists-as-threats/article533522/">violence the state dislikes</a>). The term &#8220;terrorism&#8221; has become nothing more than a rhetorical weapon for legitimizing all violence by Western countries, and delegitimizing all violence against them, even when the violence called &#8220;terrorism&#8221; is clearly intended as retaliation for Western violence.</p>
238
+ <p>This is about far more than semantics. It is central to how the west propagandizes its citizenries; the manipulative use of the &#8220;terrorism&#8221; term lies at heart of that. As Professor Kapitan wrote yesterday in <em>The New York Times</em>:</p>
239
+ <blockquote>
240
+ <p class="story-body-text">Even when a definition is agreed upon, the rhetoric of “terror” is applied both selectively and inconsistently<strong>. In the mainstream American media, the “terrorist” label is usually reserved for those opposed to the policies of the U.S. and its allies.</strong> By contrast, some acts of violence that constitute terrorism under most definitions are not identified as such — for instance, the massacre of over 2000 Palestinian civilians in the Beirut refugee camps in 1982 or the killings of more than 3000 civilians in Nicaragua by “contra” rebels during the 1980s, or the genocide that took the lives of at least a half million Rwandans in 1994. At the opposite end of the spectrum, some actions that do not qualify as terrorism are labeled as such — that would include attacks by Hamas, Hezbollah or ISIS, for instance, against uniformed soldiers on duty.</p>
241
+ <p class="story-body-text">Historically, <strong>the rhetoric of terror has been used by those in power not only to sway public opinion, but to direct attention away from their own acts of terror.</strong></p>
242
+ </blockquote>
243
+ <p class="story-body-text">At this point, &#8220;terrorism&#8221; is the term that means nothing, but justifies everything. It is long past time that media outlets begin skeptically questioning its usage by political officials rather than mindlessly parroting it.</p>
244
+ <p class="story-body-text"><em>Photo: AP/The Canadian Press, Adrian Wyld</em></p>
245
+ <p class="story-body-text"><span style="text-decoration: underline"><strong>UPDATE</strong></span>: Multiple conservative commentators have claimed that this article and my subsequent discussion of it are about this morning&#8217;s <a href="http://www.ecanadanow.com/canada/2014/10/22/police-say-soldier-shot-at-war-memorial-in-ottawa-report/">shooting of a solider in Ottawa</a>. Aside from the fact that what I wrote is expressly about a completely different incident &#8211; one that took place in Quebec on Monday &#8211; this article and my comments were published <strong>before</strong> this morning&#8217;s shooting spree was reported. So unless someone believes I possess powers of clairvoyance, the claim that I was commenting on the Ottawa shooting &#8211; about which virtually nothing is known, including the identity and motive of the shooter(s) &#8211; is obviously false.</p>
246
+ <p class="story-body-text">Then there&#8217;s also the extremely predictable accusation that I was <em>justifying</em> the attack on the soldiers. I know from prior experience in discussing these questions that no matter how clear you make it that you are writing about <i>causation</i> and not <em>justification</em>, many will still distort what you write to claim you&#8217;ve justified the attack. That&#8217;s true even if one makes as clear as the English language permits that you&#8217;re not writing about justification: &#8220;<strong>The issue here is not justification (very few people would view attacks on soldiers in a shopping mall parking lot to be justified). The issue is </strong><em><strong>causation.&#8221;</strong></em> If there&#8217;s a way to make that any clearer, please let me know.</p>
247
+ <p class="story-body-text">One more time: the difference between &#8220;causation&#8221; and &#8220;justification&#8221; is so obvious that it should require no explanation. If one observes that someone who smokes four packs of cigarettes a day can expect to develop e<span style="color: #545454">mphysema, that&#8217;s an observation about causation, not a celebration of the person&#8217;s illness. Only a willful desire to distort, or some deep confusion, can account for a failure to process this most basic point.</span></p>
248
+ <p class="story-body-text"><span style="text-decoration: underline"><strong>UPDATE II</strong></span>: In that <a href="http://opinionator.blogs.nytimes.com/2014/10/19/the-reign-of-terror/?_php=true&amp;_type=blogs&amp;_php=true&amp;_type=blogs&amp;_php=true&amp;_type=blogs&amp;_r=2&amp;">brilliant essay</a> I referenced above, published just three days ago in <em>The New York Times</em>, Professor Tomis Kapitan made this point:</p>
249
+ <blockquote>
250
+ <p class="story-body-text">Obviously, to point out the causes and objectives of particular terrorist actions is to imply nothing about their legitimacy — that is an independent matter&#8230;.</p>
251
+ </blockquote>
252
+ <p class="story-body-text">That point is so simple and, as he said, &#8220;obvious&#8221; that I have a hard time understanding what could account for some commentators conflating the two other than a willful desire to mislead.</p>
253
+ </div>
254
+ <div class="contact">
255
+ <p>Email the author: <a href='mailto:glenn.greenwald@theintercept.com'>glenn.greenwald@theintercept.com</a></p> </div>
256
+ </article>
257
+ </div>
258
+
259
+
260
+
261
+ <div id="comments" data-track="Comment" href="#comments">
262
+
263
+ <div class="comment-count">
264
+ 627 Discussing
265
+ </div>
266
+
267
+ <nav id="comment-nav-below"><h4 class="section-heading commentnav"><a id="more-comment-link" href="/theintercept/2014/10/22/canada-proclaiming-war-12-years-shocked-someone-attacked-soldiers/?comments=all#comments">Show comments</a></h4></nav><p class="nocomments">Comments closed.</p>
268
+ <div class="ti-recommended">
269
+ <h2>Recommended</h2>
270
+ <ul>
271
+ <!-- cids 0 8 ["7552","7744","7761","7303","6956","7242","7294","7252"] --><li class='ti-reccol-left'> <img class='subfeature align-left' src='https://prod01-cdn03.cdn.firstlook.org/wp-uploads/sites/1/2014/10/laptop-smartphone-excerpt-small.jpg' alt='Secret Manuals Show the Spyware Sold to Despots and Cops Worldwide' /> <h4> <a href='https://firstlook.org/theintercept/2014/10/30/hacking-team/' title='Secret Manuals Show the Spyware Sold to Despots and Cops Worldwide'>
272
+ Secret Manuals Show the Spyware Sold to Despots and Cops Worldwide </a>
273
+ </h4>
274
+ </li><li class='ti-reccol-right'> <img class='subfeature align-left' src='https://prod01-cdn02.cdn.firstlook.org/wp-uploads/sites/1/2014/11/catcall-video-excerpt-small.jpg' alt='No, We Don&amp;#8217;t Need a Law Against Catcalling' /> <h4> <a href='https://firstlook.org/theintercept/2014/11/03/we-dont-need-a-law-against-catcalling/' title='No, We Don&#8217;t Need a Law Against Catcalling'>
275
+ No, We Don&#8217;t Need a Law Against Catcalling </a>
276
+ </h4>
277
+ </li><li class='ti-reccol-left'> <img class='subfeature align-left' src='https://prod01-cdn01.cdn.firstlook.org/wp-uploads/sites/1/2014/11/05-technician-guide-p71u-1-excerpt-small.jpg' alt='Hacking Team Responds in Defense of Its Spyware' /> <h4> <a href='https://firstlook.org/theintercept/2014/11/03/hacking-team-responds-defense-spyware/' title='Hacking Team Responds in Defense of Its Spyware'>
278
+ Hacking Team Responds in Defense of Its Spyware </a>
279
+ </h4>
280
+ </li><li class='ti-reccol-right'> <img class='subfeature align-left' src='https://prod01-cdn01.cdn.firstlook.org/wp-uploads/sites/1/2014/10/micah_snowden_crop_v3-excerpt-small.jpg' alt='Ed Snowden Taught Me To Smuggle Secrets Past Incredible Danger. Now I Teach You.' /> <h4> <a href='https://firstlook.org/theintercept/2014/10/28/smuggling-snowden-secrets/' title='Ed Snowden Taught Me To Smuggle Secrets Past Incredible Danger. Now I Teach You.'>
281
+ Ed Snowden Taught Me To Smuggle Secrets Past Incredible Danger. Now I Teach You. </a>
282
+ </h4>
283
+ </li><li class='ti-reccol-left'> <img class='subfeature align-left' src='https://prod01-cdn00.cdn.firstlook.org/wp-uploads/sites/1/2014/10/455730724-excerpt-small.jpg' alt='The FBI Director&amp;#8217;s Evidence Against Encryption Is Pathetic' /> <h4> <a href='https://firstlook.org/theintercept/2014/10/17/draft-two-cases-cited-fbi-dude-dumb-dumb/' title='The FBI Director&#8217;s Evidence Against Encryption Is Pathetic'>
284
+ The FBI Director&#8217;s Evidence Against Encryption Is Pathetic </a>
285
+ </h4>
286
+ </li><li class='ti-reccol-right'> <img class='subfeature align-left' src='https://prod01-cdn03.cdn.firstlook.org/wp-uploads/sites/1/2014/10/AP071002027341-excerpt-small.jpg' alt='Blackwater Founder Remains Free and Rich While His Former Employees Go Down on Murder Charges' /> <h4> <a href='https://firstlook.org/theintercept/2014/10/22/blackwater-guilty-verdicts/' title='Blackwater Founder Remains Free and Rich While His Former Employees Go Down on Murder Charges'>
287
+ Blackwater Founder Remains Free and Rich While His Former Employees Go Down on Murder Charges </a>
288
+ </h4>
289
+ </li><li class='ti-reccol-left'> <img class='subfeature align-left' src='https://prod01-cdn02.cdn.firstlook.org/wp-uploads/sites/1/2014/10/difi-excerpt-small.jpg' alt='Is Obama Stalling Until Republicans Can Bury the CIA Torture Report?' /> <h4> <a href='https://firstlook.org/theintercept/2014/10/23/white-house-waiting-gop-senate-kill-feinsteins-torture-report/' title='Is Obama Stalling Until Republicans Can Bury the CIA Torture Report?'>
290
+ Is Obama Stalling Until Republicans Can Bury the CIA Torture Report? </a>
291
+ </h4>
292
+ </li><li class='ti-reccol-right'> <img class='subfeature align-left' src='https://prod01-cdn02.cdn.firstlook.org/wp-uploads/sites/1/2014/10/AP740808069-excerpt-small.jpg' alt='A Story About Ben Bradlee That’s Not Fucking Charming' /> <h4> <a href='https://firstlook.org/theintercept/2014/10/22/a-ben-bradlee-story-thats-not-fucking-charming/' title='A Story About Ben Bradlee That’s Not Fucking Charming'>
293
+ A Story About Ben Bradlee That’s Not Fucking Charming </a>
294
+ </h4>
295
+ </li> </ul>
296
+ </div>
297
+ </div>
298
+ </section>
299
+
300
+ <footer role="banner">
301
+ <div class="banner-bar">
302
+ <div class="grid">
303
+ <cite>&copy; First Look Media. All Rights Reserved</cite>
304
+ <nav role="navigation" class="ti-menu menu-footer">
305
+ <ul id="menu-footer" class="menu"><li id="menu-item-100" class="menu-item menu-item-type-post_type menu-item-object-page menu-item-100"><a href="https://firstlook.org/theintercept/about/">About</a></li>
306
+ <li id="menu-item-839" class="menu-item menu-item-type-post_type menu-item-object-page menu-item-839"><a href="https://firstlook.org/theintercept/terms-use/">Terms of Use</a></li>
307
+ <li id="menu-item-106" class="menu-item menu-item-type-post_type menu-item-object-page menu-item-106"><a href="https://firstlook.org/theintercept/privacy-policy/">Privacy Policy</a></li>
308
+ <li id="menu-item-3926" class="menu-item menu-item-type-custom menu-item-object-custom menu-item-3926"><a href="/theintercept/feed/?rss">RSS</a></li>
309
+ <li id="menu-item-107" class="menu-item menu-item-type-post_type menu-item-object-page menu-item-107"><a href="https://firstlook.org/theintercept/contact/">Contact</a></li>
310
+ </ul> </nav>
311
+ <nav role="navigation" class="ti-menu menu-social">
312
+ <h1>Stay in Touch</h1>
313
+ <ul id="menu-social-1" class="menu"><li class="ti-icon icon-twitter menu-item menu-item-type-custom menu-item-object-custom menu-item-869"><a title="twitter" target="_blank" href="https://twitter.com/the_intercept">Twitter</a></li>
314
+ <li class="ti-icon icon-facebook menu-item menu-item-type-custom menu-item-object-custom menu-item-3921"><a href="https://facebook.com/theinterceptflm">Facebook</a></li>
315
+ <li class="ti-icon icon-rss menu-item menu-item-type-post_type menu-item-object-page menu-item-125"><a title="rss" href="https://firstlook.org/theintercept/feeds/">RSS feeds</a></li>
316
+ </ul> </nav>
317
+ <div class="copyright for-mobile">
318
+ &copy; 2014 First Look Media, Inc. All rights reserved.
319
+ </div>
320
+ </div>
321
+ <a class="to-top" href="#">Back to Top</a>
322
+ </div>
323
+ </footer>
324
+
325
+ <script type='text/javascript' src='https://prod01-cdn01.cdn.firstlook.org/theintercept/wp-includes/js/comment-reply.min.js?ver=3.9.2'></script>
326
+ </body>
327
+ </html>
@@ -60,6 +60,10 @@ describe IndependentPageParserV1 do
60
60
  @pa.content[3].should == 'Many Saudi women have welcomed the freeze of the measure, including Sabria S. Jawhar, a Saudi columnist and assistant professor of applied linguistics at King Saud bin Abdulaziz University for Health Sciences.'
61
61
  @pa.content.size.should == 11
62
62
  end
63
+
64
+ it "should parse the guid" do
65
+ @pa.guid_from_url.should == '9065486'
66
+ end
63
67
  end
64
68
 
65
69
  describe "when parsing the syria article" do
@@ -0,0 +1,67 @@
1
+ # -*- coding: utf-8 -*-
2
+ require 'spec_helper'
3
+ include WebPageParser
4
+
5
+ describe TheInterceptPageParserFactory do
6
+ before do
7
+ @valid_urls = [
8
+ 'https://firstlook.org/theintercept/2014/10/22/canada-proclaiming-war-12-years-shocked-someone-attacked-soldiers/',
9
+ 'https://firstlook.org/theintercept/2014/10/31/block-boat-work-middle/',
10
+ 'https://firstlook.org/theintercept/2014/10/31/block-boat-work-middle'
11
+ ]
12
+ @invalid_urls = [
13
+ 'https://firstlook.org/theintercept/document/2014/10/30/hacking-team-rcs-9-0-changelog/',
14
+ 'https://firstlook.org/theintercept/froomkin/',
15
+ 'https://firstlook.org/theintercept/staff/jeremy-scahill/'
16
+ ]
17
+ end
18
+
19
+ it "should detect intercept articles from the url" do
20
+ @valid_urls.each do |url|
21
+ TheInterceptPageParserFactory.can_parse?(:url => url).should be_true
22
+ end
23
+ end
24
+
25
+ it "should ignore pages with the wrong url format" do
26
+ @invalid_urls.each do |url|
27
+ TheInterceptPageParserFactory.can_parse?(:url => url).should be_nil
28
+ end
29
+ end
30
+
31
+ end
32
+
33
+ describe TheInterceptPageParserV1 do
34
+
35
+ describe 'when parsing the canada at war article' do
36
+ before do
37
+ @valid_options = {
38
+ :url => 'https://firstlook.org/theintercept/2014/10/22/canada-proclaiming-war-12-years-shocked-someone-attacked-soldiers/',
39
+ :page => File.read('spec/fixtures/theintercept/canada-proclaiming-war-12-years-shocked-someone-attacked-soldiers.html'),
40
+ :valid_hash => '6110428997aafee4873fd2cd2dbc6c03'
41
+ }
42
+ @pa = TheInterceptPageParserV1.new(@valid_options)
43
+
44
+ end
45
+
46
+ it 'should parse the title' do
47
+ @pa.title.should == "Canada, At War For 13 Years, Shocked That 'A Terrorist' Attacked Its Soldiers"
48
+ end
49
+
50
+ it 'should parse the content' do
51
+ @pa.content[1].should == 'TORONTO – In Quebec on Monday, two Canadian soldiers were hit by a car driven by Martin Couture-Rouleau, a 25-year-old Canadian who, as The Globe and Mail reported, “converted to Islam recently and called himself Ahmad Rouleau.” One of the soldiers died, as did Couture-Rouleau when he was shot by police upon apprehension after allegedly brandishing a large knife. Police speculated that the incident was deliberate, alleging the driver waited for two hours before hitting the soldiers, one of whom was wearing a uniform. The incident took place in the parking lot of a shopping mall 30 miles southeast of Montreal, “a few kilometres from the Collège militaire royal de Saint-Jean, the military academy operated by the Department of National Defence.”'
52
+ @pa.content[5].should == 'First, Canada has spent the last 13 years proclaiming itself a nation at war. It actively participated in the invasion and occupation of Afghanistan and was an enthusiastic partner in some of the most extremist War on Terror abuses perpetrated by the U.S. Earlier this month, the Prime Minister revealed, with the support of a large majority of Canadians, that “Canada is poised to go to war in Iraq, as [he] announced plans in Parliament [] to send CF-18 fighter jets for up to six months to battle Islamic extremists.” Just yesterday, Canadian Defence Minister Rob Nicholson flamboyantly appeared at the airfield in Alberta from which the fighter jets left for Iraq and stood tall as he issued the standard Churchillian war rhetoric about the noble fight against evil.'
53
+ @pa.content[16].should == 'Even when a definition is agreed upon, the rhetoric of “terror” is applied both selectively and inconsistently. In the mainstream American media, the “terrorist” label is usually reserved for those opposed to the policies of the U.S. and its allies. By contrast, some acts of violence that constitute terrorism under most definitions are not identified as such — for instance, the massacre of over 2000 Palestinian civilians in the Beirut refugee camps in 1982 or the killings of more than 3000 civilians in Nicaragua by “contra” rebels during the 1980s, or the genocide that took the lives of at least a half million Rwandans in 1994. At the opposite end of the spectrum, some actions that do not qualify as terrorism are labeled as such — that would include attacks by Hamas, Hezbollah or ISIS, for instance, against uniformed soldiers on duty.'
54
+ @pa.content[23].should == 'UPDATE II: In that brilliant essay I referenced above, published just three days ago in The New York Times, Professor Tomis Kapitan made this point:'
55
+ @pa.content.last.should == 'That point is so simple and, as he said, “obvious” that I have a hard time understanding what could account for some commentators conflating the two other than a willful desire to mislead.'
56
+ @pa.content.size.should == 26
57
+ @pa.hash.should == @valid_options[:valid_hash]
58
+ end
59
+
60
+ it 'should parse the date in UTC' do
61
+ @pa.date.should == DateTime.parse('22nd October 2014, 08:56:26')
62
+ @pa.date.zone.should == '+00:00'
63
+ end
64
+
65
+ end
66
+
67
+ end
@@ -50,6 +50,14 @@ describe WashingtonPostPageParserV1 do
50
50
  it 'should parse the content' do
51
51
  @pa.content[0].should == 'In a major setback for al-Qaeda’s affiliate in East Africa, the Obama administration said Friday it had confirmed the death of a key Somali militant leader who had been targeted in an airstrike earlier in the week.'
52
52
  end
53
+
54
+ it 'should get the guid from the url' do
55
+ @pa.guid_from_url.should == 'fc9fee06-3512-11e4-9e92-0899b306bbea'
56
+ end
57
+
58
+ it 'should return the guid from the url using the guid method' do
59
+ @pa.guid.should == 'fc9fee06-3512-11e4-9e92-0899b306bbea'
60
+ end
53
61
  end
54
62
 
55
63
  describe 'when parsing the bust-boom article' do
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: web-page-parser
3
3
  version: !ruby/object:Gem::Version
4
- version: 1.0.0
4
+ version: 1.1.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - John Leach
@@ -31,7 +31,7 @@ cert_chain:
31
31
  MghEyBTNQa+QTUTKQMjYOO3kV+Wuv+iQGaMm/bu2SD+Ov0XUzzAsSfz0ZvrF3fbG
32
32
  jdD4CMQtJNDqDiWuUkg=
33
33
  -----END CERTIFICATE-----
34
- date: 2014-10-25 00:00:00.000000000 Z
34
+ date: 2015-01-30 00:00:00.000000000 Z
35
35
  dependencies:
36
36
  - !ruby/object:Gem::Dependency
37
37
  name: htmlentities
@@ -104,8 +104,8 @@ dependencies:
104
104
  - !ruby/object:Gem::Version
105
105
  version: '0'
106
106
  description: A Ruby library to parse the content out of web pages. Currently supports
107
- BBC News pages, The Guardian, Independent and New York Times articles. Used by the
108
- News Sniffer project. http://www.newssniffer.co.uk
107
+ BBC News pages, The Guardian, Independent, New York Times and The Intercept articles.
108
+ Used by the News Sniffer project. http://www.newssniffer.co.uk
109
109
  email: john@johnleach.co.uk
110
110
  executables: []
111
111
  extensions: []
@@ -124,6 +124,7 @@ files:
124
124
  - lib/web-page-parser/parsers/independent_page_parser.rb
125
125
  - lib/web-page-parser/parsers/new_york_times_page_parser.rb
126
126
  - lib/web-page-parser/parsers/test_page_parser.rb
127
+ - lib/web-page-parser/parsers/the_intercept_page_parser.rb
127
128
  - lib/web-page-parser/parsers/washingtonpost_page_parser.rb
128
129
  - spec/base_parser_spec.rb
129
130
  - spec/fixtures/bbc_news/10249066.stm.html
@@ -153,6 +154,7 @@ files:
153
154
  - spec/fixtures/new_york_times/khaled-meshal-the-leader-of-hamas-vacates-damascus.html
154
155
  - spec/fixtures/new_york_times/show-banned-french-comedian-has-new-one.html
155
156
  - spec/fixtures/new_york_times/the-long-run-gingrich-stuck-to-caustic-path-in-ethics-battles.html
157
+ - spec/fixtures/theintercept/canada-proclaiming-war-12-years-shocked-someone-attacked-soldiers.html
156
158
  - spec/fixtures/washingtonpost/pentagon-confirms-al-shabab-leader-killed.html
157
159
  - spec/fixtures/washingtonpost/sgt-bowe-bergdahls-capture-remains-amystery.html
158
160
  - spec/fixtures/washingtonpost/will-a-bust-follow-the-boom-in-britain.html
@@ -161,6 +163,7 @@ files:
161
163
  - spec/parsers/guardian_page_spec.rb
162
164
  - spec/parsers/independent_page_parser_spec.rb
163
165
  - spec/parsers/new_york_times_page_parser_spec.rb
166
+ - spec/parsers/the_intercept_page_parser_spec.rb
164
167
  - spec/parsers/washingtonpost_page_parser_spec.rb
165
168
  - spec/spec.opts
166
169
  - spec/spec_helper.rb
@@ -189,42 +192,44 @@ signing_key:
189
192
  specification_version: 4
190
193
  summary: A parser for various news organisation's web pages
191
194
  test_files:
192
- - spec/fixtures/new_york_times/show-banned-french-comedian-has-new-one.html
193
- - spec/fixtures/new_york_times/the-long-run-gingrich-stuck-to-caustic-path-in-ethics-battles.html
194
- - spec/fixtures/new_york_times/khaled-meshal-the-leader-of-hamas-vacates-damascus.html
195
- - spec/fixtures/guardian/syria-libya-middle-east-unrest-live.html
196
- - spec/fixtures/guardian/anger-grows-rbs-chiefs-bonus.html
197
- - spec/fixtures/guardian/anger-grows-rbs-chiefs-bonus-with-explainer.html
198
- - spec/fixtures/guardian/nhs-patient-data-available-companies-buy.html
199
- - spec/fixtures/guardian/barack-obama-nicki-minaj-mariah-carey.html
200
- - spec/fixtures/washingtonpost/sgt-bowe-bergdahls-capture-remains-amystery.html
201
- - spec/fixtures/washingtonpost/will-a-bust-follow-the-boom-in-britain.html
202
- - spec/fixtures/washingtonpost/pentagon-confirms-al-shabab-leader-killed.html
203
- - spec/fixtures/cassette_library/BbcNewsPageParserV4.yml
204
- - spec/fixtures/independent/innocent-starving-close-to-death-one-victim-of-the-siege-that-shames-syria-9065538.html
205
- - spec/fixtures/independent/david-cameron-set-for-uturn-over-uk-sanctuary-9077647.html
206
- - spec/fixtures/independent/belgian-man-who-skipped-100-restaurant-bills-is-killed-9081407.html
207
- - spec/fixtures/independent/saudi-authorities-stop-textmessage-tracking-of-women-for-now-9065486.html
208
- - spec/fixtures/bbc_news/20230333.stm.html
209
195
  - spec/fixtures/bbc_news/10249066.stm.html
196
+ - spec/fixtures/bbc_news/10341015.stm.html
197
+ - spec/fixtures/bbc_news/11125504.html
198
+ - spec/fixtures/bbc_news/12921632.html
210
199
  - spec/fixtures/bbc_news/13293006.html
200
+ - spec/fixtures/bbc_news/19957138.stm.html
201
+ - spec/fixtures/bbc_news/20230333.stm.html
202
+ - spec/fixtures/bbc_news/21528631.html
203
+ - spec/fixtures/bbc_news/6072486.stm.html
211
204
  - spec/fixtures/bbc_news/7745137.stm.html
205
+ - spec/fixtures/bbc_news/8011268.stm.html
212
206
  - spec/fixtures/bbc_news/8029015.stm.html
213
- - spec/fixtures/bbc_news/11125504.html
214
207
  - spec/fixtures/bbc_news/8040164.stm.html
215
- - spec/fixtures/bbc_news/21528631.html
216
- - spec/fixtures/bbc_news/10341015.stm.html
217
208
  - spec/fixtures/bbc_news/8063681.stm.html
218
- - spec/fixtures/bbc_news/19957138.stm.html
219
- - spec/fixtures/bbc_news/6072486.stm.html
220
- - spec/fixtures/bbc_news/8011268.stm.html
221
- - spec/fixtures/bbc_news/12921632.html
222
- - spec/base_parser_spec.rb
223
- - spec/parsers/washingtonpost_page_parser_spec.rb
209
+ - spec/fixtures/cassette_library/BbcNewsPageParserV4.yml
210
+ - spec/fixtures/guardian/anger-grows-rbs-chiefs-bonus-with-explainer.html
211
+ - spec/fixtures/guardian/anger-grows-rbs-chiefs-bonus.html
212
+ - spec/fixtures/guardian/barack-obama-nicki-minaj-mariah-carey.html
213
+ - spec/fixtures/guardian/nhs-patient-data-available-companies-buy.html
214
+ - spec/fixtures/guardian/syria-libya-middle-east-unrest-live.html
215
+ - spec/fixtures/independent/belgian-man-who-skipped-100-restaurant-bills-is-killed-9081407.html
216
+ - spec/fixtures/independent/david-cameron-set-for-uturn-over-uk-sanctuary-9077647.html
217
+ - spec/fixtures/independent/innocent-starving-close-to-death-one-victim-of-the-siege-that-shames-syria-9065538.html
218
+ - spec/fixtures/independent/saudi-authorities-stop-textmessage-tracking-of-women-for-now-9065486.html
219
+ - spec/fixtures/new_york_times/khaled-meshal-the-leader-of-hamas-vacates-damascus.html
220
+ - spec/fixtures/new_york_times/show-banned-french-comedian-has-new-one.html
221
+ - spec/fixtures/new_york_times/the-long-run-gingrich-stuck-to-caustic-path-in-ethics-battles.html
222
+ - spec/fixtures/washingtonpost/pentagon-confirms-al-shabab-leader-killed.html
223
+ - spec/fixtures/washingtonpost/sgt-bowe-bergdahls-capture-remains-amystery.html
224
+ - spec/fixtures/washingtonpost/will-a-bust-follow-the-boom-in-britain.html
225
+ - spec/fixtures/theintercept/canada-proclaiming-war-12-years-shocked-someone-attacked-soldiers.html
224
226
  - spec/parsers/bbc_news_page_spec.rb
225
227
  - spec/parsers/guardian_page_spec.rb
226
- - spec/parsers/independent_page_parser_spec.rb
227
228
  - spec/parsers/new_york_times_page_parser_spec.rb
229
+ - spec/parsers/independent_page_parser_spec.rb
230
+ - spec/parsers/the_intercept_page_parser_spec.rb
231
+ - spec/parsers/washingtonpost_page_parser_spec.rb
232
+ - spec/parser_factory_spec.rb
228
233
  - spec/spec.opts
229
234
  - spec/spec_helper.rb
230
- - spec/parser_factory_spec.rb
235
+ - spec/base_parser_spec.rb
metadata.gz.sig CHANGED
Binary file