web-page-parser 1.0.0 → 1.1.0
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- checksums.yaml.gz.sig +4 -1
- data.tar.gz.sig +0 -0
- data/README.rdoc +26 -17
- data/lib/web-page-parser/base_parser.rb +11 -1
- data/lib/web-page-parser/http.rb +1 -0
- data/lib/web-page-parser/parsers/the_intercept_page_parser.rb +62 -0
- data/spec/base_parser_spec.rb +10 -0
- data/spec/fixtures/theintercept/canada-proclaiming-war-12-years-shocked-someone-attacked-soldiers.html +327 -0
- data/spec/parsers/independent_page_parser_spec.rb +4 -0
- data/spec/parsers/the_intercept_page_parser_spec.rb +67 -0
- data/spec/parsers/washingtonpost_page_parser_spec.rb +8 -0
- metadata +37 -32
- metadata.gz.sig +0 -0
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA1:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 03275ebf096ab1230e14cc43df8d273bfb39a4de
|
4
|
+
data.tar.gz: 628fb68509c4352ac1962036cf1e33c0a51e9390
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: efb4b1cde955569cb14431a8d3748a348255a408acd15c7975715fd1b56a909489a143d2869187b834c066d6763a13276f7cc8b548112d61ea054fad0b8164ac
|
7
|
+
data.tar.gz: 083055dbff7568b6576a64c191d57406a316aaabd60e59f1eb77a2c960ed125899ead48cf2dcb3d894b2955f2d76d7a8494240d94bc3b2990e71b9485a51c62a
|
checksums.yaml.gz.sig
CHANGED
@@ -1 +1,4 @@
|
|
1
|
-
|
1
|
+
��N��4���+���#|�>l�=�c,ô�=����k��P��E�s�%�x%ճ�Q^��U#�G�o�1�����9ܝa&�K8�{���������wj�ai\j�i��������`D�'�Sw) ڨ�/f�d7���Hy(�'Яb�渥}˙��YBvy�7�
|
2
|
+
X��ʍ(���
|
3
|
+
�F�<�m_P���}_��s��uL��=�Py
|
4
|
+
���:M뤖�*�~�������in�}�+\�䌴���b�
|
data.tar.gz.sig
CHANGED
Binary file
|
data/README.rdoc
CHANGED
@@ -1,36 +1,45 @@
|
|
1
1
|
= Web Page Parser
|
2
2
|
|
3
|
-
Web Page Parser is a Ruby library to parse the content out of web
|
4
|
-
pages, such as BBC News pages. It strips all non-textual stuff out,
|
5
|
-
leaving the title, publication date and an array of paragraphs. It
|
6
|
-
makes heavy use of regular expressions, rather than actually parsing
|
7
|
-
the HTML. This may sound a bit whacky, but BBC News html in particular
|
8
|
-
has semantic markup *within comments*, which cannot easily be
|
9
|
-
referenced with standard HTML parsing. Regular expressions are much
|
10
|
-
faster than full HTML parsing too.
|
3
|
+
Web Page Parser is a Ruby library to parse the content out of certain web pages, such as BBC News pages. It strips all non-textual stuff out, leaving the title, publication date and an array of paragraphs.
|
11
4
|
|
5
|
+
Web Page Parser used to make heavy use of regular expressions, rather than actually parsing the HTML. This may sound a bit whacky, but BBC News HTML in particular had semantic markup *within comments*, which could not easily be referenced with standard HTML parsing. But the early wild west days of using Web Page Parser (back in 2009!) are over and news web page formatting has improved a lot and most of the parsers now use standard HTML parsing.
|
12
6
|
|
13
|
-
Web Page Parser currently supports BBC News
|
14
|
-
|
7
|
+
Web Page Parser currently supports BBC News, Independent, New York Times, Washington Post and Guardian news articles but new parsers are planned and can be added easily.
|
8
|
+
|
9
|
+
== News Sniffer
|
10
|
+
|
11
|
+
Web Page Parser is primarily used by the {News Sniffer}[http://www.newssniffer.co.uk] project, which parses and archives news articles to keep track of how they change. This has heavily influenced the design of Web Page Parser.
|
12
|
+
|
13
|
+
News Sniffer requires that an update to a parser doesn't cause a false change to be detected in the backlog of tracked articles. Web Page Parser caters to this by supporting multiple versions of each parser.
|
14
|
+
|
15
|
+
So whenever a parser has to be changed, say, to support a new design, or to remove some useless non-textual widget, the existing parser is not touched and a new version is added. The new version will often inherit most of the behaviour of the previous version and just add the new filters or tweaks necessary.
|
16
|
+
|
17
|
+
So, for example, the BBC often change their design for new articles but their old articles can stay using the same old design. Web Page Parser's BBC parser still supports the older article formats without changing the resulting parsed content at all.
|
18
|
+
|
19
|
+
Web Page Parser will always use the latest version of each parser by default (using the url to detect which parser to use), but you can specifically require any particular version. News Sniffer keeps track of which parser version was used for each article to it can ensure it uses the same one from then on.
|
15
20
|
|
16
|
-
It is used by the {News Sniffer}[http://www.newssniffer.co.uk]
|
17
|
-
project, which parses and archives news articles to keep track of how
|
18
|
-
they change.
|
19
21
|
|
20
22
|
== Example usage
|
21
23
|
|
22
24
|
require 'web-page-parser'
|
23
|
-
require 'open-uri'
|
24
25
|
|
25
26
|
url = "http://news.bbc.co.uk/1/hi/uk/8041972.stm"
|
26
|
-
page_data = open(url).read
|
27
27
|
|
28
|
-
page = WebPageParser::ParserFactory.parser_for(:url => url
|
28
|
+
page = WebPageParser::ParserFactory.parser_for(:url => url)
|
29
29
|
|
30
30
|
puts page.title # MPs hit back over expenses claims
|
31
31
|
puts page.date # 2009-05-09T18:58:59+00:00
|
32
32
|
puts page.content.first # The wife of author Ken Follett and ...
|
33
33
|
|
34
|
+
== Or specify a particular parser
|
35
|
+
|
36
|
+
url = "http://www.theguardian.com/world/2014/oct/24/kurds-fear-isis-chemical-weapon-kobani"
|
37
|
+
|
38
|
+
page = WebPageParser::GuardianPageParserV1.new(:url => url)
|
39
|
+
|
40
|
+
puts page.title # Barack Obama declares Iraq war a success
|
41
|
+
|
42
|
+
|
34
43
|
== Ruby 1.8 support
|
35
44
|
|
36
45
|
Installing the Oniguruma gem on Ruby 1.8 will make Web Page Parser run
|
@@ -42,5 +51,5 @@ Web Page Parser was written by {John Leach}[http://johnleach.co.uk]
|
|
42
51
|
and is released under the MIT License.
|
43
52
|
|
44
53
|
The code is available on
|
45
|
-
{github}[http://github.com/johnl/web-page-parser
|
54
|
+
{github}[http://github.com/johnl/web-page-parser].
|
46
55
|
|
@@ -12,13 +12,14 @@ module WebPageParser
|
|
12
12
|
attr_accessor :retrieve_session
|
13
13
|
end
|
14
14
|
|
15
|
-
attr_reader :url
|
15
|
+
attr_reader :url
|
16
16
|
|
17
17
|
# takes a hash of options. The :url option passes the page url, and
|
18
18
|
# the :page option passes the raw html page content for parsing
|
19
19
|
def initialize(options = { })
|
20
20
|
@url = options[:url]
|
21
21
|
@page = options[:page]
|
22
|
+
@guid = options[:guid]
|
22
23
|
end
|
23
24
|
|
24
25
|
# return the page contents, retrieving it from the server if necessary
|
@@ -46,6 +47,15 @@ module WebPageParser
|
|
46
47
|
def date
|
47
48
|
end
|
48
49
|
|
50
|
+
def guid_from_url
|
51
|
+
end
|
52
|
+
|
53
|
+
def guid
|
54
|
+
return @guid if @guid
|
55
|
+
@guid = guid_from_url if url
|
56
|
+
@guid
|
57
|
+
end
|
58
|
+
|
49
59
|
# Return a hash representing the textual content of this web page
|
50
60
|
def hash
|
51
61
|
digest = Digest::MD5.new
|
data/lib/web-page-parser/http.rb
CHANGED
@@ -23,6 +23,7 @@ module WebPageParser
|
|
23
23
|
c.dns_cache_timeout = 600
|
24
24
|
c.enable_cookies = true
|
25
25
|
c.follow_location = true
|
26
|
+
c.max_redirects = 6
|
26
27
|
c.autoreferer = true
|
27
28
|
c.headers["User-Agent"] = 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.0.4) Gecko/20060508 Firefox/1.5.0.4'
|
28
29
|
c.headers["Accept-encoding"] = 'gzip, deflate'
|
@@ -0,0 +1,62 @@
|
|
1
|
+
module WebPageParser
|
2
|
+
class TheInterceptPageParserFactory < WebPageParser::ParserFactory
|
3
|
+
URL_RE = ORegexp.new('firstlook.org/theintercept/[0-9]{4}/[0-9]{2}/[0-9]{2}/[a-z0-9-]+')
|
4
|
+
def self.can_parse?(options)
|
5
|
+
URL_RE.match(options[:url])
|
6
|
+
end
|
7
|
+
|
8
|
+
def self.create(options = {})
|
9
|
+
TheInterceptPageParserV1.new(options)
|
10
|
+
end
|
11
|
+
end
|
12
|
+
|
13
|
+
# TheInterceptPageParserV1 parses "The Intercept" web pages using html
|
14
|
+
# parsing.
|
15
|
+
class TheInterceptPageParserV1 < WebPageParser::BaseParser
|
16
|
+
require 'nokogiri'
|
17
|
+
|
18
|
+
# WashPo articles have a guid in the url (as of Jan 2014, a
|
19
|
+
# uuid)
|
20
|
+
def guid_from_url
|
21
|
+
# get the last large number from the url, if there is one
|
22
|
+
url.to_s.scan(/https:\/\/firstlook.org\/theintercept\/[0-9]{4}\/[0-9]{2}\/[0-9]{2}\/[a-z0-9-]+/).last
|
23
|
+
end
|
24
|
+
|
25
|
+
def html_doc
|
26
|
+
@html_document ||= Nokogiri::HTML(page)
|
27
|
+
end
|
28
|
+
|
29
|
+
def title
|
30
|
+
return @title if @title
|
31
|
+
title_meta = html_doc.at_css('meta[property="og:title"]')
|
32
|
+
title = nil
|
33
|
+
if title_meta
|
34
|
+
title = title_meta['content'].to_s.strip
|
35
|
+
end
|
36
|
+
if title.nil?
|
37
|
+
title = html_doc.css('head title').text.strip
|
38
|
+
end
|
39
|
+
title = title.gsub(/- The Intercept$/,'')
|
40
|
+
@title = title.strip
|
41
|
+
end
|
42
|
+
|
43
|
+
def content
|
44
|
+
return @content if @content
|
45
|
+
story_body = html_doc.css('article div.ti-body p').collect do |p|
|
46
|
+
p.text.strip.gsub(160.chr(Encoding::UTF_8), ' ') # convert to actual space
|
47
|
+
end
|
48
|
+
@content = story_body.select { |p| !p.empty? }
|
49
|
+
end
|
50
|
+
|
51
|
+
def date
|
52
|
+
return @date if @date
|
53
|
+
if date_meta = html_doc.at_css('meta[property="article:published_time"]')
|
54
|
+
date_string = date_meta['content'].scan(/[0-9]{4}-[0-9]{2}-[0-9]{2}T[0-9]{2}:[0-9]{2}:[0-9]{2}\+[0-9]{2}:[0-9]{2}/).first
|
55
|
+
@date = DateTime.parse(date_string) rescue nil
|
56
|
+
end
|
57
|
+
return @date if @date
|
58
|
+
# failing that, get it from the url
|
59
|
+
@date = DateTime.parse(url.scan(/[0-9]{4}\/[0-9]{2}\/[0-9]{2}/).first.to_s) rescue nil
|
60
|
+
end
|
61
|
+
end
|
62
|
+
end
|
data/spec/base_parser_spec.rb
CHANGED
@@ -14,6 +14,16 @@ share_as :AllPageParsers do
|
|
14
14
|
content.empty?.should be_true
|
15
15
|
end
|
16
16
|
|
17
|
+
it "should use guid_from_url if available" do
|
18
|
+
class GuidTestPageParser < WebPageParser::BaseParser
|
19
|
+
def guid_from_url
|
20
|
+
"guidfromurl"
|
21
|
+
end
|
22
|
+
end
|
23
|
+
GuidTestPageParser.new.guid.should == nil
|
24
|
+
GuidTestPageParser.new(:url => 'someurl').guid.should == 'guidfromurl'
|
25
|
+
end
|
26
|
+
|
17
27
|
context "when hashing the content" do
|
18
28
|
before :each do
|
19
29
|
@wpp = WebPageParser::BaseParser.new(@valid_options)
|
@@ -0,0 +1,327 @@
|
|
1
|
+
<!DOCTYPE html>
|
2
|
+
|
3
|
+
<!--[if lt IE 7]><html class="no-js lt-ie9 lt-ie8 lt-ie7"> <![endif]-->
|
4
|
+
<!--[if IE 7]><html class="no-js lt-ie9 lt-ie8"> <![endif]-->
|
5
|
+
<!--[if IE 8]><html class="no-js lt-ie9"> <![endif]-->
|
6
|
+
<!--[if gt IE 8]><!--><html class="no-js"> <!--<![endif]-->
|
7
|
+
<head>
|
8
|
+
<meta charset="utf-8" />
|
9
|
+
|
10
|
+
<meta http-equiv="X-UA-Compatible" content="IE=edge">
|
11
|
+
<meta name="description" content="">
|
12
|
+
<meta name="viewport" content="width=device-width, initial-scale=1">
|
13
|
+
|
14
|
+
<link rel="shortcut icon" href="https://prod01-cdn00.cdn.firstlook.org/theintercept/wp-content/themes/the-intercept-v2/favicon.png">
|
15
|
+
|
16
|
+
<script src='https://prod01-cdn01.cdn.firstlook.org/theintercept/wp-content/themes/the-intercept-v2/js/modernizr-2.6.2.min.js'></script>
|
17
|
+
<!-- This site is optimized for SEO -->
|
18
|
+
<title>Canada, At War For 13 Years, Shocked That 'A Terrorist' Attacked Its Soldiers - The Intercept</title>
|
19
|
+
<link rel="canonical" href="https://firstlook.org/theintercept/2014/10/22/canada-proclaiming-war-12-years-shocked-someone-attacked-soldiers/" />
|
20
|
+
<meta property="og:locale" content="en_US" />
|
21
|
+
<meta property="og:type" content="article" />
|
22
|
+
<meta property="og:title" content="Canada, At War For 13 Years, Shocked That 'A Terrorist' Attacked Its Soldiers - The Intercept" />
|
23
|
+
<meta property="og:description" content="(updated below – Update II) TORONTO – In Quebec on Monday, two Canadian soldiers were hit by a car driven by Martin Couture-Rouleau, a 25-year-old Canadian who, as The Globe and Mail reported, “converted to Islam recently and called himself Ahmad Rouleau.” One of the soldiers died, as did Couture-Rouleau when he was shot by police upon>>" />
|
24
|
+
<meta property="og:url" content="https://firstlook.org/theintercept/2014/10/22/canada-proclaiming-war-12-years-shocked-someone-attacked-soldiers/" />
|
25
|
+
<meta property="og:site_name" content="The Intercept" />
|
26
|
+
<meta property="article:section" content="Uncategorized" />
|
27
|
+
<meta property="article:published_time" content="<span class='fltimestamp' data-timestamp='1413982586'>2014-10-22T08:56:26+00:00</span>" />
|
28
|
+
<meta property="article:modified_time" content="2014-10-26T16:46:02+00:00" />
|
29
|
+
<meta property="og:image" content="https://prod01-cdn00.cdn.firstlook.org/wp-uploads/sites/1/2014/10/stephen-harper.jpg" />
|
30
|
+
<meta name="twitter:card" content="summary"/>
|
31
|
+
<meta name="twitter:site" content="@the_intercept"/>
|
32
|
+
<meta name="twitter:domain" content="The Intercept"/>
|
33
|
+
<meta name="twitter:creator" content="@the_intercept"/>
|
34
|
+
<!-- / Yoast WordPress SEO plugin. -->
|
35
|
+
|
36
|
+
<link rel="alternate" type="application/rss+xml" title="The Intercept » Canada, At War For 13 Years, Shocked That ‘A Terrorist’ Attacked Its Soldiers Comments Feed" href="https://firstlook.org/theintercept/2014/10/22/canada-proclaiming-war-12-years-shocked-someone-attacked-soldiers/feed/" />
|
37
|
+
<link rel='stylesheet' id='main-css-css' href='https://prod01-cdn00.cdn.firstlook.org/theintercept/wp-content/themes/the-intercept-v2/css/all.css?ver=4a4b95118f8708ad1db764f737e1088a' type='text/css' media='' />
|
38
|
+
<script type='text/javascript' src='https://prod01-cdn01.cdn.firstlook.org/theintercept/wp-includes/js/jquery/jquery.js?ver=1.11.0'></script>
|
39
|
+
<script type='text/javascript' src='https://prod01-cdn01.cdn.firstlook.org/theintercept/wp-includes/js/jquery/jquery-migrate.min.js?ver=1.2.1'></script>
|
40
|
+
<script type='text/javascript' src='https://prod01-cdn01.cdn.firstlook.org/theintercept/wp-content/themes/the-intercept-v2/js/jquery.dotdotdot.min.js?ver=e7489c03aaea168ba084298955d7fb9a'></script>
|
41
|
+
<script type='text/javascript' src='https://prod01-cdn01.cdn.firstlook.org/theintercept/wp-content/themes/the-intercept-v2/js/jquery.stellar.js?ver=facdbc0dc5a7eea6bcfabbba807822ed'></script>
|
42
|
+
<script type='text/javascript' src='https://prod01-cdn01.cdn.firstlook.org/theintercept/wp-content/themes/the-intercept-v2/js/jquery.hoverIntent.js?ver=be188522bc57c3f0821dfb2053609915'></script>
|
43
|
+
<script type='text/javascript' src='https://prod01-cdn01.cdn.firstlook.org/theintercept/wp-content/themes/the-intercept-v2/js/jquery.sticky-kit.js?ver=b42204c98ed4cf287173bbef20dbd1ce'></script>
|
44
|
+
<script type='text/javascript' src='https://prod01-cdn01.cdn.firstlook.org/theintercept/wp-content/themes/the-intercept-v2/js/moment.js?ver=2bc343658034d1a7f4d6694fef2659e4'></script>
|
45
|
+
<script type='text/javascript' src='https://prod01-cdn01.cdn.firstlook.org/theintercept/wp-content/themes/the-intercept-v2/js/flm.js?ver=e0465435eeee4e2c4578c915a2e37991'></script>
|
46
|
+
<script type='text/javascript' src='https://prod01-cdn01.cdn.firstlook.org/theintercept/wp-content/themes/the-intercept-v2/js/document-loader.js?ver=8a7dc017b1b53ec94d972081643be68a'></script>
|
47
|
+
<script type='text/javascript' src='https://prod01-cdn01.cdn.firstlook.org/theintercept/wp-content/themes/the-intercept-v2/js/pez/pez.js?ver=abb38c195d86a81dac1a003fc1b74fb2'></script>
|
48
|
+
<script type='text/javascript' src='https://prod01-cdn01.cdn.firstlook.org/theintercept/wp-content/themes/the-intercept-v2/js/pez/pez.linq.js?ver=ee21e8ca846da516f844ed3c361ab9c4'></script>
|
49
|
+
<script type='text/javascript' src='https://prod01-cdn01.cdn.firstlook.org/theintercept/wp-content/themes/the-intercept-v2/js/pez/pez.matching.js?ver=e6d719720f9888ee8f389df195a2b74c'></script>
|
50
|
+
<script type='text/javascript' src='https://prod01-cdn01.cdn.firstlook.org/theintercept/wp-content/themes/the-intercept-v2/js/pez/pez.modules.js?ver=f23e2e2c1b069d62353e8ccb1cbd3c0e'></script>
|
51
|
+
<script type='text/javascript' src='https://prod01-cdn01.cdn.firstlook.org/theintercept/wp-content/themes/the-intercept-v2/js/pez/pez.object.js?ver=4f91b61f003a57bc4debf273837c70cc'></script>
|
52
|
+
<script type='text/javascript' src='https://prod01-cdn01.cdn.firstlook.org/theintercept/wp-content/themes/the-intercept-v2/js/pez/pez.params.js?ver=fce94f86c8bf04f62ff4aae18b14fc4e'></script>
|
53
|
+
<script type='text/javascript' src='https://prod01-cdn01.cdn.firstlook.org/theintercept/wp-content/themes/the-intercept-v2/js/pez/pez.strings.js?ver=7c6646a66a66fe2832355bd35d84689f'></script>
|
54
|
+
<script type='text/javascript' src='https://prod01-cdn01.cdn.firstlook.org/theintercept/wp-content/themes/the-intercept-v2/js/mods/nav.js?ver=19d2a6ca833bf2079a5f52ed9d413758'></script>
|
55
|
+
<script type='text/javascript' src='https://prod01-cdn01.cdn.firstlook.org/theintercept/wp-content/themes/the-intercept-v2/js/mods/search.js?ver=502b1ff2ddeb5a21763503dc0fec123e'></script>
|
56
|
+
<script type='text/javascript' src='https://prod01-cdn01.cdn.firstlook.org/theintercept/wp-content/themes/the-intercept-v2/js/mods/paginator.js?ver=9da6cfd5cd618be638314edf6cd09ca0'></script>
|
57
|
+
<script type='text/javascript' src='https://prod01-cdn01.cdn.firstlook.org/theintercept/wp-content/themes/the-intercept-v2/js/mods/carousel.js?ver=bb84f70c01affe957d97c873f893419a'></script>
|
58
|
+
<script type='text/javascript' src='https://prod01-cdn01.cdn.firstlook.org/theintercept/wp-content/themes/the-intercept-v2/js/mods/features.js?ver=bd7ef1410122fd2d2c7bdb6b73104b69'></script>
|
59
|
+
<script type='text/javascript' src='https://prod01-cdn01.cdn.firstlook.org/theintercept/wp-content/themes/the-intercept-v2/js/mods/article.js?ver=b7b006e222227897d25c2c6f0b58760c'></script>
|
60
|
+
<script type='text/javascript' src='https://prod01-cdn01.cdn.firstlook.org/theintercept/wp-content/themes/the-intercept-v2/js/mods/share.js?ver=0a9500ca813719b321303852a850bcbc'></script>
|
61
|
+
<script type='text/javascript' src='https://prod01-cdn01.cdn.firstlook.org/theintercept/wp-content/themes/the-intercept-v2/js/mods/video.js?ver=c4afb1d47e2d65ed275d37005f014503'></script>
|
62
|
+
<link rel="EditURI" type="application/rsd+xml" title="RSD" href="https://firstlook.org/theintercept/xmlrpc.php?rsd" />
|
63
|
+
<link rel="wlwmanifest" type="application/wlwmanifest+xml" href="https://prod01-cdn02.cdn.firstlook.org/theintercept/wp-includes/wlwmanifest.xml" />
|
64
|
+
<meta name="generator" content="WordPress 3.9.2" />
|
65
|
+
<link rel='shortlink' href='https://firstlook.org/theintercept/?p=7196' />
|
66
|
+
<script type="text/javascript">
|
67
|
+
var _paq = _paq || [];
|
68
|
+
_paq.push(["trackPageView"]);
|
69
|
+
_paq.push(["enableLinkTracking"]);
|
70
|
+
|
71
|
+
(function() {
|
72
|
+
var u="https://prod01-piwik.firstlook.org/";
|
73
|
+
var siteID = "1";
|
74
|
+
_paq.push(["setTrackerUrl", u+"piwik.php"]);
|
75
|
+
_paq.push(["setSiteId", siteID]);
|
76
|
+
var d=document, g=d.createElement("script"), s=d.getElementsByTagName("script")[0]; g.type="text/javascript";
|
77
|
+
g.defer=true; g.async=true; g.src=u+"piwik.js"; s.parentNode.insertBefore(g,s);
|
78
|
+
})();
|
79
|
+
</script>
|
80
|
+
<!-- Start Fluid Video Embeds Style Tag -->
|
81
|
+
<style type="text/css">
|
82
|
+
/* Thanks to Web Designer Wall for writing about this technique: http://webdesignerwall.com/tutorials/css-elastic-videos */
|
83
|
+
/* And to A List Apart: http://www.alistapart.com/articles/creating-intrinsic-ratios-for-video/ */
|
84
|
+
.fve-video-wrapper {
|
85
|
+
position: relative;
|
86
|
+
overflow: hidden;
|
87
|
+
height: 0;
|
88
|
+
background-color: transparent;
|
89
|
+
padding-bottom: 56.25%; /* This is default, but will be overriden */
|
90
|
+
margin: 0.5em 0; /* A bit of margin at the bottom */
|
91
|
+
}
|
92
|
+
.fve-video-wrapper iframe,
|
93
|
+
.fve-video-wrapper object,
|
94
|
+
.fve-video-wrapper embed {
|
95
|
+
position: absolute;
|
96
|
+
display: block;
|
97
|
+
top: 0;
|
98
|
+
left: 0;
|
99
|
+
width: 100%;
|
100
|
+
height: 100%;
|
101
|
+
}
|
102
|
+
.fve-video-wrapper a.hyperlink-image {
|
103
|
+
position: relative;
|
104
|
+
display: none;
|
105
|
+
}
|
106
|
+
.fve-video-wrapper a.hyperlink-image img {
|
107
|
+
position: relative;
|
108
|
+
z-index: 2;
|
109
|
+
width: 100%;
|
110
|
+
}
|
111
|
+
.fve-video-wrapper a.hyperlink-image .fve-play-button {
|
112
|
+
position: absolute;
|
113
|
+
left: 35%;
|
114
|
+
top: 35%;
|
115
|
+
right: 35%;
|
116
|
+
bottom: 35%;
|
117
|
+
z-index: 3;
|
118
|
+
background-color: rgba(40, 40, 40, 0.75);
|
119
|
+
background-size: 100% 100%;
|
120
|
+
border-radius: 10px;
|
121
|
+
}
|
122
|
+
.fve-video-wrapper a.hyperlink-image:hover .fve-play-button {
|
123
|
+
background-color: rgba(0, 0, 0, 0.85);
|
124
|
+
}
|
125
|
+
/* End of standard styles */
|
126
|
+
</style>
|
127
|
+
<!-- End Fluid Video Embeds Style Tag -->
|
128
|
+
</head>
|
129
|
+
<body class="single single-post postid-7196 single-format-standard">
|
130
|
+
<header role="banner">
|
131
|
+
<div class="grid">
|
132
|
+
<a class="logo-link" href="https://firstlook.org/theintercept">
|
133
|
+
<img alt="The Intercept" src="/wp-content/themes/the-intercept-v2/images/the-intercept.png" class="ti-image image-logo" />
|
134
|
+
</a>
|
135
|
+
<nav role="navigation" class="ti-menu menu-main" data-pz-module="nav">
|
136
|
+
<div>
|
137
|
+
<i class="hamburger"></i>
|
138
|
+
<ul id="menu-primary" class="menu"><li id="menu-item-3924" class="menu-item menu-item-type-custom menu-item-object-custom menu-item-3924"><a href="/theintercept/features/">Features</a></li>
|
139
|
+
<li id="menu-item-3922" class="menu-item menu-item-type-custom menu-item-object-custom menu-item-3922"><a href="/theintercept/greenwald/">Greenwald</a></li>
|
140
|
+
<li id="menu-item-3923" class="menu-item menu-item-type-custom menu-item-object-custom menu-item-3923"><a href="/theintercept/froomkin/">Froomkin</a></li>
|
141
|
+
<li id="menu-item-46" class="menu-item menu-item-type-post_type menu-item-object-page menu-item-46"><a href="https://firstlook.org/theintercept/documents/">Documents</a></li>
|
142
|
+
<li id="menu-item-47" class="menu-item menu-item-type-post_type menu-item-object-page menu-item-47"><a href="https://firstlook.org/theintercept/staff/">Staff</a></li>
|
143
|
+
<li id="menu-item-3925" class="menu-item menu-item-type-custom menu-item-object-custom menu-item-3925"><a href="/theintercept/contact/">Contact</a></li>
|
144
|
+
</ul> </div>
|
145
|
+
</nav>
|
146
|
+
<nav role="navigation" class="ti-menu menu-social">
|
147
|
+
<div class="menu-social-container">
|
148
|
+
<ul id="menu-social" class="menu"><li id="menu-item-869" class="ti-icon icon-twitter menu-item menu-item-type-custom menu-item-object-custom menu-item-869"><a title="twitter" target="_blank" href="https://twitter.com/the_intercept">Twitter</a></li>
|
149
|
+
<li id="menu-item-3921" class="ti-icon icon-facebook menu-item menu-item-type-custom menu-item-object-custom menu-item-3921"><a href="https://facebook.com/theinterceptflm">Facebook</a></li>
|
150
|
+
<li id="menu-item-125" class="ti-icon icon-rss menu-item menu-item-type-post_type menu-item-object-page menu-item-125"><a title="rss" href="https://firstlook.org/theintercept/feeds/">RSS feeds</a></li>
|
151
|
+
</ul> </div>
|
152
|
+
<form role="search" method="get" id="searchform" action="https://firstlook.org/theintercept/" class="ti-form form-search" data-pz-module="search">
|
153
|
+
<div class="toggle">
|
154
|
+
<input name="s" id="s" placeholder="Search" />
|
155
|
+
</div>
|
156
|
+
<button><i class="fa fa-search"></i></button>
|
157
|
+
</form> </nav>
|
158
|
+
</div>
|
159
|
+
</header>
|
160
|
+
<section role="main" data-pz-module="article">
|
161
|
+
<div class="grid">
|
162
|
+
<div class="ti-article-page threecol ti-sticky-parent">
|
163
|
+
<header>
|
164
|
+
<h1 class="title">Canada, At War For 13 Years, Shocked That ‘A Terrorist’ Attacked Its Soldiers</h1> <div class="ti-byline">
|
165
|
+
<cite>By <span><a href='https://firstlook.org/theintercept/staff/glenn-greenwald/'>Glenn Greenwald</a></span></cite>
|
166
|
+
<div class="ti-social">
|
167
|
+
<a class='twitter' href='https://twitter.com/@ggreenwald'>@ggreenwald</a> </div>
|
168
|
+
<time><span class='fltimestamp' data-timestamp='1413982586'>22 Oct 2014</span></time>
|
169
|
+
</div>
|
170
|
+
</header>
|
171
|
+
<div class="ti-sidebar ti-sticky sidebar-social">
|
172
|
+
<aside><h4>Share</h4> <ul data-pz-module="share" class="social" data-track="Share" data-url="https://firstlook.org/theintercept/2014/10/22/canada-proclaiming-war-12-years-shocked-someone-attacked-soldiers/" data-bitly-ue="http://interc.pt/1otBqax" data-bitly="http%3A%2F%2Finterc.pt%2F1otBqax" data-id="7196">
|
173
|
+
<li class="twitter with-icon">
|
174
|
+
<a class="twitter" title="Share on Twitter" target="_tw" data-width="500" data-height="250" href="http://twitter.com/share?text=Canada, At War For 13 Years, Shocked That ‘A Terrorist’ Attacked Its Soldiers&url=http%3A%2F%2Finterc.pt%2F1otBqax">Twitter</a>
|
175
|
+
</li>
|
176
|
+
<li class="facebook with-icon">
|
177
|
+
<a class="facebook" title="Post on Facebook" target="_fb" data-width="500" data-height="400" href="http://www.facebook.com/sharer/sharer.php?u=http%3A%2F%2Finterc.pt%2F1otBqax">Facebook</a>
|
178
|
+
</li>
|
179
|
+
<li class="googleplus with-icon">
|
180
|
+
<a class="googleplus" title="Post on Google+" target="_gp" data-width="500" data-height="500" href="https://plus.google.com/share?url=http%3A%2F%2Finterc.pt%2F1otBqax">Google</a>
|
181
|
+
</li>
|
182
|
+
<!--<li class="linkedin with-icon">
|
183
|
+
<a class="linkedin" title="Post to LinedIn" target="_li" data-width="500" data-height="515" href="https://www.linkedin.com/cws/share?url=http%3A%2F%2Finterc.pt%2F1otBqax&title=Canada%2C%20At%20War%20For%2013%20Years%2C%20Shocked%20That%20%26%238216%3BA%20Terrorist%26%238217%3B%20Attacked%20Its%20Soldiers">LinkedIn</a>
|
184
|
+
</li>-->
|
185
|
+
<li class="email with-icon">
|
186
|
+
<a class="mail" title="E-mail article" href="mailto:?subject=Canada, At War For 13 Years, Shocked That ‘A Terrorist’ Attacked Its Soldiers&body=http%3A%2F%2Finterc.pt%2F1otBqax">Email</a>
|
187
|
+
</li>
|
188
|
+
<li class="print with-icon">
|
189
|
+
<a class="print" title="Print this page" href="#print">Print</a>
|
190
|
+
</li>
|
191
|
+
</ul>
|
192
|
+
</aside> </div>
|
193
|
+
<div class="ti-sidebar ti-sticky sidebar-popular">
|
194
|
+
<aside><h4>Popular</h4><ul class='ti-popular'><li><a href="https://firstlook.org/theintercept/2014/10/30/inside-story-matt-taibbis-departure-first-look-media/"><img src="https://prod01-cdn01.cdn.firstlook.org/wp-uploads/sites/1/2014/10/matt-taibbi-renaldi-single-popular.jpg" /></a><a href="https://firstlook.org/theintercept/2014/10/30/inside-story-matt-taibbis-departure-first-look-media/" class="excerpt with-image">The Inside Story Of Matt Taibbi’s Departure From First Look Media</a></li><li><a href="https://firstlook.org/theintercept/2014/10/30/hacking-team/"><img src="https://prod01-cdn00.cdn.firstlook.org/wp-uploads/sites/1/2014/10/laptop-smartphone-single-popular.jpg" /></a><a href="https://firstlook.org/theintercept/2014/10/30/hacking-team/" class="excerpt with-image">Secret Manuals Show the Spyware Sold to Despots and Cops Worldwide</a></li><li><a href="https://firstlook.org/theintercept/2014/10/22/canada-proclaiming-war-12-years-shocked-someone-attacked-soldiers/"><img src="https://prod01-cdn01.cdn.firstlook.org/wp-uploads/sites/1/2014/10/stephen-harper-single-popular.jpg" /></a><a href="https://firstlook.org/theintercept/2014/10/22/canada-proclaiming-war-12-years-shocked-someone-attacked-soldiers/" class="excerpt with-image">Canada, At War For 13 Years, Shocked That ‘A Terrorist’ Attacked Its Soldiers</a></li><li><a href="https://firstlook.org/theintercept/2014/10/28/smuggling-snowden-secrets/"><img src="https://prod01-cdn03.cdn.firstlook.org/wp-uploads/sites/1/2014/10/micah_snowden_crop_v3-single-popular.jpg" /></a><a href="https://firstlook.org/theintercept/2014/10/28/smuggling-snowden-secrets/" class="excerpt with-image">Ed Snowden Taught Me To Smuggle Secrets Past Incredible Danger. Now I Teach You.</a></li><li><a href="https://firstlook.org/theintercept/2014/10/31/block-boat-work-middle/"><img src="https://prod01-cdn00.cdn.firstlook.org/wp-uploads/sites/1/2014/10/14533521-single-popular.jpg" /></a><a href="https://firstlook.org/theintercept/2014/10/31/block-boat-work-middle/" class="excerpt with-image">A Small Band of Activists Is Humiliating an Israeli Shipping Giant</a></li></ul></aside> </div>
|
195
|
+
|
196
|
+
<article class="ti-article">
|
197
|
+
<div class="ti-sidebar sidebar-social for-mobile">
|
198
|
+
<aside><h4>Share</h4> <ul data-pz-module="share" class="social" data-track="Share" data-url="https://firstlook.org/theintercept/2014/10/22/canada-proclaiming-war-12-years-shocked-someone-attacked-soldiers/" data-bitly-ue="http://interc.pt/1otBqax" data-bitly="http%3A%2F%2Finterc.pt%2F1otBqax" data-id="7196">
|
199
|
+
<li class="twitter with-icon">
|
200
|
+
<a class="twitter" title="Share on Twitter" target="_tw" data-width="500" data-height="250" href="http://twitter.com/share?text=Canada, At War For 13 Years, Shocked That ‘A Terrorist’ Attacked Its Soldiers&url=http%3A%2F%2Finterc.pt%2F1otBqax">Twitter</a>
|
201
|
+
</li>
|
202
|
+
<li class="facebook with-icon">
|
203
|
+
<a class="facebook" title="Post on Facebook" target="_fb" data-width="500" data-height="400" href="http://www.facebook.com/sharer/sharer.php?u=http%3A%2F%2Finterc.pt%2F1otBqax">Facebook</a>
|
204
|
+
</li>
|
205
|
+
<li class="googleplus with-icon">
|
206
|
+
<a class="googleplus" title="Post on Google+" target="_gp" data-width="500" data-height="500" href="https://plus.google.com/share?url=http%3A%2F%2Finterc.pt%2F1otBqax">Google</a>
|
207
|
+
</li>
|
208
|
+
<!--<li class="linkedin with-icon">
|
209
|
+
<a class="linkedin" title="Post to LinedIn" target="_li" data-width="500" data-height="515" href="https://www.linkedin.com/cws/share?url=http%3A%2F%2Finterc.pt%2F1otBqax&title=Canada%2C%20At%20War%20For%2013%20Years%2C%20Shocked%20That%20%26%238216%3BA%20Terrorist%26%238217%3B%20Attacked%20Its%20Soldiers">LinkedIn</a>
|
210
|
+
</li>-->
|
211
|
+
<li class="email with-icon">
|
212
|
+
<a class="mail" title="E-mail article" href="mailto:?subject=Canada, At War For 13 Years, Shocked That ‘A Terrorist’ Attacked Its Soldiers&body=http%3A%2F%2Finterc.pt%2F1otBqax">Email</a>
|
213
|
+
</li>
|
214
|
+
<li class="print with-icon">
|
215
|
+
<a class="print" title="Print this page" href="#print">Print</a>
|
216
|
+
</li>
|
217
|
+
</ul>
|
218
|
+
</aside> </div>
|
219
|
+
<div class="hero">
|
220
|
+
<img src="https://prod01-cdn02.cdn.firstlook.org/wp-uploads/sites/1/2014/10/stephen-harper-article-display-b.jpg" alt="Featured photo - Canada, At War For 13 Years, Shocked That &#8216;A Terrorist&#8217; Attacked Its Soldiers" />
|
221
|
+
</div>
|
222
|
+
<div class="ti-body">
|
223
|
+
<p><strong>(updated below – Update II)</strong></p>
|
224
|
+
<p>TORONTO – In Quebec on Monday, two Canadian soldiers were hit by a car driven by Martin Couture-Rouleau, a 25-year-old Canadian who, as <em>The Globe and Mail</em> <a href="http://www.theglobeandmail.com/news/national/quebec-hit-and-run/article21187200/">reported</a>, “converted to Islam recently and called himself Ahmad Rouleau.” One of the soldiers died, as did Couture-Rouleau when he was shot by police upon apprehension after allegedly brandishing a large knife. Police speculated that the incident was deliberate, alleging the driver waited for two hours before hitting the soldiers, one of whom was wearing a uniform. The incident <a href="http://www.theglobeandmail.com/news/politics/two-soldiers-injured-in-quebec-hit-and-run/article21177035/">took place</a> in the parking lot of a shopping mall 30 miles southeast of Montreal, “a few kilometres from the Collège militaire royal de Saint-Jean, the military academy operated by the Department of National Defence.”</p>
|
225
|
+
<p>The right-wing Canadian government wasted no time in seizing on the incident to promote its fear-mongering agenda over terrorism, which includes <a href="http://calgary.ctvnews.ca/bill-proposed-to-give-csis-tools-to-investigate-track-and-prosecute-potential-terrorists-1.2057025">pending legislation</a> to vest its intelligence agency, CSIS, with more spying and secrecy powers in the name of fighting ISIS. A government spokesperson <a href="http://www.theglobeandmail.com/news/politics/two-soldiers-injured-in-quebec-hit-and-run/article21177035/">asserted</a> “clear indications” that the driver “had become radicalized.”</p>
|
226
|
+
<p>In a “clearly prearranged exchange,” a conservative MP, during parliamentary question time, asked Prime Minister Stephen Harper (pictured above) whether this was considered a “terrorist attack”; in reply, the prime minister gravely opined that the incident was “obviously extremely troubling.” Canada’s Public Safety Minister Steven Blaney <a href="http://globalnews.ca/news/1625585/canadian-soldier-struck-by-car-in-quebec-has-died/">pronounced</a> the incident “clearly linked to terrorist ideology,” while newspapers predictably followed suit, <a href="http://www.thestar.com/news/canada/2014/10/21/soldier_run_down_in_possible_quebec_terror_attack_dies.html">calling</a> it a “suspected terrorist attack” <a href="http://globalnews.ca/news/1625585/canadian-soldier-struck-by-car-in-quebec-has-died/">and</a> “homegrown terrorism.” CSIS spokesperson Tahera Mufti said “the event was the violent expression of an extremist ideology promoted by terrorist groups with global followings” and added: “That something like this would happen in a peaceable Canadian community like Saint-Jean-sur-Richelieu shows the long reach of these ideologies.”</p>
|
227
|
+
<p>In sum, the national mood and discourse in Canada is virtually identical to what prevails in every Western country whenever <a href="http://www.theguardian.com/commentisfree/2013/may/23/woolwich-attack-terrorism-blowback">an incident like this happens</a>: shock and bewilderment that someone would want to bring violence to such a good and innocent country (“a peaceable Canadian community like Saint-Jean-sur-Richelieu”), followed by claims that the incident shows how primitive and savage is the “terrorist ideology” of extremist Muslims, followed by rage and demand for still more actions of militarism and freedom-deprivation. There are two points worth making about this:</p>
|
228
|
+
<p><strong>First</strong>, Canada has spent the last 13 years proclaiming itself a nation at war. It <a href="http://www.theglobeandmail.com/globe-debate/editorials/now-that-our-war-in-afghanistan-is-over/article17501889/">actively participated</a> in the invasion and occupation of Afghanistan and was an <a href="http://rabble.ca/columnists/2014/08/poland-torture-hot-seat-canada-next">enthusiastic partner</a> in some of the most <a href="http://www.cbc.ca/news/world/omar-khadr-reattempts-to-sue-canada-for-20m-1.2753689">extremist War on Terror abuses</a> perpetrated <a href="http://www.salon.com/2010/08/11/khadr/">by the U.S.</a> Earlier this month, the Prime Minister <a href="http://news.nationalpost.com/2014/10/03/isis-motion-calls-for-air-strikes-no-troops-in-iraq/">revealed</a>, with the <a href="http://globalnews.ca/news/1595317/majority-of-canadians-back-use-of-fighter-jets-to-strike-isis-in-iraq/">support of a large majority</a> of Canadians, that “Canada is poised to go to war in Iraq, as [he] announced plans in Parliament [] to send CF-18 fighter jets for up to six months to battle Islamic extremists.” Just yesterday, Canadian Defence Minister Rob Nicholson <a href="http://www.edmontonsun.com/2014/10/21/fighter-jets-depart-from-cfb-cold-lake-alberta-to-middle-east">flamboyantly appeared</a> at the airfield in Alberta from which the fighter jets left for Iraq and stood tall as he issued the standard Churchillian war rhetoric about the noble fight against evil.</p>
|
229
|
+
<p>It is always stunning when a country that has brought violence and military force to numerous countries <a href="http://www.theguardian.com/commentisfree/2013/may/23/woolwich-attack-terrorism-blowback">acts shocked and bewildered</a> when someone brings <a href="http://www.theguardian.com/commentisfree/2013/apr/16/boston-marathon-explosions-notes-reactions">a tiny fraction of that violence</a> back to that country. Regardless of one’s views on the justifiability of Canada’s lengthy military actions, it’s not the slightest bit surprising or difficult to understand why people who identify with those on the other end of Canadian bombs and bullets would decide to attack the military responsible for that violence.</p>
|
230
|
+
<p>That’s the nature of war. A country doesn’t get to run around for years wallowing in war glory, invading, rendering and bombing others, without the risk of having violence brought back to it. Rather than being baffling or shocking, that reaction is completely natural and predictable. The only surprising thing about any of it is that it doesn’t happen more often.</p>
|
231
|
+
<p>The issue here is not justification (very few people would view attacks on soldiers in a shopping mall parking lot to be justified). The issue is <em>causation</em>. Every time one of these attacks occurs — from 9/11 on down — Western governments pretend that it was just some sort of unprovoked, utterly “senseless” act of violence caused by primitive, irrational, savage religious extremism inexplicably aimed at a country innocently minding its own business. They even invent fairy tales to feed to the population to explain why it happens: <a href="http://www.washingtonpost.com/wp-srv/nation/specials/attacked/transcripts/bushaddress_092001.html">they hate us for our freedoms.</a></p>
|
232
|
+
<p>Those fairy tales are pure deceit. Except in the rarest of cases, the violence has clearly identifiable and easy-to-understand causes: namely, anger over the violence that the country’s government has spent years directing at others. The <a href="http://www.salon.com/2010/06/22/terrorism_22/">statements of those accused by the west of terrorism</a>, and even the <a href="http://www.salon.com/2009/10/20/terrorism_6/">Pentagon’s own commissioned research</a>, have made conclusively clear what motivates these acts: namely, anger over the violence, abuse and interference by Western countries in that part of the world, with the world’s Muslims overwhelmingly the targets and victims. The very policies of militarism and civil liberties erosions justified in the name of stopping terrorism are actually what fuels terrorism and ensures its endless continuation.</p>
|
233
|
+
<p>If you want to be a country that spends more than a decade proclaiming itself at war and bringing violence to others, then one should expect that violence will sometimes be directed at you as well. Far from being the by-product of primitive and inscrutable religions, that behavior is the natural reaction of human beings targeted with violence. Anyone who doubts that should review the 13-year orgy of violence the U.S. has unleashed on the world since the 9/11 attack, as well as the decades of violence and interference from the U.S. in that region prior to that.</p>
|
234
|
+
<p><strong>Second</strong>, in what conceivable sense can this incident be called a “terrorist” attack? As I have <a href="http://www.salon.com/2010/02/19/terrorism_19/">written</a> <a href="http://www.theguardian.com/commentisfree/2012/dec/16/court-terrorism-morales-gangs-meaningless">many times</a> over the last several years, and as some of the <a href="http://www.salon.com/2010/03/14/terrorism_20/">best scholarship proves</a>, “terrorism” is a word utterly devoid of objective or consistent meaning. It is little more than a totally malleable, propagandistic fear-mongering term used by Western governments (<a href="http://www.globalresearch.ca/bashar-al-assad-interview-the-fight-against-terrorists-in-syria/5365613">and non-Western ones</a>) to justify whatever actions they undertake. As Professor Tomis Kapitan wrote in <a href="http://opinionator.blogs.nytimes.com/2014/10/19/the-reign-of-terror/?_php=true&_type=blogs&_r=0">a brilliant essay in <em>The New York Times</em> on Monday</a>: “Part of the success of this rhetoric traces to the fact that there is no consensus about the meaning of ‘terrorism.’”</p>
|
235
|
+
<p>But to the extent the term has any common understanding, it includes the deliberate (or wholly reckless) targeting of civilians with violence for political ends. But in this case in Canada, it wasn’t civilians who were targeted. If one believes the government’s accounts of the incident, the driver waited two hours until he saw a soldier in uniform. In other words, he seems to have <em>deliberately avoided attacking civilians</em>, and targeted a soldier instead – a member of a military that is currently fighting a war.</p>
|
236
|
+
<p>Again, the point isn’t justifiability. There is a compelling argument to make that undeployed soldiers engaged in normal civilian activities at home are not valid targets under the laws of war (although the U.S. and its closest allies use <a href="http://www.theguardian.com/commentisfree/cifamerica/2010/dec/10/al-jazeera-us-integrity-wikileaks">extremely broad</a> and <a href="http://news.nationalpost.com/2014/07/13/gaza-police-chief-survives-israeli-airstrike-on-family-home-but-bombs-kill-18-relatives-including-children/">permissive standards</a> for what constitutes legitimate military targets when it comes to their own violence). The point is that targeting soldiers who are part of a military fighting an active war is completely inconsistent with the common usage of the word “terrorism,” and yet it is reflexively applied by government officials and media outlets to this incident in Canada (and others like it <a href="http://www.theguardian.com/commentisfree/2013/may/23/woolwich-attack-terrorism-blowback">in the UK</a> and <a href="http://www.salon.com/2009/11/09/terrorism_7/">the US</a>).</p>
|
237
|
+
<p>That’s because the most common functional definition of “terrorism” in Western discourse is quite clear. At this point, it means little more than: “violence directed at Westerners by Muslims” (when not used to mean “violence by Muslims,” it usually just means: <a href="http://www.theglobeandmail.com/news/politics/ottawas-new-anti-terrorism-strategy-lists-eco-extremists-as-threats/article533522/">violence the state dislikes</a>). The term “terrorism” has become nothing more than a rhetorical weapon for legitimizing all violence by Western countries, and delegitimizing all violence against them, even when the violence called “terrorism” is clearly intended as retaliation for Western violence.</p>
|
238
|
+
<p>This is about far more than semantics. It is central to how the west propagandizes its citizenries; the manipulative use of the “terrorism” term lies at heart of that. As Professor Kapitan wrote yesterday in <em>The New York Times</em>:</p>
|
239
|
+
<blockquote>
|
240
|
+
<p class="story-body-text">Even when a definition is agreed upon, the rhetoric of “terror” is applied both selectively and inconsistently<strong>. In the mainstream American media, the “terrorist” label is usually reserved for those opposed to the policies of the U.S. and its allies.</strong> By contrast, some acts of violence that constitute terrorism under most definitions are not identified as such — for instance, the massacre of over 2000 Palestinian civilians in the Beirut refugee camps in 1982 or the killings of more than 3000 civilians in Nicaragua by “contra” rebels during the 1980s, or the genocide that took the lives of at least a half million Rwandans in 1994. At the opposite end of the spectrum, some actions that do not qualify as terrorism are labeled as such — that would include attacks by Hamas, Hezbollah or ISIS, for instance, against uniformed soldiers on duty.</p>
|
241
|
+
<p class="story-body-text">Historically, <strong>the rhetoric of terror has been used by those in power not only to sway public opinion, but to direct attention away from their own acts of terror.</strong></p>
|
242
|
+
</blockquote>
|
243
|
+
<p class="story-body-text">At this point, “terrorism” is the term that means nothing, but justifies everything. It is long past time that media outlets begin skeptically questioning its usage by political officials rather than mindlessly parroting it.</p>
|
244
|
+
<p class="story-body-text"><em>Photo: AP/The Canadian Press, Adrian Wyld</em></p>
|
245
|
+
<p class="story-body-text"><span style="text-decoration: underline"><strong>UPDATE</strong></span>: Multiple conservative commentators have claimed that this article and my subsequent discussion of it are about this morning’s <a href="http://www.ecanadanow.com/canada/2014/10/22/police-say-soldier-shot-at-war-memorial-in-ottawa-report/">shooting of a solider in Ottawa</a>. Aside from the fact that what I wrote is expressly about a completely different incident – one that took place in Quebec on Monday – this article and my comments were published <strong>before</strong> this morning’s shooting spree was reported. So unless someone believes I possess powers of clairvoyance, the claim that I was commenting on the Ottawa shooting – about which virtually nothing is known, including the identity and motive of the shooter(s) – is obviously false.</p>
|
246
|
+
<p class="story-body-text">Then there’s also the extremely predictable accusation that I was <em>justifying</em> the attack on the soldiers. I know from prior experience in discussing these questions that no matter how clear you make it that you are writing about <i>causation</i> and not <em>justification</em>, many will still distort what you write to claim you’ve justified the attack. That’s true even if one makes as clear as the English language permits that you’re not writing about justification: “<strong>The issue here is not justification (very few people would view attacks on soldiers in a shopping mall parking lot to be justified). The issue is </strong><em><strong>causation.”</strong></em> If there’s a way to make that any clearer, please let me know.</p>
|
247
|
+
<p class="story-body-text">One more time: the difference between “causation” and “justification” is so obvious that it should require no explanation. If one observes that someone who smokes four packs of cigarettes a day can expect to develop e<span style="color: #545454">mphysema, that’s an observation about causation, not a celebration of the person’s illness. Only a willful desire to distort, or some deep confusion, can account for a failure to process this most basic point.</span></p>
|
248
|
+
<p class="story-body-text"><span style="text-decoration: underline"><strong>UPDATE II</strong></span>: In that <a href="http://opinionator.blogs.nytimes.com/2014/10/19/the-reign-of-terror/?_php=true&_type=blogs&_php=true&_type=blogs&_php=true&_type=blogs&_r=2&">brilliant essay</a> I referenced above, published just three days ago in <em>The New York Times</em>, Professor Tomis Kapitan made this point:</p>
|
249
|
+
<blockquote>
|
250
|
+
<p class="story-body-text">Obviously, to point out the causes and objectives of particular terrorist actions is to imply nothing about their legitimacy — that is an independent matter….</p>
|
251
|
+
</blockquote>
|
252
|
+
<p class="story-body-text">That point is so simple and, as he said, “obvious” that I have a hard time understanding what could account for some commentators conflating the two other than a willful desire to mislead.</p>
|
253
|
+
</div>
|
254
|
+
<div class="contact">
|
255
|
+
<p>Email the author: <a href='mailto:glenn.greenwald@theintercept.com'>glenn.greenwald@theintercept.com</a></p> </div>
|
256
|
+
</article>
|
257
|
+
</div>
|
258
|
+
|
259
|
+
|
260
|
+
|
261
|
+
<div id="comments" data-track="Comment" href="#comments">
|
262
|
+
|
263
|
+
<div class="comment-count">
|
264
|
+
627 Discussing
|
265
|
+
</div>
|
266
|
+
|
267
|
+
<nav id="comment-nav-below"><h4 class="section-heading commentnav"><a id="more-comment-link" href="/theintercept/2014/10/22/canada-proclaiming-war-12-years-shocked-someone-attacked-soldiers/?comments=all#comments">Show comments</a></h4></nav><p class="nocomments">Comments closed.</p>
|
268
|
+
<div class="ti-recommended">
|
269
|
+
<h2>Recommended</h2>
|
270
|
+
<ul>
|
271
|
+
<!-- cids 0 8 ["7552","7744","7761","7303","6956","7242","7294","7252"] --><li class='ti-reccol-left'> <img class='subfeature align-left' src='https://prod01-cdn03.cdn.firstlook.org/wp-uploads/sites/1/2014/10/laptop-smartphone-excerpt-small.jpg' alt='Secret Manuals Show the Spyware Sold to Despots and Cops Worldwide' /> <h4> <a href='https://firstlook.org/theintercept/2014/10/30/hacking-team/' title='Secret Manuals Show the Spyware Sold to Despots and Cops Worldwide'>
|
272
|
+
Secret Manuals Show the Spyware Sold to Despots and Cops Worldwide </a>
|
273
|
+
</h4>
|
274
|
+
</li><li class='ti-reccol-right'> <img class='subfeature align-left' src='https://prod01-cdn02.cdn.firstlook.org/wp-uploads/sites/1/2014/11/catcall-video-excerpt-small.jpg' alt='No, We Don&#8217;t Need a Law Against Catcalling' /> <h4> <a href='https://firstlook.org/theintercept/2014/11/03/we-dont-need-a-law-against-catcalling/' title='No, We Don’t Need a Law Against Catcalling'>
|
275
|
+
No, We Don’t Need a Law Against Catcalling </a>
|
276
|
+
</h4>
|
277
|
+
</li><li class='ti-reccol-left'> <img class='subfeature align-left' src='https://prod01-cdn01.cdn.firstlook.org/wp-uploads/sites/1/2014/11/05-technician-guide-p71u-1-excerpt-small.jpg' alt='Hacking Team Responds in Defense of Its Spyware' /> <h4> <a href='https://firstlook.org/theintercept/2014/11/03/hacking-team-responds-defense-spyware/' title='Hacking Team Responds in Defense of Its Spyware'>
|
278
|
+
Hacking Team Responds in Defense of Its Spyware </a>
|
279
|
+
</h4>
|
280
|
+
</li><li class='ti-reccol-right'> <img class='subfeature align-left' src='https://prod01-cdn01.cdn.firstlook.org/wp-uploads/sites/1/2014/10/micah_snowden_crop_v3-excerpt-small.jpg' alt='Ed Snowden Taught Me To Smuggle Secrets Past Incredible Danger. Now I Teach You.' /> <h4> <a href='https://firstlook.org/theintercept/2014/10/28/smuggling-snowden-secrets/' title='Ed Snowden Taught Me To Smuggle Secrets Past Incredible Danger. Now I Teach You.'>
|
281
|
+
Ed Snowden Taught Me To Smuggle Secrets Past Incredible Danger. Now I Teach You. </a>
|
282
|
+
</h4>
|
283
|
+
</li><li class='ti-reccol-left'> <img class='subfeature align-left' src='https://prod01-cdn00.cdn.firstlook.org/wp-uploads/sites/1/2014/10/455730724-excerpt-small.jpg' alt='The FBI Director&#8217;s Evidence Against Encryption Is Pathetic' /> <h4> <a href='https://firstlook.org/theintercept/2014/10/17/draft-two-cases-cited-fbi-dude-dumb-dumb/' title='The FBI Director’s Evidence Against Encryption Is Pathetic'>
|
284
|
+
The FBI Director’s Evidence Against Encryption Is Pathetic </a>
|
285
|
+
</h4>
|
286
|
+
</li><li class='ti-reccol-right'> <img class='subfeature align-left' src='https://prod01-cdn03.cdn.firstlook.org/wp-uploads/sites/1/2014/10/AP071002027341-excerpt-small.jpg' alt='Blackwater Founder Remains Free and Rich While His Former Employees Go Down on Murder Charges' /> <h4> <a href='https://firstlook.org/theintercept/2014/10/22/blackwater-guilty-verdicts/' title='Blackwater Founder Remains Free and Rich While His Former Employees Go Down on Murder Charges'>
|
287
|
+
Blackwater Founder Remains Free and Rich While His Former Employees Go Down on Murder Charges </a>
|
288
|
+
</h4>
|
289
|
+
</li><li class='ti-reccol-left'> <img class='subfeature align-left' src='https://prod01-cdn02.cdn.firstlook.org/wp-uploads/sites/1/2014/10/difi-excerpt-small.jpg' alt='Is Obama Stalling Until Republicans Can Bury the CIA Torture Report?' /> <h4> <a href='https://firstlook.org/theintercept/2014/10/23/white-house-waiting-gop-senate-kill-feinsteins-torture-report/' title='Is Obama Stalling Until Republicans Can Bury the CIA Torture Report?'>
|
290
|
+
Is Obama Stalling Until Republicans Can Bury the CIA Torture Report? </a>
|
291
|
+
</h4>
|
292
|
+
</li><li class='ti-reccol-right'> <img class='subfeature align-left' src='https://prod01-cdn02.cdn.firstlook.org/wp-uploads/sites/1/2014/10/AP740808069-excerpt-small.jpg' alt='A Story About Ben Bradlee That’s Not Fucking Charming' /> <h4> <a href='https://firstlook.org/theintercept/2014/10/22/a-ben-bradlee-story-thats-not-fucking-charming/' title='A Story About Ben Bradlee That’s Not Fucking Charming'>
|
293
|
+
A Story About Ben Bradlee That’s Not Fucking Charming </a>
|
294
|
+
</h4>
|
295
|
+
</li> </ul>
|
296
|
+
</div>
|
297
|
+
</div>
|
298
|
+
</section>
|
299
|
+
|
300
|
+
<footer role="banner">
|
301
|
+
<div class="banner-bar">
|
302
|
+
<div class="grid">
|
303
|
+
<cite>© First Look Media. All Rights Reserved</cite>
|
304
|
+
<nav role="navigation" class="ti-menu menu-footer">
|
305
|
+
<ul id="menu-footer" class="menu"><li id="menu-item-100" class="menu-item menu-item-type-post_type menu-item-object-page menu-item-100"><a href="https://firstlook.org/theintercept/about/">About</a></li>
|
306
|
+
<li id="menu-item-839" class="menu-item menu-item-type-post_type menu-item-object-page menu-item-839"><a href="https://firstlook.org/theintercept/terms-use/">Terms of Use</a></li>
|
307
|
+
<li id="menu-item-106" class="menu-item menu-item-type-post_type menu-item-object-page menu-item-106"><a href="https://firstlook.org/theintercept/privacy-policy/">Privacy Policy</a></li>
|
308
|
+
<li id="menu-item-3926" class="menu-item menu-item-type-custom menu-item-object-custom menu-item-3926"><a href="/theintercept/feed/?rss">RSS</a></li>
|
309
|
+
<li id="menu-item-107" class="menu-item menu-item-type-post_type menu-item-object-page menu-item-107"><a href="https://firstlook.org/theintercept/contact/">Contact</a></li>
|
310
|
+
</ul> </nav>
|
311
|
+
<nav role="navigation" class="ti-menu menu-social">
|
312
|
+
<h1>Stay in Touch</h1>
|
313
|
+
<ul id="menu-social-1" class="menu"><li class="ti-icon icon-twitter menu-item menu-item-type-custom menu-item-object-custom menu-item-869"><a title="twitter" target="_blank" href="https://twitter.com/the_intercept">Twitter</a></li>
|
314
|
+
<li class="ti-icon icon-facebook menu-item menu-item-type-custom menu-item-object-custom menu-item-3921"><a href="https://facebook.com/theinterceptflm">Facebook</a></li>
|
315
|
+
<li class="ti-icon icon-rss menu-item menu-item-type-post_type menu-item-object-page menu-item-125"><a title="rss" href="https://firstlook.org/theintercept/feeds/">RSS feeds</a></li>
|
316
|
+
</ul> </nav>
|
317
|
+
<div class="copyright for-mobile">
|
318
|
+
© 2014 First Look Media, Inc. All rights reserved.
|
319
|
+
</div>
|
320
|
+
</div>
|
321
|
+
<a class="to-top" href="#">Back to Top</a>
|
322
|
+
</div>
|
323
|
+
</footer>
|
324
|
+
|
325
|
+
<script type='text/javascript' src='https://prod01-cdn01.cdn.firstlook.org/theintercept/wp-includes/js/comment-reply.min.js?ver=3.9.2'></script>
|
326
|
+
</body>
|
327
|
+
</html>
|
@@ -60,6 +60,10 @@ describe IndependentPageParserV1 do
|
|
60
60
|
@pa.content[3].should == 'Many Saudi women have welcomed the freeze of the measure, including Sabria S. Jawhar, a Saudi columnist and assistant professor of applied linguistics at King Saud bin Abdulaziz University for Health Sciences.'
|
61
61
|
@pa.content.size.should == 11
|
62
62
|
end
|
63
|
+
|
64
|
+
it "should parse the guid" do
|
65
|
+
@pa.guid_from_url.should == '9065486'
|
66
|
+
end
|
63
67
|
end
|
64
68
|
|
65
69
|
describe "when parsing the syria article" do
|
@@ -0,0 +1,67 @@
|
|
1
|
+
# -*- coding: utf-8 -*-
|
2
|
+
require 'spec_helper'
|
3
|
+
include WebPageParser
|
4
|
+
|
5
|
+
describe TheInterceptPageParserFactory do
|
6
|
+
before do
|
7
|
+
@valid_urls = [
|
8
|
+
'https://firstlook.org/theintercept/2014/10/22/canada-proclaiming-war-12-years-shocked-someone-attacked-soldiers/',
|
9
|
+
'https://firstlook.org/theintercept/2014/10/31/block-boat-work-middle/',
|
10
|
+
'https://firstlook.org/theintercept/2014/10/31/block-boat-work-middle'
|
11
|
+
]
|
12
|
+
@invalid_urls = [
|
13
|
+
'https://firstlook.org/theintercept/document/2014/10/30/hacking-team-rcs-9-0-changelog/',
|
14
|
+
'https://firstlook.org/theintercept/froomkin/',
|
15
|
+
'https://firstlook.org/theintercept/staff/jeremy-scahill/'
|
16
|
+
]
|
17
|
+
end
|
18
|
+
|
19
|
+
it "should detect intercept articles from the url" do
|
20
|
+
@valid_urls.each do |url|
|
21
|
+
TheInterceptPageParserFactory.can_parse?(:url => url).should be_true
|
22
|
+
end
|
23
|
+
end
|
24
|
+
|
25
|
+
it "should ignore pages with the wrong url format" do
|
26
|
+
@invalid_urls.each do |url|
|
27
|
+
TheInterceptPageParserFactory.can_parse?(:url => url).should be_nil
|
28
|
+
end
|
29
|
+
end
|
30
|
+
|
31
|
+
end
|
32
|
+
|
33
|
+
describe TheInterceptPageParserV1 do
|
34
|
+
|
35
|
+
describe 'when parsing the canada at war article' do
|
36
|
+
before do
|
37
|
+
@valid_options = {
|
38
|
+
:url => 'https://firstlook.org/theintercept/2014/10/22/canada-proclaiming-war-12-years-shocked-someone-attacked-soldiers/',
|
39
|
+
:page => File.read('spec/fixtures/theintercept/canada-proclaiming-war-12-years-shocked-someone-attacked-soldiers.html'),
|
40
|
+
:valid_hash => '6110428997aafee4873fd2cd2dbc6c03'
|
41
|
+
}
|
42
|
+
@pa = TheInterceptPageParserV1.new(@valid_options)
|
43
|
+
|
44
|
+
end
|
45
|
+
|
46
|
+
it 'should parse the title' do
|
47
|
+
@pa.title.should == "Canada, At War For 13 Years, Shocked That 'A Terrorist' Attacked Its Soldiers"
|
48
|
+
end
|
49
|
+
|
50
|
+
it 'should parse the content' do
|
51
|
+
@pa.content[1].should == 'TORONTO – In Quebec on Monday, two Canadian soldiers were hit by a car driven by Martin Couture-Rouleau, a 25-year-old Canadian who, as The Globe and Mail reported, “converted to Islam recently and called himself Ahmad Rouleau.” One of the soldiers died, as did Couture-Rouleau when he was shot by police upon apprehension after allegedly brandishing a large knife. Police speculated that the incident was deliberate, alleging the driver waited for two hours before hitting the soldiers, one of whom was wearing a uniform. The incident took place in the parking lot of a shopping mall 30 miles southeast of Montreal, “a few kilometres from the Collège militaire royal de Saint-Jean, the military academy operated by the Department of National Defence.”'
|
52
|
+
@pa.content[5].should == 'First, Canada has spent the last 13 years proclaiming itself a nation at war. It actively participated in the invasion and occupation of Afghanistan and was an enthusiastic partner in some of the most extremist War on Terror abuses perpetrated by the U.S. Earlier this month, the Prime Minister revealed, with the support of a large majority of Canadians, that “Canada is poised to go to war in Iraq, as [he] announced plans in Parliament [] to send CF-18 fighter jets for up to six months to battle Islamic extremists.” Just yesterday, Canadian Defence Minister Rob Nicholson flamboyantly appeared at the airfield in Alberta from which the fighter jets left for Iraq and stood tall as he issued the standard Churchillian war rhetoric about the noble fight against evil.'
|
53
|
+
@pa.content[16].should == 'Even when a definition is agreed upon, the rhetoric of “terror” is applied both selectively and inconsistently. In the mainstream American media, the “terrorist” label is usually reserved for those opposed to the policies of the U.S. and its allies. By contrast, some acts of violence that constitute terrorism under most definitions are not identified as such — for instance, the massacre of over 2000 Palestinian civilians in the Beirut refugee camps in 1982 or the killings of more than 3000 civilians in Nicaragua by “contra” rebels during the 1980s, or the genocide that took the lives of at least a half million Rwandans in 1994. At the opposite end of the spectrum, some actions that do not qualify as terrorism are labeled as such — that would include attacks by Hamas, Hezbollah or ISIS, for instance, against uniformed soldiers on duty.'
|
54
|
+
@pa.content[23].should == 'UPDATE II: In that brilliant essay I referenced above, published just three days ago in The New York Times, Professor Tomis Kapitan made this point:'
|
55
|
+
@pa.content.last.should == 'That point is so simple and, as he said, “obvious” that I have a hard time understanding what could account for some commentators conflating the two other than a willful desire to mislead.'
|
56
|
+
@pa.content.size.should == 26
|
57
|
+
@pa.hash.should == @valid_options[:valid_hash]
|
58
|
+
end
|
59
|
+
|
60
|
+
it 'should parse the date in UTC' do
|
61
|
+
@pa.date.should == DateTime.parse('22nd October 2014, 08:56:26')
|
62
|
+
@pa.date.zone.should == '+00:00'
|
63
|
+
end
|
64
|
+
|
65
|
+
end
|
66
|
+
|
67
|
+
end
|
@@ -50,6 +50,14 @@ describe WashingtonPostPageParserV1 do
|
|
50
50
|
it 'should parse the content' do
|
51
51
|
@pa.content[0].should == 'In a major setback for al-Qaeda’s affiliate in East Africa, the Obama administration said Friday it had confirmed the death of a key Somali militant leader who had been targeted in an airstrike earlier in the week.'
|
52
52
|
end
|
53
|
+
|
54
|
+
it 'should get the guid from the url' do
|
55
|
+
@pa.guid_from_url.should == 'fc9fee06-3512-11e4-9e92-0899b306bbea'
|
56
|
+
end
|
57
|
+
|
58
|
+
it 'should return the guid from the url using the guid method' do
|
59
|
+
@pa.guid.should == 'fc9fee06-3512-11e4-9e92-0899b306bbea'
|
60
|
+
end
|
53
61
|
end
|
54
62
|
|
55
63
|
describe 'when parsing the bust-boom article' do
|
metadata
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: web-page-parser
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 1.
|
4
|
+
version: 1.1.0
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- John Leach
|
@@ -31,7 +31,7 @@ cert_chain:
|
|
31
31
|
MghEyBTNQa+QTUTKQMjYOO3kV+Wuv+iQGaMm/bu2SD+Ov0XUzzAsSfz0ZvrF3fbG
|
32
32
|
jdD4CMQtJNDqDiWuUkg=
|
33
33
|
-----END CERTIFICATE-----
|
34
|
-
date:
|
34
|
+
date: 2015-01-30 00:00:00.000000000 Z
|
35
35
|
dependencies:
|
36
36
|
- !ruby/object:Gem::Dependency
|
37
37
|
name: htmlentities
|
@@ -104,8 +104,8 @@ dependencies:
|
|
104
104
|
- !ruby/object:Gem::Version
|
105
105
|
version: '0'
|
106
106
|
description: A Ruby library to parse the content out of web pages. Currently supports
|
107
|
-
BBC News pages, The Guardian, Independent
|
108
|
-
News Sniffer project. http://www.newssniffer.co.uk
|
107
|
+
BBC News pages, The Guardian, Independent, New York Times and The Intercept articles.
|
108
|
+
Used by the News Sniffer project. http://www.newssniffer.co.uk
|
109
109
|
email: john@johnleach.co.uk
|
110
110
|
executables: []
|
111
111
|
extensions: []
|
@@ -124,6 +124,7 @@ files:
|
|
124
124
|
- lib/web-page-parser/parsers/independent_page_parser.rb
|
125
125
|
- lib/web-page-parser/parsers/new_york_times_page_parser.rb
|
126
126
|
- lib/web-page-parser/parsers/test_page_parser.rb
|
127
|
+
- lib/web-page-parser/parsers/the_intercept_page_parser.rb
|
127
128
|
- lib/web-page-parser/parsers/washingtonpost_page_parser.rb
|
128
129
|
- spec/base_parser_spec.rb
|
129
130
|
- spec/fixtures/bbc_news/10249066.stm.html
|
@@ -153,6 +154,7 @@ files:
|
|
153
154
|
- spec/fixtures/new_york_times/khaled-meshal-the-leader-of-hamas-vacates-damascus.html
|
154
155
|
- spec/fixtures/new_york_times/show-banned-french-comedian-has-new-one.html
|
155
156
|
- spec/fixtures/new_york_times/the-long-run-gingrich-stuck-to-caustic-path-in-ethics-battles.html
|
157
|
+
- spec/fixtures/theintercept/canada-proclaiming-war-12-years-shocked-someone-attacked-soldiers.html
|
156
158
|
- spec/fixtures/washingtonpost/pentagon-confirms-al-shabab-leader-killed.html
|
157
159
|
- spec/fixtures/washingtonpost/sgt-bowe-bergdahls-capture-remains-amystery.html
|
158
160
|
- spec/fixtures/washingtonpost/will-a-bust-follow-the-boom-in-britain.html
|
@@ -161,6 +163,7 @@ files:
|
|
161
163
|
- spec/parsers/guardian_page_spec.rb
|
162
164
|
- spec/parsers/independent_page_parser_spec.rb
|
163
165
|
- spec/parsers/new_york_times_page_parser_spec.rb
|
166
|
+
- spec/parsers/the_intercept_page_parser_spec.rb
|
164
167
|
- spec/parsers/washingtonpost_page_parser_spec.rb
|
165
168
|
- spec/spec.opts
|
166
169
|
- spec/spec_helper.rb
|
@@ -189,42 +192,44 @@ signing_key:
|
|
189
192
|
specification_version: 4
|
190
193
|
summary: A parser for various news organisation's web pages
|
191
194
|
test_files:
|
192
|
-
- spec/fixtures/new_york_times/show-banned-french-comedian-has-new-one.html
|
193
|
-
- spec/fixtures/new_york_times/the-long-run-gingrich-stuck-to-caustic-path-in-ethics-battles.html
|
194
|
-
- spec/fixtures/new_york_times/khaled-meshal-the-leader-of-hamas-vacates-damascus.html
|
195
|
-
- spec/fixtures/guardian/syria-libya-middle-east-unrest-live.html
|
196
|
-
- spec/fixtures/guardian/anger-grows-rbs-chiefs-bonus.html
|
197
|
-
- spec/fixtures/guardian/anger-grows-rbs-chiefs-bonus-with-explainer.html
|
198
|
-
- spec/fixtures/guardian/nhs-patient-data-available-companies-buy.html
|
199
|
-
- spec/fixtures/guardian/barack-obama-nicki-minaj-mariah-carey.html
|
200
|
-
- spec/fixtures/washingtonpost/sgt-bowe-bergdahls-capture-remains-amystery.html
|
201
|
-
- spec/fixtures/washingtonpost/will-a-bust-follow-the-boom-in-britain.html
|
202
|
-
- spec/fixtures/washingtonpost/pentagon-confirms-al-shabab-leader-killed.html
|
203
|
-
- spec/fixtures/cassette_library/BbcNewsPageParserV4.yml
|
204
|
-
- spec/fixtures/independent/innocent-starving-close-to-death-one-victim-of-the-siege-that-shames-syria-9065538.html
|
205
|
-
- spec/fixtures/independent/david-cameron-set-for-uturn-over-uk-sanctuary-9077647.html
|
206
|
-
- spec/fixtures/independent/belgian-man-who-skipped-100-restaurant-bills-is-killed-9081407.html
|
207
|
-
- spec/fixtures/independent/saudi-authorities-stop-textmessage-tracking-of-women-for-now-9065486.html
|
208
|
-
- spec/fixtures/bbc_news/20230333.stm.html
|
209
195
|
- spec/fixtures/bbc_news/10249066.stm.html
|
196
|
+
- spec/fixtures/bbc_news/10341015.stm.html
|
197
|
+
- spec/fixtures/bbc_news/11125504.html
|
198
|
+
- spec/fixtures/bbc_news/12921632.html
|
210
199
|
- spec/fixtures/bbc_news/13293006.html
|
200
|
+
- spec/fixtures/bbc_news/19957138.stm.html
|
201
|
+
- spec/fixtures/bbc_news/20230333.stm.html
|
202
|
+
- spec/fixtures/bbc_news/21528631.html
|
203
|
+
- spec/fixtures/bbc_news/6072486.stm.html
|
211
204
|
- spec/fixtures/bbc_news/7745137.stm.html
|
205
|
+
- spec/fixtures/bbc_news/8011268.stm.html
|
212
206
|
- spec/fixtures/bbc_news/8029015.stm.html
|
213
|
-
- spec/fixtures/bbc_news/11125504.html
|
214
207
|
- spec/fixtures/bbc_news/8040164.stm.html
|
215
|
-
- spec/fixtures/bbc_news/21528631.html
|
216
|
-
- spec/fixtures/bbc_news/10341015.stm.html
|
217
208
|
- spec/fixtures/bbc_news/8063681.stm.html
|
218
|
-
- spec/fixtures/
|
219
|
-
- spec/fixtures/
|
220
|
-
- spec/fixtures/
|
221
|
-
- spec/fixtures/
|
222
|
-
- spec/
|
223
|
-
- spec/
|
209
|
+
- spec/fixtures/cassette_library/BbcNewsPageParserV4.yml
|
210
|
+
- spec/fixtures/guardian/anger-grows-rbs-chiefs-bonus-with-explainer.html
|
211
|
+
- spec/fixtures/guardian/anger-grows-rbs-chiefs-bonus.html
|
212
|
+
- spec/fixtures/guardian/barack-obama-nicki-minaj-mariah-carey.html
|
213
|
+
- spec/fixtures/guardian/nhs-patient-data-available-companies-buy.html
|
214
|
+
- spec/fixtures/guardian/syria-libya-middle-east-unrest-live.html
|
215
|
+
- spec/fixtures/independent/belgian-man-who-skipped-100-restaurant-bills-is-killed-9081407.html
|
216
|
+
- spec/fixtures/independent/david-cameron-set-for-uturn-over-uk-sanctuary-9077647.html
|
217
|
+
- spec/fixtures/independent/innocent-starving-close-to-death-one-victim-of-the-siege-that-shames-syria-9065538.html
|
218
|
+
- spec/fixtures/independent/saudi-authorities-stop-textmessage-tracking-of-women-for-now-9065486.html
|
219
|
+
- spec/fixtures/new_york_times/khaled-meshal-the-leader-of-hamas-vacates-damascus.html
|
220
|
+
- spec/fixtures/new_york_times/show-banned-french-comedian-has-new-one.html
|
221
|
+
- spec/fixtures/new_york_times/the-long-run-gingrich-stuck-to-caustic-path-in-ethics-battles.html
|
222
|
+
- spec/fixtures/washingtonpost/pentagon-confirms-al-shabab-leader-killed.html
|
223
|
+
- spec/fixtures/washingtonpost/sgt-bowe-bergdahls-capture-remains-amystery.html
|
224
|
+
- spec/fixtures/washingtonpost/will-a-bust-follow-the-boom-in-britain.html
|
225
|
+
- spec/fixtures/theintercept/canada-proclaiming-war-12-years-shocked-someone-attacked-soldiers.html
|
224
226
|
- spec/parsers/bbc_news_page_spec.rb
|
225
227
|
- spec/parsers/guardian_page_spec.rb
|
226
|
-
- spec/parsers/independent_page_parser_spec.rb
|
227
228
|
- spec/parsers/new_york_times_page_parser_spec.rb
|
229
|
+
- spec/parsers/independent_page_parser_spec.rb
|
230
|
+
- spec/parsers/the_intercept_page_parser_spec.rb
|
231
|
+
- spec/parsers/washingtonpost_page_parser_spec.rb
|
232
|
+
- spec/parser_factory_spec.rb
|
228
233
|
- spec/spec.opts
|
229
234
|
- spec/spec_helper.rb
|
230
|
-
- spec/
|
235
|
+
- spec/base_parser_spec.rb
|
metadata.gz.sig
CHANGED
Binary file
|