replacer_bot 0.0.8 → 0.0.9

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: b1e8d3be70c7dfae8451a8a7e6655bb74bb545a9
4
- data.tar.gz: 062f05a23f866b7eb168f95160b7dc5d07a4632d
3
+ metadata.gz: 19abcdfabf6c0bf9b60737242db4f0b2b81cfd8a
4
+ data.tar.gz: 71c81bee80f3775d423de810c51f2473c9ae3139
5
5
  SHA512:
6
- metadata.gz: 830a1b44dfc123d1ea276ef70deecd5f80276dd9900f43bc6d6c3d7024f87387b0b2acedd26197b028ba8d83509438bb354366bd555219fcf97aa5cf4725c04a
7
- data.tar.gz: 1147ca7c0ec8a6dc588f68535d4fcd0d59f0d41d6844b87197a7e07a6df7a5a3b716a77be82506c4a3dd35b7b0c5e0443d9c4ac759f27d6644a0d16bbdc9c113
6
+ metadata.gz: fbe3eb39215c8e41c236cc10132af163def7147cdaaf20854244f258e0d2dcdef2e08fdda0cde87968c2069d57917c8abfc20c48ae90442853055287065e5eac
7
+ data.tar.gz: 278f691140d7b6824153d6da70daa72d901358b0fe533e92c20f3cd177b9b4e6e33f1d27ad300903b731f666e3fdb2af79a042eba072b5eed737e33d8ea18d11
data/README.md CHANGED
@@ -13,6 +13,7 @@ Twitter bot that:
13
13
  * Searches Twitter for a phrase
14
14
  * Search-and-replaces phrases in the tweets
15
15
  * Tweets them
16
+ * Makes a note of the last tweet found so it knows where to start from next time
16
17
 
17
18
  ## Installation
18
19
 
@@ -34,6 +35,7 @@ The default config is [here](https://github.com/pikesley/replacer_bot/blob/maste
34
35
  - david cameron: "Satan's Little Helper"
35
36
  - cameron: Satan
36
37
  save_file: /Users/sam/.replacer_bot/last.tweet
38
+ seen_tweets: /Users/sam/.replacer_bot/seen.tweets
37
39
 
38
40
  Notes:
39
41
 
@@ -67,3 +69,20 @@ There's also
67
69
  ➔ replacer dry_run
68
70
 
69
71
  which does the search and shows what it would have tweeted, without actually tweeting anything
72
+
73
+ ## Reducing the noise
74
+
75
+ It turns out that a lot of Twitter is people (or bots) retweeting the same stuff with minimal changes, like adding extra hashtags or using a different URL shortener (I don't really understand how this even happens, but whatever). (Actually, I wonder how much of Twitter is just bots yelling at each other in the void. But I digress.) This makes a crude 'search for this phrase' bot _extremely_ noisy, so I have come up with some Opinions based on some very crude Reckons. Things that will make the bot consider tweets to be 'the same' as tweets we've seen before, and therefore ignorable, are:
76
+
77
+ * They match save for and URLs they contain being different
78
+ * They match save for different hashtags at the _start_ and _end_ of the tweet (hashtags in the body of the tweet appear to be more relevant, based on my Reckons)
79
+
80
+ The above reduced the noise a bit, but not enough to make a substantial difference. So I came up with this:
81
+
82
+ * If there is an overlap of 4 consecutive words between this tweet and one we've seen before, we ignore it
83
+
84
+ The 4 words thing is tunable in `config.yml`:
85
+
86
+ similarity_weighting: 4
87
+
88
+ but 4 seems about right for my current use case; it will clearly depend on the popularity of your search term
data/lib/replacer_bot.rb CHANGED
@@ -4,6 +4,7 @@ require 'uri'
4
4
  require 'singleton'
5
5
  require 'thor'
6
6
  require 'yaml'
7
+ require 'htmlentities'
7
8
 
8
9
  require 'replacer_bot/version'
9
10
  require 'replacer_bot/replacer'
@@ -7,6 +7,11 @@ module ReplacerBot
7
7
  word[0] == '#'
8
8
  end
9
9
 
10
+ def self.encode_entities string
11
+ coder = HTMLEntities.new
12
+ coder.decode string
13
+ end
14
+
10
15
  def self.last_tweet
11
16
  begin
12
17
  Marshal.load File.read Config.instance.config.save_file
@@ -15,7 +15,7 @@ module ReplacerBot
15
15
  end
16
16
 
17
17
  def tweets
18
- search.map { |r| ReplacerBot.truncate ReplacerBot.replace string: r.text }
18
+ search.map { |r| ReplacerBot.truncate ReplacerBot.encode_entities ReplacerBot.replace string: r.text }
19
19
  end
20
20
 
21
21
  def tweet dry_run: false, chatty: false
@@ -1,3 +1,3 @@
1
1
  module ReplacerBot
2
- VERSION = "0.0.8"
2
+ VERSION = "0.0.9"
3
3
  end
data/replacer_bot.gemspec CHANGED
@@ -21,6 +21,7 @@ Gem::Specification.new do |spec|
21
21
  spec.add_dependency 'twitter', '~> 5.14'
22
22
  spec.add_dependency 'dotenv', '~> 2.0'
23
23
  spec.add_dependency 'thor', '~> 0.19'
24
+ spec.add_dependency 'htmlentities', '~> 4.3'
24
25
 
25
26
  spec.add_development_dependency 'bundler', '~> 1.7'
26
27
  spec.add_development_dependency 'rake', '~> 10.0'
@@ -5,6 +5,17 @@ module ReplacerBot
5
5
  FileUtils.rm_f Config.instance.config.seen_tweets
6
6
  end
7
7
 
8
+ it 'recognises a hashtag' do
9
+ expect(ReplacerBot.is_hashtag '#hashtag').to eq true
10
+ expect(ReplacerBot.is_hashtag 'not_hashtag').to eq false
11
+ end
12
+
13
+ it 'encodes HTML entities' do
14
+ expect(ReplacerBot.encode_entities 'This has a & in it').to eq 'This has a & in it'
15
+ expect(ReplacerBot.encode_entities 'This has no entities that need coding & ting').
16
+ to eq 'This has no entities that need coding & ting'
17
+ end
18
+
8
19
  context 'URLs' do
9
20
  it 'URL-encodes a search term' do
10
21
  expect(ReplacerBot.encode term: 'open data').to eq '%22open%20data%22'
@@ -88,11 +99,6 @@ module ReplacerBot
88
99
  expect(ReplacerBot.replace string: 'This is an Open Data tweet').to eq 'This is a Taylor Swift tweet'
89
100
  expect(ReplacerBot.replace string: 'This is an Open Data tweet about an #opendata story').to eq 'This is a Taylor Swift tweet about a #TaylorSwift story'
90
101
  end
91
-
92
- it 'recognises a hashtag' do
93
- expect(ReplacerBot.is_hashtag '#hashtag').to eq true
94
- expect(ReplacerBot.is_hashtag 'not_hashtag').to eq false
95
- end
96
102
  end
97
103
  end
98
104
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: replacer_bot
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.0.8
4
+ version: 0.0.9
5
5
  platform: ruby
6
6
  authors:
7
7
  - pikesley
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2015-08-18 00:00:00.000000000 Z
11
+ date: 2015-08-19 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: twitter
@@ -52,6 +52,20 @@ dependencies:
52
52
  - - "~>"
53
53
  - !ruby/object:Gem::Version
54
54
  version: '0.19'
55
+ - !ruby/object:Gem::Dependency
56
+ name: htmlentities
57
+ requirement: !ruby/object:Gem::Requirement
58
+ requirements:
59
+ - - "~>"
60
+ - !ruby/object:Gem::Version
61
+ version: '4.3'
62
+ type: :runtime
63
+ prerelease: false
64
+ version_requirements: !ruby/object:Gem::Requirement
65
+ requirements:
66
+ - - "~>"
67
+ - !ruby/object:Gem::Version
68
+ version: '4.3'
55
69
  - !ruby/object:Gem::Dependency
56
70
  name: bundler
57
71
  requirement: !ruby/object:Gem::Requirement