stenographer 0.1.0
Sign up to get free protection for your applications and to get access to all the features.
- data/.gitignore +17 -0
- data/Gemfile +4 -0
- data/LICENSE.txt +22 -0
- data/README.md +50 -0
- data/Rakefile +1 -0
- data/config/swearwords.txt +1 -0
- data/lib/stenographer.rb +8 -0
- data/lib/stenographer/conversation.rb +166 -0
- data/lib/stenographer/message.rb +32 -0
- data/lib/stenographer/transcript.rb +31 -0
- data/lib/stenographer/version.rb +3 -0
- data/lib/stenographer/word.rb +20 -0
- data/stenographer.gemspec +25 -0
- metadata +70 -0
data/.gitignore
ADDED
data/Gemfile
ADDED
data/LICENSE.txt
ADDED
@@ -0,0 +1,22 @@
|
|
1
|
+
Copyright (c) 2012 Matthew Werner
|
2
|
+
|
3
|
+
MIT License
|
4
|
+
|
5
|
+
Permission is hereby granted, free of charge, to any person obtaining
|
6
|
+
a copy of this software and associated documentation files (the
|
7
|
+
"Software"), to deal in the Software without restriction, including
|
8
|
+
without limitation the rights to use, copy, modify, merge, publish,
|
9
|
+
distribute, sublicense, and/or sell copies of the Software, and to
|
10
|
+
permit persons to whom the Software is furnished to do so, subject to
|
11
|
+
the following conditions:
|
12
|
+
|
13
|
+
The above copyright notice and this permission notice shall be
|
14
|
+
included in all copies or substantial portions of the Software.
|
15
|
+
|
16
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
|
17
|
+
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
|
18
|
+
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
|
19
|
+
NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
|
20
|
+
LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
|
21
|
+
OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
|
22
|
+
WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
|
data/README.md
ADDED
@@ -0,0 +1,50 @@
|
|
1
|
+
# Stenographer
|
2
|
+
|
3
|
+
Stenographer is a gem that helps you dig through your chat history, currently limited to Adium chat logs.
|
4
|
+
|
5
|
+
## Installation
|
6
|
+
|
7
|
+
Add this line to your application's Gemfile:
|
8
|
+
|
9
|
+
gem 'stenographer'
|
10
|
+
|
11
|
+
And then execute:
|
12
|
+
|
13
|
+
$ bundle
|
14
|
+
|
15
|
+
Or install it yourself as:
|
16
|
+
|
17
|
+
$ gem install stenographer
|
18
|
+
|
19
|
+
## Usage
|
20
|
+
|
21
|
+
To get started with stenographer, you just have to call
|
22
|
+
|
23
|
+
transcript = Transcript.new
|
24
|
+
transcript.read_back
|
25
|
+
|
26
|
+
This will go to the default Adium log directory and read back all your chat conversations. You can also pass the constructor a specific directory to limit the conversations Transcript pulls in.
|
27
|
+
|
28
|
+
To search for a specific string within your chat history, just pass your queried string:
|
29
|
+
|
30
|
+
transcript.read_back(query: 'cool')
|
31
|
+
|
32
|
+
This will return every instance of 'cool' in your history. Sometimes it's handy to understand what it was you thought was so cool. Including some context helps:
|
33
|
+
|
34
|
+
transcript.read_back(query: 'cool', context: true)
|
35
|
+
|
36
|
+
Some queries yield more interesting results than others:
|
37
|
+
|
38
|
+
transcript.read_back(query: 'http')
|
39
|
+
|
40
|
+
## Upcoming
|
41
|
+
|
42
|
+
There are quite a few more things I'd like to do with this gem. Making it easier to specify which exact conversation you're meaning to pull up is high on the list.
|
43
|
+
|
44
|
+
## Contributing
|
45
|
+
|
46
|
+
1. Fork it
|
47
|
+
2. Create your feature branch (`git checkout -b my-new-feature`)
|
48
|
+
3. Commit your changes (`git commit -am 'Add some feature'`)
|
49
|
+
4. Push to the branch (`git push origin my-new-feature`)
|
50
|
+
5. Create new Pull Request
|
data/Rakefile
ADDED
@@ -0,0 +1 @@
|
|
1
|
+
require "bundler/gem_tasks"
|
@@ -0,0 +1 @@
|
|
1
|
+
fuck shit asshole cunt fag fuk fck fcuk assfuck assfucker fucker motherfucker asscock asshead asslicker asslick assnigger asssucker bastard bitch bitchtits bitches bitch brotherfucker bullshit bumblefuck buttfucka fucka buttfucker buttfucka fagbag fagfucker faggit faggot faggotcock fagtard fatass fuckoff fuckstick fucktard fuckwad fuckwit dick dickfuck dickhead dickjuice dickmilk doochbag douchebag douche dickweed dyke dumbass dumass fuckboy fuckbag gayass gayfuck gaylord gaytard nigga niggers niglet paki piss prick pussy poontang poonany porchmonkey porch monkey poon queer queerbait queerhole queef renob rimjob ruski sandnigger nigger schlong shitass shitbag shitbagger shitbreath chinc carpetmuncher chink choad clit clitface clusterfuck cockass cockbite cockface skank skeet skullfuck slut slutbag splooge twatlips twat twats twatwaffle vaj vajayjay va-j-j wank wankjob wetback whore whorebag whoreface
|
data/lib/stenographer.rb
ADDED
@@ -0,0 +1,166 @@
|
|
1
|
+
module Stenographer
|
2
|
+
class Conversation
|
3
|
+
|
4
|
+
attr_accessor :id, :members, :messages, :created_at
|
5
|
+
|
6
|
+
def initialize(id, noko)
|
7
|
+
@id = id
|
8
|
+
@messages = []
|
9
|
+
@members = []
|
10
|
+
@history = []
|
11
|
+
@lines_to_print = 0
|
12
|
+
@created_at = nil
|
13
|
+
|
14
|
+
noko.css('message').each_with_index do |message, i|
|
15
|
+
name = (message.attributes['alias'] || message.attributes['sender']).value
|
16
|
+
@created_at ||= message.attributes['time'].value
|
17
|
+
body = message.children.inner_text
|
18
|
+
|
19
|
+
@members << name if !@members.include?(name)
|
20
|
+
@messages << Message.new(i, name, body)
|
21
|
+
end
|
22
|
+
end
|
23
|
+
|
24
|
+
# Print out a conversation
|
25
|
+
#
|
26
|
+
# Example:
|
27
|
+
# >> conversation.read_back
|
28
|
+
# => Jack Johnson, John Jackson @ 2012-11-05T20:48:50-08:00 Lines: 0..-1
|
29
|
+
# John Jackson It's time someone had the courage to stand up and say: I'm against those things that everybody hates.
|
30
|
+
# Jack Johnson Now, I respect my opponent. I think he's a good man. But quite frankly, I agree with everything he just said.
|
31
|
+
# John Jackson I say your three cent titanium tax goes too far.
|
32
|
+
# Jack Johnson And I say your three cent titanium tax doesn't go too far enough.
|
33
|
+
#
|
34
|
+
# Arguments:
|
35
|
+
# opts: (Hash)
|
36
|
+
# [query] only return lines that match query
|
37
|
+
# [context] used with query, returns surrounding lines to match
|
38
|
+
def read_back(opts={})
|
39
|
+
raise "read_back only accepts an options hash" if !opts.is_a?(Hash)
|
40
|
+
query = opts[:query]
|
41
|
+
return if query && !include?(query)
|
42
|
+
|
43
|
+
id_range = range(opts)
|
44
|
+
puts header(id_range)
|
45
|
+
messages[id_range].each do |message|
|
46
|
+
if query && message.include?(query)
|
47
|
+
message.print(highlighted: true)
|
48
|
+
else
|
49
|
+
message.print unless query && opts[:context].nil?
|
50
|
+
end
|
51
|
+
end
|
52
|
+
puts
|
53
|
+
end
|
54
|
+
|
55
|
+
def report
|
56
|
+
distribution = word_distribution(top: 10)
|
57
|
+
max_length = distribution.keys.map(&:length).max + 2
|
58
|
+
max_length = 15 if max_length < 15
|
59
|
+
code = "%-#{max_length}s %0s"
|
60
|
+
|
61
|
+
puts "\n=> Conversation Report <" + ('=' * 21)
|
62
|
+
puts printf(code, 'Members: ', members.join(', '))
|
63
|
+
puts printf(code, 'Messages: ', messages_count)
|
64
|
+
puts printf(code, 'Words: ', word_count)
|
65
|
+
puts printf(code, 'Created at: ', created_at)
|
66
|
+
puts "\n=> Word Usage"
|
67
|
+
distribution.each_pair do |word, count|
|
68
|
+
puts printf(code, "#{word}:", count)
|
69
|
+
end
|
70
|
+
|
71
|
+
puts ('=' * 45) + "\n"
|
72
|
+
end
|
73
|
+
|
74
|
+
def range(opts={})
|
75
|
+
min_id = 0
|
76
|
+
max_id = -1
|
77
|
+
context = opts[:context].nil? ? 0 : 5
|
78
|
+
|
79
|
+
if opts[:query]
|
80
|
+
occurrences = []
|
81
|
+
messages.each do |m|
|
82
|
+
occurrences << m.id if m.include?(opts[:query])
|
83
|
+
end
|
84
|
+
|
85
|
+
if occurrences.any?
|
86
|
+
min_id = occurrences.min - (context)
|
87
|
+
max_id = occurrences.max + (context)
|
88
|
+
else
|
89
|
+
min_id = -1
|
90
|
+
max_id = 0
|
91
|
+
end
|
92
|
+
end
|
93
|
+
|
94
|
+
min_id = 0 if min_id < 0
|
95
|
+
max_id = -1 if max_id > messages.length
|
96
|
+
|
97
|
+
return (opts[:min_id] || min_id)..(opts[:max_id] || max_id)
|
98
|
+
end
|
99
|
+
|
100
|
+
def include?(word)
|
101
|
+
return true if word.nil?
|
102
|
+
|
103
|
+
messages.collect{|m| m.include?(word)}.include?(true)
|
104
|
+
end
|
105
|
+
|
106
|
+
def header(range)
|
107
|
+
"\n=> #{members.join(', ')} @ #{created_at} Lines: #{range}"
|
108
|
+
end
|
109
|
+
|
110
|
+
def messages_count
|
111
|
+
messages.length
|
112
|
+
end
|
113
|
+
|
114
|
+
def word_count
|
115
|
+
messages.collect{|m| m.words.length}.inject(0, :+)
|
116
|
+
end
|
117
|
+
|
118
|
+
def word_distribution(opts={})
|
119
|
+
cache = {}
|
120
|
+
messages.each do |message|
|
121
|
+
message.explode.each do |word|
|
122
|
+
cache[word.text] = 0 if cache[word.text].nil?
|
123
|
+
|
124
|
+
cache[word.text] += 1
|
125
|
+
end
|
126
|
+
end
|
127
|
+
|
128
|
+
cache = cuss_words(cache) unless opts[:cuss].nil?
|
129
|
+
cache = top_words(opts[:top], cache) unless opts[:top].nil?
|
130
|
+
|
131
|
+
cache
|
132
|
+
end
|
133
|
+
|
134
|
+
def cuss_words(cache=word_distribution)
|
135
|
+
cache.reject!{|key| Word::SWEARWORDS.include?(key) }
|
136
|
+
end
|
137
|
+
|
138
|
+
def top_words(limit=10, cache=word_distribution)
|
139
|
+
limit = limit == true ? (+1.0/0.0) : limit
|
140
|
+
|
141
|
+
sorted = {}
|
142
|
+
cache.each_pair do |word, count|
|
143
|
+
sorted[count] = [] if sorted[count].nil?
|
144
|
+
|
145
|
+
sorted[count] << word
|
146
|
+
end
|
147
|
+
|
148
|
+
top = {}; i = 0
|
149
|
+
sorted.keys.sort{|x,y| y <=> x }.each do |key|
|
150
|
+
break if i > limit
|
151
|
+
|
152
|
+
sorted[key].each do |word|
|
153
|
+
top[word] = key; i += 1
|
154
|
+
|
155
|
+
break if i > limit
|
156
|
+
end
|
157
|
+
end
|
158
|
+
|
159
|
+
return top
|
160
|
+
end
|
161
|
+
|
162
|
+
def to_s
|
163
|
+
"Conversation <members: #{members.join(', ')}, messages: #{messages.length}>"
|
164
|
+
end
|
165
|
+
end
|
166
|
+
end
|
@@ -0,0 +1,32 @@
|
|
1
|
+
module Stenographer
|
2
|
+
class Message
|
3
|
+
|
4
|
+
attr_accessor :id, :name, :body, :words
|
5
|
+
|
6
|
+
def initialize(id, name, body)
|
7
|
+
@id = id
|
8
|
+
@name = name
|
9
|
+
@body = body
|
10
|
+
@words = Word.split_message(self)
|
11
|
+
end
|
12
|
+
|
13
|
+
def include?(query=nil)
|
14
|
+
return true if query.nil?
|
15
|
+
|
16
|
+
body.include?(query)
|
17
|
+
end
|
18
|
+
|
19
|
+
def explode
|
20
|
+
Word.split_message(self)
|
21
|
+
end
|
22
|
+
|
23
|
+
def print(opts={})
|
24
|
+
code = opts[:highlighted] ? ">> %-17s %0s" : "%-20s %0s"
|
25
|
+
puts printf(code, name, body)
|
26
|
+
end
|
27
|
+
|
28
|
+
def to_s
|
29
|
+
"Message <name: #{name}, body: #{body}>"
|
30
|
+
end
|
31
|
+
end
|
32
|
+
end
|
@@ -0,0 +1,31 @@
|
|
1
|
+
module Stenographer
|
2
|
+
class Transcript
|
3
|
+
|
4
|
+
attr_accessor :conversations
|
5
|
+
|
6
|
+
def initialize(dir=nil)
|
7
|
+
dir ||= "/Users/#{`whoami`.strip}/Library/Application\ Support/Adium\ 2.0/Users/default/Logs/**/*.xml"
|
8
|
+
@conversations = []
|
9
|
+
|
10
|
+
Dir.glob(dir).each_with_index do |file, i|
|
11
|
+
doc = Nokogiri.XML(File.open(File.expand_path(file), 'rb'))
|
12
|
+
conversation = Conversation.new(i, doc)
|
13
|
+
next unless conversation.messages.any?
|
14
|
+
|
15
|
+
@conversations << conversation
|
16
|
+
end
|
17
|
+
end
|
18
|
+
|
19
|
+
def read_back(opts={})
|
20
|
+
conversations.each do |conversation|
|
21
|
+
conversation.read_back(opts)
|
22
|
+
end
|
23
|
+
|
24
|
+
self
|
25
|
+
end
|
26
|
+
|
27
|
+
def to_s
|
28
|
+
"Transcript <conversations: #{conversations.length}>"
|
29
|
+
end
|
30
|
+
end
|
31
|
+
end
|
@@ -0,0 +1,20 @@
|
|
1
|
+
module Stenographer
|
2
|
+
class Word
|
3
|
+
|
4
|
+
SWEARWORDS = File.read('./config/swearwords.txt').split(' ')
|
5
|
+
|
6
|
+
attr_accessor :text
|
7
|
+
|
8
|
+
def initialize(text)
|
9
|
+
@text = text
|
10
|
+
end
|
11
|
+
|
12
|
+
def self.split_message(message)
|
13
|
+
words = message.body.split(" ")
|
14
|
+
words = words.map(&:strip)
|
15
|
+
|
16
|
+
words.collect{|text| new(text)}
|
17
|
+
end
|
18
|
+
|
19
|
+
end
|
20
|
+
end
|
@@ -0,0 +1,25 @@
|
|
1
|
+
# -*- encoding: utf-8 -*-
|
2
|
+
lib = File.expand_path('../lib', __FILE__)
|
3
|
+
$LOAD_PATH.unshift(lib) unless $LOAD_PATH.include?(lib)
|
4
|
+
require 'stenographer/version'
|
5
|
+
|
6
|
+
Gem::Specification.new do |gem|
|
7
|
+
gem.name = "stenographer"
|
8
|
+
gem.version = Stenographer::VERSION
|
9
|
+
gem.platform = Gem::Platform::RUBY
|
10
|
+
gem.authors = ["Matthew Werner"]
|
11
|
+
gem.email = ["mttwrnr@gmail.com"]
|
12
|
+
gem.description = %q{Remember what you've said}
|
13
|
+
gem.summary = %q{Stenographer is a gem that helps you dig through your chat history}
|
14
|
+
gem.homepage = "https://github.com/mwerner/stenographer"
|
15
|
+
gem.license = 'MIT'
|
16
|
+
|
17
|
+
gem.add_runtime_dependency 'nokogiri'
|
18
|
+
|
19
|
+
gem.required_rubygems_version = ">= 1.3.6"
|
20
|
+
|
21
|
+
gem.files = `git ls-files`.split($/)
|
22
|
+
gem.executables = gem.files.grep(%r{^bin/}).map{ |f| File.basename(f) }
|
23
|
+
gem.test_files = gem.files.grep(%r{^(test|spec|features)/})
|
24
|
+
gem.require_paths = ["lib"]
|
25
|
+
end
|
metadata
ADDED
@@ -0,0 +1,70 @@
|
|
1
|
+
--- !ruby/object:Gem::Specification
|
2
|
+
name: stenographer
|
3
|
+
version: !ruby/object:Gem::Version
|
4
|
+
version: 0.1.0
|
5
|
+
prerelease:
|
6
|
+
platform: ruby
|
7
|
+
authors:
|
8
|
+
- Matthew Werner
|
9
|
+
autorequire:
|
10
|
+
bindir: bin
|
11
|
+
cert_chain: []
|
12
|
+
date: 2012-11-23 00:00:00.000000000 Z
|
13
|
+
dependencies:
|
14
|
+
- !ruby/object:Gem::Dependency
|
15
|
+
name: nokogiri
|
16
|
+
requirement: &70360847748040 !ruby/object:Gem::Requirement
|
17
|
+
none: false
|
18
|
+
requirements:
|
19
|
+
- - ! '>='
|
20
|
+
- !ruby/object:Gem::Version
|
21
|
+
version: '0'
|
22
|
+
type: :runtime
|
23
|
+
prerelease: false
|
24
|
+
version_requirements: *70360847748040
|
25
|
+
description: Remember what you've said
|
26
|
+
email:
|
27
|
+
- mttwrnr@gmail.com
|
28
|
+
executables: []
|
29
|
+
extensions: []
|
30
|
+
extra_rdoc_files: []
|
31
|
+
files:
|
32
|
+
- .gitignore
|
33
|
+
- Gemfile
|
34
|
+
- LICENSE.txt
|
35
|
+
- README.md
|
36
|
+
- Rakefile
|
37
|
+
- config/swearwords.txt
|
38
|
+
- lib/stenographer.rb
|
39
|
+
- lib/stenographer/conversation.rb
|
40
|
+
- lib/stenographer/message.rb
|
41
|
+
- lib/stenographer/transcript.rb
|
42
|
+
- lib/stenographer/version.rb
|
43
|
+
- lib/stenographer/word.rb
|
44
|
+
- stenographer.gemspec
|
45
|
+
homepage: https://github.com/mwerner/stenographer
|
46
|
+
licenses:
|
47
|
+
- MIT
|
48
|
+
post_install_message:
|
49
|
+
rdoc_options: []
|
50
|
+
require_paths:
|
51
|
+
- lib
|
52
|
+
required_ruby_version: !ruby/object:Gem::Requirement
|
53
|
+
none: false
|
54
|
+
requirements:
|
55
|
+
- - ! '>='
|
56
|
+
- !ruby/object:Gem::Version
|
57
|
+
version: '0'
|
58
|
+
required_rubygems_version: !ruby/object:Gem::Requirement
|
59
|
+
none: false
|
60
|
+
requirements:
|
61
|
+
- - ! '>='
|
62
|
+
- !ruby/object:Gem::Version
|
63
|
+
version: 1.3.6
|
64
|
+
requirements: []
|
65
|
+
rubyforge_project:
|
66
|
+
rubygems_version: 1.8.15
|
67
|
+
signing_key:
|
68
|
+
specification_version: 3
|
69
|
+
summary: Stenographer is a gem that helps you dig through your chat history
|
70
|
+
test_files: []
|