sortah 0.5.0
Sign up to get free protection for your applications and to get access to all the features.
- data/.gitignore +8 -0
- data/.rvmrc +2 -0
- data/.travis.yml +3 -0
- data/CONTRIBUTION.md +23 -0
- data/FUTURE_PLANS.md +73 -0
- data/Gemfile +10 -0
- data/KNOWN_BUGS.md +12 -0
- data/LICENSE +27 -0
- data/README.md +216 -0
- data/Rakefile +14 -0
- data/TUTORIAL.md +43 -0
- data/bin/sortah +33 -0
- data/lib/sortah.rb +10 -0
- data/lib/sortah/cleanroom.rb +47 -0
- data/lib/sortah/components.rb +3 -0
- data/lib/sortah/components/destination.rb +41 -0
- data/lib/sortah/components/lens.rb +49 -0
- data/lib/sortah/components/router.rb +13 -0
- data/lib/sortah/email.rb +31 -0
- data/lib/sortah/errors.rb +15 -0
- data/lib/sortah/handler.rb +46 -0
- data/lib/sortah/parser.rb +57 -0
- data/lib/sortah/patches.rb +7 -0
- data/lib/sortah/util/component.rb +28 -0
- data/lib/sortah/util/component_collection.rb +22 -0
- data/lib/sortah/version.rb +3 -0
- data/sortah.gemspec +31 -0
- data/spec/bin_spec.rb +54 -0
- data/spec/destination_spec.rb +42 -0
- data/spec/email_spec.rb +13 -0
- data/spec/fixtures/rc +8 -0
- data/spec/parser_spec.rb +270 -0
- data/spec/semantic_spec.rb +310 -0
- data/spec/sortah_handler_spec.rb +21 -0
- data/spec/spec_helper.rb +3 -0
- metadata +117 -0
data/.gitignore
ADDED
data/.rvmrc
ADDED
data/.travis.yml
ADDED
data/CONTRIBUTION.md
ADDED
@@ -0,0 +1,23 @@
|
|
1
|
+
#Version rules
|
2
|
+
|
3
|
+
Version 1.0 comes when I'm pretty sure there are no major outstanding bugs from
|
4
|
+
release. After that, SemVer applies.
|
5
|
+
|
6
|
+
Let me (jfredett) control the Version, please don't submit a version change in
|
7
|
+
any pull request. This is merely out of concern for sanity.
|
8
|
+
|
9
|
+
#Contribution
|
10
|
+
|
11
|
+
Easy peasy, fork, pull request. Don't submit something with broken tests in it.
|
12
|
+
|
13
|
+
If you find a bug, filing an issue is cool, fixing it is cooler. I will love you
|
14
|
+
forever if you fix it.
|
15
|
+
|
16
|
+
##Contributing sortah definitions:
|
17
|
+
|
18
|
+
For now, there is no "sortah contrib" repo, if there are many people interested
|
19
|
+
in sharing some common set of sortah definitions, then one will be created.
|
20
|
+
|
21
|
+
For now, if you have something cool, make a page in the wiki.
|
22
|
+
|
23
|
+
|
data/FUTURE_PLANS.md
ADDED
@@ -0,0 +1,73 @@
|
|
1
|
+
#Ideas
|
2
|
+
|
3
|
+
- proxy destinations
|
4
|
+
|
5
|
+
Difficulty: middling
|
6
|
+
|
7
|
+
To allow for non-filesystem storage locations (say, redis, elasticsearch, w/e),
|
8
|
+
it would be nice to have "pseudo" destinations, like:
|
9
|
+
|
10
|
+
destination :redis do
|
11
|
+
RedisHandler.acquire_conn do |redis|
|
12
|
+
redis.put email.key, email.to_s
|
13
|
+
end
|
14
|
+
end
|
15
|
+
|
16
|
+
lens :key do
|
17
|
+
email.key = SHA1_of(email)
|
18
|
+
end
|
19
|
+
|
20
|
+
router :root, :lenses => [:key] do
|
21
|
+
send_to :redis
|
22
|
+
end
|
23
|
+
|
24
|
+
- multi-target `send_to`
|
25
|
+
|
26
|
+
Difficulty: easy
|
27
|
+
|
28
|
+
It would be nice to say:
|
29
|
+
|
30
|
+
router do
|
31
|
+
send_to [:foo, :bar, :baz]
|
32
|
+
end
|
33
|
+
|
34
|
+
and have a copy be sent to each destination, optionally, it could take a
|
35
|
+
parameter, "linked", so that the others would only be soft-links to the
|
36
|
+
canonical one (specifed by link), eg:
|
37
|
+
|
38
|
+
router do
|
39
|
+
send_to [:foo, :bar, :baz], :link => :foo
|
40
|
+
end
|
41
|
+
|
42
|
+
In the above, :bar and :baz would be softlinks to :foo
|
43
|
+
|
44
|
+
- contrib
|
45
|
+
|
46
|
+
Difficulty: easy
|
47
|
+
Prereq: Having a community.
|
48
|
+
|
49
|
+
Set up an easy-to-use 'contrib' repo for community sortah libraries.
|
50
|
+
|
51
|
+
- getmail integration
|
52
|
+
|
53
|
+
Difficulty: easy | hard
|
54
|
+
|
55
|
+
This comes in two flavors, the easy flavor is to just provide a wrapper
|
56
|
+
for defining a getmailrc. Perhaps making it easy to define a gmail rc and
|
57
|
+
also allowing for safe password injections (eg, not storing the password
|
58
|
+
plaintext in the rc file).
|
59
|
+
|
60
|
+
The harder version would be to just implement getmail as part of sortah.
|
61
|
+
This would allow for fine-grained control over how sortah gets fired, allowing
|
62
|
+
for better concurrency.
|
63
|
+
|
64
|
+
|
65
|
+
|
66
|
+
|
67
|
+
|
68
|
+
|
69
|
+
|
70
|
+
|
71
|
+
|
72
|
+
|
73
|
+
|
data/Gemfile
ADDED
data/KNOWN_BUGS.md
ADDED
@@ -0,0 +1,12 @@
|
|
1
|
+
doing
|
2
|
+
|
3
|
+
sortah.sort(email)
|
4
|
+
sortah.metadata#blahblahblah
|
5
|
+
|
6
|
+
does not behave as expected (it doens't preserve metadata to the next line)
|
7
|
+
this has something to do with how I pull the data in the Kernel patch. it works
|
8
|
+
if you do
|
9
|
+
|
10
|
+
sortah.sort(email).metadata#blahblah
|
11
|
+
|
12
|
+
|
data/LICENSE
ADDED
@@ -0,0 +1,27 @@
|
|
1
|
+
Copyright (c) 2011, Joseph Fredette
|
2
|
+
All rights reserved.
|
3
|
+
|
4
|
+
Redistribution and use in source and binary forms, with or without modification,
|
5
|
+
are permitted provided that the following conditions are met:
|
6
|
+
|
7
|
+
Redistributions of source code must retain the above copyright notice, this list
|
8
|
+
of conditions and the following disclaimer.
|
9
|
+
|
10
|
+
Redistributions in binary form must reproduce the above copyright notice, this
|
11
|
+
list of conditions and the following disclaimer in the documentation and/or
|
12
|
+
other materials provided with the distribution.
|
13
|
+
|
14
|
+
Neither the name of "sortah" nor the names of its contributors may be used to
|
15
|
+
endorse or promote products derived from this software without specific prior
|
16
|
+
written permission.
|
17
|
+
|
18
|
+
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
|
19
|
+
ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
|
20
|
+
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
|
21
|
+
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR
|
22
|
+
ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
|
23
|
+
(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
|
24
|
+
LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON
|
25
|
+
ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
|
26
|
+
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
|
27
|
+
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
data/README.md
ADDED
@@ -0,0 +1,216 @@
|
|
1
|
+
#sortah
|
2
|
+
|
3
|
+
##For sortin' your friggin' mail.
|
4
|
+
|
5
|
+
--------------------------------------------------------------------------------
|
6
|
+
|
7
|
+
[![Build Status](https://secure.travis-ci.org/jfredett/sortah.png)](http://travis-ci.org/jfredett/sortah)
|
8
|
+
|
9
|
+
--------------------------------------------------------------------------------
|
10
|
+
|
11
|
+
Sortah sort's mail. It provides a ruby [EDSL](# Embedded DSL) for manipulating
|
12
|
+
email objects. The DSL allows the definition of three principle components:
|
13
|
+
|
14
|
+
- Destinations
|
15
|
+
|
16
|
+
A destination takes in an Email object, and returns a system path. This is where
|
17
|
+
the email object passed into it will be saved. Ex:
|
18
|
+
|
19
|
+
destination :spam, "spam/"
|
20
|
+
destination :ham, "/"
|
21
|
+
|
22
|
+
These are -- in essence -- simple delcarations of the structure of your sorting
|
23
|
+
system. They may take one of several forms. The examples above are relative to
|
24
|
+
the mail directory, and will transparently manage organization south of that.
|
25
|
+
They can also be absolute paths, eg:
|
26
|
+
|
27
|
+
destination :devnull, :abs => "/dev/null"
|
28
|
+
|
29
|
+
which is regarded as an absolutely qualified path. It may also alias another
|
30
|
+
path:
|
31
|
+
|
32
|
+
destination :tldr, :devnull
|
33
|
+
|
34
|
+
- Lenses
|
35
|
+
|
36
|
+
Lenses are functions which produce a value given an input email. This value is
|
37
|
+
interpreted as metadata to be used by the routers. Ex:
|
38
|
+
|
39
|
+
lens :spam_value do
|
40
|
+
x = 0
|
41
|
+
email.text.each_line do |line|
|
42
|
+
x += (line =~ /extension/) ? 1 : 0
|
43
|
+
end
|
44
|
+
x
|
45
|
+
end
|
46
|
+
|
47
|
+
lens :word_count do
|
48
|
+
email.text.split.size
|
49
|
+
end
|
50
|
+
|
51
|
+
lenses can also depend on other lenses.
|
52
|
+
|
53
|
+
lens :spam_ratio :lenses => [:spam_value, :word_count] do
|
54
|
+
email.spam_value / email.word_count
|
55
|
+
end
|
56
|
+
|
57
|
+
You may specify the `pass_through` option to cause the lens to not set any
|
58
|
+
metadata. This is useful in two cases, updating old metadata, and interaction
|
59
|
+
(typically creational) with other services. Eg:
|
60
|
+
|
61
|
+
lens :example_update, :pass_through => true do
|
62
|
+
email.spam_value = 1000000 if email.sender == "annoying_guy0022493@hotmail.com"
|
63
|
+
end
|
64
|
+
|
65
|
+
lens :example_interaction, :pass_through => true, lenses => [:spam_value] do
|
66
|
+
return unless email.spam_value >= 1000000
|
67
|
+
HTTParty.post "http://spamblacklist.net/spammer/new", :body => email.sender
|
68
|
+
end
|
69
|
+
|
70
|
+
- Routers
|
71
|
+
|
72
|
+
This is the core of the language, a router is an object which produce either a
|
73
|
+
router object, or a destination. If it produces a destination, then the email is
|
74
|
+
delivered to that destination. If it produces another router, then the email is
|
75
|
+
passed along to the router produced. A router also 'depends' on lenses. These
|
76
|
+
lenses get applied when the router is called. There is one router which is
|
77
|
+
special, the "root" router, this is the first router which gets called. To
|
78
|
+
declare it, simply declare a router without a name. Ex:
|
79
|
+
|
80
|
+
router :spam_filter, :lenses => [:spam_value] do
|
81
|
+
send_to :ham if email.spam_value < 10
|
82
|
+
send_to :spam
|
83
|
+
end
|
84
|
+
|
85
|
+
router :root, :lenses => [:word_count] do
|
86
|
+
send_to :tldr if email.word_count > 100
|
87
|
+
send_to :spam_filter
|
88
|
+
end
|
89
|
+
|
90
|
+
`send_to` will first search for a destination with the given name, if it cannot
|
91
|
+
find one, it will send it search for the corresponding router. It also acts as
|
92
|
+
`return` -- halting execution of the block when it is called. This is
|
93
|
+
implemented via an exception, which means it _may_ cause performance issues on
|
94
|
+
things like the JVM, YMMV.
|
95
|
+
|
96
|
+
when defining a root router with lenses, you must specify ":root" as the title.
|
97
|
+
|
98
|
+
## Common problems, and how to solve them:
|
99
|
+
|
100
|
+
### Problem: Adding a mail to an external service, and then saving it.
|
101
|
+
|
102
|
+
As a user of sortah, you want to set up filters to save all email from the
|
103
|
+
address "searchable@somewhere.net" to the folder "foobar/", as well as register
|
104
|
+
it with the external service "RubberBandSearch".
|
105
|
+
|
106
|
+
### Solution
|
107
|
+
|
108
|
+
destination :foobar, "foobar/"
|
109
|
+
|
110
|
+
lens :search_index , :pass_through => true do
|
111
|
+
#code to register the email in RubberBandSearch
|
112
|
+
email.indexed? = true
|
113
|
+
end
|
114
|
+
|
115
|
+
router :index_in_rubberband, :lenses => [:search_index] do
|
116
|
+
send_to :foobar
|
117
|
+
end
|
118
|
+
|
119
|
+
router :lenses => [:spam?] do
|
120
|
+
send_to :devnull if email.spam?
|
121
|
+
send_to :index_in_rubberband
|
122
|
+
end
|
123
|
+
|
124
|
+
Here we've used a `pass_through` lens to do the actual indexing, and the router
|
125
|
+
is left as more of a proxy to call the lens.
|
126
|
+
|
127
|
+
### Problem
|
128
|
+
|
129
|
+
As a user of sortah, you want to maintain a whitelist of people who should have
|
130
|
+
their own folders, and you want those people to be subsorted in some arbitrarily
|
131
|
+
deep parent folders, eg:
|
132
|
+
|
133
|
+
family/
|
134
|
+
mom/
|
135
|
+
dad/
|
136
|
+
uncle_timmy/
|
137
|
+
coworkers/
|
138
|
+
pointy_hair/
|
139
|
+
dilbert/
|
140
|
+
old_coworkers/
|
141
|
+
jim/
|
142
|
+
personal/
|
143
|
+
wife/
|
144
|
+
friends/
|
145
|
+
bob/
|
146
|
+
mike/
|
147
|
+
jack/
|
148
|
+
|
149
|
+
etc. Further, you'd like to only maintain the above file (or something like it), and
|
150
|
+
not have to write new sortah code every time you move jobs or make new friends.[1]
|
151
|
+
|
152
|
+
[1] Ideally, this code would maintain a directory structure for you. But as of right
|
153
|
+
now, sortah has no aspirations to do such a thing. Each edition which _moves_ files
|
154
|
+
in the yaml definition file will simply create new folders, it is up to the author
|
155
|
+
of that yaml file to keep the directory coherent with the yaml file.
|
156
|
+
|
157
|
+
## Solution
|
158
|
+
|
159
|
+
First, define a yaml file like the following:
|
160
|
+
|
161
|
+
personal:
|
162
|
+
- name: wife
|
163
|
+
sender:
|
164
|
+
- pretty-lady-who-feeds-me@scary.com
|
165
|
+
family:
|
166
|
+
- name: mom
|
167
|
+
sender:
|
168
|
+
- mom@hotmail.com
|
169
|
+
- mom@gmail.com
|
170
|
+
- name: dad
|
171
|
+
sender:
|
172
|
+
- dad@work.org
|
173
|
+
nested:
|
174
|
+
- name: example
|
175
|
+
reply-to: some_list@place.com
|
176
|
+
- deeper-nesting:
|
177
|
+
- name: deeper-nested-example
|
178
|
+
- reply-to: somewhere_else@overtherainbow.biz.co.uk
|
179
|
+
#...
|
180
|
+
|
181
|
+
This yaml file will represent the directory structure, as well as provide information
|
182
|
+
about how to determine whether the email is from that person or not.
|
183
|
+
|
184
|
+
Next, you could define a class `Contact`, which could be built with the following methods:
|
185
|
+
|
186
|
+
class Contact
|
187
|
+
# ... contains a definition for 'path' -- which is built from the yaml file.
|
188
|
+
|
189
|
+
def destination
|
190
|
+
destination name, path
|
191
|
+
end
|
192
|
+
|
193
|
+
def wants?(email)
|
194
|
+
search_fields.any? { |f,v| email[f] =~ /#{v}/ }
|
195
|
+
end
|
196
|
+
|
197
|
+
def search_fields
|
198
|
+
#these are the key/value pairs from the YAML file which are of the form:
|
199
|
+
# email-field: content_string
|
200
|
+
#eg:
|
201
|
+
# sender: 'me@place.net'
|
202
|
+
# reply-to: 'mailing-list@majordomo.com'
|
203
|
+
#etc
|
204
|
+
end
|
205
|
+
# ...
|
206
|
+
end
|
207
|
+
|
208
|
+
All of this code could be bound up in a router, eg:
|
209
|
+
|
210
|
+
router :contacts do
|
211
|
+
contacts = Contact.load_from_file('contacts.yml')
|
212
|
+
contacts.select { |c| c.wants?(email) }.first.destination
|
213
|
+
end
|
214
|
+
|
215
|
+
Much of this is left to pseudocode, but you can see how being able to use pure-ruby
|
216
|
+
allows for complex routes to be expressed simply.
|
data/Rakefile
ADDED
data/TUTORIAL.md
ADDED
@@ -0,0 +1,43 @@
|
|
1
|
+
#How to use Sortah
|
2
|
+
|
3
|
+
First, I reccommend the following setup. In your `$HOME` directory, create a file
|
4
|
+
called '.sortah/', and beneath that, create 'rc', then execute the command
|
5
|
+
|
6
|
+
ln -s .sortah/rc .sortahrc
|
7
|
+
|
8
|
+
from your `$HOME` directory to link the `.sortahrc` file to the "real" rc file.
|
9
|
+
Next, I would initialize a git repo (or whatever VCS you prefer) in the
|
10
|
+
`.sortah` directory, and add your rc to it.
|
11
|
+
|
12
|
+
Next, create an rvmrc file in the sortah directory with a gemset of 'sortahrc'
|
13
|
+
(or whatever you prefer), this is where you will marshall all your dependencies
|
14
|
+
for sortah -- at the moment, it's only going to be one. You can use bundler for
|
15
|
+
this if you like, but if you just need vanilla sortah, it may be worth just
|
16
|
+
using `gem` to install sortah and eliminate the bundler overhead, YMMV.
|
17
|
+
|
18
|
+
Now that we've done that, we can wire up our getmailrc to point to sortah, as
|
19
|
+
follows:
|
20
|
+
|
21
|
+
[destination]
|
22
|
+
type = MDA_external
|
23
|
+
path = $HOME/.sortah/sortah.sh
|
24
|
+
arguments = ("--log-errors", )
|
25
|
+
|
26
|
+
Where `path` should point to your `.sortah` directory. Next, we need to create
|
27
|
+
the starter script, `sortah.sh`, this should look like:
|
28
|
+
|
29
|
+
#!/bin/sh
|
30
|
+
rvm 1.9.2@sortahrc exec sortah $@
|
31
|
+
|
32
|
+
then run
|
33
|
+
|
34
|
+
chmod a+x ~/.sortah/sortah.sh
|
35
|
+
|
36
|
+
This wraps the sortah executable so that we can always call it in the context of
|
37
|
+
the sortahrc gemset -- if you install this directly to your system, then this
|
38
|
+
shouldn't be necessary.
|
39
|
+
|
40
|
+
Once you've done this, getmail should automatically use sortah to sort your
|
41
|
+
email, now you just need to write your sortah definitions in the `~/.sortah/rc`
|
42
|
+
file!
|
43
|
+
|