sortah 0.5.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- data/.gitignore +8 -0
- data/.rvmrc +2 -0
- data/.travis.yml +3 -0
- data/CONTRIBUTION.md +23 -0
- data/FUTURE_PLANS.md +73 -0
- data/Gemfile +10 -0
- data/KNOWN_BUGS.md +12 -0
- data/LICENSE +27 -0
- data/README.md +216 -0
- data/Rakefile +14 -0
- data/TUTORIAL.md +43 -0
- data/bin/sortah +33 -0
- data/lib/sortah.rb +10 -0
- data/lib/sortah/cleanroom.rb +47 -0
- data/lib/sortah/components.rb +3 -0
- data/lib/sortah/components/destination.rb +41 -0
- data/lib/sortah/components/lens.rb +49 -0
- data/lib/sortah/components/router.rb +13 -0
- data/lib/sortah/email.rb +31 -0
- data/lib/sortah/errors.rb +15 -0
- data/lib/sortah/handler.rb +46 -0
- data/lib/sortah/parser.rb +57 -0
- data/lib/sortah/patches.rb +7 -0
- data/lib/sortah/util/component.rb +28 -0
- data/lib/sortah/util/component_collection.rb +22 -0
- data/lib/sortah/version.rb +3 -0
- data/sortah.gemspec +31 -0
- data/spec/bin_spec.rb +54 -0
- data/spec/destination_spec.rb +42 -0
- data/spec/email_spec.rb +13 -0
- data/spec/fixtures/rc +8 -0
- data/spec/parser_spec.rb +270 -0
- data/spec/semantic_spec.rb +310 -0
- data/spec/sortah_handler_spec.rb +21 -0
- data/spec/spec_helper.rb +3 -0
- metadata +117 -0
data/.gitignore
ADDED
data/.rvmrc
ADDED
data/.travis.yml
ADDED
data/CONTRIBUTION.md
ADDED
@@ -0,0 +1,23 @@
|
|
1
|
+
#Version rules
|
2
|
+
|
3
|
+
Version 1.0 comes when I'm pretty sure there are no major outstanding bugs from
|
4
|
+
release. After that, SemVer applies.
|
5
|
+
|
6
|
+
Let me (jfredett) control the Version, please don't submit a version change in
|
7
|
+
any pull request. This is merely out of concern for sanity.
|
8
|
+
|
9
|
+
#Contribution
|
10
|
+
|
11
|
+
Easy peasy, fork, pull request. Don't submit something with broken tests in it.
|
12
|
+
|
13
|
+
If you find a bug, filing an issue is cool, fixing it is cooler. I will love you
|
14
|
+
forever if you fix it.
|
15
|
+
|
16
|
+
##Contributing sortah definitions:
|
17
|
+
|
18
|
+
For now, there is no "sortah contrib" repo, if there are many people interested
|
19
|
+
in sharing some common set of sortah definitions, then one will be created.
|
20
|
+
|
21
|
+
For now, if you have something cool, make a page in the wiki.
|
22
|
+
|
23
|
+
|
data/FUTURE_PLANS.md
ADDED
@@ -0,0 +1,73 @@
|
|
1
|
+
#Ideas
|
2
|
+
|
3
|
+
- proxy destinations
|
4
|
+
|
5
|
+
Difficulty: middling
|
6
|
+
|
7
|
+
To allow for non-filesystem storage locations (say, redis, elasticsearch, w/e),
|
8
|
+
it would be nice to have "pseudo" destinations, like:
|
9
|
+
|
10
|
+
destination :redis do
|
11
|
+
RedisHandler.acquire_conn do |redis|
|
12
|
+
redis.put email.key, email.to_s
|
13
|
+
end
|
14
|
+
end
|
15
|
+
|
16
|
+
lens :key do
|
17
|
+
email.key = SHA1_of(email)
|
18
|
+
end
|
19
|
+
|
20
|
+
router :root, :lenses => [:key] do
|
21
|
+
send_to :redis
|
22
|
+
end
|
23
|
+
|
24
|
+
- multi-target `send_to`
|
25
|
+
|
26
|
+
Difficulty: easy
|
27
|
+
|
28
|
+
It would be nice to say:
|
29
|
+
|
30
|
+
router do
|
31
|
+
send_to [:foo, :bar, :baz]
|
32
|
+
end
|
33
|
+
|
34
|
+
and have a copy be sent to each destination, optionally, it could take a
|
35
|
+
parameter, "linked", so that the others would only be soft-links to the
|
36
|
+
canonical one (specifed by link), eg:
|
37
|
+
|
38
|
+
router do
|
39
|
+
send_to [:foo, :bar, :baz], :link => :foo
|
40
|
+
end
|
41
|
+
|
42
|
+
In the above, :bar and :baz would be softlinks to :foo
|
43
|
+
|
44
|
+
- contrib
|
45
|
+
|
46
|
+
Difficulty: easy
|
47
|
+
Prereq: Having a community.
|
48
|
+
|
49
|
+
Set up an easy-to-use 'contrib' repo for community sortah libraries.
|
50
|
+
|
51
|
+
- getmail integration
|
52
|
+
|
53
|
+
Difficulty: easy | hard
|
54
|
+
|
55
|
+
This comes in two flavors, the easy flavor is to just provide a wrapper
|
56
|
+
for defining a getmailrc. Perhaps making it easy to define a gmail rc and
|
57
|
+
also allowing for safe password injections (eg, not storing the password
|
58
|
+
plaintext in the rc file).
|
59
|
+
|
60
|
+
The harder version would be to just implement getmail as part of sortah.
|
61
|
+
This would allow for fine-grained control over how sortah gets fired, allowing
|
62
|
+
for better concurrency.
|
63
|
+
|
64
|
+
|
65
|
+
|
66
|
+
|
67
|
+
|
68
|
+
|
69
|
+
|
70
|
+
|
71
|
+
|
72
|
+
|
73
|
+
|
data/Gemfile
ADDED
data/KNOWN_BUGS.md
ADDED
@@ -0,0 +1,12 @@
|
|
1
|
+
doing
|
2
|
+
|
3
|
+
sortah.sort(email)
|
4
|
+
sortah.metadata#blahblahblah
|
5
|
+
|
6
|
+
does not behave as expected (it doens't preserve metadata to the next line)
|
7
|
+
this has something to do with how I pull the data in the Kernel patch. it works
|
8
|
+
if you do
|
9
|
+
|
10
|
+
sortah.sort(email).metadata#blahblah
|
11
|
+
|
12
|
+
|
data/LICENSE
ADDED
@@ -0,0 +1,27 @@
|
|
1
|
+
Copyright (c) 2011, Joseph Fredette
|
2
|
+
All rights reserved.
|
3
|
+
|
4
|
+
Redistribution and use in source and binary forms, with or without modification,
|
5
|
+
are permitted provided that the following conditions are met:
|
6
|
+
|
7
|
+
Redistributions of source code must retain the above copyright notice, this list
|
8
|
+
of conditions and the following disclaimer.
|
9
|
+
|
10
|
+
Redistributions in binary form must reproduce the above copyright notice, this
|
11
|
+
list of conditions and the following disclaimer in the documentation and/or
|
12
|
+
other materials provided with the distribution.
|
13
|
+
|
14
|
+
Neither the name of "sortah" nor the names of its contributors may be used to
|
15
|
+
endorse or promote products derived from this software without specific prior
|
16
|
+
written permission.
|
17
|
+
|
18
|
+
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
|
19
|
+
ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
|
20
|
+
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
|
21
|
+
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR
|
22
|
+
ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
|
23
|
+
(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
|
24
|
+
LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON
|
25
|
+
ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
|
26
|
+
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
|
27
|
+
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
data/README.md
ADDED
@@ -0,0 +1,216 @@
|
|
1
|
+
#sortah
|
2
|
+
|
3
|
+
##For sortin' your friggin' mail.
|
4
|
+
|
5
|
+
--------------------------------------------------------------------------------
|
6
|
+
|
7
|
+
[](http://travis-ci.org/jfredett/sortah)
|
8
|
+
|
9
|
+
--------------------------------------------------------------------------------
|
10
|
+
|
11
|
+
Sortah sort's mail. It provides a ruby [EDSL](# Embedded DSL) for manipulating
|
12
|
+
email objects. The DSL allows the definition of three principle components:
|
13
|
+
|
14
|
+
- Destinations
|
15
|
+
|
16
|
+
A destination takes in an Email object, and returns a system path. This is where
|
17
|
+
the email object passed into it will be saved. Ex:
|
18
|
+
|
19
|
+
destination :spam, "spam/"
|
20
|
+
destination :ham, "/"
|
21
|
+
|
22
|
+
These are -- in essence -- simple delcarations of the structure of your sorting
|
23
|
+
system. They may take one of several forms. The examples above are relative to
|
24
|
+
the mail directory, and will transparently manage organization south of that.
|
25
|
+
They can also be absolute paths, eg:
|
26
|
+
|
27
|
+
destination :devnull, :abs => "/dev/null"
|
28
|
+
|
29
|
+
which is regarded as an absolutely qualified path. It may also alias another
|
30
|
+
path:
|
31
|
+
|
32
|
+
destination :tldr, :devnull
|
33
|
+
|
34
|
+
- Lenses
|
35
|
+
|
36
|
+
Lenses are functions which produce a value given an input email. This value is
|
37
|
+
interpreted as metadata to be used by the routers. Ex:
|
38
|
+
|
39
|
+
lens :spam_value do
|
40
|
+
x = 0
|
41
|
+
email.text.each_line do |line|
|
42
|
+
x += (line =~ /extension/) ? 1 : 0
|
43
|
+
end
|
44
|
+
x
|
45
|
+
end
|
46
|
+
|
47
|
+
lens :word_count do
|
48
|
+
email.text.split.size
|
49
|
+
end
|
50
|
+
|
51
|
+
lenses can also depend on other lenses.
|
52
|
+
|
53
|
+
lens :spam_ratio :lenses => [:spam_value, :word_count] do
|
54
|
+
email.spam_value / email.word_count
|
55
|
+
end
|
56
|
+
|
57
|
+
You may specify the `pass_through` option to cause the lens to not set any
|
58
|
+
metadata. This is useful in two cases, updating old metadata, and interaction
|
59
|
+
(typically creational) with other services. Eg:
|
60
|
+
|
61
|
+
lens :example_update, :pass_through => true do
|
62
|
+
email.spam_value = 1000000 if email.sender == "annoying_guy0022493@hotmail.com"
|
63
|
+
end
|
64
|
+
|
65
|
+
lens :example_interaction, :pass_through => true, lenses => [:spam_value] do
|
66
|
+
return unless email.spam_value >= 1000000
|
67
|
+
HTTParty.post "http://spamblacklist.net/spammer/new", :body => email.sender
|
68
|
+
end
|
69
|
+
|
70
|
+
- Routers
|
71
|
+
|
72
|
+
This is the core of the language, a router is an object which produce either a
|
73
|
+
router object, or a destination. If it produces a destination, then the email is
|
74
|
+
delivered to that destination. If it produces another router, then the email is
|
75
|
+
passed along to the router produced. A router also 'depends' on lenses. These
|
76
|
+
lenses get applied when the router is called. There is one router which is
|
77
|
+
special, the "root" router, this is the first router which gets called. To
|
78
|
+
declare it, simply declare a router without a name. Ex:
|
79
|
+
|
80
|
+
router :spam_filter, :lenses => [:spam_value] do
|
81
|
+
send_to :ham if email.spam_value < 10
|
82
|
+
send_to :spam
|
83
|
+
end
|
84
|
+
|
85
|
+
router :root, :lenses => [:word_count] do
|
86
|
+
send_to :tldr if email.word_count > 100
|
87
|
+
send_to :spam_filter
|
88
|
+
end
|
89
|
+
|
90
|
+
`send_to` will first search for a destination with the given name, if it cannot
|
91
|
+
find one, it will send it search for the corresponding router. It also acts as
|
92
|
+
`return` -- halting execution of the block when it is called. This is
|
93
|
+
implemented via an exception, which means it _may_ cause performance issues on
|
94
|
+
things like the JVM, YMMV.
|
95
|
+
|
96
|
+
when defining a root router with lenses, you must specify ":root" as the title.
|
97
|
+
|
98
|
+
## Common problems, and how to solve them:
|
99
|
+
|
100
|
+
### Problem: Adding a mail to an external service, and then saving it.
|
101
|
+
|
102
|
+
As a user of sortah, you want to set up filters to save all email from the
|
103
|
+
address "searchable@somewhere.net" to the folder "foobar/", as well as register
|
104
|
+
it with the external service "RubberBandSearch".
|
105
|
+
|
106
|
+
### Solution
|
107
|
+
|
108
|
+
destination :foobar, "foobar/"
|
109
|
+
|
110
|
+
lens :search_index , :pass_through => true do
|
111
|
+
#code to register the email in RubberBandSearch
|
112
|
+
email.indexed? = true
|
113
|
+
end
|
114
|
+
|
115
|
+
router :index_in_rubberband, :lenses => [:search_index] do
|
116
|
+
send_to :foobar
|
117
|
+
end
|
118
|
+
|
119
|
+
router :lenses => [:spam?] do
|
120
|
+
send_to :devnull if email.spam?
|
121
|
+
send_to :index_in_rubberband
|
122
|
+
end
|
123
|
+
|
124
|
+
Here we've used a `pass_through` lens to do the actual indexing, and the router
|
125
|
+
is left as more of a proxy to call the lens.
|
126
|
+
|
127
|
+
### Problem
|
128
|
+
|
129
|
+
As a user of sortah, you want to maintain a whitelist of people who should have
|
130
|
+
their own folders, and you want those people to be subsorted in some arbitrarily
|
131
|
+
deep parent folders, eg:
|
132
|
+
|
133
|
+
family/
|
134
|
+
mom/
|
135
|
+
dad/
|
136
|
+
uncle_timmy/
|
137
|
+
coworkers/
|
138
|
+
pointy_hair/
|
139
|
+
dilbert/
|
140
|
+
old_coworkers/
|
141
|
+
jim/
|
142
|
+
personal/
|
143
|
+
wife/
|
144
|
+
friends/
|
145
|
+
bob/
|
146
|
+
mike/
|
147
|
+
jack/
|
148
|
+
|
149
|
+
etc. Further, you'd like to only maintain the above file (or something like it), and
|
150
|
+
not have to write new sortah code every time you move jobs or make new friends.[1]
|
151
|
+
|
152
|
+
[1] Ideally, this code would maintain a directory structure for you. But as of right
|
153
|
+
now, sortah has no aspirations to do such a thing. Each edition which _moves_ files
|
154
|
+
in the yaml definition file will simply create new folders, it is up to the author
|
155
|
+
of that yaml file to keep the directory coherent with the yaml file.
|
156
|
+
|
157
|
+
## Solution
|
158
|
+
|
159
|
+
First, define a yaml file like the following:
|
160
|
+
|
161
|
+
personal:
|
162
|
+
- name: wife
|
163
|
+
sender:
|
164
|
+
- pretty-lady-who-feeds-me@scary.com
|
165
|
+
family:
|
166
|
+
- name: mom
|
167
|
+
sender:
|
168
|
+
- mom@hotmail.com
|
169
|
+
- mom@gmail.com
|
170
|
+
- name: dad
|
171
|
+
sender:
|
172
|
+
- dad@work.org
|
173
|
+
nested:
|
174
|
+
- name: example
|
175
|
+
reply-to: some_list@place.com
|
176
|
+
- deeper-nesting:
|
177
|
+
- name: deeper-nested-example
|
178
|
+
- reply-to: somewhere_else@overtherainbow.biz.co.uk
|
179
|
+
#...
|
180
|
+
|
181
|
+
This yaml file will represent the directory structure, as well as provide information
|
182
|
+
about how to determine whether the email is from that person or not.
|
183
|
+
|
184
|
+
Next, you could define a class `Contact`, which could be built with the following methods:
|
185
|
+
|
186
|
+
class Contact
|
187
|
+
# ... contains a definition for 'path' -- which is built from the yaml file.
|
188
|
+
|
189
|
+
def destination
|
190
|
+
destination name, path
|
191
|
+
end
|
192
|
+
|
193
|
+
def wants?(email)
|
194
|
+
search_fields.any? { |f,v| email[f] =~ /#{v}/ }
|
195
|
+
end
|
196
|
+
|
197
|
+
def search_fields
|
198
|
+
#these are the key/value pairs from the YAML file which are of the form:
|
199
|
+
# email-field: content_string
|
200
|
+
#eg:
|
201
|
+
# sender: 'me@place.net'
|
202
|
+
# reply-to: 'mailing-list@majordomo.com'
|
203
|
+
#etc
|
204
|
+
end
|
205
|
+
# ...
|
206
|
+
end
|
207
|
+
|
208
|
+
All of this code could be bound up in a router, eg:
|
209
|
+
|
210
|
+
router :contacts do
|
211
|
+
contacts = Contact.load_from_file('contacts.yml')
|
212
|
+
contacts.select { |c| c.wants?(email) }.first.destination
|
213
|
+
end
|
214
|
+
|
215
|
+
Much of this is left to pseudocode, but you can see how being able to use pure-ruby
|
216
|
+
allows for complex routes to be expressed simply.
|
data/Rakefile
ADDED
data/TUTORIAL.md
ADDED
@@ -0,0 +1,43 @@
|
|
1
|
+
#How to use Sortah
|
2
|
+
|
3
|
+
First, I reccommend the following setup. In your `$HOME` directory, create a file
|
4
|
+
called '.sortah/', and beneath that, create 'rc', then execute the command
|
5
|
+
|
6
|
+
ln -s .sortah/rc .sortahrc
|
7
|
+
|
8
|
+
from your `$HOME` directory to link the `.sortahrc` file to the "real" rc file.
|
9
|
+
Next, I would initialize a git repo (or whatever VCS you prefer) in the
|
10
|
+
`.sortah` directory, and add your rc to it.
|
11
|
+
|
12
|
+
Next, create an rvmrc file in the sortah directory with a gemset of 'sortahrc'
|
13
|
+
(or whatever you prefer), this is where you will marshall all your dependencies
|
14
|
+
for sortah -- at the moment, it's only going to be one. You can use bundler for
|
15
|
+
this if you like, but if you just need vanilla sortah, it may be worth just
|
16
|
+
using `gem` to install sortah and eliminate the bundler overhead, YMMV.
|
17
|
+
|
18
|
+
Now that we've done that, we can wire up our getmailrc to point to sortah, as
|
19
|
+
follows:
|
20
|
+
|
21
|
+
[destination]
|
22
|
+
type = MDA_external
|
23
|
+
path = $HOME/.sortah/sortah.sh
|
24
|
+
arguments = ("--log-errors", )
|
25
|
+
|
26
|
+
Where `path` should point to your `.sortah` directory. Next, we need to create
|
27
|
+
the starter script, `sortah.sh`, this should look like:
|
28
|
+
|
29
|
+
#!/bin/sh
|
30
|
+
rvm 1.9.2@sortahrc exec sortah $@
|
31
|
+
|
32
|
+
then run
|
33
|
+
|
34
|
+
chmod a+x ~/.sortah/sortah.sh
|
35
|
+
|
36
|
+
This wraps the sortah executable so that we can always call it in the context of
|
37
|
+
the sortahrc gemset -- if you install this directly to your system, then this
|
38
|
+
shouldn't be necessary.
|
39
|
+
|
40
|
+
Once you've done this, getmail should automatically use sortah to sort your
|
41
|
+
email, now you just need to write your sortah definitions in the `~/.sortah/rc`
|
42
|
+
file!
|
43
|
+
|