beso 0.1.0 → 0.2.0
Sign up to get free protection for your applications and to get access to all the features.
- data/README.md +106 -2
- data/lib/beso/version.rb +1 -1
- data/lib/tasks/beso.rake +10 -1
- metadata +2 -2
data/README.md
CHANGED
@@ -1,6 +1,6 @@
|
|
1
1
|
# Beso
|
2
2
|
|
3
|
-
|
3
|
+
Sync your historical events to KISSmetrics via CSV.
|
4
4
|
|
5
5
|
## Installation
|
6
6
|
|
@@ -16,9 +16,113 @@ Or install it yourself as:
|
|
16
16
|
|
17
17
|
$ gem install beso
|
18
18
|
|
19
|
+
Next, create an initializer for **beso**. There, you can set up your S3 bucket information and define your
|
20
|
+
serialization jobs:
|
21
|
+
|
22
|
+
``` rb
|
23
|
+
# config/initializers/beso.rb
|
24
|
+
Beso.configure do |config|
|
25
|
+
|
26
|
+
# First, set up your S3 credentials:
|
27
|
+
|
28
|
+
config.access_key = '[your AWS access key]'
|
29
|
+
config.secret_key = '[your AWS secret key]'
|
30
|
+
config.bucket_name = 'beso' # recommended, but you can really call this anything
|
31
|
+
|
32
|
+
# Then, define some jobs:
|
33
|
+
|
34
|
+
config.job :message_delivered, :table => :messages do
|
35
|
+
identity { |message| message.user.id }
|
36
|
+
timestamp :created_at
|
37
|
+
prop( :message_id ) { |message| message.id }
|
38
|
+
end
|
39
|
+
|
40
|
+
config.job :signed_up, :table => :users do
|
41
|
+
identity { |user| user.id }
|
42
|
+
timestamp :created_at
|
43
|
+
prop( :age ){ |user| user.age }
|
44
|
+
end
|
45
|
+
end
|
46
|
+
```
|
47
|
+
|
19
48
|
## Usage
|
20
49
|
|
21
|
-
|
50
|
+
### Defining Jobs
|
51
|
+
|
52
|
+
KISSmetrics events have three properties that *must* be defined:
|
53
|
+
|
54
|
+
- Identity
|
55
|
+
- Timestamp
|
56
|
+
- Event
|
57
|
+
|
58
|
+
The **Identity** field is some sort of identifier for your user. Even if your job
|
59
|
+
is working on another table, you should probably have a way to tie the event back
|
60
|
+
to the user who caused it. Here, you can provide one of three things:
|
61
|
+
|
62
|
+
- A proc that should receive the record and return the identity value
|
63
|
+
- A symbol that will get passed to `record.send`
|
64
|
+
- A literal (You'll probably want to do one of the other two options)
|
65
|
+
|
66
|
+
The **Timestamp** field is slightly different in that it should always be part of
|
67
|
+
the table you are querying, not the user. This symbol will get sent to each record,
|
68
|
+
but will also be used in determining the query for the job.
|
69
|
+
|
70
|
+
The **Event** name is inferred by the name of your job. It will be provided and
|
71
|
+
formatted for you.
|
72
|
+
|
73
|
+
On top of this, you can specify up to **ten** custom properties. Like `identity`,
|
74
|
+
you can pass either a proc, a symbol, or a literal:
|
75
|
+
|
76
|
+
``` rb
|
77
|
+
config.job :signed_up, :table => :users do
|
78
|
+
identity :id
|
79
|
+
timestamp :created_at
|
80
|
+
prop( :age ){ |user| user.age }
|
81
|
+
prop( :new_user, true )
|
82
|
+
end
|
83
|
+
```
|
84
|
+
|
85
|
+
### Using the rake task
|
86
|
+
|
87
|
+
By requiring `beso`, you get the `beso:run` rake task. This task will do the following:
|
88
|
+
|
89
|
+
- Connect to your S3 bucket
|
90
|
+
- Pull down 'beso.yml' if it exists
|
91
|
+
|
92
|
+
> `beso.yml` contains the timestamp of the last record queried for each job.
|
93
|
+
> If it doesn't exist, it will be created after the first run.
|
94
|
+
|
95
|
+
- Iterate over the jobs defined in the initializer you set up
|
96
|
+
- Create a CSV representation of all records newer than the timestamp found in `beso.yml`
|
97
|
+
- Upload each CSV to your S3 bucket with the event name and timestamp
|
98
|
+
- Update `beso.yml` with the latest timestamp for each job
|
99
|
+
|
100
|
+
The rake task is designed to be used via cron. For the moment, KISSmetrics will only process
|
101
|
+
one CSV file per hour, so it makes sense that this task should be run at an interval of hours
|
102
|
+
equal to the number of jobs you have defined. For example, if you have defined 4 jobs, this
|
103
|
+
task should run once every 4 hours.
|
104
|
+
|
105
|
+
The rake task also accepts two options that you can set via environment variables.
|
106
|
+
|
107
|
+
`BESO_PREFIX` will change the prefix of the CSV filenames that get uploaded to S3. The default
|
108
|
+
is 'beso', so it is recommended you use that when telling KISSmetrics what your filename
|
109
|
+
pattern is. You can then adjust the prefix if you would like to upload CSV's that you don't
|
110
|
+
want KISSmetrics to recognize.
|
111
|
+
|
112
|
+
`BESO_ORIGIN` will change the behavior of the task when there is no previous timestamp
|
113
|
+
defined for a job in `beso.yml`.
|
114
|
+
|
115
|
+
> By default, the task will use the last timestamp in your table (which effectively
|
116
|
+
> means the first run of this task will do nothing). This is because KISSmetrics
|
117
|
+
> charges you for every event you log through their system, so you probably don't
|
118
|
+
> want to upload 8 months worth of events straight away.
|
119
|
+
|
120
|
+
This option will accept two values to alter the behavior:
|
121
|
+
|
122
|
+
- `now` will set the first run timestamp to now, which will obviously not create any events.
|
123
|
+
- `first` will set the first run timestamp to the first timestamp in each table. Use this with
|
124
|
+
`BESO_PREFIX` if you want to dump an entire table's worth of events to S3 without having
|
125
|
+
KISSmetrics process them.
|
22
126
|
|
23
127
|
## Contributing
|
24
128
|
|
data/lib/beso/version.rb
CHANGED
data/lib/tasks/beso.rake
CHANGED
@@ -19,7 +19,16 @@ namespace :beso do
|
|
19
19
|
|
20
20
|
Beso.jobs.each do |job|
|
21
21
|
|
22
|
-
config[ job.event ] ||=
|
22
|
+
config[ job.event ] ||= begin
|
23
|
+
case ENV[ 'BESO_ORIGIN' ]
|
24
|
+
when 'first'
|
25
|
+
job.first_timestamp
|
26
|
+
when 'now'
|
27
|
+
Time.now
|
28
|
+
else
|
29
|
+
job.last_timestamp
|
30
|
+
end
|
31
|
+
end
|
23
32
|
|
24
33
|
puts "==> Processing job: #{job.event.inspect} since #{config[ job.event ]}"
|
25
34
|
|
metadata
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: beso
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.
|
4
|
+
version: 0.2.0
|
5
5
|
prerelease:
|
6
6
|
platform: ruby
|
7
7
|
authors:
|
@@ -9,7 +9,7 @@ authors:
|
|
9
9
|
autorequire:
|
10
10
|
bindir: bin
|
11
11
|
cert_chain: []
|
12
|
-
date: 2012-04-
|
12
|
+
date: 2012-04-13 00:00:00.000000000Z
|
13
13
|
dependencies:
|
14
14
|
- !ruby/object:Gem::Dependency
|
15
15
|
name: rails
|