beso 0.1.0 → 0.2.0

Sign up to get free protection for your applications and to get access to all the features.
Files changed (4) hide show
  1. data/README.md +106 -2
  2. data/lib/beso/version.rb +1 -1
  3. data/lib/tasks/beso.rake +10 -1
  4. metadata +2 -2
data/README.md CHANGED
@@ -1,6 +1,6 @@
1
1
  # Beso
2
2
 
3
- TODO: Write a gem description
3
+ Sync your historical events to KISSmetrics via CSV.
4
4
 
5
5
  ## Installation
6
6
 
@@ -16,9 +16,113 @@ Or install it yourself as:
16
16
 
17
17
  $ gem install beso
18
18
 
19
+ Next, create an initializer for **beso**. There, you can set up your S3 bucket information and define your
20
+ serialization jobs:
21
+
22
+ ``` rb
23
+ # config/initializers/beso.rb
24
+ Beso.configure do |config|
25
+
26
+ # First, set up your S3 credentials:
27
+
28
+ config.access_key = '[your AWS access key]'
29
+ config.secret_key = '[your AWS secret key]'
30
+ config.bucket_name = 'beso' # recommended, but you can really call this anything
31
+
32
+ # Then, define some jobs:
33
+
34
+ config.job :message_delivered, :table => :messages do
35
+ identity { |message| message.user.id }
36
+ timestamp :created_at
37
+ prop( :message_id ) { |message| message.id }
38
+ end
39
+
40
+ config.job :signed_up, :table => :users do
41
+ identity { |user| user.id }
42
+ timestamp :created_at
43
+ prop( :age ){ |user| user.age }
44
+ end
45
+ end
46
+ ```
47
+
19
48
  ## Usage
20
49
 
21
- TODO: Write usage instructions here
50
+ ### Defining Jobs
51
+
52
+ KISSmetrics events have three properties that *must* be defined:
53
+
54
+ - Identity
55
+ - Timestamp
56
+ - Event
57
+
58
+ The **Identity** field is some sort of identifier for your user. Even if your job
59
+ is working on another table, you should probably have a way to tie the event back
60
+ to the user who caused it. Here, you can provide one of three things:
61
+
62
+ - A proc that should receive the record and return the identity value
63
+ - A symbol that will get passed to `record.send`
64
+ - A literal (You'll probably want to do one of the other two options)
65
+
66
+ The **Timestamp** field is slightly different in that it should always be part of
67
+ the table you are querying, not the user. This symbol will get sent to each record,
68
+ but will also be used in determining the query for the job.
69
+
70
+ The **Event** name is inferred by the name of your job. It will be provided and
71
+ formatted for you.
72
+
73
+ On top of this, you can specify up to **ten** custom properties. Like `identity`,
74
+ you can pass either a proc, a symbol, or a literal:
75
+
76
+ ``` rb
77
+ config.job :signed_up, :table => :users do
78
+ identity :id
79
+ timestamp :created_at
80
+ prop( :age ){ |user| user.age }
81
+ prop( :new_user, true )
82
+ end
83
+ ```
84
+
85
+ ### Using the rake task
86
+
87
+ By requiring `beso`, you get the `beso:run` rake task. This task will do the following:
88
+
89
+ - Connect to your S3 bucket
90
+ - Pull down 'beso.yml' if it exists
91
+
92
+ > `beso.yml` contains the timestamp of the last record queried for each job.
93
+ > If it doesn't exist, it will be created after the first run.
94
+
95
+ - Iterate over the jobs defined in the initializer you set up
96
+ - Create a CSV representation of all records newer than the timestamp found in `beso.yml`
97
+ - Upload each CSV to your S3 bucket with the event name and timestamp
98
+ - Update `beso.yml` with the latest timestamp for each job
99
+
100
+ The rake task is designed to be used via cron. For the moment, KISSmetrics will only process
101
+ one CSV file per hour, so it makes sense that this task should be run at an interval of hours
102
+ equal to the number of jobs you have defined. For example, if you have defined 4 jobs, this
103
+ task should run once every 4 hours.
104
+
105
+ The rake task also accepts two options that you can set via environment variables.
106
+
107
+ `BESO_PREFIX` will change the prefix of the CSV filenames that get uploaded to S3. The default
108
+ is 'beso', so it is recommended you use that when telling KISSmetrics what your filename
109
+ pattern is. You can then adjust the prefix if you would like to upload CSV's that you don't
110
+ want KISSmetrics to recognize.
111
+
112
+ `BESO_ORIGIN` will change the behavior of the task when there is no previous timestamp
113
+ defined for a job in `beso.yml`.
114
+
115
+ > By default, the task will use the last timestamp in your table (which effectively
116
+ > means the first run of this task will do nothing). This is because KISSmetrics
117
+ > charges you for every event you log through their system, so you probably don't
118
+ > want to upload 8 months worth of events straight away.
119
+
120
+ This option will accept two values to alter the behavior:
121
+
122
+ - `now` will set the first run timestamp to now, which will obviously not create any events.
123
+ - `first` will set the first run timestamp to the first timestamp in each table. Use this with
124
+ `BESO_PREFIX` if you want to dump an entire table's worth of events to S3 without having
125
+ KISSmetrics process them.
22
126
 
23
127
  ## Contributing
24
128
 
data/lib/beso/version.rb CHANGED
@@ -1,3 +1,3 @@
1
1
  module Beso
2
- VERSION = "0.1.0"
2
+ VERSION = "0.2.0"
3
3
  end
data/lib/tasks/beso.rake CHANGED
@@ -19,7 +19,16 @@ namespace :beso do
19
19
 
20
20
  Beso.jobs.each do |job|
21
21
 
22
- config[ job.event ] ||= job.first_timestamp
22
+ config[ job.event ] ||= begin
23
+ case ENV[ 'BESO_ORIGIN' ]
24
+ when 'first'
25
+ job.first_timestamp
26
+ when 'now'
27
+ Time.now
28
+ else
29
+ job.last_timestamp
30
+ end
31
+ end
23
32
 
24
33
  puts "==> Processing job: #{job.event.inspect} since #{config[ job.event ]}"
25
34
 
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: beso
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.1.0
4
+ version: 0.2.0
5
5
  prerelease:
6
6
  platform: ruby
7
7
  authors:
@@ -9,7 +9,7 @@ authors:
9
9
  autorequire:
10
10
  bindir: bin
11
11
  cert_chain: []
12
- date: 2012-04-12 00:00:00.000000000Z
12
+ date: 2012-04-13 00:00:00.000000000Z
13
13
  dependencies:
14
14
  - !ruby/object:Gem::Dependency
15
15
  name: rails