beso 0.1.0 → 0.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (4) hide show
  1. data/README.md +106 -2
  2. data/lib/beso/version.rb +1 -1
  3. data/lib/tasks/beso.rake +10 -1
  4. metadata +2 -2
data/README.md CHANGED
@@ -1,6 +1,6 @@
1
1
  # Beso
2
2
 
3
- TODO: Write a gem description
3
+ Sync your historical events to KISSmetrics via CSV.
4
4
 
5
5
  ## Installation
6
6
 
@@ -16,9 +16,113 @@ Or install it yourself as:
16
16
 
17
17
  $ gem install beso
18
18
 
19
+ Next, create an initializer for **beso**. There, you can set up your S3 bucket information and define your
20
+ serialization jobs:
21
+
22
+ ``` rb
23
+ # config/initializers/beso.rb
24
+ Beso.configure do |config|
25
+
26
+ # First, set up your S3 credentials:
27
+
28
+ config.access_key = '[your AWS access key]'
29
+ config.secret_key = '[your AWS secret key]'
30
+ config.bucket_name = 'beso' # recommended, but you can really call this anything
31
+
32
+ # Then, define some jobs:
33
+
34
+ config.job :message_delivered, :table => :messages do
35
+ identity { |message| message.user.id }
36
+ timestamp :created_at
37
+ prop( :message_id ) { |message| message.id }
38
+ end
39
+
40
+ config.job :signed_up, :table => :users do
41
+ identity { |user| user.id }
42
+ timestamp :created_at
43
+ prop( :age ){ |user| user.age }
44
+ end
45
+ end
46
+ ```
47
+
19
48
  ## Usage
20
49
 
21
- TODO: Write usage instructions here
50
+ ### Defining Jobs
51
+
52
+ KISSmetrics events have three properties that *must* be defined:
53
+
54
+ - Identity
55
+ - Timestamp
56
+ - Event
57
+
58
+ The **Identity** field is some sort of identifier for your user. Even if your job
59
+ is working on another table, you should probably have a way to tie the event back
60
+ to the user who caused it. Here, you can provide one of three things:
61
+
62
+ - A proc that should receive the record and return the identity value
63
+ - A symbol that will get passed to `record.send`
64
+ - A literal (You'll probably want to do one of the other two options)
65
+
66
+ The **Timestamp** field is slightly different in that it should always be part of
67
+ the table you are querying, not the user. This symbol will get sent to each record,
68
+ but will also be used in determining the query for the job.
69
+
70
+ The **Event** name is inferred by the name of your job. It will be provided and
71
+ formatted for you.
72
+
73
+ On top of this, you can specify up to **ten** custom properties. Like `identity`,
74
+ you can pass either a proc, a symbol, or a literal:
75
+
76
+ ``` rb
77
+ config.job :signed_up, :table => :users do
78
+ identity :id
79
+ timestamp :created_at
80
+ prop( :age ){ |user| user.age }
81
+ prop( :new_user, true )
82
+ end
83
+ ```
84
+
85
+ ### Using the rake task
86
+
87
+ By requiring `beso`, you get the `beso:run` rake task. This task will do the following:
88
+
89
+ - Connect to your S3 bucket
90
+ - Pull down 'beso.yml' if it exists
91
+
92
+ > `beso.yml` contains the timestamp of the last record queried for each job.
93
+ > If it doesn't exist, it will be created after the first run.
94
+
95
+ - Iterate over the jobs defined in the initializer you set up
96
+ - Create a CSV representation of all records newer than the timestamp found in `beso.yml`
97
+ - Upload each CSV to your S3 bucket with the event name and timestamp
98
+ - Update `beso.yml` with the latest timestamp for each job
99
+
100
+ The rake task is designed to be used via cron. For the moment, KISSmetrics will only process
101
+ one CSV file per hour, so it makes sense that this task should be run at an interval of hours
102
+ equal to the number of jobs you have defined. For example, if you have defined 4 jobs, this
103
+ task should run once every 4 hours.
104
+
105
+ The rake task also accepts two options that you can set via environment variables.
106
+
107
+ `BESO_PREFIX` will change the prefix of the CSV filenames that get uploaded to S3. The default
108
+ is 'beso', so it is recommended you use that when telling KISSmetrics what your filename
109
+ pattern is. You can then adjust the prefix if you would like to upload CSV's that you don't
110
+ want KISSmetrics to recognize.
111
+
112
+ `BESO_ORIGIN` will change the behavior of the task when there is no previous timestamp
113
+ defined for a job in `beso.yml`.
114
+
115
+ > By default, the task will use the last timestamp in your table (which effectively
116
+ > means the first run of this task will do nothing). This is because KISSmetrics
117
+ > charges you for every event you log through their system, so you probably don't
118
+ > want to upload 8 months worth of events straight away.
119
+
120
+ This option will accept two values to alter the behavior:
121
+
122
+ - `now` will set the first run timestamp to now, which will obviously not create any events.
123
+ - `first` will set the first run timestamp to the first timestamp in each table. Use this with
124
+ `BESO_PREFIX` if you want to dump an entire table's worth of events to S3 without having
125
+ KISSmetrics process them.
22
126
 
23
127
  ## Contributing
24
128
 
data/lib/beso/version.rb CHANGED
@@ -1,3 +1,3 @@
1
1
  module Beso
2
- VERSION = "0.1.0"
2
+ VERSION = "0.2.0"
3
3
  end
data/lib/tasks/beso.rake CHANGED
@@ -19,7 +19,16 @@ namespace :beso do
19
19
 
20
20
  Beso.jobs.each do |job|
21
21
 
22
- config[ job.event ] ||= job.first_timestamp
22
+ config[ job.event ] ||= begin
23
+ case ENV[ 'BESO_ORIGIN' ]
24
+ when 'first'
25
+ job.first_timestamp
26
+ when 'now'
27
+ Time.now
28
+ else
29
+ job.last_timestamp
30
+ end
31
+ end
23
32
 
24
33
  puts "==> Processing job: #{job.event.inspect} since #{config[ job.event ]}"
25
34
 
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: beso
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.1.0
4
+ version: 0.2.0
5
5
  prerelease:
6
6
  platform: ruby
7
7
  authors:
@@ -9,7 +9,7 @@ authors:
9
9
  autorequire:
10
10
  bindir: bin
11
11
  cert_chain: []
12
- date: 2012-04-12 00:00:00.000000000Z
12
+ date: 2012-04-13 00:00:00.000000000Z
13
13
  dependencies:
14
14
  - !ruby/object:Gem::Dependency
15
15
  name: rails