conveyor 0.2.0 → 0.2.1

Sign up to get free protection for your applications and to get access to all the features.
data.tar.gz.sig CHANGED
Binary file
@@ -1,3 +1,6 @@
1
+ == 0.2.1 / 2008-02-29
2
+ * added get by timestamp and rewind to timestamp (for groups, too)
3
+
1
4
  == 0.2.0 / 2008-02-26
2
5
 
3
6
  * switched from using Mongrel to Thin. this adds a dependency on thin, which depends on Event Machine
@@ -5,8 +5,8 @@ README.txt
5
5
  Rakefile
6
6
  bin/conveyor
7
7
  bin/conveyor-upgrade
8
- docs/file-formats.mkd
9
- docs/protocol.mkd
8
+ docs/file-formats.rdoc
9
+ docs/protocol.rdoc
10
10
  lib/conveyor.rb
11
11
  lib/conveyor/base_channel.rb
12
12
  lib/conveyor/channel.rb
data/Rakefile CHANGED
@@ -14,4 +14,4 @@ Hoe.new('conveyor', Conveyor::VERSION) do |p|
14
14
  p.extra_deps << ['daemons']
15
15
  end
16
16
 
17
- # vim: syntax=Ruby
17
+ # vim: syntax=Ruby
@@ -1,15 +1,16 @@
1
- = FILE FORMATS
2
- == DATA FILES
1
+ = File Formats
2
+ == Data Files
3
3
 
4
- id time offset length hash
5
- content
6
- ...
4
+ id timestamp offset length hash flags
5
+ content
6
+ ...
7
7
 
8
8
  contrived example:
9
9
 
10
- 1213124 2008-01-05T13:35:32 1234 11 asdfasdfasdfasdfasdfasdfasdfa
11
- foo bar bam
10
+ q01w 3mp0a6g ya 11 asdfasdfasdfasdfasdfasdfasdfa 0
11
+ foo bar bam
12
12
 
13
+ * all integers encoded in base-36
13
14
  * space separated line of metadata followed by content
14
15
  * delimiter might be useful for sanity checking, but the hash could probably suffice for ensuring that the offset was calculated and persisted properly. We should look at what ARC does here.
15
16
  * offset is to beginning of metadata line
@@ -17,15 +18,15 @@ foo bar bam
17
18
 
18
19
  === INDEX FILES
19
20
 
20
- id time offset length hash file
21
+ id time offset length hash flags file
21
22
 
22
23
  contrived example:
23
24
 
24
- 1213124 2008-01-05T13:35:32 1234 11 asdfasdfasdfasdfasdfasdfasdfa 1
25
+ q01w 3mp0a6g ya 11 asdfasdfasdfasdfasdfasdfasdfa 0 1
25
26
 
26
27
  notes:
27
- * 1 is the filename
28
- * assuming a lucene-style directory of datafiles + ToC/index
28
+ * '1' is the filename
29
+ * lucene-style directory of datafiles + ToC/index
29
30
  * given that the files are written sequentially we can avoid writing every entry to the index file (as long as you write the first and last entry to the index). At most this means you have to read n entries, where n is the gap between index entries. Given that most clients will have persistent connections and be reading sequentially, we can do some clever things on the server side to make this really efficient (basically meaning you'll only have to pay that penalty on the first item you read).
30
31
 
31
32
  == LOG FILES
@@ -0,0 +1,63 @@
1
+ = Conveyor protocol
2
+
3
+ == Create a channel
4
+
5
+ [Request] PUT /channels/{channel name}
6
+
7
+ [Response] success: 201, failure: ?
8
+
9
+ The simple explanation is that to create a channel you do a PUT operation on the url you want for the channel, which must conform to /channels/\A[a-zA-Z0-9\-]+\Z. In other words, the channel name may only have letters numbers and dashes.
10
+
11
+ == Post to a channel
12
+
13
+ [Request] POST /channels/{channel name}, body is the message
14
+ [Response] success: 202, failure: ?
15
+
16
+ A post to a channel URL with the message in the body.
17
+
18
+ == Get from channel
19
+
20
+ === Get by id
21
+
22
+ [Request] GET /channels/{channel name}/{id}
23
+ [Response] success: 200, failure: 404
24
+
25
+ === Get by timestamp
26
+
27
+ NOT IMPLEMENTED YET!
28
+
29
+ [Request] GET /channels/{channel name}?after={timestamp}
30
+ [Response] success: 200
31
+
32
+ Will return the first entry *after* that timestamp.
33
+
34
+ === Get Next (Queue-like semantics)
35
+
36
+ [Request] GET /channels/{channel name}?next
37
+ [Response] success: 200
38
+
39
+ If this is called for the first time, it will return the first item in the channel. Otherwise it will return the next item.
40
+
41
+ === Get Next by Group (Multi-consumer queue)
42
+
43
+ [Request] GET /channels/{channel name}?next&group={group name}
44
+ [Response] success: 200
45
+
46
+ If this is called for the first time, it will return the first item in the channel. Otherwise it will return the next item.
47
+
48
+ === Rewinding to id
49
+
50
+ [Request] POST /channels/{channel name}?rewind_id={id}
51
+ [Response] success: 200
52
+
53
+ === Get next n
54
+
55
+ [Request] GET /channels/{channel name}?next&n={n}
56
+ [Response] JSON array of objects, which have the keys 'id', 'hash' and 'data'
57
+
58
+
59
+ === Get next n for group
60
+
61
+ [Request] GET /channels/{channel name}?next&n={n}&group={group}
62
+ [Response] JSON array of objects, which have the keys 'id', 'hash' and 'data'
63
+
@@ -1,4 +1,4 @@
1
1
  module Conveyor
2
- VERSION = '0.2.0'
2
+ VERSION = '0.2.1'
3
3
  QUALITY = 'alpha'
4
4
  end
@@ -65,7 +65,7 @@ module Conveyor
65
65
  end
66
66
  end
67
67
 
68
- def commit data, time = nil
68
+ def commit data, time=nil
69
69
  l = nil
70
70
  gzip = data.length >= 256
71
71
  if gzip
@@ -105,7 +105,7 @@ module Conveyor
105
105
  end
106
106
  end
107
107
 
108
- def get id, stream = false
108
+ def get id, stream=false
109
109
  return nil unless id <= @last_id && id > 0
110
110
  i = @index[id-1]
111
111
  headers, content, compressed_content, g = nil
@@ -127,6 +127,12 @@ module Conveyor
127
127
  end
128
128
  end
129
129
 
130
+ def get_nearest_after_timestamp timestamp, stream=false
131
+ # i = binary search to find nearest item at or after timestamp
132
+ i = nearest_after(timestamp)
133
+ get(i) if i
134
+ end
135
+
130
136
  def self.parse_headers str, index_file=false
131
137
  pattern = '\A([a-z\d]+) ([a-z\d]+) ([a-z\d]+) ([a-z\d]+) ([a-f0-9]+) ([a-z\d]+)'
132
138
  pattern += ' (\d+)' if index_file
@@ -196,5 +202,24 @@ module Conveyor
196
202
  File.join(@directory, 'version')
197
203
  end
198
204
 
205
+ def nearest_after(timestamp)
206
+ low = 0
207
+ high = @index.length
208
+ while low < high
209
+ mid = (low + high) / 2
210
+ if (@index[mid][:time].to_i > timestamp)
211
+ high = mid - 1
212
+ elsif (@index[mid][:time].to_i < timestamp)
213
+ low = mid + 1
214
+ else
215
+ return mid
216
+ end
217
+ end
218
+ if timestamp <= @index[mid][:time].to_i
219
+ @index[mid][:id]
220
+ else
221
+ nil
222
+ end
223
+ end
199
224
  end
200
225
  end
@@ -97,7 +97,7 @@ module Conveyor
97
97
  end
98
98
 
99
99
  def rewind *opts
100
- opts = opts.first
100
+ opts = opts.inject{|h, m| m.merge(h)}
101
101
  if opts.key?(:id)
102
102
  if opts.key?(:group)
103
103
  group_iterator_lock(opts[:group]) do
@@ -112,6 +112,12 @@ module Conveyor
112
112
  @iterator_file.write("#{@iterator.to_s(36)}\n")
113
113
  end
114
114
  end
115
+ elsif opts.key?(:time)
116
+ if opts.key?(:group)
117
+ rewind :id => nearest_after(opts[:time]), :group => opts[:group]
118
+ else
119
+ rewind :id => nearest_after(opts[:time])
120
+ end
115
121
  end
116
122
  end
117
123
 
@@ -45,12 +45,21 @@ module Conveyor
45
45
  end
46
46
  end
47
47
 
48
- def rewind id, group=nil
49
- if group
50
- @conn.post("/channels/#{@channel}?rewind_id=#{id}&group=#{group}", nil)
51
- else
52
- @conn.post("/channels/#{@channel}?rewind_id=#{id}", nil)
48
+ def rewind *opts
49
+ opts = opts.inject{|h,m| m.merge(h)}
50
+ if opts.key?(:id) && opts.key?(:group)
51
+ @conn.post("/channels/#{@channel}?rewind_id=#{opts[:id]}&group=#{opts[:group]}", nil)
52
+ elsif opts.key?(:id)
53
+ @conn.post("/channels/#{@channel}?rewind_id=#{opts[:id]}", nil)
54
+ elsif opts.key?(:group) && opts.key?(:time)
55
+ @conn.post("/channels/#{@channel}?rewind_time=#{opts[:time].to_i}&group=#{opts[:group]}", nil)
56
+ elsif opts.key?(:time)
57
+ @conn.post("/channels/#{@channel}?rewind_time=#{opts[:time].to_i}", nil)
53
58
  end
54
59
  end
60
+
61
+ def get_nearest_after_timestamp timestamp
62
+ @conn.get("/channels/#{@channel}?after=#{timestamp.to_i}").body
63
+ end
55
64
  end
56
65
  end
@@ -68,6 +68,14 @@ module Conveyor
68
68
  @channels[m.captures[0]].rewind(:id => params['rewind_id']).to_i # TODO make sure this is an integer
69
69
  [200, {}, "iterator rewound to #{params['rewind_id']}"]
70
70
  end
71
+ elsif params.key?('rewind_time')
72
+ if params['group']
73
+ @channels[m.captures[0]].rewind(:time => params['rewind_time'].to_i, :group => params['group']).to_i # TODO make sure this is an integer
74
+ [200, {}, "iterator rewound to #{params['rewind_id']}"]
75
+ else
76
+ @channels[m.captures[0]].rewind(:time => params['rewind_time'].to_i) # TODO make sure this is an integer
77
+ [200, {}, "iterator rewound to #{params['rewind_time']}"]
78
+ end
71
79
  else
72
80
  if env.key?('HTTP_DATE') && d = Time.parse(env['HTTP_DATE'])
73
81
  id = @channels[m.captures[0]].post(env['rack.input'].read)
@@ -111,6 +119,8 @@ module Conveyor
111
119
  headers, content = @channels[m.captures[0]].get_next
112
120
  end
113
121
  end
122
+ elsif params.key? 'after'
123
+ headers, content = @channels[m.captures[0]].get_nearest_after_timestamp(params['after'].to_i)
114
124
  else
115
125
  return [200, {}, @channels[m.captures[0]].status.to_json]
116
126
  end
@@ -162,7 +162,7 @@ class TestConveyorChannel < Test::Unit::TestCase
162
162
  d = Channel.new('/tmp/bar')
163
163
  assert_equal 'foo', d.get_next[1]
164
164
  end
165
-
165
+
166
166
  def test_group_rewind
167
167
  FileUtils.rm_r('/tmp/bar') rescue nil
168
168
  c = Channel.new('/tmp/bar')
@@ -176,7 +176,7 @@ class TestConveyorChannel < Test::Unit::TestCase
176
176
  d = Channel.new('/tmp/bar')
177
177
  assert_equal 'foo', d.get_next_by_group('bar')[1]
178
178
  end
179
-
179
+
180
180
  def test_valid_name
181
181
  assert BaseChannel.valid_channel_name?(('a'..'z').to_a.join)
182
182
  assert BaseChannel.valid_channel_name?(('A'..'Z').to_a.join)
@@ -237,7 +237,7 @@ class TestConveyorChannel < Test::Unit::TestCase
237
237
  end
238
238
  assert_equal [], c.get_next_n_by_group(10, 'bar')
239
239
  end
240
-
240
+
241
241
  def test_delete
242
242
  chan = 'test_delete'
243
243
  FileUtils.rm_r "/tmp/#{chan}" rescue nil
@@ -250,5 +250,52 @@ class TestConveyorChannel < Test::Unit::TestCase
250
250
  d = Channel.new("/tmp/#{chan}")
251
251
  assert_equal nil, d.get(1)
252
252
  end
253
+
254
+ def test_get_by_timestamp
255
+ chan = 'test_get_by_timestamp'
256
+ FileUtils.rm_r "/tmp/#{chan}" rescue nil
257
+ c = Channel.new("/tmp/#{chan}")
258
+
259
+ 10.times{|i| c.post(i.to_s)}
260
+ assert_equal '0', c.get_nearest_after_timestamp(0)[1]
261
+ assert_equal nil, c.get_nearest_after_timestamp(2**32)
262
+
263
+ t0 = Time.now.to_i
264
+ 10.times{|i| c.post((10 + i).to_s)}
265
+ assert_equal '9', c.get_nearest_after_timestamp(t0)[1]
266
+ end
267
+
268
+ def test_rewind_to_timestamp
269
+ chan = 'test_rewind_to_timestamp'
270
+ FileUtils.rm_r "/tmp/#{chan}" rescue nil
271
+ c = Channel.new("/tmp/#{chan}")
272
+
273
+ 10.times{|i| c.post(i.to_s)}
274
+ 10.times{|i| assert_equal i.to_s, c.get_next[1]}
275
+
276
+ c.rewind :time => 0
277
+ 10.times{|i| assert_equal i.to_s, c.get_next[1]}
278
+
279
+ t0 = Time.now.to_i + 1
280
+ c.rewind :time => t0
281
+ assert_equal nil, c.get_next
282
+ end
283
+
284
+ def test_rewind_group_to_timestamp
285
+ chan = 'test_rewind_group_to_timestamp'
286
+ FileUtils.rm_r "/tmp/#{chan}" rescue nil
287
+ c = Channel.new("/tmp/#{chan}")
288
+
289
+ group = 'foo'
290
+ 10.times{|i| c.post(i.to_s)}
291
+ 10.times{|i| assert_equal i.to_s, c.get_next_by_group(group)[1]}
292
+
293
+ c.rewind :time => 0, :group => group
294
+ 10.times{|i| assert_equal i.to_s, c.get_next_by_group(group)[1]}
295
+
296
+ t0 = Time.now.to_i + 1
297
+ c.rewind :time => t0, :group => group
298
+ assert_equal nil, c.get_next_by_group(group)
299
+ end
253
300
 
254
301
  end
@@ -49,7 +49,8 @@ class TestConveyorServer < Test::Unit::TestCase
49
49
  "WUSYY2dCBdDdZEiGWtyfC5yGKVMgDhzBhyNLwcefxa49fED1Sf05f8MlgXOBx6n5I6Ae2Wy3Mds",
50
50
  "uAlUDvngWqDl3PaRVl1i9RcwDIvJlNp6yMy9RQgVsucwNvKaSOQlJMarWItKy8zT2ON08ElKkZ2aQJlb45Z8FwfE0xh8sA",
51
51
  "NxWmEBmJp0uiNRhyxa26frQjfFaNERmZbConrytNQKnHfilFsZWAo0Qy8eVKgq", "ajq3i5ksiBovQYfvj",
52
- "yY3vhjeq","2IDeF0ccG8tRZIZSekz6fUii29"]
52
+ "yY3vhjeq","2IDeF0ccG8tRZIZSekz6fUii29"
53
+ ]
53
54
 
54
55
  data.each do |d|
55
56
  req = h.post("/channels/#{chan}", d, {'Content-Type' => 'application/octet-stream', 'Date' => Time.now.to_s})
@@ -62,7 +63,7 @@ class TestConveyorServer < Test::Unit::TestCase
62
63
  end
63
64
  end
64
65
  end
65
-
66
+
66
67
  def test_invalid_channel
67
68
  Net::HTTP.start('localhost', 8011) do |h|
68
69
  req = h.put('/channels/|', '', {'Content-Type' => 'application/octet-stream'})
@@ -167,9 +168,9 @@ class TestConveyorServer < Test::Unit::TestCase
167
168
  c.post 'foo'
168
169
 
169
170
  assert_equal 'foo', c.get_next('bar')
170
- c.rewind(1, 'bar')
171
+ c.rewind(:id => 1, :group => 'bar')
171
172
  assert_equal 'foo', c.get_next('bar')
172
- c.rewind(1, 'bar')
173
+ c.rewind(:id => 1, :group => 'bar')
173
174
  end
174
175
 
175
176
  def test_get_next_by_group
@@ -275,5 +276,48 @@ class TestConveyorServer < Test::Unit::TestCase
275
276
  assert_equal 'foo', c.get_next
276
277
  end
277
278
 
279
+ def test_get_by_timestamp
280
+ chan = 'test_get_by_timestamp'
281
+ c = Client.new('localhost', chan)
282
+
283
+ 10.times{|i| c.post(i.to_s)}
284
+ assert_equal '0', c.get_nearest_after_timestamp(0)
285
+ assert_equal '', c.get_nearest_after_timestamp(2**32)
286
+
287
+ t0 = Time.now.to_i
288
+ 10.times{|i| c.post((10 + i).to_s)}
289
+ assert_equal '9', c.get_nearest_after_timestamp(t0)
290
+ end
291
+
292
+ def test_rewind_to_timestamp
293
+ chan = 'test_rewind_to_timestamp'
294
+ c = Client.new('localhost', chan)
295
+
296
+ 10.times{|i| c.post(i.to_s)}
297
+ 10.times{|i| assert_equal i.to_s, c.get_next}
298
+
299
+ c.rewind :time => 0
300
+ 10.times{|i| assert_equal i.to_s, c.get_next}
301
+
302
+ t0 = Time.now.to_i + 1
303
+ c.rewind :time => t0
304
+ assert_equal '', c.get_next
305
+ end
306
+
307
+ def test_rewind_group_to_timestamp
308
+ chan = 'test_rewind_group_to_timestamp'
309
+ group = 'foo'
310
+ c = Client.new('localhost', chan)
311
+
312
+ 10.times{|i| c.post(i.to_s)}
313
+ 10.times{|i| assert_equal i.to_s, c.get_next(group)}
314
+
315
+ c.rewind :time => 0, :group => group
316
+ 10.times{|i| assert_equal i.to_s, c.get_next(group)}
317
+
318
+ t0 = Time.now.to_i + 1
319
+ c.rewind :time => t0, :group => group
320
+ assert_equal '', c.get_next(group)
321
+ end
278
322
  end
279
323
 
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: conveyor
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.2.0
4
+ version: 0.2.1
5
5
  platform: ruby
6
6
  authors:
7
7
  - Ryan King
@@ -30,7 +30,7 @@ cert_chain:
30
30
  Zls3y84CmyAEGg==
31
31
  -----END CERTIFICATE-----
32
32
 
33
- date: 2008-02-26 00:00:00 -08:00
33
+ date: 2008-02-29 00:00:00 -08:00
34
34
  default_executable:
35
35
  dependencies:
36
36
  - !ruby/object:Gem::Dependency
@@ -88,8 +88,8 @@ files:
88
88
  - Rakefile
89
89
  - bin/conveyor
90
90
  - bin/conveyor-upgrade
91
- - docs/file-formats.mkd
92
- - docs/protocol.mkd
91
+ - docs/file-formats.rdoc
92
+ - docs/protocol.rdoc
93
93
  - lib/conveyor.rb
94
94
  - lib/conveyor/base_channel.rb
95
95
  - lib/conveyor/channel.rb
metadata.gz.sig CHANGED
Binary file
@@ -1,89 +0,0 @@
1
- # Conveyor protocol #
2
-
3
- ## Create a channel ##
4
-
5
- Request
6
- : PUT /channels/{channel name}
7
-
8
- Response
9
- : success: 201, failure: ?
10
-
11
- The simple explanation is that to create a channel you do a PUT operation on the url you want for the channel, which must conform to /channels/\A[a-zA-Z0-9\-]+\Z. In other words, the channel name may only have letters numbers and dashes.
12
-
13
- ## Post to a channel ##
14
- Request
15
- : POST /channels/{channel name}
16
- : body is the message
17
-
18
- Response
19
- : success: 202, failure: ?
20
-
21
- A post to a channel URL with the message in the body.
22
-
23
- ## Get from channel ##
24
-
25
- ### Get by id ###
26
-
27
- Request
28
- : GET /channels/{channel name}/{id}
29
-
30
- Response
31
- : success: 200, failure: 404
32
-
33
- ### Get by datetime ###
34
-
35
- NOT IMPLEMENTED YET!
36
-
37
- Request
38
- : GET /channels/{channel name}?at={ISO datetime like 2008-01-11T17:53:59}
39
-
40
- Response
41
- : success: 200
42
-
43
- Will return the first entry *after* that datetime.
44
-
45
- ### Get Next (Queue-like semantics) ###
46
-
47
- Request
48
- : GET /channels/{channel name}?next
49
-
50
- Response
51
- : success: 200
52
-
53
- If this is called for the first time, it will return the first item in the channel. Otherwise it will return the next item.
54
-
55
- ### Get Next by Group (Multi-consumer queue) ###
56
-
57
- Request
58
- : GET /channels/{channel name}?next&group={group name}
59
-
60
- Response
61
- : success: 200
62
-
63
- If this is called for the first time, it will return the first item in the channel. Otherwise it will return the next item.
64
-
65
- ### Rewinding to id ###
66
-
67
- Request
68
- : POST /channels/{channel name}?rewind_id={id}
69
-
70
- Response
71
- :success: 200
72
-
73
- ### Get next n ###
74
-
75
- Request
76
- : GET /channels/{channel name}?next&n={n}
77
-
78
- Response
79
- : JSON array of objects, which have the keys 'id', 'hash' and 'data'
80
-
81
-
82
- ### Get next n for group ###
83
-
84
- Request
85
- : GET /channels/{channel name}?next&n={n}&group={group}
86
-
87
- Response
88
- : JSON array of objects, which have the keys 'id', 'hash' and 'data'
89
-