em-pg-client 0.2.1 → 0.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1 @@
1
+ lib/pg/em/client/*.rb lib/pg/em/*.rb lib/pg/*.rb - BENCHMARKS.md LICENSE HISTORY.md
@@ -1,26 +1,30 @@
1
- == Benchmarks
1
+ Benchmarks
2
+ ----------
2
3
 
3
- I've done some benchmark tests[link:benchmarks/em_pg.rb] to compare fully async and blocking em-pg drivers.
4
+ I've done some benchmark {file:benchmarks/em_pg.rb tests} to compare fully async and blocking em-pg drivers.
4
5
 
5
6
  The goal of the test is simply to retrieve (~80000) rows from table with a lot of text data, in chunks, using parallel connections.
6
7
  The parallel method uses synchrony for simplicity.
7
8
 
8
- * +single+ is (eventmachine-less) job for retrieving a whole data table in
9
+ * `single` is (eventmachine-less) job for retrieving a whole data table in
9
10
  one simple query "select * from resources"
10
- * +parallel+ chunk_row_count / concurrency] uses em-pg-client for retrieving
11
- result in chunks by +chunk_row_count+ rows and using +concurrency+ parallel
11
+ * `parallel` chunk_row_count / concurrency] uses em-pg-client for retrieving
12
+ result in chunks by `chunk_row_count` rows and using `concurrency` parallel
12
13
  connections
13
- * +blocking+ chunk_row_count / concurrency is similiar to +parallel+ except
14
+ * `blocking` chunk_row_count / concurrency is similiar to `parallel` except
14
15
  that it uses special patched version of library that uses blocking
15
16
  PGConnection methods
16
17
 
17
- == Environment
18
+ Environment
19
+ -----------
18
20
 
19
21
  The machine used for test is Linux CentOS 2.6.18-194.32.1.el5xen #1 SMP with Quad Core Xeon X3360 @ 2.83GHz, 4GB RAM.
20
22
  Postgres version used: 9.0.3.
21
23
 
22
- == The results:
24
+ The results:
25
+ ------------
23
26
 
27
+ ```
24
28
  >> benchmark 1000
25
29
  user system total real
26
30
  single: 80.970000 0.350000 81.320000 (205.592592)
@@ -34,10 +38,11 @@ Postgres version used: 9.0.3.
34
38
  blocking 5000/5: 79.930000 1.810000 81.740000 (223.342432)
35
39
  blocking 2000/10: 76.990000 2.820000 79.810000 (225.347169)
36
40
  blocking 1000/20: 78.790000 3.230000 82.020000 (225.949107)
41
+ ```
37
42
 
38
43
  As we can see the gain from using asynchronous pg client while
39
- using +parallel+ queries is noticeable (up to ~30%).
44
+ using `parallel` queries is noticeable (up to ~30%).
40
45
 
41
- The +blocking+ client however doesn't gain much from parallel execution.
46
+ The `blocking` client however doesn't gain much from parallel execution.
42
47
  This was expected because it freezes eventmachine until the whole
43
48
  dataset is consumed by the client.
@@ -1,20 +1,43 @@
1
+ 0.3.0
2
+
3
+ - dedicated asynchronous connection pool
4
+ - works on windows (with ruby 2.0+): uses PGConn#socket_io object instead of
5
+ #socket file descriptor
6
+ - socket watch handler is not being detached between command calls
7
+ - no more separate em and em-synchrony client
8
+ - api changes: async_exec and async_query command are now fiber-synchronized
9
+ - api changes: other async_* methods are removed or deprecated
10
+ - api changes: asynchronous methods renamed to *_defer
11
+ - transaction() helper method that can be called recursively
12
+ - requirements updated: eventmachine >~ 1.0.0, pg >= 0.17.0, ruby >= 1.9.2
13
+ - spec: more auto re-connect test cases
14
+ - spec: more tests for connection establishing
15
+ - comply with pg: do not close the client on connection failure
16
+ - comply with pg: asynchronous connect_timeout fallbacks to environment variable
17
+ - fix: auto re-connect raises an error if the failed connection had unfinished
18
+ transaction state
19
+ - yardoc docs
20
+
1
21
  0.2.1
22
+
2
23
  - support for pg >= 0.14 native PG::Result#check
3
24
  - support for pg >= 0.14 native PG::Connection#set_default_encoding
4
25
  - fix: connection option Hash argument was modified by Client.new and Client.async_connect
5
26
 
6
27
  0.2.0
28
+
7
29
  - disabled async_autoreconnect by default unless on_autoreconnect is set
8
30
  - async_connect sets #internal_encoding to Encoding.default_internal
9
31
  - fix: finish connection on async connect_timeout
10
32
  - nice errors generated on missing dependencies
11
33
  - blocking #reset() should clear async_command_aborted flag
12
34
  - less calls to #is_busy in Watcher#notify_readable
13
- - #async_describe_portal + specs
14
- - #async_describe_prepared + specs
35
+ - async_describe_portal() + specs
36
+ - async_describe_prepared() + specs
15
37
 
16
38
  0.2.0.pre.3
17
- - #status() returns CONNECTION_BAD for connections with expired query
39
+
40
+ - status() returns CONNECTION_BAD for connections with expired query
18
41
  - spec: query timeout expiration
19
42
  - non-blocking result processing for multiple data query statements sent at once
20
43
  - refine code in em-synchrony/pg
@@ -23,20 +46,23 @@
23
46
  - spec: autoreconnect
24
47
 
25
48
  0.2.0.pre.2
49
+
26
50
  - errors from consume_input fails deferrable
27
- - query_timeout now measures only network response timeout,
51
+ - query_timeout now measures only network response timeout,
28
52
  so it's not fired for large datasets
29
53
 
30
54
  0.2.0.pre.1
55
+
31
56
  - added query_timeout feature for async query commands
32
57
  - added connect_timeout property for async connect/reset
33
58
  - fix: async_autoreconnect for tcp/ip connections
34
59
  - fix: async_* does not raise errors; errors handled by deferrable
35
60
  - rework async_autoreconnect in fully async manner
36
- - added async_connect() and #async_reset()
61
+ - added async_connect() and async_reset()
37
62
  - API change: on_reconnect -> on_autoreconnect
38
63
 
39
64
  0.1.1
65
+
40
66
  - added on_reconnect callback
41
67
  - docs updated
42
68
  - added development dependency for eventmachine >= 1.0.0.beta.1
@@ -44,4 +70,5 @@
44
70
  - added error checking to eventmachine specs
45
71
 
46
72
  0.1.0
73
+
47
74
  - first release
data/LICENSE ADDED
@@ -0,0 +1,21 @@
1
+ Copyright (c) 2013 Rafal Michalski (rafal at yeondir dot com)
2
+
3
+ MIT License
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in
13
+ all copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
21
+ THE SOFTWARE.
@@ -0,0 +1,392 @@
1
+ em-pg-client
2
+ ============
3
+
4
+ Author: Rafał Michalski (rafal at yeondir dot com)
5
+
6
+ * http://github.com/royaltm/ruby-em-pg-client
7
+
8
+ Description
9
+ -----------
10
+
11
+ __em-pg-client__ is the Ruby EventMachine driver interface to the
12
+ PostgreSQL RDBMS. It is based on [ruby-pg](https://bitbucket.org/ged/ruby-pg).
13
+
14
+ __em-pg-client__ provides {PG::EM::Client} class which inherits
15
+ [PG::Connection](http://deveiate.org/code/pg/PG/Connection.html).
16
+ You can work with {PG::EM::Client} almost the same way you would work
17
+ with PG::Connection.
18
+
19
+ The real difference begins when you turn the EventMachine reactor on.
20
+
21
+ ```ruby
22
+ require 'pg/em'
23
+
24
+ pg = PG::EM::Client.new dbname: 'test'
25
+
26
+ # no async
27
+ pg.query('select * from foo') do |result|
28
+ puts Array(result).inspect
29
+ end
30
+
31
+ # asynchronous
32
+ EM.run do
33
+ Fiber.new do
34
+ pg.query('select * from foo') do |result|
35
+ puts Array(result).inspect
36
+ end
37
+ EM.stop
38
+ end.resume
39
+ end
40
+
41
+ # asynchronous + deferrable
42
+ EM.run do
43
+ df = pg.query_defer('select * from foo')
44
+ df.callback { |result|
45
+ puts Array(result).inspect
46
+ EM.stop
47
+ }
48
+ df.errback {|ex|
49
+ raise ex
50
+ }
51
+ puts "sent"
52
+ end
53
+ ```
54
+
55
+ Features
56
+ --------
57
+
58
+ * Non-blocking / fully asynchronous processing with EventMachine.
59
+ * Event reactor auto-detecting, asynchronous fiber-synchronized command methods
60
+ (the same code can be used regardless of the EventMachine reactor state)
61
+ * Asynchronous EM-style (deferrable returning) command methods.
62
+ * Fully asynchronous automatic re-connects on connection failures
63
+ (e.g.: RDBMS restarts, network failures).
64
+ * Minimal changes to [PG::Connection](http://deveiate.org/code/pg/PG/Connection.html) API.
65
+ * Configurable timeouts (connect or execute) of asynchronous processing.
66
+ * Dedicated connection pool with dynamic size, supporting asynchronous
67
+ processing and transactions.
68
+ * [Sequel Adapter](https://github.com/fl00r/em-pg-sequel) by Peter Yanovich.
69
+ * Works on windows (requires ruby 2.0) (issue #7).
70
+
71
+ Requirements
72
+ ------------
73
+
74
+ * ruby >= 1.9.2 (tested: 2.1.0, 2.0.0-p353, 1.9.3-p374, 1.9.2-p320)
75
+ * https://bitbucket.org/ged/ruby-pg >= 0.17.0
76
+ * [PostgreSQL](http://www.postgresql.org/ftp/source/) RDBMS >= 8.3
77
+ * http://rubyeventmachine.com >= 1.0.0
78
+ * [EM-Synchrony](https://github.com/igrigorik/em-synchrony)
79
+ (optional - not needed for any of the client functionality,
80
+ just wrap your code in a fiber)
81
+
82
+ Install
83
+ -------
84
+
85
+ ```
86
+ $ [sudo] gem install em-pg-client
87
+ ```
88
+
89
+ #### Gemfile
90
+
91
+ ```ruby
92
+ gem "em-pg-client", "~> 0.3.0"
93
+ ```
94
+
95
+ #### Github
96
+
97
+ ```
98
+ git clone git://github.com/royaltm/ruby-em-pg-client.git
99
+ ```
100
+
101
+ Usage
102
+ -----
103
+
104
+ ### PG::Connection commands adapted to the EventMachine
105
+
106
+ #### Asynchronous, the EventMachine style:
107
+
108
+ * `Client.connect_defer` (singleton method)
109
+ * `reset_defer`
110
+ * `exec_defer` (alias: `query_defer`)
111
+ * `prepare_defer`
112
+ * `exec_prepared_defer`
113
+ * `describe_prepared_defer`
114
+ * `describe_portal_defer`
115
+
116
+ For arguments of these methods consult their original (without the `_defer`
117
+ suffix) counterparts in the
118
+ [PG::Connection](http://deveiate.org/code/pg/PG/Connection.html) manual.
119
+
120
+ Use `callback` with a block on the returned deferrable object to receive the
121
+ result. In case of `connect_defer` and `reset_defer` the result is an instance
122
+ of the {PG::EM::Client}. The received client is in connected state and ready
123
+ for the queries. Otherwise an instance of the
124
+ [PG::Result](http://deveiate.org/code/pg/PG/Result.html) is received. You may
125
+ `clear` the obtained result object or leave it to `gc`.
126
+
127
+ To detect an error in the executed command call `errback` on the deferrable
128
+ with a block. You should expect an instance of the raised `Exception`
129
+ (usually PG::Error) as the block argument.
130
+
131
+ #### Reactor sensing methods, EM-Synchrony style:
132
+
133
+ * `Client.new` (singleton, alias: `connect`, `open`, `setdb`, `setdblogin`)
134
+ * `reset`
135
+ * `exec` (alias: `query`, `async_exec`, `async_query`)
136
+ * `prepare`
137
+ * `exec_prepared`
138
+ * `describe_prepared`
139
+ * `describe_portal`
140
+
141
+ The above methods call `*_defer` counterparts of themselves and `yield`
142
+ from the current fiber awaiting for the result. The PG::Result instance
143
+ (or PG::EM::Client for `new`) is then returned to the caller.
144
+ If a code block is given, it will be passed the result as an argument.
145
+ In that case the value of the block is returned instead and the result is
146
+ being cleared (or in case of `new` - client is being closed) after block
147
+ terminates.
148
+
149
+ These methods check if EventMachine's reactor is running and the current fiber
150
+ is not a root fiber. Otherwise the parent (thread-blocking) PG::Connection
151
+ methods are being called.
152
+
153
+ You can call asynchronous, fiber aware and blocking methods without finishing
154
+ the connection. You only need to start/stop EventMachine in between the
155
+ asynchronous calls.
156
+
157
+ Although the [em-synchrony](https://github.com/igrigorik/em-synchrony/)
158
+ provides very nice set of tools for the untangled EventMachine, you don't
159
+ really require it to fully benefit from the PG::EM::Client. Just wrap your
160
+ asynchronous code in a fiber:
161
+
162
+ Fiber.new { ... }.resume
163
+
164
+ #### Special options
165
+
166
+ There are four special connection options and one of them is a standard `pg`
167
+ option used by the async methods. You may pass them as one of the __hash__
168
+ options to {PG::EM::Client.new} or {PG::EM::Client.connect_defer} or simply
169
+ use the accessor methods to change them on the fly.
170
+
171
+ The options are:
172
+
173
+ * `connect_timeout`
174
+ * `query_timeout`
175
+ * `async_autoreconnect`
176
+ * `on_autoreconnect`
177
+
178
+ Only `connect_timeout` is a standard `libpq` option, although changing it with
179
+ the accessor method affects asynchronous functions only.
180
+ See {PG::EM::Client} for more details.
181
+
182
+ #### Handling errors
183
+
184
+ Exactly like in `pg`:
185
+
186
+ ```ruby
187
+ EM.synchrony do
188
+ begin
189
+ pg.query('smellect 1')
190
+ rescue => e
191
+ puts "error: #{e.inspect}"
192
+ end
193
+ EM.stop
194
+ end
195
+ ```
196
+
197
+ with *_defer methods:
198
+
199
+ ```ruby
200
+ EM.run do
201
+ pg.query_defer('smellect 1') do |ret|
202
+ if ret.is_a?(Exception)
203
+ puts "PSQL error: #{ret.inspect}"
204
+ end
205
+ end
206
+ end
207
+ ```
208
+
209
+ or
210
+
211
+ ```ruby
212
+ EM.run do
213
+ pg.query_defer('smellect 1').callback do |ret|
214
+ puts "do something with #{ret}"
215
+ end.errback do |err|
216
+ puts "PSQL error: #{err.inspect}"
217
+ end
218
+ end
219
+ ```
220
+
221
+ ### Auto re-connecting in asynchronous mode
222
+
223
+ Connection reset is done in a non-blocking manner using `reset_defer` internally.
224
+
225
+ ```ruby
226
+ EM.run do
227
+ Fiber.new do
228
+ pg = PG::EM::Client.new async_autoreconnect: true
229
+
230
+ try_query = lambda do
231
+ pg.query('select * from foo') do |result|
232
+ puts Array(result).inspect
233
+ end
234
+ end
235
+
236
+ try_query.call
237
+ system 'pg_ctl stop -m fast'
238
+ system 'pg_ctl start -w'
239
+ try_query.call
240
+
241
+ EM.stop
242
+ end.resume
243
+ end
244
+ ```
245
+
246
+ to enable this feature call:
247
+
248
+ ```ruby
249
+ pg.async_autoreconnect = true
250
+ ```
251
+
252
+ Additionally the `on_autoreconnect` callback may be set on the connection.
253
+ It's being invoked after successfull connection restart, just before the
254
+ pending command is sent again to the server.
255
+
256
+ ### Connection Pool
257
+
258
+ Forever alone? Not anymore! There is a dedicated {PG::EM::ConnectionPool}
259
+ class with dynamic pool for both types of asynchronous commands (deferral
260
+ and fiber-synchronized).
261
+
262
+ It also provides a #transaction method which locks the in-transaction
263
+ connection to the calling fiber and allows to execute commands
264
+ on the same connection within a transaction block. The transactions may
265
+ be nested. See also docs for the {PG::EM::Client#transaction} method.
266
+
267
+ #### Parallel async queries
268
+
269
+ ```ruby
270
+ require 'pg/em/connection_pool'
271
+ require 'em-synchrony'
272
+
273
+ EM.synchrony do
274
+ pg = PG::EM::ConnectionPool.new(size: 2, dbname: 'test')
275
+
276
+ multi = EM::Synchrony::Multi.new
277
+ multi.add :foo, pg.query_defer('select pg_sleep(1)')
278
+ multi.add :bar, pg.query_defer('select pg_sleep(1)')
279
+
280
+ start = Time.now
281
+ res = multi.perform
282
+ # around 1 sec.
283
+ puts Time.now - start
284
+
285
+ EM.stop
286
+ end
287
+ ```
288
+
289
+ #### Fiber Concurrency
290
+
291
+ ```ruby
292
+ require 'pg/em/connection_pool'
293
+ require 'em-synchrony'
294
+ require "em-synchrony/fiber_iterator"
295
+
296
+ EM.synchrony do
297
+ concurrency = 5
298
+ queries = (1..10).map {|i| "select pg_sleep(1); select #{i}" }
299
+
300
+ pg = PG::EM::ConnectionPool.new(size: concurrency, dbname: 'test')
301
+
302
+ start = Time.now
303
+ EM::Synchrony::FiberIterator.new(queries, concurrency).each do |query|
304
+ pg.query(query) do |result|
305
+ puts "recv: #{result.getvalue(0,0)}"
306
+ end
307
+ end
308
+ # around 2 secs.
309
+ puts Time.now - start
310
+
311
+ EM.stop
312
+ end
313
+ ```
314
+
315
+ API Changes
316
+ -----------
317
+
318
+ ### 0.2.x -> 0.3.x
319
+
320
+ There is a substantial difference in the API between this and the previous
321
+ releases. The idea behind it was to make this implementation as much
322
+ compatible as possible with the threaded `pg` interface.
323
+ E.g. the `#async_exec` is now an alias to `#exec`.
324
+
325
+ The other reason was to get rid of the ugly em / em-synchrony duality.
326
+
327
+ * There is no separate em-synchrony client version anymore.
328
+ * The methods returning Deferrable have now the `*_defer` suffix.
329
+ * The `#async_exec` and `#async_query` (in <= 0.2 they were deferrable methods)
330
+ are now aliases to `#exec`.
331
+ * The command methods `#exec`, `#query`, `#exec_*`, `#describe_*` are now
332
+ em-synchrony style methods (fiber-synchronized).
333
+ * The following methods were removed:
334
+
335
+ - `#async_prepare`,
336
+ - `#async_exec_prepared`,
337
+ - `#async_describe_prepared`,
338
+ - `#async_describe_portal`
339
+
340
+ as their names were confusing due to the unfortunate `#async_exec`.
341
+
342
+ * The `async_connect` and `#async_reset` are renamed to `connect_defer` and `#reset_defer`
343
+ respectively.
344
+
345
+ ### 0.1.x -> 0.2.x
346
+
347
+ * `on_reconnect` renamed to more accurate `on_autoreconnect`
348
+ (well, it's not used by PG::EM::Client#reset call).
349
+ * `async_autoreconnect` is `false` by default if `on_autoreconnect`
350
+ is __not__ specified as initialization option.
351
+
352
+ Bugs/Limitations
353
+ ----------------
354
+
355
+ * no async support for: COPY commands (`get_copy_data`, `put_copy_data`),
356
+ `wait_for_notify`
357
+ * actually no ActiveRecord support (you are welcome to contribute).
358
+
359
+ TODO:
360
+ -----
361
+
362
+ * implement streaming results (Postgres >= 9.2)
363
+ * implement EM adapted version of `get_copy_data`, `put_copy_data`,
364
+ `wait_for_notify` and `transaction`
365
+ * ORM (ActiveRecord and maybe Datamapper) support as separate projects
366
+ * present more benchmarks
367
+
368
+ More Info
369
+ ---------
370
+
371
+ This implementation makes use of non-blocking:
372
+ [PGConn#is_busy](http://deveiate.org/code/pg/PG/Connection.html#method-i-is_busy) and
373
+ [PGConn#consume_input](http://deveiate.org/code/pg/PG/Connection.html#method-i-consume_input) methods.
374
+ Depending on the size of queried results and the concurrency level, the gain
375
+ in overall speed and responsiveness of your application might be actually quite huge.
376
+ See {file:BENCHMARKS.md BENCHMARKING}.
377
+
378
+ Thanks
379
+ ------
380
+
381
+ The greetz go to:
382
+
383
+ * [Authors](https://bitbucket.org/ged/ruby-pg/wiki/Home#!copying) of __pg__
384
+ driver (especially for its async-api)
385
+ * Francis Cianfrocca for great reactor framework
386
+ [EventMachine](https://github.com/eventmachine/eventmachine)
387
+ * Ilya Grigorik [igrigorik](https://github.com/igrigorik) for
388
+ [untangling EM with Fibers](http://www.igvita.com/2010/03/22/untangling-evented-code-with-ruby-fibers/)
389
+ * Peter Yanovich [fl00r](https://github.com/fl00r) for the
390
+ [em-pg-sequel](https://github.com/fl00r/em-pg-sequel)
391
+ * Andrew Rudenko [prepor](https://github.com/prepor) for the implicit idea
392
+ of the re-usable watcher from his [em-pg](https://github.com/prepor/em-pg).