timescaledb 0.2.3 → 0.2.5

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 0367f44853d1cd4845a905e4692691baca381b739f9fb35f6a2aa471f350c946
4
- data.tar.gz: b6bd9df57b80f6570f341f4842949092b367da03c5e9ba2edae6df6826288729
3
+ metadata.gz: f5cb8809c9bfe578eca1fc5fd3e5d6f93a3ff7eaa9a82f6f343d6258aa591817
4
+ data.tar.gz: 790728e08291b9df8539e08c37639bf81e8669df4e9321bc494cb27b25a21e81
5
5
  SHA512:
6
- metadata.gz: 4e02cd458d020baeaa3658c20f6b502970f36772ef10f62162ad1028ad3ef7ab36943909815d3d6d04776d6cbbd8047f4705bfacbcc9315b89adaf516e54365c
7
- data.tar.gz: 83f647f7814cf797155f599a3403e6641ea09edf6334f49f65279bed90c290686554b533eab852345367a0c76cf5f9a88d318d419fa46c9534aebb28a8922858
6
+ metadata.gz: 61deae5c4a9884e56595771fbbd5de843bc4b06b395bd119495193c5de5a9eb66087453d749d940223e79bfccd5703def5c829f9737385ebd414460782ef5340
7
+ data.tar.gz: fbd19b66f8d35af8995c25335bd001b92218acf30032533bd5972dff1bfed76105053c99c680ae705d67b6c87c8149c7426daa49de4913a58cc8d2b62327e442
data/.ruby-version CHANGED
@@ -1 +1 @@
1
- 2.7.1
1
+ 3.1.2
data/Gemfile.lock CHANGED
@@ -1,7 +1,7 @@
1
1
  PATH
2
2
  remote: .
3
3
  specs:
4
- timescaledb (0.2.3)
4
+ timescaledb (0.2.5)
5
5
  activerecord
6
6
  activesupport
7
7
  pg (~> 1.2)
data/Gemfile.scenic.lock CHANGED
@@ -1,7 +1,7 @@
1
1
  PATH
2
2
  remote: .
3
3
  specs:
4
- timescaledb (0.1.5)
4
+ timescaledb (0.2.3)
5
5
  activerecord
6
6
  activesupport
7
7
  pg (~> 1.2)
@@ -58,7 +58,7 @@ GEM
58
58
  racc (~> 1.4)
59
59
  nokogiri (1.12.5-x86_64-darwin)
60
60
  racc (~> 1.4)
61
- pg (1.3.0)
61
+ pg (1.4.4)
62
62
  pry (0.14.1)
63
63
  coderay (~> 1.1)
64
64
  method_source (~> 1.0)
data/docs/index.md CHANGED
@@ -40,6 +40,17 @@ The [all_in_one](https://github.com/jonatas/timescaledb/tree/master/examples/all
40
40
 
41
41
  The [ranking](https://github.com/jonatas/timescaledb/tree/master/examples/ranking) example shows how to configure a Rails app and navigate all the features available.
42
42
 
43
+
44
+ ## Toolkit examples
45
+
46
+ There are also examples in the [toolkit-demo](https://github.com/jonatas/timescaledb/tree/master/examples/toolkit-demo) folder that can help you to
47
+ understand how to properly use the toolkit functions.
48
+
49
+ * [ohlc](https://github.com/jonatas/timescaledb/tree/master/examples/toolkit-demo/ohlc.rb) is a funtion that groups data by Open, High, Low, Close and make histogram availables to group the data, very useful for financial analysis.
50
+ * While building the [LTTB tutorial]( https://jonatas.github.io/timescaledb/toolkit_lttb_tutorial/) I created the [lttb](https://github.com/jonatas/timescaledb/tree/master/examples/toolkit-demo/lttb) is a simple charting using the Largest Triangle Three Buckets and there. A [zoomable](https://github.com/jonatas/timescaledb/tree/master/examples/toolkit-demo/lttb-zoom) version which allows to navigate in the data and zoom it keeping the same data resolution is also available.
51
+ * A small example showing how to process [volatility](https://github.com/jonatas/timescaledb/blob/master/examples/toolkit-demo/compare_volatility.rb) is also good to get familiar with the pipeline functions. A benchmark implementing the same in Ruby is also available to check how it compares to the SQL implementation.
52
+
53
+
43
54
  ## Extra resources
44
55
 
45
56
  If you need extra help, please join the fantastic [timescale community](https://www.timescale.com/community)
data/docs/migrations.md CHANGED
@@ -67,3 +67,10 @@ options = {
67
67
  create_continuous_aggregate('ohlc_1m', query, **options)
68
68
  ```
69
69
 
70
+ If you need more details, please check this [blog post][1].
71
+
72
+ If you're interested in candlesticks and need to get the OHLC values, take a look
73
+ at the [toolkit ohlc](/toolkit_ohlc) function that do the same but through a
74
+ function that can be reusing candlesticks from smaller timeframes.
75
+
76
+ [1]: https://ideia.me/timescale-continuous-aggregates-with-ruby
data/docs/toolkit.md CHANGED
@@ -93,7 +93,7 @@ Now, let's add the model `app/models/measurement.rb`:
93
93
 
94
94
  ```ruby
95
95
  class Measurement < ActiveRecord::Base
96
- self.primary_key = 'device_id'
96
+ self.primary_key = nil
97
97
 
98
98
  acts_as_hypertable time_column: "ts"
99
99
  end
@@ -168,12 +168,15 @@ Measurement
168
168
  The final query for the example above looks like this:
169
169
 
170
170
  ```sql
171
- SELECT device_id, sum(abs_delta) as volatility
171
+ SELECT device_id, SUM(abs_delta) AS volatility
172
172
  FROM (
173
173
  SELECT device_id,
174
- abs(val - lag(val) OVER (PARTITION BY device_id ORDER BY ts)) as abs_delta
174
+ ABS(
175
+ val - LAG(val) OVER (
176
+ PARTITION BY device_id ORDER BY ts)
177
+ ) AS abs_delta
175
178
  FROM "measurements"
176
- ) as calc_delta
179
+ ) AS calc_delta
177
180
  GROUP BY device_id
178
181
  ```
179
182
 
@@ -182,8 +185,14 @@ let's reproduce the same example using the toolkit pipelines:
182
185
 
183
186
  ```ruby
184
187
  Measurement
185
- .select("device_id, timevector(ts, val) -> sort() -> delta() -> abs() -> sum() as volatility")
186
- .group("device_id")
188
+ .select(<<-SQL).group("device_id")
189
+ device_id,
190
+ timevector(ts, val)
191
+ -> sort()
192
+ -> delta()
193
+ -> abs()
194
+ -> sum() as volatility
195
+ SQL
187
196
  ```
188
197
 
189
198
  As you can see, it's much easier to read and digest the example. Now, let's take
@@ -198,7 +207,7 @@ here to allow us to not repeat the parameters of the `timevector(ts, val)` call.
198
207
 
199
208
  ```ruby
200
209
  class Measurement < ActiveRecord::Base
201
- self.primary_key = 'device_id'
210
+ self.primary_key = nil
202
211
 
203
212
  acts_as_hypertable time_column: "ts"
204
213
 
@@ -224,8 +233,14 @@ class Measurement < ActiveRecord::Base
224
233
  time_column: "ts"
225
234
 
226
235
  scope :volatility, -> do
227
- select("device_id, timevector(#{time_column}, #{value_column}) -> sort() -> delta() -> abs() -> sum() as volatility")
228
- .group("device_id")
236
+ select(<<-SQL).group("device_id")
237
+ device_id,
238
+ timevector(#{time_column}, #{value_column})
239
+ -> sort()
240
+ -> delta()
241
+ -> abs()
242
+ -> sum() as volatility
243
+ SQL
229
244
  end
230
245
  end
231
246
  ```
@@ -248,7 +263,12 @@ class Measurement < ActiveRecord::Base
248
263
 
249
264
  scope :volatility, -> (columns=segment_by_column) do
250
265
  _scope = select([*columns,
251
- "timevector(#{time_column}, #{value_column}) -> sort() -> delta() -> abs() -> sum() as volatility"
266
+ "timevector(#{time_column},
267
+ #{value_column})
268
+ -> sort()
269
+ -> delta()
270
+ -> abs()
271
+ -> sum() as volatility"
252
272
  ].join(", "))
253
273
  _scope = _scope.group(columns) if columns
254
274
  _scope
@@ -361,7 +381,7 @@ Now, let's measure compare the time to process the volatility:
361
381
  ```ruby
362
382
  Benchmark.bm do |x|
363
383
  x.report("ruby") { pp Measurement.volatility_by_device_id }
364
- x.report("sql") { pp Measurement.volatility("device_id").map(&:attributes) }
384
+ x.report("sql") { pp Measurement.volatility("device_id").map(&:attributes) }
365
385
  end
366
386
  # user system total real
367
387
  # ruby 0.612439 0.061890 0.674329 ( 0.727590)
@@ -379,10 +399,103 @@ records over the wires. Now, moving to a remote host look the numbers:
379
399
  Now, using a remote connection between different regions,
380
400
  it looks even ~500 times slower than SQL.
381
401
 
382
- user system total real
383
- ruby 0.716321 0.041640 0.757961 ( 6.388881)
384
- sql 0.001156 0.000177 0.001333 ( 0.161270)
402
+ user system total real
403
+ ruby 0.716321 0.041640 0.757961 ( 6.388881)
404
+ sql 0.001156 0.000177 0.001333 ( 0.161270)
385
405
 
406
+ Let’s recap what’s time consuming here. The `find_all` is just not optimized to
407
+ fetch the data and also consuming most of the time here. It’s also fetching
408
+ the data and converting it to ActiveRecord model which has thousands of methods.
409
+
410
+ It’s very comfortable but just need the attributes to make it.
411
+
412
+ Let’s optimize it by plucking an array of values grouped by device.
413
+
414
+ ```ruby
415
+ class Measurement < ActiveRecord::Base
416
+ # ...
417
+ scope :values_from_devices, -> {
418
+ ordered_values = select(:val, :device_id).order(:ts)
419
+ Hash[
420
+ from(ordered_values)
421
+ .group(:device_id)
422
+ .pluck("device_id, array_agg(val)")
423
+ ]
424
+ }
425
+ end
426
+ ```
427
+
428
+ Now, let's create a method for processing volatility.
429
+
430
+ ```ruby
431
+ class Volatility
432
+ def self.process(values)
433
+ previous = nil
434
+ deltas = values.map do |value|
435
+ if previous
436
+ delta = (value - previous).abs
437
+ volatility = delta
438
+ end
439
+ previous = value
440
+ volatility
441
+ end
442
+ #deltas => [nil, 1, 1]
443
+ deltas.shift
444
+ volatility = deltas.sum
445
+ end
446
+ def self.process_values(map)
447
+ map.transform_values(&method(:process))
448
+ end
449
+ end
450
+ ```
451
+
452
+ Now, let's change the benchmark to expose the time for fetching and processing:
453
+
454
+
455
+ ```ruby
456
+ volatilities = nil
457
+
458
+ ActiveRecord::Base.logger = nil
459
+ Benchmark.bm do |x|
460
+ x.report("ruby") { Measurement.volatility_ruby }
461
+ x.report("sql") { Measurement.volatility_sql.map(&:attributes) }
462
+ x.report("fetch") { volatilities = Measurement.values_from_devices }
463
+ x.report("process") { Volatility.process_values(volatilities) }
464
+ end
465
+ ```
466
+
467
+ Checking the results:
468
+
469
+ user system total real
470
+ ruby 0.683654 0.036558 0.720212 ( 0.743942)
471
+ sql 0.000876 0.000096 0.000972 ( 0.054234)
472
+ fetch 0.078045 0.003221 0.081266 ( 0.116693)
473
+ process 0.067643 0.006473 0.074116 ( 0.074122)
474
+
475
+ Much better, now we can see only 200ms difference between real time which means ~36% more.
476
+
477
+
478
+ If we try to break down a bit more of the SQL part, we can see that the
479
+
480
+ ```sql
481
+ EXPLAIN ANALYSE
482
+ SELECT device_id, array_agg(val)
483
+ FROM (
484
+ SELECT val, device_id
485
+ FROM measurements
486
+ ORDER BY ts ASC
487
+ ) subquery
488
+ GROUP BY device_id;
489
+ ```
490
+
491
+ We can check the execution time and make it clear how much time is necessary
492
+ just for the processing part, isolating network and the ActiveRecord layer.
493
+
494
+ │ Planning Time: 17.761 ms │
495
+ │ Execution Time: 36.302 ms
496
+
497
+ So, it means that from the **116ms** to fetch the data, only **54ms** was used from the DB
498
+ and the remaining **62ms** was consumed by network + ORM.
386
499
 
387
500
  [1]: https://github.com/timescale/timescaledb-toolkit
388
501
  [2]: https://timescale.com
@@ -0,0 +1,315 @@
1
+ # OHLC / Candlesticks
2
+
3
+ Candlesticks are a popular tool in technical analysis, used by traders to determine potential market movements.
4
+
5
+ The toolkit also allows you to compute candlesticks with the [ohlc][1] function.
6
+
7
+ Candlesticks are a type of price chart that displays the high, low, open, and close prices of a security for a specific period. They can be useful because they can provide information about market trends and reversals. For example, if you see that the stock has been trading in a range for a while, it may be worth considering buying or selling when the price moves outside of this range. Additionally, candlesticks can be used in conjunction with other technical indicators to make trading decisions.
8
+
9
+
10
+ Let's start defining a table that stores the trades from financial market data
11
+ and then we can calculate the candlesticks with the Timescaledb Toolkit.
12
+
13
+ ## Migration
14
+
15
+ The `ticks` table is a hypertable that will be partitioning the data into one
16
+ week intervl. Compressing them after a month to save storage.
17
+
18
+ ```ruby
19
+ hypertable_options = {
20
+ time_column: 'time',
21
+ chunk_time_interval: '1 week',
22
+ compress_segmentby: 'symbol',
23
+ compress_orderby: 'time',
24
+ compression_interval: '1 month'
25
+ }
26
+ create_table :ticks, hypertable: hypertable_options, id: false do |t|
27
+ t.timestampt :time
28
+ t.string :symbol
29
+ t.decimal :price
30
+ t.integer :volume
31
+ end
32
+ ```
33
+
34
+ In the previous code block, we assume it goes inside a Rails migration or you
35
+ can embed such code into a `ActiveRecord::Base.connection.instance_exec` block.
36
+
37
+ ## Defining the model
38
+
39
+ As we don't need a primary key for the table, let's set it to nil. The
40
+ `acts_as_hypertable` macro will give us several useful scopes that can be
41
+ wrapping some of the TimescaleDB features.
42
+
43
+ The `acts_as_time_vector` will allow us to set what are the default columns used
44
+ to calculate the data.
45
+
46
+
47
+ ```ruby
48
+ class Tick < ActiveRecord::Base
49
+ self.primary_key = nil
50
+ acts_as_hypertable time_column: :time
51
+ acts_as_time_vector value_column: price, segment_by: :symbol
52
+ end
53
+ ```
54
+
55
+ The candlestick will split the timeframe by the `time_column` and use the `price` as the default value to process the candlestick. It will also segment the candles by `symbol`.
56
+
57
+ If you need to generate some data for your table, please check [this post][2].
58
+
59
+ ## The `ohlc` scope
60
+
61
+ When the `acts_as_time_vector` method is used in the model, it will inject
62
+ several scopes from the toolkit to easily have access to functions like the
63
+ ohlc.
64
+
65
+ The `ohlc` scope is available with a few parameters that inherits the
66
+ configuration from the `acts_as_time_vector` declared previously.
67
+
68
+ The simplest query is:
69
+
70
+ ```ruby
71
+ Tick.ohlc(timeframe: '1m')
72
+ ```
73
+
74
+ It will generate the following SQL:
75
+
76
+ ```sql
77
+ SELECT symbol,
78
+ "time",
79
+ toolkit_experimental.open(ohlc),
80
+ toolkit_experimental.high(ohlc),
81
+ toolkit_experimental.low(ohlc),
82
+ toolkit_experimental.close(ohlc),
83
+ toolkit_experimental.open_time(ohlc),
84
+ toolkit_experimental.high_time(ohlc),
85
+ toolkit_experimental.low_time(ohlc),
86
+ toolkit_experimental.close_time(ohlc)
87
+ FROM (
88
+ SELECT time_bucket('1m', time) as time,
89
+ "ticks"."symbol",
90
+ toolkit_experimental.ohlc(time, price)
91
+ FROM "ticks" GROUP BY 1, 2 ORDER BY 1)
92
+ AS ohlc
93
+ ```
94
+
95
+ The timeframe argument can also be skipped and the default is `1 hour`.
96
+
97
+ You can also combine other scopes to filter data before you get the data from the candlestick:
98
+
99
+ ```ruby
100
+ Tick.yesterday
101
+ .where(symbol: "APPL")
102
+ .ohlc(timeframe: '1m')
103
+ ```
104
+
105
+ The `yesterday` scope is automatically included because of the `acts_as_hypertable` macro. And it will be combining with other where clauses.
106
+
107
+ ## Continuous aggregates
108
+
109
+ If you would like to continuous aggregate the candlesticks on a materialized
110
+ view you can use continuous aggregates for it.
111
+
112
+ The next examples shows how to create a continuous aggregates of 1 minute
113
+ candlesticks:
114
+
115
+ ```ruby
116
+ options = {
117
+ with_data: false,
118
+ refresh_policies: {
119
+ start_offset: "INTERVAL '1 month'",
120
+ end_offset: "INTERVAL '1 minute'",
121
+ schedule_interval: "INTERVAL '1 minute'"
122
+ }
123
+ }
124
+ create_continuous_aggregate('ohlc_1m', Tick.ohlc(timeframe: '1m'), **options)
125
+ ```
126
+
127
+
128
+ Note that the `create_continuous_aggregate` calls the `to_sql` method in case
129
+ the second parameter is not a string.
130
+
131
+ ## Rollup
132
+
133
+ The rollup allows you to combine ohlc structures from smaller timeframes
134
+ to bigger timeframes without needing to reprocess all the data.
135
+
136
+ With this feature, you can group by the ohcl multiple times saving processing
137
+ from the server and make it easier to manage candlesticks from different time intervals.
138
+
139
+ In the previous example, we used the `.ohlc` function that returns already the
140
+ attributes from the different timeframes. In the SQL command it's calling the
141
+ `open`, `high`, `low`, `close` functions that can access the values behind the
142
+ ohlcsummary type.
143
+
144
+ To merge the ohlc we need to rollup the `ohlcsummary` to a bigger timeframe and
145
+ only access the values as a final resort to see them and access as attributes.
146
+
147
+ Let's rebuild the structure:
148
+
149
+ ```ruby
150
+ execute "CREATE VIEW ohlc_1h AS #{ Ohlc1m.rollup(timeframe: '1 hour').to_sql}"
151
+ execute "CREATE VIEW ohlc_1d AS #{ Ohlc1h.rollup(timeframe: '1 day').to_sql}"
152
+ ```
153
+
154
+ ## Defining models for views
155
+
156
+ Note that the previous code refers to `Ohlc1m` and `Ohlc1h` as two classes that
157
+ are not defined yet. They will basically be ActiveRecord readonly models to
158
+ allow to build scopes from it.
159
+
160
+ Ohlc for one hour:
161
+ ```ruby
162
+ class Ohlc1m < ActiveRecord::Base
163
+ self.table_name = 'ohlc_1m'
164
+ include Ohlc
165
+ end
166
+ ```
167
+
168
+ Ohlc for one day is pretty much the same:
169
+ ```ruby
170
+ class Ohlc1h < ActiveRecord::Base
171
+ self.table_name = 'ohlc_1h'
172
+ include Ohlc
173
+ end
174
+ ```
175
+
176
+ We'll also have the `Ohlc` as a shared concern that can help you to reuse
177
+ queries in different views.
178
+
179
+ ```ruby
180
+ module Ohlc
181
+ extend ActiveSupport::Concern
182
+
183
+ included do
184
+ scope :rollup, -> (timeframe: '1h') do
185
+ select("symbol, time_bucket('#{timeframe}', time) as time,
186
+ toolkit_experimental.rollup(ohlc) as ohlc")
187
+ .group(1,2)
188
+ end
189
+
190
+ scope :attributes, -> do
191
+ select("symbol, time,
192
+ toolkit_experimental.open(ohlc),
193
+ toolkit_experimental.high(ohlc),
194
+ toolkit_experimental.low(ohlc),
195
+ toolkit_experimental.close(ohlc),
196
+ toolkit_experimental.open_time(ohlc),
197
+ toolkit_experimental.high_time(ohlc),
198
+ toolkit_experimental.low_time(ohlc),
199
+ toolkit_experimental.close_time(ohlc)")
200
+ end
201
+
202
+ # Following the attributes scope, we can define accessors in the
203
+ # model to populate from the previous scope to make it similar
204
+ # to a regular model structure.
205
+ attribute :time, :time
206
+ attribute :symbol, :string
207
+
208
+ %w[open high low close].each do |name|
209
+ attribute name, :decimal
210
+ attribute "#{name}_time", :time
211
+ end
212
+
213
+ def readonly?
214
+ true
215
+ end
216
+ end
217
+ end
218
+ ```
219
+
220
+ The `rollup` scope is the one that was used to redefine the data into big timeframes
221
+ and the `attributes` allow to access the attributes from the [OpenHighLowClose][3]
222
+ type.
223
+
224
+ In this way, the views become just shortcuts and complex sql can also be done
225
+ just nesting the model scope. For example, to rollup from a minute to a month,
226
+ you can do:
227
+
228
+ ```ruby
229
+ Ohlc1m.attributes.from(
230
+ Ohlc1m.rollup(timeframe: '1 month')
231
+ )
232
+ ```
233
+
234
+ Soon the continuous aggregates will [support nested aggregates][4] and you'll be
235
+ abble to define the materialized views with steps like this:
236
+
237
+
238
+ ```ruby
239
+ Ohlc1m.attributes.from(
240
+ Ohlc1m.rollup(timeframe: '1 month').from(
241
+ Ohlc1m.rollup(timeframe: '1 week').from(
242
+ Ohlc1m.rollup(timeframe: '1 day').from(
243
+ Ohlc1m.rollup(timeframe: '1 hour')
244
+ )
245
+ )
246
+ )
247
+ )
248
+ ```
249
+
250
+ For now composing the subqueries will probably be less efficient and unnecessary.
251
+ But the foundation is already here to help you in future analysis. Just to make
252
+ it clear, here is the SQL generated from the previous code:
253
+
254
+ ```sql
255
+ SELECT symbol,
256
+ time,
257
+ toolkit_experimental.open(ohlc),
258
+ toolkit_experimental.high(ohlc),
259
+ toolkit_experimental.low(ohlc),
260
+ toolkit_experimental.close(ohlc),
261
+ toolkit_experimental.open_time(ohlc),
262
+ toolkit_experimental.high_time(ohlc),
263
+ toolkit_experimental.low_time(ohlc),
264
+ toolkit_experimental.close_time(ohlc)
265
+ FROM (
266
+ SELECT symbol,
267
+ time_bucket('1 month', time) as time,
268
+ toolkit_experimental.rollup(ohlc) as ohlc
269
+ FROM (
270
+ SELECT symbol,
271
+ time_bucket('1 week', time) as time,
272
+ toolkit_experimental.rollup(ohlc) as ohlc
273
+ FROM (
274
+ SELECT symbol,
275
+ time_bucket('1 day', time) as time,
276
+ toolkit_experimental.rollup(ohlc) as ohlc
277
+ FROM (
278
+ SELECT symbol,
279
+ time_bucket('1 hour', time) as time,
280
+ toolkit_experimental.rollup(ohlc) as ohlc
281
+ FROM "ohlc_1m"
282
+ GROUP BY 1, 2
283
+ ) subquery
284
+ GROUP BY 1, 2
285
+ ) subquery
286
+ GROUP BY 1, 2
287
+ ) subquery
288
+ GROUP BY 1, 2
289
+ ) subquery
290
+ ```
291
+
292
+ You can also define more scopes that will be useful depending on what are you
293
+ working on. Example:
294
+
295
+ ```ruby
296
+ scope :yesterday, -> { where("DATE(#{time_column}) = ?", Date.yesterday.in_time_zone.to_date) }
297
+ ```
298
+
299
+ And then, just combine the scopes:
300
+
301
+ ```ruby
302
+ Ohlc1m.yesterday.attributes
303
+ ```
304
+ I hope you find this tutorial interesting and you can also check the
305
+ `ohlc.rb` file in the [examples/toolkit-demo][5] folder.
306
+
307
+ If you have any questions or concerns, feel free to reach me ([@jonatasdp][7]) in the [Timescale community][6] or tag timescaledb in your StackOverflow issue.
308
+
309
+ [1]: https://docs.timescale.com/api/latest/hyperfunctions/financial-analysis/ohlc/
310
+ [2]: https://ideia.me/timescale-continuous-aggregates-with-ruby
311
+ [3]: https://github.com/timescale/timescaledb-toolkit/blob/cbbca7b2e69968e585c845924e7ed7aff1cea20a/extension/src/ohlc.rs#L20-L24
312
+ [4]: https://github.com/timescale/timescaledb/pull/4668
313
+ [5]: https://github.com/jonatas/timescaledb/tree/master/examples/toolkit-demo
314
+ [6]: https://timescale.com/community
315
+ [7]: https://twitter.com/jonatasdp
@@ -1,6 +1,14 @@
1
- require 'bundler/setup'
2
- require 'timescaledb'
1
+ # ruby compare_volatility.rb postgres://user:pass@host:port/db_name
2
+ require 'bundler/inline' #require only what you need
3
3
 
4
+ gemfile(true) do
5
+ gem 'timescaledb', path: '../..'
6
+ gem 'pry'
7
+ end
8
+
9
+ # TODO: get the volatility using the window function with plain postgresql
10
+
11
+ ActiveRecord::Base.establish_connection ARGV.last
4
12
 
5
13
  # Compare volatility processing in Ruby vs SQL.
6
14
  class Measurement < ActiveRecord::Base
@@ -25,9 +33,36 @@ class Measurement < ActiveRecord::Base
25
33
  end
26
34
  volatility
27
35
  }
36
+ scope :values_from_devices, -> {
37
+ ordered_values = select(:val, :device_id).order(:ts)
38
+ Hash[
39
+ from(ordered_values)
40
+ .group(:device_id)
41
+ .pluck("device_id, array_agg(val)")
42
+ ]
43
+ }
44
+ end
45
+
46
+ class Volatility
47
+ def self.process(values)
48
+ previous = nil
49
+ deltas = values.map do |value|
50
+ if previous
51
+ delta = (value - previous).abs
52
+ volatility = delta
53
+ end
54
+ previous = value
55
+ volatility
56
+ end
57
+ #deltas => [nil, 1, 1]
58
+ deltas.shift
59
+ volatility = deltas.sum
60
+ end
61
+ def self.process_values(map)
62
+ map.transform_values(&method(:process))
63
+ end
28
64
  end
29
65
 
30
- ActiveRecord::Base.establish_connection ENV["PG_URI"]
31
66
  ActiveRecord::Base.connection.add_toolkit_to_search_path!
32
67
 
33
68
 
@@ -58,7 +93,12 @@ if Measurement.count.zero?
58
93
  SQL
59
94
  end
60
95
 
96
+
97
+ volatilities = nil
98
+ #ActiveRecord::Base.logger = nil
61
99
  Benchmark.bm do |x|
62
- x.report("ruby") { Measurement.volatility_ruby }
63
100
  x.report("sql") { Measurement.volatility_sql.map(&:attributes) }
101
+ x.report("ruby") { Measurement.volatility_ruby }
102
+ x.report("fetch") { volatilities = Measurement.values_from_devices }
103
+ x.report("process") { Volatility.process_values(volatilities) }
64
104
  end
@@ -0,0 +1,175 @@
1
+ # ruby ohlc.rb postgres://user:pass@host:port/db_name
2
+ # @see https://jonatas.github.io/timescaledb/ohlc_tutorial
3
+
4
+ require 'bundler/inline' #require only what you need
5
+
6
+ gemfile(true) do
7
+ gem 'timescaledb', path: '../..'
8
+ gem 'pry'
9
+ end
10
+
11
+ ActiveRecord::Base.establish_connection ARGV.last
12
+
13
+ # Compare ohlc processing in Ruby vs SQL.
14
+ class Tick < ActiveRecord::Base
15
+ acts_as_hypertable time_column: "time"
16
+ acts_as_time_vector segment_by: "symbol", value_column: "price"
17
+ end
18
+ require "active_support/concern"
19
+
20
+ module Ohlc
21
+ extend ActiveSupport::Concern
22
+
23
+ included do
24
+ %w[open high low close].each do |name|
25
+ attribute name, :decimal
26
+ attribute "#{name}_time", :time
27
+ end
28
+
29
+
30
+ scope :attributes, -> do
31
+ select("symbol, time,
32
+ toolkit_experimental.open(ohlc),
33
+ toolkit_experimental.high(ohlc),
34
+ toolkit_experimental.low(ohlc),
35
+ toolkit_experimental.close(ohlc),
36
+ toolkit_experimental.open_time(ohlc),
37
+ toolkit_experimental.high_time(ohlc),
38
+ toolkit_experimental.low_time(ohlc),
39
+ toolkit_experimental.close_time(ohlc)")
40
+ end
41
+
42
+ scope :rollup, -> (timeframe: '1h') do
43
+ select("symbol, time_bucket('#{timeframe}', time) as time,
44
+ toolkit_experimental.rollup(ohlc) as ohlc")
45
+ .group(1,2)
46
+ end
47
+
48
+ def readonly?
49
+ true
50
+ end
51
+ end
52
+
53
+ class_methods do
54
+ end
55
+ end
56
+
57
+ class Ohlc1m < ActiveRecord::Base
58
+ self.table_name = 'ohlc_1m'
59
+ include Ohlc
60
+ end
61
+
62
+ class Ohlc1h < ActiveRecord::Base
63
+ self.table_name = 'ohlc_1h'
64
+ include Ohlc
65
+ end
66
+
67
+ class Ohlc1d < ActiveRecord::Base
68
+ self.table_name = 'ohlc_1d'
69
+ include Ohlc
70
+ end
71
+ =begin
72
+ scope :ohlc_ruby, -> (
73
+ timeframe: 1.hour,
74
+ segment_by: segment_by_column,
75
+ time: time_column,
76
+ value: value_column) {
77
+ ohlcs = Hash.new() {|hash, key| hash[key] = [] }
78
+
79
+ key = tick.send(segment_by)
80
+ candlestick = ohlcs[key].last
81
+ if candlestick.nil? || candlestick.time + timeframe > tick.time
82
+ ohlcs[key] << Candlestick.new(time $, price)
83
+ end
84
+ find_all do |tick|
85
+ symbol = tick.symbol
86
+
87
+ if previous[symbol]
88
+ delta = (tick.price - previous[symbol]).abs
89
+ volatility[symbol] += delta
90
+ end
91
+ previous[symbol] = tick.price
92
+ end
93
+ volatility
94
+ }
95
+ =end
96
+
97
+ ActiveRecord::Base.connection.add_toolkit_to_search_path!
98
+
99
+
100
+ ActiveRecord::Base.connection.instance_exec do
101
+ ActiveRecord::Base.logger = Logger.new(STDOUT)
102
+
103
+ unless Tick.table_exists?
104
+ hypertable_options = {
105
+ time_column: 'time',
106
+ chunk_time_interval: '1 week',
107
+ compress_segmentby: 'symbol',
108
+ compress_orderby: 'time',
109
+ compression_interval: '1 month'
110
+ }
111
+ create_table :ticks, hypertable: hypertable_options, id: false do |t|
112
+ t.column :time , 'timestamp with time zone'
113
+ t.string :symbol
114
+ t.decimal :price
115
+ t.integer :volume
116
+ end
117
+
118
+ options = {
119
+ with_data: false,
120
+ refresh_policies: {
121
+ start_offset: "INTERVAL '1 month'",
122
+ end_offset: "INTERVAL '1 minute'",
123
+ schedule_interval: "INTERVAL '1 minute'"
124
+ }
125
+ }
126
+ create_continuous_aggregate('ohlc_1m', Tick._ohlc(timeframe: '1m'), **options)
127
+
128
+ execute "CREATE VIEW ohlc_1h AS #{ Ohlc1m.rollup(timeframe: '1 hour').to_sql}"
129
+ execute "CREATE VIEW ohlc_1d AS #{ Ohlc1h.rollup(timeframe: '1 day').to_sql}"
130
+ end
131
+ end
132
+
133
+ if Tick.count.zero?
134
+ ActiveRecord::Base.connection.execute(<<~SQL)
135
+ INSERT INTO ticks
136
+ SELECT time, 'SYMBOL', 1 + (random()*30)::int, 100*(random()*10)::int
137
+ FROM generate_series(TIMESTAMP '2022-01-01 00:00:00',
138
+ TIMESTAMP '2022-02-01 00:01:00',
139
+ INTERVAL '1 second') AS time;
140
+ SQL
141
+ end
142
+
143
+
144
+ # Fetch attributes
145
+ Ohlc1m.attributes
146
+
147
+ # Rollup demo
148
+
149
+ # Attributes from rollup
150
+ Ohlc1m.attributes.from(Ohlc1m.rollup(timeframe: '1 day'))
151
+
152
+
153
+ # Nesting several levels
154
+ Ohlc1m.attributes.from(
155
+ Ohlc1m.rollup(timeframe: '1 week').from(
156
+ Ohlc1m.rollup(timeframe: '1 day')
157
+ )
158
+ )
159
+ Ohlc1m.attributes.from(
160
+ Ohlc1m.rollup(timeframe: '1 month').from(
161
+ Ohlc1m.rollup(timeframe: '1 week').from(
162
+ Ohlc1m.rollup(timeframe: '1 day')
163
+ )
164
+ )
165
+ )
166
+
167
+ Pry.start
168
+
169
+ =begin
170
+ TODO: implement the ohlc_ruby
171
+ Benchmark.bm do |x|
172
+ x.report("ruby") { Tick.ohlc_ruby }
173
+ x.report("sql") { Tick.ohlc.map(&:attributes) }
174
+ end
175
+ =end
@@ -4,7 +4,12 @@ require 'active_record/connection_adapters/postgresql_adapter'
4
4
  module Timescaledb
5
5
  # Migration helpers can help you to setup hypertables by default.
6
6
  module MigrationHelpers
7
- # create_table can receive `hypertable` argument
7
+ # `create_table` accepts a `hypertable` argument with options for creating
8
+ # a TimescaleDB hypertable.
9
+ #
10
+ # See https://docs.timescale.com/api/latest/hypertable/create_hypertable/#optional-arguments
11
+ # for additional options supported by the plugin.
12
+ #
8
13
  # @example
9
14
  # options = {
10
15
  # time_column: 'created_at',
@@ -27,15 +32,29 @@ module Timescaledb
27
32
  # Setup hypertable from options
28
33
  # @see create_table with the hypertable options.
29
34
  def create_hypertable(table_name,
30
- time_column: 'created_at',
31
- chunk_time_interval: '1 week',
32
- compress_segmentby: nil,
33
- compress_orderby: 'created_at',
34
- compression_interval: nil
35
- )
35
+ time_column: 'created_at',
36
+ chunk_time_interval: '1 week',
37
+ compress_segmentby: nil,
38
+ compress_orderby: 'created_at',
39
+ compression_interval: nil,
40
+ partition_column: nil,
41
+ number_partitions: nil,
42
+ **hypertable_options)
36
43
 
37
44
  ActiveRecord::Base.logger = Logger.new(STDOUT)
38
- execute "SELECT create_hypertable('#{table_name}', '#{time_column}', chunk_time_interval => INTERVAL '#{chunk_time_interval}')"
45
+
46
+ options = ["chunk_time_interval => INTERVAL '#{chunk_time_interval}'"]
47
+ options += hypertable_options.map { |k, v| "#{k} => #{quote(v)}" }
48
+
49
+ arguments = [
50
+ quote(table_name),
51
+ quote(time_column),
52
+ (quote(partition_column) if partition_column),
53
+ (number_partitions if partition_column),
54
+ *options
55
+ ]
56
+
57
+ execute "SELECT create_hypertable(#{arguments.compact.join(', ')})"
39
58
 
40
59
  if compress_segmentby
41
60
  execute <<~SQL
@@ -80,7 +99,7 @@ module Timescaledb
80
99
  WITH #{"NO" unless options[:with_data]} DATA;
81
100
  SQL
82
101
 
83
- create_continuous_aggregate_policy(table_name, options[:refresh_policies] || {})
102
+ create_continuous_aggregate_policy(table_name, **(options[:refresh_policies] || {}))
84
103
  end
85
104
 
86
105
 
@@ -6,15 +6,11 @@ module Timescaledb
6
6
  def tables(stream)
7
7
  super # This will call #table for each table in the database
8
8
  views(stream) unless defined?(Scenic) # Don't call this twice if we're using Scenic
9
- end
10
9
 
11
- def table(table_name, stream)
12
- super(table_name, stream)
13
- if Timescaledb::Hypertable.table_exists? &&
14
- (hypertable = Timescaledb::Hypertable.find_by(hypertable_name: table_name))
15
- timescale_hypertable(hypertable, stream)
16
- timescale_retention_policy(hypertable, stream)
17
- end
10
+ return unless Timescaledb::Hypertable.table_exists?
11
+
12
+ timescale_hypertables(stream)
13
+ timescale_retention_policies(stream)
18
14
  end
19
15
 
20
16
  def views(stream)
@@ -24,23 +20,42 @@ module Timescaledb
24
20
  super if defined?(super)
25
21
  end
26
22
 
23
+ def timescale_hypertables(stream)
24
+ sorted_hypertables.each do |hypertable|
25
+ timescale_hypertable(hypertable, stream)
26
+ end
27
+ end
28
+
29
+ def timescale_retention_policies(stream)
30
+ if sorted_hypertables.any? { |hypertable| hypertable.jobs.exists?(proc_name: "policy_retention") }
31
+ stream.puts # Insert a blank line above the retention policies, for readability
32
+ end
33
+
34
+ sorted_hypertables.each do |hypertable|
35
+ timescale_retention_policy(hypertable, stream)
36
+ end
37
+ end
38
+
27
39
  private
28
40
 
29
41
  def timescale_hypertable(hypertable, stream)
30
- dim = hypertable.dimensions.first
31
- extra_settings = {
32
- time_column: "#{dim.column_name}",
33
- chunk_time_interval: "#{dim.time_interval.inspect}"
34
- }.merge(timescale_compression_settings_for(hypertable)).map {|k, v| %Q[#{k}: "#{v}"]}.join(", ")
35
-
36
- stream.puts %Q[ create_hypertable "#{hypertable.hypertable_name}", #{extra_settings}]
37
- stream.puts
42
+ time = hypertable.main_dimension
43
+
44
+ options = {
45
+ time_column: time.column_name,
46
+ chunk_time_interval: time.time_interval.inspect,
47
+ **timescale_compression_settings_for(hypertable),
48
+ **timescale_space_partition_for(hypertable),
49
+ **timescale_index_options_for(hypertable)
50
+ }
51
+
52
+ options = options.map { |k, v| "#{k}: #{v.to_json}" }.join(", ")
53
+ stream.puts %Q[ create_hypertable "#{hypertable.hypertable_name}", #{options}]
38
54
  end
39
55
 
40
56
  def timescale_retention_policy(hypertable, stream)
41
57
  hypertable.jobs.where(proc_name: "policy_retention").each do |job|
42
58
  stream.puts %Q[ create_retention_policy "#{job.hypertable_name}", interval: "#{job.config["drop_after"]}"]
43
- stream.puts
44
59
  end
45
60
  end
46
61
 
@@ -60,6 +75,22 @@ module Timescaledb
60
75
  compression_settings
61
76
  end
62
77
 
78
+ def timescale_space_partition_for(hypertable)
79
+ return {} unless hypertable.dimensions.length > 1
80
+
81
+ space = hypertable.dimensions.last
82
+ {partition_column: space.column_name, number_partitions: space.num_partitions}
83
+ end
84
+
85
+ def timescale_index_options_for(hypertable)
86
+ time = hypertable.main_dimension
87
+ if @connection.indexes(hypertable.hypertable_name).any? { |i| i.columns == [time.column_name] }
88
+ {}
89
+ else
90
+ {create_default_indexes: false}
91
+ end
92
+ end
93
+
63
94
  def timescale_continuous_aggregates(stream)
64
95
  Timescaledb::ContinuousAggregates.all.each do |aggregate|
65
96
  opts = if (refresh_policy = aggregate.jobs.refresh_continuous_aggregate.first)
@@ -85,6 +116,10 @@ module Timescaledb
85
116
 
86
117
  "INTERVAL '#{value}'"
87
118
  end
119
+
120
+ def sorted_hypertables
121
+ @sorted_hypertables ||= Timescaledb::Hypertable.order(:hypertable_name).to_a
122
+ end
88
123
  end
89
124
  end
90
125
 
@@ -13,8 +13,9 @@ module Timescaledb
13
13
  end
14
14
 
15
15
  def time_column
16
- respond_to?(:time_column) && super || time_vector_options[:time_column]
16
+ respond_to?(:time_column) && super || time_vector_options[:time_column]
17
17
  end
18
+
18
19
  def segment_by_column
19
20
  time_vector_options[:segment_by]
20
21
  end
@@ -25,8 +26,7 @@ module Timescaledb
25
26
  scope :volatility, -> (segment_by: segment_by_column) do
26
27
  select([*segment_by,
27
28
  "timevector(#{time_column}, #{value_column}) -> sort() -> delta() -> abs() -> sum() as volatility"
28
- ].join(", "))
29
- .group(segment_by)
29
+ ].join(", ")).group(segment_by)
30
30
  end
31
31
 
32
32
  scope :time_weight, -> (segment_by: segment_by_column) do
@@ -40,8 +40,7 @@ module Timescaledb
40
40
  lttb_query = <<~SQL
41
41
  WITH x AS ( #{select(*segment_by, time_column, value_column).to_sql})
42
42
  SELECT #{"x.#{segment_by}," if segment_by}
43
- (lttb( x.#{time_column}, x.#{value_column}, #{threshold})
44
- -> toolkit_experimental.unnest()).*
43
+ (lttb( x.#{time_column}, x.#{value_column}, #{threshold}) -> unnest()).*
45
44
  FROM x
46
45
  #{"GROUP BY device_id" if segment_by}
47
46
  SQL
@@ -58,6 +57,38 @@ module Timescaledb
58
57
  downsampled.map{|e|[ e[time_column],e[value_column]]}
59
58
  end
60
59
  end
60
+
61
+
62
+ scope :_ohlc, -> (timeframe: '1h',
63
+ segment_by: segment_by_column,
64
+ time: time_column,
65
+ value: value_column) do
66
+
67
+ select( "time_bucket('#{timeframe}', #{time}) as #{time}",
68
+ *segment_by,
69
+ "toolkit_experimental.ohlc(#{time}, #{value})")
70
+ .order(1)
71
+ .group(*(segment_by ? [1,2] : 1))
72
+ end
73
+
74
+ scope :ohlc, -> (timeframe: '1h',
75
+ segment_by: segment_by_column,
76
+ time: time_column,
77
+ value: value_column) do
78
+
79
+ raw = _ohlc(timeframe: timeframe, segment_by: segment_by, time: time, value: value)
80
+ unscoped
81
+ .from("(#{raw.to_sql}) AS ohlc")
82
+ .select(*segment_by, time,
83
+ "toolkit_experimental.open(ohlc),
84
+ toolkit_experimental.high(ohlc),
85
+ toolkit_experimental.low(ohlc),
86
+ toolkit_experimental.close(ohlc),
87
+ toolkit_experimental.open_time(ohlc),
88
+ toolkit_experimental.high_time(ohlc),
89
+ toolkit_experimental.low_time(ohlc),
90
+ toolkit_experimental.close_time(ohlc)")
91
+ end
61
92
  end
62
93
  end
63
94
  end
@@ -1,3 +1,3 @@
1
1
  module Timescaledb
2
- VERSION = '0.2.3'
2
+ VERSION = '0.2.5'
3
3
  end
data/mkdocs.yml CHANGED
@@ -29,5 +29,6 @@ nav:
29
29
  - Toolkit Integration: toolkit.md
30
30
  - Toolkit LTTB Tutorial: toolkit_lttb_tutorial.md
31
31
  - Zooming with High Resolution: toolkit_lttb_zoom.md
32
+ - Toolkit OHLC: toolkit_ohlc.md
32
33
  - Command Line: command_line.md
33
34
  - Videos: videos.md
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: timescaledb
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.2.3
4
+ version: 0.2.5
5
5
  platform: ruby
6
6
  authors:
7
7
  - Jônatas Davi Paganini
8
- autorequire:
8
+ autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2022-10-23 00:00:00.000000000 Z
11
+ date: 2022-12-19 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: pg
@@ -171,6 +171,7 @@ files:
171
171
  - docs/toolkit.md
172
172
  - docs/toolkit_lttb_tutorial.md
173
173
  - docs/toolkit_lttb_zoom.md
174
+ - docs/toolkit_ohlc.md
174
175
  - docs/videos.md
175
176
  - examples/all_in_one/all_in_one.rb
176
177
  - examples/all_in_one/benchmark_comparison.rb
@@ -234,6 +235,7 @@ files:
234
235
  - examples/toolkit-demo/lttb/lttb_sinatra.rb
235
236
  - examples/toolkit-demo/lttb/lttb_test.rb
236
237
  - examples/toolkit-demo/lttb/views/index.erb
238
+ - examples/toolkit-demo/ohlc.rb
237
239
  - lib/timescaledb.rb
238
240
  - lib/timescaledb/acts_as_hypertable.rb
239
241
  - lib/timescaledb/acts_as_hypertable/core.rb
@@ -262,7 +264,7 @@ licenses:
262
264
  metadata:
263
265
  allowed_push_host: https://rubygems.org
264
266
  homepage_uri: https://github.com/jonatas/timescaledb
265
- post_install_message:
267
+ post_install_message:
266
268
  rdoc_options: []
267
269
  require_paths:
268
270
  - lib
@@ -277,8 +279,8 @@ required_rubygems_version: !ruby/object:Gem::Requirement
277
279
  - !ruby/object:Gem::Version
278
280
  version: '0'
279
281
  requirements: []
280
- rubygems_version: 3.1.2
281
- signing_key:
282
+ rubygems_version: 3.3.7
283
+ signing_key:
282
284
  specification_version: 4
283
285
  summary: TimescaleDB helpers for Ruby ecosystem.
284
286
  test_files: []