timescaledb 0.2.2 → 0.2.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: b622fffc9a920a95e1c0615df32fa13e6b60d21198ebd39a6c46d27a66e11df4
4
- data.tar.gz: 9da077e24cb64120e1d235e05234299f9c2fb7cbbbd63e50a68a13bad3a23898
3
+ metadata.gz: c5f8ebd4460e965fbf9a35600630c05d476b1a4b674192cd925a3d5f948a64a1
4
+ data.tar.gz: d697ab124689a8c4f1ffc5b809a7cecd2ac07bbce84c7b9ca539dc5ae67068c9
5
5
  SHA512:
6
- metadata.gz: b4c098c20e3a99f5c8798f5ed2e29c095b5982b55defc5612128a33bf840c01d7eb457df2a5cd83fc41243be9b48184a8bc9b585800d2d7daa28085793263246
7
- data.tar.gz: b1420ac494f3a1ed6ebbd4ebd3701a9a3d33b122bbf3d6dd4b15c94159ce5299d24d0f08c68ef612b6758859b388811415593d4512fc39bfbff26876ec82efea
6
+ metadata.gz: 1070bf2f732137006d81790ac1c4b467f733edd3bb724e8773d3c9f6ecb3b5c1dc32c3da638f351589fa944a48c1b8f7a037da6a896beed1557e4fb34e2a8442
7
+ data.tar.gz: f5bc47e8c0022d079189e7ad68e2214da6760c4f420bca3c83950137dd7026985fb9239924c241b28f261546a860c744a32db154548bfc0f8871302d35109ff7
data/.ruby-version CHANGED
@@ -1 +1 @@
1
- 2.7.1
1
+ 3.1.2
data/Fastfile ADDED
@@ -0,0 +1,17 @@
1
+
2
+ # Use `fast .version_up` to rewrite the version file
3
+ Fast.shortcut :version_up do
4
+ rewrite_file('(casgn nil VERSION (str _)', 'lib/timescaledb/version.rb') do |node|
5
+ target = node.children.last.loc.expression
6
+ pieces = target.source.split('.').map(&:to_i)
7
+ pieces.reverse.each_with_index do |fragment, i|
8
+ if fragment < 9
9
+ pieces[-(i + 1)] = fragment + 1
10
+ break
11
+ else
12
+ pieces[-(i + 1)] = 0
13
+ end
14
+ end
15
+ replace(target, "'#{pieces.join('.')}'")
16
+ end
17
+ end
data/Gemfile.lock CHANGED
@@ -1,7 +1,7 @@
1
1
  PATH
2
2
  remote: .
3
3
  specs:
4
- timescaledb (0.2.1)
4
+ timescaledb (0.2.4)
5
5
  activerecord
6
6
  activesupport
7
7
  pg (~> 1.2)
@@ -33,7 +33,7 @@ GEM
33
33
  concurrent-ruby (~> 1.0)
34
34
  method_source (1.0.0)
35
35
  minitest (5.14.4)
36
- pg (1.3.1)
36
+ pg (1.4.4)
37
37
  pry (0.14.1)
38
38
  coderay (~> 1.1)
39
39
  method_source (~> 1.0)
data/Gemfile.scenic.lock CHANGED
@@ -1,7 +1,7 @@
1
1
  PATH
2
2
  remote: .
3
3
  specs:
4
- timescaledb (0.1.5)
4
+ timescaledb (0.2.3)
5
5
  activerecord
6
6
  activesupport
7
7
  pg (~> 1.2)
@@ -58,7 +58,7 @@ GEM
58
58
  racc (~> 1.4)
59
59
  nokogiri (1.12.5-x86_64-darwin)
60
60
  racc (~> 1.4)
61
- pg (1.3.0)
61
+ pg (1.4.4)
62
62
  pry (0.14.1)
63
63
  coderay (~> 1.1)
64
64
  method_source (~> 1.0)
data/docs/index.md CHANGED
@@ -40,6 +40,17 @@ The [all_in_one](https://github.com/jonatas/timescaledb/tree/master/examples/all
40
40
 
41
41
  The [ranking](https://github.com/jonatas/timescaledb/tree/master/examples/ranking) example shows how to configure a Rails app and navigate all the features available.
42
42
 
43
+
44
+ ## Toolkit examples
45
+
46
+ There are also examples in the [toolkit-demo](https://github.com/jonatas/timescaledb/tree/master/examples/toolkit-demo) folder that can help you to
47
+ understand how to properly use the toolkit functions.
48
+
49
+ * [ohlc](https://github.com/jonatas/timescaledb/tree/master/examples/toolkit-demo/ohlc.rb) is a funtion that groups data by Open, High, Low, Close and make histogram availables to group the data, very useful for financial analysis.
50
+ * While building the [LTTB tutorial]( https://jonatas.github.io/timescaledb/toolkit_lttb_tutorial/) I created the [lttb](https://github.com/jonatas/timescaledb/tree/master/examples/toolkit-demo/lttb) is a simple charting using the Largest Triangle Three Buckets and there. A [zoomable](https://github.com/jonatas/timescaledb/tree/master/examples/toolkit-demo/lttb-zoom) version which allows to navigate in the data and zoom it keeping the same data resolution is also available.
51
+ * A small example showing how to process [volatility](https://github.com/jonatas/timescaledb/blob/master/examples/toolkit-demo/compare_volatility.rb) is also good to get familiar with the pipeline functions. A benchmark implementing the same in Ruby is also available to check how it compares to the SQL implementation.
52
+
53
+
43
54
  ## Extra resources
44
55
 
45
56
  If you need extra help, please join the fantastic [timescale community](https://www.timescale.com/community)
data/docs/migrations.md CHANGED
@@ -67,3 +67,10 @@ options = {
67
67
  create_continuous_aggregate('ohlc_1m', query, **options)
68
68
  ```
69
69
 
70
+ If you need more details, please check this [blog post][1].
71
+
72
+ If you're interested in candlesticks and need to get the OHLC values, take a look
73
+ at the [toolkit ohlc](/toolkit_ohlc) function that do the same but through a
74
+ function that can be reusing candlesticks from smaller timeframes.
75
+
76
+ [1]: https://ideia.me/timescale-continuous-aggregates-with-ruby
data/docs/toolkit.md CHANGED
@@ -93,7 +93,7 @@ Now, let's add the model `app/models/measurement.rb`:
93
93
 
94
94
  ```ruby
95
95
  class Measurement < ActiveRecord::Base
96
- self.primary_key = 'device_id'
96
+ self.primary_key = nil
97
97
 
98
98
  acts_as_hypertable time_column: "ts"
99
99
  end
@@ -168,12 +168,15 @@ Measurement
168
168
  The final query for the example above looks like this:
169
169
 
170
170
  ```sql
171
- SELECT device_id, sum(abs_delta) as volatility
171
+ SELECT device_id, SUM(abs_delta) AS volatility
172
172
  FROM (
173
173
  SELECT device_id,
174
- abs(val - lag(val) OVER (PARTITION BY device_id ORDER BY ts)) as abs_delta
174
+ ABS(
175
+ val - LAG(val) OVER (
176
+ PARTITION BY device_id ORDER BY ts)
177
+ ) AS abs_delta
175
178
  FROM "measurements"
176
- ) as calc_delta
179
+ ) AS calc_delta
177
180
  GROUP BY device_id
178
181
  ```
179
182
 
@@ -182,8 +185,14 @@ let's reproduce the same example using the toolkit pipelines:
182
185
 
183
186
  ```ruby
184
187
  Measurement
185
- .select("device_id, timevector(ts, val) -> sort() -> delta() -> abs() -> sum() as volatility")
186
- .group("device_id")
188
+ .select(<<-SQL).group("device_id")
189
+ device_id,
190
+ timevector(ts, val)
191
+ -> sort()
192
+ -> delta()
193
+ -> abs()
194
+ -> sum() as volatility
195
+ SQL
187
196
  ```
188
197
 
189
198
  As you can see, it's much easier to read and digest the example. Now, let's take
@@ -198,7 +207,7 @@ here to allow us to not repeat the parameters of the `timevector(ts, val)` call.
198
207
 
199
208
  ```ruby
200
209
  class Measurement < ActiveRecord::Base
201
- self.primary_key = 'device_id'
210
+ self.primary_key = nil
202
211
 
203
212
  acts_as_hypertable time_column: "ts"
204
213
 
@@ -224,8 +233,14 @@ class Measurement < ActiveRecord::Base
224
233
  time_column: "ts"
225
234
 
226
235
  scope :volatility, -> do
227
- select("device_id, timevector(#{time_column}, #{value_column}) -> sort() -> delta() -> abs() -> sum() as volatility")
228
- .group("device_id")
236
+ select(<<-SQL).group("device_id")
237
+ device_id,
238
+ timevector(#{time_column}, #{value_column})
239
+ -> sort()
240
+ -> delta()
241
+ -> abs()
242
+ -> sum() as volatility
243
+ SQL
229
244
  end
230
245
  end
231
246
  ```
@@ -248,7 +263,12 @@ class Measurement < ActiveRecord::Base
248
263
 
249
264
  scope :volatility, -> (columns=segment_by_column) do
250
265
  _scope = select([*columns,
251
- "timevector(#{time_column}, #{value_column}) -> sort() -> delta() -> abs() -> sum() as volatility"
266
+ "timevector(#{time_column},
267
+ #{value_column})
268
+ -> sort()
269
+ -> delta()
270
+ -> abs()
271
+ -> sum() as volatility"
252
272
  ].join(", "))
253
273
  _scope = _scope.group(columns) if columns
254
274
  _scope
@@ -361,7 +381,7 @@ Now, let's measure compare the time to process the volatility:
361
381
  ```ruby
362
382
  Benchmark.bm do |x|
363
383
  x.report("ruby") { pp Measurement.volatility_by_device_id }
364
- x.report("sql") { pp Measurement.volatility("device_id").map(&:attributes) }
384
+ x.report("sql") { pp Measurement.volatility("device_id").map(&:attributes) }
365
385
  end
366
386
  # user system total real
367
387
  # ruby 0.612439 0.061890 0.674329 ( 0.727590)
@@ -379,10 +399,103 @@ records over the wires. Now, moving to a remote host look the numbers:
379
399
  Now, using a remote connection between different regions,
380
400
  it looks even ~500 times slower than SQL.
381
401
 
382
- user system total real
383
- ruby 0.716321 0.041640 0.757961 ( 6.388881)
384
- sql 0.001156 0.000177 0.001333 ( 0.161270)
402
+ user system total real
403
+ ruby 0.716321 0.041640 0.757961 ( 6.388881)
404
+ sql 0.001156 0.000177 0.001333 ( 0.161270)
385
405
 
406
+ Let’s recap what’s time consuming here. The `find_all` is just not optimized to
407
+ fetch the data and also consuming most of the time here. It’s also fetching
408
+ the data and converting it to ActiveRecord model which has thousands of methods.
409
+
410
+ It’s very comfortable but just need the attributes to make it.
411
+
412
+ Let’s optimize it by plucking an array of values grouped by device.
413
+
414
+ ```ruby
415
+ class Measurement < ActiveRecord::Base
416
+ # ...
417
+ scope :values_from_devices, -> {
418
+ ordered_values = select(:val, :device_id).order(:ts)
419
+ Hash[
420
+ from(ordered_values)
421
+ .group(:device_id)
422
+ .pluck("device_id, array_agg(val)")
423
+ ]
424
+ }
425
+ end
426
+ ```
427
+
428
+ Now, let's create a method for processing volatility.
429
+
430
+ ```ruby
431
+ class Volatility
432
+ def self.process(values)
433
+ previous = nil
434
+ deltas = values.map do |value|
435
+ if previous
436
+ delta = (value - previous).abs
437
+ volatility = delta
438
+ end
439
+ previous = value
440
+ volatility
441
+ end
442
+ #deltas => [nil, 1, 1]
443
+ deltas.shift
444
+ volatility = deltas.sum
445
+ end
446
+ def self.process_values(map)
447
+ map.transform_values(&method(:process))
448
+ end
449
+ end
450
+ ```
451
+
452
+ Now, let's change the benchmark to expose the time for fetching and processing:
453
+
454
+
455
+ ```ruby
456
+ volatilities = nil
457
+
458
+ ActiveRecord::Base.logger = nil
459
+ Benchmark.bm do |x|
460
+ x.report("ruby") { Measurement.volatility_ruby }
461
+ x.report("sql") { Measurement.volatility_sql.map(&:attributes) }
462
+ x.report("fetch") { volatilities = Measurement.values_from_devices }
463
+ x.report("process") { Volatility.process_values(volatilities) }
464
+ end
465
+ ```
466
+
467
+ Checking the results:
468
+
469
+ user system total real
470
+ ruby 0.683654 0.036558 0.720212 ( 0.743942)
471
+ sql 0.000876 0.000096 0.000972 ( 0.054234)
472
+ fetch 0.078045 0.003221 0.081266 ( 0.116693)
473
+ process 0.067643 0.006473 0.074116 ( 0.074122)
474
+
475
+ Much better, now we can see only 200ms difference between real time which means ~36% more.
476
+
477
+
478
+ If we try to break down a bit more of the SQL part, we can see that the
479
+
480
+ ```sql
481
+ EXPLAIN ANALYSE
482
+ SELECT device_id, array_agg(val)
483
+ FROM (
484
+ SELECT val, device_id
485
+ FROM measurements
486
+ ORDER BY ts ASC
487
+ ) subquery
488
+ GROUP BY device_id;
489
+ ```
490
+
491
+ We can check the execution time and make it clear how much time is necessary
492
+ just for the processing part, isolating network and the ActiveRecord layer.
493
+
494
+ │ Planning Time: 17.761 ms │
495
+ │ Execution Time: 36.302 ms
496
+
497
+ So, it means that from the **116ms** to fetch the data, only **54ms** was used from the DB
498
+ and the remaining **62ms** was consumed by network + ORM.
386
499
 
387
500
  [1]: https://github.com/timescale/timescaledb-toolkit
388
501
  [2]: https://timescale.com
@@ -0,0 +1,315 @@
1
+ # OHLC / Candlesticks
2
+
3
+ Candlesticks are a popular tool in technical analysis, used by traders to determine potential market movements.
4
+
5
+ The toolkit also allows you to compute candlesticks with the [ohlc][1] function.
6
+
7
+ Candlesticks are a type of price chart that displays the high, low, open, and close prices of a security for a specific period. They can be useful because they can provide information about market trends and reversals. For example, if you see that the stock has been trading in a range for a while, it may be worth considering buying or selling when the price moves outside of this range. Additionally, candlesticks can be used in conjunction with other technical indicators to make trading decisions.
8
+
9
+
10
+ Let's start defining a table that stores the trades from financial market data
11
+ and then we can calculate the candlesticks with the Timescaledb Toolkit.
12
+
13
+ ## Migration
14
+
15
+ The `ticks` table is a hypertable that will be partitioning the data into one
16
+ week intervl. Compressing them after a month to save storage.
17
+
18
+ ```ruby
19
+ hypertable_options = {
20
+ time_column: 'time',
21
+ chunk_time_interval: '1 week',
22
+ compress_segmentby: 'symbol',
23
+ compress_orderby: 'time',
24
+ compression_interval: '1 month'
25
+ }
26
+ create_table :ticks, hypertable: hypertable_options, id: false do |t|
27
+ t.timestampt :time
28
+ t.string :symbol
29
+ t.decimal :price
30
+ t.integer :volume
31
+ end
32
+ ```
33
+
34
+ In the previous code block, we assume it goes inside a Rails migration or you
35
+ can embed such code into a `ActiveRecord::Base.connection.instance_exec` block.
36
+
37
+ ## Defining the model
38
+
39
+ As we don't need a primary key for the table, let's set it to nil. The
40
+ `acts_as_hypertable` macro will give us several useful scopes that can be
41
+ wrapping some of the TimescaleDB features.
42
+
43
+ The `acts_as_time_vector` will allow us to set what are the default columns used
44
+ to calculate the data.
45
+
46
+
47
+ ```ruby
48
+ class Tick < ActiveRecord::Base
49
+ self.primary_key = nil
50
+ acts_as_hypertable time_column: :time
51
+ acts_as_time_vector value_column: price, segment_by: :symbol
52
+ end
53
+ ```
54
+
55
+ The candlestick will split the timeframe by the `time_column` and use the `price` as the default value to process the candlestick. It will also segment the candles by `symbol`.
56
+
57
+ If you need to generate some data for your table, please check [this post][2].
58
+
59
+ ## The `ohlc` scope
60
+
61
+ When the `acts_as_time_vector` method is used in the model, it will inject
62
+ several scopes from the toolkit to easily have access to functions like the
63
+ ohlc.
64
+
65
+ The `ohlc` scope is available with a few parameters that inherits the
66
+ configuration from the `acts_as_time_vector` declared previously.
67
+
68
+ The simplest query is:
69
+
70
+ ```ruby
71
+ Tick.ohlc(timeframe: '1m')
72
+ ```
73
+
74
+ It will generate the following SQL:
75
+
76
+ ```sql
77
+ SELECT symbol,
78
+ "time",
79
+ toolkit_experimental.open(ohlc),
80
+ toolkit_experimental.high(ohlc),
81
+ toolkit_experimental.low(ohlc),
82
+ toolkit_experimental.close(ohlc),
83
+ toolkit_experimental.open_time(ohlc),
84
+ toolkit_experimental.high_time(ohlc),
85
+ toolkit_experimental.low_time(ohlc),
86
+ toolkit_experimental.close_time(ohlc)
87
+ FROM (
88
+ SELECT time_bucket('1m', time) as time,
89
+ "ticks"."symbol",
90
+ toolkit_experimental.ohlc(time, price)
91
+ FROM "ticks" GROUP BY 1, 2 ORDER BY 1)
92
+ AS ohlc
93
+ ```
94
+
95
+ The timeframe argument can also be skipped and the default is `1 hour`.
96
+
97
+ You can also combine other scopes to filter data before you get the data from the candlestick:
98
+
99
+ ```ruby
100
+ Tick.yesterday
101
+ .where(symbol: "APPL")
102
+ .ohlc(timeframe: '1m')
103
+ ```
104
+
105
+ The `yesterday` scope is automatically included because of the `acts_as_hypertable` macro. And it will be combining with other where clauses.
106
+
107
+ ## Continuous aggregates
108
+
109
+ If you would like to continuous aggregate the candlesticks on a materialized
110
+ view you can use continuous aggregates for it.
111
+
112
+ The next examples shows how to create a continuous aggregates of 1 minute
113
+ candlesticks:
114
+
115
+ ```ruby
116
+ options = {
117
+ with_data: false,
118
+ refresh_policies: {
119
+ start_offset: "INTERVAL '1 month'",
120
+ end_offset: "INTERVAL '1 minute'",
121
+ schedule_interval: "INTERVAL '1 minute'"
122
+ }
123
+ }
124
+ create_continuous_aggregate('ohlc_1m', Tick.ohlc(timeframe: '1m'), **options)
125
+ ```
126
+
127
+
128
+ Note that the `create_continuous_aggregate` calls the `to_sql` method in case
129
+ the second parameter is not a string.
130
+
131
+ ## Rollup
132
+
133
+ The rollup allows you to combine ohlc structures from smaller timeframes
134
+ to bigger timeframes without needing to reprocess all the data.
135
+
136
+ With this feature, you can group by the ohcl multiple times saving processing
137
+ from the server and make it easier to manage candlesticks from different time intervals.
138
+
139
+ In the previous example, we used the `.ohlc` function that returns already the
140
+ attributes from the different timeframes. In the SQL command it's calling the
141
+ `open`, `high`, `low`, `close` functions that can access the values behind the
142
+ ohlcsummary type.
143
+
144
+ To merge the ohlc we need to rollup the `ohlcsummary` to a bigger timeframe and
145
+ only access the values as a final resort to see them and access as attributes.
146
+
147
+ Let's rebuild the structure:
148
+
149
+ ```ruby
150
+ execute "CREATE VIEW ohlc_1h AS #{ Ohlc1m.rollup(timeframe: '1 hour').to_sql}"
151
+ execute "CREATE VIEW ohlc_1d AS #{ Ohlc1h.rollup(timeframe: '1 day').to_sql}"
152
+ ```
153
+
154
+ ## Defining models for views
155
+
156
+ Note that the previous code refers to `Ohlc1m` and `Ohlc1h` as two classes that
157
+ are not defined yet. They will basically be ActiveRecord readonly models to
158
+ allow to build scopes from it.
159
+
160
+ Ohlc for one hour:
161
+ ```ruby
162
+ class Ohlc1m < ActiveRecord::Base
163
+ self.table_name = 'ohlc_1m'
164
+ include Ohlc
165
+ end
166
+ ```
167
+
168
+ Ohlc for one day is pretty much the same:
169
+ ```ruby
170
+ class Ohlc1h < ActiveRecord::Base
171
+ self.table_name = 'ohlc_1h'
172
+ include Ohlc
173
+ end
174
+ ```
175
+
176
+ We'll also have the `Ohlc` as a shared concern that can help you to reuse
177
+ queries in different views.
178
+
179
+ ```ruby
180
+ module Ohlc
181
+ extend ActiveSupport::Concern
182
+
183
+ included do
184
+ scope :rollup, -> (timeframe: '1h') do
185
+ select("symbol, time_bucket('#{timeframe}', time) as time,
186
+ toolkit_experimental.rollup(ohlc) as ohlc")
187
+ .group(1,2)
188
+ end
189
+
190
+ scope :attributes, -> do
191
+ select("symbol, time,
192
+ toolkit_experimental.open(ohlc),
193
+ toolkit_experimental.high(ohlc),
194
+ toolkit_experimental.low(ohlc),
195
+ toolkit_experimental.close(ohlc),
196
+ toolkit_experimental.open_time(ohlc),
197
+ toolkit_experimental.high_time(ohlc),
198
+ toolkit_experimental.low_time(ohlc),
199
+ toolkit_experimental.close_time(ohlc)")
200
+ end
201
+
202
+ # Following the attributes scope, we can define accessors in the
203
+ # model to populate from the previous scope to make it similar
204
+ # to a regular model structure.
205
+ attribute :time, :time
206
+ attribute :symbol, :string
207
+
208
+ %w[open high low close].each do |name|
209
+ attribute name, :decimal
210
+ attribute "#{name}_time", :time
211
+ end
212
+
213
+ def readonly?
214
+ true
215
+ end
216
+ end
217
+ end
218
+ ```
219
+
220
+ The `rollup` scope is the one that was used to redefine the data into big timeframes
221
+ and the `attributes` allow to access the attributes from the [OpenHighLowClose][3]
222
+ type.
223
+
224
+ In this way, the views become just shortcuts and complex sql can also be done
225
+ just nesting the model scope. For example, to rollup from a minute to a month,
226
+ you can do:
227
+
228
+ ```ruby
229
+ Ohlc1m.attributes.from(
230
+ Ohlc1m.rollup(timeframe: '1 month')
231
+ )
232
+ ```
233
+
234
+ Soon the continuous aggregates will [support nested aggregates][4] and you'll be
235
+ abble to define the materialized views with steps like this:
236
+
237
+
238
+ ```ruby
239
+ Ohlc1m.attributes.from(
240
+ Ohlc1m.rollup(timeframe: '1 month').from(
241
+ Ohlc1m.rollup(timeframe: '1 week').from(
242
+ Ohlc1m.rollup(timeframe: '1 day').from(
243
+ Ohlc1m.rollup(timeframe: '1 hour')
244
+ )
245
+ )
246
+ )
247
+ )
248
+ ```
249
+
250
+ For now composing the subqueries will probably be less efficient and unnecessary.
251
+ But the foundation is already here to help you in future analysis. Just to make
252
+ it clear, here is the SQL generated from the previous code:
253
+
254
+ ```sql
255
+ SELECT symbol,
256
+ time,
257
+ toolkit_experimental.open(ohlc),
258
+ toolkit_experimental.high(ohlc),
259
+ toolkit_experimental.low(ohlc),
260
+ toolkit_experimental.close(ohlc),
261
+ toolkit_experimental.open_time(ohlc),
262
+ toolkit_experimental.high_time(ohlc),
263
+ toolkit_experimental.low_time(ohlc),
264
+ toolkit_experimental.close_time(ohlc)
265
+ FROM (
266
+ SELECT symbol,
267
+ time_bucket('1 month', time) as time,
268
+ toolkit_experimental.rollup(ohlc) as ohlc
269
+ FROM (
270
+ SELECT symbol,
271
+ time_bucket('1 week', time) as time,
272
+ toolkit_experimental.rollup(ohlc) as ohlc
273
+ FROM (
274
+ SELECT symbol,
275
+ time_bucket('1 day', time) as time,
276
+ toolkit_experimental.rollup(ohlc) as ohlc
277
+ FROM (
278
+ SELECT symbol,
279
+ time_bucket('1 hour', time) as time,
280
+ toolkit_experimental.rollup(ohlc) as ohlc
281
+ FROM "ohlc_1m"
282
+ GROUP BY 1, 2
283
+ ) subquery
284
+ GROUP BY 1, 2
285
+ ) subquery
286
+ GROUP BY 1, 2
287
+ ) subquery
288
+ GROUP BY 1, 2
289
+ ) subquery
290
+ ```
291
+
292
+ You can also define more scopes that will be useful depending on what are you
293
+ working on. Example:
294
+
295
+ ```ruby
296
+ scope :yesterday, -> { where("DATE(#{time_column}) = ?", Date.yesterday.in_time_zone.to_date) }
297
+ ```
298
+
299
+ And then, just combine the scopes:
300
+
301
+ ```ruby
302
+ Ohlc1m.yesterday.attributes
303
+ ```
304
+ I hope you find this tutorial interesting and you can also check the
305
+ `ohlc.rb` file in the [examples/toolkit-demo][5] folder.
306
+
307
+ If you have any questions or concerns, feel free to reach me ([@jonatasdp][7]) in the [Timescale community][6] or tag timescaledb in your StackOverflow issue.
308
+
309
+ [1]: https://docs.timescale.com/api/latest/hyperfunctions/financial-analysis/ohlc/
310
+ [2]: https://ideia.me/timescale-continuous-aggregates-with-ruby
311
+ [3]: https://github.com/timescale/timescaledb-toolkit/blob/cbbca7b2e69968e585c845924e7ed7aff1cea20a/extension/src/ohlc.rs#L20-L24
312
+ [4]: https://github.com/timescale/timescaledb/pull/4668
313
+ [5]: https://github.com/jonatas/timescaledb/tree/master/examples/toolkit-demo
314
+ [6]: https://timescale.com/community
315
+ [7]: https://twitter.com/jonatasdp
@@ -1,6 +1,14 @@
1
- require 'bundler/setup'
2
- require 'timescaledb'
1
+ # ruby compare_volatility.rb postgres://user:pass@host:port/db_name
2
+ require 'bundler/inline' #require only what you need
3
3
 
4
+ gemfile(true) do
5
+ gem 'timescaledb', path: '../..'
6
+ gem 'pry'
7
+ end
8
+
9
+ # TODO: get the volatility using the window function with plain postgresql
10
+
11
+ ActiveRecord::Base.establish_connection ARGV.last
4
12
 
5
13
  # Compare volatility processing in Ruby vs SQL.
6
14
  class Measurement < ActiveRecord::Base
@@ -25,9 +33,36 @@ class Measurement < ActiveRecord::Base
25
33
  end
26
34
  volatility
27
35
  }
36
+ scope :values_from_devices, -> {
37
+ ordered_values = select(:val, :device_id).order(:ts)
38
+ Hash[
39
+ from(ordered_values)
40
+ .group(:device_id)
41
+ .pluck("device_id, array_agg(val)")
42
+ ]
43
+ }
44
+ end
45
+
46
+ class Volatility
47
+ def self.process(values)
48
+ previous = nil
49
+ deltas = values.map do |value|
50
+ if previous
51
+ delta = (value - previous).abs
52
+ volatility = delta
53
+ end
54
+ previous = value
55
+ volatility
56
+ end
57
+ #deltas => [nil, 1, 1]
58
+ deltas.shift
59
+ volatility = deltas.sum
60
+ end
61
+ def self.process_values(map)
62
+ map.transform_values(&method(:process))
63
+ end
28
64
  end
29
65
 
30
- ActiveRecord::Base.establish_connection ENV["PG_URI"]
31
66
  ActiveRecord::Base.connection.add_toolkit_to_search_path!
32
67
 
33
68
 
@@ -58,7 +93,12 @@ if Measurement.count.zero?
58
93
  SQL
59
94
  end
60
95
 
96
+
97
+ volatilities = nil
98
+ #ActiveRecord::Base.logger = nil
61
99
  Benchmark.bm do |x|
62
- x.report("ruby") { Measurement.volatility_ruby }
63
100
  x.report("sql") { Measurement.volatility_sql.map(&:attributes) }
101
+ x.report("ruby") { Measurement.volatility_ruby }
102
+ x.report("fetch") { volatilities = Measurement.values_from_devices }
103
+ x.report("process") { Volatility.process_values(volatilities) }
64
104
  end
@@ -0,0 +1,175 @@
1
+ # ruby ohlc.rb postgres://user:pass@host:port/db_name
2
+ # @see https://jonatas.github.io/timescaledb/ohlc_tutorial
3
+
4
+ require 'bundler/inline' #require only what you need
5
+
6
+ gemfile(true) do
7
+ gem 'timescaledb', path: '../..'
8
+ gem 'pry'
9
+ end
10
+
11
+ ActiveRecord::Base.establish_connection ARGV.last
12
+
13
+ # Compare ohlc processing in Ruby vs SQL.
14
+ class Tick < ActiveRecord::Base
15
+ acts_as_hypertable time_column: "time"
16
+ acts_as_time_vector segment_by: "symbol", value_column: "price"
17
+ end
18
+ require "active_support/concern"
19
+
20
+ module Ohlc
21
+ extend ActiveSupport::Concern
22
+
23
+ included do
24
+ %w[open high low close].each do |name|
25
+ attribute name, :decimal
26
+ attribute "#{name}_time", :time
27
+ end
28
+
29
+
30
+ scope :attributes, -> do
31
+ select("symbol, time,
32
+ toolkit_experimental.open(ohlc),
33
+ toolkit_experimental.high(ohlc),
34
+ toolkit_experimental.low(ohlc),
35
+ toolkit_experimental.close(ohlc),
36
+ toolkit_experimental.open_time(ohlc),
37
+ toolkit_experimental.high_time(ohlc),
38
+ toolkit_experimental.low_time(ohlc),
39
+ toolkit_experimental.close_time(ohlc)")
40
+ end
41
+
42
+ scope :rollup, -> (timeframe: '1h') do
43
+ select("symbol, time_bucket('#{timeframe}', time) as time,
44
+ toolkit_experimental.rollup(ohlc) as ohlc")
45
+ .group(1,2)
46
+ end
47
+
48
+ def readonly?
49
+ true
50
+ end
51
+ end
52
+
53
+ class_methods do
54
+ end
55
+ end
56
+
57
+ class Ohlc1m < ActiveRecord::Base
58
+ self.table_name = 'ohlc_1m'
59
+ include Ohlc
60
+ end
61
+
62
+ class Ohlc1h < ActiveRecord::Base
63
+ self.table_name = 'ohlc_1h'
64
+ include Ohlc
65
+ end
66
+
67
+ class Ohlc1d < ActiveRecord::Base
68
+ self.table_name = 'ohlc_1d'
69
+ include Ohlc
70
+ end
71
+ =begin
72
+ scope :ohlc_ruby, -> (
73
+ timeframe: 1.hour,
74
+ segment_by: segment_by_column,
75
+ time: time_column,
76
+ value: value_column) {
77
+ ohlcs = Hash.new() {|hash, key| hash[key] = [] }
78
+
79
+ key = tick.send(segment_by)
80
+ candlestick = ohlcs[key].last
81
+ if candlestick.nil? || candlestick.time + timeframe > tick.time
82
+ ohlcs[key] << Candlestick.new(time $, price)
83
+ end
84
+ find_all do |tick|
85
+ symbol = tick.symbol
86
+
87
+ if previous[symbol]
88
+ delta = (tick.price - previous[symbol]).abs
89
+ volatility[symbol] += delta
90
+ end
91
+ previous[symbol] = tick.price
92
+ end
93
+ volatility
94
+ }
95
+ =end
96
+
97
+ ActiveRecord::Base.connection.add_toolkit_to_search_path!
98
+
99
+
100
+ ActiveRecord::Base.connection.instance_exec do
101
+ ActiveRecord::Base.logger = Logger.new(STDOUT)
102
+
103
+ unless Tick.table_exists?
104
+ hypertable_options = {
105
+ time_column: 'time',
106
+ chunk_time_interval: '1 week',
107
+ compress_segmentby: 'symbol',
108
+ compress_orderby: 'time',
109
+ compression_interval: '1 month'
110
+ }
111
+ create_table :ticks, hypertable: hypertable_options, id: false do |t|
112
+ t.column :time , 'timestamp with time zone'
113
+ t.string :symbol
114
+ t.decimal :price
115
+ t.integer :volume
116
+ end
117
+
118
+ options = {
119
+ with_data: false,
120
+ refresh_policies: {
121
+ start_offset: "INTERVAL '1 month'",
122
+ end_offset: "INTERVAL '1 minute'",
123
+ schedule_interval: "INTERVAL '1 minute'"
124
+ }
125
+ }
126
+ create_continuous_aggregate('ohlc_1m', Tick._ohlc(timeframe: '1m'), **options)
127
+
128
+ execute "CREATE VIEW ohlc_1h AS #{ Ohlc1m.rollup(timeframe: '1 hour').to_sql}"
129
+ execute "CREATE VIEW ohlc_1d AS #{ Ohlc1h.rollup(timeframe: '1 day').to_sql}"
130
+ end
131
+ end
132
+
133
+ if Tick.count.zero?
134
+ ActiveRecord::Base.connection.execute(<<~SQL)
135
+ INSERT INTO ticks
136
+ SELECT time, 'SYMBOL', 1 + (random()*30)::int, 100*(random()*10)::int
137
+ FROM generate_series(TIMESTAMP '2022-01-01 00:00:00',
138
+ TIMESTAMP '2022-02-01 00:01:00',
139
+ INTERVAL '1 second') AS time;
140
+ SQL
141
+ end
142
+
143
+
144
+ # Fetch attributes
145
+ Ohlc1m.attributes
146
+
147
+ # Rollup demo
148
+
149
+ # Attributes from rollup
150
+ Ohlc1m.attributes.from(Ohlc1m.rollup(timeframe: '1 day'))
151
+
152
+
153
+ # Nesting several levels
154
+ Ohlc1m.attributes.from(
155
+ Ohlc1m.rollup(timeframe: '1 week').from(
156
+ Ohlc1m.rollup(timeframe: '1 day')
157
+ )
158
+ )
159
+ Ohlc1m.attributes.from(
160
+ Ohlc1m.rollup(timeframe: '1 month').from(
161
+ Ohlc1m.rollup(timeframe: '1 week').from(
162
+ Ohlc1m.rollup(timeframe: '1 day')
163
+ )
164
+ )
165
+ )
166
+
167
+ Pry.start
168
+
169
+ =begin
170
+ TODO: implement the ohlc_ruby
171
+ Benchmark.bm do |x|
172
+ x.report("ruby") { Tick.ohlc_ruby }
173
+ x.report("sql") { Tick.ohlc.map(&:attributes) }
174
+ end
175
+ =end
@@ -1,7 +1,7 @@
1
1
  module Timescaledb
2
2
  class Dimension < ActiveRecord::Base
3
3
  self.table_name = "timescaledb_information.dimensions"
4
- attribute :time_interval, :interval
4
+ # attribute :time_interval, :interval
5
5
  end
6
6
  Dimensions = Dimension
7
7
  end
@@ -3,6 +3,7 @@ module Timescaledb
3
3
  self.table_name = "timescaledb_information.job_stats"
4
4
 
5
5
  belongs_to :job
6
+ # attribute :last_run_duration, :interval
6
7
 
7
8
  scope :success, -> { where(last_run_status: "Success") }
8
9
  scope :scheduled, -> { where(job_status: "Scheduled") }
@@ -80,7 +80,7 @@ module Timescaledb
80
80
  WITH #{"NO" unless options[:with_data]} DATA;
81
81
  SQL
82
82
 
83
- create_continuous_aggregate_policy(table_name, options[:refresh_policies] || {})
83
+ create_continuous_aggregate_policy(table_name, **(options[:refresh_policies] || {}))
84
84
  end
85
85
 
86
86
 
@@ -6,15 +6,11 @@ module Timescaledb
6
6
  def tables(stream)
7
7
  super # This will call #table for each table in the database
8
8
  views(stream) unless defined?(Scenic) # Don't call this twice if we're using Scenic
9
- end
10
9
 
11
- def table(table_name, stream)
12
- super(table_name, stream)
13
- if Timescaledb::Hypertable.table_exists? &&
14
- (hypertable = Timescaledb::Hypertable.find_by(hypertable_name: table_name))
15
- timescale_hypertable(hypertable, stream)
16
- timescale_retention_policy(hypertable, stream)
17
- end
10
+ return unless Timescaledb::Hypertable.table_exists?
11
+
12
+ timescale_hypertables(stream)
13
+ timescale_retention_policies(stream)
18
14
  end
19
15
 
20
16
  def views(stream)
@@ -24,23 +20,37 @@ module Timescaledb
24
20
  super if defined?(super)
25
21
  end
26
22
 
23
+ def timescale_hypertables(stream)
24
+ stream.puts # Insert a blank line above the hypertable definitions, for readability
25
+
26
+ sorted_hypertables.each do |hypertable|
27
+ timescale_hypertable(hypertable, stream)
28
+ end
29
+ end
30
+
31
+ def timescale_retention_policies(stream)
32
+ stream.puts # Insert a blank line above the retention policies, for readability
33
+
34
+ sorted_hypertables.each do |hypertable|
35
+ timescale_retention_policy(hypertable, stream)
36
+ end
37
+ end
38
+
27
39
  private
28
40
 
29
41
  def timescale_hypertable(hypertable, stream)
30
- dim = hypertable.dimensions
42
+ dim = hypertable.main_dimension
31
43
  extra_settings = {
32
44
  time_column: "#{dim.column_name}",
33
45
  chunk_time_interval: "#{dim.time_interval.inspect}"
34
46
  }.merge(timescale_compression_settings_for(hypertable)).map {|k, v| %Q[#{k}: "#{v}"]}.join(", ")
35
47
 
36
48
  stream.puts %Q[ create_hypertable "#{hypertable.hypertable_name}", #{extra_settings}]
37
- stream.puts
38
49
  end
39
50
 
40
51
  def timescale_retention_policy(hypertable, stream)
41
52
  hypertable.jobs.where(proc_name: "policy_retention").each do |job|
42
53
  stream.puts %Q[ create_retention_policy "#{job.hypertable_name}", interval: "#{job.config["drop_after"]}"]
43
- stream.puts
44
54
  end
45
55
  end
46
56
 
@@ -85,6 +95,9 @@ module Timescaledb
85
95
 
86
96
  "INTERVAL '#{value}'"
87
97
  end
98
+ def sorted_hypertables
99
+ @sorted_hypertables ||= Timescaledb::Hypertable.order(:hypertable_name).to_a
100
+ end
88
101
  end
89
102
  end
90
103
 
@@ -13,8 +13,9 @@ module Timescaledb
13
13
  end
14
14
 
15
15
  def time_column
16
- respond_to?(:time_column) && super || time_vector_options[:time_column]
16
+ respond_to?(:time_column) && super || time_vector_options[:time_column]
17
17
  end
18
+
18
19
  def segment_by_column
19
20
  time_vector_options[:segment_by]
20
21
  end
@@ -25,8 +26,7 @@ module Timescaledb
25
26
  scope :volatility, -> (segment_by: segment_by_column) do
26
27
  select([*segment_by,
27
28
  "timevector(#{time_column}, #{value_column}) -> sort() -> delta() -> abs() -> sum() as volatility"
28
- ].join(", "))
29
- .group(segment_by)
29
+ ].join(", ")).group(segment_by)
30
30
  end
31
31
 
32
32
  scope :time_weight, -> (segment_by: segment_by_column) do
@@ -40,8 +40,7 @@ module Timescaledb
40
40
  lttb_query = <<~SQL
41
41
  WITH x AS ( #{select(*segment_by, time_column, value_column).to_sql})
42
42
  SELECT #{"x.#{segment_by}," if segment_by}
43
- (lttb( x.#{time_column}, x.#{value_column}, #{threshold})
44
- -> toolkit_experimental.unnest()).*
43
+ (lttb( x.#{time_column}, x.#{value_column}, #{threshold}) -> unnest()).*
45
44
  FROM x
46
45
  #{"GROUP BY device_id" if segment_by}
47
46
  SQL
@@ -58,6 +57,38 @@ module Timescaledb
58
57
  downsampled.map{|e|[ e[time_column],e[value_column]]}
59
58
  end
60
59
  end
60
+
61
+
62
+ scope :_ohlc, -> (timeframe: '1h',
63
+ segment_by: segment_by_column,
64
+ time: time_column,
65
+ value: value_column) do
66
+
67
+ select( "time_bucket('#{timeframe}', #{time}) as #{time}",
68
+ *segment_by,
69
+ "toolkit_experimental.ohlc(#{time}, #{value})")
70
+ .order(1)
71
+ .group(*(segment_by ? [1,2] : 1))
72
+ end
73
+
74
+ scope :ohlc, -> (timeframe: '1h',
75
+ segment_by: segment_by_column,
76
+ time: time_column,
77
+ value: value_column) do
78
+
79
+ raw = _ohlc(timeframe: timeframe, segment_by: segment_by, time: time, value: value)
80
+ unscoped
81
+ .from("(#{raw.to_sql}) AS ohlc")
82
+ .select(*segment_by, time,
83
+ "toolkit_experimental.open(ohlc),
84
+ toolkit_experimental.high(ohlc),
85
+ toolkit_experimental.low(ohlc),
86
+ toolkit_experimental.close(ohlc),
87
+ toolkit_experimental.open_time(ohlc),
88
+ toolkit_experimental.high_time(ohlc),
89
+ toolkit_experimental.low_time(ohlc),
90
+ toolkit_experimental.close_time(ohlc)")
91
+ end
61
92
  end
62
93
  end
63
94
  end
@@ -1,3 +1,3 @@
1
1
  module Timescaledb
2
- VERSION = '0.2.2'
2
+ VERSION = '0.2.4'
3
3
  end
data/mkdocs.yml CHANGED
@@ -29,5 +29,6 @@ nav:
29
29
  - Toolkit Integration: toolkit.md
30
30
  - Toolkit LTTB Tutorial: toolkit_lttb_tutorial.md
31
31
  - Zooming with High Resolution: toolkit_lttb_zoom.md
32
+ - Toolkit OHLC: toolkit_ohlc.md
32
33
  - Command Line: command_line.md
33
34
  - Videos: videos.md
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: timescaledb
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.2.2
4
+ version: 0.2.4
5
5
  platform: ruby
6
6
  authors:
7
7
  - Jônatas Davi Paganini
8
- autorequire:
8
+ autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2022-10-14 00:00:00.000000000 Z
11
+ date: 2022-12-16 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: pg
@@ -150,6 +150,7 @@ files:
150
150
  - ".tool-versions"
151
151
  - ".travis.yml"
152
152
  - CODE_OF_CONDUCT.md
153
+ - Fastfile
153
154
  - Gemfile
154
155
  - Gemfile.lock
155
156
  - Gemfile.scenic
@@ -170,6 +171,7 @@ files:
170
171
  - docs/toolkit.md
171
172
  - docs/toolkit_lttb_tutorial.md
172
173
  - docs/toolkit_lttb_zoom.md
174
+ - docs/toolkit_ohlc.md
173
175
  - docs/videos.md
174
176
  - examples/all_in_one/all_in_one.rb
175
177
  - examples/all_in_one/benchmark_comparison.rb
@@ -233,6 +235,7 @@ files:
233
235
  - examples/toolkit-demo/lttb/lttb_sinatra.rb
234
236
  - examples/toolkit-demo/lttb/lttb_test.rb
235
237
  - examples/toolkit-demo/lttb/views/index.erb
238
+ - examples/toolkit-demo/ohlc.rb
236
239
  - lib/timescaledb.rb
237
240
  - lib/timescaledb/acts_as_hypertable.rb
238
241
  - lib/timescaledb/acts_as_hypertable/core.rb
@@ -261,7 +264,7 @@ licenses:
261
264
  metadata:
262
265
  allowed_push_host: https://rubygems.org
263
266
  homepage_uri: https://github.com/jonatas/timescaledb
264
- post_install_message:
267
+ post_install_message:
265
268
  rdoc_options: []
266
269
  require_paths:
267
270
  - lib
@@ -276,8 +279,8 @@ required_rubygems_version: !ruby/object:Gem::Requirement
276
279
  - !ruby/object:Gem::Version
277
280
  version: '0'
278
281
  requirements: []
279
- rubygems_version: 3.1.2
280
- signing_key:
282
+ rubygems_version: 3.3.7
283
+ signing_key:
281
284
  specification_version: 4
282
285
  summary: TimescaleDB helpers for Ruby ecosystem.
283
286
  test_files: []