red_amber 0.1.3 → 0.1.6
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- data/.rubocop.yml +31 -7
- data/CHANGELOG.md +214 -10
- data/Gemfile +4 -0
- data/README.md +117 -342
- data/benchmark/csv_load_penguins.yml +15 -0
- data/benchmark/drop_nil.yml +11 -0
- data/doc/DataFrame.md +854 -0
- data/doc/Vector.md +449 -0
- data/doc/image/arrow_table_new.png +0 -0
- data/doc/image/dataframe/assign.png +0 -0
- data/doc/image/dataframe/drop.png +0 -0
- data/doc/image/dataframe/pick.png +0 -0
- data/doc/image/dataframe/remove.png +0 -0
- data/doc/image/dataframe/rename.png +0 -0
- data/doc/image/dataframe/slice.png +0 -0
- data/doc/image/dataframe_model.png +0 -0
- data/doc/image/example_in_red_arrow.png +0 -0
- data/doc/image/tdr.png +0 -0
- data/doc/image/tdr_and_table.png +0 -0
- data/doc/image/tidy_data_in_TDR.png +0 -0
- data/doc/image/vector/binary_element_wise.png +0 -0
- data/doc/image/vector/unary_aggregation.png +0 -0
- data/doc/image/vector/unary_aggregation_w_option.png +0 -0
- data/doc/image/vector/unary_element_wise.png +0 -0
- data/doc/tdr.md +56 -0
- data/doc/tdr_ja.md +56 -0
- data/lib/red-amber.rb +27 -0
- data/lib/red_amber/data_frame.rb +91 -37
- data/lib/red_amber/{data_frame_output.rb → data_frame_displayable.rb} +49 -41
- data/lib/red_amber/data_frame_indexable.rb +38 -0
- data/lib/red_amber/data_frame_observation_operation.rb +11 -0
- data/lib/red_amber/data_frame_selectable.rb +155 -48
- data/lib/red_amber/data_frame_variable_operation.rb +137 -0
- data/lib/red_amber/helper.rb +61 -0
- data/lib/red_amber/vector.rb +69 -16
- data/lib/red_amber/vector_functions.rb +80 -45
- data/lib/red_amber/vector_selectable.rb +124 -0
- data/lib/red_amber/vector_updatable.rb +104 -0
- data/lib/red_amber/version.rb +1 -1
- data/lib/red_amber.rb +1 -16
- data/red_amber.gemspec +3 -6
- metadata +38 -9
data/README.md
CHANGED
@@ -1,20 +1,29 @@
|
|
1
1
|
# RedAmber
|
2
2
|
|
3
|
-
A simple dataframe library for Ruby (experimental)
|
3
|
+
A simple dataframe library for Ruby (experimental).
|
4
4
|
|
5
5
|
- Powered by [Red Arrow](https://github.com/apache/arrow/tree/master/ruby/red-arrow)
|
6
|
-
-
|
6
|
+
- Inspired by the dataframe library [Rover-df](https://github.com/ankane/rover)
|
7
7
|
|
8
8
|
## Requirements
|
9
9
|
|
10
10
|
```ruby
|
11
|
-
gem 'red-arrow', '>=
|
12
|
-
gem 'red-parquet', '>=
|
11
|
+
gem 'red-arrow', '>= 8.0.0'
|
12
|
+
gem 'red-parquet', '>= 8.0.0' # if you use IO from/to parquet
|
13
13
|
gem 'rover-df', '~> 0.3.0' # if you use IO from/to Rover::DataFrame
|
14
14
|
```
|
15
15
|
|
16
16
|
## Installation
|
17
17
|
|
18
|
+
Install requirements before you install Red Amber.
|
19
|
+
|
20
|
+
- Apache Arrow GLib (>= 8.0.0)
|
21
|
+
- Apache Parquet GLib (>= 8.0.0)
|
22
|
+
|
23
|
+
See [Apache Arrow install document](https://arrow.apache.org/install/).
|
24
|
+
|
25
|
+
Minimum installation example for the latest Ubuntu is in the ['Prepare the Apache Arrow' section in ci test](https://github.com/heronshoes/red_amber/blob/master/.github/workflows/test.yml) of Red Amber.
|
26
|
+
|
18
27
|
Add this line to your Gemfile:
|
19
28
|
|
20
29
|
```ruby
|
@@ -23,134 +32,66 @@ gem 'red_amber'
|
|
23
32
|
|
24
33
|
And then execute:
|
25
34
|
|
26
|
-
|
35
|
+
```shell
|
36
|
+
bundle install
|
37
|
+
```
|
27
38
|
|
28
39
|
Or install it yourself as:
|
29
40
|
|
30
|
-
|
31
|
-
|
32
|
-
|
33
|
-
|
34
|
-
### Constructors and saving
|
35
|
-
|
36
|
-
- [x] `new` from a columnar Hash
|
37
|
-
- `RedAmber::DataFrame.new(x: [1, 2, 3])`
|
38
|
-
|
39
|
-
- [x] `new` from a schema (by Hash) and rows (by Array)
|
40
|
-
- `RedAmber::DataFrame.new({:x=>:uint8}, [[1], [2], [3]])`
|
41
|
-
|
42
|
-
- [x] `new` from an Arrow::Table
|
43
|
-
- `RedAmber::DataFrame.new(Arrow::Table.new(x: [1, 2, 3]))`
|
44
|
-
|
45
|
-
- [x] `new` from a Rover::DataFrame
|
46
|
-
- `RedAmber::DataFrame.new(Rover::DataFrame.new(x: [1, 2, 3]))`
|
47
|
-
|
48
|
-
- [x] `load` (class method)
|
49
|
-
|
50
|
-
- [x] from a [`.arrow`, `.arrows`, `.csv`, `.csv.gz`, `.tsv`] file
|
51
|
-
- `RedAmber::DataFrame.load("test/entity/with_header.csv")`
|
52
|
-
|
53
|
-
- [x] from a string buffer
|
54
|
-
|
55
|
-
- [x] from a URI
|
56
|
-
- `RedAmber::DataFrame.load(URI("https://github.com/heronshoes/red_amber/blob/master/test/entity/with_header.csv"))`
|
57
|
-
|
58
|
-
- [x] from a Parquet file
|
59
|
-
|
60
|
-
`red-parquet` gem is required.
|
61
|
-
|
62
|
-
```ruby
|
63
|
-
require 'parquet'
|
64
|
-
dataframe = RedAmber::DataFrame.load("file.parquet")
|
65
|
-
```
|
66
|
-
|
67
|
-
- [x] `save` (instance method)
|
68
|
-
|
69
|
-
- [x] to a [`.arrow`, `.arrows`, `.csv`, `.csv.gz`, `.tsv`] file
|
70
|
-
|
71
|
-
- [x] to a string buffer
|
72
|
-
|
73
|
-
- [x] to a URI
|
74
|
-
|
75
|
-
- [x] to a Parquet file
|
76
|
-
|
77
|
-
`red-parquet` gem is required.
|
78
|
-
|
79
|
-
```ruby
|
80
|
-
require 'parquet'
|
81
|
-
dataframe.save("file.parquet")
|
82
|
-
```
|
83
|
-
|
84
|
-
### Properties
|
85
|
-
|
86
|
-
- [x] `table`
|
87
|
-
|
88
|
-
Reader of Arrow::Table object inside.
|
89
|
-
|
90
|
-
- [x] `n_rows`, `nrow`, `size`, `length`
|
91
|
-
|
92
|
-
Returns num of rows (data size).
|
93
|
-
|
94
|
-
- [x] `n_columns`, `ncol`, `width`
|
95
|
-
|
96
|
-
Returns num of columns (num of vectors).
|
97
|
-
|
98
|
-
- [x] `shape`
|
99
|
-
|
100
|
-
Returns shape in an Array[n_rows, n_cols].
|
101
|
-
|
102
|
-
- [x] `column_names`, `keys`
|
103
|
-
|
104
|
-
Returns num of column names by an Array.
|
105
|
-
|
106
|
-
- [x] `types`
|
107
|
-
|
108
|
-
Returns types of columns by an Array of Symbols.
|
109
|
-
|
110
|
-
- [x] `data_types`
|
111
|
-
|
112
|
-
Returns types of columns by an Array of `Arrow::DataType`.
|
113
|
-
|
114
|
-
- [x] `vectors`
|
115
|
-
|
116
|
-
Returns an Array of Vectors.
|
117
|
-
|
118
|
-
- [x] `to_h`
|
119
|
-
|
120
|
-
Returns column-oriented data in a Hash.
|
121
|
-
|
122
|
-
- [x] `to_a`, `raw_records`
|
123
|
-
|
124
|
-
Returns an array of row-oriented data without header. If you need a column-oriented full array, use `.to_h.to_a`
|
125
|
-
|
126
|
-
- [x] `schema`
|
127
|
-
|
128
|
-
Returns column name and data type in a Hash.
|
41
|
+
```shell
|
42
|
+
gem install red_amber
|
43
|
+
```
|
129
44
|
|
130
|
-
|
131
|
-
|
132
|
-
- [x] `empty?`
|
45
|
+
(From v0.1.6)
|
133
46
|
|
134
|
-
|
47
|
+
RedAmber uses TDR mode for `#inspect` and `#to_iruby` by default. If you prefer Table mode, please set the environment variable
|
48
|
+
`RED_AMBER_OUTPUT_MODE` to `"table"`. See [TDR section](#TDR) for detail.
|
135
49
|
|
136
|
-
|
50
|
+
## `RedAmber::DataFrame`
|
137
51
|
|
138
|
-
-
|
52
|
+
Represents a set of data in 2D-shape. The entity is a Red Arrow's Table object.
|
139
53
|
|
140
|
-
|
54
|
+
```ruby
|
55
|
+
require 'red_amber' # require 'red-amber' is also OK.
|
56
|
+
require 'datasets-arrow'
|
141
57
|
|
142
|
-
|
58
|
+
arrow = Datasets::Penguins.new.to_arrow
|
59
|
+
penguins = RedAmber::DataFrame.new(arrow)
|
60
|
+
penguins.table
|
143
61
|
|
144
|
-
|
62
|
+
# =>
|
63
|
+
#<Arrow::Table:0x111271098 ptr=0x7f9118b3e0b0>
|
64
|
+
species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex year
|
65
|
+
0 Adelie Torgersen 39.100000 18.700000 181 3750 male 2007
|
66
|
+
1 Adelie Torgersen 39.500000 17.400000 186 3800 female 2007
|
67
|
+
2 Adelie Torgersen 40.300000 18.000000 195 3250 female 2007
|
68
|
+
3 Adelie Torgersen (null) (null) (null) (null) (null) 2007
|
69
|
+
4 Adelie Torgersen 36.700000 19.300000 193 3450 female 2007
|
70
|
+
5 Adelie Torgersen 39.300000 20.600000 190 3650 male 2007
|
71
|
+
6 Adelie Torgersen 38.900000 17.800000 181 3625 female 2007
|
72
|
+
7 Adelie Torgersen 39.200000 19.600000 195 4675 male 2007
|
73
|
+
8 Adelie Torgersen 34.100000 18.100000 193 3475 (null) 2007
|
74
|
+
9 Adelie Torgersen 42.000000 20.200000 190 4250 (null) 2007
|
75
|
+
...
|
76
|
+
334 Gentoo Biscoe 46.200000 14.100000 217 4375 female 2009
|
77
|
+
335 Gentoo Biscoe 55.100000 16.000000 230 5850 male 2009
|
78
|
+
336 Gentoo Biscoe 44.500000 15.700000 217 4875 (null) 2009
|
79
|
+
337 Gentoo Biscoe 48.800000 16.200000 222 6000 male 2009
|
80
|
+
338 Gentoo Biscoe 47.200000 13.700000 214 4925 female 2009
|
81
|
+
339 Gentoo Biscoe (null) (null) (null) (null) (null) 2009
|
82
|
+
340 Gentoo Biscoe 46.800000 14.300000 215 4850 female 2009
|
83
|
+
341 Gentoo Biscoe 50.400000 15.700000 222 5750 male 2009
|
84
|
+
342 Gentoo Biscoe 45.200000 14.800000 212 5200 female 2009
|
85
|
+
343 Gentoo Biscoe 49.900000 16.100000 213 5400 male 2009
|
86
|
+
```
|
145
87
|
|
146
|
-
|
88
|
+
By default, RedAmber shows self by compact transposed style. This unfamiliar style (TDR) is designed for
|
89
|
+
the exploratory data processing. It keeps Vectors as row vectors, shows keys and types at a glance, shows levels
|
90
|
+
for the 'factor-like' variables and shows the number of abnormal values like NaN and nil.
|
147
91
|
|
148
92
|
```ruby
|
149
|
-
|
150
|
-
require 'datasets-arrow'
|
93
|
+
penguins
|
151
94
|
|
152
|
-
penguins = Datasets::Penguins.new.to_arrow
|
153
|
-
RedAmber::DataFrame.new(penguins)
|
154
95
|
# =>
|
155
96
|
RedAmber::DataFrame : 344 x 8 Vectors
|
156
97
|
Vectors : 5 numeric, 3 strings
|
@@ -165,257 +106,91 @@ Vectors : 5 numeric, 3 strings
|
|
165
106
|
8 :year uint16 3 {2007=>110, 2008=>114, 2009=>120}
|
166
107
|
```
|
167
108
|
|
168
|
-
|
169
|
-
|
170
|
-
|
171
|
-
### Selecting
|
109
|
+
### DataFrame model
|
110
|
+
![dataframe model of RedAmber](doc/image/dataframe_model.png)
|
172
111
|
|
173
|
-
|
174
|
-
- Key in a Symbol: `df[:symbol]`
|
175
|
-
- Key in a String: `df["string"]`
|
176
|
-
- Keys in an Array: `df[:symbol1, "string", :symbol2]`
|
177
|
-
- Keys in indeces: `df[df.keys[0]`, `df[df.keys[1,2]]`, `df[df.keys[1..]]`
|
178
|
-
- Keys in a Range:
|
179
|
-
A end-less Range can be used to represent keys.
|
112
|
+
For example, `DataFrame#pick` accepts keys as an argument and returns a sub DataFrame.
|
180
113
|
|
181
114
|
```ruby
|
182
|
-
|
183
|
-
df = RedAmber::DataFrame.new(hash)
|
184
|
-
df[:b..:c, "a"]
|
115
|
+
df = penguins.pick(:body_mass_g)
|
185
116
|
# =>
|
186
|
-
RedAmber::DataFrame :
|
187
|
-
|
188
|
-
# key
|
189
|
-
1 :
|
190
|
-
2 :c double 3 [1.0, 2.0, 3.0]
|
191
|
-
3 :a uint8 3 [1, 2, 3]
|
117
|
+
#<RedAmber::DataFrame : 344 x 1 Vector, 0x000000000000fa14>
|
118
|
+
Vector : 1 numeric
|
119
|
+
# key type level data_preview
|
120
|
+
1 :body_mass_g int64 95 [3750, 3800, 3250, nil, 3450, ... ], 2 nils
|
192
121
|
```
|
193
122
|
|
194
|
-
|
195
|
-
- Select a row by index: `df[0]`
|
196
|
-
- Select rows by indeces in a Range: `df[1..2]`
|
197
|
-
- Select rows by indeces in an Array: `df[1, 2]`
|
198
|
-
- Mixed case: `df[2, 0..]`
|
199
|
-
|
200
|
-
- [x] Select rows from top or bottom
|
201
|
-
|
202
|
-
`head(n=5)`, `tail(n=5)`, `first(n=1)`, `last(n=1)`
|
203
|
-
|
204
|
-
- [ ] slice
|
205
|
-
|
206
|
-
### Updating
|
207
|
-
|
208
|
-
- [ ] Add a new column
|
209
|
-
|
210
|
-
- [ ] Update a single element
|
211
|
-
|
212
|
-
- [ ] Update multiple elements
|
213
|
-
|
214
|
-
- [ ] Update all elements
|
215
|
-
|
216
|
-
- [ ] Update elements matching a condition
|
123
|
+
`DataFrame#assign` creates new variables (column in the table).
|
217
124
|
|
218
|
-
|
219
|
-
|
220
|
-
|
221
|
-
|
222
|
-
|
223
|
-
|
224
|
-
|
225
|
-
|
226
|
-
|
227
|
-
|
228
|
-
### Treat na data
|
229
|
-
|
230
|
-
- [ ] Drop na (NaN, nil)
|
231
|
-
|
232
|
-
- [ ] Replace na with value
|
233
|
-
|
234
|
-
- [ ] Interpolate na with convolution array
|
235
|
-
|
236
|
-
### Combining DataFrames
|
237
|
-
|
238
|
-
- [ ] Add rows
|
239
|
-
|
240
|
-
- [ ] Add columns
|
125
|
+
```ruby
|
126
|
+
df.assign(:body_mass_kg => df[:body_mass_g] / 1000.0)
|
127
|
+
# =>
|
128
|
+
#<RedAmber::DataFrame : 344 x 2 Vectors, 0x000000000000fa28>
|
129
|
+
Vectors : 2 numeric
|
130
|
+
# key type level data_preview
|
131
|
+
1 :body_mass_g int64 95 [3750, 3800, 3250, nil, 3450, ... ], 2 nils
|
132
|
+
2 :body_mass_kg double 95 [3.75, 3.8, 3.25, nil, 3.45, ... ], 2 nils
|
133
|
+
```
|
241
134
|
|
242
|
-
|
135
|
+
DataFrame manipulating methods like `pick`, `drop`, `slice`, `remove`, `rename` and `assign` accept a block.
|
243
136
|
|
244
|
-
|
137
|
+
This is an exaple to eliminate observations (row in the table) containing nil.
|
245
138
|
|
246
|
-
|
139
|
+
```ruby
|
140
|
+
# remove all observation contains nil
|
141
|
+
nil_removed = penguins.remove { vectors.map(&:is_nil).reduce(&:|) }
|
142
|
+
nil_removed.tdr
|
143
|
+
# =>
|
144
|
+
RedAmber::DataFrame : 342 x 8 Vectors
|
145
|
+
Vectors : 5 numeric, 3 strings
|
146
|
+
# key type level data_preview
|
147
|
+
1 :species string 3 {"Adelie"=>151, "Chinstrap"=>68, "Gentoo"=>123}
|
148
|
+
2 :island string 3 {"Torgersen"=>51, "Biscoe"=>167, "Dream"=>124}
|
149
|
+
3 :bill_length_mm double 164 [39.1, 39.5, 40.3, 36.7, 39.3, ... ]
|
150
|
+
4 :bill_depth_mm double 80 [18.7, 17.4, 18.0, 19.3, 20.6, ... ]
|
151
|
+
5 :flipper_length_mm int64 55 [181, 186, 195, 193, 190, ... ]
|
152
|
+
6 :body_mass_g int64 94 [3750, 3800, 3250, 3450, 3650, ... ]
|
153
|
+
7 :sex string 3 {"male"=>168, "female"=>165, ""=>9}
|
154
|
+
8 :year int64 3 {2007=>109, 2008=>114, 2009=>119}
|
155
|
+
```
|
247
156
|
|
248
|
-
|
157
|
+
For this frequently needed task, we can do it much simpler.
|
249
158
|
|
250
|
-
|
159
|
+
```ruby
|
160
|
+
penguins.remove_nil # => same result as above
|
161
|
+
```
|
251
162
|
|
252
|
-
|
163
|
+
See [DataFrame.md](doc/DataFrame.md) for details.
|
253
164
|
|
254
165
|
|
255
166
|
## `RedAmber::Vector`
|
256
|
-
### Constructor
|
257
|
-
|
258
|
-
- [x] Create from a column in a DataFrame
|
259
|
-
|
260
|
-
- [x] New from an Array
|
261
|
-
|
262
|
-
### Properties
|
263
|
-
|
264
|
-
- [x] `to_s`
|
265
|
-
|
266
|
-
- [x] `values`, `to_a`, `entries`
|
267
167
|
|
268
|
-
|
168
|
+
Class `RedAmber::Vector` represents a series of data in the DataFrame.
|
269
169
|
|
270
|
-
|
271
|
-
|
272
|
-
|
273
|
-
|
274
|
-
|
275
|
-
|
276
|
-
- [ ] `chunked?`
|
277
|
-
|
278
|
-
- [ ] `n_chunks`
|
279
|
-
|
280
|
-
- [ ] `each_chunk`
|
281
|
-
|
282
|
-
- [x] `tally`
|
283
|
-
|
284
|
-
- [x] `n_nils`, `n_nans`
|
285
|
-
|
286
|
-
- `n_nulls` is an alias of `n_nils`
|
287
|
-
|
288
|
-
- [x] `inspect(limit: 80)`
|
170
|
+
```ruby
|
171
|
+
penguins[:bill_length_mm]
|
172
|
+
# =>
|
173
|
+
#<RedAmber::Vector(:double, size=344):0x000000000000f8fc>
|
174
|
+
[39.1, 39.5, 40.3, nil, 36.7, 39.3, 38.9, 39.2, 34.1, 42.0, 37.8, 37.8, 41.1, ... ]
|
175
|
+
```
|
289
176
|
|
290
|
-
|
177
|
+
Vectors accepts some [functional methods from Arrow](https://arrow.apache.org/docs/cpp/compute.html).
|
291
178
|
|
292
|
-
|
293
|
-
#### Unary aggregations: vector.func => scalar
|
179
|
+
See [Vector.md](doc/Vector.md) for details.
|
294
180
|
|
295
|
-
|
296
|
-
| ----------- | --- | --- | --- | --- | --- |
|
297
|
-
| ✓ `all` | ✓ | | | ✓ ScalarAggregate| |
|
298
|
-
| ✓ `any` | ✓ | | | ✓ ScalarAggregate| |
|
299
|
-
| ✓ `approximate_median`| |✓| | ✓ ScalarAggregate| alias `median`|
|
300
|
-
| ✓ `count` | ✓ | ✓ | ✓ | ✓ Count | |
|
301
|
-
| ✓ `count_distinct`| ✓ | ✓ | ✓ | ✓ Count |alias `count_uniq`|
|
302
|
-
|[ ]`index` | [ ] | [ ] | [ ] |[ ] Index | |
|
303
|
-
| ✓ `max` | ✓ | ✓ | ✓ | ✓ ScalarAggregate| |
|
304
|
-
| ✓ `mean` | ✓ | ✓ | | ✓ ScalarAggregate| |
|
305
|
-
| ✓ `min` | ✓ | ✓ | ✓ | ✓ ScalarAggregate| |
|
306
|
-
|[ ]`min_max` | [ ] | [ ] | [ ] |[ ] ScalarAggregate| |
|
307
|
-
|[ ]`mode` | | [ ] | |[ ] Mode | |
|
308
|
-
| ✓ `product` | ✓ | ✓ | | ✓ ScalarAggregate| |
|
309
|
-
|[ ]`quantile`| | [ ] | |[ ] Quantile| |
|
310
|
-
|[ ]`stddev` | | ✓ | |[ ] Variance| |
|
311
|
-
| ✓ `sum` | ✓ | ✓ | | ✓ ScalarAggregate| |
|
312
|
-
|[ ]`tdigest` | | [ ] | |[ ] TDigest | |
|
313
|
-
|[ ]`variance`| | ✓ | |[ ] Variance| |
|
181
|
+
## TDR
|
314
182
|
|
183
|
+
I named the data frame representation style in the model above as TDR (Transposed DataFrame Representation).
|
315
184
|
|
316
|
-
|
317
|
-
|
185
|
+
This library can be used with both TDR mode and usual Table mode.
|
186
|
+
If you set the environment variable `RED_AMBER_OUTPUT_MODE` to `"table"`, output style by `inspect` and `to_iruby` is the Table mode. Other value including nil will output TDR style.
|
318
187
|
|
188
|
+
You can switch the mode in Ruby like this.
|
319
189
|
```ruby
|
320
|
-
|
321
|
-
#=>
|
322
|
-
#<RedAmber::Vector(:double, size=6):0x000000000000f910>
|
323
|
-
[1.0, NaN, -Infinity, Infinity, nil, 0.0]
|
324
|
-
|
325
|
-
double.count #=> 5
|
326
|
-
double.count(opts: {mode: :only_valid}) #=> 5, default
|
327
|
-
double.count(opts: {mode: :only_null}) #=> 1
|
328
|
-
double.count(opts: {mode: :all}) #=> 6
|
329
|
-
|
330
|
-
boolean = RedAmber::Vector.new([true, true, nil])
|
331
|
-
#=>
|
332
|
-
#<RedAmber::Vector(:boolean, size=3):0x000000000000f924>
|
333
|
-
[true, true, nil]
|
334
|
-
|
335
|
-
boolean.all #=> true
|
336
|
-
boolean.all(opts: {skip_nulls: true}) #=> true
|
337
|
-
boolean.all(opts: {skip_nulls: false}) #=> false
|
190
|
+
ENV['RED_AMBER_OUTPUT_STYLE'] = 'table' # => Table mode
|
338
191
|
```
|
339
192
|
|
340
|
-
|
341
|
-
|
342
|
-
| Method |Boolean|Numeric|String|Options|Remarks|
|
343
|
-
| ------------ | --- | --- | --- | --- | ----- |
|
344
|
-
| ✓ `-@` | | ✓ | | |as `-vector`|
|
345
|
-
| ✓ `negate` | | ✓ | | |`-@` |
|
346
|
-
| ✓ `abs` | | ✓ | | | |
|
347
|
-
|[ ]`acos` | | [ ] | | | |
|
348
|
-
|[ ]`asin` | | [ ] | | | |
|
349
|
-
| ✓ `atan` | | ✓ | | | |
|
350
|
-
| ✓ `bit_wise_not`| | (✓) | | |integer only|
|
351
|
-
|[ ]`ceil` | | ✓ | | | |
|
352
|
-
| ✓ `cos` | | ✓ | | | |
|
353
|
-
|[ ]`floor` | | ✓ | | | |
|
354
|
-
| ✓ `invert` | ✓ | | | |`!`, alias `not`|
|
355
|
-
|[ ]`ln` | | [ ] | | | |
|
356
|
-
|[ ]`log10` | | [ ] | | | |
|
357
|
-
|[ ]`log1p` | | [ ] | | | |
|
358
|
-
|[ ]`log2` | | [ ] | | | |
|
359
|
-
|[ ]`round` | | [ ] | |[ ] Round| |
|
360
|
-
|[ ]`round_to_multiple`| | [ ] | |[ ] RoundToMultiple| |
|
361
|
-
| ✓ `sign` | | ✓ | | | |
|
362
|
-
| ✓ `sin` | | ✓ | | | |
|
363
|
-
| ✓ `tan` | | ✓ | | | |
|
364
|
-
|[ ]`trunc` | | ✓ | | | |
|
365
|
-
|
366
|
-
#### Binary element-wise: vector.func(vector) => vector
|
367
|
-
|
368
|
-
| Method |Boolean|Numeric|String|Options|Remarks|
|
369
|
-
| ----------------- | --- | --- | --- | --- | ----- |
|
370
|
-
| ✓ `add` | | ✓ | | | `+` |
|
371
|
-
| ✓ `atan2` | | ✓ | | | |
|
372
|
-
| ✓ `and_kleene` | ✓ | | | | `&` |
|
373
|
-
| ✓ `and_org ` | ✓ | | | |`and` in Red Arrow|
|
374
|
-
| ✓ `and_not` | ✓ | | | | |
|
375
|
-
| ✓ `and_not_kleene`| ✓ | | | | |
|
376
|
-
| ✓ `bit_wise_and` | | (✓) | | |integer only|
|
377
|
-
| ✓ `bit_wise_or` | | (✓) | | |integer only|
|
378
|
-
| ✓ `bit_wise_xor` | | (✓) | | |integer only|
|
379
|
-
| ✓ `divide` | | ✓ | | | `/` |
|
380
|
-
| ✓ `equal` | ✓ | ✓ | ✓ | |`==`, alias `eq`|
|
381
|
-
| ✓ `greater` | ✓ | ✓ | ✓ | |`>`, alias `gt`|
|
382
|
-
| ✓ `greater_equal` | ✓ | ✓ | ✓ | |`>=`, alias `ge`|
|
383
|
-
| ✓ `is_finite` | | ✓ | | | |
|
384
|
-
| ✓ `is_inf` | | ✓ | | | |
|
385
|
-
| ✓ `is_na` | ✓ | ✓ | ✓ | | |
|
386
|
-
| ✓ `is_nan` | | ✓ | | | |
|
387
|
-
|[ ]`is_nil` | ✓ | ✓ | ✓ |[ ] Null|alias `is_null`|
|
388
|
-
| ✓ `is_valid` | ✓ | ✓ | ✓ | | |
|
389
|
-
| ✓ `less` | ✓ | ✓ | ✓ | |`<`, alias `lt`|
|
390
|
-
| ✓ `less_equal` | ✓ | ✓ | ✓ | |`<=`, alias `le`|
|
391
|
-
|[ ]`logb` | | [ ] | | | |
|
392
|
-
|[ ]`mod` | | [ ] | | | `%` |
|
393
|
-
| ✓ `multiply` | | ✓ | | | `*` |
|
394
|
-
| ✓ `not_equal` | ✓ | ✓ | ✓ | |`!=`, alias `ne`|
|
395
|
-
| ✓ `or_kleene` | ✓ | | | | `\|` |
|
396
|
-
| ✓ `or_org` | ✓ | | | |`or` in Red Arrow|
|
397
|
-
| ✓ `power` | | ✓ | | | `**` |
|
398
|
-
| ✓ `subtract` | | ✓ | | | `-` |
|
399
|
-
| ✓ `shift_left` | | (✓) | | |`<<`, integer only|
|
400
|
-
| ✓ `shift_right` | | (✓) | | |`>>`, integer only|
|
401
|
-
| ✓ `xor` | ✓ | | | | `^` |
|
402
|
-
|
403
|
-
##### (Not impremented)
|
404
|
-
- [ ] sort, sort_index
|
405
|
-
- [ ] argmin, argmax
|
406
|
-
- [ ] (array functions)
|
407
|
-
- [ ] (strings functions)
|
408
|
-
- [ ] (temporal functions)
|
409
|
-
- [ ] (conditional functions)
|
410
|
-
- [ ] (index functions)
|
411
|
-
- [ ] (other functions)
|
412
|
-
|
413
|
-
### Coerce (not impremented)
|
414
|
-
|
415
|
-
### Updating (not impremented)
|
416
|
-
|
417
|
-
### DSL in a block for faster calculation ?
|
418
|
-
|
193
|
+
For more detail information about TDR, see [TDR.md](doc/tdr.md).
|
419
194
|
|
420
195
|
## Development
|
421
196
|
|
@@ -0,0 +1,15 @@
|
|
1
|
+
prelude: |
|
2
|
+
require 'datasets-arrow'
|
3
|
+
require 'rover'
|
4
|
+
require 'red_amber'
|
5
|
+
|
6
|
+
penguins_csv = 'benchmark/cache/penguins.csv'
|
7
|
+
|
8
|
+
unless File.exist?(penguins_csv)
|
9
|
+
arrow = Datasets::Penguins.new.to_arrow
|
10
|
+
RedAmber::DataFrame.new(arrow).save(penguins_csv)
|
11
|
+
end
|
12
|
+
|
13
|
+
benchmark:
|
14
|
+
'penguins by Rover': Rover.read_csv(penguins_csv)
|
15
|
+
'penguins by RedAmber': RedAmber::DataFrame.load(penguins_csv)
|