red_amber 0.1.3 → 0.1.6
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/.rubocop.yml +31 -7
- data/CHANGELOG.md +214 -10
- data/Gemfile +4 -0
- data/README.md +117 -342
- data/benchmark/csv_load_penguins.yml +15 -0
- data/benchmark/drop_nil.yml +11 -0
- data/doc/DataFrame.md +854 -0
- data/doc/Vector.md +449 -0
- data/doc/image/arrow_table_new.png +0 -0
- data/doc/image/dataframe/assign.png +0 -0
- data/doc/image/dataframe/drop.png +0 -0
- data/doc/image/dataframe/pick.png +0 -0
- data/doc/image/dataframe/remove.png +0 -0
- data/doc/image/dataframe/rename.png +0 -0
- data/doc/image/dataframe/slice.png +0 -0
- data/doc/image/dataframe_model.png +0 -0
- data/doc/image/example_in_red_arrow.png +0 -0
- data/doc/image/tdr.png +0 -0
- data/doc/image/tdr_and_table.png +0 -0
- data/doc/image/tidy_data_in_TDR.png +0 -0
- data/doc/image/vector/binary_element_wise.png +0 -0
- data/doc/image/vector/unary_aggregation.png +0 -0
- data/doc/image/vector/unary_aggregation_w_option.png +0 -0
- data/doc/image/vector/unary_element_wise.png +0 -0
- data/doc/tdr.md +56 -0
- data/doc/tdr_ja.md +56 -0
- data/lib/red-amber.rb +27 -0
- data/lib/red_amber/data_frame.rb +91 -37
- data/lib/red_amber/{data_frame_output.rb → data_frame_displayable.rb} +49 -41
- data/lib/red_amber/data_frame_indexable.rb +38 -0
- data/lib/red_amber/data_frame_observation_operation.rb +11 -0
- data/lib/red_amber/data_frame_selectable.rb +155 -48
- data/lib/red_amber/data_frame_variable_operation.rb +137 -0
- data/lib/red_amber/helper.rb +61 -0
- data/lib/red_amber/vector.rb +69 -16
- data/lib/red_amber/vector_functions.rb +80 -45
- data/lib/red_amber/vector_selectable.rb +124 -0
- data/lib/red_amber/vector_updatable.rb +104 -0
- data/lib/red_amber/version.rb +1 -1
- data/lib/red_amber.rb +1 -16
- data/red_amber.gemspec +3 -6
- metadata +38 -9
data/README.md
CHANGED
@@ -1,20 +1,29 @@
|
|
1
1
|
# RedAmber
|
2
2
|
|
3
|
-
A simple dataframe library for Ruby (experimental)
|
3
|
+
A simple dataframe library for Ruby (experimental).
|
4
4
|
|
5
5
|
- Powered by [Red Arrow](https://github.com/apache/arrow/tree/master/ruby/red-arrow)
|
6
|
-
-
|
6
|
+
- Inspired by the dataframe library [Rover-df](https://github.com/ankane/rover)
|
7
7
|
|
8
8
|
## Requirements
|
9
9
|
|
10
10
|
```ruby
|
11
|
-
gem 'red-arrow', '>=
|
12
|
-
gem 'red-parquet', '>=
|
11
|
+
gem 'red-arrow', '>= 8.0.0'
|
12
|
+
gem 'red-parquet', '>= 8.0.0' # if you use IO from/to parquet
|
13
13
|
gem 'rover-df', '~> 0.3.0' # if you use IO from/to Rover::DataFrame
|
14
14
|
```
|
15
15
|
|
16
16
|
## Installation
|
17
17
|
|
18
|
+
Install requirements before you install Red Amber.
|
19
|
+
|
20
|
+
- Apache Arrow GLib (>= 8.0.0)
|
21
|
+
- Apache Parquet GLib (>= 8.0.0)
|
22
|
+
|
23
|
+
See [Apache Arrow install document](https://arrow.apache.org/install/).
|
24
|
+
|
25
|
+
Minimum installation example for the latest Ubuntu is in the ['Prepare the Apache Arrow' section in ci test](https://github.com/heronshoes/red_amber/blob/master/.github/workflows/test.yml) of Red Amber.
|
26
|
+
|
18
27
|
Add this line to your Gemfile:
|
19
28
|
|
20
29
|
```ruby
|
@@ -23,134 +32,66 @@ gem 'red_amber'
|
|
23
32
|
|
24
33
|
And then execute:
|
25
34
|
|
26
|
-
|
35
|
+
```shell
|
36
|
+
bundle install
|
37
|
+
```
|
27
38
|
|
28
39
|
Or install it yourself as:
|
29
40
|
|
30
|
-
|
31
|
-
|
32
|
-
|
33
|
-
|
34
|
-
### Constructors and saving
|
35
|
-
|
36
|
-
- [x] `new` from a columnar Hash
|
37
|
-
- `RedAmber::DataFrame.new(x: [1, 2, 3])`
|
38
|
-
|
39
|
-
- [x] `new` from a schema (by Hash) and rows (by Array)
|
40
|
-
- `RedAmber::DataFrame.new({:x=>:uint8}, [[1], [2], [3]])`
|
41
|
-
|
42
|
-
- [x] `new` from an Arrow::Table
|
43
|
-
- `RedAmber::DataFrame.new(Arrow::Table.new(x: [1, 2, 3]))`
|
44
|
-
|
45
|
-
- [x] `new` from a Rover::DataFrame
|
46
|
-
- `RedAmber::DataFrame.new(Rover::DataFrame.new(x: [1, 2, 3]))`
|
47
|
-
|
48
|
-
- [x] `load` (class method)
|
49
|
-
|
50
|
-
- [x] from a [`.arrow`, `.arrows`, `.csv`, `.csv.gz`, `.tsv`] file
|
51
|
-
- `RedAmber::DataFrame.load("test/entity/with_header.csv")`
|
52
|
-
|
53
|
-
- [x] from a string buffer
|
54
|
-
|
55
|
-
- [x] from a URI
|
56
|
-
- `RedAmber::DataFrame.load(URI("https://github.com/heronshoes/red_amber/blob/master/test/entity/with_header.csv"))`
|
57
|
-
|
58
|
-
- [x] from a Parquet file
|
59
|
-
|
60
|
-
`red-parquet` gem is required.
|
61
|
-
|
62
|
-
```ruby
|
63
|
-
require 'parquet'
|
64
|
-
dataframe = RedAmber::DataFrame.load("file.parquet")
|
65
|
-
```
|
66
|
-
|
67
|
-
- [x] `save` (instance method)
|
68
|
-
|
69
|
-
- [x] to a [`.arrow`, `.arrows`, `.csv`, `.csv.gz`, `.tsv`] file
|
70
|
-
|
71
|
-
- [x] to a string buffer
|
72
|
-
|
73
|
-
- [x] to a URI
|
74
|
-
|
75
|
-
- [x] to a Parquet file
|
76
|
-
|
77
|
-
`red-parquet` gem is required.
|
78
|
-
|
79
|
-
```ruby
|
80
|
-
require 'parquet'
|
81
|
-
dataframe.save("file.parquet")
|
82
|
-
```
|
83
|
-
|
84
|
-
### Properties
|
85
|
-
|
86
|
-
- [x] `table`
|
87
|
-
|
88
|
-
Reader of Arrow::Table object inside.
|
89
|
-
|
90
|
-
- [x] `n_rows`, `nrow`, `size`, `length`
|
91
|
-
|
92
|
-
Returns num of rows (data size).
|
93
|
-
|
94
|
-
- [x] `n_columns`, `ncol`, `width`
|
95
|
-
|
96
|
-
Returns num of columns (num of vectors).
|
97
|
-
|
98
|
-
- [x] `shape`
|
99
|
-
|
100
|
-
Returns shape in an Array[n_rows, n_cols].
|
101
|
-
|
102
|
-
- [x] `column_names`, `keys`
|
103
|
-
|
104
|
-
Returns num of column names by an Array.
|
105
|
-
|
106
|
-
- [x] `types`
|
107
|
-
|
108
|
-
Returns types of columns by an Array of Symbols.
|
109
|
-
|
110
|
-
- [x] `data_types`
|
111
|
-
|
112
|
-
Returns types of columns by an Array of `Arrow::DataType`.
|
113
|
-
|
114
|
-
- [x] `vectors`
|
115
|
-
|
116
|
-
Returns an Array of Vectors.
|
117
|
-
|
118
|
-
- [x] `to_h`
|
119
|
-
|
120
|
-
Returns column-oriented data in a Hash.
|
121
|
-
|
122
|
-
- [x] `to_a`, `raw_records`
|
123
|
-
|
124
|
-
Returns an array of row-oriented data without header. If you need a column-oriented full array, use `.to_h.to_a`
|
125
|
-
|
126
|
-
- [x] `schema`
|
127
|
-
|
128
|
-
Returns column name and data type in a Hash.
|
41
|
+
```shell
|
42
|
+
gem install red_amber
|
43
|
+
```
|
129
44
|
|
130
|
-
|
131
|
-
|
132
|
-
- [x] `empty?`
|
45
|
+
(From v0.1.6)
|
133
46
|
|
134
|
-
|
47
|
+
RedAmber uses TDR mode for `#inspect` and `#to_iruby` by default. If you prefer Table mode, please set the environment variable
|
48
|
+
`RED_AMBER_OUTPUT_MODE` to `"table"`. See [TDR section](#TDR) for detail.
|
135
49
|
|
136
|
-
|
50
|
+
## `RedAmber::DataFrame`
|
137
51
|
|
138
|
-
-
|
52
|
+
Represents a set of data in 2D-shape. The entity is a Red Arrow's Table object.
|
139
53
|
|
140
|
-
|
54
|
+
```ruby
|
55
|
+
require 'red_amber' # require 'red-amber' is also OK.
|
56
|
+
require 'datasets-arrow'
|
141
57
|
|
142
|
-
|
58
|
+
arrow = Datasets::Penguins.new.to_arrow
|
59
|
+
penguins = RedAmber::DataFrame.new(arrow)
|
60
|
+
penguins.table
|
143
61
|
|
144
|
-
|
62
|
+
# =>
|
63
|
+
#<Arrow::Table:0x111271098 ptr=0x7f9118b3e0b0>
|
64
|
+
species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex year
|
65
|
+
0 Adelie Torgersen 39.100000 18.700000 181 3750 male 2007
|
66
|
+
1 Adelie Torgersen 39.500000 17.400000 186 3800 female 2007
|
67
|
+
2 Adelie Torgersen 40.300000 18.000000 195 3250 female 2007
|
68
|
+
3 Adelie Torgersen (null) (null) (null) (null) (null) 2007
|
69
|
+
4 Adelie Torgersen 36.700000 19.300000 193 3450 female 2007
|
70
|
+
5 Adelie Torgersen 39.300000 20.600000 190 3650 male 2007
|
71
|
+
6 Adelie Torgersen 38.900000 17.800000 181 3625 female 2007
|
72
|
+
7 Adelie Torgersen 39.200000 19.600000 195 4675 male 2007
|
73
|
+
8 Adelie Torgersen 34.100000 18.100000 193 3475 (null) 2007
|
74
|
+
9 Adelie Torgersen 42.000000 20.200000 190 4250 (null) 2007
|
75
|
+
...
|
76
|
+
334 Gentoo Biscoe 46.200000 14.100000 217 4375 female 2009
|
77
|
+
335 Gentoo Biscoe 55.100000 16.000000 230 5850 male 2009
|
78
|
+
336 Gentoo Biscoe 44.500000 15.700000 217 4875 (null) 2009
|
79
|
+
337 Gentoo Biscoe 48.800000 16.200000 222 6000 male 2009
|
80
|
+
338 Gentoo Biscoe 47.200000 13.700000 214 4925 female 2009
|
81
|
+
339 Gentoo Biscoe (null) (null) (null) (null) (null) 2009
|
82
|
+
340 Gentoo Biscoe 46.800000 14.300000 215 4850 female 2009
|
83
|
+
341 Gentoo Biscoe 50.400000 15.700000 222 5750 male 2009
|
84
|
+
342 Gentoo Biscoe 45.200000 14.800000 212 5200 female 2009
|
85
|
+
343 Gentoo Biscoe 49.900000 16.100000 213 5400 male 2009
|
86
|
+
```
|
145
87
|
|
146
|
-
|
88
|
+
By default, RedAmber shows self by compact transposed style. This unfamiliar style (TDR) is designed for
|
89
|
+
the exploratory data processing. It keeps Vectors as row vectors, shows keys and types at a glance, shows levels
|
90
|
+
for the 'factor-like' variables and shows the number of abnormal values like NaN and nil.
|
147
91
|
|
148
92
|
```ruby
|
149
|
-
|
150
|
-
require 'datasets-arrow'
|
93
|
+
penguins
|
151
94
|
|
152
|
-
penguins = Datasets::Penguins.new.to_arrow
|
153
|
-
RedAmber::DataFrame.new(penguins)
|
154
95
|
# =>
|
155
96
|
RedAmber::DataFrame : 344 x 8 Vectors
|
156
97
|
Vectors : 5 numeric, 3 strings
|
@@ -165,257 +106,91 @@ Vectors : 5 numeric, 3 strings
|
|
165
106
|
8 :year uint16 3 {2007=>110, 2008=>114, 2009=>120}
|
166
107
|
```
|
167
108
|
|
168
|
-
|
169
|
-
|
170
|
-
|
171
|
-
### Selecting
|
109
|
+
### DataFrame model
|
110
|
+

|
172
111
|
|
173
|
-
|
174
|
-
- Key in a Symbol: `df[:symbol]`
|
175
|
-
- Key in a String: `df["string"]`
|
176
|
-
- Keys in an Array: `df[:symbol1, "string", :symbol2]`
|
177
|
-
- Keys in indeces: `df[df.keys[0]`, `df[df.keys[1,2]]`, `df[df.keys[1..]]`
|
178
|
-
- Keys in a Range:
|
179
|
-
A end-less Range can be used to represent keys.
|
112
|
+
For example, `DataFrame#pick` accepts keys as an argument and returns a sub DataFrame.
|
180
113
|
|
181
114
|
```ruby
|
182
|
-
|
183
|
-
df = RedAmber::DataFrame.new(hash)
|
184
|
-
df[:b..:c, "a"]
|
115
|
+
df = penguins.pick(:body_mass_g)
|
185
116
|
# =>
|
186
|
-
RedAmber::DataFrame :
|
187
|
-
|
188
|
-
# key
|
189
|
-
1 :
|
190
|
-
2 :c double 3 [1.0, 2.0, 3.0]
|
191
|
-
3 :a uint8 3 [1, 2, 3]
|
117
|
+
#<RedAmber::DataFrame : 344 x 1 Vector, 0x000000000000fa14>
|
118
|
+
Vector : 1 numeric
|
119
|
+
# key type level data_preview
|
120
|
+
1 :body_mass_g int64 95 [3750, 3800, 3250, nil, 3450, ... ], 2 nils
|
192
121
|
```
|
193
122
|
|
194
|
-
|
195
|
-
- Select a row by index: `df[0]`
|
196
|
-
- Select rows by indeces in a Range: `df[1..2]`
|
197
|
-
- Select rows by indeces in an Array: `df[1, 2]`
|
198
|
-
- Mixed case: `df[2, 0..]`
|
199
|
-
|
200
|
-
- [x] Select rows from top or bottom
|
201
|
-
|
202
|
-
`head(n=5)`, `tail(n=5)`, `first(n=1)`, `last(n=1)`
|
203
|
-
|
204
|
-
- [ ] slice
|
205
|
-
|
206
|
-
### Updating
|
207
|
-
|
208
|
-
- [ ] Add a new column
|
209
|
-
|
210
|
-
- [ ] Update a single element
|
211
|
-
|
212
|
-
- [ ] Update multiple elements
|
213
|
-
|
214
|
-
- [ ] Update all elements
|
215
|
-
|
216
|
-
- [ ] Update elements matching a condition
|
123
|
+
`DataFrame#assign` creates new variables (column in the table).
|
217
124
|
|
218
|
-
|
219
|
-
|
220
|
-
|
221
|
-
|
222
|
-
|
223
|
-
|
224
|
-
|
225
|
-
|
226
|
-
|
227
|
-
|
228
|
-
### Treat na data
|
229
|
-
|
230
|
-
- [ ] Drop na (NaN, nil)
|
231
|
-
|
232
|
-
- [ ] Replace na with value
|
233
|
-
|
234
|
-
- [ ] Interpolate na with convolution array
|
235
|
-
|
236
|
-
### Combining DataFrames
|
237
|
-
|
238
|
-
- [ ] Add rows
|
239
|
-
|
240
|
-
- [ ] Add columns
|
125
|
+
```ruby
|
126
|
+
df.assign(:body_mass_kg => df[:body_mass_g] / 1000.0)
|
127
|
+
# =>
|
128
|
+
#<RedAmber::DataFrame : 344 x 2 Vectors, 0x000000000000fa28>
|
129
|
+
Vectors : 2 numeric
|
130
|
+
# key type level data_preview
|
131
|
+
1 :body_mass_g int64 95 [3750, 3800, 3250, nil, 3450, ... ], 2 nils
|
132
|
+
2 :body_mass_kg double 95 [3.75, 3.8, 3.25, nil, 3.45, ... ], 2 nils
|
133
|
+
```
|
241
134
|
|
242
|
-
|
135
|
+
DataFrame manipulating methods like `pick`, `drop`, `slice`, `remove`, `rename` and `assign` accept a block.
|
243
136
|
|
244
|
-
|
137
|
+
This is an exaple to eliminate observations (row in the table) containing nil.
|
245
138
|
|
246
|
-
|
139
|
+
```ruby
|
140
|
+
# remove all observation contains nil
|
141
|
+
nil_removed = penguins.remove { vectors.map(&:is_nil).reduce(&:|) }
|
142
|
+
nil_removed.tdr
|
143
|
+
# =>
|
144
|
+
RedAmber::DataFrame : 342 x 8 Vectors
|
145
|
+
Vectors : 5 numeric, 3 strings
|
146
|
+
# key type level data_preview
|
147
|
+
1 :species string 3 {"Adelie"=>151, "Chinstrap"=>68, "Gentoo"=>123}
|
148
|
+
2 :island string 3 {"Torgersen"=>51, "Biscoe"=>167, "Dream"=>124}
|
149
|
+
3 :bill_length_mm double 164 [39.1, 39.5, 40.3, 36.7, 39.3, ... ]
|
150
|
+
4 :bill_depth_mm double 80 [18.7, 17.4, 18.0, 19.3, 20.6, ... ]
|
151
|
+
5 :flipper_length_mm int64 55 [181, 186, 195, 193, 190, ... ]
|
152
|
+
6 :body_mass_g int64 94 [3750, 3800, 3250, 3450, 3650, ... ]
|
153
|
+
7 :sex string 3 {"male"=>168, "female"=>165, ""=>9}
|
154
|
+
8 :year int64 3 {2007=>109, 2008=>114, 2009=>119}
|
155
|
+
```
|
247
156
|
|
248
|
-
|
157
|
+
For this frequently needed task, we can do it much simpler.
|
249
158
|
|
250
|
-
|
159
|
+
```ruby
|
160
|
+
penguins.remove_nil # => same result as above
|
161
|
+
```
|
251
162
|
|
252
|
-
|
163
|
+
See [DataFrame.md](doc/DataFrame.md) for details.
|
253
164
|
|
254
165
|
|
255
166
|
## `RedAmber::Vector`
|
256
|
-
### Constructor
|
257
|
-
|
258
|
-
- [x] Create from a column in a DataFrame
|
259
|
-
|
260
|
-
- [x] New from an Array
|
261
|
-
|
262
|
-
### Properties
|
263
|
-
|
264
|
-
- [x] `to_s`
|
265
|
-
|
266
|
-
- [x] `values`, `to_a`, `entries`
|
267
167
|
|
268
|
-
|
168
|
+
Class `RedAmber::Vector` represents a series of data in the DataFrame.
|
269
169
|
|
270
|
-
|
271
|
-
|
272
|
-
|
273
|
-
|
274
|
-
|
275
|
-
|
276
|
-
- [ ] `chunked?`
|
277
|
-
|
278
|
-
- [ ] `n_chunks`
|
279
|
-
|
280
|
-
- [ ] `each_chunk`
|
281
|
-
|
282
|
-
- [x] `tally`
|
283
|
-
|
284
|
-
- [x] `n_nils`, `n_nans`
|
285
|
-
|
286
|
-
- `n_nulls` is an alias of `n_nils`
|
287
|
-
|
288
|
-
- [x] `inspect(limit: 80)`
|
170
|
+
```ruby
|
171
|
+
penguins[:bill_length_mm]
|
172
|
+
# =>
|
173
|
+
#<RedAmber::Vector(:double, size=344):0x000000000000f8fc>
|
174
|
+
[39.1, 39.5, 40.3, nil, 36.7, 39.3, 38.9, 39.2, 34.1, 42.0, 37.8, 37.8, 41.1, ... ]
|
175
|
+
```
|
289
176
|
|
290
|
-
|
177
|
+
Vectors accepts some [functional methods from Arrow](https://arrow.apache.org/docs/cpp/compute.html).
|
291
178
|
|
292
|
-
|
293
|
-
#### Unary aggregations: vector.func => scalar
|
179
|
+
See [Vector.md](doc/Vector.md) for details.
|
294
180
|
|
295
|
-
|
296
|
-
| ----------- | --- | --- | --- | --- | --- |
|
297
|
-
| ✓ `all` | ✓ | | | ✓ ScalarAggregate| |
|
298
|
-
| ✓ `any` | ✓ | | | ✓ ScalarAggregate| |
|
299
|
-
| ✓ `approximate_median`| |✓| | ✓ ScalarAggregate| alias `median`|
|
300
|
-
| ✓ `count` | ✓ | ✓ | ✓ | ✓ Count | |
|
301
|
-
| ✓ `count_distinct`| ✓ | ✓ | ✓ | ✓ Count |alias `count_uniq`|
|
302
|
-
|[ ]`index` | [ ] | [ ] | [ ] |[ ] Index | |
|
303
|
-
| ✓ `max` | ✓ | ✓ | ✓ | ✓ ScalarAggregate| |
|
304
|
-
| ✓ `mean` | ✓ | ✓ | | ✓ ScalarAggregate| |
|
305
|
-
| ✓ `min` | ✓ | ✓ | ✓ | ✓ ScalarAggregate| |
|
306
|
-
|[ ]`min_max` | [ ] | [ ] | [ ] |[ ] ScalarAggregate| |
|
307
|
-
|[ ]`mode` | | [ ] | |[ ] Mode | |
|
308
|
-
| ✓ `product` | ✓ | ✓ | | ✓ ScalarAggregate| |
|
309
|
-
|[ ]`quantile`| | [ ] | |[ ] Quantile| |
|
310
|
-
|[ ]`stddev` | | ✓ | |[ ] Variance| |
|
311
|
-
| ✓ `sum` | ✓ | ✓ | | ✓ ScalarAggregate| |
|
312
|
-
|[ ]`tdigest` | | [ ] | |[ ] TDigest | |
|
313
|
-
|[ ]`variance`| | ✓ | |[ ] Variance| |
|
181
|
+
## TDR
|
314
182
|
|
183
|
+
I named the data frame representation style in the model above as TDR (Transposed DataFrame Representation).
|
315
184
|
|
316
|
-
|
317
|
-
|
185
|
+
This library can be used with both TDR mode and usual Table mode.
|
186
|
+
If you set the environment variable `RED_AMBER_OUTPUT_MODE` to `"table"`, output style by `inspect` and `to_iruby` is the Table mode. Other value including nil will output TDR style.
|
318
187
|
|
188
|
+
You can switch the mode in Ruby like this.
|
319
189
|
```ruby
|
320
|
-
|
321
|
-
#=>
|
322
|
-
#<RedAmber::Vector(:double, size=6):0x000000000000f910>
|
323
|
-
[1.0, NaN, -Infinity, Infinity, nil, 0.0]
|
324
|
-
|
325
|
-
double.count #=> 5
|
326
|
-
double.count(opts: {mode: :only_valid}) #=> 5, default
|
327
|
-
double.count(opts: {mode: :only_null}) #=> 1
|
328
|
-
double.count(opts: {mode: :all}) #=> 6
|
329
|
-
|
330
|
-
boolean = RedAmber::Vector.new([true, true, nil])
|
331
|
-
#=>
|
332
|
-
#<RedAmber::Vector(:boolean, size=3):0x000000000000f924>
|
333
|
-
[true, true, nil]
|
334
|
-
|
335
|
-
boolean.all #=> true
|
336
|
-
boolean.all(opts: {skip_nulls: true}) #=> true
|
337
|
-
boolean.all(opts: {skip_nulls: false}) #=> false
|
190
|
+
ENV['RED_AMBER_OUTPUT_STYLE'] = 'table' # => Table mode
|
338
191
|
```
|
339
192
|
|
340
|
-
|
341
|
-
|
342
|
-
| Method |Boolean|Numeric|String|Options|Remarks|
|
343
|
-
| ------------ | --- | --- | --- | --- | ----- |
|
344
|
-
| ✓ `-@` | | ✓ | | |as `-vector`|
|
345
|
-
| ✓ `negate` | | ✓ | | |`-@` |
|
346
|
-
| ✓ `abs` | | ✓ | | | |
|
347
|
-
|[ ]`acos` | | [ ] | | | |
|
348
|
-
|[ ]`asin` | | [ ] | | | |
|
349
|
-
| ✓ `atan` | | ✓ | | | |
|
350
|
-
| ✓ `bit_wise_not`| | (✓) | | |integer only|
|
351
|
-
|[ ]`ceil` | | ✓ | | | |
|
352
|
-
| ✓ `cos` | | ✓ | | | |
|
353
|
-
|[ ]`floor` | | ✓ | | | |
|
354
|
-
| ✓ `invert` | ✓ | | | |`!`, alias `not`|
|
355
|
-
|[ ]`ln` | | [ ] | | | |
|
356
|
-
|[ ]`log10` | | [ ] | | | |
|
357
|
-
|[ ]`log1p` | | [ ] | | | |
|
358
|
-
|[ ]`log2` | | [ ] | | | |
|
359
|
-
|[ ]`round` | | [ ] | |[ ] Round| |
|
360
|
-
|[ ]`round_to_multiple`| | [ ] | |[ ] RoundToMultiple| |
|
361
|
-
| ✓ `sign` | | ✓ | | | |
|
362
|
-
| ✓ `sin` | | ✓ | | | |
|
363
|
-
| ✓ `tan` | | ✓ | | | |
|
364
|
-
|[ ]`trunc` | | ✓ | | | |
|
365
|
-
|
366
|
-
#### Binary element-wise: vector.func(vector) => vector
|
367
|
-
|
368
|
-
| Method |Boolean|Numeric|String|Options|Remarks|
|
369
|
-
| ----------------- | --- | --- | --- | --- | ----- |
|
370
|
-
| ✓ `add` | | ✓ | | | `+` |
|
371
|
-
| ✓ `atan2` | | ✓ | | | |
|
372
|
-
| ✓ `and_kleene` | ✓ | | | | `&` |
|
373
|
-
| ✓ `and_org ` | ✓ | | | |`and` in Red Arrow|
|
374
|
-
| ✓ `and_not` | ✓ | | | | |
|
375
|
-
| ✓ `and_not_kleene`| ✓ | | | | |
|
376
|
-
| ✓ `bit_wise_and` | | (✓) | | |integer only|
|
377
|
-
| ✓ `bit_wise_or` | | (✓) | | |integer only|
|
378
|
-
| ✓ `bit_wise_xor` | | (✓) | | |integer only|
|
379
|
-
| ✓ `divide` | | ✓ | | | `/` |
|
380
|
-
| ✓ `equal` | ✓ | ✓ | ✓ | |`==`, alias `eq`|
|
381
|
-
| ✓ `greater` | ✓ | ✓ | ✓ | |`>`, alias `gt`|
|
382
|
-
| ✓ `greater_equal` | ✓ | ✓ | ✓ | |`>=`, alias `ge`|
|
383
|
-
| ✓ `is_finite` | | ✓ | | | |
|
384
|
-
| ✓ `is_inf` | | ✓ | | | |
|
385
|
-
| ✓ `is_na` | ✓ | ✓ | ✓ | | |
|
386
|
-
| ✓ `is_nan` | | ✓ | | | |
|
387
|
-
|[ ]`is_nil` | ✓ | ✓ | ✓ |[ ] Null|alias `is_null`|
|
388
|
-
| ✓ `is_valid` | ✓ | ✓ | ✓ | | |
|
389
|
-
| ✓ `less` | ✓ | ✓ | ✓ | |`<`, alias `lt`|
|
390
|
-
| ✓ `less_equal` | ✓ | ✓ | ✓ | |`<=`, alias `le`|
|
391
|
-
|[ ]`logb` | | [ ] | | | |
|
392
|
-
|[ ]`mod` | | [ ] | | | `%` |
|
393
|
-
| ✓ `multiply` | | ✓ | | | `*` |
|
394
|
-
| ✓ `not_equal` | ✓ | ✓ | ✓ | |`!=`, alias `ne`|
|
395
|
-
| ✓ `or_kleene` | ✓ | | | | `\|` |
|
396
|
-
| ✓ `or_org` | ✓ | | | |`or` in Red Arrow|
|
397
|
-
| ✓ `power` | | ✓ | | | `**` |
|
398
|
-
| ✓ `subtract` | | ✓ | | | `-` |
|
399
|
-
| ✓ `shift_left` | | (✓) | | |`<<`, integer only|
|
400
|
-
| ✓ `shift_right` | | (✓) | | |`>>`, integer only|
|
401
|
-
| ✓ `xor` | ✓ | | | | `^` |
|
402
|
-
|
403
|
-
##### (Not impremented)
|
404
|
-
- [ ] sort, sort_index
|
405
|
-
- [ ] argmin, argmax
|
406
|
-
- [ ] (array functions)
|
407
|
-
- [ ] (strings functions)
|
408
|
-
- [ ] (temporal functions)
|
409
|
-
- [ ] (conditional functions)
|
410
|
-
- [ ] (index functions)
|
411
|
-
- [ ] (other functions)
|
412
|
-
|
413
|
-
### Coerce (not impremented)
|
414
|
-
|
415
|
-
### Updating (not impremented)
|
416
|
-
|
417
|
-
### DSL in a block for faster calculation ?
|
418
|
-
|
193
|
+
For more detail information about TDR, see [TDR.md](doc/tdr.md).
|
419
194
|
|
420
195
|
## Development
|
421
196
|
|
@@ -0,0 +1,15 @@
|
|
1
|
+
prelude: |
|
2
|
+
require 'datasets-arrow'
|
3
|
+
require 'rover'
|
4
|
+
require 'red_amber'
|
5
|
+
|
6
|
+
penguins_csv = 'benchmark/cache/penguins.csv'
|
7
|
+
|
8
|
+
unless File.exist?(penguins_csv)
|
9
|
+
arrow = Datasets::Penguins.new.to_arrow
|
10
|
+
RedAmber::DataFrame.new(arrow).save(penguins_csv)
|
11
|
+
end
|
12
|
+
|
13
|
+
benchmark:
|
14
|
+
'penguins by Rover': Rover.read_csv(penguins_csv)
|
15
|
+
'penguins by RedAmber': RedAmber::DataFrame.load(penguins_csv)
|