red_amber 0.1.3 → 0.1.4
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/.rubocop.yml +9 -4
- data/CHANGELOG.md +60 -8
- data/README.md +41 -349
- data/doc/DataFrame.md +690 -0
- data/doc/Vector.md +195 -0
- data/doc/image/TDR_operations.pdf +0 -0
- data/doc/image/arrow_table_new.png +0 -0
- data/doc/image/dataframe/assign.png +0 -0
- data/doc/image/dataframe/drop.png +0 -0
- data/doc/image/dataframe/pick.png +0 -0
- data/doc/image/dataframe/remove.png +0 -0
- data/doc/image/dataframe/rename.png +0 -0
- data/doc/image/dataframe/slice.png +0 -0
- data/doc/image/dataframe_model.png +0 -0
- data/doc/image/example_in_red_arrow.png +0 -0
- data/doc/image/tdr.png +0 -0
- data/doc/image/tdr_and_table.png +0 -0
- data/doc/image/tidy_data_in_TDR.png +0 -0
- data/doc/image/vector/binary_element_wise.png +0 -0
- data/doc/image/vector/unary_aggregation.png +0 -0
- data/doc/image/vector/unary_aggregation_w_option.png +0 -0
- data/doc/image/vector/unary_element_wise.png +0 -0
- data/doc/tdr.md +53 -0
- data/doc/tdr_ja.md +53 -0
- data/lib/red_amber/data_frame.rb +22 -15
- data/lib/red_amber/{data_frame_output.rb → data_frame_displayable.rb} +44 -37
- data/lib/red_amber/data_frame_helper.rb +64 -0
- data/lib/red_amber/data_frame_observation_operation.rb +72 -0
- data/lib/red_amber/data_frame_selectable.rb +21 -43
- data/lib/red_amber/data_frame_variable_operation.rb +133 -0
- data/lib/red_amber/vector_functions.rb +54 -29
- data/lib/red_amber/version.rb +1 -1
- data/lib/red_amber.rb +4 -1
- metadata +27 -3
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 6ceace9db54b82c03ccf00fcd1b7bf2af57d94ea4e54183dc6af1da47e21ef00
|
4
|
+
data.tar.gz: f30578dcec45fd5efec9219c6438fd0108a0690b1cd69b1c398dffacd38aeba1
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: ee26fd212d0cb0758bc4611c5b43b302fe5c1b958239b5a9ac81ee09e936bdded733a719507e24e5434c33fc5d7ece43c973dd66d51413f23cc435ea0bd7570c
|
7
|
+
data.tar.gz: 674f56a11ddf906f608ecf7d7c852bec654a749e9052092553d19be967072d5acec95a096fbecc60ffd4b33fad3f4322354d93fade67230078fff15b6b7398dd
|
data/.rubocop.yml
CHANGED
@@ -55,7 +55,10 @@ Layout/LineLength:
|
|
55
55
|
Metrics/AbcSize:
|
56
56
|
Max: 23
|
57
57
|
Exclude:
|
58
|
-
- 'lib/red_amber/
|
58
|
+
- 'lib/red_amber/data_frame_displayable.rb' # Max: 55
|
59
|
+
- 'lib/red_amber/data_frame_selectable.rb' # Max: 27
|
60
|
+
- 'lib/red_amber/data_frame_observation_operation.rb' # Max: 29
|
61
|
+
- 'lib/red_amber/data_frame_variable_operation.rb' # Max: 26
|
59
62
|
|
60
63
|
# Max: 25
|
61
64
|
Metrics/BlockLength:
|
@@ -71,13 +74,15 @@ Metrics/ClassLength:
|
|
71
74
|
|
72
75
|
# Max: 7
|
73
76
|
Metrics/CyclomaticComplexity:
|
74
|
-
Max:
|
77
|
+
Max: 12
|
75
78
|
|
76
79
|
# Max: 10
|
77
80
|
Metrics/MethodLength:
|
78
81
|
Max: 18
|
79
82
|
Exclude:
|
80
|
-
- 'lib/red_amber/
|
83
|
+
- 'lib/red_amber/data_frame_displayable.rb' # Max: 33
|
84
|
+
- 'lib/red_amber/data_frame_observation_operation.rb' # Max: 21
|
85
|
+
- 'lib/red_amber/data_frame_variable_operation.rb' # Max: 20
|
81
86
|
|
82
87
|
# Max: 100
|
83
88
|
Metrics/ModuleLength:
|
@@ -87,7 +92,7 @@ Metrics/ModuleLength:
|
|
87
92
|
|
88
93
|
# Max: 8
|
89
94
|
Metrics/PerceivedComplexity:
|
90
|
-
Max:
|
95
|
+
Max: 13
|
91
96
|
|
92
97
|
# Necessary to define is_na
|
93
98
|
Naming/PredicateName:
|
data/CHANGELOG.md
CHANGED
@@ -1,18 +1,70 @@
|
|
1
|
-
##
|
1
|
+
## - Unreleased
|
2
2
|
|
3
|
-
-
|
4
|
-
- Feedback to Red Arrow
|
5
|
-
- Separate documents
|
3
|
+
- Feedback something to Red Arrow
|
6
4
|
|
7
5
|
- `DataFrame`
|
8
|
-
- Introduce
|
9
|
-
- Introduce
|
10
|
-
-
|
6
|
+
- Introduce `group_by`
|
7
|
+
- Introduce `summarize`
|
8
|
+
- Introduce `summary` or ``describe`
|
9
|
+
- Improve dataframe obs. manipuration methods to accept float as a index (#10)
|
10
|
+
- More performant
|
11
11
|
|
12
12
|
- `Vector`
|
13
|
-
- Add NaN support for functions
|
14
13
|
- Support more functions
|
15
14
|
|
15
|
+
- Document
|
16
|
+
- YARD support
|
17
|
+
|
18
|
+
## [0.1.4] - 2022-05-29 (experimental)
|
19
|
+
|
20
|
+
- Bug fixes
|
21
|
+
- Fix missing support for scalar argument (#1)
|
22
|
+
- Fix type name of boolean in DF#types to be same as Vector#type (#6, #7)
|
23
|
+
- Fix zero picking to return empty DataFrame (#8)
|
24
|
+
- Fix code at both args and a block given (#8)
|
25
|
+
|
26
|
+
- New features and improvements
|
27
|
+
- `DataFrame`
|
28
|
+
- Refine module name `Displayable`
|
29
|
+
- Rename nrow/ncol methods to `size`/`n_keys` to align with TDR concept (#4)
|
30
|
+
- Remain `n_row`/`n_col` for compatibility
|
31
|
+
- Rename `ls` method to `tdr` (#4)
|
32
|
+
- Add limit option to `tdr`
|
33
|
+
- Shorten option name (#11)
|
34
|
+
- Introduce `pick` method to create sub DataFrame (#8)
|
35
|
+
- Add boolean support (#8)
|
36
|
+
- Refactor `pick` (#9)
|
37
|
+
- Introduce `drop` method to create sub DataFrame (#8)
|
38
|
+
- Add boolean support (#8)
|
39
|
+
- Refactor `drop` (#9)
|
40
|
+
- Add boolean array support for `[]` (#9)
|
41
|
+
- Add `indexes`/`indices` to use with selecting observations (#9)
|
42
|
+
- Introduce `slice` method to create sub DataFrame (#8)
|
43
|
+
- Refactor `slice` (#9)
|
44
|
+
- Introduce `remove` method to create sub DataFrame (#9)
|
45
|
+
- Introduce `rename` method to create sub DataFrame (#14)
|
46
|
+
- Introduce `assign` method to create sub DataFrame (#14)
|
47
|
+
- Improve to call block by instance_eval (#13)
|
48
|
+
|
49
|
+
- `Vector`
|
50
|
+
- Refine `find(function)`
|
51
|
+
- Add `min_max` method (#2)
|
52
|
+
- Add `std`/`sd` method (ddof=0 version: `stddev`) (#2)
|
53
|
+
- Add `var` method (ddof=0 version: `variance`) (#2)
|
54
|
+
- Add `VectorFunctions.arrow_doc(func_name)` (temporally)
|
55
|
+
|
56
|
+
- Documentation
|
57
|
+
- Show code in README
|
58
|
+
- Change row/column names for **TDR** concept (#4)
|
59
|
+
- Add documents about **TDR** concept (#4)
|
60
|
+
- Add example about TDR (#4)
|
61
|
+
- Separate README to create DataFrame and Vector documents (#12)
|
62
|
+
- Add DataFrame model concept image to README (#12)
|
63
|
+
|
64
|
+
- GitHub site
|
65
|
+
- Switched to use merge on GitHub (not to push merged master) (#1)
|
66
|
+
- Create lifetime issue #3 to show the goal of this project (#3)
|
67
|
+
|
16
68
|
## [0.1.3] - 2022-05-15 (experimental)
|
17
69
|
|
18
70
|
- Bug fixes
|
data/README.md
CHANGED
@@ -23,134 +23,26 @@ gem 'red_amber'
|
|
23
23
|
|
24
24
|
And then execute:
|
25
25
|
|
26
|
-
|
26
|
+
```shell
|
27
|
+
bundle install
|
28
|
+
```
|
27
29
|
|
28
30
|
Or install it yourself as:
|
29
31
|
|
30
|
-
|
32
|
+
```shell
|
33
|
+
gem install red_amber
|
34
|
+
```
|
31
35
|
|
32
36
|
## `RedAmber::DataFrame`
|
33
37
|
|
34
|
-
|
35
|
-
|
36
|
-
- [x] `new` from a columnar Hash
|
37
|
-
- `RedAmber::DataFrame.new(x: [1, 2, 3])`
|
38
|
-
|
39
|
-
- [x] `new` from a schema (by Hash) and rows (by Array)
|
40
|
-
- `RedAmber::DataFrame.new({:x=>:uint8}, [[1], [2], [3]])`
|
41
|
-
|
42
|
-
- [x] `new` from an Arrow::Table
|
43
|
-
- `RedAmber::DataFrame.new(Arrow::Table.new(x: [1, 2, 3]))`
|
44
|
-
|
45
|
-
- [x] `new` from a Rover::DataFrame
|
46
|
-
- `RedAmber::DataFrame.new(Rover::DataFrame.new(x: [1, 2, 3]))`
|
47
|
-
|
48
|
-
- [x] `load` (class method)
|
49
|
-
|
50
|
-
- [x] from a [`.arrow`, `.arrows`, `.csv`, `.csv.gz`, `.tsv`] file
|
51
|
-
- `RedAmber::DataFrame.load("test/entity/with_header.csv")`
|
52
|
-
|
53
|
-
- [x] from a string buffer
|
54
|
-
|
55
|
-
- [x] from a URI
|
56
|
-
- `RedAmber::DataFrame.load(URI("https://github.com/heronshoes/red_amber/blob/master/test/entity/with_header.csv"))`
|
57
|
-
|
58
|
-
- [x] from a Parquet file
|
59
|
-
|
60
|
-
`red-parquet` gem is required.
|
61
|
-
|
62
|
-
```ruby
|
63
|
-
require 'parquet'
|
64
|
-
dataframe = RedAmber::DataFrame.load("file.parquet")
|
65
|
-
```
|
66
|
-
|
67
|
-
- [x] `save` (instance method)
|
68
|
-
|
69
|
-
- [x] to a [`.arrow`, `.arrows`, `.csv`, `.csv.gz`, `.tsv`] file
|
70
|
-
|
71
|
-
- [x] to a string buffer
|
72
|
-
|
73
|
-
- [x] to a URI
|
74
|
-
|
75
|
-
- [x] to a Parquet file
|
76
|
-
|
77
|
-
`red-parquet` gem is required.
|
78
|
-
|
79
|
-
```ruby
|
80
|
-
require 'parquet'
|
81
|
-
dataframe.save("file.parquet")
|
82
|
-
```
|
83
|
-
|
84
|
-
### Properties
|
85
|
-
|
86
|
-
- [x] `table`
|
87
|
-
|
88
|
-
Reader of Arrow::Table object inside.
|
89
|
-
|
90
|
-
- [x] `n_rows`, `nrow`, `size`, `length`
|
91
|
-
|
92
|
-
Returns num of rows (data size).
|
93
|
-
|
94
|
-
- [x] `n_columns`, `ncol`, `width`
|
95
|
-
|
96
|
-
Returns num of columns (num of vectors).
|
97
|
-
|
98
|
-
- [x] `shape`
|
99
|
-
|
100
|
-
Returns shape in an Array[n_rows, n_cols].
|
101
|
-
|
102
|
-
- [x] `column_names`, `keys`
|
103
|
-
|
104
|
-
Returns num of column names by an Array.
|
105
|
-
|
106
|
-
- [x] `types`
|
107
|
-
|
108
|
-
Returns types of columns by an Array of Symbols.
|
109
|
-
|
110
|
-
- [x] `data_types`
|
111
|
-
|
112
|
-
Returns types of columns by an Array of `Arrow::DataType`.
|
113
|
-
|
114
|
-
- [x] `vectors`
|
115
|
-
|
116
|
-
Returns an Array of Vectors.
|
117
|
-
|
118
|
-
- [x] `to_h`
|
119
|
-
|
120
|
-
Returns column-oriented data in a Hash.
|
121
|
-
|
122
|
-
- [x] `to_a`, `raw_records`
|
123
|
-
|
124
|
-
Returns an array of row-oriented data without header. If you need a column-oriented full array, use `.to_h.to_a`
|
125
|
-
|
126
|
-
- [x] `schema`
|
127
|
-
|
128
|
-
Returns column name and data type in a Hash.
|
129
|
-
|
130
|
-
- [x] `==`
|
131
|
-
|
132
|
-
- [x] `empty?`
|
133
|
-
|
134
|
-
### Output
|
135
|
-
|
136
|
-
- [x] `to_s`
|
137
|
-
|
138
|
-
- [ ] summary, describe
|
139
|
-
|
140
|
-
- [x] `to_rover`
|
141
|
-
|
142
|
-
Returns a `Rover::DataFrame`.
|
143
|
-
|
144
|
-
- [x] `inspect(tally_level: 5, max_element: 5)`
|
145
|
-
|
146
|
-
Shows some information about self in a transposed style.
|
38
|
+
Represents a set of data in 2D-shape.
|
147
39
|
|
148
40
|
```ruby
|
149
41
|
require 'red_amber'
|
150
42
|
require 'datasets-arrow'
|
151
43
|
|
152
44
|
penguins = Datasets::Penguins.new.to_arrow
|
153
|
-
RedAmber::DataFrame.new(penguins)
|
45
|
+
puts RedAmber::DataFrame.new(penguins).tdr
|
154
46
|
# =>
|
155
47
|
RedAmber::DataFrame : 344 x 8 Vectors
|
156
48
|
Vectors : 5 numeric, 3 strings
|
@@ -165,257 +57,57 @@ Vectors : 5 numeric, 3 strings
|
|
165
57
|
8 :year uint16 3 {2007=>110, 2008=>114, 2009=>120}
|
166
58
|
```
|
167
59
|
|
168
|
-
|
169
|
-
|
60
|
+
### DataFrame model
|
61
|
+

|
170
62
|
|
171
|
-
|
172
|
-
|
173
|
-
- [x] Select columns by `[]` as `[key]`, `[keys]`, `[keys[index]]`
|
174
|
-
- Key in a Symbol: `df[:symbol]`
|
175
|
-
- Key in a String: `df["string"]`
|
176
|
-
- Keys in an Array: `df[:symbol1, "string", :symbol2]`
|
177
|
-
- Keys in indeces: `df[df.keys[0]`, `df[df.keys[1,2]]`, `df[df.keys[1..]]`
|
178
|
-
- Keys in a Range:
|
179
|
-
A end-less Range can be used to represent keys.
|
63
|
+
For example, `DataFrame#pick` accepts keys as an argument and returns a sub DataFrame.
|
180
64
|
|
181
65
|
```ruby
|
182
|
-
|
183
|
-
df = RedAmber::DataFrame.new(hash)
|
184
|
-
df[:b..:c, "a"]
|
66
|
+
df = penguins.pick(:body_mass_g)
|
185
67
|
# =>
|
186
|
-
RedAmber::DataFrame :
|
187
|
-
|
188
|
-
# key
|
189
|
-
1 :
|
190
|
-
2 :c double 3 [1.0, 2.0, 3.0]
|
191
|
-
3 :a uint8 3 [1, 2, 3]
|
68
|
+
#<RedAmber::DataFrame : 344 x 1 Vector, 0x000000000000fa14>
|
69
|
+
Vector : 1 numeric
|
70
|
+
# key type level data_preview
|
71
|
+
1 :body_mass_g int64 95 [3750, 3800, 3250, nil, 3450, ... ], 2 nils
|
192
72
|
```
|
193
73
|
|
194
|
-
|
195
|
-
- Select a row by index: `df[0]`
|
196
|
-
- Select rows by indeces in a Range: `df[1..2]`
|
197
|
-
- Select rows by indeces in an Array: `df[1, 2]`
|
198
|
-
- Mixed case: `df[2, 0..]`
|
199
|
-
|
200
|
-
- [x] Select rows from top or bottom
|
201
|
-
|
202
|
-
`head(n=5)`, `tail(n=5)`, `first(n=1)`, `last(n=1)`
|
74
|
+
`DataFrame#assign` can accept a block and create new variables.
|
203
75
|
|
204
|
-
|
205
|
-
|
206
|
-
|
207
|
-
|
208
|
-
|
209
|
-
|
210
|
-
|
211
|
-
|
212
|
-
|
213
|
-
|
214
|
-
|
215
|
-
|
216
|
-
- [ ] Update elements matching a condition
|
217
|
-
|
218
|
-
- [ ] Clamp
|
219
|
-
|
220
|
-
- [ ] Delete columns
|
221
|
-
|
222
|
-
- [ ] Rename a column
|
223
|
-
|
224
|
-
- [ ] Sort rows
|
225
|
-
|
226
|
-
- [ ] Clear data
|
227
|
-
|
228
|
-
### Treat na data
|
229
|
-
|
230
|
-
- [ ] Drop na (NaN, nil)
|
231
|
-
|
232
|
-
- [ ] Replace na with value
|
233
|
-
|
234
|
-
- [ ] Interpolate na with convolution array
|
235
|
-
|
236
|
-
### Combining DataFrames
|
237
|
-
|
238
|
-
- [ ] Add rows
|
239
|
-
|
240
|
-
- [ ] Add columns
|
241
|
-
|
242
|
-
- [ ] Inner join
|
243
|
-
|
244
|
-
- [ ] Left join
|
245
|
-
|
246
|
-
### Encoding
|
247
|
-
|
248
|
-
- [ ] One-hot encoding
|
76
|
+
```ruby
|
77
|
+
df.assign do
|
78
|
+
{:body_mass_kg => penguins[:body_mass_g] / 1000.0}
|
79
|
+
end
|
80
|
+
# =>
|
81
|
+
#<RedAmber::DataFrame : 344 x 2 Vectors, 0x000000000000fa28>
|
82
|
+
Vectors : 2 numeric
|
83
|
+
# key type level data_preview
|
84
|
+
1 :body_mass_g int64 95 [3750, 3800, 3250, nil, 3450, ... ], 2 nils
|
85
|
+
2 :body_mass_kg double 95 [3.75, 3.8, 3.25, nil, 3.45, ... ], 2 nils
|
86
|
+
```
|
249
87
|
|
250
|
-
|
88
|
+
Other DataFrame manipulating methods like `pick`, `drop`, `slice`, `remove` and `rename` also accept a block.
|
251
89
|
|
252
|
-
|
90
|
+
See [DataFrame.md](doc/DataFrame.md) for details.
|
253
91
|
|
254
92
|
|
255
93
|
## `RedAmber::Vector`
|
256
|
-
### Constructor
|
257
|
-
|
258
|
-
- [x] Create from a column in a DataFrame
|
259
|
-
|
260
|
-
- [x] New from an Array
|
261
|
-
|
262
|
-
### Properties
|
263
|
-
|
264
|
-
- [x] `to_s`
|
265
|
-
|
266
|
-
- [x] `values`, `to_a`, `entries`
|
267
|
-
|
268
|
-
- [x] `size`, `length`, `n_rows`, `nrow`
|
269
|
-
|
270
|
-
- [x] `type`
|
271
|
-
|
272
|
-
- [x] `data_type`
|
273
|
-
|
274
|
-
- [ ] `each`
|
275
|
-
|
276
|
-
- [ ] `chunked?`
|
277
|
-
|
278
|
-
- [ ] `n_chunks`
|
279
94
|
|
280
|
-
|
281
|
-
|
282
|
-
- [x] `tally`
|
283
|
-
|
284
|
-
- [x] `n_nils`, `n_nans`
|
285
|
-
|
286
|
-
- `n_nulls` is an alias of `n_nils`
|
287
|
-
|
288
|
-
- [x] `inspect(limit: 80)`
|
289
|
-
|
290
|
-
- `limit` sets size limit to display long array.
|
291
|
-
|
292
|
-
### Functions
|
293
|
-
#### Unary aggregations: vector.func => scalar
|
294
|
-
|
295
|
-
| Method |Boolean|Numeric|String|Options|Remarks|
|
296
|
-
| ----------- | --- | --- | --- | --- | --- |
|
297
|
-
| ✓ `all` | ✓ | | | ✓ ScalarAggregate| |
|
298
|
-
| ✓ `any` | ✓ | | | ✓ ScalarAggregate| |
|
299
|
-
| ✓ `approximate_median`| |✓| | ✓ ScalarAggregate| alias `median`|
|
300
|
-
| ✓ `count` | ✓ | ✓ | ✓ | ✓ Count | |
|
301
|
-
| ✓ `count_distinct`| ✓ | ✓ | ✓ | ✓ Count |alias `count_uniq`|
|
302
|
-
|[ ]`index` | [ ] | [ ] | [ ] |[ ] Index | |
|
303
|
-
| ✓ `max` | ✓ | ✓ | ✓ | ✓ ScalarAggregate| |
|
304
|
-
| ✓ `mean` | ✓ | ✓ | | ✓ ScalarAggregate| |
|
305
|
-
| ✓ `min` | ✓ | ✓ | ✓ | ✓ ScalarAggregate| |
|
306
|
-
|[ ]`min_max` | [ ] | [ ] | [ ] |[ ] ScalarAggregate| |
|
307
|
-
|[ ]`mode` | | [ ] | |[ ] Mode | |
|
308
|
-
| ✓ `product` | ✓ | ✓ | | ✓ ScalarAggregate| |
|
309
|
-
|[ ]`quantile`| | [ ] | |[ ] Quantile| |
|
310
|
-
|[ ]`stddev` | | ✓ | |[ ] Variance| |
|
311
|
-
| ✓ `sum` | ✓ | ✓ | | ✓ ScalarAggregate| |
|
312
|
-
|[ ]`tdigest` | | [ ] | |[ ] TDigest | |
|
313
|
-
|[ ]`variance`| | ✓ | |[ ] Variance| |
|
314
|
-
|
315
|
-
|
316
|
-
Options can be used as follows.
|
317
|
-
See the [document of C++ function](https://arrow.apache.org/docs/cpp/compute.html) for detail.
|
95
|
+
Class `RedAmber::Vector` represents a series of data in the DataFrame.
|
318
96
|
|
319
97
|
```ruby
|
320
|
-
|
321
|
-
|
322
|
-
#<RedAmber::Vector(:
|
323
|
-
[
|
324
|
-
|
325
|
-
double.count #=> 5
|
326
|
-
double.count(opts: {mode: :only_valid}) #=> 5, default
|
327
|
-
double.count(opts: {mode: :only_null}) #=> 1
|
328
|
-
double.count(opts: {mode: :all}) #=> 6
|
329
|
-
|
330
|
-
boolean = RedAmber::Vector.new([true, true, nil])
|
331
|
-
#=>
|
332
|
-
#<RedAmber::Vector(:boolean, size=3):0x000000000000f924>
|
333
|
-
[true, true, nil]
|
334
|
-
|
335
|
-
boolean.all #=> true
|
336
|
-
boolean.all(opts: {skip_nulls: true}) #=> true
|
337
|
-
boolean.all(opts: {skip_nulls: false}) #=> false
|
98
|
+
penguins[:species]
|
99
|
+
# =>
|
100
|
+
#<RedAmber::Vector(:string, size=344):0x000000000000f8e8>
|
101
|
+
["Adelie", "Adelie", "Adelie", "Adelie", "Adelie", "Adelie", "Adelie", "Adelie", ... ]
|
338
102
|
```
|
339
103
|
|
340
|
-
|
341
|
-
|
342
|
-
|
343
|
-
|
344
|
-
|
345
|
-
| ✓ `negate` | | ✓ | | |`-@` |
|
346
|
-
| ✓ `abs` | | ✓ | | | |
|
347
|
-
|[ ]`acos` | | [ ] | | | |
|
348
|
-
|[ ]`asin` | | [ ] | | | |
|
349
|
-
| ✓ `atan` | | ✓ | | | |
|
350
|
-
| ✓ `bit_wise_not`| | (✓) | | |integer only|
|
351
|
-
|[ ]`ceil` | | ✓ | | | |
|
352
|
-
| ✓ `cos` | | ✓ | | | |
|
353
|
-
|[ ]`floor` | | ✓ | | | |
|
354
|
-
| ✓ `invert` | ✓ | | | |`!`, alias `not`|
|
355
|
-
|[ ]`ln` | | [ ] | | | |
|
356
|
-
|[ ]`log10` | | [ ] | | | |
|
357
|
-
|[ ]`log1p` | | [ ] | | | |
|
358
|
-
|[ ]`log2` | | [ ] | | | |
|
359
|
-
|[ ]`round` | | [ ] | |[ ] Round| |
|
360
|
-
|[ ]`round_to_multiple`| | [ ] | |[ ] RoundToMultiple| |
|
361
|
-
| ✓ `sign` | | ✓ | | | |
|
362
|
-
| ✓ `sin` | | ✓ | | | |
|
363
|
-
| ✓ `tan` | | ✓ | | | |
|
364
|
-
|[ ]`trunc` | | ✓ | | | |
|
365
|
-
|
366
|
-
#### Binary element-wise: vector.func(vector) => vector
|
367
|
-
|
368
|
-
| Method |Boolean|Numeric|String|Options|Remarks|
|
369
|
-
| ----------------- | --- | --- | --- | --- | ----- |
|
370
|
-
| ✓ `add` | | ✓ | | | `+` |
|
371
|
-
| ✓ `atan2` | | ✓ | | | |
|
372
|
-
| ✓ `and_kleene` | ✓ | | | | `&` |
|
373
|
-
| ✓ `and_org ` | ✓ | | | |`and` in Red Arrow|
|
374
|
-
| ✓ `and_not` | ✓ | | | | |
|
375
|
-
| ✓ `and_not_kleene`| ✓ | | | | |
|
376
|
-
| ✓ `bit_wise_and` | | (✓) | | |integer only|
|
377
|
-
| ✓ `bit_wise_or` | | (✓) | | |integer only|
|
378
|
-
| ✓ `bit_wise_xor` | | (✓) | | |integer only|
|
379
|
-
| ✓ `divide` | | ✓ | | | `/` |
|
380
|
-
| ✓ `equal` | ✓ | ✓ | ✓ | |`==`, alias `eq`|
|
381
|
-
| ✓ `greater` | ✓ | ✓ | ✓ | |`>`, alias `gt`|
|
382
|
-
| ✓ `greater_equal` | ✓ | ✓ | ✓ | |`>=`, alias `ge`|
|
383
|
-
| ✓ `is_finite` | | ✓ | | | |
|
384
|
-
| ✓ `is_inf` | | ✓ | | | |
|
385
|
-
| ✓ `is_na` | ✓ | ✓ | ✓ | | |
|
386
|
-
| ✓ `is_nan` | | ✓ | | | |
|
387
|
-
|[ ]`is_nil` | ✓ | ✓ | ✓ |[ ] Null|alias `is_null`|
|
388
|
-
| ✓ `is_valid` | ✓ | ✓ | ✓ | | |
|
389
|
-
| ✓ `less` | ✓ | ✓ | ✓ | |`<`, alias `lt`|
|
390
|
-
| ✓ `less_equal` | ✓ | ✓ | ✓ | |`<=`, alias `le`|
|
391
|
-
|[ ]`logb` | | [ ] | | | |
|
392
|
-
|[ ]`mod` | | [ ] | | | `%` |
|
393
|
-
| ✓ `multiply` | | ✓ | | | `*` |
|
394
|
-
| ✓ `not_equal` | ✓ | ✓ | ✓ | |`!=`, alias `ne`|
|
395
|
-
| ✓ `or_kleene` | ✓ | | | | `\|` |
|
396
|
-
| ✓ `or_org` | ✓ | | | |`or` in Red Arrow|
|
397
|
-
| ✓ `power` | | ✓ | | | `**` |
|
398
|
-
| ✓ `subtract` | | ✓ | | | `-` |
|
399
|
-
| ✓ `shift_left` | | (✓) | | |`<<`, integer only|
|
400
|
-
| ✓ `shift_right` | | (✓) | | |`>>`, integer only|
|
401
|
-
| ✓ `xor` | ✓ | | | | `^` |
|
402
|
-
|
403
|
-
##### (Not impremented)
|
404
|
-
- [ ] sort, sort_index
|
405
|
-
- [ ] argmin, argmax
|
406
|
-
- [ ] (array functions)
|
407
|
-
- [ ] (strings functions)
|
408
|
-
- [ ] (temporal functions)
|
409
|
-
- [ ] (conditional functions)
|
410
|
-
- [ ] (index functions)
|
411
|
-
- [ ] (other functions)
|
412
|
-
|
413
|
-
### Coerce (not impremented)
|
414
|
-
|
415
|
-
### Updating (not impremented)
|
416
|
-
|
417
|
-
### DSL in a block for faster calculation ?
|
104
|
+
Vectors accepts some [functional methods from Arrow](https://arrow.apache.org/docs/cpp/compute.html).
|
105
|
+
|
106
|
+
See [Vector.md](doc/Vector.md) for details.
|
107
|
+
|
108
|
+
## TDR concept
|
418
109
|
|
110
|
+
I named the data frame representation style in the model above as TDR (Transposed DataFrame Representation). See [TDR.md](doc/tdr.md) for details.
|
419
111
|
|
420
112
|
## Development
|
421
113
|
|