red_amber 0.2.0 → 0.2.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/.rubocop.yml +2 -0
- data/CHANGELOG.md +58 -0
- data/README.md +38 -24
- data/doc/DataFrame.md +212 -80
- data/doc/Vector.md +7 -18
- data/doc/examples_of_red_amber.ipynb +2720 -524
- data/lib/red_amber/data_frame.rb +23 -4
- data/lib/red_amber/data_frame_displayable.rb +3 -3
- data/lib/red_amber/data_frame_reshaping.rb +10 -10
- data/lib/red_amber/data_frame_selectable.rb +53 -9
- data/lib/red_amber/data_frame_variable_operation.rb +44 -13
- data/lib/red_amber/vector.rb +1 -1
- data/lib/red_amber/vector_functions.rb +21 -24
- data/lib/red_amber/vector_selectable.rb +9 -8
- data/lib/red_amber/version.rb +1 -1
- metadata +2 -2
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: d239a3fa90e5796fb695f8d3c4995d0a2178ea7c8c2789bed157e688902585cb
|
4
|
+
data.tar.gz: 968c02294d24a3dabaa6e5128be0bcfad713e131df15850ac0ceb64c2883dcd0
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: d1c5ffd9650dd8c9e825514cd7e2ff4914690bd731ac262fca6cc17e56c1e312679689351a05fb741dccfb59377214706a8bf6ca6fe3237ca46fb623ae1b9f10
|
7
|
+
data.tar.gz: f37c4aff9170cd5105737a9d2b3d827051254dcca6968b697f5ed3a70e1b2c3cb14303e88a9c342870d1447450a538e445d6f3d37de53591d3f6d13b87aebc16
|
data/.rubocop.yml
CHANGED
@@ -94,6 +94,8 @@ Metrics/MethodLength:
|
|
94
94
|
Max: 30
|
95
95
|
Exclude:
|
96
96
|
- 'lib/red_amber/data_frame_displayable.rb' # Max: 33
|
97
|
+
- 'lib/red_amber/data_frame_selectable.rb' # Max: 38
|
98
|
+
- 'lib/red_amber/data_frame_variable_operation.rb' # Max: 35
|
97
99
|
|
98
100
|
# Max: 100
|
99
101
|
Metrics/ModuleLength:
|
data/CHANGELOG.md
CHANGED
@@ -1,3 +1,61 @@
|
|
1
|
+
## [0.2.1] - 2022-09-07
|
2
|
+
|
3
|
+
-Bug fixes
|
4
|
+
|
5
|
+
- Fix `Vector#each` with block (#66)
|
6
|
+
`Vector#each` will return value of each element with block.
|
7
|
+
|
8
|
+
- Fix table format at size == 9 (#67)
|
9
|
+
|
10
|
+
- Fix to support Vector in `DataFrame#assign` (#77)
|
11
|
+
|
12
|
+
- Add `assert_delta` functionality for `assert_with_NaN` (#78)
|
13
|
+
|
14
|
+
- Fix Vector#is_in when self is chunked (#79)
|
15
|
+
|
16
|
+
- Fix Array type error (uint/int) (#79)
|
17
|
+
|
18
|
+
- New features and improvements
|
19
|
+
|
20
|
+
- Refine `DataFrame#indices` method (#67)
|
21
|
+
|
22
|
+
- Update DataFrame reshaping methods (#73)
|
23
|
+
|
24
|
+
- Change default option value of DataFrame reshaping
|
25
|
+
|
26
|
+
- Change the order of import_cars example
|
27
|
+
|
28
|
+
- Add `DataFrame#method_missing` to get column vector by method (#75)
|
29
|
+
|
30
|
+
- Add `DataFrame#method_missing` to get column (#75)
|
31
|
+
|
32
|
+
- Accept both args and block in `DataFrame#assign` (#75)
|
33
|
+
|
34
|
+
- Accept indices in `DataFrame#pick` and `DataFrame#drop` (#76)
|
35
|
+
|
36
|
+
- Add `DataFrame#slice_by` method (#77)
|
37
|
+
|
38
|
+
- Add new Vector functions (#78)
|
39
|
+
|
40
|
+
- Add inverse trigonometric function for Vector
|
41
|
+
- `acos`
|
42
|
+
- `asin`
|
43
|
+
|
44
|
+
- Add logarithmic function for Vector
|
45
|
+
- `ln`
|
46
|
+
- `log10`
|
47
|
+
- `log1p`
|
48
|
+
- `log2`
|
49
|
+
|
50
|
+
- Add binary function `Vector#logb`
|
51
|
+
|
52
|
+
- Docker image and Jupyter Notebook (Thanks to @mrkn)
|
53
|
+
- Add link to RubyData in README
|
54
|
+
- Add link to interactive README by Binder
|
55
|
+
|
56
|
+
- Update Jupyter Notebook `71 examples of RedAmber`
|
57
|
+
|
58
|
+
|
1
59
|
## [0.2.0] - 2022-08-15
|
2
60
|
|
3
61
|
- Bump version up to 0.2.0
|
data/README.md
CHANGED
@@ -53,9 +53,20 @@ Or install it yourself as:
|
|
53
53
|
gem install red_amber
|
54
54
|
```
|
55
55
|
|
56
|
+
## Docker image and Jupyter Notebook
|
57
|
+
|
58
|
+
[RubyData Docker Stacks](https://github.com/RubyData/docker-stacks) is available as a ready-to-run Docker image containing Jupyter and useful data tools as well as RedAmber (Thanks to @mrkn).
|
59
|
+
|
60
|
+
Also you can try the contents of this README interactively by [Binder](https://mybinder.org/v2/gh/RubyData/docker-stacks/master?filepath=red-amber.ipynb).
|
61
|
+
[](https://mybinder.org/v2/gh/RubyData/docker-stacks/master?filepath=red-amber.ipynb)
|
62
|
+
|
63
|
+
|
64
|
+
|
56
65
|
## `RedAmber::DataFrame`
|
57
66
|
|
58
|
-
|
67
|
+
It represents a set of data in 2D-shape. The entity is a Red Arrow's Table object.
|
68
|
+
|
69
|
+

|
59
70
|
|
60
71
|
```ruby
|
61
72
|
require 'red_amber' # require 'red-amber' is also OK.
|
@@ -79,18 +90,15 @@ penguins = RedAmber::DataFrame.new(arrow)
|
|
79
90
|
344 Gentoo Biscoe 49.9 16.1 213 ... 2009
|
80
91
|
```
|
81
92
|
|
82
|
-
|
83
|
-

|
84
|
-
|
85
|
-
For example, `DataFrame#pick` accepts keys as an argument and returns a sub DataFrame.
|
93
|
+
For example, `DataFrame#pick` accepts keys as arguments and returns a sub DataFrame.
|
86
94
|
|
87
95
|

|
88
96
|
|
89
97
|
```ruby
|
90
98
|
penguins.keys
|
91
99
|
# =>
|
92
|
-
[:species,
|
93
|
-
:island,
|
100
|
+
[:species,
|
101
|
+
:island,
|
94
102
|
:bill_length_mm,
|
95
103
|
:bill_depth_mm,
|
96
104
|
:flipper_length_mm,
|
@@ -102,17 +110,17 @@ df = penguins.pick(:species, :island, :body_mass_g)
|
|
102
110
|
df
|
103
111
|
|
104
112
|
# =>
|
105
|
-
#<RedAmber::DataFrame : 344 x 3 Vectors, 0x000000000003cc1c>
|
106
|
-
species island body_mass_g
|
107
|
-
<string> <string> <uint16>
|
108
|
-
1 Adelie Torgersen 3750
|
109
|
-
2 Adelie Torgersen 3800
|
110
|
-
3 Adelie Torgersen 3250
|
111
|
-
4 Adelie Torgersen (nil)
|
112
|
-
5 Adelie Torgersen 3450
|
113
|
-
: : : :
|
114
|
-
342 Gentoo Biscoe 5750
|
115
|
-
343 Gentoo Biscoe 5200
|
113
|
+
#<RedAmber::DataFrame : 344 x 3 Vectors, 0x000000000003cc1c>
|
114
|
+
species island body_mass_g
|
115
|
+
<string> <string> <uint16>
|
116
|
+
1 Adelie Torgersen 3750
|
117
|
+
2 Adelie Torgersen 3800
|
118
|
+
3 Adelie Torgersen 3250
|
119
|
+
4 Adelie Torgersen (nil)
|
120
|
+
5 Adelie Torgersen 3450
|
121
|
+
: : : :
|
122
|
+
342 Gentoo Biscoe 5750
|
123
|
+
343 Gentoo Biscoe 5200
|
116
124
|
344 Gentoo Biscoe 5400
|
117
125
|
```
|
118
126
|
|
@@ -120,7 +128,7 @@ df
|
|
120
128
|
|
121
129
|

|
122
130
|
|
123
|
-
You can specify by keys or a boolean array
|
131
|
+
You can specify by keys or a boolean array of same size as n_keys.
|
124
132
|
|
125
133
|
```ruby
|
126
134
|
# Same as df.drop(:species, :island)
|
@@ -214,7 +222,13 @@ penguins.remove(penguins[:bill_length_mm] < 40)
|
|
214
222
|
|
215
223
|
DataFrame manipulating methods like `pick`, `drop`, `slice`, `remove`, `rename` and `assign` accept a block.
|
216
224
|
|
217
|
-
|
225
|
+
Previous example is also OK with a block.
|
226
|
+
|
227
|
+
```ruby
|
228
|
+
penguins.remove { bill_length_mm < 40 }
|
229
|
+
```
|
230
|
+
|
231
|
+
Next example is an usage of block to update a column.
|
218
232
|
|
219
233
|
```ruby
|
220
234
|
df = RedAmber::DataFrame.new(
|
@@ -312,8 +326,8 @@ starwars
|
|
312
326
|
86 86 Captain Phasma (nil) (nil) unknown unknown unknown ... NA
|
313
327
|
87 87 Padmé Amidala 165 45.0 brown light brown ... Human
|
314
328
|
|
315
|
-
|
316
|
-
|
329
|
+
starwars.group(:species) { [count(:species), mean(:height, :mass)] }
|
330
|
+
.slice { count > 1 }
|
317
331
|
|
318
332
|
# =>
|
319
333
|
#<RedAmber::DataFrame : 9 x 4 Vectors, 0x000000000006e848>
|
@@ -324,7 +338,7 @@ grouped.slice { v(:count) > 1 }
|
|
324
338
|
3 Wookiee 2 231.0 124.0
|
325
339
|
4 Gungan 3 208.7 74.0
|
326
340
|
5 NA 4 181.3 48.0
|
327
|
-
|
341
|
+
6 Zabrak 2 173.0 80.0
|
328
342
|
7 Twi'lek 2 179.0 55.0
|
329
343
|
8 Mirialan 2 168.0 53.1
|
330
344
|
9 Kaminoan 2 221.0 88.0
|
@@ -374,7 +388,7 @@ See [Vector.md](doc/Vector.md) for details.
|
|
374
388
|
|
375
389
|
## Jupyter notebook
|
376
390
|
|
377
|
-
[
|
391
|
+
[71 Examples of Red Amber](doc/examples_of_red_amber.ipynb) shows more examples in jupyter notebook.
|
378
392
|
|
379
393
|
## Development
|
380
394
|
|