red_amber 0.1.7 → 0.1.8

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 88bdd603d8daec1a95c0277ef68857f84346ad7cf95d0ba23a306e6b70567c29
4
- data.tar.gz: 40add80cbaa5183ca0e93eadcdcd1fead37015cac1cb2360660002c0b1878255
3
+ metadata.gz: 3853e70f378cac65013a3bcfc51a2d55cb70cc494f3f3b70675bed944cc15b49
4
+ data.tar.gz: 3c65999cf978f1edf8c2c7fcce9a0ccb192d4da051f34fa0bf3f66ddc178eb1c
5
5
  SHA512:
6
- metadata.gz: d043eea51117ecc48bdc52fa951e24d2618f273eb289a30f5bbb182e1a891763cdd35f6a7c6764f6e0061bddeaaa86b2374de1dc2b48f25a5b6b05c9af83a0e3
7
- data.tar.gz: cdbba19750bf71fe99e55bf6c46cb4522018f43563d7a93fdc375987f9388234e4f7e833297fdb6b8dd5a41b5a1bfdbf287ea47663f5f8a90facb56a4c63daef
6
+ metadata.gz: fac66ba0bf5955cfe0d21a51b90ec16407182b9053e9b586dfe9f8e2526de4e90efecdd8eba1e8b3c99b12fc44544c82fb2f6af4b666b97876a64a6ee4deedf1
7
+ data.tar.gz: 1a4cc526ce9f097438f2b7d018552a4cd6aaa2d900012297cd1777c4b9e39063cc2988af91c138e93f291a56175aefb6a6b00c211f9b9c5bd38d75d6bc40acb9
data/.rubocop.yml CHANGED
@@ -43,6 +43,11 @@ Lint/BinaryOperatorWithIdenticalOperands:
43
43
  Exclude:
44
44
  - 'test/test_vector_function.rb'
45
45
 
46
+ # Need for test with empty block
47
+ Lint/EmptyBlock:
48
+ Exclude:
49
+ - 'test/test_group.rb'
50
+
46
51
  # Max: 120
47
52
  Layout/LineLength:
48
53
  Max: 118
@@ -78,9 +83,10 @@ Metrics/ClassLength:
78
83
  Metrics/CyclomaticComplexity:
79
84
  Max: 12
80
85
  Exclude:
86
+ - 'lib/red_amber/data_frame_displayable.rb' # Max: 18
81
87
  - 'lib/red_amber/data_frame_selectable.rb' # Max: 14
88
+ - 'lib/red_amber/vector_selectable.rb' # Max: 13
82
89
  - 'lib/red_amber/vector_updatable.rb' # Max: 14
83
- - 'lib/red_amber/data_frame_displayable.rb' # Max: 18
84
90
 
85
91
  # Max: 10
86
92
  Metrics/MethodLength:
data/.rubocop_todo.yml CHANGED
@@ -1,15 +1,2 @@
1
- # This configuration was generated by
2
- # `rubocop --auto-gen-config`
3
- # on 2022-05-08 02:37:36 UTC using RuboCop version 1.27.0.
4
- # The point is for the user to remove these configuration records
5
- # one by one as the offenses are removed from the code base.
6
- # Note that changes in the inspected code, or installation of new
7
- # versions of RuboCop, may require this file to be generated again.
8
-
9
- # Offense count: 1
10
- # This cop supports unsafe auto-correction (--auto-correct-all).
11
- # Configuration parameters: EnforcedStyle.
12
- # SupportedStyles: forbid_for_all_comparison_operators, forbid_for_equality_operators_only, require_for_all_comparison_operators, require_for_equality_operators_only
13
- Style/YodaCondition:
14
- Exclude:
15
- - 'lib/red_amber/data_frame.rb'
1
+ # We will use cops to detect bugs in an early stage
2
+ # Feel free to use .rubocop_todo.yml by --auto-gen-config
data/.yardopts ADDED
@@ -0,0 +1 @@
1
+ --output-dir doc/yard
data/CHANGELOG.md CHANGED
@@ -2,6 +2,41 @@
2
2
 
3
3
  - Supports Arrow 9.0.0
4
4
 
5
+ ## [0.1.8] - 2022-08-04 (experimental)
6
+
7
+ - Bug fixes
8
+
9
+ - Fix unnamed column in table formatter (#52)
10
+ - Fix DataFrame#key?, DataFrame#key_index when @keys.nil? (#52)
11
+ - Align order of replacer in Vector#replace (#53, resolved #38)
12
+
13
+ - New features and improvements
14
+
15
+ - Refine DataFrame.new for empty arguments (#50)
16
+ - Delete .rubocop_todo.yml for not to use yoda condition (#50)
17
+
18
+ - Refine Group (#52, resolved #28)
19
+ - Refine Group methods creation
20
+ - Make group key at first(left)
21
+ - Show only one group count when same counts
22
+ - Add block acceptability for group
23
+ - Rename empty key to :unnamed in DataFrame.new
24
+ - Rename Group#aggregated_by to #summarize (#54)
25
+
26
+ - Add Vector#shift (#51)
27
+
28
+ - Vector#[] accepts Range as an argument (#51)
29
+
30
+ - Update documents
31
+
32
+ - Add support for yard (#54)
33
+
34
+ - Renew jupyter notebook '53 examples' (#54)
35
+
36
+ - Add more examples and images in README (#52)
37
+ - Add document of group manipulations in README (#52)
38
+ - Renew DF#group document in DataFrame.md (#52)
39
+
5
40
  ## [0.1.7] - 2022-07-15 (experimental)
6
41
 
7
42
  - Bug fixes
data/Gemfile CHANGED
@@ -18,6 +18,7 @@ group :test do
18
18
  gem 'iruby'
19
19
  gem 'test-unit'
20
20
  gem 'webrick'
21
+ gem 'yard'
21
22
 
22
23
  gem 'benchmark_driver'
23
24
  gem 'red-datasets'
data/README.md CHANGED
@@ -56,7 +56,7 @@ require 'red_amber' # require 'red-amber' is also OK.
56
56
  require 'datasets-arrow'
57
57
 
58
58
  arrow = Datasets::Penguins.new.to_arrow
59
- RedAmber::DataFrame.new(arrow)
59
+ penguins = RedAmber::DataFrame.new(arrow)
60
60
 
61
61
  # =>
62
62
  #<RedAmber::DataFrame : 344 x 8 Vectors, 0x0000000000013790>
@@ -78,28 +78,71 @@ RedAmber::DataFrame.new(arrow)
78
78
 
79
79
  For example, `DataFrame#pick` accepts keys as an argument and returns a sub DataFrame.
80
80
 
81
+ ![pick method image](doc/image/dataframe/pick.png)
82
+
81
83
  ```ruby
82
- df = penguins.pick(:body_mass_g)
84
+ penguins.keys
85
+ # =>
86
+ [:species,
87
+ :island,
88
+ :bill_length_mm,
89
+ :bill_depth_mm,
90
+ :flipper_length_mm,
91
+ :body_mass_g,
92
+ :sex,
93
+ :year]
94
+
95
+ df = penguins.pick(:species, :island, :body_mass_g)
83
96
  df
84
97
 
85
98
  # =>
86
- #<RedAmber::DataFrame : 344 x 1 Vector, 0x0000000000015cc0>
87
- body_mass_g
88
- <uint16>
89
- 1 3750
90
- 2 3800
91
- 3 3250
92
- 4 (nil)
93
- 5 3450
94
- : :
95
- 342 5750
96
- 343 5200
99
+ #<RedAmber::DataFrame : 344 x 3 Vectors, 0x000000000003cc1c>
100
+ species island body_mass_g
101
+ <string> <string> <uint16>
102
+ 1 Adelie Torgersen 3750
103
+ 2 Adelie Torgersen 3800
104
+ 3 Adelie Torgersen 3250
105
+ 4 Adelie Torgersen (nil)
106
+ 5 Adelie Torgersen 3450
107
+ : : : :
108
+ 342 Gentoo Biscoe 5750
109
+ 343 Gentoo Biscoe 5200
110
+ 344 Gentoo Biscoe 5400
111
+ ```
112
+
113
+ `DataFrame#drop` drops some columns to create a remainer DataFrame.
114
+
115
+ ![drop method image](doc/image/dataframe/drop.png)
116
+
117
+ You can specify by keys or a boolean array (same size as n_keys).
118
+
119
+ ```ruby
120
+ # Same as df.drop(:species, :island)
121
+ df = df.drop(true, true, false)
122
+
123
+ # =>
124
+ #<RedAmber::DataFrame : 344 x 1 Vector, 0x0000000000048760>
125
+ body_mass_g
126
+ <uint16>
127
+ 1 3750
128
+ 2 3800
129
+ 3 3250
130
+ 4 (nil)
131
+ 5 3450
132
+ : :
133
+ 342 5750
134
+ 343 5200
97
135
  344 5400
98
136
  ```
99
137
 
138
+ Arrow data is immutable, so these methods always return an new object.
139
+
100
140
  `DataFrame#assign` creates new variables (column in the table).
101
141
 
142
+ ![assign method image](doc/image/dataframe/assign.png)
143
+
102
144
  ```ruby
145
+ # New column is created because ':body_mass_kg' is a new key.
103
146
  df.assign(:body_mass_kg => df[:body_mass_g] / 1000.0)
104
147
 
105
148
  # =>
@@ -117,12 +160,97 @@ df.assign(:body_mass_kg => df[:body_mass_g] / 1000.0)
117
160
  344 5400 5.4
118
161
  ```
119
162
 
163
+ `DataFrame#slice` selects rows (observations) to create a sub DataFrame.
164
+
165
+ ![slice method image](doc/image/dataframe/slice.png)
166
+
167
+ ```ruby
168
+ # returns 5 rows at the start and 5 rows from the end
169
+ penguins.slice(0...5, -5..-1)
170
+
171
+ # =>
172
+ #<RedAmber::DataFrame : 10 x 8 Vectors, 0x0000000000042be4>
173
+ species island bill_length_mm bill_depth_mm flipper_length_mm ... year
174
+ <string> <string> <double> <double> <uint8> ... <uint16>
175
+ 1 Adelie Torgersen 39.1 18.7 181 ... 2007
176
+ 2 Adelie Torgersen 39.5 17.4 186 ... 2007
177
+ 3 Adelie Torgersen 40.3 18.0 195 ... 2007
178
+ 4 Adelie Torgersen (nil) (nil) (nil) ... 2007
179
+ 5 Adelie Torgersen 36.7 19.3 193 ... 2007
180
+ : : : : : : ... :
181
+ 8 Gentoo Biscoe 50.4 15.7 222 ... 2009
182
+ 9 Gentoo Biscoe 45.2 14.8 212 ... 2009
183
+ 10 Gentoo Biscoe 49.9 16.1 213 ... 2009
184
+ ```
185
+
186
+ `DataFrame#remove` rejects rows (observations) to create a remainer DataFrame.
187
+
188
+ ![remove method image](doc/image/dataframe/remove.png)
189
+
190
+ ```ruby
191
+ # penguins[:bill_length_mm] < 40 returns a boolean Vector
192
+ penguins.remove(penguins[:bill_length_mm] < 40)
193
+
194
+ # =>
195
+ #<RedAmber::DataFrame : 244 x 8 Vectors, 0x000000000007d6f4>
196
+ species island bill_length_mm bill_depth_mm flipper_length_mm ... year
197
+ <string> <string> <double> <double> <uint8> ... <uint16>
198
+ 1 Adelie Torgersen 40.3 18.0 195 ... 2007
199
+ 2 Adelie Torgersen (nil) (nil) (nil) ... 2007
200
+ 3 Adelie Torgersen 42.0 20.2 190 ... 2007
201
+ 4 Adelie Torgersen 41.1 17.6 182 ... 2007
202
+ 5 Adelie Torgersen 42.5 20.7 197 ... 2007
203
+ : : : : : : ... :
204
+ 242 Gentoo Biscoe 50.4 15.7 222 ... 2009
205
+ 243 Gentoo Biscoe 45.2 14.8 212 ... 2009
206
+ 244 Gentoo Biscoe 49.9 16.1 213 ... 2009
207
+ ```
208
+
120
209
  DataFrame manipulating methods like `pick`, `drop`, `slice`, `remove`, `rename` and `assign` accept a block.
121
210
 
122
- This is an exaple to eliminate observations (row in the table) containing nil.
211
+ This example is usage of block to update numeric columns.
123
212
 
124
213
  ```ruby
125
- # remove all observation contains nil
214
+ df = RedAmber::DataFrame.new(
215
+ integer: [0, 1, 2, 3, nil],
216
+ float: [0.0, 1.1, 2.2, Float::NAN, nil],
217
+ string: ['A', 'B', 'C', 'D', nil],
218
+ boolean: [true, false, true, false, nil])
219
+ df
220
+
221
+ # =>
222
+ #<RedAmber::DataFrame : 5 x 4 Vectors, 0x000000000003131c>
223
+ integer float string boolean
224
+ <uint8> <double> <string> <boolean>
225
+ 1 0 0.0 A true
226
+ 2 1 1.1 B false
227
+ 3 2 2.2 C true
228
+ 4 3 NaN D false
229
+ 5 (nil) (nil) (nil) (nil)
230
+
231
+ df.assign do
232
+ vectors.each_with_object({}) do |v, h|
233
+ h[v.key] = -v if v.numeric?
234
+ end
235
+ end
236
+
237
+ # =>
238
+ #<RedAmber::DataFrame : 5 x 4 Vectors, 0x000000000009a1b4>
239
+ integer float string boolean
240
+ <uint8> <double> <string> <boolean>
241
+ 1 0 -0.0 A true
242
+ 2 255 -1.1 B false
243
+ 3 254 -2.2 C true
244
+ 4 253 NaN D false
245
+ 5 (nil) (nil) (nil) (nil)
246
+ ```
247
+
248
+ Negate (-@) method of unsigned integer Vector returns complement.
249
+
250
+ Next example is to eliminate observations (row in the table) containing nil.
251
+
252
+ ```ruby
253
+ # remove all observations containing nil
126
254
  nil_removed = penguins.remove { vectors.map(&:is_nil).reduce(&:|) }
127
255
  nil_removed.tdr
128
256
  # =>
@@ -145,12 +273,51 @@ For this frequently needed task, we can do it much simpler.
145
273
  penguins.remove_nil # => same result as above
146
274
  ```
147
275
 
276
+ `DataFrame#group` method can be used for the grouping tasks.
277
+
278
+ ```ruby
279
+ starwars = RedAmber::DataFrame.load(URI("https://vincentarelbundock.github.io/Rdatasets/csv/dplyr/starwars.csv"))
280
+ starwars
281
+
282
+ # =>
283
+ #<RedAmber::DataFrame : 87 x 12 Vectors, 0x000000000000607c>
284
+ unnamed1 name height mass hair_color skin_color eye_color ... species
285
+ <int64> <string> <int64> <double> <string> <string> <string> ... <string>
286
+ 1 1 Luke Skywalker 172 77.0 blond fair blue ... Human
287
+ 2 2 C-3PO 167 75.0 NA gold yellow ... Droid
288
+ 3 3 R2-D2 96 32.0 NA white, blue red ... Droid
289
+ 4 4 Darth Vader 202 136.0 none white yellow ... Human
290
+ 5 5 Leia Organa 150 49.0 brown light brown ... Human
291
+ : : : : : : : : ... :
292
+ 85 85 BB8 (nil) (nil) none none black ... Droid
293
+ 86 86 Captain Phasma (nil) (nil) unknown unknown unknown ... NA
294
+ 87 87 Padmé Amidala 165 45.0 brown light brown ... Human
295
+
296
+ grouped = starwars.group(:species) { [count(:species), mean(:height, :mass)] }
297
+ grouped.slice { v(:count) > 1 }
298
+
299
+ # =>
300
+ #<RedAmber::DataFrame : 9 x 4 Vectors, 0x000000000006e848>
301
+ species count mean(height) mean(mass)
302
+ <string> <int64> <double> <double>
303
+ 1 Human 35 176.6 82.8
304
+ 2 Droid 6 131.2 69.8
305
+ 3 Wookiee 2 231.0 124.0
306
+ 4 Gungan 3 208.7 74.0
307
+ 5 NA 4 181.3 48.0
308
+ : : : : :
309
+ 7 Twi'lek 2 179.0 55.0
310
+ 8 Mirialan 2 168.0 53.1
311
+ 9 Kaminoan 2 221.0 88.0
312
+ ```
313
+
148
314
  See [DataFrame.md](doc/DataFrame.md) for details.
149
315
 
150
316
 
151
317
  ## `RedAmber::Vector`
152
318
 
153
319
  Class `RedAmber::Vector` represents a series of data in the DataFrame.
320
+ Method `RedAmber::DataFrame#[key]` returns a Vector with the key `key`.
154
321
 
155
322
  ```ruby
156
323
  penguins[:bill_length_mm]
@@ -161,11 +328,34 @@ penguins[:bill_length_mm]
161
328
 
162
329
  Vectors accepts some [functional methods from Arrow](https://arrow.apache.org/docs/cpp/compute.html).
163
330
 
331
+ This is an element-wise comparison and returns a boolean Vector of same size.
332
+
333
+ ![unary element-wise](doc/image/vector/unary_element_wise.png)
334
+
335
+ ```ruby
336
+ penguins[:bill_length_mm] < 40
337
+
338
+ # =>
339
+ #<RedAmber::Vector(:boolean, size=344):0x000000000007e7ac>
340
+ [true, true, false, nil, true, true, true, true, true, false, true, true, false, ... ]
341
+ ```
342
+
343
+ Next example returns aggregated result.
344
+
345
+ ![unary aggregation](doc/image/vector/unary_aggregation.png)
346
+
347
+ ```ruby
348
+ penguins[:bill_length_mm].mean
349
+ 43.92192982456141
350
+ # =>
351
+
352
+ ```
353
+
164
354
  See [Vector.md](doc/Vector.md) for details.
165
355
 
166
356
  ## Jupyter notebook
167
357
 
168
- [47 Examples of Red Amber](doc/47_examples_of_red_amber.ipynb)
358
+ [53 Examples of Red Amber](doc/examples_of_red_amber.ipynb)
169
359
 
170
360
  ## Development
171
361
 
data/doc/DataFrame.md CHANGED
@@ -860,16 +860,10 @@ penguins.to_rover
860
860
 
861
861
  ## Grouping
862
862
 
863
- ### `group(aggregating_keys)`
864
-
865
- (
866
- This API will change in the future version. Especcially I want to change:
867
- - Order of the column of the result (aggregation_keys should be the first)
868
- - DataFrame#group will accept a block (heronshoes/red_amber #28)
869
- )
863
+ ### `group(group_keys)`
870
864
 
871
865
  `group` creates a class `Group` object. `Group` accepts functions below as a method.
872
- Method accepts options as `summary_keys`.
866
+ Method accepts options as `group_keys`.
873
867
 
874
868
  Available functions are:
875
869
 
@@ -889,8 +883,8 @@ penguins.to_rover
889
883
  - [ ] tdigest
890
884
  - ✓ variance
891
885
 
892
- For the each group of `aggregation_keys`, the aggregation `function` is applied and returns a new dataframe with aggregated keys according to `summary_keys`.
893
- Aggregated key name is `function(summary_key)` style.
886
+ For the each group of `group_keys`, the aggregation `function` is applied and returns a new dataframe with aggregated keys according to `summary_keys`.
887
+ Summary key names are provided by `function(summary_keys)` style.
894
888
 
895
889
  This is an example of grouping of famous STARWARS dataset.
896
890
 
@@ -900,18 +894,18 @@ penguins.to_rover
900
894
  starwars
901
895
 
902
896
  # =>
903
- #<RedAmber::DataFrame : 87 x 12 Vectors, 0x00000000000773bc>
904
- species name height mass hair_color skin_color eye_color ... homeworld
905
- <string> <string> <int64> <double> <string> <string> <string> ... <string>
906
- Human 1 Luke Skywalker 172 77.0 blond fair blue ... Tatooine
907
- Droid 2 C-3PO 167 75.0 NA gold yellow ... Tatooine
908
- Droid 3 R2-D2 96 32.0 NA white, blue red ... Naboo
909
- Human 4 Darth Vader 202 136.0 none white yellow ... Tatooine
910
- Human 5 Leia Organa 150 49.0 brown light brown ... Alderaan
911
- : : : : : : : : ... :
912
- Droid 85 BB8 (nil) (nil) none none black ... NA
913
- NA 86 Captain Phasma (nil) (nil) unknown unknown unknown ... NA
914
- Human 87 Padmé Amidala 165 45.0 brown light brown ... Naboo
897
+ #<RedAmber::DataFrame : 87 x 12 Vectors, 0x0000000000005a50>
898
+ unnamed1 name height mass hair_color skin_color eye_color ... species
899
+ <int64> <string> <int64> <double> <string> <string> <string> ... <string>
900
+ 1 1 Luke Skywalker 172 77.0 blond fair blue ... Human
901
+ 2 2 C-3PO 167 75.0 NA gold yellow ... Droid
902
+ 3 3 R2-D2 96 32.0 NA white, blue red ... Droid
903
+ 4 4 Darth Vader 202 136.0 none white yellow ... Human
904
+ 5 5 Leia Organa 150 49.0 brown light brown ... Human
905
+ : : : : : : : : ... :
906
+ 85 85 BB8 (nil) (nil) none none black ... Droid
907
+ 86 86 Captain Phasma (nil) (nil) unknown unknown unknown ... NA
908
+ 87 87 Padmé Amidala 165 45.0 brown light brown ... Human
915
909
 
916
910
  starwars.tdr(12)
917
911
 
@@ -919,7 +913,7 @@ penguins.to_rover
919
913
  RedAmber::DataFrame : 87 x 12 Vectors
920
914
  Vectors : 4 numeric, 8 strings
921
915
  # key type level data_preview
922
- 1 :"" int64 87 [1, 2, 3, 4, 5, ... ]
916
+ 1 :unnamed1 int64 87 [1, 2, 3, 4, 5, ... ]
923
917
  2 :name string 87 ["Luke Skywalker", "C-3PO", "R2-D2", "Darth Vader", "Leia Organa", ... ]
924
918
  3 :height int64 46 [172, 167, 96, 202, 150, ... ], 6 nils
925
919
  4 :mass double 39 [77.0, 75.0, 32.0, 136.0, 49.0, ... ], 28 nils
@@ -933,74 +927,70 @@ penguins.to_rover
933
927
  12 :species string 38 ["Human", "Droid", "Droid", "Human", "Human", ... ]
934
928
  ```
935
929
 
936
- We can aggregate for `:species` and calculate the mean of `:mass` and `:height`.
930
+ We can group by `:species` and calculate the count.
937
931
 
938
932
  ```ruby
939
- grouped = starwars.group(:species).mean(:mass, :height)
940
- grouped
933
+ starwars.group(:species).count(:species)
941
934
 
942
935
  # =>
943
- #<RedAmber::DataFrame : 38 x 3 Vectors, 0x000000000008e620>
944
- mean(mass) mean(height) species
945
- <double> <double> <string>
946
- 1 82.8 176.6 Human
947
- 2 69.8 131.2 Droid
948
- 3 124.0 231.0 Wookiee
949
- 4 74.0 173.0 Rodian
950
- 5 1358.0 175.0 Hutt
951
- : : : :
952
- 36 159.0 216.0 Kaleesh
953
- 37 80.0 206.0 Pau'an
954
- 38 80.0 188.0 Kel Dor
936
+ #<RedAmber::DataFrame : 38 x 2 Vectors, 0x000000000001d6f0>
937
+ species count
938
+ <string> <int64>
939
+ 1 Human 35
940
+ 2 Droid 6
941
+ 3 Wookiee 2
942
+ 4 Rodian 1
943
+ 5 Hutt 1
944
+ : : :
945
+ 36 Kaleesh 1
946
+ 37 Pau'an 1
947
+ 38 Kel Dor 1
955
948
  ```
956
949
 
957
- Select rows for count > 1.
958
-
950
+ We can also calculate the mean of `:mass` and `:height` together.
951
+
959
952
  ```ruby
960
- count = starwars.group(:species).count(:species)[:'count(species)'] # => Vector
961
- grouped = grouped.slice(count > 1)
953
+ grouped = starwars.group(:species) { [count(:species), mean(:height, :mass)] }
962
954
 
963
955
  # =>
964
- #<RedAmber::DataFrame : 9 x 3 Vectors, 0x0000000000098260>
965
- mean(mass) mean(height) species
966
- <double> <double> <string>
967
- 1 82.8 176.6 Human
968
- 2 69.8 131.2 Droid
969
- 3 124.0 231.0 Wookiee
970
- 4 74.0 208.7 Gungan
971
- 5 48.0 181.3 NA
972
- : : : :
973
- 7 55.0 179.0 Twi'lek
974
- 8 53.1 168.0 Mirialan
975
- 9 88.0 221.0 Kaminoan
956
+ #<RedAmber::DataFrame : 38 x 4 Vectors, 0x00000000000407cc>
957
+ species count mean(height) mean(mass)
958
+ <string> <int64> <double> <double>
959
+ 1 Human 35 176.6 82.8
960
+ 2 Droid 6 131.2 69.8
961
+ 3 Wookiee 2 231.0 124.0
962
+ 4 Rodian 1 173.0 74.0
963
+ 5 Hutt 1 175.0 1358.0
964
+ : : : : :
965
+ 36 Kaleesh 1 216.0 159.0
966
+ 37 Pau'an 1 206.0 80.0
967
+ 38 Kel Dor 1 188.0 80.0
976
968
  ```
977
969
 
978
- Assemble the result and change the order of columns.
979
-
980
- ```ruby
981
- grouped.assign(count: count[count > 1]).pick { [2,3,0,1].map{ |i| keys[i] } }
970
+ Select rows for count > 1.
982
971
 
972
+ ```ruby
973
+ grouped.slice(grouped[:count] > 1)
974
+
983
975
  # =>
984
- #<RedAmber::DataFrame : 9 x 4 Vectors, 0x0000000000141838>
985
- species count mean(mass) mean(height)
986
- <string> <uint8> <double> <double>
987
- 1 Human 35 82.8 176.6
988
- 2 Droid 6 69.8 131.2
989
- 3 Wookiee 2 124.0 231.0
990
- 4 Gungan 3 74.0 208.7
991
- 5 NA 4 48.0 181.3
992
- : : : : :
993
- 7 Twi'lek 2 55.0 179.0
994
- 8 Mirialan 2 53.1 168.0
995
- 9 Kaminoan 2 88.0 221.0
976
+ #<RedAmber::DataFrame : 9 x 4 Vectors, 0x000000000004c270>
977
+ species count mean(height) mean(mass)
978
+ <string> <int64> <double> <double>
979
+ 1 Human 35 176.6 82.8
980
+ 2 Droid 6 131.2 69.8
981
+ 3 Wookiee 2 231.0 124.0
982
+ 4 Gungan 3 208.7 74.0
983
+ 5 NA 4 181.3 48.0
984
+ : : : : :
985
+ 7 Twi'lek 2 179.0 55.0
986
+ 8 Mirialan 2 168.0 53.1
987
+ 9 Kaminoan 2 221.0 88.0
996
988
  ```
997
989
 
998
990
  ## Combining DataFrames
999
991
 
1000
992
  - [ ] Combining rows to a dataframe
1001
993
 
1002
- - [ ] Add vars
1003
-
1004
994
  - [ ] Inner join
1005
995
 
1006
996
  - [ ] Left join
@@ -1009,6 +999,6 @@ penguins.to_rover
1009
999
 
1010
1000
  - [ ] One-hot encoding
1011
1001
 
1012
- ## Iteration (not impremented)
1002
+ ## Iteration
1013
1003
 
1014
1004
  - [ ] each_rows
data/doc/Vector.md CHANGED
@@ -500,3 +500,28 @@ vector.is_in(1, -1)
500
500
  #<RedAmber::Vector(:boolean, size=3):0x000000000000f320>
501
501
  [true, false, true]
502
502
  ```
503
+
504
+ ### `shift(amount = 1, fill: nil)`
505
+
506
+ Shift vector's values by specified `amount`. Shifted space is filled by value `fill`.
507
+
508
+ ```ruby
509
+ vector = RedAmber::Vector.new([1, 2, 3, 4, 5])
510
+ vector.shift
511
+
512
+ # =>
513
+ #<RedAmber::Vector(:uint8, size=5):0x00000000000072d8>
514
+ [nil, 1, 2, 3, 4]
515
+
516
+ vector.shift(-2)
517
+
518
+ # =>
519
+ #<RedAmber::Vector(:uint8, size=5):0x0000000000009970>
520
+ [3, 4, 5, nil, nil]
521
+
522
+ vector.shift(fill: Float::NAN)
523
+
524
+ # =>
525
+ #<RedAmber::Vector(:double, size=5):0x0000000000011d3c>
526
+ [NaN, 1.0, 2.0, 3.0, 4.0]
527
+ ```