RubyGems - red_amber - Versions diffs - 0.1.8 → 0.2.0 - Mend

red_amber 0.1.8 → 0.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (19) hide show

checksums.yaml +4 -4
data/.rubocop.yml +3 -1
data/CHANGELOG.md +71 -2
data/Gemfile +1 -1
data/README.md +58 -33
data/doc/DataFrame.md +196 -55
data/doc/Vector.md +5 -1
data/doc/examples_of_red_amber.ipynb +1677 -348
data/lib/red_amber/data_frame.rb +92 -15
data/lib/red_amber/data_frame_displayable.rb +25 -10
data/lib/red_amber/data_frame_reshaping.rb +85 -0
data/lib/red_amber/data_frame_variable_operation.rb +89 -40
data/lib/red_amber/group.rb +5 -1
data/lib/red_amber/vector_functions.rb +46 -1
data/lib/red_amber/vector_selectable.rb +1 -1
data/lib/red_amber/version.rb +1 -1
data/lib/red_amber.rb +1 -1
data/red_amber.gemspec +1 -1
metadata +5 -4

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: 3853e70f378cac65013a3bcfc51a2d55cb70cc494f3f3b70675bed944cc15b49
-  data.tar.gz: 3c65999cf978f1edf8c2c7fcce9a0ccb192d4da051f34fa0bf3f66ddc178eb1c
+  metadata.gz: 73459d02c921fcb0fcb742760e8c882b5491fa5316a79b9016233a516ada013e
+  data.tar.gz: ac25e808c5e5d4c13bb1877659550bba532cb5778371e39dfa1f3b9e5a91a4f8
 SHA512:
-  metadata.gz: fac66ba0bf5955cfe0d21a51b90ec16407182b9053e9b586dfe9f8e2526de4e90efecdd8eba1e8b3c99b12fc44544c82fb2f6af4b666b97876a64a6ee4deedf1
-  data.tar.gz: 1a4cc526ce9f097438f2b7d018552a4cd6aaa2d900012297cd1777c4b9e39063cc2988af91c138e93f291a56175aefb6a6b00c211f9b9c5bd38d75d6bc40acb9
+  metadata.gz: 1bfa4200d440c338f496fe282816634d6a833e30e17edc87a2cf5ec63866e2bbbaf8796916f1b052ea66482c54a038bbf1445258c2526691e42c2b47be2c39c5
+  data.tar.gz: e324e480e6086f7017de58201783c857825b79d0b2e2c8fa2636089cd1c5531e22905a3c0d860f26b833eb6add6ed6017497632bd1ea8fcb932c2d2233b11812

data/.rubocop.yml CHANGED Viewed

@@ -61,6 +61,7 @@ Metrics/AbcSize:
   Max: 30
   Exclude:
     - 'lib/red_amber/data_frame_displayable.rb' # Max: 55
+    - 'lib/red_amber/data_frame_reshaping.rb' # Max 40.91
     - 'lib/red_amber/data_frame_selectable.rb' # Max: 51
     - 'lib/red_amber/vector_updatable.rb' # Max: 36
     - 'lib/red_amber/vector_selectable.rb' # Max: 33
@@ -98,9 +99,10 @@ Metrics/MethodLength:
 Metrics/ModuleLength:
   Max: 100
   Exclude:
+    - 'lib/red_amber/data_frame_displayable.rb' # Max: 132
     - 'lib/red_amber/data_frame_selectable.rb' # Max: 141
+    - 'lib/red_amber/data_frame_variable_operation.rb' # Max: 110
     - 'lib/red_amber/vector_functions.rb' # Max: 114
-    - 'lib/red_amber/data_frame_displayable.rb' # Max: 132
 # Max: 8
 Metrics/PerceivedComplexity:

data/CHANGELOG.md CHANGED Viewed

@@ -1,6 +1,75 @@
-## [0.1.9] - Unreleased
+## [0.2.0] - 2022-08-15
-- Supports Arrow 9.0.0
+- Bump version up to 0.2.0
+- Bug fixes
+  - Fix order of multiple group keys (#55)
+    Only 1 group key comes to left. Other keys remain in right.
+  - Remove optional `require` for rover (#55)
+    Fix DataFrame.new for argument with Rover::DataFrame.
+  - Fix occasional failure in CI (#59)
+    Sometimes the CI test fails. I added -dev dependency
+    in Arrow install by apt, not doing in bundler.
+  - Fix calling :take in V#[] (#56)
+    Fixed to call Arrow function :take instead of :array_take in Vector#take_by_vector. This will prevent the error below
+    when called with Arrow::ChunkedArray.
+  - Raise error renaming non existing key (#61)
+    Add error when specified key is not exist.
+  - Fix DataFrame#rename #assign by array (#65)
+- New features and improvements
+  - Support Arrow 9.0.0
+    - Upgrade to Arrow 9.0.0 (#59)
+    - Add Vector#quantile method (#59)
+      Arrow::QuantileOptions has supported in Arrow GLib 9.0.0 (ARROW-16623, Thanks!)
+    - Add Vector#quantiles (#62)
+    - Add DataFrame#each_row (#56)
+      - Returns Enumerator if block is not given.
+      - Change DataFrame#each_row to return a Hash {key => row} (#63)
+  - Refactor to use pattern match in overloaded parameter parsing (#61)
+    - Refine DataFrame.new to use pattern match
+    - Use pattern match in DataFrame#assign
+    - Use pattern match in DataFrame#rename
+  - Accept Array for renamer/assigner in #rename/#assign (#61)
+    - Accept assigner by Arrays in DataFrame#assign
+    - Accept renamer pairs by Arrays in DataFrame#rename
+    - Add DataFrame#assign_left method
+  - Add summary/describe (#62)
+    - Introduce DataFrame#summary(#describe)
+  - Introduce reshaping methods for DataFrame (#64)
+    - Introduce DataFrame#transpose method
+    - Intorduce DataFrame#to_long method
+    - Intorduce DataFrame#to_wide method
+  - Others
+    - Add alias sort_index for array_sort_indices (#59)
+    - Enable :width option in DataFrame#to_s (#62)
+    - Add options to DataFrame#format_table (#62)
+  - Update Documents
+    - Add Yard doc for some methods
+    - Update Jupyter notebook '61 Examples of Red Amber' (#65)
 ## [0.1.8] - 2022-08-04 (experimental)

data/Gemfile CHANGED Viewed

@@ -7,7 +7,7 @@ gemspec
 group :test do
   gem 'rake'
-  gem 'red-parquet', '>= 8.0.0'
+  gem 'red-parquet', '>= 9.0.0'
   gem 'rover-df', '~> 0.3.0'
   gem 'rubocop'

data/README.md CHANGED Viewed

@@ -3,17 +3,23 @@
 [![Gem Version](https://badge.fury.io/rb/red_amber.svg)](https://badge.fury.io/rb/red_amber)
 [![Ruby](https://github.com/heronshoes/red_amber/actions/workflows/test.yml/badge.svg)](https://github.com/heronshoes/red_amber/actions/workflows/test.yml)
-A simple dataframe library for Ruby (experimental).
+A simple dataframe library for Ruby.
 - Powered by [Red Arrow](https://github.com/apache/arrow/tree/master/ruby/red-arrow) [![Gitter Chat](https://badges.gitter.im/red-data-tools/en.svg)](https://gitter.im/red-data-tools/en)
 - Inspired by the dataframe library [Rover-df](https://github.com/ankane/rover)
 ## Requirements
+Supported Ruby version is >= 2.7.
+Since v0.2.0, this library uses pattern matching which is an experimental feature in 2.7 . It is usable but a warning message will be shown in 2.7 .
+I recommend Ruby 3 for performance.
 ```ruby
-gem 'red-arrow',   '>= 8.0.0'
+# Libraries required
+gem 'red-arrow',   '>= 9.0.0'
-gem 'red-parquet', '>= 8.0.0' # Optional, if you use IO from/to parquet
+gem 'red-parquet', '>= 9.0.0' # Optional, if you use IO from/to parquet
 gem 'rover-df',    '~> 0.3.0' # Optional, if you use IO from/to Rover::DataFrame
 ```
@@ -21,9 +27,9 @@ gem 'rover-df',    '~> 0.3.0' # Optional, if you use IO from/to Rover::DataFrame
 Install requirements before you install Red Amber.
-- Apache Arrow GLib (>= 8.0.0)
+- Apache Arrow GLib (>= 9.0.0)
-- Apache Parquet GLib (>= 8.0.0)  # If you use IO from/to parquet
+- Apache Parquet GLib (>= 9.0.0)  # If you use IO from/to parquet
   See [Apache Arrow install document](https://arrow.apache.org/install/).
@@ -122,22 +128,22 @@ df = df.drop(true, true, false)
 # =>
 #<RedAmber::DataFrame : 344 x 1 Vector, 0x0000000000048760>
-    body_mass_g
-       <uint16>
-  1        3750
-  2        3800
-  3        3250
-  4       (nil)
-  5        3450
-  :           :
-342        5750
-343        5200
+    body_mass_g
+       <uint16>
+  1        3750
+  2        3800
+  3        3250
+  4       (nil)
+  5        3450
+  :           :
+342        5750
+343        5200
 344        5400
 ```
 Arrow data is immutable, so these methods always return an new object.
-`DataFrame#assign` creates new variables (column in the table).
+`DataFrame#assign` creates new columns or update existing columns.
 ![assign method image](doc/image/dataframe/assign.png)
@@ -208,7 +214,7 @@ penguins.remove(penguins[:bill_length_mm] < 40)
 DataFrame manipulating methods like `pick`, `drop`, `slice`, `remove`, `rename` and `assign` accept a block.
-This example is usage of block to update numeric columns.
+This example is usage of block to update a column.
 ```ruby
 df = RedAmber::DataFrame.new(
@@ -229,30 +235,28 @@ df
 5   (nil)    (nil) (nil)    (nil)
 df.assign do
-  vectors.each_with_object({}) do |v, h|
-    h[v.key] = -v if v.numeric?
-  end
+  vectors.select(&:float?).map { |v| [v.key, -v] }
+  # => returns [[:float], [-0.0, -1.1, -2.2, NAN, nil]]
 end
 # =>
-#<RedAmber::DataFrame : 5 x 4 Vectors, 0x000000000009a1b4>
-  integer    float string   boolean
-  <uint8> <double> <string> <boolean>
-1       0     -0.0 A        true
-2     255     -1.1 B        false
-3     254     -2.2 C        true
-4     253      NaN D        false
-5   (nil)    (nil) (nil)    (nil)
+#<RedAmber::DataFrame : 5 x 3 Vectors, 0x00000000000e270c>
+    index    float string
+  <uint8> <double> <string>
+1       0     -0.0 A
+2       1     -1.1 B
+3       2     -2.2 C
+4       3      NaN D
+5   (nil)    (nil) (nil)
 ```
-Negate (-@) method of unsigned integer Vector returns complement.
-Next example is to eliminate observations (row in the table) containing nil.
+Next example is to eliminate rows containing nil.
 ```ruby
 # remove all observations containing nil
 nil_removed = penguins.remove { vectors.map(&:is_nil).reduce(&:|) }
 nil_removed.tdr
 # =>
 RedAmber::DataFrame : 342 x 8 Vectors
 Vectors : 5 numeric, 3 strings
@@ -273,6 +277,21 @@ For this frequently needed task, we can do it much simpler.
 penguins.remove_nil # => same result as above
 ```
+`DataFrame#summary` shows summary statistics in a DataFrame.
+```ruby
+puts penguins.summary.to_s(width: 82)
+# =>
+  variables            count     mean      std      min      25%   median      75%      max
+  <dictionary>      <uint16> <double> <double> <double> <double> <double> <double> <double>
+1 bill_length_mm         342    43.92     5.46     32.1    39.23    44.38     48.5     59.6
+2 bill_depth_mm          342    17.15     1.97     13.1     15.6    17.32     18.7     21.5
+3 flipper_length_mm      342   200.92    14.06    172.0    190.0    197.0    213.0    231.0
+4 body_mass_g            342  4201.75   801.95   2700.0   3550.0   4031.5   4750.0   6300.0
+5 year                   344  2008.03     0.82   2007.0   2007.0   2008.0   2009.0   2009.0
+```
 `DataFrame#group` method can be used for the grouping tasks.
 ```ruby
@@ -311,7 +330,7 @@ grouped.slice { v(:count) > 1 }
 9 Kaminoan       2        221.0       88.0
 ```
-See [DataFrame.md](doc/DataFrame.md) for details.
+See [DataFrame.md](doc/DataFrame.md) for other examples and details.
 ## `RedAmber::Vector`
@@ -355,7 +374,7 @@ See [Vector.md](doc/Vector.md) for details.
 ## Jupyter notebook
-[53 Examples of Red Amber](doc/examples_of_red_amber.ipynb)
+[61 Examples of Red Amber](doc/examples_of_red_amber.ipynb) shows more examples in jupyter notebook.
 ## Development
@@ -366,6 +385,12 @@ bundle install
 bundle exec rake test
 ```
+I will appreciate if you could help to improve this project. Here are a few ways you can help:
+- [Report bugs or suggest new features](https://github.com/heronshoes/red_amber/issues)
+- Fix bugs and [submit pull requests](https://github.com/heronshoes/red_amber/pulls)
+- Write, clarify, or fix documentation
 ## License
 The gem is available as open source under the terms of the [MIT License](https://opensource.org/licenses/MIT).

data/doc/DataFrame.md CHANGED Viewed

@@ -167,6 +167,11 @@ Class `RedAmber::DataFrame` represents 2D-data. A `DataFrame` consists with:
   If you need a column-oriented full array, use `.to_h.to_a`
+### `each_row`
+  Yield each row in a `{ key => row}` Hash.
+  Returns Enumerator if block is not given.
 ### `schema`
 - Returns column name and data type in a Hash.
@@ -202,7 +207,22 @@ puts penguins.to_s
 `inspect` uses `to_s` output and also shows shape and object_id.
-### `summary`, `describe` (not implemented)
+### `summary`, `describe`
+`DataFrame#summary` or `DataFrame#describe` shows summary statistics in a DataFrame.
+```ruby
+puts penguins.summary.to_s(width: 82) # needs more width to show all stats in this example
+# =>
+  variables            count     mean      std      min      25%   median      75%      max
+  <dictionary>      <uint16> <double> <double> <double> <double> <double> <double> <double>
+1 bill_length_mm         342    43.92     5.46     32.1    39.23    44.38     48.5     59.6
+2 bill_depth_mm          342    17.15     1.97     13.1     15.6    17.32     18.7     21.5
+3 flipper_length_mm      342   200.92    14.06    172.0    190.0    197.0    213.0    231.0
+4 body_mass_g            342  4201.75   801.95   2700.0   3550.0   4031.5   4750.0   6300.0
+5 year                   344  2008.03     0.82   2007.0   2007.0   2008.0   2009.0   2009.0
+```
 ### `to_rover`
@@ -704,7 +724,7 @@ penguins.to_rover
 - Key pairs as arguments
-    `rename(key_pairs)` accepts key_pairs as arguments. key_pairs should be a Hash of `{existing_key => new_key}`.
+    `rename(key_pairs)` accepts key_pairs as arguments. key_pairs should be a Hash of `{existing_key => new_key}` or an Array of Arrays like `[[existing_key, new_key], ... ]`.
     ```ruby
     df = RedAmber::DataFrame.new( 'name' => %w[Yasuko Rui Hinata], 'age' => [68, 49, 28] )
@@ -721,7 +741,11 @@ penguins.to_rover
 - Key pairs by a block
-    `rename {block}` is also acceptable. We can't use both arguments and a block at a same time. The block should return key_pairs as a Hash of `{existing_key => new_key}`. Block is called in the context of self.
+    `rename {block}` is also acceptable. We can't use both arguments and a block at a same time. The block should return key_pairs as a Hash of `{existing_key => new_key}` or an Array of Arrays like `[[existing_key, new_key], ... ]`. Block is called in the context of self.
+- Not existing keys
+    If specified `existing_key` is not exist, raise a `DataFrameArgumentError`.
 - Key type
@@ -729,16 +753,16 @@ penguins.to_rover
 ### `assign`
-  Assign new or updated variables (columns) and create a updated DataFrame.
+  Assign new or updated columns (variables) and create a updated DataFrame.
-  - Variables with new keys will append new variables at bottom (right in the table).
+  - Variables with new keys will append new columns from the right.
   - Variables with exisiting keys will update corresponding vectors.
     ![assign method image](doc/../image/dataframe/assign.png)
 - Variables as arguments
-    `assign(key_pairs)` accepts pairs of key and values as arguments. key_pairs should be a Hash of `{key => array}` or `{key => Vector}`.
+    `assign(key_pairs)` accepts pairs of key and values as parameters. `key_pairs` should be a Hash of `{key => array_like}` or an Array of Arrays like `[[key, array_like], ... ]`. `array_like` is ether `Vector`, `Array` or `Arrow::Array`.
     ```ruby
     df = RedAmber::DataFrame.new(
@@ -769,7 +793,7 @@ penguins.to_rover
 - Key pairs by a block
-    `assign {block}` is also acceptable. We can't use both arguments and a block at a same time. The block should return pairs of key and values as a Hash of `{key => array}` or `{key => Vector}`. Block is called in the context of self.
+    `assign {block}` is also acceptable. We can't use both arguments and a block at a same time. The block should return pairs of key and values as a Hash of `{key => array_like}` or an Array of Arrays like `[[key, array_like], ... ]`. `array_like` is ether `Vector`, `Array` or `Arrow::Array`. The block is called in the context of self.
     ```ruby
     df = RedAmber::DataFrame.new(
@@ -788,29 +812,27 @@ penguins.to_rover
     4       3      NaN D
     5   (nil)    (nil) (nil)
-    # update numeric variables
+    # update :float
+    # assigner by an Array
     df.assign do
-      assigner = {}
-      vectors.each_with_index do |v, i|
-        assigner[keys[i]] = v * -1 if v.numeric?
-      end
-      assigner
+      vectors.select(&:float?)
+             .map { |v| [v.key, -v] }
     end
     # =>
-    #<RedAmber::DataFrame : 5 x 3 Vectors, 0x000000000006e000>
-       index    float string
-      <int8> <double> <string>
-    1      0     -0.0 A
-    2     -1     -1.1 B
-    3     -2     -2.2 C
-    4     -3      NaN D
-    5  (nil)    (nil) (nil)
-    # Or it ’s shorter like this:
+    #<RedAmber::DataFrame : 5 x 3 Vectors, 0x00000000000dfffc>
+        index    float string
+      <uint8> <double> <string>
+    1       0     -0.0 A
+    2       1     -1.1 B
+    3       2     -2.2 C
+    4       3      NaN D
+    5   (nil)    (nil) (nil)
+    # Or we can use assigner by a Hash
     df.assign do
-      variables.select.with_object({}) do |(key, vector), assigner|
-        assigner[key] = vector * -1 if vector.numeric?
+      vectors.select.with_object({}) do |v, assigner|
+        assigner[v.key] = -v if v.float?
       end
     end
@@ -821,6 +843,28 @@ penguins.to_rover
   Symbol key and String key are considered as the same key.
+- Empty assignment
+  If assigner is empty or nil, returns self.
+- Append from left
+  `assign_left` method accepts the same parameters and block as `assign`, but append new columns from leftside.
+  ```ruby
+  df.assign_left(new_index: [1, 2, 3, 4, 5])
+  # =>
+  #<RedAmber::DataFrame : 5 x 4 Vectors, 0x000000000001787c>
+    new_index   index    float string
+      <uint8> <uint8> <double> <string>
+  1         1       0      0.0 A
+  2         2       1      1.1 B
+  3         3       2      2.2 C
+  4         4       3      NaN D
+  5         5   (nil)    (nil) (nil)
+  ```
 ## Updating
 ### `sort`
@@ -933,17 +977,17 @@ penguins.to_rover
   starwars.group(:species).count(:species)
   # =>
-  #<RedAmber::DataFrame : 38 x 2 Vectors, 0x000000000001d6f0>
-     species    count
-     <string> <int64>
-   1 Human         35
-   2 Droid          6
-   3 Wookiee        2
-   4 Rodian         1
-   5 Hutt           1
-   : :              :
-  36 Kaleesh        1
-  37 Pau'an         1
+  #<RedAmber::DataFrame : 38 x 2 Vectors, 0x000000000001d6f0>
+     species    count
+     <string> <int64>
+   1 Human         35
+   2 Droid          6
+   3 Wookiee        2
+   4 Rodian         1
+   5 Hutt           1
+   : :              :
+  36 Kaleesh        1
+  37 Pau'an         1
   38 Kel Dor        1
   ```
@@ -953,17 +997,17 @@ penguins.to_rover
   grouped = starwars.group(:species) { [count(:species), mean(:height, :mass)] }
   # =>
-  #<RedAmber::DataFrame : 38 x 4 Vectors, 0x00000000000407cc>
-     species    count mean(height) mean(mass)
-     <string> <int64>     <double>   <double>
-   1 Human         35        176.6       82.8
-   2 Droid          6        131.2       69.8
-   3 Wookiee        2        231.0      124.0
-   4 Rodian         1        173.0       74.0
-   5 Hutt           1        175.0     1358.0
-   : :              :            :          :
-  36 Kaleesh        1        216.0      159.0
-  37 Pau'an         1        206.0       80.0
+  #<RedAmber::DataFrame : 38 x 4 Vectors, 0x00000000000407cc>
+     specie  s    count mean(height) mean(mass)
+     <strin  g> <int64>     <double>   <double>
+   1 Human           35        176.6       82.8
+   2 Droid            6        131.2       69.8
+   3 Wookie  e        2        231.0      124.0
+   4 Rodian           1        173.0       74.0
+   5 Hutt             1        175.0     1358.0
+   : :                :            :          :
+  36 Kalees  h        1        216.0      159.0
+  37 Pau'an           1        206.0       80.0
   38 Kel Dor        1        188.0       80.0
   ```
@@ -987,18 +1031,115 @@ penguins.to_rover
   9 Kaminoan       2        221.0       88.0
   ```
-## Combining DataFrames
+## Reshape
-- [ ] Combining rows to a dataframe
+### `transpose`
-- [ ] Inner join
+  Creates transposed DataFrame for wide type dataframe.
-- [ ] Left join
+  ```ruby
+  import_cars = RedAmber::DataFrame.load('test/entity/import_cars.tsv')
-## Encoding
+  # =>
+  #<RedAmber::DataFrame : 5 x 6 Vectors, 0x000000000000d520>
+       Year    Audi     BMW BMW_MINI Mercedes-Benz      VW
+    <int64> <int64> <int64>  <int64>       <int64> <int64>
+  1    2021   22535   35905    18211         51722   35215
+  2    2020   22304   35712    20196         57041   36576
+  3    2019   24222   46814    23813         66553   46794
+  4    2018   26473   50982    25984         67554   51961
+  5    2017   28336   52527    25427         68221   49040
-- [ ] One-hot encoding
+  import_cars.transpose
-## Iteration
+  # =>
+  #<RedAmber::DataFrame : 5 x 6 Vectors, 0x000000000000ef74>
+    name              2021     2020     2019     2018     2017
+    <dictionary>  <uint16> <uint16> <uint32> <uint32> <uint32>
+  1 Audi             22535    22304    24222    26473    28336
+  2 BMW              35905    35712    46814    50982    52527
+  3 BMW_MINI         18211    20196    23813    25984    25427
+  4 Mercedes-Benz    51722    57041    66553    67554    68221
+  5 VW               35215    36576    46794    51961    49040
+  ```
+  The leftmost column is created by original keys. Key name of the column is
+  named by 'name'.
+### `to_long(*keep_keys)`
+  Creates a 'long' DataFrame.
+  - Parameter `keep_keys` specifies the key names to keep.
+  ```ruby
+  import_cars.to_long(:Year)
+  # =>
+  #<RedAmber::DataFrame : 25 x 3 Vectors, 0x0000000000012750>
+         Year name             value
+     <uint16> <dictionary>  <uint32>
+   1     2021 Audi             22535
+   2     2021 BMW              35905
+   3     2021 BMW_MINI         18211
+   4     2021 Mercedes-Benz    51722
+   5     2021 VW               35215
+   :        : :                    :
+  23     2017 BMW_MINI         25427
+  24     2017 Mercedes-Benz    68221
+  25     2017 VW               49040
+  ```
+  - Option `:name` : key of the column which is come **from key names**.
+  - Option `:value` : key of the column which is come **from values**.
+  ```ruby
+  import_cars.to_long(:Year, name: :Manufacturer, value: :Num_of_imported)
+  # =>
+  #<RedAmber::DataFrame : 25 x 3 Vectors, 0x0000000000017700>
+         Year Manufacturer  Num_of_imported
+     <uint16> <dictionary>         <uint32>
+   1     2021 Audi                    22535
+   2     2021 BMW                     35905
+   3     2021 BMW_MINI                18211
+   4     2021 Mercedes-Benz           51722
+   5     2021 VW                      35215
+   :        : :                           :
+  23     2017 BMW_MINI                25427
+  24     2017 Mercedes-Benz           68221
+  25     2017 VW                      49040
+  ```
-- [ ] each_rows
+### `to_wide`
+  Creates a 'wide' DataFrame.
+  - Option `:name` : key of the column which will be expanded **to key name**.
+  - Option `:value` : key of the column which will be expanded **to values**.
+  ```ruby
+  import_cars.to_long(:Year).to_wide
+  # import_cars.to_long(:Year).to_wide(name: :name, value: :value)
+  # is also OK
+  # =>
+  #<RedAmber::DataFrame : 5 x 6 Vectors, 0x000000000000f0f0>
+        Year     Audi      BMW BMW_MINI Mercedes-Benz       VW
+    <uint16> <uint16> <uint16> <uint16>      <uint32> <uint16>
+  1     2021    22535    35905    18211         51722    35215
+  2     2020    22304    35712    20196         57041    36576
+  3     2019    24222    46814    23813         66553    46794
+  4     2018    26473    50982    25984         67554    51961
+  5     2017    28336    52527    25427         68221    49040
+  ```
+## Combine
+- [ ] Combining dataframes
+- [ ] Join
+## Encoding
+- [ ] One-hot encoding

data/doc/Vector.md CHANGED Viewed

@@ -145,7 +145,7 @@ array[booleans]
 | ✓ `min_max` |  ✓  |  ✓  |  ✓  | ✓ ScalarAggregate|     |
 |[ ]`mode`    |     | [ ] |     |[ ] Mode    |     |
 | ✓ `product` |  ✓  |  ✓  |     | ✓ ScalarAggregate|     |
-|[ ]`quantile`|     | [ ] |     |[ ] Quantile|     |
+| ✓ `quantile`|     |  ✓  |     | ✓ Quantile|Specify probability in (0..1) by a parameter (default=0.5)|
 | ✓ `sd    `  |     |  ✓  |     |          |ddof: 1 at `stddev`|
 | ✓ `stddev`  |     |  ✓  |     | ✓ Variance|ddof: 0 by default|
 | ✓ `sum`     |  ✓  |  ✓  |     | ✓ ScalarAggregate|     |
@@ -303,6 +303,10 @@ double.round(n_digits: -1)
   Returns index of specified element.
+### `quantiles(probs = [1.0, 0.75, 0.5, 0.25, 0.0], interpolation: :linear, skip_nils: true, min_count: 0)`
+  Returns quantiles for specified probabilities in a DataFrame.
 ### `sort_indexes`, `sort_indices`, `array_sort_indices`
 ### [ ] `sort`, `sort_by`