RubyGems - red_amber - Versions diffs - 0.1.7 → 0.2.1 - Mend

red_amber 0.1.7 → 0.2.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (25) hide show

checksums.yaml +4 -4
data/.rubocop.yml +12 -2
data/.rubocop_todo.yml +2 -15
data/.yardopts +1 -0
data/CHANGELOG.md +164 -2
data/Gemfile +2 -1
data/README.md +246 -17
data/doc/DataFrame.md +392 -129
data/doc/Vector.md +37 -19
data/doc/examples_of_red_amber.ipynb +8979 -0
data/lib/red_amber/data_frame.rb +138 -24
data/lib/red_amber/data_frame_displayable.rb +35 -18
data/lib/red_amber/data_frame_reshaping.rb +85 -0
data/lib/red_amber/data_frame_selectable.rb +53 -9
data/lib/red_amber/data_frame_variable_operation.rb +130 -50
data/lib/red_amber/group.rb +29 -27
data/lib/red_amber/vector.rb +1 -1
data/lib/red_amber/vector_functions.rb +65 -23
data/lib/red_amber/vector_selectable.rb +12 -9
data/lib/red_amber/vector_updatable.rb +22 -1
data/lib/red_amber/version.rb +1 -1
data/lib/red_amber.rb +1 -1
data/red_amber.gemspec +1 -1
metadata +7 -5
data/doc/47_examples_of_red_amber.ipynb +0 -4872

data/doc/DataFrame.md CHANGED Viewed

@@ -155,7 +155,25 @@ Class `RedAmber::DataFrame` represents 2D-data. A `DataFrame` consists with:
 ### `indices`, `indexes`
-- Returns all indexes in an Array.
+- Returns indexes in an Array.
+  Accepts an option `start` as the first of indexes.
+  ```ruby
+  df = RedAmber::DataFrame.new(x: [1, 2, 3, 4, 5])
+  df.indices
+  # =>
+  [0, 1, 2, 3, 4]
+  df.indices(1)
+  # =>
+  [1, 2, 3, 4, 5]
+  df.indices(:a)
+  # =>
+  [:a, :b, :c, :d, :e]
+  ```
 ### `to_h`
@@ -167,6 +185,11 @@ Class `RedAmber::DataFrame` represents 2D-data. A `DataFrame` consists with:
   If you need a column-oriented full array, use `.to_h.to_a`
+### `each_row`
+  Yield each row in a `{ key => row}` Hash.
+  Returns Enumerator if block is not given.
 ### `schema`
 - Returns column name and data type in a Hash.
@@ -202,7 +225,22 @@ puts penguins.to_s
 `inspect` uses `to_s` output and also shows shape and object_id.
-### `summary`, `describe` (not implemented)
+### `summary`, `describe`
+`DataFrame#summary` or `DataFrame#describe` shows summary statistics in a DataFrame.
+```ruby
+puts penguins.summary.to_s(width: 82) # needs more width to show all stats in this example
+# =>
+  variables            count     mean      std      min      25%   median      75%      max
+  <dictionary>      <uint16> <double> <double> <double> <double> <double> <double> <double>
+1 bill_length_mm         342    43.92     5.46     32.1    39.23    44.38     48.5     59.6
+2 bill_depth_mm          342    17.15     1.97     13.1     15.6    17.32     18.7     21.5
+3 flipper_length_mm      342   200.92    14.06    172.0    190.0    197.0    213.0    231.0
+4 body_mass_g            342  4201.75   801.95   2700.0   3550.0   4031.5   4750.0   6300.0
+5 year                   344  2008.03     0.82   2007.0   2007.0   2008.0   2009.0   2009.0
+```
 ### `to_rover`
@@ -352,13 +390,13 @@ penguins.to_rover
 ### `pick  ` - pick up variables by key label -
-  Pick up some variables (columns) to create a sub DataFrame.
+  Pick up some columns (variables) to create a sub DataFrame.
   ![pick method image](doc/../image/dataframe/pick.png)
 - Keys as arguments
-  `pick(keys)` accepts keys as arguments in an Array.
+  `pick(keys)` accepts keys as arguments in an Array or a Range.
     ```ruby
     penguins.pick(:species, :bill_length_mm)
@@ -378,9 +416,31 @@ penguins.to_rover
     344 Gentoo             49.9
     ```
-- Booleans as a argument
+- Indices as arguments
+  `pick(indices)` accepts indices as arguments. Indices should be Integers, Floats or Ranges of Integers.
+    ```ruby
+    penguins.pick(0..2, -1)
+    # =>
+    #<RedAmber::DataFrame : 344 x 4 Vectors, 0x0000000000055ce4>
+        species  island    bill_length_mm     year
+        <string> <string>        <double> <uint16>
+      1 Adelie   Torgersen           39.1     2007
+      2 Adelie   Torgersen           39.5     2007
+      3 Adelie   Torgersen           40.3     2007
+      4 Adelie   Torgersen          (nil)     2007
+      5 Adelie   Torgersen           36.7     2007
+      : :        :                      :        :
+    342 Gentoo   Biscoe              50.4     2009
+    343 Gentoo   Biscoe              45.2     2009
+    344 Gentoo   Biscoe              49.9     2009
+    ```
+- Booleans as arguments
-  `pick(booleans)` accepts booleans as a argument in an Array. Booleans must be same length as `n_keys`.
+  `pick(booleans)` accepts booleans as arguments in an Array. Booleans must be same length as `n_keys`.
     ```ruby
     penguins.pick(penguins.types.map { |type| type == :string })
@@ -400,9 +460,9 @@ penguins.to_rover
     344 Gentoo   Biscoe    male
     ```
- - Keys or booleans by a block
+- Keys or booleans by a block
-    `pick {block}` is also acceptable. We can't use both arguments and a block at a same time. The block should return keys, or a boolean Array with a same length as `n_keys`. Block is called in the context of self.
+    `pick {block}` is also acceptable. We can't use both arguments and a block at a same time. The block should return keys, indices or a boolean Array with a same length as `n_keys`. Block is called in the context of self.
     ```ruby
     penguins.pick { keys.map { |key| key.end_with?('mm') } }
@@ -424,21 +484,25 @@ penguins.to_rover
 ### `drop  ` - pick and drop -
-  Drop some variables (columns) to create a remainer DataFrame.
+  Drop some columns (variables) to create a remainer DataFrame.
   ![drop method image](doc/../image/dataframe/drop.png)
 - Keys as arguments
-  `drop(keys)` accepts keys as arguments in an Array.
+  `drop(keys)` accepts keys as arguments in an Array or a Range.
+- Indices as arguments
+  `drop(indices)` accepts indices as a arguments. Indices should be Integers, Floats or Ranges of Integers.
-- Booleans as a argument
+- Booleans as arguments
-  `drop(booleans)` accepts booleans as a argument in an Array. Booleans must be same length as `n_keys`.
+  `drop(booleans)` accepts booleans as an argument in an Array. Booleans must be same length as `n_keys`.
 - Keys or booleans by a block
-  `drop {block}` is also acceptable. We can't use both arguments and a block at a same time. The block should return keys, or a boolean Array with a same length as `n_keys`. Block is called in the context of self.
+  `drop {block}` is also acceptable. We can't use both arguments and a block at a same time. The block should return keys, indices or a boolean Array with a same length as `n_keys`. Block is called in the context of self.
 - Notice for nil
@@ -473,9 +537,20 @@ penguins.to_rover
   [1, 2, 3]
   ```
+  A simple key name is usable as a method of the DataFrame if the key name is acceptable as a method name.
+  It returns a Vector same as `[]`.
+  ```ruby
+  df.a
+  # =>
+  #<RedAmber::Vector(:uint8, size=3):0x000000000000f258>
+  [1, 2, 3]
+  ```
 ### `slice  `  - to cut vertically is slice -
-  Slice and select observations (rows) to create a sub DataFrame.
+  Slice and select rows (observations) to create a sub DataFrame.
   ![slice method image](doc/../image/dataframe/slice.png)
@@ -506,7 +581,7 @@ penguins.to_rover
 - Booleans as an argument
-  `slice(booleans)` accepts booleans as a argument in an Array, a Vector or an Arrow::BooleanArray . Booleans must be same length as `size`.
+  `slice(booleans)` accepts booleans as an argument in an Array, a Vector or an Arrow::BooleanArray . Booleans must be same length as `size`.
     ```ruby
     vector = penguins[:bill_length_mm]
@@ -583,7 +658,7 @@ penguins.to_rover
 ### `remove`
-  Slice and reject observations (rows) to create a remainer DataFrame.
+  Slice and reject rows (observations) to create a remainer DataFrame.
   ![remove method image](doc/../image/dataframe/remove.png)
@@ -612,7 +687,7 @@ penguins.to_rover
 - Booleans as an argument
-  `remove(booleans)` accepts booleans as a argument in an Array, a Vector or an Arrow::BooleanArray . Booleans must be same length as `size`.
+  `remove(booleans)` accepts booleans as an argument in an Array, a Vector or an Arrow::BooleanArray . Booleans must be same length as `size`.
     ```ruby
     # remove all observation contains nil
@@ -640,10 +715,12 @@ penguins.to_rover
     ```ruby
     penguins.remove do
-      vector = self[:bill_length_mm]
-      min = vector.mean - vector.std
-      max = vector.mean + vector.std
-      vector.to_a.map { |e| (min..max).include? e }
+      # We will use another style shown in slice
+      # self.bill_length_mm returns Vector
+      mean = bill_length_mm.mean
+      min = mean - bill_length_mm.std
+      max = mean + bill_length_mm.std
+      bill_length_mm.to_a.map { |e| (min..max).include? e }
     end
     # =>
@@ -660,6 +737,7 @@ penguins.to_rover
     139 Gentoo   Biscoe              50.4          15.7               222 ...     2009
     140 Gentoo   Biscoe              49.9          16.1               213 ...     2009
     ```
 - Notice for nil
   - When `remove` used with booleans, nil in booleans is treated as false. This behavior is aligned with Ruby's `nil#!`.
@@ -704,7 +782,7 @@ penguins.to_rover
 - Key pairs as arguments
-    `rename(key_pairs)` accepts key_pairs as arguments. key_pairs should be a Hash of `{existing_key => new_key}`.
+    `rename(key_pairs)` accepts key_pairs as arguments. key_pairs should be a Hash of `{existing_key => new_key}` or an Array of Arrays like `[[existing_key, new_key], ... ]`.
     ```ruby
     df = RedAmber::DataFrame.new( 'name' => %w[Yasuko Rui Hinata], 'age' => [68, 49, 28] )
@@ -721,7 +799,11 @@ penguins.to_rover
 - Key pairs by a block
-    `rename {block}` is also acceptable. We can't use both arguments and a block at a same time. The block should return key_pairs as a Hash of `{existing_key => new_key}`. Block is called in the context of self.
+    `rename {block}` is also acceptable. We can't use both arguments and a block at a same time. The block should return key_pairs as a Hash of `{existing_key => new_key}` or an Array of Arrays like `[[existing_key, new_key], ... ]`. Block is called in the context of self.
+- Not existing keys
+    If specified `existing_key` is not exist, raise a `DataFrameArgumentError`.
 - Key type
@@ -729,16 +811,16 @@ penguins.to_rover
 ### `assign`
-  Assign new or updated variables (columns) and create a updated DataFrame.
+  Assign new or updated columns (variables) and create a updated DataFrame.
-  - Variables with new keys will append new variables at bottom (right in the table).
+  - Variables with new keys will append new columns from the right.
   - Variables with exisiting keys will update corresponding vectors.
     ![assign method image](doc/../image/dataframe/assign.png)
 - Variables as arguments
-    `assign(key_pairs)` accepts pairs of key and values as arguments. key_pairs should be a Hash of `{key => array}` or `{key => Vector}`.
+    `assign(key_pairs)` accepts pairs of key and values as parameters. `key_pairs` should be a Hash of `{key => array_like}` or an Array of Arrays like `[[key, array_like], ... ]`. `array_like` is ether `Vector`, `Array` or `Arrow::Array`.
     ```ruby
     df = RedAmber::DataFrame.new(
@@ -748,15 +830,19 @@ penguins.to_rover
     # =>
     #<RedAmber::DataFrame : 3 x 2 Vectors, 0x0000000000062804>
-      name         age
-      <string> <uint8>
-    1 Yasuko        68
-    2 Rui           49
+      name         age
+      <string> <uint8>
+    1 Yasuko        68
+    2 Rui           49
     3 Hinata        28
     # update :age and add :brother
-    assigner = { age: [97, 78, 57], brother: ['Santa', nil, 'Momotaro'] }
-    df.assign(assigner)
+    df.assign do
+      {
+        age: age + 29,
+        brother: ['Santa', nil, 'Momotaro']
+      }
+    end
     # =>
     #<RedAmber::DataFrame : 3 x 3 Vectors, 0x00000000000658b0>
@@ -769,13 +855,14 @@ penguins.to_rover
 - Key pairs by a block
-    `assign {block}` is also acceptable. We can't use both arguments and a block at a same time. The block should return pairs of key and values as a Hash of `{key => array}` or `{key => Vector}`. Block is called in the context of self.
+    `assign {block}` is also acceptable. We can't use both arguments and a block at a same time. The block should return pairs of key and values as a Hash of `{key => array_like}` or an Array of Arrays like `[[key, array_like], ... ]`. `array_like` is ether `Vector`, `Array` or `Arrow::Array`. The block is called in the context of self.
     ```ruby
     df = RedAmber::DataFrame.new(
       index: [0, 1, 2, 3, nil],
       float: [0.0, 1.1,  2.2, Float::NAN, nil],
-      string: ['A', 'B', 'C', 'D', nil])
+      string: ['A', 'B', 'C', 'D', nil]
+    )
     df
     # =>
@@ -788,29 +875,27 @@ penguins.to_rover
     4       3      NaN D
     5   (nil)    (nil) (nil)
-    # update numeric variables
+    # update :float
+    # assigner by an Array
     df.assign do
-      assigner = {}
-      vectors.each_with_index do |v, i|
-        assigner[keys[i]] = v * -1 if v.numeric?
-      end
-      assigner
+      vectors.select(&:float?)
+             .map { |v| [v.key, -v] }
     end
     # =>
-    #<RedAmber::DataFrame : 5 x 3 Vectors, 0x000000000006e000>
-       index    float string
-      <int8> <double> <string>
-    1      0     -0.0 A
-    2     -1     -1.1 B
-    3     -2     -2.2 C
-    4     -3      NaN D
-    5  (nil)    (nil) (nil)
-    # Or it ’s shorter like this:
+    #<RedAmber::DataFrame : 5 x 3 Vectors, 0x00000000000dfffc>
+        index    float string
+      <uint8> <double> <string>
+    1       0     -0.0 A
+    2       1     -1.1 B
+    3       2     -2.2 C
+    4       3      NaN D
+    5   (nil)    (nil) (nil)
+    # Or we can use assigner by a Hash
     df.assign do
-      variables.select.with_object({}) do |(key, vector), assigner|
-        assigner[key] = vector * -1 if vector.numeric?
+      vectors.select.with_object({}) do |v, assigner|
+        assigner[v.key] = -v if v.float?
       end
     end
@@ -821,6 +906,96 @@ penguins.to_rover
   Symbol key and String key are considered as the same key.
+- Empty assignment
+  If assigner is empty or nil, returns self.
+- Append from left
+  `assign_left` method accepts the same parameters and block as `assign`, but append new columns from leftside.
+  ```ruby
+  df.assign_left(new_index: df.indices(1))
+  # =>
+  #<RedAmber::DataFrame : 5 x 4 Vectors, 0x000000000001787c>
+    new_index   index    float string
+      <uint8> <uint8> <double> <string>
+  1         1       0      0.0 A
+  2         2       1      1.1 B
+  3         3       2      2.2 C
+  4         4       3      NaN D
+  5         5   (nil)    (nil) (nil)
+  ```
+### `slice_by(key, keep_key: false) { block }`
+`slice_by` accepts a key and a block to select rows.
+(Since 0.2.1)
+  ```ruby
+  df = RedAmber::DataFrame.new(
+    index: [0, 1, 2, 3, nil],
+    float: [0.0, 1.1,  2.2, Float::NAN, nil],
+    string: ['A', 'B', 'C', 'D', nil]
+  )
+  df
+  # =>
+  #<RedAmber::DataFrame : 5 x 3 Vectors, 0x0000000000069e60>
+      index    float string
+    <uint8> <double> <string>
+  1       0      0.0 A
+  2       1      1.1 B
+  3       2      2.2 C
+  4       3      NaN D
+  5   (nil)    (nil) (nil)
+  df.slice_by(:string) { ["A", "C"] }
+  # =>
+  #<RedAmber::DataFrame : 2 x 2 Vectors, 0x000000000001b1ac>
+      index    float
+    <uint8> <double>
+  1       0      0.0
+  2       2      2.2
+  ```
+It is the same behavior as;
+  ```ruby
+  df.slice { [string.index("A"), string.index("C")] }.drop(:string)
+  ```
+`slice_by` also accepts a Range.
+  ```ruby
+  df.slice_by(:string) { "A".."C" }
+  # =>
+  #<RedAmber::DataFrame : 3 x 2 Vectors, 0x0000000000069668>
+      index    float
+    <uint8> <double>
+  1       0      0.0
+  2       1      1.1
+  3       2      2.2
+  ```
+When the option `keep_key: true` used, the column `key` will be preserved.
+  ```ruby
+  df.slice_by(:string, keep_key: true) { "A".."C" }
+  # =>
+  #<RedAmber::DataFrame : 3 x 3 Vectors, 0x0000000000073c44>
+      index    float string
+    <uint8> <double> <string>
+  1       0      0.0 A
+  2       1      1.1 B
+  3       2      2.2 C
+  ```
 ## Updating
 ### `sort`
@@ -830,11 +1005,11 @@ penguins.to_rover
     - "-key" denotes descending order
   ```ruby
-  df = RedAmber::DataFrame.new({
+  df = RedAmber::DataFrame.new(
         index:  [1, 1, 0, nil, 0],
         string: ['C', 'B', nil, 'A', 'B'],
         bool:   [nil, true, false, true, false],
-      })
+      )
   df.sort(:index, '-bool')
   # =>
@@ -860,16 +1035,10 @@ penguins.to_rover
 ## Grouping
-### `group(aggregating_keys)`
-  (
-    This API will change in the future version. Especcially I want to change:
-      - Order of the column of the result (aggregation_keys should be the first)
-      - DataFrame#group will accept a block (heronshoes/red_amber #28)
-  )
+### `group(group_keys)`
   `group` creates a class `Group` object. `Group` accepts functions below as a method.
-  Method accepts options as `summary_keys`.
+  Method accepts options as `group_keys`.
   Available functions are:
@@ -889,8 +1058,8 @@ penguins.to_rover
   - [ ] tdigest
   - ✓ variance
-  For the each group of `aggregation_keys`, the aggregation `function` is applied and returns a new dataframe with aggregated keys according to `summary_keys`.
-  Aggregated key name is `function(summary_key)` style.
+  For the each group of `group_keys`, the aggregation `function` is applied and returns a new dataframe with aggregated keys according to `summary_keys`.
+  Summary key names are provided by `function(summary_keys)` style.
   This is an example of grouping of famous STARWARS dataset.
@@ -900,18 +1069,18 @@ penguins.to_rover
   starwars
   # =>
-  #<RedAmber::DataFrame : 87 x 12 Vectors, 0x00000000000773bc>
-  species     name            height     mass hair_color skin_color  eye_color ... homeworld
-  <string>    <string>       <int64> <double> <string>   <string>    <string>  ... <string>
-  Human     1 Luke Skywalker     172     77.0 blond      fair        blue      ... Tatooine
-  Droid     2 C-3PO              167     75.0 NA         gold        yellow    ... Tatooine
-  Droid     3 R2-D2               96     32.0 NA         white, blue red       ... Naboo
-  Human     4 Darth Vader        202    136.0 none       white       yellow    ... Tatooine
-  Human     5 Leia Organa        150     49.0 brown      light       brown     ... Alderaan
-  :         : :                    :        : :          :           :         ... :
-  Droid    85 BB8              (nil)    (nil) none       none        black     ... NA
-  NA       86 Captain Phasma   (nil)    (nil) unknown    unknown     unknown   ... NA
-  Human    87 Padmé Amidala      165     45.0 brown      light       brown     ... Naboo
+  #<RedAmber::DataFrame : 87 x 12 Vectors, 0x0000000000005a50>
+     unnamed1 name            height     mass hair_color skin_color  eye_color ... species
+      <int64> <string>       <int64> <double> <string>   <string>    <string>  ... <string>
+   1        1 Luke Skywalker     172     77.0 blond      fair        blue      ... Human
+   2        2 C-3PO              167     75.0 NA         gold        yellow    ... Droid
+   3        3 R2-D2               96     32.0 NA         white, blue red       ... Droid
+   4        4 Darth Vader        202    136.0 none       white       yellow    ... Human
+   5        5 Leia Organa        150     49.0 brown      light       brown     ... Human
+   :        : :                    :        : :          :           :         ... :
+  85       85 BB8              (nil)    (nil) none       none        black     ... Droid
+  86       86 Captain Phasma   (nil)    (nil) unknown    unknown     unknown   ... NA
+  87       87 Padmé Amidala      165     45.0 brown      light       brown     ... Human
   starwars.tdr(12)
@@ -919,7 +1088,7 @@ penguins.to_rover
   RedAmber::DataFrame : 87 x 12 Vectors
   Vectors : 4 numeric, 8 strings
   #  key         type   level data_preview
-  1  :""         int64     87 [1, 2, 3, 4, 5, ... ]
+  1  :unnamed1   int64     87 [1, 2, 3, 4, 5, ... ]
   2  :name       string    87 ["Luke Skywalker", "C-3PO", "R2-D2", "Darth Vader", "Leia Organa", ... ]
   3  :height     int64     46 [172, 167, 96, 202, 150, ... ], 6 nils
   4  :mass       double    39 [77.0, 75.0, 32.0, 136.0, 49.0, ... ], 28 nils
@@ -933,82 +1102,176 @@ penguins.to_rover
   12 :species    string    38 ["Human", "Droid", "Droid", "Human", "Human", ... ]
   ```
-  We can aggregate for `:species` and calculate the mean of `:mass` and `:height`.
+  We can group by `:species` and calculate the count.
+  ```ruby
+  starwars.group(:species).count(:species)
+  # =>
+  #<RedAmber::DataFrame : 38 x 2 Vectors, 0x000000000001d6f0>
+     species    count
+     <string> <int64>
+   1 Human         35
+   2 Droid          6
+   3 Wookiee        2
+   4 Rodian         1
+   5 Hutt           1
+   : :              :
+  36 Kaleesh        1
+  37 Pau'an         1
+  38 Kel Dor        1
+  ```
+  We can also calculate the mean of `:mass` and `:height` together.
   ```ruby
-  grouped = starwars.group(:species).mean(:mass, :height)
-  grouped
+  grouped = starwars.group(:species) { [count(:species), mean(:height, :mass)] }
   # =>
-  #<RedAmber::DataFrame : 38 x 3 Vectors, 0x000000000008e620>
-     mean(mass) mean(height) species
-       <double>     <double> <string>
-   1       82.8        176.6 Human
-   2       69.8        131.2 Droid
-   3      124.0        231.0 Wookiee
-   4       74.0        173.0 Rodian
-   5     1358.0        175.0 Hutt
-   :          :            : :
-  36      159.0        216.0 Kaleesh
-  37       80.0        206.0 Pau'an
-  38       80.0        188.0 Kel Dor
+  #<RedAmber::DataFrame : 38 x 4 Vectors, 0x00000000000407cc>
+     specie  s    count mean(height) mean(mass)
+     <strin  g> <int64>     <double>   <double>
+   1 Human           35        176.6       82.8
+   2 Droid            6        131.2       69.8
+   3 Wookie  e        2        231.0      124.0
+   4 Rodian           1        173.0       74.0
+   5 Hutt             1        175.0     1358.0
+   : :                :            :          :
+  36 Kalees  h        1        216.0      159.0
+  37 Pau'an           1        206.0       80.0
+  38 Kel Dor        1        188.0       80.0
   ```
   Select rows for count > 1.
   ```ruby
-  count = starwars.group(:species).count(:species)[:'count(species)'] # => Vector
-  grouped = grouped.slice(count > 1)
+  grouped.slice(grouped[:count] > 1)
   # =>
-  #<RedAmber::DataFrame : 9 x 3 Vectors, 0x0000000000098260>
-    mean(mass) mean(height) species
-      <double>     <double> <string>
-  1       82.8        176.6 Human
-  2       69.8        131.2 Droid
-  3      124.0        231.0 Wookiee
-  4       74.0        208.7 Gungan
-  5       48.0        181.3 NA
-  :          :            : :
-  7       55.0        179.0 Twi'lek
-  8       53.1        168.0 Mirialan
-  9       88.0        221.0 Kaminoan
+  #<RedAmber::DataFrame : 9 x 4 Vectors, 0x000000000004c270>
+    species    count mean(height) mean(mass)
+    <string> <int64>     <double>   <double>
+  1 Human         35        176.6       82.8
+  2 Droid          6        131.2       69.8
+  3 Wookiee        2        231.0      124.0
+  4 Gungan         3        208.7       74.0
+  5 NA             4        181.3       48.0
+  : :              :            :          :
+  7 Twi'lek        2        179.0       55.0
+  8 Mirialan       2        168.0       53.1
+  9 Kaminoan       2        221.0       88.0
   ```
-  Assemble the result and change the order of columns.
+## Reshape
+### `transpose`
+  Creates transposed DataFrame for the wide (messy) dataframe.
   ```ruby
-  grouped.assign(count: count[count > 1]).pick { [2,3,0,1].map{ |i| keys[i] } }
+  import_cars = RedAmber::DataFrame.load('test/entity/import_cars.tsv')
+  # =>
+  #<RedAmber::DataFrame : 5 x 6 Vectors, 0x000000000000d520>
+       Year    Audi     BMW BMW_MINI Mercedes-Benz      VW
+    <int64> <int64> <int64>  <int64>       <int64> <int64>
+  1    2017   28336   52527    25427         68221   49040
+  2    2018   26473   50982    25984         67554   51961
+  3    2019   24222   46814    23813         66553   46794
+  4    2020   22304   35712    20196         57041   36576
+  5    2021   22535   35905    18211         51722   35215
+  import_cars.transpose(:Manufacturer)
+  # =>
+  #<RedAmber::DataFrame : 5 x 6 Vectors, 0x000000000000ef74>
+    Manufacturer      2017     2018     2019     2020     2021
+    <dictionary>  <uint32> <uint32> <uint32> <uint16> <uint16>
+  1 Audi             28336    26473    24222    22304    22535
+  2 BMW              52527    50982    46814    35712    35905
+  3 BMW_MINI         25427    25984    23813    20196    18211
+  4 Mercedes-Benz    68221    67554    66553    57041    51722
+  5 VW               49040    51961    46794    36576    35215
+  ```
+  The leftmost column is created by original keys. Key name of the column is
+  named by parameter `:name`. If `:name` is not specified, `:N` is used for the key.
+### `to_long(*keep_keys)`
+  Creates a 'long' (tidy) DataFrame from a 'wide' DataFrame.
+  - Parameter `keep_keys` specifies the key names to keep.
+  ```ruby
+  import_cars.to_long(:Year)
   # =>
-  #<RedAmber::DataFrame : 9 x 4 Vectors, 0x0000000000141838>
-    species    count mean(mass) mean(height)
-    <string> <uint8>   <double>     <double>
-  1 Human         35       82.8        176.6
-  2 Droid          6       69.8        131.2
-  3 Wookiee        2      124.0        231.0
-  4 Gungan         3       74.0        208.7
-  5 NA             4       48.0        181.3
-  : :              :          :            :
-  7 Twi'lek        2       55.0        179.0
-  8 Mirialan       2       53.1        168.0
-  9 Kaminoan       2       88.0        221.0
+  #<RedAmber::DataFrame : 25 x 3 Vectors, 0x0000000000012750>
+         Year N                    V
+     <uint16> <dictionary>  <uint32>
+   1     2017 Audi             28336
+   2     2017 BMW              52527
+   3     2017 BMW_MINI         25427
+   4     2017 Mercedes-Benz    68221
+   5     2017 VW               49040
+   :        : :                    :
+  23     2021 BMW_MINI         18211
+  24     2021 Mercedes-Benz    51722
+  25     2021 VW               35215
   ```
-## Combining DataFrames
+  - Option `:name` is the key of the column which came **from key names**.
+  - Option `:value` is the key of the column which came **from values**.
-- [ ] Combining rows to a dataframe
+  ```ruby
+  import_cars.to_long(:Year, name: :Manufacturer, value: :Num_of_imported)
-- [ ] Add vars
+  # =>
+  #<RedAmber::DataFrame : 25 x 3 Vectors, 0x0000000000017700>
+         Year Manufacturer  Num_of_imported
+     <uint16> <dictionary>         <uint32>
+   1     2017 Audi                    28336
+   2     2017 BMW                     52527
+   3     2017 BMW_MINI                25427
+   4     2017 Mercedes-Benz           68221
+   5     2017 VW                      49040
+   :        : :                           :
+  23     2021 BMW_MINI                18211
+  24     2021 Mercedes-Benz           51722
+  25     2021 VW                      35215
+  ```
-- [ ] Inner join
+### `to_wide`
-- [ ] Left join
+  Creates a 'wide' (messy) DataFrame from a 'long' DataFrame.
-## Encoding
+  - Option `:name` is the key of the column which will be expanded **to key names**.
+  - Option `:value` is the key of the column which will be expanded **to values**.
-- [ ] One-hot encoding
+  ```ruby
+  import_cars.to_long(:Year).to_wide
+  # import_cars.to_long(:Year).to_wide(name: :N, value: :V)
+  # is also OK
-## Iteration (not impremented)
+  # =>
+  #<RedAmber::DataFrame : 5 x 6 Vectors, 0x000000000000f0f0>
+        Year     Audi      BMW BMW_MINI Mercedes-Benz       VW
+    <uint16> <uint16> <uint16> <uint16>      <uint32> <uint16>
+  1     2017    28336    52527    25427         68221    49040
+  2     2018    26473    50982    25984         67554    51961
+  3     2019    24222    46814    23813         66553    46794
+  4     2020    22304    35712    20196         57041    36576
+  5     2021    22535    35905    18211         51722    35215
+  # == import_cars
+  ```
+## Combine
+- [ ] Combining dataframes
-- [ ] each_rows
+- [ ] Join
+## Encoding
+- [ ] One-hot encoding