RubyGems - red_amber - Versions diffs - 0.2.1 → 0.2.3 - Mend

red_amber 0.2.1 → 0.2.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (58) hide show

checksums.yaml +4 -4
data/.rubocop.yml +15 -0
data/CHANGELOG.md +170 -20
data/Gemfile +4 -2
data/README.md +121 -302
data/benchmark/basic.yml +79 -0
data/benchmark/combine.yml +63 -0
data/benchmark/drop_nil.yml +15 -3
data/benchmark/group.yml +33 -0
data/benchmark/reshape.yml +27 -0
data/benchmark/{csv_load_penguins.yml → rover/csv_load_penguins.yml} +3 -3
data/benchmark/rover/flights.yml +23 -0
data/benchmark/rover/penguins.yml +23 -0
data/benchmark/rover/planes.yml +23 -0
data/benchmark/rover/weather.yml +23 -0
data/doc/DataFrame.md +611 -318
data/doc/Vector.md +31 -36
data/doc/image/basic_verbs.png +0 -0
data/doc/image/dataframe/assign.png +0 -0
data/doc/image/dataframe/assign_operation.png +0 -0
data/doc/image/dataframe/drop.png +0 -0
data/doc/image/dataframe/join.png +0 -0
data/doc/image/dataframe/pick.png +0 -0
data/doc/image/dataframe/pick_operation.png +0 -0
data/doc/image/dataframe/remove.png +0 -0
data/doc/image/dataframe/rename.png +0 -0
data/doc/image/dataframe/rename_operation.png +0 -0
data/doc/image/dataframe/reshaping_DataFrames.png +0 -0
data/doc/image/dataframe/set_and_bind.png +0 -0
data/doc/image/dataframe/slice.png +0 -0
data/doc/image/dataframe/slice_operation.png +0 -0
data/doc/image/dataframe_model.png +0 -0
data/doc/image/group_operation.png +0 -0
data/doc/image/replace-if_then.png +0 -0
data/doc/image/reshaping_dataframe.png +0 -0
data/doc/image/screenshot.png +0 -0
data/doc/image/vector/binary_element_wise.png +0 -0
data/doc/image/vector/unary_aggregation.png +0 -0
data/doc/image/vector/unary_aggregation_w_option.png +0 -0
data/doc/image/vector/unary_element_wise.png +0 -0
data/lib/red_amber/data_frame.rb +16 -42
data/lib/red_amber/data_frame_combinable.rb +283 -0
data/lib/red_amber/data_frame_displayable.rb +58 -3
data/lib/red_amber/data_frame_loadsave.rb +36 -0
data/lib/red_amber/data_frame_reshaping.rb +8 -6
data/lib/red_amber/data_frame_selectable.rb +9 -9
data/lib/red_amber/data_frame_variable_operation.rb +27 -21
data/lib/red_amber/group.rb +100 -17
data/lib/red_amber/helper.rb +20 -30
data/lib/red_amber/vector.rb +56 -30
data/lib/red_amber/vector_functions.rb +0 -8
data/lib/red_amber/vector_selectable.rb +9 -1
data/lib/red_amber/vector_updatable.rb +61 -63
data/lib/red_amber/version.rb +1 -1
data/lib/red_amber.rb +2 -0
data/red_amber.gemspec +1 -1
metadata +32 -11
data/doc/examples_of_red_amber.ipynb +0 -8979

data/doc/DataFrame.md CHANGED Viewed

@@ -5,7 +5,8 @@ Class `RedAmber::DataFrame` represents 2D-data. A `DataFrame` consists with:
 - A label is attached to `Vector`. We call it `key`.
 - A `Vector` and associated `key` is grouped as a `variable`.
 - `variable`s with same vector length are aligned and arranged to be a `DataFrame`.
-- Each `Vector` in a `DataFrame` contains a set of relating data at same position. We call it `observation`.
+  - Each `key` in a `DataFrame` must be unique.
+- Each `Vector` in a `DataFrame` contains a set of relating data at same position. We call it `record` or `observation`.
 ![dataframe model image](doc/../image/dataframe_model.png)
@@ -14,30 +15,38 @@ Class `RedAmber::DataFrame` represents 2D-data. A `DataFrame` consists with:
 ### `new` from a Hash
   ```ruby
-  RedAmber::DataFrame.new(x: [1, 2, 3])
+  df = RedAmber::DataFrame.new(x: [1, 2, 3], y: %w[A B C])
   ```
 ### `new` from a schema (by Hash) and data (by Array)
   ```ruby
-  RedAmber::DataFrame.new({:x=>:uint8}, [[1], [2], [3]])
+  RedAmber::DataFrame.new({x: :uint8, y: :string}, [[1, "A"], [2, "B"], [3, "C"]])
   ```
 ### `new` from an Arrow::Table
   ```ruby
-  table = Arrow::Table.new(x: [1, 2, 3])
+  table = Arrow::Table.new(x: [1, 2, 3], y: %w[A B C])
   RedAmber::DataFrame.new(table)
   ```
+### `new` from an Object which responds to `to_arrow`
+  ```ruby
+  require "datasets-arrow"
+  dataset = Datasets::Penguins.new
+  RedAmber::DataFrame.new(dataset)
+  ```
 ### `new` from a Rover::DataFrame
   ```ruby
   require 'rover'
-  rover = Rover::DataFrame.new(x: [1, 2, 3])
+  rover = Rover::DataFrame.new(x: [1, 2, 3], y: %w[A B C])
   RedAmber::DataFrame.new(rover)
   ```
@@ -63,7 +72,7 @@ Class `RedAmber::DataFrame` represents 2D-data. A `DataFrame` consists with:
   ```ruby
   require 'parquet'
-  dataframe = RedAmber::DataFrame.load("file.parquet")
+  df = RedAmber::DataFrame.load("file.parquet")
   ```
 ### `save` (instance method)
@@ -79,20 +88,20 @@ Class `RedAmber::DataFrame` represents 2D-data. A `DataFrame` consists with:
   ```ruby
   require 'parquet'
-  dataframe.save("file.parquet")
+  df.save("file.parquet")
   ```
 ## Properties
 ### `table`, `to_arrow`
-- Reader of Arrow::Table object inside.
+- Returns Arrow::Table object in the DataFrame.
-### `size`, `n_obs`, `n_rows`
+### `size`, `n_records`, `n_obs`, `n_rows`
-- Returns size of Vector (num of observations).
-### `n_keys`, `n_vars`, `n_cols`,
+- Returns size of Vector (num of records).
+### `n_keys`, `n_variables`, `n_vars`, `n_cols`,
 - Returns num of keys (num of variables).
@@ -130,16 +139,7 @@ Class `RedAmber::DataFrame` represents 2D-data. A `DataFrame` consists with:
 - Returns key names in an Array.
-  When we use it with vectors, Vector#key is useful to get the key inside of DataFrame.
-  ```ruby
-    # update numeric variables, another solution
-    df.assign do
-      vectors.each_with_object({}) do |vector, assigner|
-        assigner[vector.key] = vector * -1 if vector.numeric?
-      end
-    end
-  ```
+  Each key must be unique in the DataFrame.
 ### `types`
@@ -153,9 +153,20 @@ Class `RedAmber::DataFrame` represents 2D-data. A `DataFrame` consists with:
 - Returns an Array of Vectors.
+  When we use it, Vector#key is useful to get the key in the DataFrame.
+  ```ruby
+    # update numeric variables, another solution
+    df.assign do
+      vectors.each_with_object({}) do |vector, assigner|
+        assigner[vector.key] = vector * -1 if vector.numeric?
+      end
+    end
+  ```
 ### `indices`, `indexes`
-- Returns indexes in an Array.
+- Returns indexes in a Vector.
   Accepts an option `start` as the first of indexes.
   ```ruby
@@ -163,15 +174,19 @@ Class `RedAmber::DataFrame` represents 2D-data. A `DataFrame` consists with:
   df.indices
   # =>
+  #<RedAmber::Vector(:uint8, size=5):0x0000000000013ed4>
   [0, 1, 2, 3, 4]
   df.indices(1)
   # =>
+  #<RedAmber::Vector(:uint8, size=5):0x0000000000018fd8>
   [1, 2, 3, 4, 5]
   df.indices(:a)
   # =>
+  #<RedAmber::Vector(:dictionary, size=5):0x000000000001bd50>
   [:a, :b, :c, :d, :e]
   ```
@@ -210,15 +225,15 @@ puts penguins.to_s
 # =>
     species  island    bill_length_mm bill_depth_mm flipper_length_mm ...     year
     <string> <string>        <double>      <double>           <uint8> ... <uint16>
-  1 Adelie   Torgersen           39.1          18.7               181 ...     2007
-  2 Adelie   Torgersen           39.5          17.4               186 ...     2007
-  3 Adelie   Torgersen           40.3          18.0               195 ...     2007
-  4 Adelie   Torgersen          (nil)         (nil)             (nil) ...     2007
-  5 Adelie   Torgersen           36.7          19.3               193 ...     2007
+  0 Adelie   Torgersen           39.1          18.7               181 ...     2007
+  1 Adelie   Torgersen           39.5          17.4               186 ...     2007
+  2 Adelie   Torgersen           40.3          18.0               195 ...     2007
+  3 Adelie   Torgersen          (nil)         (nil)             (nil) ...     2007
+  4 Adelie   Torgersen           36.7          19.3               193 ...     2007
   : :        :                      :             :                 : ...        :
-342 Gentoo   Biscoe              50.4          15.7               222 ...     2009
-343 Gentoo   Biscoe              45.2          14.8               212 ...     2009
-344 Gentoo   Biscoe              49.9          16.1               213 ...     2009
+341 Gentoo   Biscoe              50.4          15.7               222 ...     2009
+342 Gentoo   Biscoe              45.2          14.8               212 ...     2009
+343 Gentoo   Biscoe              49.9          16.1               213 ...     2009
 ```
 ### `inspect`
@@ -235,11 +250,11 @@ puts penguins.summary.to_s(width: 82) # needs more width to show all stats in th
 # =>
   variables            count     mean      std      min      25%   median      75%      max
   <dictionary>      <uint16> <double> <double> <double> <double> <double> <double> <double>
-1 bill_length_mm         342    43.92     5.46     32.1    39.23    44.38     48.5     59.6
-2 bill_depth_mm          342    17.15     1.97     13.1     15.6    17.32     18.7     21.5
-3 flipper_length_mm      342   200.92    14.06    172.0    190.0    197.0    213.0    231.0
-4 body_mass_g            342  4201.75   801.95   2700.0   3550.0   4031.5   4750.0   6300.0
-5 year                   344  2008.03     0.82   2007.0   2007.0   2008.0   2009.0   2009.0
+0 bill_length_mm         342    43.92     5.46     32.1    39.23    44.38     48.5     59.6
+1 bill_depth_mm          342    17.15     1.97     13.1     15.6    17.32     18.7     21.5
+2 flipper_length_mm      342   200.92    14.06    172.0    190.0    197.0    213.0    231.0
+3 body_mass_g            342  4201.75   801.95   2700.0   3550.0   4031.5   4750.0   6300.0
+4 year                   344  2008.03     0.82   2007.0   2007.0   2008.0   2009.0   2009.0
 ```
 ### `to_rover`
@@ -265,26 +280,29 @@ penguins.to_rover
   require 'red_amber'
   require 'datasets-arrow'
-  penguins = Datasets::Penguins.new.to_arrow
-  RedAmber::DataFrame.new(penguins).tdr
+  dataset = Datasets::Penguins.new
+  # (From 0.2.2) responsible to the object which has `to_arrow` method.
+  # If older, it should be `dataset.to_arrow` in the parentheses.
+  RedAmber::DataFrame.new(dataset).tdr
   # =>
   RedAmber::DataFrame : 344 x 8 Vectors
   Vectors : 5 numeric, 3 strings
   # key                type   level data_preview
-  1 :species           string     3 {"Adelie"=>152, "Chinstrap"=>68, "Gentoo"=>124}
-  2 :island            string     3 {"Torgersen"=>52, "Biscoe"=>168, "Dream"=>124}
-  3 :bill_length_mm    double   165 [39.1, 39.5, 40.3, nil, 36.7, ... ], 2 nils
-  4 :bill_depth_mm     double    81 [18.7, 17.4, 18.0, nil, 19.3, ... ], 2 nils
-  5 :flipper_length_mm uint8     56 [181, 186, 195, nil, 193, ... ], 2 nils
-  6 :body_mass_g       uint16    95 [3750, 3800, 3250, nil, 3450, ... ], 2 nils
-  7 :sex               string     3 {"male"=>168, "female"=>165, nil=>11}
-  8 :year              uint16     3 {2007=>110, 2008=>114, 2009=>120}
+  0 :species           string     3 {"Adelie"=>152, "Chinstrap"=>68, "Gentoo"=>124}
+  1 :island            string     3 {"Torgersen"=>52, "Biscoe"=>168, "Dream"=>124}
+  2 :bill_length_mm    double   165 [39.1, 39.5, 40.3, nil, 36.7, ... ], 2 nils
+  3 :bill_depth_mm     double    81 [18.7, 17.4, 18.0, nil, 19.3, ... ], 2 nils
+  4 :flipper_length_mm uint8     56 [181, 186, 195, nil, 193, ... ], 2 nils
+  5 :body_mass_g       uint16    95 [3750, 3800, 3250, nil, 3450, ... ], 2 nils
+  6 :sex               string     3 {"male"=>168, "female"=>165, nil=>11}
+  7 :year              uint16     3 {2007=>110, 2008=>114, 2009=>120}
   ```
+  Options:
   - limit: limit of variables to show. Default value is 10.
-  - tally: max level to use tally mode.
-  - elements: max num of element to show values in each observations.
+  - tally: max level to use tally mode. Default value is 5.
+  - elements: max num of element to show values in each records. Default value is 5.
 ## Selecting
@@ -294,13 +312,13 @@ penguins.to_rover
 - Keys in an Array: `df[:symbol1, "string", :symbol2]`
 - Keys by indeces: `df[df.keys[0]`, `df[df.keys[1,2]]`, `df[df.keys[1..]]`
-  Key indeces can be used via `keys[i]` because numbers are used to select observations (rows).
+  Key indeces should be used via `keys[i]` because numbers are used to select records (rows). See next section.
 - Keys by a Range:
-  If keys are able to represent by Range, it can be included in the arguments. See a example below.
+  If keys are able to represent by a Range, it can be included in the arguments. See a example below.
-- You can exchange the order of variables (columns).
+- You can also exchange the order of variables (columns).
   ```ruby
   hash = {a: [1, 2, 3], b: %w[A B C], c: [1.0, 2, 3]}
@@ -311,12 +329,12 @@ penguins.to_rover
   #<RedAmber::DataFrame : 3 x 3 Vectors, 0x00000000000328fc>
     b               c       a
     <string> <double> <uint8>
-  1 A             1.0       1
-  2 B             2.0       2
-  3 C             3.0       3
+  0 A             1.0       1
+  1 B             2.0       2
+  2 C             3.0       3
   ```
-  If `#[]` represents single variable (column), it returns a Vector object.
+  If `#[]` represents a single variable (column), it returns a Vector object.
   ```ruby
   df[:a]
@@ -325,6 +343,7 @@ penguins.to_rover
   #<RedAmber::Vector(:uint8, size=3):0x000000000000f140>
   [1, 2, 3]
   ```
   Or `#v` method also returns a Vector for a key.
   ```ruby
@@ -335,18 +354,19 @@ penguins.to_rover
   [1, 2, 3]
   ```
-  This may be useful to use in a block of DataFrame manipulation verbs. We can write `v(:a)` rather than `self[:a]` or `df[:a]`
+  This method may be useful to use in a block of DataFrame manipulation verbs. We can write `v(:a)` rather than `self[:a]` or `df[:a]`
-### Select observations (rows in a table) by `[]` as `[index]`, `[range]`, `[array]`
+### Select records (rows in a table) by `[]` as `[index]`, `[range]`, `[array]`
-- Select a obs. by index: `df[0]`
-- Select obs. by indeces in a Range: `df[1..2]`
+- Select a record by index: `df[0]`
-  An end-less or a begin-less Range can be used to represent indeces.
+- Select records by indeces in an Array: `df[1, 2]`
-- Select obs. by indeces in an Array: `df[1, 2]`
+- Select records by indeces in a Range: `df[1..2]`
-- You can use float indices.
+  An end-less or a begin-less Range can be used to represent indeces.
+- You can use indices in Float.
 - Mixed case: `df[2, 0..]`
@@ -359,15 +379,15 @@ penguins.to_rover
   #<RedAmber::DataFrame : 4 x 3 Vectors, 0x0000000000033270>
           a b               c
     <uint8> <string> <double>
-  1       3 C             3.0
-  2       1 A             1.0
-  3       2 B             2.0
-  4       3 C             3.0
+  0       3 C             3.0
+  1       1 A             1.0
+  2       2 B             2.0
+  3       3 C             3.0
   ```
-- Select obs. by a boolean Array or a boolean RedAmber::Vector at same size as self.
+- Select records by a boolean Array or a boolean RedAmber::Vector at same size as self.
-  It returns a sub dataframe with observations at boolean is true.
+  It returns a sub dataframe with records at boolean is true.
     ```ruby
     # with the same dataframe `df` above
@@ -382,15 +402,15 @@ penguins.to_rover
     1       1 A             1.0
     ```
-### Select rows from top or from bottom
+### Select records (rows) from top or from bottom
   `head(n=5)`, `tail(n=5)`, `first(n=1)`, `last(n=1)`
 ## Sub DataFrame manipulations
-### `pick  ` - pick up variables by key label -
+### `pick  ` - pick up variables -
-  Pick up some columns (variables) to create a sub DataFrame.
+  Pick up some variables (columns) to create a sub DataFrame.
   ![pick method image](doc/../image/dataframe/pick.png)
@@ -405,15 +425,15 @@ penguins.to_rover
     #<RedAmber::DataFrame : 344 x 2 Vectors, 0x0000000000035ebc>
         species  bill_length_mm
         <string>       <double>
-      1 Adelie             39.1
-      2 Adelie             39.5
-      3 Adelie             40.3
-      4 Adelie            (nil)
-      5 Adelie             36.7
+      0 Adelie             39.1
+      1 Adelie             39.5
+      2 Adelie             40.3
+      3 Adelie            (nil)
+      4 Adelie             36.7
       : :                     :
-    342 Gentoo             50.4
-    343 Gentoo             45.2
-    344 Gentoo             49.9
+    341 Gentoo             50.4
+    342 Gentoo             45.2
+    343 Gentoo             49.9
     ```
 - Indices as arguments
@@ -427,15 +447,15 @@ penguins.to_rover
     #<RedAmber::DataFrame : 344 x 4 Vectors, 0x0000000000055ce4>
         species  island    bill_length_mm     year
         <string> <string>        <double> <uint16>
-      1 Adelie   Torgersen           39.1     2007
-      2 Adelie   Torgersen           39.5     2007
-      3 Adelie   Torgersen           40.3     2007
-      4 Adelie   Torgersen          (nil)     2007
-      5 Adelie   Torgersen           36.7     2007
+      0 Adelie   Torgersen           39.1     2007
+      1 Adelie   Torgersen           39.5     2007
+      2 Adelie   Torgersen           40.3     2007
+      3 Adelie   Torgersen          (nil)     2007
+      4 Adelie   Torgersen           36.7     2007
       : :        :                      :        :
-    342 Gentoo   Biscoe              50.4     2009
-    343 Gentoo   Biscoe              45.2     2009
-    344 Gentoo   Biscoe              49.9     2009
+    341 Gentoo   Biscoe              50.4     2009
+    342 Gentoo   Biscoe              45.2     2009
+    343 Gentoo   Biscoe              49.9     2009
     ```
 - Booleans as arguments
@@ -443,21 +463,21 @@ penguins.to_rover
   `pick(booleans)` accepts booleans as arguments in an Array. Booleans must be same length as `n_keys`.
     ```ruby
-    penguins.pick(penguins.types.map { |type| type == :string })
+    penguins.pick(penguins.vectors.map(&:string?))
     # =>
     #<RedAmber::DataFrame : 344 x 3 Vectors, 0x00000000000387ac>
         species  island    sex
         <string> <string>  <string>
-      1 Adelie   Torgersen male
+      0 Adelie   Torgersen male
+      1 Adelie   Torgersen female
       2 Adelie   Torgersen female
-      3 Adelie   Torgersen female
-      4 Adelie   Torgersen (nil)
-      5 Adelie   Torgersen female
+      3 Adelie   Torgersen (nil)
+      4 Adelie   Torgersen female
       : :        :         :
-    342 Gentoo   Biscoe    male
-    343 Gentoo   Biscoe    female
-    344 Gentoo   Biscoe    male
+    341 Gentoo   Biscoe    male
+    342 Gentoo   Biscoe    female
+    343 Gentoo   Biscoe    male
     ```
 - Keys or booleans by a block
@@ -471,20 +491,20 @@ penguins.to_rover
     #<RedAmber::DataFrame : 344 x 3 Vectors, 0x000000000003dd4c>
         bill_length_mm bill_depth_mm flipper_length_mm
               <double>      <double>           <uint8>
-      1           39.1          18.7               181
-      2           39.5          17.4               186
-      3           40.3          18.0               195
-      4          (nil)         (nil)             (nil)
-      5           36.7          19.3               193
+      0           39.1          18.7               181
+      1           39.5          17.4               186
+      2           40.3          18.0               195
+      3          (nil)         (nil)             (nil)
+      4           36.7          19.3               193
       :              :             :                 :
-    342           50.4          15.7               222
-    343           45.2          14.8               212
-    344           49.9          16.1               213
+    341           50.4          15.7               222
+    342           45.2          14.8               212
+    343           49.9          16.1               213
     ```
-### `drop  ` - pick and drop -
+### `drop  ` - counterpart of pick -
-  Drop some columns (variables) to create a remainer DataFrame.
+  Drop some variables (columns) to create a remainer DataFrame.
   ![drop method image](doc/../image/dataframe/drop.png)
@@ -526,9 +546,9 @@ penguins.to_rover
   #<RedAmber::DataFrame : 3 x 1 Vector, 0x000000000003f4bc>
           a
     <uint8>
-  1       1
-  2       2
-  3       3
+  0       1
+  1       2
+  2       3
   df[:a]
@@ -548,9 +568,9 @@ penguins.to_rover
   [1, 2, 3]
   ```
-### `slice  `  - to cut vertically is slice -
+### `slice  `  - slice and select records -
-  Slice and select rows (observations) to create a sub DataFrame.
+  Slice and select records (rows) to create a sub DataFrame.
   ![slice method image](doc/../image/dataframe/slice.png)
@@ -561,22 +581,22 @@ penguins.to_rover
     Negative index from the tail like Ruby's Array is also acceptable.
     ```ruby
-    # returns 5 obs. at start and 5 obs. from end
+    # returns 5 records at start and 5 records from end
     penguins.slice(0...5, -5..-1)
     # =>
     #<RedAmber::DataFrame : 10 x 8 Vectors, 0x0000000000042be4>
-       species  island    bill_length_mm bill_depth_mm flipper_length_mm ...     year
-       <string> <string>        <double>      <double>           <uint8> ... <uint16>
-     1 Adelie   Torgersen           39.1          18.7               181 ...     2007
-     2 Adelie   Torgersen           39.5          17.4               186 ...     2007
-     3 Adelie   Torgersen           40.3          18.0               195 ...     2007
-     4 Adelie   Torgersen          (nil)         (nil)             (nil) ...     2007
-     5 Adelie   Torgersen           36.7          19.3               193 ...     2007
-     : :        :                      :             :                 : ...        :
-     8 Gentoo   Biscoe              50.4          15.7               222 ...     2009
-     9 Gentoo   Biscoe              45.2          14.8               212 ...     2009
-    10 Gentoo   Biscoe              49.9          16.1               213 ...     2009
+      species  island    bill_length_mm bill_depth_mm flipper_length_mm ...     year
+      <string> <string>        <double>      <double>           <uint8> ... <uint16>
+    0 Adelie   Torgersen           39.1          18.7               181 ...     2007
+    1 Adelie   Torgersen           39.5          17.4               186 ...     2007
+    2 Adelie   Torgersen           40.3          18.0               195 ...     2007
+    3 Adelie   Torgersen          (nil)         (nil)             (nil) ...     2007
+    4 Adelie   Torgersen           36.7          19.3               193 ...     2007
+    : :        :                      :             :                 : ...        :
+    7 Gentoo   Biscoe              50.4          15.7               222 ...     2009
+    8 Gentoo   Biscoe              45.2          14.8               212 ...     2009
+    9 Gentoo   Biscoe              49.9          16.1               213 ...     2009
     ```
 - Booleans as an argument
@@ -591,15 +611,15 @@ penguins.to_rover
     #<RedAmber::DataFrame : 242 x 8 Vectors, 0x0000000000043d3c>
         species  island    bill_length_mm bill_depth_mm flipper_length_mm ...     year
         <string> <string>        <double>      <double>           <uint8> ... <uint16>
-      1 Adelie   Torgersen           40.3          18.0               195 ...     2007
-      2 Adelie   Torgersen           42.0          20.2               190 ...     2007
-      3 Adelie   Torgersen           41.1          17.6               182 ...     2007
-      4 Adelie   Torgersen           42.5          20.7               197 ...     2007
-      5 Adelie   Torgersen           46.0          21.5               194 ...     2007
+      0 Adelie   Torgersen           40.3          18.0               195 ...     2007
+      1 Adelie   Torgersen           42.0          20.2               190 ...     2007
+      2 Adelie   Torgersen           41.1          17.6               182 ...     2007
+      3 Adelie   Torgersen           42.5          20.7               197 ...     2007
+      4 Adelie   Torgersen           46.0          21.5               194 ...     2007
       : :        :                      :             :                 : ...        :
-    240 Gentoo   Biscoe              50.4          15.7               222 ...     2009
-    241 Gentoo   Biscoe              45.2          14.8               212 ...     2009
-    242 Gentoo   Biscoe              49.9          16.1               213 ...     2009
+    239 Gentoo   Biscoe              50.4          15.7               222 ...     2009
+    240 Gentoo   Biscoe              45.2          14.8               212 ...     2009
+    241 Gentoo   Biscoe              49.9          16.1               213 ...     2009
     ```
 - Indices or booleans by a block
@@ -619,15 +639,15 @@ penguins.to_rover
     #<RedAmber::DataFrame : 204 x 8 Vectors, 0x0000000000047a40>
         species  island    bill_length_mm bill_depth_mm flipper_length_mm ...     year
         <string> <string>        <double>      <double>           <uint8> ... <uint16>
-      1 Adelie   Torgersen           39.1          18.7               181 ...     2007
-      2 Adelie   Torgersen           39.5          17.4               186 ...     2007
-      3 Adelie   Torgersen           40.3          18.0               195 ...     2007
-      4 Adelie   Torgersen           39.3          20.6               190 ...     2007
-      5 Adelie   Torgersen           38.9          17.8               181 ...     2007
+      0 Adelie   Torgersen           39.1          18.7               181 ...     2007
+      1 Adelie   Torgersen           39.5          17.4               186 ...     2007
+      2 Adelie   Torgersen           40.3          18.0               195 ...     2007
+      3 Adelie   Torgersen           39.3          20.6               190 ...     2007
+      4 Adelie   Torgersen           38.9          17.8               181 ...     2007
       : :        :                      :             :                 : ...        :
-    202 Gentoo   Biscoe              47.2          13.7               214 ...     2009
-    203 Gentoo   Biscoe              46.8          14.3               215 ...     2009
-    204 Gentoo   Biscoe              45.2          14.8               212 ...     2009
+    201 Gentoo   Biscoe              47.2          13.7               214 ...     2009
+    202 Gentoo   Biscoe              46.8          14.3               215 ...     2009
+    203 Gentoo   Biscoe              45.2          14.8               212 ...     2009
     ```
 - Notice: nil option
@@ -656,9 +676,9 @@ penguins.to_rover
     0	1	A	  1.000000
     ```
-### `remove`
+### `remove` - counterpart of slice -
-  Slice and reject rows (observations) to create a remainer DataFrame.
+  Slice and reject records (rows) to create a remainer DataFrame.
   ![remove method image](doc/../image/dataframe/remove.png)
@@ -667,22 +687,22 @@ penguins.to_rover
     `remove(indeces)` accepts indeces as arguments. Indeces should be an Integer or a Range of Integer.
     ```ruby
-    # returns 6th to 339th obs.
+    # returns 6th to 339th records
     penguins.remove(0...5, -5..-1)
     # =>
     #<RedAmber::DataFrame : 334 x 8 Vectors, 0x00000000000487c4>
         species  island    bill_length_mm bill_depth_mm flipper_length_mm ...     year
         <string> <string>        <double>      <double>           <uint8> ... <uint16>
-      1 Adelie   Torgersen           39.3          20.6               190 ...     2007
-      2 Adelie   Torgersen           38.9          17.8               181 ...     2007
-      3 Adelie   Torgersen           39.2          19.6               195 ...     2007
-      4 Adelie   Torgersen           34.1          18.1               193 ...     2007
-      5 Adelie   Torgersen           42.0          20.2               190 ...     2007
+      0 Adelie   Torgersen           39.3          20.6               190 ...     2007
+      1 Adelie   Torgersen           38.9          17.8               181 ...     2007
+      2 Adelie   Torgersen           39.2          19.6               195 ...     2007
+      3 Adelie   Torgersen           34.1          18.1               193 ...     2007
+      4 Adelie   Torgersen           42.0          20.2               190 ...     2007
       : :        :                      :             :                 : ...        :
-    332 Gentoo   Biscoe              44.5          15.7               217 ...     2009
-    333 Gentoo   Biscoe              48.8          16.2               222 ...     2009
-    334 Gentoo   Biscoe              47.2          13.7               214 ...     2009
+    331 Gentoo   Biscoe              44.5          15.7               217 ...     2009
+    332 Gentoo   Biscoe              48.8          16.2               222 ...     2009
+    333 Gentoo   Biscoe              47.2          13.7               214 ...     2009
     ```
 - Booleans as an argument
@@ -690,7 +710,7 @@ penguins.to_rover
   `remove(booleans)` accepts booleans as an argument in an Array, a Vector or an Arrow::BooleanArray . Booleans must be same length as `size`.
     ```ruby
-    # remove all observation contains nil
+    # remove all records contains nil
     removed = penguins.remove { vectors.map(&:is_nil).reduce(&:|) }
     removed
@@ -698,15 +718,15 @@ penguins.to_rover
     #<RedAmber::DataFrame : 333 x 8 Vectors, 0x0000000000049fac>
         species  island    bill_length_mm bill_depth_mm flipper_length_mm ...     year
         <string> <string>        <double>      <double>           <uint8> ... <uint16>
-      1 Adelie   Torgersen           39.1          18.7               181 ...     2007
-      2 Adelie   Torgersen           39.5          17.4               186 ...     2007
-      3 Adelie   Torgersen           40.3          18.0               195 ...     2007
-      4 Adelie   Torgersen           36.7          19.3               193 ...     2007
-      5 Adelie   Torgersen           39.3          20.6               190 ...     2007
+      0 Adelie   Torgersen           39.1          18.7               181 ...     2007
+      1 Adelie   Torgersen           39.5          17.4               186 ...     2007
+      2 Adelie   Torgersen           40.3          18.0               195 ...     2007
+      3 Adelie   Torgersen           36.7          19.3               193 ...     2007
+      4 Adelie   Torgersen           39.3          20.6               190 ...     2007
       : :        :                      :             :                 : ...        :
-    331 Gentoo   Biscoe              50.4          15.7               222 ...     2009
-    332 Gentoo   Biscoe              45.2          14.8               212 ...     2009
-    333 Gentoo   Biscoe              49.9          16.1               213 ...     2009
+    330 Gentoo   Biscoe              50.4          15.7               222 ...     2009
+    331 Gentoo   Biscoe              45.2          14.8               212 ...     2009
+    332 Gentoo   Biscoe              49.9          16.1               213 ...     2009
     ```
 - Indices or booleans by a block
@@ -727,15 +747,15 @@ penguins.to_rover
     #<RedAmber::DataFrame : 140 x 8 Vectors, 0x000000000004de40>
         species  island    bill_length_mm bill_depth_mm flipper_length_mm ...     year
         <string> <string>        <double>      <double>           <uint8> ... <uint16>
-      1 Adelie   Torgersen          (nil)         (nil)             (nil) ...     2007
-      2 Adelie   Torgersen           36.7          19.3               193 ...     2007
-      3 Adelie   Torgersen           34.1          18.1               193 ...     2007
-      4 Adelie   Torgersen           37.8          17.1               186 ...     2007
-      5 Adelie   Torgersen           37.8          17.3               180 ...     2007
+      0 Adelie   Torgersen          (nil)         (nil)             (nil) ...     2007
+      1 Adelie   Torgersen           36.7          19.3               193 ...     2007
+      2 Adelie   Torgersen           34.1          18.1               193 ...     2007
+      3 Adelie   Torgersen           37.8          17.1               186 ...     2007
+      4 Adelie   Torgersen           37.8          17.3               180 ...     2007
       : :        :                      :             :                 : ...        :
-    138 Gentoo   Biscoe             (nil)         (nil)             (nil) ...     2009
-    139 Gentoo   Biscoe              50.4          15.7               222 ...     2009
-    140 Gentoo   Biscoe              49.9          16.1               213 ...     2009
+    137 Gentoo   Biscoe             (nil)         (nil)             (nil) ...     2009
+    138 Gentoo   Biscoe              50.4          15.7               222 ...     2009
+    139 Gentoo   Biscoe              49.9          16.1               213 ...     2009
     ```
 - Notice for nil
@@ -770,13 +790,13 @@ penguins.to_rover
     #<RedAmber::DataFrame : 2 x 3 Vectors, 0x000000000005df98>
             a b               c
       <uint8> <string> <double>
-    1       1 A             1.0
-    2   (nil) C             3.0
+    0       1 A             1.0
+    1   (nil) C             3.0
     ```
 ### `rename`
-  Rename keys (column names) to create a updated DataFrame.
+  Rename keys (variable/column names) to create a updated DataFrame.
   ![rename method image](doc/../image/dataframe/rename.png)
@@ -792,9 +812,9 @@ penguins.to_rover
     #<RedAmber::DataFrame : 3 x 2 Vectors, 0x0000000000060838>
       name     age_in_1993
       <string>     <uint8>
-    1 Yasuko            68
-    2 Rui               49
-    3 Hinata            28
+    0 Yasuko            68
+    1 Rui               49
+    2 Hinata            28
     ```
 - Key pairs by a block
@@ -811,7 +831,7 @@ penguins.to_rover
 ### `assign`
-  Assign new or updated columns (variables) and create a updated DataFrame.
+  Assign new or updated variables (columns) and create an updated DataFrame.
   - Variables with new keys will append new columns from the right.
   - Variables with exisiting keys will update corresponding vectors.
@@ -832,9 +852,9 @@ penguins.to_rover
     #<RedAmber::DataFrame : 3 x 2 Vectors, 0x0000000000062804>
       name         age
       <string> <uint8>
-    1 Yasuko        68
-    2 Rui           49
-    3 Hinata        28
+    0 Yasuko        68
+    1 Rui           49
+    2 Hinata        28
     # update :age and add :brother
     df.assign do
@@ -848,9 +868,9 @@ penguins.to_rover
     #<RedAmber::DataFrame : 3 x 3 Vectors, 0x00000000000658b0>
       name         age brother
       <string> <uint8> <string>
-    1 Yasuko        97 Santa
-    2 Rui           78 (nil)
-    3 Hinata        57 Momotaro
+    0 Yasuko        97 Santa
+    1 Rui           78 (nil)
+    2 Hinata        57 Momotaro
     ```
 - Key pairs by a block
@@ -869,11 +889,11 @@ penguins.to_rover
     #<RedAmber::DataFrame : 5 x 3 Vectors, 0x0000000000069e60>
         index    float string
       <uint8> <double> <string>
-    1       0      0.0 A
-    2       1      1.1 B
-    3       2      2.2 C
-    4       3      NaN D
-    5   (nil)    (nil) (nil)
+    0       0      0.0 A
+    1       1      1.1 B
+    2       2      2.2 C
+    3       3      NaN D
+    4   (nil)    (nil) (nil)
     # update :float
     # assigner by an Array
@@ -886,11 +906,11 @@ penguins.to_rover
     #<RedAmber::DataFrame : 5 x 3 Vectors, 0x00000000000dfffc>
         index    float string
       <uint8> <double> <string>
-    1       0     -0.0 A
-    2       1     -1.1 B
-    3       2     -2.2 C
-    4       3      NaN D
-    5   (nil)    (nil) (nil)
+    0       0     -0.0 A
+    1       1     -1.1 B
+    2       2     -2.2 C
+    3       3      NaN D
+    4   (nil)    (nil) (nil)
     # Or we can use assigner by a Hash
     df.assign do
@@ -921,11 +941,11 @@ penguins.to_rover
   #<RedAmber::DataFrame : 5 x 4 Vectors, 0x000000000001787c>
     new_index   index    float string
       <uint8> <uint8> <double> <string>
-  1         1       0      0.0 A
-  2         2       1      1.1 B
-  3         3       2      2.2 C
-  4         4       3      NaN D
-  5         5   (nil)    (nil) (nil)
+  0         1       0      0.0 A
+  1         2       1      1.1 B
+  2         3       2      2.2 C
+  3         4       3      NaN D
+  4         5   (nil)    (nil) (nil)
   ```
 ### `slice_by(key, keep_key: false) { block }`
@@ -946,11 +966,11 @@ penguins.to_rover
   #<RedAmber::DataFrame : 5 x 3 Vectors, 0x0000000000069e60>
       index    float string
     <uint8> <double> <string>
-  1       0      0.0 A
-  2       1      1.1 B
-  3       2      2.2 C
-  4       3      NaN D
-  5   (nil)    (nil) (nil)
+  0       0      0.0 A
+  1       1      1.1 B
+  2       2      2.2 C
+  3       3      NaN D
+  4   (nil)    (nil) (nil)
   df.slice_by(:string) { ["A", "C"] }
@@ -958,8 +978,8 @@ penguins.to_rover
   #<RedAmber::DataFrame : 2 x 2 Vectors, 0x000000000001b1ac>
       index    float
     <uint8> <double>
-  1       0      0.0
-  2       2      2.2
+  0       0      0.0
+  1       2      2.2
   ```
 It is the same behavior as;
@@ -977,9 +997,9 @@ It is the same behavior as;
   #<RedAmber::DataFrame : 3 x 2 Vectors, 0x0000000000069668>
       index    float
     <uint8> <double>
-  1       0      0.0
-  2       1      1.1
-  3       2      2.2
+  0       0      0.0
+  1       1      1.1
+  2       2      2.2
   ```
 When the option `keep_key: true` used, the column `key` will be preserved.
@@ -991,16 +1011,16 @@ When the option `keep_key: true` used, the column `key` will be preserved.
   #<RedAmber::DataFrame : 3 x 3 Vectors, 0x0000000000073c44>
       index    float string
     <uint8> <double> <string>
-  1       0      0.0 A
-  2       1      1.1 B
-  3       2      2.2 C
+  0       0      0.0 A
+  1       1      1.1 B
+  2       2      2.2 C
   ```
 ## Updating
 ### `sort`
-  `sort` accepts parameters as sort_keys thanks to the amazing Red Arrow feature。
+  `sort` accepts parameters as sort_keys thanks to the Red Arrow's feature。
     - :key, "key" or "+key" denotes ascending order
     - "-key" denotes descending order
@@ -1016,11 +1036,11 @@ When the option `keep_key: true` used, the column `key` will be preserved.
   #<RedAmber::DataFrame : 5 x 3 Vectors, 0x000000000009b03c>
       index string   bool
     <uint8> <string> <boolean>
-  1       0 (nil)    false
-  2       0 B        false
-  3       1 B        true
-  4       1 C        (nil)
-  5   (nil) A        true
+  0       0 (nil)    false
+  1       0 B        false
+  2       1 B        true
+  3       1 C        (nil)
+  4   (nil) A        true
   ```
 - [ ] Clamp
@@ -1031,13 +1051,13 @@ When the option `keep_key: true` used, the column `key` will be preserved.
 ### `remove_nil`
-  Remove any observations containing nil.
+  Remove any records containing nil.
 ## Grouping
 ### `group(group_keys)`
-  `group` creates a class `Group` object. `Group` accepts functions below as a method.
+  `group` creates a instance of class `Group`. `Group` accepts functions below as a method.
   Method accepts options as `group_keys`.
   Available functions are:
@@ -1064,23 +1084,22 @@ When the option `keep_key: true` used, the column `key` will be preserved.
   This is an example of grouping of famous STARWARS dataset.
   ```ruby
-  starwars =
-    RedAmber::DataFrame.load(URI("https://vincentarelbundock.github.io/Rdatasets/csv/dplyr/starwars.csv"))
-  starwars
+  uri = URI("https://vincentarelbundock.github.io/Rdatasets/csv/dplyr/starwars.csv")
+  starwars = RedAmber::DataFrame.load(uri)
   # =>
   #<RedAmber::DataFrame : 87 x 12 Vectors, 0x0000000000005a50>
      unnamed1 name            height     mass hair_color skin_color  eye_color ... species
       <int64> <string>       <int64> <double> <string>   <string>    <string>  ... <string>
-   1        1 Luke Skywalker     172     77.0 blond      fair        blue      ... Human
-   2        2 C-3PO              167     75.0 NA         gold        yellow    ... Droid
-   3        3 R2-D2               96     32.0 NA         white, blue red       ... Droid
-   4        4 Darth Vader        202    136.0 none       white       yellow    ... Human
-   5        5 Leia Organa        150     49.0 brown      light       brown     ... Human
+   0        1 Luke Skywalker     172     77.0 blond      fair        blue      ... Human
+   1        2 C-3PO              167     75.0 NA         gold        yellow    ... Droid
+   2        3 R2-D2               96     32.0 NA         white, blue red       ... Droid
+   3        4 Darth Vader        202    136.0 none       white       yellow    ... Human
+   4        5 Leia Organa        150     49.0 brown      light       brown     ... Human
    :        : :                    :        : :          :           :         ... :
-  85       85 BB8              (nil)    (nil) none       none        black     ... Droid
-  86       86 Captain Phasma   (nil)    (nil) unknown    unknown     unknown   ... NA
-  87       87 Padmé Amidala      165     45.0 brown      light       brown     ... Human
+  84       85 BB8              (nil)    (nil) none       none        black     ... Droid
+  85       86 Captain Phasma   (nil)    (nil) unknown    unknown     unknown   ... NA
+  86       87 Padmé Amidala      165     45.0 brown      light       brown     ... Human
   starwars.tdr(12)
@@ -1088,58 +1107,60 @@ When the option `keep_key: true` used, the column `key` will be preserved.
   RedAmber::DataFrame : 87 x 12 Vectors
   Vectors : 4 numeric, 8 strings
   #  key         type   level data_preview
-  1  :unnamed1   int64     87 [1, 2, 3, 4, 5, ... ]
-  2  :name       string    87 ["Luke Skywalker", "C-3PO", "R2-D2", "Darth Vader", "Leia Organa", ... ]
-  3  :height     int64     46 [172, 167, 96, 202, 150, ... ], 6 nils
-  4  :mass       double    39 [77.0, 75.0, 32.0, 136.0, 49.0, ... ], 28 nils
-  5  :hair_color string    13 ["blond", "NA", "NA", "none", "brown", ... ]
-  6  :skin_color string    31 ["fair", "gold", "white, blue", "white", "light", ... ]
-  7  :eye_color  string    15 ["blue", "yellow", "red", "yellow", "brown", ... ]
-  8  :birth_year double    37 [19.0, 112.0, 33.0, 41.9, 19.0, ... ], 44 nils
-  9  :sex        string     5 {"male"=>60, "none"=>6, "female"=>16, "hermaphroditic"=>1, "NA"=>4}
-  10 :gender     string     3 {"masculine"=>66, "feminine"=>17, "NA"=>4}
-  11 :homeworld  string    49 ["Tatooine", "Tatooine", "Naboo", "Tatooine", "Alderaan", ... ]
-  12 :species    string    38 ["Human", "Droid", "Droid", "Human", "Human", ... ]
+  0  :unnamed1   int64     87 [1, 2, 3, 4, 5, ... ]
+  1  :name       string    87 ["Luke Skywalker", "C-3PO", "R2-D2", "Darth Vader", "Leia Organa", ... ]
+  2  :height     int64     46 [172, 167, 96, 202, 150, ... ], 6 nils
+  3  :mass       double    39 [77.0, 75.0, 32.0, 136.0, 49.0, ... ], 28 nils
+  4  :hair_color string    13 ["blond", "NA", "NA", "none", "brown", ... ]
+  5  :skin_color string    31 ["fair", "gold", "white, blue", "white", "light", ... ]
+  6  :eye_color  string    15 ["blue", "yellow", "red", "yellow", "brown", ... ]
+  7  :birth_year double    37 [19.0, 112.0, 33.0, 41.9, 19.0, ... ], 44 nils
+  8  :sex        string     5 {"male"=>60, "none"=>6, "female"=>16, "hermaphroditic"=>1, "NA"=>4}
+  9  :gender     string     3 {"masculine"=>66, "feminine"=>17, "NA"=>4}
+  10 :homeworld  string    49 ["Tatooine", "Tatooine", "Naboo", "Tatooine", "Alderaan", ... ]
+  11 :species    string    38 ["Human", "Droid", "Droid", "Human", "Human", ... ]
   ```
   We can group by `:species` and calculate the count.
   ```ruby
-  starwars.group(:species).count(:species)
+  starwars.remove { species == "NA" }
+          .group(:species).count(:species)
   # =>
-  #<RedAmber::DataFrame : 38 x 2 Vectors, 0x000000000001d6f0>
+  #<RedAmber::DataFrame : 37 x 2 Vectors, 0x000000000000ffa0>
      species    count
      <string> <int64>
-   1 Human         35
-   2 Droid          6
-   3 Wookiee        2
-   4 Rodian         1
-   5 Hutt           1
+   0 Human         35
+   1 Droid          6
+   2 Wookiee        2
+   3 Rodian         1
+   4 Hutt           1
    : :              :
-  36 Kaleesh        1
-  37 Pau'an         1
-  38 Kel Dor        1
+  34 Kaleesh        1
+  35 Pau'an         1
+  36 Kel Dor        1
   ```
   We can also calculate the mean of `:mass` and `:height` together.
   ```ruby
-  grouped = starwars.group(:species) { [count(:species), mean(:height, :mass)] }
+  grouped = starwars.remove { species == "NA" }
+                    .group(:species) { [count(:species), mean(:height, :mass)] }
   # =>
-  #<RedAmber::DataFrame : 38 x 4 Vectors, 0x00000000000407cc>
-     specie  s    count mean(height) mean(mass)
-     <strin  g> <int64>     <double>   <double>
-   1 Human           35        176.6       82.8
-   2 Droid            6        131.2       69.8
-   3 Wookie  e        2        231.0      124.0
-   4 Rodian           1        173.0       74.0
-   5 Hutt             1        175.0     1358.0
-   : :                :            :          :
-  36 Kalees  h        1        216.0      159.0
-  37 Pau'an           1        206.0       80.0
-  38 Kel Dor        1        188.0       80.0
+  #<RedAmber::DataFrame : 37 x 4 Vectors, 0x000000000000fff0>
+     species    count mean(height) mean(mass)
+     <string> <int64>     <double>   <double>
+   0 Human         35       176.65      82.78
+   1 Droid          6        131.2      69.75
+   2 Wookiee        2        231.0      124.0
+   3 Rodian         1        173.0       74.0
+   4 Hutt           1        175.0     1358.0
+   : :              :            :          :
+  34 Kaleesh        1        216.0      159.0
+  35 Pau'an         1        206.0       80.0
+  36 Kel Dor        1        188.0       80.0
   ```
   Select rows for count > 1.
@@ -1148,22 +1169,23 @@ When the option `keep_key: true` used, the column `key` will be preserved.
   grouped.slice(grouped[:count] > 1)
   # =>
-  #<RedAmber::DataFrame : 9 x 4 Vectors, 0x000000000004c270>
+  #<RedAmber::DataFrame : 8 x 4 Vectors, 0x000000000001002c>
     species    count mean(height) mean(mass)
     <string> <int64>     <double>   <double>
-  1 Human         35        176.6       82.8
-  2 Droid          6        131.2       69.8
-  3 Wookiee        2        231.0      124.0
-  4 Gungan         3        208.7       74.0
-  5 NA             4        181.3       48.0
-  : :              :            :          :
-  7 Twi'lek        2        179.0       55.0
-  8 Mirialan       2        168.0       53.1
-  9 Kaminoan       2        221.0       88.0
+  0 Human         35       176.65      82.78
+  1 Droid          6        131.2      69.75
+  2 Wookiee        2        231.0      124.0
+  3 Gungan         3       208.67       74.0
+  4 Zabrak         2        173.0       80.0
+  5 Twi'lek        2        179.0       55.0
+  6 Mirialan       2        168.0       53.1
+  7 Kaminoan       2        221.0       88.0
   ```
 ## Reshape
+![dataframe reshapeing image](doc/../image/reshaping_dataframe.png)
 ### `transpose`
   Creates transposed DataFrame for the wide (messy) dataframe.
@@ -1175,30 +1197,31 @@ When the option `keep_key: true` used, the column `key` will be preserved.
   #<RedAmber::DataFrame : 5 x 6 Vectors, 0x000000000000d520>
        Year    Audi     BMW BMW_MINI Mercedes-Benz      VW
     <int64> <int64> <int64>  <int64>       <int64> <int64>
-  1    2017   28336   52527    25427         68221   49040
-  2    2018   26473   50982    25984         67554   51961
-  3    2019   24222   46814    23813         66553   46794
-  4    2020   22304   35712    20196         57041   36576
-  5    2021   22535   35905    18211         51722   35215
-  import_cars.transpose(:Manufacturer)
+  0    2017   28336   52527    25427         68221   49040
+  1    2018   26473   50982    25984         67554   51961
+  2    2019   24222   46814    23813         66553   46794
+  3    2020   22304   35712    20196         57041   36576
+  4    2021   22535   35905    18211         51722   35215
+  import_cars.transpose(name: :Manufacturer)
   # =>
-  #<RedAmber::DataFrame : 5 x 6 Vectors, 0x000000000000ef74>
+  #<RedAmber::DataFrame : 5 x 6 Vectors, 0x0000000000010a2c>
     Manufacturer      2017     2018     2019     2020     2021
-    <dictionary>  <uint32> <uint32> <uint32> <uint16> <uint16>
-  1 Audi             28336    26473    24222    22304    22535
-  2 BMW              52527    50982    46814    35712    35905
-  3 BMW_MINI         25427    25984    23813    20196    18211
-  4 Mercedes-Benz    68221    67554    66553    57041    51722
-  5 VW               49040    51961    46794    36576    35215
+    <string>      <uint32> <uint32> <uint32> <uint16> <uint16>
+  0 Audi             28336    26473    24222    22304    22535
+  1 BMW              52527    50982    46814    35712    35905
+  2 BMW_MINI         25427    25984    23813    20196    18211
+  3 Mercedes-Benz    68221    67554    66553    57041    51722
+  4 VW               49040    51961    46794    36576    35215
   ```
   The leftmost column is created by original keys. Key name of the column is
-  named by parameter `:name`. If `:name` is not specified, `:N` is used for the key.
+  named by parameter `:name`. If `:name` is not specified, `:NAME` is used for the key.
 ### `to_long(*keep_keys)`
-  Creates a 'long' (tidy) DataFrame from a 'wide' DataFrame.
+  Creates a 'long' (may be tidy) DataFrame from a 'wide' DataFrame.
   - Parameter `keep_keys` specifies the key names to keep.
@@ -1206,47 +1229,51 @@ When the option `keep_key: true` used, the column `key` will be preserved.
   import_cars.to_long(:Year)
   # =>
-  #<RedAmber::DataFrame : 25 x 3 Vectors, 0x0000000000012750>
-         Year N                    V
-     <uint16> <dictionary>  <uint32>
-   1     2017 Audi             28336
-   2     2017 BMW              52527
-   3     2017 BMW_MINI         25427
-   4     2017 Mercedes-Benz    68221
-   5     2017 VW               49040
+  #<RedAmber::DataFrame : 25 x 3 Vectors, 0x0000000000011864>
+         Year NAME             VALUE
+     <uint16> <string>      <uint32>
+   0     2017 Audi             28336
+   1     2017 BMW              52527
+   2     2017 BMW_MINI         25427
+   3     2017 Mercedes-Benz    68221
+   4     2017 VW               49040
    :        : :                    :
-  23     2021 BMW_MINI         18211
-  24     2021 Mercedes-Benz    51722
-  25     2021 VW               35215
+  22     2021 BMW_MINI         18211
+  23     2021 Mercedes-Benz    51722
+  24     2021 VW               35215
   ```
   - Option `:name` is the key of the column which came **from key names**.
+    The default value is `:NAME` if it is not specified.
   - Option `:value` is the key of the column which came **from values**.
+    The default value is `:VALUE` if it is not specified.
   ```ruby
   import_cars.to_long(:Year, name: :Manufacturer, value: :Num_of_imported)
   # =>
-  #<RedAmber::DataFrame : 25 x 3 Vectors, 0x0000000000017700>
+  #<RedAmber::DataFrame : 25 x 3 Vectors, 0x000000000001359c>
          Year Manufacturer  Num_of_imported
-     <uint16> <dictionary>         <uint32>
-   1     2017 Audi                    28336
-   2     2017 BMW                     52527
-   3     2017 BMW_MINI                25427
-   4     2017 Mercedes-Benz           68221
-   5     2017 VW                      49040
+     <uint16> <string>             <uint32>
+   0     2017 Audi                    28336
+   1     2017 BMW                     52527
+   2     2017 BMW_MINI                25427
+   3     2017 Mercedes-Benz           68221
+   4     2017 VW                      49040
    :        : :                           :
-  23     2021 BMW_MINI                18211
-  24     2021 Mercedes-Benz           51722
-  25     2021 VW                      35215
+  22     2021 BMW_MINI                18211
+  23     2021 Mercedes-Benz           51722
+  24     2021 VW                      35215
   ```
 ### `to_wide`
-  Creates a 'wide' (messy) DataFrame from a 'long' DataFrame.
+  Creates a 'wide' (may be messy) DataFrame from a 'long' DataFrame.
   - Option `:name` is the key of the column which will be expanded **to key names**.
+    The default value is `:NAME` if it is not specified.
   - Option `:value` is the key of the column which will be expanded **to values**.
+    The default value is `:VALUE` if it is not specified.
   ```ruby
   import_cars.to_long(:Year).to_wide
@@ -1257,20 +1284,286 @@ When the option `keep_key: true` used, the column `key` will be preserved.
   #<RedAmber::DataFrame : 5 x 6 Vectors, 0x000000000000f0f0>
         Year     Audi      BMW BMW_MINI Mercedes-Benz       VW
     <uint16> <uint16> <uint16> <uint16>      <uint32> <uint16>
-  1     2017    28336    52527    25427         68221    49040
-  2     2018    26473    50982    25984         67554    51961
-  3     2019    24222    46814    23813         66553    46794
-  4     2020    22304    35712    20196         57041    36576
-  5     2021    22535    35905    18211         51722    35215
-  # == import_cars
+  0     2017    28336    52527    25427         68221    49040
+  1     2018    26473    50982    25984         67554    51961
+  2     2019    24222    46814    23813         66553    46794
+  3     2020    22304    35712    20196         57041    36576
+  4     2021    22535    35905    18211         51722    35215
   ```
 ## Combine
-- [ ] Combining dataframes
+### `join`
+![dataframe joining image](doc/../image/dataframe/join.png)
+  You should use specific `*_join` methods below.
+  - `other` is a DataFrame or a Arrow::Table.
+  - `join_keys` are keys shared by self and other to match with them.
+  - If `join_keys` are empty, common keys in self and other are chosen (natural join).
+  - If (common keys) > `join_keys`, duplicated keys are renamed by `suffix`.
+  ```ruby
+  df = DataFrame.new(
+    KEY: %w[A B C],
+    X1: [1, 2, 3]
+  )
+  #=>
+  #<RedAmber::DataFrame : 3 x 2 Vectors, 0x0000000000012a70>
+    KEY           X1
+    <string> <uint8>
+  0 A              1
+  1 B              2
+  2 C              3
+  other = DataFrame.new(
+    KEY: %w[A B D],
+    X2: [true, false, nil]
+  )
+  #=>
+  #<RedAmber::DataFrame : 3 x 2 Vectors, 0x0000000000017034>
+    KEY      X2
+    <string> <boolean>
+  0 A        true
+  1 B        false
+  2 D        (nil)
+  ```
+#### Mutating joins
+##### `inner_join(other, join_keys = nil, suffix: '.1')`
+  Join data, leaving only the matching records.
+  ```ruby
+  df.inner_join(other, :KEY)
+  #=>
+  #<RedAmber::DataFrame : 2 x 3 Vectors, 0x000000000001e2bc>
+    KEY           X1 X2
+    <string> <uint8> <boolean>
+  0 A              1 true
+  1 B              2 false
+  ```
+##### `full_join(other, join_keys = nil, suffix: '.1')`
+  Join data, leaving all records.
+  ```ruby
+  df.full_join(other, :KEY)
+  #=>
+  #<RedAmber::DataFrame : 4 x 3 Vectors, 0x0000000000029fcc>
+    KEY           X1 X2
+    <string> <uint8> <boolean>
+  0 A              1 true
+  1 B              2 false
+  2 C              3 (nil)
+  3 D          (nil) (nil)
+  ```
-- [ ] Join
+##### `left_join(other, join_keys = nil, suffix: '.1')`
+  Join matching values to self from other.
+  ```ruby
+  df.left_join(other, :KEY)
+  #=>
+  #<RedAmber::DataFrame : 3 x 3 Vectors, 0x0000000000029fcc>
+    KEY           X1 X2
+    <string> <uint8> <boolean>
+  0 A              1 true
+  1 B              2 false
+  2 C              3 (nil)
+  ```
+##### `right_join(other, join_keys = nil, suffix: '.1')`
+  Join matching values from self to other.
+  ```ruby
+  df.right_join(other, :KEY)
+  #=>
+  #<RedAmber::DataFrame : 2 x 3 Vectors, 0x0000000000029fcc>
+    KEY           X1 X2
+    <string> <uint8> <boolean>
+  0 A              1 true
+  1 B              2 false
+  2 D          (nil) (nil)
+  ```
+#### Filtering join
+##### `semi_join(other, join_keys = nil, suffix: '.1')`
+  Return records of self that have a match in other.
+  ```ruby
+  df.semi_join(other, :KEY)
+  #=>
+  #<RedAmber::DataFrame : 2 x 2 Vectors, 0x0000000000029fcc>
+    KEY           X1
+    <string> <uint8>
+  0 A              1
+  1 B              2
+  ```
+##### `anti_join(other, join_keys = nil, suffix: '.1')`
+  Return records of self that do not have a match in other.
+  ```ruby
+  df.anti_join(other, :KEY)
+  #=>
+  #<RedAmber::DataFrame : 1 x 2 Vectors, 0x0000000000029fcc>
+    KEY           X1
+    <string> <uint8>
+  0 C              3
+  ```
+## Set operations
+![dataframe set and binding image](doc/../image/dataframe/set_and_bind.png)
+  Keys in self and other must be same in set operations.
+  ```ruby
+  df = DataFrame.new(
+    KEY1: %w[A B C],
+    KEY2: [1, 2, 3]
+  )
+  #=>
+  #<RedAmber::DataFrame : 3 x 2 Vectors, 0x0000000000012a70>
+    KEY1        KEY2
+    <string> <uint8>
+  0 A              1
+  1 B              2
+  2 C              3
+  other = DataFrame.new(
+    KEY1: %w[A B D],
+    KEY2: [1, 4, 5]
+  )
+  #=>
+  #<RedAmber::DataFrame : 3 x 2 Vectors, 0x0000000000017034>
+    KEY1        KEY2
+    <string> <uint8>
+  0 A              1
+  1 B              4
+  2 D              5
+  ```
+##### `intersect(other)`
+  Select records appearing in both self and other.
+  ```ruby
+  df.intersect(other)
+  #=>
+  #<RedAmber::DataFrame : 1 x 2 Vectors, 0x0000000000029fcc>
+    KEY1        KEY2
+    <string> <uint8>
+  0 A              1
+  ```
+##### `union(other)`
+  Select records appearing in self or other.
+  ```ruby
+  df.union(other)
+  #=>
+  #<RedAmber::DataFrame : 5 x 2 Vectors, 0x0000000000029fcc>
+    KEY1        KEY2
+    <string> <uint8>
+  0 A              1
+  1 B              2
+  2 C              3
+  3 B              4
+  4 D              5
+  ```
+##### `difference(other)`
+  Select records appearing in self but not in other.
+  It has an alias `setdiff`.
+  ```ruby
+  df.difference(other)
+  #=>
+  #<RedAmber::DataFrame : 1 x 2 Vectors, 0x0000000000029fcc>
+    KEY1        KEY2
+    <string> <uint8>
+  1 B              2
+  2 C              3
+  ```
+## Binding
+### `concatenate(other)`
+  Concatenate another DataFrame or Table onto the bottom of self. The shape and data type of other must be the same as self.
+  The alias is `concat`.
+  An array of DataFrames or Tables is also acceptable as other.
+  ```ruby
+  df
+  #=>
+  #<RedAmber::DataFrame : 2 x 2 Vectors, 0x0000000000022cb8>
+          x y
+    <uint8> <string>
+  0       1 A
+  1       2 B
+  other
+  #=>
+  #<RedAmber::DataFrame : 2 x 2 Vectors, 0x000000000001f6d0>
+          x y
+    <uint8> <string>
+  0       3 C
+  1       4 D
+  df.concatenate(other)
+  #=>
+  #<RedAmber::DataFrame : 4 x 2 Vectors, 0x0000000000022574>
+          x y
+    <uint8> <string>
+  0       1 A
+  1       2 B
+  2       3 C
+  3       4 D
+  ```
+### `merge(other)`
+  Concatenate another DataFrame or Table onto the bottom of self. The shape and data type of other must be the same as self.
+  ```ruby
+  df
+  #=>
+  #<RedAmber::DataFrame : 2 x 2 Vectors, 0x0000000000009150>
+          x       y
+    <uint8> <uint8>
+  0       1       3
+  1       2       4
+  other
+  #=>
+  #<RedAmber::DataFrame : 2 x 2 Vectors, 0x0000000000008a0c>
+    a        b
+    <string> <string>
+  0 A        C
+  1 B        D
+  df.merge(other)
+  #=>
+  #<RedAmber::DataFrame : 2 x 4 Vectors, 0x000000000000cb70>
+          x       y a        b
+    <uint8> <uint8> <string> <string>
+  0       1       3 A        C
+  1       2       4 B        D
+  ```
 ## Encoding