red_amber 0.5.1 → 0.5.2

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: b921353dbfcaf634a2e026f541caaf914125482c956fa05886f6a542f6ac2e35
4
- data.tar.gz: 737fae720227e8e3ef36c2c3142bdb7096b051fd51363a2978079e766081d320
3
+ metadata.gz: 7adbafa00c63857e5010442d39564d631ff4d7cac88bcc5dfdb520213f9c1606
4
+ data.tar.gz: 1dad23038eb977b2db44b2a09f648863646cdcb34d21d53e5ac5b582f9a00648
5
5
  SHA512:
6
- metadata.gz: 8f45b7c45725b2da1d9459edaa3ae59a2355e771055c281eb612356703b727315b64ed6963ce02493e6c8ef94194e848fc19b1cf4bbd0620b44ba2e24c1953b0
7
- data.tar.gz: ca9f134d488168f0d2396f55b118fa7db78d2f6969286896ba2139d891a8a09f8a7fde69141c2c226212748c16f93ece4737f65465f9ea0616e500177806c314
6
+ metadata.gz: 59a7154c1c18f9020628524bd66fc4c82913fadea2b413d47231c39980a62f1883397215a494cfedbb8b2e258cea4a7310ffda72d8c2030859c0bc2e132d8d92
7
+ data.tar.gz: 41b4ac8629511e16269d7642d4d129a1260b2de22b00335bbe2c2867a3d65c3ce7d9d4940cb474efa33d1839436e7a941f4c0c5e429f8f74eb1a309c68533a91
@@ -29,7 +29,7 @@ RUN set -e; \
29
29
  libzmq3-dev
30
30
 
31
31
  # Install Apache Arrow
32
- ARG APACHE_ARROW_VERSION=12.0.1-1
32
+ ARG APACHE_ARROW_VERSION=13.0.0-1
33
33
  ARG arrow_deb_tmp=/tmp/apache-arrow-apt-source-latest.deb
34
34
  ARG arrow_apt_source=https://apache.jfrog.io/artifactory/arrow/ubuntu/pool/jammy/main/a/apache-arrow-apt-source/apache-arrow-apt-source_${APACHE_ARROW_VERSION}_all.deb
35
35
  RUN set -e; \
@@ -1,6 +1,10 @@
1
1
  #!/usr/bin/env bash
2
2
  set -e
3
3
 
4
+ # Explicitly set ownership of /workspaces to vscode:vscode
5
+ # Because recent runner has uid=1001(runner), gid=999(docker)
6
+ sudo chown -R $(id -un):$(id -un) /workspaces
7
+
4
8
  # Install language and set timezone
5
9
  # You should change here if you use another
6
10
  sudo apt-get update
data/CHANGELOG.md CHANGED
@@ -1,3 +1,20 @@
1
+ ## [0.5.2] - 2023-09-01
2
+
3
+ Support Apache Arrow 13.0.0 .
4
+ This version is compatible with Arrow 12.0.0 .
5
+
6
+ - Breaking change
7
+
8
+ - Bug fixes
9
+ - Fix bundle install issue by install libyaml-devel (#280)
10
+ - Fix ownership in devcontainer ci (#280)
11
+
12
+ - New features and improvements
13
+ - Support Arrow 13.0.0 (#280)
14
+
15
+ - Documentation and Example
16
+ - Add dataframe_comparison_ja (#281)
17
+
1
18
  ## [0.5.1] - 2023-08-18
2
19
 
3
20
  Docker environment is replaced by Dev Container,
data/Gemfile CHANGED
@@ -7,7 +7,7 @@ gemspec
7
7
  group :test do
8
8
  gem 'rake'
9
9
 
10
- gem 'red-parquet', '~> 12.0.0'
10
+ gem 'red-parquet', '>= 12.0.0'
11
11
  gem 'rover-df', '~> 0.3.0'
12
12
 
13
13
  gem 'rubocop'
data/README.ja.md CHANGED
@@ -29,10 +29,10 @@ Rubyistのためのデータフレームライブラリ.
29
29
 
30
30
  ### ライブラリ
31
31
  ```ruby
32
- gem 'red-arrow', '~> 12.0.0' # お使いの環境に合わせた Apache Arrow が必要です
33
-                  # 下記のインストールを参照してください
32
+ gem 'red-arrow', '>= 12.0.0' # お使いの環境に合わせた Apache Arrow が必要です
33
+ # 下記のインストールを参照してください
34
34
  gem 'red-arrow-numo-narray' # 必要に応じて。Numo::NArray との連携またはランダムサンプリングが必要な場合。
35
- gem 'red-parquet', '~> 12.0.0' # 必要に応じて。Parquet の入出力が必要な場合。
35
+ gem 'red-parquet', '>= 12.0.0' # 必要に応じて。Parquet の入出力が必要な場合。
36
36
  gem 'red-datasets-arrow' # 必要に応じて。Red Datasets を利用する場合。
37
37
  gem 'red-arrow-activerecord' # 必要に応じて。Active Record とのデータ交換が必要な場合。
38
38
  gem 'rover-df', # 必要に応じて。Rover::DataFrame に対する入出力が必要な場合。
@@ -42,9 +42,9 @@ gem 'rover-df', # 必要に応じて。Rover::DataFrame に対す
42
42
 
43
43
  RedAmberをインストールする前に、下記のライブラリのインストールが必要です。
44
44
 
45
- - Apache Arrow (~> 12.0.0)
46
- - Apache Arrow GLib (~> 12.0.0)
47
- - Apache Parquet GLib (~> 12.0.0) # Parquetの入出力が必要な場合。
45
+ - Apache Arrow (>= 12.0.0)
46
+ - Apache Arrow GLib (>= 12.0.0)
47
+ - Apache Parquet GLib (>= 12.0.0) # Parquetの入出力が必要な場合。
48
48
 
49
49
  環境ごとの詳しいインストール方法は、 [Apache Arrow install document](https://arrow.apache.org/install/) を参照してください。
50
50
 
@@ -63,7 +63,7 @@ RedAmberをインストールする前に、下記のライブラリのインス
63
63
 
64
64
  ```
65
65
  sudo dnf update
66
- sudo dnf -y install gcc-c++ libarrow-devel libarrow-glib-devel ruby-devel
66
+ sudo dnf -y install gcc-c++ libarrow-devel libarrow-glib-devel ruby-devel libyaml-devel
67
67
  ```
68
68
 
69
69
  - macOS の場合は、Homebrewを使用する:
@@ -75,10 +75,10 @@ RedAmberをインストールする前に、下記のライブラリのインス
75
75
  Apache Arrowがインストールできたら、下記の行をGemfileに追加してください:
76
76
 
77
77
  ```ruby
78
- gem 'red-arrow', '~> 12.0.0'
78
+ gem 'red-arrow', '>= 12.0.0'
79
79
  gem 'red_amber'
80
80
  gem 'red-arrow-numo-narray' # 必要に応じて。Numo::NArray との連携またはランダムサンプリングが必要な場合。
81
- gem 'red-parquet', '~> 12.0.0' # 必要に応じて。Parquetの入出力が必要な場合。
81
+ gem 'red-parquet', '>= 12.0.0' # 必要に応じて。Parquetの入出力が必要な場合。
82
82
  gem 'red-datasets-arrow' # 必要に応じて。Red Datasets を利用する場合。
83
83
  gem 'red-arrow-activerecord' # 必要に応じて。Active Record とのデータ交換が必要な場合。
84
84
  gem 'rover-df', # 必要に応じて。Rover::DataFrameに対する入出力が必要な場合。
@@ -110,7 +110,7 @@ Jupyter Notebookの環境を含めた他の多くのデータ処理用のライ
110
110
  RedAmberの基本的な機能をPython
111
111
  [pandas](https://pandas.pydata.org/) や
112
112
  R [Tidyverse](https://www.tidyverse.org/) や
113
- Julia [Dataframes](https://dataframes.juliadata.org/stable/) と比較した表は [DataFrame_Comparison.md](doc/DataFrame_Comparison.md) にあります(Thanks to Benson Muite).
113
+ Julia [DataFrames](https://dataframes.juliadata.org/stable/) と比較した表は [DataFrame_Comparison_ja.md](doc/DataFrame_Comparison_ja.md) にあります(Thanks to Benson Muite).
114
114
 
115
115
  ## `RedAmber`のデータフレーム
116
116
 
data/README.md CHANGED
@@ -29,10 +29,10 @@ Supported Ruby version is >= 3.0.
29
29
 
30
30
  ### Required libraries
31
31
  ```ruby
32
- gem 'red-arrow', '~> 12.0.0' # Requires Apache Arrow (see installation below).
32
+ gem 'red-arrow', '>= 12.0.0' # Requires Apache Arrow (see installation below).
33
33
  gem 'red-arrow-numo-narray' # Optional, recommended if you use inputs from Numo::NArray,
34
34
  # or use random sampling feature.
35
- gem 'red-parquet', '~> 12.0.0' # Optional, if you use IO from/to parquet.
35
+ gem 'red-parquet', '>= 12.0.0' # Optional, if you use IO from/to parquet.
36
36
  gem 'red-datasets-arrow' # Optional, if you use Red Datasets.
37
37
  gem 'red-arrow-activerecord' # Optional, if you use Active Record.
38
38
  gem 'rover-df', # Optional, if you use IO from/to Rover::DataFrame.
@@ -42,9 +42,9 @@ gem 'rover-df', # Optional, if you use IO from/to Rover::DataFram
42
42
 
43
43
  Install requirements before you install RedAmber.
44
44
 
45
- - Apache Arrow (~> 12.0.0)
46
- - Apache Arrow GLib (~> 12.0.0)
47
- - Apache Parquet GLib (~> 12.0.0) # If you use IO from/to parquet
45
+ - Apache Arrow (>= 12.0.0)
46
+ - Apache Arrow GLib (>= 12.0.0)
47
+ - Apache Parquet GLib (>= 12.0.0) # If you use IO from/to parquet
48
48
 
49
49
  See [Apache Arrow install document](https://arrow.apache.org/install/).
50
50
 
@@ -63,7 +63,7 @@ See [Apache Arrow install document](https://arrow.apache.org/install/).
63
63
 
64
64
  ```
65
65
  sudo dnf update
66
- sudo dnf -y install gcc-c++ libarrow-devel libarrow-glib-devel ruby-devel
66
+ sudo dnf -y install gcc-c++ libarrow-devel libarrow-glib-devel ruby-devel libyaml-devel
67
67
  ```
68
68
 
69
69
  - On macOS, using Homebrew:
@@ -75,11 +75,11 @@ See [Apache Arrow install document](https://arrow.apache.org/install/).
75
75
  If you prepared Apache Arrow, add these lines to your Gemfile:
76
76
 
77
77
  ```ruby
78
- gem 'red-arrow', '~> 12.0.0'
78
+ gem 'red-arrow', '>= 12.0.0'
79
79
  gem 'red_amber'
80
80
  gem 'red-arrow-numo-narray' # Optional, recommended if you use inputs from Numo::NArray
81
81
  # or use random sampling feature.
82
- gem 'red-parquet', '~> 12.0.0' # Optional, if you use IO from/to parquet
82
+ gem 'red-parquet', '>= 12.0.0' # Optional, if you use IO from/to parquet
83
83
  gem 'red-datasets-arrow' # Optional, recommended if you use Red Datasets
84
84
  gem 'red-arrow-activerecord' # Optional, if you use Active Record
85
85
  gem 'rover-df', # Optional, if you use IO from/to Rover::DataFrame.
@@ -1,9 +1,8 @@
1
1
  # Comparison of DataFrames
2
2
 
3
- Compare basic features of RedAmber with Python
4
- [pandas](https://pandas.pydata.org/),
5
- R [Tidyverse](https://www.tidyverse.org/) and
6
- Julia [Dataframes](https://dataframes.juliadata.org/stable/).
3
+ Compare basic features of RedAmber with [Python pandas](https://pandas.pydata.org/),
4
+ [R Tidyverse](https://www.tidyverse.org/) and
5
+ [Julia DataFrames](https://dataframes.juliadata.org/stable/).
7
6
 
8
7
  ## Select columns (variables)
9
8
 
@@ -51,15 +50,12 @@ Julia [Dataframes](https://dataframes.juliadata.org/stable/).
51
50
  |--- |--- |--- |--- |--- |
52
51
  | Combine additional columns | merge, bind_cols | dplyr::bind_cols | concat | combine |
53
52
  | Combine additional rows | concatenate, concat, bind_rows | dplyr::bind_rows | concat | transform |
54
- | Join right to left, leaving only the matching rows| join, inner_join | dplyr::inner_join | merge | innerjoin |
55
- | Join right to left, leaving all rows | join, full_join, outer_join | dplyr::full_join | merge | outerjoin |
56
- | Join matching values to left from right | join, left_join | dplyr::left_join | merge | leftjoin |
57
- | Join matching values from left to right | join, right_join | dplyr::right_join | merge | rightjoin |
58
- | Return rows of left that have a match in right | join, semi_join | dplyr::semi_join | [isin] | semijoin |
59
- | Return rows of left that do not have a match in right | join, anti_join | dplyr::anti_join | [isin] | antijoin |
53
+ | Join right to left, leaving only the matching rows| inner_join, join | dplyr::inner_join | merge | innerjoin |
54
+ | Join right to left, leaving all rows | full_join, outer_join, join | dplyr::full_join | merge | outerjoin |
55
+ | Join matching values to left from right | left_join, join | dplyr::left_join | merge | leftjoin |
56
+ | Join matching values from left to right | right_join, join | dplyr::right_join | merge | rightjoin |
57
+ | Return rows of left that have a match in right | semi_join, join | dplyr::semi_join | [isin] | semijoin |
58
+ | Return rows of left that do not have a match in right | anti_join, join | dplyr::anti_join | [isin] | antijoin |
60
59
  | Collect rows that appear in left or right | union | dplyr::union | merge | |
61
60
  | Collect rows that appear in both left and right | intersect | dplyr::intersect | merge | |
62
61
  | Collect rows that appear in left but not right | difference, setdiff | dplyr::setdiff | merge | |
63
-
64
-
65
-
@@ -0,0 +1,61 @@
1
+ # DataFrames 操作メソッドの比較
2
+
3
+ RedAmberの基本的な操作メソッドを [Python pandas](https://pandas.pydata.org/),
4
+ [R Tidyverse](https://www.tidyverse.org/),
5
+ [Julia DataFrames](https://dataframes.juliadata.org/stable/) と比較します。
6
+
7
+ ## 列 (variables) を選択する
8
+
9
+ | 機能 | RedAmber | Tidyverse (R) | pandas | DataFrames.jl |
10
+ |--- |--- |--- |--- |--- |
11
+ | 列を選択して dataframe で返す | pick, drop, [] | dplyr::select, dplyr::select_if | [], loc[], iloc[], drop, select_dtypes | [], select |
12
+ | 列を選択して vector で返す | [], v | dplyr::pull, [, x] | [], loc[], iloc[] | [!, :x] |
13
+ | 列の順番を入れ替えた dataframeを返す | pick, [] | relocate | [], reindex, loc[], iloc[] | select,transform |
14
+
15
+ ## 行 (records, observations) を選択する
16
+
17
+ | 機能 | RedAmber | Tidyverse (R) | pandas | DataFrames.jl |
18
+ |--- |--- |--- |--- |--- |
19
+ | 論理値に従って行を選択して dataframe で返す | slice, filter, remove, [] | dplyr::filter | [], filter, query, loc[] | filter |
20
+ | インデックスで行を選択して dataframe で返す | slice, remove, [] | dplyr::slice | iloc[], drop | subset |
21
+ | 行の順番を入れ替えた dataframeを返す | slice, [] | dplyr::filter, dplyr::slice | reindex, loc[], iloc[] | permute |
22
+
23
+ ## 列を更新する / 新しい列を作る
24
+
25
+ |機能 | RedAmber | Tidyverse (R) | pandas | DataFrames.jl |
26
+ |--- |--- |--- |--- |--- |
27
+ | 既存の列の内容を変更する | assign | dplyr::mutate | assign, []= | mapcols |
28
+ | 新しい列を作成する | assign, assign_left | dplyr::mutate | apply | insertcols,.+ |
29
+ | 新しい列を作成し、残りは捨てる | new | transmute | (dfply:)transmute | transform,insertcols,mapcols |
30
+ | 列の名前を変更する | rename | dplyr::rename, dplyr::rename_with, purrr::set_names | rename, set_axis | rename |
31
+ | dataframe をソートする | sort | dplyr::arrange | sort_values | sort |
32
+
33
+ ## dataframe を変形する
34
+
35
+ | 機能 | RedAmber | Tidyverse (R) | pandas | DataFrames.jl |
36
+ |--- |--- |--- |--- |--- |
37
+ | 列を行に積む (long dataframe にする) | to_long | tidyr::pivot_longer | melt | stack |
38
+ | 行を列に集める (wide dataframe にする) | to_wide | tidyr::pivot_wider | pivot | unstack |
39
+ | wide dataframe を転置する | transpose | transpose, t | transpose, T | permutedims |
40
+
41
+ ## グループ化
42
+
43
+ | 機能 | RedAmber | Tidyverse | pandas | DataFrames.jl |
44
+ |--- |--- |--- |--- |--- |
45
+ |グループ化する | group, group.summarize | dplyr::group_by %>% dplyr::summarise | groupby.agg | combine,groupby |
46
+
47
+ ## dataframes または tables を結合する
48
+
49
+ | 機能 | RedAmber | Tidyverse | pandas | DataFrames.jl |
50
+ |--- |--- |--- |--- |--- |
51
+ | 列として連結する (横方向に連結する) | merge, bind_cols | dplyr::bind_cols | concat | combine |
52
+ | 行として連結する (縦方向に連結する) | concatenate, concat, bind_rows | dplyr::bind_rows | concat | transform |
53
+ | 一致した行だけを連結する (内部結合) | inner_join, join | dplyr::inner_join | merge | innerjoin |
54
+ | 全ての行を残して連結する (外部結合) | full_join, outer_join, join | dplyr::full_join | merge | outerjoin |
55
+ | 左の一致した値を残して連結する (左外部結合) | left_join, join | dplyr::left_join | merge | leftjoin |
56
+ | 右の一致した値を残して連結する (右外部結合) | right_join, join | dplyr::right_join | merge | rightjoin |
57
+ | 左の行のうち、右と一致したものを返す | semi_join, join | dplyr::semi_join | [isin] | semijoin |
58
+ | 左の行のうち、右と一致しなかったものを返す | anti_join, join | dplyr::anti_join | [isin] | antijoin |
59
+ | 左か右のどちらかに現れる行を返す | union | dplyr::union | merge | |
60
+ | 左とみごの両方に現れる行を返す | intersect | dplyr::intersect | merge | |
61
+ | 左にはあるが右にはない行を返す | difference, setdiff | dplyr::setdiff | merge | |
@@ -13,7 +13,7 @@ format:
13
13
  colorlinks: true
14
14
  ---
15
15
 
16
- For RedAmber Version 0.5.1-HEAD and Arrow version 12.0.1 .
16
+ For RedAmber Version 0.5.1, 0.5.2 and Arrow version 12.0.1, 13.0.0 .
17
17
 
18
18
  ## 1. Install
19
19
 
@@ -21,9 +21,9 @@ Install requirements before you install RedAmber.
21
21
 
22
22
  - Ruby (>= 3.0)
23
23
 
24
- - Apache Arrow (~> 12.0.0)
25
- - Apache Arrow GLib (~> 12.0.0)
26
- - Apache Parquet GLib (~> 12.0.0) # if you need IO from/to Parquet resource.
24
+ - Apache Arrow (>= 12.0.0)
25
+ - Apache Arrow GLib (>= 12.0.0)
26
+ - Apache Parquet GLib (>= 12.0.0) # if you need IO from/to Parquet resource.
27
27
 
28
28
  See [Apache Arrow install document](https://arrow.apache.org/install/).
29
29
 
@@ -40,7 +40,7 @@ Install requirements before you install RedAmber.
40
40
  - On Fedora 38 (Rawhide):
41
41
  ```shell
42
42
  sudo dnf update
43
- sudo dnf -y install gcc-c++ libarrow-devel libarrow-glib-devel ruby-devel
43
+ sudo dnf -y install gcc-c++ libarrow-devel libarrow-glib-devel ruby-devel libyaml-devel
44
44
 
45
45
 
46
46
  - On macOS, you can install Apache Arrow C++ library using Homebrew:
@@ -56,11 +56,11 @@ Install requirements before you install RedAmber.
56
56
  If you prepared Apache Arrow, add these lines to your Gemfile:
57
57
 
58
58
  ```ruby
59
- gem 'red-arrow', '~> 12.0.0'
59
+ gem 'red-arrow', '>= 12.0.0'
60
60
  gem 'red_amber'
61
61
  gem 'red-arrow-numo-narray' # Optional, recommended if you use inputs from Numo::NArray
62
62
  # or use random sampling feature.
63
- gem 'red-parquet', '~> 12.0.0' # Optional, if you use IO from/to parquet
63
+ gem 'red-parquet', '>= 12.0.0' # Optional, if you use IO from/to parquet
64
64
  gem 'red-datasets-arrow' # Optional, recommended if you use Red Datasets
65
65
  gem 'red-arrow-activerecord' # Optional, if you use Active Record
66
66
  gem 'rover-df', # Optional, if you use IO from/to Rover::DataFrame.
@@ -2,5 +2,5 @@
2
2
 
3
3
  module RedAmber
4
4
  # Library version
5
- VERSION = '0.5.1'
5
+ VERSION = '0.5.2'
6
6
  end
data/red_amber.gemspec CHANGED
@@ -31,7 +31,7 @@ Gem::Specification.new do |spec|
31
31
  spec.executables = spec.files.grep(%r{\Aexe/}) { |f| File.basename(f) }
32
32
  spec.require_paths = ['lib']
33
33
 
34
- spec.add_dependency 'red-arrow', '~> 12.0.0'
34
+ spec.add_dependency 'red-arrow', '>= 12.0.0'
35
35
 
36
36
  # Development dependency has gone to the Gemfile (rubygems/bundler#7237)
37
37
 
metadata CHANGED
@@ -1,27 +1,27 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: red_amber
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.5.1
4
+ version: 0.5.2
5
5
  platform: ruby
6
6
  authors:
7
7
  - Hirokazu SUZUKI (heronshoes)
8
- autorequire:
8
+ autorequire:
9
9
  bindir: exe
10
10
  cert_chain: []
11
- date: 2023-08-17 00:00:00.000000000 Z
11
+ date: 2023-08-31 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: red-arrow
15
15
  requirement: !ruby/object:Gem::Requirement
16
16
  requirements:
17
- - - "~>"
17
+ - - ">="
18
18
  - !ruby/object:Gem::Version
19
19
  version: 12.0.0
20
20
  type: :runtime
21
21
  prerelease: false
22
22
  version_requirements: !ruby/object:Gem::Requirement
23
23
  requirements:
24
- - - "~>"
24
+ - - ">="
25
25
  - !ruby/object:Gem::Version
26
26
  version: 12.0.0
27
27
  description: RedAmber is a data frame library inspired by Rover-df and powered by
@@ -59,6 +59,7 @@ files:
59
59
  - doc/CODE_OF_CONDUCT.md
60
60
  - doc/DataFrame.md
61
61
  - doc/DataFrame_Comparison.md
62
+ - doc/DataFrame_Comparison_ja.md
62
63
  - doc/Dev_Containers.ja.md
63
64
  - doc/Dev_Containers.md
64
65
  - doc/SubFrames.md
@@ -132,7 +133,7 @@ metadata:
132
133
  source_code_uri: https://github.com/red-data-tools/red_amber
133
134
  changelog_uri: https://github.com/red-data-tools/red_amber/blob/main/CHANGELOG.md
134
135
  rubygems_mfa_required: 'true'
135
- post_install_message:
136
+ post_install_message:
136
137
  rdoc_options: []
137
138
  require_paths:
138
139
  - lib
@@ -147,8 +148,8 @@ required_rubygems_version: !ruby/object:Gem::Requirement
147
148
  - !ruby/object:Gem::Version
148
149
  version: '0'
149
150
  requirements: []
150
- rubygems_version: 3.4.10
151
- signing_key:
151
+ rubygems_version: 3.4.12
152
+ signing_key:
152
153
  specification_version: 4
153
154
  summary: A data frame library for Rubyists
154
155
  test_files: []