daru 0.1.1 → 0.1.2
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/.travis.yml +1 -5
- data/CONTRIBUTING.md +2 -11
- data/History.md +18 -0
- data/README.md +109 -11
- data/daru.gemspec +11 -6
- data/images/README.md +5 -0
- data/images/con0.png +0 -0
- data/images/con1.png +0 -0
- data/images/init0.png +0 -0
- data/images/init1.png +0 -0
- data/images/man0.png +0 -0
- data/images/man1.png +0 -0
- data/images/man2.png +0 -0
- data/images/man3.png +0 -0
- data/images/man4.png +0 -0
- data/images/man5.png +0 -0
- data/images/man6.png +0 -0
- data/images/plot0.png +0 -0
- data/lib/daru.rb +5 -2
- data/lib/daru/core/group_by.rb +45 -45
- data/lib/daru/core/merge.rb +59 -1
- data/lib/daru/dataframe.rb +255 -226
- data/lib/daru/exceptions.rb +2 -0
- data/lib/daru/io/io.rb +41 -19
- data/lib/daru/io/sql_data_source.rb +116 -0
- data/lib/daru/vector.rb +124 -104
- data/lib/daru/version.rb +1 -1
- data/spec/core/group_by_spec.rb +12 -2
- data/spec/core/merge_spec.rb +14 -1
- data/spec/dataframe_spec.rb +189 -158
- data/spec/io/io_spec.rb +80 -2
- data/spec/io/sql_data_source_spec.rb +67 -0
- data/spec/spec_helper.rb +4 -2
- data/spec/support/database_helper.rb +30 -0
- data/spec/vector_spec.rb +45 -46
- metadata +104 -16
- data/.build.sh +0 -14
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA1:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: ed2a3e2a4cd9fce8d95af6aac9c3db532eed444f
|
4
|
+
data.tar.gz: 90ca6a62ee824d20f72a9f6689c03f27d7667168
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: e6f3345ef4372e1c45a3d80c0cc61c2b4c72e4c810cfb183f30bfd9285a09639ea39cd0a3597fc63551d7f72398d8d83af4424855018e8a5b2a99274b46625cd
|
7
|
+
data.tar.gz: 65d262b1deec54680a5fdcfecda3530c9fb9450dbd280c18833b655521418ed340f311253221dfd4e018577b063f9d3638d0d600a108c3b98ec5a7cd2dfe98ec
|
data/.travis.yml
CHANGED
@@ -1,13 +1,11 @@
|
|
1
1
|
language:
|
2
2
|
ruby
|
3
3
|
|
4
|
-
env:
|
5
|
-
- CPLUS_INCLUDE_PATH=/usr/include/atlas C_INCLUDE_PATH=/usr/include/atlas
|
6
|
-
|
7
4
|
rvm:
|
8
5
|
- '2.0'
|
9
6
|
- '2.1'
|
10
7
|
- '2.2'
|
8
|
+
- '2.3.0'
|
11
9
|
|
12
10
|
matrix:
|
13
11
|
fast_finish:
|
@@ -17,11 +15,9 @@ script: "bundle exec rspec"
|
|
17
15
|
|
18
16
|
install:
|
19
17
|
- gem install bundler
|
20
|
-
- ./.build.sh
|
21
18
|
- bundle install
|
22
19
|
|
23
20
|
before_install:
|
24
21
|
- sudo apt-get update -qq
|
25
|
-
- sudo apt-get install -qq libatlas-base-dev
|
26
22
|
- sudo apt-get install -y libgsl0-dev r-base r-base-dev
|
27
23
|
- sudo Rscript -e "install.packages(c('Rserve','irr'),,'http://cran.us.r-project.org')"
|
data/CONTRIBUTING.md
CHANGED
@@ -2,25 +2,16 @@
|
|
2
2
|
|
3
3
|
## Installing daru development dependencies
|
4
4
|
|
5
|
-
|
6
|
-
|
7
|
-
Keep in mind that either nmatrix or rb-gsl are NOT NECESSARY for using daru. They are just required for an optional speed up and for running the test suite.
|
5
|
+
Either nmatrix or rb-gsl are NOT NECESSARY for using daru. They are just required for an optional speed up and for running the test suite.
|
8
6
|
|
9
7
|
To install dependencies, execute the following commands:
|
10
8
|
|
11
|
-
`export CPLUS_INCLUDE_PATH=/usr/include/atlas`
|
12
|
-
`export C_INCLUDE_PATH=/usr/include/atlas`
|
13
9
|
`sudo apt-get update -qq`
|
14
|
-
`sudo apt-get install -qq libatlas-base-dev`
|
15
|
-
`sudo apt-get --purge remove liblapack-dev liblapack3 liblapack3gf`
|
16
10
|
`sudo apt-get install -y libgsl0-dev r-base r-base-dev`
|
17
11
|
`sudo Rscript -e "install.packages(c('Rserve','irr'),,'http://cran.us.r-project.org')"`
|
18
12
|
|
19
|
-
Then execute the [.build.sh script](https://github.com/v0dro/daru/blob/master/.build.sh) to clone and install the latest nmatrix system:
|
20
|
-
|
21
|
-
`./.build.sh`
|
22
13
|
|
23
|
-
Then
|
14
|
+
Then install remaining dependencies:
|
24
15
|
|
25
16
|
`bundle install`
|
26
17
|
|
data/History.md
CHANGED
@@ -1,3 +1,21 @@
|
|
1
|
+
# 0.1.2
|
2
|
+
|
3
|
+
* Enhancements
|
4
|
+
- New method `DataFrame.from_activerecord` for importing data sets from ActiveRecord. (by @mrkn)
|
5
|
+
- Better importing of data from SQL databases by extracting that functionality into a separate class called `Daru::IO::SqlDataSource` (by @mrkn).
|
6
|
+
- Faster algorithm for performing inner joins by using the bloomfilter-rb gem. Available only for MRI. (by Peter Tung)
|
7
|
+
- Added exception `SizeError` (by Peter Tung).
|
8
|
+
- Removed outdated dependencies and build scripts, updated existing dependencies.
|
9
|
+
- Ability to sort a Daru::Vector with nils present (by @gnilrets)
|
10
|
+
|
11
|
+
* Fixes
|
12
|
+
- Fix column creation for `Dataframe.from_sql` (by @dansbits).
|
13
|
+
- group_by can now be performed on DataFrames with nils (@gnilrets).
|
14
|
+
- Bug fix for DataFrame Vectors not duplicating when calling `DataFrame#dup` (by @gnilrets).
|
15
|
+
- Bug fix when concantenating DataFrames (by @gnilrets)
|
16
|
+
- Handling improper arguments to `Daru::Vector#[]` (by @lokeshh)
|
17
|
+
- Resolve narray conflict by using the latest nmatrix require methods (by @lokeshh)
|
18
|
+
|
1
19
|
# 0.1.1
|
2
20
|
|
3
21
|
* Enhancements
|
data/README.md
CHANGED
@@ -1,18 +1,13 @@
|
|
1
|
-
daru
|
2
|
-
====
|
3
|
-
|
4
|
-
Data Analysis in RUby
|
1
|
+
# daru - Data Analysis in RUby
|
5
2
|
|
6
3
|
[](http://badge.fury.io/rb/daru)
|
7
4
|
[](https://travis-ci.org/v0dro/daru)
|
8
5
|
|
9
6
|
## Introduction
|
10
7
|
|
11
|
-
daru (Data Analysis in RUby) is a library for storage, analysis, manipulation and visualization of data.
|
12
|
-
|
13
|
-
daru is inspired by pandas, a very mature solution in Python.
|
8
|
+
daru (Data Analysis in RUby) is a library for storage, analysis, manipulation and visualization of data in Ruby.
|
14
9
|
|
15
|
-
Written in pure Ruby
|
10
|
+
daru makes it easy and intuituive to process data predominantly through 2 data structures: `Daru::DataFrame` and `Daru::Vector`. Written in pure Ruby works with all ruby implementations. Tested with MRI 2.0, 2.1, 2.2 and 2.3.
|
16
11
|
|
17
12
|
## Features
|
18
13
|
|
@@ -28,7 +23,7 @@ Written in pure Ruby so should work with all ruby implementations. Tested with M
|
|
28
23
|
* Optional speed and space optimization on MRI with [NMatrix](https://github.com/SciRuby/nmatrix) and GSL.
|
29
24
|
* Easy splitting, aggregation and grouping of data.
|
30
25
|
* Quickly reducing data with pivot tables for quick data summary.
|
31
|
-
* Import and export data from and to Excel, CSV, SQL Databases and plain text files.
|
26
|
+
* Import and export data from and to Excel, CSV, SQL Databases, ActiveRecord and plain text files.
|
32
27
|
|
33
28
|
## Notebooks
|
34
29
|
|
@@ -64,6 +59,111 @@ Written in pure Ruby so should work with all ruby implementations. Tested with M
|
|
64
59
|
* [Analysis of Time Series in daru](http://v0dro.github.io/blog/2015/07/31/analysis-of-time-series-in-daru/)
|
65
60
|
* [Date Offsets in Daru](http://v0dro.github.io/blog/2015/07/27/date-offsets-in-daru/)
|
66
61
|
|
62
|
+
## Basic Usage
|
63
|
+
|
64
|
+
daru exposes two major data structures: `DataFrame` and `Vector`. The Vector is a basic 1-D structure corresponding to a labelled Array, while the `DataFrame` - daru's primary data structure - is 2-D spreadsheet-like structure for manipulating and storing data sets.
|
65
|
+
|
66
|
+
Basic DataFrame intitialization.
|
67
|
+
|
68
|
+
``` ruby
|
69
|
+
data_frame = Daru::DataFrame.new(
|
70
|
+
{
|
71
|
+
'Beer' => ['Kingfisher', 'Snow', 'Bud Light', 'Tiger Beer', 'Budweiser'],
|
72
|
+
'Gallons sold' => [500, 400, 450, 200, 250]
|
73
|
+
},
|
74
|
+
index: ['India', 'China', 'USA', 'Malaysia', 'Canada']
|
75
|
+
)
|
76
|
+
data_frame
|
77
|
+
```
|
78
|
+

|
79
|
+
|
80
|
+
|
81
|
+
Load data from CSV files.
|
82
|
+
``` ruby
|
83
|
+
df = Daru::DataFrame.from_csv('TradeoffData.csv')
|
84
|
+
```
|
85
|
+

|
86
|
+
|
87
|
+
*Basic Data Manipulation*
|
88
|
+
|
89
|
+
Selecting rows.
|
90
|
+
``` ruby
|
91
|
+
data_frame.row['USA']
|
92
|
+
```
|
93
|
+

|
94
|
+
|
95
|
+
Selecting columns.
|
96
|
+
``` ruby
|
97
|
+
data_frame['Beer']
|
98
|
+
```
|
99
|
+

|
100
|
+
|
101
|
+
A range of rows.
|
102
|
+
``` ruby
|
103
|
+
data_frame.row['India'..'USA']
|
104
|
+
```
|
105
|
+

|
106
|
+
|
107
|
+
The first 2 rows.
|
108
|
+
``` ruby
|
109
|
+
data_frame.first(2)
|
110
|
+
```
|
111
|
+

|
112
|
+
|
113
|
+
The last 2 rows.
|
114
|
+
``` ruby
|
115
|
+
data_frame.last(2)
|
116
|
+
```
|
117
|
+

|
118
|
+
|
119
|
+
Adding a new column.
|
120
|
+
``` ruby
|
121
|
+
data_frame['Gallons produced'] = [550, 500, 600, 210, 240]
|
122
|
+
```
|
123
|
+

|
124
|
+
|
125
|
+
Creating a new column based on data in other columns.
|
126
|
+
``` ruby
|
127
|
+
data_frame['Demand supply gap'] = data_frame['Gallons produced'] - data_frame['Gallons sold']
|
128
|
+
```
|
129
|
+

|
130
|
+
|
131
|
+
*Condition based selection*
|
132
|
+
|
133
|
+
Selecting countries based on the number of gallons sold in each. We use a syntax similar to that defined by [Arel](https://github.com/rails/arel), i.e. by using the `where` clause.
|
134
|
+
``` ruby
|
135
|
+
data_frame.where(data_frame['Gallons sold'].lt(300))
|
136
|
+
```
|
137
|
+

|
138
|
+
|
139
|
+
You can pass a combination of boolean operations into the `#where` method and it should work fine:
|
140
|
+
``` ruby
|
141
|
+
data_frame.where(
|
142
|
+
data_frame['Beer']
|
143
|
+
.in(['Snow', 'Kingfisher','Tiger Beer'])
|
144
|
+
.and(
|
145
|
+
data_frame['Gallons produced'].gt(520).or(data_frame['Gallons produced'].lt(250))
|
146
|
+
)
|
147
|
+
)
|
148
|
+
```
|
149
|
+

|
150
|
+
|
151
|
+
*Plotting*
|
152
|
+
|
153
|
+
Daru supports plotting of interactive graphs with [nyaplot](). You can easily create a plot with the `#plot` method. Here we plot the gallons sold on the Y axis and name of the brand on the X axis in a bar graph.
|
154
|
+
``` ruby
|
155
|
+
data_frame.plot type: :bar, x: 'Beer', y: 'Gallons sold' do |plot, diagram|
|
156
|
+
plot.x_label "Beer"
|
157
|
+
plot.y_label "Gallons Sold"
|
158
|
+
plot.yrange [0,600]
|
159
|
+
plot.width 500
|
160
|
+
plot.height 400
|
161
|
+
end
|
162
|
+
```
|
163
|
+

|
164
|
+
|
165
|
+
In addition to nyaplot, daru also supports plotting out of the box with [gnuplotrb](https://github.com/SciRuby/gnuplotrb).
|
166
|
+
|
67
167
|
## Documentation
|
68
168
|
|
69
169
|
Docs can be found [here](https://rubygems.org/gems/daru).
|
@@ -71,8 +171,6 @@ Docs can be found [here](https://rubygems.org/gems/daru).
|
|
71
171
|
## Roadmap
|
72
172
|
|
73
173
|
* Enable creation of DataFrame by only specifying an NMatrix/MDArray in initialize. Vector naming happens automatically (alphabetic) or is specified in an Array.
|
74
|
-
* Basic Data manipulation and analysis operations:
|
75
|
-
- DF concat
|
76
174
|
* Assignment of a column to a single number should set the entire column to that number.
|
77
175
|
* Multiple column assignment with []=
|
78
176
|
* Multiple value assignment for vectors with []=.
|
data/daru.gemspec
CHANGED
@@ -34,7 +34,7 @@ Thank you for installing daru!
|
|
34
34
|
oOOOOOo
|
35
35
|
,| oO
|
36
36
|
//| |
|
37
|
-
|
37
|
+
\\\\| |
|
38
38
|
`| |
|
39
39
|
`-----`
|
40
40
|
|
@@ -50,17 +50,22 @@ Cheers!
|
|
50
50
|
EOF
|
51
51
|
|
52
52
|
spec.add_runtime_dependency 'reportbuilder', '~> 1.4'
|
53
|
-
spec.add_runtime_dependency 'spreadsheet', '~> 1.
|
53
|
+
spec.add_runtime_dependency 'spreadsheet', '~> 1.1.1'
|
54
54
|
|
55
55
|
spec.add_development_dependency 'bundler', '~> 1.10'
|
56
|
-
spec.add_development_dependency 'rake'
|
56
|
+
spec.add_development_dependency 'rake', '~>10.5'
|
57
57
|
spec.add_development_dependency 'pry', '~> 0.10'
|
58
58
|
spec.add_development_dependency 'pry-byebug'
|
59
59
|
spec.add_development_dependency 'rserve-client', '~> 0.3'
|
60
|
-
spec.add_development_dependency 'rspec'
|
60
|
+
spec.add_development_dependency 'rspec', '~> 3.4'
|
61
61
|
spec.add_development_dependency 'awesome_print'
|
62
62
|
spec.add_development_dependency 'nyaplot', '~> 0.1.5'
|
63
|
-
spec.add_development_dependency 'nmatrix', '~> 0.1
|
63
|
+
spec.add_development_dependency 'nmatrix', '~> 0.2.1'
|
64
64
|
spec.add_development_dependency 'distribution', '~> 0.7'
|
65
65
|
spec.add_development_dependency 'rb-gsl', '~>1.16'
|
66
|
-
|
66
|
+
spec.add_development_dependency 'bloomfilter-rb', '~> 2.1'
|
67
|
+
spec.add_development_dependency 'dbd-sqlite3'
|
68
|
+
spec.add_development_dependency 'dbi'
|
69
|
+
spec.add_development_dependency 'activerecord', '~> 4.0'
|
70
|
+
spec.add_development_dependency 'sqlite3'
|
71
|
+
end
|
data/images/README.md
ADDED
data/images/con0.png
ADDED
Binary file
|
data/images/con1.png
ADDED
Binary file
|
data/images/init0.png
ADDED
Binary file
|
data/images/init1.png
ADDED
Binary file
|
data/images/man0.png
ADDED
Binary file
|
data/images/man1.png
ADDED
Binary file
|
data/images/man2.png
ADDED
Binary file
|
data/images/man3.png
ADDED
Binary file
|
data/images/man4.png
ADDED
Binary file
|
data/images/man5.png
ADDED
Binary file
|
data/images/man6.png
ADDED
Binary file
|
data/images/plot0.png
ADDED
Binary file
|
data/lib/daru.rb
CHANGED
@@ -38,10 +38,12 @@ module Daru
|
|
38
38
|
attr_accessor :lazy_update
|
39
39
|
|
40
40
|
def create_has_library(library)
|
41
|
-
|
42
|
-
|
41
|
+
lib_underscore = library.to_s.gsub(/-/, '_')
|
42
|
+
define_singleton_method("has_#{lib_underscore}?") do
|
43
|
+
cv = "@@#{lib_underscore}"
|
43
44
|
unless class_variable_defined? cv
|
44
45
|
begin
|
46
|
+
library = 'nmatrix/nmatrix' if library == :nmatrix
|
45
47
|
require library.to_s
|
46
48
|
class_variable_set(cv, true)
|
47
49
|
rescue LoadError
|
@@ -56,6 +58,7 @@ module Daru
|
|
56
58
|
create_has_library :gsl
|
57
59
|
create_has_library :nmatrix
|
58
60
|
create_has_library :nyaplot
|
61
|
+
create_has_library :'bloomfilter-rb'
|
59
62
|
end
|
60
63
|
|
61
64
|
autoload :Spreadsheet, 'spreadsheet'
|
data/lib/daru/core/group_by.rb
CHANGED
@@ -18,7 +18,7 @@ module Daru
|
|
18
18
|
@context = context
|
19
19
|
vectors = names.map { |vec| context[vec].to_a }
|
20
20
|
tuples = vectors[0].zip(*vectors[1..-1])
|
21
|
-
keys = tuples.uniq.sort
|
21
|
+
keys = tuples.uniq.sort { |a,b| a && b ? a.compact <=> b.compact : a ? 1 : -1 }
|
22
22
|
|
23
23
|
keys.each do |key|
|
24
24
|
@groups[key] = all_indices_for(tuples, key)
|
@@ -28,7 +28,7 @@ module Daru
|
|
28
28
|
|
29
29
|
# Get a Daru::Vector of the size of each group.
|
30
30
|
def size
|
31
|
-
index =
|
31
|
+
index =
|
32
32
|
if multi_indexed_grouping?
|
33
33
|
Daru::MultiIndex.from_tuples @groups.keys
|
34
34
|
else
|
@@ -59,15 +59,15 @@ module Daru
|
|
59
59
|
# d: [11 ,22 ,33 ,44 ,55 ,66 ,77 ,88]
|
60
60
|
# })
|
61
61
|
# df.group_by([:a, :b]).head(1)
|
62
|
-
# # =>
|
62
|
+
# # =>
|
63
63
|
# # #<Daru::DataFrame:82745170 @name = d7003f75-5eb9-4967-9303-c08dd9160224 @size = 6>
|
64
|
-
# # a b c d
|
65
|
-
# # 1 bar one 2 22
|
66
|
-
# # 3 bar three 1 44
|
67
|
-
# # 5 bar two 6 66
|
68
|
-
# # 0 foo one 1 11
|
69
|
-
# # 7 foo three 8 88
|
70
|
-
# # 2 foo two 3 33
|
64
|
+
# # a b c d
|
65
|
+
# # 1 bar one 2 22
|
66
|
+
# # 3 bar three 1 44
|
67
|
+
# # 5 bar two 6 66
|
68
|
+
# # 0 foo one 1 11
|
69
|
+
# # 7 foo three 8 88
|
70
|
+
# # 2 foo two 3 33
|
71
71
|
def head quantity=5
|
72
72
|
select_groups_from :first, quantity
|
73
73
|
end
|
@@ -82,14 +82,14 @@ module Daru
|
|
82
82
|
# d: [11 ,22 ,33 ,44 ,55 ,66 ,77 ,88]
|
83
83
|
# })
|
84
84
|
# # df.group_by([:a, :b]).tail(1)
|
85
|
-
# # =>
|
85
|
+
# # =>
|
86
86
|
# # #<Daru::DataFrame:82378270 @name = 0623db46-5425-41bd-a843-99baac3d1d9a @size = 6>
|
87
|
-
# # a b c d
|
88
|
-
# # 1 bar one 2 22
|
89
|
-
# # 3 bar three 1 44
|
90
|
-
# # 5 bar two 6 66
|
91
|
-
# # 6 foo one 3 77
|
92
|
-
# # 7 foo three 8 88
|
87
|
+
# # a b c d
|
88
|
+
# # 1 bar one 2 22
|
89
|
+
# # 3 bar three 1 44
|
90
|
+
# # 5 bar two 6 66
|
91
|
+
# # 6 foo one 3 77
|
92
|
+
# # 7 foo three 8 88
|
93
93
|
# # 4 foo two 3 55
|
94
94
|
def tail quantity=5
|
95
95
|
select_groups_from :last, quantity
|
@@ -103,15 +103,15 @@ module Daru
|
|
103
103
|
# c: [1 ,2 ,3 ,1 ,3 ,6 ,3 ,8],
|
104
104
|
# d: [11 ,22 ,33 ,44 ,55 ,66 ,77 ,88]
|
105
105
|
# df.group_by([:a, :b]).mean
|
106
|
-
# # =>
|
106
|
+
# # =>
|
107
107
|
# # #<Daru::DataFrame:81097450 @name = 0c32983f-3e06-451f-a9c9-051cadfe7371 @size = 6>
|
108
|
-
# # c d
|
109
|
-
# # ["bar", "one"] 2 22
|
110
|
-
# # ["bar", "three"] 1 44
|
111
|
-
# # ["bar", "two"] 6 66
|
112
|
-
# # ["foo", "one"] 2.0 44.0
|
113
|
-
# # ["foo", "three"] 8 88
|
114
|
-
# # ["foo", "two"] 3.0 44.0
|
108
|
+
# # c d
|
109
|
+
# # ["bar", "one"] 2 22
|
110
|
+
# # ["bar", "three"] 1 44
|
111
|
+
# # ["bar", "two"] 6 66
|
112
|
+
# # ["foo", "one"] 2.0 44.0
|
113
|
+
# # ["foo", "three"] 8 88
|
114
|
+
# # ["foo", "two"] 3.0 44.0
|
115
115
|
def mean
|
116
116
|
apply_method :numeric, :mean
|
117
117
|
end
|
@@ -128,28 +128,28 @@ module Daru
|
|
128
128
|
|
129
129
|
# Count groups, excludes missing values.
|
130
130
|
# @example Using count
|
131
|
-
# df = Daru::DataFrame.new({
|
132
|
-
# a: %w{foo bar foo bar foo bar foo foo},
|
133
|
-
# b: %w{one one two three two two one three},
|
134
|
-
# c: [1 ,2 ,3 ,1 ,3 ,6 ,3 ,8],
|
135
|
-
# d: [11 ,22 ,33 ,44 ,55 ,66 ,77 ,88]
|
136
|
-
# })
|
131
|
+
# df = Daru::DataFrame.new({
|
132
|
+
# a: %w{foo bar foo bar foo bar foo foo},
|
133
|
+
# b: %w{one one two three two two one three},
|
134
|
+
# c: [1 ,2 ,3 ,1 ,3 ,6 ,3 ,8],
|
135
|
+
# d: [11 ,22 ,33 ,44 ,55 ,66 ,77 ,88]
|
136
|
+
# })
|
137
137
|
# df.group_by([:a, :b]).count
|
138
|
-
# # =>
|
138
|
+
# # =>
|
139
139
|
# # #<Daru::DataFrame:76900210 @name = 7b9cf55d-17f8-48c7-b03a-2586c6e5ec5a @size = 6>
|
140
|
-
# # c d
|
141
|
-
# # ["bar", "one"] 1 1
|
142
|
-
# # ["bar", "two"] 1 1
|
143
|
-
# # ["bar", "three"] 1 1
|
144
|
-
# # ["foo", "one"] 2 2
|
145
|
-
# # ["foo", "three"] 1 1
|
146
|
-
# # ["foo", "two"] 2 2
|
140
|
+
# # c d
|
141
|
+
# # ["bar", "one"] 1 1
|
142
|
+
# # ["bar", "two"] 1 1
|
143
|
+
# # ["bar", "three"] 1 1
|
144
|
+
# # ["foo", "one"] 2 2
|
145
|
+
# # ["foo", "three"] 1 1
|
146
|
+
# # ["foo", "two"] 2 2
|
147
147
|
def count
|
148
148
|
width = @non_group_vectors.size
|
149
149
|
Daru::DataFrame.new([size]*width, order: @non_group_vectors)
|
150
150
|
end
|
151
151
|
|
152
|
-
# Calculate sample standard deviation of numeric vector groups, excluding
|
152
|
+
# Calculate sample standard deviation of numeric vector groups, excluding
|
153
153
|
# missing values.
|
154
154
|
def std
|
155
155
|
apply_method :numeric, :std
|
@@ -177,9 +177,9 @@ module Daru
|
|
177
177
|
# d: [11 ,22 ,33 ,44 ,55 ,66 ,77 ,88]
|
178
178
|
# })
|
179
179
|
# df.group_by([:a, :b]).get_group ['bar','two']
|
180
|
-
# #=>
|
180
|
+
# #=>
|
181
181
|
# ##<Daru::DataFrame:83258980 @name = 687ee3f6-8874-4899-97fa-9b31d84fa1d5 @size = 1>
|
182
|
-
# # a b c d
|
182
|
+
# # a b c d
|
183
183
|
# # 5 bar two 6 66
|
184
184
|
def get_group group
|
185
185
|
indexes = @groups[group]
|
@@ -198,7 +198,7 @@ module Daru
|
|
198
198
|
rows, index: @context.index[indexes], order: @context.vectors)
|
199
199
|
end
|
200
200
|
|
201
|
-
private
|
201
|
+
private
|
202
202
|
|
203
203
|
def select_groups_from method, quantity
|
204
204
|
selection = @context
|
@@ -227,7 +227,7 @@ module Daru
|
|
227
227
|
slice = vec[*indexes]
|
228
228
|
single_row << (slice.is_a?(Numeric) ? slice : slice.send(method))
|
229
229
|
end
|
230
|
-
end
|
230
|
+
end
|
231
231
|
|
232
232
|
rows << single_row
|
233
233
|
end
|
@@ -260,4 +260,4 @@ module Daru
|
|
260
260
|
end
|
261
261
|
end
|
262
262
|
end
|
263
|
-
end
|
263
|
+
end
|