daru 0.2.0 → 0.2.1
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- data/History.md +15 -0
- data/README.md +5 -3
- data/daru.gemspec +0 -23
- data/lib/daru.rb +0 -10
- data/lib/daru/core/group_by.rb +57 -46
- data/lib/daru/core/merge.rb +12 -3
- data/lib/daru/dataframe.rb +75 -67
- data/lib/daru/index/multi_index.rb +19 -5
- data/lib/daru/io/csv/converters.rb +3 -0
- data/lib/daru/io/io.rb +12 -5
- data/lib/daru/vector.rb +25 -0
- data/lib/daru/version.rb +1 -1
- data/spec/core/group_by_spec.rb +75 -21
- data/spec/dataframe_spec.rb +43 -3
- data/spec/fixtures/string_converter_test.csv +5 -0
- data/spec/index/multi_index_spec.rb +10 -2
- data/spec/io/io_spec.rb +10 -0
- data/spec/vector_spec.rb +16 -0
- metadata +6 -23
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA1:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 87e4e2869fe6411e3eece92bb5dc24d48f890774
|
4
|
+
data.tar.gz: e711d0db1d57f51f31ccb7fb54078a6bdbcc4ff5
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: afdb295d0d01542ba9f439cf5f7959d7f2a3b9e47de6047ecf7719548ef760e657c0dfe753ed16ee1da65e071bb5a182aaf03ee83c9de6075d54149753b9c346
|
7
|
+
data.tar.gz: e0c4ace661d9f1cb7e8040d424bb004a0b650a9605037d1aff258258bbac40a3c158e5f5b8a2a5c6a28070cf55566a0729ee9b77c8114d40d4d18cf9d26e69c3
|
data/History.md
CHANGED
@@ -1,3 +1,18 @@
|
|
1
|
+
# 0.2.1 (02 July 2018)
|
2
|
+
|
3
|
+
* Minor Enhancements
|
4
|
+
- Allow pasing singular Symbol to CSV converters option (@takkanm)
|
5
|
+
- Support calling GroupBy#each_group w/o blocks (@hibariya)
|
6
|
+
- Refactor grouping and aggregation (@paisible-wanderer)
|
7
|
+
- Add String Converter to Daru::IO::CSV::CONVERTERS (@takkanm)
|
8
|
+
- Fix annoying missing libraries warning
|
9
|
+
- Remove post-install message (nice yet useless)
|
10
|
+
|
11
|
+
* Fixes
|
12
|
+
- Fix group_by for DataFrame with single row (@baarkerlounger)
|
13
|
+
- `#rolling_fillna!` bugfixes on `Daru::Vector` and `Daru::DataFrame` (@mhammiche)
|
14
|
+
- Fixes `#include?` on multiindex (@rohitner)
|
15
|
+
|
1
16
|
# 0.2.0 (31 October 2017)
|
2
17
|
* Major Enhancements
|
3
18
|
- Add `DataFrame#which` query DSL (experimental! @rainchen)
|
data/README.md
CHANGED
@@ -3,12 +3,13 @@
|
|
3
3
|
[![Gem Version](https://badge.fury.io/rb/daru.svg)](http://badge.fury.io/rb/daru)
|
4
4
|
[![Build Status](https://travis-ci.org/SciRuby/daru.svg?branch=master)](https://travis-ci.org/SciRuby/daru)
|
5
5
|
[![Gitter](https://badges.gitter.im/v0dro/daru.svg)](https://gitter.im/v0dro/daru?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge)
|
6
|
+
[![Open Source Helpers](https://www.codetriage.com/sciruby/daru/badges/users.svg)](https://www.codetriage.com/sciruby/daru)
|
6
7
|
|
7
8
|
## Introduction
|
8
9
|
|
9
10
|
daru (Data Analysis in RUby) is a library for storage, analysis, manipulation and visualization of data in Ruby.
|
10
11
|
|
11
|
-
daru makes it easy and intuitive to process data predominantly through 2 data structures: `Daru::DataFrame` and `Daru::Vector`. Written in pure Ruby works with all ruby implementations. Tested with MRI 2.0, 2.1, 2.2 and 2.
|
12
|
+
daru makes it easy and intuitive to process data predominantly through 2 data structures: `Daru::DataFrame` and `Daru::Vector`. Written in pure Ruby works with all ruby implementations. Tested with MRI 2.0, 2.1, 2.2, 2.3, and 2.4.
|
12
13
|
|
13
14
|
## Features
|
14
15
|
|
@@ -73,6 +74,7 @@ $ gem install daru
|
|
73
74
|
* [Data Analysis in RUby: Basic data manipulation and plotting](http://v0dro.github.io/blog/2014/11/25/data-analysis-in-ruby-basic-data-manipulation-and-plotting/)
|
74
75
|
* [Data Analysis in RUby: Splitting, sorting, aggregating data and data types](http://v0dro.github.io/blog/2015/02/24/data-analysis-in-ruby-part-2/)
|
75
76
|
* [Finding and Combining data in daru](http://v0dro.github.io/blog/2015/08/03/finding-and-combining-data-in-daru/)
|
77
|
+
* [Introduction to analyzing datasets with daru library](http://gafur.me/2018/02/05/analysing-datasets-with-daru-library.html)
|
76
78
|
|
77
79
|
### Time series
|
78
80
|
|
@@ -192,13 +194,13 @@ In addition to nyaplot, daru also supports plotting out of the box with [gnuplot
|
|
192
194
|
|
193
195
|
## Documentation
|
194
196
|
|
195
|
-
Docs can be found [here](
|
197
|
+
Docs can be found [here](http://www.rubydoc.info/gems/daru).
|
196
198
|
|
197
199
|
## Contributing
|
198
200
|
|
199
201
|
Pick a feature from the Roadmap or the issue tracker or think of your own and send me a Pull Request!
|
200
202
|
|
201
|
-
For details see [CONTRIBUTING](https://github.com/
|
203
|
+
For details see [CONTRIBUTING](https://github.com/SciRuby/daru/blob/master/CONTRIBUTING.md).
|
202
204
|
|
203
205
|
## Acknowledgements
|
204
206
|
|
data/daru.gemspec
CHANGED
@@ -27,29 +27,6 @@ Gem::Specification.new do |spec|
|
|
27
27
|
spec.test_files = spec.files.grep(%r{^(test|spec|features)/})
|
28
28
|
spec.require_paths = ["lib"]
|
29
29
|
|
30
|
-
spec.post_install_message = <<-EOF
|
31
|
-
*************************************************************************
|
32
|
-
Thank you for installing daru!
|
33
|
-
|
34
|
-
oOOOOOo
|
35
|
-
,| oO
|
36
|
-
//| |
|
37
|
-
\\\\| |
|
38
|
-
`| |
|
39
|
-
`-----`
|
40
|
-
|
41
|
-
|
42
|
-
Hope you love daru! For enhanced interactivity and better visualizations,
|
43
|
-
consider using gnuplotrb and nyaplot with iruby. For statistics use the
|
44
|
-
statsample family.
|
45
|
-
|
46
|
-
Read the README for interesting use cases and examples.
|
47
|
-
|
48
|
-
Cheers!
|
49
|
-
*************************************************************************
|
50
|
-
EOF
|
51
|
-
|
52
|
-
|
53
30
|
spec.add_runtime_dependency 'backports'
|
54
31
|
|
55
32
|
# it is required by NMatrix, yet we want to specify clearly which minimal version is OK
|
data/lib/daru.rb
CHANGED
@@ -86,16 +86,6 @@ module Daru
|
|
86
86
|
create_has_library :gruff
|
87
87
|
end
|
88
88
|
|
89
|
-
{'spreadsheet' => '~>1.1.1', 'mechanize' => '~>2.7.5'}.each do |name, version|
|
90
|
-
begin
|
91
|
-
gem name, version
|
92
|
-
require name
|
93
|
-
rescue LoadError
|
94
|
-
Daru.error "\nInstall the #{name} gem version #{version} for using"\
|
95
|
-
" #{name} functions."
|
96
|
-
end
|
97
|
-
end
|
98
|
-
|
99
89
|
autoload :CSV, 'csv'
|
100
90
|
require 'matrix'
|
101
91
|
require 'forwardable'
|
data/lib/daru/core/group_by.rb
CHANGED
@@ -1,11 +1,64 @@
|
|
1
1
|
module Daru
|
2
2
|
module Core
|
3
3
|
class GroupBy
|
4
|
+
class << self
|
5
|
+
def get_positions_group_map_on(indexes_with_positions, sort: false)
|
6
|
+
group_map = {}
|
7
|
+
|
8
|
+
indexes_with_positions.each do |idx, position|
|
9
|
+
(group_map[idx] ||= []) << position
|
10
|
+
end
|
11
|
+
|
12
|
+
if sort # TODO: maybe add a more "stable" sorting option?
|
13
|
+
sorted_keys = group_map.keys.sort(&Daru::Core::GroupBy::TUPLE_SORTER)
|
14
|
+
group_map = sorted_keys.map { |k| [k, group_map[k]] }.to_h
|
15
|
+
end
|
16
|
+
|
17
|
+
group_map
|
18
|
+
end
|
19
|
+
|
20
|
+
def get_positions_group_for_aggregation(multi_index, level=-1)
|
21
|
+
raise unless multi_index.is_a?(Daru::MultiIndex)
|
22
|
+
|
23
|
+
new_index = multi_index.dup
|
24
|
+
new_index.remove_layer(level) # TODO: recheck code of Daru::MultiIndex#remove_layer
|
25
|
+
|
26
|
+
get_positions_group_map_on(new_index.each_with_index)
|
27
|
+
end
|
28
|
+
|
29
|
+
def get_positions_group_map_for_df(df, group_by_keys, sort: true)
|
30
|
+
indexes_with_positions = df[*group_by_keys].to_df.each_row.map(&:to_a).each_with_index
|
31
|
+
|
32
|
+
get_positions_group_map_on(indexes_with_positions, sort: sort)
|
33
|
+
end
|
34
|
+
|
35
|
+
def group_map_from_positions_to_indexes(positions_group_map, index)
|
36
|
+
positions_group_map.map { |k, positions| [k, positions.map { |pos| index.at(pos) }] }.to_h
|
37
|
+
end
|
38
|
+
|
39
|
+
def df_from_group_map(df, group_map, remaining_vectors, from_position: true)
|
40
|
+
return nil if group_map == {}
|
41
|
+
|
42
|
+
new_index = group_map.flat_map { |group, values| values.map { |val| group + [val] } }
|
43
|
+
new_index = Daru::MultiIndex.from_tuples(new_index)
|
44
|
+
|
45
|
+
return Daru::DataFrame.new({}, index: new_index) if remaining_vectors == []
|
46
|
+
|
47
|
+
new_rows_order = group_map.values.flatten
|
48
|
+
new_df = df[*remaining_vectors].to_df.get_sub_dataframe(new_rows_order, by_position: from_position)
|
49
|
+
new_df.index = new_index
|
50
|
+
|
51
|
+
new_df
|
52
|
+
end
|
53
|
+
end
|
54
|
+
|
4
55
|
attr_reader :groups, :df
|
5
56
|
|
6
57
|
# Iterate over each group created by group_by. A DataFrame is yielded in
|
7
58
|
# block.
|
8
59
|
def each_group
|
60
|
+
return to_enum(:each_group) unless block_given?
|
61
|
+
|
9
62
|
groups.keys.each do |k|
|
10
63
|
yield get_group(k)
|
11
64
|
end
|
@@ -22,11 +75,8 @@ module Daru
|
|
22
75
|
end
|
23
76
|
|
24
77
|
def initialize context, names
|
25
|
-
@groups = {}
|
26
78
|
@non_group_vectors = context.vectors.to_a - names
|
27
79
|
@context = context
|
28
|
-
vectors = names.map { |vec| context[vec].to_a }
|
29
|
-
tuples = vectors[0].zip(*vectors[1..-1])
|
30
80
|
# FIXME: It feels like we don't want to sort here. Ruby's #group_by
|
31
81
|
# never sorts:
|
32
82
|
#
|
@@ -34,7 +84,10 @@ module Daru
|
|
34
84
|
# # => {4=>["test"], 2=>["me"], 6=>["please"]}
|
35
85
|
#
|
36
86
|
# - zverok, 2016-09-12
|
37
|
-
|
87
|
+
positions_groups = GroupBy.get_positions_group_map_for_df(@context, names, sort: true)
|
88
|
+
|
89
|
+
@groups = GroupBy.group_map_from_positions_to_indexes(positions_groups, @context.index)
|
90
|
+
@df = GroupBy.df_from_group_map(@context, positions_groups, @non_group_vectors)
|
38
91
|
end
|
39
92
|
|
40
93
|
# Get a Daru::Vector of the size of each group.
|
@@ -282,26 +335,11 @@ module Daru
|
|
282
335
|
# Ram Hyderabad,Mumbai
|
283
336
|
#
|
284
337
|
def aggregate(options={})
|
285
|
-
@df.index = @df.index.remove_layer(@df.index.levels.size - 1)
|
286
338
|
@df.aggregate(options)
|
287
339
|
end
|
288
340
|
|
289
341
|
private
|
290
342
|
|
291
|
-
def init_groups_df tuples, names
|
292
|
-
multi_index_tuples = []
|
293
|
-
keys = tuples.uniq.sort(&TUPLE_SORTER)
|
294
|
-
keys.each do |key|
|
295
|
-
indices = all_indices_for(tuples, key)
|
296
|
-
@groups[key] = indices
|
297
|
-
indices.each do |indice|
|
298
|
-
multi_index_tuples << key + [indice]
|
299
|
-
end
|
300
|
-
end
|
301
|
-
@groups.freeze
|
302
|
-
@df = resultant_context(multi_index_tuples, names) unless multi_index_tuples.empty?
|
303
|
-
end
|
304
|
-
|
305
343
|
def select_groups_from method, quantity
|
306
344
|
selection = @context
|
307
345
|
rows, indexes = [], []
|
@@ -342,33 +380,6 @@ module Daru
|
|
342
380
|
end
|
343
381
|
end
|
344
382
|
|
345
|
-
def resultant_context(multi_index_tuples, names)
|
346
|
-
multi_index = Daru::MultiIndex.from_tuples(multi_index_tuples)
|
347
|
-
context_tmp = @context.dup.delete_vectors(*names)
|
348
|
-
rows_tuples = context_tmp.access_row_tuples_by_indexs(
|
349
|
-
*@groups.values.flatten!
|
350
|
-
)
|
351
|
-
context_new = Daru::DataFrame.rows(rows_tuples, index: multi_index)
|
352
|
-
context_new.vectors = context_tmp.vectors
|
353
|
-
context_new
|
354
|
-
end
|
355
|
-
|
356
|
-
def all_indices_for arry, element
|
357
|
-
found, index, indexes = -1, -1, []
|
358
|
-
while found
|
359
|
-
found = arry[index+1..-1].index(element)
|
360
|
-
if found
|
361
|
-
index = index + found + 1
|
362
|
-
indexes << index
|
363
|
-
end
|
364
|
-
end
|
365
|
-
if indexes.count == 1
|
366
|
-
[@context.index.at(*indexes)]
|
367
|
-
else
|
368
|
-
@context.index.at(*indexes).to_a
|
369
|
-
end
|
370
|
-
end
|
371
|
-
|
372
383
|
def multi_indexed_grouping?
|
373
384
|
return false unless @groups.keys[0]
|
374
385
|
@groups.keys[0].size > 1
|
data/lib/daru/core/merge.rb
CHANGED
@@ -17,17 +17,17 @@ module Daru
|
|
17
17
|
end
|
18
18
|
end
|
19
19
|
|
20
|
-
def initialize left_df, right_df, opts={}
|
20
|
+
def initialize left_df, right_df, opts={} # rubocop:disable Metrics/AbcSize -- quick-fix for issue #171
|
21
21
|
init_opts(opts)
|
22
22
|
validate_on!(left_df, right_df)
|
23
23
|
key_sanitizer = ->(h) { sanitize_merge_keys(h.values_at(*on)) }
|
24
24
|
|
25
25
|
@left = df_to_a(left_df)
|
26
|
-
@left.
|
26
|
+
@left.sort! { |a, b| safe_compare(a.values_at(*on), b.values_at(*on)) }
|
27
27
|
@left_key_values = @left.map(&key_sanitizer)
|
28
28
|
|
29
29
|
@right = df_to_a(right_df)
|
30
|
-
@right.
|
30
|
+
@right.sort! { |a, b| safe_compare(a.values_at(*on), b.values_at(*on)) }
|
31
31
|
@right_key_values = @right.map(&key_sanitizer)
|
32
32
|
|
33
33
|
@left_keys, @right_keys = merge_keys(left_df, right_df, on)
|
@@ -246,6 +246,15 @@ module Daru
|
|
246
246
|
raise ArgumentError, "Both dataframes expected to have #{on.inspect} field"
|
247
247
|
end
|
248
248
|
end
|
249
|
+
|
250
|
+
def safe_compare(left_array, right_array)
|
251
|
+
left_array.zip(right_array).map { |l, r|
|
252
|
+
next 0 if l.nil? && r.nil?
|
253
|
+
next 1 if r.nil?
|
254
|
+
next -1 if l.nil?
|
255
|
+
l <=> r
|
256
|
+
}.reject(&:zero?).first || 0
|
257
|
+
end
|
249
258
|
end
|
250
259
|
|
251
260
|
module Merge
|
data/lib/daru/dataframe.rb
CHANGED
@@ -549,6 +549,20 @@ module Daru
|
|
549
549
|
Daru::Accessors::DataFrameByRow.new(self)
|
550
550
|
end
|
551
551
|
|
552
|
+
# Extract a dataframe given row indexes or positions
|
553
|
+
# @param keys [Array] can be positions (if by_position is true) or indexes (if by_position if false)
|
554
|
+
# @return [Daru::Dataframe]
|
555
|
+
def get_sub_dataframe(keys, by_position: true)
|
556
|
+
return Daru::DataFrame.new({}) if keys == []
|
557
|
+
|
558
|
+
keys = @index.pos(*keys) unless by_position
|
559
|
+
|
560
|
+
sub_df = row_at(*keys)
|
561
|
+
sub_df = sub_df.to_df.transpose if sub_df.is_a?(Daru::Vector)
|
562
|
+
|
563
|
+
sub_df
|
564
|
+
end
|
565
|
+
|
552
566
|
# Duplicate the DataFrame entirely.
|
553
567
|
#
|
554
568
|
# == Arguments
|
@@ -698,6 +712,7 @@ module Daru
|
|
698
712
|
#
|
699
713
|
def rolling_fillna!(direction=:forward)
|
700
714
|
@data.each { |vec| vec.rolling_fillna!(direction) }
|
715
|
+
self
|
701
716
|
end
|
702
717
|
|
703
718
|
def rolling_fillna(direction=:forward)
|
@@ -990,6 +1005,17 @@ module Daru
|
|
990
1005
|
self
|
991
1006
|
end
|
992
1007
|
|
1008
|
+
def apply_method(method, keys: nil, by_position: true)
|
1009
|
+
df = keys ? get_sub_dataframe(keys, by_position: by_position) : self
|
1010
|
+
|
1011
|
+
case method
|
1012
|
+
when Symbol then df.send(method)
|
1013
|
+
when Proc then method.call(df)
|
1014
|
+
else raise
|
1015
|
+
end
|
1016
|
+
end
|
1017
|
+
alias :apply_method_on_sub_df :apply_method
|
1018
|
+
|
993
1019
|
# Retrieves a Daru::Vector, based on the result of calculation
|
994
1020
|
# performed on each row.
|
995
1021
|
def collect_rows &block
|
@@ -1450,11 +1476,10 @@ module Daru
|
|
1450
1476
|
# # ["foo", "two", 3]=>[2, 4]}
|
1451
1477
|
def group_by *vectors
|
1452
1478
|
vectors.flatten!
|
1453
|
-
|
1454
|
-
|
1455
|
-
|
1456
|
-
|
1457
|
-
}
|
1479
|
+
missing = vectors - @vectors.to_a
|
1480
|
+
unless missing.empty?
|
1481
|
+
raise(ArgumentError, "Vector(s) missing: #{missing.join(', ')}")
|
1482
|
+
end
|
1458
1483
|
|
1459
1484
|
vectors = [@vectors.first] if vectors.empty?
|
1460
1485
|
|
@@ -2249,22 +2274,6 @@ module Daru
|
|
2249
2274
|
end
|
2250
2275
|
end
|
2251
2276
|
|
2252
|
-
# returns array of row tuples at given index(s)
|
2253
|
-
def access_row_tuples_by_indexs *indexes
|
2254
|
-
positions = @index.pos(*indexes)
|
2255
|
-
|
2256
|
-
return populate_row_for(positions) if positions.is_a? Numeric
|
2257
|
-
|
2258
|
-
res = []
|
2259
|
-
new_rows = @data.map { |vec| vec[*indexes] }
|
2260
|
-
indexes.each do |index|
|
2261
|
-
tuples = []
|
2262
|
-
new_rows.map { |row| tuples += [row[index]] }
|
2263
|
-
res << tuples
|
2264
|
-
end
|
2265
|
-
res
|
2266
|
-
end
|
2267
|
-
|
2268
2277
|
# Function to use for aggregating the data.
|
2269
2278
|
#
|
2270
2279
|
# @param options [Hash] options for column, you want in resultant dataframe
|
@@ -2282,7 +2291,7 @@ module Daru
|
|
2282
2291
|
# 3 d 17
|
2283
2292
|
# 4 e 1
|
2284
2293
|
#
|
2285
|
-
# df.aggregate(num_100_times: ->(df) { df.num*100 })
|
2294
|
+
# df.aggregate(num_100_times: ->(df) { (df.num*100).first })
|
2286
2295
|
# => #<Daru::DataFrame(5x1)>
|
2287
2296
|
# num_100_ti
|
2288
2297
|
# 0 5200
|
@@ -2312,41 +2321,26 @@ module Daru
|
|
2312
2321
|
#
|
2313
2322
|
# Note: `GroupBy` class `aggregate` method uses this `aggregate` method
|
2314
2323
|
# internally.
|
2315
|
-
def aggregate(options={})
|
2316
|
-
|
2317
|
-
Daru::DataFrame.new(
|
2318
|
-
colmn_value, index: index_tuples, order: options.keys
|
2319
|
-
)
|
2320
|
-
end
|
2324
|
+
def aggregate(options={}, multi_index_level=-1)
|
2325
|
+
positions_tuples, new_index = group_index_for_aggregation(@index, multi_index_level)
|
2321
2326
|
|
2322
|
-
|
2327
|
+
colmn_value = aggregate_by_positions_tuples(options, positions_tuples)
|
2323
2328
|
|
2324
|
-
|
2325
|
-
# lambda), on the column.
|
2326
|
-
def apply_method_on_colmns colmn, index_tuples, method
|
2327
|
-
rows = []
|
2328
|
-
index_tuples.each do |indexes|
|
2329
|
-
# If single element then also make it vector.
|
2330
|
-
slice = Daru::Vector.new(Array(self[colmn][*indexes]))
|
2331
|
-
case method
|
2332
|
-
when Symbol
|
2333
|
-
rows << (slice.is_a?(Daru::Vector) ? slice.send(method) : slice)
|
2334
|
-
when Proc
|
2335
|
-
rows << method.call(slice)
|
2336
|
-
end
|
2337
|
-
end
|
2338
|
-
rows
|
2329
|
+
Daru::DataFrame.new(colmn_value, index: new_index, order: options.keys)
|
2339
2330
|
end
|
2340
2331
|
|
2341
|
-
|
2342
|
-
|
2343
|
-
|
2344
|
-
|
2345
|
-
|
2346
|
-
|
2347
|
-
|
2332
|
+
# Is faster than using group_by followed by aggregate (because it doesn't generate an intermediary dataframe)
|
2333
|
+
def group_by_and_aggregate(*group_by_keys, **aggregation_map)
|
2334
|
+
positions_groups = Daru::Core::GroupBy.get_positions_group_map_for_df(self, group_by_keys.flatten, sort: true)
|
2335
|
+
|
2336
|
+
new_index = Daru::MultiIndex.from_tuples(positions_groups.keys).coerce_index
|
2337
|
+
colmn_value = aggregate_by_positions_tuples(aggregation_map, positions_groups.values)
|
2338
|
+
|
2339
|
+
Daru::DataFrame.new(colmn_value, index: new_index, order: aggregation_map.keys)
|
2348
2340
|
end
|
2349
2341
|
|
2342
|
+
private
|
2343
|
+
|
2350
2344
|
def headers
|
2351
2345
|
Daru::Index.new(Array(index.name) + @vectors.to_a)
|
2352
2346
|
end
|
@@ -2910,27 +2904,41 @@ module Daru
|
|
2910
2904
|
end
|
2911
2905
|
|
2912
2906
|
def update_data source, vectors
|
2913
|
-
@data = @vectors.each_with_index.map do |_vec,idx|
|
2907
|
+
@data = @vectors.each_with_index.map do |_vec, idx|
|
2914
2908
|
Daru::Vector.new(source[idx], index: @index, name: vectors[idx])
|
2915
2909
|
end
|
2916
2910
|
end
|
2917
2911
|
|
2918
|
-
def
|
2919
|
-
|
2920
|
-
|
2921
|
-
|
2922
|
-
|
2923
|
-
|
2924
|
-
|
2925
|
-
|
2926
|
-
|
2927
|
-
|
2928
|
-
|
2929
|
-
|
2930
|
-
|
2931
|
-
end
|
2912
|
+
def aggregate_by_positions_tuples(options, positions_tuples)
|
2913
|
+
options.map do |vect, method|
|
2914
|
+
if @vectors.include?(vect)
|
2915
|
+
vect = self[vect]
|
2916
|
+
|
2917
|
+
positions_tuples.map do |positions|
|
2918
|
+
vect.apply_method_on_sub_vector(method, keys: positions)
|
2919
|
+
end
|
2920
|
+
else
|
2921
|
+
positions_tuples.map do |positions|
|
2922
|
+
apply_method_on_sub_df(method, keys: positions)
|
2923
|
+
end
|
2924
|
+
end
|
2932
2925
|
end
|
2933
|
-
|
2926
|
+
end
|
2927
|
+
|
2928
|
+
def group_index_for_aggregation(index, multi_index_level=-1)
|
2929
|
+
case index
|
2930
|
+
when Daru::MultiIndex
|
2931
|
+
groups = Daru::Core::GroupBy.get_positions_group_for_aggregation(index, multi_index_level)
|
2932
|
+
new_index, pos_tuples = groups.keys, groups.values
|
2933
|
+
|
2934
|
+
new_index = Daru::MultiIndex.from_tuples(new_index).coerce_index
|
2935
|
+
when Daru::Index, Daru::CategoricalIndex
|
2936
|
+
new_index = Array(index).uniq
|
2937
|
+
pos_tuples = new_index.map { |idx| [*index.pos(idx)] }
|
2938
|
+
else raise
|
2939
|
+
end
|
2940
|
+
|
2941
|
+
[pos_tuples, new_index]
|
2934
2942
|
end
|
2935
2943
|
|
2936
2944
|
# coerce ranges, integers and array in appropriate ways
|
@@ -244,8 +244,21 @@ module Daru
|
|
244
244
|
@labels.delete_at(layer_index)
|
245
245
|
@name.delete_at(layer_index) unless @name.nil?
|
246
246
|
|
247
|
-
|
248
|
-
|
247
|
+
coerce_index
|
248
|
+
end
|
249
|
+
|
250
|
+
def coerce_index
|
251
|
+
if @levels.size == 1
|
252
|
+
elements = to_a.flatten
|
253
|
+
|
254
|
+
if elements.uniq.length == elements.length
|
255
|
+
Daru::Index.new(elements)
|
256
|
+
else
|
257
|
+
Daru::CategoricalIndex.new(elements)
|
258
|
+
end
|
259
|
+
else
|
260
|
+
self
|
261
|
+
end
|
249
262
|
end
|
250
263
|
|
251
264
|
# Array `name` must have same length as levels and labels.
|
@@ -272,7 +285,7 @@ module Daru
|
|
272
285
|
end
|
273
286
|
|
274
287
|
def dup
|
275
|
-
MultiIndex.new levels: levels.dup, labels: labels
|
288
|
+
MultiIndex.new levels: levels.dup, labels: labels.dup, name: (@name.nil? ? nil : @name.dup)
|
276
289
|
end
|
277
290
|
|
278
291
|
def drop_left_level by=1
|
@@ -293,8 +306,9 @@ module Daru
|
|
293
306
|
|
294
307
|
def include? tuple
|
295
308
|
return false unless tuple.is_a? Enumerable
|
296
|
-
tuple.flatten.
|
297
|
-
|
309
|
+
@labels[0...tuple.flatten.size]
|
310
|
+
.transpose
|
311
|
+
.include?(tuple.flatten.each_with_index.map { |e, i| @levels[i][e] })
|
298
312
|
end
|
299
313
|
|
300
314
|
def size
|
data/lib/daru/io/io.rb
CHANGED
@@ -34,11 +34,12 @@ module Daru
|
|
34
34
|
end
|
35
35
|
end
|
36
36
|
|
37
|
-
module IO
|
37
|
+
module IO # rubocop:disable Metrics/ModuleLength
|
38
38
|
class << self
|
39
39
|
# Functions for loading/writing Excel files.
|
40
40
|
|
41
41
|
def from_excel path, opts={}
|
42
|
+
optional_gem 'spreadsheet', '~>1.1.1'
|
42
43
|
opts = {
|
43
44
|
worksheet_id: 0
|
44
45
|
}.merge opts
|
@@ -185,19 +186,25 @@ module Daru
|
|
185
186
|
end
|
186
187
|
|
187
188
|
def from_html path, opts
|
189
|
+
optional_gem 'mechanize', '~>2.7.5'
|
188
190
|
page = Mechanize.new.get(path)
|
189
191
|
page.search('table').map { |table| html_parse_table table }
|
190
192
|
.keep_if { |table| html_search table, opts[:match] }
|
191
193
|
.compact
|
192
194
|
.map { |table| html_decide_values table, opts }
|
193
195
|
.map { |table| html_table_to_dataframe table }
|
194
|
-
rescue LoadError
|
195
|
-
raise 'Install the mechanize gem version 2.7.5 with `gem install mechanize`,'\
|
196
|
-
' for using the from_html function.'
|
197
196
|
end
|
198
197
|
|
199
198
|
private
|
200
199
|
|
200
|
+
def optional_gem(name, version)
|
201
|
+
gem name, version
|
202
|
+
require name
|
203
|
+
rescue LoadError
|
204
|
+
Daru.error "\nInstall the #{name} gem version #{version} for using"\
|
205
|
+
" #{name} functions."
|
206
|
+
end
|
207
|
+
|
201
208
|
DARU_OPT_KEYS = %i[clone order index name].freeze
|
202
209
|
|
203
210
|
def from_csv_prepare_opts opts
|
@@ -214,7 +221,7 @@ module Daru
|
|
214
221
|
end
|
215
222
|
|
216
223
|
def from_csv_prepare_converters(converters)
|
217
|
-
converters.flat_map do |c|
|
224
|
+
Array(converters).flat_map do |c|
|
218
225
|
if ::CSV::Converters[c]
|
219
226
|
::CSV::Converters[c]
|
220
227
|
elsif Daru::IO::CSV::CONVERTERS[c]
|
data/lib/daru/vector.rb
CHANGED
@@ -122,6 +122,17 @@ module Daru
|
|
122
122
|
self
|
123
123
|
end
|
124
124
|
|
125
|
+
def apply_method(method, keys: nil, by_position: true)
|
126
|
+
vect = keys ? get_sub_vector(keys, by_position: by_position) : self
|
127
|
+
|
128
|
+
case method
|
129
|
+
when Symbol then vect.send(method)
|
130
|
+
when Proc then method.call(vect)
|
131
|
+
else raise
|
132
|
+
end
|
133
|
+
end
|
134
|
+
alias :apply_method_on_sub_vector :apply_method
|
135
|
+
|
125
136
|
# The name of the Daru::Vector. String.
|
126
137
|
attr_reader :name
|
127
138
|
# The row index. Can be either Daru::Index or Daru::MultiIndex.
|
@@ -790,6 +801,7 @@ module Daru
|
|
790
801
|
self[idx] = last_valid_value
|
791
802
|
end
|
792
803
|
end
|
804
|
+
self
|
793
805
|
end
|
794
806
|
|
795
807
|
# Non-destructive version of rolling_fillna!
|
@@ -870,6 +882,19 @@ module Daru
|
|
870
882
|
@index.include? index
|
871
883
|
end
|
872
884
|
|
885
|
+
# @param keys [Array] can be positions (if by_position is true) or indexes (if by_position if false)
|
886
|
+
# @return [Daru::Vector]
|
887
|
+
def get_sub_vector(keys, by_position: true)
|
888
|
+
return Daru::Vector.new([]) if keys == []
|
889
|
+
|
890
|
+
keys = @index.pos(*keys) unless by_position
|
891
|
+
|
892
|
+
sub_vect = at(*keys)
|
893
|
+
sub_vect = Daru::Vector.new([sub_vect]) unless sub_vect.is_a?(Daru::Vector)
|
894
|
+
|
895
|
+
sub_vect
|
896
|
+
end
|
897
|
+
|
873
898
|
# @return [Daru::DataFrame] the vector as a single-vector dataframe
|
874
899
|
def to_df
|
875
900
|
Daru::DataFrame.new({@name => @data}, name: @name, index: @index)
|
data/lib/daru/version.rb
CHANGED
data/spec/core/group_by_spec.rb
CHANGED
@@ -201,6 +201,22 @@ describe Daru::Core::GroupBy do
|
|
201
201
|
end
|
202
202
|
end
|
203
203
|
|
204
|
+
context '#each_group without block' do
|
205
|
+
it 'enumerates groups' do
|
206
|
+
enum = @dl_group.each_group
|
207
|
+
|
208
|
+
expect(enum.count).to eq 6
|
209
|
+
expect(enum).to all be_a(Daru::DataFrame)
|
210
|
+
expect(enum.to_a.last).to eq(Daru::DataFrame.new({
|
211
|
+
a: ['foo', 'foo'],
|
212
|
+
b: ['two', 'two'],
|
213
|
+
c: [3, 3],
|
214
|
+
d: [33, 55]
|
215
|
+
}, index: [2, 4]
|
216
|
+
))
|
217
|
+
end
|
218
|
+
end
|
219
|
+
|
204
220
|
context '#first' do
|
205
221
|
it 'gets the first row from each group' do
|
206
222
|
expect(@dl_group.first).to eq(Daru::DataFrame.new({
|
@@ -223,10 +239,6 @@ describe Daru::Core::GroupBy do
|
|
223
239
|
end
|
224
240
|
end
|
225
241
|
|
226
|
-
context "#aggregate" do
|
227
|
-
pending
|
228
|
-
end
|
229
|
-
|
230
242
|
context "#mean" do
|
231
243
|
it "computes mean of the numeric columns of a single layer group" do
|
232
244
|
expect(@sl_group.mean).to eq(Daru::DataFrame.new({
|
@@ -498,23 +510,6 @@ describe Daru::Core::GroupBy do
|
|
498
510
|
}
|
499
511
|
end
|
500
512
|
|
501
|
-
context 'group and aggregate sum for two vectors' do
|
502
|
-
subject {
|
503
|
-
dataframe.group_by([:employee, :month]).aggregate(salary: :sum) }
|
504
|
-
|
505
|
-
it { is_expected.to eq Daru::DataFrame.new({
|
506
|
-
salary: [600, 500, 1200, 1000, 600, 700]},
|
507
|
-
index: Daru::MultiIndex.from_tuples([
|
508
|
-
['Jane', 'July'],
|
509
|
-
['Jane', 'June'],
|
510
|
-
['John', 'July'],
|
511
|
-
['John', 'June'],
|
512
|
-
['Mark', 'July'],
|
513
|
-
['Mark', 'June']
|
514
|
-
])
|
515
|
-
)}
|
516
|
-
end
|
517
|
-
|
518
513
|
context 'group and aggregate sum and lambda function for vectors' do
|
519
514
|
subject { dataframe.group_by([:employee]).aggregate(
|
520
515
|
salary: :sum,
|
@@ -592,5 +587,64 @@ describe Daru::Core::GroupBy do
|
|
592
587
|
)
|
593
588
|
end
|
594
589
|
end
|
590
|
+
|
591
|
+
let(:spending_df) {
|
592
|
+
Daru::DataFrame.rows([
|
593
|
+
[2010, 'dev', 50, 1],
|
594
|
+
[2010, 'dev', 150, 1],
|
595
|
+
[2010, 'dev', 200, 1],
|
596
|
+
[2011, 'dev', 50, 1],
|
597
|
+
[2012, 'dev', 150, 1],
|
598
|
+
|
599
|
+
[2011, 'office', 300, 1],
|
600
|
+
|
601
|
+
[2010, 'market', 50, 1],
|
602
|
+
[2011, 'market', 500, 1],
|
603
|
+
[2012, 'market', 500, 1],
|
604
|
+
[2012, 'market', 300, 1],
|
605
|
+
|
606
|
+
[2012, 'R&D', 10, 1],],
|
607
|
+
order: [:year, :category, :spending, :nb_spending])
|
608
|
+
}
|
609
|
+
let(:multi_index_year_category) {
|
610
|
+
Daru::MultiIndex.from_tuples([
|
611
|
+
[2010, "dev"], [2010, "market"],
|
612
|
+
[2011, "dev"], [2011, "market"], [2011, "office"],
|
613
|
+
[2012, "R&D"], [2012, "dev"], [2012, "market"]])
|
614
|
+
}
|
615
|
+
|
616
|
+
context 'group_by and aggregate on multiple elements' do
|
617
|
+
it 'does aggregate' do
|
618
|
+
expect(spending_df.group_by([:year, :category]).aggregate(spending: :sum)).to eq(
|
619
|
+
Daru::DataFrame.new({spending: [400, 50, 50, 500, 300, 10, 150, 800]}, index: multi_index_year_category))
|
620
|
+
end
|
621
|
+
|
622
|
+
it 'works as older methods' do
|
623
|
+
newer_way = spending_df.group_by([:year, :category]).aggregate(spending: :sum, nb_spending: :sum)
|
624
|
+
older_way = spending_df.group_by([:year, :category]).sum
|
625
|
+
expect(newer_way).to eq(older_way)
|
626
|
+
end
|
627
|
+
|
628
|
+
context 'can aggregate on MultiIndex' do
|
629
|
+
let(:multi_indexed_aggregated_df) { spending_df.group_by([:year, :category]).aggregate(spending: :sum) }
|
630
|
+
let(:index_year) { Daru::Index.new([2010, 2011, 2012]) }
|
631
|
+
let(:index_category) { Daru::Index.new(["dev", "market", "office", "R&D"]) }
|
632
|
+
|
633
|
+
it 'aggregates by default on the last layer of MultiIndex' do
|
634
|
+
expect(multi_indexed_aggregated_df.aggregate(spending: :sum)).to eq(
|
635
|
+
Daru::DataFrame.new({spending: [450, 850, 960]}, index: index_year))
|
636
|
+
end
|
637
|
+
|
638
|
+
it 'can aggregate on the first layer of MultiIndex' do
|
639
|
+
expect(multi_indexed_aggregated_df.aggregate({spending: :sum},0)).to eq(
|
640
|
+
Daru::DataFrame.new({spending: [600, 1350, 300, 10]}, index: index_category))
|
641
|
+
end
|
642
|
+
|
643
|
+
it 'does coercion: when one layer is remaining, MultiIndex is coerced in Index that does not aggregate anymore' do
|
644
|
+
df_with_simple_index = multi_indexed_aggregated_df.aggregate(spending: :sum)
|
645
|
+
expect(df_with_simple_index.aggregate(spending: :sum)).to eq(df_with_simple_index)
|
646
|
+
end
|
647
|
+
end
|
648
|
+
end
|
595
649
|
end
|
596
650
|
end
|
data/spec/dataframe_spec.rb
CHANGED
@@ -1858,7 +1858,7 @@ describe Daru::DataFrame do
|
|
1858
1858
|
|
1859
1859
|
context 'rolling_fillna! forwards' do
|
1860
1860
|
before { subject.rolling_fillna!(:forward) }
|
1861
|
-
it {
|
1861
|
+
it { expect(subject.rolling_fillna!(:forward)).to eq(subject) }
|
1862
1862
|
its(:'a.to_a') { is_expected.to eq [1, 2, 3, 3, 3, 3, 1, 7] }
|
1863
1863
|
its(:'b.to_a') { is_expected.to eq [:a, :b, :b, :b, :b, 3, 5, 5] }
|
1864
1864
|
its(:'c.to_a') { is_expected.to eq ['a', 'a', 3, 4, 3, 5, 5, 7] }
|
@@ -1866,7 +1866,7 @@ describe Daru::DataFrame do
|
|
1866
1866
|
|
1867
1867
|
context 'rolling_fillna! backwards' do
|
1868
1868
|
before { subject.rolling_fillna!(:backward) }
|
1869
|
-
it {
|
1869
|
+
it { expect(subject.rolling_fillna!(:backward)).to eq(subject) }
|
1870
1870
|
its(:'a.to_a') { is_expected.to eq [1, 2, 3, 1, 1, 1, 1, 7] }
|
1871
1871
|
its(:'b.to_a') { is_expected.to eq [:a, :b, 3, 3, 3, 3, 5, 0] }
|
1872
1872
|
its(:'c.to_a') { is_expected.to eq ['a', 3, 3, 4, 3, 5, 7, 7] }
|
@@ -3266,6 +3266,18 @@ describe Daru::DataFrame do
|
|
3266
3266
|
end
|
3267
3267
|
end
|
3268
3268
|
|
3269
|
+
context "group_by" do
|
3270
|
+
context "on a single row DataFrame" do
|
3271
|
+
let(:df){ Daru::DataFrame.new(city: %w[Kyiv], year: [2015], value: [1]) }
|
3272
|
+
it "returns a groupby object" do
|
3273
|
+
expect(df.group_by([:city])).to be_a(Daru::Core::GroupBy)
|
3274
|
+
end
|
3275
|
+
it "has the correct index" do
|
3276
|
+
expect(df.group_by([:city]).groups).to eq({["Kyiv"]=>[0]})
|
3277
|
+
end
|
3278
|
+
end
|
3279
|
+
end
|
3280
|
+
|
3269
3281
|
context "#vector_sum" do
|
3270
3282
|
before do
|
3271
3283
|
a1 = Daru::Vector.new [1, 2, 3, 4, 5, nil, nil]
|
@@ -4032,7 +4044,7 @@ describe Daru::DataFrame do
|
|
4032
4044
|
Daru::DataFrame.new({num: [52,12,07,17,01]}, index: cat_idx) }
|
4033
4045
|
|
4034
4046
|
it 'lambda function on particular column' do
|
4035
|
-
expect(df.aggregate(num_100_times: ->(df) { df.num*100 })).to eq(
|
4047
|
+
expect(df.aggregate(num_100_times: ->(df) { (df.num*100).first })).to eq(
|
4036
4048
|
Daru::DataFrame.new(num_100_times: [5200, 1200, 700, 1700, 100])
|
4037
4049
|
)
|
4038
4050
|
end
|
@@ -4043,6 +4055,34 @@ describe Daru::DataFrame do
|
|
4043
4055
|
end
|
4044
4056
|
end
|
4045
4057
|
|
4058
|
+
context '#group_by_and_aggregate' do
|
4059
|
+
let(:spending_df) {
|
4060
|
+
Daru::DataFrame.rows([
|
4061
|
+
[2010, 'dev', 50, 1],
|
4062
|
+
[2010, 'dev', 150, 1],
|
4063
|
+
[2010, 'dev', 200, 1],
|
4064
|
+
[2011, 'dev', 50, 1],
|
4065
|
+
[2012, 'dev', 150, 1],
|
4066
|
+
|
4067
|
+
[2011, 'office', 300, 1],
|
4068
|
+
|
4069
|
+
[2010, 'market', 50, 1],
|
4070
|
+
[2011, 'market', 500, 1],
|
4071
|
+
[2012, 'market', 500, 1],
|
4072
|
+
[2012, 'market', 300, 1],
|
4073
|
+
|
4074
|
+
[2012, 'R&D', 10, 1],],
|
4075
|
+
order: [:year, :category, :spending, :nb_spending])
|
4076
|
+
}
|
4077
|
+
|
4078
|
+
it 'works as group_by + aggregate' do
|
4079
|
+
expect(spending_df.group_by_and_aggregate(:year, {spending: :sum})).to eq(
|
4080
|
+
spending_df.group_by(:year).aggregate(spending: :sum))
|
4081
|
+
expect(spending_df.group_by_and_aggregate([:year, :category], spending: :sum, nb_spending: :size)).to eq(
|
4082
|
+
spending_df.group_by([:year, :category]).aggregate(spending: :sum, nb_spending: :size))
|
4083
|
+
end
|
4084
|
+
end
|
4085
|
+
|
4046
4086
|
context '#create_sql' do
|
4047
4087
|
let(:df) { Daru::DataFrame.new({
|
4048
4088
|
a: [1,2,3],
|
@@ -0,0 +1,5 @@
|
|
1
|
+
ID,Case Number,Date,Block,IUCR,Primary Type,Description,Location Description,Arrest,Domestic,Beat,District,Ward,Community Area,FBI Code,X Coordinate,Y Coordinate,Year,Updated On,Latitude,Longitude,Location
|
2
|
+
8517337,094652,03/12/2012 02:00:00 PM,027XX S HAMLIN AVE,1152,DECEPTIVE PRACTICE,ILLEGAL USE CASH CARD,ATM (AUTOMATIC TELLER MACHINE),false,true,1031,010,22,30,11,1151482,1885517,2012,02/04/2016 06:33:39 AM,41.841738053,-87.719605942,"(41.841738053, -87.719605942)"
|
3
|
+
8517338,194241,03/06/2012 10:49:00 PM,102XX S VERNON AVE,0917,MOTOR VEHICLE THEFT,"CYCLE, SCOOTER, BIKE W-VIN",STREET,false,false,0511,005,9,49,07,1181052,1837191,2012,02/04/2016 06:33:39 AM,41.708495677,-87.612580474,"(41.708495677, -87.612580474)"
|
4
|
+
8517339,194563,02/01/2012 08:15:00 AM,003XX W 108TH ST,0460,BATTERY,SIMPLE,"SCHOOL, PRIVATE, BUILDING",false,false,0513,005,34,49,08B,1176016,1833309,2012,02/04/2016 06:33:39 AM,41.6979571,-87.631138505,"(41.6979571, -87.631138505)"
|
5
|
+
8517340,194531,03/12/2012 05:50:00 PM,089XX S CARPENTER ST,0560,ASSAULT,SIMPLE,STREET,false,false,2222,022,21,73,08A,1170886,1845421,2012,02/04/2016 06:33:39 AM,41.731307475,-87.649569675,"(41.731307475, -87.649569675)"
|
@@ -202,8 +202,16 @@ describe Daru::MultiIndex do
|
|
202
202
|
expect(@multi_mi.include?([:a, :one])).to eq(true)
|
203
203
|
end
|
204
204
|
|
205
|
-
it "checks for non-existence of
|
206
|
-
expect(@multi_mi.include?([:
|
205
|
+
it "checks for non-existence of completely specified tuple" do
|
206
|
+
expect(@multi_mi.include?([:b, :two, :foo])).to eq(false)
|
207
|
+
end
|
208
|
+
|
209
|
+
it "checks for non-existence of a top layer incomplete tuple" do
|
210
|
+
expect(@multi_mi.include?([:d])).to eq(false)
|
211
|
+
end
|
212
|
+
|
213
|
+
it "checks for non-existence of a middle layer incomplete tuple" do
|
214
|
+
expect(@multi_mi.include?([:c, :three])).to eq(false)
|
207
215
|
end
|
208
216
|
end
|
209
217
|
|
data/spec/io/io_spec.rb
CHANGED
@@ -51,6 +51,16 @@ describe Daru::IO do
|
|
51
51
|
expect(df['Domestic'].to_a).to all be_boolean
|
52
52
|
end
|
53
53
|
|
54
|
+
it "uses the custom string converter correctly" do
|
55
|
+
df = Daru::DataFrame.from_csv 'spec/fixtures/string_converter_test.csv', converters: [:string]
|
56
|
+
expect(df['Case Number'].to_a.all? {|x| String === x }).to be_truthy
|
57
|
+
end
|
58
|
+
|
59
|
+
it "allow symbol to converters option" do
|
60
|
+
df = Daru::DataFrame.from_csv 'spec/fixtures/boolean_converter_test.csv', converters: :boolean
|
61
|
+
expect(df['Domestic'].to_a).to all be_boolean
|
62
|
+
end
|
63
|
+
|
54
64
|
it "checks for equal parsing of local CSV files and remote CSV files" do
|
55
65
|
%w[matrix_test repeated_fields scientific_notation sales-funnel].each do |file|
|
56
66
|
df_local = Daru::DataFrame.from_csv("spec/fixtures/#{file}.csv")
|
data/spec/vector_spec.rb
CHANGED
@@ -1808,6 +1808,22 @@ describe Daru::Vector do
|
|
1808
1808
|
end
|
1809
1809
|
end
|
1810
1810
|
|
1811
|
+
context '#rolling_fillna' do
|
1812
|
+
subject do
|
1813
|
+
Daru::Vector.new(
|
1814
|
+
[Float::NAN, 2, 1, 4, nil, Float::NAN, 3, nil, Float::NAN]
|
1815
|
+
)
|
1816
|
+
end
|
1817
|
+
|
1818
|
+
context 'rolling_fillna forwards' do
|
1819
|
+
it { expect(subject.rolling_fillna(:forward).to_a).to eq [0, 2, 1, 4, 4, 4, 3, 3, 3] }
|
1820
|
+
end
|
1821
|
+
|
1822
|
+
context 'rolling_fillna backwards' do
|
1823
|
+
it { expect(subject.rolling_fillna(direction: :backward).to_a).to eq [2, 2, 1, 4, 3, 3, 3, 0, 0] }
|
1824
|
+
end
|
1825
|
+
end
|
1826
|
+
|
1811
1827
|
context "#type" do
|
1812
1828
|
before(:each) do
|
1813
1829
|
@numeric = Daru::Vector.new([1,2,3,4,5])
|
metadata
CHANGED
@@ -1,14 +1,14 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: daru
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.2.
|
4
|
+
version: 0.2.1
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Sameer Deshmukh
|
8
8
|
autorequire:
|
9
9
|
bindir: bin
|
10
10
|
cert_chain: []
|
11
|
-
date:
|
11
|
+
date: 2018-07-02 00:00:00.000000000 Z
|
12
12
|
dependencies:
|
13
13
|
- !ruby/object:Gem::Dependency
|
14
14
|
name: backports
|
@@ -532,6 +532,7 @@ files:
|
|
532
532
|
- spec/fixtures/repeated_fields.csv
|
533
533
|
- spec/fixtures/sales-funnel.csv
|
534
534
|
- spec/fixtures/scientific_notation.csv
|
535
|
+
- spec/fixtures/string_converter_test.csv
|
535
536
|
- spec/fixtures/strings.dat
|
536
537
|
- spec/fixtures/test_xls.xls
|
537
538
|
- spec/fixtures/url_test.txt~
|
@@ -569,26 +570,7 @@ homepage: http://github.com/v0dro/daru
|
|
569
570
|
licenses:
|
570
571
|
- BSD-2
|
571
572
|
metadata: {}
|
572
|
-
post_install_message:
|
573
|
-
*************************************************************************
|
574
|
-
Thank you for installing daru!
|
575
|
-
|
576
|
-
oOOOOOo
|
577
|
-
,| oO
|
578
|
-
//| |
|
579
|
-
\\| |
|
580
|
-
`| |
|
581
|
-
`-----`
|
582
|
-
|
583
|
-
|
584
|
-
Hope you love daru! For enhanced interactivity and better visualizations,
|
585
|
-
consider using gnuplotrb and nyaplot with iruby. For statistics use the
|
586
|
-
statsample family.
|
587
|
-
|
588
|
-
Read the README for interesting use cases and examples.
|
589
|
-
|
590
|
-
Cheers!
|
591
|
-
*************************************************************************
|
573
|
+
post_install_message:
|
592
574
|
rdoc_options: []
|
593
575
|
require_paths:
|
594
576
|
- lib
|
@@ -604,7 +586,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
|
|
604
586
|
version: '0'
|
605
587
|
requirements: []
|
606
588
|
rubyforge_project:
|
607
|
-
rubygems_version: 2.6.
|
589
|
+
rubygems_version: 2.6.14
|
608
590
|
signing_key:
|
609
591
|
specification_version: 4
|
610
592
|
summary: Data Analysis in RUby
|
@@ -638,6 +620,7 @@ test_files:
|
|
638
620
|
- spec/fixtures/repeated_fields.csv
|
639
621
|
- spec/fixtures/sales-funnel.csv
|
640
622
|
- spec/fixtures/scientific_notation.csv
|
623
|
+
- spec/fixtures/string_converter_test.csv
|
641
624
|
- spec/fixtures/strings.dat
|
642
625
|
- spec/fixtures/test_xls.xls
|
643
626
|
- spec/fixtures/url_test.txt~
|