daru 0.2.0 → 0.2.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/History.md +15 -0
- data/README.md +5 -3
- data/daru.gemspec +0 -23
- data/lib/daru.rb +0 -10
- data/lib/daru/core/group_by.rb +57 -46
- data/lib/daru/core/merge.rb +12 -3
- data/lib/daru/dataframe.rb +75 -67
- data/lib/daru/index/multi_index.rb +19 -5
- data/lib/daru/io/csv/converters.rb +3 -0
- data/lib/daru/io/io.rb +12 -5
- data/lib/daru/vector.rb +25 -0
- data/lib/daru/version.rb +1 -1
- data/spec/core/group_by_spec.rb +75 -21
- data/spec/dataframe_spec.rb +43 -3
- data/spec/fixtures/string_converter_test.csv +5 -0
- data/spec/index/multi_index_spec.rb +10 -2
- data/spec/io/io_spec.rb +10 -0
- data/spec/vector_spec.rb +16 -0
- metadata +6 -23
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA1:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 87e4e2869fe6411e3eece92bb5dc24d48f890774
|
4
|
+
data.tar.gz: e711d0db1d57f51f31ccb7fb54078a6bdbcc4ff5
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: afdb295d0d01542ba9f439cf5f7959d7f2a3b9e47de6047ecf7719548ef760e657c0dfe753ed16ee1da65e071bb5a182aaf03ee83c9de6075d54149753b9c346
|
7
|
+
data.tar.gz: e0c4ace661d9f1cb7e8040d424bb004a0b650a9605037d1aff258258bbac40a3c158e5f5b8a2a5c6a28070cf55566a0729ee9b77c8114d40d4d18cf9d26e69c3
|
data/History.md
CHANGED
@@ -1,3 +1,18 @@
|
|
1
|
+
# 0.2.1 (02 July 2018)
|
2
|
+
|
3
|
+
* Minor Enhancements
|
4
|
+
- Allow pasing singular Symbol to CSV converters option (@takkanm)
|
5
|
+
- Support calling GroupBy#each_group w/o blocks (@hibariya)
|
6
|
+
- Refactor grouping and aggregation (@paisible-wanderer)
|
7
|
+
- Add String Converter to Daru::IO::CSV::CONVERTERS (@takkanm)
|
8
|
+
- Fix annoying missing libraries warning
|
9
|
+
- Remove post-install message (nice yet useless)
|
10
|
+
|
11
|
+
* Fixes
|
12
|
+
- Fix group_by for DataFrame with single row (@baarkerlounger)
|
13
|
+
- `#rolling_fillna!` bugfixes on `Daru::Vector` and `Daru::DataFrame` (@mhammiche)
|
14
|
+
- Fixes `#include?` on multiindex (@rohitner)
|
15
|
+
|
1
16
|
# 0.2.0 (31 October 2017)
|
2
17
|
* Major Enhancements
|
3
18
|
- Add `DataFrame#which` query DSL (experimental! @rainchen)
|
data/README.md
CHANGED
@@ -3,12 +3,13 @@
|
|
3
3
|
[](http://badge.fury.io/rb/daru)
|
4
4
|
[](https://travis-ci.org/SciRuby/daru)
|
5
5
|
[](https://gitter.im/v0dro/daru?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge)
|
6
|
+
[](https://www.codetriage.com/sciruby/daru)
|
6
7
|
|
7
8
|
## Introduction
|
8
9
|
|
9
10
|
daru (Data Analysis in RUby) is a library for storage, analysis, manipulation and visualization of data in Ruby.
|
10
11
|
|
11
|
-
daru makes it easy and intuitive to process data predominantly through 2 data structures: `Daru::DataFrame` and `Daru::Vector`. Written in pure Ruby works with all ruby implementations. Tested with MRI 2.0, 2.1, 2.2 and 2.
|
12
|
+
daru makes it easy and intuitive to process data predominantly through 2 data structures: `Daru::DataFrame` and `Daru::Vector`. Written in pure Ruby works with all ruby implementations. Tested with MRI 2.0, 2.1, 2.2, 2.3, and 2.4.
|
12
13
|
|
13
14
|
## Features
|
14
15
|
|
@@ -73,6 +74,7 @@ $ gem install daru
|
|
73
74
|
* [Data Analysis in RUby: Basic data manipulation and plotting](http://v0dro.github.io/blog/2014/11/25/data-analysis-in-ruby-basic-data-manipulation-and-plotting/)
|
74
75
|
* [Data Analysis in RUby: Splitting, sorting, aggregating data and data types](http://v0dro.github.io/blog/2015/02/24/data-analysis-in-ruby-part-2/)
|
75
76
|
* [Finding and Combining data in daru](http://v0dro.github.io/blog/2015/08/03/finding-and-combining-data-in-daru/)
|
77
|
+
* [Introduction to analyzing datasets with daru library](http://gafur.me/2018/02/05/analysing-datasets-with-daru-library.html)
|
76
78
|
|
77
79
|
### Time series
|
78
80
|
|
@@ -192,13 +194,13 @@ In addition to nyaplot, daru also supports plotting out of the box with [gnuplot
|
|
192
194
|
|
193
195
|
## Documentation
|
194
196
|
|
195
|
-
Docs can be found [here](
|
197
|
+
Docs can be found [here](http://www.rubydoc.info/gems/daru).
|
196
198
|
|
197
199
|
## Contributing
|
198
200
|
|
199
201
|
Pick a feature from the Roadmap or the issue tracker or think of your own and send me a Pull Request!
|
200
202
|
|
201
|
-
For details see [CONTRIBUTING](https://github.com/
|
203
|
+
For details see [CONTRIBUTING](https://github.com/SciRuby/daru/blob/master/CONTRIBUTING.md).
|
202
204
|
|
203
205
|
## Acknowledgements
|
204
206
|
|
data/daru.gemspec
CHANGED
@@ -27,29 +27,6 @@ Gem::Specification.new do |spec|
|
|
27
27
|
spec.test_files = spec.files.grep(%r{^(test|spec|features)/})
|
28
28
|
spec.require_paths = ["lib"]
|
29
29
|
|
30
|
-
spec.post_install_message = <<-EOF
|
31
|
-
*************************************************************************
|
32
|
-
Thank you for installing daru!
|
33
|
-
|
34
|
-
oOOOOOo
|
35
|
-
,| oO
|
36
|
-
//| |
|
37
|
-
\\\\| |
|
38
|
-
`| |
|
39
|
-
`-----`
|
40
|
-
|
41
|
-
|
42
|
-
Hope you love daru! For enhanced interactivity and better visualizations,
|
43
|
-
consider using gnuplotrb and nyaplot with iruby. For statistics use the
|
44
|
-
statsample family.
|
45
|
-
|
46
|
-
Read the README for interesting use cases and examples.
|
47
|
-
|
48
|
-
Cheers!
|
49
|
-
*************************************************************************
|
50
|
-
EOF
|
51
|
-
|
52
|
-
|
53
30
|
spec.add_runtime_dependency 'backports'
|
54
31
|
|
55
32
|
# it is required by NMatrix, yet we want to specify clearly which minimal version is OK
|
data/lib/daru.rb
CHANGED
@@ -86,16 +86,6 @@ module Daru
|
|
86
86
|
create_has_library :gruff
|
87
87
|
end
|
88
88
|
|
89
|
-
{'spreadsheet' => '~>1.1.1', 'mechanize' => '~>2.7.5'}.each do |name, version|
|
90
|
-
begin
|
91
|
-
gem name, version
|
92
|
-
require name
|
93
|
-
rescue LoadError
|
94
|
-
Daru.error "\nInstall the #{name} gem version #{version} for using"\
|
95
|
-
" #{name} functions."
|
96
|
-
end
|
97
|
-
end
|
98
|
-
|
99
89
|
autoload :CSV, 'csv'
|
100
90
|
require 'matrix'
|
101
91
|
require 'forwardable'
|
data/lib/daru/core/group_by.rb
CHANGED
@@ -1,11 +1,64 @@
|
|
1
1
|
module Daru
|
2
2
|
module Core
|
3
3
|
class GroupBy
|
4
|
+
class << self
|
5
|
+
def get_positions_group_map_on(indexes_with_positions, sort: false)
|
6
|
+
group_map = {}
|
7
|
+
|
8
|
+
indexes_with_positions.each do |idx, position|
|
9
|
+
(group_map[idx] ||= []) << position
|
10
|
+
end
|
11
|
+
|
12
|
+
if sort # TODO: maybe add a more "stable" sorting option?
|
13
|
+
sorted_keys = group_map.keys.sort(&Daru::Core::GroupBy::TUPLE_SORTER)
|
14
|
+
group_map = sorted_keys.map { |k| [k, group_map[k]] }.to_h
|
15
|
+
end
|
16
|
+
|
17
|
+
group_map
|
18
|
+
end
|
19
|
+
|
20
|
+
def get_positions_group_for_aggregation(multi_index, level=-1)
|
21
|
+
raise unless multi_index.is_a?(Daru::MultiIndex)
|
22
|
+
|
23
|
+
new_index = multi_index.dup
|
24
|
+
new_index.remove_layer(level) # TODO: recheck code of Daru::MultiIndex#remove_layer
|
25
|
+
|
26
|
+
get_positions_group_map_on(new_index.each_with_index)
|
27
|
+
end
|
28
|
+
|
29
|
+
def get_positions_group_map_for_df(df, group_by_keys, sort: true)
|
30
|
+
indexes_with_positions = df[*group_by_keys].to_df.each_row.map(&:to_a).each_with_index
|
31
|
+
|
32
|
+
get_positions_group_map_on(indexes_with_positions, sort: sort)
|
33
|
+
end
|
34
|
+
|
35
|
+
def group_map_from_positions_to_indexes(positions_group_map, index)
|
36
|
+
positions_group_map.map { |k, positions| [k, positions.map { |pos| index.at(pos) }] }.to_h
|
37
|
+
end
|
38
|
+
|
39
|
+
def df_from_group_map(df, group_map, remaining_vectors, from_position: true)
|
40
|
+
return nil if group_map == {}
|
41
|
+
|
42
|
+
new_index = group_map.flat_map { |group, values| values.map { |val| group + [val] } }
|
43
|
+
new_index = Daru::MultiIndex.from_tuples(new_index)
|
44
|
+
|
45
|
+
return Daru::DataFrame.new({}, index: new_index) if remaining_vectors == []
|
46
|
+
|
47
|
+
new_rows_order = group_map.values.flatten
|
48
|
+
new_df = df[*remaining_vectors].to_df.get_sub_dataframe(new_rows_order, by_position: from_position)
|
49
|
+
new_df.index = new_index
|
50
|
+
|
51
|
+
new_df
|
52
|
+
end
|
53
|
+
end
|
54
|
+
|
4
55
|
attr_reader :groups, :df
|
5
56
|
|
6
57
|
# Iterate over each group created by group_by. A DataFrame is yielded in
|
7
58
|
# block.
|
8
59
|
def each_group
|
60
|
+
return to_enum(:each_group) unless block_given?
|
61
|
+
|
9
62
|
groups.keys.each do |k|
|
10
63
|
yield get_group(k)
|
11
64
|
end
|
@@ -22,11 +75,8 @@ module Daru
|
|
22
75
|
end
|
23
76
|
|
24
77
|
def initialize context, names
|
25
|
-
@groups = {}
|
26
78
|
@non_group_vectors = context.vectors.to_a - names
|
27
79
|
@context = context
|
28
|
-
vectors = names.map { |vec| context[vec].to_a }
|
29
|
-
tuples = vectors[0].zip(*vectors[1..-1])
|
30
80
|
# FIXME: It feels like we don't want to sort here. Ruby's #group_by
|
31
81
|
# never sorts:
|
32
82
|
#
|
@@ -34,7 +84,10 @@ module Daru
|
|
34
84
|
# # => {4=>["test"], 2=>["me"], 6=>["please"]}
|
35
85
|
#
|
36
86
|
# - zverok, 2016-09-12
|
37
|
-
|
87
|
+
positions_groups = GroupBy.get_positions_group_map_for_df(@context, names, sort: true)
|
88
|
+
|
89
|
+
@groups = GroupBy.group_map_from_positions_to_indexes(positions_groups, @context.index)
|
90
|
+
@df = GroupBy.df_from_group_map(@context, positions_groups, @non_group_vectors)
|
38
91
|
end
|
39
92
|
|
40
93
|
# Get a Daru::Vector of the size of each group.
|
@@ -282,26 +335,11 @@ module Daru
|
|
282
335
|
# Ram Hyderabad,Mumbai
|
283
336
|
#
|
284
337
|
def aggregate(options={})
|
285
|
-
@df.index = @df.index.remove_layer(@df.index.levels.size - 1)
|
286
338
|
@df.aggregate(options)
|
287
339
|
end
|
288
340
|
|
289
341
|
private
|
290
342
|
|
291
|
-
def init_groups_df tuples, names
|
292
|
-
multi_index_tuples = []
|
293
|
-
keys = tuples.uniq.sort(&TUPLE_SORTER)
|
294
|
-
keys.each do |key|
|
295
|
-
indices = all_indices_for(tuples, key)
|
296
|
-
@groups[key] = indices
|
297
|
-
indices.each do |indice|
|
298
|
-
multi_index_tuples << key + [indice]
|
299
|
-
end
|
300
|
-
end
|
301
|
-
@groups.freeze
|
302
|
-
@df = resultant_context(multi_index_tuples, names) unless multi_index_tuples.empty?
|
303
|
-
end
|
304
|
-
|
305
343
|
def select_groups_from method, quantity
|
306
344
|
selection = @context
|
307
345
|
rows, indexes = [], []
|
@@ -342,33 +380,6 @@ module Daru
|
|
342
380
|
end
|
343
381
|
end
|
344
382
|
|
345
|
-
def resultant_context(multi_index_tuples, names)
|
346
|
-
multi_index = Daru::MultiIndex.from_tuples(multi_index_tuples)
|
347
|
-
context_tmp = @context.dup.delete_vectors(*names)
|
348
|
-
rows_tuples = context_tmp.access_row_tuples_by_indexs(
|
349
|
-
*@groups.values.flatten!
|
350
|
-
)
|
351
|
-
context_new = Daru::DataFrame.rows(rows_tuples, index: multi_index)
|
352
|
-
context_new.vectors = context_tmp.vectors
|
353
|
-
context_new
|
354
|
-
end
|
355
|
-
|
356
|
-
def all_indices_for arry, element
|
357
|
-
found, index, indexes = -1, -1, []
|
358
|
-
while found
|
359
|
-
found = arry[index+1..-1].index(element)
|
360
|
-
if found
|
361
|
-
index = index + found + 1
|
362
|
-
indexes << index
|
363
|
-
end
|
364
|
-
end
|
365
|
-
if indexes.count == 1
|
366
|
-
[@context.index.at(*indexes)]
|
367
|
-
else
|
368
|
-
@context.index.at(*indexes).to_a
|
369
|
-
end
|
370
|
-
end
|
371
|
-
|
372
383
|
def multi_indexed_grouping?
|
373
384
|
return false unless @groups.keys[0]
|
374
385
|
@groups.keys[0].size > 1
|
data/lib/daru/core/merge.rb
CHANGED
@@ -17,17 +17,17 @@ module Daru
|
|
17
17
|
end
|
18
18
|
end
|
19
19
|
|
20
|
-
def initialize left_df, right_df, opts={}
|
20
|
+
def initialize left_df, right_df, opts={} # rubocop:disable Metrics/AbcSize -- quick-fix for issue #171
|
21
21
|
init_opts(opts)
|
22
22
|
validate_on!(left_df, right_df)
|
23
23
|
key_sanitizer = ->(h) { sanitize_merge_keys(h.values_at(*on)) }
|
24
24
|
|
25
25
|
@left = df_to_a(left_df)
|
26
|
-
@left.
|
26
|
+
@left.sort! { |a, b| safe_compare(a.values_at(*on), b.values_at(*on)) }
|
27
27
|
@left_key_values = @left.map(&key_sanitizer)
|
28
28
|
|
29
29
|
@right = df_to_a(right_df)
|
30
|
-
@right.
|
30
|
+
@right.sort! { |a, b| safe_compare(a.values_at(*on), b.values_at(*on)) }
|
31
31
|
@right_key_values = @right.map(&key_sanitizer)
|
32
32
|
|
33
33
|
@left_keys, @right_keys = merge_keys(left_df, right_df, on)
|
@@ -246,6 +246,15 @@ module Daru
|
|
246
246
|
raise ArgumentError, "Both dataframes expected to have #{on.inspect} field"
|
247
247
|
end
|
248
248
|
end
|
249
|
+
|
250
|
+
def safe_compare(left_array, right_array)
|
251
|
+
left_array.zip(right_array).map { |l, r|
|
252
|
+
next 0 if l.nil? && r.nil?
|
253
|
+
next 1 if r.nil?
|
254
|
+
next -1 if l.nil?
|
255
|
+
l <=> r
|
256
|
+
}.reject(&:zero?).first || 0
|
257
|
+
end
|
249
258
|
end
|
250
259
|
|
251
260
|
module Merge
|
data/lib/daru/dataframe.rb
CHANGED
@@ -549,6 +549,20 @@ module Daru
|
|
549
549
|
Daru::Accessors::DataFrameByRow.new(self)
|
550
550
|
end
|
551
551
|
|
552
|
+
# Extract a dataframe given row indexes or positions
|
553
|
+
# @param keys [Array] can be positions (if by_position is true) or indexes (if by_position if false)
|
554
|
+
# @return [Daru::Dataframe]
|
555
|
+
def get_sub_dataframe(keys, by_position: true)
|
556
|
+
return Daru::DataFrame.new({}) if keys == []
|
557
|
+
|
558
|
+
keys = @index.pos(*keys) unless by_position
|
559
|
+
|
560
|
+
sub_df = row_at(*keys)
|
561
|
+
sub_df = sub_df.to_df.transpose if sub_df.is_a?(Daru::Vector)
|
562
|
+
|
563
|
+
sub_df
|
564
|
+
end
|
565
|
+
|
552
566
|
# Duplicate the DataFrame entirely.
|
553
567
|
#
|
554
568
|
# == Arguments
|
@@ -698,6 +712,7 @@ module Daru
|
|
698
712
|
#
|
699
713
|
def rolling_fillna!(direction=:forward)
|
700
714
|
@data.each { |vec| vec.rolling_fillna!(direction) }
|
715
|
+
self
|
701
716
|
end
|
702
717
|
|
703
718
|
def rolling_fillna(direction=:forward)
|
@@ -990,6 +1005,17 @@ module Daru
|
|
990
1005
|
self
|
991
1006
|
end
|
992
1007
|
|
1008
|
+
def apply_method(method, keys: nil, by_position: true)
|
1009
|
+
df = keys ? get_sub_dataframe(keys, by_position: by_position) : self
|
1010
|
+
|
1011
|
+
case method
|
1012
|
+
when Symbol then df.send(method)
|
1013
|
+
when Proc then method.call(df)
|
1014
|
+
else raise
|
1015
|
+
end
|
1016
|
+
end
|
1017
|
+
alias :apply_method_on_sub_df :apply_method
|
1018
|
+
|
993
1019
|
# Retrieves a Daru::Vector, based on the result of calculation
|
994
1020
|
# performed on each row.
|
995
1021
|
def collect_rows &block
|
@@ -1450,11 +1476,10 @@ module Daru
|
|
1450
1476
|
# # ["foo", "two", 3]=>[2, 4]}
|
1451
1477
|
def group_by *vectors
|
1452
1478
|
vectors.flatten!
|
1453
|
-
|
1454
|
-
|
1455
|
-
|
1456
|
-
|
1457
|
-
}
|
1479
|
+
missing = vectors - @vectors.to_a
|
1480
|
+
unless missing.empty?
|
1481
|
+
raise(ArgumentError, "Vector(s) missing: #{missing.join(', ')}")
|
1482
|
+
end
|
1458
1483
|
|
1459
1484
|
vectors = [@vectors.first] if vectors.empty?
|
1460
1485
|
|
@@ -2249,22 +2274,6 @@ module Daru
|
|
2249
2274
|
end
|
2250
2275
|
end
|
2251
2276
|
|
2252
|
-
# returns array of row tuples at given index(s)
|
2253
|
-
def access_row_tuples_by_indexs *indexes
|
2254
|
-
positions = @index.pos(*indexes)
|
2255
|
-
|
2256
|
-
return populate_row_for(positions) if positions.is_a? Numeric
|
2257
|
-
|
2258
|
-
res = []
|
2259
|
-
new_rows = @data.map { |vec| vec[*indexes] }
|
2260
|
-
indexes.each do |index|
|
2261
|
-
tuples = []
|
2262
|
-
new_rows.map { |row| tuples += [row[index]] }
|
2263
|
-
res << tuples
|
2264
|
-
end
|
2265
|
-
res
|
2266
|
-
end
|
2267
|
-
|
2268
2277
|
# Function to use for aggregating the data.
|
2269
2278
|
#
|
2270
2279
|
# @param options [Hash] options for column, you want in resultant dataframe
|
@@ -2282,7 +2291,7 @@ module Daru
|
|
2282
2291
|
# 3 d 17
|
2283
2292
|
# 4 e 1
|
2284
2293
|
#
|
2285
|
-
# df.aggregate(num_100_times: ->(df) { df.num*100 })
|
2294
|
+
# df.aggregate(num_100_times: ->(df) { (df.num*100).first })
|
2286
2295
|
# => #<Daru::DataFrame(5x1)>
|
2287
2296
|
# num_100_ti
|
2288
2297
|
# 0 5200
|
@@ -2312,41 +2321,26 @@ module Daru
|
|
2312
2321
|
#
|
2313
2322
|
# Note: `GroupBy` class `aggregate` method uses this `aggregate` method
|
2314
2323
|
# internally.
|
2315
|
-
def aggregate(options={})
|
2316
|
-
|
2317
|
-
Daru::DataFrame.new(
|
2318
|
-
colmn_value, index: index_tuples, order: options.keys
|
2319
|
-
)
|
2320
|
-
end
|
2324
|
+
def aggregate(options={}, multi_index_level=-1)
|
2325
|
+
positions_tuples, new_index = group_index_for_aggregation(@index, multi_index_level)
|
2321
2326
|
|
2322
|
-
|
2327
|
+
colmn_value = aggregate_by_positions_tuples(options, positions_tuples)
|
2323
2328
|
|
2324
|
-
|
2325
|
-
# lambda), on the column.
|
2326
|
-
def apply_method_on_colmns colmn, index_tuples, method
|
2327
|
-
rows = []
|
2328
|
-
index_tuples.each do |indexes|
|
2329
|
-
# If single element then also make it vector.
|
2330
|
-
slice = Daru::Vector.new(Array(self[colmn][*indexes]))
|
2331
|
-
case method
|
2332
|
-
when Symbol
|
2333
|
-
rows << (slice.is_a?(Daru::Vector) ? slice.send(method) : slice)
|
2334
|
-
when Proc
|
2335
|
-
rows << method.call(slice)
|
2336
|
-
end
|
2337
|
-
end
|
2338
|
-
rows
|
2329
|
+
Daru::DataFrame.new(colmn_value, index: new_index, order: options.keys)
|
2339
2330
|
end
|
2340
2331
|
|
2341
|
-
|
2342
|
-
|
2343
|
-
|
2344
|
-
|
2345
|
-
|
2346
|
-
|
2347
|
-
|
2332
|
+
# Is faster than using group_by followed by aggregate (because it doesn't generate an intermediary dataframe)
|
2333
|
+
def group_by_and_aggregate(*group_by_keys, **aggregation_map)
|
2334
|
+
positions_groups = Daru::Core::GroupBy.get_positions_group_map_for_df(self, group_by_keys.flatten, sort: true)
|
2335
|
+
|
2336
|
+
new_index = Daru::MultiIndex.from_tuples(positions_groups.keys).coerce_index
|
2337
|
+
colmn_value = aggregate_by_positions_tuples(aggregation_map, positions_groups.values)
|
2338
|
+
|
2339
|
+
Daru::DataFrame.new(colmn_value, index: new_index, order: aggregation_map.keys)
|
2348
2340
|
end
|
2349
2341
|
|
2342
|
+
private
|
2343
|
+
|
2350
2344
|
def headers
|
2351
2345
|
Daru::Index.new(Array(index.name) + @vectors.to_a)
|
2352
2346
|
end
|
@@ -2910,27 +2904,41 @@ module Daru
|
|
2910
2904
|
end
|
2911
2905
|
|
2912
2906
|
def update_data source, vectors
|
2913
|
-
@data = @vectors.each_with_index.map do |_vec,idx|
|
2907
|
+
@data = @vectors.each_with_index.map do |_vec, idx|
|
2914
2908
|
Daru::Vector.new(source[idx], index: @index, name: vectors[idx])
|
2915
2909
|
end
|
2916
2910
|
end
|
2917
2911
|
|
2918
|
-
def
|
2919
|
-
|
2920
|
-
|
2921
|
-
|
2922
|
-
|
2923
|
-
|
2924
|
-
|
2925
|
-
|
2926
|
-
|
2927
|
-
|
2928
|
-
|
2929
|
-
|
2930
|
-
|
2931
|
-
end
|
2912
|
+
def aggregate_by_positions_tuples(options, positions_tuples)
|
2913
|
+
options.map do |vect, method|
|
2914
|
+
if @vectors.include?(vect)
|
2915
|
+
vect = self[vect]
|
2916
|
+
|
2917
|
+
positions_tuples.map do |positions|
|
2918
|
+
vect.apply_method_on_sub_vector(method, keys: positions)
|
2919
|
+
end
|
2920
|
+
else
|
2921
|
+
positions_tuples.map do |positions|
|
2922
|
+
apply_method_on_sub_df(method, keys: positions)
|
2923
|
+
end
|
2924
|
+
end
|
2932
2925
|
end
|
2933
|
-
|
2926
|
+
end
|
2927
|
+
|
2928
|
+
def group_index_for_aggregation(index, multi_index_level=-1)
|
2929
|
+
case index
|
2930
|
+
when Daru::MultiIndex
|
2931
|
+
groups = Daru::Core::GroupBy.get_positions_group_for_aggregation(index, multi_index_level)
|
2932
|
+
new_index, pos_tuples = groups.keys, groups.values
|
2933
|
+
|
2934
|
+
new_index = Daru::MultiIndex.from_tuples(new_index).coerce_index
|
2935
|
+
when Daru::Index, Daru::CategoricalIndex
|
2936
|
+
new_index = Array(index).uniq
|
2937
|
+
pos_tuples = new_index.map { |idx| [*index.pos(idx)] }
|
2938
|
+
else raise
|
2939
|
+
end
|
2940
|
+
|
2941
|
+
[pos_tuples, new_index]
|
2934
2942
|
end
|
2935
2943
|
|
2936
2944
|
# coerce ranges, integers and array in appropriate ways
|
@@ -244,8 +244,21 @@ module Daru
|
|
244
244
|
@labels.delete_at(layer_index)
|
245
245
|
@name.delete_at(layer_index) unless @name.nil?
|
246
246
|
|
247
|
-
|
248
|
-
|
247
|
+
coerce_index
|
248
|
+
end
|
249
|
+
|
250
|
+
def coerce_index
|
251
|
+
if @levels.size == 1
|
252
|
+
elements = to_a.flatten
|
253
|
+
|
254
|
+
if elements.uniq.length == elements.length
|
255
|
+
Daru::Index.new(elements)
|
256
|
+
else
|
257
|
+
Daru::CategoricalIndex.new(elements)
|
258
|
+
end
|
259
|
+
else
|
260
|
+
self
|
261
|
+
end
|
249
262
|
end
|
250
263
|
|
251
264
|
# Array `name` must have same length as levels and labels.
|
@@ -272,7 +285,7 @@ module Daru
|
|
272
285
|
end
|
273
286
|
|
274
287
|
def dup
|
275
|
-
MultiIndex.new levels: levels.dup, labels: labels
|
288
|
+
MultiIndex.new levels: levels.dup, labels: labels.dup, name: (@name.nil? ? nil : @name.dup)
|
276
289
|
end
|
277
290
|
|
278
291
|
def drop_left_level by=1
|
@@ -293,8 +306,9 @@ module Daru
|
|
293
306
|
|
294
307
|
def include? tuple
|
295
308
|
return false unless tuple.is_a? Enumerable
|
296
|
-
tuple.flatten.
|
297
|
-
|
309
|
+
@labels[0...tuple.flatten.size]
|
310
|
+
.transpose
|
311
|
+
.include?(tuple.flatten.each_with_index.map { |e, i| @levels[i][e] })
|
298
312
|
end
|
299
313
|
|
300
314
|
def size
|
data/lib/daru/io/io.rb
CHANGED
@@ -34,11 +34,12 @@ module Daru
|
|
34
34
|
end
|
35
35
|
end
|
36
36
|
|
37
|
-
module IO
|
37
|
+
module IO # rubocop:disable Metrics/ModuleLength
|
38
38
|
class << self
|
39
39
|
# Functions for loading/writing Excel files.
|
40
40
|
|
41
41
|
def from_excel path, opts={}
|
42
|
+
optional_gem 'spreadsheet', '~>1.1.1'
|
42
43
|
opts = {
|
43
44
|
worksheet_id: 0
|
44
45
|
}.merge opts
|
@@ -185,19 +186,25 @@ module Daru
|
|
185
186
|
end
|
186
187
|
|
187
188
|
def from_html path, opts
|
189
|
+
optional_gem 'mechanize', '~>2.7.5'
|
188
190
|
page = Mechanize.new.get(path)
|
189
191
|
page.search('table').map { |table| html_parse_table table }
|
190
192
|
.keep_if { |table| html_search table, opts[:match] }
|
191
193
|
.compact
|
192
194
|
.map { |table| html_decide_values table, opts }
|
193
195
|
.map { |table| html_table_to_dataframe table }
|
194
|
-
rescue LoadError
|
195
|
-
raise 'Install the mechanize gem version 2.7.5 with `gem install mechanize`,'\
|
196
|
-
' for using the from_html function.'
|
197
196
|
end
|
198
197
|
|
199
198
|
private
|
200
199
|
|
200
|
+
def optional_gem(name, version)
|
201
|
+
gem name, version
|
202
|
+
require name
|
203
|
+
rescue LoadError
|
204
|
+
Daru.error "\nInstall the #{name} gem version #{version} for using"\
|
205
|
+
" #{name} functions."
|
206
|
+
end
|
207
|
+
|
201
208
|
DARU_OPT_KEYS = %i[clone order index name].freeze
|
202
209
|
|
203
210
|
def from_csv_prepare_opts opts
|
@@ -214,7 +221,7 @@ module Daru
|
|
214
221
|
end
|
215
222
|
|
216
223
|
def from_csv_prepare_converters(converters)
|
217
|
-
converters.flat_map do |c|
|
224
|
+
Array(converters).flat_map do |c|
|
218
225
|
if ::CSV::Converters[c]
|
219
226
|
::CSV::Converters[c]
|
220
227
|
elsif Daru::IO::CSV::CONVERTERS[c]
|
data/lib/daru/vector.rb
CHANGED
@@ -122,6 +122,17 @@ module Daru
|
|
122
122
|
self
|
123
123
|
end
|
124
124
|
|
125
|
+
def apply_method(method, keys: nil, by_position: true)
|
126
|
+
vect = keys ? get_sub_vector(keys, by_position: by_position) : self
|
127
|
+
|
128
|
+
case method
|
129
|
+
when Symbol then vect.send(method)
|
130
|
+
when Proc then method.call(vect)
|
131
|
+
else raise
|
132
|
+
end
|
133
|
+
end
|
134
|
+
alias :apply_method_on_sub_vector :apply_method
|
135
|
+
|
125
136
|
# The name of the Daru::Vector. String.
|
126
137
|
attr_reader :name
|
127
138
|
# The row index. Can be either Daru::Index or Daru::MultiIndex.
|
@@ -790,6 +801,7 @@ module Daru
|
|
790
801
|
self[idx] = last_valid_value
|
791
802
|
end
|
792
803
|
end
|
804
|
+
self
|
793
805
|
end
|
794
806
|
|
795
807
|
# Non-destructive version of rolling_fillna!
|
@@ -870,6 +882,19 @@ module Daru
|
|
870
882
|
@index.include? index
|
871
883
|
end
|
872
884
|
|
885
|
+
# @param keys [Array] can be positions (if by_position is true) or indexes (if by_position if false)
|
886
|
+
# @return [Daru::Vector]
|
887
|
+
def get_sub_vector(keys, by_position: true)
|
888
|
+
return Daru::Vector.new([]) if keys == []
|
889
|
+
|
890
|
+
keys = @index.pos(*keys) unless by_position
|
891
|
+
|
892
|
+
sub_vect = at(*keys)
|
893
|
+
sub_vect = Daru::Vector.new([sub_vect]) unless sub_vect.is_a?(Daru::Vector)
|
894
|
+
|
895
|
+
sub_vect
|
896
|
+
end
|
897
|
+
|
873
898
|
# @return [Daru::DataFrame] the vector as a single-vector dataframe
|
874
899
|
def to_df
|
875
900
|
Daru::DataFrame.new({@name => @data}, name: @name, index: @index)
|
data/lib/daru/version.rb
CHANGED
data/spec/core/group_by_spec.rb
CHANGED
@@ -201,6 +201,22 @@ describe Daru::Core::GroupBy do
|
|
201
201
|
end
|
202
202
|
end
|
203
203
|
|
204
|
+
context '#each_group without block' do
|
205
|
+
it 'enumerates groups' do
|
206
|
+
enum = @dl_group.each_group
|
207
|
+
|
208
|
+
expect(enum.count).to eq 6
|
209
|
+
expect(enum).to all be_a(Daru::DataFrame)
|
210
|
+
expect(enum.to_a.last).to eq(Daru::DataFrame.new({
|
211
|
+
a: ['foo', 'foo'],
|
212
|
+
b: ['two', 'two'],
|
213
|
+
c: [3, 3],
|
214
|
+
d: [33, 55]
|
215
|
+
}, index: [2, 4]
|
216
|
+
))
|
217
|
+
end
|
218
|
+
end
|
219
|
+
|
204
220
|
context '#first' do
|
205
221
|
it 'gets the first row from each group' do
|
206
222
|
expect(@dl_group.first).to eq(Daru::DataFrame.new({
|
@@ -223,10 +239,6 @@ describe Daru::Core::GroupBy do
|
|
223
239
|
end
|
224
240
|
end
|
225
241
|
|
226
|
-
context "#aggregate" do
|
227
|
-
pending
|
228
|
-
end
|
229
|
-
|
230
242
|
context "#mean" do
|
231
243
|
it "computes mean of the numeric columns of a single layer group" do
|
232
244
|
expect(@sl_group.mean).to eq(Daru::DataFrame.new({
|
@@ -498,23 +510,6 @@ describe Daru::Core::GroupBy do
|
|
498
510
|
}
|
499
511
|
end
|
500
512
|
|
501
|
-
context 'group and aggregate sum for two vectors' do
|
502
|
-
subject {
|
503
|
-
dataframe.group_by([:employee, :month]).aggregate(salary: :sum) }
|
504
|
-
|
505
|
-
it { is_expected.to eq Daru::DataFrame.new({
|
506
|
-
salary: [600, 500, 1200, 1000, 600, 700]},
|
507
|
-
index: Daru::MultiIndex.from_tuples([
|
508
|
-
['Jane', 'July'],
|
509
|
-
['Jane', 'June'],
|
510
|
-
['John', 'July'],
|
511
|
-
['John', 'June'],
|
512
|
-
['Mark', 'July'],
|
513
|
-
['Mark', 'June']
|
514
|
-
])
|
515
|
-
)}
|
516
|
-
end
|
517
|
-
|
518
513
|
context 'group and aggregate sum and lambda function for vectors' do
|
519
514
|
subject { dataframe.group_by([:employee]).aggregate(
|
520
515
|
salary: :sum,
|
@@ -592,5 +587,64 @@ describe Daru::Core::GroupBy do
|
|
592
587
|
)
|
593
588
|
end
|
594
589
|
end
|
590
|
+
|
591
|
+
let(:spending_df) {
|
592
|
+
Daru::DataFrame.rows([
|
593
|
+
[2010, 'dev', 50, 1],
|
594
|
+
[2010, 'dev', 150, 1],
|
595
|
+
[2010, 'dev', 200, 1],
|
596
|
+
[2011, 'dev', 50, 1],
|
597
|
+
[2012, 'dev', 150, 1],
|
598
|
+
|
599
|
+
[2011, 'office', 300, 1],
|
600
|
+
|
601
|
+
[2010, 'market', 50, 1],
|
602
|
+
[2011, 'market', 500, 1],
|
603
|
+
[2012, 'market', 500, 1],
|
604
|
+
[2012, 'market', 300, 1],
|
605
|
+
|
606
|
+
[2012, 'R&D', 10, 1],],
|
607
|
+
order: [:year, :category, :spending, :nb_spending])
|
608
|
+
}
|
609
|
+
let(:multi_index_year_category) {
|
610
|
+
Daru::MultiIndex.from_tuples([
|
611
|
+
[2010, "dev"], [2010, "market"],
|
612
|
+
[2011, "dev"], [2011, "market"], [2011, "office"],
|
613
|
+
[2012, "R&D"], [2012, "dev"], [2012, "market"]])
|
614
|
+
}
|
615
|
+
|
616
|
+
context 'group_by and aggregate on multiple elements' do
|
617
|
+
it 'does aggregate' do
|
618
|
+
expect(spending_df.group_by([:year, :category]).aggregate(spending: :sum)).to eq(
|
619
|
+
Daru::DataFrame.new({spending: [400, 50, 50, 500, 300, 10, 150, 800]}, index: multi_index_year_category))
|
620
|
+
end
|
621
|
+
|
622
|
+
it 'works as older methods' do
|
623
|
+
newer_way = spending_df.group_by([:year, :category]).aggregate(spending: :sum, nb_spending: :sum)
|
624
|
+
older_way = spending_df.group_by([:year, :category]).sum
|
625
|
+
expect(newer_way).to eq(older_way)
|
626
|
+
end
|
627
|
+
|
628
|
+
context 'can aggregate on MultiIndex' do
|
629
|
+
let(:multi_indexed_aggregated_df) { spending_df.group_by([:year, :category]).aggregate(spending: :sum) }
|
630
|
+
let(:index_year) { Daru::Index.new([2010, 2011, 2012]) }
|
631
|
+
let(:index_category) { Daru::Index.new(["dev", "market", "office", "R&D"]) }
|
632
|
+
|
633
|
+
it 'aggregates by default on the last layer of MultiIndex' do
|
634
|
+
expect(multi_indexed_aggregated_df.aggregate(spending: :sum)).to eq(
|
635
|
+
Daru::DataFrame.new({spending: [450, 850, 960]}, index: index_year))
|
636
|
+
end
|
637
|
+
|
638
|
+
it 'can aggregate on the first layer of MultiIndex' do
|
639
|
+
expect(multi_indexed_aggregated_df.aggregate({spending: :sum},0)).to eq(
|
640
|
+
Daru::DataFrame.new({spending: [600, 1350, 300, 10]}, index: index_category))
|
641
|
+
end
|
642
|
+
|
643
|
+
it 'does coercion: when one layer is remaining, MultiIndex is coerced in Index that does not aggregate anymore' do
|
644
|
+
df_with_simple_index = multi_indexed_aggregated_df.aggregate(spending: :sum)
|
645
|
+
expect(df_with_simple_index.aggregate(spending: :sum)).to eq(df_with_simple_index)
|
646
|
+
end
|
647
|
+
end
|
648
|
+
end
|
595
649
|
end
|
596
650
|
end
|
data/spec/dataframe_spec.rb
CHANGED
@@ -1858,7 +1858,7 @@ describe Daru::DataFrame do
|
|
1858
1858
|
|
1859
1859
|
context 'rolling_fillna! forwards' do
|
1860
1860
|
before { subject.rolling_fillna!(:forward) }
|
1861
|
-
it {
|
1861
|
+
it { expect(subject.rolling_fillna!(:forward)).to eq(subject) }
|
1862
1862
|
its(:'a.to_a') { is_expected.to eq [1, 2, 3, 3, 3, 3, 1, 7] }
|
1863
1863
|
its(:'b.to_a') { is_expected.to eq [:a, :b, :b, :b, :b, 3, 5, 5] }
|
1864
1864
|
its(:'c.to_a') { is_expected.to eq ['a', 'a', 3, 4, 3, 5, 5, 7] }
|
@@ -1866,7 +1866,7 @@ describe Daru::DataFrame do
|
|
1866
1866
|
|
1867
1867
|
context 'rolling_fillna! backwards' do
|
1868
1868
|
before { subject.rolling_fillna!(:backward) }
|
1869
|
-
it {
|
1869
|
+
it { expect(subject.rolling_fillna!(:backward)).to eq(subject) }
|
1870
1870
|
its(:'a.to_a') { is_expected.to eq [1, 2, 3, 1, 1, 1, 1, 7] }
|
1871
1871
|
its(:'b.to_a') { is_expected.to eq [:a, :b, 3, 3, 3, 3, 5, 0] }
|
1872
1872
|
its(:'c.to_a') { is_expected.to eq ['a', 3, 3, 4, 3, 5, 7, 7] }
|
@@ -3266,6 +3266,18 @@ describe Daru::DataFrame do
|
|
3266
3266
|
end
|
3267
3267
|
end
|
3268
3268
|
|
3269
|
+
context "group_by" do
|
3270
|
+
context "on a single row DataFrame" do
|
3271
|
+
let(:df){ Daru::DataFrame.new(city: %w[Kyiv], year: [2015], value: [1]) }
|
3272
|
+
it "returns a groupby object" do
|
3273
|
+
expect(df.group_by([:city])).to be_a(Daru::Core::GroupBy)
|
3274
|
+
end
|
3275
|
+
it "has the correct index" do
|
3276
|
+
expect(df.group_by([:city]).groups).to eq({["Kyiv"]=>[0]})
|
3277
|
+
end
|
3278
|
+
end
|
3279
|
+
end
|
3280
|
+
|
3269
3281
|
context "#vector_sum" do
|
3270
3282
|
before do
|
3271
3283
|
a1 = Daru::Vector.new [1, 2, 3, 4, 5, nil, nil]
|
@@ -4032,7 +4044,7 @@ describe Daru::DataFrame do
|
|
4032
4044
|
Daru::DataFrame.new({num: [52,12,07,17,01]}, index: cat_idx) }
|
4033
4045
|
|
4034
4046
|
it 'lambda function on particular column' do
|
4035
|
-
expect(df.aggregate(num_100_times: ->(df) { df.num*100 })).to eq(
|
4047
|
+
expect(df.aggregate(num_100_times: ->(df) { (df.num*100).first })).to eq(
|
4036
4048
|
Daru::DataFrame.new(num_100_times: [5200, 1200, 700, 1700, 100])
|
4037
4049
|
)
|
4038
4050
|
end
|
@@ -4043,6 +4055,34 @@ describe Daru::DataFrame do
|
|
4043
4055
|
end
|
4044
4056
|
end
|
4045
4057
|
|
4058
|
+
context '#group_by_and_aggregate' do
|
4059
|
+
let(:spending_df) {
|
4060
|
+
Daru::DataFrame.rows([
|
4061
|
+
[2010, 'dev', 50, 1],
|
4062
|
+
[2010, 'dev', 150, 1],
|
4063
|
+
[2010, 'dev', 200, 1],
|
4064
|
+
[2011, 'dev', 50, 1],
|
4065
|
+
[2012, 'dev', 150, 1],
|
4066
|
+
|
4067
|
+
[2011, 'office', 300, 1],
|
4068
|
+
|
4069
|
+
[2010, 'market', 50, 1],
|
4070
|
+
[2011, 'market', 500, 1],
|
4071
|
+
[2012, 'market', 500, 1],
|
4072
|
+
[2012, 'market', 300, 1],
|
4073
|
+
|
4074
|
+
[2012, 'R&D', 10, 1],],
|
4075
|
+
order: [:year, :category, :spending, :nb_spending])
|
4076
|
+
}
|
4077
|
+
|
4078
|
+
it 'works as group_by + aggregate' do
|
4079
|
+
expect(spending_df.group_by_and_aggregate(:year, {spending: :sum})).to eq(
|
4080
|
+
spending_df.group_by(:year).aggregate(spending: :sum))
|
4081
|
+
expect(spending_df.group_by_and_aggregate([:year, :category], spending: :sum, nb_spending: :size)).to eq(
|
4082
|
+
spending_df.group_by([:year, :category]).aggregate(spending: :sum, nb_spending: :size))
|
4083
|
+
end
|
4084
|
+
end
|
4085
|
+
|
4046
4086
|
context '#create_sql' do
|
4047
4087
|
let(:df) { Daru::DataFrame.new({
|
4048
4088
|
a: [1,2,3],
|
@@ -0,0 +1,5 @@
|
|
1
|
+
ID,Case Number,Date,Block,IUCR,Primary Type,Description,Location Description,Arrest,Domestic,Beat,District,Ward,Community Area,FBI Code,X Coordinate,Y Coordinate,Year,Updated On,Latitude,Longitude,Location
|
2
|
+
8517337,094652,03/12/2012 02:00:00 PM,027XX S HAMLIN AVE,1152,DECEPTIVE PRACTICE,ILLEGAL USE CASH CARD,ATM (AUTOMATIC TELLER MACHINE),false,true,1031,010,22,30,11,1151482,1885517,2012,02/04/2016 06:33:39 AM,41.841738053,-87.719605942,"(41.841738053, -87.719605942)"
|
3
|
+
8517338,194241,03/06/2012 10:49:00 PM,102XX S VERNON AVE,0917,MOTOR VEHICLE THEFT,"CYCLE, SCOOTER, BIKE W-VIN",STREET,false,false,0511,005,9,49,07,1181052,1837191,2012,02/04/2016 06:33:39 AM,41.708495677,-87.612580474,"(41.708495677, -87.612580474)"
|
4
|
+
8517339,194563,02/01/2012 08:15:00 AM,003XX W 108TH ST,0460,BATTERY,SIMPLE,"SCHOOL, PRIVATE, BUILDING",false,false,0513,005,34,49,08B,1176016,1833309,2012,02/04/2016 06:33:39 AM,41.6979571,-87.631138505,"(41.6979571, -87.631138505)"
|
5
|
+
8517340,194531,03/12/2012 05:50:00 PM,089XX S CARPENTER ST,0560,ASSAULT,SIMPLE,STREET,false,false,2222,022,21,73,08A,1170886,1845421,2012,02/04/2016 06:33:39 AM,41.731307475,-87.649569675,"(41.731307475, -87.649569675)"
|
@@ -202,8 +202,16 @@ describe Daru::MultiIndex do
|
|
202
202
|
expect(@multi_mi.include?([:a, :one])).to eq(true)
|
203
203
|
end
|
204
204
|
|
205
|
-
it "checks for non-existence of
|
206
|
-
expect(@multi_mi.include?([:
|
205
|
+
it "checks for non-existence of completely specified tuple" do
|
206
|
+
expect(@multi_mi.include?([:b, :two, :foo])).to eq(false)
|
207
|
+
end
|
208
|
+
|
209
|
+
it "checks for non-existence of a top layer incomplete tuple" do
|
210
|
+
expect(@multi_mi.include?([:d])).to eq(false)
|
211
|
+
end
|
212
|
+
|
213
|
+
it "checks for non-existence of a middle layer incomplete tuple" do
|
214
|
+
expect(@multi_mi.include?([:c, :three])).to eq(false)
|
207
215
|
end
|
208
216
|
end
|
209
217
|
|
data/spec/io/io_spec.rb
CHANGED
@@ -51,6 +51,16 @@ describe Daru::IO do
|
|
51
51
|
expect(df['Domestic'].to_a).to all be_boolean
|
52
52
|
end
|
53
53
|
|
54
|
+
it "uses the custom string converter correctly" do
|
55
|
+
df = Daru::DataFrame.from_csv 'spec/fixtures/string_converter_test.csv', converters: [:string]
|
56
|
+
expect(df['Case Number'].to_a.all? {|x| String === x }).to be_truthy
|
57
|
+
end
|
58
|
+
|
59
|
+
it "allow symbol to converters option" do
|
60
|
+
df = Daru::DataFrame.from_csv 'spec/fixtures/boolean_converter_test.csv', converters: :boolean
|
61
|
+
expect(df['Domestic'].to_a).to all be_boolean
|
62
|
+
end
|
63
|
+
|
54
64
|
it "checks for equal parsing of local CSV files and remote CSV files" do
|
55
65
|
%w[matrix_test repeated_fields scientific_notation sales-funnel].each do |file|
|
56
66
|
df_local = Daru::DataFrame.from_csv("spec/fixtures/#{file}.csv")
|
data/spec/vector_spec.rb
CHANGED
@@ -1808,6 +1808,22 @@ describe Daru::Vector do
|
|
1808
1808
|
end
|
1809
1809
|
end
|
1810
1810
|
|
1811
|
+
context '#rolling_fillna' do
|
1812
|
+
subject do
|
1813
|
+
Daru::Vector.new(
|
1814
|
+
[Float::NAN, 2, 1, 4, nil, Float::NAN, 3, nil, Float::NAN]
|
1815
|
+
)
|
1816
|
+
end
|
1817
|
+
|
1818
|
+
context 'rolling_fillna forwards' do
|
1819
|
+
it { expect(subject.rolling_fillna(:forward).to_a).to eq [0, 2, 1, 4, 4, 4, 3, 3, 3] }
|
1820
|
+
end
|
1821
|
+
|
1822
|
+
context 'rolling_fillna backwards' do
|
1823
|
+
it { expect(subject.rolling_fillna(direction: :backward).to_a).to eq [2, 2, 1, 4, 3, 3, 3, 0, 0] }
|
1824
|
+
end
|
1825
|
+
end
|
1826
|
+
|
1811
1827
|
context "#type" do
|
1812
1828
|
before(:each) do
|
1813
1829
|
@numeric = Daru::Vector.new([1,2,3,4,5])
|
metadata
CHANGED
@@ -1,14 +1,14 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: daru
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.2.
|
4
|
+
version: 0.2.1
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Sameer Deshmukh
|
8
8
|
autorequire:
|
9
9
|
bindir: bin
|
10
10
|
cert_chain: []
|
11
|
-
date:
|
11
|
+
date: 2018-07-02 00:00:00.000000000 Z
|
12
12
|
dependencies:
|
13
13
|
- !ruby/object:Gem::Dependency
|
14
14
|
name: backports
|
@@ -532,6 +532,7 @@ files:
|
|
532
532
|
- spec/fixtures/repeated_fields.csv
|
533
533
|
- spec/fixtures/sales-funnel.csv
|
534
534
|
- spec/fixtures/scientific_notation.csv
|
535
|
+
- spec/fixtures/string_converter_test.csv
|
535
536
|
- spec/fixtures/strings.dat
|
536
537
|
- spec/fixtures/test_xls.xls
|
537
538
|
- spec/fixtures/url_test.txt~
|
@@ -569,26 +570,7 @@ homepage: http://github.com/v0dro/daru
|
|
569
570
|
licenses:
|
570
571
|
- BSD-2
|
571
572
|
metadata: {}
|
572
|
-
post_install_message:
|
573
|
-
*************************************************************************
|
574
|
-
Thank you for installing daru!
|
575
|
-
|
576
|
-
oOOOOOo
|
577
|
-
,| oO
|
578
|
-
//| |
|
579
|
-
\\| |
|
580
|
-
`| |
|
581
|
-
`-----`
|
582
|
-
|
583
|
-
|
584
|
-
Hope you love daru! For enhanced interactivity and better visualizations,
|
585
|
-
consider using gnuplotrb and nyaplot with iruby. For statistics use the
|
586
|
-
statsample family.
|
587
|
-
|
588
|
-
Read the README for interesting use cases and examples.
|
589
|
-
|
590
|
-
Cheers!
|
591
|
-
*************************************************************************
|
573
|
+
post_install_message:
|
592
574
|
rdoc_options: []
|
593
575
|
require_paths:
|
594
576
|
- lib
|
@@ -604,7 +586,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
|
|
604
586
|
version: '0'
|
605
587
|
requirements: []
|
606
588
|
rubyforge_project:
|
607
|
-
rubygems_version: 2.6.
|
589
|
+
rubygems_version: 2.6.14
|
608
590
|
signing_key:
|
609
591
|
specification_version: 4
|
610
592
|
summary: Data Analysis in RUby
|
@@ -638,6 +620,7 @@ test_files:
|
|
638
620
|
- spec/fixtures/repeated_fields.csv
|
639
621
|
- spec/fixtures/sales-funnel.csv
|
640
622
|
- spec/fixtures/scientific_notation.csv
|
623
|
+
- spec/fixtures/string_converter_test.csv
|
641
624
|
- spec/fixtures/strings.dat
|
642
625
|
- spec/fixtures/test_xls.xls
|
643
626
|
- spec/fixtures/url_test.txt~
|