RubyGems - daru - Versions diffs - 0.0.5 → 0.1.0 - Mend

daru 0.0.5 → 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (48) hide show

checksums.yaml +4 -4
data/.build.sh +14 -0
data/.travis.yml +26 -4
data/CONTRIBUTING.md +31 -0
data/Gemfile +1 -2
data/{History.txt → History.md} +110 -44
data/README.md +21 -288
data/Rakefile +1 -0
data/daru.gemspec +12 -8
data/lib/daru.rb +36 -1
data/lib/daru/accessors/array_wrapper.rb +8 -3
data/lib/daru/accessors/gsl_wrapper.rb +113 -0
data/lib/daru/accessors/nmatrix_wrapper.rb +6 -17
data/lib/daru/core/group_by.rb +0 -1
data/lib/daru/dataframe.rb +1192 -83
data/lib/daru/extensions/rserve.rb +21 -0
data/lib/daru/index.rb +14 -0
data/lib/daru/io/io.rb +170 -8
data/lib/daru/maths/arithmetic/dataframe.rb +4 -3
data/lib/daru/maths/arithmetic/vector.rb +4 -4
data/lib/daru/maths/statistics/dataframe.rb +48 -27
data/lib/daru/maths/statistics/vector.rb +215 -33
data/lib/daru/monkeys.rb +53 -7
data/lib/daru/multi_index.rb +21 -4
data/lib/daru/plotting/dataframe.rb +83 -25
data/lib/daru/plotting/vector.rb +9 -10
data/lib/daru/vector.rb +596 -61
data/lib/daru/version.rb +3 -0
data/spec/accessors/wrappers_spec.rb +51 -0
data/spec/core/group_by_spec.rb +0 -2
data/spec/daru_spec.rb +58 -0
data/spec/dataframe_spec.rb +768 -73
data/spec/extensions/rserve_spec.rb +52 -0
data/spec/fixtures/bank2.dat +200 -0
data/spec/fixtures/repeated_fields.csv +7 -0
data/spec/fixtures/scientific_notation.csv +4 -0
data/spec/fixtures/test_xls.xls +0 -0
data/spec/io/io_spec.rb +161 -24
data/spec/math/arithmetic/dataframe_spec.rb +26 -7
data/spec/math/arithmetic/vector_spec.rb +8 -0
data/spec/math/statistics/dataframe_spec.rb +16 -1
data/spec/math/statistics/vector_spec.rb +215 -47
data/spec/spec_helper.rb +21 -2
data/spec/vector_spec.rb +368 -12
metadata +99 -16
data/lib/version.rb +0 -3
data/notebooks/grouping_splitting_pivots.ipynb +0 -529
data/notebooks/intro_with_music_data_.ipynb +0 -303

checksums.yaml CHANGED

@@ -1,7 +1,7 @@
 ---
 SHA1:
-  metadata.gz: fd2dec0795f15ca1e45bdad5238fb7dbe33e1089
-  data.tar.gz: 634ff6e6b533cad019893a6e248706c824933e1d
+  metadata.gz: 6e48778067b94afc9f1060d7d6d4212029b421f2
+  data.tar.gz: 5d0ed9cc2fcf70562e0fcf2767c593e1f8fbfa54
 SHA512:
-  metadata.gz: 2c4aed326afacb2fe2324dd720e302564ab973b7fe69e17daf8f4902fecf7a2bbe34a26b0681dc42eaef14bd511a439a2717a115a7f577f700212d0d605d6dee
-  data.tar.gz: be1bc452b188d233a6c668a008ed9f9e4cd77cf9b24a574559bf27c8c28ab34b0c23d51cc4321ab49c1416a53b0b74571afca698cdf2106d407f744204191362
+  metadata.gz: 778ad55b592865e08388eac0001cdbce6bc01f58fa77ed8e2a2b72e44d8a54fc2d289f241352affd98b6424326416634ef729889163175f9eb64c83e471fb7e2
+  data.tar.gz: c498252daf63597adc0255810d3eb7b60c102ef086117bf451de821faaa8e196933570b06f88e6415b51c1ef6ea2ee6f60afce18b65691483741158954c73d0b

data/.build.sh ADDED

@@ -0,0 +1,14 @@
+#!/bin/bash
+git clone https://github.com/SciRuby/nmatrix.git
+cd nmatrix
+gem build nmatrix.gemspec
+gem install nmatrix-0.1.0.gem
+cd ..
+rm -rf nmatrix
+git clone https://github.com/v0dro/gsl-nmatrix
+cd gsl-nmatrix
+gem build gsl-nmatrix.gemspec
+gem install gsl-nmatrix-1.17.gem
+cd ..
+rm -rf gsl-nmatrix

data/.travis.yml CHANGED

@@ -1,5 +1,27 @@
-language: ruby
+language:
+  ruby
+env:
+  - CPLUS_INCLUDE_PATH=/usr/include/atlas C_INCLUDE_PATH=/usr/include/atlas
 rvm:
-  - 1.9.3
-  - 2.0.0
-  - 2.1.1
+  - '2.0'
+  - '2.1'
+  - '2.2'
+matrix:
+  fast_finish:
+    true
+script: "bundle exec rspec"
+install:
+  - gem install bundler
+  - ./.build.sh
+  - bundle install
+before_install:
+  - sudo apt-get update -qq
+  - sudo apt-get install -qq libatlas-base-dev
+  - sudo apt-get install -y libgsl0-dev r-base r-base-dev
+  - sudo Rscript -e "install.packages(c('Rserve','irr'),,'http://cran.us.r-project.org')"

data/CONTRIBUTING.md CHANGED

@@ -0,0 +1,31 @@
+# Contributing guide
+## Installing daru development dependencies
+If you want to run the full rspec suite, you will need the latest unreleased nmatrix and gsl-nmatrix ruby gems. They will released upstream soon but please follow this procedure for now.
+Keep in mind that either nmatrix OR gsl-nmatrix are NOT NECESSARY for using daru. They are just required for an optional speed up.
+To install dependencies, execute the following commands:
+  `export CPLUS_INCLUDE_PATH=/usr/include/atlas`
+  `export C_INCLUDE_PATH=/usr/include/atlas`
+  `sudo apt-get update -qq`
+  `sudo apt-get install -qq libatlas-base-dev`
+  `sudo apt-get --purge remove liblapack-dev liblapack3 liblapack3gf`
+  `sudo apt-get install -y libgsl0-dev r-base r-base-dev`
+  `sudo Rscript -e "install.packages(c('Rserve','irr'),,'http://cran.us.r-project.org')"`
+Then execute the .build.sh script to clone and install the latest nmatrix and gsl-nmatrix on your system:
+  `./.build.sh`
+Then finally install remaining dependencies:
+  `bundle install`
+And run the test suite (should be all green with pending tests):
+  `bundle exec rspec`
+If you have problems installing nmatrix, please consult the [nmatrix installation wiki](https://github.com/SciRuby/nmatrix/wiki/Installation) or the [mailing list](https://groups.google.com/forum/#!forum/sciruby-dev).

data/Gemfile CHANGED

@@ -1,3 +1,2 @@
 source 'https://rubygems.org'
-gemspec
+gemspec

data/{History.txt → History.md} RENAMED

@@ -1,52 +1,74 @@
-== 0.0.1
-* Added classes for DataFrame and Vector alongwith some super-basic functions to get off the ground
-== 0.0.2
-* Added iterators for dataframe and vector alongwith printing functions (to_html) to interface properly with iRuby notebook.
-== 0.0.2.1
-* Fixed bugs with previous code and more iterators
-== 0.0.2.2
-* Added test cases and multiple column access through the [] operator on DataFrames
-== 0.0.2.3
-* Added #filter\_rows and #delete_row to DataFrame and changed #row to return a row containing a Hash of column name and value.
-* Vector objects passed into a DataFrame are now duplicated so that any changes dont affect the original vector.
-* Added an optional opts argument to DataFrame.
-* Sending more fields than vectors in DataFrame will cause addition of nil vectors.
-* Init a DataFrame without having to convert explicitly to vectors.
-== 0.0.2.4
-* Initialize dataframe from an array which looks like [{a: 10, b: 20}, {a: 11, b: 12}]. Works for parsed JSON.
-* Over-riding vectors in DataFrame will still preserve order.
-* Any re-assignment of rows in #each_row and #each_row_with_index will reflect in the DataFrame.
-* Added #to_a and #to_json to DataFrame.
+# 0.1.0
-== 0.0.3
-* This release is a complete rewrite of the entire gem to accomodate index values.
+* Fixes
+    - Update documentation and fix it in other places.
+    - Fix Vector#sum_of_squares and #ranked.
+    - Fixed some tests that were giving RSpec warnings
+    - Fixed a bug where nyaplot not being present would raise a warning.
+    - Fixed a bug in DataFrame row assignment.
+* Enhancements
+    - Wrote a proper .travis.yml
+    - Added optional GSL dependency gsl-nmatrix
+    - Added Marshalling and unMarshalling capabilities to Vector, Index and DataFrame.
+    - Added new method Daru::IO.load for loading data from files by marshalling.
+    - Lots of documentation and new notebooks.
+    - Added data loading and writing from and to CSV, Excel, plain text and SQL databases.
+    - Daru::DataFrame and Vector have now completely replaced Statsample::Dataset and Vector.
+    - Vector
+        - #center
+        - #standardize
+        - #vector_percentile
+        - Added a new wrapper class Daru::Accessors::GSLWrapper for wrapping around GSL::Vector, which works similarly to NMatrixWrapper or ArrayWrapper.
+        - Added a host of statistical methods to GSLWrapper in Daru::Accessors::GSLStatistics that call the relevant GSL::Vector functions for super-fast C level computations.
+        - More stats functions - #vector_standardized_compute, #vector_centered_compute, #sample_with_replacement, #sample_without_replacement
+        - #only_valid for creating a Vector with only non-nil data.
+        - #only_missing for creating a Vector of only missing data.
+        - #only_numeric to create Vector of only numerical data.
+        - Ported many Statsample::Vector stat methods to Daru::Vector. These are: #percentile, #factors, etc.
+        - Added .new_with_size for creating vectors by specifying a size for the
+        vector and a block for generating values.
+        - Added Vector#verify, #recode! and #recode.
+        - Added #save, #jackknife and #bootstrap.
+        - Added #missing_values= that will allow setting values for treating data as 'missing'.
+        - Added #split_by_separator, #split_by_separator_freq and #splitted.
+        - Added #reset_index!
+        - Added #any? and #all?
+        - Added #db_type for guessing the type of SQL type contained in the vector.
+        - Added and tested plotting support for histogram and box plot.
+    - DataFrame
+        - #dup_only_valid
+        - #clone, #clone_only_valid, #clone_structure
+        - #[]= does not clone the vector if it has the same index as the DataFrame.
+        - Added a :clone option to initialize that will not clone Daru::Vectors passed into the constructor.
+        - Added #save.
+        - Added #only_numerics.
+        - Added better iterators and changed some behaviour of previous ones to make them more ruby-like. New iterators are #map, #map!, #each, #recode and #collect.
+        - Added #vector_sum and #vector_mean.
+        - Added #to_gsl to convert to GSL::Matrix.
+        - Added #has_missing_data? and #missing_values_rows.
+        - Added #compute and #verify.
+        - Added .crosstab_by_assignation to generate data frame from row, column and value vectors.
+        - Added #filter_vector.
+        - Added #standardize and added argument option to #dup.
+        - Added #any? and #all? for vector and row axis.
+        - Better creation of empty data frames.
+        - Added #merge, #one_to_many, #add_vectors_by_split_recode
+        - Added constant SPLIT_TOKEN and methods #add_vectors_by_split, .[], #summary.
+        - Added #bootstrap.
+        - Added a #filter method to wrap around #filter_vectors and #filter_rows.
+        - Greatly improved plotting function.
+    - Added a lazy update feature that will allow users to delay updating the missing positions index until the last possible moment.
+    - Added interoperaility with rserve client which makes it possible to change daru data to R data and perform computation there.
+* Changes
+    - Changes Vector#nil_positions to Vector#missing_positions so that future changes for accomodating different values for missing data can be made easily.
+    - Changed History.txt to History.md
-== 0.0.3.1
-* Added aritmetic methods for vector aritmetic by taking the index of values into account.
-== 0.0.4
-* Added wrappers for Array, NMatrix and MDArray such that the external implementation is completely transparent of the data type being used internally.
-* Added statistics methods for vectors for ArrayWrapper. These are compatible with statsample methods.
-* Added plotting functions for DataFrame and Vector using Nyaplot.
-* Create a DataFrame by specifying the rows with the ".rows" class method.
-* Create a Vector from a Hash.
-* Call a Vector element by specfying the index name as a method call (method_missing logic).
-* Retrive multiple rows of a DataFrame by specfying a Range or an Array with multiple index names.
-* #head and #tail for DataFrame.
-* #uniq for Vector.
-* #max for Vector can return a Vector object with the index set to the index of the max value.
-* Tonnes of documentation for most methods.
-== 0.0.5
+# 0.0.5
 * Easy accessors for some methods
 * Faster CSV loading.
-* Changed vector #is\_valid? to #exists?
+* Changed vector #is_valid? to #exists?
 * Revamped dtype specifiers for Vector. Now specify :array/:nmatrix for changing underlying data implementation. Specigfy nm\_dtype for specifying the data type of the NMatrix object.
 * #sort for Vector. Quick sort algorithm with preservation of original indexes.
 * Removed #re\_index and #to\_index from Daru::Index.
@@ -75,4 +97,48 @@
 * Added #describe to DataFrame for producing multiple statistics data of numerical vectors in one shot.
 * Monkey patched Ruby Matrix to include #elementwise_division.
 * Added #covariance to calculate the covariance between numbers of a DataFrame and #correlation to calculate correlation.
-* Enumerators return Enumerator objects if there is no block.
+* Enumerators return Enumerator objects if there is no block.
+# 0.0.4
+* Added wrappers for Array, NMatrix and MDArray such that the external implementation is completely transparent of the data type being used internally.
+* Added statistics methods for vectors for ArrayWrapper. These are compatible with statsample methods.
+* Added plotting functions for DataFrame and Vector using Nyaplot.
+* Create a DataFrame by specifying the rows with the ".rows" class method.
+* Create a Vector from a Hash.
+* Call a Vector element by specfying the index name as a method call (method_missing logic).
+* Retrive multiple rows of a DataFrame by specfying a Range or an Array with multiple index names.
+* #head and #tail for DataFrame.
+* #uniq for Vector.
+* #max for Vector can return a Vector object with the index set to the index of the max value.
+* Tonnes of documentation for most methods.
+# 0.0.3.1
+* Added aritmetic methods for vector aritmetic by taking the index of values into account.
+# 0.0.3
+* This release is a complete rewrite of the entire gem to accomodate index values.
+# 0.0.2.4
+* Initialize dataframe from an array which looks like [{a: 10, b: 20}, {a: 11, b: 12}]. Works for parsed JSON.
+* Over-riding vectors in DataFrame will still preserve order.
+* Any re-assignment of rows in #each_row and #each_row_with_index will reflect in the DataFrame.
+* Added #to_a and #to_json to DataFrame.
+# 0.0.2.3
+* Added #filter\_rows and #delete_row to DataFrame and changed #row to return a row containing a Hash of column name and value.
+* Vector objects passed into a DataFrame are now duplicated so that any changes dont affect the original vector.
+* Added an optional opts argument to DataFrame.
+* Sending more fields than vectors in DataFrame will cause addition of nil vectors.
+* Init a DataFrame without having to convert explicitly to vectors.
+# 0.0.2.2
+* Added test cases and multiple column access through the [] operator on DataFrames
+# 0.0.2.1
+* Fixed bugs with previous code and more iterators
+# 0.0.2
+* Added iterators for dataframe and vector alongwith printing functions (to_html) to interface properly with iRuby notebook.
+# 0.0.1
+* Added classes for DataFrame and Vector alongwith some super-basic functions to get off the ground

data/README.md CHANGED

@@ -4,33 +4,45 @@ daru
 Data Analysis in RUby
 [![Gem Version](https://badge.fury.io/rb/daru.svg)](http://badge.fury.io/rb/daru)
+[![Build Status](https://travis-ci.org/v0dro/daru.svg)](https://travis-ci.org/v0dro/daru)
 ## Introduction
 daru (Data Analysis in RUby) is a library for storage, analysis, manipulation and visualization of data.
-daru is inspired by `Statsample::Dataset` and pandas, a very mature solution in Python.
+daru is inspired by pandas, a very mature solution in Python.
-Written in pure Ruby so should work with all ruby implementations.
+Written in pure Ruby so should work with all ruby implementations. Tested with MRI 2.0, 2.1, 2.2.
 ## Features
 * Data structures:
     - Vector - A basic 1-D vector.
-    - DataFrame - A 2-D table-like structure which is internally composed of named `Vectors`.
-* Compatible with [IRuby notebook](https://github.com/minad/iruby) and [statsample](https://github.com/clbustos/statsample).
+    - DataFrame - A 2-D spreadsheet-like structure for manipulating and storing data sets. This is daru's primary data structure.
+* Compatible with [IRuby notebook](https://github.com/SciRuby/iruby) and [statsample](https://github.com/SciRuby/statsample).
 * Singly and hierarchially indexed data structures.
 * Flexible and intuitive API for manipulation and analysis of data.
 * Easy plotting, statistics and arithmetic.
 * Plentiful iterators.
-* Optional speed and space optimization on MRI with [NMatrix](https://github.com/SciRuby/nmatrix).
+* Optional speed and space optimization on MRI with [NMatrix](https://github.com/SciRuby/nmatrix) and GSL.
 * Easy splitting, aggregation and grouping of data.
 * Quickly reducing data with pivot tables for quick data summary.
+* Import and exports dataset from and to Excel, CSV, Databases and plain text files.
 ## Notebooks
-* [Analysis and plotting of a data set comprising of music listening habits of a last.fm user](http://nbviewer.ipython.org/github/v0dro/daru/blob/master/notebooks/intro_with_music_data_.ipynb)
-* [Basic splitting, grouping and aggregating of data](http://nbviewer.ipython.org/github/v0dro/daru/blob/master/notebooks/grouping_splitting_pivots.ipynb)
+### Usage
+* [Basic Creation of Vectors and DataFrame](http://nbviewer.ipython.org/github/SciRuby/sciruby-notebooks/blob/master/Data%20Analysis/Creation%20of%20Vector%20and%20DataFrame.ipynb)
+* [Detailed Usage of Daru::Vector](http://nbviewer.ipython.org/github/SciRuby/sciruby-notebooks/blob/master/Data%20Analysis/Usage%20of%20Vector.ipynb)
+* [Detailed Usage of Daru::DataFrame](http://nbviewer.ipython.org/github/SciRuby/sciruby-notebooks/blob/master/Data%20Analysis/Usage%20of%20DataFrame.ipynb)
+* [Visualizing Data With Daru::DataFrame](http://nbviewer.ipython.org/github/SciRuby/sciruby-notebooks/blob/master/Visualization/Visualizing%20data%20with%20daru%20DataFrame.ipynb)
+* [Grouping, Splitting and Pivoting Data](http://nbviewer.ipython.org/github/SciRuby/sciruby-notebooks/blob/master/Data%20Analysis/Grouping%2C%20Splitting%20and%20Pivoting.ipynb)
+### Case Studies
+* [Logistic Regression Analysis with daru and statsample-glm](http://nbviewer.ipython.org/github/SciRuby/sciruby-notebooks/blob/master/Data%20Analysis/Logistic%20Regression%20with%20daru%20and%20statsample-glm.ipynb)
+* [Finding and Plotting most heard artists from a Last.fm dataset](http://nbviewer.ipython.org/github/SciRuby/sciruby-notebooks/blob/master/Data%20Analysis/Finding%20and%20plotting%20the%20most%20heard%20artists%20on%20last%20fm.ipynb)
 ## Blog Posts
@@ -41,295 +53,18 @@ Written in pure Ruby so should work with all ruby implementations.
 Docs can be found [here](https://rubygems.org/gems/daru).
-## Basic Usage
-#### Initialization of DataFrame
-A basic DataFrame can be initialized like this:
-```ruby
-df = Daru::DataFrame.new({b: [11,12,13,14,15], a: [1,2,3,4,5]}, order: [:a, :b], index: [:one, :two, :three, :four, :five])
-df
-# =>
-# # <Daru::DataFrame:87274040 @name = 7308c587-4073-4e7d-b3ca-3679d1dcc946 # @size = 5>
-#           a     b
-#   one     1    11
-#   two     2    12
-# three     3    13
-#  four     4    14
-#  five     5    15
-```
-Daru will automatically align the vectors correctly according to the specified index and then create the DataFrame. Thus, elements having the same index will show up in the same row. The indexes will be arranged alphabetically if vectors with unaligned indexes are supplied.
-The vectors of the DataFrame will be arranged according to the array specified in the (optional) second argument. Otherwise the vectors are ordered alphabetically.
-```ruby
-df = Daru::DataFrame.new({
-    b: [11,12,13,14,15].dv(:b, [:two, :one, :four, :five, :three]),
-    a:      [1,2,3,4,5].dv(:a, [:two,:one,:three, :four, :five])
-  }, order: [:a, :b]
-)
-df
-# =>
-# #<Daru::DataFrame:87363700 @name = 75ba0a14-8291-48ac-ac30-35017e4d6c5f # @size = 5>
-#           a     b
-#  five     5    14
-#  four     4    13
-#   one     2    12
-# three     3    15
-#   two     1    11
-```
-If an index for the DataFrame is supplied (third argument), then the indexes of the individual vectors will be matched to the DataFrame index. If any of the indexes do not match, nils will be inserted instead:
-```ruby
-df = Daru::DataFrame.new({
-    b: [11]                .dv(nil, [:one]),
-    a: [1,2,3]             .dv(nil, [:one, :two, :three]),
-    c: [11,22,33,44,55]    .dv(nil, [:one, :two, :three, :four, :five]),
-    d: [49,69,89,99,108,44].dv(nil, [:one, :two, :three, :four, :five, :six])
-  }, order: [:a, :b, :c, :d], index: [:one, :two, :three, :four, :five, :six])
-df
-# =>
-# #<Daru::DataFrame:87523270 @name = bda4eb68-afdd-4404-9981-708edab14201  #@size = 6>
-#           a     b     c     d
-#   one     1    11    11    49
-#   two     2   nil    22    69
-# three     3   nil    33    89
-#  four   nil   nil    44    99
-#  five   nil   nil    55   108
-#   six   nil   nil   nil    44
-```
-If some of the supplied vectors do not contain certain indexes that are contained in other vectors, they are added to those vectors and the correspoding elements are set to `nil`.
-```ruby
-df = Daru::DataFrame.new({
-  b: [11,12,13,14,15].dv(:b, [:two, :one, :four, :five, :three]),
-  a: [1,2,3]         .dv(:a, [:two,:one,:three])
-}, order: [:a, :b])
-df
-#  =>
-# #<Daru::DataFrame:87612510 @name = 1e904c15-e095-4dce-bfdf-c07ee4d6e4a4 # @size = 5>
-#           a     b
-#  five   nil    14
-#  four   nil    13
-#   one     2    12
-# three     3    15
-#   two     1    11
-```
-#### Initialization of Vector
-The `Vector` data structure is also named and indexed. It accepts arguments name, source, index (in that order).
-In the simplest case it can be constructed like this:
-```ruby
-dv = Daru::Vector.new [1,2,3,4,5], name: ravan, index: [:ek, :don, :teen, :char, :pach]
-dv
-#  =>
-# #<Daru::Vector:87630270 @name = ravan @size = 5 >
-#     ravan
-#   ek    1
-#  don    2
-# teen    3
-# char    4
-# pach    5
-```
-Initializing a vector with indexes will insert nils in places where elements dont exist:
-```ruby
-dv = Daru::Vector.new [1,2,3], name: yoga, index: [0,1,2,3,4]
-dv
-#  =>
-# #<Daru::Vector:87890840 @name = yoga @size = 5 >
-#   y
-# 0 1
-# 1 2
-# 2 3
-# 3 nil
-# 4 nil
-```
-#### Basic Selection Operations
-Initialize a dataframe:
-```ruby
-df = Daru::DataFrame.new({
-  b: [11,12,13,14,15].dv(:b, [:two, :one, :four, :five, :three]),
-  a: [1,2,3,4,5].dv(:a, [:two,:one,:three, :four, :five])
-}, order: [:a, :b])
-#  =>
-# #<Daru::DataFrame:87455010 @name = b3d14e23-98c2-4741-a563-92e8f1fd0f13 # @size = 5>
-#           a     b
-#  five     5    14
-#  four     4    13
-#   one     2    12
-# three     3    15
-#   two     1    11
-```
-Select a row from a DataFrame:
-```ruby
-df.row[:one]
-#  =>
-# #<Daru::Vector:87432070 @name = one @size = 2 >
-#    one
-#  a  2
-#  b 12
-```
-A row or a vector is returned as a `Daru::Vector` object, so any manipulations supported by `Daru::Vector` can be performed on the chosen row as well.
-Select multiple rows with a Range and get a DataFrame in return:
-``` ruby
-df.row[1..3] # OR df.row[:four..:three]
-# =>
-#<Daru::DataFrame:85361520 @name = d6582f66-5a55-473e-ba57-cb2ba974da6a @size #= 3>
-#                    a          b
-#      four          4         13
-#       one          2         12
-#     three          3         15
-```
-Select a single vector:
-```ruby
-df.vector[:a] # or simply df.a
-#  =>
-# #<Daru::Vector:87454270 @name = a @size = 5 >
-#           a
-#  five     5
-#  four     4
-#   one     2
-# three     3
-#   two     1
-```
-Select multiple vectors and return a DataFrame in the specified order:
-```ruby
-df.vector[:b, :a]
-#  =>
-# #<Daru::DataFrame:87835960 @name = e80902cc-cff9-4b23-9eca-5da36ebc88a8 #   @size = 5>
-#           b     a
-#  five    14     5
-#  four    13     4
-#   one    12     2
-# three    15     3
-#   two    11     1
-```
-Keep/remove row according to a specified condition:
-```ruby
-df = df.filter_rows do |row|
-  row[:a] == 5
-end
-df
-#  =>
-# #<Daru::DataFrame:87455010 @name = b3d14e23-98c2-4741-a563-92e8f1fd0f13 # @size = 1>
-#         a    b
-# five    5   14
-```
-The same can be applied to vectors using `filter_vectors`.
-To change the values of a row/vector while iterating through the DataFrame, use `map_rows` or `map_vectors`:
-```ruby
-df.map_rows do |row|
-  row = row * row
-end
-df
-#  =>
-# #<Daru::DataFrame:86826830 @name = b092ca5b-7b83-4dbe-a469-124f7f25a568 # @size = 5>
-#           a     b
-#  five    25   196
-#  four    16   169
-#   one     4   144
-# three     9   225
-#   two     1   121
-```
-#### Basic Maths Operations
-Performing a binary arithmetic operation on two `Daru::Vector` objects will return a `Vector` object in which the operation will be performed on elements of the same index.
-```ruby
-dv1 = Daru::Vector.new [1,2,3,4], name: :boozy, index: [:a, :b, :c, :d]
-dv2 = Daru::Vector.new [1,2,3,4], name: :mayer, index: [:e, :f, :b, :d]
-dv1 * dv2
-# #<Daru::Vector:80924700 @name = boozy @size = 2 >
-#         boozy
-#      b      6
-#      d     16
-```
-Arithmetic operators applied on a single Numeric will perform the operation with that number against the entire vector.
-Same applies to DataFrame as well.
-#### Splitting and aggregation of data
-`Daru::DataFrame` provides the `#group_by` method to split or aggregate data. Its very similar to SQL GROUP BY. Check the [blog post]() for details.
-You can also generate Excel-style pivot tables with `#pivot_table`.
-#### Plotting
-daru uses [Nyaplot](https://github.com/domitry/nyaplot) for plotting and an example of this can be found in the [notebook](http://nbviewer.ipython.org/github/v0dro/daru/blob/master/notebooks/intro_with_music_data_.ipynb) or [blog post](http://v0dro.github.io/blog/2014/11/25/data-analysis-in-ruby-basic-data-manipulation-and-plotting/).
-Head over to the tutorials and notebooks listed above for more examples.
-#### Working with missing data
-Missing data is an integral part of any data analysis operation and [this blog post](http://v0dro.github.io/blog/2015/02/24/data-analysis-in-ruby-part-2/) provides details on dealing with missing data.
 ## Roadmap
 * Automate testing for both MRI and JRuby.
 * Enable creation of DataFrame by only specifying an NMatrix/MDArray in initialize. Vector naming happens automatically (alphabetic) or is specified in an Array.
-* Destructive map iterators for DataFrame.
 * Completely test all functionality for MDArray.
 * Basic Data manipulation and analysis operations:
-    - Different kinds of join operations
-    - Dataframe/vector merge (left, right, inner, outer)
-    - Verification of data in a vector
     - DF concat
 * Option to express a DataFrame as an NMatrix or MDArray so as to use more efficient storage techniques.
 * Assignment of a column to a single number should set the entire column to that number.
 * == between daru_vector and string/number.
 * Multiple column assignment with []=
 * Multiple value assignment for vectors with []=.
-* Load DataFrame from multiple sources (excel, SQL, etc.).
-* Deletion of elements from Vector should only modify the index and leave the vector as it is so that compacting is not needed and things are faster.
 * #find\_max function which will evaluate a block and return the row for the value of the block is max.
 * Function to check if a value of a row/vector is within a specified range.
 * Create a new vector in map_rows if any of the already present rows dont match the one assigned in the block.
@@ -338,19 +73,17 @@ Missing data is an integral part of any data analysis operation and [this blog p
 * Cumulative sum.
 * Time series support.
 * Calculate percentage change.
-* Working with missing data - drop\_missing\_data, dropping rows with missing data.
 * Have some sample data sets for users to play around with. Should be able to load these from the code itself.
 * Sorting with missing data present.
-* Make vectors aware of the data frame that they are a part of.
 * re_index should re establish previous index values in the newly supplied index.
-* Reset index.
 ## Contributing
-Pick a feature from the Roadmap above or think of your own and send me a Pull Request!
+Pick a feature from the Roadmap or the issue tracker or think of your own and send me a Pull Request!
 ## Acknowledgements
+* Google and the Ruby Science Foundation for the Google Summer of Code 2015 grant for further developing daru and integrating it with other ruby gems.
 * Thank you [last.fm](http://www.last.fm/) for making user data accessible to the public.
 Copyright (c) 2015, Sameer Deshmukh