RubyGems - dreader - Versions diffs - 1.2.0 → 1.2.1 - Mend

dreader 1.2.0 → 1.2.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (21) hide show

checksums.yaml +4 -4
data/CHANGELOG.org +7 -12
data/README.org +54 -52
data/dreader.gemspec +2 -2
data/examples/age/ages.txt +10 -0
data/examples/template/birthdays.xlsx +0 -0
data/lib/dreader/column.rb +2 -0
data/lib/dreader/engine.rb +52 -88
data/lib/dreader/options.rb +10 -0
data/lib/dreader/util.rb +21 -0
data/lib/dreader/version.rb +3 -1
data/lib/dreader.rb +2 -1
metadata +7 -16
data/examples/age_csv/Birthdays-TabSeparated.csv +0 -13
data/examples/age_csv/Birthdays.csv +0 -13
data/examples/age_csv/age.rb +0 -55
data/examples/age_noext/Birthdays +0 -0
data/examples/age_noext/Birthdays-xlsx +0 -0
data/examples/age_noext/Birthdays-xlsx-with-wrong-extension.xls +0 -0
data/examples/age_noext/age.rb +0 -73
data/examples/wikipedia_us_cities/us_cities_reject.rb +0 -77

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: 3c30be2fe49c6c8ce20d4930c75f1279ac1a92a099f609b1266b14dc61c7cf3c
-  data.tar.gz: 58c735a67c45ef11a180bc6f17892ba912656d346320da1a716caa69661f4695
+  metadata.gz: 520ff1f682a1b747037ccc7fb1aa13d0619dad6118690552897471cbaf53a580
+  data.tar.gz: 10feef9edfc5511527aecbbcc297ecdb2a05b3f2f8f0268038bc41a55974bf17
 SHA512:
-  metadata.gz: 7847892dbcf648432a9c51867fd70e1260e956e82ea7cbbad93f92882dd867be6f36f67a0edfbb972e16e12c1364efe92a54bd03d31801770da4b28fac725350
-  data.tar.gz: a717955a2eaa0c406d6fb140daf9cd084c11e0d4710289de34b7757b5fd4f4e920ded4fd1450306e3e42a42278c983e078ff8e53248fe0f5b7a93d09fb8a9d40
+  metadata.gz: 80e982f26d152b30ff25d57180d97139f97d40423ab623754d7c514a42f5e69fbd06f7f882fe85034880f7a3dd4a48e89c63b362dedd5f9dd2e955c4125036f0
+  data.tar.gz: 0df7a5c61ce2a4f72f076fdfa44a2487bd22d1f1b86862aba4cf6b3f858eb5d632a4237cd85be50f262726cf2001d98a6842c1bda157cab8046b93b60e2f7061

data/CHANGELOG.org CHANGED Viewed

@@ -1,19 +1,14 @@
 #+TITLE: Changelog
-* Version 1.2.0 - <2023-11-02 Thu>
-** reject declaration
+* Version 1.2.1 - <2025-08-26 Tue>
-   - A new reject declaration allows to reject some lines.  reject takes as
-     input a row and can predicate over columns and virtual columns.  When
-     true, the corresponding line is discarded.
+- Back to Github
+- Fixes
+* Version 1.2.0 - <2023-12-29 Fri>
-* Version 1.1.2 - <2023-10-31 Tue>
-** Fixes an issue with the :extension option
-   - Fixes a bug related to =:extension= and adds a working example, to test
-     the feature
-   - Changes the extension from a string to a symbol. No initial dot required
-     any longer
+** Adds support for type in columns
+** (Developer) Removes Rubocop Warning
 * Version 1.1.1 - <2023-10-16 Mon>
 ** Adds option :extension

data/README.org CHANGED Viewed

@@ -137,8 +137,7 @@ To write an import function with Dreader:
   and check parsed data
 - Add virtual columns, that is, columns computed from other values
   in the row
-- Specify what lines you want to reject, if any
-- Specify how to transform lines. This is where you do the actual work
+- Specify how to map line. This is where you do the actual work
   (for instance, if you process a file line by line) or put together data for
   processing after the file has been fully read --- see the next step.
@@ -166,13 +165,12 @@ Require =dreader= and declare a class which extends =Dreader::Engine=:
   end
 #+END_EXAMPLE
-Specify parsing option in the class, using the following syntax:
+In the class specify parsing option, using the following syntax:
 #+BEGIN_EXAMPLE ruby
   options do
     filename 'example.ods'
-    # this optional. Use it when the file does not have an extension
-    extension :ods
+    extension ".ods"
     sheet 'Sheet 1'
@@ -192,10 +190,10 @@ where:
   to supply a filename when loading the file (see =read=, below).  *Use
   =.tsv= for tab-separated files.*
 - (optional) =extension= overrides or specify the extension of =filename=.
-  Takes as input a symbol (e.g., =:xlsx=).
-  Notice that **value of this option is not appended to filename** (see =read=
-  below).  Filename must thus be a valid reference to a file in the file
-  system. This option is useful in one of these two circumstances:
+  Takes as input the extension preceded by a "." (e.g., ".xlsx").  Notice that
+  **value of this option is not appended to filename** (see =read= below).
+  Filename must thus be a valid reference to a file in the file system. This
+  option is useful in one of these two circumstances:
   1. When =filename= has no extension
   2. When you want to override the extension of the filename, e.g., to force
      reading a "file.csv" as a tab separated file
@@ -205,6 +203,10 @@ where:
   will rely on =roo= to determine the last row.  This is useful for
   those files in which you only want to process some of the content or
   contain "garbage" after the records.
+- (optional) =date_format= specifies the date format, using the notation
+  understood by =strptime=.  It is used only when the column declaration
+  contains a type specification (e.g., the column declaration of one or more
+  columns is in the form =[<column>, :date]=
 - (optional) =sheet= is the sheet name or number to read from. If not
   specified, the first (default) sheet is used
 - (optional) =debug= specifies that we are debugging
@@ -244,18 +246,43 @@ There are two notations:
 The reference to a column can either be a letter or a number. First column
 is ='A'= or =1=.
-The =column= declaration can contain Ruby blocks:
+Optionally, the reference to the column can be an array.  In this case, the
+first element of the array is the reference to the column and the second
+argument being its type, that is, any of =:integer, :float,
+:big_decimal, :date=:
-- one or more =check_raw= block check raw data as read from the input
-  file. They can be used, for instance, to verify presence of a value in the
-  input file.  *Check must return true if there are no errors; any other
-  value (e.g. an array of messages) is considered an error.*
+#+begin_example ruby
+  # First notation, colref is put in the block
+  column({ name: ['A', :date] })
+#+end_example
+The effect of this declaration is introducing a =process= directive which
+takes care of converting the input into the declared type.  That is, the
+notation above is a shortcut for:
+#+begin_example ruby
+  # First notation, colref is put in the block
+  column({ name: 'A' } do
+    process { |value| Date.strptime(value, <the value of the option date_format>) }
+  end
+#+end_example
+The =column= declaration can contain various Ruby blocks:
+- one or more =check_raw= block.  The =check_raw= blocks are run in sequence,
+  to check data as read from the input file. They can be used, for instance,
+  to verify presence of a value in the input file.
+  *Check must return true if there are no errors: any other value (e.g. an
+  array of messages) is considered an error.*
 - =process= can be used to transform data into something closer to the input
   data required for the importing (e.g., it can be used for downcase or
   strip a string)
-- one or more =check= block perform a check on the =process=ed data, to check
-  for errors. They can be used, for instance, to check that a model built with
-  =process= is valid.  *Check must return true if there are no errors.*
+- one or more =check= block. The =check= blocks are run in sequence on the
+  processed data (that is the output of =process=, to check for errors. They
+  can be used, for instance, to check that a model built with =process= is
+  valid.
+  *Check must return true if there are no errors: any other value
+  (e.g. an array of messages) is considered an error.*
 #+begin_example
   column({ name: 'A' }) do
@@ -266,9 +293,9 @@ The =column= declaration can contain Ruby blocks:
 #+end_example
 #+begin_quote
-  *If you declare more than a check block of the same type per column, use a
-  unique symbol to distinguish the blocks or the error messages will be
-  overwritten*.
+ If you declare more than a check block of the same type per column, use a
+ unique symbol to distinguish the blocks or the error messages will be
+ overwritten.
 #+end_quote
 #+begin_example
@@ -399,10 +426,6 @@ See [[file:examples/wikipedia_us_cities/us_cities_bulk_declare.rb][us_cities_bul
   hash from the code block.
 #+END_NOTES
-The data read from each row of our input data is stored in a hash. The hash
-uses column names as the primary key and stores the values in the =:value=
-key.
 *** Add virtual columns
 Sometimes it is convenient to aggregate or otherwise manipulate the data
@@ -431,22 +454,6 @@ Virtual columns are, of course, available to the =mapping= directive
 (see below).
-*** Specify which lines to reject
-You can reject some lines using the =reject= declaration, which is applied row
-by row, can predicate over columns and virtual columns, and has to return a
-Boolean value.
-All lines returning a truish value will be be rejected, that is, not stored in
-the =@table= variable (and, consequently, passed to the mapping function).
-For instance, the following declaration rejects all lines in which the
-population column is higher than =3_000_000=:
-#+begin_src ruby
-  reject { |row| row[:population][:value] > 3_000_000 }
-#+end_src
 *** Specify how to process each line
 The =mapping= directive specifies what to do with each line read.  The
@@ -462,9 +469,10 @@ value of column =:age= and prints them to standard output
   end
 #+END_EXAMPLE
-To invoke the =mapping= declaration on a file, use the =mappings= method,
-which invokes =map= to each row and it stores in the =@table= variable
-whatever value mapping returns.
+The data read from each row of our input data is stored in a hash. The hash
+uses column names as the primary key and stores the values in the =:value=
+key.
 *** Process data
@@ -484,8 +492,8 @@ A typical scenario works as follows:
   # examples:
   # i.read
   # i.read filename: "example.ods"
-  # i.read filename: "example.ods", extension: :ods
-  # i.read filename: "example", extension: :ods
+  # i.read filename: "example.ods", extension: ".ods"
+  # i.read filename: "example", extension: ".ods"
   # (the line above opens the file "example" as an Open Document Spreasdheet)
   i.read
@@ -520,13 +528,7 @@ A typical scenario works as follows:
 (Optionally: check again for errors.)
 5. Add your own code to process the data returned after =mappings=, which you
-   can assign to a variable (e.g., =returned_data = i.mappings=) or access
-   with =i.table= or =i.data= (synonyms).
-#+begin_quote
-Notice that =mappings= does a side effect and invoking the mapping twice in a
-row won't work: you need to reload the file first.
-#+end_quote
+   can access with =i.table= or =i.data= (synonyms).
 Look in the examples directory for further details and a couple of working
 examples.

data/dreader.gemspec CHANGED Viewed

@@ -9,7 +9,7 @@ Gem::Specification.new do |spec|
   spec.authors       = ["Adolfo Villafiorita"]
   spec.email         = ["adolfo@shair.tech"]
-  spec.summary       = %q{Process and import data from cvs and spreadsheets}
+  spec.summary       = %q{Porcelain on top of Roo for declarative importing of CSV and spreadheet files}
   spec.description   = %q{Use this gem to specify the structure of some tabular data
 you want to process.  The input data can be in CSV, LibreOffice, and Excel.  Each row
 can then be passed to a block of code you define.
@@ -19,7 +19,7 @@ Rails application, but the gem can used in any Ruby application.
 The gem should be relatively easy to use, despite its name. (Dread
 stands for *d*ata *r*eader)}
-  spec.homepage      = "https://redmine.shair.tech/projects/dreader"
+  spec.homepage      = "https://https://github.com/avillafiorita/dreader"
   spec.license       = "MIT"
   spec.files         = `git ls-files -z`.split("\x0").reject do |f|

data/examples/age/ages.txt ADDED Viewed

@@ -0,0 +1,10 @@
+Forest Whitaker 61
+Daniel Day-Lewis 65
+Sean Penn 62
+Jeff Bridges 74
+Colin Firth 62
+Jean Dujardin 50
+Daniel Day-Lewis 65
+Matthew McConaughey 54
+Eddie Redmayne 40
+Leonardo DiCaprio 49

data/examples/template/birthdays.xlsx ADDED Viewed

Binary file

data/lib/dreader/column.rb CHANGED Viewed

@@ -1,3 +1,5 @@
+# frozen_string_literal:true
 module Dreader
   # service class to implement the column DSL language
   class Column

data/lib/dreader/engine.rb CHANGED Viewed

@@ -1,3 +1,5 @@
+# frozen_string_literal:true
 require "roo"
 require "logger"
 require "fast_excel"
@@ -10,6 +12,8 @@ module Dreader
   #
   # This is where the real stuff begins
   #
+  # TODO: FIX Metric?
+  # rubocop:disable Module/ModuleLength
   module Engine
     # the options we passed
     attr_accessor :declared_options
@@ -21,9 +25,7 @@ module Dreader
     attr_accessor :declared_virtual_columns
     # the mapping rules
     attr_accessor :declared_mapping
-    # the declared filter
-    attr_accessor :declared_reject
     # the data we read
     attr_reader :table
@@ -48,13 +50,13 @@ module Dreader
       @declared_columns ||= []
-      if name.instance_of?(Hash)
-        @declared_columns << column.to_hash.merge(
-          { name: name.keys.first, colref: name.values.first }
-        )
-      else
-        @declared_columns << column.to_hash.merge({ name: name })
-      end
+      @declared_columns << (
+        if name.instance_of?(Hash)
+          columns(name, &block)
+        else
+          column.to_hash.merge({ name: })
+        end
+      )
     end
     # define a DSL for multiple column specification (bulk_declare)
@@ -62,7 +64,7 @@ module Dreader
     # - hash is a hash in the form { symbolic_name: colref }
     #
     # i.bulk_declare {name: "B", age: "C"} is equivalent to:
-    #
+    #
     # i.column :name do
     #   colref "B"
     # end
@@ -91,9 +93,18 @@ module Dreader
     #   end
     # end
     def columns(hash, &block)
-      hash.each_key do |key|
+      hash.each do |key, value|
         column = Column.new
-        column.colref hash[key]
+        if value.instance_of?(Array)
+          column.colref value[0]
+          column.process do |string|
+            Util.convert(string, value[1], @declared_options)
+          end
+        else
+          column.colref value
+        end
         column.instance_eval(&block) if block
         @declared_columns ||= []
@@ -114,15 +125,10 @@ module Dreader
     # they are defined
     def virtual_column(name, &block)
       column = Column.new
-      column.instance_eval &block
+      column.instance_eval(&block)
       @declared_virtual_columns ||= []
-      @declared_virtual_columns << column.to_hash.merge({ name: name })
-    end
-    # define a filter, which skips some rows
-    def reject(&block)
-      @declared_reject = block
+      @declared_virtual_columns << column.to_hash.merge({ name: })
     end
     # define what we do with each line we read
@@ -194,13 +200,8 @@ module Dreader
         # this has side-effects on r
         virtual_columns_on(r) if options[:virtual] || options[:mapping]
-        # check whether the filter would ignore this line
-        # notice that we need to invoke compact to avoid nil being added
-        # to the table
-        next if !options[:ignore_reject] && reject?(r)
         options[:mapping] ? mappings_on(r) : r
-      end.compact
+      end
     end
     # TODO: PASS A ROW (and not row_number and sheet)
@@ -227,10 +228,10 @@ module Dreader
         coord = coord(row_number, colspec[:colref], cell)
         begin
           processed = colspec[:process] ? colspec[:process].call(cell) : cell
-          @logger.debug "[dreader] #{colname} process #{coord} yields '#{processed}' (#{processed.class})"
+          @logger.debug "[dreader] '#{colname}' process @ #{coord} yields '#{processed}' (#{processed.class})"
           r[colname][:value] = processed
         rescue => e
-          @logger.error "[dreader] #{colname} process #{coord} raises an exception"
+          @logger.error "[dreader] '#{colname}' process @ #{coord} raises an exception"
           raise e
         end
@@ -280,11 +281,10 @@ module Dreader
     # Compute virtual columns for, with side effect on row
     def virtual_columns_on(row)
-      @declared_virtual_columns ||= []
       @declared_virtual_columns.each do |virtualcol|
         colname = virtualcol[:name]
         row[colname] = { virtual: true }
         check_data(virtualcol[:checks_raw], row, colname, full_row: true)
         begin
@@ -304,36 +304,13 @@ module Dreader
       end
     end
-    # check whether a line has to be rejected
-    def reject?(row)
-      rejected = @declared_reject&.call(row)
-      if rejected
-        @logger.debug "[dreader] row rejected by reject declaration #{row}"
-      end
-    end
-    # apply the mapping code to the @table.  Notice that we do a side effect
-    # on @table and, hence, invoking the mapping twice won't work (you need to
-    # reload first).
-    #
-    # the mapping is applied only if it defined and it returns the output of
-    # the mapping.
-    #
-    # notice also that we do a side-effect on @table.  This is to make the
-    # behavior of
-    #
-    #   i.load mapping: true
-    #   i.table
+    # apply the mapping code to the array it makes sense to invoke it only
+    # once.
     #
-    # and
-    #
-    #   i = load;
-    #   i.mappings
-    #   i.table
-    #
-    # the same
+    # the mapping is applied only if it defined and it uses map, so that
+    # it can be used functionally
     def mappings
-      @table = @table.map { |row| mappings_on(row) }
+      @table.map { |row| mappings_on(row) }
     end
     def mappings_on(row)
@@ -431,49 +408,36 @@ module Dreader
     private
-    # list of keys we support in options. We remove them when reading
-    # the CSV file
-    OPTION_KEYS = %i[
-      filename extension sheet first_row last_row
-      logger logger_level
-      debug
-    ]
     def open_spreadsheet(options)
       filename = options[:filename]
-      # use the extension option or make ".CSV" into :csv
-      extension = options[:extension] || File.extname(filename).downcase[1..-1]&.to_sym
-      # TODO: MAKE DEBUG AND LOGGER INTO REAL CLASS VARIABLES OR MAKE LOCAL AND/OR FUNCTIONS
-      @debug = @declared_options.merge(options)[:debug] == true
-      if @debug
-        @logger = options[:logger] || Logger.new($stdout)
-        @logger.debug "[dreader open_spreadsheet] filename: #{filename}"
-        @logger.debug "[dreader open_spreadsheet] extension: #{extension}"
-      end
+      ext = options[:extension] || File.extname(filename)
-      case extension
-      when :csv
-        csv_options = @declared_options.except(*OPTION_KEYS)
+      case ext
+      when ".csv"
+        csv_options = @declared_options.except(*Options::NON_CSV_KEYS)
         Roo::CSV.new(filename, csv_options:)
-      when :tsv
-        csv_options = @declared_options.except(*OPTION_KEYS).merge({ col_sep: "\t" })
+      when ".tsv"
+        csv_options = @declared_options.except(*Options::NON_CSV_KEYS).merge({ col_sep: "\t" })
         Roo::CSV.new(filename, csv_options:)
-      when :ods, :xls, :xlsx
-        Roo::Spreadsheet.open(filename, extension:)
+      when ".ods"
+        Roo::OpenOffice.new(filename)
+      when ".xls"
+        Roo::Excel.new(filename)
+      when ".xlsx"
+        Roo::Excelx.new(filename)
       else
-        raise "Unknown extension: #{ext}. Use the :extension option."
+        raise "Unknown extension: #{ext}"
       end
     end
     def colref_to_i(colref)
       return colref if colref.instance_of?(Integer)
       value = 0
       power = 1
       colref.to_s.reverse.split("").map do |char|
-        value = value + power * (1 + char.ord - 'A'.ord)
-        power = power * 26
+        value += power * (1 + char.ord - 'A'.ord)
+        power *= power
       end
       value - 1
     end
@@ -496,7 +460,7 @@ module Dreader
     #
     # - debug :: a boolean
     def check_data(check_spec, hash, colname, full_row: false)
-      check_spec.each do |error_message, check_function|
+      (check_spec || []).each do |error_message, check_function|
         # here we extract values by distinguishing whether the hash is that of
         # column or that of a row
         if full_row

data/lib/dreader/options.rb CHANGED Viewed

@@ -1,6 +1,16 @@
+# frozen_string_literal:true
 module Dreader
   # service class to implement the options DSL language
   class Options
+    # List of keys we support in options and which are not understood by the
+    # CSV reader
+    #
+    # We remove them when reading the CSV file
+    NON_CSV_KEYS = %i[
+      filename sheet first_row last_row logger logger_level date_format
+    ].freeze
     def initialize
       @attributes = {}
     end

data/lib/dreader/util.rb CHANGED Viewed

@@ -1,3 +1,5 @@
+# frozen_string_literal: true
 module Dreader
   # Utilities function to simplify importing data into
   # ActiveRecords
@@ -82,5 +84,24 @@ module Dreader
         error[:row] == row && (col.nil? || error[:col] == col)
       end
     end
+    #
+    # Convert a string to a given type
+    #
+    def self.convert(value, type, options = {})
+      case type
+      when :integer
+        value.to_i
+      when :float
+        value.to_f
+      when :big_decimal
+        BigDecimal(value)
+      when :date
+        date_format = options[:date_format] || "%d/%m/%Y"
+        Date.strptime(value, date_format)
+      else
+        value
+      end
+    end
   end
 end

data/lib/dreader/version.rb CHANGED Viewed

@@ -1,3 +1,5 @@
+# frozen_string_literal: true
 module Dreader
-  VERSION = "1.2.0"
+  VERSION = "1.2.1"
 end

data/lib/dreader.rb CHANGED Viewed

@@ -1,6 +1,7 @@
+# frozen_string_literal: true
 require "dreader/column"
 require "dreader/engine"
 require "dreader/options"
 require "dreader/util"
 require "dreader/version"

metadata CHANGED Viewed

@@ -1,14 +1,13 @@
 --- !ruby/object:Gem::Specification
 name: dreader
 version: !ruby/object:Gem::Version
-  version: 1.2.0
+  version: 1.2.1
 platform: ruby
 authors:
 - Adolfo Villafiorita
-autorequire:
 bindir: exe
 cert_chain: []
-date: 2023-11-02 00:00:00.000000000 Z
+date: 1980-01-02 00:00:00.000000000 Z
 dependencies:
 - !ruby/object:Gem::Dependency
   name: roo
@@ -108,34 +107,27 @@ files:
 - dreader.gemspec
 - examples/age/Birthdays.ods
 - examples/age/age.rb
-- examples/age_csv/Birthdays-TabSeparated.csv
-- examples/age_csv/Birthdays.csv
-- examples/age_csv/age.rb
-- examples/age_noext/Birthdays
-- examples/age_noext/Birthdays-xlsx
-- examples/age_noext/Birthdays-xlsx-with-wrong-extension.xls
-- examples/age_noext/age.rb
+- examples/age/ages.txt
 - examples/age_with_multiple_checks/Birthdays.ods
 - examples/age_with_multiple_checks/age_with_multiple_checks.rb
 - examples/local_vars/local_vars.rb
+- examples/template/birthdays.xlsx
 - examples/template/template_generation.rb
 - examples/wikipedia_big_us_cities/big_us_cities.rb
 - examples/wikipedia_big_us_cities/cities_by_state.ods
 - examples/wikipedia_us_cities/us_cities.rb
 - examples/wikipedia_us_cities/us_cities.tsv
 - examples/wikipedia_us_cities/us_cities_bulk_declare.rb
-- examples/wikipedia_us_cities/us_cities_reject.rb
 - lib/dreader.rb
 - lib/dreader/column.rb
 - lib/dreader/engine.rb
 - lib/dreader/options.rb
 - lib/dreader/util.rb
 - lib/dreader/version.rb
-homepage: https://redmine.shair.tech/projects/dreader
+homepage: https://https://github.com/avillafiorita/dreader
 licenses:
 - MIT
 metadata: {}
-post_install_message:
 rdoc_options: []
 require_paths:
 - lib
@@ -150,8 +142,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
     - !ruby/object:Gem::Version
       version: '0'
 requirements: []
-rubygems_version: 3.4.21
-signing_key:
+rubygems_version: 3.6.7
 specification_version: 4
-summary: Process and import data from cvs and spreadsheets
+summary: Porcelain on top of Roo for declarative importing of CSV and spreadheet files
 test_files: []

data/examples/age_csv/Birthdays-TabSeparated.csv DELETED Viewed

@@ -1,13 +0,0 @@
-Name	Date of birth
-Forest Whitaker	July 15, 1961
-Daniel Day-Lewis	April 29, 1957
-Sean Penn	August 17, 1960
-Jeff Bridges	December 4, 1949
-Colin Firth	September 10, 1960
-Jean Dujardin	June 19, 1972
-Daniel Day-Lewis	April 29, 1957
-Matthew McConaughey	November 4, 1969
-Eddie Redmayne	January 6, 1982
-Leonardo DiCaprio	November 11, 1974
-Casey Affleck	August 12, 1975
-Gary Oldman	March 21, 1958

data/examples/age_csv/Birthdays.csv DELETED Viewed

@@ -1,13 +0,0 @@
-Name,Date of birth
-Forest Whitaker,"July 15, 1961"
-Daniel Day-Lewis,"April 29, 1957"
-Sean Penn,"August 17, 1960"
-Jeff Bridges,"December 4, 1949"
-Colin Firth,"September 10, 1960"
-Jean Dujardin,"June 19, 1972"
-Daniel Day-Lewis,"April 29, 1957"
-Matthew McConaughey,"November 4, 1969"
-Eddie Redmayne,"January 6, 1982"
-Leonardo DiCaprio,"November 11, 1974"
-Casey Affleck,"August 12, 1975"
-Gary Oldman,"March 21, 1958"

data/examples/age_csv/age.rb DELETED Viewed

@@ -1,55 +0,0 @@
-require "dreader"
-class Reader
-  extend Dreader::Engine
-  options do
-    first_row 2
-    debug true
-  end
-  column :name do
-    doc "A is the name string"
-    colref 'A'
-  end
-  column :birthdate do
-    doc "Birthdate contains a full date (i.e., including the year)"
-    colref 'B'
-    process do |c|
-      Date.parse(c)
-    end
-  end
-  virtual_column :age do
-    process do |row|
-      birthdate = row[:birthdate][:value]
-      birthday = Date.new(Date.today.year, birthdate.month, birthdate.day)
-      today = Date.today
-      [0, today.year - birthdate.year - (birthday < today ? 1 : 0)].max
-    end
-  end
-  mapping do |row|
-    r = Dreader::Util.simplify(row)
-    puts "#{r[:name]} is #{r[:age]} years old (born on #{r[:birthdate]})"
-  end
-end
-i = Reader
-i.read filename: "Birthdays.csv", mapping: true
-i.read filename: "Birthdays-TabSeparated.csv", extension: :tsv, mapping: true
-#
-# Here we can do further processing on the data
-#
-File.open("ages.txt", "w") do |file|
-  i.table.each do |row|
-    unless row[:row_errors].any?
-      file.puts "#{row[:name][:value]} #{row[:age][:value]}"
-    end
-  end
-end

data/examples/age_noext/Birthdays DELETED Viewed

Binary file

data/examples/age_noext/Birthdays-xlsx DELETED Viewed

Binary file

data/examples/age_noext/Birthdays-xlsx-with-wrong-extension.xls DELETED Viewed

Binary file

data/examples/age_noext/age.rb DELETED Viewed

@@ -1,73 +0,0 @@
-require "dreader"
-class Reader
-  extend Dreader::Engine
-  options do
-    first_row 2
-    debug true
-    extension :ods
-  end
-  column :name do
-    doc "A is the name string"
-    colref 'A'
-  end
-  column :birthdate do
-    doc "Birthdate contains a full date (i.e., including the year)"
-    colref 'B'
-    process do |c|
-      Date.parse(c)
-    end
-  end
-  virtual_column :age do
-    process do |row|
-      birthdate = row[:birthdate][:value]
-      birthday = Date.new(Date.today.year, birthdate.month, birthdate.day)
-      today = Date.today
-      [0, today.year - birthdate.year - (birthday < today ? 1 : 0)].max
-    end
-  end
-  mapping do |row|
-    r = Dreader::Util.simplify(row)
-    puts "#{r[:name]} is #{r[:age]} years old (born on #{r[:birthdate]})"
-  end
-end
-puts
-puts "*****************************************************************"
-puts "Reading ODS with no extension, using extension set in the options"
-puts "*****************************************************************"
-puts
-i = Reader
-i.read filename: "Birthdays"
-i.virtual_columns
-i.mappings
-puts
-puts "*****************************************************************"
-puts "Reading XLSX with wrong extension, overriding existing extension"
-puts "*****************************************************************"
-puts
-i = Reader
-i.read filename: "Birthdays-xlsx-with-wrong-extension.xls", extension: :xlsx
-i.virtual_columns
-i.mappings
-puts
-puts "*****************************************************************"
-puts "Reading XLSX with no extension"
-puts "*****************************************************************"
-puts
-i = Reader
-i.read filename: "Birthdays-xlsx", extension: :xlsx
-i.virtual_columns
-i.mappings

data/examples/wikipedia_us_cities/us_cities_reject.rb DELETED Viewed

@@ -1,77 +0,0 @@
-require 'dreader'
-# this is the class which will contain all the data we read from the file
-class City
-  [:city, :state, :population, :lat, :lon].each do |var|
-    attr_accessor var
-  end
-  def initialize(hash)
-    hash.each do |k, v|
-      self.send("#{k}=", v)
-    end
-  end
-end
-class Importer
-  extend Dreader::Engine
-  # read from us_cities.tsv, lines from 2 to 10 (included)
-  options do
-    filename "us_cities.tsv"
-    first_row 2
-    last_row  307
-  end
-  # these are the columns for which we only need to specify column and name
-  columns ({city: 2, state: 3, latlon: 11}) do
-    process { |val| val.strip }
-  end
-  # the population column requires more work
-  column :population do |col|
-    col.colref 4
-    # make "3,000" into 3000 (int)
-    col.process { |value| value.gsub(",", "").to_i }
-    # check population is positive
-    col.check { |value| value > 0 }
-  end
-  # reject all cities with more than 3M people
-  reject do |row|
-    row[:population][:value] >= 3_000_000
-  end
-  mapping do |row|
-    # remove all additional information stored in each cell
-    r = Dreader::Util.simplify row
-    # make latlon into the lat, lon fields
-    r[:lat], r[:lon] = r[:latlon].split(" ")
-    # now r contains something like
-    # {lat: ..., lon: ..., city: ..., state: ..., population: ..., latlon: ...}
-    # remove fields which are not understood by the Cities class and
-    # make a new instance
-    cleaned = Dreader::Util.clean r, [:latlon]
-    # you must declare an array cities before calling importer.mapping
-    City.new(cleaned)
-  end
-end
-# load and process
-importer = Importer
-importer.load mapping: true, debug: true
-# output everything to see whether it works
-puts "First ten cities in the US with less than 3M (source Wikipedia)"
-importer.table.each do |city|
-  [:city, :state, :population, :lat, :lon].each do |var|
-    puts "#{var.to_s.capitalize}: #{city.send(var)}"
-  end
-  puts ""
-end