fat_table 0.7.0 → 0.9.0
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- data/README.org +90 -115
- data/lib/fat_table/column.rb +2 -0
- data/lib/fat_table/convert.rb +0 -13
- data/lib/fat_table/table.rb +147 -94
- data/lib/fat_table/version.rb +1 -1
- data/lib/fat_table.rb +16 -16
- metadata +2 -2
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 6511a69dea56479777956084ceae10b33d6769b4e9e8fcbb87d50140fa5475d4
|
4
|
+
data.tar.gz: b41a1e1a578f1c9f4f4f82ac7fba1413343ec75d8febda7ebb1f12737b9fb6b2
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: fe1da822b9d2ef6995dbfe665f570a3474d955892e062cba559162a07ef43013e9c1f649040fa47529c780aec213a9d74c4ac0684217a608960ebedac9479843
|
7
|
+
data.tar.gz: b4ab503c3f992a471bd72a70822455ac6a50a6a72558242841656d514e2c60c9b51bf7a1d052c66a35ecb1ba4cbce30ac6351d30919aa85f3284f0d9cfad1821
|
data/README.org
CHANGED
@@ -445,27 +445,49 @@ or nil. There are only five permissible types for a ~Column~:
|
|
445
445
|
4. *String* (for ruby ~String~ objects), or
|
446
446
|
5. *NilClass* (for the undetermined column type).
|
447
447
|
|
448
|
-
|
449
|
-
having a type of ~NilClass~, that is, their type is as yet
|
450
|
-
a string or object
|
451
|
-
|
452
|
-
|
453
|
-
|
454
|
-
|
448
|
+
By default, when a ~Table~ is constructed from an external source, all
|
449
|
+
~Columns~ start out having a type of ~NilClass~, that is, their type is as yet
|
450
|
+
undetermined. When a string or object is added to a ~Column~ and it can be
|
451
|
+
converted into one of the permissible types, it fixes the type of the column,
|
452
|
+
and all further items added to the ~Column~ must either be ~nil~ (indicating
|
453
|
+
no value) or be capable of being coerced to the column's type. Otherwise,
|
454
|
+
~FatTable~ raises an ~IncompatibleTypeError~ exception.
|
455
|
+
|
456
|
+
*** Type Keywords Arguments
|
457
|
+
All of the table constructors allow you to set the type for a column in
|
458
|
+
advance by adding keyword arguments to the end of the contructor arguments
|
459
|
+
where the keyword is a header symbol and the value is a string designating one
|
460
|
+
of the types. For example, suppose we are constructing a table from a CSV
|
461
|
+
file, and we know that one of the columns is labeled 'Start' and another
|
462
|
+
'Price'. We want to require the items in the 'Start' column to be a valid
|
463
|
+
date and the items in the 'Price' column to be valid numbers:
|
464
|
+
|
465
|
+
#+begin_example
|
466
|
+
FatTable.from_csv_file('data.csv', start: 'date', price: 'num')
|
467
|
+
#+end_example
|
468
|
+
|
469
|
+
The type string can be anything that starts with 'dat', 'num', 'boo', or
|
470
|
+
'str', regardless of case, to designate ~DateTime~, ~Numeric~, ~Boolean~, or
|
471
|
+
~String~ types, respectively. Any other string keeps the type as NilClass,
|
472
|
+
that is, it remains open for automatic typing.
|
455
473
|
|
456
474
|
The strictness of requiring all items to be of the same type can be relaxed by
|
457
|
-
declaring a column to be "tolerant." You can do so
|
458
|
-
|
459
|
-
|
460
|
-
|
461
|
-
|
462
|
-
|
463
|
-
|
464
|
-
|
465
|
-
|
466
|
-
|
467
|
-
|
468
|
-
|
475
|
+
declaring a column to be "tolerant." You can do so by adding a '~' to the end
|
476
|
+
of a keyword type specifier in the table constructor. In the above example,
|
477
|
+
if we wanted to allow strings to be mixed up with the numeric prices, we would
|
478
|
+
use the following:
|
479
|
+
|
480
|
+
#+begin_example
|
481
|
+
FatTable.from_csv_file('data.csv', start: 'date', price: 'num~')
|
482
|
+
#+end_example
|
483
|
+
|
484
|
+
If a Column is tolerant, ~FatTable~ tries to convert new items into the
|
485
|
+
column's specified type, or if the type is still open, to one of ~DateTime~,
|
486
|
+
~Numeric~, or ~Boolean~ and then fixing the column's type, or, if it cannot do
|
487
|
+
so converts the item into a ~String~ but does not raise an
|
488
|
+
~IncompatibleTypeError~ exception. These interloper strings are treated like
|
489
|
+
nils for purposes of sorting and evaluation, but are displayed according to
|
490
|
+
any string formatting on output. See [[*Designating "Tolerant" Columns][Designating "Tolerant" Columns]] below.
|
469
491
|
|
470
492
|
Items of input must be either one of the permissible ruby objects or strings. If
|
471
493
|
they are strings, ~FatTable~ attempts to parse them as one of the permissible
|
@@ -626,14 +648,32 @@ columns to be created:
|
|
626
648
|
|
627
649
|
Occasionally, ~FatTable~'s automatic type detection can get in the way and you
|
628
650
|
just want it to treat one or more columns as Strings regardless of their
|
629
|
-
appearance. Think, for example, of zip codes.
|
630
|
-
table is contructed, you can designate a
|
631
|
-
|
632
|
-
just mark it as a forced-string Column.
|
651
|
+
appearance. Think, for example, of zip codes. As mentioned above, when a
|
652
|
+
table is contructed, you can designate a 'String' type for a column by
|
653
|
+
using a keyword parameter.
|
633
654
|
|
634
|
-
#+
|
635
|
-
|
636
|
-
|
655
|
+
#+begin_src ruby :wrap EXAMPLE
|
656
|
+
require 'fat_table'
|
657
|
+
tab = FatTable.new(:a, 'b!', 'C', :d, :zip, zip: 'str')
|
658
|
+
tab << { a: 1, b: 2, c: "<2017-01-21>", d: 'f', e: '', zip: 18552 }
|
659
|
+
tab << { a: 3.14, b: 2.17, c: '[2016-01-21 Thu]', d: 'Y', e: nil }
|
660
|
+
tab << { zip: '01879--7884' }
|
661
|
+
tab << { zip: '66210', b: 'Not a Number' }
|
662
|
+
tab << { zip: '90210' }
|
663
|
+
tab.to_text
|
664
|
+
#+end_src
|
665
|
+
|
666
|
+
#+begin_EXAMPLE
|
667
|
+
+===+===+============+===+=============+===+
|
668
|
+
| A | B | C | D | Zip | E |
|
669
|
+
+---+---+------------+---+-------------+---+
|
670
|
+
| 1 | 2 | 2017-01-21 | F | 18552 | |
|
671
|
+
| 3 | 2 | 2016-01-21 | T | | |
|
672
|
+
| | | | | 01879--7884 | |
|
673
|
+
| | | | | 90210 | |
|
674
|
+
| | | | | | |
|
675
|
+
+===+===+============+===+=============+===+
|
676
|
+
#+end_EXAMPLE
|
637
677
|
|
638
678
|
In addition, at any time after creating a table, you can force the String type
|
639
679
|
on any number of columns with the ~force_string!~ method. When you do so, all
|
@@ -662,84 +702,9 @@ exisiting items in the column are converted to strings with the #to_s method.
|
|
662
702
|
+======+======+============+===+=======+===+
|
663
703
|
#+end_EXAMPLE
|
664
704
|
|
665
|
-
**** Designating "Tolerant" Columns
|
666
|
-
|
667
|
-
Related to the problem just discussed is the problem of reading files in from
|
668
|
-
the wild where a column may get typed as, say Numeric, but then contain
|
669
|
-
something that can't be parsed as a Numeric. ~FatTable~ raises an exception
|
670
|
-
is such cases, and that may be what you want if you can control the input.
|
671
|
-
But, especially when you cannot do so, it can be helpful to designate one or
|
672
|
-
more columns as "tolerant." This means that when a conversion problem occurs,
|
673
|
-
the column item is retained as a string type in a column that is otherwise of
|
674
|
-
one of the types Numeric, DateTime, or Boolean. Those string items are
|
675
|
-
treated as nils for purposes of sorting or evaluation in a ~select~ method.
|
676
|
-
When formatted, they participate in string formatting directive, but not those
|
677
|
-
for other types.
|
678
|
-
|
679
|
-
All of the table construction methods, allow a keyword parameter,
|
680
|
-
~tolerant_columns~, where you can designate what columns should be convert to
|
681
|
-
String type when conversion to the auto-typed column type is not possible.
|
682
|
-
The parameter should be an array of headers, in either string or symbol form,
|
683
|
-
for which this behavior is desired. In addition, it can be set to the special
|
684
|
-
string '*' or symbol ~:*~ to indicate that all the columns should be made
|
685
|
-
tolerant.
|
686
|
-
|
687
|
-
#+begin_src ruby :wrap EXAMPLE
|
688
|
-
require 'fat_table'
|
689
|
-
tab = FatTable.new(:a, 'b', 'C', :d, :zip, tolerant_columns: [:zip])
|
690
|
-
tab << { a: 1, b: 2, c: "<2017-01-21>", d: 'f', e: '', zip: 18552 }
|
691
|
-
tab << { a: 3.14, b: 2.17, c: '[2016-01-21 Thu]', d: 'Y', e: nil }
|
692
|
-
tab << { zip: '01879--7884' }
|
693
|
-
tab << { zip: '66210' }
|
694
|
-
tab << { zip: '90210' }
|
695
|
-
tab.to_text
|
696
|
-
#+end_src
|
697
|
-
|
698
|
-
#+RESULTS:
|
699
|
-
#+begin_EXAMPLE
|
700
|
-
+======+======+============+===+=============+===+
|
701
|
-
| A | B | C | D | Zip | E |
|
702
|
-
+------+------+------------+---+-------------+---+
|
703
|
-
| 1 | 2 | 2017-01-21 | F | 18552 | |
|
704
|
-
| 3.14 | 2.17 | 2016-01-21 | T | | |
|
705
|
-
| | | | | 01879--7884 | |
|
706
|
-
| | | | | 66210 | |
|
707
|
-
| | | | | 90210 | |
|
708
|
-
+======+======+============+===+=============+===+
|
709
|
-
#+end_EXAMPLE
|
710
|
-
|
711
|
-
Another way to designate a column as tolerant is to end a column you want to
|
712
|
-
designate as tolerant with a ~!~. The ~!~ will be stripped from the header,
|
713
|
-
but it will be marked as tolerant.
|
714
|
-
#+begin_src ruby :wrap EXAMPLE
|
715
|
-
require 'fat_table'
|
716
|
-
tab = FatTable.new(:a, 'b!', 'C', :d, :zip!)
|
717
|
-
tab << { a: 1, b: 2, c: "<2017-01-21>", d: 'f', e: '', zip: 18552 }
|
718
|
-
tab << { a: 3.14, b: 2.17, c: '[2016-01-21 Thu]', d: 'Y', e: nil }
|
719
|
-
tab << { zip: '01879--7884' }
|
720
|
-
tab << { zip: '66210', b: 'Not a Number' }
|
721
|
-
tab << { zip: '90210' }
|
722
|
-
tab.to_text
|
723
|
-
#+end_src
|
724
|
-
|
725
|
-
#+RESULTS:
|
726
|
-
#+begin_EXAMPLE
|
727
|
-
+======+==============+============+===+=============+===+
|
728
|
-
| A | B | C | D | Zip | E |
|
729
|
-
+------+--------------+------------+---+-------------+---+
|
730
|
-
| 1 | 2 | 2017-01-21 | F | 18552 | |
|
731
|
-
| 3.14 | 2.17 | 2016-01-21 | T | | |
|
732
|
-
| | | | | 01879--7884 | |
|
733
|
-
| | Not a Number | | | 66210 | |
|
734
|
-
| | | | | 90210 | |
|
735
|
-
+======+==============+============+===+=============+===+
|
736
|
-
#+end_EXAMPLE
|
737
|
-
|
738
705
|
*** From CSV or Org Mode files or strings
|
739
706
|
Tables can also be read from ~.csv~ files or files containing ~org-mode~
|
740
|
-
tables.
|
741
|
-
~tolerant_columns:~ keyword argument or make them all tolerant by designating
|
742
|
-
the pseudo-column ~:*~ as tolerant.
|
707
|
+
tables.
|
743
708
|
|
744
709
|
In the case of org-mode files, ~FatTable~ skips through the file until it finds
|
745
710
|
a line that look like a table, that is, it begins with any number of spaces
|
@@ -799,10 +764,10 @@ header row, and the headers are converted to symbols as described above.
|
|
799
764
|
|
800
765
|
You can also initialize a table directly from ruby data structures. You can,
|
801
766
|
for example, build a table from an array of arrays. Remember that you can
|
802
|
-
make any column tolerant with a
|
803
|
-
|
767
|
+
make any column tolerant with a keyword argument for the column symbol and
|
768
|
+
ending it with a '~'.
|
804
769
|
|
805
|
-
#+BEGIN_SRC ruby
|
770
|
+
#+BEGIN_SRC ruby
|
806
771
|
aoa = [
|
807
772
|
['Ref', 'Date', 'Code', 'Raw', 'Shares', 'Price', 'Info', 'Bool'],
|
808
773
|
[1, '2013-05-02', 'P', 795_546.20, 795_546.2, 1.1850, 'ENTITY1', 'T'],
|
@@ -816,11 +781,27 @@ them all tolerant by designating the pseudo-column ~:*~ as tolerant.
|
|
816
781
|
[13, '2013-05-29', 'S', 13_459.00, 5659.51, 24.7464, 'ENTITY3', 'T'],
|
817
782
|
[14, '2013-05-29', 'S', 15_700.00, 6601.85, 24.7790, 'ENTITY3', 'F'],
|
818
783
|
[15, '2013-05-29', 'S', 15_900.00, 6685.95, 24.5802, 'ENTITY3', 'T'],
|
819
|
-
[16, '2013-05-30', 'S', 6_679.00, 2808.52, 25.0471, 'ENTITY3', 'T']
|
820
|
-
|
821
|
-
tab = FatTable.from_aoa(aoa)
|
784
|
+
[16, '2013-05-30', 'S', 6_679.00, 2808.52, 25.0471, 'ENTITY3', 'T'] ]
|
785
|
+
|
786
|
+
tab = FatTable.from_aoa(aoa).to_aoa
|
822
787
|
#+END_SRC
|
823
788
|
|
789
|
+
#+RESULTS:
|
790
|
+
| Ref | Date | Code | Raw | Shares | Price | Info | Bool |
|
791
|
+
|-----+------------+------+--------+--------+-------+---------+------|
|
792
|
+
| 1 | 2013-05-02 | P | 795546 | 795546 | 1 | ENTITY1 | T |
|
793
|
+
| 2 | 2013-05-02 | P | 118186 | 118186 | 12 | ENTITY1 | T |
|
794
|
+
| 7 | 2013-05-20 | S | 12000 | 5046 | 28 | ENTITY3 | F |
|
795
|
+
| 8 | 2013-05-20 | S | 85000 | 35743 | 28 | ENTITY3 | T |
|
796
|
+
| 9 | 2013-05-20 | S | 33302 | 14003 | 29 | ENTITY3 | T |
|
797
|
+
| 10 | 2013-05-23 | S | 8000 | 3364 | 27 | ENTITY3 | T |
|
798
|
+
| 11 | 2013-05-23 | S | 23054 | 9694 | 27 | ENTITY3 | F |
|
799
|
+
| 12 | 2013-05-23 | S | 39906 | 16780 | 25 | ENTITY3 | T |
|
800
|
+
| 13 | 2013-05-29 | S | 13459 | 5660 | 25 | ENTITY3 | T |
|
801
|
+
| 14 | 2013-05-29 | S | 15700 | 6602 | 25 | ENTITY3 | F |
|
802
|
+
| 15 | 2013-05-29 | S | 15900 | 6686 | 25 | ENTITY3 | T |
|
803
|
+
| 16 | 2013-05-30 | S | 6679 | 2809 | 25 | ENTITY3 | T |
|
804
|
+
|
824
805
|
Notice that the values can either be ruby objects, such as the Integer ~85_000~,
|
825
806
|
or strings that can be parsed into one of the permissible column types.
|
826
807
|
|
@@ -898,9 +879,7 @@ This example illustrates several things:
|
|
898
879
|
A second ruby data structure that can be used to initialize a ~FatTable~ table
|
899
880
|
is an array of ruby Hashes. Each hash represents a row of the table, and the
|
900
881
|
headers of the table are taken from the keys of the hashes. Accordingly, all
|
901
|
-
the hashes must have the same keys.
|
902
|
-
tolerant with a ~tolerant_columns:~ keyword argument or make them all tolerant
|
903
|
-
by designating the pseudo-column ~:*~ as tolerant.
|
882
|
+
the hashes must have the same keys.
|
904
883
|
|
905
884
|
This same method can in fact take an array of any objects that can be converted
|
906
885
|
to a Hash with the ~#to_h~ method, so you can use an array of your own objects
|
@@ -989,10 +968,6 @@ The ~.connect~ function need only be called once, and the database handle it
|
|
989
968
|
creates will be used for all subsequent ~.from_sql~ calls until ~.connect~ is
|
990
969
|
called again.
|
991
970
|
|
992
|
-
Remember that you can make any column tolerant with a ~tolerant_columns:~
|
993
|
-
keyword argument or make them all tolerant by designating the pseudo-column
|
994
|
-
~:*~ as tolerant.
|
995
|
-
|
996
971
|
*** Marking Groups in Input
|
997
972
|
**** Manually
|
998
973
|
|
data/lib/fat_table/column.rb
CHANGED
data/lib/fat_table/convert.rb
CHANGED
@@ -63,19 +63,6 @@ module FatTable
|
|
63
63
|
when 'String'
|
64
64
|
if val.nil?
|
65
65
|
nil
|
66
|
-
elsif tolerant
|
67
|
-
# Allow String to upgrade to one of Numeric, DateTime, or Boolean if
|
68
|
-
# possible.
|
69
|
-
if (new_val = convert_to_numeric(val))
|
70
|
-
new_val
|
71
|
-
elsif (new_val = convert_to_date_time(val))
|
72
|
-
new_val
|
73
|
-
elsif (new_val = convert_to_boolean(val))
|
74
|
-
new_val
|
75
|
-
else
|
76
|
-
new_val = convert_to_string(val)
|
77
|
-
end
|
78
|
-
new_val
|
79
66
|
else
|
80
67
|
new_val = convert_to_string(val)
|
81
68
|
if new_val.nil?
|
data/lib/fat_table/table.rb
CHANGED
@@ -53,6 +53,11 @@ module FatTable
|
|
53
53
|
class Table
|
54
54
|
# An Array of FatTable::Columns that constitute the table.
|
55
55
|
attr_reader :columns
|
56
|
+
attr_reader :heads
|
57
|
+
|
58
|
+
# Headers of columns that are to be tolerant when they are built.
|
59
|
+
attr_accessor :tolerant_cols
|
60
|
+
attr_reader :omni_typ, :omni_tol
|
56
61
|
|
57
62
|
# Record boundaries set explicitly with mark_boundaries or from reading
|
58
63
|
# hlines from input. When we want to access boundaries, however, we want
|
@@ -62,80 +67,94 @@ module FatTable
|
|
62
67
|
# method call.
|
63
68
|
attr_accessor :explicit_boundaries
|
64
69
|
|
65
|
-
# An Array of FatTable::Columns that should be tolerant.
|
66
|
-
attr_reader :tolerant_columns
|
67
|
-
|
68
70
|
###########################################################################
|
69
71
|
# Constructors
|
70
|
-
|
71
|
-
|
72
|
+
#
|
73
|
+
#
|
72
74
|
# :category: Constructors
|
73
|
-
|
75
|
+
#
|
74
76
|
# Return an empty FatTable::Table object. Specifying headers is optional.
|
75
|
-
#
|
76
|
-
#
|
77
|
-
#
|
78
|
-
#
|
79
|
-
#
|
80
|
-
#
|
81
|
-
#
|
82
|
-
#
|
83
|
-
|
77
|
+
# By default, all columns start our as having an "open" type and get
|
78
|
+
# assigned a type based on their contents. For example, if a column
|
79
|
+
# contains items that can be interpreted as dates, the column gets
|
80
|
+
# assigned a DateTime type. Other types are Numeric, Boolean, and String.
|
81
|
+
# Once a type is assigned to a column, any non-conforming vaules in that
|
82
|
+
# column raise an IncompatibleType error. If a column is marked
|
83
|
+
# "tolerant", however, the incompatible item is converted to a string and
|
84
|
+
# allowed to remain in the column without raising an error. They count as
|
85
|
+
# nils when calculations are performed on the column and paricipate only
|
86
|
+
# in string formatting directives on output.
|
87
|
+
#
|
88
|
+
# Rather than have a column's type determined by content, you can also
|
89
|
+
# specify a column type by providing a type hash, where the key is the
|
90
|
+
# header's name and the value is the desired type. In that case, any
|
91
|
+
# incompatible type raises an an IncompatibleTypeError unless the column
|
92
|
+
# is also marked tolerant, in which case it gets converted to a string as
|
93
|
+
# discussed above. If the type name in the types hash ends in a '~', it
|
94
|
+
# is treated as a specifying the given type but marking it as tolerant as
|
95
|
+
# well. The values in the type hash can be any string or sybol that
|
96
|
+
# starts with 'num', 'dat', 'bool', or 'str' to specify Numeric,
|
97
|
+
# DateTime, Boolean, or String types respectively.
|
98
|
+
def initialize(*heads, **types)
|
99
|
+
@heads = heads.flatten.map(&:as_sym)
|
100
|
+
@types = types
|
84
101
|
@columns = []
|
85
|
-
@
|
86
|
-
|
87
|
-
|
88
|
-
|
89
|
-
|
90
|
-
|
91
|
-
|
92
|
-
|
93
|
-
|
94
|
-
|
95
|
-
|
96
|
-
|
97
|
-
|
98
|
-
|
99
|
-
|
100
|
-
|
101
|
-
|
102
|
+
@tolerant_cols = []
|
103
|
+
# Check for the special 'omni' key
|
104
|
+
@omni_type = 'NilClass'
|
105
|
+
@omni_tol = false
|
106
|
+
if types.keys.map(&:to_s).include?('omni')
|
107
|
+
# All columns not otherwise included in types should have the type and
|
108
|
+
# tolerance of omni.
|
109
|
+
omni_val = (types['omni'] || types[:omni])
|
110
|
+
@omni_type, @omni_tol = Table.typ_tol(omni_val)
|
111
|
+
# Remove omni from types.
|
112
|
+
types.delete(:omni)
|
113
|
+
types.delete('omni')
|
114
|
+
end
|
115
|
+
# heads += types.keys
|
116
|
+
(heads.flatten + types.keys).uniq.each do |h|
|
117
|
+
if types[h]
|
118
|
+
typ, tol = Table.typ_tol(types[h])
|
102
119
|
else
|
103
|
-
|
104
|
-
|
105
|
-
unless heads.empty?
|
106
|
-
heads.each do |h|
|
107
|
-
if h.to_s.end_with?('!') || @tolerant_columns.include?(h)
|
108
|
-
@columns << Column.new(header: h.to_s.sub(/!\s*\z/, ''), type: 'String')
|
109
|
-
else
|
110
|
-
@columns << Column.new(header: h)
|
111
|
-
end
|
120
|
+
typ = @omni_type
|
121
|
+
tol = @omni_tol
|
112
122
|
end
|
123
|
+
@tolerant_cols << h.to_s.as_sym if tol
|
124
|
+
@columns << Column.new(header: h.to_s.sub(/~\s*\z/, ''), type: typ,
|
125
|
+
tolerant: tol)
|
113
126
|
end
|
127
|
+
@explicit_boundaries = []
|
114
128
|
end
|
115
129
|
|
116
130
|
# :category: Constructors
|
117
131
|
|
118
|
-
# Return an
|
119
|
-
#
|
120
|
-
#
|
121
|
-
#
|
122
|
-
|
123
|
-
|
124
|
-
|
125
|
-
|
126
|
-
|
127
|
-
|
128
|
-
|
129
|
-
|
132
|
+
# Return an new table based on this Table but with empty columns named by
|
133
|
+
# the result_cols parameter, by default the this Table's columns. If any
|
134
|
+
# of the result_cols have the same name as an existing column, inherit
|
135
|
+
# that column's type and tolerance. Also, set any instance variables that
|
136
|
+
# might have been set by a subclass instance.
|
137
|
+
def empty_dup(result_cols = nil)
|
138
|
+
result_cols ||= heads
|
139
|
+
result_types = types.select { |k,_v| result_cols.include?(k) }
|
140
|
+
result = Table.new(result_cols, **result_types)
|
141
|
+
tolerant_cols.each do |h|
|
142
|
+
result.tolerant_cols << h
|
143
|
+
result.column(h).tolerant = true
|
144
|
+
end
|
145
|
+
(instance_variables - result.instance_variables).each do |v|
|
146
|
+
result.instance_variable_set(instance_variable_get(v))
|
147
|
+
end
|
148
|
+
result
|
130
149
|
end
|
131
150
|
|
132
151
|
# :category: Constructors
|
133
152
|
|
134
153
|
# Construct a Table from the contents of a CSV file named +fname+. Headers
|
135
154
|
# will be taken from the first CSV row and converted to symbols.
|
136
|
-
def self.from_csv_file(fname,
|
155
|
+
def self.from_csv_file(fname, **types)
|
137
156
|
File.open(fname, 'r') do |io|
|
138
|
-
from_csv_io(io,
|
157
|
+
from_csv_io(io, **types)
|
139
158
|
end
|
140
159
|
end
|
141
160
|
|
@@ -143,8 +162,8 @@ module FatTable
|
|
143
162
|
|
144
163
|
# Construct a Table from a CSV string +str+, treated in the same manner as
|
145
164
|
# the input from a CSV file in ::from_org_file.
|
146
|
-
def self.from_csv_string(str,
|
147
|
-
from_csv_io(StringIO.new(str),
|
165
|
+
def self.from_csv_string(str, **types)
|
166
|
+
from_csv_io(StringIO.new(str), **types)
|
148
167
|
end
|
149
168
|
|
150
169
|
# :category: Constructors
|
@@ -153,9 +172,9 @@ module FatTable
|
|
153
172
|
# file named +fname+. Headers are taken from the first row if the second row
|
154
173
|
# is an hrule. Otherwise, synthetic headers of the form +:col_1+, +:col_2+,
|
155
174
|
# etc. are created.
|
156
|
-
def self.from_org_file(fname,
|
175
|
+
def self.from_org_file(fname, **types)
|
157
176
|
File.open(fname, 'r') do |io|
|
158
|
-
from_org_io(io,
|
177
|
+
from_org_io(io, **types)
|
159
178
|
end
|
160
179
|
end
|
161
180
|
|
@@ -163,8 +182,8 @@ module FatTable
|
|
163
182
|
|
164
183
|
# Construct a Table from a string +str+, treated in the same manner as the
|
165
184
|
# contents of an org-mode file in ::from_org_file.
|
166
|
-
def self.from_org_string(str,
|
167
|
-
from_org_io(StringIO.new(str),
|
185
|
+
def self.from_org_string(str, **types)
|
186
|
+
from_org_io(StringIO.new(str), **types)
|
168
187
|
end
|
169
188
|
|
170
189
|
# :category: Constructors
|
@@ -183,8 +202,8 @@ module FatTable
|
|
183
202
|
# :hlines no +) org-mode strips all hrules from the table; otherwise (+
|
184
203
|
# HEADER: :hlines yes +) they are indicated with nil elements in the outer
|
185
204
|
# array.
|
186
|
-
def self.from_aoa(aoa, hlines: false,
|
187
|
-
from_array_of_arrays(aoa, hlines: hlines,
|
205
|
+
def self.from_aoa(aoa, hlines: false, **types)
|
206
|
+
from_array_of_arrays(aoa, hlines: hlines, **types)
|
188
207
|
end
|
189
208
|
|
190
209
|
# :category: Constructors
|
@@ -194,9 +213,9 @@ module FatTable
|
|
194
213
|
# keys, which, when converted to symbols will become the headers for the
|
195
214
|
# Table. If hlines is set true, mark a group boundary whenever a nil, rather
|
196
215
|
# than a hash appears in the outer array.
|
197
|
-
def self.from_aoh(aoh, hlines: false,
|
216
|
+
def self.from_aoh(aoh, hlines: false, **types)
|
198
217
|
if aoh.first.respond_to?(:to_h)
|
199
|
-
from_array_of_hashes(aoh, hlines: hlines,
|
218
|
+
from_array_of_hashes(aoh, hlines: hlines, **types)
|
200
219
|
else
|
201
220
|
raise UserError,
|
202
221
|
"Cannot initialize Table with an array of #{input[0].class}"
|
@@ -215,7 +234,7 @@ module FatTable
|
|
215
234
|
|
216
235
|
# Construct a Table by running a SQL +query+ against the database set up
|
217
236
|
# with FatTable.connect, with the rows of the query result as rows.
|
218
|
-
def self.from_sql(query,
|
237
|
+
def self.from_sql(query, **types)
|
219
238
|
msg = 'FatTable.db must be set with FatTable.connect'
|
220
239
|
raise UserError, msg if FatTable.db.nil?
|
221
240
|
|
@@ -232,13 +251,33 @@ module FatTable
|
|
232
251
|
############################################################################
|
233
252
|
|
234
253
|
class << self
|
254
|
+
# Return [typ, tol] based on the type string, str.
|
255
|
+
def typ_tol(str)
|
256
|
+
tol = str ? str.match?(/~\s*\Z/) : false
|
257
|
+
typ =
|
258
|
+
case str
|
259
|
+
when /\A\s*num/i
|
260
|
+
'Numeric'
|
261
|
+
when /\A\s*boo/i
|
262
|
+
'Boolean'
|
263
|
+
when /\A\s*dat/i
|
264
|
+
'DateTime'
|
265
|
+
when /\A\s*str/i
|
266
|
+
'String'
|
267
|
+
else
|
268
|
+
'NilClass'
|
269
|
+
end
|
270
|
+
[typ, tol]
|
271
|
+
end
|
272
|
+
|
235
273
|
private
|
236
274
|
|
237
275
|
# Construct table from an array of hashes or an array of any object that
|
238
276
|
# can respond to #to_h. If an array element is a nil, mark it as a group
|
239
277
|
# boundary in the Table.
|
240
|
-
def from_array_of_hashes(hashes, hlines: false,
|
241
|
-
|
278
|
+
def from_array_of_hashes(hashes, hlines: false, **types)
|
279
|
+
heads = hashes.first.keys
|
280
|
+
result = new(*heads, **types)
|
242
281
|
hashes.each do |hsh|
|
243
282
|
if hsh.nil?
|
244
283
|
unless hlines
|
@@ -266,8 +305,7 @@ module FatTable
|
|
266
305
|
# hlines are stripped from the table, otherwise (:hlines yes) they are
|
267
306
|
# indicated with nil elements in the outer array as expected by this
|
268
307
|
# method when hlines is set true.
|
269
|
-
def from_array_of_arrays(rows, hlines: false,
|
270
|
-
result = new(tolerant_columns: tolerant_columns)
|
308
|
+
def from_array_of_arrays(rows, hlines: false, **types)
|
271
309
|
headers = []
|
272
310
|
if !hlines
|
273
311
|
# Take the first row as headers
|
@@ -286,6 +324,7 @@ module FatTable
|
|
286
324
|
headers = (1..rows[0].size).to_a.map { |k| "col_#{k}".as_sym }
|
287
325
|
first_data_row = 0
|
288
326
|
end
|
327
|
+
result = new(*headers, **types)
|
289
328
|
rows[first_data_row..-1].each do |row|
|
290
329
|
if row.nil?
|
291
330
|
unless hlines
|
@@ -303,8 +342,8 @@ module FatTable
|
|
303
342
|
result
|
304
343
|
end
|
305
344
|
|
306
|
-
def from_csv_io(io,
|
307
|
-
result = new(
|
345
|
+
def from_csv_io(io, **types)
|
346
|
+
result = new(**types)
|
308
347
|
::CSV.new(io, headers: true, header_converters: :symbol,
|
309
348
|
skip_blanks: true).each do |row|
|
310
349
|
result << row.to_h
|
@@ -317,7 +356,7 @@ module FatTable
|
|
317
356
|
# header row must be marked with an hline (i.e, a row that looks like
|
318
357
|
# '|---+--...--|') and groups of rows may be marked with hlines to
|
319
358
|
# indicate group boundaries.
|
320
|
-
def from_org_io(io,
|
359
|
+
def from_org_io(io, **types)
|
321
360
|
table_re = /\A\s*\|/
|
322
361
|
hrule_re = /\A\s*\|[-+]+/
|
323
362
|
rows = []
|
@@ -352,7 +391,7 @@ module FatTable
|
|
352
391
|
rows << line.split('|').map(&:clean)
|
353
392
|
end
|
354
393
|
end
|
355
|
-
from_array_of_arrays(rows, hlines: true,
|
394
|
+
from_array_of_arrays(rows, hlines: true, **types)
|
356
395
|
end
|
357
396
|
end
|
358
397
|
|
@@ -377,12 +416,23 @@ module FatTable
|
|
377
416
|
column(key).type
|
378
417
|
end
|
379
418
|
|
419
|
+
# Return the type of the Column with the given +key+ as its
|
420
|
+
# header as a String.
|
421
|
+
def types
|
422
|
+
result = {}
|
423
|
+
headers.each do |h|
|
424
|
+
result[h] = type(h)
|
425
|
+
end
|
426
|
+
result
|
427
|
+
end
|
428
|
+
|
380
429
|
# :category: Attributes
|
381
430
|
|
382
431
|
# Set the column type for Column with the given +key+ as a String type.
|
383
432
|
def force_string!(*keys)
|
384
433
|
keys.each do |h|
|
385
434
|
raise UserError, "force_string!: #{h} not a column in table" unless column(h)
|
435
|
+
|
386
436
|
column(h).force_string!
|
387
437
|
end
|
388
438
|
self
|
@@ -428,7 +478,7 @@ module FatTable
|
|
428
478
|
# :category: Attributes
|
429
479
|
|
430
480
|
# Return a Hash of the Table's Column header symbols to type strings.
|
431
|
-
def
|
481
|
+
def col_types
|
432
482
|
result = {}
|
433
483
|
columns.each do |c|
|
434
484
|
result[c.header] = c.type
|
@@ -445,11 +495,11 @@ module FatTable
|
|
445
495
|
|
446
496
|
# :category: Attributes
|
447
497
|
|
448
|
-
# Return whether the column with the given head
|
498
|
+
# Return whether the column with the given head is supposed to be
|
499
|
+
# tolerant. We can't just look up the Column because it may not be build
|
500
|
+
# yet, as when we do a row-by-row add.
|
449
501
|
def tolerant_col?(h)
|
450
|
-
|
451
|
-
|
452
|
-
tolerant_columns.include?(h)
|
502
|
+
tolerant_cols.include?(h.to_s.as_sym) || self.omni_tol
|
453
503
|
end
|
454
504
|
|
455
505
|
# :category: Attributes
|
@@ -809,6 +859,7 @@ module FatTable
|
|
809
859
|
unless expr.is_a?(String)
|
810
860
|
raise "must call FatTable::Table\#order_with with a single string expression"
|
811
861
|
end
|
862
|
+
|
812
863
|
rev = false
|
813
864
|
if expr.match?(/\s*!\s*\z/)
|
814
865
|
rev = true
|
@@ -921,8 +972,15 @@ module FatTable
|
|
921
972
|
before: before_hook,
|
922
973
|
after: after_hook)
|
923
974
|
# Compute the new Table from this Table
|
924
|
-
|
975
|
+
result_cols =
|
976
|
+
if cols.include?(:omni)
|
977
|
+
(headers + new_cols.keys - [:omni])
|
978
|
+
else
|
979
|
+
(cols + new_cols.keys)
|
980
|
+
end
|
981
|
+
result = empty_dup(result_cols)
|
925
982
|
normalize_boundaries
|
983
|
+
|
926
984
|
rows.each_with_index do |old_row, old_k|
|
927
985
|
# Set the group number in the before hook and run the hook with the
|
928
986
|
# local variables set to the row before the new row is evaluated.
|
@@ -992,15 +1050,6 @@ module FatTable
|
|
992
1050
|
def where(expr)
|
993
1051
|
expr = expr.to_s
|
994
1052
|
result = empty_dup
|
995
|
-
headers.each do |h|
|
996
|
-
col =
|
997
|
-
if tolerant_col?(h)
|
998
|
-
Column.new(header: h, tolerant: true)
|
999
|
-
else
|
1000
|
-
Column.new(header: h)
|
1001
|
-
end
|
1002
|
-
result.add_column(col)
|
1003
|
-
end
|
1004
1053
|
ev = Evaluator.new(ivars: { row: 0, group: 0 })
|
1005
1054
|
rows.each_with_index do |row, k|
|
1006
1055
|
grp = row_index_to_group_index(k)
|
@@ -1058,10 +1107,13 @@ module FatTable
|
|
1058
1107
|
# boundaries of the constituent tables. Preserves and adjusts the group
|
1059
1108
|
# boundaries of the constituent table.
|
1060
1109
|
def union_all(other)
|
1061
|
-
set_operation(
|
1062
|
-
|
1063
|
-
|
1064
|
-
|
1110
|
+
set_operation(
|
1111
|
+
other,
|
1112
|
+
:+,
|
1113
|
+
distinct: false,
|
1114
|
+
add_boundaries: true,
|
1115
|
+
inherit_boundaries: true
|
1116
|
+
)
|
1065
1117
|
end
|
1066
1118
|
|
1067
1119
|
# :category: Operators
|
@@ -1453,7 +1505,8 @@ module FatTable
|
|
1453
1505
|
groups = sorted_tab.rows.group_by do |r|
|
1454
1506
|
group_cols.map { |k| r[k] }
|
1455
1507
|
end
|
1456
|
-
|
1508
|
+
grp_types = types.select { |k, _v| group_cols.include?(k) }
|
1509
|
+
result = Table.new(*group_cols, **grp_types)
|
1457
1510
|
groups.each_pair do |_vals, grp_rows|
|
1458
1511
|
result << row_from_group(grp_rows, group_cols, agg_cols)
|
1459
1512
|
end
|
data/lib/fat_table/version.rb
CHANGED
data/lib/fat_table.rb
CHANGED
@@ -61,22 +61,22 @@ module FatTable
|
|
61
61
|
|
62
62
|
# Return an empty FatTable::Table object. You can use FatTable::Table#add_row
|
63
63
|
# or FatTable::Table#add_column to populate the table with data.
|
64
|
-
def self.new(*args,
|
65
|
-
Table.new(*args,
|
64
|
+
def self.new(*args, **types)
|
65
|
+
Table.new(*args, **types)
|
66
66
|
end
|
67
67
|
|
68
68
|
# Construct a FatTable::Table from the contents of a CSV file given by the
|
69
69
|
# file name +fname+. Headers will be taken from the first row and converted to
|
70
70
|
# symbols.
|
71
|
-
def self.from_csv_file(fname,
|
72
|
-
Table.from_csv_file(fname,
|
71
|
+
def self.from_csv_file(fname, **types)
|
72
|
+
Table.from_csv_file(fname, **types)
|
73
73
|
end
|
74
74
|
|
75
75
|
# Construct a FatTable::Table from the string +str+, treated in the same
|
76
76
|
# manner as if read the input from a CSV file. Headers will be taken from the
|
77
77
|
# first row and converted to symbols.
|
78
|
-
def self.from_csv_string(str,
|
79
|
-
Table.from_csv_string(str,
|
78
|
+
def self.from_csv_string(str, **types)
|
79
|
+
Table.from_csv_string(str, **types)
|
80
80
|
end
|
81
81
|
|
82
82
|
# Construct a FatTable::Table from the first table found in the Emacs org-mode
|
@@ -84,8 +84,8 @@ module FatTable
|
|
84
84
|
# is an hline. Otherwise, synthetic headers of the form +:col_1+, +:col_2+,
|
85
85
|
# etc. are created. Any other hlines will be treated as marking a boundary in
|
86
86
|
# the table.
|
87
|
-
def self.from_org_file(fname,
|
88
|
-
Table.from_org_file(fname,
|
87
|
+
def self.from_org_file(fname, **types)
|
88
|
+
Table.from_org_file(fname, **types)
|
89
89
|
end
|
90
90
|
|
91
91
|
# Construct a FatTable::Table from the first table found in the string +str+,
|
@@ -93,8 +93,8 @@ module FatTable
|
|
93
93
|
# are taken from the first row if the second row is an hrule. Otherwise,
|
94
94
|
# synthetic headers of the form :col_1, :col_2, etc. are created. Any other
|
95
95
|
# hlines will be treated as marking a boundary in the table.
|
96
|
-
def self.from_org_string(str,
|
97
|
-
Table.from_org_string(str,
|
96
|
+
def self.from_org_string(str, **types)
|
97
|
+
Table.from_org_string(str, **types)
|
98
98
|
end
|
99
99
|
|
100
100
|
# Construct a FatTable::Table from the array of arrays +aoa+. By default, with
|
@@ -108,8 +108,8 @@ module FatTable
|
|
108
108
|
# org-mode code blocks, by default (+:hlines no+) all hlines are stripped from
|
109
109
|
# the table, otherwise (+:hlines yes+) they are indicated with nil elements in
|
110
110
|
# the outer array.
|
111
|
-
def self.from_aoa(aoa, hlines: false,
|
112
|
-
Table.from_aoa(aoa, hlines: hlines,
|
111
|
+
def self.from_aoa(aoa, hlines: false, **types)
|
112
|
+
Table.from_aoa(aoa, hlines: hlines, **types)
|
113
113
|
end
|
114
114
|
|
115
115
|
# Construct a FatTable::Table from the array of hashes +aoh+, which can be an
|
@@ -117,8 +117,8 @@ module FatTable
|
|
117
117
|
# interpret nil separators as marking boundaries in the new Table. All hashes
|
118
118
|
# must have the same keys, which, converted to symbols, become the headers for
|
119
119
|
# the new Table.
|
120
|
-
def self.from_aoh(aoh, hlines: false,
|
121
|
-
Table.from_aoh(aoh, hlines: hlines,
|
120
|
+
def self.from_aoh(aoh, hlines: false, **types)
|
121
|
+
Table.from_aoh(aoh, hlines: hlines, **types)
|
122
122
|
end
|
123
123
|
|
124
124
|
# Construct a FatTable::Table from another FatTable::Table. Inherit any group
|
@@ -130,8 +130,8 @@ module FatTable
|
|
130
130
|
# Construct a Table by running a SQL query against the database set up with
|
131
131
|
# FatTable.connect. Return the Table with the query results as rows and the
|
132
132
|
# headers from the query, converted to symbols, as headers.
|
133
|
-
def self.from_sql(query,
|
134
|
-
Table.from_sql(query,
|
133
|
+
def self.from_sql(query, **types)
|
134
|
+
Table.from_sql(query, **types)
|
135
135
|
end
|
136
136
|
|
137
137
|
########################################################################
|
metadata
CHANGED
@@ -1,14 +1,14 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: fat_table
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.
|
4
|
+
version: 0.9.0
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Daniel E. Doherty
|
8
8
|
autorequire:
|
9
9
|
bindir: bin
|
10
10
|
cert_chain: []
|
11
|
-
date: 2023-
|
11
|
+
date: 2023-05-22 00:00:00.000000000 Z
|
12
12
|
dependencies:
|
13
13
|
- !ruby/object:Gem::Dependency
|
14
14
|
name: bundler
|