fat_table 0.2.6 → 0.3.0
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +5 -5
- data/.gitignore +3 -0
- data/.rubocop.yml +3 -0
- data/.travis.yml +7 -2
- data/.yardopts +5 -1
- data/README.org +271 -251
- data/README.rdoc +4 -4
- data/TODO.org +7 -0
- data/bin/ft_console +82 -79
- data/fat_table.gemspec +47 -46
- data/lib/fat_table.rb +11 -2
- data/lib/fat_table/column.rb +41 -29
- data/lib/fat_table/db_handle.rb +31 -50
- data/lib/fat_table/evaluator.rb +26 -24
- data/lib/fat_table/formatters/aoa_formatter.rb +5 -6
- data/lib/fat_table/formatters/aoh_formatter.rb +6 -7
- data/lib/fat_table/formatters/formatter.rb +67 -48
- data/lib/fat_table/formatters/latex_formatter.rb +9 -7
- data/lib/fat_table/formatters/org_formatter.rb +8 -9
- data/lib/fat_table/formatters/term_formatter.rb +19 -18
- data/lib/fat_table/formatters/text_formatter.rb +6 -7
- data/lib/fat_table/patches.rb +30 -1
- data/lib/fat_table/table.rb +143 -117
- data/lib/fat_table/version.rb +1 -1
- data/md/README.md +2167 -0
- metadata +78 -84
data/lib/fat_table/version.rb
CHANGED
data/md/README.md
ADDED
@@ -0,0 +1,2167 @@
|
|
1
|
+
|
2
|
+
# Table of Contents
|
3
|
+
|
4
|
+
1. [Introduction](#org23d768e)
|
5
|
+
2. [Installation](#org8d90fdf)
|
6
|
+
1. [Prerequisites](#org26d2aee)
|
7
|
+
2. [Installing the gem](#orga19109b)
|
8
|
+
3. [Usage](#org0b5ecd8)
|
9
|
+
1. [Quick Start](#org199fc3a)
|
10
|
+
2. [A Word About the Examples](#org1e51988)
|
11
|
+
3. [Anatomy of a Table](#org7d48b5d)
|
12
|
+
1. [Columns](#org4a6c98f)
|
13
|
+
2. [Headers](#org37bbf47)
|
14
|
+
3. [Groups](#org1c03cc1)
|
15
|
+
4. [Constructing Tables](#orgbf0e735)
|
16
|
+
1. [Empty Tables](#org80c41f5)
|
17
|
+
2. [From CSV or Org Mode files or strings](#org681a599)
|
18
|
+
3. [From Arrays of Arrays](#org4f683cf)
|
19
|
+
4. [From Arrays of Hashes](#org7980800)
|
20
|
+
5. [From SQL queries](#orgdab2ec1)
|
21
|
+
6. [Marking Groups in Input](#orgeb97e36)
|
22
|
+
5. [Accessing Parts of Tables](#orgf9cb237)
|
23
|
+
1. [Rows](#org4453cea)
|
24
|
+
2. [Columns](#org8a6dd85)
|
25
|
+
3. [Cells](#orgcc87a8b)
|
26
|
+
4. [Other table attributes](#org4a41de4)
|
27
|
+
6. [Operations on Tables](#org731fd13)
|
28
|
+
1. [Example Input Table](#orga96ca08)
|
29
|
+
2. [Select](#orga0c49b3)
|
30
|
+
3. [Where](#orge185ad7)
|
31
|
+
4. [Order\_by](#org57f51d1)
|
32
|
+
5. [Group\_by](#org1ee0a85)
|
33
|
+
6. [Join](#org6432f26)
|
34
|
+
7. [Set Operations](#org7d2857d)
|
35
|
+
8. [Uniq (aka Distinct)](#org073a8b5)
|
36
|
+
9. [Remove groups with degroup!](#orgd147303)
|
37
|
+
7. [Formatting Tables](#org9f4d633)
|
38
|
+
1. [Available Formatters](#orgb7b2335)
|
39
|
+
2. [Table Locations](#org4db9ae4)
|
40
|
+
3. [Formatting Directives](#orgd2128a3)
|
41
|
+
4. [Footers Methods](#org947e8a4)
|
42
|
+
5. [Formatting Methods](#orgcef241a)
|
43
|
+
6. [The `format` and `format_for` methods](#org7b25866)
|
44
|
+
4. [Development](#org62e325b)
|
45
|
+
5. [Contributing](#orgf51a2c9)
|
46
|
+
|
47
|
+
[![Build Status](https://travis-ci.org/ddoherty03/fat_table.svg?branch=v0.2.7)](https://travis-ci.org/ddoherty03/fat_table)
|
48
|
+
|
49
|
+
<a id="org23d768e"></a>
|
50
|
+
|
51
|
+
# Introduction
|
52
|
+
|
53
|
+
`FatTable` is a gem that treats tables as a data type. It provides methods for
|
54
|
+
constructing tables from a variety of sources, building them row-by-row,
|
55
|
+
extracting rows, columns, and cells, and performing aggregate operations on
|
56
|
+
columns. It also provides as set of SQL-esque methods for manipulating table
|
57
|
+
objects: `select` for filtering by columns or for creating new columns, `where`
|
58
|
+
for filtering by rows, `order_by` for sorting rows, `distinct` for eliminating
|
59
|
+
duplicate rows, `group_by` for aggregating multiple rows into single rows and
|
60
|
+
applying column aggregate methods to ungrouped columns, a collection of `join`
|
61
|
+
methods for combining tables, and more.
|
62
|
+
|
63
|
+
Furthermore, `FatTable` provides methods for formatting tables and producing
|
64
|
+
output that targets various output media: text, ANSI terminals, ruby data
|
65
|
+
structures, LaTeX tables, Emacs org-mode tables, and more. The formatting
|
66
|
+
methods can specify cell formatting in a way that is uniform across all the
|
67
|
+
output methods and can also decorate the output with any number of footers,
|
68
|
+
including group footers. `FatTable` applies formatting directives to the extent
|
69
|
+
they makes sense for the output medium and treats other formatting directives as
|
70
|
+
no-ops.
|
71
|
+
|
72
|
+
`FatTable` can be used to perform operations on data that are naturally best
|
73
|
+
conceived of as tables, which in my experience is quite often. It can also serve
|
74
|
+
as a foundation for providing reporting functions where flexibility about the
|
75
|
+
output medium can be quite useful. Finally `FatTable` can be used within Emacs
|
76
|
+
`org-mode` files in code blocks targeting the Ruby language. Org mode tables are
|
77
|
+
presented to a ruby code block as an array of arrays, so `FatTable` can read
|
78
|
+
them in with its `.from_aoa` constructor. A `FatTable` table output as an array
|
79
|
+
of arrays with its `.to_aoa` output function will be rendered in an org-mode
|
80
|
+
buffer as an org-table, ready for processing by other code blocks.
|
81
|
+
|
82
|
+
|
83
|
+
<a id="org8d90fdf"></a>
|
84
|
+
|
85
|
+
# Installation
|
86
|
+
|
87
|
+
|
88
|
+
<a id="org26d2aee"></a>
|
89
|
+
|
90
|
+
## Prerequisites
|
91
|
+
|
92
|
+
The `fat_table` gem depends on several libraries being available for building,
|
93
|
+
mostly those concerned with accessing databases. On an ubuntu system, the
|
94
|
+
following packages should be installed before you install the `fat_table` gem:
|
95
|
+
|
96
|
+
- ruby-dev
|
97
|
+
- build-essential
|
98
|
+
- libsqlite3-dev
|
99
|
+
- libpq-dev
|
100
|
+
- libmysqlclient-dev
|
101
|
+
|
102
|
+
|
103
|
+
<a id="orga19109b"></a>
|
104
|
+
|
105
|
+
## Installing the gem
|
106
|
+
|
107
|
+
Add this line to your application’s Gemfile:
|
108
|
+
|
109
|
+
gem 'fat_table'
|
110
|
+
|
111
|
+
And then execute:
|
112
|
+
|
113
|
+
$ bundle
|
114
|
+
|
115
|
+
Or install it yourself as:
|
116
|
+
|
117
|
+
$ gem install fat_table
|
118
|
+
|
119
|
+
|
120
|
+
<a id="org0b5ecd8"></a>
|
121
|
+
|
122
|
+
# Usage
|
123
|
+
|
124
|
+
|
125
|
+
<a id="org199fc3a"></a>
|
126
|
+
|
127
|
+
## Quick Start
|
128
|
+
|
129
|
+
`FatTable` provides table objects as a data type that can be constructed and
|
130
|
+
operated on in a number of ways. Here’s a quick example to illustrate the use of
|
131
|
+
the main features of `FatTable`. See the detailed explanations further on down.
|
132
|
+
|
133
|
+
require 'fat_table'
|
134
|
+
|
135
|
+
data =
|
136
|
+
[['Date', 'Code', 'Raw', 'Shares', 'Price', 'Info', 'Ok'],
|
137
|
+
['2013-05-29', 'S', 15_700.00, 6601.85, 24.7790, 'ENTITY3', 'F'],
|
138
|
+
['2013-05-02', 'P', 118_186.40, 118_186.4, 11.8500, 'ENTITY1', 'T'],
|
139
|
+
['2013-05-20', 'S', 12_000.00, 5046.00, 28.2804, 'ENTITY3', 'F'],
|
140
|
+
['2013-05-23', 'S', 8000.00, 3364.00, 27.1083, 'ENTITY3', 'T'],
|
141
|
+
['2013-05-23', 'S', 39_906.00, 16_780.47, 25.1749, 'ENTITY3', 'T'],
|
142
|
+
['2013-05-20', 'S', 85_000.00, 35_742.50, 28.3224, 'ENTITY3', 'T'],
|
143
|
+
['2013-05-02', 'P', 795_546.20, 795_546.2, 1.1850, 'ENTITY1', 'T'],
|
144
|
+
['2013-05-29', 'S', 13_459.00, 5659.51, 24.7464, 'ENTITY3', 'T'],
|
145
|
+
['2013-05-20', 'S', 33_302.00, 14_003.49, 28.6383, 'ENTITY3', 'T'],
|
146
|
+
['2013-05-29', 'S', 15_900.00, 6685.95, 24.5802, 'ENTITY3', 'T'],
|
147
|
+
['2013-05-30', 'S', 6_679.00, 2808.52, 25.0471, 'ENTITY3', 'T'],
|
148
|
+
['2013-05-23', 'S', 23_054.00, 9694.21, 26.8015, 'ENTITY3', 'F']]
|
149
|
+
|
150
|
+
# Build the Table and then perform chained operations on it
|
151
|
+
|
152
|
+
table = FatTable.from_aoa(data) \
|
153
|
+
.where('shares > 2000') \
|
154
|
+
.order_by(:date, :code) \
|
155
|
+
.select(:date, :code, :shares,
|
156
|
+
:price, :ok, ref: '@row') \
|
157
|
+
.select(:ref, :date, :code,
|
158
|
+
:shares, :price, :ok)
|
159
|
+
|
160
|
+
# Convert the table to an ASCII text string
|
161
|
+
|
162
|
+
table.to_text do |fmt|
|
163
|
+
# Add some table footers
|
164
|
+
fmt.avg_footer(:price, :shares)
|
165
|
+
fmt.sum_footer(:shares)
|
166
|
+
# Add a group footer
|
167
|
+
fmt.gfooter('Avg', shares: :avg, price: :avg)
|
168
|
+
# Formats for all locations
|
169
|
+
fmt.format(ref: 'CB', numeric: 'R', boolean: 'CY')
|
170
|
+
# Formats for different "locations" in the table
|
171
|
+
fmt.format_for(:header, string: 'CB')
|
172
|
+
fmt.format_for(:body, code: 'C', shares: ',0.1', price: '0.4', )
|
173
|
+
fmt.format_for(:bfirst, price: '$0.4', )
|
174
|
+
fmt.format_for(:footer, shares: 'B,0.1', price: '$B0.4', )
|
175
|
+
fmt.format_for(:gfooter, shares: 'B,0.1', price: 'B0.4', )
|
176
|
+
end
|
177
|
+
|
178
|
+
+=========+============+======+=============+==========+====+
|
179
|
+
| Ref | Date | Code | Shares | Price | Ok |
|
180
|
+
+---------|------------|------|-------------|----------|----+
|
181
|
+
| 1 | 2013-05-02 | P | 118,186.4 | $11.8500 | Y |
|
182
|
+
| 2 | 2013-05-02 | P | 795,546.2 | 1.1850 | Y |
|
183
|
+
+---------|------------|------|-------------|----------|----+
|
184
|
+
| Avg | | | 456,866.3 | 6.5175 | |
|
185
|
+
+---------|------------|------|-------------|----------|----+
|
186
|
+
| 3 | 2013-05-20 | S | 5,046.0 | 28.2804 | N |
|
187
|
+
| 4 | 2013-05-20 | S | 35,742.5 | 28.3224 | Y |
|
188
|
+
| 5 | 2013-05-20 | S | 14,003.5 | 28.6383 | Y |
|
189
|
+
+---------|------------|------|-------------|----------|----+
|
190
|
+
| Avg | | | 18,264.0 | 28.4137 | |
|
191
|
+
+---------|------------|------|-------------|----------|----+
|
192
|
+
| 6 | 2013-05-23 | S | 3,364.0 | 27.1083 | Y |
|
193
|
+
| 7 | 2013-05-23 | S | 16,780.5 | 25.1749 | Y |
|
194
|
+
| 8 | 2013-05-23 | S | 9,694.2 | 26.8015 | N |
|
195
|
+
+---------|------------|------|-------------|----------|----+
|
196
|
+
| Avg | | | 9,946.2 | 26.3616 | |
|
197
|
+
+---------|------------|------|-------------|----------|----+
|
198
|
+
| 9 | 2013-05-29 | S | 6,601.9 | 24.7790 | N |
|
199
|
+
| 10 | 2013-05-29 | S | 5,659.5 | 24.7464 | Y |
|
200
|
+
| 11 | 2013-05-29 | S | 6,686.0 | 24.5802 | Y |
|
201
|
+
+---------|------------|------|-------------|----------|----+
|
202
|
+
| Avg | | | 6,315.8 | 24.7019 | |
|
203
|
+
+---------|------------|------|-------------|----------|----+
|
204
|
+
| 12 | 2013-05-30 | S | 2,808.5 | 25.0471 | Y |
|
205
|
+
+---------|------------|------|-------------|----------|----+
|
206
|
+
| Avg | | | 2,808.5 | 25.0471 | |
|
207
|
+
+---------|------------|------|-------------|----------|----+
|
208
|
+
| Average | | | 85,009.9 | $23.0428 | |
|
209
|
+
+---------|------------|------|-------------|----------|----+
|
210
|
+
| Total | | | 1,020,119.1 | | |
|
211
|
+
+=========+============+======+=============+==========+====+
|
212
|
+
|
213
|
+
|
214
|
+
<a id="org1e51988"></a>
|
215
|
+
|
216
|
+
## A Word About the Examples
|
217
|
+
|
218
|
+
When you install the `fat_table` gem, you have access to a program `ft_console`
|
219
|
+
which opens a `pry` session with `fat_table` loaded and the tables used in the
|
220
|
+
examples in this `README` defined as instance variables so you can experiment
|
221
|
+
with them. Because they are defined as instance variables, you have to write
|
222
|
+
`tab1` as `@tab1` in `ft_console`, but otherwise the examples should work.
|
223
|
+
|
224
|
+
The examples in this `README` file are executed as code blocks within the
|
225
|
+
`README.org` file, so they typically end with a call to `.to_aoa`. That causes
|
226
|
+
the table to be inserted into the file and formatted as a table. With
|
227
|
+
`ft_console`, you should instead display your tables with `.to_text` or
|
228
|
+
`.to_term`. These will return a string that you can print to the terminal with
|
229
|
+
`puts`.
|
230
|
+
|
231
|
+
To read in the table used in the Quick Start section above, you might do the
|
232
|
+
following:
|
233
|
+
|
234
|
+
$ ft_console[1] pry(main)> ls
|
235
|
+
ActiveSupport::ToJsonWithActiveSupportEncoder#methods: to_json
|
236
|
+
self.methods: inspect to_s
|
237
|
+
instance variables:
|
238
|
+
@aoa @tab1 @tab2 @tab_a @tab_b @tt
|
239
|
+
@data @tab1_str @tab2_str @tab_a_str @tab_b_str
|
240
|
+
locals: _ __ _dir_ _ex_ _file_ _in_ _out_ _pry_ lib str version
|
241
|
+
[2] pry(main)> table = FatTable.from_aoa(@data)
|
242
|
+
=> #<FatTable::Table:0x0055b40e6cd870
|
243
|
+
@boundaries=[],
|
244
|
+
@columns=
|
245
|
+
[#<FatTable::Column:0x0055b40e6cc948
|
246
|
+
@header=:date,
|
247
|
+
@items=
|
248
|
+
[Wed, 29 May 2013,
|
249
|
+
Thu, 02 May 2013,
|
250
|
+
Mon, 20 May 2013,
|
251
|
+
Thu, 23 May 2013,
|
252
|
+
Thu, 23 May 2013,
|
253
|
+
Mon, 20 May 2013,
|
254
|
+
Thu, 02 May 2013,
|
255
|
+
Wed, 29 May 2013,
|
256
|
+
Mon, 20 May 2013,
|
257
|
+
...
|
258
|
+
@items=["ENTITY3", "ENTITY1", "ENTITY3", "ENTITY3", "ENTITY3", "ENTITY3", "ENTITY1", "ENTITY3", "ENTITY3", "ENTITY3", "ENTITY3", "ENTITY3"],
|
259
|
+
@raw_header=:info,
|
260
|
+
@type="String">,
|
261
|
+
#<FatTable::Column:0x0055b40e6d2668 @header=:ok, @items=[false, true, false, true, true, true, true, true, true, true, true, false], @raw_header=:ok, @type="Boolean">]>
|
262
|
+
[3] pry(main)> puts table.to_text
|
263
|
+
+============+======+==========+==========+=========+=========+====+
|
264
|
+
| Date | Code | Raw | Shares | Price | Info | Ok |
|
265
|
+
+------------|------|----------|----------|---------|---------|----+
|
266
|
+
| 2013-05-29 | S | 15700.0 | 6601.85 | 24.779 | ENTITY3 | F |
|
267
|
+
| 2013-05-02 | P | 118186.4 | 118186.4 | 11.85 | ENTITY1 | T |
|
268
|
+
| 2013-05-20 | S | 12000.0 | 5046.0 | 28.2804 | ENTITY3 | F |
|
269
|
+
| 2013-05-23 | S | 8000.0 | 3364.0 | 27.1083 | ENTITY3 | T |
|
270
|
+
| 2013-05-23 | S | 39906.0 | 16780.47 | 25.1749 | ENTITY3 | T |
|
271
|
+
| 2013-05-20 | S | 85000.0 | 35742.5 | 28.3224 | ENTITY3 | T |
|
272
|
+
| 2013-05-02 | P | 795546.2 | 795546.2 | 1.185 | ENTITY1 | T |
|
273
|
+
| 2013-05-29 | S | 13459.0 | 5659.51 | 24.7464 | ENTITY3 | T |
|
274
|
+
| 2013-05-20 | S | 33302.0 | 14003.49 | 28.6383 | ENTITY3 | T |
|
275
|
+
| 2013-05-29 | S | 15900.0 | 6685.95 | 24.5802 | ENTITY3 | T |
|
276
|
+
| 2013-05-30 | S | 6679.0 | 2808.52 | 25.0471 | ENTITY3 | T |
|
277
|
+
| 2013-05-23 | S | 23054.0 | 9694.21 | 26.8015 | ENTITY3 | F |
|
278
|
+
+============+======+==========+==========+=========+=========+====+
|
279
|
+
=> nil
|
280
|
+
[4] pry(main)>
|
281
|
+
|
282
|
+
And if you use `.to_term`, you can see the effect of the color formatting
|
283
|
+
directives.
|
284
|
+
|
285
|
+
|
286
|
+
<a id="org7d48b5d"></a>
|
287
|
+
|
288
|
+
## Anatomy of a Table
|
289
|
+
|
290
|
+
|
291
|
+
<a id="org4a6c98f"></a>
|
292
|
+
|
293
|
+
### Columns
|
294
|
+
|
295
|
+
`FatTable::Table` objects consist of an array of `FatTable::Column` objects.
|
296
|
+
Each `Column` has a header, a type, and an array of items, all of the given type
|
297
|
+
or nil. There are only five permissible types for a `Column`:
|
298
|
+
|
299
|
+
1. **Boolean** (for holding ruby `TrueClass` and `FalseClass` objects),
|
300
|
+
2. **DateTime** (for holding ruby `DateTime` or `Date` objects),
|
301
|
+
3. **Numeric** (for holding ruby `Integer`, `Rational`, or `BigDecimal` objects),
|
302
|
+
4. **String** (for ruby `String` objects), or
|
303
|
+
5. **NilClass** (for the undetermined column type).
|
304
|
+
|
305
|
+
When a `Table` is constructed from an external source, all `Columns` start out
|
306
|
+
having a type of `NilClass`, that is, their type is as yet undetermined. When a
|
307
|
+
string or object of one of the four determined types is added to a `Column`, it
|
308
|
+
fixes the type of the column and all further items added to the `Column` must
|
309
|
+
either be `nil` (indicating no value) or be capable of being coerced to the
|
310
|
+
column’s type. Otherwise, `FatTable` raises an exception.
|
311
|
+
|
312
|
+
Items of input must be either one of the permissible ruby objects or strings. If
|
313
|
+
they are strings, `FatTable` attempts to parse them as one of the permissible
|
314
|
+
types as follows:
|
315
|
+
|
316
|
+
- **Boolean:** the strings, `'t'`, `'true'`, `'yes'`, or `'y'`, regardless of
|
317
|
+
case, are interpreted as `TrueClass` and the strings, `'f'`, `'false'`,
|
318
|
+
`'no'`, or `'n'`, regardless of case, are interpreted as `FalseClass`, in
|
319
|
+
either case resulting in a Boolean column. Empty strings in a column
|
320
|
+
already having a Boolean type are converted to `nil`.
|
321
|
+
- **DateTime:** strings that contain patterns of `'yyyy-mm-dd'` or `'yyyy/mm/dd'`
|
322
|
+
or `'mm-dd-yyy'` or `'mm/dd/yyyy'` or any of the foregoing with an added
|
323
|
+
`'Thh:mm:ss'` or `'Thh:mm'` will be interpreted as a `DateTime` or a `Date`
|
324
|
+
(if there are no sub-day time components present). The number of digits in
|
325
|
+
the month and day can be one or two, but the year component must be four
|
326
|
+
digits. Any time components are valid if they can be properly interpreted
|
327
|
+
by `DateTime.parse`. Org mode timestamps (any of the foregoing surrounded
|
328
|
+
by square ’`[]`’ or pointy ’`<>`’ brackets), active or inactive, are valid
|
329
|
+
input strings for `DateTime` columns. Empty strings in a column already
|
330
|
+
having the `DateTime` type are converted to `nil`.
|
331
|
+
- **Numeric:** all commas `','`, underscores, `'_'`, and `'$'` dollar signs (or
|
332
|
+
other currency symbol as set by `FatTable.currency_symbol` are removed from
|
333
|
+
the string and if the remaining string can be interpreted as a `Numeric`,
|
334
|
+
it will be. It is interpreted as an `Integer` if there are no decimal
|
335
|
+
places in the remaining string, as a `Rational` if the string has the form
|
336
|
+
’`<number>:<number>`’ or ’`<number>/<number>`’, or as a `BigDecimal` if
|
337
|
+
there is a decimal point in the remaining string. Empty strings in a column
|
338
|
+
already having the Numeric type are converted to nil.
|
339
|
+
- **String:** if all else fails, `FatTable` applies `#to_s` to the input value
|
340
|
+
and, treats it as an item of type `String`. Empty strings in a column
|
341
|
+
already having the `String` type are kept as empty strings.
|
342
|
+
- **NilClass:** until the input contains a non-blank string that can be parsed as
|
343
|
+
one of the other types, it has this type, meaning that the type is still
|
344
|
+
open. A column comprised completely of blank strings or `nils` will retain
|
345
|
+
the `NilClass` type.
|
346
|
+
|
347
|
+
|
348
|
+
<a id="org37bbf47"></a>
|
349
|
+
|
350
|
+
### Headers
|
351
|
+
|
352
|
+
Headers for the columns are formed from the input. No two columns in a table can
|
353
|
+
have the same header. Headers in the input are converted to symbols by
|
354
|
+
|
355
|
+
- converting the header to a string with `#to_s`,
|
356
|
+
- converting any run of blanks to an underscore `_`,
|
357
|
+
- removing any characters that are not letters, numbers, or underscores, and
|
358
|
+
- lowercasing all remaining letters
|
359
|
+
|
360
|
+
Thus, a header of `'Date'` becomes `:date`, a header of `'Id Number'` becomes,
|
361
|
+
`:id_number`, etc. When referring to a column in code, you must use the symbol
|
362
|
+
form of the header.
|
363
|
+
|
364
|
+
If no sensible headers can be discerned from the input, headers of the form
|
365
|
+
`:col_1`, `:col_2`, etc., are synthesized.
|
366
|
+
|
367
|
+
|
368
|
+
<a id="org1c03cc1"></a>
|
369
|
+
|
370
|
+
### Groups
|
371
|
+
|
372
|
+
The rows of a `FatTable` table can be sub-divided into groups, either from
|
373
|
+
markers in the input or as a result of certain operations. There is only one
|
374
|
+
level of grouping, so `FatTable` has no concept of sub-groups. Groups can be
|
375
|
+
shown on output with rules or “hlines” that underline the last row in each
|
376
|
+
group, and you can decorate the output with group footers that summarize the
|
377
|
+
columns in each group.
|
378
|
+
|
379
|
+
|
380
|
+
<a id="orgbf0e735"></a>
|
381
|
+
|
382
|
+
## Constructing Tables
|
383
|
+
|
384
|
+
|
385
|
+
<a id="org80c41f5"></a>
|
386
|
+
|
387
|
+
### Empty Tables
|
388
|
+
|
389
|
+
You can create an empty table with `FatTable.new`, and then add rows with the
|
390
|
+
`<<` operator and a Hash:
|
391
|
+
|
392
|
+
tab = FatTable.new
|
393
|
+
tab << { a: 1, b: 2, c: "<2017-01-21>', d: 'f', e: '' }
|
394
|
+
tab << { a: 3.14, b: 2.17, c: '[2016-01-21 Thu]', d: 'Y', e: nil }
|
395
|
+
tab.to_aoa
|
396
|
+
|
397
|
+
After this, the table will have column headers `:a`, `:b`, `:c`, `:d`, and `:e`.
|
398
|
+
Column, `:a` and `:b` will have type Numeric, column `:c` will have type
|
399
|
+
`DateTime`, and column `:d` will have type `Boolean`. Column `:e` will still
|
400
|
+
have an open type. Notice that dates in the input can be wrapped in brackets as
|
401
|
+
in org-mode time stamps.
|
402
|
+
|
403
|
+
|
404
|
+
<a id="org681a599"></a>
|
405
|
+
|
406
|
+
### From CSV or Org Mode files or strings
|
407
|
+
|
408
|
+
Tables can also be read from `.csv` files or files containing `org-mode` tables.
|
409
|
+
In the case of org-mode files, `FatTable` skips through the file until it finds
|
410
|
+
a line that look like a table, that is, it begins with any number of spaces
|
411
|
+
followed by `|-`. Only the first table in an `.org` file is read.
|
412
|
+
|
413
|
+
For both `.csv` and `.org` files, the first row in the tables is taken as the
|
414
|
+
header row, and the headers are converted to symbols as described above.
|
415
|
+
|
416
|
+
tab1 = FatTable.from_csv_file('~/data.csv')
|
417
|
+
tab2 = FatTable.from_org_file('~/project.org')
|
418
|
+
|
419
|
+
csv_body = <<-EOS
|
420
|
+
Ref,Date,Code,RawShares,Shares,Price,Info
|
421
|
+
1,2006-05-02,P,5000,5000,8.6000,2006-08-09-1-I
|
422
|
+
2,2006-05-03,P,5000,5000,8.4200,2006-08-09-1-I
|
423
|
+
3,2006-05-04,P,5000,5000,8.4000,2006-08-09-1-I
|
424
|
+
4,2006-05-10,P,8600,8600,8.0200,2006-08-09-1-D
|
425
|
+
5,2006-05-12,P,10000,10000,7.2500,2006-08-09-1-D
|
426
|
+
6,2006-05-12,P,2000,2000,6.7400,2006-08-09-1-I
|
427
|
+
EOS
|
428
|
+
|
429
|
+
tab3 = FatTable.from_csv_string(csv_body)
|
430
|
+
|
431
|
+
org_body = <<-EOS
|
432
|
+
.* Smith Transactions
|
433
|
+
:PROPERTIES:
|
434
|
+
:TABLE_EXPORT_FILE: smith.csv
|
435
|
+
:END:
|
436
|
+
|
437
|
+
#+TBLNAME: smith_tab
|
438
|
+
| Ref | Date | Code | Raw | Shares | Price | Info |
|
439
|
+
|-----|------------|------|---------|--------|----------|---------|
|
440
|
+
| 29 | 2013-05-02 | P | 795,546 | 2,609 | 1.18500 | ENTITY1 |
|
441
|
+
| 30 | 2013-05-02 | P | 118,186 | 388 | 11.85000 | ENTITY1 |
|
442
|
+
| 31 | 2013-05-02 | P | 340,948 | 1,926 | 1.18500 | ENTITY2 |
|
443
|
+
| 32 | 2013-05-02 | P | 50,651 | 286 | 11.85000 | ENTITY2 |
|
444
|
+
| 33 | 2013-05-20 | S | 12,000 | 32 | 28.28040 | ENTITY3 |
|
445
|
+
| 34 | 2013-05-20 | S | 85,000 | 226 | 28.32240 | ENTITY3 |
|
446
|
+
| 35 | 2013-05-20 | S | 33,302 | 88 | 28.63830 | ENTITY3 |
|
447
|
+
| 36 | 2013-05-23 | S | 8,000 | 21 | 27.10830 | ENTITY3 |
|
448
|
+
| 37 | 2013-05-23 | S | 23,054 | 61 | 26.80150 | ENTITY3 |
|
449
|
+
| 38 | 2013-05-23 | S | 39,906 | 106 | 25.17490 | ENTITY3 |
|
450
|
+
| 39 | 2013-05-29 | S | 13,459 | 36 | 24.74640 | ENTITY3 |
|
451
|
+
| 40 | 2013-05-29 | S | 15,700 | 42 | 24.77900 | ENTITY3 |
|
452
|
+
| 41 | 2013-05-29 | S | 15,900 | 42 | 24.58020 | ENTITY3 |
|
453
|
+
| 42 | 2013-05-30 | S | 6,679 | 18 | 25.04710 | ENTITY3 |
|
454
|
+
|
455
|
+
.* Another Heading
|
456
|
+
EOS
|
457
|
+
|
458
|
+
tab4 = FatTable.from_org_string(org_body)
|
459
|
+
|
460
|
+
|
461
|
+
<a id="org4f683cf"></a>
|
462
|
+
|
463
|
+
### From Arrays of Arrays
|
464
|
+
|
465
|
+
You can also initialize a table directly from ruby data structures. You can, for
|
466
|
+
example, build a table from an array of arrays:
|
467
|
+
|
468
|
+
aoa = [
|
469
|
+
['Ref', 'Date', 'Code', 'Raw', 'Shares', 'Price', 'Info', 'Bool'],
|
470
|
+
[1, '2013-05-02', 'P', 795_546.20, 795_546.2, 1.1850, 'ENTITY1', 'T'],
|
471
|
+
[2, '2013-05-02', 'P', 118_186.40, 118_186.4, 11.8500, 'ENTITY1', 'T'],
|
472
|
+
[7, '2013-05-20', 'S', 12_000.00, 5046.00, 28.2804, 'ENTITY3', 'F'],
|
473
|
+
[8, '2013-05-20', 'S', 85_000.00, 35_742.50, 28.3224, 'ENTITY3', 'T'],
|
474
|
+
[9, '2013-05-20', 'S', 33_302.00, 14_003.49, 28.6383, 'ENTITY3', 'T'],
|
475
|
+
[10, '2013-05-23', 'S', 8000.00, 3364.00, 27.1083, 'ENTITY3', 'T'],
|
476
|
+
[11, '2013-05-23', 'S', 23_054.00, 9694.21, 26.8015, 'ENTITY3', 'F'],
|
477
|
+
[12, '2013-05-23', 'S', 39_906.00, 16_780.47, 25.1749, 'ENTITY3', 'T'],
|
478
|
+
[13, '2013-05-29', 'S', 13_459.00, 5659.51, 24.7464, 'ENTITY3', 'T'],
|
479
|
+
[14, '2013-05-29', 'S', 15_700.00, 6601.85, 24.7790, 'ENTITY3', 'F'],
|
480
|
+
[15, '2013-05-29', 'S', 15_900.00, 6685.95, 24.5802, 'ENTITY3', 'T'],
|
481
|
+
[16, '2013-05-30', 'S', 6_679.00, 2808.52, 25.0471, 'ENTITY3', 'T']
|
482
|
+
]
|
483
|
+
tab = FatTable.from_aoa(aoa)
|
484
|
+
|
485
|
+
Notice that the values can either be ruby objects, such as the Integer `85_000`,
|
486
|
+
or strings that can be parsed into one of the permissible column types.
|
487
|
+
|
488
|
+
This method of building a table, `.from_aoa`, is particularly useful in dealing
|
489
|
+
with Emacs org-mode code blocks. Tables in org-mode are passed to code blocks as
|
490
|
+
arrays of arrays. Likewise, a result of a code block in the form of an array of
|
491
|
+
arrays is displayed as an org-mode table:
|
492
|
+
|
493
|
+
#+NAME: trades1
|
494
|
+
| Ref | Date | Code | Price | G10 | QP10 | Shares | LP | QP | IPLP | IPQP |
|
495
|
+
|------|------------|------|--------|-----|------|--------|-------|--------|--------|--------|
|
496
|
+
| T001 | 2016-11-01 | P | 7.7000 | T | F | 100 | 14 | 86 | 0.2453 | 0.1924 |
|
497
|
+
| T002 | 2016-11-01 | P | 7.7500 | T | F | 200 | 28 | 172 | 0.2453 | 0.1924 |
|
498
|
+
| T003 | 2016-11-01 | P | 7.5000 | F | T | 800 | 112 | 688 | 0.2453 | 0.1924 |
|
499
|
+
| T004 | 2016-11-01 | S | 7.5500 | T | F | 6811 | 966 | 5845 | 0.2453 | 0.1924 |
|
500
|
+
| T005 | 2016-11-01 | S | 7.5000 | F | F | 4000 | 572 | 3428 | 0.2453 | 0.1924 |
|
501
|
+
| T006 | 2016-11-01 | S | 7.6000 | F | T | 1000 | 143 | 857 | 0.2453 | 0.1924 |
|
502
|
+
| T007 | 2016-11-01 | S | 7.6500 | T | F | 200 | 28 | 172 | 0.2453 | 0.1924 |
|
503
|
+
| T008 | 2016-11-01 | P | 7.6500 | F | F | 2771 | 393 | 2378 | 0.2453 | 0.1924 |
|
504
|
+
| T009 | 2016-11-01 | P | 7.6000 | F | F | 9550 | 1363 | 8187 | 0.2453 | 0.1924 |
|
505
|
+
| T010 | 2016-11-01 | P | 7.5500 | F | T | 3175 | 451 | 2724 | 0.2453 | 0.1924 |
|
506
|
+
| T011 | 2016-11-02 | P | 7.4250 | T | F | 100 | 14 | 86 | 0.2453 | 0.1924 |
|
507
|
+
| T012 | 2016-11-02 | P | 7.5500 | F | F | 4700 | 677 | 4023 | 0.2453 | 0.1924 |
|
508
|
+
| T013 | 2016-11-02 | P | 7.3500 | T | T | 53100 | 7656 | 45444 | 0.2453 | 0.1924 |
|
509
|
+
| T014 | 2016-11-02 | P | 7.4500 | F | T | 5847 | 835 | 5012 | 0.2453 | 0.1924 |
|
510
|
+
| T015 | 2016-11-02 | P | 7.7500 | F | F | 500 | 72 | 428 | 0.2453 | 0.1924 |
|
511
|
+
| T016 | 2016-11-02 | P | 8.2500 | T | T | 100 | 14 | 86 | 0.2453 | 0.1924 |
|
512
|
+
|
513
|
+
#+HEADER: :colnames no
|
514
|
+
:#+BEGIN_SRC ruby :var tab=trades1
|
515
|
+
require 'fat_table'
|
516
|
+
tab = FatTable.from_aoa(tab).where('shares > 500')
|
517
|
+
tab.to_aoa
|
518
|
+
:#+END_SRC
|
519
|
+
|
520
|
+
#+RESULTS:
|
521
|
+
| Ref | Date | Code | Price | G10 | QP10 | Shares | Lp | Qp | Iplp | Ipqp |
|
522
|
+
|------|------------|------|-------|-----|------|--------|------|-------|--------|--------|
|
523
|
+
| T003 | 2016-11-01 | P | 7.5 | F | T | 800 | 112 | 688 | 0.2453 | 0.1924 |
|
524
|
+
| T004 | 2016-11-01 | S | 7.55 | T | F | 6811 | 966 | 5845 | 0.2453 | 0.1924 |
|
525
|
+
| T005 | 2016-11-01 | S | 7.5 | F | F | 4000 | 572 | 3428 | 0.2453 | 0.1924 |
|
526
|
+
| T006 | 2016-11-01 | S | 7.6 | F | T | 1000 | 143 | 857 | 0.2453 | 0.1924 |
|
527
|
+
| T008 | 2016-11-01 | P | 7.65 | F | F | 2771 | 393 | 2378 | 0.2453 | 0.1924 |
|
528
|
+
| T009 | 2016-11-01 | P | 7.6 | F | F | 9550 | 1363 | 8187 | 0.2453 | 0.1924 |
|
529
|
+
| T010 | 2016-11-01 | P | 7.55 | F | T | 3175 | 451 | 2724 | 0.2453 | 0.1924 |
|
530
|
+
| T012 | 2016-11-02 | P | 7.55 | F | F | 4700 | 677 | 4023 | 0.2453 | 0.1924 |
|
531
|
+
| T013 | 2016-11-02 | P | 7.35 | T | T | 53100 | 7656 | 45444 | 0.2453 | 0.1924 |
|
532
|
+
| T014 | 2016-11-02 | P | 7.45 | F | T | 5847 | 835 | 5012 | 0.2453 | 0.1924 |
|
533
|
+
|
534
|
+
This example illustrates several things:
|
535
|
+
|
536
|
+
1. The named org-mode table, `trades1`, can be passed into a ruby code block
|
537
|
+
using the `:var tab=trades1` header argument to the code block; that makes
|
538
|
+
the variable `tab` available to the code block as an array of arrays, which
|
539
|
+
`FatTable` then uses to initialize the table.
|
540
|
+
2. The code block requires that you set `:colnames no` in the header arguments.
|
541
|
+
This suppresses org-mode’s own processing of the header line so that
|
542
|
+
`FatTable` can see the headers. Failure to do this will cause an error.
|
543
|
+
3. The table is subjected to some processing, in this case selecting those rows
|
544
|
+
where the number of shares is greater than 500. More on that later.
|
545
|
+
4. `FatTable` passes back to org-mode an array of arrays using the `.to_aoa`
|
546
|
+
method. In an `org-mode` buffer, these are rendered as tables. We’ll often
|
547
|
+
apply `.to_aoa` at the end of example blocks to render the results inside
|
548
|
+
this `README.org` file. As we’ll see below, this method can also take a block
|
549
|
+
to which formatting directives and footers can be attached.
|
550
|
+
|
551
|
+
|
552
|
+
<a id="org7980800"></a>
|
553
|
+
|
554
|
+
### From Arrays of Hashes
|
555
|
+
|
556
|
+
A second ruby data structure that can be used to initialize a `FatTable` table
|
557
|
+
is an array of ruby Hashes. Each hash represents a row of the table, and the
|
558
|
+
headers of the table are take from the keys of the hashes. Accordingly, all the
|
559
|
+
hashes should have the same keys.
|
560
|
+
|
561
|
+
This same method can in fact take an array of any objects that can be converted
|
562
|
+
to a Hash with the `#to_h` method, so you can use an array of your own objects
|
563
|
+
to initialize a table, provided that you define a suitable `#to_h` method for
|
564
|
+
the objects’ class.
|
565
|
+
|
566
|
+
aoh = [
|
567
|
+
{ ref: 'T001', date: '2016-11-01', code: 'P', price: '7.7000', shares: 100 },
|
568
|
+
{ ref: 'T002', date: '2016-11-01', code: 'P', price: 7.7500, shares: 200 },
|
569
|
+
{ ref: 'T003', date: '2016-11-01', code: 'P', price: 7.5000, shares: 800 },
|
570
|
+
{ ref: 'T004', date: '2016-11-01', code: 'S', price: 7.5500, shares: 6811 },
|
571
|
+
{ ref: 'T005', date: Date.today, code: 'S', price: 7.5000, shares: 4000 },
|
572
|
+
{ ref: 'T006', date: '2016-11-01', code: 'S', price: 7.6000, shares: 1000 },
|
573
|
+
{ ref: 'T007', date: '2016-11-01', code: 'S', price: 7.6500, shares: 200 },
|
574
|
+
{ ref: 'T008', date: '2016-11-01', code: 'P', price: 7.6500, shares: 2771 },
|
575
|
+
{ ref: 'T009', date: '2016-11-01', code: 'P', price: 7.6000, shares: 9550 },
|
576
|
+
{ ref: 'T010', date: '2016-11-01', code: 'P', price: 7.5500, shares: 3175 },
|
577
|
+
{ ref: 'T011', date: '2016-11-02', code: 'P', price: 7.4250, shares: 100 },
|
578
|
+
{ ref: 'T012', date: '2016-11-02', code: 'P', price: 7.5500, shares: 4700 },
|
579
|
+
{ ref: 'T013', date: '2016-11-02', code: 'P', price: 7.3500, shares: 53100 },
|
580
|
+
{ ref: 'T014', date: '2016-11-02', code: 'P', price: 7.4500, shares: 5847 },
|
581
|
+
{ ref: 'T015', date: '2016-11-02', code: 'P', price: 7.7500, shares: 500 },
|
582
|
+
{ ref: 'T016', date: '2016-11-02', code: 'P', price: 8.2500, shares: 100 }
|
583
|
+
]
|
584
|
+
tab = FatTable.from_aoh(aoh)
|
585
|
+
|
586
|
+
Notice, again, that the values can either be ruby objects, such as `Date.today`,
|
587
|
+
or strings that can parsed into one of the permissible column types.
|
588
|
+
|
589
|
+
|
590
|
+
<a id="orgdab2ec1"></a>
|
591
|
+
|
592
|
+
### From SQL queries
|
593
|
+
|
594
|
+
Another way to initialize a `FatTable` table is with the results of a SQL query.
|
595
|
+
`FatTable` uses the `sequel` gem to query databases. You must first set the
|
596
|
+
database parameters to be used for the queries.
|
597
|
+
|
598
|
+
# This automatically requires sequel.
|
599
|
+
require 'fat_table'
|
600
|
+
FatTable.connect(driver: 'Pg',
|
601
|
+
database: 'XXX_development',
|
602
|
+
user: 'dtd',
|
603
|
+
password: 'slflpowert',
|
604
|
+
host: 'localhost',
|
605
|
+
socket: '/tmp/.s.PGSQL.5432')
|
606
|
+
tab = FatTable.from_sql('select * from trades;')
|
607
|
+
|
608
|
+
Some of the parameters to the `.connect` function have defaults. The driver
|
609
|
+
defaults to `'Pg'` for postgresql and the socket defaults to
|
610
|
+
`/tmp/.s.PGSQL.5432` if the host is ’localhost’, which it is by default. If the
|
611
|
+
host is not `'localhost'`, the dsn uses a port rather than a socket and defaults
|
612
|
+
to port `'5432'`. While user and password default to nil, the database parameter
|
613
|
+
is required.
|
614
|
+
|
615
|
+
The `.connect` function need only be called once, and the database handle it
|
616
|
+
creates will be used for all subsequent `.from_sql` calls until `.connect` is
|
617
|
+
called again.
|
618
|
+
|
619
|
+
Alternatively, you can build the `Sequel` connection with `Sequel.connect` or
|
620
|
+
with adapter-specific `Sequel` connection methods and let `FatTable` know to use
|
621
|
+
that connection:
|
622
|
+
|
623
|
+
require 'fat_table'
|
624
|
+
FatTable.db = Sequel.connect('postgres://user:password@localhost/dbname')
|
625
|
+
FatTable.db = Sequel.ado(conn_string: 'Provider=Microsoft.ACE.OLEDB.12.0;Data Source=drive:\path\filename.accdb')
|
626
|
+
|
627
|
+
Consult `Sequel's` documentation for details on its connection methods.
|
628
|
+
<http://sequel.jeremyevans.net/rdoc/files/doc/opening_databases_rdoc.html>
|
629
|
+
|
630
|
+
|
631
|
+
<a id="orgeb97e36"></a>
|
632
|
+
|
633
|
+
### Marking Groups in Input
|
634
|
+
|
635
|
+
The `.from_aoa` and `.from_aoh` functions take an optional keyword parameter
|
636
|
+
`hlines:` that, if set to `true`, causes them to mark group boundaries in the
|
637
|
+
table wherever a row Array (for `.from_aoa`) or Hash (for `.from_aoh`) is
|
638
|
+
followed by a `nil`. Each boundary means that the rows above it and after the
|
639
|
+
header or prior group boundary all belong to a group. By default `hlines` is
|
640
|
+
false for both functions so neither expects hlines in its input.
|
641
|
+
|
642
|
+
In the case of `.from_aoa`, if `hlines:` is set true, the input must also
|
643
|
+
include a `nil` in the second element of the outer array to indicate that the
|
644
|
+
first row is to be used as headers. Otherwise, it will synthesize headers of
|
645
|
+
the form `:col_1`, `:col_2`, … `:col_n`.
|
646
|
+
|
647
|
+
In org mode table text passed to `.from_org_file` and `.from_org_string`, you
|
648
|
+
*must* mark the header row by following it with an hrule and you *may* mark
|
649
|
+
group boundaries with an hrule. In org mode tables, hlines are table rows
|
650
|
+
beginning with something like ’`|---`’. The `.from_org_...` functions always
|
651
|
+
recognizes hlines in the input, so it takes no `hlines:` keyword parameter.
|
652
|
+
|
653
|
+
|
654
|
+
<a id="orgf9cb237"></a>
|
655
|
+
|
656
|
+
## Accessing Parts of Tables
|
657
|
+
|
658
|
+
|
659
|
+
<a id="org4453cea"></a>
|
660
|
+
|
661
|
+
### Rows
|
662
|
+
|
663
|
+
A `FatTable` table is an Enumerable, yielding each row of the table as a Hash
|
664
|
+
keyed on the header symbols. The method `Table#rows` returns an Array of the
|
665
|
+
rows as Hashes as well.
|
666
|
+
|
667
|
+
You can also use indexing to access a row of the table by number. Using an
|
668
|
+
integer index returns a Hash of the given row. Thus, `tab[20]` returns the 21st
|
669
|
+
data row of the table, while `tab[0]` returns the first row and tab[-1] returns
|
670
|
+
the last row.
|
671
|
+
|
672
|
+
|
673
|
+
<a id="org8a6dd85"></a>
|
674
|
+
|
675
|
+
### Columns
|
676
|
+
|
677
|
+
If the index provided to `[]` is a string or a symbol, it returns an Array of
|
678
|
+
the items of the column with that header. Thus, `tab[:ref]` returns an Array of
|
679
|
+
all the items of the table’s `:ref` column.
|
680
|
+
|
681
|
+
|
682
|
+
<a id="orgcc87a8b"></a>
|
683
|
+
|
684
|
+
### Cells
|
685
|
+
|
686
|
+
The two forms of indexing can be combined to access individual cells of the
|
687
|
+
table:
|
688
|
+
|
689
|
+
tab[13] # => Hash of the 14th row
|
690
|
+
tab[:date] # => Array of all Dates in the :date column
|
691
|
+
tab[13][:date] # => The Date in the 14th row
|
692
|
+
tab[:date][13] # => The Date in the 14th row; indexes can be in either order.
|
693
|
+
|
694
|
+
|
695
|
+
<a id="org4a41de4"></a>
|
696
|
+
|
697
|
+
### Other table attributes
|
698
|
+
|
699
|
+
tab.headers # => an Array of the headers in symbol form
|
700
|
+
tab.types # => a Hash mapping headers to column types
|
701
|
+
tab.size # => the number of rows in the table
|
702
|
+
tab.width # => the number of columns in the table
|
703
|
+
tab.empty? # => is the table empty?
|
704
|
+
tab.column?(head) # => does the table have a column with the given header?
|
705
|
+
tab.groups # => return an Array of the table's groups as Arrays of row Hashes.
|
706
|
+
|
707
|
+
|
708
|
+
<a id="org731fd13"></a>
|
709
|
+
|
710
|
+
## Operations on Tables
|
711
|
+
|
712
|
+
Once you have one or more tables, you will likely want to perform operations on
|
713
|
+
them. The operations provided by `FatTable` are the subject of this section.
|
714
|
+
Before getting into the operations, though, there are a couple of issues that
|
715
|
+
cut across all or many of the operations.
|
716
|
+
|
717
|
+
First, tables are by and large immutable objects. Each operation creates a new
|
718
|
+
table without affecting the input tables. The only exception is the `degroup!`
|
719
|
+
operation, which mutates the receiver table by removing its group boundaries.
|
720
|
+
|
721
|
+
Second, because each operation returns a `FatTable::Table` object, the
|
722
|
+
operations are chainable.
|
723
|
+
|
724
|
+
Third, `FatTable::Table` objects can have “groups” of rows within the table.
|
725
|
+
These can be decorated with hlines and group footers on output. Some of these
|
726
|
+
operations result in marking group boundaries in the result table, others remove
|
727
|
+
group boundaries that may have existed in the input table. Operations that
|
728
|
+
either create or remove groups will be noted below.
|
729
|
+
|
730
|
+
Finally, the operations are for the most part patterned on SQL table operations,
|
731
|
+
but when expressions play a role, you write them using ruby syntax rather than
|
732
|
+
SQL.
|
733
|
+
|
734
|
+
|
735
|
+
<a id="orga96ca08"></a>
|
736
|
+
|
737
|
+
### Example Input Table
|
738
|
+
|
739
|
+
For illustration purposes assume that the following tables are read into ruby
|
740
|
+
variables called ’`tab1`’ and ’`tab2`. We have given the table groups, marked by
|
741
|
+
the hlines below, and included some duplicate rows to illustrate the effect of
|
742
|
+
certain operations on groups and duplicates.
|
743
|
+
|
744
|
+
require 'fat_table'
|
745
|
+
|
746
|
+
tab1_str = <<-EOS
|
747
|
+
| Ref | Date | Code | Price | G10 | QP10 | Shares | LP | QP | IPLP | IPQP |
|
748
|
+
|------|------------------|------|--------|-----|------|--------|------|-------|--------|--------|
|
749
|
+
| T001 | [2016-11-01 Tue] | P | 7.7000 | T | F | 100 | 14 | 86 | 0.2453 | 0.1924 |
|
750
|
+
| T002 | [2016-11-01 Tue] | P | 7.7500 | T | F | 200 | 28 | 172 | 0.2453 | 0.1924 |
|
751
|
+
| T003 | [2016-11-01 Tue] | P | 7.5000 | F | T | 800 | 112 | 688 | 0.2453 | 0.1924 |
|
752
|
+
| T003 | [2016-11-01 Tue] | P | 7.5000 | F | T | 800 | 112 | 688 | 0.2453 | 0.1924 |
|
753
|
+
|------|------------------|------|--------|-----|------|--------|------|-------|--------|--------|
|
754
|
+
| T004 | [2016-11-01 Tue] | S | 7.5500 | T | F | 6811 | 966 | 5845 | 0.2453 | 0.1924 |
|
755
|
+
| T005 | [2016-11-01 Tue] | S | 7.5000 | F | F | 4000 | 572 | 3428 | 0.2453 | 0.1924 |
|
756
|
+
| T006 | [2016-11-01 Tue] | S | 7.6000 | F | T | 1000 | 143 | 857 | 0.2453 | 0.1924 |
|
757
|
+
| T006 | [2016-11-01 Tue] | S | 7.6000 | F | T | 1000 | 143 | 857 | 0.2453 | 0.1924 |
|
758
|
+
| T007 | [2016-11-01 Tue] | S | 7.6500 | T | F | 200 | 28 | 172 | 0.2453 | 0.1924 |
|
759
|
+
| T008 | [2016-11-01 Tue] | P | 7.6500 | F | F | 2771 | 393 | 2378 | 0.2453 | 0.1924 |
|
760
|
+
| T009 | [2016-11-01 Tue] | P | 7.6000 | F | F | 9550 | 1363 | 8187 | 0.2453 | 0.1924 |
|
761
|
+
|------|------------------|------|--------|-----|------|--------|------|-------|--------|--------|
|
762
|
+
| T010 | [2016-11-01 Tue] | P | 7.5500 | F | T | 3175 | 451 | 2724 | 0.2453 | 0.1924 |
|
763
|
+
| T011 | [2016-11-02 Wed] | P | 7.4250 | T | F | 100 | 14 | 86 | 0.2453 | 0.1924 |
|
764
|
+
| T012 | [2016-11-02 Wed] | P | 7.5500 | F | F | 4700 | 677 | 4023 | 0.2453 | 0.1924 |
|
765
|
+
| T012 | [2016-11-02 Wed] | P | 7.5500 | F | F | 4700 | 677 | 4023 | 0.2453 | 0.1924 |
|
766
|
+
| T013 | [2016-11-02 Wed] | P | 7.3500 | T | T | 53100 | 7656 | 45444 | 0.2453 | 0.1924 |
|
767
|
+
|------|------------------|------|--------|-----|------|--------|------|-------|--------|--------|
|
768
|
+
| T014 | [2016-11-02 Wed] | P | 7.4500 | F | T | 5847 | 835 | 5012 | 0.2453 | 0.1924 |
|
769
|
+
| T015 | [2016-11-02 Wed] | P | 7.7500 | F | F | 500 | 72 | 428 | 0.2453 | 0.1924 |
|
770
|
+
| T016 | [2016-11-02 Wed] | P | 8.2500 | T | T | 100 | 14 | 86 | 0.2453 | 0.1924 |
|
771
|
+
EOS
|
772
|
+
|
773
|
+
tab2_str = <<-EOS
|
774
|
+
| Ref | Date | Code | Price | G10 | QP10 | Shares | LP | QP | IPLP | IPQP |
|
775
|
+
|------|------------------|------|--------|-----|------|--------|-------|------|--------|--------|
|
776
|
+
| T003 | [2016-11-01 Tue] | P | 7.5000 | F | T | 800 | 112 | 688 | 0.2453 | 0.1924 |
|
777
|
+
| T003 | [2016-11-01 Tue] | P | 7.5000 | F | T | 800 | 112 | 688 | 0.2453 | 0.1924 |
|
778
|
+
| T017 | [2016-11-01 Tue] | P | 8.3 | F | T | 1801 | 1201 | 600 | 0.2453 | 0.1924 |
|
779
|
+
|------|------------------|------|--------|-----|------|--------|-------|------|--------|--------|
|
780
|
+
| T018 | [2016-11-01 Tue] | S | 7.152 | T | F | 2516 | 2400 | 116 | 0.2453 | 0.1924 |
|
781
|
+
| T018 | [2016-11-01 Tue] | S | 7.152 | T | F | 2516 | 2400 | 116 | 0.2453 | 0.1924 |
|
782
|
+
| T006 | [2016-11-01 Tue] | S | 7.6000 | F | T | 1000 | 143 | 857 | 0.2453 | 0.1924 |
|
783
|
+
| T007 | [2016-11-01 Tue] | S | 7.6500 | T | F | 200 | 28 | 172 | 0.2453 | 0.1924 |
|
784
|
+
|------|------------------|------|--------|-----|------|--------|-------|------|--------|--------|
|
785
|
+
| T014 | [2016-11-02 Wed] | P | 7.4500 | F | T | 5847 | 835 | 5012 | 0.2453 | 0.1924 |
|
786
|
+
| T015 | [2016-11-02 Wed] | P | 7.7500 | F | F | 500 | 72 | 428 | 0.2453 | 0.1924 |
|
787
|
+
| T015 | [2016-11-02 Wed] | P | 7.7500 | F | F | 500 | 72 | 428 | 0.2453 | 0.1924 |
|
788
|
+
| T016 | [2016-11-02 Wed] | P | 8.2500 | T | T | 100 | 14 | 86 | 0.2453 | 0.1924 |
|
789
|
+
|------|------------------|------|--------|-----|------|--------|-------|------|--------|--------|
|
790
|
+
| T019 | [2017-01-15 Sun] | S | 8.75 | T | F | 300 | 175 | 125 | 0.2453 | 0.1924 |
|
791
|
+
| T020 | [2017-01-19 Thu] | S | 8.25 | F | T | 700 | 615 | 85 | 0.2453 | 0.1924 |
|
792
|
+
| T021 | [2017-01-23 Mon] | P | 7.16 | T | T | 12100 | 11050 | 1050 | 0.2453 | 0.1924 |
|
793
|
+
| T021 | [2017-01-23 Mon] | P | 7.16 | T | T | 12100 | 11050 | 1050 | 0.2453 | 0.1924 |
|
794
|
+
EOS
|
795
|
+
|
796
|
+
tab1 = FatTable.from_org_string(tab1_str)
|
797
|
+
tab2 = FatTable.from_org_string(tab2_str)
|
798
|
+
|
799
|
+
|
800
|
+
<a id="orga0c49b3"></a>
|
801
|
+
|
802
|
+
### Select
|
803
|
+
|
804
|
+
With the `select` method, you can select which existing columns should appear in
|
805
|
+
the output table and create new columns in the output table that are a function
|
806
|
+
of existing and new columns.
|
807
|
+
|
808
|
+
1. Selecting Existing Columns
|
809
|
+
|
810
|
+
Here we select three existing columns by simply passing header symbols in the
|
811
|
+
order we want them to appear in the output. Thus, one use of `select` is to
|
812
|
+
filter and permute the order of existing columns. The `select` method preserves
|
813
|
+
any group boundaries present in the input table.
|
814
|
+
|
815
|
+
tab1.select(:price, :ref, :shares).to_aoa
|
816
|
+
|
817
|
+
| Price | Ref | Shares |
|
818
|
+
|-------|------|--------|
|
819
|
+
| 7.7 | T001 | 100 |
|
820
|
+
| 7.75 | T002 | 200 |
|
821
|
+
| 7.5 | T003 | 800 |
|
822
|
+
| 7.5 | T003 | 800 |
|
823
|
+
|-------|------|--------|
|
824
|
+
| 7.55 | T004 | 6811 |
|
825
|
+
| 7.5 | T005 | 4000 |
|
826
|
+
| 7.6 | T006 | 1000 |
|
827
|
+
| 7.6 | T006 | 1000 |
|
828
|
+
| 7.65 | T007 | 200 |
|
829
|
+
| 7.65 | T008 | 2771 |
|
830
|
+
| 7.6 | T009 | 9550 |
|
831
|
+
|-------|------|--------|
|
832
|
+
| 7.55 | T010 | 3175 |
|
833
|
+
| 7.425 | T011 | 100 |
|
834
|
+
| 7.55 | T012 | 4700 |
|
835
|
+
| 7.55 | T012 | 4700 |
|
836
|
+
| 7.35 | T013 | 53100 |
|
837
|
+
|-------|------|--------|
|
838
|
+
| 7.45 | T014 | 5847 |
|
839
|
+
| 7.75 | T015 | 500 |
|
840
|
+
| 8.25 | T016 | 100 |
|
841
|
+
|
842
|
+
2. Adding New Columns
|
843
|
+
|
844
|
+
More interesting is that `select` can take hash-like keyword arguments after the
|
845
|
+
symbol arguments to create new columns in the output as functions of other
|
846
|
+
columns. For each hash-like parameter, the keyword given must be a symbol, which
|
847
|
+
becomes the header for the new column, and the value must be either: (1) a
|
848
|
+
symbol representing an existing column, which has the effect of renaming an
|
849
|
+
existing column, or (2) a string representing a ruby expression for the value of
|
850
|
+
a new column.
|
851
|
+
|
852
|
+
Within the string expression, the names of existing or already-specified columns
|
853
|
+
are available as local variables, as well as the instance variables ’@row’ and
|
854
|
+
’@group’. So for our example table, the string expressions for new columns have
|
855
|
+
access to local variables `ref`, `date`, `code`, `price`, `g10`, `qp10`,
|
856
|
+
`shares`, `lp`, `qp`, `iplp`, and `ipqp` as well as the instance variables
|
857
|
+
`@row` and `@group`. The local variables are set to the values of the cell in
|
858
|
+
their respective columns for each row in the input table and the instance
|
859
|
+
variables are set the number of the current row and group respectively.
|
860
|
+
|
861
|
+
For example, if we want to rename the `:date` column and add a new column to
|
862
|
+
compute the cost of shares, we could do the following:
|
863
|
+
|
864
|
+
tab1.select(:ref, :price, :shares, traded_on: :date, cost: 'price * shares').to_aoa
|
865
|
+
|
866
|
+
| Ref | Price | Shares | Traded On | Cost |
|
867
|
+
|------|-------|--------|------------|----------|
|
868
|
+
| T001 | 7.7 | 100 | 2016-11-01 | 770.0 |
|
869
|
+
| T002 | 7.75 | 200 | 2016-11-01 | 1550.0 |
|
870
|
+
| T003 | 7.5 | 800 | 2016-11-01 | 6000.0 |
|
871
|
+
| T003 | 7.5 | 800 | 2016-11-01 | 6000.0 |
|
872
|
+
|------|-------|--------|------------|----------|
|
873
|
+
| T004 | 7.55 | 6811 | 2016-11-01 | 51423.05 |
|
874
|
+
| T005 | 7.5 | 4000 | 2016-11-01 | 30000.0 |
|
875
|
+
| T006 | 7.6 | 1000 | 2016-11-01 | 7600.0 |
|
876
|
+
| T006 | 7.6 | 1000 | 2016-11-01 | 7600.0 |
|
877
|
+
| T007 | 7.65 | 200 | 2016-11-01 | 1530.0 |
|
878
|
+
| T008 | 7.65 | 2771 | 2016-11-01 | 21198.15 |
|
879
|
+
| T009 | 7.6 | 9550 | 2016-11-01 | 72580.0 |
|
880
|
+
|------|-------|--------|------------|----------|
|
881
|
+
| T010 | 7.55 | 3175 | 2016-11-01 | 23971.25 |
|
882
|
+
| T011 | 7.425 | 100 | 2016-11-02 | 742.5 |
|
883
|
+
| T012 | 7.55 | 4700 | 2016-11-02 | 35485.0 |
|
884
|
+
| T012 | 7.55 | 4700 | 2016-11-02 | 35485.0 |
|
885
|
+
| T013 | 7.35 | 53100 | 2016-11-02 | 390285.0 |
|
886
|
+
|------|-------|--------|------------|----------|
|
887
|
+
| T014 | 7.45 | 5847 | 2016-11-02 | 43560.15 |
|
888
|
+
| T015 | 7.75 | 500 | 2016-11-02 | 3875.0 |
|
889
|
+
| T016 | 8.25 | 100 | 2016-11-02 | 825.0 |
|
890
|
+
|
891
|
+
The parameter ’`traded_on: :date`’ caused the `:date` column of the input table
|
892
|
+
to be renamed ’`:traded_on`, and the parameter `cost: 'price * shares'` created
|
893
|
+
a new column, `:cost`, as the product of values in the `:price` and `:shares`
|
894
|
+
columns.
|
895
|
+
|
896
|
+
The order of the columns in the result tables is the same as the order of the
|
897
|
+
parameters to the `select` method. So, you can re-order the columns with a
|
898
|
+
second, chained call to `select`:
|
899
|
+
|
900
|
+
tab1.select(:ref, :price, :shares, traded_on: :date, cost: 'price * shares') \
|
901
|
+
.select(:ref, :traded_on, :price, :shares, :cost) \
|
902
|
+
.to_aoa
|
903
|
+
|
904
|
+
| Ref | Traded On | Price | Shares | Cost |
|
905
|
+
|------|------------|-------|--------|----------|
|
906
|
+
| T001 | 2016-11-01 | 7.7 | 100 | 770.0 |
|
907
|
+
| T002 | 2016-11-01 | 7.75 | 200 | 1550.0 |
|
908
|
+
| T003 | 2016-11-01 | 7.5 | 800 | 6000.0 |
|
909
|
+
| T003 | 2016-11-01 | 7.5 | 800 | 6000.0 |
|
910
|
+
|------|------------|-------|--------|----------|
|
911
|
+
| T004 | 2016-11-01 | 7.55 | 6811 | 51423.05 |
|
912
|
+
| T005 | 2016-11-01 | 7.5 | 4000 | 30000.0 |
|
913
|
+
| T006 | 2016-11-01 | 7.6 | 1000 | 7600.0 |
|
914
|
+
| T006 | 2016-11-01 | 7.6 | 1000 | 7600.0 |
|
915
|
+
| T007 | 2016-11-01 | 7.65 | 200 | 1530.0 |
|
916
|
+
| T008 | 2016-11-01 | 7.65 | 2771 | 21198.15 |
|
917
|
+
| T009 | 2016-11-01 | 7.6 | 9550 | 72580.0 |
|
918
|
+
|------|------------|-------|--------|----------|
|
919
|
+
| T010 | 2016-11-01 | 7.55 | 3175 | 23971.25 |
|
920
|
+
| T011 | 2016-11-02 | 7.425 | 100 | 742.5 |
|
921
|
+
| T012 | 2016-11-02 | 7.55 | 4700 | 35485.0 |
|
922
|
+
| T012 | 2016-11-02 | 7.55 | 4700 | 35485.0 |
|
923
|
+
| T013 | 2016-11-02 | 7.35 | 53100 | 390285.0 |
|
924
|
+
|------|------------|-------|--------|----------|
|
925
|
+
| T014 | 2016-11-02 | 7.45 | 5847 | 43560.15 |
|
926
|
+
| T015 | 2016-11-02 | 7.75 | 500 | 3875.0 |
|
927
|
+
| T016 | 2016-11-02 | 8.25 | 100 | 825.0 |
|
928
|
+
|
929
|
+
3. Custom Instance Variables and Hooks
|
930
|
+
|
931
|
+
As the above examples demonstrate, the instance variables `@row` and `@group`
|
932
|
+
are available when evaluating expressions that add new columns. You can also set
|
933
|
+
up your own instance variables as well for keeping track of things that cross
|
934
|
+
row boundaries, such as running sums.
|
935
|
+
|
936
|
+
To declare instance variables, you can use the `ivars:` hash parameter to
|
937
|
+
`select`. Each key of the hash becomes an instance variable and each value
|
938
|
+
becomes its initial value before any rows are evaluated.
|
939
|
+
|
940
|
+
In addition, you can provide `before_hook:` and `after_hook:` parameters to
|
941
|
+
`select` as strings that are evaluated as ruby expressions before and after each
|
942
|
+
row is processed. You can use these to update instance variables. The values set
|
943
|
+
in the `before_hook:` can be used in expressions for adding new columns by
|
944
|
+
referencing them with the ’@’ prefix.
|
945
|
+
|
946
|
+
For example, suppose we wanted to not only add a cost column, but a column that
|
947
|
+
shows the cumulative cost after each transaction in our example table. The
|
948
|
+
following example uses the `ivars:` and `before_hook:` parameters to keep track
|
949
|
+
of the running cost of shares, then formats the table.
|
950
|
+
|
951
|
+
tab = tab1.select(:ref, :price, :shares, traded_on: :date, \
|
952
|
+
cost: 'price * shares', cumulative: '@total_cost', \
|
953
|
+
ivars: { total_cost: 0 }, \
|
954
|
+
before_hook: '@total_cost += price * shares')
|
955
|
+
FatTable.to_aoa(tab) do |f|
|
956
|
+
f.format(price: '0.4', shares: '0.0,', cost: '0.2,', cumulative: '0.2,')
|
957
|
+
end
|
958
|
+
|
959
|
+
| Ref | Price | Shares | Traded On | Cost | Cumulative |
|
960
|
+
|------|--------|--------|------------|------------|------------|
|
961
|
+
| T001 | 7.7000 | 100 | 2016-11-01 | 770.00 | 770.00 |
|
962
|
+
| T002 | 7.7500 | 200 | 2016-11-01 | 1,550.00 | 2,320.00 |
|
963
|
+
| T003 | 7.5000 | 800 | 2016-11-01 | 6,000.00 | 8,320.00 |
|
964
|
+
| T003 | 7.5000 | 800 | 2016-11-01 | 6,000.00 | 14,320.00 |
|
965
|
+
|------|--------|--------|------------|------------|------------|
|
966
|
+
| T004 | 7.5500 | 6,811 | 2016-11-01 | 51,423.05 | 65,743.05 |
|
967
|
+
| T005 | 7.5000 | 4,000 | 2016-11-01 | 30,000.00 | 95,743.05 |
|
968
|
+
| T006 | 7.6000 | 1,000 | 2016-11-01 | 7,600.00 | 103,343.05 |
|
969
|
+
| T006 | 7.6000 | 1,000 | 2016-11-01 | 7,600.00 | 110,943.05 |
|
970
|
+
| T007 | 7.6500 | 200 | 2016-11-01 | 1,530.00 | 112,473.05 |
|
971
|
+
| T008 | 7.6500 | 2,771 | 2016-11-01 | 21,198.15 | 133,671.20 |
|
972
|
+
| T009 | 7.6000 | 9,550 | 2016-11-01 | 72,580.00 | 206,251.20 |
|
973
|
+
|------|--------|--------|------------|------------|------------|
|
974
|
+
| T010 | 7.5500 | 3,175 | 2016-11-01 | 23,971.25 | 230,222.45 |
|
975
|
+
| T011 | 7.4250 | 100 | 2016-11-02 | 742.50 | 230,964.95 |
|
976
|
+
| T012 | 7.5500 | 4,700 | 2016-11-02 | 35,485.00 | 266,449.95 |
|
977
|
+
| T012 | 7.5500 | 4,700 | 2016-11-02 | 35,485.00 | 301,934.95 |
|
978
|
+
| T013 | 7.3500 | 53,100 | 2016-11-02 | 390,285.00 | 692,219.95 |
|
979
|
+
|------|--------|--------|------------|------------|------------|
|
980
|
+
| T014 | 7.4500 | 5,847 | 2016-11-02 | 43,560.15 | 735,780.10 |
|
981
|
+
| T015 | 7.7500 | 500 | 2016-11-02 | 3,875.00 | 739,655.10 |
|
982
|
+
| T016 | 8.2500 | 100 | 2016-11-02 | 825.00 | 740,480.10 |
|
983
|
+
|
984
|
+
4. Argument Order and Boundaries
|
985
|
+
|
986
|
+
Notice that `select` can take any number of arguments but all the symbol
|
987
|
+
arguments must come first followed by all the hash-like keyword arguments,
|
988
|
+
including the special arguments for instance variables and hooks.
|
989
|
+
|
990
|
+
As the example illustrates, `.select` transmits any group boundaries in its
|
991
|
+
input table to the result table.
|
992
|
+
|
993
|
+
|
994
|
+
<a id="orge185ad7"></a>
|
995
|
+
|
996
|
+
### Where
|
997
|
+
|
998
|
+
You can filter the rows of the result table with the `.where` method. It takes a
|
999
|
+
single string expression as an argument which is evaluated in a manner similar
|
1000
|
+
to `.select` in which the value of the cells in each column are available as
|
1001
|
+
local variables and the instance variables `@row` and `@group` are available for
|
1002
|
+
testing. The expression is evaluated for each row, and if the expression
|
1003
|
+
evaluates to a truthy value, the row is included in the output, otherwise it is
|
1004
|
+
not. The `.where` method obliterates any group boundaries in the input, so the
|
1005
|
+
output table has only a single group.
|
1006
|
+
|
1007
|
+
Here we select only those even-numbered rows where either of the two boolean
|
1008
|
+
fields is true:
|
1009
|
+
|
1010
|
+
tab1.where('@row.even? && (g10 || qp10)') \
|
1011
|
+
.to_aoa
|
1012
|
+
|
1013
|
+
| Ref | Date | Code | Price | G10 | QP10 | Shares | Lp | Qp | Iplp | Ipqp |
|
1014
|
+
|------|------------|------|-------|-----|------|--------|------|-------|--------|--------|
|
1015
|
+
| T002 | 2016-11-01 | P | 7.75 | T | F | 200 | 28 | 172 | 0.2453 | 0.1924 |
|
1016
|
+
| T003 | 2016-11-01 | P | 7.5 | F | T | 800 | 112 | 688 | 0.2453 | 0.1924 |
|
1017
|
+
| T006 | 2016-11-01 | S | 7.6 | F | T | 1000 | 143 | 857 | 0.2453 | 0.1924 |
|
1018
|
+
| T010 | 2016-11-01 | P | 7.55 | F | T | 3175 | 451 | 2724 | 0.2453 | 0.1924 |
|
1019
|
+
| T013 | 2016-11-02 | P | 7.35 | T | T | 53100 | 7656 | 45444 | 0.2453 | 0.1924 |
|
1020
|
+
|
1021
|
+
|
1022
|
+
<a id="org57f51d1"></a>
|
1023
|
+
|
1024
|
+
### Order\_by
|
1025
|
+
|
1026
|
+
You can sort a table on any number of columns with `order_by`. The `order_by`
|
1027
|
+
method takes any number of symbol arguments for the columns to sort on. If you
|
1028
|
+
specify more than one column, the sort is performed on the first column, then
|
1029
|
+
all columns that are equal with respect to the first column are sorted by the
|
1030
|
+
second column, and so on. All columns of the input table are included in the
|
1031
|
+
output.
|
1032
|
+
|
1033
|
+
Let’s sort our table first by `:code`, then by `:date`.
|
1034
|
+
|
1035
|
+
tab1.order_by(:code, :date) \
|
1036
|
+
.to_aoa
|
1037
|
+
|
1038
|
+
| Ref | Date | Code | Price | G10 | QP10 | Shares | Lp | Qp | Iplp | Ipqp |
|
1039
|
+
|------|------------|------|-------|-----|------|--------|------|-------|--------|--------|
|
1040
|
+
| T001 | 2016-11-01 | P | 7.7 | T | F | 100 | 14 | 86 | 0.2453 | 0.1924 |
|
1041
|
+
| T002 | 2016-11-01 | P | 7.75 | T | F | 200 | 28 | 172 | 0.2453 | 0.1924 |
|
1042
|
+
| T003 | 2016-11-01 | P | 7.5 | F | T | 800 | 112 | 688 | 0.2453 | 0.1924 |
|
1043
|
+
| T003 | 2016-11-01 | P | 7.5 | F | T | 800 | 112 | 688 | 0.2453 | 0.1924 |
|
1044
|
+
| T008 | 2016-11-01 | P | 7.65 | F | F | 2771 | 393 | 2378 | 0.2453 | 0.1924 |
|
1045
|
+
| T009 | 2016-11-01 | P | 7.6 | F | F | 9550 | 1363 | 8187 | 0.2453 | 0.1924 |
|
1046
|
+
| T010 | 2016-11-01 | P | 7.55 | F | T | 3175 | 451 | 2724 | 0.2453 | 0.1924 |
|
1047
|
+
|------|------------|------|-------|-----|------|--------|------|-------|--------|--------|
|
1048
|
+
| T011 | 2016-11-02 | P | 7.425 | T | F | 100 | 14 | 86 | 0.2453 | 0.1924 |
|
1049
|
+
| T012 | 2016-11-02 | P | 7.55 | F | F | 4700 | 677 | 4023 | 0.2453 | 0.1924 |
|
1050
|
+
| T012 | 2016-11-02 | P | 7.55 | F | F | 4700 | 677 | 4023 | 0.2453 | 0.1924 |
|
1051
|
+
| T013 | 2016-11-02 | P | 7.35 | T | T | 53100 | 7656 | 45444 | 0.2453 | 0.1924 |
|
1052
|
+
| T014 | 2016-11-02 | P | 7.45 | F | T | 5847 | 835 | 5012 | 0.2453 | 0.1924 |
|
1053
|
+
| T015 | 2016-11-02 | P | 7.75 | F | F | 500 | 72 | 428 | 0.2453 | 0.1924 |
|
1054
|
+
| T016 | 2016-11-02 | P | 8.25 | T | T | 100 | 14 | 86 | 0.2453 | 0.1924 |
|
1055
|
+
|------|------------|------|-------|-----|------|--------|------|-------|--------|--------|
|
1056
|
+
| T004 | 2016-11-01 | S | 7.55 | T | F | 6811 | 966 | 5845 | 0.2453 | 0.1924 |
|
1057
|
+
| T005 | 2016-11-01 | S | 7.5 | F | F | 4000 | 572 | 3428 | 0.2453 | 0.1924 |
|
1058
|
+
| T006 | 2016-11-01 | S | 7.6 | F | T | 1000 | 143 | 857 | 0.2453 | 0.1924 |
|
1059
|
+
| T006 | 2016-11-01 | S | 7.6 | F | T | 1000 | 143 | 857 | 0.2453 | 0.1924 |
|
1060
|
+
| T007 | 2016-11-01 | S | 7.65 | T | F | 200 | 28 | 172 | 0.2453 | 0.1924 |
|
1061
|
+
|
1062
|
+
The interesting thing about `order_by` is that, while it ignores groups in its
|
1063
|
+
input, it adds group boundaries in the output table at those rows where the sort
|
1064
|
+
keys change. Thus, in each group, `:code` and `:date` are the same, and when
|
1065
|
+
either changes, `order_by` inserts a group boundary.
|
1066
|
+
|
1067
|
+
|
1068
|
+
<a id="org1ee0a85"></a>
|
1069
|
+
|
1070
|
+
### Group\_by
|
1071
|
+
|
1072
|
+
Like `order_by`, `group_by` takes a set of parameters of column header symbols,
|
1073
|
+
the “grouping parameters”, by which to sort the table into a set of groups that
|
1074
|
+
are equal with respect to values in those columns. In addition, those parameters
|
1075
|
+
can be followed by a series of hash-like parameters, the “aggregating
|
1076
|
+
parameters”, that indicate how any of the remaining, non-group columns are to be
|
1077
|
+
aggregated into a single value. The output table has one row for each group for
|
1078
|
+
which the grouping parameters are equal containing those columns and an
|
1079
|
+
aggregate column for each of the aggregating parameters.
|
1080
|
+
|
1081
|
+
For example, let’s summarize the `trades` table by `:code` and `:price` again,
|
1082
|
+
and determine total shares, average price, and a few other features of each
|
1083
|
+
group:
|
1084
|
+
|
1085
|
+
tab1.group_by(:code, :date, price: :avg,
|
1086
|
+
shares: :sum, lp: :sum, qp: :sum,
|
1087
|
+
qp10: :all?) \
|
1088
|
+
.to_aoa { |f| f.format(avg_price: '0.5R') }
|
1089
|
+
|
1090
|
+
| Code | Date | Avg Price | Sum Shares | Sum Lp | Sum Qp | All QP10 |
|
1091
|
+
|------|------------|-----------|------------|--------|--------|----------|
|
1092
|
+
| P | 2016-11-01 | 7.60714 | 17396 | 2473 | 14923 | F |
|
1093
|
+
| P | 2016-11-02 | 7.61786 | 69047 | 9945 | 59102 | F |
|
1094
|
+
| S | 2016-11-01 | 7.58000 | 13011 | 1852 | 11159 | F |
|
1095
|
+
|
1096
|
+
After the grouping column parameters, `:code` and `:date`, there are several
|
1097
|
+
hash-like “aggregating” parameters where the key is the column to aggregate and
|
1098
|
+
the value is a symbol for one of several aggregating methods that
|
1099
|
+
`FatTable::Column` objects understand. For example, the `:avg` method is applied
|
1100
|
+
to the :price column so that the output shows the average price in each group.
|
1101
|
+
The `:shares`, `:lp`, and `:qp` columns are summed, and the `:any?` aggregate is
|
1102
|
+
applied to one of the boolean fields, that is, it is `true` if any of the values
|
1103
|
+
in that column are `true`. The column names in the output of the aggregated
|
1104
|
+
columns have the name of the aggregating method pre-pended to the column name.
|
1105
|
+
|
1106
|
+
Here is a list of all the aggregate methods available. If the description
|
1107
|
+
restricts the aggregate to particular column types, applying it to other types
|
1108
|
+
will raise an exception.
|
1109
|
+
|
1110
|
+
- **`first`:** the first non-nil item in the column,
|
1111
|
+
- **`last`:** the last non-nil item in the column,
|
1112
|
+
- **`rng`:** form a string of the form `"#{first}..#{last}"` to show the range of
|
1113
|
+
values in the column,
|
1114
|
+
- **`sum`:** for `Numeric` and `String` columns, apply ’+’ to all the non-nil
|
1115
|
+
values,
|
1116
|
+
- **`count`:** the number of non-nil values in the column,
|
1117
|
+
- **`min`:** for `Numeric`, `String`, and `DateTime` columns, return the smallest
|
1118
|
+
non-nil value in the column,
|
1119
|
+
- **`max`:** for `Numeric`, `String`, and `DateTime` columns, return the largest
|
1120
|
+
non-nil value in the column,
|
1121
|
+
- **`avg`:** for `Numeric` and `DateTime` columns, return the arithmetic mean of
|
1122
|
+
the non-nil values in the column; with respect to `Date` or `DateTime`
|
1123
|
+
objects, each is converted to a numeric Julian date, the average is
|
1124
|
+
calculated, and the result converted back to a `Date` or `DateTime` object,
|
1125
|
+
- **`var`:** for `Numeric` and `DateTime` columns, compute the sample variance of
|
1126
|
+
the non-nil values in the column, dates are converted to Julian date
|
1127
|
+
numbers as for the `:avg` aggregate,
|
1128
|
+
- **`pvar`:** for `Numeric` and `DateTime` columns, compute the population
|
1129
|
+
variance of the non-nil values in the column, dates are converted to Julian
|
1130
|
+
date numbers as for the `:avg` aggregate,
|
1131
|
+
- **`dev`:** for `Numeric` and `DateTime` columns, compute the sample standard
|
1132
|
+
deviation of the non-nil values in the column, dates are converted to
|
1133
|
+
Julian date numbers as for the `:avg` aggregate,
|
1134
|
+
- **`pdev`:** for `Numeric` and `DateTime` columns, compute the population
|
1135
|
+
standard deviation of the non-nil values in the column, dates are converted
|
1136
|
+
to numbers as for the `:avg` aggregate,
|
1137
|
+
- **`all?`:** for `Boolean` columns only, return true if all of the non-nil values
|
1138
|
+
in the column are true,
|
1139
|
+
- **`any?`:** for `Boolean` columns only, return true if any non-nil value in the
|
1140
|
+
column is true,
|
1141
|
+
- **`none?`:** for `Boolean` columns only, return true if no non-nil value in the
|
1142
|
+
column is true,
|
1143
|
+
- **`one?`:** for `Boolean` columns only, return true if exactly one non-nil value
|
1144
|
+
in the column is true,
|
1145
|
+
|
1146
|
+
Perhaps surprisingly, the `group_by` method ignores any groups in its input and
|
1147
|
+
results in no group boundaries in the output since each group formed by the
|
1148
|
+
implicit `order_by` on the grouping columns is collapsed into a single row.
|
1149
|
+
|
1150
|
+
|
1151
|
+
<a id="org6432f26"></a>
|
1152
|
+
|
1153
|
+
### Join
|
1154
|
+
|
1155
|
+
1. Join Types
|
1156
|
+
|
1157
|
+
So far, all the operations have operated on a single table. `FatTable` provides
|
1158
|
+
several `join` methods for combining two tables, each of which takes as
|
1159
|
+
parameters (1) a second table and (2) except in the case of `cross_join`, zero
|
1160
|
+
or more “join expressions”. In the descriptions below, `T1` is the table on
|
1161
|
+
which the method is called, `T2` is the table supplied as the first parameter
|
1162
|
+
`other`, and `R1` and `R2` are rows in their respective tables being considered
|
1163
|
+
for inclusion in the joined output table.
|
1164
|
+
|
1165
|
+
- **`join(other, *jexps)`:** Performs an “inner join” on the tables. For each row
|
1166
|
+
`R1` of `T1`, the joined table has a row for each row in `T2` that
|
1167
|
+
satisfies the join condition with `R1`.
|
1168
|
+
|
1169
|
+
- **`left_join(other, *jexps)`:** First, an inner join is performed. Then, for
|
1170
|
+
each row in `T1` that does not satisfy the join condition with any row in
|
1171
|
+
`T2`, a joined row is added with null values in columns of `T2`. Thus, the
|
1172
|
+
joined table always has at least one row for each row in `T1`.
|
1173
|
+
|
1174
|
+
- **`right_join(other, *jexps)`:** First, an inner join is performed. Then, for
|
1175
|
+
each row in `T2` that does not satisfy the join condition with any row in
|
1176
|
+
`T1`, a joined row is added with null values in columns of `T1`. This is
|
1177
|
+
the converse of a left join: the result table will always have a row for
|
1178
|
+
each row in `T2`.
|
1179
|
+
|
1180
|
+
- **`full_join(other, *jexps)`:** First, an inner join is performed. Then, for
|
1181
|
+
each row in `T1` that does not satisfy the join condition with any row in
|
1182
|
+
`T2`, a joined row is added with null values in columns of `T2`. Also, for
|
1183
|
+
each row of `T2` that does not satisfy the join condition with any row in
|
1184
|
+
`T1`, a joined row with null values in the columns of `T1` is added.
|
1185
|
+
|
1186
|
+
- **`cross_join(other)`:** For every possible combination of rows from `T1` and
|
1187
|
+
`T2` (i.e., a Cartesian product), the joined table will contain a row
|
1188
|
+
consisting of all columns in `T1` followed by all columns in `T2`. If the
|
1189
|
+
tables have `N` and `M` rows respectively, the joined table will have `N *
|
1190
|
+
M` rows.
|
1191
|
+
|
1192
|
+
2. Join Expressions
|
1193
|
+
|
1194
|
+
For each of the join types, if no join expressions are given, the tables will be
|
1195
|
+
joined on columns having the same column header in both tables, and the join
|
1196
|
+
condition is satisfied when all the values in those columns are equal. If the
|
1197
|
+
join type is an inner join, this is a so-called “natural” join.
|
1198
|
+
|
1199
|
+
If the join expressions are one or more symbols, the join condition requires
|
1200
|
+
that the values of both tables are equal for all columns named by the symbols. A
|
1201
|
+
column that appears in both tables can be given without modification and will be
|
1202
|
+
assumed to require equality on that column. If an unmodified symbol is not a
|
1203
|
+
name that appears in both tables, an exception will be raised. Column names that
|
1204
|
+
are unique to the first table must have a `_a` appended to the column name and
|
1205
|
+
column names that are unique to the other table must have a `_b` appended to the
|
1206
|
+
column name. These disambiguated column names must come in pairs, one for the
|
1207
|
+
first table and one for the second, and they will imply a join condition that
|
1208
|
+
the columns must be equal on those columns. Several such symbol expressions will
|
1209
|
+
require that all such implied pairs are equal in order for the join condition to
|
1210
|
+
be met.
|
1211
|
+
|
1212
|
+
Finally, a join expression can be a string that contains an arbitrary ruby
|
1213
|
+
expression that will be evaluated for truthiness. Within the string, *all*
|
1214
|
+
column names must be disambiguated with the `_a` or `_b` modifiers whether they
|
1215
|
+
are common to both tables or not. As with `select` and `where` methods, the
|
1216
|
+
names of the columns in both tables (albeit disambiguated) are available as
|
1217
|
+
local variables within the expression, but the instance variables `@row` and
|
1218
|
+
`@group` are not.
|
1219
|
+
|
1220
|
+
3. Join Examples
|
1221
|
+
|
1222
|
+
The following examples are taken from the [Postgresql tutorial](https://www.tutorialspoint.com/postgresql/postgresql_using_joins.htm), with some slight
|
1223
|
+
modifications. The examples will use the following two tables, which are also
|
1224
|
+
available in `ft_console` as `@tab_a` and `@tab_b`:
|
1225
|
+
|
1226
|
+
require 'fat_table'
|
1227
|
+
|
1228
|
+
tab_a_str = <<-EOS
|
1229
|
+
| Id | Name | Age | Address | Salary | Join Date |
|
1230
|
+
|----|-------|-----|------------|--------|------------|
|
1231
|
+
| 1 | Paul | 32 | California | 20000 | 2001-07-13 |
|
1232
|
+
| 3 | Teddy | 23 | Norway | 20000 | 2007-12-13 |
|
1233
|
+
| 4 | Mark | 25 | Rich-Mond | 65000 | 2007-12-13 |
|
1234
|
+
| 5 | David | 27 | Texas | 85000 | 2007-12-13 |
|
1235
|
+
| 2 | Allen | 25 | Texas | | 2005-07-13 |
|
1236
|
+
| 8 | Paul | 24 | Houston | 20000 | 2005-07-13 |
|
1237
|
+
| 9 | James | 44 | Norway | 5000 | 2005-07-13 |
|
1238
|
+
| 10 | James | 45 | Texas | 5000 | |
|
1239
|
+
EOS
|
1240
|
+
|
1241
|
+
tab_b_str = <<-EOS
|
1242
|
+
| Id | Dept | Emp Id |
|
1243
|
+
|----|-------------|--------|
|
1244
|
+
| 1 | IT Billing | 1 |
|
1245
|
+
| 2 | Engineering | 2 |
|
1246
|
+
| 3 | Finance | 7 |
|
1247
|
+
EOS
|
1248
|
+
|
1249
|
+
tab_a = FatTable.from_org_string(tab_a_str)
|
1250
|
+
tab_b = FatTable.from_org_string(tab_b_str)
|
1251
|
+
|
1252
|
+
1. Inner Joins
|
1253
|
+
|
1254
|
+
With no join expression arguments, the tables are joined when their sole common
|
1255
|
+
field, `:id`, is equal in both tables. The result is the natural join of the
|
1256
|
+
two tables.
|
1257
|
+
|
1258
|
+
tab_a.join(tab_b).to_aoa
|
1259
|
+
|
1260
|
+
| Id | Name | Age | Address | Salary | Join Date | Dept | Emp Id |
|
1261
|
+
|----|-------|-----|------------|--------|------------|-------------|--------|
|
1262
|
+
| 1 | Paul | 32 | California | 20000 | 2001-07-13 | IT Billing | 1 |
|
1263
|
+
| 3 | Teddy | 23 | Norway | 20000 | 2007-12-13 | Finance | 7 |
|
1264
|
+
| 2 | Allen | 25 | Texas | | 2005-07-13 | Engineering | 2 |
|
1265
|
+
|
1266
|
+
But the natural join joined employee IDs in the first table and department IDs
|
1267
|
+
in the second table. To correct this, we need to explicitly state the columns we
|
1268
|
+
want to join on in each table by disambiguating them with `_a` and `_b`
|
1269
|
+
suffixes:
|
1270
|
+
|
1271
|
+
tab_a.join(tab_b, :id_a, :emp_id_b).to_aoa
|
1272
|
+
|
1273
|
+
| Id | Name | Age | Address | Salary | Join Date | Id B | Dept |
|
1274
|
+
|----|-------|-----|------------|--------|------------|------|-------------|
|
1275
|
+
| 1 | Paul | 32 | California | 20000 | 2001-07-13 | 1 | IT Billing |
|
1276
|
+
| 2 | Allen | 25 | Texas | | 2005-07-13 | 2 | Engineering |
|
1277
|
+
|
1278
|
+
Instead of using the disambiguated column names as symbols, we could also use a
|
1279
|
+
string containing a ruby expression. Within the expression, the column names
|
1280
|
+
should be treated as local variables:
|
1281
|
+
|
1282
|
+
tab_a.join(tab_b, 'id_a == emp_id_b').to_aoa
|
1283
|
+
|
1284
|
+
| Id | Name | Age | Address | Salary | Join Date | Id B | Dept | Emp Id |
|
1285
|
+
|----|-------|-----|------------|--------|------------|------|-------------|--------|
|
1286
|
+
| 1 | Paul | 32 | California | 20000 | 2001-07-13 | 1 | IT Billing | 1 |
|
1287
|
+
| 2 | Allen | 25 | Texas | | 2005-07-13 | 2 | Engineering | 2 |
|
1288
|
+
|
1289
|
+
2. Left and Right Joins
|
1290
|
+
|
1291
|
+
In left join, all the rows of `tab_a` are included in the output, augmented by
|
1292
|
+
the matching columns of `tab_b` and augmented with nils where there is no match:
|
1293
|
+
|
1294
|
+
tab_a.left_join(tab_b, 'id_a == emp_id_b').to_aoa
|
1295
|
+
|
1296
|
+
| Id | Name | Age | Address | Salary | Join Date | Id B | Dept | Emp Id |
|
1297
|
+
|----|-------|-----|------------|--------|------------|------|-------------|--------|
|
1298
|
+
| 1 | Paul | 32 | California | 20000 | 2001-07-13 | 1 | IT Billing | 1 |
|
1299
|
+
| 3 | Teddy | 23 | Norway | 20000 | 2007-12-13 | | | |
|
1300
|
+
| 4 | Mark | 25 | Rich-Mond | 65000 | 2007-12-13 | | | |
|
1301
|
+
| 5 | David | 27 | Texas | 85000 | 2007-12-13 | | | |
|
1302
|
+
| 2 | Allen | 25 | Texas | | 2005-07-13 | 2 | Engineering | 2 |
|
1303
|
+
| 8 | Paul | 24 | Houston | 20000 | 2005-07-13 | | | |
|
1304
|
+
| 9 | James | 44 | Norway | 5000 | 2005-07-13 | | | |
|
1305
|
+
| 10 | James | 45 | Texas | 5000 | | | | |
|
1306
|
+
|
1307
|
+
In a right join, all the rows of `tab_b` are included in the output, augmented
|
1308
|
+
by the matching columns of `tab_a` and augmented with nils where there is no
|
1309
|
+
match:
|
1310
|
+
|
1311
|
+
tab_a.right_join(tab_b, 'id_a == emp_id_b').to_aoa
|
1312
|
+
|
1313
|
+
| Id | Name | Age | Address | Salary | Join Date | Id B | Dept | Emp Id |
|
1314
|
+
|----|-------|-----|------------|--------|------------|------|-------------|--------|
|
1315
|
+
| 1 | Paul | 32 | California | 20000 | 2001-07-13 | 1 | IT Billing | 1 |
|
1316
|
+
| 2 | Allen | 25 | Texas | | 2005-07-13 | 2 | Engineering | 2 |
|
1317
|
+
| | | | | | | 3 | Finance | 7 |
|
1318
|
+
|
1319
|
+
3. Full Join
|
1320
|
+
|
1321
|
+
A full join combines the effects of a left join and a right join. All the rows
|
1322
|
+
from both tables are included in the output augmented by columns of the other
|
1323
|
+
table where the join expression is satisfied and augmented with nils otherwise.
|
1324
|
+
|
1325
|
+
tab_a.full_join(tab_b, 'id_a == emp_id_b').to_aoa
|
1326
|
+
|
1327
|
+
| Id | Name | Age | Address | Salary | Join Date | Id B | Dept | Emp Id |
|
1328
|
+
|----|-------|-----|------------|--------|------------|------|-------------|--------|
|
1329
|
+
| 1 | Paul | 32 | California | 20000 | 2001-07-13 | 1 | IT Billing | 1 |
|
1330
|
+
| 3 | Teddy | 23 | Norway | 20000 | 2007-12-13 | | | |
|
1331
|
+
| 4 | Mark | 25 | Rich-Mond | 65000 | 2007-12-13 | | | |
|
1332
|
+
| 5 | David | 27 | Texas | 85000 | 2007-12-13 | | | |
|
1333
|
+
| 2 | Allen | 25 | Texas | | 2005-07-13 | 2 | Engineering | 2 |
|
1334
|
+
| 8 | Paul | 24 | Houston | 20000 | 2005-07-13 | | | |
|
1335
|
+
| 9 | James | 44 | Norway | 5000 | 2005-07-13 | | | |
|
1336
|
+
| 10 | James | 45 | Texas | 5000 | | | | |
|
1337
|
+
| | | | | | | 3 | Finance | 7 |
|
1338
|
+
|
1339
|
+
4. Cross Join
|
1340
|
+
|
1341
|
+
Finally, a cross join outputs every row of `tab_a` augmented with every row of
|
1342
|
+
`tab_b`, in other words, the Cartesian product of the two tables. If `tab_a` has
|
1343
|
+
`N` rows and `tab_b` has `M` rows, the output table will have `N * M` rows.
|
1344
|
+
|
1345
|
+
tab_a.cross_join(tab_b).to_aoa
|
1346
|
+
|
1347
|
+
| Id | Name | Age | Address | Salary | Join Date | Id B | Dept | Emp Id |
|
1348
|
+
|----|-------|-----|------------|--------|------------|------|-------------|--------|
|
1349
|
+
| 1 | Paul | 32 | California | 20000 | 2001-07-13 | 1 | IT Billing | 1 |
|
1350
|
+
| 1 | Paul | 32 | California | 20000 | 2001-07-13 | 2 | Engineering | 2 |
|
1351
|
+
| 1 | Paul | 32 | California | 20000 | 2001-07-13 | 3 | Finance | 7 |
|
1352
|
+
| 3 | Teddy | 23 | Norway | 20000 | 2007-12-13 | 1 | IT Billing | 1 |
|
1353
|
+
| 3 | Teddy | 23 | Norway | 20000 | 2007-12-13 | 2 | Engineering | 2 |
|
1354
|
+
| 3 | Teddy | 23 | Norway | 20000 | 2007-12-13 | 3 | Finance | 7 |
|
1355
|
+
| 4 | Mark | 25 | Rich-Mond | 65000 | 2007-12-13 | 1 | IT Billing | 1 |
|
1356
|
+
| 4 | Mark | 25 | Rich-Mond | 65000 | 2007-12-13 | 2 | Engineering | 2 |
|
1357
|
+
| 4 | Mark | 25 | Rich-Mond | 65000 | 2007-12-13 | 3 | Finance | 7 |
|
1358
|
+
| 5 | David | 27 | Texas | 85000 | 2007-12-13 | 1 | IT Billing | 1 |
|
1359
|
+
| 5 | David | 27 | Texas | 85000 | 2007-12-13 | 2 | Engineering | 2 |
|
1360
|
+
| 5 | David | 27 | Texas | 85000 | 2007-12-13 | 3 | Finance | 7 |
|
1361
|
+
| 2 | Allen | 25 | Texas | | 2005-07-13 | 1 | IT Billing | 1 |
|
1362
|
+
| 2 | Allen | 25 | Texas | | 2005-07-13 | 2 | Engineering | 2 |
|
1363
|
+
| 2 | Allen | 25 | Texas | | 2005-07-13 | 3 | Finance | 7 |
|
1364
|
+
| 8 | Paul | 24 | Houston | 20000 | 2005-07-13 | 1 | IT Billing | 1 |
|
1365
|
+
| 8 | Paul | 24 | Houston | 20000 | 2005-07-13 | 2 | Engineering | 2 |
|
1366
|
+
| 8 | Paul | 24 | Houston | 20000 | 2005-07-13 | 3 | Finance | 7 |
|
1367
|
+
| 9 | James | 44 | Norway | 5000 | 2005-07-13 | 1 | IT Billing | 1 |
|
1368
|
+
| 9 | James | 44 | Norway | 5000 | 2005-07-13 | 2 | Engineering | 2 |
|
1369
|
+
| 9 | James | 44 | Norway | 5000 | 2005-07-13 | 3 | Finance | 7 |
|
1370
|
+
| 10 | James | 45 | Texas | 5000 | | 1 | IT Billing | 1 |
|
1371
|
+
| 10 | James | 45 | Texas | 5000 | | 2 | Engineering | 2 |
|
1372
|
+
| 10 | James | 45 | Texas | 5000 | | 3 | Finance | 7 |
|
1373
|
+
|
1374
|
+
|
1375
|
+
<a id="org7d2857d"></a>
|
1376
|
+
|
1377
|
+
### Set Operations
|
1378
|
+
|
1379
|
+
`FatTable` can perform several set operations on tables. In order for two tables
|
1380
|
+
to be used this way, they must have the same number of columns with the same
|
1381
|
+
types or an exception will be raised. We’ll call two tables that qualify for
|
1382
|
+
combining with set operations “set-compatible.”
|
1383
|
+
|
1384
|
+
We’ll use the following two set-compatible tables in the examples. They each
|
1385
|
+
have some duplicates and some group boundaries so you can see the effect of the
|
1386
|
+
set operations on duplicates and groups.
|
1387
|
+
|
1388
|
+
tab1.to_aoa
|
1389
|
+
|
1390
|
+
| Ref | Date | Code | Price | G10 | QP10 | Shares | Lp | Qp | Iplp | Ipqp |
|
1391
|
+
|------|------------|------|-------|-----|------|--------|------|-------|--------|--------|
|
1392
|
+
| T001 | 2016-11-01 | P | 7.7 | T | F | 100 | 14 | 86 | 0.2453 | 0.1924 |
|
1393
|
+
| T002 | 2016-11-01 | P | 7.75 | T | F | 200 | 28 | 172 | 0.2453 | 0.1924 |
|
1394
|
+
| T003 | 2016-11-01 | P | 7.5 | F | T | 800 | 112 | 688 | 0.2453 | 0.1924 |
|
1395
|
+
| T003 | 2016-11-01 | P | 7.5 | F | T | 800 | 112 | 688 | 0.2453 | 0.1924 |
|
1396
|
+
|------|------------|------|-------|-----|------|--------|------|-------|--------|--------|
|
1397
|
+
| T004 | 2016-11-01 | S | 7.55 | T | F | 6811 | 966 | 5845 | 0.2453 | 0.1924 |
|
1398
|
+
| T005 | 2016-11-01 | S | 7.5 | F | F | 4000 | 572 | 3428 | 0.2453 | 0.1924 |
|
1399
|
+
| T006 | 2016-11-01 | S | 7.6 | F | T | 1000 | 143 | 857 | 0.2453 | 0.1924 |
|
1400
|
+
| T006 | 2016-11-01 | S | 7.6 | F | T | 1000 | 143 | 857 | 0.2453 | 0.1924 |
|
1401
|
+
| T007 | 2016-11-01 | S | 7.65 | T | F | 200 | 28 | 172 | 0.2453 | 0.1924 |
|
1402
|
+
| T008 | 2016-11-01 | P | 7.65 | F | F | 2771 | 393 | 2378 | 0.2453 | 0.1924 |
|
1403
|
+
| T009 | 2016-11-01 | P | 7.6 | F | F | 9550 | 1363 | 8187 | 0.2453 | 0.1924 |
|
1404
|
+
|------|------------|------|-------|-----|------|--------|------|-------|--------|--------|
|
1405
|
+
| T010 | 2016-11-01 | P | 7.55 | F | T | 3175 | 451 | 2724 | 0.2453 | 0.1924 |
|
1406
|
+
| T011 | 2016-11-02 | P | 7.425 | T | F | 100 | 14 | 86 | 0.2453 | 0.1924 |
|
1407
|
+
| T012 | 2016-11-02 | P | 7.55 | F | F | 4700 | 677 | 4023 | 0.2453 | 0.1924 |
|
1408
|
+
| T012 | 2016-11-02 | P | 7.55 | F | F | 4700 | 677 | 4023 | 0.2453 | 0.1924 |
|
1409
|
+
| T013 | 2016-11-02 | P | 7.35 | T | T | 53100 | 7656 | 45444 | 0.2453 | 0.1924 |
|
1410
|
+
|------|------------|------|-------|-----|------|--------|------|-------|--------|--------|
|
1411
|
+
| T014 | 2016-11-02 | P | 7.45 | F | T | 5847 | 835 | 5012 | 0.2453 | 0.1924 |
|
1412
|
+
| T015 | 2016-11-02 | P | 7.75 | F | F | 500 | 72 | 428 | 0.2453 | 0.1924 |
|
1413
|
+
| T016 | 2016-11-02 | P | 8.25 | T | T | 100 | 14 | 86 | 0.2453 | 0.1924 |
|
1414
|
+
|
1415
|
+
tab2.to_aoa
|
1416
|
+
|
1417
|
+
| Ref | Date | Code | Price | G10 | QP10 | Shares | Lp | Qp | Iplp | Ipqp |
|
1418
|
+
|------|------------|------|-------|-----|------|--------|-------|------|--------|--------|
|
1419
|
+
| T003 | 2016-11-01 | P | 7.5 | F | T | 800 | 112 | 688 | 0.2453 | 0.1924 |
|
1420
|
+
| T003 | 2016-11-01 | P | 7.5 | F | T | 800 | 112 | 688 | 0.2453 | 0.1924 |
|
1421
|
+
| T017 | 2016-11-01 | P | 8.3 | F | T | 1801 | 1201 | 600 | 0.2453 | 0.1924 |
|
1422
|
+
|------|------------|------|-------|-----|------|--------|-------|------|--------|--------|
|
1423
|
+
| T018 | 2016-11-01 | S | 7.152 | T | F | 2516 | 2400 | 116 | 0.2453 | 0.1924 |
|
1424
|
+
| T018 | 2016-11-01 | S | 7.152 | T | F | 2516 | 2400 | 116 | 0.2453 | 0.1924 |
|
1425
|
+
| T006 | 2016-11-01 | S | 7.6 | F | T | 1000 | 143 | 857 | 0.2453 | 0.1924 |
|
1426
|
+
| T007 | 2016-11-01 | S | 7.65 | T | F | 200 | 28 | 172 | 0.2453 | 0.1924 |
|
1427
|
+
|------|------------|------|-------|-----|------|--------|-------|------|--------|--------|
|
1428
|
+
| T014 | 2016-11-02 | P | 7.45 | F | T | 5847 | 835 | 5012 | 0.2453 | 0.1924 |
|
1429
|
+
| T015 | 2016-11-02 | P | 7.75 | F | F | 500 | 72 | 428 | 0.2453 | 0.1924 |
|
1430
|
+
| T015 | 2016-11-02 | P | 7.75 | F | F | 500 | 72 | 428 | 0.2453 | 0.1924 |
|
1431
|
+
| T016 | 2016-11-02 | P | 8.25 | T | T | 100 | 14 | 86 | 0.2453 | 0.1924 |
|
1432
|
+
|------|------------|------|-------|-----|------|--------|-------|------|--------|--------|
|
1433
|
+
| T019 | 2017-01-15 | S | 8.75 | T | F | 300 | 175 | 125 | 0.2453 | 0.1924 |
|
1434
|
+
| T020 | 2017-01-19 | S | 8.25 | F | T | 700 | 615 | 85 | 0.2453 | 0.1924 |
|
1435
|
+
| T021 | 2017-01-23 | P | 7.16 | T | T | 12100 | 11050 | 1050 | 0.2453 | 0.1924 |
|
1436
|
+
| T021 | 2017-01-23 | P | 7.16 | T | T | 12100 | 11050 | 1050 | 0.2453 | 0.1924 |
|
1437
|
+
|
1438
|
+
1. Unions
|
1439
|
+
|
1440
|
+
Two tables that are set-compatible can be combined with the `union` or
|
1441
|
+
`union_all` methods so that the rows of both tables appear in the output. In the
|
1442
|
+
output table, the headers of the receiver table are used. You can use `select`
|
1443
|
+
to change or re-order the headers if you prefer. The `union` method eliminates
|
1444
|
+
duplicate rows in the result table, the `union_all` method does not.
|
1445
|
+
|
1446
|
+
Any group boundaries in the input tables are destroyed by `union` but are
|
1447
|
+
preserved by `union_all`. In addition, `union_all` (but not `union`) adds a
|
1448
|
+
group boundary between the rows of the two input tables.
|
1449
|
+
|
1450
|
+
tab1.union(tab2).to_aoa
|
1451
|
+
|
1452
|
+
| Ref | Date | Code | Price | G10 | QP10 | Shares | Lp | Qp | Iplp | Ipqp |
|
1453
|
+
|------|------------|------|-------|-----|------|--------|-------|-------|--------|--------|
|
1454
|
+
| T001 | 2016-11-01 | P | 7.7 | T | F | 100 | 14 | 86 | 0.2453 | 0.1924 |
|
1455
|
+
| T002 | 2016-11-01 | P | 7.75 | T | F | 200 | 28 | 172 | 0.2453 | 0.1924 |
|
1456
|
+
| T003 | 2016-11-01 | P | 7.5 | F | T | 800 | 112 | 688 | 0.2453 | 0.1924 |
|
1457
|
+
| T004 | 2016-11-01 | S | 7.55 | T | F | 6811 | 966 | 5845 | 0.2453 | 0.1924 |
|
1458
|
+
| T005 | 2016-11-01 | S | 7.5 | F | F | 4000 | 572 | 3428 | 0.2453 | 0.1924 |
|
1459
|
+
| T006 | 2016-11-01 | S | 7.6 | F | T | 1000 | 143 | 857 | 0.2453 | 0.1924 |
|
1460
|
+
| T007 | 2016-11-01 | S | 7.65 | T | F | 200 | 28 | 172 | 0.2453 | 0.1924 |
|
1461
|
+
| T008 | 2016-11-01 | P | 7.65 | F | F | 2771 | 393 | 2378 | 0.2453 | 0.1924 |
|
1462
|
+
| T009 | 2016-11-01 | P | 7.6 | F | F | 9550 | 1363 | 8187 | 0.2453 | 0.1924 |
|
1463
|
+
| T010 | 2016-11-01 | P | 7.55 | F | T | 3175 | 451 | 2724 | 0.2453 | 0.1924 |
|
1464
|
+
| T011 | 2016-11-02 | P | 7.425 | T | F | 100 | 14 | 86 | 0.2453 | 0.1924 |
|
1465
|
+
| T012 | 2016-11-02 | P | 7.55 | F | F | 4700 | 677 | 4023 | 0.2453 | 0.1924 |
|
1466
|
+
| T013 | 2016-11-02 | P | 7.35 | T | T | 53100 | 7656 | 45444 | 0.2453 | 0.1924 |
|
1467
|
+
| T014 | 2016-11-02 | P | 7.45 | F | T | 5847 | 835 | 5012 | 0.2453 | 0.1924 |
|
1468
|
+
| T015 | 2016-11-02 | P | 7.75 | F | F | 500 | 72 | 428 | 0.2453 | 0.1924 |
|
1469
|
+
| T016 | 2016-11-02 | P | 8.25 | T | T | 100 | 14 | 86 | 0.2453 | 0.1924 |
|
1470
|
+
| T017 | 2016-11-01 | P | 8.3 | F | T | 1801 | 1201 | 600 | 0.2453 | 0.1924 |
|
1471
|
+
| T018 | 2016-11-01 | S | 7.152 | T | F | 2516 | 2400 | 116 | 0.2453 | 0.1924 |
|
1472
|
+
| T019 | 2017-01-15 | S | 8.75 | T | F | 300 | 175 | 125 | 0.2453 | 0.1924 |
|
1473
|
+
| T020 | 2017-01-19 | S | 8.25 | F | T | 700 | 615 | 85 | 0.2453 | 0.1924 |
|
1474
|
+
| T021 | 2017-01-23 | P | 7.16 | T | T | 12100 | 11050 | 1050 | 0.2453 | 0.1924 |
|
1475
|
+
|
1476
|
+
tab1.union_all(tab2).to_aoa
|
1477
|
+
|
1478
|
+
| Ref | Date | Code | Price | G10 | QP10 | Shares | Lp | Qp | Iplp | Ipqp |
|
1479
|
+
|------|------------|------|-------|-----|------|--------|-------|-------|--------|--------|
|
1480
|
+
| T001 | 2016-11-01 | P | 7.7 | T | F | 100 | 14 | 86 | 0.2453 | 0.1924 |
|
1481
|
+
| T002 | 2016-11-01 | P | 7.75 | T | F | 200 | 28 | 172 | 0.2453 | 0.1924 |
|
1482
|
+
| T003 | 2016-11-01 | P | 7.5 | F | T | 800 | 112 | 688 | 0.2453 | 0.1924 |
|
1483
|
+
| T003 | 2016-11-01 | P | 7.5 | F | T | 800 | 112 | 688 | 0.2453 | 0.1924 |
|
1484
|
+
|------|------------|------|-------|-----|------|--------|-------|-------|--------|--------|
|
1485
|
+
| T004 | 2016-11-01 | S | 7.55 | T | F | 6811 | 966 | 5845 | 0.2453 | 0.1924 |
|
1486
|
+
| T005 | 2016-11-01 | S | 7.5 | F | F | 4000 | 572 | 3428 | 0.2453 | 0.1924 |
|
1487
|
+
| T006 | 2016-11-01 | S | 7.6 | F | T | 1000 | 143 | 857 | 0.2453 | 0.1924 |
|
1488
|
+
| T006 | 2016-11-01 | S | 7.6 | F | T | 1000 | 143 | 857 | 0.2453 | 0.1924 |
|
1489
|
+
| T007 | 2016-11-01 | S | 7.65 | T | F | 200 | 28 | 172 | 0.2453 | 0.1924 |
|
1490
|
+
| T008 | 2016-11-01 | P | 7.65 | F | F | 2771 | 393 | 2378 | 0.2453 | 0.1924 |
|
1491
|
+
| T009 | 2016-11-01 | P | 7.6 | F | F | 9550 | 1363 | 8187 | 0.2453 | 0.1924 |
|
1492
|
+
|------|------------|------|-------|-----|------|--------|-------|-------|--------|--------|
|
1493
|
+
| T010 | 2016-11-01 | P | 7.55 | F | T | 3175 | 451 | 2724 | 0.2453 | 0.1924 |
|
1494
|
+
| T011 | 2016-11-02 | P | 7.425 | T | F | 100 | 14 | 86 | 0.2453 | 0.1924 |
|
1495
|
+
| T012 | 2016-11-02 | P | 7.55 | F | F | 4700 | 677 | 4023 | 0.2453 | 0.1924 |
|
1496
|
+
| T012 | 2016-11-02 | P | 7.55 | F | F | 4700 | 677 | 4023 | 0.2453 | 0.1924 |
|
1497
|
+
| T013 | 2016-11-02 | P | 7.35 | T | T | 53100 | 7656 | 45444 | 0.2453 | 0.1924 |
|
1498
|
+
|------|------------|------|-------|-----|------|--------|-------|-------|--------|--------|
|
1499
|
+
| T014 | 2016-11-02 | P | 7.45 | F | T | 5847 | 835 | 5012 | 0.2453 | 0.1924 |
|
1500
|
+
| T015 | 2016-11-02 | P | 7.75 | F | F | 500 | 72 | 428 | 0.2453 | 0.1924 |
|
1501
|
+
| T016 | 2016-11-02 | P | 8.25 | T | T | 100 | 14 | 86 | 0.2453 | 0.1924 |
|
1502
|
+
|------|------------|------|-------|-----|------|--------|-------|-------|--------|--------|
|
1503
|
+
| T003 | 2016-11-01 | P | 7.5 | F | T | 800 | 112 | 688 | 0.2453 | 0.1924 |
|
1504
|
+
| T003 | 2016-11-01 | P | 7.5 | F | T | 800 | 112 | 688 | 0.2453 | 0.1924 |
|
1505
|
+
| T017 | 2016-11-01 | P | 8.3 | F | T | 1801 | 1201 | 600 | 0.2453 | 0.1924 |
|
1506
|
+
|------|------------|------|-------|-----|------|--------|-------|-------|--------|--------|
|
1507
|
+
| T018 | 2016-11-01 | S | 7.152 | T | F | 2516 | 2400 | 116 | 0.2453 | 0.1924 |
|
1508
|
+
| T018 | 2016-11-01 | S | 7.152 | T | F | 2516 | 2400 | 116 | 0.2453 | 0.1924 |
|
1509
|
+
| T006 | 2016-11-01 | S | 7.6 | F | T | 1000 | 143 | 857 | 0.2453 | 0.1924 |
|
1510
|
+
| T007 | 2016-11-01 | S | 7.65 | T | F | 200 | 28 | 172 | 0.2453 | 0.1924 |
|
1511
|
+
|------|------------|------|-------|-----|------|--------|-------|-------|--------|--------|
|
1512
|
+
| T014 | 2016-11-02 | P | 7.45 | F | T | 5847 | 835 | 5012 | 0.2453 | 0.1924 |
|
1513
|
+
| T015 | 2016-11-02 | P | 7.75 | F | F | 500 | 72 | 428 | 0.2453 | 0.1924 |
|
1514
|
+
| T015 | 2016-11-02 | P | 7.75 | F | F | 500 | 72 | 428 | 0.2453 | 0.1924 |
|
1515
|
+
| T016 | 2016-11-02 | P | 8.25 | T | T | 100 | 14 | 86 | 0.2453 | 0.1924 |
|
1516
|
+
|------|------------|------|-------|-----|------|--------|-------|-------|--------|--------|
|
1517
|
+
| T019 | 2017-01-15 | S | 8.75 | T | F | 300 | 175 | 125 | 0.2453 | 0.1924 |
|
1518
|
+
| T020 | 2017-01-19 | S | 8.25 | F | T | 700 | 615 | 85 | 0.2453 | 0.1924 |
|
1519
|
+
| T021 | 2017-01-23 | P | 7.16 | T | T | 12100 | 11050 | 1050 | 0.2453 | 0.1924 |
|
1520
|
+
| T021 | 2017-01-23 | P | 7.16 | T | T | 12100 | 11050 | 1050 | 0.2453 | 0.1924 |
|
1521
|
+
|
1522
|
+
2. Intersections
|
1523
|
+
|
1524
|
+
The `intersect` method returns a table having only rows common to both tables,
|
1525
|
+
eliminating any duplicate rows in the result.
|
1526
|
+
|
1527
|
+
tab1.intersect(tab2).to_aoa
|
1528
|
+
|
1529
|
+
| Ref | Date | Code | Price | G10 | QP10 | Shares | Lp | Qp | Iplp | Ipqp |
|
1530
|
+
|------|------------|------|-------|-----|------|--------|-----|------|--------|--------|
|
1531
|
+
| T003 | 2016-11-01 | P | 7.5 | F | T | 800 | 112 | 688 | 0.2453 | 0.1924 |
|
1532
|
+
| T006 | 2016-11-01 | S | 7.6 | F | T | 1000 | 143 | 857 | 0.2453 | 0.1924 |
|
1533
|
+
| T007 | 2016-11-01 | S | 7.65 | T | F | 200 | 28 | 172 | 0.2453 | 0.1924 |
|
1534
|
+
| T014 | 2016-11-02 | P | 7.45 | F | T | 5847 | 835 | 5012 | 0.2453 | 0.1924 |
|
1535
|
+
| T015 | 2016-11-02 | P | 7.75 | F | F | 500 | 72 | 428 | 0.2453 | 0.1924 |
|
1536
|
+
| T016 | 2016-11-02 | P | 8.25 | T | T | 100 | 14 | 86 | 0.2453 | 0.1924 |
|
1537
|
+
|
1538
|
+
With `intersect_all`, all the rows of the first table, including duplicates, are
|
1539
|
+
included in the result if they also occur in the second table. However,
|
1540
|
+
duplicates in the second table do not appear.
|
1541
|
+
|
1542
|
+
tab1.intersect_all(tab2).to_aoa
|
1543
|
+
|
1544
|
+
| Ref | Date | Code | Price | G10 | QP10 | Shares | Lp | Qp | Iplp | Ipqp |
|
1545
|
+
|------|------------|------|-------|-----|------|--------|-----|------|--------|--------|
|
1546
|
+
| T003 | 2016-11-01 | P | 7.5 | F | T | 800 | 112 | 688 | 0.2453 | 0.1924 |
|
1547
|
+
| T003 | 2016-11-01 | P | 7.5 | F | T | 800 | 112 | 688 | 0.2453 | 0.1924 |
|
1548
|
+
| T006 | 2016-11-01 | S | 7.6 | F | T | 1000 | 143 | 857 | 0.2453 | 0.1924 |
|
1549
|
+
| T006 | 2016-11-01 | S | 7.6 | F | T | 1000 | 143 | 857 | 0.2453 | 0.1924 |
|
1550
|
+
| T007 | 2016-11-01 | S | 7.65 | T | F | 200 | 28 | 172 | 0.2453 | 0.1924 |
|
1551
|
+
| T014 | 2016-11-02 | P | 7.45 | F | T | 5847 | 835 | 5012 | 0.2453 | 0.1924 |
|
1552
|
+
| T015 | 2016-11-02 | P | 7.75 | F | F | 500 | 72 | 428 | 0.2453 | 0.1924 |
|
1553
|
+
| T016 | 2016-11-02 | P | 8.25 | T | T | 100 | 14 | 86 | 0.2453 | 0.1924 |
|
1554
|
+
|
1555
|
+
As a result, it makes a difference which table is the receiver of the
|
1556
|
+
`intersect_all` method call and which is the argument. In other words, order of
|
1557
|
+
operation matters.
|
1558
|
+
|
1559
|
+
tab2.intersect_all(tab1).to_aoa
|
1560
|
+
|
1561
|
+
| Ref | Date | Code | Price | G10 | QP10 | Shares | Lp | Qp | Iplp | Ipqp |
|
1562
|
+
|------|------------|------|-------|-----|------|--------|-----|------|--------|--------|
|
1563
|
+
| T003 | 2016-11-01 | P | 7.5 | F | T | 800 | 112 | 688 | 0.2453 | 0.1924 |
|
1564
|
+
| T003 | 2016-11-01 | P | 7.5 | F | T | 800 | 112 | 688 | 0.2453 | 0.1924 |
|
1565
|
+
| T006 | 2016-11-01 | S | 7.6 | F | T | 1000 | 143 | 857 | 0.2453 | 0.1924 |
|
1566
|
+
| T007 | 2016-11-01 | S | 7.65 | T | F | 200 | 28 | 172 | 0.2453 | 0.1924 |
|
1567
|
+
| T014 | 2016-11-02 | P | 7.45 | F | T | 5847 | 835 | 5012 | 0.2453 | 0.1924 |
|
1568
|
+
| T015 | 2016-11-02 | P | 7.75 | F | F | 500 | 72 | 428 | 0.2453 | 0.1924 |
|
1569
|
+
| T015 | 2016-11-02 | P | 7.75 | F | F | 500 | 72 | 428 | 0.2453 | 0.1924 |
|
1570
|
+
| T016 | 2016-11-02 | P | 8.25 | T | T | 100 | 14 | 86 | 0.2453 | 0.1924 |
|
1571
|
+
|
1572
|
+
3. Differences with Except
|
1573
|
+
|
1574
|
+
You can use the `except` method to delete from a table any rows that occur in
|
1575
|
+
another table, that is, compute the set difference between the tables.
|
1576
|
+
|
1577
|
+
tab1.except(tab2).to_aoa
|
1578
|
+
|
1579
|
+
| Ref | Date | Code | Price | G10 | QP10 | Shares | Lp | Qp | Iplp | Ipqp |
|
1580
|
+
|------|------------|------|-------|-----|------|--------|------|-------|--------|--------|
|
1581
|
+
| T001 | 2016-11-01 | P | 7.7 | T | F | 100 | 14 | 86 | 0.2453 | 0.1924 |
|
1582
|
+
| T002 | 2016-11-01 | P | 7.75 | T | F | 200 | 28 | 172 | 0.2453 | 0.1924 |
|
1583
|
+
| T004 | 2016-11-01 | S | 7.55 | T | F | 6811 | 966 | 5845 | 0.2453 | 0.1924 |
|
1584
|
+
| T005 | 2016-11-01 | S | 7.5 | F | F | 4000 | 572 | 3428 | 0.2453 | 0.1924 |
|
1585
|
+
| T008 | 2016-11-01 | P | 7.65 | F | F | 2771 | 393 | 2378 | 0.2453 | 0.1924 |
|
1586
|
+
| T009 | 2016-11-01 | P | 7.6 | F | F | 9550 | 1363 | 8187 | 0.2453 | 0.1924 |
|
1587
|
+
| T010 | 2016-11-01 | P | 7.55 | F | T | 3175 | 451 | 2724 | 0.2453 | 0.1924 |
|
1588
|
+
| T011 | 2016-11-02 | P | 7.425 | T | F | 100 | 14 | 86 | 0.2453 | 0.1924 |
|
1589
|
+
| T012 | 2016-11-02 | P | 7.55 | F | F | 4700 | 677 | 4023 | 0.2453 | 0.1924 |
|
1590
|
+
| T013 | 2016-11-02 | P | 7.35 | T | T | 53100 | 7656 | 45444 | 0.2453 | 0.1924 |
|
1591
|
+
|
1592
|
+
Like subtraction, though, the order of operands matters with set difference
|
1593
|
+
computed by `except`.
|
1594
|
+
|
1595
|
+
tab2.except(tab1).to_aoa
|
1596
|
+
|
1597
|
+
| Ref | Date | Code | Price | G10 | QP10 | Shares | Lp | Qp | Iplp | Ipqp |
|
1598
|
+
|------|------------|------|-------|-----|------|--------|-------|------|--------|--------|
|
1599
|
+
| T017 | 2016-11-01 | P | 8.3 | F | T | 1801 | 1201 | 600 | 0.2453 | 0.1924 |
|
1600
|
+
| T018 | 2016-11-01 | S | 7.152 | T | F | 2516 | 2400 | 116 | 0.2453 | 0.1924 |
|
1601
|
+
| T019 | 2017-01-15 | S | 8.75 | T | F | 300 | 175 | 125 | 0.2453 | 0.1924 |
|
1602
|
+
| T020 | 2017-01-19 | S | 8.25 | F | T | 700 | 615 | 85 | 0.2453 | 0.1924 |
|
1603
|
+
| T021 | 2017-01-23 | P | 7.16 | T | T | 12100 | 11050 | 1050 | 0.2453 | 0.1924 |
|
1604
|
+
|
1605
|
+
As with `intersect_all`, `except_all` includes any duplicates in the first,
|
1606
|
+
receiver table, but not those in the second, argument table.
|
1607
|
+
|
1608
|
+
tab1.except_all(tab2).to_aoa
|
1609
|
+
|
1610
|
+
| Ref | Date | Code | Price | G10 | QP10 | Shares | Lp | Qp | Iplp | Ipqp |
|
1611
|
+
|------|------------|------|-------|-----|------|--------|------|-------|--------|--------|
|
1612
|
+
| T001 | 2016-11-01 | P | 7.7 | T | F | 100 | 14 | 86 | 0.2453 | 0.1924 |
|
1613
|
+
| T002 | 2016-11-01 | P | 7.75 | T | F | 200 | 28 | 172 | 0.2453 | 0.1924 |
|
1614
|
+
| T004 | 2016-11-01 | S | 7.55 | T | F | 6811 | 966 | 5845 | 0.2453 | 0.1924 |
|
1615
|
+
| T005 | 2016-11-01 | S | 7.5 | F | F | 4000 | 572 | 3428 | 0.2453 | 0.1924 |
|
1616
|
+
| T008 | 2016-11-01 | P | 7.65 | F | F | 2771 | 393 | 2378 | 0.2453 | 0.1924 |
|
1617
|
+
| T009 | 2016-11-01 | P | 7.6 | F | F | 9550 | 1363 | 8187 | 0.2453 | 0.1924 |
|
1618
|
+
| T010 | 2016-11-01 | P | 7.55 | F | T | 3175 | 451 | 2724 | 0.2453 | 0.1924 |
|
1619
|
+
| T011 | 2016-11-02 | P | 7.425 | T | F | 100 | 14 | 86 | 0.2453 | 0.1924 |
|
1620
|
+
| T012 | 2016-11-02 | P | 7.55 | F | F | 4700 | 677 | 4023 | 0.2453 | 0.1924 |
|
1621
|
+
| T012 | 2016-11-02 | P | 7.55 | F | F | 4700 | 677 | 4023 | 0.2453 | 0.1924 |
|
1622
|
+
| T013 | 2016-11-02 | P | 7.35 | T | T | 53100 | 7656 | 45444 | 0.2453 | 0.1924 |
|
1623
|
+
|
1624
|
+
And, of course, the order of operands matters here as well.
|
1625
|
+
|
1626
|
+
tab2.except_all(tab1).to_aoa
|
1627
|
+
|
1628
|
+
| Ref | Date | Code | Price | G10 | QP10 | Shares | Lp | Qp | Iplp | Ipqp |
|
1629
|
+
|------|------------|------|-------|-----|------|--------|-------|------|--------|--------|
|
1630
|
+
| T017 | 2016-11-01 | P | 8.3 | F | T | 1801 | 1201 | 600 | 0.2453 | 0.1924 |
|
1631
|
+
| T018 | 2016-11-01 | S | 7.152 | T | F | 2516 | 2400 | 116 | 0.2453 | 0.1924 |
|
1632
|
+
| T018 | 2016-11-01 | S | 7.152 | T | F | 2516 | 2400 | 116 | 0.2453 | 0.1924 |
|
1633
|
+
| T019 | 2017-01-15 | S | 8.75 | T | F | 300 | 175 | 125 | 0.2453 | 0.1924 |
|
1634
|
+
| T020 | 2017-01-19 | S | 8.25 | F | T | 700 | 615 | 85 | 0.2453 | 0.1924 |
|
1635
|
+
| T021 | 2017-01-23 | P | 7.16 | T | T | 12100 | 11050 | 1050 | 0.2453 | 0.1924 |
|
1636
|
+
| T021 | 2017-01-23 | P | 7.16 | T | T | 12100 | 11050 | 1050 | 0.2453 | 0.1924 |
|
1637
|
+
|
1638
|
+
|
1639
|
+
<a id="org073a8b5"></a>
|
1640
|
+
|
1641
|
+
### Uniq (aka Distinct)
|
1642
|
+
|
1643
|
+
The `uniq` method takes no arguments and simply removes any duplicate rows from
|
1644
|
+
the input table. The `distinct` method is an alias for `uniq`. Any groups in
|
1645
|
+
the input table are lost.
|
1646
|
+
|
1647
|
+
tab1.uniq.to_aoa
|
1648
|
+
|
1649
|
+
| Ref | Date | Code | Price | G10 | QP10 | Shares | Lp | Qp | Iplp | Ipqp |
|
1650
|
+
|------|------------|------|-------|-----|------|--------|------|-------|--------|--------|
|
1651
|
+
| T001 | 2016-11-01 | P | 7.7 | T | F | 100 | 14 | 86 | 0.2453 | 0.1924 |
|
1652
|
+
| T002 | 2016-11-01 | P | 7.75 | T | F | 200 | 28 | 172 | 0.2453 | 0.1924 |
|
1653
|
+
| T003 | 2016-11-01 | P | 7.5 | F | T | 800 | 112 | 688 | 0.2453 | 0.1924 |
|
1654
|
+
| T004 | 2016-11-01 | S | 7.55 | T | F | 6811 | 966 | 5845 | 0.2453 | 0.1924 |
|
1655
|
+
| T005 | 2016-11-01 | S | 7.5 | F | F | 4000 | 572 | 3428 | 0.2453 | 0.1924 |
|
1656
|
+
| T006 | 2016-11-01 | S | 7.6 | F | T | 1000 | 143 | 857 | 0.2453 | 0.1924 |
|
1657
|
+
| T007 | 2016-11-01 | S | 7.65 | T | F | 200 | 28 | 172 | 0.2453 | 0.1924 |
|
1658
|
+
| T008 | 2016-11-01 | P | 7.65 | F | F | 2771 | 393 | 2378 | 0.2453 | 0.1924 |
|
1659
|
+
| T009 | 2016-11-01 | P | 7.6 | F | F | 9550 | 1363 | 8187 | 0.2453 | 0.1924 |
|
1660
|
+
| T010 | 2016-11-01 | P | 7.55 | F | T | 3175 | 451 | 2724 | 0.2453 | 0.1924 |
|
1661
|
+
| T011 | 2016-11-02 | P | 7.425 | T | F | 100 | 14 | 86 | 0.2453 | 0.1924 |
|
1662
|
+
| T012 | 2016-11-02 | P | 7.55 | F | F | 4700 | 677 | 4023 | 0.2453 | 0.1924 |
|
1663
|
+
| T013 | 2016-11-02 | P | 7.35 | T | T | 53100 | 7656 | 45444 | 0.2453 | 0.1924 |
|
1664
|
+
| T014 | 2016-11-02 | P | 7.45 | F | T | 5847 | 835 | 5012 | 0.2453 | 0.1924 |
|
1665
|
+
| T015 | 2016-11-02 | P | 7.75 | F | F | 500 | 72 | 428 | 0.2453 | 0.1924 |
|
1666
|
+
| T016 | 2016-11-02 | P | 8.25 | T | T | 100 | 14 | 86 | 0.2453 | 0.1924 |
|
1667
|
+
|
1668
|
+
|
1669
|
+
<a id="orgd147303"></a>
|
1670
|
+
|
1671
|
+
### Remove groups with degroup!
|
1672
|
+
|
1673
|
+
Finally, it is sometimes helpful to remove any group boundaries from a table.
|
1674
|
+
You can do this with `.degroup!`, which is the only operation that mutates its
|
1675
|
+
receiver table by removing its groups.
|
1676
|
+
|
1677
|
+
tab1.degroup!.to_aoa
|
1678
|
+
|
1679
|
+
| Ref | Date | Code | Price | G10 | QP10 | Shares | Lp | Qp | Iplp | Ipqp |
|
1680
|
+
|------|------------|------|-------|-----|------|--------|------|-------|--------|--------|
|
1681
|
+
| T001 | 2016-11-01 | P | 7.7 | T | F | 100 | 14 | 86 | 0.2453 | 0.1924 |
|
1682
|
+
| T002 | 2016-11-01 | P | 7.75 | T | F | 200 | 28 | 172 | 0.2453 | 0.1924 |
|
1683
|
+
| T003 | 2016-11-01 | P | 7.5 | F | T | 800 | 112 | 688 | 0.2453 | 0.1924 |
|
1684
|
+
| T003 | 2016-11-01 | P | 7.5 | F | T | 800 | 112 | 688 | 0.2453 | 0.1924 |
|
1685
|
+
| T004 | 2016-11-01 | S | 7.55 | T | F | 6811 | 966 | 5845 | 0.2453 | 0.1924 |
|
1686
|
+
| T005 | 2016-11-01 | S | 7.5 | F | F | 4000 | 572 | 3428 | 0.2453 | 0.1924 |
|
1687
|
+
| T006 | 2016-11-01 | S | 7.6 | F | T | 1000 | 143 | 857 | 0.2453 | 0.1924 |
|
1688
|
+
| T006 | 2016-11-01 | S | 7.6 | F | T | 1000 | 143 | 857 | 0.2453 | 0.1924 |
|
1689
|
+
| T007 | 2016-11-01 | S | 7.65 | T | F | 200 | 28 | 172 | 0.2453 | 0.1924 |
|
1690
|
+
| T008 | 2016-11-01 | P | 7.65 | F | F | 2771 | 393 | 2378 | 0.2453 | 0.1924 |
|
1691
|
+
| T009 | 2016-11-01 | P | 7.6 | F | F | 9550 | 1363 | 8187 | 0.2453 | 0.1924 |
|
1692
|
+
| T010 | 2016-11-01 | P | 7.55 | F | T | 3175 | 451 | 2724 | 0.2453 | 0.1924 |
|
1693
|
+
| T011 | 2016-11-02 | P | 7.425 | T | F | 100 | 14 | 86 | 0.2453 | 0.1924 |
|
1694
|
+
| T012 | 2016-11-02 | P | 7.55 | F | F | 4700 | 677 | 4023 | 0.2453 | 0.1924 |
|
1695
|
+
| T012 | 2016-11-02 | P | 7.55 | F | F | 4700 | 677 | 4023 | 0.2453 | 0.1924 |
|
1696
|
+
| T013 | 2016-11-02 | P | 7.35 | T | T | 53100 | 7656 | 45444 | 0.2453 | 0.1924 |
|
1697
|
+
| T014 | 2016-11-02 | P | 7.45 | F | T | 5847 | 835 | 5012 | 0.2453 | 0.1924 |
|
1698
|
+
| T015 | 2016-11-02 | P | 7.75 | F | F | 500 | 72 | 428 | 0.2453 | 0.1924 |
|
1699
|
+
| T016 | 2016-11-02 | P | 8.25 | T | T | 100 | 14 | 86 | 0.2453 | 0.1924 |
|
1700
|
+
|
1701
|
+
|
1702
|
+
<a id="org9f4d633"></a>
|
1703
|
+
|
1704
|
+
## Formatting Tables
|
1705
|
+
|
1706
|
+
Besides creating and operating on tables, you may want to display the resulting
|
1707
|
+
table. `FatTable` seeks to provide a set of formatting directives that are the
|
1708
|
+
most common across many output media. It provides directives for alignment, for
|
1709
|
+
color, for adding currency symbols and grouping commas to numbers, for padding
|
1710
|
+
numbers, and for formatting dates and booleans.
|
1711
|
+
|
1712
|
+
In addition, you can add any number of footers to a table, which appear at the
|
1713
|
+
end of the table, and any number of group footers, which appear after each group
|
1714
|
+
in the table. These can be formatted independently of the table body.
|
1715
|
+
|
1716
|
+
If the target output medium does not support a formatting directive or the
|
1717
|
+
directive does not make sense, it is simply ignored. For example, you can output
|
1718
|
+
an `org-mode` table as a String, and since `org-mode` does not support colors,
|
1719
|
+
any color directives are ignored. Some of the output targets are not strings,
|
1720
|
+
but ruby data structures, and for them, things such as alignment are irrelevant.
|
1721
|
+
|
1722
|
+
|
1723
|
+
<a id="orgb7b2335"></a>
|
1724
|
+
|
1725
|
+
### Available Formatters
|
1726
|
+
|
1727
|
+
`FatTable` supports the following output targets for its tables:
|
1728
|
+
|
1729
|
+
- **Text:** form the table with ACSII characters,
|
1730
|
+
- **Org:** form the table with ASCII characters but in the form used by Emacs
|
1731
|
+
org-mode for constructing tables,
|
1732
|
+
- **Term:** form the table with ANSI terminal codes and unicode characters,
|
1733
|
+
possibly including colored text and cell backgrounds,
|
1734
|
+
- **LaTeX:** form the table as input for LaTeX’s longtable environment,
|
1735
|
+
- **Aoh:** output the table as a ruby data structure, building the table as an
|
1736
|
+
array of hashes, and
|
1737
|
+
- **Aoa:** output the table as a ruby data structure, building the table as an
|
1738
|
+
array of array,
|
1739
|
+
|
1740
|
+
These are all implemented by classes that inherit from `FatTable::Formatter`
|
1741
|
+
class by defining about a dozen methods that get called at various places during
|
1742
|
+
the construction of the output table. The idea is that more classes can be
|
1743
|
+
defined by adding additional classes.
|
1744
|
+
|
1745
|
+
|
1746
|
+
<a id="org4db9ae4"></a>
|
1747
|
+
|
1748
|
+
### Table Locations
|
1749
|
+
|
1750
|
+
In the formatting methods, the table is divided into several “locations” for
|
1751
|
+
which separate formatting directives may be given. These locations are
|
1752
|
+
identified with the following symbols:
|
1753
|
+
|
1754
|
+
- **:header:** the first row of the output table containing the headers,
|
1755
|
+
- **:footer:** all rows of the table’s footers,
|
1756
|
+
- **:gfooter:** all rows of the table’s group footers,
|
1757
|
+
- **:body:** all the data rows of the table, that is, those that are neither part
|
1758
|
+
of the header, footers, or gfooters,
|
1759
|
+
- **:bfirst:** the first row of the table’s body, and
|
1760
|
+
- **:gfirst:** the first row in each group in the table’s body.
|
1761
|
+
|
1762
|
+
|
1763
|
+
<a id="orgd2128a3"></a>
|
1764
|
+
|
1765
|
+
### Formatting Directives
|
1766
|
+
|
1767
|
+
The formatting methods explained in the next section all take formatting
|
1768
|
+
directives as strings in which letters and other characters signify what
|
1769
|
+
formatting applies. For example, we may apply the formatting directive `'R,$'`
|
1770
|
+
to numbers in a certain part of the table. Each of those characters, and in
|
1771
|
+
some cases a whole substring, is a single directive. They can appear in any
|
1772
|
+
order, so `'$R,'` and `',$R'` are equivalent.
|
1773
|
+
|
1774
|
+
Here is a list of all the formatting directives that apply to each cell type:
|
1775
|
+
|
1776
|
+
1. String
|
1777
|
+
|
1778
|
+
For a string element, the following instructions are valid. Note that these can
|
1779
|
+
also be applied to all the other cell types as well since they are all converted
|
1780
|
+
to a string in forming the output.
|
1781
|
+
|
1782
|
+
- **u:** convert the element to all lowercase,
|
1783
|
+
- **U:** convert the element to all uppercase,
|
1784
|
+
- **t:** title case the element, that is, upcase the initial letter in
|
1785
|
+
each word and lower case the other letters
|
1786
|
+
- **B ~B:** make the element bold, or turn off bold
|
1787
|
+
- **I ~I:** make the element italic, or turn off italic
|
1788
|
+
- **R:** align the element on the right of the column
|
1789
|
+
- **L:** align the element on the left of the column
|
1790
|
+
- **C:** align the element in the center of the column
|
1791
|
+
- **c[color]:** render the element in the given color; the color can have
|
1792
|
+
the form fgcolor, fgcolor.bgcolor, or .bgcolor, to set the
|
1793
|
+
foreground or background colors respectively, and each of those can
|
1794
|
+
be an ANSI or X11 color name in addition to the special color,
|
1795
|
+
’none’, which keeps the terminal’s default color.
|
1796
|
+
- **\_ ~\_:** underline the element, or turn off underline
|
1797
|
+
- **\* ~\*:** cause the element to blink, or turn off blink
|
1798
|
+
|
1799
|
+
For example, the directive `'tCc[red.yellow]'` would title-case the element,
|
1800
|
+
center it, and color it red on a yellow background. The directives that are
|
1801
|
+
boolean have negating forms so that, for example, if bold is turned on for all
|
1802
|
+
columns of a given type, it can be countermanded in formatting directives for
|
1803
|
+
particular columns.
|
1804
|
+
|
1805
|
+
2. Numeric
|
1806
|
+
|
1807
|
+
For a numeric element, all the instructions valid for string are available, in
|
1808
|
+
addition to the following:
|
1809
|
+
|
1810
|
+
- **, ~,:** insert grouping commas, or do not insert grouping commas,
|
1811
|
+
- **$ ~$:** format the number as currency according to the locale, or not,
|
1812
|
+
- **m.n:** include at least m digits before the decimal point, padding on
|
1813
|
+
the left with zeroes as needed, and round the number to the n
|
1814
|
+
decimal places and include n digits after the decimal point,
|
1815
|
+
padding on the right with zeroes as needed,
|
1816
|
+
- **H:** convert the number (assumed to be in units of seconds) to `HH:MM:SS.ss`
|
1817
|
+
form. So a column that is the result of subtracting two :datetime forms
|
1818
|
+
will result in a :numeric expressed as seconds and can be displayed in
|
1819
|
+
hours, minutes, and seconds with this formatting instruction.
|
1820
|
+
|
1821
|
+
For example, the directive `'R5.0c[blue]'` would right-align the numeric
|
1822
|
+
element, pad it on the left with zeros, and color it blue.
|
1823
|
+
|
1824
|
+
3. DateTime
|
1825
|
+
|
1826
|
+
For a `DateTime`, all the instructions valid for string are available, in
|
1827
|
+
addition to the following:
|
1828
|
+
|
1829
|
+
- **d[fmt]:** apply the format to a `Date` or a `DateTime` that is a whole day,
|
1830
|
+
that is that has no or zero hour, minute, and second components, where fmt
|
1831
|
+
is a valid format string for `Date#strftime`, otherwise, the datetime will
|
1832
|
+
be formatted as an ISO 8601 string, YYYY-MM-DD.
|
1833
|
+
- **D[fmt]:** apply the format to a datetime that has at least a non-zero hour
|
1834
|
+
component where fmt is a valid format string for Date#strftime, otherwise,
|
1835
|
+
the datetime will be formatted as an ISO 8601 string, YYYY-MM-DD.
|
1836
|
+
|
1837
|
+
For example, `'c[pink]d[%b %-d, %Y]C'`, would format a date element like ’Sep
|
1838
|
+
22, 1957’, center it, and color it pink.
|
1839
|
+
|
1840
|
+
4. Boolean
|
1841
|
+
|
1842
|
+
For a boolean cell, all the instructions valid for string are available, in
|
1843
|
+
addition to the following:
|
1844
|
+
|
1845
|
+
- **Y:** print true as ’`Y`’ and false as ’`N`’,
|
1846
|
+
- **T:** print true as ’`T`’ and false as ’`F`’,
|
1847
|
+
- **X:** print true as ’`X`’ and false as an empty string ’’,
|
1848
|
+
- **b[xxx,yyy]:** print true as the string given as `xxx` and false as the string
|
1849
|
+
given as `yyy`,
|
1850
|
+
- **c[tcolor,fcolor]:** color a true element with `tcolor` and a false element
|
1851
|
+
with `fcolor`. Each of the colors may be specified in the same manner as
|
1852
|
+
colors for strings described above.
|
1853
|
+
|
1854
|
+
For example, the directive ’`b[Yeppers,Nope]c[green.pink,red.pink]`’ would
|
1855
|
+
render a true boolean as ’`Yeppers`’ colored green on pink and render a false
|
1856
|
+
boolean as ’`Nope`’ colored red on pink. See [Yeppers](https://www.youtube.com/watch?v=oLdFFD8II8U) for additional information.
|
1857
|
+
|
1858
|
+
5. NilClass
|
1859
|
+
|
1860
|
+
By default, `nil` elements are rendered as blank cells, but you can make them
|
1861
|
+
visible with the following, and in that case, all the formatting instructions
|
1862
|
+
valid for strings are also available:
|
1863
|
+
|
1864
|
+
- **n[niltext]:** render a `nil` item with the given niltext.
|
1865
|
+
|
1866
|
+
For example, you might want to use `'n[-]Cc[purple]'` to make nils visible as a
|
1867
|
+
centered purple hyphen.
|
1868
|
+
|
1869
|
+
|
1870
|
+
<a id="org947e8a4"></a>
|
1871
|
+
|
1872
|
+
### Footers Methods
|
1873
|
+
|
1874
|
+
You can call the `footer` and `gfooter` methods on `Formatter` objects to add
|
1875
|
+
footers and group footers. Their signatures are:
|
1876
|
+
|
1877
|
+
- **`footer(label, *sum_cols, **agg_cols)`:** where `label` is a label to be
|
1878
|
+
placed in the first cell of the footer (unless that column is named as one
|
1879
|
+
of the `sum_cols` or `agg_cols`, in which case the label is ignored),
|
1880
|
+
`*sum_cols` are zero or more symbols for columns to be summed, and
|
1881
|
+
`**agg_cols` is zero or more hash-like parameters with a column symbol as a
|
1882
|
+
key and a symbol for an aggregate method as the value. This causes a
|
1883
|
+
table-wide header to be added at the bottom of the table applying the
|
1884
|
+
`:sum` aggregate to the `sum_cols` and the named aggregate method to the
|
1885
|
+
`agg_cols`. A table can have any number of footers attached, and they will
|
1886
|
+
appear at the bottom of the output table in the order they are given.
|
1887
|
+
|
1888
|
+
- **`gfooter(label, *sum_cols, **agg_cols)`:** where the parameters have the same
|
1889
|
+
meaning as for the `footer` method, but result in a footer for each group
|
1890
|
+
in the table rather than the table as a whole. These will appear in the
|
1891
|
+
output table just below each group.
|
1892
|
+
|
1893
|
+
There are also a number of convenience methods for adding common footers:
|
1894
|
+
|
1895
|
+
- **`sum_footer(*cols)`:** Add a footer summing the given columns with the label
|
1896
|
+
’Total’.
|
1897
|
+
- **`sum_gfooter(*cols)`:** Add a group footer summing the given columns with the
|
1898
|
+
label ’Group Total’.
|
1899
|
+
- **`avg_footer(*cols)`:** Add a footer averaging the given columns with the label
|
1900
|
+
’Average’.
|
1901
|
+
- **`avg_gfooter(*cols)`:** Add a group footer averaging the given columns with the label
|
1902
|
+
’Group Average’.
|
1903
|
+
- **`min_footer(*cols)`:** Add a footer showing the minimum for the given columns
|
1904
|
+
with the label ’Minimum’.
|
1905
|
+
- **`min_gfooter(*cols)`:** Add a group footer showing the minumum for the given
|
1906
|
+
columns with the label ’Group Minimum’.
|
1907
|
+
- **`max_footer(*cols)`:** Add a footer showing the maximum for the given columns
|
1908
|
+
with the label ’Maximum’.
|
1909
|
+
- **`max_gfooter(*cols)`:** Add a group footer showing the maximum for the given
|
1910
|
+
columns with the label ’Group Maximum’.
|
1911
|
+
|
1912
|
+
|
1913
|
+
<a id="orgcef241a"></a>
|
1914
|
+
|
1915
|
+
### Formatting Methods
|
1916
|
+
|
1917
|
+
You can call methods on `Formatter` objects to specify formatting directives
|
1918
|
+
for specific columns or types. There are two methods for doing so, `format_for`
|
1919
|
+
and `format`.
|
1920
|
+
|
1921
|
+
1. Instantiating a Formatter
|
1922
|
+
|
1923
|
+
There are several ways to invoke the formatting methods on a table. First, you
|
1924
|
+
can instantiate a `XXXFormatter` object and feed it a table as a parameter.
|
1925
|
+
There is a Formatter subclass for each target output medium, for example,
|
1926
|
+
`AoaFormatter` will produce a ruby array of arrays. You can then call the
|
1927
|
+
`output` method on the `XXXFormatter`.
|
1928
|
+
|
1929
|
+
FatTable::AoaFormatter.new(tab_a).output
|
1930
|
+
|
1931
|
+
| Id | Name | Age | Address | Salary | Join Date |
|
1932
|
+
|----|-------|-----|------------|--------|------------|
|
1933
|
+
| 1 | Paul | 32 | California | 20000 | 2001-07-13 |
|
1934
|
+
| 3 | Teddy | 23 | Norway | 20000 | 2007-12-13 |
|
1935
|
+
| 4 | Mark | 25 | Rich-Mond | 65000 | 2007-12-13 |
|
1936
|
+
| 5 | David | 27 | Texas | 85000 | 2007-12-13 |
|
1937
|
+
| 2 | Allen | 25 | Texas | | 2005-07-13 |
|
1938
|
+
| 8 | Paul | 24 | Houston | 20000 | 2005-07-13 |
|
1939
|
+
| 9 | James | 44 | Norway | 5000 | 2005-07-13 |
|
1940
|
+
| 10 | James | 45 | Texas | 5000 | |
|
1941
|
+
|
1942
|
+
The `XXXFormatter.new` method yields the new instance to any block given, and
|
1943
|
+
you can call methods on it to affect the formatting of the output:
|
1944
|
+
|
1945
|
+
FatTable::AoaFormatter.new(tab_a) do |f|
|
1946
|
+
f.format(numeric: '0.0,R', id: '3.0C')
|
1947
|
+
end.output
|
1948
|
+
|
1949
|
+
| Id | Name | Age | Address | Salary | Join Date |
|
1950
|
+
|-----|-------|-----|------------|--------|------------|
|
1951
|
+
| 001 | Paul | 32 | California | 20,000 | 2001-07-13 |
|
1952
|
+
| 003 | Teddy | 23 | Norway | 20,000 | 2007-12-13 |
|
1953
|
+
| 004 | Mark | 25 | Rich-Mond | 65,000 | 2007-12-13 |
|
1954
|
+
| 005 | David | 27 | Texas | 85,000 | 2007-12-13 |
|
1955
|
+
| 002 | Allen | 25 | Texas | | 2005-07-13 |
|
1956
|
+
| 008 | Paul | 24 | Houston | 20,000 | 2005-07-13 |
|
1957
|
+
| 009 | James | 44 | Norway | 5,000 | 2005-07-13 |
|
1958
|
+
| 010 | James | 45 | Texas | 5,000 | |
|
1959
|
+
|
1960
|
+
2. `FatTable` module-level method calls
|
1961
|
+
|
1962
|
+
The `FatTable` module provides a set of methods of the form `to_aoa`, `to_text`,
|
1963
|
+
etc., to access a `Formatter` without having to create an instance yourself.
|
1964
|
+
Without a block, they apply the default formatting to the table and call the
|
1965
|
+
`.output` method automatically:
|
1966
|
+
|
1967
|
+
FatTable.to_aoa(tab_a)
|
1968
|
+
|
1969
|
+
| Id | Name | Age | Address | Salary | Join Date |
|
1970
|
+
|----|-------|-----|------------|--------|------------|
|
1971
|
+
| 1 | Paul | 32 | California | 20000 | 2001-07-13 |
|
1972
|
+
| 3 | Teddy | 23 | Norway | 20000 | 2007-12-13 |
|
1973
|
+
| 4 | Mark | 25 | Rich-Mond | 65000 | 2007-12-13 |
|
1974
|
+
| 5 | David | 27 | Texas | 85000 | 2007-12-13 |
|
1975
|
+
| 2 | Allen | 25 | Texas | | 2005-07-13 |
|
1976
|
+
| 8 | Paul | 24 | Houston | 20000 | 2005-07-13 |
|
1977
|
+
| 9 | James | 44 | Norway | 5000 | 2005-07-13 |
|
1978
|
+
| 10 | James | 45 | Texas | 5000 | |
|
1979
|
+
|
1980
|
+
With a block, these methods yield a `Formatter` instance on which you can call
|
1981
|
+
formatting and footer methods. The `.output` method is called on the `Formatter`
|
1982
|
+
automatically after the block:
|
1983
|
+
|
1984
|
+
FatTable.to_aoa(tab_a) do |f|
|
1985
|
+
f.format(numeric: '0.0,R', id: '3.0C')
|
1986
|
+
end
|
1987
|
+
|
1988
|
+
| Id | Name | Age | Address | Salary | Join Date |
|
1989
|
+
|-----|-------|-----|------------|--------|------------|
|
1990
|
+
| 001 | Paul | 32 | California | 20,000 | 2001-07-13 |
|
1991
|
+
| 003 | Teddy | 23 | Norway | 20,000 | 2007-12-13 |
|
1992
|
+
| 004 | Mark | 25 | Rich-Mond | 65,000 | 2007-12-13 |
|
1993
|
+
| 005 | David | 27 | Texas | 85,000 | 2007-12-13 |
|
1994
|
+
| 002 | Allen | 25 | Texas | | 2005-07-13 |
|
1995
|
+
| 008 | Paul | 24 | Houston | 20,000 | 2005-07-13 |
|
1996
|
+
| 009 | James | 44 | Norway | 5,000 | 2005-07-13 |
|
1997
|
+
| 010 | James | 45 | Texas | 5,000 | |
|
1998
|
+
|
1999
|
+
3. Calling methods on Table objects
|
2000
|
+
|
2001
|
+
Finally, you can call methods such as `to_aoa`, `to_text`, etc., directly on a
|
2002
|
+
Table:
|
2003
|
+
|
2004
|
+
tab_a.to_aoa
|
2005
|
+
|
2006
|
+
| Id | Name | Age | Address | Salary | Join Date |
|
2007
|
+
|----|-------|-----|------------|--------|------------|
|
2008
|
+
| 1 | Paul | 32 | California | 20000 | 2001-07-13 |
|
2009
|
+
| 3 | Teddy | 23 | Norway | 20000 | 2007-12-13 |
|
2010
|
+
| 4 | Mark | 25 | Rich-Mond | 65000 | 2007-12-13 |
|
2011
|
+
| 5 | David | 27 | Texas | 85000 | 2007-12-13 |
|
2012
|
+
| 2 | Allen | 25 | Texas | | 2005-07-13 |
|
2013
|
+
| 8 | Paul | 24 | Houston | 20000 | 2005-07-13 |
|
2014
|
+
| 9 | James | 44 | Norway | 5000 | 2005-07-13 |
|
2015
|
+
| 10 | James | 45 | Texas | 5000 | |
|
2016
|
+
|
2017
|
+
And you can supply a block to them as well to specify formatting or footers:
|
2018
|
+
|
2019
|
+
tab_a.to_aoa do |f|
|
2020
|
+
f.format(numeric: '0.0,R', id: '3.0C')
|
2021
|
+
f.sum_footer(:salary, :age)
|
2022
|
+
end
|
2023
|
+
|
2024
|
+
| Id | Name | Age | Address | Salary | Join Date |
|
2025
|
+
|-------|-------|-----|------------|---------|------------|
|
2026
|
+
| 001 | Paul | 32 | California | 20,000 | 2001-07-13 |
|
2027
|
+
| 003 | Teddy | 23 | Norway | 20,000 | 2007-12-13 |
|
2028
|
+
| 004 | Mark | 25 | Rich-Mond | 65,000 | 2007-12-13 |
|
2029
|
+
| 005 | David | 27 | Texas | 85,000 | 2007-12-13 |
|
2030
|
+
| 002 | Allen | 25 | Texas | | 2005-07-13 |
|
2031
|
+
| 008 | Paul | 24 | Houston | 20,000 | 2005-07-13 |
|
2032
|
+
| 009 | James | 44 | Norway | 5,000 | 2005-07-13 |
|
2033
|
+
| 010 | James | 45 | Texas | 5,000 | |
|
2034
|
+
|-------|-------|-----|------------|---------|------------|
|
2035
|
+
| Total | | 245 | | 220,000 | |
|
2036
|
+
|
2037
|
+
|
2038
|
+
<a id="org7b25866"></a>
|
2039
|
+
|
2040
|
+
### The `format` and `format_for` methods
|
2041
|
+
|
2042
|
+
Formatters take only two kinds of methods, those that attach footers to a
|
2043
|
+
table, which are discussed in the next section, and those that specify
|
2044
|
+
formatting for table cells, which are the subject of this section.
|
2045
|
+
|
2046
|
+
To set formatting directives for all locations in a table at once, use the
|
2047
|
+
`format` method; to set formatting directives for a particular location in the
|
2048
|
+
table, use the `format_for` method, giving the location as the first parameter.
|
2049
|
+
|
2050
|
+
Other than that first parameter, the two methods take the same types of
|
2051
|
+
parameters. The remaining parameters are hash-like parameters that use either a
|
2052
|
+
column name or a type as the key and a string with the formatting directives to
|
2053
|
+
apply as the value. The following example says to set the formatting for all
|
2054
|
+
locations in the table and to format all numeric fields as strings that are
|
2055
|
+
rounded to whole numbers (the ’0.0’ part), that are right-aligned (the ’R’
|
2056
|
+
part), and have grouping commas inserted (the ’,’ part). But the `:id` column is
|
2057
|
+
numeric, and the second parameter overrides the formatting for numerics in
|
2058
|
+
general and calls for the `:id` column to be padded to three digits with zeros
|
2059
|
+
on the left (the ’3.0’ part) and to be centered (the ’C’ part).
|
2060
|
+
|
2061
|
+
tab_a.to_aoa do |f|
|
2062
|
+
f.format(numeric: '0.0,R', id: '3.0C')
|
2063
|
+
end
|
2064
|
+
|
2065
|
+
| Id | Name | Age | Address | Salary | Join Date |
|
2066
|
+
|-----|-------|-----|------------|--------|------------|
|
2067
|
+
| 001 | Paul | 32 | California | 20,000 | 2001-07-13 |
|
2068
|
+
| 003 | Teddy | 23 | Norway | 20,000 | 2007-12-13 |
|
2069
|
+
| 004 | Mark | 25 | Rich-Mond | 65,000 | 2007-12-13 |
|
2070
|
+
| 005 | David | 27 | Texas | 85,000 | 2007-12-13 |
|
2071
|
+
| 002 | Allen | 25 | Texas | | 2005-07-13 |
|
2072
|
+
| 008 | Paul | 24 | Houston | 20,000 | 2005-07-13 |
|
2073
|
+
| 009 | James | 44 | Norway | 5,000 | 2005-07-13 |
|
2074
|
+
| 010 | James | 45 | Texas | 5,000 | |
|
2075
|
+
|
2076
|
+
The `numeric:` directive affected the `:age` and `:salary` columns and the `id:`
|
2077
|
+
directive affected only the `:id` column. All the other cells in the table had
|
2078
|
+
the default formatting applied.
|
2079
|
+
|
2080
|
+
1. Location priority
|
2081
|
+
|
2082
|
+
Formatting for any given cell depends on its location in the table. The
|
2083
|
+
`format_for` method takes a location to which its formatting directive are
|
2084
|
+
restricted as the first argument. It can be one of the following:
|
2085
|
+
|
2086
|
+
- **`:header`:** directive apply only to the header row, that is the first row, of
|
2087
|
+
the output table,
|
2088
|
+
|
2089
|
+
- **`:footer`:** directives apply to all the footer rows of the output table,
|
2090
|
+
regardless of how many there are,
|
2091
|
+
|
2092
|
+
- **`gfooter`:** directives apply to all group footer rows of the output tables,
|
2093
|
+
regardless of how many there are,
|
2094
|
+
|
2095
|
+
- **`:body`:** directives apply to all rows in the body of the table unless the
|
2096
|
+
row is the first row in the table or in a group and separate directives for
|
2097
|
+
those have been given, in which case those directives apply,
|
2098
|
+
|
2099
|
+
- **`:gfirst`:** directives apply to the first row in each group in the body of
|
2100
|
+
the table, unless the row is also the first row in the table as a whole, in
|
2101
|
+
which case the `:bfirst` directives apply,
|
2102
|
+
|
2103
|
+
- **`:bfirst`:** directives apply to the first row in the body of the table.
|
2104
|
+
|
2105
|
+
If you give directives for `:body`, they are copied to `:bfirst` and `:gfirst`
|
2106
|
+
as well and can be overridden by directives for those locations.
|
2107
|
+
|
2108
|
+
Directives given to the `format` method apply the directives to all locations in
|
2109
|
+
the table, but they can be overridden by more specific directives given in a
|
2110
|
+
`format_for` directive.
|
2111
|
+
|
2112
|
+
2. Type and Column priority
|
2113
|
+
|
2114
|
+
A directive based on type applies to all columns having that type unless
|
2115
|
+
overridden by a directive specific to a named column; a directive based on a
|
2116
|
+
column name applies only to cells in that column.
|
2117
|
+
|
2118
|
+
However, there is a twist. Since the end result of formatting is to convert all
|
2119
|
+
columns to strings, the formatting directives for the `:string` type applies to
|
2120
|
+
all columns. Likewise, since all columns may contain nils, the `nil:` type
|
2121
|
+
applies to nils in all columns regardless of the column’s type.
|
2122
|
+
|
2123
|
+
require 'fat_table'
|
2124
|
+
tab_a.to_text do |f|
|
2125
|
+
f.format(string: 'R', id: '3.0C', salary: 'n[N/A]')
|
2126
|
+
end
|
2127
|
+
|
2128
|
+
+=====+=======+=====+============+========+============+
|
2129
|
+
| Id | Name | Age | Address | Salary | Join Date |
|
2130
|
+
+-----|-------|-----|------------|--------|------------+
|
2131
|
+
| 001 | Paul | 32 | California | 20000 | 2001-07-13 |
|
2132
|
+
| 003 | Teddy | 23 | Norway | 20000 | 2007-12-13 |
|
2133
|
+
| 004 | Mark | 25 | Rich-Mond | 65000 | 2007-12-13 |
|
2134
|
+
| 005 | David | 27 | Texas | 85000 | 2007-12-13 |
|
2135
|
+
| 002 | Allen | 25 | Texas | N/A | 2005-07-13 |
|
2136
|
+
| 008 | Paul | 24 | Houston | 20000 | 2005-07-13 |
|
2137
|
+
| 009 | James | 44 | Norway | 5000 | 2005-07-13 |
|
2138
|
+
| 010 | James | 45 | Texas | 5000 | |
|
2139
|
+
+=====+=======+=====+============+========+============+
|
2140
|
+
|
2141
|
+
The `string: 'R'` directive causes all the cells to be right-aligned except
|
2142
|
+
`:id` which specifies centering for the `:id` column only. The `n[N/A]`
|
2143
|
+
directive for specifies how nil are displayed in the numeric column, `:salary`,
|
2144
|
+
but not for other nils, such as in the last row of the `:join_date` column.
|
2145
|
+
|
2146
|
+
|
2147
|
+
<a id="org62e325b"></a>
|
2148
|
+
|
2149
|
+
# Development
|
2150
|
+
|
2151
|
+
After checking out the repo, run \`bin/setup\` to install dependencies. Then, run
|
2152
|
+
\`rake spec\` to run the tests. You can also run \`bin/console\` for an interactive
|
2153
|
+
prompt that will allow you to experiment.
|
2154
|
+
|
2155
|
+
To install this gem onto your local machine, run \`bundle exec rake install\`. To
|
2156
|
+
release a new version, update the version number in \`version.rb\`, and then run
|
2157
|
+
\`bundle exec rake release\`, which will create a git tag for the version, push
|
2158
|
+
git commits and tags, and push the \`.gem\` file to
|
2159
|
+
[rubygems.org](<https://rubygems.org>).
|
2160
|
+
|
2161
|
+
|
2162
|
+
<a id="orgf51a2c9"></a>
|
2163
|
+
|
2164
|
+
# Contributing
|
2165
|
+
|
2166
|
+
Bug reports and pull requests are welcome on GitHub at
|
2167
|
+
<https://github.com/ddoherty03/fat_table>.
|