spreadbase 0.1.2 → 0.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
data/README.md CHANGED
@@ -1,8 +1,17 @@
1
+ [![CI](https://github.com/saveriomiroddi/spreadbase/actions/workflows/ci.yml/badge.svg)](https://github.com/saveriomiroddi/spreadbase/actions/workflows/ci.yml)
2
+
1
3
  SpreadBase!!
2
4
  ============
3
5
 
4
6
  ... because Excel IS a database.
5
7
 
8
+ Status
9
+ ------
10
+
11
+ The library itself is stable, and can be regularly used.
12
+
13
+ I plan to add features on request, but if nobody asks for them, I will update the project very infrequently.
14
+
6
15
  What is SpreadBase©?
7
16
  --------------------
8
17
 
@@ -53,6 +62,27 @@ Append a row:
53
62
 
54
63
  table_2.append_row( [ 'Fabrizio F.' ] )
55
64
 
65
+ Read a column, or a range of columns:
66
+
67
+ table.column( 0 )
68
+
69
+ # [ 'Dish', 'Roasted 6502', '8080, with an 8-bit bus', '65000 with side dishes of Copper and Blitter' ]
70
+
71
+ table.column( 0 .. 1 )
72
+
73
+ # [ [ 'Dish', 'Roasted 6502', '8080, with an 8-bit bus', '65000 with side dishes of Copper and Blitter' ],
74
+ # [ 'Price', 38.911, 8, 512.0 ] ]
75
+
76
+ Read a row, or a range of rows:
77
+
78
+ table.row( 1 )
79
+
80
+ # [ 'Roasted 6502', 38.911 ]
81
+
82
+ table.row( 1 .. 2 )
83
+
84
+ # [ [ 'Roasted 6502', 38.911 ], [ '8080, with an 8-bit bus', 8 ] ]
85
+
56
86
  Read a cell:
57
87
 
58
88
  price_8080 = document.tables[ 0 ][ 1, 2 ]
@@ -101,20 +131,17 @@ Save the document:
101
131
 
102
132
  document.save
103
133
 
134
+ Enjoy many other APIs.
135
+
104
136
  Notes
105
137
  -----
106
138
 
107
139
  - Numbers are decoded to Fixnum or Float, depending on the existence of the fractional part.
108
140
  Alternatively, numbers with a fractional part can be decoded as Bigdecimal, using the option:
109
141
 
110
- `SpreadBase::Document.new( "Random numbers für alle!.ods", :floats_as_bigdecimal => true )`
111
-
112
- - The archives are always encoded in UTF-8. In Ruby 1.8.7, input strings are assumed to be UTF-8; if not, it's possible to open a document as:
113
-
114
- `SpreadBase::Document.new( "Today's menu.ods", :force_18_strings_encoding => '<encoding>' )`
142
+ `SpreadBase::Document.new( "Random numbers für alle!.ods", floats_as_bigdecimal: true )`
115
143
 
116
- in order to override the input encoding.
117
- - The gem has been tested on Ruby 1.8.7 and 1.9.3-p125, on Linux and Mac OS X.
144
+ - The gem is tested on all the supported Ruby versions (see [Build](https://github.com/saveriomiroddi/spreadbase/actions/workflows/ci.yml)), and used mainly on Linux.
118
145
  - The column widths are retained (decoding/encoding), but at the current version, they're not [officially] accessible via any API.
119
146
 
120
147
  Currently unsupported features
@@ -123,17 +150,6 @@ Currently unsupported features
123
150
  - Styles; Date and and [Date]Times are formatted as, respectively, '%Y-%m-%d' and '%Y-%m-%d %H:%M:%S %z'
124
151
  - Percentage data type - they're handled using their float value (e.g. 50% = 0.5)
125
152
 
126
- Supporting SpreadBase
127
- ---------------------
128
-
129
- If you find SpreadBase useful for any reason, I invite you to join Kiva.org, using this invitation:
130
-
131
- http://www.kiva.org/invitedby/saveriomiroddi
132
-
133
- it will cost you **nothing** (zero/0 €/£/$), it will take three to five minutes of your time, and you will have actively done something for economically disadvantaged countries.
134
-
135
- If you want to do more, in addition to accepting the invitation, you can donate to my Paypal account (saverio.pub2 \<a-hat!\> gmail.com) - I will publish your donation and use the entire amount for making loans using the mentioned website.
136
-
137
153
  Roadmap/Todo
138
154
  ------------
139
155
 
data/Rakefile ADDED
@@ -0,0 +1,5 @@
1
+ require 'rspec/core/rake_task'
2
+
3
+ RSpec::Core::RakeTask.new(:spec)
4
+
5
+ task default: :spec
data/docs/STRUCTURE.md ADDED
@@ -0,0 +1,3 @@
1
+ - the base directory contains the abstract concept classes (document, table, etc.)
2
+ - `codecs` contains the codec classes, for encoding/decoding
3
+ - `helpers` contains functionality of generic nature
data/docs/TESTING.md ADDED
@@ -0,0 +1,11 @@
1
+ Files
2
+ =====
3
+
4
+ - the `elements` suites test the abstract concept classes (document, table, etc.)
5
+ - the `codecs` suites have functional testing (encode -> decode cycle) of each code, plus UTs for codec-specific functionalities
6
+ - `spec_helpers.rb` are a few constants and methods useful for testing
7
+
8
+ Methodology
9
+ ===========
10
+
11
+ The general workflow is to write specific UTs, then extend the functional test(s), then test on files generated with Libreoffice (from Libreoffice-built to spreadbase, from spreadbase-built to Libreoffice, and from Libreoffice-build to spreadbase to Libreoffice), using the `utils` scripts.
data/lib/spreadbase.rb CHANGED
@@ -1,33 +1,12 @@
1
- # encoding: UTF-8
1
+ require_relative 'spreadbase/helpers/helpers'
2
2
 
3
- =begin
4
- Copyright 2012 Saverio Miroddi saverio.pub2 <a-hat!> gmail.com
3
+ require_relative 'spreadbase/document'
4
+ require_relative 'spreadbase/table'
5
+ require_relative 'spreadbase/cell'
5
6
 
6
- This file is part of SpreadBase.
7
-
8
- SpreadBase is free software: you can redistribute it and/or modify it under the
9
- terms of the GNU Lesser General Public License as published by the Free Software
10
- Foundation, either version 3 of the License, or (at your option) any later
11
- version.
12
-
13
- SpreadBase is distributed in the hope that it will be useful, but WITHOUT ANY
14
- WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A
15
- PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details.
16
-
17
- You should have received a copy of the GNU Lesser General Public License along
18
- with SpreadBase. If not, see <http://www.gnu.org/licenses/>.
19
- =end
20
-
21
- require 'rubygems'
22
-
23
- require File.expand_path( '../spreadbase/helpers/helpers', __FILE__ )
24
-
25
- require File.expand_path( '../spreadbase/document', __FILE__ )
26
- require File.expand_path( '../spreadbase/table', __FILE__ )
27
-
28
- require File.expand_path( '../spreadbase/codecs/open_document_12_modules/encoding', __FILE__ )
29
- require File.expand_path( '../spreadbase/codecs/open_document_12_modules/decoding', __FILE__ )
30
- require File.expand_path( '../spreadbase/codecs/open_document_12', __FILE__ )
7
+ require_relative 'spreadbase/codecs/open_document_12_modules/encoding'
8
+ require_relative 'spreadbase/codecs/open_document_12_modules/decoding'
9
+ require_relative 'spreadbase/codecs/open_document_12'
31
10
 
32
11
  # = Spreadbase
33
12
  #
@@ -0,0 +1,19 @@
1
+ module SpreadBase # :nodoc:
2
+
3
+ # Represents the abstraction of a cell; values and their types are merged into a single entity.
4
+ #
5
+ class Cell
6
+
7
+ attr_accessor :value
8
+
9
+ def initialize(value)
10
+ @value = value
11
+ end
12
+
13
+ def ==(other)
14
+ other.is_a?(Cell) && @value == other.value
15
+ end
16
+
17
+ end
18
+
19
+ end
@@ -1,24 +1,4 @@
1
- # encoding: UTF-8
2
-
3
- =begin
4
- Copyright 2012 Saverio Miroddi saverio.pub2 <a-hat!> gmail.com
5
-
6
- This file is part of SpreadBase.
7
-
8
- SpreadBase is free software: you can redistribute it and/or modify it under the
9
- terms of the GNU Lesser General Public License as published by the Free Software
10
- Foundation, either version 3 of the License, or (at your option) any later
11
- version.
12
-
13
- SpreadBase is distributed in the hope that it will be useful, but WITHOUT ANY
14
- WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A
15
- PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details.
16
-
17
- You should have received a copy of the GNU Lesser General Public License along
18
- with SpreadBase. If not, see <http://www.gnu.org/licenses/>.
19
- =end
20
-
21
- require 'zipruby'
1
+ require 'zip'
22
2
  require 'rexml/document'
23
3
 
24
4
  module SpreadBase # :nodoc:
@@ -49,20 +29,17 @@ module SpreadBase # :nodoc:
49
29
  #
50
30
  # _options_:
51
31
  #
52
- # +force_18_strings_encoding+:: ('UTF-8') on ruby 1.8, when converting to UTF-8, assume the strings are using the specified format.
53
32
  # +prettify+:: (false )prettifies the content.xml to be human readable.
54
33
  #
55
34
  # _returns_ the archive as binary string.
56
35
  #
57
- def encode_to_archive( el_document, options={} )
58
- document_buffer = encode_to_content_xml( el_document, options )
36
+ def encode_to_archive(el_document, options={})
37
+ document_buffer = encode_to_content_xml(el_document, options)
59
38
  zip_buffer = ''
60
39
 
61
- Zip::Archive.open_buffer( zip_buffer, Zip::CREATE ) do | zip_file |
62
- zip_file.add_dir( 'META-INF' )
63
-
64
- zip_file.add_buffer( 'META-INF/manifest.xml', MANIFEST_XML );
65
- zip_file.add_buffer( 'content.xml', document_buffer );
40
+ Zip::File.open_buffer(zip_buffer) do | zip_file |
41
+ zip_file.get_output_stream('META-INF/manifest.xml') { |f| f << MANIFEST_XML }
42
+ zip_file.get_output_stream('content.xml') { |f| f << document_buffer }
66
43
  end
67
44
 
68
45
  zip_buffer
@@ -81,12 +58,11 @@ module SpreadBase # :nodoc:
81
58
  #
82
59
  # _returns_ the SpreadBase::Document instance.
83
60
  #
84
- def decode_archive( zip_buffer, options={} )
85
- content_xml_data = Zip::Archive.open_buffer( zip_buffer ) do | zip_file |
86
- zip_file.fopen( 'content.xml' ) { | file | file.read }
87
- end
61
+ def decode_archive(zip_buffer, options={})
62
+ io = StringIO.new(zip_buffer)
63
+ content_xml_data = Zip::File.new(io, false, true).read('content.xml')
88
64
 
89
- decode_content_xml( content_xml_data, options )
65
+ decode_content_xml(content_xml_data, options)
90
66
  end
91
67
 
92
68
  # Utility method; encodes the Document to the content.xml format.
@@ -97,18 +73,17 @@ module SpreadBase # :nodoc:
97
73
  #
98
74
  # _options_:
99
75
  #
100
- # +force_18_strings_encoding+:: ('UTF-8') on ruby 1.8, when converting to UTF-8, assume the strings are using the specified format.
101
76
  # +prettify+:: (false ) prettifies the content.xml to be human readable.
102
77
  #
103
78
  # _returns_ content.xml as string.
104
79
  #--
105
80
  # "utility" is a fancy name for testing/utils helper.
106
81
  #
107
- def encode_to_content_xml( el_document, options={} )
108
- prettify = options[ :prettify ]
82
+ def encode_to_content_xml(el_document, options={})
83
+ prettify = options[:prettify]
109
84
 
110
- document_xml_root = encode_to_document_node( el_document, options )
111
- document_buffer = prettify ? pretty_xml( document_xml_root ) : document_xml_root.to_s
85
+ document_xml_root = encode_to_document_node(el_document)
86
+ document_buffer = prettify ? pretty_xml(document_xml_root) : document_xml_root.to_s
112
87
 
113
88
  document_buffer
114
89
  end
@@ -123,20 +98,20 @@ module SpreadBase # :nodoc:
123
98
  #--
124
99
  # "utility" is a fancy name for testing/utils helper.
125
100
  #
126
- def decode_content_xml( content_xml_data, options={} )
127
- root_node = REXML::Document.new( content_xml_data )
101
+ def decode_content_xml(content_xml_data, options={})
102
+ root_node = REXML::Document.new(content_xml_data)
128
103
 
129
- decode_document_node( root_node, options )
104
+ decode_document_node(root_node, options)
130
105
  end
131
106
 
132
107
  private
133
108
 
134
- def pretty_xml( document )
109
+ def pretty_xml(document)
135
110
  buffer = ""
136
111
 
137
112
  xml_formatter = REXML::Formatters::Pretty.new
138
113
  xml_formatter.compact = true
139
- xml_formatter.write( document, buffer )
114
+ xml_formatter.write(document, buffer)
140
115
 
141
116
  buffer
142
117
  end
@@ -1,23 +1,3 @@
1
- # encoding: UTF-8
2
-
3
- =begin
4
- Copyright 2012 Saverio Miroddi saverio.pub2 <a-hat!> gmail.com
5
-
6
- This file is part of SpreadBase.
7
-
8
- SpreadBase is free software: you can redistribute it and/or modify it under the
9
- terms of the GNU Lesser General Public License as published by the Free Software
10
- Foundation, either version 3 of the License, or (at your option) any later
11
- version.
12
-
13
- SpreadBase is distributed in the hope that it will be useful, but WITHOUT ANY
14
- WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A
15
- PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details.
16
-
17
- You should have received a copy of the GNU Lesser General Public License along
18
- with SpreadBase. If not, see <http://www.gnu.org/licenses/>.
19
- =end
20
-
21
1
  require 'date'
22
2
  require 'bigdecimal'
23
3
 
@@ -37,32 +17,32 @@ module SpreadBase # :nodoc:
37
17
 
38
18
  # Returns a Document instance.
39
19
  #
40
- def decode_document_node( root_node, options={} )
20
+ def decode_document_node(root_node, options={})
41
21
  document = Document.new
42
22
 
43
- style_nodes = root_node.elements.to_a( '//office:document-content/office:automatic-styles/style:style' )
44
- table_nodes = root_node.elements.to_a( '//office:document-content/office:body/office:spreadsheet/table:table' )
23
+ style_nodes = root_node.elements.to_a('//office:document-content/office:automatic-styles/style:style')
24
+ table_nodes = root_node.elements.to_a('//office:document-content/office:body/office:spreadsheet/table:table')
45
25
 
46
- document.column_width_styles = decode_column_width_styles( style_nodes )
26
+ document.column_width_styles = decode_column_width_styles(style_nodes)
47
27
 
48
- document.tables = table_nodes.map { | node | decode_table_node( node, options ) }
28
+ document.tables = table_nodes.map { | node | decode_table_node(node, options) }
49
29
 
50
30
  document
51
31
  end
52
32
 
53
33
  # Currently it has only the purpose of decoding the column widths (for this reason it has a different naming convention).
54
34
  #
55
- def decode_column_width_styles( style_nodes )
56
- style_nodes.inject( {} ) do | column_width_styles, style_node |
57
- column_node = style_node.elements[ 'style:table-column-properties' ]
35
+ def decode_column_width_styles(style_nodes)
36
+ style_nodes.inject({}) do | column_width_styles, style_node |
37
+ column_node = style_node.elements['style:table-column-properties']
58
38
 
59
39
  if column_node
60
- column_width = column_node.attributes[ 'style:column-width' ]
40
+ column_width = column_node.attributes['style:column-width']
61
41
 
62
42
  if column_width
63
- style_name = style_node.attributes[ 'style:name' ]
43
+ style_name = style_node.attributes['style:name']
64
44
 
65
- column_width_styles[ style_name] = column_width
45
+ column_width_styles[style_name] = column_width
66
46
  end
67
47
  end
68
48
 
@@ -70,98 +50,123 @@ module SpreadBase # :nodoc:
70
50
  end
71
51
  end
72
52
 
73
- def decode_table_node( table_node, options )
74
- table = Table.new( table_node.attributes[ 'table:name' ] )
53
+ def decode_table_node(table_node, options)
54
+ table = Table.new(table_node.attributes['table:name'])
75
55
 
76
- column_nodes = table_node.elements.to_a( 'table:table-column' )
77
- row_nodes = table_node.elements.to_a( 'table:table-row' )
56
+ column_nodes = table_node.elements.to_a('table:table-column')
57
+ row_nodes = table_node.elements.to_a('table:table-row')
78
58
 
79
59
  # A single column/row can represent multiple columns (table:number-(columns|rows)-repeated)
80
60
  #
81
- table.column_width_styles = column_nodes.inject( [] ) { | current_styles, node | current_styles + decode_column_width_style( node ) }
82
- table.data = row_nodes.inject( [] ) { | current_rows, node | current_rows + decode_row_node( node, options ) }
61
+ table.column_width_styles = column_nodes.inject([]) { | current_styles, node | current_styles + decode_column_width_style(node) }
62
+ table.data = decode_row_nodes(row_nodes, options)
83
63
 
84
64
  table
85
65
  end
86
66
 
87
- def decode_column_width_style( column_node )
88
- repetitions = ( column_node.attributes[ 'table:number-columns-repeated' ] || '1' ).to_i
89
- style_name = column_node.attributes[ 'table:style-name' ]
67
+ def decode_column_width_style(column_node)
68
+ repetitions = (column_node.attributes['table:number-columns-repeated'] || '1').to_i
69
+ style_name = column_node.attributes['table:style-name']
90
70
 
91
71
  # WATCH OUT! See module note
92
72
  #
93
- make_array_from_repetitions( style_name, repetitions )
73
+ make_array_from_repetitions(style_name, repetitions)
94
74
  end
95
75
 
96
- def decode_row_node( row_node, options )
97
- repetitions = ( row_node.attributes[ 'table:number-rows-repeated' ] || '1' ).to_i
98
- cell_nodes = row_node.elements.to_a( 'table:table-cell' )
76
+ def decode_row_nodes(row_nodes, options)
77
+ rows = []
78
+ row_nodes.inject(0) do |size, node|
79
+ row, repetitions = decode_row_node(node, options)
80
+ row.empty? || append_row(rows, size, row, repetitions)
81
+ size + repetitions
82
+ end
83
+ rows
84
+ end
99
85
 
100
- # Watch out the :flatten; a single cell can represent multiple cells (table:number-columns-repeated)
101
- #
102
- values = cell_nodes.map { | node | decode_cell_node( node, options ) }.flatten
86
+ def decode_row_node(row_node, options)
87
+ repetitions = (row_node.attributes['table:number-rows-repeated'] || '1').to_i
88
+ cell_nodes = row_node.elements.to_a('table:table-cell')
103
89
 
104
- make_array_from_repetitions( values, repetitions )
90
+ [decode_cell_nodes(cell_nodes, options), repetitions]
105
91
  end
106
92
 
107
- def decode_cell_node( cell_node, options )
108
- floats_as_bigdecimal = options[ :floats_as_bigdecimal ]
93
+ def append_row(rows, size, row, repetitions)
94
+ (size - rows.size).times { rows << [] }
95
+ rows.concat(make_array_from_repetitions(row, repetitions))
96
+ end
109
97
 
110
- value_type = cell_node.attributes[ 'office:value-type' ]
98
+ def decode_cell_nodes(cell_nodes, options)
99
+ cells = []
100
+ cell_nodes.inject(0) do |size, node|
101
+ cell, repetitions = decode_cell_node(node, options)
102
+ cell.nil? || append_cell(cells, size, cell, repetitions)
103
+ size + repetitions
104
+ end
105
+ cells
106
+ end
111
107
 
112
- value = \
113
- case value_type
114
- when 'string'
115
- value_node = cell_node.elements[ 'text:p' ]
108
+ def decode_cell_node(cell_node, options)
109
+ [
110
+ decode_cell_value(cell_node, options),
111
+ (cell_node.attributes['table:number-columns-repeated'] || '1').to_i
112
+ ]
113
+ end
116
114
 
117
- value_node.text
118
- when 'date'
119
- date_string = cell_node.attributes[ 'office:date-value' ]
115
+ def append_cell(cells, size, cell, repetitions)
116
+ cells[size - 1] = nil if size != cells.size
117
+ cells.concat(make_array_from_repetitions(cell, repetitions))
118
+ end
120
119
 
121
- if date_string =~ /T/
122
- DateTime.strptime( date_string, '%Y-%m-%dT%H:%M:%S' )
123
- else
124
- Date.strptime( date_string, '%Y-%m-%d' )
125
- end
126
- when 'float', 'percentage'
127
- float_string = cell_node.attributes[ 'office:value' ]
128
-
129
- if float_string.include?( '.' )
130
- if floats_as_bigdecimal
131
- BigDecimal.new( float_string )
132
- else
133
- float_string.to_f
134
- end
135
- else
136
- float_string.to_i
137
- end
138
- when 'boolean'
139
- boolean_string = cell_node.attributes[ 'office:boolean-value' ]
140
-
141
- case boolean_string
142
- when 'true'
143
- true
144
- when 'false'
145
- false
120
+ def decode_cell_value(cell_node, options)
121
+ floats_as_bigdecimal = options[:floats_as_bigdecimal]
122
+
123
+ value_type = cell_node.attributes['office:value-type']
124
+
125
+ case value_type
126
+ when 'string'
127
+ value_node = cell_node.elements['text:p']
128
+
129
+ value_node.text
130
+ when 'date'
131
+ date_string = cell_node.attributes['office:date-value']
132
+
133
+ if date_string =~ /T/
134
+ DateTime.strptime(date_string, '%Y-%m-%dT%H:%M:%S')
135
+ else
136
+ Date.strptime(date_string, '%Y-%m-%d')
137
+ end
138
+ when 'float', 'percentage'
139
+ float_string = cell_node.attributes['office:value']
140
+
141
+ if float_string.include?('.')
142
+ if floats_as_bigdecimal
143
+ BigDecimal(float_string)
146
144
  else
147
- raise "Invalid boolean value: #{ boolean_string }"
145
+ float_string.to_f
148
146
  end
149
- when nil
150
- nil
151
147
  else
152
- raise "Unrecognized value type found in a cell: #{ value_type }"
148
+ float_string.to_i
153
149
  end
154
-
155
- repetitions = ( cell_node.attributes[ 'table:number-columns-repeated' ] || '1' ).to_i
156
-
157
- make_array_from_repetitions( value, repetitions )
150
+ when 'boolean'
151
+ boolean_string = cell_node.attributes['office:boolean-value']
152
+
153
+ case boolean_string
154
+ when 'true'
155
+ true
156
+ when 'false'
157
+ false
158
+ else
159
+ raise "Invalid boolean value: #{ boolean_string }"
160
+ end
161
+ when nil
162
+ nil
163
+ else
164
+ raise "Unrecognized value type found in a cell: #{ value_type }"
165
+ end
158
166
  end
159
167
 
160
168
  end
161
169
 
162
- private
163
-
164
-
165
170
  end
166
171
 
167
172
  end