spreadbase 0.1.2 → 0.3.0

Sign up to get free protection for your applications and to get access to all the features.
data/README.md CHANGED
@@ -1,8 +1,17 @@
1
+ [![CI](https://github.com/saveriomiroddi/spreadbase/actions/workflows/ci.yml/badge.svg)](https://github.com/saveriomiroddi/spreadbase/actions/workflows/ci.yml)
2
+
1
3
  SpreadBase!!
2
4
  ============
3
5
 
4
6
  ... because Excel IS a database.
5
7
 
8
+ Status
9
+ ------
10
+
11
+ The library itself is stable, and can be regularly used.
12
+
13
+ I plan to add features on request, but if nobody asks for them, I will update the project very infrequently.
14
+
6
15
  What is SpreadBase©?
7
16
  --------------------
8
17
 
@@ -53,6 +62,27 @@ Append a row:
53
62
 
54
63
  table_2.append_row( [ 'Fabrizio F.' ] )
55
64
 
65
+ Read a column, or a range of columns:
66
+
67
+ table.column( 0 )
68
+
69
+ # [ 'Dish', 'Roasted 6502', '8080, with an 8-bit bus', '65000 with side dishes of Copper and Blitter' ]
70
+
71
+ table.column( 0 .. 1 )
72
+
73
+ # [ [ 'Dish', 'Roasted 6502', '8080, with an 8-bit bus', '65000 with side dishes of Copper and Blitter' ],
74
+ # [ 'Price', 38.911, 8, 512.0 ] ]
75
+
76
+ Read a row, or a range of rows:
77
+
78
+ table.row( 1 )
79
+
80
+ # [ 'Roasted 6502', 38.911 ]
81
+
82
+ table.row( 1 .. 2 )
83
+
84
+ # [ [ 'Roasted 6502', 38.911 ], [ '8080, with an 8-bit bus', 8 ] ]
85
+
56
86
  Read a cell:
57
87
 
58
88
  price_8080 = document.tables[ 0 ][ 1, 2 ]
@@ -101,20 +131,17 @@ Save the document:
101
131
 
102
132
  document.save
103
133
 
134
+ Enjoy many other APIs.
135
+
104
136
  Notes
105
137
  -----
106
138
 
107
139
  - Numbers are decoded to Fixnum or Float, depending on the existence of the fractional part.
108
140
  Alternatively, numbers with a fractional part can be decoded as Bigdecimal, using the option:
109
141
 
110
- `SpreadBase::Document.new( "Random numbers für alle!.ods", :floats_as_bigdecimal => true )`
111
-
112
- - The archives are always encoded in UTF-8. In Ruby 1.8.7, input strings are assumed to be UTF-8; if not, it's possible to open a document as:
113
-
114
- `SpreadBase::Document.new( "Today's menu.ods", :force_18_strings_encoding => '<encoding>' )`
142
+ `SpreadBase::Document.new( "Random numbers für alle!.ods", floats_as_bigdecimal: true )`
115
143
 
116
- in order to override the input encoding.
117
- - The gem has been tested on Ruby 1.8.7 and 1.9.3-p125, on Linux and Mac OS X.
144
+ - The gem is tested on all the supported Ruby versions (see [Build](https://github.com/saveriomiroddi/spreadbase/actions/workflows/ci.yml)), and used mainly on Linux.
118
145
  - The column widths are retained (decoding/encoding), but at the current version, they're not [officially] accessible via any API.
119
146
 
120
147
  Currently unsupported features
@@ -123,17 +150,6 @@ Currently unsupported features
123
150
  - Styles; Date and and [Date]Times are formatted as, respectively, '%Y-%m-%d' and '%Y-%m-%d %H:%M:%S %z'
124
151
  - Percentage data type - they're handled using their float value (e.g. 50% = 0.5)
125
152
 
126
- Supporting SpreadBase
127
- ---------------------
128
-
129
- If you find SpreadBase useful for any reason, I invite you to join Kiva.org, using this invitation:
130
-
131
- http://www.kiva.org/invitedby/saveriomiroddi
132
-
133
- it will cost you **nothing** (zero/0 €/£/$), it will take three to five minutes of your time, and you will have actively done something for economically disadvantaged countries.
134
-
135
- If you want to do more, in addition to accepting the invitation, you can donate to my Paypal account (saverio.pub2 \<a-hat!\> gmail.com) - I will publish your donation and use the entire amount for making loans using the mentioned website.
136
-
137
153
  Roadmap/Todo
138
154
  ------------
139
155
 
data/Rakefile ADDED
@@ -0,0 +1,5 @@
1
+ require 'rspec/core/rake_task'
2
+
3
+ RSpec::Core::RakeTask.new(:spec)
4
+
5
+ task default: :spec
data/docs/STRUCTURE.md ADDED
@@ -0,0 +1,3 @@
1
+ - the base directory contains the abstract concept classes (document, table, etc.)
2
+ - `codecs` contains the codec classes, for encoding/decoding
3
+ - `helpers` contains functionality of generic nature
data/docs/TESTING.md ADDED
@@ -0,0 +1,11 @@
1
+ Files
2
+ =====
3
+
4
+ - the `elements` suites test the abstract concept classes (document, table, etc.)
5
+ - the `codecs` suites have functional testing (encode -> decode cycle) of each code, plus UTs for codec-specific functionalities
6
+ - `spec_helpers.rb` are a few constants and methods useful for testing
7
+
8
+ Methodology
9
+ ===========
10
+
11
+ The general workflow is to write specific UTs, then extend the functional test(s), then test on files generated with Libreoffice (from Libreoffice-built to spreadbase, from spreadbase-built to Libreoffice, and from Libreoffice-build to spreadbase to Libreoffice), using the `utils` scripts.
data/lib/spreadbase.rb CHANGED
@@ -1,33 +1,12 @@
1
- # encoding: UTF-8
1
+ require_relative 'spreadbase/helpers/helpers'
2
2
 
3
- =begin
4
- Copyright 2012 Saverio Miroddi saverio.pub2 <a-hat!> gmail.com
3
+ require_relative 'spreadbase/document'
4
+ require_relative 'spreadbase/table'
5
+ require_relative 'spreadbase/cell'
5
6
 
6
- This file is part of SpreadBase.
7
-
8
- SpreadBase is free software: you can redistribute it and/or modify it under the
9
- terms of the GNU Lesser General Public License as published by the Free Software
10
- Foundation, either version 3 of the License, or (at your option) any later
11
- version.
12
-
13
- SpreadBase is distributed in the hope that it will be useful, but WITHOUT ANY
14
- WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A
15
- PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details.
16
-
17
- You should have received a copy of the GNU Lesser General Public License along
18
- with SpreadBase. If not, see <http://www.gnu.org/licenses/>.
19
- =end
20
-
21
- require 'rubygems'
22
-
23
- require File.expand_path( '../spreadbase/helpers/helpers', __FILE__ )
24
-
25
- require File.expand_path( '../spreadbase/document', __FILE__ )
26
- require File.expand_path( '../spreadbase/table', __FILE__ )
27
-
28
- require File.expand_path( '../spreadbase/codecs/open_document_12_modules/encoding', __FILE__ )
29
- require File.expand_path( '../spreadbase/codecs/open_document_12_modules/decoding', __FILE__ )
30
- require File.expand_path( '../spreadbase/codecs/open_document_12', __FILE__ )
7
+ require_relative 'spreadbase/codecs/open_document_12_modules/encoding'
8
+ require_relative 'spreadbase/codecs/open_document_12_modules/decoding'
9
+ require_relative 'spreadbase/codecs/open_document_12'
31
10
 
32
11
  # = Spreadbase
33
12
  #
@@ -0,0 +1,19 @@
1
+ module SpreadBase # :nodoc:
2
+
3
+ # Represents the abstraction of a cell; values and their types are merged into a single entity.
4
+ #
5
+ class Cell
6
+
7
+ attr_accessor :value
8
+
9
+ def initialize(value)
10
+ @value = value
11
+ end
12
+
13
+ def ==(other)
14
+ other.is_a?(Cell) && @value == other.value
15
+ end
16
+
17
+ end
18
+
19
+ end
@@ -1,24 +1,4 @@
1
- # encoding: UTF-8
2
-
3
- =begin
4
- Copyright 2012 Saverio Miroddi saverio.pub2 <a-hat!> gmail.com
5
-
6
- This file is part of SpreadBase.
7
-
8
- SpreadBase is free software: you can redistribute it and/or modify it under the
9
- terms of the GNU Lesser General Public License as published by the Free Software
10
- Foundation, either version 3 of the License, or (at your option) any later
11
- version.
12
-
13
- SpreadBase is distributed in the hope that it will be useful, but WITHOUT ANY
14
- WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A
15
- PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details.
16
-
17
- You should have received a copy of the GNU Lesser General Public License along
18
- with SpreadBase. If not, see <http://www.gnu.org/licenses/>.
19
- =end
20
-
21
- require 'zipruby'
1
+ require 'zip'
22
2
  require 'rexml/document'
23
3
 
24
4
  module SpreadBase # :nodoc:
@@ -49,20 +29,17 @@ module SpreadBase # :nodoc:
49
29
  #
50
30
  # _options_:
51
31
  #
52
- # +force_18_strings_encoding+:: ('UTF-8') on ruby 1.8, when converting to UTF-8, assume the strings are using the specified format.
53
32
  # +prettify+:: (false )prettifies the content.xml to be human readable.
54
33
  #
55
34
  # _returns_ the archive as binary string.
56
35
  #
57
- def encode_to_archive( el_document, options={} )
58
- document_buffer = encode_to_content_xml( el_document, options )
36
+ def encode_to_archive(el_document, options={})
37
+ document_buffer = encode_to_content_xml(el_document, options)
59
38
  zip_buffer = ''
60
39
 
61
- Zip::Archive.open_buffer( zip_buffer, Zip::CREATE ) do | zip_file |
62
- zip_file.add_dir( 'META-INF' )
63
-
64
- zip_file.add_buffer( 'META-INF/manifest.xml', MANIFEST_XML );
65
- zip_file.add_buffer( 'content.xml', document_buffer );
40
+ Zip::File.open_buffer(zip_buffer) do | zip_file |
41
+ zip_file.get_output_stream('META-INF/manifest.xml') { |f| f << MANIFEST_XML }
42
+ zip_file.get_output_stream('content.xml') { |f| f << document_buffer }
66
43
  end
67
44
 
68
45
  zip_buffer
@@ -81,12 +58,11 @@ module SpreadBase # :nodoc:
81
58
  #
82
59
  # _returns_ the SpreadBase::Document instance.
83
60
  #
84
- def decode_archive( zip_buffer, options={} )
85
- content_xml_data = Zip::Archive.open_buffer( zip_buffer ) do | zip_file |
86
- zip_file.fopen( 'content.xml' ) { | file | file.read }
87
- end
61
+ def decode_archive(zip_buffer, options={})
62
+ io = StringIO.new(zip_buffer)
63
+ content_xml_data = Zip::File.new(io, false, true).read('content.xml')
88
64
 
89
- decode_content_xml( content_xml_data, options )
65
+ decode_content_xml(content_xml_data, options)
90
66
  end
91
67
 
92
68
  # Utility method; encodes the Document to the content.xml format.
@@ -97,18 +73,17 @@ module SpreadBase # :nodoc:
97
73
  #
98
74
  # _options_:
99
75
  #
100
- # +force_18_strings_encoding+:: ('UTF-8') on ruby 1.8, when converting to UTF-8, assume the strings are using the specified format.
101
76
  # +prettify+:: (false ) prettifies the content.xml to be human readable.
102
77
  #
103
78
  # _returns_ content.xml as string.
104
79
  #--
105
80
  # "utility" is a fancy name for testing/utils helper.
106
81
  #
107
- def encode_to_content_xml( el_document, options={} )
108
- prettify = options[ :prettify ]
82
+ def encode_to_content_xml(el_document, options={})
83
+ prettify = options[:prettify]
109
84
 
110
- document_xml_root = encode_to_document_node( el_document, options )
111
- document_buffer = prettify ? pretty_xml( document_xml_root ) : document_xml_root.to_s
85
+ document_xml_root = encode_to_document_node(el_document)
86
+ document_buffer = prettify ? pretty_xml(document_xml_root) : document_xml_root.to_s
112
87
 
113
88
  document_buffer
114
89
  end
@@ -123,20 +98,20 @@ module SpreadBase # :nodoc:
123
98
  #--
124
99
  # "utility" is a fancy name for testing/utils helper.
125
100
  #
126
- def decode_content_xml( content_xml_data, options={} )
127
- root_node = REXML::Document.new( content_xml_data )
101
+ def decode_content_xml(content_xml_data, options={})
102
+ root_node = REXML::Document.new(content_xml_data)
128
103
 
129
- decode_document_node( root_node, options )
104
+ decode_document_node(root_node, options)
130
105
  end
131
106
 
132
107
  private
133
108
 
134
- def pretty_xml( document )
109
+ def pretty_xml(document)
135
110
  buffer = ""
136
111
 
137
112
  xml_formatter = REXML::Formatters::Pretty.new
138
113
  xml_formatter.compact = true
139
- xml_formatter.write( document, buffer )
114
+ xml_formatter.write(document, buffer)
140
115
 
141
116
  buffer
142
117
  end
@@ -1,23 +1,3 @@
1
- # encoding: UTF-8
2
-
3
- =begin
4
- Copyright 2012 Saverio Miroddi saverio.pub2 <a-hat!> gmail.com
5
-
6
- This file is part of SpreadBase.
7
-
8
- SpreadBase is free software: you can redistribute it and/or modify it under the
9
- terms of the GNU Lesser General Public License as published by the Free Software
10
- Foundation, either version 3 of the License, or (at your option) any later
11
- version.
12
-
13
- SpreadBase is distributed in the hope that it will be useful, but WITHOUT ANY
14
- WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A
15
- PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details.
16
-
17
- You should have received a copy of the GNU Lesser General Public License along
18
- with SpreadBase. If not, see <http://www.gnu.org/licenses/>.
19
- =end
20
-
21
1
  require 'date'
22
2
  require 'bigdecimal'
23
3
 
@@ -37,32 +17,32 @@ module SpreadBase # :nodoc:
37
17
 
38
18
  # Returns a Document instance.
39
19
  #
40
- def decode_document_node( root_node, options={} )
20
+ def decode_document_node(root_node, options={})
41
21
  document = Document.new
42
22
 
43
- style_nodes = root_node.elements.to_a( '//office:document-content/office:automatic-styles/style:style' )
44
- table_nodes = root_node.elements.to_a( '//office:document-content/office:body/office:spreadsheet/table:table' )
23
+ style_nodes = root_node.elements.to_a('//office:document-content/office:automatic-styles/style:style')
24
+ table_nodes = root_node.elements.to_a('//office:document-content/office:body/office:spreadsheet/table:table')
45
25
 
46
- document.column_width_styles = decode_column_width_styles( style_nodes )
26
+ document.column_width_styles = decode_column_width_styles(style_nodes)
47
27
 
48
- document.tables = table_nodes.map { | node | decode_table_node( node, options ) }
28
+ document.tables = table_nodes.map { | node | decode_table_node(node, options) }
49
29
 
50
30
  document
51
31
  end
52
32
 
53
33
  # Currently it has only the purpose of decoding the column widths (for this reason it has a different naming convention).
54
34
  #
55
- def decode_column_width_styles( style_nodes )
56
- style_nodes.inject( {} ) do | column_width_styles, style_node |
57
- column_node = style_node.elements[ 'style:table-column-properties' ]
35
+ def decode_column_width_styles(style_nodes)
36
+ style_nodes.inject({}) do | column_width_styles, style_node |
37
+ column_node = style_node.elements['style:table-column-properties']
58
38
 
59
39
  if column_node
60
- column_width = column_node.attributes[ 'style:column-width' ]
40
+ column_width = column_node.attributes['style:column-width']
61
41
 
62
42
  if column_width
63
- style_name = style_node.attributes[ 'style:name' ]
43
+ style_name = style_node.attributes['style:name']
64
44
 
65
- column_width_styles[ style_name] = column_width
45
+ column_width_styles[style_name] = column_width
66
46
  end
67
47
  end
68
48
 
@@ -70,98 +50,123 @@ module SpreadBase # :nodoc:
70
50
  end
71
51
  end
72
52
 
73
- def decode_table_node( table_node, options )
74
- table = Table.new( table_node.attributes[ 'table:name' ] )
53
+ def decode_table_node(table_node, options)
54
+ table = Table.new(table_node.attributes['table:name'])
75
55
 
76
- column_nodes = table_node.elements.to_a( 'table:table-column' )
77
- row_nodes = table_node.elements.to_a( 'table:table-row' )
56
+ column_nodes = table_node.elements.to_a('table:table-column')
57
+ row_nodes = table_node.elements.to_a('table:table-row')
78
58
 
79
59
  # A single column/row can represent multiple columns (table:number-(columns|rows)-repeated)
80
60
  #
81
- table.column_width_styles = column_nodes.inject( [] ) { | current_styles, node | current_styles + decode_column_width_style( node ) }
82
- table.data = row_nodes.inject( [] ) { | current_rows, node | current_rows + decode_row_node( node, options ) }
61
+ table.column_width_styles = column_nodes.inject([]) { | current_styles, node | current_styles + decode_column_width_style(node) }
62
+ table.data = decode_row_nodes(row_nodes, options)
83
63
 
84
64
  table
85
65
  end
86
66
 
87
- def decode_column_width_style( column_node )
88
- repetitions = ( column_node.attributes[ 'table:number-columns-repeated' ] || '1' ).to_i
89
- style_name = column_node.attributes[ 'table:style-name' ]
67
+ def decode_column_width_style(column_node)
68
+ repetitions = (column_node.attributes['table:number-columns-repeated'] || '1').to_i
69
+ style_name = column_node.attributes['table:style-name']
90
70
 
91
71
  # WATCH OUT! See module note
92
72
  #
93
- make_array_from_repetitions( style_name, repetitions )
73
+ make_array_from_repetitions(style_name, repetitions)
94
74
  end
95
75
 
96
- def decode_row_node( row_node, options )
97
- repetitions = ( row_node.attributes[ 'table:number-rows-repeated' ] || '1' ).to_i
98
- cell_nodes = row_node.elements.to_a( 'table:table-cell' )
76
+ def decode_row_nodes(row_nodes, options)
77
+ rows = []
78
+ row_nodes.inject(0) do |size, node|
79
+ row, repetitions = decode_row_node(node, options)
80
+ row.empty? || append_row(rows, size, row, repetitions)
81
+ size + repetitions
82
+ end
83
+ rows
84
+ end
99
85
 
100
- # Watch out the :flatten; a single cell can represent multiple cells (table:number-columns-repeated)
101
- #
102
- values = cell_nodes.map { | node | decode_cell_node( node, options ) }.flatten
86
+ def decode_row_node(row_node, options)
87
+ repetitions = (row_node.attributes['table:number-rows-repeated'] || '1').to_i
88
+ cell_nodes = row_node.elements.to_a('table:table-cell')
103
89
 
104
- make_array_from_repetitions( values, repetitions )
90
+ [decode_cell_nodes(cell_nodes, options), repetitions]
105
91
  end
106
92
 
107
- def decode_cell_node( cell_node, options )
108
- floats_as_bigdecimal = options[ :floats_as_bigdecimal ]
93
+ def append_row(rows, size, row, repetitions)
94
+ (size - rows.size).times { rows << [] }
95
+ rows.concat(make_array_from_repetitions(row, repetitions))
96
+ end
109
97
 
110
- value_type = cell_node.attributes[ 'office:value-type' ]
98
+ def decode_cell_nodes(cell_nodes, options)
99
+ cells = []
100
+ cell_nodes.inject(0) do |size, node|
101
+ cell, repetitions = decode_cell_node(node, options)
102
+ cell.nil? || append_cell(cells, size, cell, repetitions)
103
+ size + repetitions
104
+ end
105
+ cells
106
+ end
111
107
 
112
- value = \
113
- case value_type
114
- when 'string'
115
- value_node = cell_node.elements[ 'text:p' ]
108
+ def decode_cell_node(cell_node, options)
109
+ [
110
+ decode_cell_value(cell_node, options),
111
+ (cell_node.attributes['table:number-columns-repeated'] || '1').to_i
112
+ ]
113
+ end
116
114
 
117
- value_node.text
118
- when 'date'
119
- date_string = cell_node.attributes[ 'office:date-value' ]
115
+ def append_cell(cells, size, cell, repetitions)
116
+ cells[size - 1] = nil if size != cells.size
117
+ cells.concat(make_array_from_repetitions(cell, repetitions))
118
+ end
120
119
 
121
- if date_string =~ /T/
122
- DateTime.strptime( date_string, '%Y-%m-%dT%H:%M:%S' )
123
- else
124
- Date.strptime( date_string, '%Y-%m-%d' )
125
- end
126
- when 'float', 'percentage'
127
- float_string = cell_node.attributes[ 'office:value' ]
128
-
129
- if float_string.include?( '.' )
130
- if floats_as_bigdecimal
131
- BigDecimal.new( float_string )
132
- else
133
- float_string.to_f
134
- end
135
- else
136
- float_string.to_i
137
- end
138
- when 'boolean'
139
- boolean_string = cell_node.attributes[ 'office:boolean-value' ]
140
-
141
- case boolean_string
142
- when 'true'
143
- true
144
- when 'false'
145
- false
120
+ def decode_cell_value(cell_node, options)
121
+ floats_as_bigdecimal = options[:floats_as_bigdecimal]
122
+
123
+ value_type = cell_node.attributes['office:value-type']
124
+
125
+ case value_type
126
+ when 'string'
127
+ value_node = cell_node.elements['text:p']
128
+
129
+ value_node.text
130
+ when 'date'
131
+ date_string = cell_node.attributes['office:date-value']
132
+
133
+ if date_string =~ /T/
134
+ DateTime.strptime(date_string, '%Y-%m-%dT%H:%M:%S')
135
+ else
136
+ Date.strptime(date_string, '%Y-%m-%d')
137
+ end
138
+ when 'float', 'percentage'
139
+ float_string = cell_node.attributes['office:value']
140
+
141
+ if float_string.include?('.')
142
+ if floats_as_bigdecimal
143
+ BigDecimal(float_string)
146
144
  else
147
- raise "Invalid boolean value: #{ boolean_string }"
145
+ float_string.to_f
148
146
  end
149
- when nil
150
- nil
151
147
  else
152
- raise "Unrecognized value type found in a cell: #{ value_type }"
148
+ float_string.to_i
153
149
  end
154
-
155
- repetitions = ( cell_node.attributes[ 'table:number-columns-repeated' ] || '1' ).to_i
156
-
157
- make_array_from_repetitions( value, repetitions )
150
+ when 'boolean'
151
+ boolean_string = cell_node.attributes['office:boolean-value']
152
+
153
+ case boolean_string
154
+ when 'true'
155
+ true
156
+ when 'false'
157
+ false
158
+ else
159
+ raise "Invalid boolean value: #{ boolean_string }"
160
+ end
161
+ when nil
162
+ nil
163
+ else
164
+ raise "Unrecognized value type found in a cell: #{ value_type }"
165
+ end
158
166
  end
159
167
 
160
168
  end
161
169
 
162
- private
163
-
164
-
165
170
  end
166
171
 
167
172
  end