creek 1.1.2 → 2.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +5 -13
- data/LICENSE.txt +1 -1
- data/README.md +117 -0
- data/creek.gemspec +4 -3
- data/lib/creek.rb +3 -1
- data/lib/creek/book.rb +19 -2
- data/lib/creek/drawing.rb +109 -0
- data/lib/creek/sheet.rb +54 -2
- data/lib/creek/styles/converter.rb +1 -3
- data/lib/creek/utils.rb +16 -0
- data/lib/creek/version.rb +1 -1
- data/spec/drawing_spec.rb +52 -0
- data/spec/fixtures/sample-with-images.xlsx +0 -0
- data/spec/fixtures/sample.xlsx +0 -0
- data/spec/shared_string_spec.rb +6 -6
- data/spec/sheet_spec.rb +85 -0
- data/spec/styles/converter_spec.rb +1 -1
- data/spec/styles/style_types_spec.rb +3 -3
- data/spec/test_spec.rb +38 -23
- metadata +46 -24
- data/README.rdoc +0 -76
checksums.yaml
CHANGED
@@ -1,15 +1,7 @@
|
|
1
1
|
---
|
2
|
-
|
3
|
-
metadata.gz:
|
4
|
-
|
5
|
-
data.tar.gz: !binary |-
|
6
|
-
NWE3YmRhNGI5NTkwMDgzNDFiMDJkMmYzYzI1NjNiYjY2MDE0NmQ0Yw==
|
2
|
+
SHA1:
|
3
|
+
metadata.gz: 6b7d68233b036517f99988f30405a4508c3b4892
|
4
|
+
data.tar.gz: a34206d988e501a0324598bcb2de856957de1ec5
|
7
5
|
SHA512:
|
8
|
-
metadata.gz:
|
9
|
-
|
10
|
-
NjI5MTkxMTYyNTRhMDhkODkxOGExY2E1MTVlMmZkODEwMTM4M2NlYjgyODI2
|
11
|
-
MDcxMjFmMzNmZDA2ZWU3MWM3OWJmNWRhMzAyYzgxM2E0NzdkODk=
|
12
|
-
data.tar.gz: !binary |-
|
13
|
-
ZmYxNDFiYzU0MDM1Y2FlMjgxZmI4OGI2OTRiYTQzMTI2OGRhMGYxNjlhNTU0
|
14
|
-
MzFjNTY5MjA5M2ZjY2MyODA3MDljYzYwZTNhNmZjZTAzMGQ4ZDFmMTNjNzU1
|
15
|
-
MTU5Y2I1M2Q3ZGU3YWE1Y2VjZDVhMDY3NTVlYzc1NmEyNmZlODU=
|
6
|
+
metadata.gz: 3b75245668526340173ae7be11e0f5f81a92c5f0da6383e2961fd877fef49a7ec8809e7e97202b11cc978db8e2260fc8c3f43c7ad41da285e3d922dc705e4f79
|
7
|
+
data.tar.gz: 639e94e90af3d78d1af3ef6a07e08f79c79f21accf9d1f197976022112ee03c9d12284f0a78d674a946f33bf2d2091ed6c614a4d2c49c53329faed9ba5a4b514
|
data/LICENSE.txt
CHANGED
data/README.md
ADDED
@@ -0,0 +1,117 @@
|
|
1
|
+
# Creek - Stream parser for large Excel (xlsx and xlsm) files.
|
2
|
+
|
3
|
+
Creek is a Ruby gem that provides a fast, simple and efficient method of parsing large Excel (xlsx and xlsm) files.
|
4
|
+
|
5
|
+
|
6
|
+
## Installation
|
7
|
+
|
8
|
+
Creek can be used from the command line or as part of a Ruby web framework. To install the gem using terminal, run the following command:
|
9
|
+
|
10
|
+
```
|
11
|
+
gem install creek
|
12
|
+
```
|
13
|
+
|
14
|
+
To use it in Rails, add this line to your Gemfile:
|
15
|
+
|
16
|
+
```ruby
|
17
|
+
gem 'creek'
|
18
|
+
```
|
19
|
+
|
20
|
+
## Basic Usage
|
21
|
+
Creek can simply parse an Excel file by looping through the rows enumerator:
|
22
|
+
|
23
|
+
```ruby
|
24
|
+
require 'creek'
|
25
|
+
creek = Creek::Book.new 'specs/fixtures/sample.xlsx'
|
26
|
+
sheet= creek.sheets[0]
|
27
|
+
|
28
|
+
sheet.rows.each do |row|
|
29
|
+
puts row # => {"A1"=>"Content 1", "B1"=>nil, C1"=>nil, "D1"=>"Content 3"}
|
30
|
+
end
|
31
|
+
|
32
|
+
sheet.rows_with_meta_data.each do |row|
|
33
|
+
puts row # => {"collapsed"=>"false", "customFormat"=>"false", "customHeight"=>"true", "hidden"=>"false", "ht"=>"12.1", "outlineLevel"=>"0", "r"=>"1", "cells"=>{"A1"=>"Content 1", "B1"=>nil, C1"=>nil, "D1"=>"Content 3"}}
|
34
|
+
end
|
35
|
+
|
36
|
+
sheet.state # => 'visible'
|
37
|
+
sheet.name # => 'Sheet1'
|
38
|
+
sheet.rid # => 'rId2'
|
39
|
+
```
|
40
|
+
|
41
|
+
## Filename considerations
|
42
|
+
By default, Creek will ensure that the file extension is either *.xlsx or *.xlsm, but this check can be circumvented as needed:
|
43
|
+
|
44
|
+
```ruby
|
45
|
+
path = 'sample-as-zip.zip'
|
46
|
+
Creek::Book.new path, :check_file_extension => false
|
47
|
+
```
|
48
|
+
|
49
|
+
By default, the Rails [file_field_tag](http://api.rubyonrails.org/classes/ActionView/Helpers/FormTagHelper.html#method-i-file_field_tag) uploads to a temporary location and stores the original filename with the StringIO object. (See [this section](http://guides.rubyonrails.org/form_helpers.html#uploading-files) of the Rails Guides for more information.)
|
50
|
+
|
51
|
+
Creek can parse this directly without the need for file upload gems such as Carrierwave or Paperclip by passing the original filename as an option:
|
52
|
+
|
53
|
+
```ruby
|
54
|
+
# Import endpoint in Rails controller
|
55
|
+
def import
|
56
|
+
file = params[:file]
|
57
|
+
Creek::Book.new file.path, check_file_extension: false
|
58
|
+
end
|
59
|
+
```
|
60
|
+
|
61
|
+
## Parsing images
|
62
|
+
Creek does not parse images by default. If you want to parse the images,
|
63
|
+
use `with_images` method before iterating over rows to preload images information. If you don't call this method, Creek will not return images anywhere.
|
64
|
+
|
65
|
+
Cells with images will be an array of Pathname objects.
|
66
|
+
If an image is spread across multiple cells, same Pathname object will be returned for each cell.
|
67
|
+
|
68
|
+
```ruby
|
69
|
+
sheet.with_images.rows.each do |row|
|
70
|
+
puts row # => {"A1"=>[#<Pathname:/var/folders/ck/l64nmm3d4k75pvxr03ndk1tm0000gn/T/creek__drawing20161101-53599-274q0vimage1.jpeg>], "B2"=>"Fluffy"}
|
71
|
+
end
|
72
|
+
```
|
73
|
+
|
74
|
+
Images for a specific cell can be obtained with images_at method:
|
75
|
+
|
76
|
+
```ruby
|
77
|
+
puts sheet.images_at('A1') # => [#<Pathname:/var/folders/ck/l64nmm3d4k75pvxr03ndk1tm0000gn/T/creek__drawing20161101-53599-274q0vimage1.jpeg>]
|
78
|
+
|
79
|
+
# no images in a cell
|
80
|
+
puts sheet.images_at('C1') # => nil
|
81
|
+
```
|
82
|
+
|
83
|
+
Creek will most likely return nil for a cell with images if there is no other text cell in that row - you can use *images_at* method for retrieving images in that cell.
|
84
|
+
|
85
|
+
## Remote files
|
86
|
+
|
87
|
+
```ruby
|
88
|
+
remote_url = 'http://dev-builds.libreoffice.org/tmp/test.xlsx'
|
89
|
+
Creek::Book.new remote_url, remote: true
|
90
|
+
```
|
91
|
+
|
92
|
+
## Contributing
|
93
|
+
|
94
|
+
Contributions are welcomed. You can fork a repository, add your code changes to the forked branch, ensure all existing unit tests pass, create new unit tests which cover your new changes and finally create a pull request.
|
95
|
+
|
96
|
+
After forking and then cloning the repository locally, install the Bundler and then use it
|
97
|
+
to install the development gem dependencies:
|
98
|
+
|
99
|
+
```
|
100
|
+
gem install bundler
|
101
|
+
bundle install
|
102
|
+
```
|
103
|
+
|
104
|
+
Once this is complete, you should be able to run the test suite:
|
105
|
+
|
106
|
+
```
|
107
|
+
rake
|
108
|
+
```
|
109
|
+
|
110
|
+
## Bug Reporting
|
111
|
+
|
112
|
+
Please use the [Issues](https://github.com/pythonicrubyist/creek/issues) page to report bugs or suggest new enhancements.
|
113
|
+
|
114
|
+
|
115
|
+
## License
|
116
|
+
|
117
|
+
Creek has been published under [MIT License](https://github.com/pythonicrubyist/creek/blob/master/LICENSE.txt)
|
data/creek.gemspec
CHANGED
@@ -18,13 +18,14 @@ Gem::Specification.new do |spec|
|
|
18
18
|
spec.test_files = spec.files.grep(%r{^(test|spec|features)/})
|
19
19
|
spec.require_paths = ["lib"]
|
20
20
|
|
21
|
-
spec.required_ruby_version = '>=
|
21
|
+
spec.required_ruby_version = '>= 2.0.0'
|
22
22
|
|
23
23
|
spec.add_development_dependency "bundler", "~> 1.3"
|
24
24
|
spec.add_development_dependency "rake"
|
25
|
-
spec.add_development_dependency 'rspec', '~>
|
25
|
+
spec.add_development_dependency 'rspec', '~> 3.6.0'
|
26
26
|
spec.add_development_dependency 'pry'
|
27
27
|
|
28
|
-
spec.add_dependency 'nokogiri', '~> 1.
|
28
|
+
spec.add_dependency 'nokogiri', '~> 1.7.0'
|
29
29
|
spec.add_dependency 'rubyzip', '>= 1.0.0'
|
30
|
+
spec.add_dependency 'httparty', '~> 0.15.5'
|
30
31
|
end
|
data/lib/creek.rb
CHANGED
@@ -1,9 +1,11 @@
|
|
1
|
-
require
|
1
|
+
require 'creek/version'
|
2
2
|
require 'creek/book'
|
3
3
|
require 'creek/styles/constants'
|
4
4
|
require 'creek/styles/style_types'
|
5
5
|
require 'creek/styles/converter'
|
6
|
+
require 'creek/utils'
|
6
7
|
require 'creek/styles'
|
8
|
+
require 'creek/drawing'
|
7
9
|
require 'creek/sheet'
|
8
10
|
require 'creek/shared_strings'
|
9
11
|
|
data/lib/creek/book.rb
CHANGED
@@ -1,6 +1,7 @@
|
|
1
1
|
require 'zip/filesystem'
|
2
2
|
require 'nokogiri'
|
3
3
|
require 'date'
|
4
|
+
require 'httparty'
|
4
5
|
|
5
6
|
module Creek
|
6
7
|
|
@@ -19,16 +20,32 @@ module Creek
|
|
19
20
|
extension = File.extname(options[:original_filename] || path).downcase
|
20
21
|
raise 'Not a valid file format.' unless (['.xlsx', '.xlsm'].include? extension)
|
21
22
|
end
|
22
|
-
|
23
|
+
if options[:remote]
|
24
|
+
zipfile = Tempfile.new("file")
|
25
|
+
zipfile.binmode
|
26
|
+
zipfile.write(HTTParty.get(path).body)
|
27
|
+
zipfile.close
|
28
|
+
path = zipfile.path
|
29
|
+
end
|
30
|
+
@files = Zip::File.open(path)
|
23
31
|
@shared_strings = SharedStrings.new(self)
|
24
32
|
end
|
25
33
|
|
26
34
|
def sheets
|
27
35
|
doc = @files.file.open "xl/workbook.xml"
|
28
36
|
xml = Nokogiri::XML::Document.parse doc
|
37
|
+
namespaces = xml.namespaces
|
38
|
+
|
39
|
+
cssPrefix = ''
|
40
|
+
namespaces.each do |namespace|
|
41
|
+
if namespace[1] == 'http://schemas.openxmlformats.org/spreadsheetml/2006/main' && namespace[0] != 'xmlns' then
|
42
|
+
cssPrefix = namespace[0].split(':')[1]+'|'
|
43
|
+
end
|
44
|
+
end
|
45
|
+
|
29
46
|
rels_doc = @files.file.open "xl/_rels/workbook.xml.rels"
|
30
47
|
rels = Nokogiri::XML::Document.parse(rels_doc).css("Relationship")
|
31
|
-
@sheets = xml.css('sheet').map do |sheet|
|
48
|
+
@sheets = xml.css(cssPrefix+'sheet').map do |sheet|
|
32
49
|
sheetfile = rels.find { |el| sheet.attr("r:id") == el.attr("Id") }.attr("Target")
|
33
50
|
Sheet.new(self, sheet.attr("name"), sheet.attr("sheetid"), sheet.attr("state"), sheet.attr("visible"), sheet.attr("r:id"), sheetfile)
|
34
51
|
end
|
@@ -0,0 +1,109 @@
|
|
1
|
+
require 'pathname'
|
2
|
+
|
3
|
+
module Creek
|
4
|
+
class Creek::Drawing
|
5
|
+
include Creek::Utils
|
6
|
+
|
7
|
+
COLUMNS = ('A'..'AZ').to_a
|
8
|
+
|
9
|
+
def initialize(book, drawing_filepath)
|
10
|
+
@book = book
|
11
|
+
@drawing_filepath = drawing_filepath
|
12
|
+
@drawings = []
|
13
|
+
@drawings_rels = []
|
14
|
+
@images_pathnames = Hash.new { |hash, key| hash[key] = [] }
|
15
|
+
|
16
|
+
if file_exist?(@drawing_filepath)
|
17
|
+
load_drawings_and_rels
|
18
|
+
load_images_pathnames_by_cells if has_images?
|
19
|
+
end
|
20
|
+
end
|
21
|
+
|
22
|
+
##
|
23
|
+
# Returns false if there are no images in the drawing file or the drawing file does not exist, true otherwise.
|
24
|
+
def has_images?
|
25
|
+
@has_images ||= !@drawings.empty?
|
26
|
+
end
|
27
|
+
|
28
|
+
##
|
29
|
+
# Extracts images from excel to tmpdir for a cell, if the images are not already extracted (multiple calls or same image file in multiple cells).
|
30
|
+
# Returns array of images as Pathname objects or nil.
|
31
|
+
def images_at(cell_name)
|
32
|
+
coordinate = calc_coordinate(cell_name)
|
33
|
+
pathnames_at_coordinate = @images_pathnames[coordinate]
|
34
|
+
return if pathnames_at_coordinate.empty?
|
35
|
+
|
36
|
+
pathnames_at_coordinate.map do |image_pathname|
|
37
|
+
if image_pathname.exist?
|
38
|
+
image_pathname
|
39
|
+
else
|
40
|
+
excel_image_path = "xl/media#{image_pathname.to_path.split(tmpdir).last}"
|
41
|
+
IO.copy_stream(@book.files.file.open(excel_image_path), image_pathname.to_path)
|
42
|
+
image_pathname
|
43
|
+
end
|
44
|
+
end
|
45
|
+
end
|
46
|
+
|
47
|
+
private
|
48
|
+
|
49
|
+
##
|
50
|
+
# Transforms cell name to [row, col], e.g. A1 => [0, 0], B3 => [1, 2]
|
51
|
+
# Rows and cols start with 0.
|
52
|
+
def calc_coordinate(cell_name)
|
53
|
+
col = COLUMNS.index(cell_name.slice /[A-Z]+/)
|
54
|
+
row = (cell_name.slice /\d+/).to_i - 1 # rows in drawings start with 0
|
55
|
+
[row, col]
|
56
|
+
end
|
57
|
+
|
58
|
+
##
|
59
|
+
# Creates/loads temporary directory for extracting images from excel
|
60
|
+
def tmpdir
|
61
|
+
@tmpdir ||= ::Dir.mktmpdir('creek__drawing')
|
62
|
+
end
|
63
|
+
|
64
|
+
##
|
65
|
+
# Parses drawing and drawing's relationships xmls.
|
66
|
+
# Drawing xml contains relationships ID's and coordinates (row, col).
|
67
|
+
# Drawing relationships xml contains images' locations.
|
68
|
+
def load_drawings_and_rels
|
69
|
+
@drawings = parse_xml(@drawing_filepath).css('xdr|twoCellAnchor')
|
70
|
+
drawing_rels_filepath = expand_to_rels_path(@drawing_filepath)
|
71
|
+
@drawings_rels = parse_xml(drawing_rels_filepath).css('Relationships')
|
72
|
+
end
|
73
|
+
|
74
|
+
##
|
75
|
+
# Iterates through the drawings and saves images' paths as Pathname objects to a hash with [row, col] keys.
|
76
|
+
# As multiple images can be located in a single cell, hash values are array of Pathname objects.
|
77
|
+
# One image can be spread across multiple cells (defined with from-row/to-row/from-col/to-col attributes) - same Pathname object is associated to each row-col combination for the range.
|
78
|
+
def load_images_pathnames_by_cells
|
79
|
+
image_selector = 'xdr:pic/xdr:blipFill/a:blip'.freeze
|
80
|
+
row_from_selector = 'xdr:from/xdr:row'.freeze
|
81
|
+
row_to_selector = 'xdr:to/xdr:row'.freeze
|
82
|
+
col_from_selector = 'xdr:from/xdr:col'.freeze
|
83
|
+
col_to_selector = 'xdr:to/xdr:col'.freeze
|
84
|
+
|
85
|
+
@drawings.xpath('//xdr:twoCellAnchor').each do |drawing|
|
86
|
+
embed = drawing.xpath(image_selector).first.attributes['embed']
|
87
|
+
next if embed.nil?
|
88
|
+
|
89
|
+
rid = embed.value
|
90
|
+
path = Pathname.new("#{tmpdir}/#{extract_drawing_path(rid).slice(/[^\/]*$/)}")
|
91
|
+
|
92
|
+
row_from = drawing.xpath(row_from_selector).text.to_i
|
93
|
+
col_from = drawing.xpath(col_from_selector).text.to_i
|
94
|
+
row_to = drawing.xpath(row_to_selector).text.to_i
|
95
|
+
col_to = drawing.xpath(col_to_selector).text.to_i
|
96
|
+
|
97
|
+
(col_from..col_to).each do |col|
|
98
|
+
(row_from..row_to).each do |row|
|
99
|
+
@images_pathnames[[row, col]].push(path)
|
100
|
+
end
|
101
|
+
end
|
102
|
+
end
|
103
|
+
end
|
104
|
+
|
105
|
+
def extract_drawing_path(rid)
|
106
|
+
@drawings_rels.css("Relationship[@Id=#{rid}]").first.attributes['Target'].value
|
107
|
+
end
|
108
|
+
end
|
109
|
+
end
|
data/lib/creek/sheet.rb
CHANGED
@@ -3,6 +3,7 @@ require 'nokogiri'
|
|
3
3
|
|
4
4
|
module Creek
|
5
5
|
class Creek::Sheet
|
6
|
+
include Creek::Utils
|
6
7
|
|
7
8
|
attr_reader :book,
|
8
9
|
:name,
|
@@ -21,6 +22,28 @@ module Creek
|
|
21
22
|
@rid = rid
|
22
23
|
@state = state
|
23
24
|
@sheetfile = sheetfile
|
25
|
+
@images_present = false
|
26
|
+
end
|
27
|
+
|
28
|
+
##
|
29
|
+
# Preloads images info (coordinates and paths) from related drawing.xml and drawing rels.
|
30
|
+
# Must be called before #rows method if you want to have images included.
|
31
|
+
# Returns self so you can chain the calls (sheet.with_images.rows).
|
32
|
+
def with_images
|
33
|
+
@drawingfile = extract_drawing_filepath
|
34
|
+
if @drawingfile
|
35
|
+
@drawing = Creek::Drawing.new(@book, @drawingfile.sub('..', 'xl'))
|
36
|
+
@images_present = @drawing.has_images?
|
37
|
+
end
|
38
|
+
self
|
39
|
+
end
|
40
|
+
|
41
|
+
##
|
42
|
+
# Extracts images for a cell to a temporary folder.
|
43
|
+
# Returns array of Pathnames for the cell.
|
44
|
+
# Returns nil if images asre not found for the cell or images were not preloaded with #with_images.
|
45
|
+
def images_at(cell)
|
46
|
+
@drawing.images_at(cell) if @images_present
|
24
47
|
end
|
25
48
|
|
26
49
|
##
|
@@ -43,7 +66,7 @@ module Creek
|
|
43
66
|
# Returns a hash per row that includes the cell ids and values.
|
44
67
|
# Empty cells will be also included in the hash with a nil value.
|
45
68
|
def rows_generator include_meta_data=false
|
46
|
-
path = "xl/#{@sheetfile}"
|
69
|
+
path = if @sheetfile.start_with? "/xl/" or @sheetfile.start_with? "xl/" then @sheetfile else "xl/#{@sheetfile}" end
|
47
70
|
if @book.files.file.exist?(path)
|
48
71
|
# SAX parsing, Each element in the stream comes through as two events:
|
49
72
|
# one to open the element and one to close it.
|
@@ -62,6 +85,14 @@ module Creek
|
|
62
85
|
y << (include_meta_data ? row : cells) if node.self_closing?
|
63
86
|
elsif (node.name.eql? 'row') and (node.node_type.eql? closer)
|
64
87
|
processed_cells = fill_in_empty_cells(cells, row['r'], cell)
|
88
|
+
|
89
|
+
if @images_present
|
90
|
+
processed_cells.each do |cell_name, cell_value|
|
91
|
+
next unless cell_value.nil?
|
92
|
+
processed_cells[cell_name] = images_at(cell_name)
|
93
|
+
end
|
94
|
+
end
|
95
|
+
|
65
96
|
row['cells'] = processed_cells
|
66
97
|
y << (include_meta_data ? row : processed_cells)
|
67
98
|
elsif (node.name.eql? 'c') and (node.node_type.eql? opener)
|
@@ -72,6 +103,10 @@ module Creek
|
|
72
103
|
unless cell.nil?
|
73
104
|
cells[cell] = convert(node.inner_xml, cell_type, cell_style_idx)
|
74
105
|
end
|
106
|
+
elsif (node.name.eql? 't') and (node.node_type.eql? opener)
|
107
|
+
unless cell.nil?
|
108
|
+
cells[cell] = convert(node.inner_xml, cell_type, cell_style_idx)
|
109
|
+
end
|
75
110
|
end
|
76
111
|
end
|
77
112
|
end
|
@@ -108,5 +143,22 @@ module Creek
|
|
108
143
|
|
109
144
|
new_cells
|
110
145
|
end
|
146
|
+
|
147
|
+
##
|
148
|
+
# Find drawing filepath for the current sheet.
|
149
|
+
# Sheet xml contains drawing relationship ID.
|
150
|
+
# Sheet relationships xml contains drawing file's location.
|
151
|
+
def extract_drawing_filepath
|
152
|
+
# Read drawing relationship ID from the sheet.
|
153
|
+
sheet_filepath = "xl/#{@sheetfile}"
|
154
|
+
drawing = parse_xml(sheet_filepath).css('drawing').first
|
155
|
+
return if drawing.nil?
|
156
|
+
|
157
|
+
drawing_rid = drawing.attributes['id'].value
|
158
|
+
|
159
|
+
# Read sheet rels to find drawing file's location.
|
160
|
+
sheet_rels_filepath = expand_to_rels_path(sheet_filepath)
|
161
|
+
parse_xml(sheet_rels_filepath).css("Relationship[@Id='#{drawing_rid}']").first.attributes['Target'].value
|
162
|
+
end
|
111
163
|
end
|
112
|
-
end
|
164
|
+
end
|
data/lib/creek/utils.rb
ADDED
@@ -0,0 +1,16 @@
|
|
1
|
+
module Creek
|
2
|
+
module Utils
|
3
|
+
def expand_to_rels_path(filepath)
|
4
|
+
filepath.sub(/(\/[^\/]+$)/, '/_rels\1.rels')
|
5
|
+
end
|
6
|
+
|
7
|
+
def file_exist?(path)
|
8
|
+
@book.files.file.exist?(path)
|
9
|
+
end
|
10
|
+
|
11
|
+
def parse_xml(xml_path)
|
12
|
+
doc = @book.files.file.open(xml_path)
|
13
|
+
Nokogiri::XML::Document.parse(doc)
|
14
|
+
end
|
15
|
+
end
|
16
|
+
end
|
data/lib/creek/version.rb
CHANGED
@@ -0,0 +1,52 @@
|
|
1
|
+
require './spec/spec_helper'
|
2
|
+
|
3
|
+
describe 'drawing' do
|
4
|
+
let(:book) { Creek::Book.new('spec/fixtures/sample-with-images.xlsx') }
|
5
|
+
let(:book_no_images) { Creek::Book.new('spec/fixtures/sample.xlsx') }
|
6
|
+
let(:drawingfile) { 'xl/drawings/drawing1.xml' }
|
7
|
+
let(:drawing) { Creek::Drawing.new(book, drawingfile) }
|
8
|
+
let(:drawing_without_images) { Creek::Drawing.new(book_no_images, drawingfile) }
|
9
|
+
|
10
|
+
describe '#has_images?' do
|
11
|
+
it 'has' do
|
12
|
+
expect(drawing.has_images?).to eq(true)
|
13
|
+
end
|
14
|
+
|
15
|
+
it 'does not have' do
|
16
|
+
expect(drawing_without_images.has_images?).to eq(false)
|
17
|
+
end
|
18
|
+
end
|
19
|
+
|
20
|
+
describe '#images_at' do
|
21
|
+
it 'returns images pathnames at cell' do
|
22
|
+
image = drawing.images_at('A2')[0]
|
23
|
+
expect(image.class).to eq(Pathname)
|
24
|
+
expect(image.exist?).to eq(true)
|
25
|
+
expect(image.to_path).to match(/.+creek__drawing.+\.jpeg$/)
|
26
|
+
end
|
27
|
+
|
28
|
+
context 'when no images in cell' do
|
29
|
+
it 'returns nil' do
|
30
|
+
images = drawing.images_at('B2')
|
31
|
+
expect(images).to eq(nil)
|
32
|
+
end
|
33
|
+
end
|
34
|
+
|
35
|
+
context 'when more images in one cell' do
|
36
|
+
it 'returns all images at cell' do
|
37
|
+
images = drawing.images_at('A10')
|
38
|
+
expect(images.size).to eq(2)
|
39
|
+
expect(images.all?(&:exist?)).to eq(true)
|
40
|
+
end
|
41
|
+
end
|
42
|
+
|
43
|
+
context 'when same image across multiple cells' do
|
44
|
+
it 'returns same image for each cell' do
|
45
|
+
image1 = drawing.images_at('A4')[0]
|
46
|
+
image2 = drawing.images_at('A5')[0]
|
47
|
+
expect(image1.class).to eq(Pathname)
|
48
|
+
expect(image1).to eq(image2)
|
49
|
+
end
|
50
|
+
end
|
51
|
+
end
|
52
|
+
end
|
Binary file
|
data/spec/fixtures/sample.xlsx
CHANGED
Binary file
|
data/spec/shared_string_spec.rb
CHANGED
@@ -7,12 +7,12 @@ describe 'shared strings' do
|
|
7
7
|
doc = Nokogiri::XML(shared_strings_xml_file)
|
8
8
|
dictionary = Creek::SharedStrings.parse_shared_string_from_document(doc)
|
9
9
|
|
10
|
-
dictionary.keys.size.
|
11
|
-
dictionary[0].
|
12
|
-
dictionary[1].
|
13
|
-
dictionary[2].
|
14
|
-
dictionary[3].
|
15
|
-
dictionary[4].
|
10
|
+
expect(dictionary.keys.size).to eq(5)
|
11
|
+
expect(dictionary[0]).to eq('Cell A1')
|
12
|
+
expect(dictionary[1]).to eq('Cell B1')
|
13
|
+
expect(dictionary[2]).to eq('My Cell')
|
14
|
+
expect(dictionary[3]).to eq('Cell A2')
|
15
|
+
expect(dictionary[4]).to eq('Cell B2')
|
16
16
|
end
|
17
17
|
|
18
18
|
end
|
data/spec/sheet_spec.rb
ADDED
@@ -0,0 +1,85 @@
|
|
1
|
+
require './spec/spec_helper'
|
2
|
+
|
3
|
+
describe 'sheet' do
|
4
|
+
let(:book_with_images) { Creek::Book.new('spec/fixtures/sample-with-images.xlsx') }
|
5
|
+
let(:book_no_images) { Creek::Book.new('spec/fixtures/sample.xlsx') }
|
6
|
+
let(:sheetfile) { 'worksheets/sheet1.xml' }
|
7
|
+
let(:sheet_with_images) { Creek::Sheet.new(book_with_images, 'Sheet 1', 1, '', '', '1', sheetfile) }
|
8
|
+
let(:sheet_no_images) { Creek::Sheet.new(book_no_images, 'Sheet 1', 1, '', '', '1', sheetfile) }
|
9
|
+
|
10
|
+
def load_cell(rows, cell_name)
|
11
|
+
cell = rows.find { |row| !row[cell_name].nil? }
|
12
|
+
cell[cell_name] if cell
|
13
|
+
end
|
14
|
+
|
15
|
+
describe '#rows' do
|
16
|
+
context 'with excel with images' do
|
17
|
+
context 'with images preloading' do
|
18
|
+
let(:rows) { sheet_with_images.with_images.rows.map { |r| r } }
|
19
|
+
|
20
|
+
it 'parses single image in a cell' do
|
21
|
+
expect(load_cell(rows, 'A2').size).to eq(1)
|
22
|
+
end
|
23
|
+
|
24
|
+
it 'returns nil for cells without images' do
|
25
|
+
expect(load_cell(rows, 'A3')).to eq(nil)
|
26
|
+
expect(load_cell(rows, 'A7')).to eq(nil)
|
27
|
+
expect(load_cell(rows, 'A9')).to eq(nil)
|
28
|
+
end
|
29
|
+
|
30
|
+
it 'returns nil for merged cell within empty row' do
|
31
|
+
expect(load_cell(rows, 'A5')).to eq(nil)
|
32
|
+
end
|
33
|
+
|
34
|
+
it 'returns nil for image in a cell with empty row' do
|
35
|
+
expect(load_cell(rows, 'A8')).to eq(nil)
|
36
|
+
end
|
37
|
+
|
38
|
+
it 'returns images for merged cells' do
|
39
|
+
expect(load_cell(rows, 'A4').size).to eq(1)
|
40
|
+
expect(load_cell(rows, 'A6').size).to eq(1)
|
41
|
+
end
|
42
|
+
|
43
|
+
it 'returns multiple images' do
|
44
|
+
expect(load_cell(rows, 'A10').size).to eq(2)
|
45
|
+
end
|
46
|
+
end
|
47
|
+
|
48
|
+
it 'ignores images' do
|
49
|
+
rows = sheet_with_images.rows.map { |r| r }
|
50
|
+
expect(load_cell(rows, 'A2')).to eq(nil)
|
51
|
+
expect(load_cell(rows, 'A3')).to eq(nil)
|
52
|
+
expect(load_cell(rows, 'A4')).to eq(nil)
|
53
|
+
end
|
54
|
+
end
|
55
|
+
|
56
|
+
context 'with excel without images' do
|
57
|
+
it 'does not break on with_images' do
|
58
|
+
rows = sheet_no_images.with_images.rows.map { |r| r }
|
59
|
+
expect(load_cell(rows, 'A10')).to eq(0.15)
|
60
|
+
end
|
61
|
+
end
|
62
|
+
end
|
63
|
+
|
64
|
+
describe '#images_at' do
|
65
|
+
it 'returns images for merged cell' do
|
66
|
+
image = sheet_with_images.with_images.images_at('A5')[0]
|
67
|
+
expect(image.class).to eq(Pathname)
|
68
|
+
end
|
69
|
+
|
70
|
+
it 'returns images for empty row' do
|
71
|
+
image = sheet_with_images.with_images.images_at('A8')[0]
|
72
|
+
expect(image.class).to eq(Pathname)
|
73
|
+
end
|
74
|
+
|
75
|
+
it 'returns nil for empty cell' do
|
76
|
+
image = sheet_with_images.with_images.images_at('B3')
|
77
|
+
expect(image).to eq(nil)
|
78
|
+
end
|
79
|
+
|
80
|
+
it 'returns nil for empty cell without preloading images' do
|
81
|
+
image = sheet_with_images.images_at('B3')
|
82
|
+
expect(image).to eq(nil)
|
83
|
+
end
|
84
|
+
end
|
85
|
+
end
|
@@ -7,9 +7,9 @@ describe Creek::Styles::StyleTypes do
|
|
7
7
|
xml_file = File.open('spec/fixtures/styles/first.xml')
|
8
8
|
doc = Nokogiri::XML(xml_file)
|
9
9
|
res = Creek::Styles::StyleTypes.new(doc).call
|
10
|
-
res.size.
|
11
|
-
res[3].
|
12
|
-
res.
|
10
|
+
expect(res.size).to eq(8)
|
11
|
+
expect(res[3]).to eq(:date_time)
|
12
|
+
expect(res).to eq([:unsupported, :unsupported, :unsupported, :date_time, :unsupported, :unsupported, :unsupported, :unsupported])
|
13
13
|
end
|
14
14
|
end
|
15
15
|
end
|
data/spec/test_spec.rb
CHANGED
@@ -2,25 +2,38 @@ require './spec/spec_helper'
|
|
2
2
|
|
3
3
|
describe 'Creek trying to parsing an invalid file.' do
|
4
4
|
it 'Fail to open a legacy xls file.' do
|
5
|
-
|
5
|
+
expect { Creek::Book.new 'spec/fixtures/invalid.xls' }
|
6
|
+
.to raise_error 'Not a valid file format.'
|
6
7
|
end
|
7
8
|
|
8
9
|
it 'Ignore file extensions on request.' do
|
9
10
|
path = 'spec/fixtures/sample-as-zip.zip'
|
10
|
-
|
11
|
+
expect { Creek::Book.new path, check_file_extension: false }
|
12
|
+
.not_to raise_error
|
11
13
|
end
|
12
14
|
|
13
15
|
it 'Check file extension when requested.' do
|
14
|
-
|
15
|
-
|
16
|
+
expect { Creek::Book.new 'spec/fixtures/invalid.xls', check_file_extension: true }
|
17
|
+
.to raise_error 'Not a valid file format.'
|
18
|
+
end
|
19
|
+
|
20
|
+
it 'Fail to open remote file' do
|
21
|
+
expect { Creek::Book.new 'http://dev-builds.libreoffice.org/tmp/test.xlsx' }
|
22
|
+
.to raise_error(Zip::Error, /not found/)
|
23
|
+
end
|
24
|
+
|
25
|
+
it 'Opens remote file with remote flag' do
|
26
|
+
expect { Creek::Book.new 'http://dev-builds.libreoffice.org/tmp/test.xlsx', remote: true }
|
27
|
+
.not_to raise_error
|
16
28
|
end
|
17
29
|
|
18
30
|
it 'Check file extension of original_filename if passed.' do
|
19
31
|
path = 'spec/fixtures/temp_string_io_file_path_with_no_extension'
|
20
|
-
|
21
|
-
|
32
|
+
expect { Creek::Book.new path, :original_filename => 'invalid.xls' }
|
33
|
+
.to raise_error 'Not a valid file format.'
|
34
|
+
expect { Creek::Book.new path, :original_filename => 'valid.xlsx' }
|
35
|
+
.not_to raise_error
|
22
36
|
end
|
23
|
-
|
24
37
|
end
|
25
38
|
|
26
39
|
describe 'Creek parsing a sample XLSX file' do
|
@@ -32,7 +45,8 @@ describe 'Creek parsing a sample XLSX file' do
|
|
32
45
|
{'A4'=>'Content 7', 'B4'=>'Content 8', 'C4'=>'Content 9', 'D4'=>'Content 10', 'E4'=>'Content 11', 'F4'=>'Content 12'},
|
33
46
|
{'A5'=>nil, 'B5'=>nil, 'C5'=>nil, 'D5'=>nil, 'E5'=>nil, 'F5'=>nil, 'G5'=>nil, 'H5'=>nil, 'I5'=>nil, 'J5'=>nil, 'K5'=>nil, 'L5'=>nil, 'M5'=>nil, 'N5'=>nil, 'O5'=>nil, 'P5'=>nil, 'Q5'=>nil, 'R5'=>nil, 'S5'=>nil, 'T5'=>nil, 'U5'=>nil, 'V5'=>nil, 'W5'=>nil, 'X5'=>nil, 'Y5'=>nil, 'Z5'=>'Z Content', 'AA5'=>nil, 'AB5'=>nil, 'AC5'=>nil, 'AD5'=>nil, 'AE5'=>nil, 'AF5'=>nil, 'AG5'=>nil, 'AH5'=>nil, 'AI5'=>nil, 'AJ5'=>nil, 'AK5'=>nil, 'AL5'=>nil, 'AM5'=>nil, 'AN5'=>nil, 'AO5'=>nil, 'AP5'=>nil, 'AQ5'=>nil, 'AR5'=>nil, 'AS5'=>nil, 'AT5'=>nil, 'AU5'=>nil, 'AV5'=>nil, 'AW5'=>nil, 'AX5'=>nil, 'AY5'=>nil, 'AZ5'=>'Content 13'},
|
34
47
|
{'A6'=>'1', 'B6'=>'2', 'C6'=>'3'}, {'A7'=>'Content 15', 'B7'=>'Content 16', 'C7'=>'Content 18', 'D7'=>'Content 19'},
|
35
|
-
{'A8'=>nil, 'B8'=>'Content 20', 'C8'=>nil, 'D8'=>nil, 'E8'=>nil, 'F8'=>'Content 21'}
|
48
|
+
{'A8'=>nil, 'B8'=>'Content 20', 'C8'=>nil, 'D8'=>nil, 'E8'=>nil, 'F8'=>'Content 21'},
|
49
|
+
{'A10' => 0.15, 'B10' => 0.15}]
|
36
50
|
end
|
37
51
|
|
38
52
|
after(:all) do
|
@@ -40,15 +54,15 @@ describe 'Creek parsing a sample XLSX file' do
|
|
40
54
|
end
|
41
55
|
|
42
56
|
it 'open an XLSX file successfully.' do
|
43
|
-
@creek.
|
57
|
+
expect(@creek).not_to be_nil
|
44
58
|
end
|
45
59
|
|
46
60
|
it 'find sheets successfully.' do
|
47
|
-
@creek.sheets.count.
|
61
|
+
expect(@creek.sheets.count).to eq(1)
|
48
62
|
sheet = @creek.sheets.first
|
49
|
-
sheet.state.
|
50
|
-
sheet.name.
|
51
|
-
sheet.rid.
|
63
|
+
expect(sheet.state).to eql nil
|
64
|
+
expect(sheet.name).to eql 'Sheet1'
|
65
|
+
expect(sheet.rid).to eql 'rId1'
|
52
66
|
end
|
53
67
|
|
54
68
|
it 'Parse rows with empty cells successfully.' do
|
@@ -59,15 +73,16 @@ describe 'Creek parsing a sample XLSX file' do
|
|
59
73
|
row_count += 1
|
60
74
|
end
|
61
75
|
|
62
|
-
rows[0].
|
63
|
-
rows[1].
|
64
|
-
rows[2].
|
65
|
-
rows[3].
|
66
|
-
rows[4].
|
67
|
-
rows[5].
|
68
|
-
rows[6].
|
69
|
-
rows[7].
|
70
|
-
|
76
|
+
expect(rows[0]).to eq(@expected_rows[0])
|
77
|
+
expect(rows[1]).to eq(@expected_rows[1])
|
78
|
+
expect(rows[2]).to eq(@expected_rows[2])
|
79
|
+
expect(rows[3]).to eq(@expected_rows[3])
|
80
|
+
expect(rows[4]).to eq(@expected_rows[4])
|
81
|
+
expect(rows[5]).to eq(@expected_rows[5])
|
82
|
+
expect(rows[6]).to eq(@expected_rows[6])
|
83
|
+
expect(rows[7]).to eq(@expected_rows[7])
|
84
|
+
expect(rows[8]).to eq(@expected_rows[8])
|
85
|
+
expect(row_count).to eq(9)
|
71
86
|
end
|
72
87
|
|
73
88
|
it 'Parse rows with empty cells and meta data successfully.' do
|
@@ -77,6 +92,6 @@ describe 'Creek parsing a sample XLSX file' do
|
|
77
92
|
rows << row
|
78
93
|
row_count += 1
|
79
94
|
end
|
80
|
-
rows.map{|r| r['cells']}.
|
95
|
+
expect(rows.map{|r| r['cells']}).to eq(@expected_rows)
|
81
96
|
end
|
82
97
|
end
|
metadata
CHANGED
@@ -1,99 +1,113 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: creek
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version:
|
4
|
+
version: '2.0'
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- pythonicrubyist
|
8
8
|
autorequire:
|
9
9
|
bindir: bin
|
10
10
|
cert_chain: []
|
11
|
-
date:
|
11
|
+
date: 2017-06-14 00:00:00.000000000 Z
|
12
12
|
dependencies:
|
13
13
|
- !ruby/object:Gem::Dependency
|
14
14
|
name: bundler
|
15
15
|
requirement: !ruby/object:Gem::Requirement
|
16
16
|
requirements:
|
17
|
-
- - ~>
|
17
|
+
- - "~>"
|
18
18
|
- !ruby/object:Gem::Version
|
19
19
|
version: '1.3'
|
20
20
|
type: :development
|
21
21
|
prerelease: false
|
22
22
|
version_requirements: !ruby/object:Gem::Requirement
|
23
23
|
requirements:
|
24
|
-
- - ~>
|
24
|
+
- - "~>"
|
25
25
|
- !ruby/object:Gem::Version
|
26
26
|
version: '1.3'
|
27
27
|
- !ruby/object:Gem::Dependency
|
28
28
|
name: rake
|
29
29
|
requirement: !ruby/object:Gem::Requirement
|
30
30
|
requirements:
|
31
|
-
- -
|
31
|
+
- - ">="
|
32
32
|
- !ruby/object:Gem::Version
|
33
33
|
version: '0'
|
34
34
|
type: :development
|
35
35
|
prerelease: false
|
36
36
|
version_requirements: !ruby/object:Gem::Requirement
|
37
37
|
requirements:
|
38
|
-
- -
|
38
|
+
- - ">="
|
39
39
|
- !ruby/object:Gem::Version
|
40
40
|
version: '0'
|
41
41
|
- !ruby/object:Gem::Dependency
|
42
42
|
name: rspec
|
43
43
|
requirement: !ruby/object:Gem::Requirement
|
44
44
|
requirements:
|
45
|
-
- - ~>
|
45
|
+
- - "~>"
|
46
46
|
- !ruby/object:Gem::Version
|
47
|
-
version:
|
47
|
+
version: 3.6.0
|
48
48
|
type: :development
|
49
49
|
prerelease: false
|
50
50
|
version_requirements: !ruby/object:Gem::Requirement
|
51
51
|
requirements:
|
52
|
-
- - ~>
|
52
|
+
- - "~>"
|
53
53
|
- !ruby/object:Gem::Version
|
54
|
-
version:
|
54
|
+
version: 3.6.0
|
55
55
|
- !ruby/object:Gem::Dependency
|
56
56
|
name: pry
|
57
57
|
requirement: !ruby/object:Gem::Requirement
|
58
58
|
requirements:
|
59
|
-
- -
|
59
|
+
- - ">="
|
60
60
|
- !ruby/object:Gem::Version
|
61
61
|
version: '0'
|
62
62
|
type: :development
|
63
63
|
prerelease: false
|
64
64
|
version_requirements: !ruby/object:Gem::Requirement
|
65
65
|
requirements:
|
66
|
-
- -
|
66
|
+
- - ">="
|
67
67
|
- !ruby/object:Gem::Version
|
68
68
|
version: '0'
|
69
69
|
- !ruby/object:Gem::Dependency
|
70
70
|
name: nokogiri
|
71
71
|
requirement: !ruby/object:Gem::Requirement
|
72
72
|
requirements:
|
73
|
-
- - ~>
|
73
|
+
- - "~>"
|
74
74
|
- !ruby/object:Gem::Version
|
75
|
-
version: 1.
|
75
|
+
version: 1.7.0
|
76
76
|
type: :runtime
|
77
77
|
prerelease: false
|
78
78
|
version_requirements: !ruby/object:Gem::Requirement
|
79
79
|
requirements:
|
80
|
-
- - ~>
|
80
|
+
- - "~>"
|
81
81
|
- !ruby/object:Gem::Version
|
82
|
-
version: 1.
|
82
|
+
version: 1.7.0
|
83
83
|
- !ruby/object:Gem::Dependency
|
84
84
|
name: rubyzip
|
85
85
|
requirement: !ruby/object:Gem::Requirement
|
86
86
|
requirements:
|
87
|
-
- -
|
87
|
+
- - ">="
|
88
88
|
- !ruby/object:Gem::Version
|
89
89
|
version: 1.0.0
|
90
90
|
type: :runtime
|
91
91
|
prerelease: false
|
92
92
|
version_requirements: !ruby/object:Gem::Requirement
|
93
93
|
requirements:
|
94
|
-
- -
|
94
|
+
- - ">="
|
95
95
|
- !ruby/object:Gem::Version
|
96
96
|
version: 1.0.0
|
97
|
+
- !ruby/object:Gem::Dependency
|
98
|
+
name: httparty
|
99
|
+
requirement: !ruby/object:Gem::Requirement
|
100
|
+
requirements:
|
101
|
+
- - "~>"
|
102
|
+
- !ruby/object:Gem::Version
|
103
|
+
version: 0.15.5
|
104
|
+
type: :runtime
|
105
|
+
prerelease: false
|
106
|
+
version_requirements: !ruby/object:Gem::Requirement
|
107
|
+
requirements:
|
108
|
+
- - "~>"
|
109
|
+
- !ruby/object:Gem::Version
|
110
|
+
version: 0.15.5
|
97
111
|
description: A Ruby gem that streams and parses large Excel(xlsx and xlsm) files fast
|
98
112
|
and efficiently.
|
99
113
|
email:
|
@@ -102,29 +116,34 @@ executables: []
|
|
102
116
|
extensions: []
|
103
117
|
extra_rdoc_files: []
|
104
118
|
files:
|
105
|
-
- .gitignore
|
119
|
+
- ".gitignore"
|
106
120
|
- Gemfile
|
107
121
|
- LICENSE.txt
|
108
|
-
- README.
|
122
|
+
- README.md
|
109
123
|
- Rakefile
|
110
124
|
- creek.gemspec
|
111
125
|
- lib/creek.rb
|
112
126
|
- lib/creek/book.rb
|
127
|
+
- lib/creek/drawing.rb
|
113
128
|
- lib/creek/shared_strings.rb
|
114
129
|
- lib/creek/sheet.rb
|
115
130
|
- lib/creek/styles.rb
|
116
131
|
- lib/creek/styles/constants.rb
|
117
132
|
- lib/creek/styles/converter.rb
|
118
133
|
- lib/creek/styles/style_types.rb
|
134
|
+
- lib/creek/utils.rb
|
119
135
|
- lib/creek/version.rb
|
136
|
+
- spec/drawing_spec.rb
|
120
137
|
- spec/fixtures/invalid.xls
|
121
138
|
- spec/fixtures/sample-as-zip.zip
|
139
|
+
- spec/fixtures/sample-with-images.xlsx
|
122
140
|
- spec/fixtures/sample.xlsx
|
123
141
|
- spec/fixtures/sheets/sheet1.xml
|
124
142
|
- spec/fixtures/sst.xml
|
125
143
|
- spec/fixtures/styles/first.xml
|
126
144
|
- spec/fixtures/temp_string_io_file_path_with_no_extension
|
127
145
|
- spec/shared_string_spec.rb
|
146
|
+
- spec/sheet_spec.rb
|
128
147
|
- spec/spec_helper.rb
|
129
148
|
- spec/styles/converter_spec.rb
|
130
149
|
- spec/styles/style_types_spec.rb
|
@@ -139,29 +158,32 @@ require_paths:
|
|
139
158
|
- lib
|
140
159
|
required_ruby_version: !ruby/object:Gem::Requirement
|
141
160
|
requirements:
|
142
|
-
- -
|
161
|
+
- - ">="
|
143
162
|
- !ruby/object:Gem::Version
|
144
|
-
version:
|
163
|
+
version: 2.0.0
|
145
164
|
required_rubygems_version: !ruby/object:Gem::Requirement
|
146
165
|
requirements:
|
147
|
-
- -
|
166
|
+
- - ">="
|
148
167
|
- !ruby/object:Gem::Version
|
149
168
|
version: '0'
|
150
169
|
requirements: []
|
151
170
|
rubyforge_project:
|
152
|
-
rubygems_version: 2.
|
171
|
+
rubygems_version: 2.6.12
|
153
172
|
signing_key:
|
154
173
|
specification_version: 4
|
155
174
|
summary: A Ruby gem for parsing large Excel(xlsx and xlsm) files.
|
156
175
|
test_files:
|
176
|
+
- spec/drawing_spec.rb
|
157
177
|
- spec/fixtures/invalid.xls
|
158
178
|
- spec/fixtures/sample-as-zip.zip
|
179
|
+
- spec/fixtures/sample-with-images.xlsx
|
159
180
|
- spec/fixtures/sample.xlsx
|
160
181
|
- spec/fixtures/sheets/sheet1.xml
|
161
182
|
- spec/fixtures/sst.xml
|
162
183
|
- spec/fixtures/styles/first.xml
|
163
184
|
- spec/fixtures/temp_string_io_file_path_with_no_extension
|
164
185
|
- spec/shared_string_spec.rb
|
186
|
+
- spec/sheet_spec.rb
|
165
187
|
- spec/spec_helper.rb
|
166
188
|
- spec/styles/converter_spec.rb
|
167
189
|
- spec/styles/style_types_spec.rb
|
data/README.rdoc
DELETED
@@ -1,76 +0,0 @@
|
|
1
|
-
= Creek -- Stream parser for large Excel(xlsx and xlsm) files.
|
2
|
-
|
3
|
-
Creek is a Ruby gem that provide a fast, simple and efficient method of parsing large Excel(xlsx and xlsm) files.
|
4
|
-
|
5
|
-
|
6
|
-
== Installation
|
7
|
-
|
8
|
-
Creek can be used from the command line or as part of a Ruby web framework. To install the gem using terminal, run the following command:
|
9
|
-
|
10
|
-
gem install creek
|
11
|
-
|
12
|
-
To use it in Rails, add this line to your Gemfile:
|
13
|
-
|
14
|
-
gem "creek"
|
15
|
-
|
16
|
-
|
17
|
-
== Basic Usage
|
18
|
-
Creek can simply parse an Excel file by looping through the rows enumerator:
|
19
|
-
|
20
|
-
require 'creek'
|
21
|
-
creek = Creek::Book.new "specs/fixtures/sample.xlsx"
|
22
|
-
sheet= creek.sheets[0]
|
23
|
-
|
24
|
-
sheet.rows.each do |row|
|
25
|
-
puts row # => {"A1"=>"Content 1", "B1"=>nil, C1"=>nil, "D1"=>"Content 3"}
|
26
|
-
end
|
27
|
-
|
28
|
-
|
29
|
-
sheet.rows_with_meta_data.each do |row|
|
30
|
-
puts row # => {"collapsed"=>"false", "customFormat"=>"false", "customHeight"=>"true", "hidden"=>"false", "ht"=>"12.1", "outlineLevel"=>"0", "r"=>"1", "cells"=>{"A1"=>"Content 1", "B1"=>nil, C1"=>nil, "D1"=>"Content 3"}}
|
31
|
-
end
|
32
|
-
|
33
|
-
|
34
|
-
sheet.state # => 'visible'
|
35
|
-
sheet.name # => 'Sheet1'
|
36
|
-
sheet.rid # => 'rId2'
|
37
|
-
|
38
|
-
== Filename considerations
|
39
|
-
By default, Creek will ensure that the file extension is either *.xlsx or *.xlsm, but this check can be circumvented as needed:
|
40
|
-
|
41
|
-
path = 'sample-as-zip.zip'
|
42
|
-
Creek::Book.new path, :check_file_extension => false
|
43
|
-
|
44
|
-
By default, the Rails {file_field_tag}[http://api.rubyonrails.org/classes/ActionView/Helpers/FormTagHelper.html#method-i-file_field_tag] uploads to a temporary location and stores the original filename with the StringIO object. (See {this section}[http://guides.rubyonrails.org/form_helpers.html#uploading-files] of the Rails Guides for more information.)
|
45
|
-
|
46
|
-
Creek can parse this directly without the need for file upload gems such as Carrierwave or Paperclip by passing the original filename as an option:
|
47
|
-
|
48
|
-
# Import endpoint in Rails controller
|
49
|
-
def import
|
50
|
-
file = params[:file]
|
51
|
-
Creek::Book.new file.path, check_file_extension: false
|
52
|
-
end
|
53
|
-
|
54
|
-
== Contributing
|
55
|
-
|
56
|
-
Contributions are welcomed. You can fork a repository, add your code changes to the forked branch, ensure all existing unit tests pass, create new unit tests cover your new changes and finally create a pull request.
|
57
|
-
|
58
|
-
After forking and then cloning the repository locally, install Bundler and then use it
|
59
|
-
to install the development gem dependecies:
|
60
|
-
|
61
|
-
gem install bundler
|
62
|
-
bundle install
|
63
|
-
|
64
|
-
Once this is complete, you should be able to run the test suite:
|
65
|
-
|
66
|
-
rake
|
67
|
-
|
68
|
-
|
69
|
-
== Bug Reporting
|
70
|
-
|
71
|
-
Please use the {Issues}[https://github.com/pythonicrubyist/creek/issues] page to report bugs or suggest new enhancements.
|
72
|
-
|
73
|
-
|
74
|
-
== License
|
75
|
-
|
76
|
-
Creek has been published under {MIT License}[https://github.com/pythonicrubyist/creek/blob/master/LICENSE.txt]
|