peregrin 1.1.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- data/MIT-LICENSE +20 -0
- data/README.md +148 -0
- data/bin/peregrin +6 -0
- data/lib/formats/epub.rb +553 -0
- data/lib/formats/ochook.rb +113 -0
- data/lib/formats/zhook.rb +394 -0
- data/lib/peregrin/book.rb +87 -0
- data/lib/peregrin/chapter.rb +31 -0
- data/lib/peregrin/component.rb +12 -0
- data/lib/peregrin/componentizer.rb +118 -0
- data/lib/peregrin/outliner.rb +204 -0
- data/lib/peregrin/property.rb +16 -0
- data/lib/peregrin/resource.rb +24 -0
- data/lib/peregrin/version.rb +5 -0
- data/lib/peregrin/zip_patch.rb +11 -0
- data/lib/peregrin.rb +139 -0
- data/test/conversion_test.rb +80 -0
- data/test/formats/epub_test.rb +159 -0
- data/test/formats/ochook_test.rb +104 -0
- data/test/formats/zhook_test.rb +219 -0
- data/test/test_helper.rb +16 -0
- data/test/utils/componentizer_test.rb +78 -0
- data/test/utils/outliner_test.rb +49 -0
- metadata +135 -0
data/MIT-LICENSE
ADDED
@@ -0,0 +1,20 @@
|
|
1
|
+
Copyright (c) 2010 Joseph Pearson
|
2
|
+
|
3
|
+
Permission is hereby granted, free of charge, to any person obtaining
|
4
|
+
a copy of this software and associated documentation files (the
|
5
|
+
"Software"), to deal in the Software without restriction, including
|
6
|
+
without limitation the rights to use, copy, modify, merge, publish,
|
7
|
+
distribute, sublicense, and/or sell copies of the Software, and to
|
8
|
+
permit persons to whom the Software is furnished to do so, subject to
|
9
|
+
the following conditions:
|
10
|
+
|
11
|
+
The above copyright notice and this permission notice shall be
|
12
|
+
included in all copies or substantial portions of the Software.
|
13
|
+
|
14
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
|
15
|
+
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
|
16
|
+
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
|
17
|
+
NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
|
18
|
+
LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
|
19
|
+
OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
|
20
|
+
WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
|
data/README.md
ADDED
@@ -0,0 +1,148 @@
|
|
1
|
+
# Peregrin
|
2
|
+
|
3
|
+
A library for inspecting Zhooks, Ochooks and EPUB ebooks, and converting
|
4
|
+
between them.
|
5
|
+
|
6
|
+
Invented by [Inventive Labs](http://inventivelabs.com.au). Released under the
|
7
|
+
MIT license.
|
8
|
+
|
9
|
+
More info: http://ochook.org/peregrin
|
10
|
+
|
11
|
+
|
12
|
+
## Requirements
|
13
|
+
|
14
|
+
Ruby, at least 1.8.x.
|
15
|
+
|
16
|
+
You must have ImageMagick installed — specifically, you must have the 'convert'
|
17
|
+
utility provided by ImageMagick somewhere in your PATH.
|
18
|
+
|
19
|
+
Required Ruby gems:
|
20
|
+
|
21
|
+
* zipruby
|
22
|
+
* nokogiri
|
23
|
+
* mime-types
|
24
|
+
|
25
|
+
|
26
|
+
## Peregrin from the command-line
|
27
|
+
|
28
|
+
You can use Peregrin to inspect a Zhook, Ochook or EPUB file from the
|
29
|
+
command-line. It will perform very basic validation of the file and
|
30
|
+
output an analysis.
|
31
|
+
|
32
|
+
$ peregrin strunk.epub
|
33
|
+
[EPUB]
|
34
|
+
|
35
|
+
Cover
|
36
|
+
images/cover.png
|
37
|
+
|
38
|
+
Components [10]
|
39
|
+
cover.xml
|
40
|
+
title.xml
|
41
|
+
about.xml
|
42
|
+
main0.xml
|
43
|
+
main1.xml
|
44
|
+
main2.xml
|
45
|
+
main3.xml
|
46
|
+
main4.xml
|
47
|
+
main5.xml
|
48
|
+
main6.xml
|
49
|
+
|
50
|
+
Resources [2]
|
51
|
+
css/main.css
|
52
|
+
images/cover.png
|
53
|
+
|
54
|
+
Chapters
|
55
|
+
- Title
|
56
|
+
- About
|
57
|
+
- Chapter 1 - Introductory
|
58
|
+
- Chapter 2 - Elementary Rules of Usage
|
59
|
+
- Chapter 3 - Elementary Principles of Composition
|
60
|
+
- Chapter 4 - A Few Matters of Form
|
61
|
+
- Chapter 5 - Words and Expressions Commonly Misused
|
62
|
+
- Chapter 6 - Words Commonly Misspelled
|
63
|
+
|
64
|
+
Properties [5]
|
65
|
+
title: The Elements of Style
|
66
|
+
identifier: urn:uuid:6f82990c-9394-11df-920d-001cc0a62c0b
|
67
|
+
language: en
|
68
|
+
creator: William Strunk Jr.
|
69
|
+
subject: Non-Fiction
|
70
|
+
|
71
|
+
Note that file type detection is quite naive — it just uses the path extension,
|
72
|
+
and if the extension is not .zhook or .epub, it assumes the path is an
|
73
|
+
Ochook directory.
|
74
|
+
|
75
|
+
You can also use Peregrin to convert from one format to another. Just provide
|
76
|
+
two paths to the utility; it will convert from the first to the second.
|
77
|
+
|
78
|
+
$ peregrin strunk.epub strunk.zhook
|
79
|
+
[Zhook]
|
80
|
+
Cover
|
81
|
+
cover.png
|
82
|
+
|
83
|
+
Components [1]
|
84
|
+
index.html
|
85
|
+
|
86
|
+
Resources [2]
|
87
|
+
css/main.css
|
88
|
+
cover.png
|
89
|
+
|
90
|
+
Chapters
|
91
|
+
- Title
|
92
|
+
- About
|
93
|
+
- Chapter 1 - Introductory
|
94
|
+
- Chapter 2 - Elementary Rules of Usage
|
95
|
+
- Chapter 3 - Elementary Principles of Composition
|
96
|
+
- Chapter 4 - A Few Matters of Form
|
97
|
+
- Chapter 5 - Words and Expressions Commonly Misused
|
98
|
+
- Chapter 6 - Words Commonly Misspelled
|
99
|
+
|
100
|
+
Properties [5]
|
101
|
+
title: The Elements of Style
|
102
|
+
identifier: urn:uuid:6f82990c-9394-11df-920d-001cc0a62c0b
|
103
|
+
language: en
|
104
|
+
creator: William Strunk Jr.
|
105
|
+
subject: Non-Fiction
|
106
|
+
|
107
|
+
|
108
|
+
## Library usage
|
109
|
+
|
110
|
+
The three formats are represented in the Peregrin::Epub, Peregrin::Zhook and
|
111
|
+
Peregrin::Ochook classes. Each format class responds to the following methods:
|
112
|
+
|
113
|
+
* validate(path)
|
114
|
+
* read(path) - creates an instance of the class from the path
|
115
|
+
* new(book) - creates an instance of the class from a Peregrin::Book
|
116
|
+
|
117
|
+
Each instance of a format class responds to the following methods:
|
118
|
+
|
119
|
+
* write(path)
|
120
|
+
* to\_book(options) - returns a Peregrin:Book object
|
121
|
+
|
122
|
+
Here's what a conversion routine might look like:
|
123
|
+
|
124
|
+
zhook = Peregrin::Zhook.read('foo.zhook')
|
125
|
+
epub = Peregrin::Epub.new(zhook.to\_book(:componentize => true))
|
126
|
+
epub.write('foo.epub')
|
127
|
+
|
128
|
+
## Peregrin::Book
|
129
|
+
|
130
|
+
Between the three supported formats, there is an abstracted concept of "book"
|
131
|
+
data, which holds the following information:
|
132
|
+
|
133
|
+
* components - an array of Components that make up the linear content
|
134
|
+
* chapters - an array of Chapters (with title, src and children)
|
135
|
+
* properties - an array of Property metadata tuples (key/value + attributes)
|
136
|
+
* resources - an array of Resources contained in the ebook, other than components
|
137
|
+
* cover - the Resource that should be used as the cover of the ebook
|
138
|
+
|
139
|
+
There will probably be some changes to the shape of this data over the
|
140
|
+
development of Peregrin, to ensure that the Book interchange object retains all
|
141
|
+
relevant information about an ebook without lossiness. But for the moment,
|
142
|
+
it's being kept as simple as possible.
|
143
|
+
|
144
|
+
|
145
|
+
## Peregrin?
|
146
|
+
|
147
|
+
All this rhyming on "ook" put me in mind of the Took family. There is no
|
148
|
+
deeper meaning.
|