peregrin 1.1.1
Sign up to get free protection for your applications and to get access to all the features.
- data/MIT-LICENSE +20 -0
- data/README.md +148 -0
- data/bin/peregrin +6 -0
- data/lib/formats/epub.rb +553 -0
- data/lib/formats/ochook.rb +113 -0
- data/lib/formats/zhook.rb +394 -0
- data/lib/peregrin/book.rb +87 -0
- data/lib/peregrin/chapter.rb +31 -0
- data/lib/peregrin/component.rb +12 -0
- data/lib/peregrin/componentizer.rb +118 -0
- data/lib/peregrin/outliner.rb +204 -0
- data/lib/peregrin/property.rb +16 -0
- data/lib/peregrin/resource.rb +24 -0
- data/lib/peregrin/version.rb +5 -0
- data/lib/peregrin/zip_patch.rb +11 -0
- data/lib/peregrin.rb +139 -0
- data/test/conversion_test.rb +80 -0
- data/test/formats/epub_test.rb +159 -0
- data/test/formats/ochook_test.rb +104 -0
- data/test/formats/zhook_test.rb +219 -0
- data/test/test_helper.rb +16 -0
- data/test/utils/componentizer_test.rb +78 -0
- data/test/utils/outliner_test.rb +49 -0
- metadata +135 -0
data/MIT-LICENSE
ADDED
@@ -0,0 +1,20 @@
|
|
1
|
+
Copyright (c) 2010 Joseph Pearson
|
2
|
+
|
3
|
+
Permission is hereby granted, free of charge, to any person obtaining
|
4
|
+
a copy of this software and associated documentation files (the
|
5
|
+
"Software"), to deal in the Software without restriction, including
|
6
|
+
without limitation the rights to use, copy, modify, merge, publish,
|
7
|
+
distribute, sublicense, and/or sell copies of the Software, and to
|
8
|
+
permit persons to whom the Software is furnished to do so, subject to
|
9
|
+
the following conditions:
|
10
|
+
|
11
|
+
The above copyright notice and this permission notice shall be
|
12
|
+
included in all copies or substantial portions of the Software.
|
13
|
+
|
14
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
|
15
|
+
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
|
16
|
+
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
|
17
|
+
NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
|
18
|
+
LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
|
19
|
+
OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
|
20
|
+
WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
|
data/README.md
ADDED
@@ -0,0 +1,148 @@
|
|
1
|
+
# Peregrin
|
2
|
+
|
3
|
+
A library for inspecting Zhooks, Ochooks and EPUB ebooks, and converting
|
4
|
+
between them.
|
5
|
+
|
6
|
+
Invented by [Inventive Labs](http://inventivelabs.com.au). Released under the
|
7
|
+
MIT license.
|
8
|
+
|
9
|
+
More info: http://ochook.org/peregrin
|
10
|
+
|
11
|
+
|
12
|
+
## Requirements
|
13
|
+
|
14
|
+
Ruby, at least 1.8.x.
|
15
|
+
|
16
|
+
You must have ImageMagick installed — specifically, you must have the 'convert'
|
17
|
+
utility provided by ImageMagick somewhere in your PATH.
|
18
|
+
|
19
|
+
Required Ruby gems:
|
20
|
+
|
21
|
+
* zipruby
|
22
|
+
* nokogiri
|
23
|
+
* mime-types
|
24
|
+
|
25
|
+
|
26
|
+
## Peregrin from the command-line
|
27
|
+
|
28
|
+
You can use Peregrin to inspect a Zhook, Ochook or EPUB file from the
|
29
|
+
command-line. It will perform very basic validation of the file and
|
30
|
+
output an analysis.
|
31
|
+
|
32
|
+
$ peregrin strunk.epub
|
33
|
+
[EPUB]
|
34
|
+
|
35
|
+
Cover
|
36
|
+
images/cover.png
|
37
|
+
|
38
|
+
Components [10]
|
39
|
+
cover.xml
|
40
|
+
title.xml
|
41
|
+
about.xml
|
42
|
+
main0.xml
|
43
|
+
main1.xml
|
44
|
+
main2.xml
|
45
|
+
main3.xml
|
46
|
+
main4.xml
|
47
|
+
main5.xml
|
48
|
+
main6.xml
|
49
|
+
|
50
|
+
Resources [2]
|
51
|
+
css/main.css
|
52
|
+
images/cover.png
|
53
|
+
|
54
|
+
Chapters
|
55
|
+
- Title
|
56
|
+
- About
|
57
|
+
- Chapter 1 - Introductory
|
58
|
+
- Chapter 2 - Elementary Rules of Usage
|
59
|
+
- Chapter 3 - Elementary Principles of Composition
|
60
|
+
- Chapter 4 - A Few Matters of Form
|
61
|
+
- Chapter 5 - Words and Expressions Commonly Misused
|
62
|
+
- Chapter 6 - Words Commonly Misspelled
|
63
|
+
|
64
|
+
Properties [5]
|
65
|
+
title: The Elements of Style
|
66
|
+
identifier: urn:uuid:6f82990c-9394-11df-920d-001cc0a62c0b
|
67
|
+
language: en
|
68
|
+
creator: William Strunk Jr.
|
69
|
+
subject: Non-Fiction
|
70
|
+
|
71
|
+
Note that file type detection is quite naive — it just uses the path extension,
|
72
|
+
and if the extension is not .zhook or .epub, it assumes the path is an
|
73
|
+
Ochook directory.
|
74
|
+
|
75
|
+
You can also use Peregrin to convert from one format to another. Just provide
|
76
|
+
two paths to the utility; it will convert from the first to the second.
|
77
|
+
|
78
|
+
$ peregrin strunk.epub strunk.zhook
|
79
|
+
[Zhook]
|
80
|
+
Cover
|
81
|
+
cover.png
|
82
|
+
|
83
|
+
Components [1]
|
84
|
+
index.html
|
85
|
+
|
86
|
+
Resources [2]
|
87
|
+
css/main.css
|
88
|
+
cover.png
|
89
|
+
|
90
|
+
Chapters
|
91
|
+
- Title
|
92
|
+
- About
|
93
|
+
- Chapter 1 - Introductory
|
94
|
+
- Chapter 2 - Elementary Rules of Usage
|
95
|
+
- Chapter 3 - Elementary Principles of Composition
|
96
|
+
- Chapter 4 - A Few Matters of Form
|
97
|
+
- Chapter 5 - Words and Expressions Commonly Misused
|
98
|
+
- Chapter 6 - Words Commonly Misspelled
|
99
|
+
|
100
|
+
Properties [5]
|
101
|
+
title: The Elements of Style
|
102
|
+
identifier: urn:uuid:6f82990c-9394-11df-920d-001cc0a62c0b
|
103
|
+
language: en
|
104
|
+
creator: William Strunk Jr.
|
105
|
+
subject: Non-Fiction
|
106
|
+
|
107
|
+
|
108
|
+
## Library usage
|
109
|
+
|
110
|
+
The three formats are represented in the Peregrin::Epub, Peregrin::Zhook and
|
111
|
+
Peregrin::Ochook classes. Each format class responds to the following methods:
|
112
|
+
|
113
|
+
* validate(path)
|
114
|
+
* read(path) - creates an instance of the class from the path
|
115
|
+
* new(book) - creates an instance of the class from a Peregrin::Book
|
116
|
+
|
117
|
+
Each instance of a format class responds to the following methods:
|
118
|
+
|
119
|
+
* write(path)
|
120
|
+
* to\_book(options) - returns a Peregrin:Book object
|
121
|
+
|
122
|
+
Here's what a conversion routine might look like:
|
123
|
+
|
124
|
+
zhook = Peregrin::Zhook.read('foo.zhook')
|
125
|
+
epub = Peregrin::Epub.new(zhook.to\_book(:componentize => true))
|
126
|
+
epub.write('foo.epub')
|
127
|
+
|
128
|
+
## Peregrin::Book
|
129
|
+
|
130
|
+
Between the three supported formats, there is an abstracted concept of "book"
|
131
|
+
data, which holds the following information:
|
132
|
+
|
133
|
+
* components - an array of Components that make up the linear content
|
134
|
+
* chapters - an array of Chapters (with title, src and children)
|
135
|
+
* properties - an array of Property metadata tuples (key/value + attributes)
|
136
|
+
* resources - an array of Resources contained in the ebook, other than components
|
137
|
+
* cover - the Resource that should be used as the cover of the ebook
|
138
|
+
|
139
|
+
There will probably be some changes to the shape of this data over the
|
140
|
+
development of Peregrin, to ensure that the Book interchange object retains all
|
141
|
+
relevant information about an ebook without lossiness. But for the moment,
|
142
|
+
it's being kept as simple as possible.
|
143
|
+
|
144
|
+
|
145
|
+
## Peregrin?
|
146
|
+
|
147
|
+
All this rhyming on "ook" put me in mind of the Took family. There is no
|
148
|
+
deeper meaning.
|