pseudohikiparser 0.0.0.4.develop → 0.0.0.5.develop
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- data/LICENSE +23 -0
- data/README.md +203 -0
- data/bin/pseudohiki2html.rb +28 -11
- data/lib/htmlelement.rb +1 -3
- data/lib/pseudohiki/blockparser.rb +5 -7
- data/lib/pseudohiki/inlineparser.rb +5 -11
- data/lib/pseudohiki/treestack.rb +1 -1
- data/lib/pseudohiki/version.rb +1 -1
- data/test/test_plaintextformat.rb +10 -0
- metadata +6 -4
data/LICENSE
ADDED
@@ -0,0 +1,23 @@
|
|
1
|
+
Copyright (c) 2011, HASHIMOTO Naoki
|
2
|
+
All rights reserved.
|
3
|
+
|
4
|
+
Redistribution and use in source and binary forms, with or without modification,
|
5
|
+
are permitted provided that the following conditions are met:
|
6
|
+
|
7
|
+
* Redistributions of source code must retain the above copyright notice, this
|
8
|
+
list of conditions and the following disclaimer.
|
9
|
+
|
10
|
+
* Redistributions in binary form must reproduce the above copyright notice, this
|
11
|
+
list of conditions and the following disclaimer in the documentation and/or
|
12
|
+
other materials provided with the distribution.
|
13
|
+
|
14
|
+
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
|
15
|
+
ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
|
16
|
+
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
|
17
|
+
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR
|
18
|
+
ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
|
19
|
+
(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
|
20
|
+
LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON
|
21
|
+
ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
|
22
|
+
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
|
23
|
+
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
data/README.md
ADDED
@@ -0,0 +1,203 @@
|
|
1
|
+
PseudoHikiParser
|
2
|
+
================
|
3
|
+
|
4
|
+
PseudoHikiParser is a converter of texts written in a [Hiki](http://hikiwiki.org/en/) like notation, into html or other formats.
|
5
|
+
|
6
|
+
Currently, only a limited range of notations can be converted into HTML4 or XHTML1.0.
|
7
|
+
|
8
|
+
I am writing this tool with following objectives in mind,
|
9
|
+
|
10
|
+
* provide some additional features that do not exist in the original Hiki notation
|
11
|
+
* make the notation more line oriented
|
12
|
+
* allow to assign ids to elements such as headings
|
13
|
+
* support several formats other than HTML
|
14
|
+
* The visitor pattern is adopted for the implementation, so you only have to add a visitor class to support a certain format.
|
15
|
+
|
16
|
+
And, it would not be compatible with the original Hiki notation.
|
17
|
+
|
18
|
+
## License
|
19
|
+
|
20
|
+
BSD 2-Clause License
|
21
|
+
|
22
|
+
## Installation
|
23
|
+
|
24
|
+
```
|
25
|
+
gem install pseudohikiparser --pre
|
26
|
+
```
|
27
|
+
|
28
|
+
|
29
|
+
## Usage
|
30
|
+
|
31
|
+
### Samples
|
32
|
+
|
33
|
+
[A sample text](https://github.com/nico-hn/PseudoHikiParser/blob/develop/samples/wikipage.txt) in Hiki notation and [a result of conversion](http://htmlpreview.github.com/?https://github.com/nico-hn/PseudoHikiParser/blob/develop/samples/wikipage.html), and [another result of conversion](http://htmlpreview.github.com/?https://github.com/nico-hn/PseudoHikiParser/blob/develop/samples/wikipage_with_toc.html)
|
34
|
+
|
35
|
+
You will find those samples in [develop branch](https://github.com/nico-hn/PseudoHikiParser/tree/develop/samples).
|
36
|
+
|
37
|
+
|
38
|
+
### pseudohiki2html.rb
|
39
|
+
|
40
|
+
After the installation of PseudoHikiParser, you could use a command, _pseudohiki2html.rb_.
|
41
|
+
|
42
|
+
_Please note that pseudohiki2html.rb is currently provided as a showcase of PseudoHikiParser, and the options will be continuously changed at this stage of development._
|
43
|
+
|
44
|
+
Typing the following lines at the command prompt:
|
45
|
+
|
46
|
+
```
|
47
|
+
pseudohiki2html.rb <<TEXT
|
48
|
+
!! The first heading
|
49
|
+
The first paragraph
|
50
|
+
TEXT
|
51
|
+
```
|
52
|
+
will return the following result to stdout:
|
53
|
+
|
54
|
+
```html
|
55
|
+
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
|
56
|
+
"http://www.w3.org/TR/html4/loose.dtd">
|
57
|
+
<html lang="en">
|
58
|
+
<head>
|
59
|
+
<meta content="en" http-equiv="Content-Language">
|
60
|
+
<meta content="text/html; charset=UTF-8" http-equiv="Content-Type">
|
61
|
+
<meta content="text/javascript" http-equiv="Content-Script-Type">
|
62
|
+
<title>-</title>
|
63
|
+
<link href="default.css" rel="stylesheet" type="text/css">
|
64
|
+
</head>
|
65
|
+
<body>
|
66
|
+
<div class="section h2">
|
67
|
+
<h2> The first heading
|
68
|
+
</h2>
|
69
|
+
<p>
|
70
|
+
The first paragraph
|
71
|
+
</p>
|
72
|
+
<!-- end of section h2 -->
|
73
|
+
</div>
|
74
|
+
</body>
|
75
|
+
</html>
|
76
|
+
```
|
77
|
+
And if you specify a file name with `--output` option:
|
78
|
+
|
79
|
+
```
|
80
|
+
pseudohiki2html.rb --output first_example.html <<TEXT
|
81
|
+
!! The first heading
|
82
|
+
The first paragraph
|
83
|
+
TEXT
|
84
|
+
```
|
85
|
+
the result will be saved in first_example.html.
|
86
|
+
|
87
|
+
For more options, please try `pseudohiki2html.rb --help`
|
88
|
+
|
89
|
+
### module PseudoHiki
|
90
|
+
|
91
|
+
If you save the lines below as a ruby script and execute it:
|
92
|
+
|
93
|
+
```
|
94
|
+
#!/usr/bin/env ruby
|
95
|
+
|
96
|
+
require 'pseudohikiparser'
|
97
|
+
|
98
|
+
plain = <<TEXT
|
99
|
+
!! The first heading
|
100
|
+
The first paragraph
|
101
|
+
TEXT
|
102
|
+
|
103
|
+
tree = PseudoHiki::BlockParser.parse(plain.lines.to_a)
|
104
|
+
html = PseudoHiki::HtmlFormat.format(tree)
|
105
|
+
puts html
|
106
|
+
```
|
107
|
+
you will get the following output:
|
108
|
+
|
109
|
+
```
|
110
|
+
<div class="section h2">
|
111
|
+
<h2> The first heading
|
112
|
+
</h2>
|
113
|
+
<p>
|
114
|
+
The first paragraph
|
115
|
+
</p>
|
116
|
+
<!-- end of section h2 -->
|
117
|
+
</div>
|
118
|
+
```
|
119
|
+
|
120
|
+
Other than PseudoHiki::HtmlFormat, you can choose PseudoHiki::XhtmlFormat, PseudoHiki::Xhtml5Format, PseudoHiki::PlainTextFormat.
|
121
|
+
|
122
|
+
## Development status of features from the original [Hiki notation](http://hikiwiki.org/en/TextFormattingRules.html)
|
123
|
+
|
124
|
+
* Paragraphs - Usable
|
125
|
+
* Links
|
126
|
+
* WikiNames - Not supported (and would never be)
|
127
|
+
* Linking to other Wiki pages - Not supported as well
|
128
|
+
* Linking to an arbitrary URL - Maybe usable
|
129
|
+
* Preformatted text - Usable
|
130
|
+
* Text decoration - Partly supported
|
131
|
+
* Currently, there is no means of escaping tags for inline decorations.
|
132
|
+
* The notation with backquote tags(``) for inline literals is not supported.
|
133
|
+
* Headings - Usable
|
134
|
+
* Horizontal lines - Usable
|
135
|
+
* Lists - Usable
|
136
|
+
* Quotations - Usable
|
137
|
+
* Definitions - Usable
|
138
|
+
* Tables - Usable
|
139
|
+
* Comments - Usable
|
140
|
+
* Plugins - Not supported (and will not be compatible with the original one)
|
141
|
+
|
142
|
+
## Additional Features
|
143
|
+
### Already Implemented
|
144
|
+
#### Assigning ids
|
145
|
+
If you add [name_of_id], just after the marks that denote heading or list type items, it becomes the id attribute of resulting html elements. Below is an example.
|
146
|
+
|
147
|
+
```
|
148
|
+
!![heading_id]heading
|
149
|
+
|
150
|
+
*[list_id]list
|
151
|
+
```
|
152
|
+
will be rendered as
|
153
|
+
|
154
|
+
```html
|
155
|
+
<div class="section h2">
|
156
|
+
<h2 id="HEADING_ID">heading
|
157
|
+
</h2>
|
158
|
+
<ul>
|
159
|
+
<li id="LIST_ID">list
|
160
|
+
</li>
|
161
|
+
</ul>
|
162
|
+
<!-- end of section h2 -->
|
163
|
+
</div>
|
164
|
+
```
|
165
|
+
|
166
|
+
### Partly Implemented
|
167
|
+
#### A visitor that removes markups and returns plain texts
|
168
|
+
The visitor, [PlainTextFormat](https://github.com/nico-hn/PseudoHikiParser/blob/develop/lib/pseudohiki/plaintextformat.rb) is currently available only in the [develop branch](https://github.com/nico-hn/PseudoHikiParser/tree/develop). Below are examples
|
169
|
+
|
170
|
+
```
|
171
|
+
:tel:03-xxxx-xxxx
|
172
|
+
::03-yyyy-yyyy
|
173
|
+
:fax:03-xxxx-xxxx
|
174
|
+
```
|
175
|
+
will be rendered as
|
176
|
+
|
177
|
+
```
|
178
|
+
tel: 03-xxxx-xxxx
|
179
|
+
03-yyyy-yyyy
|
180
|
+
fax: 03-xxxx-xxxx
|
181
|
+
```
|
182
|
+
|
183
|
+
And
|
184
|
+
|
185
|
+
```
|
186
|
+
||cell 1-1||>>cell 1-2,3,4||cell 1-5
|
187
|
+
||cell 2-1||^>cell 2-2,3 3-2,3||cell 2-4||cell 2-5
|
188
|
+
||cell 3-1||cell 3-4||cell 3-5
|
189
|
+
||cell 4-1||cell 4-2||cell 4-3||cell 4-4||cell 4-5
|
190
|
+
```
|
191
|
+
will be rendered as
|
192
|
+
|
193
|
+
```
|
194
|
+
cell 1-1 cell 1-2,3,4 == == cell 1-5
|
195
|
+
cell 2-1 cell 2-2,3 3-2,3 == cell 2-4 cell 2-5
|
196
|
+
cell 3-1 || || cell 3-4 cell 3-5
|
197
|
+
cell 4-1 cell 4-2 cell 4-3 cell 4-4 cell 4-5
|
198
|
+
```
|
199
|
+
#### A visitor for HTML5
|
200
|
+
The visitor, [Xhtml5Format](https://github.com/nico-hn/PseudoHikiParser/blob/develop/lib/pseudohiki/htmlformat.rb#L225) is currently available only in the [develop branch](https://github.com/nico-hn/PseudoHikiParser/tree/develop).
|
201
|
+
|
202
|
+
|
203
|
+
### Not Implemented Yet
|
data/bin/pseudohiki2html.rb
CHANGED
@@ -22,7 +22,8 @@ OPTIONS = {
|
|
22
22
|
:template => nil,
|
23
23
|
:output => nil,
|
24
24
|
:force => false,
|
25
|
-
:toc => nil
|
25
|
+
:toc => nil,
|
26
|
+
:split_main_heading => false
|
26
27
|
}
|
27
28
|
|
28
29
|
ENCODING_REGEXP = {
|
@@ -37,7 +38,7 @@ HTML_VERSIONS = %w(html4 xhtml1 html5)
|
|
37
38
|
FILE_HEADER_PAT = /^(\xef\xbb\xbf)?\/\//
|
38
39
|
WRITTEN_OPTION_PAT = {}
|
39
40
|
OPTIONS.keys.each {|opt| WRITTEN_OPTION_PAT[opt] = /^(\xef\xbb\xbf)?\/\/#{opt}:\s*(.*)$/ }
|
40
|
-
HEADING_WITH_ID_PAT = /^(!{2,3})\[([A-Za-z][0-9A-Za-z_\-.:]*)\]
|
41
|
+
HEADING_WITH_ID_PAT = /^(!{2,3})\[([A-Za-z][0-9A-Za-z_\-.:]*)\]\s*/o
|
41
42
|
|
42
43
|
PlainFormat = PlainTextFormat.create
|
43
44
|
|
@@ -46,7 +47,12 @@ class InputManager
|
|
46
47
|
@formatter ||= OPTIONS.html_template.new
|
47
48
|
end
|
48
49
|
|
50
|
+
def to_plain(line)
|
51
|
+
PlainFormat.format(BlockParser.parse(line.lines.to_a)).to_s.chomp
|
52
|
+
end
|
53
|
+
|
49
54
|
def create_table_of_contents(lines)
|
55
|
+
return "" unless OPTIONS[:toc]
|
50
56
|
toc_lines = lines.grep(HEADING_WITH_ID_PAT).map do |line|
|
51
57
|
m = HEADING_WITH_ID_PAT.match(line)
|
52
58
|
heading_depth, id = m[1].length, m[2].upcase
|
@@ -55,7 +61,15 @@ class InputManager
|
|
55
61
|
OPTIONS.formatter.format(BlockParser.parse(toc_lines))
|
56
62
|
end
|
57
63
|
|
58
|
-
def
|
64
|
+
def split_main_heading(input_lines)
|
65
|
+
return "" unless OPTIONS[:split_main_heading]
|
66
|
+
h1_pos = input_lines.find_index {|line| /^![^!]/o =~ line }
|
67
|
+
return "" unless h1_pos
|
68
|
+
tree = BlockParser.parse([input_lines.delete_at(h1_pos)])
|
69
|
+
OPTIONS.formatter.format(tree)
|
70
|
+
end
|
71
|
+
|
72
|
+
def create_main(toc, body, h1)
|
59
73
|
return nil unless OPTIONS[:toc]
|
60
74
|
toc_container = formatter.create_element("section").tap do |element|
|
61
75
|
element["id"] = "toc"
|
@@ -68,6 +82,7 @@ class InputManager
|
|
68
82
|
end
|
69
83
|
main = formatter.create_element("section").tap do |element|
|
70
84
|
element["id"] = "main"
|
85
|
+
element.push h1 unless h1.empty?
|
71
86
|
element.push toc_container
|
72
87
|
element.push contents_container
|
73
88
|
end
|
@@ -88,11 +103,12 @@ class InputManager
|
|
88
103
|
end
|
89
104
|
|
90
105
|
def compose_html(input_lines)
|
106
|
+
h1 = split_main_heading(input_lines)
|
91
107
|
css = OPTIONS[:css]
|
92
108
|
toc = create_table_of_contents(input_lines)
|
93
109
|
body = compose_body(input_lines)
|
94
110
|
title = OPTIONS.title
|
95
|
-
main = create_main(toc,body)
|
111
|
+
main = create_main(toc,body, h1)
|
96
112
|
|
97
113
|
if OPTIONS[:template]
|
98
114
|
erb = ERB.new(OPTIONS.read_template_file)
|
@@ -107,10 +123,6 @@ class InputManager
|
|
107
123
|
end
|
108
124
|
end
|
109
125
|
|
110
|
-
def to_plain(line)
|
111
|
-
PlainFormat.format(BlockParser.parse(line.lines.to_a)).to_s.chomp
|
112
|
-
end
|
113
|
-
|
114
126
|
def win32?
|
115
127
|
true if RUBY_PLATFORM =~ /win/i
|
116
128
|
end
|
@@ -228,7 +240,7 @@ end
|
|
228
240
|
OptionParser.new("** Convert texts written in a Hiki-like notation into HTML **
|
229
241
|
USAGE: #{File.basename(__FILE__)} [options]") do |opt|
|
230
242
|
opt.on("-h [html_version]", "--html_version [=html_version]",
|
231
|
-
"HTML version to be used. Choose html4 or
|
243
|
+
"HTML version to be used. Choose html4, xhtml1 or html5 (default: #{OPTIONS[:html_version]})") do |version|
|
232
244
|
OPTIONS.set_html_version(version)
|
233
245
|
end
|
234
246
|
|
@@ -254,7 +266,7 @@ USAGE: #{File.basename(__FILE__)} [options]") do |opt|
|
|
254
266
|
end
|
255
267
|
|
256
268
|
opt.on("-C [path_to_css_file]", "--embed-css [=path_to_css_file]",
|
257
|
-
"Set the path to a css file to
|
269
|
+
"Set the path to a css file to embed (default: not to embed)") do |path_to_css_file|
|
258
270
|
OPTIONS[:embed_css] = path_to_css_file
|
259
271
|
end
|
260
272
|
|
@@ -284,6 +296,11 @@ USAGE: #{File.basename(__FILE__)} [options]") do |opt|
|
|
284
296
|
OPTIONS[:toc] = toc_title
|
285
297
|
end
|
286
298
|
|
299
|
+
opt.on("-s", "--split-main-heading",
|
300
|
+
"Split the first h1 element") do |should_be_split|
|
301
|
+
OPTIONS[:split_main_heading] = should_be_split
|
302
|
+
end
|
303
|
+
|
287
304
|
opt.parse!
|
288
305
|
end
|
289
306
|
|
@@ -304,7 +321,7 @@ when 1
|
|
304
321
|
OPTIONS.read_input_filename(ARGV[0])
|
305
322
|
end
|
306
323
|
|
307
|
-
input_lines = ARGF.
|
324
|
+
input_lines = ARGF.readlines
|
308
325
|
|
309
326
|
OPTIONS.set_options_from_input_file(input_lines)
|
310
327
|
OPTIONS.default_title = OPTIONS.input_file_basename
|
data/lib/htmlelement.rb
CHANGED
@@ -311,14 +311,12 @@ module PseudoHiki
|
|
311
311
|
@stack.current_node.breakable?(breaker)
|
312
312
|
end
|
313
313
|
|
314
|
+
def in_link_tag?(preceding_str)
|
315
|
+
preceding_str[-2,2] == "[[" or preceding_str[-1,1] == "|"
|
316
|
+
end
|
317
|
+
|
314
318
|
def tagfy_link(line)
|
315
|
-
line.gsub(URI_RE)
|
316
|
-
unless ($`)[-2,2] == "[[" or ($`)[-1,1] == "|"
|
317
|
-
"[[#{url}]]"
|
318
|
-
else
|
319
|
-
url
|
320
|
-
end
|
321
|
-
end
|
319
|
+
line.gsub(URI_RE) {|url| in_link_tag?($`) ? url : "[[#{url}]]" }
|
322
320
|
end
|
323
321
|
|
324
322
|
def select_leaf_type(line)
|
@@ -142,21 +142,15 @@ module PseudoHiki
|
|
142
142
|
end
|
143
143
|
|
144
144
|
def push(token)
|
145
|
-
|
146
|
-
|
147
|
-
else
|
148
|
-
super(token)
|
149
|
-
end
|
145
|
+
return super(token) unless self.empty?
|
146
|
+
super(parse_first_token(token))
|
150
147
|
end
|
151
148
|
end
|
152
149
|
|
153
150
|
def treated_as_node_end(token)
|
154
|
-
|
155
|
-
|
156
|
-
|
157
|
-
end
|
158
|
-
|
159
|
-
super(token)
|
151
|
+
return super(token) unless token == TableSep
|
152
|
+
self.pop
|
153
|
+
self.push TableCellNode.new
|
160
154
|
end
|
161
155
|
|
162
156
|
def parse
|
data/lib/pseudohiki/treestack.rb
CHANGED
data/lib/pseudohiki/version.rb
CHANGED
@@ -64,6 +64,16 @@ TEXT
|
|
64
64
|
@verbose_formatter.format(tree).to_s)
|
65
65
|
end
|
66
66
|
|
67
|
+
def test_link_url2
|
68
|
+
text = <<TEXT
|
69
|
+
!![develepment_status] Development status of features from the original [[Hiki notation|http://hikiwiki.org/en/TextFormattingRules.html]]
|
70
|
+
TEXT
|
71
|
+
tree = BlockParser.parse(text.lines.to_a)
|
72
|
+
assert_equal(" Development status of features from the original Hiki notation\n", @formatter.format(tree).to_s)
|
73
|
+
assert_equal(" Development status of features from the original Hiki notation (http://hikiwiki.org/en/TextFormattingRules.html)\n",
|
74
|
+
@verbose_formatter.format(tree).to_s)
|
75
|
+
end
|
76
|
+
|
67
77
|
def test_link_image
|
68
78
|
text = <<TEXT
|
69
79
|
A test string with an [[image|image.jpg]] is here.
|
metadata
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: pseudohikiparser
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.0.0.
|
4
|
+
version: 0.0.0.5.develop
|
5
5
|
prerelease: 8
|
6
6
|
platform: ruby
|
7
7
|
authors:
|
@@ -9,7 +9,7 @@ authors:
|
|
9
9
|
autorequire:
|
10
10
|
bindir: bin
|
11
11
|
cert_chain: []
|
12
|
-
date: 2013-
|
12
|
+
date: 2013-10-19 00:00:00.000000000 Z
|
13
13
|
dependencies:
|
14
14
|
- !ruby/object:Gem::Dependency
|
15
15
|
name: bundler
|
@@ -52,6 +52,8 @@ executables:
|
|
52
52
|
extensions: []
|
53
53
|
extra_rdoc_files: []
|
54
54
|
files:
|
55
|
+
- README.md
|
56
|
+
- LICENSE
|
55
57
|
- lib/pseudohikiparser.rb
|
56
58
|
- lib/pseudohiki/treestack.rb
|
57
59
|
- lib/pseudohiki/inlineparser.rb
|
@@ -71,9 +73,9 @@ files:
|
|
71
73
|
- test/test_htmlformat.rb
|
72
74
|
- test/test_htmlplugin.rb
|
73
75
|
- bin/pseudohiki2html.rb
|
74
|
-
homepage: https://github.com/
|
76
|
+
homepage: https://github.com/nico-hn/PseudoHikiParser/wiki
|
75
77
|
licenses:
|
76
|
-
-
|
78
|
+
- BSD 2-Clause license
|
77
79
|
post_install_message:
|
78
80
|
rdoc_options: []
|
79
81
|
require_paths:
|