pseudohikiparser 0.0.0.4.develop → 0.0.0.5.develop
Sign up to get free protection for your applications and to get access to all the features.
- data/LICENSE +23 -0
- data/README.md +203 -0
- data/bin/pseudohiki2html.rb +28 -11
- data/lib/htmlelement.rb +1 -3
- data/lib/pseudohiki/blockparser.rb +5 -7
- data/lib/pseudohiki/inlineparser.rb +5 -11
- data/lib/pseudohiki/treestack.rb +1 -1
- data/lib/pseudohiki/version.rb +1 -1
- data/test/test_plaintextformat.rb +10 -0
- metadata +6 -4
data/LICENSE
ADDED
@@ -0,0 +1,23 @@
|
|
1
|
+
Copyright (c) 2011, HASHIMOTO Naoki
|
2
|
+
All rights reserved.
|
3
|
+
|
4
|
+
Redistribution and use in source and binary forms, with or without modification,
|
5
|
+
are permitted provided that the following conditions are met:
|
6
|
+
|
7
|
+
* Redistributions of source code must retain the above copyright notice, this
|
8
|
+
list of conditions and the following disclaimer.
|
9
|
+
|
10
|
+
* Redistributions in binary form must reproduce the above copyright notice, this
|
11
|
+
list of conditions and the following disclaimer in the documentation and/or
|
12
|
+
other materials provided with the distribution.
|
13
|
+
|
14
|
+
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
|
15
|
+
ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
|
16
|
+
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
|
17
|
+
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR
|
18
|
+
ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
|
19
|
+
(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
|
20
|
+
LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON
|
21
|
+
ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
|
22
|
+
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
|
23
|
+
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
data/README.md
ADDED
@@ -0,0 +1,203 @@
|
|
1
|
+
PseudoHikiParser
|
2
|
+
================
|
3
|
+
|
4
|
+
PseudoHikiParser is a converter of texts written in a [Hiki](http://hikiwiki.org/en/) like notation, into html or other formats.
|
5
|
+
|
6
|
+
Currently, only a limited range of notations can be converted into HTML4 or XHTML1.0.
|
7
|
+
|
8
|
+
I am writing this tool with following objectives in mind,
|
9
|
+
|
10
|
+
* provide some additional features that do not exist in the original Hiki notation
|
11
|
+
* make the notation more line oriented
|
12
|
+
* allow to assign ids to elements such as headings
|
13
|
+
* support several formats other than HTML
|
14
|
+
* The visitor pattern is adopted for the implementation, so you only have to add a visitor class to support a certain format.
|
15
|
+
|
16
|
+
And, it would not be compatible with the original Hiki notation.
|
17
|
+
|
18
|
+
## License
|
19
|
+
|
20
|
+
BSD 2-Clause License
|
21
|
+
|
22
|
+
## Installation
|
23
|
+
|
24
|
+
```
|
25
|
+
gem install pseudohikiparser --pre
|
26
|
+
```
|
27
|
+
|
28
|
+
|
29
|
+
## Usage
|
30
|
+
|
31
|
+
### Samples
|
32
|
+
|
33
|
+
[A sample text](https://github.com/nico-hn/PseudoHikiParser/blob/develop/samples/wikipage.txt) in Hiki notation and [a result of conversion](http://htmlpreview.github.com/?https://github.com/nico-hn/PseudoHikiParser/blob/develop/samples/wikipage.html), and [another result of conversion](http://htmlpreview.github.com/?https://github.com/nico-hn/PseudoHikiParser/blob/develop/samples/wikipage_with_toc.html)
|
34
|
+
|
35
|
+
You will find those samples in [develop branch](https://github.com/nico-hn/PseudoHikiParser/tree/develop/samples).
|
36
|
+
|
37
|
+
|
38
|
+
### pseudohiki2html.rb
|
39
|
+
|
40
|
+
After the installation of PseudoHikiParser, you could use a command, _pseudohiki2html.rb_.
|
41
|
+
|
42
|
+
_Please note that pseudohiki2html.rb is currently provided as a showcase of PseudoHikiParser, and the options will be continuously changed at this stage of development._
|
43
|
+
|
44
|
+
Typing the following lines at the command prompt:
|
45
|
+
|
46
|
+
```
|
47
|
+
pseudohiki2html.rb <<TEXT
|
48
|
+
!! The first heading
|
49
|
+
The first paragraph
|
50
|
+
TEXT
|
51
|
+
```
|
52
|
+
will return the following result to stdout:
|
53
|
+
|
54
|
+
```html
|
55
|
+
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
|
56
|
+
"http://www.w3.org/TR/html4/loose.dtd">
|
57
|
+
<html lang="en">
|
58
|
+
<head>
|
59
|
+
<meta content="en" http-equiv="Content-Language">
|
60
|
+
<meta content="text/html; charset=UTF-8" http-equiv="Content-Type">
|
61
|
+
<meta content="text/javascript" http-equiv="Content-Script-Type">
|
62
|
+
<title>-</title>
|
63
|
+
<link href="default.css" rel="stylesheet" type="text/css">
|
64
|
+
</head>
|
65
|
+
<body>
|
66
|
+
<div class="section h2">
|
67
|
+
<h2> The first heading
|
68
|
+
</h2>
|
69
|
+
<p>
|
70
|
+
The first paragraph
|
71
|
+
</p>
|
72
|
+
<!-- end of section h2 -->
|
73
|
+
</div>
|
74
|
+
</body>
|
75
|
+
</html>
|
76
|
+
```
|
77
|
+
And if you specify a file name with `--output` option:
|
78
|
+
|
79
|
+
```
|
80
|
+
pseudohiki2html.rb --output first_example.html <<TEXT
|
81
|
+
!! The first heading
|
82
|
+
The first paragraph
|
83
|
+
TEXT
|
84
|
+
```
|
85
|
+
the result will be saved in first_example.html.
|
86
|
+
|
87
|
+
For more options, please try `pseudohiki2html.rb --help`
|
88
|
+
|
89
|
+
### module PseudoHiki
|
90
|
+
|
91
|
+
If you save the lines below as a ruby script and execute it:
|
92
|
+
|
93
|
+
```
|
94
|
+
#!/usr/bin/env ruby
|
95
|
+
|
96
|
+
require 'pseudohikiparser'
|
97
|
+
|
98
|
+
plain = <<TEXT
|
99
|
+
!! The first heading
|
100
|
+
The first paragraph
|
101
|
+
TEXT
|
102
|
+
|
103
|
+
tree = PseudoHiki::BlockParser.parse(plain.lines.to_a)
|
104
|
+
html = PseudoHiki::HtmlFormat.format(tree)
|
105
|
+
puts html
|
106
|
+
```
|
107
|
+
you will get the following output:
|
108
|
+
|
109
|
+
```
|
110
|
+
<div class="section h2">
|
111
|
+
<h2> The first heading
|
112
|
+
</h2>
|
113
|
+
<p>
|
114
|
+
The first paragraph
|
115
|
+
</p>
|
116
|
+
<!-- end of section h2 -->
|
117
|
+
</div>
|
118
|
+
```
|
119
|
+
|
120
|
+
Other than PseudoHiki::HtmlFormat, you can choose PseudoHiki::XhtmlFormat, PseudoHiki::Xhtml5Format, PseudoHiki::PlainTextFormat.
|
121
|
+
|
122
|
+
## Development status of features from the original [Hiki notation](http://hikiwiki.org/en/TextFormattingRules.html)
|
123
|
+
|
124
|
+
* Paragraphs - Usable
|
125
|
+
* Links
|
126
|
+
* WikiNames - Not supported (and would never be)
|
127
|
+
* Linking to other Wiki pages - Not supported as well
|
128
|
+
* Linking to an arbitrary URL - Maybe usable
|
129
|
+
* Preformatted text - Usable
|
130
|
+
* Text decoration - Partly supported
|
131
|
+
* Currently, there is no means of escaping tags for inline decorations.
|
132
|
+
* The notation with backquote tags(``) for inline literals is not supported.
|
133
|
+
* Headings - Usable
|
134
|
+
* Horizontal lines - Usable
|
135
|
+
* Lists - Usable
|
136
|
+
* Quotations - Usable
|
137
|
+
* Definitions - Usable
|
138
|
+
* Tables - Usable
|
139
|
+
* Comments - Usable
|
140
|
+
* Plugins - Not supported (and will not be compatible with the original one)
|
141
|
+
|
142
|
+
## Additional Features
|
143
|
+
### Already Implemented
|
144
|
+
#### Assigning ids
|
145
|
+
If you add [name_of_id], just after the marks that denote heading or list type items, it becomes the id attribute of resulting html elements. Below is an example.
|
146
|
+
|
147
|
+
```
|
148
|
+
!![heading_id]heading
|
149
|
+
|
150
|
+
*[list_id]list
|
151
|
+
```
|
152
|
+
will be rendered as
|
153
|
+
|
154
|
+
```html
|
155
|
+
<div class="section h2">
|
156
|
+
<h2 id="HEADING_ID">heading
|
157
|
+
</h2>
|
158
|
+
<ul>
|
159
|
+
<li id="LIST_ID">list
|
160
|
+
</li>
|
161
|
+
</ul>
|
162
|
+
<!-- end of section h2 -->
|
163
|
+
</div>
|
164
|
+
```
|
165
|
+
|
166
|
+
### Partly Implemented
|
167
|
+
#### A visitor that removes markups and returns plain texts
|
168
|
+
The visitor, [PlainTextFormat](https://github.com/nico-hn/PseudoHikiParser/blob/develop/lib/pseudohiki/plaintextformat.rb) is currently available only in the [develop branch](https://github.com/nico-hn/PseudoHikiParser/tree/develop). Below are examples
|
169
|
+
|
170
|
+
```
|
171
|
+
:tel:03-xxxx-xxxx
|
172
|
+
::03-yyyy-yyyy
|
173
|
+
:fax:03-xxxx-xxxx
|
174
|
+
```
|
175
|
+
will be rendered as
|
176
|
+
|
177
|
+
```
|
178
|
+
tel: 03-xxxx-xxxx
|
179
|
+
03-yyyy-yyyy
|
180
|
+
fax: 03-xxxx-xxxx
|
181
|
+
```
|
182
|
+
|
183
|
+
And
|
184
|
+
|
185
|
+
```
|
186
|
+
||cell 1-1||>>cell 1-2,3,4||cell 1-5
|
187
|
+
||cell 2-1||^>cell 2-2,3 3-2,3||cell 2-4||cell 2-5
|
188
|
+
||cell 3-1||cell 3-4||cell 3-5
|
189
|
+
||cell 4-1||cell 4-2||cell 4-3||cell 4-4||cell 4-5
|
190
|
+
```
|
191
|
+
will be rendered as
|
192
|
+
|
193
|
+
```
|
194
|
+
cell 1-1 cell 1-2,3,4 == == cell 1-5
|
195
|
+
cell 2-1 cell 2-2,3 3-2,3 == cell 2-4 cell 2-5
|
196
|
+
cell 3-1 || || cell 3-4 cell 3-5
|
197
|
+
cell 4-1 cell 4-2 cell 4-3 cell 4-4 cell 4-5
|
198
|
+
```
|
199
|
+
#### A visitor for HTML5
|
200
|
+
The visitor, [Xhtml5Format](https://github.com/nico-hn/PseudoHikiParser/blob/develop/lib/pseudohiki/htmlformat.rb#L225) is currently available only in the [develop branch](https://github.com/nico-hn/PseudoHikiParser/tree/develop).
|
201
|
+
|
202
|
+
|
203
|
+
### Not Implemented Yet
|
data/bin/pseudohiki2html.rb
CHANGED
@@ -22,7 +22,8 @@ OPTIONS = {
|
|
22
22
|
:template => nil,
|
23
23
|
:output => nil,
|
24
24
|
:force => false,
|
25
|
-
:toc => nil
|
25
|
+
:toc => nil,
|
26
|
+
:split_main_heading => false
|
26
27
|
}
|
27
28
|
|
28
29
|
ENCODING_REGEXP = {
|
@@ -37,7 +38,7 @@ HTML_VERSIONS = %w(html4 xhtml1 html5)
|
|
37
38
|
FILE_HEADER_PAT = /^(\xef\xbb\xbf)?\/\//
|
38
39
|
WRITTEN_OPTION_PAT = {}
|
39
40
|
OPTIONS.keys.each {|opt| WRITTEN_OPTION_PAT[opt] = /^(\xef\xbb\xbf)?\/\/#{opt}:\s*(.*)$/ }
|
40
|
-
HEADING_WITH_ID_PAT = /^(!{2,3})\[([A-Za-z][0-9A-Za-z_\-.:]*)\]
|
41
|
+
HEADING_WITH_ID_PAT = /^(!{2,3})\[([A-Za-z][0-9A-Za-z_\-.:]*)\]\s*/o
|
41
42
|
|
42
43
|
PlainFormat = PlainTextFormat.create
|
43
44
|
|
@@ -46,7 +47,12 @@ class InputManager
|
|
46
47
|
@formatter ||= OPTIONS.html_template.new
|
47
48
|
end
|
48
49
|
|
50
|
+
def to_plain(line)
|
51
|
+
PlainFormat.format(BlockParser.parse(line.lines.to_a)).to_s.chomp
|
52
|
+
end
|
53
|
+
|
49
54
|
def create_table_of_contents(lines)
|
55
|
+
return "" unless OPTIONS[:toc]
|
50
56
|
toc_lines = lines.grep(HEADING_WITH_ID_PAT).map do |line|
|
51
57
|
m = HEADING_WITH_ID_PAT.match(line)
|
52
58
|
heading_depth, id = m[1].length, m[2].upcase
|
@@ -55,7 +61,15 @@ class InputManager
|
|
55
61
|
OPTIONS.formatter.format(BlockParser.parse(toc_lines))
|
56
62
|
end
|
57
63
|
|
58
|
-
def
|
64
|
+
def split_main_heading(input_lines)
|
65
|
+
return "" unless OPTIONS[:split_main_heading]
|
66
|
+
h1_pos = input_lines.find_index {|line| /^![^!]/o =~ line }
|
67
|
+
return "" unless h1_pos
|
68
|
+
tree = BlockParser.parse([input_lines.delete_at(h1_pos)])
|
69
|
+
OPTIONS.formatter.format(tree)
|
70
|
+
end
|
71
|
+
|
72
|
+
def create_main(toc, body, h1)
|
59
73
|
return nil unless OPTIONS[:toc]
|
60
74
|
toc_container = formatter.create_element("section").tap do |element|
|
61
75
|
element["id"] = "toc"
|
@@ -68,6 +82,7 @@ class InputManager
|
|
68
82
|
end
|
69
83
|
main = formatter.create_element("section").tap do |element|
|
70
84
|
element["id"] = "main"
|
85
|
+
element.push h1 unless h1.empty?
|
71
86
|
element.push toc_container
|
72
87
|
element.push contents_container
|
73
88
|
end
|
@@ -88,11 +103,12 @@ class InputManager
|
|
88
103
|
end
|
89
104
|
|
90
105
|
def compose_html(input_lines)
|
106
|
+
h1 = split_main_heading(input_lines)
|
91
107
|
css = OPTIONS[:css]
|
92
108
|
toc = create_table_of_contents(input_lines)
|
93
109
|
body = compose_body(input_lines)
|
94
110
|
title = OPTIONS.title
|
95
|
-
main = create_main(toc,body)
|
111
|
+
main = create_main(toc,body, h1)
|
96
112
|
|
97
113
|
if OPTIONS[:template]
|
98
114
|
erb = ERB.new(OPTIONS.read_template_file)
|
@@ -107,10 +123,6 @@ class InputManager
|
|
107
123
|
end
|
108
124
|
end
|
109
125
|
|
110
|
-
def to_plain(line)
|
111
|
-
PlainFormat.format(BlockParser.parse(line.lines.to_a)).to_s.chomp
|
112
|
-
end
|
113
|
-
|
114
126
|
def win32?
|
115
127
|
true if RUBY_PLATFORM =~ /win/i
|
116
128
|
end
|
@@ -228,7 +240,7 @@ end
|
|
228
240
|
OptionParser.new("** Convert texts written in a Hiki-like notation into HTML **
|
229
241
|
USAGE: #{File.basename(__FILE__)} [options]") do |opt|
|
230
242
|
opt.on("-h [html_version]", "--html_version [=html_version]",
|
231
|
-
"HTML version to be used. Choose html4 or
|
243
|
+
"HTML version to be used. Choose html4, xhtml1 or html5 (default: #{OPTIONS[:html_version]})") do |version|
|
232
244
|
OPTIONS.set_html_version(version)
|
233
245
|
end
|
234
246
|
|
@@ -254,7 +266,7 @@ USAGE: #{File.basename(__FILE__)} [options]") do |opt|
|
|
254
266
|
end
|
255
267
|
|
256
268
|
opt.on("-C [path_to_css_file]", "--embed-css [=path_to_css_file]",
|
257
|
-
"Set the path to a css file to
|
269
|
+
"Set the path to a css file to embed (default: not to embed)") do |path_to_css_file|
|
258
270
|
OPTIONS[:embed_css] = path_to_css_file
|
259
271
|
end
|
260
272
|
|
@@ -284,6 +296,11 @@ USAGE: #{File.basename(__FILE__)} [options]") do |opt|
|
|
284
296
|
OPTIONS[:toc] = toc_title
|
285
297
|
end
|
286
298
|
|
299
|
+
opt.on("-s", "--split-main-heading",
|
300
|
+
"Split the first h1 element") do |should_be_split|
|
301
|
+
OPTIONS[:split_main_heading] = should_be_split
|
302
|
+
end
|
303
|
+
|
287
304
|
opt.parse!
|
288
305
|
end
|
289
306
|
|
@@ -304,7 +321,7 @@ when 1
|
|
304
321
|
OPTIONS.read_input_filename(ARGV[0])
|
305
322
|
end
|
306
323
|
|
307
|
-
input_lines = ARGF.
|
324
|
+
input_lines = ARGF.readlines
|
308
325
|
|
309
326
|
OPTIONS.set_options_from_input_file(input_lines)
|
310
327
|
OPTIONS.default_title = OPTIONS.input_file_basename
|
data/lib/htmlelement.rb
CHANGED
@@ -311,14 +311,12 @@ module PseudoHiki
|
|
311
311
|
@stack.current_node.breakable?(breaker)
|
312
312
|
end
|
313
313
|
|
314
|
+
def in_link_tag?(preceding_str)
|
315
|
+
preceding_str[-2,2] == "[[" or preceding_str[-1,1] == "|"
|
316
|
+
end
|
317
|
+
|
314
318
|
def tagfy_link(line)
|
315
|
-
line.gsub(URI_RE)
|
316
|
-
unless ($`)[-2,2] == "[[" or ($`)[-1,1] == "|"
|
317
|
-
"[[#{url}]]"
|
318
|
-
else
|
319
|
-
url
|
320
|
-
end
|
321
|
-
end
|
319
|
+
line.gsub(URI_RE) {|url| in_link_tag?($`) ? url : "[[#{url}]]" }
|
322
320
|
end
|
323
321
|
|
324
322
|
def select_leaf_type(line)
|
@@ -142,21 +142,15 @@ module PseudoHiki
|
|
142
142
|
end
|
143
143
|
|
144
144
|
def push(token)
|
145
|
-
|
146
|
-
|
147
|
-
else
|
148
|
-
super(token)
|
149
|
-
end
|
145
|
+
return super(token) unless self.empty?
|
146
|
+
super(parse_first_token(token))
|
150
147
|
end
|
151
148
|
end
|
152
149
|
|
153
150
|
def treated_as_node_end(token)
|
154
|
-
|
155
|
-
|
156
|
-
|
157
|
-
end
|
158
|
-
|
159
|
-
super(token)
|
151
|
+
return super(token) unless token == TableSep
|
152
|
+
self.pop
|
153
|
+
self.push TableCellNode.new
|
160
154
|
end
|
161
155
|
|
162
156
|
def parse
|
data/lib/pseudohiki/treestack.rb
CHANGED
data/lib/pseudohiki/version.rb
CHANGED
@@ -64,6 +64,16 @@ TEXT
|
|
64
64
|
@verbose_formatter.format(tree).to_s)
|
65
65
|
end
|
66
66
|
|
67
|
+
def test_link_url2
|
68
|
+
text = <<TEXT
|
69
|
+
!![develepment_status] Development status of features from the original [[Hiki notation|http://hikiwiki.org/en/TextFormattingRules.html]]
|
70
|
+
TEXT
|
71
|
+
tree = BlockParser.parse(text.lines.to_a)
|
72
|
+
assert_equal(" Development status of features from the original Hiki notation\n", @formatter.format(tree).to_s)
|
73
|
+
assert_equal(" Development status of features from the original Hiki notation (http://hikiwiki.org/en/TextFormattingRules.html)\n",
|
74
|
+
@verbose_formatter.format(tree).to_s)
|
75
|
+
end
|
76
|
+
|
67
77
|
def test_link_image
|
68
78
|
text = <<TEXT
|
69
79
|
A test string with an [[image|image.jpg]] is here.
|
metadata
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: pseudohikiparser
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.0.0.
|
4
|
+
version: 0.0.0.5.develop
|
5
5
|
prerelease: 8
|
6
6
|
platform: ruby
|
7
7
|
authors:
|
@@ -9,7 +9,7 @@ authors:
|
|
9
9
|
autorequire:
|
10
10
|
bindir: bin
|
11
11
|
cert_chain: []
|
12
|
-
date: 2013-
|
12
|
+
date: 2013-10-19 00:00:00.000000000 Z
|
13
13
|
dependencies:
|
14
14
|
- !ruby/object:Gem::Dependency
|
15
15
|
name: bundler
|
@@ -52,6 +52,8 @@ executables:
|
|
52
52
|
extensions: []
|
53
53
|
extra_rdoc_files: []
|
54
54
|
files:
|
55
|
+
- README.md
|
56
|
+
- LICENSE
|
55
57
|
- lib/pseudohikiparser.rb
|
56
58
|
- lib/pseudohiki/treestack.rb
|
57
59
|
- lib/pseudohiki/inlineparser.rb
|
@@ -71,9 +73,9 @@ files:
|
|
71
73
|
- test/test_htmlformat.rb
|
72
74
|
- test/test_htmlplugin.rb
|
73
75
|
- bin/pseudohiki2html.rb
|
74
|
-
homepage: https://github.com/
|
76
|
+
homepage: https://github.com/nico-hn/PseudoHikiParser/wiki
|
75
77
|
licenses:
|
76
|
-
-
|
78
|
+
- BSD 2-Clause license
|
77
79
|
post_install_message:
|
78
80
|
rdoc_options: []
|
79
81
|
require_paths:
|