pseudohikiparser 0.0.0.4.develop → 0.0.0.5.develop

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
data/LICENSE ADDED
@@ -0,0 +1,23 @@
1
+ Copyright (c) 2011, HASHIMOTO Naoki
2
+ All rights reserved.
3
+
4
+ Redistribution and use in source and binary forms, with or without modification,
5
+ are permitted provided that the following conditions are met:
6
+
7
+ * Redistributions of source code must retain the above copyright notice, this
8
+ list of conditions and the following disclaimer.
9
+
10
+ * Redistributions in binary form must reproduce the above copyright notice, this
11
+ list of conditions and the following disclaimer in the documentation and/or
12
+ other materials provided with the distribution.
13
+
14
+ THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
15
+ ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
16
+ WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
17
+ DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR
18
+ ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
19
+ (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
20
+ LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON
21
+ ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
22
+ (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
23
+ SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
data/README.md ADDED
@@ -0,0 +1,203 @@
1
+ PseudoHikiParser
2
+ ================
3
+
4
+ PseudoHikiParser is a converter of texts written in a [Hiki](http://hikiwiki.org/en/) like notation, into html or other formats.
5
+
6
+ Currently, only a limited range of notations can be converted into HTML4 or XHTML1.0.
7
+
8
+ I am writing this tool with following objectives in mind,
9
+
10
+ * provide some additional features that do not exist in the original Hiki notation
11
+ * make the notation more line oriented
12
+ * allow to assign ids to elements such as headings
13
+ * support several formats other than HTML
14
+ * The visitor pattern is adopted for the implementation, so you only have to add a visitor class to support a certain format.
15
+
16
+ And, it would not be compatible with the original Hiki notation.
17
+
18
+ ## License
19
+
20
+ BSD 2-Clause License
21
+
22
+ ## Installation
23
+
24
+ ```
25
+ gem install pseudohikiparser --pre
26
+ ```
27
+
28
+
29
+ ## Usage
30
+
31
+ ### Samples
32
+
33
+ [A sample text](https://github.com/nico-hn/PseudoHikiParser/blob/develop/samples/wikipage.txt) in Hiki notation and [a result of conversion](http://htmlpreview.github.com/?https://github.com/nico-hn/PseudoHikiParser/blob/develop/samples/wikipage.html), and [another result of conversion](http://htmlpreview.github.com/?https://github.com/nico-hn/PseudoHikiParser/blob/develop/samples/wikipage_with_toc.html)
34
+
35
+ You will find those samples in [develop branch](https://github.com/nico-hn/PseudoHikiParser/tree/develop/samples).
36
+
37
+
38
+ ### pseudohiki2html.rb
39
+
40
+ After the installation of PseudoHikiParser, you could use a command, _pseudohiki2html.rb_.
41
+
42
+ _Please note that pseudohiki2html.rb is currently provided as a showcase of PseudoHikiParser, and the options will be continuously changed at this stage of development._
43
+
44
+ Typing the following lines at the command prompt:
45
+
46
+ ```
47
+ pseudohiki2html.rb <<TEXT
48
+ !! The first heading
49
+ The first paragraph
50
+ TEXT
51
+ ```
52
+ will return the following result to stdout:
53
+
54
+ ```html
55
+ <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
56
+ "http://www.w3.org/TR/html4/loose.dtd">
57
+ <html lang="en">
58
+ <head>
59
+ <meta content="en" http-equiv="Content-Language">
60
+ <meta content="text/html; charset=UTF-8" http-equiv="Content-Type">
61
+ <meta content="text/javascript" http-equiv="Content-Script-Type">
62
+ <title>-</title>
63
+ <link href="default.css" rel="stylesheet" type="text/css">
64
+ </head>
65
+ <body>
66
+ <div class="section h2">
67
+ <h2> The first heading
68
+ </h2>
69
+ <p>
70
+ The first paragraph
71
+ </p>
72
+ <!-- end of section h2 -->
73
+ </div>
74
+ </body>
75
+ </html>
76
+ ```
77
+ And if you specify a file name with `--output` option:
78
+
79
+ ```
80
+ pseudohiki2html.rb --output first_example.html <<TEXT
81
+ !! The first heading
82
+ The first paragraph
83
+ TEXT
84
+ ```
85
+ the result will be saved in first_example.html.
86
+
87
+ For more options, please try `pseudohiki2html.rb --help`
88
+
89
+ ### module PseudoHiki
90
+
91
+ If you save the lines below as a ruby script and execute it:
92
+
93
+ ```
94
+ #!/usr/bin/env ruby
95
+
96
+ require 'pseudohikiparser'
97
+
98
+ plain = <<TEXT
99
+ !! The first heading
100
+ The first paragraph
101
+ TEXT
102
+
103
+ tree = PseudoHiki::BlockParser.parse(plain.lines.to_a)
104
+ html = PseudoHiki::HtmlFormat.format(tree)
105
+ puts html
106
+ ```
107
+ you will get the following output:
108
+
109
+ ```
110
+ <div class="section h2">
111
+ <h2> The first heading
112
+ </h2>
113
+ <p>
114
+ The first paragraph
115
+ </p>
116
+ <!-- end of section h2 -->
117
+ </div>
118
+ ```
119
+
120
+ Other than PseudoHiki::HtmlFormat, you can choose PseudoHiki::XhtmlFormat, PseudoHiki::Xhtml5Format, PseudoHiki::PlainTextFormat.
121
+
122
+ ## Development status of features from the original [Hiki notation](http://hikiwiki.org/en/TextFormattingRules.html)
123
+
124
+ * Paragraphs - Usable
125
+ * Links
126
+ * WikiNames - Not supported (and would never be)
127
+ * Linking to other Wiki pages - Not supported as well
128
+ * Linking to an arbitrary URL - Maybe usable
129
+ * Preformatted text - Usable
130
+ * Text decoration - Partly supported
131
+ * Currently, there is no means of escaping tags for inline decorations.
132
+ * The notation with backquote tags(``) for inline literals is not supported.
133
+ * Headings - Usable
134
+ * Horizontal lines - Usable
135
+ * Lists - Usable
136
+ * Quotations - Usable
137
+ * Definitions - Usable
138
+ * Tables - Usable
139
+ * Comments - Usable
140
+ * Plugins - Not supported (and will not be compatible with the original one)
141
+
142
+ ## Additional Features
143
+ ### Already Implemented
144
+ #### Assigning ids
145
+ If you add [name_of_id], just after the marks that denote heading or list type items, it becomes the id attribute of resulting html elements. Below is an example.
146
+
147
+ ```
148
+ !![heading_id]heading
149
+
150
+ *[list_id]list
151
+ ```
152
+ will be rendered as
153
+
154
+ ```html
155
+ <div class="section h2">
156
+ <h2 id="HEADING_ID">heading
157
+ </h2>
158
+ <ul>
159
+ <li id="LIST_ID">list
160
+ </li>
161
+ </ul>
162
+ <!-- end of section h2 -->
163
+ </div>
164
+ ```
165
+
166
+ ### Partly Implemented
167
+ #### A visitor that removes markups and returns plain texts
168
+ The visitor, [PlainTextFormat](https://github.com/nico-hn/PseudoHikiParser/blob/develop/lib/pseudohiki/plaintextformat.rb) is currently available only in the [develop branch](https://github.com/nico-hn/PseudoHikiParser/tree/develop). Below are examples
169
+
170
+ ```
171
+ :tel:03-xxxx-xxxx
172
+ ::03-yyyy-yyyy
173
+ :fax:03-xxxx-xxxx
174
+ ```
175
+ will be rendered as
176
+
177
+ ```
178
+ tel: 03-xxxx-xxxx
179
+ 03-yyyy-yyyy
180
+ fax: 03-xxxx-xxxx
181
+ ```
182
+
183
+ And
184
+
185
+ ```
186
+ ||cell 1-1||>>cell 1-2,3,4||cell 1-5
187
+ ||cell 2-1||^>cell 2-2,3 3-2,3||cell 2-4||cell 2-5
188
+ ||cell 3-1||cell 3-4||cell 3-5
189
+ ||cell 4-1||cell 4-2||cell 4-3||cell 4-4||cell 4-5
190
+ ```
191
+ will be rendered as
192
+
193
+ ```
194
+ cell 1-1 cell 1-2,3,4 == == cell 1-5
195
+ cell 2-1 cell 2-2,3 3-2,3 == cell 2-4 cell 2-5
196
+ cell 3-1 || || cell 3-4 cell 3-5
197
+ cell 4-1 cell 4-2 cell 4-3 cell 4-4 cell 4-5
198
+ ```
199
+ #### A visitor for HTML5
200
+ The visitor, [Xhtml5Format](https://github.com/nico-hn/PseudoHikiParser/blob/develop/lib/pseudohiki/htmlformat.rb#L225) is currently available only in the [develop branch](https://github.com/nico-hn/PseudoHikiParser/tree/develop).
201
+
202
+
203
+ ### Not Implemented Yet
@@ -22,7 +22,8 @@ OPTIONS = {
22
22
  :template => nil,
23
23
  :output => nil,
24
24
  :force => false,
25
- :toc => nil
25
+ :toc => nil,
26
+ :split_main_heading => false
26
27
  }
27
28
 
28
29
  ENCODING_REGEXP = {
@@ -37,7 +38,7 @@ HTML_VERSIONS = %w(html4 xhtml1 html5)
37
38
  FILE_HEADER_PAT = /^(\xef\xbb\xbf)?\/\//
38
39
  WRITTEN_OPTION_PAT = {}
39
40
  OPTIONS.keys.each {|opt| WRITTEN_OPTION_PAT[opt] = /^(\xef\xbb\xbf)?\/\/#{opt}:\s*(.*)$/ }
40
- HEADING_WITH_ID_PAT = /^(!{2,3})\[([A-Za-z][0-9A-Za-z_\-.:]*)\]/o
41
+ HEADING_WITH_ID_PAT = /^(!{2,3})\[([A-Za-z][0-9A-Za-z_\-.:]*)\]\s*/o
41
42
 
42
43
  PlainFormat = PlainTextFormat.create
43
44
 
@@ -46,7 +47,12 @@ class InputManager
46
47
  @formatter ||= OPTIONS.html_template.new
47
48
  end
48
49
 
50
+ def to_plain(line)
51
+ PlainFormat.format(BlockParser.parse(line.lines.to_a)).to_s.chomp
52
+ end
53
+
49
54
  def create_table_of_contents(lines)
55
+ return "" unless OPTIONS[:toc]
50
56
  toc_lines = lines.grep(HEADING_WITH_ID_PAT).map do |line|
51
57
  m = HEADING_WITH_ID_PAT.match(line)
52
58
  heading_depth, id = m[1].length, m[2].upcase
@@ -55,7 +61,15 @@ class InputManager
55
61
  OPTIONS.formatter.format(BlockParser.parse(toc_lines))
56
62
  end
57
63
 
58
- def create_main(toc, body)
64
+ def split_main_heading(input_lines)
65
+ return "" unless OPTIONS[:split_main_heading]
66
+ h1_pos = input_lines.find_index {|line| /^![^!]/o =~ line }
67
+ return "" unless h1_pos
68
+ tree = BlockParser.parse([input_lines.delete_at(h1_pos)])
69
+ OPTIONS.formatter.format(tree)
70
+ end
71
+
72
+ def create_main(toc, body, h1)
59
73
  return nil unless OPTIONS[:toc]
60
74
  toc_container = formatter.create_element("section").tap do |element|
61
75
  element["id"] = "toc"
@@ -68,6 +82,7 @@ class InputManager
68
82
  end
69
83
  main = formatter.create_element("section").tap do |element|
70
84
  element["id"] = "main"
85
+ element.push h1 unless h1.empty?
71
86
  element.push toc_container
72
87
  element.push contents_container
73
88
  end
@@ -88,11 +103,12 @@ class InputManager
88
103
  end
89
104
 
90
105
  def compose_html(input_lines)
106
+ h1 = split_main_heading(input_lines)
91
107
  css = OPTIONS[:css]
92
108
  toc = create_table_of_contents(input_lines)
93
109
  body = compose_body(input_lines)
94
110
  title = OPTIONS.title
95
- main = create_main(toc,body)
111
+ main = create_main(toc,body, h1)
96
112
 
97
113
  if OPTIONS[:template]
98
114
  erb = ERB.new(OPTIONS.read_template_file)
@@ -107,10 +123,6 @@ class InputManager
107
123
  end
108
124
  end
109
125
 
110
- def to_plain(line)
111
- PlainFormat.format(BlockParser.parse(line.lines.to_a)).to_s.chomp
112
- end
113
-
114
126
  def win32?
115
127
  true if RUBY_PLATFORM =~ /win/i
116
128
  end
@@ -228,7 +240,7 @@ end
228
240
  OptionParser.new("** Convert texts written in a Hiki-like notation into HTML **
229
241
  USAGE: #{File.basename(__FILE__)} [options]") do |opt|
230
242
  opt.on("-h [html_version]", "--html_version [=html_version]",
231
- "HTML version to be used. Choose html4 or xhtml1 (default: #{OPTIONS[:html_version]})") do |version|
243
+ "HTML version to be used. Choose html4, xhtml1 or html5 (default: #{OPTIONS[:html_version]})") do |version|
232
244
  OPTIONS.set_html_version(version)
233
245
  end
234
246
 
@@ -254,7 +266,7 @@ USAGE: #{File.basename(__FILE__)} [options]") do |opt|
254
266
  end
255
267
 
256
268
  opt.on("-C [path_to_css_file]", "--embed-css [=path_to_css_file]",
257
- "Set the path to a css file to be used (default: not to embed)") do |path_to_css_file|
269
+ "Set the path to a css file to embed (default: not to embed)") do |path_to_css_file|
258
270
  OPTIONS[:embed_css] = path_to_css_file
259
271
  end
260
272
 
@@ -284,6 +296,11 @@ USAGE: #{File.basename(__FILE__)} [options]") do |opt|
284
296
  OPTIONS[:toc] = toc_title
285
297
  end
286
298
 
299
+ opt.on("-s", "--split-main-heading",
300
+ "Split the first h1 element") do |should_be_split|
301
+ OPTIONS[:split_main_heading] = should_be_split
302
+ end
303
+
287
304
  opt.parse!
288
305
  end
289
306
 
@@ -304,7 +321,7 @@ when 1
304
321
  OPTIONS.read_input_filename(ARGV[0])
305
322
  end
306
323
 
307
- input_lines = ARGF.lines.to_a
324
+ input_lines = ARGF.readlines
308
325
 
309
326
  OPTIONS.set_options_from_input_file(input_lines)
310
327
  OPTIONS.default_title = OPTIONS.input_file_basename
data/lib/htmlelement.rb CHANGED
@@ -4,9 +4,7 @@ require 'kconv'
4
4
 
5
5
  class HtmlElement
6
6
  class Children < Array
7
- def to_s
8
- self.join
9
- end
7
+ alias to_s join
10
8
  end
11
9
 
12
10
  module CHARSET
@@ -311,14 +311,12 @@ module PseudoHiki
311
311
  @stack.current_node.breakable?(breaker)
312
312
  end
313
313
 
314
+ def in_link_tag?(preceding_str)
315
+ preceding_str[-2,2] == "[[" or preceding_str[-1,1] == "|"
316
+ end
317
+
314
318
  def tagfy_link(line)
315
- line.gsub(URI_RE) do |url|
316
- unless ($`)[-2,2] == "[[" or ($`)[-1,1] == "|"
317
- "[[#{url}]]"
318
- else
319
- url
320
- end
321
- end
319
+ line.gsub(URI_RE) {|url| in_link_tag?($`) ? url : "[[#{url}]]" }
322
320
  end
323
321
 
324
322
  def select_leaf_type(line)
@@ -142,21 +142,15 @@ module PseudoHiki
142
142
  end
143
143
 
144
144
  def push(token)
145
- if self.empty?
146
- super(parse_first_token(token))
147
- else
148
- super(token)
149
- end
145
+ return super(token) unless self.empty?
146
+ super(parse_first_token(token))
150
147
  end
151
148
  end
152
149
 
153
150
  def treated_as_node_end(token)
154
- if token == TableSep
155
- self.pop
156
- return (self.push TableCellNode.new)
157
- end
158
-
159
- super(token)
151
+ return super(token) unless token == TableSep
152
+ self.pop
153
+ self.push TableCellNode.new
160
154
  end
161
155
 
162
156
  def parse
@@ -1,7 +1,6 @@
1
1
  #!/usr/bin/env ruby
2
2
 
3
3
  class TreeStack
4
-
5
4
  class NotLeafError < Exception; end
6
5
 
7
6
  module Mergeable; end
@@ -59,6 +58,7 @@ class TreeStack
59
58
  nil
60
59
  end
61
60
  end
61
+
62
62
  attr_reader :node_end, :last_leaf
63
63
 
64
64
  def initialize(root_node=Node.new)
@@ -1,3 +1,3 @@
1
1
  module PseudoHiki
2
- VERSION = "0.0.0.4.develop"
2
+ VERSION = "0.0.0.5.develop"
3
3
  end
@@ -64,6 +64,16 @@ TEXT
64
64
  @verbose_formatter.format(tree).to_s)
65
65
  end
66
66
 
67
+ def test_link_url2
68
+ text = <<TEXT
69
+ !![develepment_status] Development status of features from the original [[Hiki notation|http://hikiwiki.org/en/TextFormattingRules.html]]
70
+ TEXT
71
+ tree = BlockParser.parse(text.lines.to_a)
72
+ assert_equal(" Development status of features from the original Hiki notation\n", @formatter.format(tree).to_s)
73
+ assert_equal(" Development status of features from the original Hiki notation (http://hikiwiki.org/en/TextFormattingRules.html)\n",
74
+ @verbose_formatter.format(tree).to_s)
75
+ end
76
+
67
77
  def test_link_image
68
78
  text = <<TEXT
69
79
  A test string with an [[image|image.jpg]] is here.
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: pseudohikiparser
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.0.0.4.develop
4
+ version: 0.0.0.5.develop
5
5
  prerelease: 8
6
6
  platform: ruby
7
7
  authors:
@@ -9,7 +9,7 @@ authors:
9
9
  autorequire:
10
10
  bindir: bin
11
11
  cert_chain: []
12
- date: 2013-09-10 00:00:00.000000000 Z
12
+ date: 2013-10-19 00:00:00.000000000 Z
13
13
  dependencies:
14
14
  - !ruby/object:Gem::Dependency
15
15
  name: bundler
@@ -52,6 +52,8 @@ executables:
52
52
  extensions: []
53
53
  extra_rdoc_files: []
54
54
  files:
55
+ - README.md
56
+ - LICENSE
55
57
  - lib/pseudohikiparser.rb
56
58
  - lib/pseudohiki/treestack.rb
57
59
  - lib/pseudohiki/inlineparser.rb
@@ -71,9 +73,9 @@ files:
71
73
  - test/test_htmlformat.rb
72
74
  - test/test_htmlplugin.rb
73
75
  - bin/pseudohiki2html.rb
74
- homepage: https://github.com/hashimoto-naoki/PseudoHikiParser/wiki
76
+ homepage: https://github.com/nico-hn/PseudoHikiParser/wiki
75
77
  licenses:
76
- - Not decided yet
78
+ - BSD 2-Clause license
77
79
  post_install_message:
78
80
  rdoc_options: []
79
81
  require_paths: