html2md 0.1.1 → 0.1.2

Sign up to get free protection for your applications and to get access to all the features.
data/README.md CHANGED
@@ -31,7 +31,7 @@ This gem is built with Travis-ci.org. http://travis-ci.org/#!/pmorton/html2md
31
31
 
32
32
  Compatibility
33
33
  ==============
34
- Currently not compatiable with jruby, mainly because I am too lazy to fix the build issues. Compatiablity for jruby will be added in the near future.
34
+ 1.9 Compat + Jruby Support. Currently working through 1.8 support
35
35
 
36
36
 
37
37
  Contributing
data/Rakefile CHANGED
@@ -14,6 +14,6 @@ end
14
14
 
15
15
  desc "Test"
16
16
  task :t, [] => [] do |taks,args|
17
- t = Html2Md.new(open("http://loremipsum.net/about.html").read)
17
+ t = Html2Md.new(File.read('./test.html'))
18
18
  puts t.parse
19
19
  end
@@ -0,0 +1,44 @@
1
+ <hr><h1>Header 1</h1>
2
+
3
+ <p>Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse sed pulvinar metus. Integer a ligula dolor. Maecenas malesuada nibh ac nulla tempus pulvinar at at neque. Integer non eleifend neque. Donec sed sapien nunc. Mauris imperdiet rutrum est in rhoncus. Pellentesque sollicitudin dapibus sapien vitae aliquam. Donec mollis dui at turpis tristique volutpat. Sed nec magna eget lectus convallis volutpat quis at odio. Fusce quis massa leo. Cras eget nisl erat. Aliquam aliquet consectetur risus, a venenatis dolor eleifend interdum. Fusce vel risus velit, non convallis enim.</p>
4
+
5
+ <h2>Header 2</h2>
6
+
7
+ <p>Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse sed pulvinar metus. Integer a ligula dolor. Maecenas malesuada nibh ac nulla tempus pulvinar at at neque. Integer non eleifend neque. Donec sed sapien nunc. Mauris imperdiet rutrum est in rhoncus. Pellentesque sollicitudin dapibus sapien vitae aliquam. Donec mollis dui at turpis tristique volutpat. Sed nec magna eget lectus convallis volutpat quis at odio. Fusce quis massa leo. Cras eget nisl erat. Aliquam aliquet consectetur risus, a venenatis dolor eleifend interdum. Fusce vel risus velit, non convallis enim.</p>
8
+
9
+ <h3>Header 3</h3>
10
+
11
+ <p>Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse sed pulvinar metus. Integer a ligula dolor. Maecenas malesuada nibh ac nulla tempus pulvinar at at neque. Integer non eleifend neque. Donec sed sapien nunc. Mauris imperdiet rutrum est in rhoncus. Pellentesque sollicitudin dapibus sapien vitae aliquam. Donec mollis dui at turpis tristique volutpat. Sed nec magna eget lectus convallis volutpat quis at odio. Fusce quis massa leo. Cras eget nisl erat. Aliquam aliquet consectetur risus, a venenatis dolor eleifend interdum. Fusce vel risus velit, non convallis enim.</p>
12
+
13
+ <p>Un-Ordered List 1</p>
14
+
15
+ <ul>
16
+ <li>Item 1</li>
17
+ <li>Item 2</li>
18
+ <li>Item 3</li>
19
+ </ul><p>Ordered List 1</p>
20
+
21
+ <ol>
22
+ <li>Item 1</li>
23
+ <li>Item 2</li>
24
+ <li>Item 3</li>
25
+ </ol><p>Nested List</p>
26
+
27
+ <ul>
28
+ <li>Un-Ordered Item 1</li>
29
+ <li>Un-Ordered Item 2
30
+
31
+ <ol>
32
+ <li>Ordered Item 1</li>
33
+ <li>Ordered Item 2
34
+
35
+ <ul>
36
+ <li>Un-Ordered Item 1</li>
37
+ </ul>
38
+ </li>
39
+ </ol>
40
+ </li>
41
+ <li>Un-Ordered Item 3</li>
42
+ </ul><p><strong>Strong</strong></p>
43
+
44
+ <p><em>Emphasis</em></p>
@@ -0,0 +1,41 @@
1
+ ********
2
+
3
+ Header 1
4
+ ========
5
+
6
+ Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse sed pulvinar metus. Integer a ligula dolor. Maecenas malesuada nibh ac nulla tempus pulvinar at at neque. Integer non eleifend neque. Donec sed sapien nunc. Mauris imperdiet rutrum est in rhoncus. Pellentesque sollicitudin dapibus sapien vitae aliquam. Donec mollis dui at turpis tristique volutpat. Sed nec magna eget lectus convallis volutpat quis at odio. Fusce quis massa leo. Cras eget nisl erat. Aliquam aliquet consectetur risus, a venenatis dolor eleifend interdum. Fusce vel risus velit, non convallis enim.
7
+
8
+ Header 2
9
+ --------
10
+
11
+ Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse sed pulvinar metus. Integer a ligula dolor. Maecenas malesuada nibh ac nulla tempus pulvinar at at neque. Integer non eleifend neque. Donec sed sapien nunc. Mauris imperdiet rutrum est in rhoncus. Pellentesque sollicitudin dapibus sapien vitae aliquam. Donec mollis dui at turpis tristique volutpat. Sed nec magna eget lectus convallis volutpat quis at odio. Fusce quis massa leo. Cras eget nisl erat. Aliquam aliquet consectetur risus, a venenatis dolor eleifend interdum. Fusce vel risus velit, non convallis enim.
12
+
13
+ ### Header 3
14
+
15
+ Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse sed pulvinar metus. Integer a ligula dolor. Maecenas malesuada nibh ac nulla tempus pulvinar at at neque. Integer non eleifend neque. Donec sed sapien nunc. Mauris imperdiet rutrum est in rhoncus. Pellentesque sollicitudin dapibus sapien vitae aliquam. Donec mollis dui at turpis tristique volutpat. Sed nec magna eget lectus convallis volutpat quis at odio. Fusce quis massa leo. Cras eget nisl erat. Aliquam aliquet consectetur risus, a venenatis dolor eleifend interdum. Fusce vel risus velit, non convallis enim.
16
+
17
+ Un-Ordered List 1
18
+
19
+ - Item 1
20
+ - Item 2
21
+ - Item 3
22
+
23
+ Ordered List 1
24
+
25
+ 1. Item 1
26
+ 2. Item 2
27
+ 3. Item 3
28
+
29
+ Nested List
30
+
31
+ - Un-Ordered Item 1
32
+ - Un-Ordered Item 2
33
+ 1. Ordered Item 1
34
+ 2. Ordered Item 2
35
+ - Un-Ordered Item 1
36
+ - Un-Ordered Item 3
37
+
38
+ **Strong**
39
+
40
+ _Emphasis_
41
+
@@ -4,7 +4,7 @@ Feature: Markdown
4
4
  Scenario: Create a H Rule (HR) element
5
5
  * HTML <hr/>
6
6
  * I say parse
7
- * The markdown should be (\n* * * * *\n)
7
+ * The markdown should be (********\n)
8
8
 
9
9
  Scenario: Create a hard break (BR) element
10
10
  * HTML <br/>
@@ -49,7 +49,7 @@ Feature: Markdown
49
49
  Scenario: Complex List
50
50
  * HTML <ul><li>First</li><li> <ol><li>First<ul><li>First</li><li>Second</li></ul></li><li>Second</li> </ol>Second</li><ul>
51
51
  * I say parse
52
- * The markdown should be (\n - First\n - \n 1. First\n - First\n - Second\n\n 2. Second\nSecond\n\n)
52
+ * The markdown should be (\n - First\n - \n 1. First\n - First\n - Second\n 2. Second\nSecond\n\n)
53
53
 
54
54
  Scenario: Emphasis (em) element
55
55
  * HTML <em>Emphasis</em>
@@ -77,21 +77,26 @@ Feature: Markdown
77
77
  * The markdown should be (This is in a span)
78
78
 
79
79
  Scenario: Character data should not have new lines
80
- * HTML This is character data \n
80
+ * HTML <p>This is character data \n\n\n\n</p>
81
81
  * I say parse
82
82
  * The markdown should be (This is character data \n\n)
83
83
 
84
84
  Scenario: First level headers
85
85
  * HTML <h1>This is a H1 Element</h1>
86
86
  * I say parse
87
- * The markdown should be (\nThis is a H1 Element\n====================\n)
87
+ * The markdown should be (\nThis is a H1 Element\n====================\n\n)
88
88
 
89
89
  Scenario: Second level headers
90
90
  * HTML <h2>This is a H2 Element</h2>
91
91
  * I say parse
92
- * The markdown should be (\nThis is a H2 Element\n--------------------\n)
92
+ * The markdown should be (\nThis is a H2 Element\n--------------------\n\n)
93
93
 
94
94
  Scenario: Third level headers
95
95
  * HTML <h3>This is a H3 Element</h3>
96
96
  * I say parse
97
- * The markdown should be (\n### This is a H3 Element\n)
97
+ * The markdown should be (\n### This is a H3 Element\n\n)
98
+
99
+ Scenario: Full File Conversion
100
+ * File (./features/assets/test.html)
101
+ * I say parse
102
+ * The mardown should be equal to (./features/assets/test.md)
@@ -15,10 +15,18 @@ Given /HTML (.*)/ do |n|
15
15
  @html2md.source = n.gsub("\\n", "\n")
16
16
  end
17
17
 
18
+ Given /File \((.*)\)/ do |n|
19
+ @html2md.source = File.read(n)
20
+ end
21
+
18
22
  When /I say parse/ do
19
23
  @result = @html2md.parse
20
24
  end
21
25
 
22
26
  Then /The markdown should be \((.*)\)/ do |result|
23
27
  @result.should == result.gsub("\\n", "\n")
28
+ end
29
+
30
+ Then /The mardown should be equal to \((.*)\)/ do |file|
31
+ @result.gsub("\\n","\n").should == File.read(file)
24
32
  end
@@ -1,3 +1,3 @@
1
1
  class Html2Md
2
- VERSION = "0.1.1"
3
- end
2
+ VERSION = "0.1.2"
3
+ end
@@ -11,7 +11,6 @@ class Html2Md
11
11
  @markdown = ''
12
12
  @last_href = nil
13
13
  @allowed_tags = ['tr','td','th','table']
14
- @current_list = -1
15
14
  @list_tree = []
16
15
  @last_cdata_length = 0
17
16
 
@@ -34,9 +33,9 @@ class Html2Md
34
33
  end
35
34
 
36
35
  def start_element name, attributes = []
36
+ #@markdown << name
37
37
  start_name = "start_#{name}".to_sym
38
38
  both_name = "start_and_end_#{name}".to_sym
39
-
40
39
  if self.respond_to?(both_name)
41
40
  self.send( both_name, attributes )
42
41
  elsif self.respond_to?(start_name)
@@ -48,9 +47,9 @@ class Html2Md
48
47
  end
49
48
 
50
49
  def end_element name, attributes = []
50
+ #@markdown << name
51
51
  end_name = "end_#{name}".to_sym
52
52
  both_name = "start_and_end_#{name}".to_sym
53
-
54
53
  if self.respond_to?(both_name)
55
54
  self.send( both_name, attributes )
56
55
  elsif self.respond_to?(end_name)
@@ -61,7 +60,7 @@ class Html2Md
61
60
  end
62
61
 
63
62
  def start_hr(attributes)
64
- @markdown << "\n* * * * *\n"
63
+ @markdown << "********\n"
65
64
  end
66
65
 
67
66
  def end_hr(attributes)
@@ -89,7 +88,7 @@ class Html2Md
89
88
  end
90
89
 
91
90
  def end_p(attributes)
92
- @markdown << "\n\n"
91
+ @markdown << "\n\n" unless @list_tree[-1]
93
92
  end
94
93
 
95
94
  def start_h1(attributes)
@@ -101,7 +100,7 @@ class Html2Md
101
100
  @last_cdata_length.times do
102
101
  @markdown << "="
103
102
  end
104
- @markdown << "\n"
103
+ @markdown << "\n\n"
105
104
  end
106
105
 
107
106
  def start_h2(attributes)
@@ -113,7 +112,7 @@ class Html2Md
113
112
  @last_cdata_length.times do
114
113
  @markdown << "-"
115
114
  end
116
- @markdown << "\n"
115
+ @markdown << "\n\n"
117
116
  end
118
117
 
119
118
  def start_h3(attributes)
@@ -121,7 +120,7 @@ class Html2Md
121
120
  end
122
121
 
123
122
  def end_h3(attributes)
124
- @markdown << "\n"
123
+ @markdown << "\n\n"
125
124
  end
126
125
 
127
126
  def start_a(attributes)
@@ -157,21 +156,23 @@ class Html2Md
157
156
  end
158
157
 
159
158
  def start_ul(attributes)
159
+ @markdown << "\n" #if @list_tree[-1]
160
160
  @list_tree.push( { :type => :ul, :current_element => 0 } )
161
- @markdown << "\n"
162
161
  end
163
162
 
164
163
  def end_ul(attributes)
165
164
  @list_tree.pop
165
+ @markdown << "\n" unless @list_tree[-1]
166
166
  end
167
167
 
168
168
  def start_ol(attributes)
169
+ @markdown << "\n"# if @list_tree[-1]
169
170
  @list_tree.push( { :type => :ol, :current_element => 0 } )
170
- @markdown << "\n"
171
171
  end
172
172
 
173
173
  def end_ol(attributes)
174
174
  @list_tree.pop
175
+ @markdown << "\n" unless @list_tree[-1]
175
176
  end
176
177
 
177
178
  def start_li(attributes)
@@ -192,18 +193,22 @@ class Html2Md
192
193
  end
193
194
 
194
195
  def end_li(attributes)
195
- @markdown << "\n"
196
+ @markdown << "\n" if @markdown[-1] != "\n" and @markdown[-1] != 10
196
197
  end
197
198
 
198
199
  def characters c
199
200
  @last_cdata_length = c.chomp.length
200
201
  if @list_tree[-1]
201
- @markdown << c.chomp.lstrip.rstrip
202
+ @markdown << c.gsub(/\n(\s*)?/,"").lstrip
202
203
  else
203
- @markdown << c.chomp
204
+ @markdown << c.gsub(/\n(\s*)?/,"")
204
205
  end
205
206
  end
206
207
 
208
+ def end_document
209
+ @markdown.gsub!(/\n{2,}/,"\n\n")
210
+ end
211
+
207
212
 
208
213
  end
209
214
  end
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: html2md
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.1.1
4
+ version: 0.1.2
5
5
  prerelease:
6
6
  platform: ruby
7
7
  authors:
@@ -9,7 +9,7 @@ authors:
9
9
  autorequire:
10
10
  bindir: bin
11
11
  cert_chain: []
12
- date: 2012-03-18 00:00:00.000000000 Z
12
+ date: 2012-03-23 00:00:00.000000000 Z
13
13
  dependencies: []
14
14
  description: ! ' Converts Basic HTML to markdown
15
15
 
@@ -25,6 +25,8 @@ files:
25
25
  - lib/html2md/document.rb
26
26
  - lib/html2md/VERSION.rb
27
27
  - lib/html2md.rb
28
+ - features/assets/test.html
29
+ - features/assets/test.md
28
30
  - features/markdown.feature
29
31
  - features/step_definitions/markdown_steps.rb
30
32
  homepage: http://github.com/pmorton/html2md