mort666-pricetag 0.1.0

Sign up to get free protection for your applications and to get access to all the features.
data/.document ADDED
@@ -0,0 +1,5 @@
1
+ lib/**/*.rb
2
+ bin/*
3
+ -
4
+ features/**/*.feature
5
+ LICENSE.txt
data/Gemfile ADDED
@@ -0,0 +1,10 @@
1
+ source :gemcutter
2
+
3
+ gem 'nokogiri', "~> 1.4.4"
4
+
5
+ group :development do
6
+ gem "shoulda", ">= 0"
7
+ gem "bundler", "~> 1.0.0"
8
+ gem "jeweler", "~> 1.5.1"
9
+ gem "rcov", ">= 0"
10
+ end
data/Gemfile.lock ADDED
@@ -0,0 +1,22 @@
1
+ GEM
2
+ remote: http://rubygems.org/
3
+ specs:
4
+ git (1.2.5)
5
+ jeweler (1.5.1)
6
+ bundler (~> 1.0.0)
7
+ git (>= 1.2.5)
8
+ rake
9
+ nokogiri (1.4.4)
10
+ rake (0.8.7)
11
+ rcov (0.9.9)
12
+ shoulda (2.11.3)
13
+
14
+ PLATFORMS
15
+ ruby
16
+
17
+ DEPENDENCIES
18
+ bundler (~> 1.0.0)
19
+ jeweler (~> 1.5.1)
20
+ nokogiri (~> 1.4.4)
21
+ rcov
22
+ shoulda
data/README.md ADDED
@@ -0,0 +1,82 @@
1
+ # PriceTag
2
+ ## Convert HTML into your favorite lightweight markup language
3
+
4
+ Who doesn't love themselves some light markup languages? As somebody writing with one _as I type this very sentence_, I am certainly a big fan.
5
+
6
+ Sure, the whole point of [Markdown][1], [Textile][2], and the like are to generate HTML, but what if you want to go the other way? To undo the horrible mistake of shackling your document in oppressive angled brackets, or perhaps share some love with markup that never knew the sweet embrace of concise, humanist syntax?
7
+
8
+ With PriceTag, such transformative experiences are just a simple line of code away:
9
+
10
+ html = <<-EOF
11
+ <h1>Lorem Ipsum</h1>
12
+ <p>Lorem ipsum dolor sit amet.</p>
13
+ EOF
14
+
15
+ PriceTag.html_to_markdown(html)
16
+
17
+ Or if you're of the [Textile][2] persuasion:
18
+
19
+ PriceTag.html_to_textile(html)
20
+
21
+
22
+ ## Usage
23
+
24
+ You can customize aspects of a document's output by providing optional parameters to the `html\_to\_*` argument:
25
+
26
+ PriceTag.html_to_markdown(html, :heading_style => :setext, :link_style => :inline)
27
+
28
+ - `link_style` : markup style for `a` tags (default: `reference`)
29
+ - `reference` -- link URLs are referenced as footnote
30
+ - `inline` -- link URLs are displayed inline with element
31
+
32
+ There are language-specific options as well:
33
+
34
+ ### Markdown
35
+
36
+ - `heading_style` : markup style for `h1`...`h6` tags (default: `atx`)
37
+ - `atx` -- Octothorps (#) next to heading
38
+ - `setext` -- Equal signs or dashes (=/-) underneath headings (falls back to atx style for h3 and above)
39
+
40
+ ### Textile
41
+
42
+ - `ignore_styles` : Textile allows you to set attributes (`id`, `class`, `lang`, and CSS Styles) for inline elements. When set to `true`, this information will not be included in the output (default: `true`)
43
+
44
+ ## Dependencies
45
+
46
+ - Nokogiri 1.4+
47
+
48
+ ## Note on Patches/Pull Requests
49
+
50
+ 1. Fork the project.
51
+ 2. Make your feature addition or bug fix.
52
+ 3. Add tests for it. This is important so I don’t break it in a future version unintentionally.
53
+ 4. Commit, do not mess with rakefile, version, or history. (if you want to have your own version, that is fine but bump version in a commit by itself I can ignore when I pull)
54
+ 5. Send me a pull request. Bonus points for topic branches.
55
+
56
+ ## License
57
+
58
+ PriceTag is licensed under the MIT License:
59
+
60
+ Copyright (c) 2010 Mattt Thompson (http://mattt.me/)
61
+
62
+ Permission is hereby granted, free of charge, to any person obtaining a copy
63
+ of this software and associated documentation files (the "Software"), to deal
64
+ in the Software without restriction, including without limitation the rights
65
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
66
+ copies of the Software, and to permit persons to whom the Software is
67
+ furnished to do so, subject to the following conditions:
68
+
69
+ The above copyright notice and this permission notice shall be included in
70
+ all copies or substantial portions of the Software.
71
+
72
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
73
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
74
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
75
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
76
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
77
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
78
+ THE SOFTWARE.
79
+
80
+
81
+ [1]: http://daringfireball.net/projects/markdown/ "Markdown, by John Gruber"
82
+ [2]: http://textile.thresholdstate.com/ "Textile, by Dean Allen"
data/Rakefile ADDED
@@ -0,0 +1,50 @@
1
+ require 'rubygems'
2
+ require 'bundler'
3
+ begin
4
+ Bundler.setup(:default, :development)
5
+ rescue Bundler::BundlerError => e
6
+ $stderr.puts e.message
7
+ $stderr.puts "Run `bundle install` to install missing gems"
8
+ exit e.status_code
9
+ end
10
+ require 'rake'
11
+
12
+ require 'jeweler'
13
+ Jeweler::Tasks.new do |gem|
14
+ gem.name = "pricetag"
15
+ gem.homepage = "http://github.com/mattt/pricetag"
16
+ gem.license = "MIT"
17
+ gem.summary = "Convert HTML into your favorite lightweight markup language"
18
+ gem.description = "PriceTag converts HTML documents into light markup languages. Currently supports Markdown and Textile."
19
+ gem.email = "m@mattt.me"
20
+ gem.authors = ["Mattt Thompson"]
21
+
22
+ gem.add_development_dependency 'nokogiri', '> 1.4'
23
+ end
24
+ Jeweler::RubygemsDotOrgTasks.new
25
+
26
+ require 'rake/testtask'
27
+ Rake::TestTask.new(:test) do |test|
28
+ test.libs << 'lib' << 'test'
29
+ test.pattern = 'test/**/test_*.rb'
30
+ test.verbose = true
31
+ end
32
+
33
+ require 'rcov/rcovtask'
34
+ Rcov::RcovTask.new do |test|
35
+ test.libs << 'test'
36
+ test.pattern = 'test/**/test_*.rb'
37
+ test.verbose = true
38
+ end
39
+
40
+ task :default => :test
41
+
42
+ require 'rake/rdoctask'
43
+ Rake::RDocTask.new do |rdoc|
44
+ version = File.exist?('VERSION') ? File.read('VERSION') : ""
45
+
46
+ rdoc.rdoc_dir = 'rdoc'
47
+ rdoc.title = "pricetag #{version}"
48
+ rdoc.rdoc_files.include('README*')
49
+ rdoc.rdoc_files.include('lib/**/*.rb')
50
+ end
data/VERSION ADDED
@@ -0,0 +1 @@
1
+ 0.1.1
data/lib/pricetag.rb ADDED
@@ -0,0 +1,9 @@
1
+ require 'nokogiri'
2
+ require 'cgi'
3
+
4
+ module PriceTag
5
+ end
6
+
7
+ require 'pricetag/document'
8
+ require 'pricetag/processors'
9
+ require 'pricetag/version'
@@ -0,0 +1,19 @@
1
+ module PriceTag
2
+ class Document
3
+ def initialize(html)
4
+ @xml = Nokogiri::XML("<root>" + html + "</root>") do |config|
5
+ config.strict.noent
6
+ end
7
+
8
+ @references = []
9
+ @xml.search("a").each do |a|
10
+ @references << [a['href'], a['title']]
11
+ a['data-reference-number'] = @references.length.to_s
12
+ end
13
+
14
+ @xml.search("li").each do |li|
15
+ li['data-position'] = (li.parent.search("li").index(li) + 1).to_s
16
+ end
17
+ end
18
+ end
19
+ end
@@ -0,0 +1,3 @@
1
+ require 'pricetag/processors/base'
2
+ require 'pricetag/processors/markdown'
3
+ require 'pricetag/processors/textile'
@@ -0,0 +1,30 @@
1
+ module PriceTag
2
+ module Processors
3
+ module Base
4
+ def process!(node, options = {})
5
+ return if node.text?
6
+
7
+ node.children.each do |child|
8
+ process!(child, options)
9
+ end
10
+
11
+ text = text_for_node(node, options)
12
+ node.replace(node.document.create_text_node(text))
13
+ end
14
+
15
+ private
16
+
17
+ def level_for_heading(node)
18
+ node.name.match(/[123456]/)[0].to_i rescue 0
19
+ end
20
+
21
+ def indentation_level_for_list_item(node)
22
+ node.ancestors.select{|t| ["ol", "ul"].include?(t.name)}.length
23
+ end
24
+
25
+ def indentation_level_for_blockquote(node)
26
+ node.ancestors.select{|t| t.name == "blockquote"}.length
27
+ end
28
+ end
29
+ end
30
+ end
@@ -0,0 +1,124 @@
1
+ module PriceTag
2
+ module Processors
3
+ module Markdown
4
+ extend Processors::Base
5
+
6
+ PriceTag.module_eval do
7
+ class << self
8
+ def html_to_markdown(html, options = {})
9
+ Document.new(html).to_markdown(options)
10
+ end
11
+ end
12
+ end
13
+
14
+ PriceTag::Document.class_eval do
15
+ include PriceTag::Processors
16
+
17
+ def to_markdown(options = {})
18
+ options[:heading_style] ||= :atx
19
+ options[:link_style] ||= :reference
20
+
21
+ @markdown ||= @xml.dup
22
+ @markdown.search("root > *").each do |node|
23
+ Markdown::process!(node, options)
24
+ end
25
+
26
+ output = CGI.unescapeHTML(@markdown.root.inner_html)
27
+ output += Markdown::text_for_references(@references) if options[:link_style] == :reference
28
+
29
+ return output
30
+ end
31
+ end
32
+
33
+ class << self
34
+ def text_for_references(references)
35
+ output = "\n\n"
36
+ references.each_with_index do |reference, i|
37
+ href, title = reference
38
+ output << "[#{i+1}]: #{href}"
39
+ output << " \"#{title}\"" if title and title != ""
40
+ output << "\n"
41
+ end
42
+
43
+ return output
44
+ end
45
+
46
+ private
47
+
48
+ def text_for_node(node, options = {})
49
+ case tag = node.name.to_sym
50
+ when :h1, :h2, :h3, :h4, :h5, :h6
51
+ style = options[:heading_style]
52
+ style = :atx unless [:h1, :h2].include?(tag)
53
+
54
+ case style
55
+ when :setext
56
+ separator = (tag == :h1 ? "=" : "-")
57
+ "#{node.text}\n#{separator * node.text.length}"
58
+ when :atx
59
+ octothorps = "#" * level_for_heading(node)
60
+ "#{octothorps} #{node.text}"
61
+ end
62
+ when :li
63
+ indentation = "\t" * (indentation_level_for_list_item(node) - 1)
64
+ if node.parent.name == "ol"
65
+ position = node['data-position']
66
+ "#{indentation}#{position}. #{node.text}"
67
+ else
68
+ "#{indentation}* #{node.text}"
69
+ end
70
+ when :blockquote
71
+ nesting = ">" * (indentation_level_for_blockquote(node) + 1)
72
+ nesting + " " + node.text
73
+ when :a
74
+ style = options[:link_style]
75
+
76
+ case style
77
+ when :inline
78
+ if node.text == node['href']
79
+ "<#{node.text}>"
80
+ elsif node['href'].match(/mailto:(.+)/)
81
+ "<#{$1}>"
82
+ elsif node['title']
83
+ "[#{node.text.strip}](#{node['href']} #{node['title']})"
84
+ else
85
+ "[#{node.text.strip}](#{node['href']})"
86
+ end
87
+ when :reference
88
+ "[#{node.text.strip}][#{node['data-reference-number']}]"
89
+ end
90
+ when :img
91
+ if node['title']
92
+ "![#{node['alt']}](#{node['href']} #{node['title']})"
93
+ else
94
+ "![#{node['alt']}](#{node['href']})"
95
+ end
96
+ when :em
97
+ "_#{node.text}_"
98
+ when :strong
99
+ "**#{node.text}**"
100
+ when :pre
101
+ node.text.match(/^\t/) ? node.text : node.to_s
102
+ when :code
103
+ case node.parent.name.to_sym
104
+ when :pre
105
+ node.text.split(/\n/).collect{|line| "\t\t" + line}.join("\n")
106
+ else
107
+ "`#{node.text}`"
108
+ end
109
+ when :tt
110
+ "`#{node.text}`"
111
+ when :br
112
+ " \n"
113
+ when :hr
114
+ "- - -"
115
+ when :p, :span, :ul, :ol
116
+ node.text
117
+ else
118
+ CGI.unescapeHTML(node.to_s)
119
+ end
120
+ end
121
+ end
122
+ end
123
+ end
124
+ end
@@ -0,0 +1,124 @@
1
+ module PriceTag
2
+ module Processors
3
+ module Textile
4
+ extend Processors::Base
5
+
6
+ PriceTag.module_eval do
7
+ class << self
8
+ def html_to_textile(html, options = {})
9
+ Document.new(html).to_textile(options)
10
+ end
11
+ end
12
+ end
13
+
14
+ PriceTag::Document.class_eval do
15
+ include PriceTag::Processors
16
+
17
+ def to_textile(options = {})
18
+ options[:link_style] ||= :inline
19
+ options[:ignore_styles] ||= true
20
+
21
+ @textile ||= @xml.dup
22
+ @textile.search("root > *").each do |node|
23
+ Textile::process!(node, options)
24
+ end
25
+
26
+ output = CGI.unescapeHTML(@textile.root.inner_html)
27
+ output += Markdown::text_for_references(@references) if options[:link_style] == :reference
28
+
29
+ return output
30
+ end
31
+ end
32
+
33
+ INLINE_TAG_SYMBOLS = {
34
+ :span => "%",
35
+ :em => "_",
36
+ :i => "__",
37
+ :strong => "*",
38
+ :b => "**",
39
+ :cite => "??",
40
+ :del => "-",
41
+ :ins => "+",
42
+ :sup => "^",
43
+ :sub => "~"
44
+ }
45
+
46
+ class << self
47
+ def text_for_references(references)
48
+ output = "\n\n"
49
+ references.each_with_index do |reference, i|
50
+ href, title = reference
51
+ output << "[#{i+1}]#{href}\n"
52
+ end
53
+
54
+ return output
55
+ end
56
+
57
+ private
58
+
59
+ def text_for_node(node, options = {})
60
+ case tag = node.name.to_sym
61
+ when :h1, :h2, :h3, :h4, :h5, :h6
62
+ "#{node.name}. #{node.text}"
63
+ when :li
64
+ "#{(node.parent.name == "ol" ? "#" : "*") * indentation_level_for_list_item(node)} #{node.text}"
65
+ when :blockquote
66
+ "bq. #{node.text}"
67
+ when :a
68
+ style = options[:link_style]
69
+
70
+ case style
71
+ when :inline
72
+ if node.text.match(/^\!/)
73
+ "#{node.text.strip}:#{node['href']}"
74
+ else
75
+ "\"#{node.text.strip}\":#{node['href']}"
76
+ end
77
+ end
78
+ when :img
79
+ if node['title']
80
+ "!#{node['href']}(#{node['title']})!"
81
+ else
82
+ "!#{node['href']}!"
83
+ end
84
+ when :span, :em, :i, :strong, :b, :cite, :del, :ins, :sup, :sub
85
+ attributes = inline_attributes_for_node(node, options)
86
+ symbol = INLINE_TAG_SYMBOLS[tag]
87
+ "#{symbol}#{attributes}#{node.text}#{symbol}"
88
+ when :acronym
89
+ "#{node.text}(#{node['title']})"
90
+ when :code
91
+ node.parent.name.to_sym == :pre ? node.to_s : "@#{node.text}@"
92
+ when :tt
93
+ "@#{node.text}@"
94
+ when :p, :ul, :ol
95
+ node.text
96
+ else
97
+ CGI.unescapeHTML(node.to_s)
98
+ end
99
+ end
100
+
101
+ private
102
+
103
+ def inline_attributes_for_node(node, options = {})
104
+ return nil if node.attributes.empty?
105
+
106
+ attributes = ""
107
+
108
+ if node.attributes["id"] and node.attributes["class"]
109
+ attributes << "(#{node.attributes["class"]}##{node.attributes["id"]})"
110
+ elsif attributes["id"]
111
+ attributes << "(##{node.attributes["id"]})"
112
+ elsif attributes["class"]
113
+ attributes << "(#{node.attributes["class"]})"
114
+ end
115
+
116
+ attributes << "[#{node.attributes["lang"]}]" if node.attributes["lang"]
117
+ attributes << "{#{node.attributes["style"]}}" if node.attributes["style"]
118
+
119
+ return attributes
120
+ end
121
+ end
122
+ end
123
+ end
124
+ end
@@ -0,0 +1,3 @@
1
+ module PriceTag
2
+ Version = VERSION = '0.1.0'
3
+ end
data/test/helper.rb ADDED
@@ -0,0 +1,18 @@
1
+ require 'rubygems'
2
+ require 'bundler'
3
+ begin
4
+ Bundler.setup(:default, :development)
5
+ rescue Bundler::BundlerError => e
6
+ $stderr.puts e.message
7
+ $stderr.puts "Run `bundle install` to install missing gems"
8
+ exit e.status_code
9
+ end
10
+ require 'test/unit'
11
+ require 'shoulda'
12
+
13
+ $LOAD_PATH.unshift(File.join(File.dirname(__FILE__), '..', 'lib'))
14
+ $LOAD_PATH.unshift(File.dirname(__FILE__))
15
+ require 'pricetag'
16
+
17
+ class Test::Unit::TestCase
18
+ end
metadata ADDED
@@ -0,0 +1,155 @@
1
+ --- !ruby/object:Gem::Specification
2
+ name: mort666-pricetag
3
+ version: !ruby/object:Gem::Version
4
+ hash: 27
5
+ prerelease:
6
+ segments:
7
+ - 0
8
+ - 1
9
+ - 0
10
+ version: 0.1.0
11
+ platform: ruby
12
+ authors:
13
+ - Mattt Thompson
14
+ - Stephen Kapp
15
+ autorequire:
16
+ bindir: bin
17
+ cert_chain: []
18
+
19
+ date: 2010-11-29 00:00:00 +00:00
20
+ default_executable:
21
+ dependencies:
22
+ - !ruby/object:Gem::Dependency
23
+ name: shoulda
24
+ prerelease: false
25
+ requirement: &id001 !ruby/object:Gem::Requirement
26
+ none: false
27
+ requirements:
28
+ - - ">="
29
+ - !ruby/object:Gem::Version
30
+ hash: 3
31
+ segments:
32
+ - 0
33
+ version: "0"
34
+ type: :development
35
+ version_requirements: *id001
36
+ - !ruby/object:Gem::Dependency
37
+ name: bundler
38
+ prerelease: false
39
+ requirement: &id002 !ruby/object:Gem::Requirement
40
+ none: false
41
+ requirements:
42
+ - - ~>
43
+ - !ruby/object:Gem::Version
44
+ hash: 23
45
+ segments:
46
+ - 1
47
+ - 0
48
+ - 0
49
+ version: 1.0.0
50
+ type: :development
51
+ version_requirements: *id002
52
+ - !ruby/object:Gem::Dependency
53
+ name: jeweler
54
+ prerelease: false
55
+ requirement: &id003 !ruby/object:Gem::Requirement
56
+ none: false
57
+ requirements:
58
+ - - ~>
59
+ - !ruby/object:Gem::Version
60
+ hash: 1
61
+ segments:
62
+ - 1
63
+ - 5
64
+ - 1
65
+ version: 1.5.1
66
+ type: :development
67
+ version_requirements: *id003
68
+ - !ruby/object:Gem::Dependency
69
+ name: rcov
70
+ prerelease: false
71
+ requirement: &id004 !ruby/object:Gem::Requirement
72
+ none: false
73
+ requirements:
74
+ - - ">="
75
+ - !ruby/object:Gem::Version
76
+ hash: 3
77
+ segments:
78
+ - 0
79
+ version: "0"
80
+ type: :development
81
+ version_requirements: *id004
82
+ - !ruby/object:Gem::Dependency
83
+ name: nokogiri
84
+ prerelease: false
85
+ requirement: &id005 !ruby/object:Gem::Requirement
86
+ none: false
87
+ requirements:
88
+ - - ">"
89
+ - !ruby/object:Gem::Version
90
+ hash: 7
91
+ segments:
92
+ - 1
93
+ - 4
94
+ version: "1.4"
95
+ type: :development
96
+ version_requirements: *id005
97
+ description: PriceTag converts HTML documents into light markup languages. Currently supports Markdown and Textile.
98
+ email: m@mattt.me
99
+ executables: []
100
+
101
+ extensions: []
102
+
103
+ extra_rdoc_files:
104
+ - README.md
105
+ files:
106
+ - .document
107
+ - Gemfile
108
+ - Gemfile.lock
109
+ - README.md
110
+ - Rakefile
111
+ - VERSION
112
+ - lib/pricetag.rb
113
+ - lib/pricetag/document.rb
114
+ - lib/pricetag/processors.rb
115
+ - lib/pricetag/processors/base.rb
116
+ - lib/pricetag/processors/markdown.rb
117
+ - lib/pricetag/processors/textile.rb
118
+ - lib/pricetag/version.rb
119
+ - test/helper.rb
120
+ has_rdoc: true
121
+ homepage: http://github.com/mort666/pricetag
122
+ licenses:
123
+ - MIT
124
+ post_install_message:
125
+ rdoc_options: []
126
+
127
+ require_paths:
128
+ - lib
129
+ required_ruby_version: !ruby/object:Gem::Requirement
130
+ none: false
131
+ requirements:
132
+ - - ">="
133
+ - !ruby/object:Gem::Version
134
+ hash: 3
135
+ segments:
136
+ - 0
137
+ version: "0"
138
+ required_rubygems_version: !ruby/object:Gem::Requirement
139
+ none: false
140
+ requirements:
141
+ - - ">="
142
+ - !ruby/object:Gem::Version
143
+ hash: 3
144
+ segments:
145
+ - 0
146
+ version: "0"
147
+ requirements: []
148
+
149
+ rubyforge_project:
150
+ rubygems_version: 1.6.2
151
+ signing_key:
152
+ specification_version: 3
153
+ summary: Convert HTML into your favorite lightweight markup language
154
+ test_files:
155
+ - test/helper.rb