rubypants-unicode 0.2.0

Sign up to get free protection for your applications and to get access to all the features.
Files changed (6) hide show
  1. data/README.md +119 -0
  2. data/Rakefile +56 -0
  3. data/install.rb +9 -0
  4. data/rubypants.rb +491 -0
  5. data/test_rubypants.rb +164 -0
  6. metadata +73 -0
data/README.md ADDED
@@ -0,0 +1,119 @@
1
+ # RubyPants Unicode — SmartyPants ported to Ruby
2
+
3
+ Switched to unicode output (UTF-8) instead of HTML entities by Chris Chapman
4
+ Copyright © 2012 Chris Chapman
5
+
6
+ Ported by Christian Neukirchen <mailto:chneukirchen@gmail.com>
7
+ Copyright © 2004 Christian Neukirchen
8
+
9
+ Incooporates ideas, comments and documentation by Chad Miller
10
+ Copyright © 2004 Chad Miller
11
+
12
+ Original SmartyPants by John Gruber
13
+ Copyright © 2003 John Gruber
14
+
15
+
16
+ ## RubyPants
17
+
18
+ RubyPants is a Ruby port of the smart-quotes library SmartyPants.
19
+
20
+ The original “SmartyPants” is a free web publishing plug-in for
21
+ Movable Type, Blosxom, and BBEdit that easily translates plain ASCII
22
+ punctuation characters into “smart” typographic punctuation HTML
23
+ entities.
24
+
25
+ See rubypants.rb for more information.
26
+
27
+
28
+ ## Incompatibilities
29
+
30
+ RubyPants uses a different API than SmartyPants; it is compatible to
31
+ Red- and BlueCloth. Usually, you call RubyPants like this:
32
+
33
+ ```ruby
34
+ nicehtml = RubyPants.new(uglyhtml, options).to_html
35
+ ```
36
+
37
+ where +options+ is an Array of Integers and/or Symbols (if you don’t
38
+ pass any options, RubyPants will use <tt>[2]</tt> as default.)
39
+
40
+ *Note*:: This is incompatible to SmartyPants, which uses <tt>[1]</tt>
41
+ for default.
42
+
43
+ The exact meaning of numbers and symbols is documented at RubyPants#new.
44
+
45
+
46
+ ## SmartyPants license:
47
+
48
+ Copyright © 2003 John Gruber
49
+ (http://daringfireball.net)
50
+ All rights reserved.
51
+
52
+ Redistribution and use in source and binary forms, with or without
53
+ modification, are permitted provided that the following conditions
54
+ are met:
55
+
56
+ * Redistributions of source code must retain the above copyright
57
+ notice, this list of conditions and the following disclaimer.
58
+
59
+ * Redistributions in binary form must reproduce the above copyright
60
+ notice, this list of conditions and the following disclaimer in
61
+ the documentation and/or other materials provided with the
62
+ distribution.
63
+
64
+ * Neither the name “SmartyPants” nor the names of its contributors
65
+ may be used to endorse or promote products derived from this
66
+ software without specific prior written permission.
67
+
68
+ This software is provided by the copyright holders and contributors
69
+ “as is” and any express or implied warranties, including, but not
70
+ limited to, the implied warranties of merchantability and fitness
71
+ for a particular purpose are disclaimed. In no event shall the
72
+ copyright owner or contributors be liable for any direct, indirect,
73
+ incidental, special, exemplary, or consequential damages (including,
74
+ but not limited to, procurement of substitute goods or services;
75
+ loss of use, data, or profits; or business interruption) however
76
+ caused and on any theory of liability, whether in contract, strict
77
+ liability, or tort (including negligence or otherwise) arising in
78
+ any way out of the use of this software, even if advised of the
79
+ possibility of such damage.
80
+
81
+
82
+ ## RubyPants license
83
+
84
+ Copyright © 2004 Christian Neukirchen
85
+
86
+ RubyPants is a derivative work of SmartyPants and smartypants.py.
87
+
88
+ Redistribution and use in source and binary forms, with or without
89
+ modification, are permitted provided that the following conditions
90
+ are met:
91
+
92
+ * Redistributions of source code must retain the above copyright
93
+ notice, this list of conditions and the following disclaimer.
94
+
95
+ * Redistributions in binary form must reproduce the above copyright
96
+ notice, this list of conditions and the following disclaimer in
97
+ the documentation and/or other materials provided with the
98
+ distribution.
99
+
100
+ This software is provided by the copyright holders and contributors
101
+ “as is” and any express or implied warranties, including, but not
102
+ limited to, the implied warranties of merchantability and fitness
103
+ for a particular purpose are disclaimed. In no event shall the
104
+ copyright owner or contributors be liable for any direct, indirect,
105
+ incidental, special, exemplary, or consequential damages (including,
106
+ but not limited to, procurement of substitute goods or services;
107
+ loss of use, data, or profits; or business interruption) however
108
+ caused and on any theory of liability, whether in contract, strict
109
+ liability, or tort (including negligence or otherwise) arising in
110
+ any way out of the use of this software, even if advised of the
111
+ possibility of such damage.
112
+
113
+
114
+ ## Links
115
+
116
+ - John Gruber:: http://daringfireball.net
117
+ - SmartyPants:: http://daringfireball.net/projects/smartypants
118
+ - Chad Miller:: http://web.chad.org
119
+ - Christian Neukirchen:: http://kronavita.de/chris
data/Rakefile ADDED
@@ -0,0 +1,56 @@
1
+ # Rakefile for rubypants -*-ruby-*-
2
+ require 'rake/rdoctask'
3
+ require 'rake/gempackagetask'
4
+
5
+
6
+ desc "Run all the tests"
7
+ task :default => [:test]
8
+
9
+ desc "Do predistribution stuff"
10
+ task :predist => [:doc]
11
+
12
+
13
+ desc "Run all the tests"
14
+ task :test do
15
+ ruby 'test_rubypants.rb'
16
+ end
17
+
18
+ desc "Make an archive as .tar.gz"
19
+ task :dist => :test do
20
+ system "darcs dist -d rubypants#{get_darcs_tree_version}"
21
+ end
22
+
23
+
24
+ desc "Generate RDoc documentation"
25
+ Rake::RDocTask.new(:doc) do |rdoc|
26
+ rdoc.options << '--line-numbers --inline-source --all'
27
+ rdoc.rdoc_files.include 'README'
28
+ rdoc.rdoc_files.include 'rubypants.rb'
29
+ end
30
+
31
+
32
+ # Helper to retrieve the "revision number" of the darcs tree.
33
+ def get_darcs_tree_version
34
+ return "" unless File.directory? "_darcs"
35
+
36
+ changes = `darcs changes`
37
+ count = 0
38
+ tag = "0.0"
39
+
40
+ changes.each("\n\n") { |change|
41
+ head, title, desc = change.split("\n", 3)
42
+
43
+ if title =~ /^ \*/
44
+ # Normal change.
45
+ count += 1
46
+ elsif title =~ /tagged (.*)/
47
+ # Tag. We look for these.
48
+ tag = $1
49
+ break
50
+ else
51
+ warn "Unparsable change: #{change}"
52
+ end
53
+ }
54
+
55
+ "-" + tag + "." + count.to_s
56
+ end
data/install.rb ADDED
@@ -0,0 +1,9 @@
1
+ # Install RubyPants.
2
+
3
+ require "rbconfig"
4
+ require "fileutils"
5
+
6
+ source = "rubypants.rb"
7
+ dest = File.join(Config::CONFIG["sitelibdir"], source)
8
+
9
+ FileUtils.install(source, dest, :mode => 0644, :verbose => true)
data/rubypants.rb ADDED
@@ -0,0 +1,491 @@
1
+ # encoding: utf-8
2
+ #
3
+ # = RubyPants -- SmartyPants ported to Ruby
4
+ #
5
+ # Ported by Christian Neukirchen <mailto:chneukirchen@gmail.com>
6
+ # Copyright (C) 2004 Christian Neukirchen
7
+ #
8
+ # Incooporates ideas, comments and documentation by Chad Miller
9
+ # Copyright (C) 2004 Chad Miller
10
+ #
11
+ # Original SmartyPants by John Gruber
12
+ # Copyright (C) 2003 John Gruber
13
+ #
14
+
15
+ #
16
+ # = RubyPants -- SmartyPants ported to Ruby
17
+ #
18
+ # == Synopsis
19
+ #
20
+ # RubyPants is a Ruby port of the smart-quotes library SmartyPants.
21
+ #
22
+ # The original "SmartyPants" is a free web publishing plug-in for
23
+ # Movable Type, Blosxom, and BBEdit that easily translates plain ASCII
24
+ # punctuation characters into "smart" typographic punctuation HTML
25
+ # entities.
26
+ #
27
+ #
28
+ # == Description
29
+ #
30
+ # RubyPants can perform the following transformations:
31
+ #
32
+ # * Straight quotes (<tt>"</tt> and <tt>'</tt>) into "curly" quote
33
+ # HTML entities
34
+ # * Backticks-style quotes (<tt>``like this''</tt>) into "curly" quote
35
+ # HTML entities
36
+ # * Dashes (<tt>--</tt> and <tt>---</tt>) into en- and em-dash
37
+ # entities
38
+ # * Three consecutive dots (<tt>...</tt> or <tt>. . .</tt>) into an
39
+ # ellipsis entity
40
+ #
41
+ # This means you can write, edit, and save your posts using plain old
42
+ # ASCII straight quotes, plain dashes, and plain dots, but your
43
+ # published posts (and final HTML output) will appear with smart
44
+ # quotes, em-dashes, and proper ellipses.
45
+ #
46
+ # RubyPants does not modify characters within <tt><pre></tt>,
47
+ # <tt><code></tt>, <tt><kbd></tt>, <tt><math></tt> or
48
+ # <tt><script></tt> tag blocks. Typically, these tags are used to
49
+ # display text where smart quotes and other "smart punctuation" would
50
+ # not be appropriate, such as source code or example markup.
51
+ #
52
+ #
53
+ # == Backslash Escapes
54
+ #
55
+ # If you need to use literal straight quotes (or plain hyphens and
56
+ # periods), RubyPants accepts the following backslash escape sequences
57
+ # to force non-smart punctuation. It does so by transforming the
58
+ # escape sequence into a decimal-encoded HTML entity:
59
+ #
60
+ # \\ \" \' \. \- \`
61
+ #
62
+ # This is useful, for example, when you want to use straight quotes as
63
+ # foot and inch marks: 6'2" tall; a 17" iMac. (Use <tt>6\'2\"</tt>
64
+ # resp. <tt>17\"</tt>.)
65
+ #
66
+ #
67
+ # == Algorithmic Shortcomings
68
+ #
69
+ # One situation in which quotes will get curled the wrong way is when
70
+ # apostrophes are used at the start of leading contractions. For
71
+ # example:
72
+ #
73
+ # 'Twas the night before Christmas.
74
+ #
75
+ # In the case above, RubyPants will turn the apostrophe into an
76
+ # opening single-quote, when in fact it should be a closing one. I
77
+ # don't think this problem can be solved in the general case--every
78
+ # word processor I've tried gets this wrong as well. In such cases,
79
+ # it's best to use the proper HTML entity for closing single-quotes
80
+ # ("<tt>’</tt>") by hand.
81
+ #
82
+ #
83
+ # == Bugs
84
+ #
85
+ # To file bug reports or feature requests (except see above) please
86
+ # send email to: mailto:chneukirchen@gmail.com
87
+ #
88
+ # If the bug involves quotes being curled the wrong way, please send
89
+ # example text to illustrate.
90
+ #
91
+ #
92
+ # == Authors
93
+ #
94
+ # John Gruber did all of the hard work of writing this software in
95
+ # Perl for Movable Type and almost all of this useful documentation.
96
+ # Chad Miller ported it to Python to use with Pyblosxom.
97
+ #
98
+ # Christian Neukirchen provided the Ruby port, as a general-purpose
99
+ # library that follows the *Cloth API.
100
+ #
101
+ #
102
+ # == Copyright and License
103
+ #
104
+ # === SmartyPants license:
105
+ #
106
+ # Copyright (c) 2003 John Gruber
107
+ # (http://daringfireball.net)
108
+ # All rights reserved.
109
+ #
110
+ # Redistribution and use in source and binary forms, with or without
111
+ # modification, are permitted provided that the following conditions
112
+ # are met:
113
+ #
114
+ # * Redistributions of source code must retain the above copyright
115
+ # notice, this list of conditions and the following disclaimer.
116
+ #
117
+ # * Redistributions in binary form must reproduce the above copyright
118
+ # notice, this list of conditions and the following disclaimer in
119
+ # the documentation and/or other materials provided with the
120
+ # distribution.
121
+ #
122
+ # * Neither the name "SmartyPants" nor the names of its contributors
123
+ # may be used to endorse or promote products derived from this
124
+ # software without specific prior written permission.
125
+ #
126
+ # This software is provided by the copyright holders and contributors
127
+ # "as is" and any express or implied warranties, including, but not
128
+ # limited to, the implied warranties of merchantability and fitness
129
+ # for a particular purpose are disclaimed. In no event shall the
130
+ # copyright owner or contributors be liable for any direct, indirect,
131
+ # incidental, special, exemplary, or consequential damages (including,
132
+ # but not limited to, procurement of substitute goods or services;
133
+ # loss of use, data, or profits; or business interruption) however
134
+ # caused and on any theory of liability, whether in contract, strict
135
+ # liability, or tort (including negligence or otherwise) arising in
136
+ # any way out of the use of this software, even if advised of the
137
+ # possibility of such damage.
138
+ #
139
+ # === RubyPants license
140
+ #
141
+ # RubyPants is a derivative work of SmartyPants and smartypants.py.
142
+ #
143
+ # Redistribution and use in source and binary forms, with or without
144
+ # modification, are permitted provided that the following conditions
145
+ # are met:
146
+ #
147
+ # * Redistributions of source code must retain the above copyright
148
+ # notice, this list of conditions and the following disclaimer.
149
+ #
150
+ # * Redistributions in binary form must reproduce the above copyright
151
+ # notice, this list of conditions and the following disclaimer in
152
+ # the documentation and/or other materials provided with the
153
+ # distribution.
154
+ #
155
+ # This software is provided by the copyright holders and contributors
156
+ # "as is" and any express or implied warranties, including, but not
157
+ # limited to, the implied warranties of merchantability and fitness
158
+ # for a particular purpose are disclaimed. In no event shall the
159
+ # copyright owner or contributors be liable for any direct, indirect,
160
+ # incidental, special, exemplary, or consequential damages (including,
161
+ # but not limited to, procurement of substitute goods or services;
162
+ # loss of use, data, or profits; or business interruption) however
163
+ # caused and on any theory of liability, whether in contract, strict
164
+ # liability, or tort (including negligence or otherwise) arising in
165
+ # any way out of the use of this software, even if advised of the
166
+ # possibility of such damage.
167
+ #
168
+ #
169
+ # == Links
170
+ #
171
+ # John Gruber:: http://daringfireball.net
172
+ # SmartyPants:: http://daringfireball.net/projects/smartypants
173
+ #
174
+ # Chad Miller:: http://web.chad.org
175
+ #
176
+ # Christian Neukirchen:: http://kronavita.de/chris
177
+ #
178
+
179
+
180
+ class RubyPants < String
181
+ VERSION = "0.2"
182
+
183
+ # Create a new RubyPants instance with the text in +string+.
184
+ #
185
+ # Allowed elements in the options array:
186
+ #
187
+ # 0 :: do nothing
188
+ # 1 :: enable all, using only em-dash shortcuts
189
+ # 2 :: enable all, using old school en- and em-dash shortcuts (*default*)
190
+ # 3 :: enable all, using inverted old school en and em-dash shortcuts
191
+ # -1 :: stupefy (translate HTML entities to their ASCII-counterparts)
192
+ #
193
+ # If you don't like any of these defaults, you can pass symbols to change
194
+ # RubyPants' behavior:
195
+ #
196
+ # <tt>:quotes</tt> :: quotes
197
+ # <tt>:backticks</tt> :: backtick quotes (``double'' only)
198
+ # <tt>:allbackticks</tt> :: backtick quotes (``double'' and `single')
199
+ # <tt>:dashes</tt> :: dashes
200
+ # <tt>:oldschool</tt> :: old school dashes
201
+ # <tt>:inverted</tt> :: inverted old school dashes
202
+ # <tt>:ellipses</tt> :: ellipses
203
+ # <tt>:convertquotes</tt> :: convert <tt>&quot;</tt> entities to
204
+ # <tt>"</tt> for Dreamweaver users
205
+ # <tt>:stupefy</tt> :: translate RubyPants HTML entities
206
+ # to their ASCII counterparts.
207
+ #
208
+ def initialize(string, options=[2])
209
+ super string
210
+ @options = [*options]
211
+ end
212
+
213
+ # Apply SmartyPants transformations.
214
+ def to_html
215
+ do_quotes = do_backticks = do_dashes = do_ellipses = do_stupify = nil
216
+ convert_quotes = false
217
+
218
+ if @options.include? 0
219
+ # Do nothing.
220
+ return self
221
+ elsif @options.include? 1
222
+ # Do everything, turn all options on.
223
+ do_quotes = do_backticks = do_ellipses = true
224
+ do_dashes = :normal
225
+ elsif @options.include? 2
226
+ # Do everything, turn all options on, use old school dash shorthand.
227
+ do_quotes = do_backticks = do_ellipses = true
228
+ do_dashes = :oldschool
229
+ elsif @options.include? 3
230
+ # Do everything, turn all options on, use inverted old school
231
+ # dash shorthand.
232
+ do_quotes = do_backticks = do_ellipses = true
233
+ do_dashes = :inverted
234
+ elsif @options.include?(-1)
235
+ do_stupefy = true
236
+ else
237
+ do_quotes = @options.include? :quotes
238
+ do_backticks = @options.include? :backticks
239
+ do_backticks = :both if @options.include? :allbackticks
240
+ do_dashes = :normal if @options.include? :dashes
241
+ do_dashes = :oldschool if @options.include? :oldschool
242
+ do_dashes = :inverted if @options.include? :inverted
243
+ do_ellipses = @options.include? :ellipses
244
+ convert_quotes = @options.include? :convertquotes
245
+ do_stupefy = @options.include? :stupefy
246
+ end
247
+
248
+ # Parse the HTML
249
+ tokens = tokenize
250
+
251
+ # Keep track of when we're inside <pre> or <code> tags.
252
+ in_pre = false
253
+
254
+ # Here is the result stored in.
255
+ result = ""
256
+
257
+ # This is a cheat, used to get some context for one-character
258
+ # tokens that consist of just a quote char. What we do is remember
259
+ # the last character of the previous text token, to use as context
260
+ # to curl single- character quote tokens correctly.
261
+ prev_token_last_char = nil
262
+
263
+ tokens.each { |token|
264
+ if token.first == :tag
265
+ result << token[1]
266
+ if token[1] =~ %r!<(/?)(?:pre|code|kbd|script|math)[\s>]!
267
+ in_pre = ($1 != "/") # Opening or closing tag?
268
+ end
269
+ else
270
+ t = token[1]
271
+
272
+ # Remember last char of this token before processing.
273
+ last_char = t[-1].chr
274
+
275
+ unless in_pre
276
+ t = process_escapes t
277
+
278
+ t.gsub!(/&quot;/, '"') if convert_quotes
279
+
280
+ if do_dashes
281
+ t = educate_dashes t if do_dashes == :normal
282
+ t = educate_dashes_oldschool t if do_dashes == :oldschool
283
+ t = educate_dashes_inverted t if do_dashes == :inverted
284
+ end
285
+
286
+ t = educate_ellipses t if do_ellipses
287
+
288
+ # Note: backticks need to be processed before quotes.
289
+ if do_backticks
290
+ t = educate_backticks t
291
+ t = educate_single_backticks t if do_backticks == :both
292
+ end
293
+
294
+ if do_quotes
295
+ if t == "'"
296
+ # Special case: single-character ' token
297
+ if prev_token_last_char =~ /\S/
298
+ t = "’"
299
+ else
300
+ t = "‘"
301
+ end
302
+ elsif t == '"'
303
+ # Special case: single-character " token
304
+ if prev_token_last_char =~ /\S/
305
+ t = "”"
306
+ else
307
+ t = "“"
308
+ end
309
+ else
310
+ # Normal case:
311
+ t = educate_quotes t
312
+ end
313
+ end
314
+
315
+ t = stupefy_entities t if do_stupefy
316
+ end
317
+
318
+ prev_token_last_char = last_char
319
+ result << t
320
+ end
321
+ }
322
+
323
+ # Done
324
+ result
325
+ end
326
+
327
+ protected
328
+
329
+ # Return the string, with after processing the following backslash
330
+ # escape sequences. This is useful if you want to force a "dumb" quote
331
+ # or other character to appear.
332
+ #
333
+ # Escaped are:
334
+ # \\ \" \' \. \- \`
335
+ #
336
+ def process_escapes(str)
337
+ str.gsub('\\\\', '&#92;').
338
+ gsub('\"', '&#34;').
339
+ gsub("\\\'", '&#39;').
340
+ gsub('\.', '&#46;').
341
+ gsub('\-', '&#45;').
342
+ gsub('\`', '&#96;')
343
+ end
344
+
345
+ # The string, with each instance of "<tt>--</tt>" translated to an
346
+ # em-dash HTML entity.
347
+ #
348
+ def educate_dashes(str)
349
+ str.gsub(/--/, '—')
350
+ end
351
+
352
+ # The string, with each instance of "<tt>--</tt>" translated to an
353
+ # en-dash HTML entity, and each "<tt>---</tt>" translated to an
354
+ # em-dash HTML entity.
355
+ #
356
+ def educate_dashes_oldschool(str)
357
+ str.gsub(/---/, '—').gsub(/--/, '–')
358
+ end
359
+
360
+ # Return the string, with each instance of "<tt>--</tt>" translated
361
+ # to an em-dash HTML entity, and each "<tt>---</tt>" translated to
362
+ # an en-dash HTML entity. Two reasons why: First, unlike the en- and
363
+ # em-dash syntax supported by +educate_dashes_oldschool+, it's
364
+ # compatible with existing entries written before SmartyPants 1.1,
365
+ # back when "<tt>--</tt>" was only used for em-dashes. Second,
366
+ # em-dashes are more common than en-dashes, and so it sort of makes
367
+ # sense that the shortcut should be shorter to type. (Thanks to
368
+ # Aaron Swartz for the idea.)
369
+ #
370
+ def educate_dashes_inverted(str)
371
+ str.gsub(/---/, '–').gsub(/--/, '—')
372
+ end
373
+
374
+ # Return the string, with each instance of "<tt>...</tt>" translated
375
+ # to an ellipsis HTML entity. Also converts the case where there are
376
+ # spaces between the dots.
377
+ #
378
+ def educate_ellipses(str)
379
+ str.gsub('...', '…').gsub('. . .', '…')
380
+ end
381
+
382
+ # Return the string, with "<tt>``backticks''</tt>"-style single quotes
383
+ # translated into HTML curly quote entities.
384
+ #
385
+ def educate_backticks(str)
386
+ str.gsub("``", '“').gsub("''", '”')
387
+ end
388
+
389
+ # Return the string, with "<tt>`backticks'</tt>"-style single quotes
390
+ # translated into HTML curly quote entities.
391
+ #
392
+ def educate_single_backticks(str)
393
+ str.gsub("`", '‘').gsub("'", '’')
394
+ end
395
+
396
+ # Return the string, with "educated" curly quote HTML entities.
397
+ #
398
+ def educate_quotes(str)
399
+ punct_class = '[!"#\$\%\'()*+,\-.\/:;<=>?\@\[\\\\\]\^_`{|}~]'
400
+
401
+ str = str.dup
402
+
403
+ # Special case if the very first character is a quote followed by
404
+ # punctuation at a non-word-break. Close the quotes by brute
405
+ # force:
406
+ str.gsub!(/^'(?=#{punct_class}\B)/, '’')
407
+ str.gsub!(/^"(?=#{punct_class}\B)/, '”')
408
+
409
+ # Special case for double sets of quotes, e.g.:
410
+ # <p>He said, "'Quoted' words in a larger quote."</p>
411
+ str.gsub!(/"'(?=\w)/, '“‘')
412
+ str.gsub!(/'"(?=\w)/, '‘“')
413
+
414
+ # Special case for decade abbreviations (the '80s):
415
+ str.gsub!(/'(?=\d\ds)/, '’')
416
+
417
+ close_class = %![^\ \t\r\n\\[\{\(\-]!
418
+ dec_dashes = '–|—'
419
+
420
+ # Get most opening single quotes:
421
+ str.gsub!(/(\s|&nbsp;|--|&[mn]dash;|#{dec_dashes}|&#x201[34];)'(?=\w)/,
422
+ '\1‘')
423
+ # Single closing quotes:
424
+ str.gsub!(/(#{close_class})'/, '\1’')
425
+ str.gsub!(/'(\s|s\b|$)/, '’\1')
426
+ # Any remaining single quotes should be opening ones:
427
+ str.gsub!(/'/, '‘')
428
+
429
+ # Get most opening double quotes:
430
+ str.gsub!(/(\s|&nbsp;|--|&[mn]dash;|#{dec_dashes}|&#x201[34];)"(?=\w)/,
431
+ '\1“')
432
+ # Double closing quotes:
433
+ str.gsub!(/(#{close_class})"/, '\1”')
434
+ str.gsub!(/"(\s|s\b|$)/, '”\1')
435
+ # Any remaining quotes should be opening ones:
436
+ str.gsub!(/"/, '“')
437
+
438
+ str
439
+ end
440
+
441
+ # Return the string, with each RubyPants HTML entity translated to
442
+ # its ASCII counterpart.
443
+ #
444
+ # Note: This is not reversible (but exactly the same as in SmartyPants)
445
+ #
446
+ def stupefy_entities(str)
447
+ str.
448
+ gsub(/–/, '-'). # en-dash
449
+ gsub(/—/, '--'). # em-dash
450
+
451
+ gsub(/‘/, "'"). # open single quote
452
+ gsub(/’/, "'"). # close single quote
453
+
454
+ gsub(/“/, '"'). # open double quote
455
+ gsub(/”/, '"'). # close double quote
456
+
457
+ gsub(/…/, '...') # ellipsis
458
+ end
459
+
460
+ # Return an array of the tokens comprising the string. Each token is
461
+ # either a tag (possibly with nested, tags contained therein, such
462
+ # as <tt><a href="<MTFoo>"></tt>, or a run of text between
463
+ # tags. Each element of the array is a two-element array; the first
464
+ # is either :tag or :text; the second is the actual value.
465
+ #
466
+ # Based on the <tt>_tokenize()</tt> subroutine from Brad Choate's
467
+ # MTRegex plugin. <http://www.bradchoate.com/past/mtregex.php>
468
+ #
469
+ # This is actually the easier variant using tag_soup, as used by
470
+ # Chad Miller in the Python port of SmartyPants.
471
+ #
472
+ def tokenize
473
+ tag_soup = /([^<]*)(<[^>]*>)/
474
+
475
+ tokens = []
476
+
477
+ prev_end = 0
478
+ scan(tag_soup) {
479
+ tokens << [:text, $1] if $1 != ""
480
+ tokens << [:tag, $2]
481
+
482
+ prev_end = $~.end(0)
483
+ }
484
+
485
+ if prev_end < size
486
+ tokens << [:text, self[prev_end..-1]]
487
+ end
488
+
489
+ tokens
490
+ end
491
+ end
data/test_rubypants.rb ADDED
@@ -0,0 +1,164 @@
1
+ # encoding: utf-8
2
+
3
+ require 'test/unit'
4
+ require './rubypants'
5
+
6
+ # Test EVERYTHING against SmartyPants.pl output!
7
+
8
+
9
+ class TestRubyPants < Test::Unit::TestCase
10
+ def assert_rp_equal(str, orig, options=[2])
11
+ assert_equal orig, RubyPants.new(str, options).to_html
12
+ end
13
+
14
+ def assert_verbatim(str)
15
+ assert_rp_equal str, str
16
+ end
17
+
18
+ def test_verbatim
19
+ assert_verbatim "foo!"
20
+ assert_verbatim "<div>This is HTML</div>"
21
+ assert_verbatim "<div>This is HTML with <crap </div> tags>"
22
+ assert_verbatim <<EOF
23
+ multiline
24
+
25
+ <b>html</b>
26
+
27
+ code
28
+
29
+ EOF
30
+ end
31
+
32
+ def test_quotes
33
+ assert_rp_equal '"A first example"', '“A first example”'
34
+ assert_rp_equal '"A first "nested" example"',
35
+ '“A first “nested” example”'
36
+
37
+ assert_rp_equal '".', '”.'
38
+ assert_rp_equal '"a', '“a'
39
+
40
+ assert_rp_equal "'.", '’.'
41
+ assert_rp_equal "'a", '‘a'
42
+
43
+ assert_rp_equal %{<p>He said, "'Quoted' words in a larger quote."</p>},
44
+ "<p>He said, “‘Quoted’ words in a larger quote.”</p>"
45
+
46
+ assert_rp_equal %{"I like the 70's"}, '“I like the 70’s”'
47
+ assert_rp_equal %{"I like the '70s"}, '“I like the ’70s”'
48
+ assert_rp_equal %{"I like the '70!"}, '“I like the ‘70!”'
49
+
50
+ assert_rp_equal 'pre"post', 'pre”post'
51
+ assert_rp_equal 'pre "post', 'pre “post'
52
+ assert_rp_equal 'pre&nbsp;"post', 'pre&nbsp;“post'
53
+ assert_rp_equal 'pre--"post', 'pre–“post'
54
+ assert_rp_equal 'pre--"!', 'pre–”!'
55
+
56
+ assert_rp_equal "pre'post", 'pre’post'
57
+ assert_rp_equal "pre 'post", 'pre ‘post'
58
+ assert_rp_equal "pre&nbsp;'post", 'pre&nbsp;‘post'
59
+ assert_rp_equal "pre--'post", 'pre–‘post'
60
+ assert_rp_equal "pre--'!", 'pre–’!'
61
+
62
+ assert_rp_equal "<b>'</b>", "<b>‘</b>"
63
+ assert_rp_equal "foo<b>'</b>", "foo<b>’</b>"
64
+
65
+ assert_rp_equal '<b>"</b>', "<b>“</b>"
66
+ assert_rp_equal 'foo<b>"</b>', "foo<b>”</b>"
67
+ end
68
+
69
+ def test_dashes
70
+ assert_rp_equal "foo--bar", 'foo—bar', 1
71
+ assert_rp_equal "foo---bar", 'foo—-bar', 1
72
+ assert_rp_equal "foo----bar", 'foo——bar', 1
73
+ assert_rp_equal "foo-----bar", 'foo——-bar', 1
74
+ assert_rp_equal "--foo--bar--quux--",
75
+ '—foo—bar—quux—', 1
76
+
77
+ assert_rp_equal "foo--bar", 'foo–bar', 2
78
+ assert_rp_equal "foo---bar", 'foo—bar', 2
79
+ assert_rp_equal "foo----bar", 'foo—-bar', 2
80
+ assert_rp_equal "foo-----bar", 'foo—–bar', 2
81
+ assert_rp_equal "--foo--bar--quux--",
82
+ '–foo–bar–quux–', 2
83
+
84
+ assert_rp_equal "foo--bar", 'foo—bar', 3
85
+ assert_rp_equal "foo---bar", 'foo–bar', 3
86
+ assert_rp_equal "foo----bar", 'foo–-bar', 3
87
+ assert_rp_equal "foo-----bar", 'foo–—bar', 3
88
+ assert_rp_equal "--foo--bar--quux--",
89
+ '—foo—bar—quux—', 3
90
+ end
91
+
92
+ def test_ellipses
93
+ assert_rp_equal "foo..bar", 'foo..bar'
94
+ assert_rp_equal "foo...bar", 'foo…bar'
95
+ assert_rp_equal "foo....bar", 'foo….bar'
96
+
97
+ # Nasty ones
98
+ assert_rp_equal "foo. . ..bar", 'foo….bar'
99
+ assert_rp_equal "foo. . ...bar", 'foo. . …bar'
100
+ assert_rp_equal "foo. . ....bar", 'foo. . ….bar'
101
+ end
102
+
103
+ def test_backticks
104
+ assert_rp_equal "pre``post", 'pre“post'
105
+ assert_rp_equal "pre ``post", 'pre “post'
106
+ assert_rp_equal "pre&nbsp;``post", 'pre&nbsp;“post'
107
+ assert_rp_equal "pre--``post", 'pre–“post'
108
+ assert_rp_equal "pre--``!", 'pre–“!'
109
+
110
+ assert_rp_equal "pre''post", 'pre”post'
111
+ assert_rp_equal "pre ''post", 'pre ”post'
112
+ assert_rp_equal "pre&nbsp;''post", 'pre&nbsp;”post'
113
+ assert_rp_equal "pre--''post", 'pre–”post'
114
+ assert_rp_equal "pre--''!", 'pre–”!'
115
+ end
116
+
117
+ def test_single_backticks
118
+ o = [:oldschool, :allbackticks]
119
+
120
+ assert_rp_equal "`foo'", "‘foo’", o
121
+
122
+ assert_rp_equal "pre`post", 'pre‘post', o
123
+ assert_rp_equal "pre `post", 'pre ‘post', o
124
+ assert_rp_equal "pre&nbsp;`post", 'pre&nbsp;‘post', o
125
+ assert_rp_equal "pre--`post", 'pre–‘post', o
126
+ assert_rp_equal "pre--`!", 'pre–‘!', o
127
+
128
+ assert_rp_equal "pre'post", 'pre’post', o
129
+ assert_rp_equal "pre 'post", 'pre ’post', o
130
+ assert_rp_equal "pre&nbsp;'post", 'pre&nbsp;’post', o
131
+ assert_rp_equal "pre--'post", 'pre–’post', o
132
+ assert_rp_equal "pre--'!", 'pre–’!', o
133
+ end
134
+
135
+ def test_stupefy
136
+ o = [:stupefy]
137
+
138
+ assert_rp_equal "<p>He said, “‘Quoted’ words " +
139
+ "in a larger quote.”</p>",
140
+ %{<p>He said, "'Quoted' words in a larger quote."</p>}, o
141
+
142
+ assert_rp_equal "– — ‘’ “” …",
143
+ %{- -- '' "" ...}, o
144
+
145
+ assert_rp_equal %{- -- '' "" ...}, %{- -- '' "" ...}, o
146
+ end
147
+
148
+ def test_process_escapes
149
+ assert_rp_equal %q{foo\bar}, "foo\\bar"
150
+ assert_rp_equal %q{foo\\\bar}, "foo&#92;bar"
151
+ assert_rp_equal %q{foo\\\\\bar}, "foo&#92;\\bar"
152
+ assert_rp_equal %q{foo\...bar}, "foo&#46;..bar"
153
+ assert_rp_equal %q{foo\.\.\.bar}, "foo&#46;&#46;&#46;bar"
154
+
155
+ assert_rp_equal %q{foo\'bar}, "foo&#39;bar"
156
+ assert_rp_equal %q{foo\"bar}, "foo&#34;bar"
157
+ assert_rp_equal %q{foo\-bar}, "foo&#45;bar"
158
+ assert_rp_equal %q{foo\`bar}, "foo&#96;bar"
159
+
160
+ assert_rp_equal %q{foo\#bar}, "foo\\#bar"
161
+ assert_rp_equal %q{foo\*bar}, "foo\\*bar"
162
+ assert_rp_equal %q{foo\&bar}, "foo\\&bar"
163
+ end
164
+ end
metadata ADDED
@@ -0,0 +1,73 @@
1
+ --- !ruby/object:Gem::Specification
2
+ name: rubypants-unicode
3
+ version: !ruby/object:Gem::Version
4
+ version: 0.2.0
5
+ prerelease:
6
+ platform: ruby
7
+ authors:
8
+ - Chris Chapman
9
+ - Christian Neukirchen
10
+ autorequire:
11
+ bindir: bin
12
+ cert_chain: []
13
+ date: 2012-07-30 00:00:00.000000000 Z
14
+ dependencies: []
15
+ description: ! 'RubyPants-Unicode is a Ruby port of the smart-quotes library SmartyPants
16
+ that outputs
17
+
18
+ unicode characters (UTF-8) instead of HTML entities.
19
+
20
+
21
+ The original "SmartyPants" is a free web publishing plug-in for
22
+
23
+ Movable Type, Blosxom, and BBEdit that easily translates plain ASCII
24
+
25
+ punctuation characters into "smart" typographic punctuation HTML
26
+
27
+ entities.
28
+
29
+ '
30
+ email: chris.chapman@aggiemail.usu.edu
31
+ executables: []
32
+ extensions: []
33
+ extra_rdoc_files:
34
+ - README.md
35
+ files:
36
+ - install.rb
37
+ - rubypants.rb
38
+ - test_rubypants.rb
39
+ - README.md
40
+ - Rakefile
41
+ homepage: https://github.com/cdchapman/rubypants-unicode
42
+ licenses: []
43
+ post_install_message:
44
+ rdoc_options:
45
+ - --main
46
+ - README.md
47
+ - --line-numbers
48
+ - --inline-source
49
+ - --all
50
+ - --exclude
51
+ - test_rubypants.rb
52
+ require_paths:
53
+ - .
54
+ required_ruby_version: !ruby/object:Gem::Requirement
55
+ none: false
56
+ requirements:
57
+ - - ! '>='
58
+ - !ruby/object:Gem::Version
59
+ version: '0'
60
+ required_rubygems_version: !ruby/object:Gem::Requirement
61
+ none: false
62
+ requirements:
63
+ - - ! '>='
64
+ - !ruby/object:Gem::Version
65
+ version: '0'
66
+ requirements: []
67
+ rubyforge_project:
68
+ rubygems_version: 1.8.24
69
+ signing_key:
70
+ specification_version: 3
71
+ summary: RubyPants-Unicode is a Ruby port of the smart-quotes library SmartyPants.
72
+ test_files:
73
+ - test_rubypants.rb