babosa 0.3.6 → 0.3.7

Sign up to get free protection for your applications and to get access to all the features.
data/README.md CHANGED
@@ -1,8 +1,7 @@
1
1
  # Babosa
2
2
 
3
- Babosa is a library for creating human-friendly identifiers. Its primary
4
- intended purpose is for creating URL slugs, but can also be useful for
5
- normalizing and sanitizing data.
3
+ Babosa is a library for creating human-friendly identifiers, aka "slugs". It can
4
+ also be useful for normalizing and sanitizing data.
6
5
 
7
6
  It is an extraction and improvement of the string code from
8
7
  [FriendlyId](http://github.com/norman/friendly_id). I have released this as a
@@ -11,20 +10,34 @@ FriendlyId.
11
10
 
12
11
  ## Features / Usage
13
12
 
14
- ### ASCII transliteration
13
+ ### Transliterate UTF-8 characters to ASCII
15
14
 
16
15
  "Gölcük, Turkey".to_slug.transliterate.to_s #=> "Golcuk, Turkey"
17
16
 
18
- ### Per-locale transliteration
17
+ ### Locale sensitive transliteration, with support for many languages
19
18
 
20
19
  "Jürgen Müller".to_slug.transliterate.to_s #=> "Jurgen Muller"
21
20
  "Jürgen Müller".to_slug.transliterate(:german).to_s #=> "Juergen Mueller"
22
21
 
23
- Many European languages using both Roman and Cyrillic alphabets are supported.
24
- I'll gladly accept contributions from fluent speakers to support more
25
- languages.
22
+ Currently supported languages include:
26
23
 
27
- ### Non-ASCII removal
24
+ * Bulgarian
25
+ * Danish
26
+ * German
27
+ * Greek
28
+ * Macedonian
29
+ * Norwegian
30
+ * Romanian
31
+ * Russian
32
+ * Serbian
33
+ * Spanish
34
+ * Swedish
35
+ * Ukrainian
36
+
37
+
38
+ I'll gladly accept contributions from fluent speakers to support more languages.
39
+
40
+ ### Strip non-ASCII characters
28
41
 
29
42
  "Gölcük, Turkey".to_slug.to_ascii.to_s #=> "Glck, Turkey"
30
43
 
@@ -49,8 +62,46 @@ whose length is limited by bytes rather than UTF-8 characters.
49
62
 
50
63
  ### Other stuff
51
64
 
52
- Babosa can also generate strings for Ruby method names. (Yes, Ruby 1.9 can use UTF-8 chars
53
- in method names, but you may not want to):
65
+ #### Using Babosa With FriendlyId 4
66
+
67
+ require "babosa"
68
+
69
+ class Person < ActiveRecord::Base
70
+ friendly_id :name, use: :slugged
71
+
72
+ def normalize_friendly_id(input)
73
+ input.to_s.to_slug.normalize(transliterations: :russian).to_s
74
+ end
75
+ end
76
+
77
+ #### Pedantic UTF-8 support
78
+
79
+ Babosa goes out of its way to handle [nasty Unicode issues you might never think
80
+ you would have](https://github.com/norman/enc/blob/master/equivalence.rb) by
81
+ checking, sanitizing and normalizing your string input.
82
+
83
+ It will automatically use whatever Unicode library you have loaded before
84
+ Babosa, or fall back to a simple built-in library. Supported
85
+ Unicode libraries include:
86
+
87
+ * Java (only JRuby of course)
88
+ * Active Support
89
+ * [Unicode](https://github.com/blackwinter/unicode)
90
+ * Built-in
91
+
92
+ This built-in module is much faster than Active Support but much slower than
93
+ Java or Unicode. It can only do **very** naive Unicode composition to ensure
94
+ that, for example, "é" will always be composed to a single codepoint rather than
95
+ an "e" and a "´" - making it safe to use as a hash key.
96
+
97
+ But seriously - save yourself the headache and install a real Unicode library.
98
+ If you are using Babosa with a language that uses the Cyrillic alphabet, Babosa
99
+ requires either Unicode, Active Support or Java.
100
+
101
+ #### Ruby Method Names
102
+
103
+ Babosa can also generate strings for Ruby method names. (Yes, Ruby 1.9 can use
104
+ UTF-8 chars in method names, but you may not want to):
54
105
 
55
106
 
56
107
  "this is a method".to_slug.to_ruby_method! #=> this_is_a_method
@@ -59,9 +110,10 @@ in method names, but you may not want to):
59
110
  # You can also disallow trailing punctuation chars
60
111
  "über cool stuff!".to_slug.to_ruby_method(false) #=> uber_cool_stuff
61
112
 
113
+ #### Easy to Extend
62
114
 
63
- You can easily add custom transliterators for your language with very little code,
64
- for example here's the transliterator for German:
115
+ You can add custom transliterators for your language with very little code. For
116
+ example here's the transliterator for German:
65
117
 
66
118
  # encoding: utf-8
67
119
  module Babosa
@@ -100,44 +152,67 @@ And a spec (you can use this as a template):
100
152
  end
101
153
 
102
154
 
103
- ### UTF-8 support
155
+ ### Rails 3
156
+
157
+ Some of Babosa's functionality was added to Active Support 3.
158
+
159
+ Babosa now differs from ActiveSupport primarily in that it supports non-Latin
160
+ strings by default, and has per-locale ASCII transliterations already baked-in.
161
+ If you are considering using Babosa with Rails 3, you may want to first take a
162
+ look at Active Support's
163
+ [transliterate](http://edgeapi.rubyonrails.org/classes/ActiveSupport/Inflector.html#M000565)
164
+ and
165
+ [parameterize](http://edgeapi.rubyonrails.org/classes/ActiveSupport/Inflector.html#M000566)
166
+ to see if they suit your needs.
104
167
 
105
- Babosa has no hard dependencies, but if you have either the Unicode or
106
- ActiveSupport gems installed and required prior to requiring "babosa", these
107
- will be used to perform upcasing and downcasing on UTF-8 strings. On JRuby 1.5
108
- and above, Java's native Unicode support will be used instead. Unless you're on
109
- JRuby, which already has excellent support for Unicode via Java's Standard
110
- Library, I recommend using the Unicode gem because it's the fastest Ruby Unicode
111
- library available.
168
+ ### Babosa vs. Stringex
112
169
 
113
- If none of these libraries are available, Babosa falls back to a simple module
114
- which **only** supports Latin characters.
170
+ Babosa provides much of the functionality provided by the
171
+ [Stringex](https://github.com/rsl/stringex) gem, but in the subjective opinion
172
+ of the author, is for most use cases a much better choice.
115
173
 
116
- This default module is fast and can do very naive Unicode composition to ensure
117
- that, for example, "é" will always be composed to a single codepoint rather than
118
- an "e" and a "´" - making it safe to use as a hash key. But seriously - save
119
- yourself the headache and install a real Unicode library.
174
+ #### Fewer Features
120
175
 
121
- If you are using Babosa with a language that uses the Cyrillic alphabet, Babosa
122
- requires either Unicode, Active Support or Java.
176
+ Stringex offers functionality for storing slugs in an Active Record model, like
177
+ a simple version of [FriendlyId](http://github.com/norman/friendly_id), in
178
+ addition to string processing. Babosa only does string processing.
123
179
 
180
+ #### Less Aggressive Unicode Transliteration
124
181
 
125
- ### Rails 3
182
+ Stringex uses an agressive Unicode to ASCII mapping which outputs gibberish for
183
+ almost anything but Western European langages. Babosa supports only languages
184
+ for which fluent speakers have provided transliterations, to ensure that the
185
+ output makes sense to users.
126
186
 
127
- Some of Babosa's functionality is already present in Active Support/Rails 3.
187
+ #### Better Locale Support
188
+
189
+ Recent versions of Stringex support locale-specific transliterations, but
190
+ include no built-in support for any languages. Babosa works out of the box for
191
+ most European languages and is easy to extend if your language is not supported.
192
+
193
+ #### Unicode Support
194
+
195
+ Stringex does no Unicode normalization or validation before transliterating
196
+ strings, so if you pass in strings with encoding errors or with different
197
+ Unicode normalizations, you'll get unpredictable results.
198
+
199
+ #### No Locale Assumptions
200
+
201
+ Babosa avoids making assumptions about locales like Stringex does, so it doesn't
202
+ offer transliterations like this out of the box:
203
+
204
+ "$12 worth of Ruby power".to_url => "12-dollars-worth-of-ruby-power"
205
+
206
+ This is because the symbol "$" is used by many Latin American countries for the
207
+ peso. Stringex does this in many places, for example, transliterating all Han
208
+ characters into Pinyin, effectively treating Japanese text as if it were
209
+ Chinese.
128
210
 
129
- Babosa differs from ActiveSupport primarily in that it supports non-Latin
130
- strings by default, and has per-locale ASCII transliterations already baked-in.
131
- If you are considering using Babosa with Rails 3, you should first take a look
132
- at Active Support's
133
- [transliterate](http://edgeapi.rubyonrails.org/classes/ActiveSupport/Inflector.html#M000565)
134
- and
135
- [parameterize](http://edgeapi.rubyonrails.org/classes/ActiveSupport/Inflector.html#M000566)
136
- because it may already do what you need.
137
211
 
138
212
  ### More info
139
213
 
140
- Please see the [API docs](http://norman.github.com/babosa) and source code for more info.
214
+ Please see the [API docs](http://norman.github.com/babosa) and source code for
215
+ more info.
141
216
 
142
217
  ## Getting it
143
218
 
@@ -147,12 +222,13 @@ Babosa can be installed via Rubygems:
147
222
 
148
223
  You can get the source code from its [Github repository](http://github.com/norman/babosa).
149
224
 
150
- Babosa is tested to be compatible with Ruby 1.8.6-1.9.2, JRuby 1.4-1.5, and
151
- Rubinius 1.0.x. It's probably compatible with other Rubies as well.
225
+ Babosa is tested to be compatible with Ruby 1.8.6-1.9.2, JRuby 1.4+, and
226
+ Rubinius 1.0+ It's probably compatible with other Rubies as well.
152
227
 
153
228
  ## Reporting bugs
154
229
 
155
- Please use Babosa's [Github issue tracker](http://github.com/norman/babosa/issues).
230
+ Please use Babosa's [Github issue
231
+ tracker](http://github.com/norman/babosa/issues).
156
232
 
157
233
 
158
234
  ## Misc
@@ -165,6 +241,13 @@ Please use Babosa's [Github issue tracker](http://github.com/norman/babosa/issue
165
241
 
166
242
  ## Contributors
167
243
 
244
+ Many thanks to the following people for their help:
245
+
246
+ * [Philip Arndt](https://github.com/parndt) - Bugfixes
247
+ * [Jonas Forsberg](https://github.com/himynameisjonas) - Swedish support
248
+ * [Jaroslav Kalistsuk](https://github.com/jarosan) - Greek support
249
+ * [Steven Heidel](https://github.com/stevenheidel) - Bugfixes
250
+ * [Edgars Beigarts](https://github.com/ebeigarts) - Support for multiple transliterators
168
251
  * [Tiberiu C. Turbureanu](https://gitorious.org/~tct) - Romanian support
169
252
  * [Kim Joar Bekkelund](https://github.com/kjbekkelund) - Norwegian support
170
253
  * [Alexey Shkolnikov](https://github.com/grlm) - Russian support
@@ -175,6 +258,7 @@ Please use Babosa's [Github issue tracker](http://github.com/norman/babosa/issue
175
258
 
176
259
  ## Changelog
177
260
 
261
+ * 0.3.7 - Fix compatibility with Ruby 1.8.7. Add Swedish support.
178
262
  * 0.3.6 - Allow multiple transliterators. Add Greek support.
179
263
  * 0.3.5 - Don't strip underscores from identifiers.
180
264
  * 0.3.4 - Add Romanian support.
@@ -190,7 +274,7 @@ Please use Babosa's [Github issue tracker](http://github.com/norman/babosa/issue
190
274
 
191
275
  ## Copyright
192
276
 
193
- Copyright (c) 2010-2011 Norman Clarke
277
+ Copyright (c) 2010-2012 Norman Clarke
194
278
 
195
279
  Permission is hereby granted, free of charge, to any person obtaining a copy of
196
280
  this software and associated documentation files (the "Software"), to deal in
@@ -113,6 +113,7 @@ module Babosa
113
113
  # @param *args <Symbol>
114
114
  # @return String
115
115
  def transliterate!(*kinds)
116
+ kinds.compact!
116
117
  kinds = [:latin] if kinds.empty?
117
118
  kinds.each do |kind|
118
119
  transliterator = Transliterator.get(kind).instance
@@ -148,8 +149,12 @@ module Babosa
148
149
  else
149
150
  options = default_normalize_options.merge(options || {})
150
151
  end
151
- if options[:transliterate]
152
- transliterate!(*options[:transliterations])
152
+ if translit_option = options[:transliterate]
153
+ if translit_option != true
154
+ transliterate!(*translit_option)
155
+ else
156
+ transliterate!(*options[:transliterations])
157
+ end
153
158
  end
154
159
  to_ascii! if options[:to_ascii]
155
160
  clean!
@@ -17,6 +17,7 @@ module Babosa
17
17
  autoload :Russian, "babosa/transliterator/russian"
18
18
  autoload :Serbian, "babosa/transliterator/serbian"
19
19
  autoload :Spanish, "babosa/transliterator/spanish"
20
+ autoload :Swedish, "babosa/transliterator/swedish"
20
21
  autoload :Ukrainian, "babosa/transliterator/ukrainian"
21
22
  autoload :Greek, "babosa/transliterator/greek"
22
23
 
@@ -0,0 +1,16 @@
1
+ # encoding: utf-8
2
+ module Babosa
3
+ module Transliterator
4
+ class Swedish < Latin
5
+ APPROXIMATIONS = {
6
+ "å" => "aa",
7
+ "ä" => "ae",
8
+ "ö" => "oe",
9
+ "Å" => "Aa",
10
+ "Ä" => "Ae",
11
+ "Ö" => "Oe"
12
+ }
13
+ end
14
+ end
15
+ end
16
+
@@ -1,5 +1,5 @@
1
1
  module Babosa
2
2
  module Version
3
- STRING = "0.3.6"
3
+ STRING = "0.3.7"
4
4
  end
5
5
  end
@@ -54,6 +54,11 @@ describe Babosa::Identifier do
54
54
  end
55
55
 
56
56
  describe "#normalize" do
57
+
58
+ it "should allow passing locale as key for :transliterate" do
59
+ "ö".to_slug.clean.normalize(:transliterate => :german).should eql("oe")
60
+ end
61
+
57
62
  it "should replace whitespace with dashes" do
58
63
  "a b".to_slug.clean.normalize.should eql("a-b")
59
64
  end
@@ -0,0 +1,18 @@
1
+ # encoding: utf-8
2
+ require File.expand_path("../../spec_helper", __FILE__)
3
+
4
+ describe Babosa::Transliterator::Swedish do
5
+
6
+ let(:t) { described_class.instance }
7
+ it_behaves_like "a latin transliterator"
8
+
9
+ it "should transliterate various characters" do
10
+ examples = {
11
+ "Räksmörgås" => "Raeksmoergaas",
12
+ "Öre" => "Oere",
13
+ "Åre" => "Aare",
14
+ "Älskar" => "Aelskar"
15
+ }
16
+ examples.each {|k, v| t.transliterate(k).should eql(v)}
17
+ end
18
+ end
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: babosa
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.3.6
4
+ version: 0.3.7
5
5
  prerelease:
6
6
  platform: ruby
7
7
  authors:
@@ -9,11 +9,11 @@ authors:
9
9
  autorequire:
10
10
  bindir: bin
11
11
  cert_chain: []
12
- date: 2012-01-07 00:00:00.000000000 Z
12
+ date: 2012-03-12 00:00:00.000000000 Z
13
13
  dependencies:
14
14
  - !ruby/object:Gem::Dependency
15
15
  name: activesupport
16
- requirement: &70212217120220 !ruby/object:Gem::Requirement
16
+ requirement: &70194416980920 !ruby/object:Gem::Requirement
17
17
  none: false
18
18
  requirements:
19
19
  - - ! '>='
@@ -21,10 +21,10 @@ dependencies:
21
21
  version: 2.3.0
22
22
  type: :development
23
23
  prerelease: false
24
- version_requirements: *70212217120220
24
+ version_requirements: *70194416980920
25
25
  - !ruby/object:Gem::Dependency
26
26
  name: rspec
27
- requirement: &70212217118700 !ruby/object:Gem::Requirement
27
+ requirement: &70194416979940 !ruby/object:Gem::Requirement
28
28
  none: false
29
29
  requirements:
30
30
  - - ~>
@@ -32,10 +32,10 @@ dependencies:
32
32
  version: 2.5.0
33
33
  type: :development
34
34
  prerelease: false
35
- version_requirements: *70212217118700
35
+ version_requirements: *70194416979940
36
36
  - !ruby/object:Gem::Dependency
37
37
  name: simplecov
38
- requirement: &70212217117820 !ruby/object:Gem::Requirement
38
+ requirement: &70194416979120 !ruby/object:Gem::Requirement
39
39
  none: false
40
40
  requirements:
41
41
  - - ! '>='
@@ -43,7 +43,7 @@ dependencies:
43
43
  version: '0'
44
44
  type: :development
45
45
  prerelease: false
46
- version_requirements: *70212217117820
46
+ version_requirements: *70194416979120
47
47
  description: ! " A library for creating slugs. Babosa an extraction and improvement
48
48
  of the\n string code from FriendlyId, intended to help developers create similar\n
49
49
  \ libraries or plugins.\n"
@@ -66,6 +66,7 @@ files:
66
66
  - lib/babosa/transliterator/russian.rb
67
67
  - lib/babosa/transliterator/serbian.rb
68
68
  - lib/babosa/transliterator/spanish.rb
69
+ - lib/babosa/transliterator/swedish.rb
69
70
  - lib/babosa/transliterator/ukrainian.rb
70
71
  - lib/babosa/utf8/active_support_proxy.rb
71
72
  - lib/babosa/utf8/dumb_proxy.rb
@@ -92,6 +93,7 @@ files:
92
93
  - spec/transliterators/russian_spec.rb
93
94
  - spec/transliterators/serbian_spec.rb
94
95
  - spec/transliterators/spanish_spec.rb
96
+ - spec/transliterators/swedish_spec.rb
95
97
  - spec/transliterators/ukrainian_spec.rb
96
98
  - spec/utf8_proxy_spec.rb
97
99
  - .gemtest
@@ -115,7 +117,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
115
117
  version: '0'
116
118
  requirements: []
117
119
  rubyforge_project: ! '[none]'
118
- rubygems_version: 1.8.10
120
+ rubygems_version: 1.8.11
119
121
  signing_key:
120
122
  specification_version: 3
121
123
  summary: A library for creating slugs.