babosa 0.3.6 → 0.3.7
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- data/README.md +128 -44
- data/lib/babosa/identifier.rb +7 -2
- data/lib/babosa/transliterator/base.rb +1 -0
- data/lib/babosa/transliterator/swedish.rb +16 -0
- data/lib/babosa/version.rb +1 -1
- data/spec/babosa_spec.rb +5 -0
- data/spec/transliterators/swedish_spec.rb +18 -0
- metadata +11 -9
data/README.md
CHANGED
@@ -1,8 +1,7 @@
|
|
1
1
|
# Babosa
|
2
2
|
|
3
|
-
Babosa is a library for creating human-friendly identifiers.
|
4
|
-
|
5
|
-
normalizing and sanitizing data.
|
3
|
+
Babosa is a library for creating human-friendly identifiers, aka "slugs". It can
|
4
|
+
also be useful for normalizing and sanitizing data.
|
6
5
|
|
7
6
|
It is an extraction and improvement of the string code from
|
8
7
|
[FriendlyId](http://github.com/norman/friendly_id). I have released this as a
|
@@ -11,20 +10,34 @@ FriendlyId.
|
|
11
10
|
|
12
11
|
## Features / Usage
|
13
12
|
|
14
|
-
### ASCII
|
13
|
+
### Transliterate UTF-8 characters to ASCII
|
15
14
|
|
16
15
|
"Gölcük, Turkey".to_slug.transliterate.to_s #=> "Golcuk, Turkey"
|
17
16
|
|
18
|
-
###
|
17
|
+
### Locale sensitive transliteration, with support for many languages
|
19
18
|
|
20
19
|
"Jürgen Müller".to_slug.transliterate.to_s #=> "Jurgen Muller"
|
21
20
|
"Jürgen Müller".to_slug.transliterate(:german).to_s #=> "Juergen Mueller"
|
22
21
|
|
23
|
-
|
24
|
-
I'll gladly accept contributions from fluent speakers to support more
|
25
|
-
languages.
|
22
|
+
Currently supported languages include:
|
26
23
|
|
27
|
-
|
24
|
+
* Bulgarian
|
25
|
+
* Danish
|
26
|
+
* German
|
27
|
+
* Greek
|
28
|
+
* Macedonian
|
29
|
+
* Norwegian
|
30
|
+
* Romanian
|
31
|
+
* Russian
|
32
|
+
* Serbian
|
33
|
+
* Spanish
|
34
|
+
* Swedish
|
35
|
+
* Ukrainian
|
36
|
+
|
37
|
+
|
38
|
+
I'll gladly accept contributions from fluent speakers to support more languages.
|
39
|
+
|
40
|
+
### Strip non-ASCII characters
|
28
41
|
|
29
42
|
"Gölcük, Turkey".to_slug.to_ascii.to_s #=> "Glck, Turkey"
|
30
43
|
|
@@ -49,8 +62,46 @@ whose length is limited by bytes rather than UTF-8 characters.
|
|
49
62
|
|
50
63
|
### Other stuff
|
51
64
|
|
52
|
-
|
53
|
-
|
65
|
+
#### Using Babosa With FriendlyId 4
|
66
|
+
|
67
|
+
require "babosa"
|
68
|
+
|
69
|
+
class Person < ActiveRecord::Base
|
70
|
+
friendly_id :name, use: :slugged
|
71
|
+
|
72
|
+
def normalize_friendly_id(input)
|
73
|
+
input.to_s.to_slug.normalize(transliterations: :russian).to_s
|
74
|
+
end
|
75
|
+
end
|
76
|
+
|
77
|
+
#### Pedantic UTF-8 support
|
78
|
+
|
79
|
+
Babosa goes out of its way to handle [nasty Unicode issues you might never think
|
80
|
+
you would have](https://github.com/norman/enc/blob/master/equivalence.rb) by
|
81
|
+
checking, sanitizing and normalizing your string input.
|
82
|
+
|
83
|
+
It will automatically use whatever Unicode library you have loaded before
|
84
|
+
Babosa, or fall back to a simple built-in library. Supported
|
85
|
+
Unicode libraries include:
|
86
|
+
|
87
|
+
* Java (only JRuby of course)
|
88
|
+
* Active Support
|
89
|
+
* [Unicode](https://github.com/blackwinter/unicode)
|
90
|
+
* Built-in
|
91
|
+
|
92
|
+
This built-in module is much faster than Active Support but much slower than
|
93
|
+
Java or Unicode. It can only do **very** naive Unicode composition to ensure
|
94
|
+
that, for example, "é" will always be composed to a single codepoint rather than
|
95
|
+
an "e" and a "´" - making it safe to use as a hash key.
|
96
|
+
|
97
|
+
But seriously - save yourself the headache and install a real Unicode library.
|
98
|
+
If you are using Babosa with a language that uses the Cyrillic alphabet, Babosa
|
99
|
+
requires either Unicode, Active Support or Java.
|
100
|
+
|
101
|
+
#### Ruby Method Names
|
102
|
+
|
103
|
+
Babosa can also generate strings for Ruby method names. (Yes, Ruby 1.9 can use
|
104
|
+
UTF-8 chars in method names, but you may not want to):
|
54
105
|
|
55
106
|
|
56
107
|
"this is a method".to_slug.to_ruby_method! #=> this_is_a_method
|
@@ -59,9 +110,10 @@ in method names, but you may not want to):
|
|
59
110
|
# You can also disallow trailing punctuation chars
|
60
111
|
"über cool stuff!".to_slug.to_ruby_method(false) #=> uber_cool_stuff
|
61
112
|
|
113
|
+
#### Easy to Extend
|
62
114
|
|
63
|
-
You can
|
64
|
-
|
115
|
+
You can add custom transliterators for your language with very little code. For
|
116
|
+
example here's the transliterator for German:
|
65
117
|
|
66
118
|
# encoding: utf-8
|
67
119
|
module Babosa
|
@@ -100,44 +152,67 @@ And a spec (you can use this as a template):
|
|
100
152
|
end
|
101
153
|
|
102
154
|
|
103
|
-
###
|
155
|
+
### Rails 3
|
156
|
+
|
157
|
+
Some of Babosa's functionality was added to Active Support 3.
|
158
|
+
|
159
|
+
Babosa now differs from ActiveSupport primarily in that it supports non-Latin
|
160
|
+
strings by default, and has per-locale ASCII transliterations already baked-in.
|
161
|
+
If you are considering using Babosa with Rails 3, you may want to first take a
|
162
|
+
look at Active Support's
|
163
|
+
[transliterate](http://edgeapi.rubyonrails.org/classes/ActiveSupport/Inflector.html#M000565)
|
164
|
+
and
|
165
|
+
[parameterize](http://edgeapi.rubyonrails.org/classes/ActiveSupport/Inflector.html#M000566)
|
166
|
+
to see if they suit your needs.
|
104
167
|
|
105
|
-
Babosa
|
106
|
-
ActiveSupport gems installed and required prior to requiring "babosa", these
|
107
|
-
will be used to perform upcasing and downcasing on UTF-8 strings. On JRuby 1.5
|
108
|
-
and above, Java's native Unicode support will be used instead. Unless you're on
|
109
|
-
JRuby, which already has excellent support for Unicode via Java's Standard
|
110
|
-
Library, I recommend using the Unicode gem because it's the fastest Ruby Unicode
|
111
|
-
library available.
|
168
|
+
### Babosa vs. Stringex
|
112
169
|
|
113
|
-
|
114
|
-
|
170
|
+
Babosa provides much of the functionality provided by the
|
171
|
+
[Stringex](https://github.com/rsl/stringex) gem, but in the subjective opinion
|
172
|
+
of the author, is for most use cases a much better choice.
|
115
173
|
|
116
|
-
|
117
|
-
that, for example, "é" will always be composed to a single codepoint rather than
|
118
|
-
an "e" and a "´" - making it safe to use as a hash key. But seriously - save
|
119
|
-
yourself the headache and install a real Unicode library.
|
174
|
+
#### Fewer Features
|
120
175
|
|
121
|
-
|
122
|
-
|
176
|
+
Stringex offers functionality for storing slugs in an Active Record model, like
|
177
|
+
a simple version of [FriendlyId](http://github.com/norman/friendly_id), in
|
178
|
+
addition to string processing. Babosa only does string processing.
|
123
179
|
|
180
|
+
#### Less Aggressive Unicode Transliteration
|
124
181
|
|
125
|
-
|
182
|
+
Stringex uses an agressive Unicode to ASCII mapping which outputs gibberish for
|
183
|
+
almost anything but Western European langages. Babosa supports only languages
|
184
|
+
for which fluent speakers have provided transliterations, to ensure that the
|
185
|
+
output makes sense to users.
|
126
186
|
|
127
|
-
|
187
|
+
#### Better Locale Support
|
188
|
+
|
189
|
+
Recent versions of Stringex support locale-specific transliterations, but
|
190
|
+
include no built-in support for any languages. Babosa works out of the box for
|
191
|
+
most European languages and is easy to extend if your language is not supported.
|
192
|
+
|
193
|
+
#### Unicode Support
|
194
|
+
|
195
|
+
Stringex does no Unicode normalization or validation before transliterating
|
196
|
+
strings, so if you pass in strings with encoding errors or with different
|
197
|
+
Unicode normalizations, you'll get unpredictable results.
|
198
|
+
|
199
|
+
#### No Locale Assumptions
|
200
|
+
|
201
|
+
Babosa avoids making assumptions about locales like Stringex does, so it doesn't
|
202
|
+
offer transliterations like this out of the box:
|
203
|
+
|
204
|
+
"$12 worth of Ruby power".to_url => "12-dollars-worth-of-ruby-power"
|
205
|
+
|
206
|
+
This is because the symbol "$" is used by many Latin American countries for the
|
207
|
+
peso. Stringex does this in many places, for example, transliterating all Han
|
208
|
+
characters into Pinyin, effectively treating Japanese text as if it were
|
209
|
+
Chinese.
|
128
210
|
|
129
|
-
Babosa differs from ActiveSupport primarily in that it supports non-Latin
|
130
|
-
strings by default, and has per-locale ASCII transliterations already baked-in.
|
131
|
-
If you are considering using Babosa with Rails 3, you should first take a look
|
132
|
-
at Active Support's
|
133
|
-
[transliterate](http://edgeapi.rubyonrails.org/classes/ActiveSupport/Inflector.html#M000565)
|
134
|
-
and
|
135
|
-
[parameterize](http://edgeapi.rubyonrails.org/classes/ActiveSupport/Inflector.html#M000566)
|
136
|
-
because it may already do what you need.
|
137
211
|
|
138
212
|
### More info
|
139
213
|
|
140
|
-
Please see the [API docs](http://norman.github.com/babosa) and source code for
|
214
|
+
Please see the [API docs](http://norman.github.com/babosa) and source code for
|
215
|
+
more info.
|
141
216
|
|
142
217
|
## Getting it
|
143
218
|
|
@@ -147,12 +222,13 @@ Babosa can be installed via Rubygems:
|
|
147
222
|
|
148
223
|
You can get the source code from its [Github repository](http://github.com/norman/babosa).
|
149
224
|
|
150
|
-
Babosa is tested to be compatible with Ruby 1.8.6-1.9.2, JRuby 1.4
|
151
|
-
Rubinius 1.0
|
225
|
+
Babosa is tested to be compatible with Ruby 1.8.6-1.9.2, JRuby 1.4+, and
|
226
|
+
Rubinius 1.0+ It's probably compatible with other Rubies as well.
|
152
227
|
|
153
228
|
## Reporting bugs
|
154
229
|
|
155
|
-
Please use Babosa's [Github issue
|
230
|
+
Please use Babosa's [Github issue
|
231
|
+
tracker](http://github.com/norman/babosa/issues).
|
156
232
|
|
157
233
|
|
158
234
|
## Misc
|
@@ -165,6 +241,13 @@ Please use Babosa's [Github issue tracker](http://github.com/norman/babosa/issue
|
|
165
241
|
|
166
242
|
## Contributors
|
167
243
|
|
244
|
+
Many thanks to the following people for their help:
|
245
|
+
|
246
|
+
* [Philip Arndt](https://github.com/parndt) - Bugfixes
|
247
|
+
* [Jonas Forsberg](https://github.com/himynameisjonas) - Swedish support
|
248
|
+
* [Jaroslav Kalistsuk](https://github.com/jarosan) - Greek support
|
249
|
+
* [Steven Heidel](https://github.com/stevenheidel) - Bugfixes
|
250
|
+
* [Edgars Beigarts](https://github.com/ebeigarts) - Support for multiple transliterators
|
168
251
|
* [Tiberiu C. Turbureanu](https://gitorious.org/~tct) - Romanian support
|
169
252
|
* [Kim Joar Bekkelund](https://github.com/kjbekkelund) - Norwegian support
|
170
253
|
* [Alexey Shkolnikov](https://github.com/grlm) - Russian support
|
@@ -175,6 +258,7 @@ Please use Babosa's [Github issue tracker](http://github.com/norman/babosa/issue
|
|
175
258
|
|
176
259
|
## Changelog
|
177
260
|
|
261
|
+
* 0.3.7 - Fix compatibility with Ruby 1.8.7. Add Swedish support.
|
178
262
|
* 0.3.6 - Allow multiple transliterators. Add Greek support.
|
179
263
|
* 0.3.5 - Don't strip underscores from identifiers.
|
180
264
|
* 0.3.4 - Add Romanian support.
|
@@ -190,7 +274,7 @@ Please use Babosa's [Github issue tracker](http://github.com/norman/babosa/issue
|
|
190
274
|
|
191
275
|
## Copyright
|
192
276
|
|
193
|
-
Copyright (c) 2010-
|
277
|
+
Copyright (c) 2010-2012 Norman Clarke
|
194
278
|
|
195
279
|
Permission is hereby granted, free of charge, to any person obtaining a copy of
|
196
280
|
this software and associated documentation files (the "Software"), to deal in
|
data/lib/babosa/identifier.rb
CHANGED
@@ -113,6 +113,7 @@ module Babosa
|
|
113
113
|
# @param *args <Symbol>
|
114
114
|
# @return String
|
115
115
|
def transliterate!(*kinds)
|
116
|
+
kinds.compact!
|
116
117
|
kinds = [:latin] if kinds.empty?
|
117
118
|
kinds.each do |kind|
|
118
119
|
transliterator = Transliterator.get(kind).instance
|
@@ -148,8 +149,12 @@ module Babosa
|
|
148
149
|
else
|
149
150
|
options = default_normalize_options.merge(options || {})
|
150
151
|
end
|
151
|
-
if options[:transliterate]
|
152
|
-
|
152
|
+
if translit_option = options[:transliterate]
|
153
|
+
if translit_option != true
|
154
|
+
transliterate!(*translit_option)
|
155
|
+
else
|
156
|
+
transliterate!(*options[:transliterations])
|
157
|
+
end
|
153
158
|
end
|
154
159
|
to_ascii! if options[:to_ascii]
|
155
160
|
clean!
|
@@ -17,6 +17,7 @@ module Babosa
|
|
17
17
|
autoload :Russian, "babosa/transliterator/russian"
|
18
18
|
autoload :Serbian, "babosa/transliterator/serbian"
|
19
19
|
autoload :Spanish, "babosa/transliterator/spanish"
|
20
|
+
autoload :Swedish, "babosa/transliterator/swedish"
|
20
21
|
autoload :Ukrainian, "babosa/transliterator/ukrainian"
|
21
22
|
autoload :Greek, "babosa/transliterator/greek"
|
22
23
|
|
data/lib/babosa/version.rb
CHANGED
data/spec/babosa_spec.rb
CHANGED
@@ -54,6 +54,11 @@ describe Babosa::Identifier do
|
|
54
54
|
end
|
55
55
|
|
56
56
|
describe "#normalize" do
|
57
|
+
|
58
|
+
it "should allow passing locale as key for :transliterate" do
|
59
|
+
"ö".to_slug.clean.normalize(:transliterate => :german).should eql("oe")
|
60
|
+
end
|
61
|
+
|
57
62
|
it "should replace whitespace with dashes" do
|
58
63
|
"a b".to_slug.clean.normalize.should eql("a-b")
|
59
64
|
end
|
@@ -0,0 +1,18 @@
|
|
1
|
+
# encoding: utf-8
|
2
|
+
require File.expand_path("../../spec_helper", __FILE__)
|
3
|
+
|
4
|
+
describe Babosa::Transliterator::Swedish do
|
5
|
+
|
6
|
+
let(:t) { described_class.instance }
|
7
|
+
it_behaves_like "a latin transliterator"
|
8
|
+
|
9
|
+
it "should transliterate various characters" do
|
10
|
+
examples = {
|
11
|
+
"Räksmörgås" => "Raeksmoergaas",
|
12
|
+
"Öre" => "Oere",
|
13
|
+
"Åre" => "Aare",
|
14
|
+
"Älskar" => "Aelskar"
|
15
|
+
}
|
16
|
+
examples.each {|k, v| t.transliterate(k).should eql(v)}
|
17
|
+
end
|
18
|
+
end
|
metadata
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: babosa
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.3.
|
4
|
+
version: 0.3.7
|
5
5
|
prerelease:
|
6
6
|
platform: ruby
|
7
7
|
authors:
|
@@ -9,11 +9,11 @@ authors:
|
|
9
9
|
autorequire:
|
10
10
|
bindir: bin
|
11
11
|
cert_chain: []
|
12
|
-
date: 2012-
|
12
|
+
date: 2012-03-12 00:00:00.000000000 Z
|
13
13
|
dependencies:
|
14
14
|
- !ruby/object:Gem::Dependency
|
15
15
|
name: activesupport
|
16
|
-
requirement: &
|
16
|
+
requirement: &70194416980920 !ruby/object:Gem::Requirement
|
17
17
|
none: false
|
18
18
|
requirements:
|
19
19
|
- - ! '>='
|
@@ -21,10 +21,10 @@ dependencies:
|
|
21
21
|
version: 2.3.0
|
22
22
|
type: :development
|
23
23
|
prerelease: false
|
24
|
-
version_requirements: *
|
24
|
+
version_requirements: *70194416980920
|
25
25
|
- !ruby/object:Gem::Dependency
|
26
26
|
name: rspec
|
27
|
-
requirement: &
|
27
|
+
requirement: &70194416979940 !ruby/object:Gem::Requirement
|
28
28
|
none: false
|
29
29
|
requirements:
|
30
30
|
- - ~>
|
@@ -32,10 +32,10 @@ dependencies:
|
|
32
32
|
version: 2.5.0
|
33
33
|
type: :development
|
34
34
|
prerelease: false
|
35
|
-
version_requirements: *
|
35
|
+
version_requirements: *70194416979940
|
36
36
|
- !ruby/object:Gem::Dependency
|
37
37
|
name: simplecov
|
38
|
-
requirement: &
|
38
|
+
requirement: &70194416979120 !ruby/object:Gem::Requirement
|
39
39
|
none: false
|
40
40
|
requirements:
|
41
41
|
- - ! '>='
|
@@ -43,7 +43,7 @@ dependencies:
|
|
43
43
|
version: '0'
|
44
44
|
type: :development
|
45
45
|
prerelease: false
|
46
|
-
version_requirements: *
|
46
|
+
version_requirements: *70194416979120
|
47
47
|
description: ! " A library for creating slugs. Babosa an extraction and improvement
|
48
48
|
of the\n string code from FriendlyId, intended to help developers create similar\n
|
49
49
|
\ libraries or plugins.\n"
|
@@ -66,6 +66,7 @@ files:
|
|
66
66
|
- lib/babosa/transliterator/russian.rb
|
67
67
|
- lib/babosa/transliterator/serbian.rb
|
68
68
|
- lib/babosa/transliterator/spanish.rb
|
69
|
+
- lib/babosa/transliterator/swedish.rb
|
69
70
|
- lib/babosa/transliterator/ukrainian.rb
|
70
71
|
- lib/babosa/utf8/active_support_proxy.rb
|
71
72
|
- lib/babosa/utf8/dumb_proxy.rb
|
@@ -92,6 +93,7 @@ files:
|
|
92
93
|
- spec/transliterators/russian_spec.rb
|
93
94
|
- spec/transliterators/serbian_spec.rb
|
94
95
|
- spec/transliterators/spanish_spec.rb
|
96
|
+
- spec/transliterators/swedish_spec.rb
|
95
97
|
- spec/transliterators/ukrainian_spec.rb
|
96
98
|
- spec/utf8_proxy_spec.rb
|
97
99
|
- .gemtest
|
@@ -115,7 +117,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
|
|
115
117
|
version: '0'
|
116
118
|
requirements: []
|
117
119
|
rubyforge_project: ! '[none]'
|
118
|
-
rubygems_version: 1.8.
|
120
|
+
rubygems_version: 1.8.11
|
119
121
|
signing_key:
|
120
122
|
specification_version: 3
|
121
123
|
summary: A library for creating slugs.
|