babosa 0.3.6 → 0.3.7
Sign up to get free protection for your applications and to get access to all the features.
- data/README.md +128 -44
- data/lib/babosa/identifier.rb +7 -2
- data/lib/babosa/transliterator/base.rb +1 -0
- data/lib/babosa/transliterator/swedish.rb +16 -0
- data/lib/babosa/version.rb +1 -1
- data/spec/babosa_spec.rb +5 -0
- data/spec/transliterators/swedish_spec.rb +18 -0
- metadata +11 -9
data/README.md
CHANGED
@@ -1,8 +1,7 @@
|
|
1
1
|
# Babosa
|
2
2
|
|
3
|
-
Babosa is a library for creating human-friendly identifiers.
|
4
|
-
|
5
|
-
normalizing and sanitizing data.
|
3
|
+
Babosa is a library for creating human-friendly identifiers, aka "slugs". It can
|
4
|
+
also be useful for normalizing and sanitizing data.
|
6
5
|
|
7
6
|
It is an extraction and improvement of the string code from
|
8
7
|
[FriendlyId](http://github.com/norman/friendly_id). I have released this as a
|
@@ -11,20 +10,34 @@ FriendlyId.
|
|
11
10
|
|
12
11
|
## Features / Usage
|
13
12
|
|
14
|
-
### ASCII
|
13
|
+
### Transliterate UTF-8 characters to ASCII
|
15
14
|
|
16
15
|
"Gölcük, Turkey".to_slug.transliterate.to_s #=> "Golcuk, Turkey"
|
17
16
|
|
18
|
-
###
|
17
|
+
### Locale sensitive transliteration, with support for many languages
|
19
18
|
|
20
19
|
"Jürgen Müller".to_slug.transliterate.to_s #=> "Jurgen Muller"
|
21
20
|
"Jürgen Müller".to_slug.transliterate(:german).to_s #=> "Juergen Mueller"
|
22
21
|
|
23
|
-
|
24
|
-
I'll gladly accept contributions from fluent speakers to support more
|
25
|
-
languages.
|
22
|
+
Currently supported languages include:
|
26
23
|
|
27
|
-
|
24
|
+
* Bulgarian
|
25
|
+
* Danish
|
26
|
+
* German
|
27
|
+
* Greek
|
28
|
+
* Macedonian
|
29
|
+
* Norwegian
|
30
|
+
* Romanian
|
31
|
+
* Russian
|
32
|
+
* Serbian
|
33
|
+
* Spanish
|
34
|
+
* Swedish
|
35
|
+
* Ukrainian
|
36
|
+
|
37
|
+
|
38
|
+
I'll gladly accept contributions from fluent speakers to support more languages.
|
39
|
+
|
40
|
+
### Strip non-ASCII characters
|
28
41
|
|
29
42
|
"Gölcük, Turkey".to_slug.to_ascii.to_s #=> "Glck, Turkey"
|
30
43
|
|
@@ -49,8 +62,46 @@ whose length is limited by bytes rather than UTF-8 characters.
|
|
49
62
|
|
50
63
|
### Other stuff
|
51
64
|
|
52
|
-
|
53
|
-
|
65
|
+
#### Using Babosa With FriendlyId 4
|
66
|
+
|
67
|
+
require "babosa"
|
68
|
+
|
69
|
+
class Person < ActiveRecord::Base
|
70
|
+
friendly_id :name, use: :slugged
|
71
|
+
|
72
|
+
def normalize_friendly_id(input)
|
73
|
+
input.to_s.to_slug.normalize(transliterations: :russian).to_s
|
74
|
+
end
|
75
|
+
end
|
76
|
+
|
77
|
+
#### Pedantic UTF-8 support
|
78
|
+
|
79
|
+
Babosa goes out of its way to handle [nasty Unicode issues you might never think
|
80
|
+
you would have](https://github.com/norman/enc/blob/master/equivalence.rb) by
|
81
|
+
checking, sanitizing and normalizing your string input.
|
82
|
+
|
83
|
+
It will automatically use whatever Unicode library you have loaded before
|
84
|
+
Babosa, or fall back to a simple built-in library. Supported
|
85
|
+
Unicode libraries include:
|
86
|
+
|
87
|
+
* Java (only JRuby of course)
|
88
|
+
* Active Support
|
89
|
+
* [Unicode](https://github.com/blackwinter/unicode)
|
90
|
+
* Built-in
|
91
|
+
|
92
|
+
This built-in module is much faster than Active Support but much slower than
|
93
|
+
Java or Unicode. It can only do **very** naive Unicode composition to ensure
|
94
|
+
that, for example, "é" will always be composed to a single codepoint rather than
|
95
|
+
an "e" and a "´" - making it safe to use as a hash key.
|
96
|
+
|
97
|
+
But seriously - save yourself the headache and install a real Unicode library.
|
98
|
+
If you are using Babosa with a language that uses the Cyrillic alphabet, Babosa
|
99
|
+
requires either Unicode, Active Support or Java.
|
100
|
+
|
101
|
+
#### Ruby Method Names
|
102
|
+
|
103
|
+
Babosa can also generate strings for Ruby method names. (Yes, Ruby 1.9 can use
|
104
|
+
UTF-8 chars in method names, but you may not want to):
|
54
105
|
|
55
106
|
|
56
107
|
"this is a method".to_slug.to_ruby_method! #=> this_is_a_method
|
@@ -59,9 +110,10 @@ in method names, but you may not want to):
|
|
59
110
|
# You can also disallow trailing punctuation chars
|
60
111
|
"über cool stuff!".to_slug.to_ruby_method(false) #=> uber_cool_stuff
|
61
112
|
|
113
|
+
#### Easy to Extend
|
62
114
|
|
63
|
-
You can
|
64
|
-
|
115
|
+
You can add custom transliterators for your language with very little code. For
|
116
|
+
example here's the transliterator for German:
|
65
117
|
|
66
118
|
# encoding: utf-8
|
67
119
|
module Babosa
|
@@ -100,44 +152,67 @@ And a spec (you can use this as a template):
|
|
100
152
|
end
|
101
153
|
|
102
154
|
|
103
|
-
###
|
155
|
+
### Rails 3
|
156
|
+
|
157
|
+
Some of Babosa's functionality was added to Active Support 3.
|
158
|
+
|
159
|
+
Babosa now differs from ActiveSupport primarily in that it supports non-Latin
|
160
|
+
strings by default, and has per-locale ASCII transliterations already baked-in.
|
161
|
+
If you are considering using Babosa with Rails 3, you may want to first take a
|
162
|
+
look at Active Support's
|
163
|
+
[transliterate](http://edgeapi.rubyonrails.org/classes/ActiveSupport/Inflector.html#M000565)
|
164
|
+
and
|
165
|
+
[parameterize](http://edgeapi.rubyonrails.org/classes/ActiveSupport/Inflector.html#M000566)
|
166
|
+
to see if they suit your needs.
|
104
167
|
|
105
|
-
Babosa
|
106
|
-
ActiveSupport gems installed and required prior to requiring "babosa", these
|
107
|
-
will be used to perform upcasing and downcasing on UTF-8 strings. On JRuby 1.5
|
108
|
-
and above, Java's native Unicode support will be used instead. Unless you're on
|
109
|
-
JRuby, which already has excellent support for Unicode via Java's Standard
|
110
|
-
Library, I recommend using the Unicode gem because it's the fastest Ruby Unicode
|
111
|
-
library available.
|
168
|
+
### Babosa vs. Stringex
|
112
169
|
|
113
|
-
|
114
|
-
|
170
|
+
Babosa provides much of the functionality provided by the
|
171
|
+
[Stringex](https://github.com/rsl/stringex) gem, but in the subjective opinion
|
172
|
+
of the author, is for most use cases a much better choice.
|
115
173
|
|
116
|
-
|
117
|
-
that, for example, "é" will always be composed to a single codepoint rather than
|
118
|
-
an "e" and a "´" - making it safe to use as a hash key. But seriously - save
|
119
|
-
yourself the headache and install a real Unicode library.
|
174
|
+
#### Fewer Features
|
120
175
|
|
121
|
-
|
122
|
-
|
176
|
+
Stringex offers functionality for storing slugs in an Active Record model, like
|
177
|
+
a simple version of [FriendlyId](http://github.com/norman/friendly_id), in
|
178
|
+
addition to string processing. Babosa only does string processing.
|
123
179
|
|
180
|
+
#### Less Aggressive Unicode Transliteration
|
124
181
|
|
125
|
-
|
182
|
+
Stringex uses an agressive Unicode to ASCII mapping which outputs gibberish for
|
183
|
+
almost anything but Western European langages. Babosa supports only languages
|
184
|
+
for which fluent speakers have provided transliterations, to ensure that the
|
185
|
+
output makes sense to users.
|
126
186
|
|
127
|
-
|
187
|
+
#### Better Locale Support
|
188
|
+
|
189
|
+
Recent versions of Stringex support locale-specific transliterations, but
|
190
|
+
include no built-in support for any languages. Babosa works out of the box for
|
191
|
+
most European languages and is easy to extend if your language is not supported.
|
192
|
+
|
193
|
+
#### Unicode Support
|
194
|
+
|
195
|
+
Stringex does no Unicode normalization or validation before transliterating
|
196
|
+
strings, so if you pass in strings with encoding errors or with different
|
197
|
+
Unicode normalizations, you'll get unpredictable results.
|
198
|
+
|
199
|
+
#### No Locale Assumptions
|
200
|
+
|
201
|
+
Babosa avoids making assumptions about locales like Stringex does, so it doesn't
|
202
|
+
offer transliterations like this out of the box:
|
203
|
+
|
204
|
+
"$12 worth of Ruby power".to_url => "12-dollars-worth-of-ruby-power"
|
205
|
+
|
206
|
+
This is because the symbol "$" is used by many Latin American countries for the
|
207
|
+
peso. Stringex does this in many places, for example, transliterating all Han
|
208
|
+
characters into Pinyin, effectively treating Japanese text as if it were
|
209
|
+
Chinese.
|
128
210
|
|
129
|
-
Babosa differs from ActiveSupport primarily in that it supports non-Latin
|
130
|
-
strings by default, and has per-locale ASCII transliterations already baked-in.
|
131
|
-
If you are considering using Babosa with Rails 3, you should first take a look
|
132
|
-
at Active Support's
|
133
|
-
[transliterate](http://edgeapi.rubyonrails.org/classes/ActiveSupport/Inflector.html#M000565)
|
134
|
-
and
|
135
|
-
[parameterize](http://edgeapi.rubyonrails.org/classes/ActiveSupport/Inflector.html#M000566)
|
136
|
-
because it may already do what you need.
|
137
211
|
|
138
212
|
### More info
|
139
213
|
|
140
|
-
Please see the [API docs](http://norman.github.com/babosa) and source code for
|
214
|
+
Please see the [API docs](http://norman.github.com/babosa) and source code for
|
215
|
+
more info.
|
141
216
|
|
142
217
|
## Getting it
|
143
218
|
|
@@ -147,12 +222,13 @@ Babosa can be installed via Rubygems:
|
|
147
222
|
|
148
223
|
You can get the source code from its [Github repository](http://github.com/norman/babosa).
|
149
224
|
|
150
|
-
Babosa is tested to be compatible with Ruby 1.8.6-1.9.2, JRuby 1.4
|
151
|
-
Rubinius 1.0
|
225
|
+
Babosa is tested to be compatible with Ruby 1.8.6-1.9.2, JRuby 1.4+, and
|
226
|
+
Rubinius 1.0+ It's probably compatible with other Rubies as well.
|
152
227
|
|
153
228
|
## Reporting bugs
|
154
229
|
|
155
|
-
Please use Babosa's [Github issue
|
230
|
+
Please use Babosa's [Github issue
|
231
|
+
tracker](http://github.com/norman/babosa/issues).
|
156
232
|
|
157
233
|
|
158
234
|
## Misc
|
@@ -165,6 +241,13 @@ Please use Babosa's [Github issue tracker](http://github.com/norman/babosa/issue
|
|
165
241
|
|
166
242
|
## Contributors
|
167
243
|
|
244
|
+
Many thanks to the following people for their help:
|
245
|
+
|
246
|
+
* [Philip Arndt](https://github.com/parndt) - Bugfixes
|
247
|
+
* [Jonas Forsberg](https://github.com/himynameisjonas) - Swedish support
|
248
|
+
* [Jaroslav Kalistsuk](https://github.com/jarosan) - Greek support
|
249
|
+
* [Steven Heidel](https://github.com/stevenheidel) - Bugfixes
|
250
|
+
* [Edgars Beigarts](https://github.com/ebeigarts) - Support for multiple transliterators
|
168
251
|
* [Tiberiu C. Turbureanu](https://gitorious.org/~tct) - Romanian support
|
169
252
|
* [Kim Joar Bekkelund](https://github.com/kjbekkelund) - Norwegian support
|
170
253
|
* [Alexey Shkolnikov](https://github.com/grlm) - Russian support
|
@@ -175,6 +258,7 @@ Please use Babosa's [Github issue tracker](http://github.com/norman/babosa/issue
|
|
175
258
|
|
176
259
|
## Changelog
|
177
260
|
|
261
|
+
* 0.3.7 - Fix compatibility with Ruby 1.8.7. Add Swedish support.
|
178
262
|
* 0.3.6 - Allow multiple transliterators. Add Greek support.
|
179
263
|
* 0.3.5 - Don't strip underscores from identifiers.
|
180
264
|
* 0.3.4 - Add Romanian support.
|
@@ -190,7 +274,7 @@ Please use Babosa's [Github issue tracker](http://github.com/norman/babosa/issue
|
|
190
274
|
|
191
275
|
## Copyright
|
192
276
|
|
193
|
-
Copyright (c) 2010-
|
277
|
+
Copyright (c) 2010-2012 Norman Clarke
|
194
278
|
|
195
279
|
Permission is hereby granted, free of charge, to any person obtaining a copy of
|
196
280
|
this software and associated documentation files (the "Software"), to deal in
|
data/lib/babosa/identifier.rb
CHANGED
@@ -113,6 +113,7 @@ module Babosa
|
|
113
113
|
# @param *args <Symbol>
|
114
114
|
# @return String
|
115
115
|
def transliterate!(*kinds)
|
116
|
+
kinds.compact!
|
116
117
|
kinds = [:latin] if kinds.empty?
|
117
118
|
kinds.each do |kind|
|
118
119
|
transliterator = Transliterator.get(kind).instance
|
@@ -148,8 +149,12 @@ module Babosa
|
|
148
149
|
else
|
149
150
|
options = default_normalize_options.merge(options || {})
|
150
151
|
end
|
151
|
-
if options[:transliterate]
|
152
|
-
|
152
|
+
if translit_option = options[:transliterate]
|
153
|
+
if translit_option != true
|
154
|
+
transliterate!(*translit_option)
|
155
|
+
else
|
156
|
+
transliterate!(*options[:transliterations])
|
157
|
+
end
|
153
158
|
end
|
154
159
|
to_ascii! if options[:to_ascii]
|
155
160
|
clean!
|
@@ -17,6 +17,7 @@ module Babosa
|
|
17
17
|
autoload :Russian, "babosa/transliterator/russian"
|
18
18
|
autoload :Serbian, "babosa/transliterator/serbian"
|
19
19
|
autoload :Spanish, "babosa/transliterator/spanish"
|
20
|
+
autoload :Swedish, "babosa/transliterator/swedish"
|
20
21
|
autoload :Ukrainian, "babosa/transliterator/ukrainian"
|
21
22
|
autoload :Greek, "babosa/transliterator/greek"
|
22
23
|
|
data/lib/babosa/version.rb
CHANGED
data/spec/babosa_spec.rb
CHANGED
@@ -54,6 +54,11 @@ describe Babosa::Identifier do
|
|
54
54
|
end
|
55
55
|
|
56
56
|
describe "#normalize" do
|
57
|
+
|
58
|
+
it "should allow passing locale as key for :transliterate" do
|
59
|
+
"ö".to_slug.clean.normalize(:transliterate => :german).should eql("oe")
|
60
|
+
end
|
61
|
+
|
57
62
|
it "should replace whitespace with dashes" do
|
58
63
|
"a b".to_slug.clean.normalize.should eql("a-b")
|
59
64
|
end
|
@@ -0,0 +1,18 @@
|
|
1
|
+
# encoding: utf-8
|
2
|
+
require File.expand_path("../../spec_helper", __FILE__)
|
3
|
+
|
4
|
+
describe Babosa::Transliterator::Swedish do
|
5
|
+
|
6
|
+
let(:t) { described_class.instance }
|
7
|
+
it_behaves_like "a latin transliterator"
|
8
|
+
|
9
|
+
it "should transliterate various characters" do
|
10
|
+
examples = {
|
11
|
+
"Räksmörgås" => "Raeksmoergaas",
|
12
|
+
"Öre" => "Oere",
|
13
|
+
"Åre" => "Aare",
|
14
|
+
"Älskar" => "Aelskar"
|
15
|
+
}
|
16
|
+
examples.each {|k, v| t.transliterate(k).should eql(v)}
|
17
|
+
end
|
18
|
+
end
|
metadata
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: babosa
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.3.
|
4
|
+
version: 0.3.7
|
5
5
|
prerelease:
|
6
6
|
platform: ruby
|
7
7
|
authors:
|
@@ -9,11 +9,11 @@ authors:
|
|
9
9
|
autorequire:
|
10
10
|
bindir: bin
|
11
11
|
cert_chain: []
|
12
|
-
date: 2012-
|
12
|
+
date: 2012-03-12 00:00:00.000000000 Z
|
13
13
|
dependencies:
|
14
14
|
- !ruby/object:Gem::Dependency
|
15
15
|
name: activesupport
|
16
|
-
requirement: &
|
16
|
+
requirement: &70194416980920 !ruby/object:Gem::Requirement
|
17
17
|
none: false
|
18
18
|
requirements:
|
19
19
|
- - ! '>='
|
@@ -21,10 +21,10 @@ dependencies:
|
|
21
21
|
version: 2.3.0
|
22
22
|
type: :development
|
23
23
|
prerelease: false
|
24
|
-
version_requirements: *
|
24
|
+
version_requirements: *70194416980920
|
25
25
|
- !ruby/object:Gem::Dependency
|
26
26
|
name: rspec
|
27
|
-
requirement: &
|
27
|
+
requirement: &70194416979940 !ruby/object:Gem::Requirement
|
28
28
|
none: false
|
29
29
|
requirements:
|
30
30
|
- - ~>
|
@@ -32,10 +32,10 @@ dependencies:
|
|
32
32
|
version: 2.5.0
|
33
33
|
type: :development
|
34
34
|
prerelease: false
|
35
|
-
version_requirements: *
|
35
|
+
version_requirements: *70194416979940
|
36
36
|
- !ruby/object:Gem::Dependency
|
37
37
|
name: simplecov
|
38
|
-
requirement: &
|
38
|
+
requirement: &70194416979120 !ruby/object:Gem::Requirement
|
39
39
|
none: false
|
40
40
|
requirements:
|
41
41
|
- - ! '>='
|
@@ -43,7 +43,7 @@ dependencies:
|
|
43
43
|
version: '0'
|
44
44
|
type: :development
|
45
45
|
prerelease: false
|
46
|
-
version_requirements: *
|
46
|
+
version_requirements: *70194416979120
|
47
47
|
description: ! " A library for creating slugs. Babosa an extraction and improvement
|
48
48
|
of the\n string code from FriendlyId, intended to help developers create similar\n
|
49
49
|
\ libraries or plugins.\n"
|
@@ -66,6 +66,7 @@ files:
|
|
66
66
|
- lib/babosa/transliterator/russian.rb
|
67
67
|
- lib/babosa/transliterator/serbian.rb
|
68
68
|
- lib/babosa/transliterator/spanish.rb
|
69
|
+
- lib/babosa/transliterator/swedish.rb
|
69
70
|
- lib/babosa/transliterator/ukrainian.rb
|
70
71
|
- lib/babosa/utf8/active_support_proxy.rb
|
71
72
|
- lib/babosa/utf8/dumb_proxy.rb
|
@@ -92,6 +93,7 @@ files:
|
|
92
93
|
- spec/transliterators/russian_spec.rb
|
93
94
|
- spec/transliterators/serbian_spec.rb
|
94
95
|
- spec/transliterators/spanish_spec.rb
|
96
|
+
- spec/transliterators/swedish_spec.rb
|
95
97
|
- spec/transliterators/ukrainian_spec.rb
|
96
98
|
- spec/utf8_proxy_spec.rb
|
97
99
|
- .gemtest
|
@@ -115,7 +117,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
|
|
115
117
|
version: '0'
|
116
118
|
requirements: []
|
117
119
|
rubyforge_project: ! '[none]'
|
118
|
-
rubygems_version: 1.8.
|
120
|
+
rubygems_version: 1.8.11
|
119
121
|
signing_key:
|
120
122
|
specification_version: 3
|
121
123
|
summary: A library for creating slugs.
|