RubyGems - babosa - Versions diffs - 1.0.4 → 2.0.0.beta - Mend

babosa 1.0.4 → 2.0.0.beta

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (53) hide show

checksums.yaml +4 -4
data/Changelog.md +12 -0
data/README.md +80 -117
data/Rakefile +9 -8
data/lib/babosa.rb +2 -4
data/lib/babosa/identifier.rb +82 -121
data/lib/babosa/transliterator/base.rb +57 -56
data/lib/babosa/transliterator/bulgarian.rb +3 -2
data/lib/babosa/transliterator/cyrillic.rb +5 -5
data/lib/babosa/transliterator/danish.rb +3 -3
data/lib/babosa/transliterator/german.rb +3 -2
data/lib/babosa/transliterator/greek.rb +4 -3
data/lib/babosa/transliterator/hindi.rb +3 -2
data/lib/babosa/transliterator/latin.rb +5 -5
data/lib/babosa/transliterator/macedonian.rb +3 -2
data/lib/babosa/transliterator/norwegian.rb +3 -3
data/lib/babosa/transliterator/romanian.rb +3 -2
data/lib/babosa/transliterator/russian.rb +3 -2
data/lib/babosa/transliterator/serbian.rb +29 -27
data/lib/babosa/transliterator/spanish.rb +2 -2
data/lib/babosa/transliterator/swedish.rb +3 -3
data/lib/babosa/transliterator/turkish.rb +8 -8
data/lib/babosa/transliterator/ukrainian.rb +5 -4
data/lib/babosa/transliterator/vietnamese.rb +4 -3
data/lib/babosa/version.rb +3 -1
data/spec/{babosa_spec.rb → identifier_spec.rb} +9 -10
data/spec/spec_helper.rb +6 -6
data/spec/transliterators/base_spec.rb +5 -6
data/spec/transliterators/bulgarian_spec.rb +4 -5
data/spec/transliterators/danish_spec.rb +5 -6
data/spec/transliterators/german_spec.rb +4 -5
data/spec/transliterators/greek_spec.rb +7 -7
data/spec/transliterators/hindi_spec.rb +7 -7
data/spec/transliterators/latin_spec.rb +3 -4
data/spec/transliterators/macedonian_spec.rb +3 -4
data/spec/transliterators/norwegian_spec.rb +4 -4
data/spec/transliterators/polish_spec.rb +3 -5
data/spec/transliterators/romanian_spec.rb +5 -6
data/spec/transliterators/russian_spec.rb +3 -4
data/spec/transliterators/serbian_spec.rb +6 -7
data/spec/transliterators/spanish_spec.rb +4 -5
data/spec/transliterators/swedish_spec.rb +7 -7
data/spec/transliterators/turkish_spec.rb +24 -24
data/spec/transliterators/ukrainian_spec.rb +74 -75
data/spec/transliterators/vietnamese_spec.rb +10 -10
metadata +17 -38
data/lib/babosa/utf8/active_support_proxy.rb +0 -38
data/lib/babosa/utf8/dumb_proxy.rb +0 -49
data/lib/babosa/utf8/java_proxy.rb +0 -22
data/lib/babosa/utf8/mappings.rb +0 -193
data/lib/babosa/utf8/proxy.rb +0 -125
data/lib/babosa/utf8/unicode_proxy.rb +0 -23
data/spec/utf8_proxy_spec.rb +0 -52

checksums.yaml CHANGED

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: 7878565d1bfb436b7110d42e81dff1eb589f86e10e0919ded4b2de695784fae3
-  data.tar.gz: f6f3e7cc2b4876a940ec66fa1895f9cd3390526c62cebc888110494b72d77fe5
+  metadata.gz: ef1346a05f1b3a1104af8a095b7829eaa853427927335c7905d41d5ab47b2e9c
+  data.tar.gz: 54e543a250c7eff0c9613bea3f22d2e79317707f60cbb364e6e68979237c4c53
 SHA512:
-  metadata.gz: 6ea0ff964d688d9ca29710da13207c12d49509c13d72f8198a33d38141f7f6ad39b5792f5937a94a44b3976dcd68b24233c11b3418dc9d100a170caa799404ed
-  data.tar.gz: 1b37fd01a1907f244e112171da5dd942105181b860e6dc05ede3e1798bfdf1e50e8543b69757109dc9bd7dfddecb6ba2fbec2b59f38843223052c559f085490e
+  metadata.gz: d4f6732579088cda9d4514b4d1dddd32f2cd933f2818ed4c74bf0f00168ccc3b20e6ce220bc1c473a0453d29ef50de256689ed17e1f809b42dcb8956377c0e76
+  data.tar.gz: 797887db1d626a92b28249883f2dcc55e3b92d9f423aa23052ba74cc45856a15ace2b6c823c983b6e6baa0907cfc3d467b9a52e05cd9de9d9fcf37c779bd0647

data/Changelog.md CHANGED

@@ -1,5 +1,17 @@
 # Babosa Changelog
+## 2.0.0
+This release contains no important changes. I had a week off from work and
+decided to refactor the code.  However there are some small breaking changes so
+I have released it as 2.0.0.
+* Refactor internals for simplicity
+* Use built-in Ruby UTF-8 support in places of other gems.
+* Drop support for Ruby < 2.5.0.
+* `Babosa::Identifier#word_chars` no longer removes dashes
+* `Babosa::Identifier#to_ruby_method` default argument `allow_bangs` is now a keyword argument
 ## 1.0.4
 * Fix nil being cast to frozen string (https://github.com/norman/babosa/pull/52)

data/README.md CHANGED

@@ -15,12 +15,16 @@ FriendlyId.
 ### Transliterate UTF-8 characters to ASCII
-    "Gölcük, Turkey".to_slug.transliterate.to_s #=> "Golcuk, Turkey"
+```ruby
+"Gölcük, Turkey".to_slug.transliterate.to_s #=> "Golcuk, Turkey"
+```
 ### Locale sensitive transliteration, with support for many languages
-    "Jürgen Müller".to_slug.transliterate.to_s           #=> "Jurgen Muller"
-    "Jürgen Müller".to_slug.transliterate(:german).to_s  #=> "Juergen Mueller"
+```ruby
+"Jürgen Müller".to_slug.transliterate.to_s           #=> "Jurgen Muller"
+"Jürgen Müller".to_slug.transliterate(:german).to_s  #=> "Juergen Mueller"
+```
 Currently supported languages include:
@@ -28,6 +32,7 @@ Currently supported languages include:
 * Danish
 * German
 * Greek
+* Hindi
 * Macedonian
 * Norwegian
 * Romanian
@@ -35,124 +40,125 @@ Currently supported languages include:
 * Serbian
 * Spanish
 * Swedish
+* Turkish
 * Ukrainian
+* Vietnamese
+Additionally there are generic transliterators for transliterating from the
+Cyrillic alphabet and Latin alphabet with diacritics. The Latin transliterator
+can be used, for example, with Czech. There is also a transliterator named
+"Hindi" which may be sufficient for other Indic languages using Devanagari, but
+I do not know enough to say whether the transliterations would make sense.
 I'll gladly accept contributions from fluent speakers to support more languages.
 ### Strip non-ASCII characters
-    "Gölcük, Turkey".to_slug.to_ascii.to_s #=> "Glck, Turkey"
+```ruby
+"Gölcük, Turkey".to_slug.to_ascii.to_s #=> "Glck, Turkey"
+```
 ### Truncate by characters
-    "üüü".to_slug.truncate(2).to_s #=> "üü"
+```ruby
+"üüü".to_slug.truncate(2).to_s #=> "üü"
+```
 ### Truncate by bytes
 This can be useful to ensure the generated slug will fit in a database column
 whose length is limited by bytes rather than UTF-8 characters.
-    "üüü".to_slug.truncate_bytes(2).to_s #=> "ü"
+```ruby
+"üüü".to_slug.truncate_bytes(2).to_s #=> "ü"
+```
 ### Remove punctuation chars
-    "this is, um, **really** cool, huh?".to_slug.word_chars.to_s #=> "this is um really cool huh"
+```ruby
+"this is, um, **really** cool, huh?".to_slug.word_chars.to_s #=> "this is um really cool huh"
+```
 ### All-in-one
-    "Gölcük, Turkey".to_slug.normalize.to_s #=> "golcuk-turkey"
+```ruby
+"Gölcük, Turkey".to_slug.normalize.to_s #=> "golcuk-turkey"
+```
 ### Other stuff
-#### Using Babosa With FriendlyId 4
-    require "babosa"
+#### Using Babosa With FriendlyId 4+
-    class Person < ActiveRecord::Base
-      friendly_id :name, use: :slugged
-      def normalize_friendly_id(input)
-        input.to_s.to_slug.normalize(transliterations: :russian).to_s
-      end
-    end
+```ruby
+require "babosa"
-#### Pedantic UTF-8 support
+class Person < ActiveRecord::Base
+  friendly_id :name, use: :slugged
-Babosa goes out of its way to handle [nasty Unicode issues you might never think
-you would have](https://github.com/norman/enc/blob/master/equivalence.rb) by
-checking, sanitizing and normalizing your string input.
+  def normalize_friendly_id(input)
+    input.to_s.to_slug.normalize(transliterations: :russian).to_s
+  end
+end
+```
-It will automatically use whatever Unicode library you have loaded before
-Babosa, or fall back to a simple built-in library. Supported
-Unicode libraries include:
+#### UTF-8 support
-* Java (only on JRuby of course)
-* Active Support
-* [Unicode](https://github.com/blackwinter/unicode)
-* Built-in
-This built-in module is much faster than Active Support but much slower than
-Java or Unicode. It can only do **very** naive Unicode composition to ensure
-that, for example, "é" will always be composed to a single codepoint rather than
-an "e" and a "´" - making it safe to use as a hash key.
-But seriously - save yourself the headache and install a real Unicode library.
-If you are using Babosa with a language that uses the Cyrillic alphabet, Babosa
-requires either Unicode, Active Support or Java.
+Babosa normalizes all input strings [to NFC](https://en.wikipedia.org/wiki/Unicode_equivalence#Normal_forms).
 #### Ruby Method Names
-Babosa can also generate strings for Ruby method names. (Yes, Ruby 1.9 can use
+Babosa can generate strings for Ruby method names. (Yes, Ruby 1.9+ can use
 UTF-8 chars in method names, but you may not want to):
-    "this is a method".to_slug.to_ruby_method! #=> this_is_a_method
-    "über cool stuff!".to_slug.to_ruby_method! #=> uber_cool_stuff!
+```ruby
+"this is a method".to_slug.to_ruby_method! #=> this_is_a_method
+"über cool stuff!".to_slug.to_ruby_method! #=> uber_cool_stuff!
-    # You can also disallow trailing punctuation chars
-    "über cool stuff!".to_slug.to_ruby_method(false) #=> uber_cool_stuff
+# You can also disallow trailing punctuation chars
+"über cool stuff!".to_slug.to_ruby_method(allow_bangs: false) #=> uber_cool_stuff
+```
 #### Easy to Extend
 You can add custom transliterators for your language with very little code. For
 example here's the transliterator for German:
-    # encoding: utf-8
-    module Babosa
-      module Transliterator
-        class German < Latin
-          APPROXIMATIONS = {
-            "ä" => "ae",
-            "ö" => "oe",
-            "ü" => "ue",
-            "Ä" => "Ae",
-            "Ö" => "Oe",
-            "Ü" => "Ue"
-          }
-        end
-      end
+```ruby
+module Babosa
+  module Transliterator
+    class German < Latin
+      APPROXIMATIONS = {
+        "ä" => "ae",
+        "ö" => "oe",
+        "ü" => "ue",
+        "Ä" => "Ae",
+        "Ö" => "Oe",
+        "Ü" => "Ue"
+      }
     end
+  end
+end
+```
 And a spec (you can use this as a template):
-    # encoding: utf-8
-    require File.expand_path("../../spec_helper", __FILE__)
-    describe Babosa::Transliterator::German do
+```ruby
+require "spec_helper"
-      let(:t) { described_class.instance }
-      it_behaves_like "a latin transliterator"
+describe Babosa::Transliterator::German do
+  let(:t) { described_class.instance }
+  it_behaves_like "a latin transliterator"
-      it "should transliterate Eszett" do
-        t.transliterate("ß").should eql("ss")
-      end
-      it "should transliterate vowels with umlauts" do
-        t.transliterate("üöä").should eql("ueoeae")
-      end
-    end
+  it "should transliterate Eszett" do
+    t.transliterate("ß").should eql("ss")
+  end
+  it "should transliterate vowels with umlauts" do
+    t.transliterate("üöä").should eql("ueoeae")
+  end
+end
+```
 ### Rails 3.x and higher
@@ -167,46 +173,6 @@ and
 [parameterize](http://api.rubyonrails.org/classes/ActiveSupport/Inflector.html#method-i-parameterize)
 to see if they suit your needs.
-### Babosa vs. Stringex
-Babosa provides much of the functionality provided by the
-[Stringex](https://github.com/rsl/stringex) gem, but in the subjective opinion
-of the author, is for most use cases a better choice.
-#### Fewer Features
-Stringex offers functionality for storing slugs in an Active Record model, like
-a simple version of [FriendlyId](http://github.com/norman/friendly_id), in
-addition to string processing. Babosa only does string processing.
-#### Less Aggressive Unicode Transliteration
-Stringex uses an agressive Unicode to ASCII mapping which outputs gibberish for
-almost anything but Western European langages and Mandarin Chinese. Babosa
-supports only languages for which fluent speakers have provided
-transliterations, to ensure that the output makes sense to users.
-#### Unicode Support
-Stringex does no Unicode normalization or validation before transliterating
-strings, so if you pass in strings with encoding errors or with different
-Unicode normalizations, you'll get unpredictable results.
-#### No Locale Assumptions
-Babosa avoids making assumptions about locales like Stringex does, so it doesn't
-offer transliterations like this out of the box:
-    "$12 worth of Ruby power".to_url => "12-dollars-worth-of-ruby-power"
-This is because the symbol "$" is used in many Latin American countries for the
-peso. Stringex does this in many places, for example, transliterating all Han
-characters into Pinyin, effectively treating Japanese text as if it were
-Mandarin Chinese.
-### More info
 Please see the [API docs](http://rubydoc.info/github/norman/babosa/master/frames) and source code for
 more info.
@@ -218,9 +184,6 @@ Babosa can be installed via Rubygems:
 You can get the source code from its [Github repository](http://github.com/norman/babosa).
-Babosa is tested to be compatible with Ruby 2.x, JRuby 1.7+, and
-Rubinius 2.x It's probably compatible with other Rubies as well.
 ## Reporting bugs
 Please use Babosa's [Github issue
@@ -229,7 +192,7 @@ tracker](http://github.com/norman/babosa/issues).
 ## Misc
-"Babosa" means slug in Spanish.
+"Babosa" means "slug" in Spanish.
 ## Author
@@ -258,7 +221,7 @@ Many thanks to the following people for their help:
 ## Copyright
-Copyright (c) 2010-2013 Norman Clarke
+Copyright (c) 2010-2020 Norman Clarke
 Permission is hereby granted, free of charge, to any person obtaining a copy of
 this software and associated documentation files (the "Software"), to deal in

data/Rakefile CHANGED

@@ -1,10 +1,13 @@
+# frozen_string_literal: true
 require "rubygems"
 require "rake/testtask"
 require "rake/clean"
 require "rubygems/package_task"
+require "rubocop/rake_task"
-task :default => :spec
-task :test    => :spec
+task default: [:rubocop, :spec]
+task test: :spec
 CLEAN << "pkg" << "doc" << "coverage" << ".yardoc"
@@ -14,6 +17,7 @@ begin
     t.options = ["--output-dir=doc"]
   end
 rescue LoadError
+  puts "Yard not present"
 end
 begin
@@ -23,12 +27,9 @@ begin
     Rake::Task["spec"].execute
   end
 rescue LoadError
+  puts "SimpleCov not present"
 end
-gemspec = File.expand_path("../babosa.gemspec", __FILE__)
-if File.exist? gemspec
-  Gem::PackageTask.new(eval(File.read(gemspec))) { |pkg| }
-end
-require 'rspec/core/rake_task'
+require "rspec/core/rake_task"
 RSpec::Core::RakeTask.new(:spec)
+RuboCop::RakeTask.new

data/lib/babosa.rb CHANGED

@@ -1,7 +1,6 @@
+# frozen_string_literal: true
 module Babosa
-  def self.jruby15?
-    JRUBY_VERSION >= "1.5" rescue false
-  end
 end
 class String
@@ -12,5 +11,4 @@ class String
 end
 require "babosa/transliterator/base"
-require "babosa/utf8/proxy"
 require "babosa/identifier"

data/lib/babosa/identifier.rb CHANGED

@@ -1,16 +1,6 @@
-# encoding: utf-8
-module Babosa
-  # Codepoints for characters that will be deleted by +#word_chars!+.
-  STRIPPABLE = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 11, 12, 14, 15, 16, 17, 18, 19,
-    20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 33, 34, 35, 36, 37, 38, 39,
-    40, 41, 42, 43, 44, 45, 46, 47, 58, 59, 60, 61, 62, 63, 64, 91, 92, 93, 94,
-    96, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136,
-    137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151,
-    152, 153, 154, 155, 156, 157, 158, 159, 161, 162, 163, 164, 165, 166, 167,
-    168, 169, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 182, 183, 184,
-    185, 187, 188, 189, 190, 191, 215, 247, 8203, 8204, 8205, 8239, 65279]
+# frozen_string_literal: true
+module Babosa
   # This class provides some string-manipulation methods specific to slugs.
   #
   # Note that this class includes many "bang methods" such as {#clean!} and
@@ -20,7 +10,7 @@ module Babosa
   # it is generated dynamically.
   #
   # All of the bang methods return an instance of String, while the bangless
-  # versions return an instance of Babosa::Identifier, so that calls to methods
+  # versions return an instance of {Babosa::Identifier}, so that calls to methods
   # specific to this class can be chained:
   #
   #   string = Identifier.new("hello world")
@@ -29,71 +19,47 @@ module Babosa
   #
   # @see http://www.utf8-chartable.de/unicode-utf8-table.pl?utf8=dec Unicode character table
   class Identifier
     Error = Class.new(StandardError)
     attr_reader :wrapped_string
     alias to_s wrapped_string
-    @@utf8_proxy = if Babosa.jruby15?
-      UTF8::JavaProxy
-    elsif defined? Unicode::VERSION
-      UTF8::UnicodeProxy
-    elsif defined? ActiveSupport
-      UTF8::ActiveSupportProxy
-    else
-      UTF8::DumbProxy
-    end
-    # Return the proxy used for UTF-8 support.
-    # @see Babosa::UTF8::Proxy
-    def self.utf8_proxy
-      @@utf8_proxy
-    end
-    # Set a proxy object used for UTF-8 support.
-    # @see Babosa::UTF8::Proxy
-    def self.utf8_proxy=(obj)
-      @@utf8_proxy = obj
-    end
     def method_missing(symbol, *args, &block)
       @wrapped_string.__send__(symbol, *args, &block)
     end
+    def respond_to_missing?(name, include_all)
+      @wrapped_string.respond_to?(name, include_all)
+    end
     # @param string [#to_s] The string to use as the basis of the Identifier.
     def initialize(string)
-      @wrapped_string = string.to_s
+      @wrapped_string = string.to_s.dup
       tidy_bytes!
       normalize_utf8!
     end
-    def ==(value)
-      @wrapped_string.to_s == value.to_s
-    end
-    def eql?(value)
-      @wrapped_string == value
+    def ==(other)
+      to_s == other.to_s
     end
-    def empty?
-      # included to make this class :respond_to? :empty for compatibility with Active Support's
-      # #blank?
-      @wrapped_string.empty?
+    def eql?(other)
+      self == other
     end
-    # Approximate an ASCII string. This works only for Western strings using
-    # characters that are Roman-alphabet characters + diacritics. Non-letter
-    # characters are left unmodified.
+    # Approximate an ASCII string. This works only for strings using characters
+    # that are Roman-alphabet characters + diacritics. Non-letter characters
+    # are left unmodified.
     #
-    #   string = Identifier.new "Łódź
+    #   string = Identifier.new "Łódź, Poland"
     #   string.transliterate                 # => "Lodz, Poland"
     #   string = Identifier.new "日本"
     #   string.transliterate                 # => "日本"
     #
-    # You can pass any key(s) from +Characters.approximations+ as arguments. This allows
-    # for contextual approximations. Various languages are supported, you can see which ones
-    # by looking at the source of {Babosa::Transliterator::Base}.
+    # You can pass the names of any transliterator class as arguments. This
+    # allows for contextual approximations. Various languages are supported,
+    # you can see which ones by looking at the source of
+    # {Babosa::Transliterator::Base}.
     #
     #   string = Identifier.new "Jürgen Müller"
     #   string.transliterate                 # => "Jurgen Muller"
@@ -111,7 +77,7 @@ module Babosa
     # to remove non-ASCII characters such as "¡" and "¿", use {#to_ascii!}:
     #
     #   string.transliterate!(:spanish)       # => "¡Feliz anio!"
-    #   string.transliterate!                 # => "¡Feliz anio!"
+    #   string.to_ascii!                      # => "Feliz anio!"
     #
     # @param *args <Symbol>
     # @return String
@@ -122,40 +88,50 @@ module Babosa
         transliterator = Transliterator.get(kind).instance
         @wrapped_string = transliterator.transliterate(@wrapped_string)
       end
-      @wrapped_string
+      to_s
     end
     # Converts dashes to spaces, removes leading and trailing spaces, and
     # replaces multiple whitespace characters with a single space.
+    #
     # @return String
     def clean!
-      @wrapped_string = @wrapped_string.gsub("-", " ").squeeze(" ").strip
+      gsub!(/[- ]+/, " ")
+      strip!
+      to_s
     end
     # Remove any non-word characters. For this library's purposes, this means
-    # anything other than letters, numbers, spaces, newlines and linefeeds.
+    # anything other than letters, numbers, spaces, underscores, dashes,
+    # newlines, and linefeeds.
+    #
     # @return String
     def word_chars!
-      @wrapped_string = (unpack("U*") - Babosa::STRIPPABLE).pack("U*")
+      # `^\p{letter}` = Any non-Unicode letter
+      # `&&` = add the following character class
+      # `[^ _\n\r]` = Anything other than space, underscore, newline or linefeed
+      gsub!(/[[^\p{letter}]&&[^ _\-\n\r]]/, "")
+      to_s
     end
     # Normalize the string for use as a URL slug. Note that in this context,
     # +normalize+ means, strip, remove non-letters/numbers, downcasing,
     # truncating to 255 bytes and converting whitespace to dashes.
-    # @param Options
+    #
+    # @param options [Hash]
     # @return String
-    def normalize!(options = nil)
-      options = default_normalize_options.merge(options || {})
+    def normalize!(options = {})
+      options = default_normalize_options.merge(options)
-      if translit_option = options[:transliterate]
-        if translit_option != true
-          transliterate!(*translit_option)
+      if options[:transliterate]
+        option = options[:transliterate]
+        if option != true
+          transliterate!(*option)
         else
           transliterate!(*options[:transliterations])
         end
       end
       to_ascii! if options[:to_ascii]
-      clean!
       word_chars!
       clean!
       downcase!
@@ -164,105 +140,90 @@ module Babosa
     end
     # Normalize a string so that it can safely be used as a Ruby method name.
-    def to_ruby_method!(allow_bangs = true)
-      leader, trailer = @wrapped_string.strip.scan(/\A(.+)(.)\z/).flatten
-      leader          = leader.to_s.dup
-      trailer         = trailer.to_s.dup
-      if allow_bangs
-        trailer.downcase!
-        trailer.gsub!(/[^a-z0-9!=\\?]/, '')
-      else
-        trailer.downcase!
-        trailer.gsub!(/[^a-z0-9]/, '')
-      end
-      id = leader.to_identifier
-      id.transliterate!
-      id.to_ascii!
-      id.clean!
-      id.word_chars!
-      id.clean!
-      @wrapped_string = id.to_s + trailer
-      if @wrapped_string == ""
-        raise Error, "Input generates impossible Ruby method name"
-      end
+    #
+    # @param allow_bangs [Boolean]
+    # @return String
+    def to_ruby_method!(allow_bangs: true)
+      last_char = self[-1]
+      transliterate!
+      to_ascii!
+      word_chars!
+      clean!
+      @wrapped_string += last_char if allow_bangs && ["!", "?"].include?(last_char)
+      raise Error, "Input generates impossible Ruby method name" if self == ""
       with_separators!("_")
     end
     # Delete any non-ascii characters.
+    #
     # @return String
     def to_ascii!
-      @wrapped_string = @wrapped_string.gsub(/[^\x00-\x7f]/u, '')
+      gsub!(/[^\x00-\x7f]/u, "")
+      to_s
     end
     # Truncate the string to +max+ characters.
+    #
     # @example
     #   "üéøá".to_identifier.truncate(3) #=> "üéø"
+    #
+    # @param max [Integer] The maximum number of characters.
     # @return String
     def truncate!(max)
-      @wrapped_string = unpack("U*")[0...max].pack("U*")
+      @wrapped_string = slice(0, max)
     end
     # Truncate the string to +max+ bytes. This can be useful for ensuring that
     # a UTF-8 string will always fit into a database column with a certain max
     # byte length. The resulting string may be less than +max+ if the string must
     # be truncated at a multibyte character boundary.
+    #
     # @example
     #   "üéøá".to_identifier.truncate_bytes(3) #=> "ü"
+    #
+    # @param max [Integer] The maximum number of bytes.
     # @return String
     def truncate_bytes!(max)
-      return @wrapped_string if @wrapped_string.bytesize <= max
-      curr = 0
-      new = []
-      unpack("U*").each do |char|
-        break if curr > max
-        char = [char].pack("U")
-        curr += char.bytesize
-        if curr <= max
-          new << char
-        end
-      end
-      @wrapped_string = new.join
+      truncate!(max)
+      chop! until bytesize <= max
     end
     # Replaces whitespace with dashes ("-").
+    #
+    # @param char [String] the separator character to use.
     # @return String
     def with_separators!(char = "-")
-      @wrapped_string = @wrapped_string.gsub(/\s/u, char)
-    end
-    # Perform UTF-8 sensitive upcasing.
-    # @return String
-    def upcase!
-      @wrapped_string = @@utf8_proxy.upcase(@wrapped_string)
-    end
-    # Perform UTF-8 sensitive downcasing.
-    # @return String
-    def downcase!
-      @wrapped_string = @@utf8_proxy.downcase(@wrapped_string)
+      gsub!(/\s/u, char)
+      to_s
     end
     # Perform Unicode composition on the wrapped string.
+    #
     # @return String
     def normalize_utf8!
-      @wrapped_string = @@utf8_proxy.normalize_utf8(@wrapped_string)
+      unicode_normalize!(:nfc)
+      to_s
     end
     # Attempt to convert characters encoded using CP1252 and IS0-8859-1 to
     # UTF-8.
     # @return String
     def tidy_bytes!
-      @wrapped_string = @@utf8_proxy.tidy_bytes(@wrapped_string)
+      scrub! do |bad|
+        bad.encode(Encoding::UTF_8, Encoding::Windows_1252, invalid: :replace, undef: :replace)
+      end
+      to_s
     end
     %w[transliterate clean downcase word_chars normalize normalize_utf8
-      tidy_bytes to_ascii to_ruby_method truncate truncate_bytes upcase
-      with_separators].each do |method|
-      class_eval(<<-EOM, __FILE__, __LINE__ + 1)
+       tidy_bytes to_ascii to_ruby_method truncate truncate_bytes upcase
+       with_separators].each do |method|
+      class_eval(<<-METHOD, __FILE__, __LINE__ + 1)
         def #{method}(*args)
           send_to_new_instance(:#{method}!, *args)
         end
-      EOM
+      METHOD
     end
     def to_identifier
@@ -271,7 +232,7 @@ module Babosa
     # The default options for {#normalize!}. Override to set your own defaults.
     def default_normalize_options
-      {:transliterate => true, :max_length => 255, :separator => "-"}
+      {transliterate: :latin, max_length: 255, separator: "-"}
     end
     alias approximate_ascii transliterate