RubyGems - babosa - Versions diffs - 1.0.4 → 2.0.0 - Mend

babosa 1.0.4 → 2.0.0

Files changed (56) hide show

checksums.yaml +4 -4
checksums.yaml.gz.sig +0 -0
data.tar.gz.sig +0 -0
data/Changelog.md +12 -0
data/README.md +81 -119
data/Rakefile +9 -8
data/lib/babosa.rb +2 -4
data/lib/babosa/identifier.rb +104 -129
data/lib/babosa/transliterator/base.rb +57 -56
data/lib/babosa/transliterator/bulgarian.rb +3 -2
data/lib/babosa/transliterator/cyrillic.rb +5 -5
data/lib/babosa/transliterator/danish.rb +3 -3
data/lib/babosa/transliterator/german.rb +3 -2
data/lib/babosa/transliterator/greek.rb +4 -3
data/lib/babosa/transliterator/hindi.rb +3 -2
data/lib/babosa/transliterator/latin.rb +5 -5
data/lib/babosa/transliterator/macedonian.rb +3 -2
data/lib/babosa/transliterator/norwegian.rb +3 -3
data/lib/babosa/transliterator/romanian.rb +3 -2
data/lib/babosa/transliterator/russian.rb +3 -2
data/lib/babosa/transliterator/serbian.rb +29 -27
data/lib/babosa/transliterator/spanish.rb +2 -2
data/lib/babosa/transliterator/swedish.rb +3 -3
data/lib/babosa/transliterator/turkish.rb +8 -8
data/lib/babosa/transliterator/ukrainian.rb +5 -4
data/lib/babosa/transliterator/vietnamese.rb +4 -3
data/lib/babosa/version.rb +3 -1
data/spec/{babosa_spec.rb → identifier_spec.rb} +13 -14
data/spec/spec_helper.rb +6 -6
data/spec/transliterators/base_spec.rb +5 -6
data/spec/transliterators/bulgarian_spec.rb +4 -5
data/spec/transliterators/danish_spec.rb +5 -6
data/spec/transliterators/german_spec.rb +4 -5
data/spec/transliterators/greek_spec.rb +7 -7
data/spec/transliterators/hindi_spec.rb +7 -7
data/spec/transliterators/latin_spec.rb +3 -4
data/spec/transliterators/macedonian_spec.rb +3 -4
data/spec/transliterators/norwegian_spec.rb +4 -4
data/spec/transliterators/polish_spec.rb +3 -5
data/spec/transliterators/romanian_spec.rb +5 -6
data/spec/transliterators/russian_spec.rb +3 -4
data/spec/transliterators/serbian_spec.rb +6 -7
data/spec/transliterators/spanish_spec.rb +4 -5
data/spec/transliterators/swedish_spec.rb +7 -7
data/spec/transliterators/turkish_spec.rb +24 -24
data/spec/transliterators/ukrainian_spec.rb +74 -75
data/spec/transliterators/vietnamese_spec.rb +10 -10
metadata +44 -38
metadata.gz.sig +2 -0
data/lib/babosa/utf8/active_support_proxy.rb +0 -38
data/lib/babosa/utf8/dumb_proxy.rb +0 -49
data/lib/babosa/utf8/java_proxy.rb +0 -22
data/lib/babosa/utf8/mappings.rb +0 -193
data/lib/babosa/utf8/proxy.rb +0 -125
data/lib/babosa/utf8/unicode_proxy.rb +0 -23
data/spec/utf8_proxy_spec.rb +0 -52

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: 7878565d1bfb436b7110d42e81dff1eb589f86e10e0919ded4b2de695784fae3
-  data.tar.gz: f6f3e7cc2b4876a940ec66fa1895f9cd3390526c62cebc888110494b72d77fe5
+  metadata.gz: 84fa74468828d44925314cc1bcf6d297aa3a94ebfd993bca491686c92c936667
+  data.tar.gz: 618a52e9d52878fe8bc5933879115d32dd31b212a9cd2178e78bb7320d186c43
 SHA512:
-  metadata.gz: 6ea0ff964d688d9ca29710da13207c12d49509c13d72f8198a33d38141f7f6ad39b5792f5937a94a44b3976dcd68b24233c11b3418dc9d100a170caa799404ed
-  data.tar.gz: 1b37fd01a1907f244e112171da5dd942105181b860e6dc05ede3e1798bfdf1e50e8543b69757109dc9bd7dfddecb6ba2fbec2b59f38843223052c559f085490e
+  metadata.gz: d8d86867859c5af0f8f80631b5d7aefe14cf87d61ec5908fa08d68fa38cc2e287c77c2c59c41377eb41e6524ba723ed511d1d09e5837259aad4a1ced7162c25a
+  data.tar.gz: 005dbd796a469420ddc905e2fcf4261703f3777ab29000102039e52b907d51d51c3ab69f11779bfc3e76ed19c24fad997733d008c7d08f50dc1446fa230b7e3d

checksums.yaml.gz.sig ADDED Viewed

Binary file

data.tar.gz.sig ADDED Viewed

Binary file

data/Changelog.md CHANGED Viewed

@@ -1,5 +1,17 @@
 # Babosa Changelog
+## 2.0.0
+This release contains no important changes. I had a week off from work and
+decided to refactor the code.  However there are some small breaking changes so
+I have released it as 2.0.0.
+* Refactor internals for simplicity
+* Use built-in Ruby UTF-8 support in places of other gems.
+* Drop support for Ruby < 2.5.0.
+* `Babosa::Identifier#word_chars` no longer removes dashes
+* `Babosa::Identifier#to_ruby_method` default argument `allow_bangs` is now a keyword argument
 ## 1.0.4
 * Fix nil being cast to frozen string (https://github.com/norman/babosa/pull/52)

data/README.md CHANGED Viewed

@@ -1,7 +1,6 @@
 # Babosa
-[![Build Status](https://travis-ci.org/norman/babosa.png?branch=master)](https://travis-ci.org/norman/babosa)
+[![Build Status](https://github.com/norman/babosa/actions/workflows/main.yml/badge.svg)](https://github.com/norman/babosa/actions)
 Babosa is a library for creating human-friendly identifiers, aka "slugs". It can
 also be useful for normalizing and sanitizing data.
@@ -15,12 +14,16 @@ FriendlyId.
 ### Transliterate UTF-8 characters to ASCII
-    "Gölcük, Turkey".to_slug.transliterate.to_s #=> "Golcuk, Turkey"
+```ruby
+"Gölcük, Turkey".to_slug.transliterate.to_s #=> "Golcuk, Turkey"
+```
 ### Locale sensitive transliteration, with support for many languages
-    "Jürgen Müller".to_slug.transliterate.to_s           #=> "Jurgen Muller"
-    "Jürgen Müller".to_slug.transliterate(:german).to_s  #=> "Juergen Mueller"
+```ruby
+"Jürgen Müller".to_slug.transliterate.to_s           #=> "Jurgen Muller"
+"Jürgen Müller".to_slug.transliterate(:german).to_s  #=> "Juergen Mueller"
+```
 Currently supported languages include:
@@ -28,6 +31,7 @@ Currently supported languages include:
 * Danish
 * German
 * Greek
+* Hindi
 * Macedonian
 * Norwegian
 * Romanian
@@ -35,124 +39,125 @@ Currently supported languages include:
 * Serbian
 * Spanish
 * Swedish
+* Turkish
 * Ukrainian
+* Vietnamese
+Additionally there are generic transliterators for transliterating from the
+Cyrillic alphabet and Latin alphabet with diacritics. The Latin transliterator
+can be used, for example, with Czech. There is also a transliterator named
+"Hindi" which may be sufficient for other Indic languages using Devanagari, but
+I do not know enough to say whether the transliterations would make sense.
 I'll gladly accept contributions from fluent speakers to support more languages.
 ### Strip non-ASCII characters
-    "Gölcük, Turkey".to_slug.to_ascii.to_s #=> "Glck, Turkey"
+```ruby
+"Gölcük, Turkey".to_slug.to_ascii.to_s #=> "Glck, Turkey"
+```
 ### Truncate by characters
-    "üüü".to_slug.truncate(2).to_s #=> "üü"
+```ruby
+"üüü".to_slug.truncate(2).to_s #=> "üü"
+```
 ### Truncate by bytes
 This can be useful to ensure the generated slug will fit in a database column
 whose length is limited by bytes rather than UTF-8 characters.
-    "üüü".to_slug.truncate_bytes(2).to_s #=> "ü"
+```ruby
+"üüü".to_slug.truncate_bytes(2).to_s #=> "ü"
+```
 ### Remove punctuation chars
-    "this is, um, **really** cool, huh?".to_slug.word_chars.to_s #=> "this is um really cool huh"
+```ruby
+"this is, um, **really** cool, huh?".to_slug.word_chars.to_s #=> "this is um really cool huh"
+```
 ### All-in-one
-    "Gölcük, Turkey".to_slug.normalize.to_s #=> "golcuk-turkey"
+```ruby
+"Gölcük, Turkey".to_slug.normalize.to_s #=> "golcuk-turkey"
+```
 ### Other stuff
-#### Using Babosa With FriendlyId 4
-    require "babosa"
-    class Person < ActiveRecord::Base
-      friendly_id :name, use: :slugged
-      def normalize_friendly_id(input)
-        input.to_s.to_slug.normalize(transliterations: :russian).to_s
-      end
-    end
-#### Pedantic UTF-8 support
+#### Using Babosa With FriendlyId 4+
-Babosa goes out of its way to handle [nasty Unicode issues you might never think
-you would have](https://github.com/norman/enc/blob/master/equivalence.rb) by
-checking, sanitizing and normalizing your string input.
+```ruby
+require "babosa"
-It will automatically use whatever Unicode library you have loaded before
-Babosa, or fall back to a simple built-in library. Supported
-Unicode libraries include:
+class Person < ActiveRecord::Base
+  friendly_id :name, use: :slugged
-* Java (only on JRuby of course)
-* Active Support
-* [Unicode](https://github.com/blackwinter/unicode)
-* Built-in
+  def normalize_friendly_id(input)
+    input.to_s.to_slug.normalize(transliterations: :russian).to_s
+  end
+end
+```
-This built-in module is much faster than Active Support but much slower than
-Java or Unicode. It can only do **very** naive Unicode composition to ensure
-that, for example, "é" will always be composed to a single codepoint rather than
-an "e" and a "´" - making it safe to use as a hash key.
+#### UTF-8 support
-But seriously - save yourself the headache and install a real Unicode library.
-If you are using Babosa with a language that uses the Cyrillic alphabet, Babosa
-requires either Unicode, Active Support or Java.
+Babosa normalizes all input strings [to NFC](https://en.wikipedia.org/wiki/Unicode_equivalence#Normal_forms).
 #### Ruby Method Names
-Babosa can also generate strings for Ruby method names. (Yes, Ruby 1.9 can use
+Babosa can generate strings for Ruby method names. (Yes, Ruby 1.9+ can use
 UTF-8 chars in method names, but you may not want to):
-    "this is a method".to_slug.to_ruby_method! #=> this_is_a_method
-    "über cool stuff!".to_slug.to_ruby_method! #=> uber_cool_stuff!
+```ruby
+"this is a method".to_slug.to_ruby_method! #=> this_is_a_method
+"über cool stuff!".to_slug.to_ruby_method! #=> uber_cool_stuff!
-    # You can also disallow trailing punctuation chars
-    "über cool stuff!".to_slug.to_ruby_method(false) #=> uber_cool_stuff
+# You can also disallow trailing punctuation chars
+"über cool stuff!".to_slug.to_ruby_method(allow_bangs: false) #=> uber_cool_stuff
+```
 #### Easy to Extend
 You can add custom transliterators for your language with very little code. For
 example here's the transliterator for German:
-    # encoding: utf-8
-    module Babosa
-      module Transliterator
-        class German < Latin
-          APPROXIMATIONS = {
-            "ä" => "ae",
-            "ö" => "oe",
-            "ü" => "ue",
-            "Ä" => "Ae",
-            "Ö" => "Oe",
-            "Ü" => "Ue"
-          }
-        end
-      end
+```ruby
+module Babosa
+  module Transliterator
+    class German < Latin
+      APPROXIMATIONS = {
+        "ä" => "ae",
+        "ö" => "oe",
+        "ü" => "ue",
+        "Ä" => "Ae",
+        "Ö" => "Oe",
+        "Ü" => "Ue"
+      }
     end
+  end
+end
+```
 And a spec (you can use this as a template):
-    # encoding: utf-8
-    require File.expand_path("../../spec_helper", __FILE__)
+```ruby
+require "spec_helper"
-    describe Babosa::Transliterator::German do
+describe Babosa::Transliterator::German do
+  let(:t) { described_class.instance }
+  it_behaves_like "a latin transliterator"
-      let(:t) { described_class.instance }
-      it_behaves_like "a latin transliterator"
-      it "should transliterate Eszett" do
-        t.transliterate("ß").should eql("ss")
-      end
-      it "should transliterate vowels with umlauts" do
-        t.transliterate("üöä").should eql("ueoeae")
-      end
-    end
+  it "should transliterate Eszett" do
+    t.transliterate("ß").should eql("ss")
+  end
+  it "should transliterate vowels with umlauts" do
+    t.transliterate("üöä").should eql("ueoeae")
+  end
+end
+```
 ### Rails 3.x and higher
@@ -167,46 +172,6 @@ and
 [parameterize](http://api.rubyonrails.org/classes/ActiveSupport/Inflector.html#method-i-parameterize)
 to see if they suit your needs.
-### Babosa vs. Stringex
-Babosa provides much of the functionality provided by the
-[Stringex](https://github.com/rsl/stringex) gem, but in the subjective opinion
-of the author, is for most use cases a better choice.
-#### Fewer Features
-Stringex offers functionality for storing slugs in an Active Record model, like
-a simple version of [FriendlyId](http://github.com/norman/friendly_id), in
-addition to string processing. Babosa only does string processing.
-#### Less Aggressive Unicode Transliteration
-Stringex uses an agressive Unicode to ASCII mapping which outputs gibberish for
-almost anything but Western European langages and Mandarin Chinese. Babosa
-supports only languages for which fluent speakers have provided
-transliterations, to ensure that the output makes sense to users.
-#### Unicode Support
-Stringex does no Unicode normalization or validation before transliterating
-strings, so if you pass in strings with encoding errors or with different
-Unicode normalizations, you'll get unpredictable results.
-#### No Locale Assumptions
-Babosa avoids making assumptions about locales like Stringex does, so it doesn't
-offer transliterations like this out of the box:
-    "$12 worth of Ruby power".to_url => "12-dollars-worth-of-ruby-power"
-This is because the symbol "$" is used in many Latin American countries for the
-peso. Stringex does this in many places, for example, transliterating all Han
-characters into Pinyin, effectively treating Japanese text as if it were
-Mandarin Chinese.
-### More info
 Please see the [API docs](http://rubydoc.info/github/norman/babosa/master/frames) and source code for
 more info.
@@ -218,9 +183,6 @@ Babosa can be installed via Rubygems:
 You can get the source code from its [Github repository](http://github.com/norman/babosa).
-Babosa is tested to be compatible with Ruby 2.x, JRuby 1.7+, and
-Rubinius 2.x It's probably compatible with other Rubies as well.
 ## Reporting bugs
 Please use Babosa's [Github issue
@@ -229,7 +191,7 @@ tracker](http://github.com/norman/babosa/issues).
 ## Misc
-"Babosa" means slug in Spanish.
+"Babosa" means "slug" in Spanish.
 ## Author
@@ -258,7 +220,7 @@ Many thanks to the following people for their help:
 ## Copyright
-Copyright (c) 2010-2013 Norman Clarke
+Copyright (c) 2010-2020 Norman Clarke
 Permission is hereby granted, free of charge, to any person obtaining a copy of
 this software and associated documentation files (the "Software"), to deal in

data/Rakefile CHANGED Viewed

@@ -1,10 +1,13 @@
+# frozen_string_literal: true
 require "rubygems"
 require "rake/testtask"
 require "rake/clean"
 require "rubygems/package_task"
+require "rubocop/rake_task"
-task :default => :spec
-task :test    => :spec
+task default: [:rubocop, :spec]
+task test: :spec
 CLEAN << "pkg" << "doc" << "coverage" << ".yardoc"
@@ -14,6 +17,7 @@ begin
     t.options = ["--output-dir=doc"]
   end
 rescue LoadError
+  puts "Yard not present"
 end
 begin
@@ -23,12 +27,9 @@ begin
     Rake::Task["spec"].execute
   end
 rescue LoadError
+  puts "SimpleCov not present"
 end
-gemspec = File.expand_path("../babosa.gemspec", __FILE__)
-if File.exist? gemspec
-  Gem::PackageTask.new(eval(File.read(gemspec))) { |pkg| }
-end
-require 'rspec/core/rake_task'
+require "rspec/core/rake_task"
 RSpec::Core::RakeTask.new(:spec)
+RuboCop::RakeTask.new

data/lib/babosa.rb CHANGED Viewed

@@ -1,7 +1,6 @@
+# frozen_string_literal: true
 module Babosa
-  def self.jruby15?
-    JRUBY_VERSION >= "1.5" rescue false
-  end
 end
 class String
@@ -12,5 +11,4 @@ class String
 end
 require "babosa/transliterator/base"
-require "babosa/utf8/proxy"
 require "babosa/identifier"

data/lib/babosa/identifier.rb CHANGED Viewed

@@ -1,16 +1,6 @@
-# encoding: utf-8
-module Babosa
-  # Codepoints for characters that will be deleted by +#word_chars!+.
-  STRIPPABLE = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 11, 12, 14, 15, 16, 17, 18, 19,
-    20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 33, 34, 35, 36, 37, 38, 39,
-    40, 41, 42, 43, 44, 45, 46, 47, 58, 59, 60, 61, 62, 63, 64, 91, 92, 93, 94,
-    96, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136,
-    137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151,
-    152, 153, 154, 155, 156, 157, 158, 159, 161, 162, 163, 164, 165, 166, 167,
-    168, 169, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 182, 183, 184,
-    185, 187, 188, 189, 190, 191, 215, 247, 8203, 8204, 8205, 8239, 65279]
+# frozen_string_literal: true
+module Babosa
   # This class provides some string-manipulation methods specific to slugs.
   #
   # Note that this class includes many "bang methods" such as {#clean!} and
@@ -20,7 +10,7 @@ module Babosa
   # it is generated dynamically.
   #
   # All of the bang methods return an instance of String, while the bangless
-  # versions return an instance of Babosa::Identifier, so that calls to methods
+  # versions return an instance of {Babosa::Identifier}, so that calls to methods
   # specific to this class can be chained:
   #
   #   string = Identifier.new("hello world")
@@ -29,71 +19,47 @@ module Babosa
   #
   # @see http://www.utf8-chartable.de/unicode-utf8-table.pl?utf8=dec Unicode character table
   class Identifier
     Error = Class.new(StandardError)
     attr_reader :wrapped_string
     alias to_s wrapped_string
-    @@utf8_proxy = if Babosa.jruby15?
-      UTF8::JavaProxy
-    elsif defined? Unicode::VERSION
-      UTF8::UnicodeProxy
-    elsif defined? ActiveSupport
-      UTF8::ActiveSupportProxy
-    else
-      UTF8::DumbProxy
-    end
-    # Return the proxy used for UTF-8 support.
-    # @see Babosa::UTF8::Proxy
-    def self.utf8_proxy
-      @@utf8_proxy
-    end
-    # Set a proxy object used for UTF-8 support.
-    # @see Babosa::UTF8::Proxy
-    def self.utf8_proxy=(obj)
-      @@utf8_proxy = obj
-    end
     def method_missing(symbol, *args, &block)
       @wrapped_string.__send__(symbol, *args, &block)
     end
+    def respond_to_missing?(name, include_all)
+      @wrapped_string.respond_to?(name, include_all)
+    end
     # @param string [#to_s] The string to use as the basis of the Identifier.
     def initialize(string)
-      @wrapped_string = string.to_s
+      @wrapped_string = string.to_s.dup
       tidy_bytes!
       normalize_utf8!
     end
-    def ==(value)
-      @wrapped_string.to_s == value.to_s
+    def ==(other)
+      to_s == other.to_s
     end
-    def eql?(value)
-      @wrapped_string == value
+    def eql?(other)
+      self == other
     end
-    def empty?
-      # included to make this class :respond_to? :empty for compatibility with Active Support's
-      # #blank?
-      @wrapped_string.empty?
-    end
-    # Approximate an ASCII string. This works only for Western strings using
-    # characters that are Roman-alphabet characters + diacritics. Non-letter
-    # characters are left unmodified.
+    # Approximate an ASCII string. This works only for strings using characters
+    # that are Roman-alphabet characters + diacritics. Non-letter characters
+    # are left unmodified.
     #
-    #   string = Identifier.new "Łódź
+    #   string = Identifier.new "Łódź, Poland"
     #   string.transliterate                 # => "Lodz, Poland"
     #   string = Identifier.new "日本"
     #   string.transliterate                 # => "日本"
     #
-    # You can pass any key(s) from +Characters.approximations+ as arguments. This allows
-    # for contextual approximations. Various languages are supported, you can see which ones
-    # by looking at the source of {Babosa::Transliterator::Base}.
+    # You can pass the names of any transliterator class as arguments. This
+    # allows for contextual approximations. Various languages are supported,
+    # you can see which ones by looking at the source of
+    # {Babosa::Transliterator::Base}.
     #
     #   string = Identifier.new "Jürgen Müller"
     #   string.transliterate                 # => "Jurgen Muller"
@@ -111,7 +77,7 @@ module Babosa
     # to remove non-ASCII characters such as "¡" and "¿", use {#to_ascii!}:
     #
     #   string.transliterate!(:spanish)       # => "¡Feliz anio!"
-    #   string.transliterate!                 # => "¡Feliz anio!"
+    #   string.to_ascii!                      # => "Feliz anio!"
     #
     # @param *args <Symbol>
     # @return String
@@ -122,40 +88,50 @@ module Babosa
         transliterator = Transliterator.get(kind).instance
         @wrapped_string = transliterator.transliterate(@wrapped_string)
       end
-      @wrapped_string
+      to_s
     end
     # Converts dashes to spaces, removes leading and trailing spaces, and
     # replaces multiple whitespace characters with a single space.
+    #
     # @return String
     def clean!
-      @wrapped_string = @wrapped_string.gsub("-", " ").squeeze(" ").strip
+      gsub!(/[- ]+/, " ")
+      strip!
+      to_s
     end
     # Remove any non-word characters. For this library's purposes, this means
-    # anything other than letters, numbers, spaces, newlines and linefeeds.
+    # anything other than letters, numbers, spaces, underscores, dashes,
+    # newlines, and linefeeds.
+    #
     # @return String
     def word_chars!
-      @wrapped_string = (unpack("U*") - Babosa::STRIPPABLE).pack("U*")
+      # `^\p{letter}` = Any non-Unicode letter
+      # `&&` = add the following character class
+      # `[^ _\n\r]` = Anything other than space, underscore, newline or linefeed
+      gsub!(/[[^\p{letter}]&&[^ \d_\-\n\r]]/, "")
+      to_s
     end
     # Normalize the string for use as a URL slug. Note that in this context,
     # +normalize+ means, strip, remove non-letters/numbers, downcasing,
     # truncating to 255 bytes and converting whitespace to dashes.
-    # @param Options
+    #
+    # @param options [Hash]
     # @return String
-    def normalize!(options = nil)
-      options = default_normalize_options.merge(options || {})
+    def normalize!(options = {})
+      options = default_normalize_options.merge(options)
-      if translit_option = options[:transliterate]
-        if translit_option != true
-          transliterate!(*translit_option)
-        else
+      if options[:transliterate]
+        option = options[:transliterate]
+        if option == true
           transliterate!(*options[:transliterations])
+        else
+          transliterate!(*option)
         end
       end
       to_ascii! if options[:to_ascii]
-      clean!
       word_chars!
       clean!
       downcase!
@@ -164,105 +140,103 @@ module Babosa
     end
     # Normalize a string so that it can safely be used as a Ruby method name.
-    def to_ruby_method!(allow_bangs = true)
-      leader, trailer = @wrapped_string.strip.scan(/\A(.+)(.)\z/).flatten
-      leader          = leader.to_s.dup
-      trailer         = trailer.to_s.dup
-      if allow_bangs
-        trailer.downcase!
-        trailer.gsub!(/[^a-z0-9!=\\?]/, '')
-      else
-        trailer.downcase!
-        trailer.gsub!(/[^a-z0-9]/, '')
-      end
-      id = leader.to_identifier
-      id.transliterate!
-      id.to_ascii!
-      id.clean!
-      id.word_chars!
-      id.clean!
-      @wrapped_string = id.to_s + trailer
-      if @wrapped_string == ""
-        raise Error, "Input generates impossible Ruby method name"
-      end
+    #
+    # @param allow_bangs [Boolean]
+    # @return String
+    def to_ruby_method!(allow_bangs: true)
+      last_char = self[-1]
+      transliterate!
+      to_ascii!
+      word_chars!
+      strip_leading_digits!
+      clean!
+      @wrapped_string += last_char if allow_bangs && ["!", "?"].include?(last_char)
+      raise Error, "Input generates impossible Ruby method name" if self == ""
       with_separators!("_")
     end
     # Delete any non-ascii characters.
+    #
     # @return String
     def to_ascii!
-      @wrapped_string = @wrapped_string.gsub(/[^\x00-\x7f]/u, '')
+      gsub!(/[^\x00-\x7f]/u, "")
+      to_s
     end
     # Truncate the string to +max+ characters.
+    #
     # @example
     #   "üéøá".to_identifier.truncate(3) #=> "üéø"
+    #
+    # @param max [Integer] The maximum number of characters.
     # @return String
     def truncate!(max)
-      @wrapped_string = unpack("U*")[0...max].pack("U*")
+      @wrapped_string = slice(0, max)
     end
     # Truncate the string to +max+ bytes. This can be useful for ensuring that
     # a UTF-8 string will always fit into a database column with a certain max
     # byte length. The resulting string may be less than +max+ if the string must
     # be truncated at a multibyte character boundary.
+    #
     # @example
     #   "üéøá".to_identifier.truncate_bytes(3) #=> "ü"
+    #
+    # @param max [Integer] The maximum number of bytes.
     # @return String
     def truncate_bytes!(max)
-      return @wrapped_string if @wrapped_string.bytesize <= max
-      curr = 0
-      new = []
-      unpack("U*").each do |char|
-        break if curr > max
-        char = [char].pack("U")
-        curr += char.bytesize
-        if curr <= max
-          new << char
-        end
-      end
-      @wrapped_string = new.join
+      truncate!(max)
+      chop! until bytesize <= max
     end
     # Replaces whitespace with dashes ("-").
+    #
+    # @param char [String] the separator character to use.
     # @return String
     def with_separators!(char = "-")
-      @wrapped_string = @wrapped_string.gsub(/\s/u, char)
-    end
-    # Perform UTF-8 sensitive upcasing.
-    # @return String
-    def upcase!
-      @wrapped_string = @@utf8_proxy.upcase(@wrapped_string)
+      gsub!(/\s/u, char)
+      to_s
     end
-    # Perform UTF-8 sensitive downcasing.
+    # Perform Unicode composition on the wrapped string.
+    #
     # @return String
-    def downcase!
-      @wrapped_string = @@utf8_proxy.downcase(@wrapped_string)
+    def normalize_utf8!
+      unicode_normalize!(:nfc)
+      to_s
     end
-    # Perform Unicode composition on the wrapped string.
+    # Strip any leading digits.
+    #
     # @return String
-    def normalize_utf8!
-      @wrapped_string = @@utf8_proxy.normalize_utf8(@wrapped_string)
+    def strip_leading_digits!
+      gsub!(/^\d+/, "")
+      to_s
     end
     # Attempt to convert characters encoded using CP1252 and IS0-8859-1 to
     # UTF-8.
     # @return String
     def tidy_bytes!
-      @wrapped_string = @@utf8_proxy.tidy_bytes(@wrapped_string)
+      scrub! do |bad|
+        bad.encode(Encoding::UTF_8, Encoding::Windows_1252, invalid: :replace, undef: :replace)
+      end
+      to_s
     end
-    %w[transliterate clean downcase word_chars normalize normalize_utf8
-      tidy_bytes to_ascii to_ruby_method truncate truncate_bytes upcase
-      with_separators].each do |method|
-      class_eval(<<-EOM, __FILE__, __LINE__ + 1)
+    %w[clean downcase normalize normalize_utf8 strip_leading_digits
+       tidy_bytes to_ascii transliterate truncate truncate_bytes upcase
+       with_separators word_chars].each do |method|
+      class_eval(<<-METHOD, __FILE__, __LINE__ + 1)
         def #{method}(*args)
-          send_to_new_instance(:#{method}!, *args)
+          with_new_instance { |id| id.send(:#{method}!, *args) }
         end
-      EOM
+      METHOD
+    end
+    def to_ruby_method(allow_bangs: true)
+      with_new_instance { |id| id.to_ruby_method!(allow_bangs: allow_bangs) }
     end
     def to_identifier
@@ -271,7 +245,7 @@ module Babosa
     # The default options for {#normalize!}. Override to set your own defaults.
     def default_normalize_options
-      {:transliterate => true, :max_length => 255, :separator => "-"}
+      {transliterate: :latin, max_length: 255, separator: "-"}
     end
     alias approximate_ascii transliterate
@@ -282,12 +256,13 @@ module Babosa
     private
-    # Used as the basis of the bangless methods.
-    def send_to_new_instance(*args)
-      id = Identifier.allocate
-      id.instance_variable_set :@wrapped_string, to_s
-      id.send(*args)
-      id
+    # Used as the basis of the non-mutating (bangless) methods.
+    def with_new_instance
+      Identifier.allocate.tap do |id|
+        id.instance_variable_set :@wrapped_string, to_s
+        yield id
+      end
     end
   end
 end