byk 0.6.0 → 1.0.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CHANGELOG.md +5 -0
- data/README.md +123 -51
- data/exe/byk +51 -0
- data/ext/byk/byk.c +261 -182
- data/lib/byk/version.rb +1 -1
- data/spec/byk_spec.rb +97 -40
- metadata +25 -8
    
        checksums.yaml
    CHANGED
    
    | @@ -1,7 +1,7 @@ | |
| 1 1 | 
             
            ---
         | 
| 2 2 | 
             
            SHA1:
         | 
| 3 | 
            -
              metadata.gz:  | 
| 4 | 
            -
              data.tar.gz:  | 
| 3 | 
            +
              metadata.gz: cc996c9d9dc81f884e02cc1dd760eeb57b6545fc
         | 
| 4 | 
            +
              data.tar.gz: de07860c2cb41bcb39b299fee4500fd2bf01db73
         | 
| 5 5 | 
             
            SHA512:
         | 
| 6 | 
            -
              metadata.gz:  | 
| 7 | 
            -
              data.tar.gz:  | 
| 6 | 
            +
              metadata.gz: 16e97855924c380b205e2e651fdcde391785fe051c2971d948f801ff4260eb691dc4c3304ac17b3083fc0a2469f26d134c9622f74b058f6950d5fd8dfaf62383
         | 
| 7 | 
            +
              data.tar.gz: c85659aaaccbc5e1db30305b52e2f4955de160dcb7a617ae564877619fe5f36d852ea7c17228f301e639aae3d4793133baa5d644faeffc83942d6b179bef53e9
         | 
    
        data/CHANGELOG.md
    CHANGED
    
    
    
        data/README.md
    CHANGED
    
    | @@ -4,39 +4,85 @@ Byk | |
| 4 4 | 
             
            [](https://rubygems.org/gems/byk)
         | 
| 5 5 | 
             
            [](https://travis-ci.org/topalovic/byk)
         | 
| 6 6 |  | 
| 7 | 
            -
            Ruby gem for fast transliteration of Serbian Cyrillic  | 
| 8 | 
            -
            <br />
         | 
| 9 | 
            -
            <sub>Inspired by @dejan's
         | 
| 10 | 
            -
            [nice little gem](https://github.com/dejan/srbovanje),
         | 
| 11 | 
            -
            this one comes with a C-optimized twist</sub>
         | 
| 7 | 
            +
            Ruby gem for fast transliteration of Serbian Cyrillic ↔ Latin
         | 
| 12 8 |  | 
| 13 9 | 
             
            
         | 
| 14 10 |  | 
| 15 11 |  | 
| 16 12 | 
             
            ## Installation
         | 
| 17 13 |  | 
| 18 | 
            -
             | 
| 14 | 
            +
            Byk can be used as a standalone console utility or as a `String`
         | 
| 15 | 
            +
            extension in your Ruby programs. It has zero dependencies beyond
         | 
| 16 | 
            +
            vanilla Ruby and the toolchain for building native gems <sup>1</sup>.
         | 
| 17 | 
            +
             | 
| 18 | 
            +
            You can install it directly:
         | 
| 19 | 
            +
             | 
| 20 | 
            +
            ```ruby
         | 
| 21 | 
            +
            $ gem install byk
         | 
| 22 | 
            +
            ```
         | 
| 23 | 
            +
             | 
| 24 | 
            +
            or add it as a dependency in your application's Gemfile:
         | 
| 19 25 |  | 
| 20 26 | 
             
            ```ruby
         | 
| 21 27 | 
             
            gem "byk"
         | 
| 22 28 | 
             
            ```
         | 
| 23 29 |  | 
| 24 | 
            -
             | 
| 30 | 
            +
            <sub><sup>1</sup> For Windows, you might want to check out
         | 
| 31 | 
            +
            [DevKit](https://github.com/oneclick/rubyinstaller/wiki/Development-Kit)</sub>
         | 
| 32 | 
            +
             | 
| 33 | 
            +
             | 
| 34 | 
            +
            ## Usage
         | 
| 35 | 
            +
             | 
| 36 | 
            +
            ### As a standalone utility
         | 
| 37 | 
            +
             | 
| 38 | 
            +
            Here's the help banner with all the available options:
         | 
| 25 39 |  | 
| 26 40 | 
             
            ```
         | 
| 27 | 
            -
             | 
| 41 | 
            +
            usage: byk [options] [files]
         | 
| 42 | 
            +
             | 
| 43 | 
            +
            options:
         | 
| 44 | 
            +
              -c, --cyrillic       convert input to Cyrillic (default)
         | 
| 45 | 
            +
              -l, --latin          convert input to Latin
         | 
| 46 | 
            +
              -a, --ascii          convert input to "ASCII Latin"
         | 
| 47 | 
            +
              -v, --version        show version
         | 
| 28 48 | 
             
            ```
         | 
| 29 49 |  | 
| 30 | 
            -
             | 
| 50 | 
            +
            Translation goes to stdout so you can redirect it or pipe it as you
         | 
| 51 | 
            +
            see fit. Let's take a look at some common scenarios.
         | 
| 31 52 |  | 
| 53 | 
            +
            To translate files to Cyrillic:
         | 
| 54 | 
            +
            ```sh
         | 
| 55 | 
            +
            $ byk in1.txt in2.txt > out.txt
         | 
| 32 56 | 
             
            ```
         | 
| 33 | 
            -
             | 
| 57 | 
            +
             | 
| 58 | 
            +
            To translate files to Latin and search for a phrase:
         | 
| 59 | 
            +
            ```sh
         | 
| 60 | 
            +
            $ byk -l file.txt | grep stvar
         | 
| 34 61 | 
             
            ```
         | 
| 35 62 |  | 
| 63 | 
            +
            Ad hoc conversion:
         | 
| 64 | 
            +
            ```sh
         | 
| 65 | 
            +
            $ echo "Вук Стефановић Караџић" | byk -a
         | 
| 66 | 
            +
            Vuk Stefanovic Karadzic
         | 
| 67 | 
            +
            ```
         | 
| 36 68 |  | 
| 37 | 
            -
             | 
| 69 | 
            +
            or simply omit args and type away:
         | 
| 70 | 
            +
            ```sh
         | 
| 71 | 
            +
            $ byk
         | 
| 72 | 
            +
            a u ruke Mandušića Vuka
         | 
| 73 | 
            +
            biće svaka puška ubojita!
         | 
| 74 | 
            +
            ^D
         | 
| 75 | 
            +
            а у руке Мандушића Вука
         | 
| 76 | 
            +
            биће свака пушка убојита!
         | 
| 77 | 
            +
            ```
         | 
| 38 78 |  | 
| 39 | 
            -
             | 
| 79 | 
            +
            `^D` being <kbd>ctrl</kbd> <kbd>d</kbd>.
         | 
| 80 | 
            +
             | 
| 81 | 
            +
             | 
| 82 | 
            +
            ### As a `String` extension
         | 
| 83 | 
            +
             | 
| 84 | 
            +
            Unless you're using Bundler, make sure to require the gem in your
         | 
| 85 | 
            +
            initializer:
         | 
| 40 86 |  | 
| 41 87 | 
             
            ```ruby
         | 
| 42 88 | 
             
            require "byk"
         | 
| @@ -45,22 +91,23 @@ require "byk" | |
| 45 91 | 
             
            This will extend `String` with a couple of simple methods:
         | 
| 46 92 |  | 
| 47 93 | 
             
            ```ruby
         | 
| 48 | 
            -
            " | 
| 49 | 
            -
            "Шеширџија". | 
| 50 | 
            -
            " | 
| 94 | 
            +
            "Šeširdžija".to_cyrillic    # => "Шеширџија"
         | 
| 95 | 
            +
            "Шеширџија".to_latin        # => "Šeširdžija"
         | 
| 96 | 
            +
            "Шеширџија".to_ascii_latin  # => "Sesirdzija"
         | 
| 51 97 | 
             
            ```
         | 
| 52 98 |  | 
| 53 | 
            -
             | 
| 99 | 
            +
            These do not modify the receiver. For that, there's a destructive
         | 
| 100 | 
            +
            variant of each:
         | 
| 54 101 |  | 
| 55 102 | 
             
            ```ruby
         | 
| 56 | 
            -
            text = " | 
| 57 | 
            -
            text. | 
| 58 | 
            -
            text | 
| 59 | 
            -
            text.to_ascii_latin!  # => " | 
| 60 | 
            -
            text                  # => " | 
| 103 | 
            +
            text = "Šeširdžija"
         | 
| 104 | 
            +
            text.to_cyrillic!     # => "Шеширџија"
         | 
| 105 | 
            +
            text.to_latin!        # => "Šeširdžija"
         | 
| 106 | 
            +
            text.to_ascii_latin!  # => "Sesirdzija"
         | 
| 107 | 
            +
            text                  # => "Sesirdzija"
         | 
| 61 108 | 
             
            ```
         | 
| 62 109 |  | 
| 63 | 
            -
            Note that  | 
| 110 | 
            +
            Note that both latinization methods observe
         | 
| 64 111 | 
             
            [digraph capitalization rules](http://sr.wikipedia.org/wiki/Гајица#.D0.94.D0.B8.D0.B3.D1.80.D0.B0.D1.84.D0.B8):
         | 
| 65 112 |  | 
| 66 113 | 
             
            ```ruby
         | 
| @@ -68,63 +115,88 @@ Note that these methods take into account the | |
| 68 115 | 
             
            "ĐORĐE Đorđević".to_ascii_latin  # => "DJORDJE Djordjevic"
         | 
| 69 116 | 
             
            ```
         | 
| 70 117 |  | 
| 71 | 
            -
             | 
| 72 | 
            -
            require | 
| 118 | 
            +
             | 
| 119 | 
            +
            ### Safe require
         | 
| 120 | 
            +
             | 
| 121 | 
            +
            If you prefer not to monkey patch `String`, you can do a "safe"
         | 
| 122 | 
            +
            require in your Gemfile:
         | 
| 123 | 
            +
             | 
| 73 124 |  | 
| 74 125 | 
             
            ```ruby
         | 
| 75 | 
            -
            require "byk/safe"
         | 
| 126 | 
            +
            gem "byk", :require => "byk/safe"
         | 
| 76 127 | 
             
            ```
         | 
| 77 128 |  | 
| 78 | 
            -
             | 
| 129 | 
            +
            or initializer:
         | 
| 79 130 |  | 
| 80 131 | 
             
            ```ruby
         | 
| 81 | 
            -
             | 
| 82 | 
            -
            Byk.to_latin(text)   # => "Vuk"
         | 
| 83 | 
            -
            text                 # => "Byk"
         | 
| 84 | 
            -
            Byk.to_latin!(text)  # => "Vuk"
         | 
| 85 | 
            -
            text                 # => "Vuk"
         | 
| 132 | 
            +
            require "byk/safe"
         | 
| 86 133 | 
             
            ```
         | 
| 87 134 |  | 
| 135 | 
            +
            Then, you should rely on module methods:
         | 
| 88 136 |  | 
| 89 | 
            -
             | 
| 137 | 
            +
            ```ruby
         | 
| 138 | 
            +
            text = "Жвазбука"
         | 
| 90 139 |  | 
| 91 | 
            -
             | 
| 140 | 
            +
            Byk.to_latin(text)   # => "Žvazbuka"
         | 
| 141 | 
            +
            text                 # => "Жвазбука"
         | 
| 142 | 
            +
             | 
| 143 | 
            +
            Byk.to_latin!(text)  # => "Žvazbuka"
         | 
| 144 | 
            +
            text                 # => "Žvazbuka"
         | 
| 92 145 |  | 
| 146 | 
            +
            # etc.
         | 
| 93 147 | 
             
            ```
         | 
| 94 | 
            -
             | 
| 95 | 
            -
             | 
| 148 | 
            +
             | 
| 149 | 
            +
             | 
| 150 | 
            +
            ## How fast is "fast" transliteration?
         | 
| 151 | 
            +
             | 
| 152 | 
            +
            Here's a quick test:
         | 
| 153 | 
            +
             | 
| 154 | 
            +
            ```sh
         | 
| 155 | 
            +
            $ wget https://sr.wikipedia.org/ -O sample
         | 
| 156 | 
            +
            $ du -h sample
         | 
| 157 | 
            +
            128K
         | 
| 158 | 
            +
             | 
| 159 | 
            +
            $ time byk -l sample > /dev/null
         | 
| 160 | 
            +
            0.08s user 0.04s system 96% cpu 0.126 total
         | 
| 96 161 | 
             
            ```
         | 
| 97 162 |  | 
| 163 | 
            +
            Let's up the ante:
         | 
| 164 | 
            +
             | 
| 165 | 
            +
            ```sh
         | 
| 166 | 
            +
            $ for i in {1..800}; do cat sample; done > big
         | 
| 167 | 
            +
            $ du -h big
         | 
| 168 | 
            +
            97M
         | 
| 169 | 
            +
             | 
| 170 | 
            +
            $ time byk -l big > /dev/null
         | 
| 171 | 
            +
            1.71s user 0.13s system 99% cpu 1.846 total
         | 
| 172 | 
            +
            ```
         | 
| 98 173 |  | 
| 99 | 
            -
             | 
| 174 | 
            +
            So, ~100MB in under 2s. Fast enough, I suppose. You can expect it to
         | 
| 175 | 
            +
            scale linearly.
         | 
| 100 176 |  | 
| 101 | 
            -
             | 
| 102 | 
            -
             | 
| 103 | 
            -
             | 
| 177 | 
            +
            Compared to the pure Ruby implementation, it is about
         | 
| 178 | 
            +
            [10-30x faster](benchmark), depending on the input composition and the
         | 
| 179 | 
            +
            transliteration method applied.
         | 
| 104 180 |  | 
| 105 181 |  | 
| 106 | 
            -
            ##  | 
| 182 | 
            +
            ## Testing
         | 
| 107 183 |  | 
| 108 | 
            -
             | 
| 109 | 
            -
            projects, e.g. sites supporting dual script content. Remember,
         | 
| 110 | 
            -
            `Benchmark` is your friend.
         | 
| 184 | 
            +
            To test the gem, clone the repo and run:
         | 
| 111 185 |  | 
| 112 | 
            -
             | 
| 113 | 
            -
             | 
| 114 | 
            -
             | 
| 186 | 
            +
            ```
         | 
| 187 | 
            +
            $ bundle && bundle exec rake
         | 
| 188 | 
            +
            ```
         | 
| 115 189 |  | 
| 116 190 |  | 
| 117 191 | 
             
            ## Compatibility
         | 
| 118 192 |  | 
| 119 | 
            -
            Byk is supported under MRI  | 
| 193 | 
            +
            Byk is supported under MRI 1.9.2+. I might try my hand in writing a
         | 
| 194 | 
            +
            JRuby extension in a future release.
         | 
| 120 195 |  | 
| 121 | 
            -
            I don't plan to support 1.8.7 or older due to substantial C API
         | 
| 122 | 
            -
            changes between 1.8 and 1.9. It doesn't build under Rubinius
         | 
| 123 | 
            -
            currently, but I intend to support it in future releases.
         | 
| 124 196 |  | 
| 125 197 |  | 
| 126 198 | 
             
            ## License
         | 
| 127 199 |  | 
| 128 | 
            -
            This gem is released under the [MIT License]( | 
| 200 | 
            +
            This gem is released under the [MIT License](LICENSE).
         | 
| 129 201 |  | 
| 130 202 | 
             
            Уздравље!
         | 
    
        data/exe/byk
    ADDED
    
    | @@ -0,0 +1,51 @@ | |
| 1 | 
            +
            #!/usr/bin/env ruby
         | 
| 2 | 
            +
             | 
| 3 | 
            +
            require "byk/safe"
         | 
| 4 | 
            +
            require "optparse"
         | 
| 5 | 
            +
             | 
| 6 | 
            +
            trap "SIGINT" do
         | 
| 7 | 
            +
              exit 130
         | 
| 8 | 
            +
            end
         | 
| 9 | 
            +
             | 
| 10 | 
            +
            method_name = :to_cyrillic
         | 
| 11 | 
            +
             | 
| 12 | 
            +
            opts = OptionParser.new do |opt|
         | 
| 13 | 
            +
              opt.banner = "usage: byk [options] [files]"
         | 
| 14 | 
            +
              opt.summary_width = 20
         | 
| 15 | 
            +
             | 
| 16 | 
            +
              opt.separator ""
         | 
| 17 | 
            +
              opt.separator "options:"
         | 
| 18 | 
            +
             | 
| 19 | 
            +
              opt.on("-c", "--cyrillic", "convert input to Cyrillic (default)") do
         | 
| 20 | 
            +
                method_name = :to_cyrillic
         | 
| 21 | 
            +
              end
         | 
| 22 | 
            +
             | 
| 23 | 
            +
              opt.on("-l", "--latin", "convert input to Latin") do
         | 
| 24 | 
            +
                method_name = :to_latin
         | 
| 25 | 
            +
              end
         | 
| 26 | 
            +
             | 
| 27 | 
            +
              opt.on("-a", "--ascii", 'convert input to "ASCII Latin"') do
         | 
| 28 | 
            +
                method_name = :to_ascii_latin
         | 
| 29 | 
            +
              end
         | 
| 30 | 
            +
             | 
| 31 | 
            +
              opt.on_tail("-v", "--version", "show version") do
         | 
| 32 | 
            +
                puts Byk::VERSION
         | 
| 33 | 
            +
                exit
         | 
| 34 | 
            +
              end
         | 
| 35 | 
            +
            end
         | 
| 36 | 
            +
             | 
| 37 | 
            +
            begin
         | 
| 38 | 
            +
              opts.parse!
         | 
| 39 | 
            +
            rescue OptionParser::InvalidOption => e
         | 
| 40 | 
            +
              puts e
         | 
| 41 | 
            +
              puts
         | 
| 42 | 
            +
              puts opts
         | 
| 43 | 
            +
              exit 1
         | 
| 44 | 
            +
            end
         | 
| 45 | 
            +
             | 
| 46 | 
            +
            begin
         | 
| 47 | 
            +
              puts Byk.send(method_name, ARGF.read)
         | 
| 48 | 
            +
            rescue => e
         | 
| 49 | 
            +
              puts e
         | 
| 50 | 
            +
              exit 1
         | 
| 51 | 
            +
            end
         | 
    
        data/ext/byk/byk.c
    CHANGED
    
    | @@ -3,103 +3,225 @@ | |
| 3 3 |  | 
| 4 4 | 
             
            #define STR_ENC_GET(str) rb_enc_from_index(ENCODING_GET(str))
         | 
| 5 5 |  | 
| 6 | 
            -
             | 
| 7 | 
            -
             | 
| 8 | 
            -
             | 
| 6 | 
            +
            static inline void
         | 
| 7 | 
            +
            _str_cat_char(VALUE str, unsigned c, rb_encoding *enc)
         | 
| 8 | 
            +
            {
         | 
| 9 | 
            +
                char s[16];
         | 
| 10 | 
            +
                int n = rb_enc_codelen(c, enc);
         | 
| 11 | 
            +
                rb_enc_mbcput(c, s, enc);
         | 
| 12 | 
            +
                rb_str_buf_cat(str, s, n);
         | 
| 13 | 
            +
            }
         | 
| 9 14 |  | 
| 10 15 | 
             
            enum {
         | 
| 11 | 
            -
                LAT_CAP_TJ =  | 
| 12 | 
            -
                 | 
| 13 | 
            -
                 | 
| 14 | 
            -
                 | 
| 15 | 
            -
                 | 
| 16 | 
            -
                 | 
| 17 | 
            -
                 | 
| 18 | 
            -
                 | 
| 19 | 
            -
                 | 
| 20 | 
            -
                 | 
| 21 | 
            -
                 | 
| 22 | 
            -
                 | 
| 23 | 
            -
                 | 
| 24 | 
            -
                 | 
| 25 | 
            -
                 | 
| 26 | 
            -
                 | 
| 27 | 
            -
                CYR_CAP_A,
         | 
| 28 | 
            -
                CYR_CAP_ZH = 0x416,
         | 
| 29 | 
            -
                CYR_CAP_C  = 0x426,
         | 
| 30 | 
            -
                CYR_CAP_CH,
         | 
| 31 | 
            -
                CYR_CAP_SH,
         | 
| 32 | 
            -
                CYR_A  = 0x430,
         | 
| 33 | 
            -
                CYR_ZH = 0x436,
         | 
| 34 | 
            -
                CYR_C  = 0x446,
         | 
| 35 | 
            -
                CYR_CH,
         | 
| 36 | 
            -
                CYR_SH,
         | 
| 37 | 
            -
                CYR_DJ = 0x452,
         | 
| 38 | 
            -
                CYR_J  = 0x458,
         | 
| 39 | 
            -
                CYR_LJ,
         | 
| 40 | 
            -
                CYR_NJ,
         | 
| 41 | 
            -
                CYR_TJ,
         | 
| 42 | 
            -
                CYR_DZ = 0x45f
         | 
| 16 | 
            +
                LAT_CAP_TJ=262, LAT_TJ, LAT_CAP_CH=268, LAT_CH,
         | 
| 17 | 
            +
                LAT_CAP_DJ=272, LAT_DJ, LAT_CAP_SH=352, LAT_SH,
         | 
| 18 | 
            +
                LAT_CAP_ZH=381, LAT_ZH, CYR_CAP_DJ=1026, CYR_CAP_J=1032,
         | 
| 19 | 
            +
                CYR_CAP_LJ, CYR_CAP_NJ, CYR_CAP_TJ, CYR_CAP_DZ=1039,
         | 
| 20 | 
            +
                CYR_CAP_A, CYR_CAP_B, CYR_CAP_V, CYR_CAP_G,
         | 
| 21 | 
            +
                CYR_CAP_D, CYR_CAP_E, CYR_CAP_ZH, CYR_CAP_Z,
         | 
| 22 | 
            +
                CYR_CAP_I, CYR_CAP_K=1050, CYR_CAP_L, CYR_CAP_M,
         | 
| 23 | 
            +
                CYR_CAP_N, CYR_CAP_O, CYR_CAP_P, CYR_CAP_R,
         | 
| 24 | 
            +
                CYR_CAP_S, CYR_CAP_T, CYR_CAP_U, CYR_CAP_F,
         | 
| 25 | 
            +
                CYR_CAP_H, CYR_CAP_C, CYR_CAP_CH, CYR_CAP_SH,
         | 
| 26 | 
            +
                CYR_A=1072, CYR_B, CYR_V, CYR_G, CYR_D,
         | 
| 27 | 
            +
                CYR_E, CYR_ZH, CYR_Z, CYR_I, CYR_K=1082,
         | 
| 28 | 
            +
                CYR_L, CYR_M, CYR_N, CYR_O, CYR_P,
         | 
| 29 | 
            +
                CYR_R, CYR_S, CYR_T, CYR_U, CYR_F,
         | 
| 30 | 
            +
                CYR_H, CYR_C, CYR_CH, CYR_SH, CYR_DJ=1106,
         | 
| 31 | 
            +
                CYR_J=1112, CYR_LJ, CYR_NJ, CYR_TJ, CYR_DZ=1119
         | 
| 43 32 | 
             
            };
         | 
| 44 33 |  | 
| 45 | 
            -
            static inline unsigned | 
| 46 | 
            -
             | 
| 34 | 
            +
            static inline unsigned
         | 
| 35 | 
            +
            is_cap(unsigned codepoint)
         | 
| 47 36 | 
             
            {
         | 
| 48 | 
            -
                 | 
| 37 | 
            +
                if (codepoint >= 65 && codepoint <= 90) return 1;
         | 
| 38 | 
            +
                if (codepoint >= CYR_CAP_DJ && codepoint <= CYR_CAP_SH) return 1;
         | 
| 39 | 
            +
             | 
| 40 | 
            +
                switch(codepoint) {
         | 
| 41 | 
            +
                case LAT_CAP_TJ:
         | 
| 42 | 
            +
                case LAT_CAP_CH:
         | 
| 43 | 
            +
                case LAT_CAP_DJ:
         | 
| 44 | 
            +
                case LAT_CAP_SH:
         | 
| 45 | 
            +
                case LAT_CAP_ZH:
         | 
| 46 | 
            +
                    return 1;
         | 
| 47 | 
            +
                default:
         | 
| 48 | 
            +
                    return 0;
         | 
| 49 | 
            +
                }
         | 
| 49 50 | 
             
            }
         | 
| 50 51 |  | 
| 51 | 
            -
            static inline unsigned | 
| 52 | 
            -
             | 
| 52 | 
            +
            static inline unsigned
         | 
| 53 | 
            +
            is_digraph(unsigned codepoint)
         | 
| 53 54 | 
             
            {
         | 
| 54 | 
            -
                 | 
| 55 | 
            -
             | 
| 56 | 
            -
             | 
| 57 | 
            -
             | 
| 58 | 
            -
             | 
| 59 | 
            -
             | 
| 60 | 
            -
             | 
| 55 | 
            +
                switch(codepoint) {
         | 
| 56 | 
            +
                case CYR_LJ:
         | 
| 57 | 
            +
                case CYR_NJ:
         | 
| 58 | 
            +
                case CYR_DZ:
         | 
| 59 | 
            +
                case CYR_CAP_LJ:
         | 
| 60 | 
            +
                case CYR_CAP_NJ:
         | 
| 61 | 
            +
                case CYR_CAP_DZ:
         | 
| 62 | 
            +
                    return 1;
         | 
| 63 | 
            +
                default:
         | 
| 64 | 
            +
                    return 0;
         | 
| 65 | 
            +
                }
         | 
| 61 66 | 
             
            }
         | 
| 62 67 |  | 
| 63 | 
            -
            static  | 
| 64 | 
            -
             | 
| 68 | 
            +
            static unsigned
         | 
| 69 | 
            +
            digraph_to_cyr(unsigned codepoint, unsigned codepoint2, unsigned capitalize, unsigned *next_out)
         | 
| 65 70 | 
             
            {
         | 
| 66 | 
            -
                 | 
| 67 | 
            -
                     | 
| 68 | 
            -
                     | 
| 71 | 
            +
                static unsigned CYR_MAP[] = {
         | 
| 72 | 
            +
                    CYR_A, CYR_B, CYR_C, CYR_D, CYR_E, CYR_F,
         | 
| 73 | 
            +
                    CYR_G, CYR_H, CYR_I, CYR_J, CYR_K, CYR_L,
         | 
| 74 | 
            +
                    CYR_M, CYR_N, CYR_O, CYR_P, 0, CYR_R,
         | 
| 75 | 
            +
                    CYR_S, CYR_T, CYR_U, CYR_V, 0, 0, 0, CYR_Z
         | 
| 76 | 
            +
                };
         | 
| 77 | 
            +
             | 
| 78 | 
            +
                static unsigned CYR_CAPS_MAP[] = {
         | 
| 79 | 
            +
                    CYR_CAP_A, CYR_CAP_B, CYR_CAP_C, CYR_CAP_D, CYR_CAP_E, CYR_CAP_F,
         | 
| 80 | 
            +
                    CYR_CAP_G, CYR_CAP_H, CYR_CAP_I, CYR_CAP_J, CYR_CAP_K, CYR_CAP_L,
         | 
| 81 | 
            +
                    CYR_CAP_M, CYR_CAP_N, CYR_CAP_O, CYR_CAP_P, 0, CYR_CAP_R,
         | 
| 82 | 
            +
                    CYR_CAP_S, CYR_CAP_T, CYR_CAP_U, CYR_CAP_V, 0, 0, 0, CYR_CAP_Z
         | 
| 83 | 
            +
                };
         | 
| 84 | 
            +
             | 
| 85 | 
            +
                if (codepoint2 == LAT_CAP_ZH || codepoint2 == LAT_ZH) {
         | 
| 86 | 
            +
                    switch (codepoint) {
         | 
| 87 | 
            +
                    case 'd': return CYR_DZ;
         | 
| 88 | 
            +
                    case 'D': return CYR_CAP_DZ;
         | 
| 89 | 
            +
                    }
         | 
| 90 | 
            +
                }
         | 
| 91 | 
            +
             | 
| 92 | 
            +
                if (codepoint2 == 'j' || codepoint2 == 'J') {
         | 
| 93 | 
            +
                    switch (codepoint) {
         | 
| 94 | 
            +
                    case 'l': return CYR_LJ;
         | 
| 95 | 
            +
                    case 'n': return CYR_NJ;
         | 
| 96 | 
            +
                    case 'L': return CYR_CAP_LJ;
         | 
| 97 | 
            +
                    case 'N': return CYR_CAP_NJ;
         | 
| 98 | 
            +
                    }
         | 
| 99 | 
            +
                }
         | 
| 100 | 
            +
             | 
| 101 | 
            +
                if (codepoint >= 'a' && codepoint <= 'z') return CYR_MAP[codepoint - 'a'];
         | 
| 102 | 
            +
                if (codepoint >= 'A' && codepoint <= 'Z') return CYR_CAPS_MAP[codepoint - 'A'];
         | 
| 103 | 
            +
             | 
| 104 | 
            +
                switch (codepoint) {
         | 
| 105 | 
            +
                case LAT_CH: return CYR_CH;
         | 
| 106 | 
            +
                case LAT_DJ: return CYR_DJ;
         | 
| 107 | 
            +
                case LAT_SH: return CYR_SH;
         | 
| 108 | 
            +
                case LAT_TJ: return CYR_TJ;
         | 
| 109 | 
            +
                case LAT_ZH: return CYR_ZH;
         | 
| 110 | 
            +
                case LAT_CAP_CH: return CYR_CAP_CH;
         | 
| 111 | 
            +
                case LAT_CAP_DJ: return CYR_CAP_DJ;
         | 
| 112 | 
            +
                case LAT_CAP_SH: return CYR_CAP_SH;
         | 
| 113 | 
            +
                case LAT_CAP_TJ: return CYR_CAP_TJ;
         | 
| 114 | 
            +
                case LAT_CAP_ZH: return CYR_CAP_ZH;
         | 
| 115 | 
            +
                }
         | 
| 116 | 
            +
             | 
| 117 | 
            +
                return 0;
         | 
| 69 118 | 
             
            }
         | 
| 70 119 |  | 
| 71 | 
            -
            static  | 
| 72 | 
            -
             | 
| 120 | 
            +
            static unsigned
         | 
| 121 | 
            +
            digraph_to_latin(unsigned codepoint, unsigned codepoint2, unsigned capitalize, unsigned *next_out)
         | 
| 73 122 | 
             
            {
         | 
| 74 | 
            -
                char  | 
| 75 | 
            -
             | 
| 76 | 
            -
             | 
| 77 | 
            -
                 | 
| 123 | 
            +
                static char LAT_MAP[] = {
         | 
| 124 | 
            +
                    'a', 'b', 'v', 'g', 'd', 'e', 0, 'z', 'i', 0, 'k', 'l',
         | 
| 125 | 
            +
                    'm', 'n', 'o', 'p', 'r', 's', 't', 'u', 'f', 'h', 'c'
         | 
| 126 | 
            +
                };
         | 
| 127 | 
            +
             | 
| 128 | 
            +
                static char LAT_CAPS_MAP[] = {
         | 
| 129 | 
            +
                    'A', 'B', 'V', 'G', 'D', 'E', 0, 'Z', 'I', 0, 'K', 'L',
         | 
| 130 | 
            +
                    'M', 'N', 'O', 'P', 'R', 'S', 'T', 'U', 'F', 'H', 'C'
         | 
| 131 | 
            +
                };
         | 
| 132 | 
            +
             | 
| 133 | 
            +
                if (codepoint < CYR_CAP_DJ || codepoint > CYR_DZ) return 0;
         | 
| 134 | 
            +
             | 
| 135 | 
            +
                switch (codepoint) {
         | 
| 136 | 
            +
                case CYR_ZH: return LAT_ZH;
         | 
| 137 | 
            +
                case CYR_CAP_ZH: return LAT_CAP_ZH;
         | 
| 138 | 
            +
                }
         | 
| 139 | 
            +
             | 
| 140 | 
            +
                if (codepoint >= CYR_A && codepoint <= CYR_C)
         | 
| 141 | 
            +
                    return LAT_MAP[codepoint - CYR_A];
         | 
| 142 | 
            +
             | 
| 143 | 
            +
                if (codepoint >= CYR_CAP_A && codepoint <= CYR_CAP_C)
         | 
| 144 | 
            +
                    return LAT_CAPS_MAP[codepoint - CYR_CAP_A];
         | 
| 145 | 
            +
             | 
| 146 | 
            +
                if (codepoint >= CYR_A) {
         | 
| 147 | 
            +
                    switch (codepoint) {
         | 
| 148 | 
            +
                    case CYR_J:  return 'j';
         | 
| 149 | 
            +
                    case CYR_TJ: return LAT_TJ;
         | 
| 150 | 
            +
                    case CYR_CH: return LAT_CH;
         | 
| 151 | 
            +
                    case CYR_SH: return LAT_SH;
         | 
| 152 | 
            +
                    case CYR_DJ: return LAT_DJ;
         | 
| 153 | 
            +
                    case CYR_LJ: *next_out = 'j'; return 'l';
         | 
| 154 | 
            +
                    case CYR_NJ: *next_out = 'j'; return 'n';
         | 
| 155 | 
            +
                    case CYR_DZ: *next_out = LAT_ZH; return 'd';
         | 
| 156 | 
            +
                    }
         | 
| 157 | 
            +
                }
         | 
| 158 | 
            +
                else {
         | 
| 159 | 
            +
                    switch (codepoint) {
         | 
| 160 | 
            +
                    case CYR_CAP_J:  return 'J';
         | 
| 161 | 
            +
                    case CYR_CAP_TJ: return LAT_CAP_TJ;
         | 
| 162 | 
            +
                    case CYR_CAP_CH: return LAT_CAP_CH;
         | 
| 163 | 
            +
                    case CYR_CAP_SH: return LAT_CAP_SH;
         | 
| 164 | 
            +
                    case CYR_CAP_DJ: return LAT_CAP_DJ;
         | 
| 165 | 
            +
                    case CYR_CAP_LJ: *next_out = (capitalize || is_cap(codepoint2)) ? 'J' : 'j'; return 'L';
         | 
| 166 | 
            +
                    case CYR_CAP_NJ: *next_out = (capitalize || is_cap(codepoint2)) ? 'J' : 'j'; return 'N';
         | 
| 167 | 
            +
                    case CYR_CAP_DZ: *next_out = (capitalize || is_cap(codepoint2)) ? LAT_CAP_ZH : LAT_ZH; return 'D';
         | 
| 168 | 
            +
                    }
         | 
| 169 | 
            +
                }
         | 
| 170 | 
            +
             | 
| 171 | 
            +
                return 0;
         | 
| 172 | 
            +
            }
         | 
| 173 | 
            +
             | 
| 174 | 
            +
            static unsigned
         | 
| 175 | 
            +
            digraph_to_ascii(unsigned codepoint, unsigned codepoint2, unsigned capitalize, unsigned *next_out)
         | 
| 176 | 
            +
            {
         | 
| 177 | 
            +
                switch (codepoint) {
         | 
| 178 | 
            +
                case LAT_TJ:
         | 
| 179 | 
            +
                case LAT_CH:
         | 
| 180 | 
            +
                case CYR_TJ:
         | 
| 181 | 
            +
                case CYR_CH: return 'c';
         | 
| 182 | 
            +
                case LAT_SH:
         | 
| 183 | 
            +
                case CYR_SH: return 's';
         | 
| 184 | 
            +
                case LAT_ZH:
         | 
| 185 | 
            +
                case CYR_ZH: return 'z';
         | 
| 186 | 
            +
                case LAT_DJ:
         | 
| 187 | 
            +
                case CYR_DJ: *next_out = 'j'; return 'd';
         | 
| 188 | 
            +
                case LAT_CAP_TJ:
         | 
| 189 | 
            +
                case LAT_CAP_CH:
         | 
| 190 | 
            +
                case CYR_CAP_TJ:
         | 
| 191 | 
            +
                case CYR_CAP_CH: return 'C';
         | 
| 192 | 
            +
                case LAT_CAP_SH:
         | 
| 193 | 
            +
                case CYR_CAP_SH: return 'S';
         | 
| 194 | 
            +
                case LAT_CAP_ZH:
         | 
| 195 | 
            +
                case CYR_CAP_ZH: return 'Z';
         | 
| 196 | 
            +
                case LAT_CAP_DJ:
         | 
| 197 | 
            +
                case CYR_CAP_DJ:
         | 
| 198 | 
            +
                    *next_out = (capitalize || is_cap(codepoint2)) ? 'J' : 'j'; return 'D';
         | 
| 199 | 
            +
                case CYR_DZ:
         | 
| 200 | 
            +
                    *next_out = (capitalize || is_cap(codepoint2)) ? 'Z' : 'z'; return 'd';
         | 
| 201 | 
            +
                case CYR_CAP_DZ:
         | 
| 202 | 
            +
                    *next_out = (capitalize || is_cap(codepoint2)) ? 'Z' : 'z'; return 'D';
         | 
| 203 | 
            +
                default:
         | 
| 204 | 
            +
                    return digraph_to_latin(codepoint, codepoint2, capitalize, next_out);
         | 
| 205 | 
            +
                }
         | 
| 78 206 | 
             
            }
         | 
| 79 207 |  | 
| 80 208 | 
             
            static VALUE
         | 
| 81 | 
            -
             | 
| 209 | 
            +
            str_to_srb(VALUE str, int strategy, int bang)
         | 
| 82 210 | 
             
            {
         | 
| 83 211 | 
             
                VALUE dest;
         | 
| 84 | 
            -
                 | 
| 212 | 
            +
                rb_encoding *enc;
         | 
| 213 | 
            +
             | 
| 85 214 | 
             
                int len, next_len;
         | 
| 86 | 
            -
                 | 
| 87 | 
            -
                int force_upper = 0;
         | 
| 215 | 
            +
                unsigned in, in2, out, out2, seen_cap = 0;
         | 
| 88 216 | 
             
                char *pos, *end, *seq_start = 0;
         | 
| 89 | 
            -
                char cyr;
         | 
| 90 | 
            -
                unsigned int codepoint = 0;
         | 
| 91 | 
            -
                unsigned int next_codepoint = 0;
         | 
| 92 | 
            -
                rb_encoding *enc;
         | 
| 93 217 |  | 
| 94 | 
            -
                 | 
| 95 | 
            -
                    'a', 'b', 'v', 'g', 'd', 'e', '\0', 'z', 'i', '\0', 'k',
         | 
| 96 | 
            -
                    'l', 'm', 'n', 'o', 'p', 'r', 's', 't', 'u', 'f', 'h', 'c'
         | 
| 97 | 
            -
                };
         | 
| 218 | 
            +
                unsigned (*method)(unsigned, unsigned, unsigned, unsigned*);
         | 
| 98 219 |  | 
| 99 | 
            -
                 | 
| 100 | 
            -
             | 
| 101 | 
            -
             | 
| 102 | 
            -
                 | 
| 220 | 
            +
                switch(strategy) {
         | 
| 221 | 
            +
                case 0:  method = &digraph_to_cyr;   break;
         | 
| 222 | 
            +
                case 1:  method = &digraph_to_latin; break;
         | 
| 223 | 
            +
                default: method = &digraph_to_ascii;
         | 
| 224 | 
            +
                }
         | 
| 103 225 |  | 
| 104 226 | 
             
                StringValue(str);
         | 
| 105 227 | 
             
                pos = RSTRING_PTR(str);
         | 
| @@ -107,123 +229,50 @@ str_to_latin(VALUE str, int ascii, int bang) | |
| 107 229 |  | 
| 108 230 | 
             
                end = RSTRING_END(str);
         | 
| 109 231 | 
             
                enc = STR_ENC_GET(str);
         | 
| 110 | 
            -
                 | 
| 111 | 
            -
                dest = rb_str_buf_new(dest_len);
         | 
| 232 | 
            +
                dest = rb_str_buf_new(RSTRING_LEN(str) + 30);
         | 
| 112 233 | 
             
                rb_enc_associate(dest, enc);
         | 
| 113 234 |  | 
| 114 | 
            -
                 | 
| 235 | 
            +
                in = rb_enc_codepoint_len(pos, end, &len, enc);
         | 
| 115 236 |  | 
| 116 237 | 
             
                while (pos < end) {
         | 
| 117 | 
            -
                     | 
| 118 | 
            -
                        next_codepoint = rb_enc_codepoint_len(pos + len, end, &next_len, enc);
         | 
| 119 | 
            -
                    }
         | 
| 238 | 
            +
                    in2 = out2 = 0;
         | 
| 120 239 |  | 
| 121 | 
            -
                     | 
| 122 | 
            -
             | 
| 123 | 
            -
                        if (seq_start) {
         | 
| 124 | 
            -
                            rb_str_buf_cat(dest, seq_start, pos - seq_start);
         | 
| 125 | 
            -
                            seq_start = 0;
         | 
| 126 | 
            -
                        }
         | 
| 240 | 
            +
                    if (pos + len < end)
         | 
| 241 | 
            +
                        in2 = rb_enc_codepoint_len(pos + len, end, &next_len, enc);
         | 
| 127 242 |  | 
| 128 | 
            -
             | 
| 129 | 
            -
                        case LAT_TJ:
         | 
| 130 | 
            -
                        case LAT_CH:     rb_str_buf_cat(dest, "c",  1); break;
         | 
| 131 | 
            -
                        case LAT_DJ:     rb_str_buf_cat(dest, "dj", 2); break;
         | 
| 132 | 
            -
                        case LAT_SH:     rb_str_buf_cat(dest, "s",  1); break;
         | 
| 133 | 
            -
                        case LAT_ZH:     rb_str_buf_cat(dest, "z",  1); break;
         | 
| 134 | 
            -
                        case LAT_CAP_TJ:
         | 
| 135 | 
            -
                        case LAT_CAP_CH: rb_str_buf_cat(dest, "C",  1); break;
         | 
| 136 | 
            -
                        case LAT_CAP_SH: rb_str_buf_cat(dest, "S",  1); break;
         | 
| 137 | 
            -
                        case LAT_CAP_ZH: rb_str_buf_cat(dest, "Z",  1); break;
         | 
| 138 | 
            -
                        case LAT_CAP_DJ:
         | 
| 139 | 
            -
                            (seen_upper || is_upper(next_codepoint))
         | 
| 140 | 
            -
                                ? rb_str_buf_cat(dest, "DJ", 2)
         | 
| 141 | 
            -
                                : rb_str_buf_cat(dest, "Dj", 2);
         | 
| 142 | 
            -
                            break;
         | 
| 143 | 
            -
                        default:
         | 
| 144 | 
            -
                            rb_str_buf_cat(dest, pos, len);
         | 
| 145 | 
            -
                        }
         | 
| 146 | 
            -
                    }
         | 
| 243 | 
            +
                    out = (*method)(in, in2, seen_cap, &out2);
         | 
| 147 244 |  | 
| 148 | 
            -
                     | 
| 149 | 
            -
             | 
| 245 | 
            +
                    if (out) {
         | 
| 246 | 
            +
                        /* flush previous untranslatable sequence */
         | 
| 150 247 | 
             
                        if (seq_start) {
         | 
| 151 248 | 
             
                            rb_str_buf_cat(dest, seq_start, pos - seq_start);
         | 
| 152 249 | 
             
                            seq_start = 0;
         | 
| 153 250 | 
             
                        }
         | 
| 154 251 |  | 
| 155 | 
            -
                         | 
| 156 | 
            -
             | 
| 157 | 
            -
                                cyr = CYR_MAP[codepoint - CYR_A];
         | 
| 158 | 
            -
                                cyr ? rb_str_buf_cat(dest, &cyr, 1)
         | 
| 159 | 
            -
                                    : rb_str_buf_cat(dest, pos, len);
         | 
| 160 | 
            -
                            }
         | 
| 161 | 
            -
                            else {
         | 
| 162 | 
            -
                                switch (codepoint) {
         | 
| 163 | 
            -
                                case CYR_J:  rb_str_buf_cat(dest, "j",  1); break;
         | 
| 164 | 
            -
                                case CYR_LJ: rb_str_buf_cat(dest, "lj", 2); break;
         | 
| 165 | 
            -
                                case CYR_NJ: rb_str_buf_cat(dest, "nj", 2); break;
         | 
| 166 | 
            -
                                case CYR_DJ: STR_CAT_COND_ASCII(ascii, dest, "dj", LAT_DJ, 2, enc); break;
         | 
| 167 | 
            -
                                case CYR_TJ: STR_CAT_COND_ASCII(ascii, dest, "c",  LAT_TJ, 1, enc); break;
         | 
| 168 | 
            -
                                case CYR_CH: STR_CAT_COND_ASCII(ascii, dest, "c",  LAT_CH, 1, enc); break;
         | 
| 169 | 
            -
                                case CYR_SH: STR_CAT_COND_ASCII(ascii, dest, "s",  LAT_SH, 1, enc); break;
         | 
| 170 | 
            -
                                case CYR_ZH: STR_CAT_COND_ASCII(ascii, dest, "z",  LAT_ZH, 1, enc); break;
         | 
| 171 | 
            -
                                case CYR_DZ:
         | 
| 172 | 
            -
                                    rb_str_buf_cat(dest, "d", 1);
         | 
| 173 | 
            -
                                    STR_CAT_COND_ASCII(ascii, dest, "z", LAT_ZH, 1, enc);
         | 
| 174 | 
            -
                                    break;
         | 
| 175 | 
            -
                                default:
         | 
| 176 | 
            -
                                    rb_str_buf_cat(dest, pos, len);
         | 
| 177 | 
            -
                                }
         | 
| 178 | 
            -
                            }
         | 
| 179 | 
            -
                        }
         | 
| 180 | 
            -
                        else {
         | 
| 181 | 
            -
                            if (maps_directly(codepoint)) {
         | 
| 182 | 
            -
                                cyr = CYR_CAPS_MAP[codepoint - CYR_CAP_A];
         | 
| 183 | 
            -
                                cyr ? rb_str_buf_cat(dest, &cyr, 1)
         | 
| 184 | 
            -
                                    : rb_str_buf_cat(dest, pos, len);
         | 
| 185 | 
            -
                            }
         | 
| 186 | 
            -
                            else {
         | 
| 187 | 
            -
                                force_upper = seen_upper || is_upper(next_codepoint);
         | 
| 188 | 
            -
             | 
| 189 | 
            -
                                switch (codepoint) {
         | 
| 190 | 
            -
                                case CYR_CAP_J:  rb_str_buf_cat(dest, "J", 1); break;
         | 
| 191 | 
            -
                                case CYR_CAP_LJ: rb_str_buf_cat(dest, (force_upper ? "LJ" : "Lj"), 2); break;
         | 
| 192 | 
            -
                                case CYR_CAP_NJ: rb_str_buf_cat(dest, (force_upper ? "NJ" : "Nj"), 2); break;
         | 
| 193 | 
            -
                                case CYR_CAP_TJ: STR_CAT_COND_ASCII(ascii, dest, "C", LAT_CAP_TJ, 1, enc); break;
         | 
| 194 | 
            -
                                case CYR_CAP_CH: STR_CAT_COND_ASCII(ascii, dest, "C", LAT_CAP_CH, 1, enc); break;
         | 
| 195 | 
            -
                                case CYR_CAP_SH: STR_CAT_COND_ASCII(ascii, dest, "S", LAT_CAP_SH, 1, enc); break;
         | 
| 196 | 
            -
                                case CYR_CAP_ZH: STR_CAT_COND_ASCII(ascii, dest, "Z", LAT_CAP_ZH, 1, enc); break;
         | 
| 197 | 
            -
                                case CYR_CAP_DJ: STR_CAT_COND_ASCII(ascii, dest, (force_upper ? "DJ" : "Dj"), LAT_CAP_DJ, 2, enc); break;
         | 
| 198 | 
            -
                                case CYR_CAP_DZ:
         | 
| 199 | 
            -
                                    rb_str_buf_cat(dest, "D", 1);
         | 
| 200 | 
            -
                                    force_upper ? STR_CAT_COND_ASCII(ascii, dest, "Z", LAT_CAP_ZH, 1, enc)
         | 
| 201 | 
            -
                                                : STR_CAT_COND_ASCII(ascii, dest, "z", LAT_ZH, 1, enc);
         | 
| 202 | 
            -
                                    break;
         | 
| 203 | 
            -
                                default:
         | 
| 204 | 
            -
                                    rb_str_buf_cat(dest, pos, len);
         | 
| 205 | 
            -
                                }
         | 
| 206 | 
            -
                            }
         | 
| 207 | 
            -
                        }
         | 
| 252 | 
            +
                        _str_cat_char(dest, out, enc);
         | 
| 253 | 
            +
                        if (out2) _str_cat_char(dest, out2, enc);
         | 
| 208 254 | 
             
                    }
         | 
| 209 | 
            -
                    else {
         | 
| 210 | 
            -
                        /*  | 
| 211 | 
            -
                         | 
| 255 | 
            +
                    else if (!seq_start) {
         | 
| 256 | 
            +
                        /* mark the beginning of an untranslatable sequence */
         | 
| 257 | 
            +
                        seq_start = pos;
         | 
| 258 | 
            +
                    }
         | 
| 259 | 
            +
             | 
| 260 | 
            +
                    /* for cyrillic output, skip the second half of an input digraph */
         | 
| 261 | 
            +
                    if (strategy == 0 && is_digraph(out)) {
         | 
| 262 | 
            +
                        pos += next_len;
         | 
| 263 | 
            +
                        if (pos + len < end)
         | 
| 264 | 
            +
                            in2 = rb_enc_codepoint_len(pos + len, end, &next_len, enc);
         | 
| 212 265 | 
             
                    }
         | 
| 213 266 |  | 
| 214 | 
            -
                     | 
| 267 | 
            +
                    seen_cap = is_cap(in);
         | 
| 215 268 |  | 
| 216 269 | 
             
                    pos += len;
         | 
| 217 270 | 
             
                    len = next_len;
         | 
| 218 | 
            -
             | 
| 219 | 
            -
                    codepoint = next_codepoint;
         | 
| 220 | 
            -
                    next_codepoint = 0;
         | 
| 271 | 
            +
                    in = in2;
         | 
| 221 272 | 
             
                }
         | 
| 222 273 |  | 
| 223 | 
            -
                /*  | 
| 224 | 
            -
                if (seq_start)  | 
| 225 | 
            -
                    rb_str_buf_cat(dest, seq_start, pos - seq_start);
         | 
| 226 | 
            -
                }
         | 
| 274 | 
            +
                /* flush final sequence */
         | 
| 275 | 
            +
                if (seq_start) rb_str_buf_cat(dest, seq_start, pos - seq_start);
         | 
| 227 276 |  | 
| 228 277 | 
             
                if (bang) {
         | 
| 229 278 | 
             
                    rb_str_shared_replace(str, dest);
         | 
| @@ -237,7 +286,35 @@ str_to_latin(VALUE str, int ascii, int bang) | |
| 237 286 | 
             
            }
         | 
| 238 287 |  | 
| 239 288 | 
             
            /**
         | 
| 240 | 
            -
             * Returns a copy of <i>str</i> with  | 
| 289 | 
            +
             * Returns a copy of <i>str</i> with Latin characters transliterated
         | 
| 290 | 
            +
             * into Serbian Cyrillic.
         | 
| 291 | 
            +
             *
         | 
| 292 | 
            +
             * @overload to_cyrillic(str)
         | 
| 293 | 
            +
             *   @param  [String] str text to be transliterated
         | 
| 294 | 
            +
             *   @return [String] transliterated text
         | 
| 295 | 
            +
             */
         | 
| 296 | 
            +
            static VALUE
         | 
| 297 | 
            +
            rb_str_to_cyrillic(VALUE self, VALUE str)
         | 
| 298 | 
            +
            {
         | 
| 299 | 
            +
                return str_to_srb(str, 0, 0);
         | 
| 300 | 
            +
            }
         | 
| 301 | 
            +
             | 
| 302 | 
            +
            /**
         | 
| 303 | 
            +
             * Performs transliteration of <code>Byk.to_cyrillic</code> in place,
         | 
| 304 | 
            +
             * returning <i>str</i>, whether any changes were made or not.
         | 
| 305 | 
            +
             *
         | 
| 306 | 
            +
             * @overload to_cyrillic!(str)
         | 
| 307 | 
            +
             *   @param  [String] str text to be transliterated
         | 
| 308 | 
            +
             *   @return [String] transliterated text
         | 
| 309 | 
            +
             */
         | 
| 310 | 
            +
            static VALUE
         | 
| 311 | 
            +
            rb_str_to_cyrillic_bang(VALUE self, VALUE str)
         | 
| 312 | 
            +
            {
         | 
| 313 | 
            +
                return str_to_srb(str, 0, 1);
         | 
| 314 | 
            +
            }
         | 
| 315 | 
            +
             | 
| 316 | 
            +
            /**
         | 
| 317 | 
            +
             * Returns a copy of <i>str</i> with Serbian Cyrillic characters
         | 
| 241 318 | 
             
             * transliterated into Latin.
         | 
| 242 319 | 
             
             *
         | 
| 243 320 | 
             
             * @overload to_latin(str)
         | 
| @@ -247,12 +324,12 @@ str_to_latin(VALUE str, int ascii, int bang) | |
| 247 324 | 
             
            static VALUE
         | 
| 248 325 | 
             
            rb_str_to_latin(VALUE self, VALUE str)
         | 
| 249 326 | 
             
            {
         | 
| 250 | 
            -
                return  | 
| 327 | 
            +
                return str_to_srb(str, 1, 0);
         | 
| 251 328 | 
             
            }
         | 
| 252 329 |  | 
| 253 330 | 
             
            /**
         | 
| 254 | 
            -
             * Performs  | 
| 255 | 
            -
             * returning <i>str</i>, whether changes were made or not.
         | 
| 331 | 
            +
             * Performs transliteration of <code>Byk.to_latin</code> in place,
         | 
| 332 | 
            +
             * returning <i>str</i>, whether any changes were made or not.
         | 
| 256 333 | 
             
             *
         | 
| 257 334 | 
             
             * @overload to_latin!(str)
         | 
| 258 335 | 
             
             *   @param  [String] str text to be transliterated
         | 
| @@ -261,12 +338,12 @@ rb_str_to_latin(VALUE self, VALUE str) | |
| 261 338 | 
             
            static VALUE
         | 
| 262 339 | 
             
            rb_str_to_latin_bang(VALUE self, VALUE str)
         | 
| 263 340 | 
             
            {
         | 
| 264 | 
            -
                return  | 
| 341 | 
            +
                return str_to_srb(str, 1, 1);
         | 
| 265 342 | 
             
            }
         | 
| 266 343 |  | 
| 267 344 | 
             
            /**
         | 
| 268 | 
            -
             * Returns a copy of <i>str</i> with  | 
| 269 | 
            -
             *  | 
| 345 | 
            +
             * Returns a copy of <i>str</i> with Serbian characters transliterated
         | 
| 346 | 
            +
             * into ASCII Latin.
         | 
| 270 347 | 
             
             *
         | 
| 271 348 | 
             
             * @overload to_ascii_latin(str)
         | 
| 272 349 | 
             
             *   @param  [String] str text to be transliterated
         | 
| @@ -275,12 +352,12 @@ rb_str_to_latin_bang(VALUE self, VALUE str) | |
| 275 352 | 
             
            static VALUE
         | 
| 276 353 | 
             
            rb_str_to_ascii_latin(VALUE self, VALUE str)
         | 
| 277 354 | 
             
            {
         | 
| 278 | 
            -
                return  | 
| 355 | 
            +
                return str_to_srb(str, 2, 0);
         | 
| 279 356 | 
             
            }
         | 
| 280 357 |  | 
| 281 358 | 
             
            /**
         | 
| 282 | 
            -
             * Performs  | 
| 283 | 
            -
             * place, returning <i>str</i>, whether changes were made or not.
         | 
| 359 | 
            +
             * Performs transliteration of <code>Byk.to_ascii_latin</code> in
         | 
| 360 | 
            +
             * place, returning <i>str</i>, whether any changes were made or not.
         | 
| 284 361 | 
             
             *
         | 
| 285 362 | 
             
             * @overload to_ascii_latin!(str)
         | 
| 286 363 | 
             
             *   @param  [String] str text to be transliterated
         | 
| @@ -289,12 +366,14 @@ rb_str_to_ascii_latin(VALUE self, VALUE str) | |
| 289 366 | 
             
            static VALUE
         | 
| 290 367 | 
             
            rb_str_to_ascii_latin_bang(VALUE self, VALUE str)
         | 
| 291 368 | 
             
            {
         | 
| 292 | 
            -
                return  | 
| 369 | 
            +
                return str_to_srb(str, 2, 1);
         | 
| 293 370 | 
             
            }
         | 
| 294 371 |  | 
| 295 372 | 
             
            void Init_byk_native(void)
         | 
| 296 373 | 
             
            {
         | 
| 297 374 | 
             
                VALUE Byk = rb_define_module("Byk");
         | 
| 375 | 
            +
                rb_define_singleton_method(Byk, "to_cyrillic", rb_str_to_cyrillic, 1);
         | 
| 376 | 
            +
                rb_define_singleton_method(Byk, "to_cyrillic!", rb_str_to_cyrillic_bang, 1);
         | 
| 298 377 | 
             
                rb_define_singleton_method(Byk, "to_latin", rb_str_to_latin, 1);
         | 
| 299 378 | 
             
                rb_define_singleton_method(Byk, "to_latin!", rb_str_to_latin_bang, 1);
         | 
| 300 379 | 
             
                rb_define_singleton_method(Byk, "to_ascii_latin", rb_str_to_ascii_latin, 1);
         | 
    
        data/lib/byk/version.rb
    CHANGED
    
    
    
        data/spec/byk_spec.rb
    CHANGED
    
    | @@ -1,5 +1,4 @@ | |
| 1 1 | 
             
            # coding: utf-8
         | 
| 2 | 
            -
             | 
| 3 2 | 
             
            require "spec_helper"
         | 
| 4 3 |  | 
| 5 4 | 
             
            describe Byk do
         | 
| @@ -24,70 +23,114 @@ describe Byk do | |
| 24 23 | 
             
                let(:non_serbian_cyrillic) { non_serbian_cyrillic_coderange.join }
         | 
| 25 24 |  | 
| 26 25 | 
             
                let(:ascii) { "The quick brown fox jumps over the lazy dog." }
         | 
| 27 | 
            -
                let(:other) { "संस्कृतम् | 
| 26 | 
            +
                let(:other) { "संस्कृतम्" }
         | 
| 28 27 |  | 
| 29 | 
            -
                let(:mixed) { "संस्कृतम्  | 
| 30 | 
            -
                let(: | 
| 31 | 
            -
                let(: | 
| 28 | 
            +
                let(:mixed) { "संस्कृतम् илити Sanskrit, obrati ПАЖЊУ." }
         | 
| 29 | 
            +
                let(:mixed_cyrillic) { "संस्कृतम् илити Санскрит, обрати ПАЖЊУ." }
         | 
| 30 | 
            +
                let(:mixed_latin) { "संस्कृतम् iliti Sanskrit, obrati PAŽNJU." }
         | 
| 31 | 
            +
                let(:mixed_ascii_latin) { "संस्कृतम् iliti Sanskrit, obrati PAZNJU." }
         | 
| 32 32 |  | 
| 33 | 
            -
                it "doesn't  | 
| 33 | 
            +
                it "doesn't translate an empty string" do
         | 
| 34 34 | 
             
                  expect(Byk.send(method, "")).to eq ""
         | 
| 35 35 | 
             
                end
         | 
| 36 36 |  | 
| 37 | 
            -
                it "doesn't  | 
| 38 | 
            -
                  expect(Byk.send(method,  | 
| 37 | 
            +
                it "doesn't translate foreign coderanges" do
         | 
| 38 | 
            +
                  expect(Byk.send(method, other)).to eq other
         | 
| 39 39 | 
             
                end
         | 
| 40 | 
            +
              end
         | 
| 40 41 |  | 
| 41 | 
            -
             | 
| 42 | 
            +
              shared_examples :cyrillization_method do |method|
         | 
| 43 | 
            +
                include_examples :base, method
         | 
| 44 | 
            +
             | 
| 45 | 
            +
                let(:edge_cases) do
         | 
| 46 | 
            +
                  [
         | 
| 47 | 
            +
                    ["lJ", "љ"],
         | 
| 48 | 
            +
                    ["nJ", "њ"],
         | 
| 49 | 
            +
                    ["dŽ", "џ"]
         | 
| 50 | 
            +
                  ]
         | 
| 51 | 
            +
                end
         | 
| 52 | 
            +
             | 
| 53 | 
            +
                it "doesn't translate Cyrillic" do
         | 
| 54 | 
            +
                  expect(Byk.send(method, pangram)).to eq pangram
         | 
| 55 | 
            +
                end
         | 
| 56 | 
            +
             | 
| 57 | 
            +
                it "doesn't translate non-Serbian Cyrillic" do
         | 
| 42 58 | 
             
                  expect(Byk.send(method, non_serbian_cyrillic)).to eq non_serbian_cyrillic
         | 
| 43 59 | 
             
                end
         | 
| 44 60 |  | 
| 45 | 
            -
                it " | 
| 46 | 
            -
                  expect(Byk.send(method,  | 
| 61 | 
            +
                it "translates Latin to Cyrillic" do
         | 
| 62 | 
            +
                  expect(Byk.send(method, pangram_latin)).to eq pangram
         | 
| 63 | 
            +
                end
         | 
| 64 | 
            +
             | 
| 65 | 
            +
                it "translates Latin caps to Cyrillic caps" do
         | 
| 66 | 
            +
                  expect(Byk.send(method, pangram_latin_caps)).to eq pangram_caps
         | 
| 67 | 
            +
                end
         | 
| 68 | 
            +
             | 
| 69 | 
            +
                it "translates mixed text properly" do
         | 
| 70 | 
            +
                  expect(Byk.send(method, mixed)).to eq mixed_cyrillic
         | 
| 71 | 
            +
                end
         | 
| 72 | 
            +
             | 
| 73 | 
            +
                it "translates edge cases properly" do
         | 
| 74 | 
            +
                  edge_cases.each do |input, output|
         | 
| 75 | 
            +
                    expect(Byk.send(method, input)).to eq output
         | 
| 76 | 
            +
                  end
         | 
| 77 | 
            +
                end
         | 
| 78 | 
            +
             | 
| 79 | 
            +
                it "translates ABECEDA to AZBUKA" do
         | 
| 80 | 
            +
                  expect(Byk::ABECEDA.map { |l| l.dup.send(:to_cyrillic) }).to match_array(Byk::AZBUKA)
         | 
| 81 | 
            +
                end
         | 
| 82 | 
            +
             | 
| 83 | 
            +
                it "translates ABECEDA_CAPS to AZBUKA_CAPS" do
         | 
| 84 | 
            +
                  expect(Byk::ABECEDA_CAPS.map { |l| l.dup.send(:to_cyrillic) }).to match_array(Byk::AZBUKA_CAPS)
         | 
| 47 85 | 
             
                end
         | 
| 48 86 | 
             
              end
         | 
| 49 87 |  | 
| 50 88 | 
             
              shared_examples :latinization_method do |method|
         | 
| 51 89 | 
             
                include_examples :base, method
         | 
| 52 90 |  | 
| 53 | 
            -
                let(:edge_cases)  | 
| 91 | 
            +
                let(:edge_cases) do
         | 
| 54 92 | 
             
                  [
         | 
| 55 | 
            -
                    ["Њ", "Nj"],
         | 
| 56 | 
            -
                    ["Љ", "Lj"],
         | 
| 57 | 
            -
                    ["Џ", "Dž"],
         | 
| 58 | 
            -
                    ["ЊЊ", "NJNJ"],
         | 
| 59 93 | 
             
                    ["ЉЉ", "LJLJ"],
         | 
| 94 | 
            +
                    ["ЊЊ", "NJNJ"],
         | 
| 60 95 | 
             
                    ["ЏЏ", "DŽDŽ"]
         | 
| 61 96 | 
             
                  ]
         | 
| 62 | 
            -
                 | 
| 97 | 
            +
                end
         | 
| 63 98 |  | 
| 64 | 
            -
                it "doesn't  | 
| 99 | 
            +
                it "doesn't translate ASCII" do
         | 
| 100 | 
            +
                  expect(Byk.send(method, ascii)).to eq ascii
         | 
| 101 | 
            +
                end
         | 
| 102 | 
            +
             | 
| 103 | 
            +
                it "doesn't translate Latin" do
         | 
| 65 104 | 
             
                  expect(Byk.send(method, pangram_latin)).to eq pangram_latin
         | 
| 66 105 | 
             
                end
         | 
| 67 106 |  | 
| 68 | 
            -
                it " | 
| 107 | 
            +
                it "doesn't translate non-Serbian Cyrillic" do
         | 
| 108 | 
            +
                  expect(Byk.send(method, non_serbian_cyrillic)).to eq non_serbian_cyrillic
         | 
| 109 | 
            +
                end
         | 
| 110 | 
            +
             | 
| 111 | 
            +
                it "translates Cyrillic to Latin" do
         | 
| 69 112 | 
             
                  expect(Byk.send(method, pangram)).to eq pangram_latin
         | 
| 70 113 | 
             
                end
         | 
| 71 114 |  | 
| 72 | 
            -
                it " | 
| 115 | 
            +
                it "translates Cyrillic caps to Latin caps" do
         | 
| 73 116 | 
             
                  expect(Byk.send(method, pangram_caps)).to eq pangram_latin_caps
         | 
| 74 117 | 
             
                end
         | 
| 75 118 |  | 
| 76 | 
            -
                it " | 
| 119 | 
            +
                it "translates mixed text properly" do
         | 
| 77 120 | 
             
                  expect(Byk.send(method, mixed)).to eq mixed_latin
         | 
| 78 121 | 
             
                end
         | 
| 79 122 |  | 
| 80 | 
            -
                it " | 
| 123 | 
            +
                it "translates edge cases properly" do
         | 
| 81 124 | 
             
                  edge_cases.each do |input, output|
         | 
| 82 125 | 
             
                    expect(Byk.send(method, input)).to eq output
         | 
| 83 126 | 
             
                  end
         | 
| 84 127 | 
             
                end
         | 
| 85 128 |  | 
| 86 | 
            -
                it " | 
| 129 | 
            +
                it "translates AZBUKA to ABECEDA" do
         | 
| 87 130 | 
             
                  expect(Byk::AZBUKA.map { |l| l.dup.send(method) }).to match_array(Byk::ABECEDA)
         | 
| 88 131 | 
             
                end
         | 
| 89 132 |  | 
| 90 | 
            -
                it " | 
| 133 | 
            +
                it "translates AZBUKA_CAPS to ABECEDA_CAPS" do
         | 
| 91 134 | 
             
                  expect(Byk::AZBUKA_CAPS.map { |l| l.dup.send(method) }).to match_array(Byk::ABECEDA_CAPS)
         | 
| 92 135 | 
             
                end
         | 
| 93 136 | 
             
              end
         | 
| @@ -95,7 +138,7 @@ describe Byk do | |
| 95 138 | 
             
              shared_examples :ascii_latinization_method do |method|
         | 
| 96 139 | 
             
                include_examples :base, method
         | 
| 97 140 |  | 
| 98 | 
            -
                let(:edge_cases)  | 
| 141 | 
            +
                let(:edge_cases) do
         | 
| 99 142 | 
             
                  [
         | 
| 100 143 | 
             
                    ["Њ", "Nj"],
         | 
| 101 144 | 
             
                    ["Љ", "Lj"],
         | 
| @@ -107,32 +150,36 @@ describe Byk do | |
| 107 150 | 
             
                    ["ЏЏ", "DZDZ"],
         | 
| 108 151 | 
             
                    ["ЂЂ", "DJDJ"],
         | 
| 109 152 | 
             
                    ["ĐĐ", "DJDJ"],
         | 
| 110 | 
            -
                    ["ЂУРАЂ  | 
| 111 | 
            -
                    ["ĐURAĐ  | 
| 153 | 
            +
                    ["ЂУРАЂ Ђурђевић", "DJURADJ Djurdjevic"],
         | 
| 154 | 
            +
                    ["ĐURAĐ Đurđević", "DJURADJ Djurdjevic"]
         | 
| 112 155 | 
             
                  ]
         | 
| 113 | 
            -
                }
         | 
| 114 | 
            -
             | 
| 115 | 
            -
                it "converts Cyrillic to ASCII Latin" do
         | 
| 116 | 
            -
                  expect(Byk.send(method, pangram)).to eq pangram_ascii_latin
         | 
| 117 156 | 
             
                end
         | 
| 118 157 |  | 
| 119 | 
            -
                it " | 
| 120 | 
            -
                  expect(Byk.send(method,  | 
| 158 | 
            +
                it "doesn't translate ASCII" do
         | 
| 159 | 
            +
                  expect(Byk.send(method, ascii)).to eq ascii
         | 
| 121 160 | 
             
                end
         | 
| 122 161 |  | 
| 123 | 
            -
                it " | 
| 162 | 
            +
                it "translates Latin to ASCII Latin" do
         | 
| 124 163 | 
             
                  expect(Byk.send(method, pangram_latin)).to eq pangram_ascii_latin
         | 
| 125 164 | 
             
                end
         | 
| 126 165 |  | 
| 127 | 
            -
                it " | 
| 166 | 
            +
                it "translates Latin caps to ASCII Latin caps" do
         | 
| 128 167 | 
             
                  expect(Byk.send(method, pangram_latin_caps)).to eq pangram_ascii_latin_caps
         | 
| 129 168 | 
             
                end
         | 
| 130 169 |  | 
| 131 | 
            -
                it " | 
| 170 | 
            +
                it "translates Cyrillic to ASCII Latin" do
         | 
| 171 | 
            +
                  expect(Byk.send(method, pangram)).to eq pangram_ascii_latin
         | 
| 172 | 
            +
                end
         | 
| 173 | 
            +
             | 
| 174 | 
            +
                it "translates Cyrillic caps to ASCII Latin caps" do
         | 
| 175 | 
            +
                  expect(Byk.send(method, pangram_caps)).to eq pangram_ascii_latin_caps
         | 
| 176 | 
            +
                end
         | 
| 177 | 
            +
             | 
| 178 | 
            +
                it "translates mixed text properly" do
         | 
| 132 179 | 
             
                  expect(Byk.send(method, mixed)).to eq mixed_ascii_latin
         | 
| 133 180 | 
             
                end
         | 
| 134 181 |  | 
| 135 | 
            -
                it " | 
| 182 | 
            +
                it "translates edge cases properly" do
         | 
| 136 183 | 
             
                  edge_cases.each do |input, output|
         | 
| 137 184 | 
             
                    expect(Byk.send(method, input)).to eq output
         | 
| 138 185 | 
             
                  end
         | 
| @@ -141,18 +188,28 @@ describe Byk do | |
| 141 188 |  | 
| 142 189 | 
             
              shared_examples :non_destructive_method do |method|
         | 
| 143 190 | 
             
                it "doesn't modify the arg" do
         | 
| 144 | 
            -
                  str = " | 
| 191 | 
            +
                  str = "ЖŽ"
         | 
| 145 192 | 
             
                  expect { Byk.send(method, str) }.to_not change { str }
         | 
| 146 193 | 
             
                end
         | 
| 147 194 | 
             
              end
         | 
| 148 195 |  | 
| 149 196 | 
             
              shared_examples :destructive_method do |method|
         | 
| 150 197 | 
             
                it "modifies the arg" do
         | 
| 151 | 
            -
                  str = " | 
| 198 | 
            +
                  str = "ЖŽ"
         | 
| 152 199 | 
             
                  expect { Byk.send(method, str) }.to change { str }
         | 
| 153 200 | 
             
                end
         | 
| 154 201 | 
             
              end
         | 
| 155 202 |  | 
| 203 | 
            +
              describe ".to_cyrillic" do
         | 
| 204 | 
            +
                it_behaves_like :cyrillization_method, :to_cyrillic
         | 
| 205 | 
            +
                it_behaves_like :non_destructive_method, :to_cyrillic
         | 
| 206 | 
            +
              end
         | 
| 207 | 
            +
             | 
| 208 | 
            +
              describe ".to_cyrillic!" do
         | 
| 209 | 
            +
                it_behaves_like :cyrillization_method, :to_cyrillic!
         | 
| 210 | 
            +
                it_behaves_like :destructive_method, :to_cyrillic!
         | 
| 211 | 
            +
              end
         | 
| 212 | 
            +
             | 
| 156 213 | 
             
              describe ".to_latin" do
         | 
| 157 214 | 
             
                it_behaves_like :latinization_method, :to_latin
         | 
| 158 215 | 
             
                it_behaves_like :non_destructive_method, :to_latin
         | 
| @@ -176,7 +233,7 @@ end | |
| 176 233 |  | 
| 177 234 | 
             
            describe String do
         | 
| 178 235 | 
             
              it "responds to Byk methods" do
         | 
| 179 | 
            -
                Byk. | 
| 236 | 
            +
                Byk.singleton_methods.each do |method|
         | 
| 180 237 | 
             
                  expect("").to respond_to(method)
         | 
| 181 238 | 
             
                end
         | 
| 182 239 | 
             
              end
         | 
    
        metadata
    CHANGED
    
    | @@ -1,15 +1,29 @@ | |
| 1 1 | 
             
            --- !ruby/object:Gem::Specification
         | 
| 2 2 | 
             
            name: byk
         | 
| 3 3 | 
             
            version: !ruby/object:Gem::Version
         | 
| 4 | 
            -
              version: 0. | 
| 4 | 
            +
              version: 1.0.0
         | 
| 5 5 | 
             
            platform: ruby
         | 
| 6 6 | 
             
            authors:
         | 
| 7 7 | 
             
            - Nikola Topalović
         | 
| 8 8 | 
             
            autorequire: 
         | 
| 9 | 
            -
            bindir:  | 
| 9 | 
            +
            bindir: exe
         | 
| 10 10 | 
             
            cert_chain: []
         | 
| 11 | 
            -
            date:  | 
| 11 | 
            +
            date: 2016-04-09 00:00:00.000000000 Z
         | 
| 12 12 | 
             
            dependencies:
         | 
| 13 | 
            +
            - !ruby/object:Gem::Dependency
         | 
| 14 | 
            +
              name: rake
         | 
| 15 | 
            +
              requirement: !ruby/object:Gem::Requirement
         | 
| 16 | 
            +
                requirements:
         | 
| 17 | 
            +
                - - "~>"
         | 
| 18 | 
            +
                  - !ruby/object:Gem::Version
         | 
| 19 | 
            +
                    version: '10.5'
         | 
| 20 | 
            +
              type: :development
         | 
| 21 | 
            +
              prerelease: false
         | 
| 22 | 
            +
              version_requirements: !ruby/object:Gem::Requirement
         | 
| 23 | 
            +
                requirements:
         | 
| 24 | 
            +
                - - "~>"
         | 
| 25 | 
            +
                  - !ruby/object:Gem::Version
         | 
| 26 | 
            +
                    version: '10.5'
         | 
| 13 27 | 
             
            - !ruby/object:Gem::Dependency
         | 
| 14 28 | 
             
              name: rake-compiler
         | 
| 15 29 | 
             
              requirement: !ruby/object:Gem::Requirement
         | 
| @@ -38,10 +52,11 @@ dependencies: | |
| 38 52 | 
             
                - - "~>"
         | 
| 39 53 | 
             
                  - !ruby/object:Gem::Version
         | 
| 40 54 | 
             
                    version: '3.2'
         | 
| 41 | 
            -
            description:  | 
| 42 | 
            -
               | 
| 55 | 
            +
            description: Fast transliteration of Serbian Cyrillic to Latin and back. Brzo preslovljavanje
         | 
| 56 | 
            +
              ćirilice u latinicu i obratno.
         | 
| 43 57 | 
             
            email: nikola.topalovic@gmail.com
         | 
| 44 | 
            -
            executables: | 
| 58 | 
            +
            executables:
         | 
| 59 | 
            +
            - byk
         | 
| 45 60 | 
             
            extensions:
         | 
| 46 61 | 
             
            - ext/byk/extconf.rb
         | 
| 47 62 | 
             
            extra_rdoc_files: []
         | 
| @@ -49,6 +64,7 @@ files: | |
| 49 64 | 
             
            - CHANGELOG.md
         | 
| 50 65 | 
             
            - LICENSE
         | 
| 51 66 | 
             
            - README.md
         | 
| 67 | 
            +
            - exe/byk
         | 
| 52 68 | 
             
            - ext/byk/byk.c
         | 
| 53 69 | 
             
            - ext/byk/extconf.rb
         | 
| 54 70 | 
             
            - lib/byk.rb
         | 
| @@ -76,9 +92,10 @@ required_rubygems_version: !ruby/object:Gem::Requirement | |
| 76 92 | 
             
                  version: '0'
         | 
| 77 93 | 
             
            requirements: []
         | 
| 78 94 | 
             
            rubyforge_project: 
         | 
| 79 | 
            -
            rubygems_version: 2. | 
| 95 | 
            +
            rubygems_version: 2.5.1
         | 
| 80 96 | 
             
            signing_key: 
         | 
| 81 97 | 
             
            specification_version: 4
         | 
| 82 | 
            -
            summary: Fast transliteration of Serbian Cyrillic  | 
| 98 | 
            +
            summary: Fast transliteration of Serbian Cyrillic to Latin and back. Brzo preslovljavanje
         | 
| 99 | 
            +
              ćirilice u latinicu i obratno.
         | 
| 83 100 | 
             
            test_files:
         | 
| 84 101 | 
             
            - spec/byk_spec.rb
         |