surname-transliterator 0.4.2 → 0.4.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 3b89068379dd610abf4b713b922fb2a10859f0bdbd9212bd62df1cf78a6de22e
4
- data.tar.gz: 96f02a045635b271baa5b95dd5d731f0165d1634de1c4be5e7d2d74fb9e88c16
3
+ metadata.gz: 5a9264f8289114d3007f04b0b8897bd50801bf10852143d67bbb60e2d10c921b
4
+ data.tar.gz: f73f11d4ba8f9f59cc512b626f24fd400afe90a00e4e77c869b5beceabe12d93
5
5
  SHA512:
6
- metadata.gz: a4767ed01232c158e6090e04744ea0156d2c87191622b7b1e626eac916285f4529f0281f3cb374978cc7f757c5c358713323fe39c4c11f40c812bdc504dffcda
7
- data.tar.gz: c68d19e198eccc518e86998d057e45b7ffc35034de5f71c303d7677940acdfcebac0e5d227af35e6099c04b32c1068e93d115cc095853b353b5900fbe6e0a750
6
+ metadata.gz: e1e7b8cd5dc6e9e0d7a0d1b73c59bd02689e44db5985aabbbf83cd94732732103f6c1b7c4a22e97b3cdc26f40bbd25488a49745bd88d74c15f68319d68b16502
7
+ data.tar.gz: 1edc0770b0a4035c8b76c35a36a46628f41eddf5d59ca7822cc4f0b413c1bfa5e2b96d06e62bc4ab07c3746250808e054841cd05441dafa06cbe7df30e79391f
data/CHANGELOG.md CHANGED
@@ -5,6 +5,20 @@ All notable changes to this project will be documented in this file.
5
5
  The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
6
6
  and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
7
7
 
8
+ ## [0.4.4] - 2025-01-01
9
+
10
+ ### Fixed
11
+ - Updated test expectations for improved Jankauskas mapping.
12
+
13
+ ## [0.4.3] - 2025-01-01
14
+
15
+ ### Added
16
+ - Improved lithuanian_to_polish mappings: 'auskas' → 'owski' for better genealogical transformations (e.g., Jankauskas → Jankowski).
17
+
18
+ ### Fixed
19
+ - Corrected README examples to match actual gem outputs (including digraph handling).
20
+ - Updated installation instructions.
21
+
8
22
  ## [0.4.0] - 2025-01-01
9
23
 
10
24
  ### Added
data/README.md CHANGED
@@ -3,13 +3,13 @@
3
3
  A Ruby gem for cross-language surname transliteration and transformation, based on genealogical rules. Supports transliteration (removing diacritics/Cyrillic) and polonization/de-polonization endings between languages like Polish-Lithuanian, Polish-Russian, Czech, etc. Extensible for more pairs. Useful for reducing false positives in genealogical matching.
4
4
 
5
5
  Features:
6
- - Transliterate Polish surnames to Lithuanian script (remove diacritics).
7
- - Basic de-polonization (reverse common polonized endings to Lithuanian).
6
+ - Transliterate surnames (remove diacritics/Cyrillic, handle Polish digraphs like sz/č/cz/rz).
7
+ - Transform endings between languages (polonization/de-polonization based on genealogical rules).
8
+ - Generate W/V interchange variants for better genealogical matching.
9
+ - Support for Polish ↔ Lithuanian, Polish ↔ Russian transformations (asymmetric).
8
10
 
9
11
  ## Installation
10
12
 
11
- TODO: Replace `UPDATE_WITH_YOUR_GEM_NAME_IMMEDIATELY_AFTER_RELEASE_TO_RUBYGEMS_ORG` with your gem name right after releasing it to RubyGems.org. Please do not do it earlier due to security reasons. Alternatively, replace this section with instructions to install your gem from git if you don't plan to release to RubyGems.org.
12
-
13
13
  Install the gem and add to the application's Gemfile by executing:
14
14
 
15
15
  ```bash
@@ -29,23 +29,23 @@ require 'surname/transliterator'
29
29
 
30
30
  # Convenience methods
31
31
  polish_to_lith = Surname::Transliterator.polish_to_lithuanian("Łukasiewicz")
32
- # => ["Lukasiewicz"] (only transliterated, no transformation)
32
+ # => ["Lukasiewič"] (transliterated with Polish digraphs)
33
33
 
34
34
  polish_to_lith2 = Surname::Transliterator.polish_to_lithuanian("Antonowicz")
35
- # => ["Antonowicz", "Antanavicius"] (transliterated + transformed)
35
+ # => ["Antonavičius", "Antonowič"] (transliterated + transformed)
36
36
 
37
37
  lith_to_polish = Surname::Transliterator.lithuanian_to_polish("Jankauskas")
38
- # => ["Jankauskas", "Jankowski"]
38
+ # => ["Jankowski", "Jankauskas"] (transformed + transliterated)
39
39
 
40
40
  polish_to_russian = Surname::Transliterator.polish_to_russian("Kowalski")
41
41
  # => ["Kowalski", "Kowalskii"]
42
42
 
43
43
  russian_to_polish = Surname::Transliterator.russian_to_polish("Иванов")
44
- # => ["Ivanov", "Ivanov"]
44
+ # => ["Ivanov"]
45
45
 
46
- # General cross-language normalization
47
- variants = Surname::Transliterator.normalize_surname("Antonowicz", 'polish', 'lithuanian')
48
- # => ["Antonowicz", "Antanavicius"]
46
+ # General cross-language normalization (includes W/V interchange for genealogical matching)
47
+ variants = Surname::Transliterator.normalize_surname("Wiszniewski", 'polish', 'lithuanian')
48
+ # => ["Wišnievskis", "Wišnievskas", "Wišniewski", "Višnievskis", "Višnievskas", "Višniewski"] (transformed + W/V)
49
49
 
50
50
  # Just transliterate (remove diacritics/Cyrillic)
51
51
  clean_polish = Surname::Transliterator.transliterate("Świętochowski", 'polish')
@@ -55,6 +55,43 @@ clean_russian = Surname::Transliterator.transliterate("Иванов", 'russian')
55
55
  # => "Ivanov"
56
56
  ```
57
57
 
58
+ ## Important Notes
59
+
60
+ - **Asymmetric Transformations**: Translations between languages are not symmetric due to historical genealogical adaptations. For example, Polish -owicz may become Lithuanian -avičius, but reversing it doesn't always restore -owicz exactly. Use `polish_to_lithuanian` and `lithuanian_to_polish` as separate methods with their own mappings.
61
+
62
+ ## Supported Languages and Pairs
63
+
64
+ The gem supports transliteration and transformation for the following languages:
65
+
66
+ - **Polish**: Full transliteration (diacritics + digraphs like sz/č/cz/rz).
67
+ - **Lithuanian**: Full transliteration.
68
+ - **Russian**: Full transliteration.
69
+ - **Czech**: Basic transliteration.
70
+
71
+ ### Supported Language Pairs for Transformations
72
+
73
+ | From ↓ / To → | Polish | Lithuanian | Russian |
74
+ |---------------|--------|------------|---------|
75
+ | **Polish** | - | ✅ (polish_to_lithuanian) | ✅ (polish_to_russian) |
76
+ | **Lithuanian**| ✅ (lithuanian_to_polish) | - | - |
77
+ | **Russian** | ✅ (russian_to_polish) | - | - |
78
+
79
+ Note: Transformations are asymmetric (see below). Add more pairs by editing `POLONIZATION_MAPPINGS`.
80
+
81
+ ## Transformation Matrix Examples
82
+
83
+ Below is a matrix showing example transformations between languages (not symmetric):
84
+
85
+ | From → To | Polish → Lithuanian | Lithuanian → Polish |
86
+ |--------------------|---------------------|---------------------|
87
+ | Antonowicz | Antonavičius, Antonowič | - |
88
+ | Jankauskas | - | Jankowski, Jankauskas |
89
+ | Kowalski | Kovalskis | - |
90
+ | Wiśniewski | Višnievskis, Višnievskas | - |
91
+ | Dombrovskis | - | Dombrowski |
92
+
93
+ This illustrates why separate methods are needed for each direction.
94
+
58
95
  ## Adding New Languages
59
96
 
60
97
  Edit `DIACRITIC_MAPPINGS` and `POLONIZATION_MAPPINGS` in the code to add support for more languages/pairs.
@@ -65,6 +102,14 @@ After checking out the repo, run `bin/setup` to install dependencies. You can al
65
102
 
66
103
  To install this gem onto your local machine, run `bundle exec rake install`. To release a new version, update the version number in `version.rb`, and then run `bundle exec rake release`, which will create a git tag for the version, push git commits and the created tag, and push the `.gem` file to [rubygems.org](https://rubygems.org).
67
104
 
105
+ ## TODO
106
+
107
+ - Add polonization mappings for Czech surnames.
108
+ - Extend support for more language pairs (e.g., Lithuanian ↔ Russian).
109
+ - Improve W/V interchange logic for other languages.
110
+ - Add more genealogical sources for mapping validation.
111
+ - Consider adding fuzzy matching or Soundex for better approximate matches.
112
+
68
113
  ## Contributing
69
114
 
70
- Bug reports and pull requests are welcome on GitHub at https://github.com/[USERNAME]/surname-transliterator.
115
+ Bug reports and pull requests are welcome on GitHub at https://github.com/justi-blue/surname-transliterator.
@@ -2,6 +2,6 @@
2
2
 
3
3
  module Surname
4
4
  module Transliterator
5
- VERSION = '0.4.2'
5
+ VERSION = '0.4.4'
6
6
  end
7
7
  end
@@ -108,6 +108,7 @@ module Surname
108
108
  'skas' => ['ski'],
109
109
  'ckis' => ['cki'],
110
110
  'ckas' => ['cki'],
111
+ 'auskas' => ['owski'], # e.g., Jankauskas → Jankowski
111
112
  'onis' => ['owicz'], # e.g., Jonas → Janowicz
112
113
  'aitis' => ['owicz'] # rarer, e.g., Kazlauskas variations
113
114
  },
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: surname-transliterator
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.4.2
4
+ version: 0.4.4
5
5
  platform: ruby
6
6
  authors:
7
7
  - Justyna Wojtczak
@@ -26,12 +26,12 @@ files:
26
26
  - lib/surname/transliterator.rb
27
27
  - lib/surname/transliterator/version.rb
28
28
  - sig/surname/transliterator.rbs
29
- homepage: https://github.com/justi-blue/surname-transliterator
29
+ homepage: https://github.com/justine84/surname-transliterator
30
30
  licenses:
31
31
  - MIT
32
32
  metadata:
33
- source_code_uri: https://github.com/justi-blue/surname-transliterator/tree/main
34
- homepage_uri: https://github.com/justi-blue/surname-transliterator
33
+ source_code_uri: https://github.com/justine84/surname-transliterator/tree/main
34
+ homepage_uri: https://github.com/justine84/surname-transliterator
35
35
  changelog_uri: https://github.com/justi-blue/surname-transliterator/blob/main/CHANGELOG.md
36
36
  rubygems_mfa_required: 'true'
37
37
  rdoc_options: []