obfuscator-rb 0.3.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml ADDED
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA256:
3
+ metadata.gz: 631dc474ce83763052076cc0859f85012e0e79a623d4394ffc1485152d92ae41
4
+ data.tar.gz: 0b6f3963a2f67f29f4f84ea226f181331a77802bfdab22046b802c5ea9df82ab
5
+ SHA512:
6
+ metadata.gz: 694433e45c631fc851840d63af7f8dceff5d77da72b28a0818a3cbb5001c9456e55c45c10d3597e53b6b55e7a6eb29bc1aada13edad2ded2f53a85bf906458fc
7
+ data.tar.gz: d7bcd0e7b17e725c782a92a9ed865c7f78dc03e9b7afd8d10afe3725cf4a46080839e3abcd5fe756fd18853b74a7db96f572ec3e483ae10b5f67eaa1c8e9dbcc
data/.rubocop.yml ADDED
@@ -0,0 +1,55 @@
1
+ AllCops:
2
+ TargetRubyVersion: 3.3
3
+ NewCops: enable
4
+ Exclude:
5
+ - "bin/**/*"
6
+ - ".git/**/*"
7
+ - ".bundle/**/*"
8
+ - ".vscode/**/*"
9
+ - ".ruby-lsp/**/*"
10
+ - "vendor/**/*"
11
+ SuggestExtensions: false
12
+
13
+ require:
14
+ - rubocop-minitest
15
+ - rubocop-performance
16
+
17
+ Naming/FileName:
18
+ Exclude:
19
+ - 'lib/obfuscator-rb.rb'
20
+
21
+ Style/StringLiterals:
22
+ Enabled: true
23
+ EnforcedStyle: single_quotes
24
+
25
+ Style/StringLiteralsInInterpolation:
26
+ EnforcedStyle: double_quotes
27
+
28
+ Layout/ArgumentAlignment:
29
+ EnforcedStyle: with_first_argument
30
+
31
+ Layout/HashAlignment:
32
+ EnforcedHashRocketStyle: table
33
+ EnforcedColonStyle: table
34
+
35
+ Metrics/ClassLength:
36
+ Max: 140
37
+ Exclude:
38
+ - "test/*"
39
+
40
+ Metrics/BlockLength:
41
+ Max: 50
42
+
43
+ Metrics/MethodLength:
44
+ Max: 60
45
+
46
+ Metrics/AbcSize:
47
+ Max: 40
48
+ Exclude:
49
+ - "test/*"
50
+
51
+ Metrics/PerceivedComplexity:
52
+ Max: 35
53
+
54
+ Metrics/CyclomaticComplexity:
55
+ Max: 30
data/CHANGELOG.md ADDED
@@ -0,0 +1,32 @@
1
+ # Changelog
2
+ All notable changes to this project will be documented in this file.
3
+
4
+ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
5
+ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
6
+
7
+ ## [0.3.1] - 2025-02-07
8
+ ### Fixed
9
+ - Restored proper deterministic behavior for sequential obfuscation calls when using seed
10
+ - Fixed language detection for capitalized words
11
+ - Added test coverage for sequential determinism with seeds
12
+
13
+ ## [0.3.0] - 2025-02-07
14
+ ### Changed
15
+ - Renamed gem from 'obfuscator' to 'obfuscator-rb' for RubyGems.org publication
16
+ - Restructured main entry point for better gem compatibility
17
+ - Updated documentation to reflect new gem name
18
+
19
+ ## [0.2.0] - 2025-02-06
20
+ ### Added
21
+ - DateObfuscator class for handling date obfuscation
22
+ - Support for various date formats (EU, ISO, Russian)
23
+ - Configurable date constraints (year range, month/weekday preservation)
24
+ - Random number generation, array and range sampling helper methods
25
+
26
+ ### Changed
27
+ - Refactored internal RNG handling for better maintainability and consistency
28
+ - Fixed seed handling to ensure proper reproducibility of results
29
+
30
+ ## [0.1.0] - 2025-02-03
31
+ ### Added
32
+ - Initial release
data/LICENSE.txt ADDED
@@ -0,0 +1,21 @@
1
+ The MIT License (MIT)
2
+
3
+ Copyright (c) 2025 ad
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in
13
+ all copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
21
+ THE SOFTWARE.
data/README.md ADDED
@@ -0,0 +1,225 @@
1
+ # Obfuscator
2
+
3
+ [Русский](#русский) | [English](#english)
4
+
5
+ ## Русский
6
+
7
+ Ruby-гем для обфускации текста. Сохраняет структуру, заменяя содержимое бессмысленными словами, сохраняющими при этом
8
+ естественный вид исходного текста. Поддерживает русский и английский языки.
9
+
10
+ ### Установка
11
+
12
+ Добавьте эту строку в Gemfile вашего приложения:
13
+
14
+ ```ruby
15
+ gem 'obfuscator-rb', git: 'https://hub.mos.ru/ad/obfuscator.git'
16
+ ```
17
+
18
+ И выполните:
19
+
20
+ ```bash
21
+ $ bundle install
22
+ ```
23
+
24
+ Или установите самостоятельно:
25
+
26
+ ```bash
27
+ $ gem install obfuscator-rb
28
+ ```
29
+
30
+ ### Возможности
31
+
32
+ - Сохраняет структуру текста (пунктуация, пробелы, регистр)
33
+ - Сохраняет длину слов (если не включена натурализация)
34
+ - Поддерживает несколько режимов обфускации
35
+ - Опциональная натурализация текста по некоторым простым правилам
36
+ - Необратимая трансформация текста
37
+ - Детерминированный вывод при использовании сида
38
+ - Полная поддержка UTF-8
39
+ - Обеспечена обработка ошибок определённых типов данных
40
+
41
+ ### Использование
42
+
43
+ ```ruby
44
+ require 'obfuscator-rb'
45
+
46
+ # Базовое использование (режим :direct по умолчанию)
47
+ obfuscator = Obfuscator::Multilang.new
48
+ text = "Hello, Привет! This is a TEST текст."
49
+ result = obfuscator.obfuscate(text)
50
+ # => Каждое слово обфусцируется с использованием исходного алфавита
51
+ # => "Idise, Кющэшэ! Izib oq g MUGU дипяд."
52
+
53
+ # Смешанный режим с обоими алфавитами
54
+ obfuscator = Obfuscator::Multilang.new(mode: :mixed)
55
+ result = obfuscator.obfuscate(text)
56
+ # => Слова могут содержать и латинские, и кириллические символы
57
+ # => "Fаyef, Фeфeгю! Muci лi r HЫЛO ицижё."
58
+
59
+ # С натурализацией по простым правилам для более естественного вывода
60
+ obfuscator = Obfuscator::Multilang.new(mode: :mixed, naturalize: true)
61
+ result = obfuscator.obfuscate(text)
62
+ # => Вывод обрабатывается для более естественного вида
63
+ # => "Ohеsion, Wорыой! Наvы мe л ЛУНI yeзing."
64
+
65
+ # С сидом для воспроизводимых результатов
66
+ obfuscator = Obfuscator::Multilang.new(seed: 12345)
67
+ result = obfuscator.obfuscate(text)
68
+ # => Одинаковый ввод + одинаковый сид = одинаковый вывод
69
+ # => "Cumic, Фяцёне! Okac ub h POWO щюзёс."
70
+ ```
71
+
72
+ ### Доступные режимы
73
+
74
+ - `:direct` (по умолчанию) - сохраняет исходный язык для каждого слова
75
+ - `:eng_to_eng` - только английский в английский
76
+ - `:rus_to_rus` - только русский в русский
77
+ - `:swapped` - английский в русский и наоборот
78
+ - `:mixed` - использует оба алфавита (просто ради прикола)
79
+
80
+ ### Обработка входных данных
81
+
82
+ Обфускатор обрабатывает разлличные типы данных:
83
+
84
+ - `nil` → возвращает nil
85
+ - Числа → возвращаются без изменений
86
+ - Объекты с методом `to_s` → обрабатываются нормально
87
+ - Объекты без базовых методов Ruby → вызывают `InputError`
88
+ - Неверные кодировки → вызывают `EncodingError`
89
+
90
+ #### Типы ошибок
91
+
92
+ - `Obfuscator::Error` - Базовый класс ошибок гема
93
+ - `Obfuscator::InputError` - Возникает при неверном типе входных данных
94
+ - `Obfuscator::EncodingError` - Возникает при проблемах с кодировкой
95
+
96
+ #### Пример использования с обработкой ошибок
97
+
98
+ ```ruby
99
+
100
+ begin
101
+ obfuscator.obfuscate(текст)
102
+ rescue Obfuscator::InputError => e
103
+ # Обработка неверного типа входных данных
104
+ puts "Неверный тип данных: #{e.message}"
105
+ rescue Obfuscator::EncodingError => e
106
+ # Обработка проблем с кодировкой
107
+ puts "Ошибка кодировки: #{e.message}"
108
+ rescue Obfuscator::Error => e
109
+ # Обработка прочих ошибок обфускации
110
+ puts "Ошибка обфускации: #{e.message}"
111
+ end
112
+ ```
113
+
114
+ ## English
115
+
116
+ A Ruby gem for text obfuscation that preserves text structure while replacing content with meaningless but
117
+ natural-looking words. Supports both English and Russian languages.
118
+
119
+ ### Installation
120
+
121
+ Add this line to your application's Gemfile:
122
+
123
+ ```ruby
124
+ gem 'obfuscator-rb', git: 'https://hub.mos.ru/ad/obfuscator.git'
125
+ ```
126
+
127
+ And then execute:
128
+
129
+ ```bash
130
+ $ bundle install
131
+ ```
132
+
133
+ Or install it yourself as:
134
+
135
+ ```bash
136
+ $ gem install obfuscator-rb
137
+ ```
138
+
139
+ ### Features
140
+
141
+ - Preserves text structure (punctuation, spacing, capitalization)
142
+ - Maintains word lengths (unless naturalization is enabled)
143
+ - Supports multiple obfuscation modes
144
+ - Optional text naturalization according to some basic rules
145
+ - Irreversible transformation
146
+ - Deterministic output with seeds
147
+ - Full UTF-8 support
148
+ - Comprehensive error handling with specific error types
149
+
150
+ ### Usage
151
+
152
+ ```ruby
153
+ require 'obfuscator-rb'
154
+
155
+ # Basic usage (default :direct mode)
156
+ obfuscator = Obfuscator::Multilang.new
157
+ text = "Hello, Привет! This is a TEST текст."
158
+ result = obfuscator.obfuscate(text)
159
+ # => Each word is obfuscated using its source alphabet
160
+ # => "Idise, Кющэшэ! Izib oq g MUGU дипяд."
161
+
162
+ # Mixed mode with both alphabets
163
+ obfuscator = Obfuscator::Multilang.new(mode: :mixed)
164
+ result = obfuscator.obfuscate(text)
165
+ # => Words may contain both Latin and Cyrillic characters
166
+ # => "Fаyef, Фeфeгю! Muci лi r HЫЛO ицижё."
167
+
168
+ # With basic naturalization for more natural-looking output
169
+ obfuscator = Obfuscator::Multilang.new(mode: :mixed, naturalize: true)
170
+ result = obfuscator.obfuscate(text)
171
+ # => Output is processed to look more natural
172
+ # => "Ohеsion, Wорыой! Наvы мe л ЛУНI yeзing."
173
+
174
+ # With seed for reproducible results
175
+ obfuscator = Obfuscator::Multilang.new(seed: 12345)
176
+ result = obfuscator.obfuscate(text)
177
+ # => Same input + same seed = same output
178
+ # => "Cumic, Фяцёне! Okac ub h POWO щюзёс."
179
+ ```
180
+
181
+ ### Available Modes
182
+
183
+ - `:direct` (default) - preserves source language for each word
184
+ - `:eng_to_eng` - English to English only
185
+ - `:rus_to_rus` - Russian to Russian only
186
+ - `:swapped` - English to Russian and vice versa
187
+ - `:mixed` - uses both alphabets (just for fun)
188
+
189
+ ### Input Handling
190
+
191
+ The obfuscator handles various input types:
192
+
193
+ - `nil` → returns nil
194
+ - Numbers → returns unchanged
195
+ - Objects responding to `to_s` → processes normally
196
+ - Objects without basic Ruby methods → raises `InputError`
197
+ - Invalid encodings → raises `EncodingError`
198
+
199
+ #### Error Types
200
+
201
+ - `Obfuscator::Error` - Base error class for the gem
202
+ - `Obfuscator::InputError` - Raised for invalid input types
203
+ - `Obfuscator::EncodingError` - Raised for encoding-related issues
204
+
205
+ #### Example Usage with Error Handling
206
+
207
+ ```ruby
208
+
209
+ begin
210
+ obfuscator.obfuscate(text)
211
+ rescue Obfuscator::InputError => e
212
+ # Handle invalid input types
213
+ puts "Invalid input: #{e.message}"
214
+ rescue Obfuscator::EncodingError => e
215
+ # Handle encoding issues
216
+ puts "Encoding error: #{e.message}"
217
+ rescue Obfuscator::Error => e
218
+ # Handle other obfuscation errors
219
+ puts "Obfuscation failed: #{e.message}"
220
+ end
221
+ ```
222
+
223
+ ## License
224
+
225
+ The gem is available as open source under the terms of the [MIT License](https://opensource.org/licenses/MIT).
data/Rakefile ADDED
@@ -0,0 +1,12 @@
1
+ # frozen_string_literal: true
2
+
3
+ require 'bundler/gem_tasks'
4
+ require 'minitest/test_task'
5
+
6
+ Minitest::TestTask.create
7
+
8
+ require 'rubocop/rake_task'
9
+
10
+ RuboCop::RakeTask.new
11
+
12
+ task default: %i[test rubocop]
@@ -0,0 +1,33 @@
1
+ # frozen_string_literal: true
2
+
3
+ module Obfuscator
4
+ module Constants
5
+ ENGLISH_CONSONANTS = %w[b c d f g h j k l m n p q r s t v w x y z].freeze
6
+ ENGLISH_VOWELS = %w[a e i o u].freeze
7
+
8
+ RUSSIAN_CONSONANTS = %w[б в г д ж з к л м н п р с т ф х ц ч ш щ].freeze
9
+ RUSSIAN_VOWELS = %w[а е ё и о у ы э ю я].freeze
10
+
11
+ # Impossible combinations in both languages
12
+ IMPOSSIBLE_COMBINATIONS = %w[
13
+ щщ щц щч щж щш щх
14
+ жщ жж жц жч
15
+ цщ цж цч
16
+ чщ чж чц
17
+ th щ th ж th ц th ч
18
+ wa щ wa ж wa ц wa ч
19
+ ].freeze
20
+
21
+ # Typical Russian word endings
22
+ RUSSIAN_ENDINGS = %w[
23
+ ый ой ая ое ые ий ь
24
+ ость ение ство ация
25
+ ].freeze
26
+
27
+ # Typical English word endings
28
+ ENGLISH_ENDINGS = %w[
29
+ ing ed ly tion sion
30
+ ment ness ful less
31
+ ].freeze
32
+ end
33
+ end
@@ -0,0 +1,122 @@
1
+ # frozen_string_literal: true
2
+
3
+ require 'date'
4
+ require_relative 'internal/rng'
5
+
6
+ module Obfuscator
7
+ # Class for obfuscating dates while preserving their format and optionally some properties.
8
+ #
9
+ # Supports various date formats through presets or custom format strings.
10
+ # Can preserve certain date characteristics (month, weekday) and respect year constraints.
11
+ # All generated dates are valid - for example, it won't generate February 31st.
12
+ #
13
+ # @example Basic usage with preset format
14
+ # obfuscator = DateObfuscator.new
15
+ # obfuscator.obfuscate('2023-12-31') # => "2025-07-15"
16
+ #
17
+ # @example With custom format string
18
+ # obfuscator = DateObfuscator.new(format: '%Y-%m-%d')
19
+ # obfuscator.obfuscate('2023-12-31') # => "2025-07-15"
20
+ #
21
+ # @example With constraints
22
+ # obfuscator = DateObfuscator.new(
23
+ # constraints: {
24
+ # min_year: 2020, # Minimum year to generate
25
+ # max_year: 2025, # Maximum year to generate
26
+ # preserve_month: true, # Keep the same month
27
+ # preserve_weekday: true # Keep the same day of week
28
+ # }
29
+ # )
30
+ #
31
+ # @example With seed for reproducible results
32
+ # obfuscator = DateObfuscator.new(seed: 12345)
33
+ # obfuscator.obfuscate('2023-12-31') # => Same result for same seed
34
+ #
35
+ # Available preset formats:
36
+ # - :eu => '%d.%m.%Y' # 31.12.2023
37
+ # - :eu_short => '%d.%m.%y' # 31.12.23
38
+ # - :rus => '%d.%m.%Y' # 31.12.2023
39
+ # - :rus_short => '%d.%m.%y' # 31.12.23
40
+ # - :iso => '%Y-%m-%d' # 2023-12-31
41
+ #
42
+ # @param format [Symbol, String] Preset format name or custom format string (default: :iso)
43
+ # @param seed [Integer, nil] Optional seed for reproducible results
44
+ # @param constraints [Hash] Optional constraints for date generation
45
+ # @option constraints [Integer] :min_year Minimum year to generate (default: 2000)
46
+ # @option constraints [Integer] :max_year Maximum year to generate (default: 2030)
47
+ # @option constraints [Boolean] :preserve_month Keep the same month (default: false)
48
+ # @option constraints [Boolean] :preserve_weekday Keep the same day of week (default: false)
49
+ #
50
+ # @raise [Error] If date string is invalid or doesn't match the format
51
+ class DateObfuscator
52
+ include Internal::RNG
53
+
54
+ PRESET_FORMATS = {
55
+ eu: '%d.%m.%Y', # 31.12.2023
56
+ eu_short: '%d.%m.%y', # 31.12.23
57
+ rus: '%d.%m.%Y', # 31.12.2023
58
+ rus_short: '%d.%m.%y', # 31.12.23
59
+ iso: '%Y-%m-%d' # 2023-12-31
60
+ }.freeze
61
+
62
+ def initialize(format: :iso, seed: nil, constraints: {})
63
+ setup_rng(seed)
64
+ @format = PRESET_FORMATS[format] || format
65
+ @constraints = default_constraints.merge(constraints)
66
+ end
67
+
68
+ def obfuscate(date_string)
69
+ return date_string if date_string.nil? || date_string.empty?
70
+
71
+ begin
72
+ date = ::Date.strptime(date_string, @format)
73
+ obfuscated_date = generate_date(date)
74
+ obfuscated_date.strftime(@format)
75
+ rescue ArgumentError => e
76
+ raise Error, "Invalid date or format: #{e.message}"
77
+ end
78
+ end
79
+
80
+ private
81
+
82
+ def default_constraints
83
+ {
84
+ min_year: 2000,
85
+ max_year: 2030,
86
+ preserve_month: false,
87
+ preserve_weekday: false
88
+ }
89
+ end
90
+
91
+ def generate_date(original_date)
92
+ year = random_year
93
+ month = @constraints[:preserve_month] ? original_date.month : random_month
94
+ day = random_day(year, month, original_date)
95
+
96
+ ::Date.new(year, month, day)
97
+ end
98
+
99
+ def random_year
100
+ random_range(@constraints[:min_year]..@constraints[:max_year])
101
+ end
102
+
103
+ def random_month
104
+ random_range(1..12)
105
+ end
106
+
107
+ def random_day(year, month, original_date)
108
+ days_in_month = ::Date.new(year, month, -1).day
109
+
110
+ if @constraints[:preserve_weekday]
111
+ # Find a day that falls on the same weekday
112
+ target_weekday = original_date.wday
113
+ possible_days = (1..days_in_month).select do |d|
114
+ ::Date.new(year, month, d).wday == target_weekday
115
+ end
116
+ random_sample(possible_days)
117
+ else
118
+ random_range(1..days_in_month)
119
+ end
120
+ end
121
+ end
122
+ end
@@ -0,0 +1,47 @@
1
+ # frozen_string_literal: true
2
+
3
+ module Obfuscator
4
+ module Internal
5
+ # Internal module providing Random Number Generation functionality.
6
+ # This module is intended for internal use only and shouldn't be used directly by gem users.
7
+ #
8
+ # Provides consistent random number generation across the gem's classes,
9
+ # with optional seed support for reproducible results.
10
+ #
11
+ # @api private
12
+ #
13
+ # Usage:
14
+ # include Internal::RNG
15
+ #
16
+ # def initialize(seed = nil)
17
+ # setup_rng(seed)
18
+ # end
19
+ #
20
+ # private
21
+ #
22
+ # def some_method
23
+ # random_sample(some_array) # For array sampling
24
+ # random_probability # For random float between 0 and 1
25
+ # random_range(some_range) # For range sampling
26
+ # end
27
+ module RNG
28
+ private
29
+
30
+ def setup_rng(seed = nil)
31
+ @rng = seed.nil? ? Random.new : Random.new(seed)
32
+ end
33
+
34
+ def random_sample(array)
35
+ array.sample(random: @rng)
36
+ end
37
+
38
+ def random_probability
39
+ @rng.rand
40
+ end
41
+
42
+ def random_range(range)
43
+ @rng.rand(range)
44
+ end
45
+ end
46
+ end
47
+ end
@@ -0,0 +1,220 @@
1
+ # frozen_string_literal: true
2
+
3
+ require_relative 'constants'
4
+ require_relative 'internal/rng'
5
+ require_relative 'naturalizer'
6
+
7
+ module Obfuscator
8
+ # A class responsible for obfuscating text in Russian and English languages.
9
+ #
10
+ # This class provides various modes for obfuscating text while preserving the original
11
+ # text structure, whitespace, punctuation, and capitalization. The obfuscation can be
12
+ # performed in several modes and optionally naturalized to produce more readable output.
13
+ #
14
+ # Available modes:
15
+ # - MODE_DIRECT (default): Preserves language, replacing words with same-language random words
16
+ # - MODE_ENG_TO_ENG: Only obfuscates English words, leaves Russian untouched
17
+ # - MODE_RUS_TO_RUS: Only obfuscates Russian words, leaves English untouched
18
+ # - MODE_SWAPPED: Swaps languages (English→Russian and Russian→English)
19
+ # - MODE_MIXED: Generates words containing both English and Russian characters
20
+ #
21
+ # @example Basic usage
22
+ # obfuscator = Multilang.new
23
+ # obfuscator.obfuscate("Hello world!") # => "Kites mefal!"
24
+ #
25
+ # @example Using swapped mode with naturalization
26
+ # obfuscator = Multilang.new(mode: :swapped, naturalize: true)
27
+ # obfuscator.obfuscate("Hello мир!") # => "Привет world!"
28
+ #
29
+ # @param mode [Symbol] The obfuscation mode to use (default: MODE_DIRECT)
30
+ # @param seed [Integer, nil] Optional seed for reproducible results
31
+ # @param naturalize [Boolean] Whether to naturalize the output (default: false)
32
+ #
33
+ # @raise [InputError] If input doesn't respond to :to_s
34
+ # @raise [EncodingError] If input has invalid encoding
35
+ # @raise [Error] If obfuscation fails for any other reason
36
+ class Multilang
37
+ include Constants
38
+ include Internal::RNG
39
+
40
+ MODE_DIRECT = :direct # 1:1 obfuscation, the default
41
+ MODE_ENG_TO_ENG = :eng_to_eng # eng/rus → eng/rus untouched
42
+ MODE_RUS_TO_RUS = :rus_to_rus # eng/rus → eng untouched/rus
43
+ MODE_SWAPPED = :swapped # eng→rus and rus→eng
44
+ MODE_MIXED = :mixed # eng/rus → eng+rus mix just for fun
45
+
46
+ def initialize(mode: MODE_DIRECT, seed: nil, naturalize: false)
47
+ @mode = mode
48
+ @seed = seed # Store the seed
49
+ setup_rng(seed)
50
+ @naturalizer = Naturalizer.new(seed) if naturalize
51
+ end
52
+
53
+ def obfuscate(input)
54
+ # Reset RNG state before each obfuscation if seed was provided
55
+ setup_rng(@seed) if @seed
56
+
57
+ raise InputError, 'Input must respond to :to_s' unless input.respond_to?(:to_s)
58
+ return input if input.nil? || input.is_a?(Numeric)
59
+
60
+ text = input.to_s
61
+
62
+ # Ensure UTF-8 encoding
63
+ begin
64
+ text = text.encode('UTF-8') unless text.encoding == Encoding::UTF_8
65
+ rescue Encoding::InvalidByteSequenceError, Encoding::UndefinedConversionError => e
66
+ raise EncodingError, "Encoding error: #{e.message}"
67
+ end
68
+
69
+ # Split preserving all whitespace and punctuation
70
+ begin
71
+ tokens = text.split(/(\s+|[[:punct:]])/)
72
+ tokens.map do |token|
73
+ if token.match?(/\s+|[[:punct:]]/)
74
+ token # Preserve whitespace and punctuation
75
+ else
76
+ process_word(token)
77
+ end
78
+ end.join
79
+ rescue ArgumentError => e
80
+ raise EncodingError, "Encoding error: #{e.message}" if e.message.include?('invalid byte sequence')
81
+
82
+ raise Error, "Obfuscation error: #{e.message}"
83
+ rescue StandardError => e
84
+ raise Error, "Obfuscation error: #{e.message}"
85
+ end
86
+ rescue NoMethodError => e
87
+ raise InputError, "Input must be a Ruby object with basic methods: #{e.message}"
88
+ end
89
+
90
+ private
91
+
92
+ def process_word(word)
93
+ return word if word.empty?
94
+
95
+ begin
96
+ source_lang = detect_language(word)
97
+ return word if source_lang == :unknown
98
+
99
+ result = case @mode
100
+ when MODE_ENG_TO_ENG
101
+ source_lang == :english ? obfuscate_word(word, :english) : word
102
+ when MODE_RUS_TO_RUS
103
+ source_lang == :russian ? obfuscate_word(word, :russian) : word
104
+ when MODE_SWAPPED
105
+ target_lang = source_lang == :english ? :russian : :english
106
+ obfuscate_word(word, target_lang)
107
+ when MODE_MIXED
108
+ obfuscate_mixed_word(word)
109
+ when MODE_DIRECT
110
+ obfuscate_word(word, source_lang)
111
+ else
112
+ word
113
+ end
114
+
115
+ @naturalizer ? @naturalizer.naturalize(result) : result
116
+ rescue StandardError => e
117
+ raise Error, "Word processing error for '#{word}': #{e.message}"
118
+ end
119
+ end
120
+
121
+ def detect_language(word)
122
+ first_char = word[0]
123
+ return :russian if first_char.match?(/[а-яёА-ЯЁ]/)
124
+ return :english if first_char.match?(/[a-zA-Z]/)
125
+
126
+ :unknown
127
+ end
128
+
129
+ def obfuscate_word(word, target_lang)
130
+ # Store capitalization pattern
131
+ caps_pattern = word.chars.map { |char| char.match?(/[A-ZА-ЯЁ]/) }
132
+
133
+ # Generate new word
134
+ new_word = case target_lang
135
+ when :english
136
+ generate_english_word(word.length)
137
+ when :russian
138
+ generate_russian_word(word.length)
139
+ end
140
+
141
+ # Apply capitalization pattern
142
+ apply_caps_pattern(new_word, caps_pattern)
143
+ end
144
+
145
+ def obfuscate_mixed_word(word)
146
+ caps_pattern = word.chars.map { |char| char.match?(/[A-ZА-ЯЁ]/) }
147
+ new_word = generate_mixed_word(word.length)
148
+ apply_caps_pattern(new_word, caps_pattern)
149
+ end
150
+
151
+ def generate_english_word(length)
152
+ generate_word(length, ENGLISH_CONSONANTS, ENGLISH_VOWELS, 0.4)
153
+ end
154
+
155
+ def generate_russian_word(length)
156
+ generate_word(length, RUSSIAN_CONSONANTS, RUSSIAN_VOWELS, 0.25)
157
+ end
158
+
159
+ def generate_word(length, consonants, vowels, vowel_start_prob)
160
+ return '' if length.zero?
161
+
162
+ result = ''
163
+ is_vowel = random_probability < vowel_start_prob
164
+
165
+ while result.length < length
166
+ chars = is_vowel ? vowels : consonants
167
+ result += random_sample(chars)
168
+ is_vowel = !is_vowel
169
+ end
170
+
171
+ result[0...length]
172
+ end
173
+
174
+ def generate_mixed_word(length)
175
+ return '' if length.zero?
176
+
177
+ is_vowel = random_probability < 0.25
178
+
179
+ result = ''
180
+ while result.length < length
181
+ # 50/50 chance of Russian or English
182
+ use_russian = random_probability < 0.5
183
+
184
+ # 25/75 chance of vowel for Russian/English
185
+ # is_vowel = @rng.rand < (use_russian ? 0.25 : 0.4)
186
+
187
+ char = if is_vowel
188
+ if use_russian
189
+ random_sample(RUSSIAN_VOWELS)
190
+ else
191
+ random_sample(ENGLISH_VOWELS)
192
+ end
193
+ elsif use_russian
194
+ random_sample(RUSSIAN_CONSONANTS)
195
+ else
196
+ random_sample(ENGLISH_CONSONANTS)
197
+ end
198
+
199
+ result += char
200
+ is_vowel = !is_vowel
201
+ end
202
+
203
+ result[0...length]
204
+ end
205
+
206
+ def apply_caps_pattern(word, pattern)
207
+ word.chars.map.with_index do |c, i|
208
+ pattern[i] ? c.upcase : c.downcase
209
+ end.join
210
+ end
211
+
212
+ def random_sample(array)
213
+ array.sample(random: @rng)
214
+ end
215
+
216
+ def random_probability
217
+ @rng.rand
218
+ end
219
+ end
220
+ end
@@ -0,0 +1,174 @@
1
+ # frozen_string_literal: true
2
+
3
+ require_relative 'constants'
4
+ require_relative 'internal/rng'
5
+
6
+ module Obfuscator
7
+ # A class responsible for naturalizing words by applying linguistic rules to make them
8
+ # more readable and natural-looking while preserving their structure.
9
+ #
10
+ # The naturalizer applies several rules to improve readability:
11
+ # 1. No soft/hard signs (ь/ъ) after Latin letters
12
+ # 2. No щ after w/th combinations
13
+ # 3. No й after consonants
14
+ # 4. No triple consonants (inserts appropriate vowel)
15
+ # 5. Handles impossible letter combinations
16
+ # 6. No double vowels
17
+ # 7. Special handling for ё, ю, я after consonants
18
+ # 8. Applies appropriate language-specific endings for longer words
19
+ #
20
+ # @example Basic usage
21
+ # naturalizer = Naturalizer.new
22
+ # naturalizer.naturalize("Thщит") # => "Thкит"
23
+ #
24
+ # @example With seed for reproducible results
25
+ # naturalizer = Naturalizer.new(12345)
26
+ # naturalizer.naturalize("Thщит") # => Same result for same seed
27
+ #
28
+ # @param seed [Integer, nil] Optional seed for reproducible results
29
+ #
30
+ # @see Multilang For the main obfuscation class that uses this naturalizer
31
+ class Naturalizer
32
+ include Constants
33
+ include Internal::RNG
34
+
35
+ def initialize(seed = nil)
36
+ setup_rng(seed)
37
+ end
38
+
39
+ # rubocop:disable Metrics/AbcSize, Metrics/CyclomaticComplexity, Metrics/MethodLength,Metrics/PerceivedComplexity
40
+ def naturalize(word)
41
+ return word unless word.respond_to?(:to_s)
42
+ return word if word.length < 2
43
+
44
+ begin
45
+ chars = word.chars
46
+ result = []
47
+
48
+ chars.each_with_index do |char, i|
49
+ next_char = chars[i + 1]
50
+
51
+ if next_char.nil?
52
+ result << char
53
+ next
54
+ end
55
+
56
+ # Rule 1: No ь/ъ after Latin letters
57
+ soft_hard_signs = %w[ь ъ]
58
+ if latin?(char) && soft_hard_signs.include?(next_char)
59
+ chars[i + 1] = random_sample(RUSSIAN_CONSONANTS.reject { |c| soft_hard_signs.include?(c) })
60
+ end
61
+
62
+ # Rule 2: No щ after w/th
63
+ if (char == 'w' || (i.positive? && chars[i - 1] == 't' && char == 'h')) && next_char == 'щ'
64
+ chars[i + 1] = random_sample(RUSSIAN_CONSONANTS - ['щ'])
65
+ end
66
+
67
+ # Rule 3: No й after consonants
68
+ chars[i + 1] = random_sample(RUSSIAN_CONSONANTS - ['й']) if consonant?(char) && next_char == 'й'
69
+
70
+ # Rule 4: No triple consonants
71
+ if i < chars.length - 2 &&
72
+ consonant?(char) &&
73
+ consonant?(next_char) &&
74
+ consonant?(chars[i + 2])
75
+ chars[i + 1] = if cyrillic?(next_char)
76
+ random_sample(RUSSIAN_VOWELS)
77
+ else
78
+ random_sample(ENGLISH_VOWELS)
79
+ end
80
+ end
81
+
82
+ # Rule 5: Handle impossible combinations
83
+ current_pair = char + next_char
84
+ if IMPOSSIBLE_COMBINATIONS.any? { |combo| current_pair.include?(combo) }
85
+ chars[i + 1] = if cyrillic?(next_char)
86
+ random_sample(RUSSIAN_CONSONANTS)
87
+ else
88
+ random_sample(ENGLISH_CONSONANTS)
89
+ end
90
+ end
91
+
92
+ # Rule 6: No double vowels
93
+ if vowel?(char) && vowel?(next_char)
94
+ chars[i + 1] = if cyrillic?(next_char)
95
+ random_sample(RUSSIAN_CONSONANTS)
96
+ else
97
+ random_sample(ENGLISH_CONSONANTS)
98
+ end
99
+ end
100
+
101
+ # Rule 7: Handle ё, ю, я after consonants
102
+ # This rule is a special case of Rule 5
103
+ soft_vowels = %w[ё ю я]
104
+ if consonant?(char) && soft_vowels.include?(next_char)
105
+ chars[i + 1] = random_sample(RUSSIAN_VOWELS - soft_vowels)
106
+ end
107
+
108
+ result << char
109
+ rescue StandardError => e
110
+ raise Error, "Naturalization error for '#{word}': #{e.message}"
111
+ end
112
+ end
113
+
114
+ # Rule 8: Apply appropriate ending if word is long enough
115
+ final_word = result.join
116
+ if final_word.length > 4
117
+ if mostly_russian?(final_word)
118
+ apply_russian_ending(final_word)
119
+ elsif mostly_english?(final_word)
120
+ apply_english_ending(final_word)
121
+ else
122
+ final_word
123
+ end
124
+ else
125
+ final_word
126
+ end
127
+ end
128
+ # rubocop:enable Metrics/AbcSize, Metrics/CyclomaticComplexity, Metrics/MethodLength,Metrics/PerceivedComplexity
129
+
130
+ private
131
+
132
+ def latin?(char)
133
+ char.match?(/[a-zA-Z]/)
134
+ end
135
+
136
+ def cyrillic?(char)
137
+ char.match?(/[а-яёА-ЯЁ]/)
138
+ end
139
+
140
+ def consonant?(char)
141
+ down_char = char.downcase
142
+ ENGLISH_CONSONANTS.include?(down_char) || RUSSIAN_CONSONANTS.include?(down_char)
143
+ end
144
+
145
+ def vowel?(char)
146
+ down_char = char.downcase
147
+ ENGLISH_VOWELS.include?(down_char) || RUSSIAN_VOWELS.include?(down_char)
148
+ end
149
+
150
+ def mostly_russian?(word)
151
+ russian_chars = word.chars.count { |c| cyrillic?(c) }
152
+ russian_chars > word.length / 2
153
+ end
154
+
155
+ def mostly_english?(word)
156
+ english_chars = word.chars.count { |c| latin?(c) }
157
+ english_chars > word.length / 2
158
+ end
159
+
160
+ def apply_russian_ending(word)
161
+ return word if word.length < 4
162
+
163
+ base = word[0...-2]
164
+ base + random_sample(RUSSIAN_ENDINGS)
165
+ end
166
+
167
+ def apply_english_ending(word)
168
+ return word if word.length < 4
169
+
170
+ base = word[0...-2]
171
+ base + random_sample(ENGLISH_ENDINGS)
172
+ end
173
+ end
174
+ end
@@ -0,0 +1,5 @@
1
+ # frozen_string_literal: true
2
+
3
+ module Obfuscator
4
+ VERSION = '0.3.1'
5
+ end
@@ -0,0 +1,50 @@
1
+ # frozen_string_literal: true
2
+
3
+ # Obfuscator is a text obfuscation library that preserves text structure while replacing content
4
+ # with meaningless but natural-looking words. It supports both English and Russian languages.
5
+ #
6
+ # The gem provides two main obfuscators:
7
+ # - {Multilang} for text obfuscation with multiple language support
8
+ # - {DateObfuscator} for date obfuscation with format preservation
9
+ #
10
+ # @example Basic text obfuscation
11
+ # require 'obfuscator-rb'
12
+ #
13
+ # obfuscator = Obfuscator::Multilang.new
14
+ # obfuscator.obfuscate("Hello, World!") # => "Kites, Mefal!"
15
+ #
16
+ # @example Date obfuscation
17
+ # date_obf = Obfuscator::DateObfuscator.new
18
+ # date_obf.obfuscate("2023-12-31") # => "2025-07-15"
19
+ #
20
+ # Error handling is provided through specific error classes:
21
+ # - {Error} Base error class for the gem
22
+ # - {InputError} Raised for invalid input types
23
+ # - {EncodingError} Raised for encoding-related issues
24
+ #
25
+ # @see Multilang For text obfuscation functionality
26
+ # @see DateObfuscator For date obfuscation functionality
27
+ # @see Internal::RNG For random number generation utilities
28
+
29
+ module Obfuscator
30
+ class Error < StandardError; end
31
+ class EncodingError < Error; end
32
+ class InputError < Error; end
33
+ end
34
+
35
+ require_relative 'obfuscator/version'
36
+ require_relative 'obfuscator/constants'
37
+ require_relative 'obfuscator/internal/rng'
38
+ require_relative 'obfuscator/naturalizer'
39
+ require_relative 'obfuscator/multilang'
40
+ require_relative 'obfuscator/date_obfuscator'
41
+
42
+ # Usage example:
43
+ if __FILE__ == $PROGRAM_NAME
44
+ obfuscator = Obfuscator::Multilang.new(seed: 12_345)
45
+ original_text = 'Hello, Мир! This is a TEST текст.'
46
+ obfuscated = obfuscator.obfuscate(original_text)
47
+
48
+ puts "Original: #{original_text}"
49
+ puts "Obfuscated: #{obfuscated}" # 'Cumic, Фяц! Piwi ok c UBOH ричуг.'
50
+ end
metadata ADDED
@@ -0,0 +1,67 @@
1
+ --- !ruby/object:Gem::Specification
2
+ name: obfuscator-rb
3
+ version: !ruby/object:Gem::Version
4
+ version: 0.3.1
5
+ platform: ruby
6
+ authors:
7
+ - Aleksandr Dryzhuk
8
+ autorequire:
9
+ bindir: bin
10
+ cert_chain: []
11
+ date: 2025-02-19 00:00:00.000000000 Z
12
+ dependencies: []
13
+ description: |
14
+ A Ruby gem for text obfuscation that preserves text structure while replacing content
15
+ with meaningless but natural-looking words. Supports both English and Russian languages,
16
+ with various obfuscation modes and optional text naturalization.
17
+
18
+ Гем для обфускации текста, сохраняющий его структуру и естественный вид, но заменяющий при этом содержимое
19
+ бессмысленными словами. Поддерживает английский и русский языки, различные режимы обфускации и опциональную
20
+ натурализацию текста.
21
+ email:
22
+ - dev@ad-it.pro
23
+ executables: []
24
+ extensions: []
25
+ extra_rdoc_files: []
26
+ files:
27
+ - ".rubocop.yml"
28
+ - CHANGELOG.md
29
+ - LICENSE.txt
30
+ - README.md
31
+ - Rakefile
32
+ - lib/obfuscator-rb.rb
33
+ - lib/obfuscator/constants.rb
34
+ - lib/obfuscator/date_obfuscator.rb
35
+ - lib/obfuscator/internal/rng.rb
36
+ - lib/obfuscator/multilang.rb
37
+ - lib/obfuscator/naturalizer.rb
38
+ - lib/obfuscator/version.rb
39
+ homepage: https://hub.mos.ru/ad/obfuscator
40
+ licenses:
41
+ - MIT
42
+ metadata:
43
+ rubygems_mfa_required: 'true'
44
+ homepage_uri: https://hub.mos.ru/ad/obfuscator
45
+ source_code_uri: https://hub.mos.ru/ad/obfuscator
46
+ changelog_uri: https://hub.mos.ru/ad/obfuscator/blob/master/CHANGELOG.md
47
+ post_install_message:
48
+ rdoc_options: []
49
+ require_paths:
50
+ - lib
51
+ required_ruby_version: !ruby/object:Gem::Requirement
52
+ requirements:
53
+ - - ">="
54
+ - !ruby/object:Gem::Version
55
+ version: 3.3.0
56
+ required_rubygems_version: !ruby/object:Gem::Requirement
57
+ requirements:
58
+ - - ">="
59
+ - !ruby/object:Gem::Version
60
+ version: '0'
61
+ requirements: []
62
+ rubygems_version: 3.5.22
63
+ signing_key:
64
+ specification_version: 4
65
+ summary: Text obfuscator that preserves structure while working with both English
66
+ and Russian languages
67
+ test_files: []