string_to_number 0.2.0 → 0.3.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/.github/workflows/ci.yml +81 -0
- data/.github/workflows/release.yml +62 -0
- data/.rubocop.yml +110 -0
- data/CLAUDE.md +23 -85
- data/Gemfile +9 -0
- data/Gemfile.lock +32 -1
- data/README.md +53 -163
- data/Rakefile +5 -1
- data/SECURITY.md +25 -0
- data/benchmark.rb +41 -40
- data/docs/ARCHITECTURE.md +131 -0
- data/docs/demo.gif +0 -0
- data/lib/string_to_number/parser.rb +49 -79
- data/lib/string_to_number/to_number.rb +21 -22
- data/lib/string_to_number/version.rb +3 -1
- data/lib/string_to_number.rb +9 -7
- data/microbenchmark.rb +81 -80
- data/performance_comparison.rb +34 -35
- data/profile.rb +44 -45
- data/string_to_number.gemspec +5 -6
- metadata +15 -51
- data/.travis.yml +0 -5
- /data/{LICENSE.txt → LICENSE} +0 -0
data/README.md
CHANGED
|
@@ -1,54 +1,53 @@
|
|
|
1
1
|
# StringToNumber
|
|
2
2
|
|
|
3
|
-
|
|
4
|
-
|
|
5
|
-
|
|
3
|
+
<p align="center">
|
|
4
|
+
<a href="https://badge.fury.io/rb/string_to_number"><img src="https://badge.fury.io/rb/string_to_number.svg" alt="Gem Version"></a>
|
|
5
|
+
<a href="https://github.com/FabienPiette/string_to_number/actions"><img src="https://github.com/FabienPiette/string_to_number/workflows/CI/badge.svg" alt="CI"></a>
|
|
6
|
+
<a href="https://opensource.org/licenses/MIT"><img src="https://img.shields.io/badge/License-MIT-yellow.svg" alt="License: MIT"></a>
|
|
7
|
+
</p>
|
|
6
8
|
|
|
7
|
-
|
|
9
|
+
Convert French written numbers into their numeric equivalents in Ruby.
|
|
8
10
|
|
|
9
|
-
|
|
11
|
+
<p align="center">
|
|
12
|
+
<img src="docs/demo.gif" alt="goscribe demo" width="800">
|
|
13
|
+
</p>
|
|
10
14
|
|
|
11
|
-
|
|
12
|
-
- **Thread-Safe**: Concurrent access support with proper locking mechanisms
|
|
13
|
-
- **Comprehensive**: Handles complex French number formats including:
|
|
14
|
-
- Basic numbers (zéro, un, deux...)
|
|
15
|
-
- Compound numbers (vingt et un, quatre-vingt-quatorze...)
|
|
16
|
-
- Large numbers (millions, milliards, billions...)
|
|
17
|
-
- Special cases (quatre-vingts, soixante-dix...)
|
|
18
|
-
- **Memory Efficient**: LRU cache with configurable limits
|
|
19
|
-
- **Backward Compatible**: Maintains compatibility with original implementation
|
|
15
|
+
## Quick Start
|
|
20
16
|
|
|
21
|
-
|
|
17
|
+
```ruby
|
|
18
|
+
gem 'string_to_number' # Add to your Gemfile, then: bundle install
|
|
19
|
+
```
|
|
22
20
|
|
|
23
|
-
|
|
24
|
-
|
|
25
|
-
| Short | 0.5ms | 0.035ms | **14x** |
|
|
26
|
-
| Medium | 2.1ms | 0.045ms | **47x** |
|
|
27
|
-
| Long | 23ms | 0.05ms | **460x** |
|
|
21
|
+
```ruby
|
|
22
|
+
require 'string_to_number'
|
|
28
23
|
|
|
29
|
-
|
|
24
|
+
StringToNumber.in_numbers('vingt et un') #=> 21
|
|
25
|
+
StringToNumber.in_numbers('mille deux cent trente-quatre') #=> 1234
|
|
26
|
+
StringToNumber.in_numbers('trois milliards') #=> 3_000_000_000
|
|
27
|
+
```
|
|
30
28
|
|
|
31
|
-
|
|
29
|
+
## Features
|
|
32
30
|
|
|
33
|
-
|
|
34
|
-
|
|
35
|
-
|
|
31
|
+
- **Fast** — 14-460x faster than naive recursive parsing, via pre-compiled patterns and LRU caching
|
|
32
|
+
- **Complete** — handles all standard French number words from `zéro` to `billions`, including compound forms (`quatre-vingt-quatorze`, `soixante-dix`)
|
|
33
|
+
- **Thread-safe** — concurrent access with mutex-protected caches; >2M conversions/sec under contention
|
|
34
|
+
- **Zero dependencies** — pure Ruby, no external gems required
|
|
35
|
+
|
|
36
|
+
## Install
|
|
36
37
|
|
|
37
|
-
|
|
38
|
+
**Prerequisites:** Ruby 2.7+
|
|
38
39
|
|
|
39
40
|
```bash
|
|
40
|
-
|
|
41
|
+
gem install string_to_number
|
|
41
42
|
```
|
|
42
43
|
|
|
43
|
-
Or
|
|
44
|
+
Or in your Gemfile:
|
|
44
45
|
|
|
45
|
-
```
|
|
46
|
-
|
|
46
|
+
```ruby
|
|
47
|
+
gem 'string_to_number'
|
|
47
48
|
```
|
|
48
49
|
|
|
49
|
-
##
|
|
50
|
-
|
|
51
|
-
### Basic Usage
|
|
50
|
+
## Usage
|
|
52
51
|
|
|
53
52
|
```ruby
|
|
54
53
|
require 'string_to_number'
|
|
@@ -59,153 +58,44 @@ StringToNumber.in_numbers('quinze') #=> 15
|
|
|
59
58
|
StringToNumber.in_numbers('cent') #=> 100
|
|
60
59
|
|
|
61
60
|
# Compound numbers
|
|
62
|
-
StringToNumber.in_numbers('vingt et un')
|
|
63
|
-
StringToNumber.in_numbers('quatre-vingt-quatorze')
|
|
64
|
-
StringToNumber.in_numbers('
|
|
61
|
+
StringToNumber.in_numbers('vingt et un') #=> 21
|
|
62
|
+
StringToNumber.in_numbers('quatre-vingt-quatorze') #=> 94
|
|
63
|
+
StringToNumber.in_numbers('neuf mille neuf cent quatre-vingt-dix-neuf') #=> 9999
|
|
65
64
|
|
|
66
65
|
# Large numbers
|
|
67
|
-
StringToNumber.in_numbers('
|
|
68
|
-
StringToNumber.in_numbers('
|
|
69
|
-
StringToNumber.in_numbers('trois milliards') #=> 3_000_000_000
|
|
70
|
-
|
|
71
|
-
# Complex expressions
|
|
72
|
-
StringToNumber.in_numbers('neuf mille neuf cent quatre-vingt-dix-neuf') #=> 9999
|
|
73
|
-
StringToNumber.in_numbers('deux millions trois cent mille') #=> 2_300_000
|
|
66
|
+
StringToNumber.in_numbers('un million') #=> 1_000_000
|
|
67
|
+
StringToNumber.in_numbers('deux millions trois cent mille') #=> 2_300_000
|
|
74
68
|
```
|
|
75
69
|
|
|
76
|
-
###
|
|
70
|
+
### Validation and cache management
|
|
77
71
|
|
|
78
72
|
```ruby
|
|
79
|
-
# Validation
|
|
80
73
|
StringToNumber.valid_french_number?('vingt et un') #=> true
|
|
81
74
|
StringToNumber.valid_french_number?('hello world') #=> false
|
|
82
75
|
|
|
83
|
-
#
|
|
84
|
-
StringToNumber.clear_caches! #
|
|
85
|
-
stats = StringToNumber.cache_stats
|
|
86
|
-
puts "Cache hit ratio: #{stats[:cache_hit_ratio]}"
|
|
87
|
-
|
|
88
|
-
# Backward compatibility mode
|
|
89
|
-
StringToNumber.in_numbers('cent', use_optimized: false) #=> 100
|
|
76
|
+
StringToNumber.cache_stats # inspect cache hit ratios
|
|
77
|
+
StringToNumber.clear_caches! # free cached data
|
|
90
78
|
```
|
|
91
79
|
|
|
92
|
-
|
|
93
|
-
|
|
94
|
-
| Range | Examples |
|
|
95
|
-
|-------|----------|
|
|
96
|
-
| 0-19 | zéro, un, deux, trois, quatre, cinq, six, sept, huit, neuf, dix, onze, douze, treize, quatorze, quinze, seize, dix-sept, dix-huit, dix-neuf |
|
|
97
|
-
| 20-99 | vingt, trente, quarante, cinquante, soixante, soixante-dix, quatre-vingts, quatre-vingt-dix |
|
|
98
|
-
| 100+ | cent, mille, million, milliard, billion |
|
|
99
|
-
| Compounds | vingt et un, quatre-vingt-quatorze, deux mille trois |
|
|
100
|
-
|
|
101
|
-
## ⚡ Performance Tips
|
|
102
|
-
|
|
103
|
-
1. **Reuse conversions**: The gem automatically caches results for better performance
|
|
104
|
-
2. **Batch processing**: Use the optimized parser (default) for better throughput
|
|
105
|
-
3. **Memory management**: Call `clear_caches!` periodically if processing many unique inputs
|
|
106
|
-
4. **Thread safety**: The gem is thread-safe and can be used in concurrent environments
|
|
107
|
-
|
|
108
|
-
## 🧪 Development
|
|
109
|
-
|
|
110
|
-
After checking out the repo, run `bin/setup` to install dependencies:
|
|
111
|
-
|
|
112
|
-
```bash
|
|
113
|
-
$ git clone https://github.com/FabienPiette/string_to_number.git
|
|
114
|
-
$ cd string_to_number
|
|
115
|
-
$ bin/setup
|
|
116
|
-
```
|
|
117
|
-
|
|
118
|
-
### Running Tests
|
|
119
|
-
|
|
120
|
-
```bash
|
|
121
|
-
# Run all tests
|
|
122
|
-
$ rake spec
|
|
123
|
-
|
|
124
|
-
# Run performance tests
|
|
125
|
-
$ ruby benchmark.rb
|
|
126
|
-
|
|
127
|
-
# Run specific test files
|
|
128
|
-
$ rspec spec/string_to_number_spec.rb
|
|
129
|
-
```
|
|
130
|
-
|
|
131
|
-
### Performance Benchmarking
|
|
132
|
-
|
|
133
|
-
```bash
|
|
134
|
-
# Compare implementations
|
|
135
|
-
$ ruby performance_comparison.rb
|
|
136
|
-
|
|
137
|
-
# Detailed micro-benchmarks
|
|
138
|
-
$ ruby microbenchmark.rb
|
|
139
|
-
```
|
|
140
|
-
|
|
141
|
-
### Interactive Console
|
|
142
|
-
|
|
143
|
-
```bash
|
|
144
|
-
$ bin/console
|
|
145
|
-
# => Interactive prompt for experimentation
|
|
146
|
-
```
|
|
147
|
-
|
|
148
|
-
## 🏗️ Architecture
|
|
149
|
-
|
|
150
|
-
The gem uses a dual-architecture approach:
|
|
151
|
-
|
|
152
|
-
- **Optimized Parser** (`StringToNumber::Parser`): High-performance implementation with caching
|
|
153
|
-
- **Original Implementation** (`StringToNumber::ToNumber`): Reference implementation for compatibility
|
|
154
|
-
|
|
155
|
-
Key performance optimizations:
|
|
156
|
-
- Pre-compiled regex patterns
|
|
157
|
-
- LRU caching with thread-safe access
|
|
158
|
-
- Memoized parser instances
|
|
159
|
-
- Zero-allocation number matching
|
|
160
|
-
|
|
161
|
-
## 🤝 Contributing
|
|
162
|
-
|
|
163
|
-
Bug reports and pull requests are welcome on GitHub at https://github.com/FabienPiette/string_to_number.
|
|
164
|
-
|
|
165
|
-
### Development Process
|
|
166
|
-
|
|
167
|
-
1. Fork the repository
|
|
168
|
-
2. Create your feature branch (`git checkout -b feature/amazing-feature`)
|
|
169
|
-
3. Write tests for your changes
|
|
170
|
-
4. Ensure all tests pass (`rake spec`)
|
|
171
|
-
5. Run performance tests to avoid regressions
|
|
172
|
-
6. Commit your changes (`git commit -am 'Add amazing feature'`)
|
|
173
|
-
7. Push to the branch (`git push origin feature/amazing-feature`)
|
|
174
|
-
8. Open a Pull Request
|
|
175
|
-
|
|
176
|
-
### Code of Conduct
|
|
177
|
-
|
|
178
|
-
This project is intended to be a safe, welcoming space for collaboration. Contributors are expected to adhere to the [Contributor Covenant](http://contributor-covenant.org) code of conduct.
|
|
179
|
-
|
|
180
|
-
## 📋 Requirements
|
|
181
|
-
|
|
182
|
-
- Ruby 2.5 or higher
|
|
183
|
-
- No external dependencies (uses only Ruby standard library)
|
|
184
|
-
|
|
185
|
-
## 🐛 Troubleshooting
|
|
186
|
-
|
|
187
|
-
### Common Issues
|
|
188
|
-
|
|
189
|
-
**Q: Numbers aren't parsing correctly**
|
|
190
|
-
A: Ensure your input uses proper French number words. Use `valid_french_number?` to validate input.
|
|
80
|
+
For the full API, see the [source documentation](lib/string_to_number.rb).
|
|
191
81
|
|
|
192
|
-
|
|
193
|
-
A: Make sure you're using the default optimized parser. Check cache statistics with `cache_stats`.
|
|
82
|
+
## Known Issues
|
|
194
83
|
|
|
195
|
-
|
|
196
|
-
|
|
84
|
+
- Input must be French number words only — mixed text (e.g. `"il y a vingt personnes"`) is not supported
|
|
85
|
+
- Regional Belgian/Swiss variants (`septante`, `nonante`) are recognized, but coverage may be incomplete
|
|
197
86
|
|
|
198
|
-
##
|
|
87
|
+
## Contributing
|
|
199
88
|
|
|
200
|
-
|
|
89
|
+
Bug reports and pull requests are welcome on [GitHub](https://github.com/FabienPiette/string_to_number).
|
|
201
90
|
|
|
202
|
-
##
|
|
91
|
+
## Acknowledgments
|
|
203
92
|
|
|
204
|
-
|
|
93
|
+
Created by [Fabien Piette](https://github.com/FabienPiette). Thanks to all [contributors](https://github.com/FabienPiette/string_to_number/graphs/contributors).
|
|
205
94
|
|
|
206
|
-
|
|
95
|
+
<p align="center">
|
|
96
|
+
<a href="https://buymeacoffee.com/fabienpiette" target="_blank"><img src="https://cdn.buymeacoffee.com/buttons/v2/default-yellow.png" alt="Buy Me A Coffee" height="60"></a>
|
|
97
|
+
</p>
|
|
207
98
|
|
|
208
|
-
|
|
209
|
-
- Performance optimizations and enhancements
|
|
210
|
-
- Community contributors and testers
|
|
99
|
+
## License
|
|
211
100
|
|
|
101
|
+
[MIT](LICENSE)
|
data/Rakefile
CHANGED
|
@@ -1,9 +1,13 @@
|
|
|
1
|
+
# frozen_string_literal: true
|
|
2
|
+
|
|
1
3
|
require 'bundler/gem_tasks'
|
|
2
4
|
require 'rspec/core/rake_task'
|
|
5
|
+
require 'rubocop/rake_task'
|
|
3
6
|
|
|
4
7
|
RSpec::Core::RakeTask.new(:spec)
|
|
8
|
+
RuboCop::RakeTask.new
|
|
5
9
|
|
|
6
|
-
task default:
|
|
10
|
+
task default: %i[rubocop spec]
|
|
7
11
|
|
|
8
12
|
task :env, [:env] do |_t, _args|
|
|
9
13
|
require 'string_to_number'
|
data/SECURITY.md
ADDED
|
@@ -0,0 +1,25 @@
|
|
|
1
|
+
# Security Policy
|
|
2
|
+
|
|
3
|
+
## Supported Versions
|
|
4
|
+
|
|
5
|
+
| Version | Supported |
|
|
6
|
+
|---------|--------------------|
|
|
7
|
+
| 0.2.x | :white_check_mark: |
|
|
8
|
+
| < 0.2 | :x: |
|
|
9
|
+
|
|
10
|
+
## Reporting a Vulnerability
|
|
11
|
+
|
|
12
|
+
If you discover a security vulnerability in this gem, please report it responsibly.
|
|
13
|
+
|
|
14
|
+
**Do not open a public GitHub issue.**
|
|
15
|
+
|
|
16
|
+
Instead, please use [GitHub's private vulnerability reporting](https://github.com/FabienPiette/string_to_number/security/advisories/new) to submit your report. You can also email the maintainer directly at the address listed in the gemspec.
|
|
17
|
+
|
|
18
|
+
Please include:
|
|
19
|
+
|
|
20
|
+
- A description of the vulnerability
|
|
21
|
+
- Steps to reproduce
|
|
22
|
+
- Affected versions
|
|
23
|
+
- Any potential impact
|
|
24
|
+
|
|
25
|
+
You should expect an initial response within 7 days. If the vulnerability is accepted, a fix will be released as a patch version and the advisory will be published after the fix is available.
|
data/benchmark.rb
CHANGED
|
@@ -10,8 +10,8 @@ require 'benchmark'
|
|
|
10
10
|
class StringToNumberBenchmark
|
|
11
11
|
# Test data organized by complexity
|
|
12
12
|
TEST_CASES = {
|
|
13
|
-
simple: [
|
|
14
|
-
|
|
13
|
+
simple: %w[
|
|
14
|
+
un vingt cent mille
|
|
15
15
|
],
|
|
16
16
|
medium: [
|
|
17
17
|
'vingt et un', 'deux cent cinquante', 'mille deux cent'
|
|
@@ -20,20 +20,20 @@ class StringToNumberBenchmark
|
|
|
20
20
|
'trois milliards cinq cents millions',
|
|
21
21
|
'soixante-quinze million trois cent quarante six mille sept cent quatre-vingt-dix neuf'
|
|
22
22
|
],
|
|
23
|
-
edge_cases: [
|
|
24
|
-
|
|
23
|
+
edge_cases: %w[
|
|
24
|
+
VINGT une septante quatre-vingts
|
|
25
25
|
]
|
|
26
26
|
}.freeze
|
|
27
27
|
|
|
28
28
|
def self.run_benchmark
|
|
29
|
-
puts
|
|
30
|
-
puts
|
|
29
|
+
puts 'StringToNumber Performance Benchmark'
|
|
30
|
+
puts '=' * 50
|
|
31
31
|
puts "Ruby version: #{RUBY_VERSION}"
|
|
32
32
|
puts "Platform: #{RUBY_PLATFORM}"
|
|
33
33
|
puts
|
|
34
34
|
|
|
35
35
|
# Warm up
|
|
36
|
-
puts
|
|
36
|
+
puts 'Warming up...'
|
|
37
37
|
TEST_CASES.values.flatten.each { |text| StringToNumber.in_numbers(text) }
|
|
38
38
|
puts
|
|
39
39
|
|
|
@@ -41,7 +41,7 @@ class StringToNumberBenchmark
|
|
|
41
41
|
|
|
42
42
|
TEST_CASES.each do |category, test_cases|
|
|
43
43
|
puts "#{category.to_s.capitalize} Numbers:"
|
|
44
|
-
puts
|
|
44
|
+
puts '-' * 30
|
|
45
45
|
|
|
46
46
|
results = benchmark_category(test_cases)
|
|
47
47
|
total_results[category] = results
|
|
@@ -53,46 +53,44 @@ class StringToNumberBenchmark
|
|
|
53
53
|
puts
|
|
54
54
|
|
|
55
55
|
# Show individual case performance for complex numbers
|
|
56
|
-
|
|
57
|
-
|
|
58
|
-
|
|
59
|
-
|
|
60
|
-
|
|
61
|
-
|
|
62
|
-
avg_ms = (individual_time / 1000) * 1000
|
|
63
|
-
puts " #{index + 1}. #{avg_ms.round(4)}ms - '#{text[0..50]}#{text.length > 50 ? '...' : ''}'"
|
|
56
|
+
next unless category == :complex
|
|
57
|
+
|
|
58
|
+
puts 'Individual case breakdown:'
|
|
59
|
+
test_cases.each_with_index do |text, index|
|
|
60
|
+
individual_time = Benchmark.realtime do
|
|
61
|
+
1000.times { StringToNumber.in_numbers(text) }
|
|
64
62
|
end
|
|
65
|
-
|
|
63
|
+
avg_ms = (individual_time / 1000) * 1000
|
|
64
|
+
puts " #{index + 1}. #{avg_ms.round(4)}ms - '#{text[0..50]}#{'...' if text.length > 50}'"
|
|
66
65
|
end
|
|
66
|
+
puts
|
|
67
67
|
end
|
|
68
68
|
|
|
69
69
|
# Summary
|
|
70
|
-
puts
|
|
71
|
-
puts
|
|
72
|
-
puts
|
|
70
|
+
puts '=' * 50
|
|
71
|
+
puts 'PERFORMANCE SUMMARY'
|
|
72
|
+
puts '=' * 50
|
|
73
73
|
|
|
74
74
|
total_results.each do |category, results|
|
|
75
75
|
status = case results[:avg_time_ms]
|
|
76
|
-
when 0..0.1 then
|
|
77
|
-
when 0.1..0.5 then
|
|
78
|
-
when 0.5..1.0 then
|
|
79
|
-
else
|
|
76
|
+
when 0..0.1 then '🟢 Excellent'
|
|
77
|
+
when 0.1..0.5 then '🟡 Good'
|
|
78
|
+
when 0.5..1.0 then '🟠 Acceptable'
|
|
79
|
+
else '🔴 Needs optimization'
|
|
80
80
|
end
|
|
81
|
-
|
|
81
|
+
|
|
82
82
|
puts "#{category.to_s.capitalize.ljust(12)} #{status.ljust(15)} #{results[:avg_time_ms].round(4)}ms avg"
|
|
83
83
|
end
|
|
84
84
|
|
|
85
85
|
puts
|
|
86
|
-
puts
|
|
86
|
+
puts 'Memory efficiency test...'
|
|
87
87
|
test_memory_usage
|
|
88
88
|
|
|
89
89
|
puts
|
|
90
|
-
puts
|
|
90
|
+
puts 'Scalability test...'
|
|
91
91
|
test_scalability
|
|
92
92
|
end
|
|
93
93
|
|
|
94
|
-
private
|
|
95
|
-
|
|
96
94
|
def self.benchmark_category(test_cases, iterations = 2000)
|
|
97
95
|
total_time = Benchmark.realtime do
|
|
98
96
|
test_cases.each do |text|
|
|
@@ -130,7 +128,7 @@ class StringToNumberBenchmark
|
|
|
130
128
|
|
|
131
129
|
puts "Object creation: #{object_growth} new objects (#{object_growth > 1000 ? '🔴 High' : '🟢 Low'})"
|
|
132
130
|
else
|
|
133
|
-
puts
|
|
131
|
+
puts 'Memory tracking not available on this platform'
|
|
134
132
|
end
|
|
135
133
|
end
|
|
136
134
|
|
|
@@ -138,27 +136,32 @@ class StringToNumberBenchmark
|
|
|
138
136
|
# Test how performance scales with input complexity
|
|
139
137
|
inputs = [
|
|
140
138
|
'un', # 2 chars
|
|
141
|
-
'vingt et un', # 11 chars
|
|
139
|
+
'vingt et un', # 11 chars
|
|
142
140
|
'mille deux cent trente-quatre', # 29 chars
|
|
143
141
|
'trois milliards cinq cents millions deux cent mille et une' # 58 chars
|
|
144
142
|
]
|
|
145
143
|
|
|
146
|
-
puts
|
|
147
|
-
|
|
144
|
+
puts 'Input length vs. performance:'
|
|
145
|
+
|
|
148
146
|
results = inputs.map do |input|
|
|
149
147
|
time = Benchmark.realtime do
|
|
150
148
|
1000.times { StringToNumber.in_numbers(input) }
|
|
151
149
|
end
|
|
152
150
|
avg_ms = (time / 1000) * 1000
|
|
153
|
-
|
|
151
|
+
|
|
154
152
|
{ length: input.length, time: avg_ms, input: input }
|
|
155
153
|
end
|
|
156
154
|
|
|
157
155
|
results.each do |result|
|
|
158
156
|
complexity_ratio = result[:time] / results.first[:time]
|
|
159
|
-
status = complexity_ratio < 5
|
|
160
|
-
|
|
161
|
-
|
|
157
|
+
status = if complexity_ratio < 5
|
|
158
|
+
'🟢'
|
|
159
|
+
else
|
|
160
|
+
complexity_ratio < 10 ? '🟡' : '🔴'
|
|
161
|
+
end
|
|
162
|
+
|
|
163
|
+
puts " #{result[:length].to_s.rjust(2)} chars: #{result[:time].round(4)}ms #{status} " \
|
|
164
|
+
"(#{complexity_ratio.round(1)}x baseline)"
|
|
162
165
|
end
|
|
163
166
|
|
|
164
167
|
# Check if performance degrades reasonably
|
|
@@ -172,6 +175,4 @@ class StringToNumberBenchmark
|
|
|
172
175
|
end
|
|
173
176
|
|
|
174
177
|
# Run the benchmark
|
|
175
|
-
if __FILE__ == $
|
|
176
|
-
StringToNumberBenchmark.run_benchmark
|
|
177
|
-
end
|
|
178
|
+
StringToNumberBenchmark.run_benchmark if __FILE__ == $PROGRAM_NAME
|
|
@@ -0,0 +1,131 @@
|
|
|
1
|
+
# Architecture
|
|
2
|
+
|
|
3
|
+
This document describes the high-level architecture of StringToNumber.
|
|
4
|
+
If you want to familiarize yourself with the codebase, you are in the
|
|
5
|
+
right place.
|
|
6
|
+
|
|
7
|
+
## Bird's Eye View
|
|
8
|
+
|
|
9
|
+
StringToNumber converts French written numbers into Ruby integers. A
|
|
10
|
+
string like `"deux millions trois cent mille"` goes in, and `2_300_000`
|
|
11
|
+
comes out.
|
|
12
|
+
|
|
13
|
+
The conversion pipeline is:
|
|
14
|
+
|
|
15
|
+
1. **Normalize** — downcase and strip whitespace
|
|
16
|
+
2. **Cache lookup** — return immediately if this string was converted before
|
|
17
|
+
3. **Parse** — recursively decompose the French phrase into factor/multiplier
|
|
18
|
+
pairs (e.g. `cinq` × `cent` = 500), then sum the parts
|
|
19
|
+
4. **Cache store** — save the result for future lookups
|
|
20
|
+
|
|
21
|
+
The parsing relies on two data tables: direct word-to-value mappings
|
|
22
|
+
(`WORD_VALUES`) and power-of-ten multipliers (`MULTIPLIERS`). French
|
|
23
|
+
number grammar has irregular patterns — especially the `quatre-vingt`
|
|
24
|
+
(4×20) family — that require dedicated regex handling.
|
|
25
|
+
|
|
26
|
+
## Code Map
|
|
27
|
+
|
|
28
|
+
### `lib/string_to_number.rb`
|
|
29
|
+
|
|
30
|
+
Public API module. Exposes three class methods: `in_numbers`,
|
|
31
|
+
`clear_caches!`, and `cache_stats`. Also provides `valid_french_number?`
|
|
32
|
+
for input validation.
|
|
33
|
+
|
|
34
|
+
This file is the only entry point consumers interact with. It delegates
|
|
35
|
+
to either `Parser` or `ToNumber` based on the `use_optimized` flag
|
|
36
|
+
(default: `true`).
|
|
37
|
+
|
|
38
|
+
Key methods: `StringToNumber.in_numbers`, `StringToNumber.valid_french_number?`.
|
|
39
|
+
|
|
40
|
+
### `lib/string_to_number/parser.rb`
|
|
41
|
+
|
|
42
|
+
High-performance parser. Owns all caching logic (LRU conversion cache,
|
|
43
|
+
instance cache) and thread-safety (two mutexes: `@cache_mutex` for
|
|
44
|
+
conversions, `@instance_mutex` for parser instances).
|
|
45
|
+
|
|
46
|
+
The parsing algorithm mirrors `ToNumber`'s recursive extraction but
|
|
47
|
+
operates on pre-compiled regex patterns (`MULTIPLIER_PATTERN`,
|
|
48
|
+
`QUATRE_VINGT_PATTERN`) instead of building them per call.
|
|
49
|
+
|
|
50
|
+
Key types: `Parser.convert` (class-level entry point),
|
|
51
|
+
`Parser#parse_optimized` and `Parser#extract_optimized` (recursive core).
|
|
52
|
+
|
|
53
|
+
**Architecture Invariant:** `Parser` imports data tables from `ToNumber`
|
|
54
|
+
(`WORD_VALUES = ToNumber::EXCEPTIONS`) but never calls `ToNumber` methods.
|
|
55
|
+
The dependency is data-only.
|
|
56
|
+
|
|
57
|
+
### `lib/string_to_number/to_number.rb`
|
|
58
|
+
|
|
59
|
+
Original (legacy) implementation. Owns the canonical data tables:
|
|
60
|
+
`EXCEPTIONS` (word-to-value map for 0–90 plus regional variants) and
|
|
61
|
+
`POWERS_OF_TEN` (multiplier words to exponent values, up to `googol`).
|
|
62
|
+
|
|
63
|
+
Uses the same recursive `extract`/`match` algorithm as `Parser` but
|
|
64
|
+
rebuilds regex patterns on every instantiation and has no caching.
|
|
65
|
+
|
|
66
|
+
Key types: `ToNumber::EXCEPTIONS`, `ToNumber::POWERS_OF_TEN`,
|
|
67
|
+
`ToNumber#to_number`.
|
|
68
|
+
|
|
69
|
+
**Architecture Invariant:** `ToNumber` has no knowledge of `Parser`.
|
|
70
|
+
It must remain independently functional for backward compatibility.
|
|
71
|
+
|
|
72
|
+
### `lib/string_to_number/version.rb`
|
|
73
|
+
|
|
74
|
+
Single constant `StringToNumber::VERSION`. Updated before gem releases.
|
|
75
|
+
|
|
76
|
+
### `spec/`
|
|
77
|
+
|
|
78
|
+
RSpec test suites. `string_to_number_spec.rb` covers correctness across
|
|
79
|
+
number ranges (0–9, 10–19, 20–29, ..., millions). `performance_spec.rb`
|
|
80
|
+
validates throughput thresholds.
|
|
81
|
+
|
|
82
|
+
### `benchmark.rb`, `microbenchmark.rb`, `profile.rb`, `performance_comparison.rb`
|
|
83
|
+
|
|
84
|
+
Standalone scripts for measuring and profiling performance. Not part of
|
|
85
|
+
the gem distribution (excluded by gemspec).
|
|
86
|
+
|
|
87
|
+
## Invariants
|
|
88
|
+
|
|
89
|
+
`Parser` depends on `ToNumber` for data constants only, never for
|
|
90
|
+
parsing logic. This keeps the legacy implementation independently
|
|
91
|
+
testable while the optimized path reuses proven word mappings.
|
|
92
|
+
|
|
93
|
+
All shared mutable state lives in `Parser`'s class-level instance
|
|
94
|
+
variables and is accessed exclusively through `@cache_mutex` or
|
|
95
|
+
`@instance_mutex`. No other module holds mutable state.
|
|
96
|
+
|
|
97
|
+
The `EXCEPTIONS` and `POWERS_OF_TEN` hashes in `ToNumber` are frozen.
|
|
98
|
+
Any new word mapping must be added there — `Parser` inherits changes
|
|
99
|
+
automatically.
|
|
100
|
+
|
|
101
|
+
Load order matters: `to_number.rb` must be required before `parser.rb`
|
|
102
|
+
because `Parser` references `ToNumber::EXCEPTIONS` and
|
|
103
|
+
`ToNumber::POWERS_OF_TEN` at class body evaluation time.
|
|
104
|
+
|
|
105
|
+
## Cross-Cutting Concerns
|
|
106
|
+
|
|
107
|
+
**Caching.** Two layers in `Parser`: an LRU conversion cache (string →
|
|
108
|
+
integer, capped at 1000 entries) and an instance cache (string → Parser
|
|
109
|
+
object). Both are thread-safe. Call `StringToNumber.clear_caches!` to
|
|
110
|
+
reset.
|
|
111
|
+
|
|
112
|
+
**Thread safety.** Achieved through two separate mutexes to reduce
|
|
113
|
+
contention — one for conversion results, one for parser instances.
|
|
114
|
+
`ToNumber` is not thread-safe but is stateless per call, so concurrent
|
|
115
|
+
use is safe in practice.
|
|
116
|
+
|
|
117
|
+
**Testing.** RSpec with two suites: correctness (`spec/string_to_number_spec.rb`)
|
|
118
|
+
and performance (`spec/performance_spec.rb`). Run both via `rake spec`
|
|
119
|
+
or individually with `bundle exec rspec <file>`.
|
|
120
|
+
|
|
121
|
+
## A Typical Change
|
|
122
|
+
|
|
123
|
+
**Adding a new French number word** (e.g., a regional variant):
|
|
124
|
+
|
|
125
|
+
1. Add the word-to-value mapping in `ToNumber::EXCEPTIONS` or
|
|
126
|
+
`ToNumber::POWERS_OF_TEN` in `lib/string_to_number/to_number.rb`
|
|
127
|
+
2. Add test cases in `spec/string_to_number_spec.rb`
|
|
128
|
+
3. Run `rake spec` — `Parser` picks up the new mapping automatically
|
|
129
|
+
since it references `ToNumber`'s constants
|
|
130
|
+
|
|
131
|
+
No changes to `parser.rb` or `string_to_number.rb` are needed.
|
data/docs/demo.gif
ADDED
|
Binary file
|