string_to_number 0.2.1 → 0.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: a97e2a69e72adda0aeedace8ab5a9ddae63c1a4741846d6d405a0ef767bdaaaa
4
- data.tar.gz: 78553f4c606a717c8bab939e24b18ddd023a8962b902d2fea75b03498b2e3e0a
3
+ metadata.gz: 336f985b755b77d9518187442a09e117175a12bcc8658abd46a98f0180efd5e3
4
+ data.tar.gz: acd54764a6df32c72067457f22de32f28a57a3c30c571ca608aba48bd9aa1f63
5
5
  SHA512:
6
- metadata.gz: 1042d45f537afc86b43883d29a3ef03ffd3ddc365a37c85843bd78fa54c4d35178c140e32c025d7406681dc31d77c06b7f7e526e4255fef1629a468a7ddf1958
7
- data.tar.gz: 1f6b2b1e8dc6c17bf72d0ce759da2fe31dd9771188ae6087b23774b06e55f9678ee8bcc049c611ff8b0a68d9b99c8b3cb3088e28f342ae1f0d842cd618cf81da
6
+ metadata.gz: 8ea31d03afc64fa5400ac6609748bf14e15916bdcc27a605e0e399d170b0db7f2be09b82c46ca43fc43eb5e2bb6ae71e4276eebc97e1b6356f43d230ef320c9a
7
+ data.tar.gz: 8b962962941a065def976540572e47534fe82ac270f2a5fd9121e640f82fc66aefa7270bedd068cdf02c32f66fd8307a013fe9b4382f6d8466f25f5a78c87e51
@@ -2,9 +2,7 @@ name: CI
2
2
 
3
3
  on:
4
4
  push:
5
- branches: [ master, main ]
6
5
  pull_request:
7
- branches: [ master, main ]
8
6
 
9
7
  jobs:
10
8
  test:
@@ -0,0 +1,62 @@
1
+ name: Release
2
+
3
+ on:
4
+ push:
5
+ tags:
6
+ - 'v*'
7
+ workflow_dispatch:
8
+ inputs:
9
+ tag:
10
+ description: 'Release tag (e.g., v1.0.0)'
11
+ required: true
12
+
13
+ jobs:
14
+ test:
15
+ runs-on: ubuntu-latest
16
+ strategy:
17
+ matrix:
18
+ ruby-version: ['2.7', '3.0', '3.1', '3.2', '3.3']
19
+
20
+ steps:
21
+ - uses: actions/checkout@v4
22
+
23
+ - name: Set up Ruby ${{ matrix.ruby-version }}
24
+ uses: ruby/setup-ruby@v1
25
+ with:
26
+ ruby-version: ${{ matrix.ruby-version }}
27
+ bundler-cache: true
28
+
29
+ - name: Run tests
30
+ run: bundle exec rake spec
31
+
32
+ publish:
33
+ needs: test
34
+ runs-on: ubuntu-latest
35
+ environment: release
36
+ permissions:
37
+ contents: write
38
+ id-token: write
39
+
40
+ steps:
41
+ - uses: actions/checkout@v4
42
+
43
+ - name: Set up Ruby
44
+ uses: ruby/setup-ruby@v1
45
+ with:
46
+ ruby-version: '3.3'
47
+ bundler-cache: true
48
+
49
+ - name: Build gem
50
+ run: gem build string_to_number.gemspec
51
+
52
+ - name: Configure RubyGems credentials
53
+ uses: rubygems/configure-rubygems-credentials@v1.0.0
54
+
55
+ - name: Publish to RubyGems
56
+ run: gem push string_to_number-*.gem
57
+
58
+ - name: Create GitHub Release
59
+ env:
60
+ GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
61
+ TAG: ${{ github.ref_name }}
62
+ run: gh release create "$TAG" --generate-notes
data/CLAUDE.md CHANGED
@@ -4,100 +4,38 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co
4
4
 
5
5
  ## Project Overview
6
6
 
7
- This is a Ruby gem that converts French words into numbers. The gem provides a single public method `StringToNumber.in_numbers(french_string)` that parses French number words and returns their numeric equivalent.
7
+ Ruby gem that converts French written numbers into integers. Single public entry point: `StringToNumber.in_numbers(french_string)`.
8
8
 
9
- ## Core Architecture
10
-
11
- - **Main module**: `StringToNumber` in `lib/string_to_number.rb` - provides the public API
12
- - **Optimized parser**: `StringToNumber::Parser` in `lib/string_to_number/parser.rb` - high-performance implementation
13
- - **Original implementation**: `StringToNumber::ToNumber` in `lib/string_to_number/to_number.rb` - legacy compatibility
14
- - **Version**: `StringToNumber::VERSION` in `lib/string_to_number/version.rb`
15
-
16
- ### Parser Architecture
17
-
18
- The optimized parser uses:
19
- - **WORD_VALUES**: Direct French word to number mappings (0-90, including regional variants)
20
- - **MULTIPLIERS**: Power-of-ten multipliers (cent=2, mille=3, million=6, etc.)
21
- - **Pre-compiled regex patterns**: Eliminate compilation overhead
22
- - **Multi-level caching**: Instance cache + LRU conversion cache
23
- - **Thread-safe design**: Concurrent access with mutex protection
24
-
25
- The algorithm maintains the proven recursive parsing logic from the original while adding:
26
- - Memoization for repeated conversions
27
- - Instance caching to reduce initialization costs
28
- - Optimized string operations and hash lookups
29
-
30
- ## Common Development Commands
9
+ ## Commands
31
10
 
32
11
  ```bash
33
- # Install dependencies
34
- bundle install
35
-
36
- # Run tests
37
- rake spec
38
-
39
- # Run specific test
40
- bundle exec rspec spec/string_to_number_spec.rb
41
-
42
- # Start interactive console
43
- rake console
44
- # or
45
- bundle exec irb -I lib -r string_to_number
46
-
47
- # Install gem locally
48
- bundle exec rake install
49
-
50
- # Release new version (updates version.rb, creates git tag, pushes to rubygems)
51
- bundle exec rake release
12
+ bundle install # Install dependencies
13
+ rake spec # Run all tests (correctness + performance)
14
+ bundle exec rspec spec/string_to_number_spec.rb # Correctness tests only
15
+ bundle exec rspec spec/performance_spec.rb # Performance tests only
16
+ bundle exec rspec spec/string_to_number_spec.rb -e "vingt" # Run single test by name
17
+ rake rubocop # Lint (or: bundle exec rubocop)
18
+ rake # Default: rubocop + spec
19
+ rake console # Interactive REPL with gem loaded
52
20
  ```
53
21
 
54
- ## Testing
55
-
56
- Uses RSpec with comprehensive test coverage for French number parsing from 0 to millions. Tests are organized by number ranges (0-9, 10-19, 20-29, etc.) and include complex multi-word numbers.
57
-
58
- ### Performance Testing
59
-
60
- Performance tests are available to measure and monitor the implementation's efficiency:
22
+ ## Architecture
61
23
 
62
- ```bash
63
- # Run comprehensive performance test suite
64
- bundle exec rspec spec/performance_spec.rb
65
-
66
- # Run standalone benchmark script
67
- ruby -I lib benchmark.rb
24
+ Three files under `lib/` matter:
68
25
 
69
- # Run micro-benchmarks to identify bottlenecks
70
- ruby -I lib microbenchmark.rb
71
-
72
- # Run profiling analysis
73
- ruby -I lib profile.rb
74
- ```
26
+ - `lib/string_to_number.rb` public API module. Delegates to `Parser` (default) or `ToNumber` (legacy, via `use_optimized: false`).
27
+ - `lib/string_to_number/parser.rb` — optimized implementation. Owns all caching (LRU + instance cache) and thread-safety (two mutexes). Imports data tables from `ToNumber` but never calls its methods.
28
+ - `lib/string_to_number/to_number.rb` — legacy implementation. Owns the canonical data tables: `EXCEPTIONS` (word→value) and `POWERS_OF_TEN` (multiplier→exponent). Must be loaded before `parser.rb`.
75
29
 
76
- **Performance Characteristics (Optimized Implementation):**
77
- - Simple numbers (0-100): ~0.001ms average, 800,000+ conversions/sec
78
- - Medium complexity (100-1000): ~0.001ms average, 780,000+ conversions/sec
79
- - Complex numbers (1000+): ~0.002ms average, 690,000+ conversions/sec
80
- - Exceptional scalability: minimal performance degradation with input length
81
- - Memory efficient: zero object creation during operation
82
- - Intelligent caching: repeated conversions benefit from memoization
30
+ **Key constraint:** New word mappings go in `ToNumber::EXCEPTIONS` or `ToNumber::POWERS_OF_TEN` — `Parser` inherits them automatically. Don't duplicate data between the two implementations.
83
31
 
84
- **Performance Improvements:**
85
- - **14-460x faster** than original implementation across all test cases
86
- - **Excellent scalability**: 1.3x degradation vs 43x in original
87
- - **Pre-compiled regex patterns** eliminate compilation overhead
88
- - **Instance caching** reduces initialization costs
89
- - **Memoization** speeds up repeated conversions
90
- - **Thread-safe** with concurrent performance >2M conversions/sec
32
+ Both parsers use the same recursive algorithm: decompose input into `factor × multiplier` pairs via regex, with special-case handling for `quatre-vingt` forms.
91
33
 
92
- **Usage Options:**
93
- ```ruby
94
- # Use optimized implementation (default)
95
- StringToNumber.in_numbers('vingt et un')
34
+ See `docs/ARCHITECTURE.md` for the full code map and invariants.
96
35
 
97
- # Use original implementation for compatibility
98
- StringToNumber.in_numbers('vingt et un', use_optimized: false)
36
+ ## Style
99
37
 
100
- # Cache management
101
- StringToNumber.clear_caches!
102
- StringToNumber.cache_stats
103
- ```
38
+ - RuboCop enforced (`rake rubocop`). Config in `.rubocop.yml`.
39
+ - Single quotes for strings. `frozen_string_literal: true` in every file.
40
+ - Max line length: 120 (specs exempt).
41
+ - Target Ruby: 2.7+.
data/Gemfile.lock CHANGED
@@ -1,7 +1,7 @@
1
1
  PATH
2
2
  remote: .
3
3
  specs:
4
- string_to_number (0.2.1)
4
+ string_to_number (0.3.0)
5
5
 
6
6
  GEM
7
7
  remote: https://rubygems.org/
data/README.md CHANGED
@@ -1,58 +1,53 @@
1
- <div align="center">
2
- <img src="logo.png" alt="StringToNumber Logo" width="200" height="200">
3
-
4
- # StringToNumber
5
-
6
- [![Gem Version](https://badge.fury.io/rb/string_to_number.svg)](https://badge.fury.io/rb/string_to_number)
7
- [![Ruby](https://github.com/FabienPiette/string_to_number/workflows/Ruby/badge.svg)](https://github.com/FabienPiette/string_to_number/actions)
8
- [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
9
- </div>
10
-
11
- A high-performance Ruby gem for converting French written numbers into their numeric equivalents. Features intelligent caching, thread-safe operations, and support for complex French number formats.
12
-
13
- ## ✨ Features
14
-
15
- - **High Performance**: Up to 460x faster than naive implementations with intelligent caching
16
- - **Thread-Safe**: Concurrent access support with proper locking mechanisms
17
- - **Comprehensive**: Handles complex French number formats including:
18
- - Basic numbers (zéro, un, deux...)
19
- - Compound numbers (vingt et un, quatre-vingt-quatorze...)
20
- - Large numbers (millions, milliards, billions...)
21
- - Special cases (quatre-vingts, soixante-dix...)
22
- - **Memory Efficient**: LRU cache with configurable limits
23
- - **Backward Compatible**: Maintains compatibility with original implementation
24
-
25
- ## 🚀 Performance
26
-
27
- | Input Size | Original | Optimized | Improvement |
28
- |------------|----------|-----------|-------------|
29
- | Short | 0.5ms | 0.035ms | **14x** |
30
- | Medium | 2.1ms | 0.045ms | **47x** |
31
- | Long | 23ms | 0.05ms | **460x** |
32
-
33
- ## 📦 Installation
34
-
35
- Add this line to your application's Gemfile:
1
+ # StringToNumber
2
+
3
+ <p align="center">
4
+ <a href="https://badge.fury.io/rb/string_to_number"><img src="https://badge.fury.io/rb/string_to_number.svg" alt="Gem Version"></a>
5
+ <a href="https://github.com/FabienPiette/string_to_number/actions"><img src="https://github.com/FabienPiette/string_to_number/workflows/CI/badge.svg" alt="CI"></a>
6
+ <a href="https://opensource.org/licenses/MIT"><img src="https://img.shields.io/badge/License-MIT-yellow.svg" alt="License: MIT"></a>
7
+ </p>
8
+
9
+ Convert French written numbers into their numeric equivalents in Ruby.
10
+
11
+ <p align="center">
12
+ <img src="docs/demo.gif" alt="goscribe demo" width="800">
13
+ </p>
14
+
15
+ ## Quick Start
36
16
 
37
17
  ```ruby
38
- gem 'string_to_number'
18
+ gem 'string_to_number' # Add to your Gemfile, then: bundle install
39
19
  ```
40
20
 
41
- And then execute:
21
+ ```ruby
22
+ require 'string_to_number'
42
23
 
43
- ```bash
44
- $ bundle install
24
+ StringToNumber.in_numbers('vingt et un') #=> 21
25
+ StringToNumber.in_numbers('mille deux cent trente-quatre') #=> 1234
26
+ StringToNumber.in_numbers('trois milliards') #=> 3_000_000_000
45
27
  ```
46
28
 
47
- Or install it yourself as:
29
+ ## Features
30
+
31
+ - **Fast** — 14-460x faster than naive recursive parsing, via pre-compiled patterns and LRU caching
32
+ - **Complete** — handles all standard French number words from `zéro` to `billions`, including compound forms (`quatre-vingt-quatorze`, `soixante-dix`)
33
+ - **Thread-safe** — concurrent access with mutex-protected caches; >2M conversions/sec under contention
34
+ - **Zero dependencies** — pure Ruby, no external gems required
35
+
36
+ ## Install
37
+
38
+ **Prerequisites:** Ruby 2.7+
48
39
 
49
40
  ```bash
50
- $ gem install string_to_number
41
+ gem install string_to_number
51
42
  ```
52
43
 
53
- ## 🔧 Usage
44
+ Or in your Gemfile:
45
+
46
+ ```ruby
47
+ gem 'string_to_number'
48
+ ```
54
49
 
55
- ### Basic Usage
50
+ ## Usage
56
51
 
57
52
  ```ruby
58
53
  require 'string_to_number'
@@ -63,153 +58,44 @@ StringToNumber.in_numbers('quinze') #=> 15
63
58
  StringToNumber.in_numbers('cent') #=> 100
64
59
 
65
60
  # Compound numbers
66
- StringToNumber.in_numbers('vingt et un') #=> 21
67
- StringToNumber.in_numbers('quatre-vingt-quatorze') #=> 94
68
- StringToNumber.in_numbers('soixante-dix') #=> 70
61
+ StringToNumber.in_numbers('vingt et un') #=> 21
62
+ StringToNumber.in_numbers('quatre-vingt-quatorze') #=> 94
63
+ StringToNumber.in_numbers('neuf mille neuf cent quatre-vingt-dix-neuf') #=> 9999
69
64
 
70
65
  # Large numbers
71
- StringToNumber.in_numbers('mille deux cent trente-quatre') #=> 1234
72
- StringToNumber.in_numbers('un million') #=> 1_000_000
73
- StringToNumber.in_numbers('trois milliards') #=> 3_000_000_000
74
-
75
- # Complex expressions
76
- StringToNumber.in_numbers('neuf mille neuf cent quatre-vingt-dix-neuf') #=> 9999
77
- StringToNumber.in_numbers('deux millions trois cent mille') #=> 2_300_000
66
+ StringToNumber.in_numbers('un million') #=> 1_000_000
67
+ StringToNumber.in_numbers('deux millions trois cent mille') #=> 2_300_000
78
68
  ```
79
69
 
80
- ### Advanced Features
70
+ ### Validation and cache management
81
71
 
82
72
  ```ruby
83
- # Validation
84
73
  StringToNumber.valid_french_number?('vingt et un') #=> true
85
74
  StringToNumber.valid_french_number?('hello world') #=> false
86
75
 
87
- # Cache management
88
- StringToNumber.clear_caches! # Clear all internal caches
89
- stats = StringToNumber.cache_stats
90
- puts "Cache hit ratio: #{stats[:cache_hit_ratio]}"
91
-
92
- # Backward compatibility mode
93
- StringToNumber.in_numbers('cent', use_optimized: false) #=> 100
94
- ```
95
-
96
- ### Supported Number Formats
97
-
98
- | Range | Examples |
99
- |-------|----------|
100
- | 0-19 | zéro, un, deux, trois, quatre, cinq, six, sept, huit, neuf, dix, onze, douze, treize, quatorze, quinze, seize, dix-sept, dix-huit, dix-neuf |
101
- | 20-99 | vingt, trente, quarante, cinquante, soixante, soixante-dix, quatre-vingts, quatre-vingt-dix |
102
- | 100+ | cent, mille, million, milliard, billion |
103
- | Compounds | vingt et un, quatre-vingt-quatorze, deux mille trois |
104
-
105
- ## ⚡ Performance Tips
106
-
107
- 1. **Reuse conversions**: The gem automatically caches results for better performance
108
- 2. **Batch processing**: Use the optimized parser (default) for better throughput
109
- 3. **Memory management**: Call `clear_caches!` periodically if processing many unique inputs
110
- 4. **Thread safety**: The gem is thread-safe and can be used in concurrent environments
111
-
112
- ## 🧪 Development
113
-
114
- After checking out the repo, run `bin/setup` to install dependencies:
115
-
116
- ```bash
117
- $ git clone https://github.com/FabienPiette/string_to_number.git
118
- $ cd string_to_number
119
- $ bin/setup
76
+ StringToNumber.cache_stats # inspect cache hit ratios
77
+ StringToNumber.clear_caches! # free cached data
120
78
  ```
121
79
 
122
- ### Running Tests
123
-
124
- ```bash
125
- # Run all tests
126
- $ rake spec
127
-
128
- # Run performance tests
129
- $ ruby benchmark.rb
130
-
131
- # Run specific test files
132
- $ rspec spec/string_to_number_spec.rb
133
- ```
134
-
135
- ### Performance Benchmarking
136
-
137
- ```bash
138
- # Compare implementations
139
- $ ruby performance_comparison.rb
140
-
141
- # Detailed micro-benchmarks
142
- $ ruby microbenchmark.rb
143
- ```
144
-
145
- ### Interactive Console
146
-
147
- ```bash
148
- $ bin/console
149
- # => Interactive prompt for experimentation
150
- ```
151
-
152
- ## 🏗️ Architecture
153
-
154
- The gem uses a dual-architecture approach:
155
-
156
- - **Optimized Parser** (`StringToNumber::Parser`): High-performance implementation with caching
157
- - **Original Implementation** (`StringToNumber::ToNumber`): Reference implementation for compatibility
158
-
159
- Key performance optimizations:
160
- - Pre-compiled regex patterns
161
- - LRU caching with thread-safe access
162
- - Memoized parser instances
163
- - Zero-allocation number matching
164
-
165
- ## 🤝 Contributing
166
-
167
- Bug reports and pull requests are welcome on GitHub at https://github.com/FabienPiette/string_to_number.
168
-
169
- ### Development Process
170
-
171
- 1. Fork the repository
172
- 2. Create your feature branch (`git checkout -b feature/amazing-feature`)
173
- 3. Write tests for your changes
174
- 4. Ensure all tests pass (`rake spec`)
175
- 5. Run performance tests to avoid regressions
176
- 6. Commit your changes (`git commit -am 'Add amazing feature'`)
177
- 7. Push to the branch (`git push origin feature/amazing-feature`)
178
- 8. Open a Pull Request
179
-
180
- ### Code of Conduct
181
-
182
- This project is intended to be a safe, welcoming space for collaboration. Contributors are expected to adhere to the [Contributor Covenant](http://contributor-covenant.org) code of conduct.
183
-
184
- ## 📋 Requirements
185
-
186
- - Ruby 2.5 or higher
187
- - No external dependencies (uses only Ruby standard library)
188
-
189
- ## 🐛 Troubleshooting
190
-
191
- ### Common Issues
192
-
193
- **Q: Numbers aren't parsing correctly**
194
- A: Ensure your input uses proper French number words. Use `valid_french_number?` to validate input.
80
+ For the full API, see the [source documentation](lib/string_to_number.rb).
195
81
 
196
- **Q: Performance seems slow**
197
- A: Make sure you're using the default optimized parser. Check cache statistics with `cache_stats`.
82
+ ## Known Issues
198
83
 
199
- **Q: Memory usage is high**
200
- A: Call `clear_caches!` periodically if processing many unique number strings.
84
+ - Input must be French number words only — mixed text (e.g. `"il y a vingt personnes"`) is not supported
85
+ - Regional Belgian/Swiss variants (`septante`, `nonante`) are recognized, but coverage may be incomplete
201
86
 
202
- ## 📝 Changelog
87
+ ## Contributing
203
88
 
204
- See [CHANGELOG.md](CHANGELOG.md) for version history and updates.
89
+ Bug reports and pull requests are welcome on [GitHub](https://github.com/FabienPiette/string_to_number).
205
90
 
206
- ## 📄 License
91
+ ## Acknowledgments
207
92
 
208
- The gem is available as open source under the terms of the [MIT License](https://opensource.org/licenses/MIT).
93
+ Created by [Fabien Piette](https://github.com/FabienPiette). Thanks to all [contributors](https://github.com/FabienPiette/string_to_number/graphs/contributors).
209
94
 
210
- ## 🙏 Acknowledgments
95
+ <p align="center">
96
+ <a href="https://buymeacoffee.com/fabienpiette" target="_blank"><img src="https://cdn.buymeacoffee.com/buttons/v2/default-yellow.png" alt="Buy Me A Coffee" height="60"></a>
97
+ </p>
211
98
 
212
- - Original implementation by [Fabien Piette](https://github.com/FabienPiette)
213
- - Performance optimizations and enhancements
214
- - Community contributors and testers
99
+ ## License
215
100
 
101
+ [MIT](LICENSE)
data/SECURITY.md ADDED
@@ -0,0 +1,25 @@
1
+ # Security Policy
2
+
3
+ ## Supported Versions
4
+
5
+ | Version | Supported |
6
+ |---------|--------------------|
7
+ | 0.2.x | :white_check_mark: |
8
+ | < 0.2 | :x: |
9
+
10
+ ## Reporting a Vulnerability
11
+
12
+ If you discover a security vulnerability in this gem, please report it responsibly.
13
+
14
+ **Do not open a public GitHub issue.**
15
+
16
+ Instead, please use [GitHub's private vulnerability reporting](https://github.com/FabienPiette/string_to_number/security/advisories/new) to submit your report. You can also email the maintainer directly at the address listed in the gemspec.
17
+
18
+ Please include:
19
+
20
+ - A description of the vulnerability
21
+ - Steps to reproduce
22
+ - Affected versions
23
+ - Any potential impact
24
+
25
+ You should expect an initial response within 7 days. If the vulnerability is accepted, a fix will be released as a patch version and the advisory will be published after the fix is available.
@@ -0,0 +1,131 @@
1
+ # Architecture
2
+
3
+ This document describes the high-level architecture of StringToNumber.
4
+ If you want to familiarize yourself with the codebase, you are in the
5
+ right place.
6
+
7
+ ## Bird's Eye View
8
+
9
+ StringToNumber converts French written numbers into Ruby integers. A
10
+ string like `"deux millions trois cent mille"` goes in, and `2_300_000`
11
+ comes out.
12
+
13
+ The conversion pipeline is:
14
+
15
+ 1. **Normalize** — downcase and strip whitespace
16
+ 2. **Cache lookup** — return immediately if this string was converted before
17
+ 3. **Parse** — recursively decompose the French phrase into factor/multiplier
18
+ pairs (e.g. `cinq` × `cent` = 500), then sum the parts
19
+ 4. **Cache store** — save the result for future lookups
20
+
21
+ The parsing relies on two data tables: direct word-to-value mappings
22
+ (`WORD_VALUES`) and power-of-ten multipliers (`MULTIPLIERS`). French
23
+ number grammar has irregular patterns — especially the `quatre-vingt`
24
+ (4×20) family — that require dedicated regex handling.
25
+
26
+ ## Code Map
27
+
28
+ ### `lib/string_to_number.rb`
29
+
30
+ Public API module. Exposes three class methods: `in_numbers`,
31
+ `clear_caches!`, and `cache_stats`. Also provides `valid_french_number?`
32
+ for input validation.
33
+
34
+ This file is the only entry point consumers interact with. It delegates
35
+ to either `Parser` or `ToNumber` based on the `use_optimized` flag
36
+ (default: `true`).
37
+
38
+ Key methods: `StringToNumber.in_numbers`, `StringToNumber.valid_french_number?`.
39
+
40
+ ### `lib/string_to_number/parser.rb`
41
+
42
+ High-performance parser. Owns all caching logic (LRU conversion cache,
43
+ instance cache) and thread-safety (two mutexes: `@cache_mutex` for
44
+ conversions, `@instance_mutex` for parser instances).
45
+
46
+ The parsing algorithm mirrors `ToNumber`'s recursive extraction but
47
+ operates on pre-compiled regex patterns (`MULTIPLIER_PATTERN`,
48
+ `QUATRE_VINGT_PATTERN`) instead of building them per call.
49
+
50
+ Key types: `Parser.convert` (class-level entry point),
51
+ `Parser#parse_optimized` and `Parser#extract_optimized` (recursive core).
52
+
53
+ **Architecture Invariant:** `Parser` imports data tables from `ToNumber`
54
+ (`WORD_VALUES = ToNumber::EXCEPTIONS`) but never calls `ToNumber` methods.
55
+ The dependency is data-only.
56
+
57
+ ### `lib/string_to_number/to_number.rb`
58
+
59
+ Original (legacy) implementation. Owns the canonical data tables:
60
+ `EXCEPTIONS` (word-to-value map for 0–90 plus regional variants) and
61
+ `POWERS_OF_TEN` (multiplier words to exponent values, up to `googol`).
62
+
63
+ Uses the same recursive `extract`/`match` algorithm as `Parser` but
64
+ rebuilds regex patterns on every instantiation and has no caching.
65
+
66
+ Key types: `ToNumber::EXCEPTIONS`, `ToNumber::POWERS_OF_TEN`,
67
+ `ToNumber#to_number`.
68
+
69
+ **Architecture Invariant:** `ToNumber` has no knowledge of `Parser`.
70
+ It must remain independently functional for backward compatibility.
71
+
72
+ ### `lib/string_to_number/version.rb`
73
+
74
+ Single constant `StringToNumber::VERSION`. Updated before gem releases.
75
+
76
+ ### `spec/`
77
+
78
+ RSpec test suites. `string_to_number_spec.rb` covers correctness across
79
+ number ranges (0–9, 10–19, 20–29, ..., millions). `performance_spec.rb`
80
+ validates throughput thresholds.
81
+
82
+ ### `benchmark.rb`, `microbenchmark.rb`, `profile.rb`, `performance_comparison.rb`
83
+
84
+ Standalone scripts for measuring and profiling performance. Not part of
85
+ the gem distribution (excluded by gemspec).
86
+
87
+ ## Invariants
88
+
89
+ `Parser` depends on `ToNumber` for data constants only, never for
90
+ parsing logic. This keeps the legacy implementation independently
91
+ testable while the optimized path reuses proven word mappings.
92
+
93
+ All shared mutable state lives in `Parser`'s class-level instance
94
+ variables and is accessed exclusively through `@cache_mutex` or
95
+ `@instance_mutex`. No other module holds mutable state.
96
+
97
+ The `EXCEPTIONS` and `POWERS_OF_TEN` hashes in `ToNumber` are frozen.
98
+ Any new word mapping must be added there — `Parser` inherits changes
99
+ automatically.
100
+
101
+ Load order matters: `to_number.rb` must be required before `parser.rb`
102
+ because `Parser` references `ToNumber::EXCEPTIONS` and
103
+ `ToNumber::POWERS_OF_TEN` at class body evaluation time.
104
+
105
+ ## Cross-Cutting Concerns
106
+
107
+ **Caching.** Two layers in `Parser`: an LRU conversion cache (string →
108
+ integer, capped at 1000 entries) and an instance cache (string → Parser
109
+ object). Both are thread-safe. Call `StringToNumber.clear_caches!` to
110
+ reset.
111
+
112
+ **Thread safety.** Achieved through two separate mutexes to reduce
113
+ contention — one for conversion results, one for parser instances.
114
+ `ToNumber` is not thread-safe but is stateless per call, so concurrent
115
+ use is safe in practice.
116
+
117
+ **Testing.** RSpec with two suites: correctness (`spec/string_to_number_spec.rb`)
118
+ and performance (`spec/performance_spec.rb`). Run both via `rake spec`
119
+ or individually with `bundle exec rspec <file>`.
120
+
121
+ ## A Typical Change
122
+
123
+ **Adding a new French number word** (e.g., a regional variant):
124
+
125
+ 1. Add the word-to-value mapping in `ToNumber::EXCEPTIONS` or
126
+ `ToNumber::POWERS_OF_TEN` in `lib/string_to_number/to_number.rb`
127
+ 2. Add test cases in `spec/string_to_number_spec.rb`
128
+ 3. Run `rake spec` — `Parser` picks up the new mapping automatically
129
+ since it references `ToNumber`'s constants
130
+
131
+ No changes to `parser.rb` or `string_to_number.rb` are needed.
data/docs/demo.gif ADDED
Binary file
@@ -21,21 +21,21 @@ module StringToNumber
21
21
  MULTIPLIERS = StringToNumber::ToNumber::POWERS_OF_TEN.freeze
22
22
 
23
23
  # Pre-compiled regex patterns for optimal performance
24
- MULTIPLIER_KEYS = MULTIPLIERS.keys.reject { |k| %w[un dix].include?(k) }
24
+ MULTIPLIER_KEYS = MULTIPLIERS.keys
25
+ .reject { |k| %w[un dix].include?(k) }
25
26
  .sort_by(&:length).reverse.freeze
26
27
  MULTIPLIER_PATTERN = /(?<f>.*?)\s?(?<m>#{MULTIPLIER_KEYS.join('|')})/.freeze
27
- QUATRE_VINGT_PATTERN = /(quatre(-|\s)vingt(s?)((-|\s)dix)?)((-|\s)?)(\w*)/.freeze
28
+ QUATRE_VINGT_PATTERN = /(?<base>quatre[-\s]vingt(?:s?)(?:[-\s]dix)?)(?:[-\s]?)(?<suffix>\w*)/.freeze
28
29
 
29
30
  # Cache configuration
30
31
  MAX_CACHE_SIZE = 1000
31
32
  private_constant :MAX_CACHE_SIZE
32
33
 
33
- # Thread-safe class-level caches
34
- @conversion_cache = {}
35
- @cache_access_order = []
36
- @instance_cache = {}
34
+ # Thread-safe LRU cache using Hash insertion order (Ruby 1.9+)
35
+ @cache = {}
36
+ @cache_hits = 0
37
+ @cache_lookups = 0
37
38
  @cache_mutex = Mutex.new
38
- @instance_mutex = Mutex.new
39
39
 
40
40
  class << self
41
41
  # Convert French text to number using cached parser instance
@@ -49,28 +49,34 @@ module StringToNumber
49
49
  normalized = normalize_text(text)
50
50
  return 0 if normalized.empty?
51
51
 
52
- # Check conversion cache first
53
- cached_result = get_cached_conversion(normalized)
54
- return cached_result if cached_result
52
+ @cache_mutex.synchronize do
53
+ @cache_lookups += 1
54
+
55
+ if @cache.key?(normalized)
56
+ @cache_hits += 1
57
+ # Delete and reinsert to move to end (most recently used)
58
+ value = @cache.delete(normalized)
59
+ @cache[normalized] = value
60
+ return value
61
+ end
62
+ end
55
63
 
56
- # Get or create parser instance and convert
57
- parser = get_cached_instance(normalized)
58
- result = parser.parse_optimized(normalized)
64
+ result = new(normalized).parse_optimized(normalized)
65
+
66
+ @cache_mutex.synchronize do
67
+ @cache.delete(@cache.first[0]) if @cache.size >= MAX_CACHE_SIZE
68
+ @cache[normalized] = result
69
+ end
59
70
 
60
- # Cache the result
61
- cache_conversion(normalized, result)
62
71
  result
63
72
  end
64
73
 
65
74
  # Clear all caches
66
75
  def clear_caches!
67
76
  @cache_mutex.synchronize do
68
- @conversion_cache.clear
69
- @cache_access_order.clear
70
- end
71
-
72
- @instance_mutex.synchronize do
73
- @instance_cache.clear
77
+ @cache.clear
78
+ @cache_hits = 0
79
+ @cache_lookups = 0
74
80
  end
75
81
  end
76
82
 
@@ -78,10 +84,9 @@ module StringToNumber
78
84
  def cache_stats
79
85
  @cache_mutex.synchronize do
80
86
  {
81
- conversion_cache_size: @conversion_cache.size,
87
+ conversion_cache_size: @cache.size,
82
88
  conversion_cache_limit: MAX_CACHE_SIZE,
83
- instance_cache_size: @instance_cache.size,
84
- cache_hit_ratio: calculate_hit_ratio
89
+ cache_hit_ratio: @cache_lookups.zero? ? 0.0 : @cache_hits.to_f / @cache_lookups
85
90
  }
86
91
  end
87
92
  end
@@ -95,43 +100,6 @@ module StringToNumber
95
100
  def normalize_text(text)
96
101
  text.to_s.downcase.strip
97
102
  end
98
-
99
- def get_cached_conversion(normalized_text)
100
- @cache_mutex.synchronize do
101
- if @conversion_cache.key?(normalized_text)
102
- # Update LRU order
103
- @cache_access_order.delete(normalized_text)
104
- @cache_access_order.push(normalized_text)
105
- return @conversion_cache[normalized_text]
106
- end
107
- end
108
- nil
109
- end
110
-
111
- def cache_conversion(normalized_text, result)
112
- @cache_mutex.synchronize do
113
- # LRU eviction
114
- if @conversion_cache.size >= MAX_CACHE_SIZE
115
- oldest = @cache_access_order.shift
116
- @conversion_cache.delete(oldest)
117
- end
118
-
119
- @conversion_cache[normalized_text] = result
120
- @cache_access_order.push(normalized_text)
121
- end
122
- end
123
-
124
- def get_cached_instance(normalized_text)
125
- @instance_mutex.synchronize do
126
- @instance_cache[normalized_text] ||= new(normalized_text)
127
- end
128
- end
129
-
130
- def calculate_hit_ratio
131
- return 0.0 if @cache_access_order.empty?
132
-
133
- @conversion_cache.size.to_f / @cache_access_order.size
134
- end
135
103
  end
136
104
 
137
105
  # Initialize parser with normalized text
@@ -153,7 +121,7 @@ module StringToNumber
153
121
  return WORD_VALUES[text] if WORD_VALUES.key?(text)
154
122
 
155
123
  # Use the proven extraction algorithm from the original implementation
156
- extract_optimized(text, MULTIPLIER_KEYS.join('|'))
124
+ extract_optimized(text)
157
125
  end
158
126
 
159
127
  private
@@ -161,7 +129,7 @@ module StringToNumber
161
129
  # Optimized version of the original extract method
162
130
  # This maintains the exact logic of the working implementation
163
131
  # but with performance improvements
164
- def extract_optimized(sentence, keys, detail: false)
132
+ def extract_optimized(sentence, detail: false)
165
133
  return 0 if sentence.nil? || sentence.empty?
166
134
 
167
135
  # Direct lookup
@@ -179,7 +147,7 @@ module StringToNumber
179
147
 
180
148
  # Handle compound numbers
181
149
  if higher_multiple_exists?(result[:m], sentence)
182
- details = extract_optimized(sentence, keys, detail: true)
150
+ details = extract_optimized(sentence, detail: true)
183
151
  factor = (factor * multiple_of_ten) + details[:factor]
184
152
  multiple_of_ten = details[:multiple_of_ten]
185
153
  sentence = details[:sentence]
@@ -194,17 +162,17 @@ module StringToNumber
194
162
  }
195
163
  end
196
164
 
197
- extract_optimized(sentence, keys) + (factor * multiple_of_ten)
165
+ extract_optimized(sentence) + (factor * multiple_of_ten)
198
166
 
199
167
  # Quatre-vingt special handling
200
168
  elsif (m = QUATRE_VINGT_PATTERN.match(sentence))
201
- normalize_str = m[1].tr(' ', '-')
169
+ normalize_str = m[:base].tr(' ', '-')
202
170
  normalize_str = normalize_str[0...-1] if normalize_str[-1] == 's'
203
171
 
204
172
  sentence = sentence.gsub(m[0], '')
205
173
 
206
- extract_optimized(sentence, keys) +
207
- WORD_VALUES[normalize_str] + (WORD_VALUES[m[8]] || 0)
174
+ extract_optimized(sentence) +
175
+ WORD_VALUES[normalize_str] + (WORD_VALUES[m[:suffix]] || 0)
208
176
  else
209
177
  match_optimized(sentence)
210
178
  end
@@ -199,12 +199,11 @@ module StringToNumber
199
199
  # - "quatre-vingt" / "quatre vingts" (with/without 's')
200
200
  # - "quatre-vingt-dix" / "quatre vingts dix" (90)
201
201
  # - Space vs hyphen variations
202
- elsif (m = /(quatre(-|\s)vingt(s?)((-|\s)dix)?)((-|\s)?)(\w*)/.match(sentence))
202
+ elsif (m = /(?<base>quatre[-\s]vingt(?:s?)(?:[-\s]dix)?)(?:[-\s]?)(?<suffix>\w*)/.match(sentence))
203
203
  # Normalize spacing to hyphens for consistent lookup
204
- normalize_str = m[1].tr(' ', '-')
204
+ normalize_str = m[:base].tr(' ', '-')
205
205
 
206
206
  # Remove trailing 's' from "quatre-vingts" if present
207
- # Bug fix: use [-1] instead of [length] for last character
208
207
  normalize_str = normalize_str[0...-1] if normalize_str[-1] == 's'
209
208
 
210
209
  # Remove the matched portion from sentence
@@ -213,7 +212,7 @@ module StringToNumber
213
212
  # Return sum of: remaining sentence + normalized quatre-vingt value + any suffix
214
213
  # Example: "quatre-vingt-cinq" -> EXCEPTIONS["quatre-vingt"] + EXCEPTIONS["cinq"]
215
214
  extract(sentence, keys) +
216
- EXCEPTIONS[normalize_str] + (EXCEPTIONS[m[8]] || 0)
215
+ EXCEPTIONS[normalize_str] + (EXCEPTIONS[m[:suffix]] || 0)
217
216
  else
218
217
  # Fallback: use match() method for simple word combinations
219
218
  match(sentence)
@@ -1,5 +1,5 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  module StringToNumber
4
- VERSION = '0.2.1'
4
+ VERSION = '0.3.0'
5
5
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: string_to_number
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.2.1
4
+ version: 0.3.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Fabien Piette
8
- autorequire:
8
+ autorequire:
9
9
  bindir: exe
10
10
  cert_chain: []
11
- date: 2025-06-24 00:00:00.000000000 Z
11
+ date: 2026-02-13 00:00:00.000000000 Z
12
12
  dependencies: []
13
13
  description: A ruby gem to convert French words into numbers.
14
14
  email:
@@ -18,26 +18,28 @@ extensions: []
18
18
  extra_rdoc_files: []
19
19
  files:
20
20
  - ".github/workflows/ci.yml"
21
+ - ".github/workflows/release.yml"
21
22
  - ".gitignore"
22
23
  - ".rspec"
23
24
  - ".rubocop.yml"
24
25
  - ".tool-versions"
25
- - ".travis.yml"
26
26
  - CLAUDE.md
27
27
  - CODE_OF_CONDUCT.md
28
28
  - Gemfile
29
29
  - Gemfile.lock
30
- - LICENSE.txt
30
+ - LICENSE
31
31
  - README.md
32
32
  - Rakefile
33
+ - SECURITY.md
33
34
  - benchmark.rb
34
35
  - bin/console
35
36
  - bin/setup
37
+ - docs/ARCHITECTURE.md
38
+ - docs/demo.gif
36
39
  - lib/string_to_number.rb
37
40
  - lib/string_to_number/parser.rb
38
41
  - lib/string_to_number/to_number.rb
39
42
  - lib/string_to_number/version.rb
40
- - logo.png
41
43
  - microbenchmark.rb
42
44
  - performance_comparison.rb
43
45
  - profile.rb
@@ -48,7 +50,7 @@ licenses:
48
50
  metadata:
49
51
  allowed_push_host: https://rubygems.org
50
52
  rubygems_mfa_required: 'true'
51
- post_install_message:
53
+ post_install_message:
52
54
  rdoc_options: []
53
55
  require_paths:
54
56
  - lib
@@ -63,8 +65,8 @@ required_rubygems_version: !ruby/object:Gem::Requirement
63
65
  - !ruby/object:Gem::Version
64
66
  version: '0'
65
67
  requirements: []
66
- rubygems_version: 3.1.6
67
- signing_key:
68
+ rubygems_version: 3.5.22
69
+ signing_key:
68
70
  specification_version: 4
69
71
  summary: A ruby gem to convert French words into numbers.
70
72
  test_files: []
data/.travis.yml DELETED
@@ -1,5 +0,0 @@
1
- sudo: false
2
- language: ruby
3
- rvm:
4
- - 2.3.3
5
- before_install: gem install bundler -v 1.13.6
data/logo.png DELETED
Binary file
File without changes