sec_id 5.0.0 → 5.2.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CHANGELOG.md +24 -0
- data/README.md +119 -9
- data/lib/sec_id/base.rb +46 -0
- data/lib/sec_id/cei.rb +7 -0
- data/lib/sec_id/cfi.rb +19 -1
- data/lib/sec_id/concerns/normalizable.rb +18 -0
- data/lib/sec_id/cusip.rb +14 -0
- data/lib/sec_id/errors.rb +7 -0
- data/lib/sec_id/figi.rb +10 -0
- data/lib/sec_id/fisn.rb +5 -0
- data/lib/sec_id/iban.rb +22 -0
- data/lib/sec_id/isin.rb +14 -0
- data/lib/sec_id/lei.rb +10 -0
- data/lib/sec_id/occ.rb +10 -0
- data/lib/sec_id/scanner.rb +144 -0
- data/lib/sec_id/sedol.rb +3 -0
- data/lib/sec_id/valoren.rb +7 -0
- data/lib/sec_id/version.rb +1 -1
- data/lib/sec_id.rb +83 -28
- data/sec_id.gemspec +4 -4
- metadata +7 -5
checksums.yaml
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
SHA256:
|
|
3
|
-
metadata.gz:
|
|
4
|
-
data.tar.gz:
|
|
3
|
+
metadata.gz: bc458a32d8a5cb8fc4db2b1ab4c635a6c88e4160d38948e4ce779e90b039bd3c
|
|
4
|
+
data.tar.gz: 298298a51ca424aef20c814ed620e35d32bbfe9bf285fe1f6ee6f44bac163980
|
|
5
5
|
SHA512:
|
|
6
|
-
metadata.gz:
|
|
7
|
-
data.tar.gz:
|
|
6
|
+
metadata.gz: 78650675337ce8a03970e4b1227713aaa0adafb8c18da35466099add8fc39389c8b02583602c18f9aeb59a340ee6b9db9f42ef7662d96284dd0ed93336a50a6a
|
|
7
|
+
data.tar.gz: 27d54f707d2ec471218a15e6f9dadc2d78d3f46985084f1e18be921b83691f4fe5cd4d82d793ae9cbb34c3e55df30e2966aa834b5942606dfa81d5e3d38e4799
|
data/CHANGELOG.md
CHANGED
|
@@ -8,6 +8,30 @@ and [Conventional Commits](https://www.conventionalcommits.org/en/v1.0.0/).
|
|
|
8
8
|
|
|
9
9
|
## [Unreleased]
|
|
10
10
|
|
|
11
|
+
## [5.2.0] - 2026-02-24
|
|
12
|
+
|
|
13
|
+
### Added
|
|
14
|
+
|
|
15
|
+
- `SecID.scan` and `SecID.extract` methods for finding identifiers in freeform text — returns `Scanner::Match` objects (`Data.define(:type, :raw, :range, :identifier)`) with the validated identifier instance; supports `types:` filtering, hyphenated identifiers, and compound patterns (OCC with spaces, FISN with slashes)
|
|
16
|
+
- `SecID.explain` method for debugging identifier detection — returns per-type validation results showing exactly why each type matched or rejected the input
|
|
17
|
+
- `on_ambiguous:` option for `SecID.parse` and `SecID.parse!` — `:first` (default, existing behavior), `:raise` (raises `AmbiguousMatchError`), `:all` (returns array of all matching instances)
|
|
18
|
+
- `SecID::AmbiguousMatchError` exception class for ambiguous identifier detection
|
|
19
|
+
- `#as_json` method on all identifier types (delegates to `#to_h`) and on `Errors` (delegates to `#details`) for JSON serialization compatibility
|
|
20
|
+
- `SecID::IBAN.supported_countries` class method returning sorted array of all supported country codes
|
|
21
|
+
- `SecID::CFI.categories` class method returning the categories hash
|
|
22
|
+
- `SecID::CFI.groups_for(category_code)` class method returning groups hash for a given category
|
|
23
|
+
|
|
24
|
+
|
|
25
|
+
## [5.1.0] - 2026-02-19
|
|
26
|
+
|
|
27
|
+
### Added
|
|
28
|
+
|
|
29
|
+
- `#==`, `#eql?`, and `#hash` methods on all identifier types — two instances of the same type with the same normalized form are equal and usable as Hash keys / in Sets
|
|
30
|
+
- `#to_h` method on all identifier types for consistent hash serialization — returns `{ type:, full_id:, normalized:, valid:, components: }` with type-specific component hashes (e.g. ISIN: `country_code`, `nsin`, `check_digit`)
|
|
31
|
+
- `#to_pretty_s` and `.to_pretty_s` display formatting methods on all identifier types, returning a human-readable string or `nil` for invalid input — with type-specific formats for IBAN (4-char groups), LEI (4-char groups), ISIN (CC + NSIN + CD), CUSIP (cusip6 + issue + CD), FIGI (prefix+G + random + CD), OCC (space-separated components), and Valoren (thousands grouping)
|
|
32
|
+
- Lookup service integration guides and runnable examples for OpenFIGI, SEC EDGAR, GLEIF, and Eurex APIs (`docs/guides/`, `examples/`)
|
|
33
|
+
- GitHub community standards files: Code of Conduct, Contributing guide, Security policy, issue templates, and PR template
|
|
34
|
+
|
|
11
35
|
## [5.0.0] - 2026-02-17
|
|
12
36
|
|
|
13
37
|
### Added
|
data/README.md
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
# SecID [](https://rubygems.org/gems/sec_id) [](https://app.codecov.io/gh/svyatov/sec_id) [](https://github.com/svyatov/sec_id/actions?query=workflow%3ACI)
|
|
2
2
|
|
|
3
|
-
>
|
|
3
|
+
> A Ruby toolkit for securities identifiers — validate, parse, normalize, detect, and convert.
|
|
4
4
|
|
|
5
5
|
## Table of Contents
|
|
6
6
|
|
|
@@ -8,6 +8,8 @@
|
|
|
8
8
|
- [Installation](#installation)
|
|
9
9
|
- [Supported Standards and Usage](#supported-standards-and-usage)
|
|
10
10
|
- [Metadata Registry](#metadata-registry) - enumerate, filter, look up, and detect identifier types
|
|
11
|
+
- [Text Scanning](#text-scanning) - find identifiers in freeform text
|
|
12
|
+
- [Debugging Detection](#debugging-detection) - understand why strings match or don't
|
|
11
13
|
- [Structured Validation](#structured-validation) - detailed error codes and messages
|
|
12
14
|
- [ISIN](#isin) - International Securities Identification Number
|
|
13
15
|
- [CUSIP](#cusip) - Committee on Uniform Securities Identification Procedures
|
|
@@ -22,6 +24,7 @@
|
|
|
22
24
|
- [Valoren](#valoren) - Swiss Security Number
|
|
23
25
|
- [CFI](#cfi) - Classification of Financial Instruments
|
|
24
26
|
- [FISN](#fisn) - Financial Instrument Short Name
|
|
27
|
+
- [Lookup Service Integration](#lookup-service-integration)
|
|
25
28
|
- [Development](#development)
|
|
26
29
|
- [Contributing](#contributing)
|
|
27
30
|
- [Changelog](#changelog)
|
|
@@ -37,7 +40,7 @@ Ruby 3.2+ is required.
|
|
|
37
40
|
Add this line to your application's Gemfile:
|
|
38
41
|
|
|
39
42
|
```ruby
|
|
40
|
-
gem 'sec_id', '~> 5.
|
|
43
|
+
gem 'sec_id', '~> 5.2'
|
|
41
44
|
```
|
|
42
45
|
|
|
43
46
|
And then execute:
|
|
@@ -58,10 +61,39 @@ gem install sec_id
|
|
|
58
61
|
|
|
59
62
|
All identifier classes provide `valid?`, `errors`, `validate`, `validate!` methods at both class and instance levels.
|
|
60
63
|
|
|
61
|
-
**All identifiers** support normalization:
|
|
64
|
+
**All identifiers** support normalization and display formatting:
|
|
62
65
|
- `.normalize(id)` - strips separators, upcases, validates, and returns the canonical string
|
|
63
66
|
- `#normalized` / `#normalize` - returns the canonical string for a valid instance
|
|
64
67
|
- `#normalize!` - mutates `full_id` to canonical form, returns `self`
|
|
68
|
+
- `#to_pretty_s` / `.to_pretty_s(id)` - returns a human-readable formatted string, or `nil` for invalid input
|
|
69
|
+
|
|
70
|
+
**All identifiers** support hash serialization:
|
|
71
|
+
- `#to_h` - returns a hash with `:type`, `:full_id`, `:normalized`, `:valid`, and `:components` keys
|
|
72
|
+
- `#as_json` - same as `#to_h`, for JSON serialization compatibility (Rails, `JSON.generate`, etc.)
|
|
73
|
+
|
|
74
|
+
```ruby
|
|
75
|
+
SecID::ISIN.new('US5949181045').to_h
|
|
76
|
+
# => { type: :isin, full_id: 'US5949181045', normalized: 'US5949181045',
|
|
77
|
+
# valid: true, components: { country_code: 'US', nsin: '594918104', check_digit: 5 } }
|
|
78
|
+
|
|
79
|
+
SecID::ISIN.new('INVALID').to_h
|
|
80
|
+
# => { type: :isin, full_id: 'INVALID', normalized: nil,
|
|
81
|
+
# valid: false, components: { country_code: nil, nsin: nil, check_digit: nil } }
|
|
82
|
+
```
|
|
83
|
+
|
|
84
|
+
**All identifiers** support value equality — two instances of the same type with the same normalized form are equal:
|
|
85
|
+
|
|
86
|
+
```ruby
|
|
87
|
+
a = SecID::ISIN.new('US5949181045')
|
|
88
|
+
b = SecID::ISIN.new('us 5949 1810 45')
|
|
89
|
+
|
|
90
|
+
a == b # => true
|
|
91
|
+
a.eql?(b) # => true
|
|
92
|
+
|
|
93
|
+
# Works as Hash keys and in Sets
|
|
94
|
+
{ a => 'MSFT' }[b] # => 'MSFT'
|
|
95
|
+
Set.new([a, b]).size # => 1
|
|
96
|
+
```
|
|
65
97
|
|
|
66
98
|
**Check-digit based identifiers** (ISIN, CUSIP, CEI, SEDOL, FIGI, LEI, IBAN) also provide:
|
|
67
99
|
- `restore` / `.restore` - returns the full identifier string with correct check-digit (no mutation)
|
|
@@ -115,6 +147,56 @@ SecID.parse('594918104', types: [:cusip]) # => #<SecID::CUSIP>
|
|
|
115
147
|
# Bang version raises on failure
|
|
116
148
|
SecID.parse!('US5949181045') # => #<SecID::ISIN>
|
|
117
149
|
SecID.parse!('unknown') # raises SecID::InvalidFormatError
|
|
150
|
+
|
|
151
|
+
# Handle ambiguous matches
|
|
152
|
+
SecID.parse('514000', on_ambiguous: :first) # => #<SecID::WKN> (default)
|
|
153
|
+
SecID.parse('514000', on_ambiguous: :raise) # raises SecID::AmbiguousMatchError
|
|
154
|
+
SecID.parse('514000', on_ambiguous: :all) # => [#<SecID::WKN>, #<SecID::Valoren>, #<SecID::CIK>]
|
|
155
|
+
SecID.parse('US5949181045', on_ambiguous: :raise) # => #<SecID::ISIN> (unambiguous, no error)
|
|
156
|
+
```
|
|
157
|
+
|
|
158
|
+
### Text Scanning
|
|
159
|
+
|
|
160
|
+
Find identifiers embedded in freeform text:
|
|
161
|
+
|
|
162
|
+
```ruby
|
|
163
|
+
# Extract all identifiers from text
|
|
164
|
+
matches = SecID.extract('Portfolio: US5949181045, 594918104, B0YBKJ7')
|
|
165
|
+
matches.map(&:type) # => [:isin, :cusip, :sedol]
|
|
166
|
+
matches.first.raw # => "US5949181045"
|
|
167
|
+
matches.first.range # => 11...23
|
|
168
|
+
matches.first.identifier.country_code # => "US"
|
|
169
|
+
|
|
170
|
+
# Lazy scanning with Enumerator
|
|
171
|
+
SecID.scan('Buy US5949181045 now').each { |m| puts m.type }
|
|
172
|
+
|
|
173
|
+
# Filter by types
|
|
174
|
+
SecID.extract('514000', types: [:valoren]) # => only Valoren matches
|
|
175
|
+
|
|
176
|
+
# Handles hyphenated identifiers
|
|
177
|
+
match = SecID.extract('ID: US-5949-1810-45').first
|
|
178
|
+
match.raw # => "US-5949-1810-45"
|
|
179
|
+
match.identifier.normalized # => "US5949181045"
|
|
180
|
+
```
|
|
181
|
+
|
|
182
|
+
> **Known limitations:** Format-only types (CIK, Valoren, WKN, CFI) can false-positive on
|
|
183
|
+
> common numbers and short words in prose — use the `types:` filter to restrict scanning when
|
|
184
|
+
> this is a concern. Identifiers prefixed with special characters (e.g. `#US5949181045`) may be
|
|
185
|
+
> consumed as a single token by CUSIP's `*@#` character class and fail validation, preventing
|
|
186
|
+
> the embedded identifier from being found.
|
|
187
|
+
|
|
188
|
+
### Debugging Detection
|
|
189
|
+
|
|
190
|
+
Understand why a string matches or doesn't match specific identifier types:
|
|
191
|
+
|
|
192
|
+
```ruby
|
|
193
|
+
result = SecID.explain('US5949181040')
|
|
194
|
+
isin = result[:candidates].find { |c| c[:type] == :isin }
|
|
195
|
+
isin[:valid] # => false
|
|
196
|
+
isin[:errors].first[:error] # => :invalid_check_digit
|
|
197
|
+
|
|
198
|
+
# Filter to specific types
|
|
199
|
+
SecID.explain('US5949181045', types: %i[isin cusip])
|
|
118
200
|
```
|
|
119
201
|
|
|
120
202
|
### Structured Validation
|
|
@@ -192,6 +274,7 @@ isin.valid? # => true
|
|
|
192
274
|
isin.restore # => 'US5949181045'
|
|
193
275
|
isin.restore! # => #<SecID::ISIN> (mutates instance)
|
|
194
276
|
isin.calculate_check_digit # => 5
|
|
277
|
+
isin.to_pretty_s # => 'US 594918104 5'
|
|
195
278
|
isin.to_cusip # => #<SecID::CUSIP>
|
|
196
279
|
isin.nsin_type # => :cusip
|
|
197
280
|
isin.to_nsin # => #<SecID::CUSIP>
|
|
@@ -236,6 +319,7 @@ cusip.valid? # => true
|
|
|
236
319
|
cusip.restore # => '594918104'
|
|
237
320
|
cusip.restore! # => #<SecID::CUSIP> (mutates instance)
|
|
238
321
|
cusip.calculate_check_digit # => 4
|
|
322
|
+
cusip.to_pretty_s # => '594918 10 4'
|
|
239
323
|
cusip.to_isin('US') # => #<SecID::ISIN>
|
|
240
324
|
cusip.cins? # => false
|
|
241
325
|
```
|
|
@@ -308,6 +392,7 @@ figi.valid? # => true
|
|
|
308
392
|
figi.restore # => 'BBG000DMBXR2'
|
|
309
393
|
figi.restore! # => #<SecID::FIGI> (mutates instance)
|
|
310
394
|
figi.calculate_check_digit # => 2
|
|
395
|
+
figi.to_pretty_s # => 'BBG 000DMBXR 2'
|
|
311
396
|
```
|
|
312
397
|
|
|
313
398
|
### LEI
|
|
@@ -332,6 +417,7 @@ lei.valid? # => true
|
|
|
332
417
|
lei.restore # => '5493006MHB84DD0ZWV18'
|
|
333
418
|
lei.restore! # => #<SecID::LEI> (mutates instance)
|
|
334
419
|
lei.calculate_check_digit # => 18
|
|
420
|
+
lei.to_pretty_s # => '5493 006M HB84 DD0Z WV18'
|
|
335
421
|
```
|
|
336
422
|
|
|
337
423
|
### IBAN
|
|
@@ -358,10 +444,16 @@ iban.restore # => 'DE89370400440532013000'
|
|
|
358
444
|
iban.restore! # => #<SecID::IBAN> (mutates instance)
|
|
359
445
|
iban.calculate_check_digit # => 89
|
|
360
446
|
iban.known_country? # => true
|
|
447
|
+
iban.to_pretty_s # => 'DE89 3704 0044 0532 0130 00'
|
|
361
448
|
```
|
|
362
449
|
|
|
363
450
|
Full BBAN structural validation is supported for EU/EEA countries. Other countries have length-only validation.
|
|
364
451
|
|
|
452
|
+
```ruby
|
|
453
|
+
# List all supported countries
|
|
454
|
+
SecID::IBAN.supported_countries # => ["AD", "AE", "AT", "BE", "BG", "CH", ...]
|
|
455
|
+
```
|
|
456
|
+
|
|
365
457
|
### CIK
|
|
366
458
|
|
|
367
459
|
> [Central Index Key](https://en.wikipedia.org/wiki/Central_Index_Key) - a 10-digit number used by the SEC to identify corporations and individuals who have filed disclosures.
|
|
@@ -412,6 +504,7 @@ occ.full_id # => 'X 250620C00050000'
|
|
|
412
504
|
occ.valid? # => true
|
|
413
505
|
occ.normalize! # => #<SecID::OCC> (mutates full_id, returns self)
|
|
414
506
|
occ.full_id # => 'X 250620C00050000'
|
|
507
|
+
occ.to_pretty_s # => 'X 250620 C 00050000'
|
|
415
508
|
```
|
|
416
509
|
|
|
417
510
|
### WKN
|
|
@@ -454,6 +547,7 @@ valoren.identifier # => '3886335'
|
|
|
454
547
|
valoren.valid? # => true
|
|
455
548
|
valoren.normalized # => '003886335'
|
|
456
549
|
valoren.normalize! # => #<SecID::Valoren> (mutates full_id, returns self)
|
|
550
|
+
valoren.to_pretty_s # => '3 886 335'
|
|
457
551
|
valoren.to_isin # => #<SecID::ISIN> (CH ISIN by default)
|
|
458
552
|
valoren.to_isin('LI') # => #<SecID::ISIN> (LI ISIN)
|
|
459
553
|
```
|
|
@@ -486,6 +580,12 @@ cfi.registered? # => true
|
|
|
486
580
|
|
|
487
581
|
CFI validates the category code (position 1) against 14 valid values and the group code (position 2) against valid values for that category. Attribute positions 3-6 accept any letter A-Z, with X meaning "not applicable".
|
|
488
582
|
|
|
583
|
+
```ruby
|
|
584
|
+
# Introspect valid codes
|
|
585
|
+
SecID::CFI.categories # => { "E" => :equity, "C" => :collective_investment_vehicles, ... }
|
|
586
|
+
SecID::CFI.groups_for('E') # => { "S" => :common_shares, "P" => :preferred_shares, ... }
|
|
587
|
+
```
|
|
588
|
+
|
|
489
589
|
### FISN
|
|
490
590
|
|
|
491
591
|
> [Financial Instrument Short Name](https://en.wikipedia.org/wiki/ISO_18774) - a human-readable short name for financial instruments per ISO 18774.
|
|
@@ -506,6 +606,19 @@ fisn.to_s # => 'APPLE INC/SH'
|
|
|
506
606
|
|
|
507
607
|
FISN format: `Issuer Name/Abbreviated Instrument Description` with issuer (1-15 chars) and description (1-19 chars) separated by a forward slash. Character set: uppercase A-Z, digits 0-9, and space.
|
|
508
608
|
|
|
609
|
+
## Lookup Service Integration
|
|
610
|
+
|
|
611
|
+
SecID validates identifiers but does not include HTTP clients. The [`docs/guides/`](docs/guides/) directory provides integration patterns for external lookup services using only stdlib (`net/http`, `json`):
|
|
612
|
+
|
|
613
|
+
| Guide | Service | Identifier |
|
|
614
|
+
|-------|---------|------------|
|
|
615
|
+
| [OpenFIGI](docs/guides/openfigi.md) | [OpenFIGI API](https://www.openfigi.com/api) | FIGI |
|
|
616
|
+
| [SEC EDGAR](docs/guides/sec-edgar.md) | [SEC EDGAR](https://www.sec.gov/edgar/sec-api-documentation) | CIK |
|
|
617
|
+
| [GLEIF](docs/guides/gleif.md) | [GLEIF API](https://www.gleif.org/en/lei-data/gleif-api) | LEI |
|
|
618
|
+
| [Eurex](docs/guides/eurex.md) | [Eurex Reference Data](https://www.eurex.com/ex-en/data/free-reference-data-api) | ISIN |
|
|
619
|
+
|
|
620
|
+
Each guide includes a complete adapter class and a [runnable example](examples/).
|
|
621
|
+
|
|
509
622
|
## Development
|
|
510
623
|
|
|
511
624
|
After checking out the repo, run `bin/setup` to install dependencies.
|
|
@@ -516,12 +629,9 @@ To install this gem onto your local machine, run `bundle exec rake install`.
|
|
|
516
629
|
|
|
517
630
|
## Contributing
|
|
518
631
|
|
|
519
|
-
|
|
520
|
-
|
|
521
|
-
|
|
522
|
-
4. Commit using [Conventional Commits](https://www.conventionalcommits.org/en/v1.0.0/) format (`git commit -m 'feat: add some feature'`)
|
|
523
|
-
5. Push to the branch (`git push origin my-new-feature`)
|
|
524
|
-
6. Create a new Pull Request
|
|
632
|
+
Bug reports and pull requests are welcome on [GitHub](https://github.com/svyatov/sec_id). See [CONTRIBUTING.md](CONTRIBUTING.md) for development setup, code style, and PR guidelines.
|
|
633
|
+
|
|
634
|
+
This project follows the [Contributor Covenant Code of Conduct](CODE_OF_CONDUCT.md).
|
|
525
635
|
|
|
526
636
|
## Changelog
|
|
527
637
|
|
data/lib/sec_id/base.rb
CHANGED
|
@@ -52,6 +52,7 @@ module SecID
|
|
|
52
52
|
# @api private
|
|
53
53
|
def self.inherited(subclass)
|
|
54
54
|
super
|
|
55
|
+
# Skip anonymous classes and classes outside the SecID namespace (e.g. in tests)
|
|
55
56
|
SecID.__send__(:register_identifier, subclass) if subclass.name&.start_with?('SecID::')
|
|
56
57
|
end
|
|
57
58
|
|
|
@@ -63,8 +64,53 @@ module SecID
|
|
|
63
64
|
raise NotImplementedError
|
|
64
65
|
end
|
|
65
66
|
|
|
67
|
+
# @param other [Object]
|
|
68
|
+
# @return [Boolean]
|
|
69
|
+
def ==(other)
|
|
70
|
+
other.class == self.class && comparison_id == other.comparison_id
|
|
71
|
+
end
|
|
72
|
+
|
|
73
|
+
alias eql? ==
|
|
74
|
+
|
|
75
|
+
# @return [Integer]
|
|
76
|
+
def hash
|
|
77
|
+
[self.class, comparison_id].hash
|
|
78
|
+
end
|
|
79
|
+
|
|
80
|
+
# Returns a hash representation of this identifier for serialization.
|
|
81
|
+
#
|
|
82
|
+
# @return [Hash] hash with :type, :full_id, :normalized, :valid, and :components keys
|
|
83
|
+
def to_h
|
|
84
|
+
{
|
|
85
|
+
type: self.class.short_name.downcase.to_sym,
|
|
86
|
+
full_id: full_id,
|
|
87
|
+
normalized: valid? ? normalized : nil,
|
|
88
|
+
valid: valid?,
|
|
89
|
+
components: components
|
|
90
|
+
}
|
|
91
|
+
end
|
|
92
|
+
|
|
93
|
+
# Returns a JSON-compatible hash representation.
|
|
94
|
+
#
|
|
95
|
+
# @return [Hash]
|
|
96
|
+
def as_json(*)
|
|
97
|
+
to_h
|
|
98
|
+
end
|
|
99
|
+
|
|
100
|
+
protected
|
|
101
|
+
|
|
102
|
+
# @return [String]
|
|
103
|
+
def comparison_id
|
|
104
|
+
valid? ? normalized : full_id
|
|
105
|
+
end
|
|
106
|
+
|
|
66
107
|
private
|
|
67
108
|
|
|
109
|
+
# @return [Hash]
|
|
110
|
+
def components
|
|
111
|
+
{}
|
|
112
|
+
end
|
|
113
|
+
|
|
68
114
|
# @param sec_id_number [String, #to_s] the identifier to parse
|
|
69
115
|
# @return [MatchData, Hash] the regex match data or empty hash if no match
|
|
70
116
|
def parse(sec_id_number)
|
data/lib/sec_id/cei.rb
CHANGED
|
@@ -46,6 +46,13 @@ module SecID
|
|
|
46
46
|
@check_digit = cei_parts[:check_digit]&.to_i
|
|
47
47
|
end
|
|
48
48
|
|
|
49
|
+
private
|
|
50
|
+
|
|
51
|
+
# @return [Hash]
|
|
52
|
+
def components = { prefix:, numeric:, entity_id:, check_digit: }
|
|
53
|
+
|
|
54
|
+
public
|
|
55
|
+
|
|
49
56
|
# @return [Integer] the calculated check digit (0-9)
|
|
50
57
|
# @raise [InvalidFormatError] if the CEI format is invalid
|
|
51
58
|
def calculate_check_digit
|
data/lib/sec_id/cfi.rb
CHANGED
|
@@ -164,7 +164,22 @@ module SecID
|
|
|
164
164
|
'C' => :combined_instruments,
|
|
165
165
|
'M' => :miscellaneous
|
|
166
166
|
}
|
|
167
|
-
}.freeze
|
|
167
|
+
}.each_value(&:freeze).freeze
|
|
168
|
+
|
|
169
|
+
# Returns the category codes hash.
|
|
170
|
+
#
|
|
171
|
+
# @return [Hash{String => Symbol}]
|
|
172
|
+
def self.categories
|
|
173
|
+
CATEGORIES
|
|
174
|
+
end
|
|
175
|
+
|
|
176
|
+
# Returns the groups hash for a given category code.
|
|
177
|
+
#
|
|
178
|
+
# @param category_code [String] single-letter category code
|
|
179
|
+
# @return [Hash{String => Symbol}, nil]
|
|
180
|
+
def self.groups_for(category_code)
|
|
181
|
+
GROUPS[category_code.to_s.upcase]
|
|
182
|
+
end
|
|
168
183
|
|
|
169
184
|
# @return [String, nil] the category code (position 1)
|
|
170
185
|
attr_reader :category_code
|
|
@@ -299,6 +314,9 @@ module SecID
|
|
|
299
314
|
|
|
300
315
|
private
|
|
301
316
|
|
|
317
|
+
# @return [Hash]
|
|
318
|
+
def components = { category_code:, group_code:, attr1:, attr2:, attr3:, attr4: }
|
|
319
|
+
|
|
302
320
|
# @return [Boolean]
|
|
303
321
|
def valid_format?
|
|
304
322
|
super && valid_category? && valid_group?
|
|
@@ -23,6 +23,15 @@ module SecID
|
|
|
23
23
|
cleaned = id.to_s.strip.gsub(self::SEPARATORS, '')
|
|
24
24
|
new(cleaned.upcase).normalized
|
|
25
25
|
end
|
|
26
|
+
|
|
27
|
+
# Returns a human-readable formatted string, or nil if invalid.
|
|
28
|
+
#
|
|
29
|
+
# @param id [String, #to_s] the identifier to format
|
|
30
|
+
# @return [String, nil]
|
|
31
|
+
def to_pretty_s(id)
|
|
32
|
+
cleaned = id.to_s.strip.gsub(self::SEPARATORS, '')
|
|
33
|
+
new(cleaned.upcase).to_pretty_s
|
|
34
|
+
end
|
|
26
35
|
end
|
|
27
36
|
|
|
28
37
|
# Returns the canonical normalized form of this identifier.
|
|
@@ -48,6 +57,15 @@ module SecID
|
|
|
48
57
|
self
|
|
49
58
|
end
|
|
50
59
|
|
|
60
|
+
# Returns a human-readable formatted string, or nil if invalid.
|
|
61
|
+
#
|
|
62
|
+
# @return [String, nil]
|
|
63
|
+
def to_pretty_s
|
|
64
|
+
return nil unless valid?
|
|
65
|
+
|
|
66
|
+
to_s
|
|
67
|
+
end
|
|
68
|
+
|
|
51
69
|
# @return [String]
|
|
52
70
|
def to_s
|
|
53
71
|
identifier.to_s
|
data/lib/sec_id/cusip.rb
CHANGED
|
@@ -45,6 +45,13 @@ module SecID
|
|
|
45
45
|
@check_digit = cusip_parts[:check_digit]&.to_i
|
|
46
46
|
end
|
|
47
47
|
|
|
48
|
+
# @return [String, nil]
|
|
49
|
+
def to_pretty_s
|
|
50
|
+
return nil unless valid?
|
|
51
|
+
|
|
52
|
+
"#{cusip6} #{issue} #{check_digit}"
|
|
53
|
+
end
|
|
54
|
+
|
|
48
55
|
# @return [Integer] the calculated check digit (0-9)
|
|
49
56
|
# @raise [InvalidFormatError] if the CUSIP format is invalid
|
|
50
57
|
def calculate_check_digit
|
|
@@ -63,6 +70,13 @@ module SecID
|
|
|
63
70
|
ISIN.new(country_code + restore).restore!
|
|
64
71
|
end
|
|
65
72
|
|
|
73
|
+
private
|
|
74
|
+
|
|
75
|
+
# @return [Hash]
|
|
76
|
+
def components = { cusip6:, issue:, check_digit: }
|
|
77
|
+
|
|
78
|
+
public
|
|
79
|
+
|
|
66
80
|
# @return [Boolean] true if first character is a letter (CINS identifier)
|
|
67
81
|
def cins?
|
|
68
82
|
cusip6[0] < '0' || cusip6[0] > '9'
|
data/lib/sec_id/errors.rb
CHANGED
data/lib/sec_id/figi.rb
CHANGED
|
@@ -50,6 +50,13 @@ module SecID
|
|
|
50
50
|
@check_digit = figi_parts[:check_digit]&.to_i
|
|
51
51
|
end
|
|
52
52
|
|
|
53
|
+
# @return [String, nil]
|
|
54
|
+
def to_pretty_s
|
|
55
|
+
return nil unless valid?
|
|
56
|
+
|
|
57
|
+
"#{prefix}G #{random_part} #{check_digit}"
|
|
58
|
+
end
|
|
59
|
+
|
|
53
60
|
# @return [Integer] the calculated check digit (0-9)
|
|
54
61
|
# @raise [InvalidFormatError] if the FIGI format is invalid
|
|
55
62
|
def calculate_check_digit
|
|
@@ -59,6 +66,9 @@ module SecID
|
|
|
59
66
|
|
|
60
67
|
private
|
|
61
68
|
|
|
69
|
+
# @return [Hash]
|
|
70
|
+
def components = { prefix:, random_part:, check_digit: }
|
|
71
|
+
|
|
62
72
|
# @return [Boolean]
|
|
63
73
|
def valid_format?
|
|
64
74
|
!identifier.nil? && !RESTRICTED_PREFIXES.include?(prefix)
|
data/lib/sec_id/fisn.rb
CHANGED
data/lib/sec_id/iban.rb
CHANGED
|
@@ -33,6 +33,13 @@ module SecID
|
|
|
33
33
|
(?<rest>[A-Z0-9]{13,32})
|
|
34
34
|
\z/x
|
|
35
35
|
|
|
36
|
+
# Returns sorted array of all supported country codes.
|
|
37
|
+
#
|
|
38
|
+
# @return [Array<String>]
|
|
39
|
+
def self.supported_countries
|
|
40
|
+
@supported_countries ||= (COUNTRY_RULES.keys + LENGTH_ONLY_COUNTRIES.keys).sort.freeze
|
|
41
|
+
end
|
|
42
|
+
|
|
36
43
|
# @return [String, nil] the ISO 3166-1 alpha-2 country code
|
|
37
44
|
attr_reader :country_code
|
|
38
45
|
|
|
@@ -106,8 +113,23 @@ module SecID
|
|
|
106
113
|
"#{country_code}#{check_digit.to_s.rjust(2, '0')}#{bban}"
|
|
107
114
|
end
|
|
108
115
|
|
|
116
|
+
# @return [String, nil]
|
|
117
|
+
def to_pretty_s
|
|
118
|
+
to_s.scan(/.{1,4}/).join(' ') if valid?
|
|
119
|
+
end
|
|
120
|
+
|
|
109
121
|
private
|
|
110
122
|
|
|
123
|
+
# @return [Hash]
|
|
124
|
+
def components
|
|
125
|
+
hash = { country_code:, bban:, check_digit: }
|
|
126
|
+
hash[:bank_code] = bank_code if bank_code
|
|
127
|
+
hash[:branch_code] = branch_code if branch_code
|
|
128
|
+
hash[:account_number] = account_number if account_number
|
|
129
|
+
hash[:national_check] = national_check if national_check
|
|
130
|
+
hash
|
|
131
|
+
end
|
|
132
|
+
|
|
111
133
|
# @return [Integer]
|
|
112
134
|
def check_digit_width
|
|
113
135
|
2
|
data/lib/sec_id/isin.rb
CHANGED
|
@@ -76,6 +76,13 @@ module SecID
|
|
|
76
76
|
@check_digit = isin_parts[:check_digit]&.to_i
|
|
77
77
|
end
|
|
78
78
|
|
|
79
|
+
# @return [String, nil]
|
|
80
|
+
def to_pretty_s
|
|
81
|
+
return nil unless valid?
|
|
82
|
+
|
|
83
|
+
"#{country_code} #{nsin} #{check_digit}"
|
|
84
|
+
end
|
|
85
|
+
|
|
79
86
|
# @return [Integer] the calculated check digit (0-9)
|
|
80
87
|
# @raise [InvalidFormatError] if the ISIN format is invalid
|
|
81
88
|
def calculate_check_digit
|
|
@@ -135,6 +142,13 @@ module SecID
|
|
|
135
142
|
Valoren.new(nsin)
|
|
136
143
|
end
|
|
137
144
|
|
|
145
|
+
private
|
|
146
|
+
|
|
147
|
+
# @return [Hash]
|
|
148
|
+
def components = { country_code:, nsin:, check_digit: }
|
|
149
|
+
|
|
150
|
+
public
|
|
151
|
+
|
|
138
152
|
# Returns the type of NSIN embedded in this ISIN.
|
|
139
153
|
#
|
|
140
154
|
# @return [Symbol] :cusip, :sedol, :wkn, :valoren, or :generic
|
data/lib/sec_id/lei.rb
CHANGED
|
@@ -50,6 +50,13 @@ module SecID
|
|
|
50
50
|
@check_digit = lei_parts[:check_digit]&.to_i
|
|
51
51
|
end
|
|
52
52
|
|
|
53
|
+
# @return [String, nil]
|
|
54
|
+
def to_pretty_s
|
|
55
|
+
return nil unless valid?
|
|
56
|
+
|
|
57
|
+
to_s.scan(/.{1,4}/).join(' ')
|
|
58
|
+
end
|
|
59
|
+
|
|
53
60
|
# @return [Integer] the calculated 2-digit check digit (1-98)
|
|
54
61
|
# @raise [InvalidFormatError] if the LEI format is invalid
|
|
55
62
|
def calculate_check_digit
|
|
@@ -59,6 +66,9 @@ module SecID
|
|
|
59
66
|
|
|
60
67
|
private
|
|
61
68
|
|
|
69
|
+
# @return [Hash]
|
|
70
|
+
def components = { lou_id:, reserved:, entity_id:, check_digit: }
|
|
71
|
+
|
|
62
72
|
# @return [Integer]
|
|
63
73
|
def check_digit_width
|
|
64
74
|
2
|
data/lib/sec_id/occ.rb
CHANGED
|
@@ -138,8 +138,18 @@ module SecID
|
|
|
138
138
|
full_id
|
|
139
139
|
end
|
|
140
140
|
|
|
141
|
+
# @return [String, nil]
|
|
142
|
+
def to_pretty_s
|
|
143
|
+
return nil unless valid?
|
|
144
|
+
|
|
145
|
+
"#{underlying} #{date_str} #{type} #{strike_mills}"
|
|
146
|
+
end
|
|
147
|
+
|
|
141
148
|
private
|
|
142
149
|
|
|
150
|
+
# @return [Hash]
|
|
151
|
+
def components = { underlying:, date_str:, type:, strike_mills: }
|
|
152
|
+
|
|
143
153
|
# @return [Array<Symbol>]
|
|
144
154
|
def error_codes
|
|
145
155
|
return detect_errors unless valid_format?
|
|
@@ -0,0 +1,144 @@
|
|
|
1
|
+
# frozen_string_literal: true
|
|
2
|
+
|
|
3
|
+
module SecID
|
|
4
|
+
# Immutable value object representing a matched identifier found in text.
|
|
5
|
+
Match = Data.define(:type, :raw, :range, :identifier)
|
|
6
|
+
|
|
7
|
+
# Finds securities identifiers in freeform text using regex candidate extraction,
|
|
8
|
+
# length/charset pre-filtering, and cursor-based overlap prevention.
|
|
9
|
+
#
|
|
10
|
+
# @api private
|
|
11
|
+
class Scanner
|
|
12
|
+
# Composite regex for candidate extraction.
|
|
13
|
+
#
|
|
14
|
+
# Three named groups tried left-to-right via alternation:
|
|
15
|
+
# - fisn: contains `/` (unique FISN delimiter)
|
|
16
|
+
# - occ: contains structural spaces + date/type pattern
|
|
17
|
+
# - simple: common alphanumeric tokens (covers all other types)
|
|
18
|
+
CANDIDATE_RE = %r{
|
|
19
|
+
(?<![A-Za-z0-9*@\#/.$])
|
|
20
|
+
(?:
|
|
21
|
+
(?<fisn>[A-Za-z0-9](?:[A-Za-z0-9 ]{0,33}[A-Za-z0-9])?/[A-Za-z0-9](?:[A-Za-z0-9 ]{0,33}[A-Za-z0-9])?)
|
|
22
|
+
|
|
|
23
|
+
(?<occ>[A-Za-z]{1,6}\ {1,5}\d{6}[CcPp]\d{8})
|
|
24
|
+
|
|
|
25
|
+
(?<simple>[A-Za-z0-9*@\#](?:[A-Za-z0-9*@\#-]{0,40}[A-Za-z0-9*@\#])?)
|
|
26
|
+
)
|
|
27
|
+
(?![A-Za-z0-9*@\#.])
|
|
28
|
+
}x
|
|
29
|
+
|
|
30
|
+
# @param identifier_list [Array<Class>] registered identifier classes
|
|
31
|
+
def initialize(identifier_list)
|
|
32
|
+
@classes = identifier_list.dup.freeze
|
|
33
|
+
precompute
|
|
34
|
+
end
|
|
35
|
+
|
|
36
|
+
# Scans text for identifiers, yielding or returning matches.
|
|
37
|
+
#
|
|
38
|
+
# @param text [String, nil] the text to scan
|
|
39
|
+
# @param classes [Array<Class>, nil] restrict to specific classes
|
|
40
|
+
# @return [Enumerator<Match>] if no block given
|
|
41
|
+
# @yieldparam match [Match]
|
|
42
|
+
def call(text, classes: nil, &block)
|
|
43
|
+
return enum_for(:call, text, classes: classes) unless block
|
|
44
|
+
|
|
45
|
+
input = text.to_s
|
|
46
|
+
return if input.empty?
|
|
47
|
+
|
|
48
|
+
scan_text(input, classes || @classes, &block)
|
|
49
|
+
end
|
|
50
|
+
|
|
51
|
+
private
|
|
52
|
+
|
|
53
|
+
# @return [void]
|
|
54
|
+
def precompute # rubocop:disable Metrics/AbcSize
|
|
55
|
+
build_key_table
|
|
56
|
+
build_priority_table
|
|
57
|
+
@fisn_classes = @classes.select { |k| k.short_name == 'FISN' }
|
|
58
|
+
@occ_classes = @classes.select { |k| k.short_name == 'OCC' }
|
|
59
|
+
@simple_classes = @classes - @fisn_classes - @occ_classes
|
|
60
|
+
@candidates_by_length = Hash.new { |h, k| h[k] = [] }
|
|
61
|
+
@classes.each do |klass|
|
|
62
|
+
id_length = klass::ID_LENGTH
|
|
63
|
+
lengths = id_length.is_a?(Range) ? id_length : [id_length]
|
|
64
|
+
lengths.each { |len| @candidates_by_length[len] << klass }
|
|
65
|
+
end
|
|
66
|
+
@candidates_by_length.each_value(&:freeze)
|
|
67
|
+
end
|
|
68
|
+
|
|
69
|
+
# @return [void]
|
|
70
|
+
def build_key_table
|
|
71
|
+
@key_for = {}
|
|
72
|
+
@classes.each { |klass| @key_for[klass] = klass.short_name.downcase.to_sym }
|
|
73
|
+
@key_for.freeze
|
|
74
|
+
end
|
|
75
|
+
|
|
76
|
+
# @return [void]
|
|
77
|
+
def build_priority_table
|
|
78
|
+
@priority_for = {}
|
|
79
|
+
@classes.each_with_index do |klass, index|
|
|
80
|
+
check_digit_rank = klass.has_check_digit? ? 0 : 1
|
|
81
|
+
id_length = klass::ID_LENGTH
|
|
82
|
+
range_size = id_length.is_a?(Range) ? id_length.size : 1
|
|
83
|
+
@priority_for[klass] = [check_digit_rank, range_size, index].freeze
|
|
84
|
+
end
|
|
85
|
+
@priority_for.freeze
|
|
86
|
+
end
|
|
87
|
+
|
|
88
|
+
# @param input [String]
|
|
89
|
+
# @param target_classes [Array<Class>]
|
|
90
|
+
# @return [void]
|
|
91
|
+
def scan_text(input, target_classes)
|
|
92
|
+
pos = 0
|
|
93
|
+
while pos < input.length
|
|
94
|
+
match_data = CANDIDATE_RE.match(input, pos)
|
|
95
|
+
break unless match_data
|
|
96
|
+
|
|
97
|
+
result = identify_candidate(match_data, target_classes)
|
|
98
|
+
if result
|
|
99
|
+
yield result
|
|
100
|
+
pos = match_data.end(0)
|
|
101
|
+
else
|
|
102
|
+
pos = match_data.begin(0) + 1
|
|
103
|
+
end
|
|
104
|
+
end
|
|
105
|
+
end
|
|
106
|
+
|
|
107
|
+
# @param match_data [MatchData]
|
|
108
|
+
# @param target_classes [Array<Class>]
|
|
109
|
+
# @return [Match, nil]
|
|
110
|
+
def identify_candidate(match_data, target_classes)
|
|
111
|
+
raw = match_data[0]
|
|
112
|
+
start_pos = match_data.begin(0)
|
|
113
|
+
|
|
114
|
+
if match_data[:fisn]
|
|
115
|
+
try_classes(raw, raw.upcase, start_pos, target_classes & @fisn_classes)
|
|
116
|
+
elsif match_data[:occ]
|
|
117
|
+
try_classes(raw, raw.upcase, start_pos, target_classes & @occ_classes)
|
|
118
|
+
else
|
|
119
|
+
cleaned = raw.gsub('-', '').upcase
|
|
120
|
+
try_classes(raw, cleaned, start_pos, target_classes & @simple_classes)
|
|
121
|
+
end
|
|
122
|
+
end
|
|
123
|
+
|
|
124
|
+
# @return [Match, nil]
|
|
125
|
+
def try_classes(raw, cleaned, start_pos, classes)
|
|
126
|
+
best = best_match(cleaned, classes)
|
|
127
|
+
return unless best
|
|
128
|
+
|
|
129
|
+
end_pos = start_pos + raw.length
|
|
130
|
+
Match.new(type: @key_for[best], raw: raw, range: start_pos...end_pos, identifier: best.new(cleaned))
|
|
131
|
+
end
|
|
132
|
+
|
|
133
|
+
# @return [Class, nil]
|
|
134
|
+
def best_match(cleaned, classes)
|
|
135
|
+
return if classes.empty?
|
|
136
|
+
|
|
137
|
+
candidates = (@candidates_by_length[cleaned.length] || []) & classes
|
|
138
|
+
return if candidates.empty?
|
|
139
|
+
|
|
140
|
+
validated = candidates.select { |k| cleaned.match?(k::VALID_CHARS_REGEX) && k.valid?(cleaned) }
|
|
141
|
+
validated.min_by { |k| @priority_for[k] }
|
|
142
|
+
end
|
|
143
|
+
end
|
|
144
|
+
end
|
data/lib/sec_id/sedol.rb
CHANGED
data/lib/sec_id/valoren.rb
CHANGED
|
@@ -40,6 +40,13 @@ module SecID
|
|
|
40
40
|
@identifier = valoren_parts[:identifier]
|
|
41
41
|
end
|
|
42
42
|
|
|
43
|
+
# @return [String, nil]
|
|
44
|
+
def to_pretty_s
|
|
45
|
+
return nil unless valid?
|
|
46
|
+
|
|
47
|
+
identifier.reverse.scan(/.{1,3}/).join(' ').reverse
|
|
48
|
+
end
|
|
49
|
+
|
|
43
50
|
# @param country_code [String] the ISO 3166-1 alpha-2 country code (default: 'CH')
|
|
44
51
|
# @return [ISIN] a new ISIN instance with calculated check digit
|
|
45
52
|
# @raise [InvalidFormatError] if the country code is not CH or LI
|
data/lib/sec_id/version.rb
CHANGED
data/lib/sec_id.rb
CHANGED
|
@@ -15,6 +15,9 @@ module SecID
|
|
|
15
15
|
# Raised for type-specific structural errors (invalid prefix, category, group, BBAN, or date).
|
|
16
16
|
class InvalidStructureError < Error; end
|
|
17
17
|
|
|
18
|
+
# Raised when multiple identifier types match and on_ambiguous: :raise is used.
|
|
19
|
+
class AmbiguousMatchError < Error; end
|
|
20
|
+
|
|
18
21
|
class << self
|
|
19
22
|
# Looks up an identifier class by its symbol key.
|
|
20
23
|
#
|
|
@@ -54,45 +57,85 @@ module SecID
|
|
|
54
57
|
types.any? { |key| self[key].valid?(str) }
|
|
55
58
|
end
|
|
56
59
|
|
|
57
|
-
#
|
|
58
|
-
#
|
|
60
|
+
# @param text [String, nil] the text to scan
|
|
61
|
+
# @param types [Array<Symbol>, nil] restrict to specific types
|
|
62
|
+
# @return [Array<Match>]
|
|
63
|
+
# @raise [ArgumentError] if any key in types is unknown
|
|
64
|
+
def extract(text, types: nil)
|
|
65
|
+
scan(text, types: types).to_a
|
|
66
|
+
end
|
|
67
|
+
|
|
68
|
+
# @param text [String, nil] the text to scan
|
|
69
|
+
# @param types [Array<Symbol>, nil] restrict to specific types
|
|
70
|
+
# @return [Enumerator<Match>] if no block given
|
|
71
|
+
# @yieldparam match [Match]
|
|
72
|
+
# @raise [ArgumentError] if any key in types is unknown
|
|
73
|
+
def scan(text, types: nil, &)
|
|
74
|
+
classes = types&.map { |key| self[key] }
|
|
75
|
+
scanner.call(text, classes: classes, &)
|
|
76
|
+
end
|
|
77
|
+
|
|
78
|
+
# @param str [String, nil] the identifier string to explain
|
|
79
|
+
# @param types [Array<Symbol>, nil] restrict to specific types
|
|
80
|
+
# @return [Hash] hash with :input and :candidates keys
|
|
81
|
+
def explain(str, types: nil)
|
|
82
|
+
input = str.to_s.strip
|
|
83
|
+
target_keys = types || identifier_list.map { |k| k.short_name.downcase.to_sym }
|
|
84
|
+
candidates = target_keys.map do |key|
|
|
85
|
+
instance = self[key].new(input)
|
|
86
|
+
{ type: key, valid: instance.valid?, errors: instance.errors.details }
|
|
87
|
+
end
|
|
88
|
+
{ input: input, candidates: candidates }
|
|
89
|
+
end
|
|
90
|
+
|
|
59
91
|
# @param str [String, nil] the identifier string to parse
|
|
60
92
|
# @param types [Array<Symbol>, nil] restrict to specific types (e.g. [:isin, :cusip])
|
|
61
|
-
# @
|
|
62
|
-
# @
|
|
63
|
-
|
|
64
|
-
|
|
93
|
+
# @param on_ambiguous [:first, :raise, :all] how to handle multiple matches
|
|
94
|
+
# @return [SecID::Base, nil, Array<SecID::Base>] depends on on_ambiguous mode
|
|
95
|
+
# @raise [AmbiguousMatchError] when on_ambiguous: :raise and multiple types match
|
|
96
|
+
def parse(str, types: nil, on_ambiguous: :first)
|
|
97
|
+
case on_ambiguous
|
|
98
|
+
when :first then types.nil? ? parse_any(str) : parse_from(str, types)
|
|
99
|
+
when :raise then parse_strict(str, types)
|
|
100
|
+
when :all then parse_all(str, types)
|
|
101
|
+
else raise ArgumentError, "Unknown on_ambiguous mode: #{on_ambiguous.inspect}"
|
|
102
|
+
end
|
|
65
103
|
end
|
|
66
104
|
|
|
67
|
-
# Parses a string into the most specific matching identifier instance, raising on failure.
|
|
68
|
-
#
|
|
69
105
|
# @param str [String, nil] the identifier string to parse
|
|
70
106
|
# @param types [Array<Symbol>, nil] restrict to specific types (e.g. [:isin, :cusip])
|
|
71
|
-
# @
|
|
107
|
+
# @param on_ambiguous [:first, :raise, :all] how to handle multiple matches
|
|
108
|
+
# @return [SecID::Base, Array<SecID::Base>] depends on on_ambiguous mode
|
|
72
109
|
# @raise [InvalidFormatError] if no matching identifier type is found
|
|
73
|
-
# @raise [
|
|
74
|
-
def parse!(str, types: nil)
|
|
75
|
-
parse(str, types: types
|
|
110
|
+
# @raise [AmbiguousMatchError] when on_ambiguous: :raise and multiple types match
|
|
111
|
+
def parse!(str, types: nil, on_ambiguous: :first)
|
|
112
|
+
result = parse(str, types: types, on_ambiguous: on_ambiguous)
|
|
113
|
+
|
|
114
|
+
if on_ambiguous == :all
|
|
115
|
+
raise(InvalidFormatError, parse_error_message(str, types)) if result.empty?
|
|
116
|
+
|
|
117
|
+
return result
|
|
118
|
+
end
|
|
119
|
+
|
|
120
|
+
result || raise(InvalidFormatError, parse_error_message(str, types))
|
|
76
121
|
end
|
|
77
122
|
|
|
78
123
|
private
|
|
79
124
|
|
|
80
|
-
# @param klass [Class] the identifier class to register
|
|
81
125
|
# @return [void]
|
|
82
126
|
def register_identifier(klass)
|
|
83
127
|
key = klass.name.split('::').last.downcase.to_sym
|
|
84
128
|
identifier_map[key] = klass
|
|
85
129
|
identifier_list << klass
|
|
86
130
|
@detector = nil
|
|
131
|
+
@scanner = nil
|
|
87
132
|
end
|
|
88
133
|
|
|
89
|
-
# @return [SecID::Base, nil]
|
|
90
134
|
def parse_any(str)
|
|
91
135
|
key = detect(str).first
|
|
92
136
|
key && self[key].new(str)
|
|
93
137
|
end
|
|
94
138
|
|
|
95
|
-
# @return [SecID::Base, nil]
|
|
96
139
|
def parse_from(str, types)
|
|
97
140
|
types.each do |key|
|
|
98
141
|
instance = self[key].new(str)
|
|
@@ -101,26 +144,37 @@ module SecID
|
|
|
101
144
|
nil
|
|
102
145
|
end
|
|
103
146
|
|
|
104
|
-
|
|
105
|
-
|
|
106
|
-
|
|
107
|
-
|
|
147
|
+
def parse_strict(str, types)
|
|
148
|
+
candidates = resolve_candidates(str, types)
|
|
149
|
+
raise AmbiguousMatchError, ambiguous_message(str, candidates) if candidates.size > 1
|
|
150
|
+
|
|
151
|
+
candidates.first && self[candidates.first].new(str)
|
|
108
152
|
end
|
|
109
153
|
|
|
110
|
-
|
|
111
|
-
|
|
112
|
-
@detector ||= Detector.new(identifier_list)
|
|
154
|
+
def parse_all(str, types)
|
|
155
|
+
resolve_candidates(str, types).map { |key| self[key].new(str) }
|
|
113
156
|
end
|
|
114
157
|
|
|
115
|
-
# @return [
|
|
116
|
-
def
|
|
117
|
-
|
|
158
|
+
# @return [Array<Symbol>]
|
|
159
|
+
def resolve_candidates(str, types)
|
|
160
|
+
types ? types.select { |key| self[key].valid?(str) } : detect(str)
|
|
118
161
|
end
|
|
119
162
|
|
|
120
|
-
# @return [
|
|
121
|
-
def
|
|
122
|
-
|
|
163
|
+
# @return [String]
|
|
164
|
+
def ambiguous_message(str, candidates)
|
|
165
|
+
"Ambiguous identifier #{str.to_s.strip.inspect}: matches #{candidates.inspect}"
|
|
166
|
+
end
|
|
167
|
+
|
|
168
|
+
# @return [String]
|
|
169
|
+
def parse_error_message(str, types)
|
|
170
|
+
base = "No matching identifier type found for #{str.to_s.strip.inspect}"
|
|
171
|
+
types ? "#{base} among #{types.inspect}" : base
|
|
123
172
|
end
|
|
173
|
+
|
|
174
|
+
def detector = @detector ||= Detector.new(identifier_list)
|
|
175
|
+
def scanner = @scanner ||= Scanner.new(identifier_list)
|
|
176
|
+
def identifier_map = @identifier_map ||= {}
|
|
177
|
+
def identifier_list = @identifier_list ||= []
|
|
124
178
|
end
|
|
125
179
|
end
|
|
126
180
|
|
|
@@ -131,6 +185,7 @@ require 'sec_id/concerns/validatable'
|
|
|
131
185
|
require 'sec_id/concerns/checkable'
|
|
132
186
|
require 'sec_id/base'
|
|
133
187
|
require 'sec_id/detector'
|
|
188
|
+
require 'sec_id/scanner'
|
|
134
189
|
require 'sec_id/isin'
|
|
135
190
|
require 'sec_id/cusip'
|
|
136
191
|
require 'sec_id/sedol'
|
data/sec_id.gemspec
CHANGED
|
@@ -10,10 +10,10 @@ Gem::Specification.new do |spec|
|
|
|
10
10
|
spec.authors = ['Leonid Svyatov']
|
|
11
11
|
spec.email = ['leonid@svyatov.ru']
|
|
12
12
|
|
|
13
|
-
spec.summary = '
|
|
14
|
-
spec.description = 'Validate,
|
|
15
|
-
'
|
|
16
|
-
'and FISN
|
|
13
|
+
spec.summary = 'A Ruby toolkit for securities identifiers — validate, parse, normalize, detect, and convert.'
|
|
14
|
+
spec.description = 'Validate, normalize, parse, and convert securities identifiers. Auto-detect identifier ' \
|
|
15
|
+
'type from any string. Calculate and restore check digits. Supports ISIN, CUSIP, CEI, ' \
|
|
16
|
+
'SEDOL, FIGI, LEI, IBAN, CIK, OCC, WKN, Valoren, CFI, and FISN.'
|
|
17
17
|
spec.homepage = 'https://github.com/svyatov/sec_id'
|
|
18
18
|
spec.license = 'MIT'
|
|
19
19
|
|
metadata
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
|
2
2
|
name: sec_id
|
|
3
3
|
version: !ruby/object:Gem::Version
|
|
4
|
-
version: 5.
|
|
4
|
+
version: 5.2.0
|
|
5
5
|
platform: ruby
|
|
6
6
|
authors:
|
|
7
7
|
- Leonid Svyatov
|
|
@@ -9,9 +9,9 @@ bindir: bin
|
|
|
9
9
|
cert_chain: []
|
|
10
10
|
date: 1980-01-02 00:00:00.000000000 Z
|
|
11
11
|
dependencies: []
|
|
12
|
-
description: Validate,
|
|
13
|
-
|
|
14
|
-
CFI, and FISN
|
|
12
|
+
description: Validate, normalize, parse, and convert securities identifiers. Auto-detect
|
|
13
|
+
identifier type from any string. Calculate and restore check digits. Supports ISIN,
|
|
14
|
+
CUSIP, CEI, SEDOL, FIGI, LEI, IBAN, CIK, OCC, WKN, Valoren, CFI, and FISN.
|
|
15
15
|
email:
|
|
16
16
|
- leonid@svyatov.ru
|
|
17
17
|
executables: []
|
|
@@ -41,6 +41,7 @@ files:
|
|
|
41
41
|
- lib/sec_id/isin.rb
|
|
42
42
|
- lib/sec_id/lei.rb
|
|
43
43
|
- lib/sec_id/occ.rb
|
|
44
|
+
- lib/sec_id/scanner.rb
|
|
44
45
|
- lib/sec_id/sedol.rb
|
|
45
46
|
- lib/sec_id/valoren.rb
|
|
46
47
|
- lib/sec_id/version.rb
|
|
@@ -70,5 +71,6 @@ required_rubygems_version: !ruby/object:Gem::Requirement
|
|
|
70
71
|
requirements: []
|
|
71
72
|
rubygems_version: 4.0.6
|
|
72
73
|
specification_version: 4
|
|
73
|
-
summary:
|
|
74
|
+
summary: A Ruby toolkit for securities identifiers — validate, parse, normalize, detect,
|
|
75
|
+
and convert.
|
|
74
76
|
test_files: []
|