accept_language 2.1.0 → 2.2.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/README.md +119 -31
- data/lib/accept_language/matcher.rb +323 -33
- data/lib/accept_language/parser.rb +397 -27
- data/lib/accept_language.rb +252 -12
- metadata +10 -20
checksums.yaml
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
SHA256:
|
|
3
|
-
metadata.gz:
|
|
4
|
-
data.tar.gz:
|
|
3
|
+
metadata.gz: a993b9e4d4792701b09a650afb27011ff9a94ba104362a8c542d01ee389ca5e9
|
|
4
|
+
data.tar.gz: 129990017c1827e87e95847d8f8f42fb8c85b2d5b8146da5e6aeecb6ac7853ea
|
|
5
5
|
SHA512:
|
|
6
|
-
metadata.gz:
|
|
7
|
-
data.tar.gz:
|
|
6
|
+
metadata.gz: 2cf1e95c98cf16c78b33f7db1e666ce834f62d436a1fc84ffee28df36dfbe41a1595710cc539e7cdbc63431c13d51b55065fdcf037a01114cc639451de33498d
|
|
7
|
+
data.tar.gz: 63c161793225af35b1c5f73364dee830ba1429a2cc02e431c3422da2cf9e60edb76b9f601535f9cdb3afb982ff9d28b5f3c148a2fc8bad8320deba65115b1d23
|
data/README.md
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
# AcceptLanguage
|
|
2
2
|
|
|
3
|
-
A lightweight, thread-safe Ruby library for parsing `Accept-Language` HTTP
|
|
3
|
+
A lightweight, thread-safe Ruby library for parsing the `Accept-Language` HTTP header as defined in [RFC 2616](https://tools.ietf.org/html/rfc2616#section-14.4), with full support for [BCP 47](https://tools.ietf.org/html/bcp47) language tags.
|
|
4
4
|
|
|
5
5
|
[](https://github.com/cyril/accept_language.rb/tags)
|
|
6
6
|
[](https://rubydoc.info/github/cyril/accept_language.rb/main)
|
|
@@ -8,14 +8,6 @@ A lightweight, thread-safe Ruby library for parsing `Accept-Language` HTTP heade
|
|
|
8
8
|

|
|
9
9
|
[](https://github.com/cyril/accept_language.rb/raw/main/LICENSE.md)
|
|
10
10
|
|
|
11
|
-
## Features
|
|
12
|
-
|
|
13
|
-
- Thread-safe
|
|
14
|
-
- No framework dependencies
|
|
15
|
-
- Case-insensitive matching
|
|
16
|
-
- BCP 47 language tag support
|
|
17
|
-
- Wildcard and exclusion handling
|
|
18
|
-
|
|
19
11
|
## Installation
|
|
20
12
|
|
|
21
13
|
```ruby
|
|
@@ -25,51 +17,145 @@ gem "accept_language"
|
|
|
25
17
|
## Usage
|
|
26
18
|
|
|
27
19
|
```ruby
|
|
28
|
-
AcceptLanguage.parse("en-GB, en;q=0.
|
|
29
|
-
# => :
|
|
20
|
+
AcceptLanguage.parse("da, en-GB;q=0.8, en;q=0.7").match(:en, :da)
|
|
21
|
+
# => :da
|
|
30
22
|
```
|
|
31
23
|
|
|
24
|
+
## Behavior
|
|
25
|
+
|
|
32
26
|
### Quality values
|
|
33
27
|
|
|
34
|
-
Quality values (q-values)
|
|
28
|
+
Quality values (q-values) express relative preference, ranging from `0` (unacceptable) to `1` (most preferred). When omitted, the default is `1`.
|
|
35
29
|
|
|
36
30
|
```ruby
|
|
37
31
|
parser = AcceptLanguage.parse("da, en-GB;q=0.8, en;q=0.7")
|
|
38
32
|
|
|
39
|
-
parser.match(:en, :da) # => :da
|
|
40
|
-
parser.match(:en, :"en-GB") # => :"en-GB"
|
|
41
|
-
parser.match(:
|
|
33
|
+
parser.match(:en, :da) # => :da (q=1 > q=0.8)
|
|
34
|
+
parser.match(:en, :"en-GB") # => :"en-GB" (q=0.8 > q=0.7)
|
|
35
|
+
parser.match(:ja) # => nil (no match)
|
|
36
|
+
```
|
|
37
|
+
|
|
38
|
+
Per RFC 2616 Section 3.9, valid q-values have at most three decimal places: `0`, `0.7`, `0.85`, `1.000`. Invalid q-values are ignored.
|
|
39
|
+
|
|
40
|
+
### Identical quality values
|
|
41
|
+
|
|
42
|
+
When multiple languages share the same q-value, the order of declaration in the header determines priority—the first declared language is preferred:
|
|
43
|
+
|
|
44
|
+
```ruby
|
|
45
|
+
AcceptLanguage.parse("en;q=0.8, fr;q=0.8").match(:en, :fr)
|
|
46
|
+
# => :en (declared first)
|
|
47
|
+
|
|
48
|
+
AcceptLanguage.parse("fr;q=0.8, en;q=0.8").match(:en, :fr)
|
|
49
|
+
# => :fr (declared first)
|
|
50
|
+
```
|
|
51
|
+
|
|
52
|
+
### Prefix matching
|
|
53
|
+
|
|
54
|
+
Per RFC 2616 Section 14.4, a language-range matches any language-tag that exactly equals the range or begins with the range followed by `-`:
|
|
55
|
+
|
|
56
|
+
```ruby
|
|
57
|
+
AcceptLanguage.parse("zh").match(:"zh-TW")
|
|
58
|
+
# => :"zh-TW" ("zh" matches "zh-TW")
|
|
59
|
+
|
|
60
|
+
AcceptLanguage.parse("zh-TW").match(:zh)
|
|
61
|
+
# => nil ("zh-TW" does not match "zh")
|
|
62
|
+
```
|
|
63
|
+
|
|
64
|
+
Note that prefix matching follows hyphen boundaries—`zh` does not match `zhx`:
|
|
65
|
+
|
|
66
|
+
```ruby
|
|
67
|
+
AcceptLanguage.parse("zh").match(:zhx)
|
|
68
|
+
# => nil ("zhx" is a different language code)
|
|
69
|
+
```
|
|
70
|
+
|
|
71
|
+
### Wildcards
|
|
72
|
+
|
|
73
|
+
The wildcard `*` matches any language not matched by another range:
|
|
74
|
+
|
|
75
|
+
```ruby
|
|
76
|
+
AcceptLanguage.parse("de, *;q=0.5").match(:ja)
|
|
77
|
+
# => :ja (matched by wildcard)
|
|
78
|
+
|
|
79
|
+
AcceptLanguage.parse("de, *;q=0.5").match(:de, :ja)
|
|
80
|
+
# => :de (explicit match preferred over wildcard)
|
|
42
81
|
```
|
|
43
82
|
|
|
44
|
-
###
|
|
83
|
+
### Exclusions
|
|
84
|
+
|
|
85
|
+
A q-value of `0` explicitly excludes a language:
|
|
86
|
+
|
|
87
|
+
```ruby
|
|
88
|
+
AcceptLanguage.parse("*, en;q=0").match(:en)
|
|
89
|
+
# => nil (English excluded)
|
|
90
|
+
|
|
91
|
+
AcceptLanguage.parse("*, en;q=0").match(:ja)
|
|
92
|
+
# => :ja (matched by wildcard)
|
|
93
|
+
```
|
|
45
94
|
|
|
46
|
-
|
|
95
|
+
Exclusions apply to prefix matches:
|
|
47
96
|
|
|
48
97
|
```ruby
|
|
49
|
-
AcceptLanguage.parse("
|
|
50
|
-
|
|
98
|
+
AcceptLanguage.parse("*, en;q=0").match(:"en-GB")
|
|
99
|
+
# => nil (en-GB excluded via "en" prefix)
|
|
51
100
|
```
|
|
52
101
|
|
|
53
|
-
###
|
|
102
|
+
### Case insensitivity
|
|
54
103
|
|
|
55
|
-
|
|
104
|
+
Matching is case-insensitive per RFC 2616, but the original case of available language tags is preserved:
|
|
56
105
|
|
|
57
106
|
```ruby
|
|
58
|
-
AcceptLanguage.parse("
|
|
59
|
-
|
|
60
|
-
|
|
107
|
+
AcceptLanguage.parse("EN-GB").match(:"en-gb")
|
|
108
|
+
# => :"en-gb"
|
|
109
|
+
|
|
110
|
+
AcceptLanguage.parse("en-gb").match(:"EN-GB")
|
|
111
|
+
# => :"EN-GB"
|
|
61
112
|
```
|
|
62
113
|
|
|
63
|
-
###
|
|
114
|
+
### BCP 47 language tags
|
|
64
115
|
|
|
65
|
-
|
|
116
|
+
Full support for [BCP 47](https://tools.ietf.org/html/bcp47) language tags:
|
|
66
117
|
|
|
67
118
|
```ruby
|
|
68
|
-
|
|
69
|
-
AcceptLanguage.parse("
|
|
119
|
+
# Script subtags
|
|
120
|
+
AcceptLanguage.parse("zh-Hant").match(:"zh-Hant-TW", :"zh-Hans-CN")
|
|
121
|
+
# => :"zh-Hant-TW"
|
|
122
|
+
|
|
123
|
+
# Variant subtags
|
|
124
|
+
AcceptLanguage.parse("de-1996, de;q=0.9").match(:"de-CH-1996", :"de-CH")
|
|
125
|
+
# => :"de-CH-1996"
|
|
126
|
+
```
|
|
127
|
+
|
|
128
|
+
## Integration examples
|
|
129
|
+
|
|
130
|
+
### Rack
|
|
131
|
+
|
|
132
|
+
```ruby
|
|
133
|
+
# config.ru
|
|
134
|
+
class LocaleMiddleware
|
|
135
|
+
def initialize(app, available_locales:, default_locale:)
|
|
136
|
+
@app = app
|
|
137
|
+
@available_locales = available_locales
|
|
138
|
+
@default_locale = default_locale
|
|
139
|
+
end
|
|
140
|
+
|
|
141
|
+
def call(env)
|
|
142
|
+
locale = detect_locale(env) || @default_locale
|
|
143
|
+
env["rack.locale"] = locale
|
|
144
|
+
@app.call(env)
|
|
145
|
+
end
|
|
146
|
+
|
|
147
|
+
private
|
|
148
|
+
|
|
149
|
+
def detect_locale(env)
|
|
150
|
+
header = env["HTTP_ACCEPT_LANGUAGE"]
|
|
151
|
+
return unless header
|
|
152
|
+
|
|
153
|
+
AcceptLanguage.parse(header).match(*@available_locales)
|
|
154
|
+
end
|
|
155
|
+
end
|
|
70
156
|
```
|
|
71
157
|
|
|
72
|
-
|
|
158
|
+
### Ruby on Rails
|
|
73
159
|
|
|
74
160
|
```ruby
|
|
75
161
|
# app/controllers/application_controller.rb
|
|
@@ -100,13 +186,15 @@ end
|
|
|
100
186
|
|
|
101
187
|
## Documentation
|
|
102
188
|
|
|
103
|
-
- [API
|
|
189
|
+
- [API documentation](https://rubydoc.info/github/cyril/accept_language.rb/main)
|
|
190
|
+
- [RFC 2616 Section 14.4](https://tools.ietf.org/html/rfc2616#section-14.4)
|
|
191
|
+
- [BCP 47](https://tools.ietf.org/html/bcp47)
|
|
104
192
|
- [Language negotiation with Ruby](https://dev.to/cyri_/language-negotiation-with-ruby-5166)
|
|
105
193
|
- [Rubyで言語ネゴシエーション](https://qiita.com/cyril/items/45dc233edb7be9d614e7)
|
|
106
194
|
|
|
107
195
|
## Versioning
|
|
108
196
|
|
|
109
|
-
This library follows [Semantic Versioning 2.0
|
|
197
|
+
This library follows [Semantic Versioning 2.0](https://semver.org/).
|
|
110
198
|
|
|
111
199
|
## License
|
|
112
200
|
|
|
@@ -1,57 +1,280 @@
|
|
|
1
1
|
# frozen_string_literal: true
|
|
2
2
|
|
|
3
3
|
module AcceptLanguage
|
|
4
|
-
#
|
|
5
|
-
#
|
|
6
|
-
#
|
|
4
|
+
# = Language Preference Matcher
|
|
5
|
+
#
|
|
6
|
+
# Matcher implements the language matching algorithm defined in RFC 2616
|
|
7
|
+
# Section 14.4. It takes parsed language preferences (from {Parser}) and
|
|
8
|
+
# determines the optimal language choice from a set of available languages.
|
|
9
|
+
#
|
|
10
|
+
# == Overview
|
|
11
|
+
#
|
|
12
|
+
# The matching process balances multiple factors:
|
|
13
|
+
#
|
|
14
|
+
# 1. **Quality values**: Higher q-values indicate stronger user preference
|
|
15
|
+
# 2. **Declaration order**: Tie-breaker when q-values are equal
|
|
16
|
+
# 3. **Prefix matching**: Allows +en+ to match +en-US+, +en-GB+, etc.
|
|
17
|
+
# 4. **Wildcards**: The +*+ range matches any otherwise unmatched language
|
|
18
|
+
# 5. **Exclusions**: Languages with +q=0+ are explicitly unacceptable
|
|
19
|
+
#
|
|
20
|
+
# == RFC 2616 Section 14.4 Compliance
|
|
21
|
+
#
|
|
22
|
+
# This implementation follows the Accept-Language matching rules:
|
|
23
|
+
#
|
|
24
|
+
# > A language-range matches a language-tag if it exactly equals the tag,
|
|
25
|
+
# > or if it exactly equals a prefix of the tag such that the first tag
|
|
26
|
+
# > character following the prefix is "-".
|
|
27
|
+
#
|
|
28
|
+
# This means:
|
|
29
|
+
# - +en+ matches +en+, +en-US+, +en-GB+, +en-Latn-US+
|
|
30
|
+
# - +en-US+ matches only +en-US+ (not +en+ or +en-GB+)
|
|
31
|
+
# - +en+ does NOT match +eng+ (no hyphen boundary)
|
|
32
|
+
#
|
|
33
|
+
# == Quality Value Semantics
|
|
34
|
+
#
|
|
35
|
+
# Quality values have specific meanings per RFC 2616:
|
|
36
|
+
#
|
|
37
|
+
# - +q=1+ (or omitted): Most preferred
|
|
38
|
+
# - +0 < q < 1+: Acceptable with relative preference
|
|
39
|
+
# - +q=0+: Explicitly NOT acceptable
|
|
40
|
+
#
|
|
41
|
+
# The +q=0+ case is special: it doesn't just indicate low preference, it
|
|
42
|
+
# completely excludes the language from consideration. This is used with
|
|
43
|
+
# wildcards to express "any language except X":
|
|
44
|
+
#
|
|
45
|
+
# Accept-Language: *, en;q=0
|
|
46
|
+
#
|
|
47
|
+
# == Wildcard Behavior
|
|
48
|
+
#
|
|
49
|
+
# The wildcard +*+ matches any language not explicitly matched by another
|
|
50
|
+
# language-range. When processing a wildcard:
|
|
51
|
+
#
|
|
52
|
+
# 1. Collect all explicitly listed language tags (excluding the wildcard)
|
|
53
|
+
# 2. Find available languages that don't match any explicit tag
|
|
54
|
+
# 3. Return the first such language
|
|
55
|
+
#
|
|
56
|
+
# This ensures explicit preferences always take priority over the wildcard.
|
|
57
|
+
#
|
|
58
|
+
# == Internal Design
|
|
59
|
+
#
|
|
60
|
+
# The Matcher separates languages into two categories during initialization:
|
|
61
|
+
#
|
|
62
|
+
# - **preferred_langtags**: Languages with q > 0, sorted by descending quality
|
|
63
|
+
# - **excluded_langtags**: Languages with q = 0 (explicitly unacceptable)
|
|
64
|
+
#
|
|
65
|
+
# This separation optimizes the matching algorithm by allowing quick
|
|
66
|
+
# filtering of excluded languages before attempting matches.
|
|
67
|
+
#
|
|
68
|
+
# == Thread Safety
|
|
69
|
+
#
|
|
70
|
+
# Matcher instances are immutable after initialization. Both +preferred_langtags+
|
|
71
|
+
# and +excluded_langtags+ are frozen, making instances safe for concurrent use.
|
|
7
72
|
#
|
|
8
73
|
# @api private
|
|
9
|
-
# @note This class is
|
|
74
|
+
# @note This class is used internally by {Parser#match} and should not be
|
|
75
|
+
# instantiated directly. Use {AcceptLanguage.parse} followed by
|
|
76
|
+
# {Parser#match} instead.
|
|
77
|
+
#
|
|
78
|
+
# @example Internal usage (via Parser)
|
|
79
|
+
# # Don't do this:
|
|
80
|
+
# matcher = AcceptLanguage::Matcher.new("en" => 1000, "fr" => 800)
|
|
81
|
+
#
|
|
82
|
+
# # Do this instead:
|
|
83
|
+
# AcceptLanguage.parse("en, fr;q=0.8").match(:en, :fr)
|
|
84
|
+
#
|
|
85
|
+
# @see Parser#match
|
|
86
|
+
# @see https://tools.ietf.org/html/rfc2616#section-14.4 RFC 2616 Section 14.4
|
|
10
87
|
class Matcher
|
|
88
|
+
# The hyphen character used as a subtag delimiter in BCP 47 language tags.
|
|
89
|
+
#
|
|
90
|
+
# Per RFC 2616 Section 14.4, prefix matching must respect hyphen boundaries.
|
|
91
|
+
# A language-range matches a language-tag only if the character immediately
|
|
92
|
+
# following the prefix is a hyphen.
|
|
93
|
+
#
|
|
94
|
+
# @api private
|
|
95
|
+
# @return [String] "-"
|
|
96
|
+
HYPHEN = "-"
|
|
97
|
+
|
|
98
|
+
# Error message raised when an available language tag is not a Symbol.
|
|
99
|
+
#
|
|
100
|
+
# This guards against accidental non-Symbol values in the available languages
|
|
101
|
+
# array, which would cause unexpected behavior during matching.
|
|
102
|
+
#
|
|
103
|
+
# @api private
|
|
104
|
+
# @return [String]
|
|
105
|
+
LANGTAG_TYPE_ERROR = "Language tag must be a Symbol"
|
|
106
|
+
|
|
107
|
+
# The wildcard character that matches any language not explicitly listed.
|
|
108
|
+
#
|
|
109
|
+
# Per RFC 2616 Section 14.4, the wildcard has special semantics:
|
|
110
|
+
# - It matches any language not matched by other ranges
|
|
111
|
+
# - +*;q=0+ makes all unlisted languages unacceptable
|
|
112
|
+
# - It has lower effective priority than explicit language tags
|
|
113
|
+
#
|
|
11
114
|
# @api private
|
|
115
|
+
# @return [String] "*"
|
|
12
116
|
WILDCARD = "*"
|
|
13
117
|
|
|
118
|
+
# Language tags explicitly marked as unacceptable (+q=0+).
|
|
119
|
+
#
|
|
120
|
+
# These tags are filtered out from available languages before any
|
|
121
|
+
# matching occurs. Exclusions apply via prefix matching, so excluding
|
|
122
|
+
# +en+ also excludes +en-US+, +en-GB+, etc.
|
|
123
|
+
#
|
|
124
|
+
# @note The wildcard +*+ is never added to this set, even when +*;q=0+
|
|
125
|
+
# is specified. Wildcard exclusion is handled implicitly: when +*;q=0+
|
|
126
|
+
# and no other languages have +q > 0+, the preferred_langtags list is
|
|
127
|
+
# empty, resulting in no matches.
|
|
128
|
+
#
|
|
14
129
|
# @api private
|
|
15
|
-
|
|
130
|
+
# @return [Set<String>] downcased language tags with q=0
|
|
131
|
+
#
|
|
132
|
+
# @example
|
|
133
|
+
# # For "*, en;q=0, de;q=0"
|
|
134
|
+
# matcher.excluded_langtags
|
|
135
|
+
# # => #<Set: {"en", "de"}>
|
|
136
|
+
attr_reader :excluded_langtags
|
|
16
137
|
|
|
138
|
+
# Language tags sorted by preference (descending quality value).
|
|
139
|
+
#
|
|
140
|
+
# This array contains only tags with +q > 0+, ordered from most preferred
|
|
141
|
+
# to least preferred. When quality values are equal, the original
|
|
142
|
+
# declaration order from the Accept-Language header is preserved.
|
|
143
|
+
#
|
|
144
|
+
# The stable sort guarantee ensures deterministic matching: given the
|
|
145
|
+
# same header and available languages, the result is always the same.
|
|
146
|
+
#
|
|
17
147
|
# @api private
|
|
148
|
+
# @return [Array<String>] downcased language tags, highest quality first
|
|
149
|
+
#
|
|
150
|
+
# @example
|
|
151
|
+
# # For "fr;q=0.8, en, de;q=0.9"
|
|
152
|
+
# # Sorted: en (q=1), de (q=0.9), fr (q=0.8)
|
|
153
|
+
# matcher.preferred_langtags
|
|
154
|
+
# # => ["en", "de", "fr"]
|
|
155
|
+
attr_reader :preferred_langtags
|
|
156
|
+
|
|
157
|
+
# Creates a new Matcher instance from parsed language preferences.
|
|
158
|
+
#
|
|
159
|
+
# The initialization process:
|
|
160
|
+
#
|
|
161
|
+
# 1. Separates excluded tags (+q=0+) from preferred tags (+q > 0+)
|
|
162
|
+
# 2. Sorts preferred tags by descending quality value
|
|
163
|
+
# 3. Preserves original order for tags with equal quality (stable sort)
|
|
164
|
+
#
|
|
165
|
+
# == Exclusion Rules
|
|
166
|
+
#
|
|
167
|
+
# Only specific language tags with +q=0+ are added to the exclusion set.
|
|
168
|
+
# The wildcard +*+ is explicitly NOT added even when +*;q=0+ is present,
|
|
169
|
+
# because:
|
|
170
|
+
#
|
|
171
|
+
# - Adding +*+ to exclusions would break prefix matching logic
|
|
172
|
+
# - +*;q=0+ semantics are: "no unlisted language is acceptable"
|
|
173
|
+
# - This is achieved by having an empty preferred_langtags (no wildcards)
|
|
174
|
+
#
|
|
175
|
+
# == Stable Sorting
|
|
176
|
+
#
|
|
177
|
+
# Ruby's +sort_by+ is stable since Ruby 2.0, meaning elements with equal
|
|
178
|
+
# sort keys maintain their relative order. This ensures that when multiple
|
|
179
|
+
# languages have the same quality value, the first one declared in the
|
|
180
|
+
# Accept-Language header wins.
|
|
181
|
+
#
|
|
182
|
+
# @api private
|
|
183
|
+
# @param languages_range [Hash{String => Integer}] language tags mapped to
|
|
184
|
+
# quality values (0-1000), as produced by {Parser}
|
|
185
|
+
#
|
|
186
|
+
# @example
|
|
187
|
+
# Matcher.new("en" => 1000, "fr" => 800, "de" => 0)
|
|
188
|
+
# # preferred_langtags: ["en", "fr"]
|
|
189
|
+
# # excluded_langtags: #<Set: {"de"}>
|
|
18
190
|
def initialize(**languages_range)
|
|
19
191
|
@excluded_langtags = ::Set[]
|
|
20
|
-
|
|
21
|
-
|
|
22
|
-
|
|
23
|
-
|
|
24
|
-
|
|
25
|
-
|
|
26
|
-
|
|
27
|
-
|
|
28
|
-
|
|
192
|
+
|
|
193
|
+
languages_range.each do |langtag, quality|
|
|
194
|
+
next unless quality.zero? && !wildcard?(langtag)
|
|
195
|
+
|
|
196
|
+
# Exclude specific language tags, but NOT the wildcard.
|
|
197
|
+
# When "*;q=0" is specified, all non-listed languages become
|
|
198
|
+
# unacceptable implicitly (they won't match any preferred_langtags).
|
|
199
|
+
# Adding "*" to excluded_langtags would break prefix_match? logic.
|
|
200
|
+
@excluded_langtags << langtag
|
|
29
201
|
end
|
|
30
202
|
|
|
31
|
-
|
|
203
|
+
# Sort by descending quality. Ruby's sort_by is stable, so languages
|
|
204
|
+
# with identical quality values preserve their original order from
|
|
205
|
+
# the Accept-Language header (first declared = higher priority).
|
|
206
|
+
@preferred_langtags = languages_range
|
|
207
|
+
.reject { |_, quality| quality.zero? }
|
|
208
|
+
.sort_by { |_, quality| -quality }
|
|
209
|
+
.map(&:first)
|
|
32
210
|
end
|
|
33
211
|
|
|
212
|
+
# Finds the best matching language from the available options.
|
|
213
|
+
#
|
|
214
|
+
# == Algorithm
|
|
215
|
+
#
|
|
216
|
+
# 1. **Filter**: Remove available languages that match any excluded tag
|
|
217
|
+
# 2. **Match**: For each preferred tag (in quality order):
|
|
218
|
+
# - If it's a wildcard, return the first available language not
|
|
219
|
+
# matching any other preferred tag
|
|
220
|
+
# - Otherwise, return the first available language that matches
|
|
221
|
+
# via exact match or prefix match
|
|
222
|
+
# 3. **Result**: Return the first match found, or +nil+ if none
|
|
223
|
+
#
|
|
224
|
+
# == Return Value
|
|
225
|
+
#
|
|
226
|
+
# The returned value preserves the exact form (case) of the matched
|
|
227
|
+
# element from +available_langtags+. This is important for direct use
|
|
228
|
+
# with APIs like +I18n.locale=+ that may be case-sensitive.
|
|
229
|
+
#
|
|
34
230
|
# @api private
|
|
231
|
+
# @param available_langtags [Array<Symbol>] languages to match against
|
|
232
|
+
# @return [Symbol, nil] the best matching language, or +nil+
|
|
233
|
+
# @raise [TypeError] if any available language tag is not a Symbol
|
|
234
|
+
#
|
|
235
|
+
# @example Basic matching
|
|
236
|
+
# matcher = Matcher.new("en" => 1000, "fr" => 800)
|
|
237
|
+
# matcher.call(:en, :fr, :de)
|
|
238
|
+
# # => :en
|
|
239
|
+
#
|
|
240
|
+
# @example Prefix matching
|
|
241
|
+
# matcher = Matcher.new("en" => 1000)
|
|
242
|
+
# matcher.call(:"en-US", :"en-GB")
|
|
243
|
+
# # => :"en-US"
|
|
244
|
+
#
|
|
245
|
+
# @example With exclusion
|
|
246
|
+
# matcher = Matcher.new("*" => 500, "en" => 0)
|
|
247
|
+
# matcher.call(:en, :fr)
|
|
248
|
+
# # => :fr
|
|
35
249
|
def call(*available_langtags)
|
|
36
|
-
raise ::ArgumentError, "Language tags cannot be nil" if available_langtags.any?(&:nil?)
|
|
37
|
-
|
|
38
250
|
filtered_tags = drop_unacceptable(*available_langtags)
|
|
39
|
-
return
|
|
251
|
+
return if filtered_tags.empty?
|
|
40
252
|
|
|
41
253
|
find_best_match(filtered_tags)
|
|
42
254
|
end
|
|
43
255
|
|
|
44
256
|
private
|
|
45
257
|
|
|
258
|
+
# Iterates through preferred languages to find the first match.
|
|
259
|
+
#
|
|
260
|
+
# @param available_langtags [Set<String>] pre-filtered available tags
|
|
261
|
+
# @return [Symbol, nil] the matched tag or nil
|
|
46
262
|
def find_best_match(available_langtags)
|
|
47
263
|
preferred_langtags.each do |preferred_tag|
|
|
48
264
|
match = match_langtag(preferred_tag, available_langtags)
|
|
49
|
-
return match
|
|
265
|
+
return :"#{match}" unless match.nil?
|
|
50
266
|
end
|
|
51
267
|
|
|
52
268
|
nil
|
|
53
269
|
end
|
|
54
270
|
|
|
271
|
+
# Attempts to match a single preferred tag against available languages.
|
|
272
|
+
#
|
|
273
|
+
# Handles both wildcard and specific language tags differently.
|
|
274
|
+
#
|
|
275
|
+
# @param preferred_tag [String] the preferred language tag to match
|
|
276
|
+
# @param available_langtags [Set<String>] available tags to search
|
|
277
|
+
# @return [String, nil] the matched tag or nil
|
|
55
278
|
def match_langtag(preferred_tag, available_langtags)
|
|
56
279
|
if wildcard?(preferred_tag)
|
|
57
280
|
any_other_langtag(*available_langtags)
|
|
@@ -60,38 +283,105 @@ module AcceptLanguage
|
|
|
60
283
|
end
|
|
61
284
|
end
|
|
62
285
|
|
|
286
|
+
# Finds an available language that matches via exact or prefix match.
|
|
287
|
+
#
|
|
288
|
+
# @param preferred_tag [String] the preferred tag (downcased)
|
|
289
|
+
# @param available_langtags [Set<String>] available tags
|
|
290
|
+
# @return [String, nil] the first matching tag or nil
|
|
63
291
|
def find_matching_tag(preferred_tag, available_langtags)
|
|
64
|
-
|
|
65
|
-
available_langtags.find { |tag| tag.match?(pattern) }
|
|
292
|
+
available_langtags.find { |tag| prefix_match?(preferred_tag, tag) }
|
|
66
293
|
end
|
|
67
294
|
|
|
295
|
+
# Finds an available language for wildcard matching.
|
|
296
|
+
#
|
|
297
|
+
# Returns the first available language that doesn't match any explicitly
|
|
298
|
+
# listed preferred language tag. This implements the RFC 2616 semantics
|
|
299
|
+
# where +*+ matches "any language not matched by another range".
|
|
300
|
+
#
|
|
301
|
+
# @param available_langtags [Array<String>] available tags
|
|
302
|
+
# @return [String, nil] the first non-matching tag or nil
|
|
68
303
|
def any_other_langtag(*available_langtags)
|
|
304
|
+
langtags = preferred_langtags - [WILDCARD]
|
|
305
|
+
|
|
69
306
|
available_langtags.find do |available_langtag|
|
|
70
|
-
langtags
|
|
71
|
-
langtags.none? do |tag|
|
|
72
|
-
pattern = /\A#{::Regexp.escape(tag)}/i
|
|
73
|
-
available_langtag.match?(pattern)
|
|
74
|
-
end
|
|
307
|
+
langtags.none? { |tag| prefix_match?(tag, available_langtag) }
|
|
75
308
|
end
|
|
76
309
|
end
|
|
77
310
|
|
|
311
|
+
# Removes explicitly excluded languages from the available set.
|
|
312
|
+
#
|
|
313
|
+
# Uses prefix matching for exclusions, so excluding +en+ also excludes
|
|
314
|
+
# +en-US+, +en-GB+, etc.
|
|
315
|
+
#
|
|
316
|
+
# @param available_langtags [Array<Symbol>] all available tags
|
|
317
|
+
# @return [Set<String>] tags not matching any exclusion
|
|
318
|
+
# @raise [TypeError] if any tag is not a Symbol
|
|
78
319
|
def drop_unacceptable(*available_langtags)
|
|
79
|
-
available_langtags.
|
|
80
|
-
|
|
320
|
+
available_langtags.each_with_object(::Set[]) do |available_langtag, langtags|
|
|
321
|
+
raise ::TypeError, LANGTAG_TYPE_ERROR unless available_langtag.is_a?(::Symbol)
|
|
81
322
|
|
|
82
|
-
|
|
323
|
+
available_langtag = "#{available_langtag}"
|
|
324
|
+
langtags << available_langtag unless unacceptable?(available_langtag)
|
|
83
325
|
end
|
|
84
326
|
end
|
|
85
327
|
|
|
328
|
+
# Checks if a language tag is explicitly excluded.
|
|
329
|
+
#
|
|
330
|
+
# @param langtag [String] the tag to check (as string)
|
|
331
|
+
# @return [Boolean] true if the tag matches any exclusion
|
|
86
332
|
def unacceptable?(langtag)
|
|
87
|
-
excluded_langtags.any?
|
|
88
|
-
pattern = /\A#{::Regexp.escape(excluded_tag)}/i
|
|
89
|
-
langtag.match?(pattern)
|
|
90
|
-
end
|
|
333
|
+
excluded_langtags.any? { |excluded_tag| prefix_match?(excluded_tag, langtag) }
|
|
91
334
|
end
|
|
92
335
|
|
|
336
|
+
# Checks if a value is the wildcard character.
|
|
337
|
+
#
|
|
338
|
+
# @param value [String] the value to check
|
|
339
|
+
# @return [Boolean] true if the value is "*"
|
|
93
340
|
def wildcard?(value)
|
|
94
341
|
value.eql?(WILDCARD)
|
|
95
342
|
end
|
|
343
|
+
|
|
344
|
+
# Implements RFC 2616 Section 14.4 prefix matching rule.
|
|
345
|
+
#
|
|
346
|
+
# From the specification:
|
|
347
|
+
#
|
|
348
|
+
# > A language-range matches a language-tag if it exactly equals the tag,
|
|
349
|
+
# > or if it exactly equals a prefix of the tag such that the first tag
|
|
350
|
+
# > character following the prefix is "-".
|
|
351
|
+
#
|
|
352
|
+
# This rule ensures that language ranges match at subtag boundaries:
|
|
353
|
+
#
|
|
354
|
+
# - +en+ matches +en+ (exact)
|
|
355
|
+
# - +en+ matches +en-US+ (prefix + hyphen)
|
|
356
|
+
# - +en+ does NOT match +eng+ (no hyphen after prefix)
|
|
357
|
+
# - +en-US+ does NOT match +en+ (prefix is longer than tag)
|
|
358
|
+
#
|
|
359
|
+
# Matching is case-insensitive per RFC 2616, using +casecmp?+ for
|
|
360
|
+
# efficient comparison without allocating new strings.
|
|
361
|
+
#
|
|
362
|
+
# @param prefix [String] the language-range to match (downcased)
|
|
363
|
+
# @param tag [String] the language-tag to test (any case)
|
|
364
|
+
# @return [Boolean] true if prefix matches tag per RFC 2616 rules
|
|
365
|
+
#
|
|
366
|
+
# @example Exact matches
|
|
367
|
+
# prefix_match?("en", "en") # => true
|
|
368
|
+
# prefix_match?("en", "EN") # => true
|
|
369
|
+
# prefix_match?("en-us", "en-US") # => true
|
|
370
|
+
#
|
|
371
|
+
# @example Prefix matches
|
|
372
|
+
# prefix_match?("en", "en-us") # => true
|
|
373
|
+
# prefix_match?("en", "en-GB") # => true
|
|
374
|
+
# prefix_match?("zh", "zh-Hant-TW") # => true
|
|
375
|
+
#
|
|
376
|
+
# @example Non-matches
|
|
377
|
+
# prefix_match?("en-us", "en") # => false (prefix longer than tag)
|
|
378
|
+
# prefix_match?("en", "eng") # => false (no hyphen boundary)
|
|
379
|
+
# prefix_match?("en", "fr") # => false (different language)
|
|
380
|
+
def prefix_match?(prefix, tag)
|
|
381
|
+
return true if tag.casecmp?(prefix)
|
|
382
|
+
return false if tag.length <= prefix.length
|
|
383
|
+
|
|
384
|
+
tag[0, prefix.length].casecmp?(prefix) && tag[prefix.length] == HYPHEN
|
|
385
|
+
end
|
|
96
386
|
end
|
|
97
387
|
end
|