vers 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,1009 @@
1
+ ======================================================
2
+ vers: a mostly universal version range specifier
3
+ ======================================================
4
+
5
+ This specification is a new syntax for dependency and vulnerable version ranges.
6
+
7
+
8
+ Context
9
+ --------
10
+
11
+ Software package version ranges and version constraints are essential:
12
+
13
+ - When resolving the dependencies of a package to express which subset of the
14
+ versions are supported. For instance a dependency or requirement statement
15
+ such as "I require package foo, version 2.0 or later versions" defines a
16
+ range of acceptable foo versions.
17
+
18
+ - When stating that a known vulnerability or bug affects a range of package
19
+ versions. For instance a security advisory such as "vulnerability 123 affects
20
+ package bar, version 3.1 and version 4.2 but not version 5" defines a range of
21
+ vulnerable "bar" package versions.
22
+
23
+ Version ranges can be replaced by a list enumerating all the versions of
24
+ interest. But in practice, all the versions may not yet exist when defining an
25
+ open version range such as "v2.0 or later".
26
+
27
+ Therefore, a version range is a necessary, compact and practical way to
28
+ reference multiple versions rather than listing all the versions.
29
+
30
+
31
+ Problem
32
+ --------
33
+
34
+ Several version range notations exist and have evolved separately to serve the
35
+ specific needs of each package ecosystem, vulnerability databases and tools.
36
+
37
+ There is no (mostly) universal notation for version ranges and there is no
38
+ universal way to compare two versions, even though the concepts that exist in
39
+ most version range notations are similar.
40
+
41
+ Each package type or ecosystem may define their own ranges notation and version
42
+ comparison semantics for dependencies. And for security advisories, the lack of
43
+ a portable and compact notation for vulnerable package version ranges means that
44
+ these ranges may be either ambiguous or hard to compute and may be best replaced
45
+ by complete enumerations of all impacted versions, such as in the `NVD CPE Match
46
+ feed <https://nvd.nist.gov/vuln/data-feeds#cpeMatch>`_.
47
+
48
+ Because of this, expressing and resolving a version range is often a complex, or
49
+ error prone task.
50
+
51
+ In particular the need for common notation for version has emerged based on the
52
+ usage of Package URLs referencing vulnerable package version ranges such as in
53
+ vulnerability databases like `VulnerableCode
54
+ <https://github.com/nexB/vulnerablecode/>`_.
55
+
56
+ To better understand the problem, here are some of the many notations and
57
+ conventions in use:
58
+
59
+ - ``semver`` https://semver.org/ is a popular specification to structure version
60
+ strings, but does not provide a way to express version ranges.
61
+
62
+ - RubyGems strongly suggest using ``semver`` for version but does not enforce it.
63
+ As a result some gem use semver while several popular package do not use
64
+ strict semver. RubyGems use their own notation for version ranges which
65
+ looks like the ``node-semver`` notation with some subtle differences.
66
+ See https://guides.rubygems.org/patterns/#semantic-versioning
67
+
68
+ - ``node-semver`` ranges are used in npm at https://github.com/npm/node-semver#ranges
69
+ with range semantics that are specific to ``semver`` versions and npm.
70
+
71
+ - Dart pub versioning scheme is similar to ``node-semver`` and the documentation
72
+ at https://dart.dev/tools/pub/versioning provides a comprehensive coverage of
73
+ the topic of versioning. Version resolution uses its own algorithm.
74
+
75
+ - Python uses its own version and version ranges notation with notable
76
+ peculiarities on how pre-release and post-release suffixes are used
77
+ https://www.python.org/dev/peps/pep-0440/
78
+
79
+ - Debian and Ubuntu use their own notation and are remarkable for their use of
80
+ ``epochs`` to disambiguate versions.
81
+ https://www.debian.org/doc/debian-policy/ch-relationships.html
82
+
83
+ - RPM distros use their own range notation and use epochs like Debian.
84
+ https://rpm-software-management.github.io/rpm/manual/dependencies.html
85
+
86
+ - Perl CPAN defines its own version range notation similar to this specification
87
+ and uses two-segment versions. https://metacpan.org/pod/CPAN::Meta::Spec#Version-Ranges
88
+
89
+ - Apache Maven and NuGet use similar math intervals notation using brackets
90
+ https://en.wikipedia.org/wiki/Interval_(mathematics)
91
+
92
+ - Apache Maven http://maven.apache.org/enforcer/enforcer-rules/versionRanges.html
93
+ - NuGet https://docs.microsoft.com/en-us/nuget/concepts/package-versioning#version-ranges
94
+
95
+ - gradle uses Apache Maven notation with some extensions
96
+ https://docs.gradle.org/current/userguide/single_versions.html
97
+
98
+ - Gentoo and Alpine Linux use comparison operators similar to this specification:
99
+ - Gentoo https://wiki.gentoo.org/wiki/Version_specifier
100
+ - Alpine linux https://gitlab.alpinelinux.org/alpine/apk-tools/-/blob/master/src/version.c
101
+
102
+ - Arch Linux https://wiki.archlinux.org/title/PKGBUILD#Dependencies use its
103
+ own simplified notation for its PKGBUILD depends array and use a modified
104
+ RPM version comparison.
105
+
106
+ - Go modules https://golang.org/ref/mod#versions use ``semver`` versions with
107
+ specific version resolution algorithms.
108
+
109
+ - Haskell Package Versioning Policy https://pvp.haskell.org/ provides a notation
110
+ similar to this specification based on a modified semver with extra notations
111
+ such as star and caret.
112
+
113
+ - The NVD https://nvd.nist.gov/vuln/data-feeds#cpeMatch defines CPE ranges as
114
+ lists of version start and end either including or excluding the start or end
115
+ version. And also provides a concrete enumeration of the available ranges as
116
+ a daily feed.
117
+
118
+ - The version 5 of the CVE JSON data format at
119
+ https://github.com/CVEProject/cve-schema/blob/master/schema/v5.0/CVE_JSON_5.0.schema#L303
120
+ defines version ranges with a starting version, a versionType, and an upper
121
+ limit for the version range as lessThan or lessThanOrEqual; or an enumeration
122
+ of versions. The versionType is defined as ``"The version numbering system
123
+ used for specifying the range. This defines the exact semantics of the
124
+ comparison (less-than) operation on versions, which is required to understand
125
+ the range itself"``. A "versionType" resembles closely the Package URL package
126
+ "type".
127
+
128
+ - The OSSF OSV schema https://ossf.github.io/osv-schema/ defines vulnerable
129
+ ranges with version events using "introduced", "fixed" and "limit" fields and
130
+ an optional enumeration of all the versions in these ranges, except for
131
+ semver-based versions. A range may be ecosystem-specific based on a provided
132
+ package "ecosystem" value that resembles closely the Package URL package
133
+ "type".
134
+
135
+
136
+ The way two versions are compared as equal, lesser or greater is a closely
137
+ related topic:
138
+
139
+ - Each package ecosystem may have evolved its own peculiar version string
140
+ conventions, semantics and comparison procedure.
141
+
142
+ - For instance, ``semver`` is a prominent specification in this domain but this
143
+ is just one of the many ways to structure a version string.
144
+
145
+ - Debian, RPM, PyPI, RubyGems, and Composer have their own subtly different
146
+ approach on how to determine how two versions are compared as equal, greater
147
+ or lesser.
148
+
149
+
150
+ Solution
151
+ ---------
152
+
153
+ A solution to the many version range syntaxes is to design a new simplified
154
+ notation to unify them all with:
155
+
156
+ - a mostly universal and minimalist, compact notation to express version ranges
157
+ from many different package types and ecosystems.
158
+
159
+ - the package type-specific definitions to normalize existing range expressions
160
+ in this common notation.
161
+
162
+ - the designation of which algorithm or procedure to use when comparing two
163
+ versions such that it is possible to resolve if a version is within or
164
+ outside of a version range.
165
+
166
+ We call this solution "version range specifier" or "vers" and it is described
167
+ in this document.
168
+
169
+
170
+ Version range specifier
171
+ ------------------------
172
+
173
+ A version range specifier (aka. "vers") is a URI string using the ``vers``
174
+ URI-scheme with this syntax::
175
+
176
+ vers:<versioning-scheme>/<version-constraint>|<version-constraint>|...
177
+
178
+ For example, to define a set of versions that contains either version ``1.2.3``,
179
+ or any versions greater than or equal to ``2.0.0`` but less than ``5.0.0`` using
180
+ the ``node-semver`` versioning scheme used with the ``npm`` Package URL type,
181
+ the version range specifier will be::
182
+
183
+ vers:npm/1.2.3|>=2.0.0|<5.0.0
184
+
185
+ ``vers`` is the URI-scheme and is an acronym for "VErsion Range Specifier". It
186
+ has been selected because it is short, obviously about version and available
187
+ for a future formal URI-scheme registration at IANA.
188
+
189
+ The pipe "|" is used as a simple separator between ``<version-constraint>``.
190
+ Each ``<version-constraint>`` in this pipe-separated list contains a comparator
191
+ and a version::
192
+
193
+ <comparator:version>
194
+
195
+ This list of ``<version-constraint>`` are signposts in the version timeline of
196
+ a package that specify version intervals.
197
+
198
+ A ``<version>`` satisfies a version range specifier if it is contained within
199
+ any of the intervals defined by these ``<version-constraint>``.
200
+
201
+
202
+ Using version range specifiers
203
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
204
+
205
+ ``vers`` primary usage is to test if a version is within a range.
206
+
207
+ An version is within a version range if falls in any of the intervals defined
208
+ by a range. Otherwise, the version is outside of the version range.
209
+
210
+ Some important usages derived from this include:
211
+
212
+ - **Resolving a version range specifier to a list of concrete versions.**
213
+ In this case, the input is one or more known versions of a package. Each
214
+ version is then tested to check if it lies within or outside the range. For
215
+ example, given a vulnerability and the ``vers`` describing the vulnerable
216
+ versions of a package, this process is used to determine if an existing
217
+ package version is vulnerable.
218
+
219
+ - **Selecting one of several versions that are within a range.**
220
+ In this case, given several versions that are within a range and several
221
+ packages that express package dependencies qualified by a version range,
222
+ a package management tools will determine and select the set of package
223
+ versions that satisfy all the version ranges constraints of all dependencies.
224
+ This usually requires deploying heuristics and algorithms (possibly complex
225
+ such as sat solvers) that are ecosystem- and tool-specific and outside of the
226
+ scope for this specification; yet ``vers`` could be used in tandem with
227
+ ``purl`` to provide an input to this dependencies resolution process.
228
+
229
+
230
+ Examples
231
+ ~~~~~~~~~
232
+
233
+ A single version in an npm package dependency:
234
+
235
+ - originally seen as a dependency on version "1.2.3" in a package.json manifest
236
+ - the version range spec is: ``vers:npm/1.2.3``
237
+
238
+
239
+ A list of versions, enumerated:
240
+
241
+ - ``vers:pypi/0.0.0|0.0.1|0.0.2|0.0.3|1.0|2.0pre1``
242
+
243
+
244
+ A complex statement about a vulnerability in a "maven" package that affects
245
+ multiple branches each with their own fixed versions at
246
+ https://repo1.maven.org/maven2/org/apache/tomee/apache-tomee/
247
+ Note how the constraints are sorted:
248
+
249
+
250
+ - "affects Apache TomEE 8.0.0-M1 - 8.0.1, Apache TomEE 7.1.0 - 7.1.2,
251
+ Apache TomEE 7.0.0-M1 - 7.0.7, Apache TomEE 1.0.0-beta1 - 1.7.5."
252
+
253
+ - a normalized version range spec is:
254
+ ``vers:maven/>=1.0.0-beta1|<=1.7.5|>=7.0.0-M1|<=7.0.7|>=7.1.0|<=7.1.2|>=8.0.0-M1|<=8.0.1``
255
+
256
+ - alternatively, four ``vers`` express the same range, using one ``vers`` for
257
+ each vulnerable "branches":
258
+ - ``vers:tomee/>=1.0.0-beta1|<=1.7.5``
259
+ - ``vers:tomee/>=7.0.0-M1|<=7.0.7``
260
+ - ``vers:tomee/>=7.1.0|<=7.1.2``
261
+ - ``vers:tomee/>=8.0.0-M1|<=8.0.1``
262
+
263
+ Conversing RubyGems custom syntax for dependency on gem. Note how the
264
+ pessimistic version constraint is expanded:
265
+
266
+ - ``'library', '~> 2.2.0', '!= 2.2.1'``
267
+ - the version range spec is: ``vers:gem/>=2.2.0|!= 2.2.1|<2.3.0``
268
+
269
+
270
+ URI scheme
271
+ ~~~~~~~~~~
272
+
273
+ The ``vers`` URI scheme is an acronym for "VErsion Range Specifier".
274
+ It has been selected because it is short, obviously about version and available
275
+ for a future formal registration for this URI-scheme at the IANA registry.
276
+
277
+ The URI scheme is followed by a colon ":".
278
+
279
+
280
+ ``<versioning-scheme>``
281
+ ~~~~~~~~~~~~~~~~~~~~~~~
282
+
283
+ The ``<versioning-scheme>`` (such as ``npm``, ``deb``, etc.) determines:
284
+
285
+ - the specific notation and conventions used for a version string encoded in
286
+ this scheme. Versioning schemes often specify a version segments separator and
287
+ the meaning of each version segments, such as [major.minor.patch] in semver.
288
+
289
+ - how two versions are compared as greater or lesser to determine if a version
290
+ is within or outside a range.
291
+
292
+ - how a versioning scheme-specific range notation can be transformed in the
293
+ ``vers`` simplified notation defined here.
294
+
295
+ By convention the versioning scheme **should** be the same as the ``Package URL``
296
+ package type for a given package ecosystem. It is OK to have other schemes
297
+ beyond the purl type. A scheme could be specific to a single package name.
298
+
299
+ The ``<versioning-scheme>`` is followed by a slash "/".
300
+
301
+
302
+ ``<version-constraint>``
303
+ ~~~~~~~~~~~~~~~~~~~~~~~~
304
+
305
+ After the ``<versioning-scheme>`` and "/" there are one or more
306
+ ``<version-constraint>`` separated by a pipe "|". The pipe "|" has no special
307
+ meaning beside being a separator.
308
+
309
+ Each ``<version-constraint>`` of this list is either a single ``<version>`` as
310
+ in ``1.2.3`` for example or the combination of a ``<comparator>`` and a ``<version>`` as in
311
+ ``>=2.0.0`` using this syntax::
312
+
313
+ <comparator><version>
314
+
315
+ A single version that means that a version equal to this version satisfies the
316
+ range spec. Equality is based on the equality of two normalized version strings
317
+ according to their versioning scheme. For most schemes, this is a simple string
318
+ equality. But schemes can specify normalization and rules for equality such as
319
+ ``pypi`` with PEP440.
320
+
321
+
322
+ The special star "*" comparator matches any version. It must be used **alone**
323
+ exclusive of any other constraint and must not be followed by a version. For
324
+ example "vers:deb/\*" represent all the versions of a Debian package. This
325
+ includes past, current and possible future versions.
326
+
327
+
328
+ Otherwise, the ``<comparator>`` is one of these comparison operators:
329
+
330
+ - "!=": Version exclusion or inequality comparator. This means a version must
331
+ not be equal to the provided version that must be excluded from the range.
332
+ For example: "!=1.2.3" means that version "1.2.3" is excluded.
333
+
334
+ - "<", "<=": Lesser than or lesser-or-equal version comparators point to all
335
+ versions less than or equal to the provided version.
336
+ For example "<=1.2.3" means less than or equal to "1.2.3".
337
+
338
+ - ">", ">=": Greater than or greater-or-equal version comparators point to
339
+ all versions greater than or equal to the provided version.
340
+ For example ">=1.2.3" means greater than or equal to "1.2.3".
341
+
342
+
343
+ The ``<versioning-scheme>`` defines:
344
+
345
+ - how to compare two version strings using these comparators, and
346
+
347
+ - the structure of a version string such as "1.2.3" if any. For instance, the
348
+ ``semver`` specification for version numbers defines a version as composed
349
+ primarily of three dot-separated numeric segments named major, minor and patch.
350
+
351
+
352
+
353
+ Normalized, canonical representation and validation
354
+ -----------------------------------------------------
355
+
356
+ The construction and validation rules are designed such that a ``vers`` is
357
+ easier to read and understand by human and straight forward to process by tools,
358
+ attempting to avoid the creation of empty or impossible version ranges.
359
+
360
+ - Spaces are not significant and removed in a canonical form. For example
361
+ "<1.2.3|>=2.0" and " < 1.2. 3 | > = 2 . 0" are equivalent.
362
+
363
+ - A version range specifier contains only printable ASCII letters, digits and
364
+ punctuation.
365
+
366
+ - The URI scheme and versioning scheme are always lowercase as in ``vers:npm``.
367
+
368
+ - The versions are case-sensitive, and a versioning scheme may specify its own
369
+ case sensitivity.
370
+
371
+ - If a ``version`` in a ``<version-constraint>`` contains separator or
372
+ comparator characters (i.e. ``><=!*|``), it must be quoted using the URL
373
+ quoting rules. This should be rare in practice.
374
+
375
+ The list of ``<version-constraint>s`` of a range are signposts in the version
376
+ timeline of a package. With these few and simple validation rules, we can avoid
377
+ the creation of most empty or impossible version ranges:
378
+
379
+ - **Constraints are sorted by version**. The canonical ordering is the versions
380
+ order. The ordering of ``<version-constraint>`` is not significant otherwise
381
+ but this sort order is needed when check if a version is contained in a range.
382
+
383
+ - **Versions are unique**. Each ``version`` must be unique in a range and can
384
+ occur only once in any ``<version-constraint>`` of a range specifier,
385
+ irrespective of its comparators. Tools must report an error for duplicated
386
+ versions.
387
+
388
+ - **There is only one star**: "*" must only occur once and alone in a range,
389
+ without any other constraint or version.
390
+
391
+ Starting from a de-duplicated and sorted list of constraints, these extra rules
392
+ apply to the comparators of any two contiguous constraints to be valid:
393
+
394
+ - "!=" constraint can be followed by a constraint using any comparator, i.e.,
395
+ any of "=", "!=", ">", ">=", "<", "<=" as comparator (or no constraint).
396
+
397
+ Ignoring all constraints with "!=" comparators:
398
+
399
+ - A "=" constraint must be followed only by a constraint with one of "=", ">",
400
+ ">=" as comparator (or no constraint).
401
+
402
+ And ignoring all constraints with "=" or "!=" comparators, the sequence of
403
+ constraint comparators must be an alternation of greater and lesser comparators:
404
+
405
+ - "<" and "<=" must be followed by one of ">", ">=" (or no constraint).
406
+ - ">" and ">=" must be followed by one of "<", "<=" (or no constraint).
407
+
408
+ Tools must report an error for such invalid ranges.
409
+
410
+
411
+ Parsing and validating version range specifiers
412
+ -------------------------------------------------
413
+
414
+ To parse a version range specifier string:
415
+
416
+ - Remove all spaces and tabs.
417
+ - Start from left, and split once on colon ":".
418
+ - The left hand side is the URI-scheme that must be lowercase.
419
+ - Tools must validate that the URI-scheme value is ``vers``.
420
+ - The right hand side is the specifier.
421
+
422
+ - Split the specifier from left once on a slash "/".
423
+
424
+ - The left hand side is the <versioning-scheme> that must be lowercase.
425
+ Tools should validate that the <versioning-scheme> is a known scheme.
426
+
427
+ - The right hand side is a list of one or more constraints.
428
+ Tools must validate that this constraints string is not empty ignoring spaces.
429
+
430
+ - If the constraints string is equal to "*", the ``<version-constraint>`` is "*".
431
+ Parsing is done and no further processing is needed for this ``vers``. A tool
432
+ should report an error if there are extra characters beyond "*".
433
+
434
+ - Strip leading and trailing pipes "|" from the constraints string.
435
+ - Split the constraints on pipe "|". The result is a list of ``<version-constraint>``.
436
+ Consecutive pipes must be treated as one and leading and trailing pipes ignored.
437
+
438
+ - For each ``<version-constraint>``:
439
+ - Determine if the ``<version-constraint>`` starts with one of the two comparators:
440
+
441
+ - If it starts with ">=", then the comparator is ">=".
442
+ - If it starts with "<=", then the comparator is "<=".
443
+ - If it starts with "!=", then the comparator is "!=".
444
+ - If it starts with "<", then the comparator is "<".
445
+ - If it starts with ">", then the comparator is ">".
446
+
447
+ - Remove the comparator from ``<version-constraint>`` string start. The
448
+ remaining string is the version.
449
+
450
+ - Otherwise the version is the full ``<version-constraint>`` string (which implies
451
+ an equality comparator of "=")
452
+
453
+ - Tools should validate and report an error if the version is empty.
454
+
455
+ - If the version contains a percent "%" character, apply URL quoting rules
456
+ to unquote this string.
457
+
458
+ - Append the parsed (comparator, version) to the constraints list.
459
+
460
+ Finally:
461
+
462
+ - The results are the ``<versioning-scheme>`` and the list of ``<comparator, version>``
463
+ constraints.
464
+
465
+ Tools should optionally validate and simplify the list of ``<comparator, version>``
466
+ constraints once parsing is complete:
467
+
468
+ - Sort and validate the list of constraints.
469
+ - Simplify the list of constraints.
470
+
471
+
472
+ Version constraints simplification
473
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
474
+
475
+ Tools can simplify a list of ``<version-constraint>`` using this approach:
476
+
477
+ These pairs of contiguous constraints with these comparators are valid:
478
+
479
+ - != followed by anything
480
+ - =, <, or <= followed by =, !=, >, or >=
481
+ - >, or >= followed by !=, <, or <=
482
+
483
+ These pairs of contiguous constraints with these comparators are redundant and
484
+ invalid (ignoring any != since they can show up anywhere):
485
+
486
+ - =, < or <= followed by < or <=: this is the same as < or <=
487
+ - > or >= followed by =, > or >=: this is the same as > or >=
488
+
489
+
490
+ A procedure to remove redundant constraints can be:
491
+
492
+ - Start from a list of constraints of comparator and version, sorted by version
493
+ and where each version occurs only once in any constraint.
494
+
495
+ - If the constraints list contains a single constraint (star, equal or anything)
496
+ return this list and simplification is finished.
497
+
498
+ - Split the constraints list in two sub lists:
499
+
500
+ - a list of "unequal constraints" where the comparator is "!="
501
+ - a remainder list of "constraints" where the comparator is not "!="
502
+
503
+ - If the remainder list of "constraints" is empty, return the "unequal constraints"
504
+ list and simplification is finished.
505
+
506
+ - Iterate over the constraints list, considering the current and next contiguous
507
+ constraints, and the previous constraint (e.g., before current) if it exists:
508
+
509
+ - If current comparator is ">" or ">=" and next comparator is "=", ">" or ">=",
510
+ discard next constraint
511
+
512
+ - If current comparator is "=", "<" or "<=" and next comparator is <" or <=",
513
+ discard current constraint. Previous constraint becomes current if it exists.
514
+
515
+ - If there is a previous constraint:
516
+
517
+ - If previous comparator is ">" or ">=" and current comparator is "=", ">" or ">=",
518
+ discard current constraint
519
+
520
+ - If previous comparator is "=", "<" or "<=" and current comparator is <" or <=",
521
+ discard previous constraint
522
+
523
+ - Concatenate the "unequal constraints" list and the filtered "constraints" list
524
+ - Sort by version and return.
525
+
526
+
527
+ Checking if a version is contained within a range
528
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
529
+
530
+ To check if a "tested version" is contained within a version range:
531
+
532
+ - Start from a parsed a version range specifier with:
533
+
534
+ - a versioning scheme
535
+ - a list of constraints of comparator and version, sorted by version
536
+ and where each version occurs only once in any constraint.
537
+
538
+ - If the constraint list contains only one item and the comparator is "*",
539
+ then the "tested version" is IN the range. Check is finished.
540
+
541
+ - Select the version equality and comparison procedures suitable for this
542
+ versioning scheme and use these for all version comparisons performed below.
543
+
544
+ - If the "tested version" is equal to the any of the constraint version
545
+ where the constraint comparator is for equality (any of "=", "<=", or ">=")
546
+ then the "tested version" is in the range. Check is finished.
547
+
548
+ - If the "tested version" is equal to the any of the constraint version where
549
+ the constraint comparator is "=!" then the "tested version" is NOT in the
550
+ range. Check is finished.
551
+
552
+ - Split the constraint list in two sub lists:
553
+
554
+ - a first list where the comparator is "=" or "!="
555
+ - a second list where the comparator is neither "=" nor "!="
556
+
557
+ - Iterate over the current and next contiguous constraints pairs (aka. pairwise)
558
+ in the second list.
559
+
560
+ - For each current and next constraint:
561
+
562
+ - If this is the first iteration and current comparator is "<" or <="
563
+ and the "tested version" is less than the current version
564
+ then the "tested version" is IN the range. Check is finished.
565
+
566
+ - If this is the last iteration and next comparator is ">" or >="
567
+ and the "tested version" is greater than the next version
568
+ then the "tested version" is IN the range. Check is finished.
569
+
570
+ - If current comparator is ">" or >=" and next comparator is "<" or <="
571
+ and the "tested version" is greater than the current version
572
+ and the "tested version" is less than the next version
573
+ then the "tested version" is IN the range. Check is finished.
574
+
575
+ - If current comparator is "<" or <=" and next comparator is ">" or >="
576
+ then these versions are out the range. Continue to the next iteration.
577
+
578
+ - Reaching here without having finished the check before means that the
579
+ "tested version" is NOT in the range.
580
+
581
+
582
+ Notes and caveats
583
+ ~~~~~~~~~~~~~~~~~~~
584
+
585
+ - Comparing versions from two different versioning schemes is an error. Even
586
+ though there may be some similarities between the ``semver`` version of an npm
587
+ and the ``deb`` version of its Debian packaging, the way versions are compared
588
+ specific to each versioning scheme and may be different. Tools should report
589
+ an error in this case.
590
+
591
+ - All references to sorting or ordering of version constraints means sorting
592
+ by version. And sorting by versions always implies using the versioning
593
+ scheme-specified version comparison and ordering.
594
+
595
+
596
+ Some of the known versioning schemes
597
+ ----------------------------------------
598
+
599
+ These are a few known versioning schemes for some common Package URL
600
+ `types` (aka. ``ecosystem``).
601
+
602
+ - **deb**: Debian and Ubuntu https://www.debian.org/doc/debian-policy/ch-relationships.html
603
+ Debian uses these comparators: <<, <=, =, >= and >>.
604
+
605
+ - **rpm**: RPM distros https://rpm-software-management.github.io/rpm/manual/dependencies.html
606
+ The a simplified rmpvercmp version comparison routine is used by Arch Linux Pacman.
607
+
608
+ - **gem**: RubyGems https://guides.rubygems.org/patterns/#semantic-versioning
609
+ which is similar to ``node-semver`` for its syntax, but does not use semver
610
+ versions.
611
+
612
+ - **npm**: npm uses node-semver which is based on semver with its own range
613
+ notation https://github.com/npm/node-semver#ranges
614
+ A similar but different scheme is used by Rust
615
+ https://doc.rust-lang.org/cargo/reference/specifying-dependencies.html
616
+ and several other package types may use ``node-semver``-like ranges. But most
617
+ of these related schemes are not strictly the same as what is implemented in
618
+ ``node-semver``. For instance PHP ``composer`` may need its own scheme as this
619
+ is not strictly ``node-semver``.
620
+
621
+ - **composer**: PHP https://getcomposer.org/doc/articles/versions.md
622
+
623
+ - **pypi**: Python https://www.python.org/dev/peps/pep-0440/
624
+
625
+ - **cpan**: Perl https://perlmaven.com/how-to-compare-version-numbers-in-perl-and-for-cpan-modules
626
+
627
+ - **golang**: Go modules https://golang.org/ref/mod#versions use ``semver`` versions
628
+ with a specific minimum version resolution algorithm.
629
+
630
+ - **maven**: Apache Maven supports a math interval notation which is rarely seen
631
+ in practice http://maven.apache.org/enforcer/enforcer-rules/versionRanges.html
632
+
633
+ - **nuget**: NuGet https://docs.microsoft.com/en-us/nuget/concepts/package-versioning#version-ranges
634
+ Note that Apache Maven and NuGet are following a similar approach with a
635
+ math-derived intervals syntax as in https://en.wikipedia.org/wiki/Interval_(mathematics)
636
+
637
+ - **gentoo**: Gentoo https://wiki.gentoo.org/wiki/Version_specifier
638
+
639
+ - **alpine**: Alpine linux https://gitlab.alpinelinux.org/alpine/apk-tools/-/blob/master/src/version.c
640
+ which is using Gentoo-like conventions.
641
+
642
+
643
+ These are generic schemes, to use sparingly for special cases:
644
+
645
+ - **generic**: a generic version comparison algorithm (which will be specified
646
+ later, likely based on a split on any wholly alpha or wholly numeric segments
647
+ and dealing with digit and string comparisons, like is done in libversion)
648
+
649
+ - **none**: a generic versioning scheme for a range containing no version.
650
+ ``vers:none/*`` is the only valid vers form for this scheme.
651
+
652
+ - **all**: a generic versioning scheme for a range containing all versions.
653
+ ``vers:all/*`` is the only valid vers form for this scheme.
654
+
655
+ - **intdot**: a generic versioning scheme that allows version components to be
656
+ specified as integers separated by dots, e.g. ``10.234.5.12``. Versions
657
+ specified in this scheme consist of ASCII digits only, formatted with only
658
+ non-negative integers, and ignoring leading zeros. Interpretation of the
659
+ version should stop at the first character that is not a digit or a dot.
660
+
661
+ - **semver**: a generic scheme that uses the same syntax as ``semver``. It follows the MAJOR.MINOR.PATCH format and is defined in the Semantic Versioning Specification 2.0.0, see https://semver.org/spec/v2.0.0.html.
662
+
663
+ A separate document will provide details for each versioning scheme and:
664
+
665
+ - how to convert its native range notation to the ``vers`` notation and back.
666
+ - how to compare and sort two versions in a range.
667
+
668
+ This versioning schemes document will also explain how to convert CVE and OSV
669
+ ranges to ``vers``.
670
+
671
+
672
+ TODO: add Rust, composer and archlinux, nginx, tomcat, apache.
673
+
674
+
675
+ Implementations
676
+ -----------------------
677
+
678
+ - Python: https://github.com/nexB/univers
679
+ - Java: https://github.com/nscuro/versatile
680
+ - Yours!
681
+
682
+
683
+
684
+ Related efforts and alternative
685
+ ------------------------------------
686
+
687
+ - CUDF defines a generic range notation similar to Debian and integer version
688
+ numbers from the sequence of versions for universal dependencies resolution
689
+ https://www.mancoosi.org/cudf/primer/
690
+
691
+ - OSV is an "Open source vulnerability DB and triage service." It defines
692
+ vulnerable version range semantics using a minimal set of comparators for use
693
+ with package "ecosystem" and version range "type".
694
+ https://github.com/google/osv
695
+
696
+ - libversion is a library for general purpose version comparison using a
697
+ unified procedure designed to work with many package types.
698
+ https://github.com/repology/libversion
699
+
700
+ - unified-range is a library for uniform version ranges based on the Maven
701
+ version range spec. It support Apache Maven and npm ranges
702
+ https://github.com/snyk/unified-range
703
+
704
+ - dephell specifier is a library to parse and evaluate version ranges and
705
+ "work with version specifiers (can parse PEP-440, SemVer, Ruby, NPM, Maven)"
706
+ https://github.com/dephell/dephell_specifier
707
+
708
+
709
+ Why not reuse existing version range notations?
710
+ -----------------------------------------------------
711
+
712
+ Most existing version range notations are tied to a specific version string
713
+ syntax and are therefore not readily applicable to other contexts. For example,
714
+ the use of elements such as tilde and caret ranges in RubyGems, npm or Dart
715
+ notations implies that a certain structure exists in the version string (semver
716
+ or semver- like). The inclusion of these additional comparators is a result of
717
+ the history and evolution in a given package ecosystem to address specific needs.
718
+
719
+ In practice, the unified and reduced set of comparators and syntax defined for
720
+ ``vers`` has been designed such that all these notations can be converted to a
721
+ ``vers`` and back from a ``vers`` to the original notation.
722
+
723
+ In contrast, this would not be possible with existing notations. For instance,
724
+ the Python notation may not work with npm semver versions and reciprocally.
725
+
726
+ There are likely to be a few rare cases where round tripping from and to
727
+ ``vers`` may not be possible, and in any case round tripping to and from ``vers``
728
+ should produce equivalent results and even if not strictly the same original
729
+ strings.
730
+
731
+ Another issue with existing version range notations is that they are primarily
732
+ designed for dependencies and not for vulnerable ranges. In particular, a
733
+ vulnerability may exist for multiple "version branches" of a given package such
734
+ as with Django 2.x and 3.x. Several version range notations have difficulties to
735
+ communicate these as typically all the version constraints must be satisfied.
736
+ In contrast, a vulnerability can affect multiple disjoint version ranges of a
737
+ package and any version satisfying these constraints would be vulnerable: it
738
+ may not be possible to express this with a notation designed exclusively for
739
+ dependent versions resolution.
740
+
741
+ Finally, one of the goals of this spec is to be a compact yet obvious Package
742
+ URL companion for version ranges. Several existing and closely related notations
743
+ designed for vulnerable ranges are verbose specifications designed for use
744
+ in API with larger JSON documents.
745
+
746
+
747
+ Why not use the OSV Ranges?
748
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
749
+
750
+ See:
751
+
752
+ - https://ossf.github.io/osv-schema/
753
+
754
+ ``vers`` and the OSSF OSV schema vulnerable ranges are equivalent and ``vers``
755
+ provides a compact range notation while OSV provides more verbose JSON notation.
756
+
757
+ ``vers`` borrows the design from and was informed by the OSV schema spec and its
758
+ authors.
759
+
760
+ OSV uses a minimalist set of only three comparators:
761
+
762
+ - "=" to enumerate versions,
763
+ - ">=" for the version that introduced a vulnerability, and
764
+ - "<" for the version that fixed a vulnerability.
765
+
766
+ OSV Ranges support neither ">" nor "!=" comparators making it difficult to
767
+ express some ranges that must exclude a version. This may not be an issue for
768
+ most vulnerable ranges yet:
769
+
770
+ - this makes it difficult or impossible to precisely express certain dependency
771
+ and vulnerable ranges when a version must be excluded and the set of existing
772
+ versions is not yet known,
773
+
774
+ - this make some ranges more verbose such as with the CVE v5 API ranges
775
+ notation that can include their upper limit and would need two constraints.
776
+
777
+ Another high level difference between the two specifications are the
778
+ codes used to qualify a range package "ecosystem" value that resembles closely
779
+ the Package URL package "type" used in ``vers``. This spec will provide a strict
780
+ mapping between the OSV ecosystem and the ``vers`` versioning schemes values.
781
+
782
+
783
+ Why not use the CVE v5 API Ranges?
784
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
785
+
786
+ See:
787
+
788
+ - https://github.com/CVEProject/cve-schema/blob/master/schema/v5.0/CVE_JSON_5.0_schema.json#L303
789
+ - https://github.com/CVEProject/cve-schema/blob/master/schema/v5.0/CVE_JSON_5.0_schema.json#L123
790
+
791
+ The version 5 of the CVE JSON data format defines version ranges with a
792
+ starting version, a versionType, and an upper limit for the version range as
793
+ lessThan or lessThanOrEqual or as an enumeration of versions. The versionType
794
+ and the package collectionURL possible values are only indicative and left out
795
+ of this specification and both seem strictly equivalent to the Package URL
796
+ "type" on the one hand and the ``vers`` versioning scheme on the other hand.
797
+
798
+ The semantics and expressiveness of each range are similar and ``vers`` provides
799
+ a compact notation rather than a more verbose JSON notation. ``vers`` supports
800
+ strictly the conversion of any CVE v5 range to its notation and further
801
+ provides a concrete list of well known versioning schemes. ``vers`` design was
802
+ informed by the CVE v5 API schema spec and its authors.
803
+
804
+ When CVE v5 becomes active, this spec will provide a strict mapping between the
805
+ CVE ``versionType`` and the ``vers`` versioning schemes values. Furthermore, this
806
+ spec and the Package URL "types" should be updated accordingly to provide
807
+ a mapping with the upcoming CVE ``collectionURL`` that will be effectively used.
808
+
809
+ There is one issue with CVE v5: it introduces a new trailing "*" notation that
810
+ does not exists in most version ranges notations and may not be computable
811
+ easily in many cases. The description of the "lessThan" property is:
812
+
813
+ The non-inclusive upper limit of the range. This is the least version NOT
814
+ in the range. The usual version syntax is expanded to allow a pattern to end
815
+ in an asterisk `(*)`, indicating an arbitrarily large number in the version
816
+ ordering. For example, `{version: 1.0 lessThan: 1.*}` would describe the
817
+ entire 1.X branch for most range kinds, and `{version: 2.0, lessThan: *}`
818
+ describes all versions starting at 2.0, including 3.0, 5.1, and so on.
819
+
820
+ The conversion to ``vers`` range should be:
821
+
822
+ - with a version 1.0 and `"lessThan": "*"`, the ``vers`` equivalent is: ``>=1.0``.
823
+
824
+ - with a version 1.0 and `"lessThan": "2.*"`, the ``vers`` equivalent can be
825
+ computed for ``semver`` versions as ``>=1.0|<2`` but is not accurate unless
826
+ as versioning schemes have different rules. For instance, pre-release may be
827
+ treated in some case as part of the v1. branch and in some other cases as part
828
+ of the v2. branch. It is not clear if with "2.*" the CVE v5 spec means:
829
+
830
+ - ``<2``
831
+ - or something that excludes any version string that starts with ``2.``
832
+
833
+ And in this case, with the expression `"lessThan": "2.*"` using a ``semver``
834
+ version, it is not clear if ``2.0.0-alpha`` is "lessThan"; semver sorts it
835
+ before ``2.0`` and after ``1.0``, e.g., in ``semver`` ``2.0.0-alpha`` is
836
+ "less than" ``2``.
837
+
838
+
839
+ Why not use the NVD CPE Ranges?
840
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
841
+
842
+ See:
843
+
844
+ - https://nvd.nist.gov/vuln/vulnerability-detail-pages#divRange
845
+ - https://nvd.nist.gov/developers/vulnerabilities#divResponse
846
+ - https://csrc.nist.gov/schema/nvd/feed/1.1/nvd_cve_feed_json_1.1.schema
847
+
848
+ The version ranges notation defined in the JSON schema of the CVE API payload
849
+ uses these four fields: ``versionStartIncluding``, ``versionStartExcluding``,
850
+ ``versionEndIncluding`` and ``versionEndExcluding``. For example::
851
+
852
+ "versionStartIncluding": "7.3.0",
853
+ "versionEndExcluding": "7.3.31",
854
+ "versionStartExcluding" : "9.0.0",
855
+ "versionEndIncluding" : "9.0.46",
856
+
857
+ In addition to these ranges, the NVD publishes a list of concrete CPE with
858
+ versions resolved for a range with daily updates at
859
+ https://nvd.nist.gov/vuln/data-feeds#cpeMatch
860
+
861
+ Note that the NVD CVE configuration is a complex specification that goes well
862
+ beyond version ranges and is used to match comprehensive configurations across
863
+ multiple products and version ranges. ``vers`` focus is exclusively versions.
864
+
865
+ In contrast with ``vers`` compact notation, the NVD JSON notation is more
866
+ verbose, yet ``vers`` supports strictly the conversion of any CPE range.
867
+
868
+
869
+ Why not use node-semver ranges?
870
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
871
+
872
+ See:
873
+
874
+ - https://github.com/npm/node-semver#ranges
875
+
876
+ The node-semver spec is similar but much more complex than this spec. This is
877
+ an AND of ORs constraints with a few practical issues:
878
+
879
+ - A space means "AND", therefore white spaces are significant. Having
880
+ significant white spaces in a string makes normalization more complicated and
881
+ may be a source of confusion if you remove the spaces from the string.
882
+ ``vers`` avoids the ambiguity of spaces by ignoring them.
883
+
884
+ - The advanced range syntax has grown to be rather complex using hyphen ranges,
885
+ stars ranges, carets and tilde constructs that are all tied to the JavaScript
886
+ and npm ways of handling versions in their ecosystem and are bound furthermore
887
+ to the semver semantics and its npm implementation. These are not readily
888
+ reusable elsewhere. The multiple comparators and modifiers make the notation
889
+ grammar more complex to parse and process for a machine and harder to read for
890
+ human.
891
+
892
+ Notations that are directly derived from node-semver as used in Rust and PHP
893
+ Composer have the same issues.
894
+
895
+
896
+ Why not use Python PEP-0440 ranges?
897
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
898
+
899
+ See:
900
+
901
+ - https://www.python.org/dev/peps/pep-0440/#version-specifiers
902
+
903
+ The Python pep-0440 "Version Identification and Dependency Specification"
904
+ provides a comprehensive specification for Python package versioning and a
905
+ notation for "version specifiers" to express the version constraints of
906
+ dependencies.
907
+
908
+ This specification is similar to this ``vers`` spec, with more operators and
909
+ aspects specific to the versions used only in the Python ecosystem.
910
+
911
+ - In particular pep-0440 uses tilde, triple equal and wildcard star operators
912
+ that are specific to how two Python versions are compared.
913
+
914
+ - The comma separator between constraints is a logical "AND" rather than an
915
+ "OR". The "OR" does not exist in the syntax making some version ranges
916
+ harder to express, in particular for vulnerabilities that may affect several
917
+ exact versions or ranges for multiple parallel release branches. Ranges such as
918
+ "Django 1.2 or later, or Django 2.2 or later or Django 3.2 or later" are
919
+ difficult to express without an "OR" logic.
920
+
921
+
922
+ Why not use RubyGems requirements notation?
923
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
924
+
925
+ See:
926
+
927
+ - https://guides.rubygems.org/patterns/#declaring-dependencies
928
+
929
+ The RubyGems specification suggests but does not enforce using semver. It uses
930
+ operators similar to the ``node-semver`` spec with the different of the "~>"
931
+ aka. pessimistic operator vs. a plain "~" tilde used in node-semver. This
932
+ operator implies some semver-like versioning, yet gem version are not strictly
933
+ semver. This makes the notation complex to implement and impractical to reuse
934
+ in places that do not use the same Ruby-specific semver-like semantics.
935
+
936
+
937
+ Why not use fewer comparators with only =, >= and <?
938
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
939
+
940
+ For instance, the OSV schema adopts a reduced set of only three comparators:
941
+
942
+ - "=" is implied when used to enumerate vulnerable versions
943
+ - ">=" (greater or equal) is for the version that introduces a vulnerability
944
+ - "<" (lesser) is for the version that fixes a vulnerability
945
+
946
+ This approach is simpler and works well for most vulnerable ranges but it faces
947
+ limitations when converting from other notations:
948
+
949
+ - ">" cannot be converted reliably to ">=" unless you know all the versions and
950
+ these will never change.
951
+
952
+ - "<=" cannot be converted reliably to "<" unless you know all the versions and
953
+ these will never change.
954
+
955
+ - "!=" cannot be converted reliably: there is no ">" comparator to create an
956
+ unequal equivalent of "><"; and a combo of ">=" and "<" is not equivalent
957
+ to inequality unless you know all the versions and these will never change.
958
+
959
+
960
+ Why not use richer comparators such as tilde, caret and star?
961
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
962
+
963
+ Some existing notations such as used with npm, gem, python, or composer
964
+ provide syntactic shorthand such as:
965
+
966
+ - a "pessimistic operator" using tilde, ~> or =~ as in "~1.3" or "~>1.2.3"
967
+ - a caret ^ prefix as in "^ 1.2"
968
+ - using a star in a version segment as in "1.2.*"
969
+ - dash-separated ranges as in "1.2 - 1.4"
970
+ - arbitrary string equality such as "===1.2"
971
+
972
+ Most of these notations can be converted without loss to the ``vers`` notation.
973
+ Furthermore these notations typically assume a well defined version string
974
+ structure specific to their package ecosystem and are not reusable in another
975
+ ecosystem that would not use the exact same version conventions.
976
+
977
+ For instance, the tilde and caret notations demand that you can reliably
978
+ infer the next version (aka. "bump") from a given version; this is possible
979
+ only if the versioning scheme supports this operation reliably for all its
980
+ accepted versions.
981
+
982
+
983
+ Why not use mathematical interval notation for ranges?
984
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
985
+
986
+ Apache Maven and NuGet use a mathematical interval notation with comma-separated
987
+ "[", "]", "(" and ")" to declare version ranges.
988
+
989
+ All other known range notations use the more common ">", "<", and "=" as
990
+ comparators. ``vers`` adopts this familiar approach.
991
+
992
+
993
+ References
994
+ ---------------------
995
+
996
+
997
+ Here are some of the discussions that led to the creation of this specification:
998
+
999
+ - https://github.com/package-url/purl-spec/issues/66
1000
+ - https://github.com/package-url/purl-spec/issues/84
1001
+ - https://github.com/package-url/purl-spec/pull/93
1002
+ - https://github.com/nexB/vulnerablecode/issues/119
1003
+ - https://github.com/nexB/vulnerablecode/issues/140
1004
+ - https://github.com/nexB/univers/pull/11
1005
+
1006
+ License
1007
+ ---------------------
1008
+
1009
+ This document is licensed under the MIT license