linkify-it-rb 1.2.0 → 2.0.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: 82b2fb7b546639550e88ace3601ef00e6e250947
4
- data.tar.gz: c858ce560f4f15ed72d70612ba6db5d136ced5a3
3
+ metadata.gz: 176768c4d108c26b13260d1e62c1cb30cd0491f5
4
+ data.tar.gz: a47bd97acdf25582ce67d65551a1d94678d56aa3
5
5
  SHA512:
6
- metadata.gz: 17a3be5c92263a111057cfa856eb69feb4ead4b44cad7ba61aecba117d5fda188e42b0a545094be724aeba58dc00d2a0d260db7e3a00004f40da5d78a37c6eec
7
- data.tar.gz: 00860971169d330cecebd36dfd2d1dff8b34aee1f75049240d27e1b8ddae0d1cf48d22f63debaab8e877f44f4cc64789d9a51262329c9b66cae01c46a0194da6
6
+ metadata.gz: 1cbcc4a1e42a24df990484e927b1049a81ff87a8bd0a53f43d32f9d0687ec2bcfb2ec81778e1fbd4a33e11c9e41796e4787d84b8f3f55461ed853dfe4b40c05e
7
+ data.tar.gz: 5b25b74826e53b4257fdc2b4ea1b0422659d44c9ec69e877ee37b396d9597654db71811d2593193552785cdb8a71258485d4a57a6447b2036cbf324386d0380f
data/README.md CHANGED
@@ -1,11 +1,16 @@
1
1
  # linkify-it-rb
2
2
 
3
3
  [![Gem Version](https://badge.fury.io/rb/linkify-it-rb.svg)](http://badge.fury.io/rb/linkify-it-rb)
4
-
5
- Links recognition library with full unicode support. Focused on high quality link pattern detection in plain text. For use with both Ruby and RubyMotion.
4
+ [![Build Status](https://travis-ci.org/digitalmoksha/linkify-it-rb.svg?branch=master)](https://travis-ci.org/digitalmoksha/linkify-it-rb)
6
5
 
7
6
  This gem is a port of the [linkify-it javascript package](https://github.com/markdown-it/linkify-it) by Vitaly Puzrin, that is used for the [markdown-it](https://github.com/markdown-it/markdown-it) package.
8
7
 
8
+ _Currently synced with linkify-it 2.0.3_
9
+
10
+ ---
11
+
12
+ Links recognition library with full unicode support. Focused on high quality link pattern detection in plain text. For use with both Ruby and RubyMotion.
13
+
9
14
  __[Javascript Demo](http://markdown-it.github.io/linkify-it/)__
10
15
 
11
16
  Features:
@@ -46,8 +51,8 @@ Usage examples
46
51
  ```ruby
47
52
  linkify = Linkify.new
48
53
 
49
- # add unoffocial `.onion` domain.
50
- linkify.tlds('.onion', true) # Add unofficial `.onion` domain
54
+ # Reload full tlds list & add unofficial `.onion` domain.
55
+ linkify.tlds('onion', true) # Add unofficial `.onion` domain
51
56
  linkify.add('git:', 'http:') # Add `git:` ptotocol as "alias"
52
57
  linkify.add('ftp:', null) # Disable `ftp:` ptotocol
53
58
  linkify.set({fuzzyIP: true}) # Enable IPs in fuzzy links (without schema)
@@ -59,7 +64,7 @@ linkify.match('Site github.com!'))
59
64
  => [#<Linkify::Match @schema="", @index=5, @lastIndex=15, @raw="github.com", @text="github.com", @url="github.com">]
60
65
  ```
61
66
 
62
- ##### Exmple 2. Add twitter mentions handler
67
+ ##### Example 2. Add twitter mentions handler
63
68
 
64
69
  ```ruby
65
70
  linkify.add('@', {
@@ -96,7 +101,7 @@ By default understands:
96
101
  `schemas` is a Hash, where each key/value describes protocol/rule:
97
102
 
98
103
  - __key__ - link prefix (usually, protocol name with `:` at the end, `skype:`
99
- for example). `linkify-it-rb` makes shure that prefix is not preceeded with
104
+ for example). `linkify-it-rb` makes sure that prefix is not preceded with
100
105
  alphanumeric char.
101
106
  - __value__ - rule to check tail after link prefix
102
107
  - _String_ - just alias to existing rule
@@ -108,10 +113,11 @@ By default understands:
108
113
 
109
114
  `options`:
110
115
 
111
- - __fuzzyLink__ - recognige URL-s without `http(s):` prefix. Default `true`.
116
+ - __fuzzyLink__ - recognize URL-s without `http(s)://` head. Default `true`.
112
117
  - __fuzzyIP__ - allow IPs in fuzzy links above. Can conflict with some texts
113
118
  like version numbers. Default `false`.
114
- - __fuzzyEmail__ - recognize emails without `mailto:` prefix.
119
+ - __fuzzyEmail__ - recognize emails without `mailto:` prefix. Default `true`.
120
+ - __---__ - set `true` to terminate link with `---` (if it's considered as long dash).
115
121
 
116
122
 
117
123
  ### .test(text)
@@ -149,16 +155,14 @@ Each match has:
149
155
 
150
156
  ### .tlds(list[, keepOld])
151
157
 
152
- Load (or merge) new tlds list. These are used for fuzzy links (without prefix)
158
+ Load (or merge) new tlds list. These are needed for fuzzy links (without schema)
153
159
  to avoid false positives. By default this algorithm uses:
154
160
 
155
- - hostname with any 2-letter root zones are ok.
156
- - biz|com|edu|gov|net|org|pro|web|xxx|aero|asia|coop|info|museum|name|shop|рф
157
- are ok.
161
+ - 2-letter root zones are ok.
162
+ - biz|com|edu|gov|net|org|pro|web|xxx|aero|asia|coop|info|museum|name|shop|рф are ok.
158
163
  - encoded (`xn--...`) root zones are ok.
159
164
 
160
- If list is replaced, then exact match for 2-chars root zones will be checked.
161
-
165
+ If that's not enougth, you can reload defaults with more detailed zones list.
162
166
 
163
167
  ### .add(schema, definition)
164
168
 
@@ -1,18 +1,18 @@
1
- # encoding: utf-8
2
-
3
1
  if defined?(Motion::Project::Config)
4
-
2
+
5
3
  lib_dir_path = File.dirname(File.expand_path(__FILE__))
6
4
  Motion::Project::App.setup do |app|
7
- app.files.unshift(Dir.glob(File.join(lib_dir_path, "linkify-it-rb/**/*.rb")))
5
+ app.files.unshift(Dir.glob(File.join(lib_dir_path, 'linkify-it-rb/**/*.rb')))
6
+
7
+ app.files_dependencies File.join(lib_dir_path, 'linkify-it-rb/index.rb') => File.join(lib_dir_path, 'linkify-it-rb/re.rb')
8
8
  end
9
-
9
+
10
10
  require 'uc.micro-rb'
11
11
 
12
12
  else
13
-
13
+
14
14
  require 'uc.micro-rb'
15
15
  require 'linkify-it-rb/re'
16
16
  require 'linkify-it-rb/index'
17
-
17
+
18
18
  end
@@ -1,15 +1,15 @@
1
1
  class Linkify
2
2
  include ::LinkifyRe
3
-
3
+
4
4
  attr_accessor :__index__, :__last_index__, :__text_cache__, :__schema__, :__compiled__
5
5
  attr_accessor :re, :bypass_normalizer
6
-
6
+
7
7
  # RE pattern for 2-character tlds (autogenerated by ./support/tlds_2char_gen.js)
8
8
  TLDS_2CH_SRC_RE = 'a[cdefgilmnoqrstuwxz]|b[abdefghijmnorstvwyz]|c[acdfghiklmnoruvwxyz]|d[ejkmoz]|e[cegrstu]|f[ijkmor]|g[abdefghilmnpqrstuwy]|h[kmnrtu]|i[delmnoqrst]|j[emop]|k[eghimnprwyz]|l[abcikrstuvy]|m[acdeghklmnopqrstuvwxyz]|n[acefgilopruz]|om|p[aefghklmnrstwy]|qa|r[eosuw]|s[abcdeghijklmnortuvxyz]|t[cdfghjklmnortvwz]|u[agksyz]|v[aceginu]|w[fs]|y[et]|z[amw]'
9
9
 
10
10
  # DON'T try to make PRs with changes. Extend TLDs with LinkifyIt.tlds() instead
11
11
  TLDS_DEFAULT = 'biz|com|edu|gov|net|org|pro|web|xxx|aero|asia|coop|info|museum|name|shop|рф'.split('|')
12
-
12
+
13
13
  DEFAULT_OPTIONS = {
14
14
  fuzzyLink: true,
15
15
  fuzzyEmail: true,
@@ -23,7 +23,7 @@ class Linkify
23
23
 
24
24
  if (!obj.re[:http])
25
25
  # compile lazily, because "host"-containing variables can change on tlds update.
26
- obj.re[:http] = Regexp.new('^\\/\\/' + LinkifyRe::SRC_AUTH + LinkifyRe::SRC_HOST_PORT_STRICT + LinkifyRe::SRC_PATH, 'i')
26
+ obj.re[:http] = Regexp.new('^\\/\\/' + obj.re[:src_auth] + obj.re[:src_host_port_strict] + obj.re[:src_path], 'i')
27
27
  end
28
28
  if obj.re[:http] =~ tail
29
29
  return tail.match(obj.re[:http])[0].length
@@ -38,13 +38,24 @@ class Linkify
38
38
  tail = text.slice(pos..-1)
39
39
 
40
40
  if (!obj.re[:no_http])
41
- # compile lazily, becayse "host"-containing variables can change on tlds update.
42
- obj.re[:no_http] = Regexp.new('^' + LinkifyRe::SRC_AUTH + LinkifyRe::SRC_HOST_PORT_STRICT + LinkifyRe::SRC_PATH, 'i')
41
+ # compile lazily, because "host"-containing variables can change on tlds update.
42
+ obj.re[:no_http] = Regexp.new(
43
+ '^' +
44
+ obj.re[:src_auth] +
45
+ # Don't allow single-level domains, because of false positives like '//test'
46
+ # with code comments
47
+ '(?:localhost|(?:(?:' + obj.re[:src_domain] + ')\\.)+' + obj.re[:src_domain_root] + ')' +
48
+ obj.re[:src_port] +
49
+ obj.re[:src_host_terminator] +
50
+ obj.re[:src_path],
51
+ 'i'
52
+ )
43
53
  end
44
54
 
45
55
  if (obj.re[:no_http] =~ tail)
46
- # should not be `://`, that protects from errors in protocol name
56
+ # should not be `://` & `///`, that protects from errors in protocol name
47
57
  return 0 if (pos >= 3 && text[pos - 3] == ':')
58
+ return 0 if (pos >= 3 && text[pos - 3] == '/')
48
59
  return tail.match(obj.re[:no_http])[0].length
49
60
  end
50
61
  return 0
@@ -55,7 +66,7 @@ class Linkify
55
66
  tail = text.slice(pos..-1)
56
67
 
57
68
  if (!obj.re[:mailto])
58
- obj.re[:mailto] = Regexp.new('^' + LinkifyRe::SRC_EMAIL_NAME + '@' + LinkifyRe::SRC_HOST_STRICT, 'i')
69
+ obj.re[:mailto] = Regexp.new('^' + obj.re[:src_email_name] + '@' + obj.re[:src_host_strict], 'i')
59
70
  end
60
71
  if (obj.re[:mailto] =~ tail)
61
72
  return tail.match(obj.re[:mailto])[0].length
@@ -104,18 +115,21 @@ class Linkify
104
115
  #
105
116
  #------------------------------------------------------------------------------
106
117
  def compile
107
- @re = { src_xn: LinkifyRe::SRC_XN }
118
+ @re = build_re(@__opts__)
108
119
 
109
120
  # Define dynamic patterns
110
121
  tlds = @__tlds__.dup
122
+
123
+ onCompile
124
+
111
125
  tlds.push(TLDS_2CH_SRC_RE) if (!@__tlds_replaced__)
112
126
  tlds.push(@re[:src_xn])
113
127
 
114
- @re[:src_tlds] = tlds.join('|')
115
- @re[:email_fuzzy] = Regexp.new(LinkifyRe::TPL_EMAIL_FUZZY.gsub('%TLDS%', @re[:src_tlds]), true)
116
- @re[:link_fuzzy] = Regexp.new(LinkifyRe::TPL_LINK_FUZZY.gsub('%TLDS%', @re[:src_tlds]), true)
117
- @re[:link_no_ip_fuzzy] = Regexp.new(LinkifyRe::TPL_LINK_NO_IP_FUZZY.gsub('%TLDS%', @re[:src_tlds]), true)
118
- @re[:host_fuzzy_test] = Regexp.new(LinkifyRe::TPL_HOST_FUZZY_TEST.gsub('%TLDS%', @re[:src_tlds]), true)
128
+ @re[:src_tlds] = tlds.join('|')
129
+ @re[:email_fuzzy] = Regexp.new(@re[:tpl_email_fuzzy].gsub('%TLDS%', @re[:src_tlds]), true)
130
+ @re[:link_fuzzy] = Regexp.new(@re[:tpl_link_fuzzy].gsub('%TLDS%', @re[:src_tlds]), true)
131
+ @re[:link_no_ip_fuzzy] = Regexp.new(@re[:tpl_link_no_ip_fuzzy].gsub('%TLDS%', @re[:src_tlds]), true)
132
+ @re[:host_fuzzy_test] = Regexp.new(@re[:tpl_host_fuzzy_test].gsub('%TLDS%', @re[:src_tlds]), true)
119
133
 
120
134
  #
121
135
  # Compile each schema
@@ -190,8 +204,8 @@ class Linkify
190
204
  slist = @__compiled__.select {|name, val| name.length > 0 && !val.nil? }.keys.map {|str| escapeRE(str)}.join('|')
191
205
 
192
206
  # (?!_) cause 1.5x slowdown
193
- @re[:schema_test] = Regexp.new('(^|(?!_)(?:>|' + LinkifyRe::SRC_Z_P_CC + '))(' + slist + ')', 'i')
194
- @re[:schema_search] = Regexp.new('(^|(?!_)(?:>|' + LinkifyRe::SRC_Z_P_CC + '))(' + slist + ')', 'ig')
207
+ @re[:schema_test] = Regexp.new('(^|(?!_)(?:[><\uff5c]|' + @re[:src_XPCc] + '))(' + slist + ')', 'i')
208
+ @re[:schema_search] = Regexp.new('(^|(?!_)(?:[><\uff5c]|' + @re[:src_XPCc] + '))(' + slist + ')', 'ig')
195
209
 
196
210
  @re[:pretest] = Regexp.new(
197
211
  '(' + @re[:schema_test].source + ')|' +
@@ -203,12 +217,12 @@ class Linkify
203
217
 
204
218
  resetScanCache
205
219
  end
206
-
220
+
207
221
  # Match result. Single element of array, returned by [[LinkifyIt#match]]
208
222
  #------------------------------------------------------------------------------
209
223
  class Match
210
224
  attr_accessor :schema, :index, :lastIndex, :raw, :text, :url
211
-
225
+
212
226
  def initialize(obj, shift)
213
227
  start = obj.__index__
214
228
  endt = obj.__last_index__
@@ -288,11 +302,14 @@ class Linkify
288
302
  #
289
303
  #------------------------------------------------------------------------------
290
304
  def initialize(schemas = {}, options = {})
305
+ schemas = {} unless schemas
306
+
307
+ # not needed
291
308
  # if (!(this instanceof LinkifyIt)) {
292
309
  # return new LinkifyIt(schemas, options);
293
310
  # }
294
311
 
295
- #--- not needed, if you want to pass options, then must also pass schemas
312
+ # not needed, if you want to pass options, then must also pass schemas
296
313
  # if options.empty?
297
314
  # if (isOptionsObj(schemas)) {
298
315
  # options = schemas;
@@ -321,7 +338,6 @@ class Linkify
321
338
  compile
322
339
  end
323
340
 
324
-
325
341
  # chainable
326
342
  # LinkifyIt#add(schema, definition)
327
343
  # - schema (String): rule name (fixed pattern prefix)
@@ -356,7 +372,7 @@ class Linkify
356
372
  @__index__ = -1
357
373
 
358
374
  return false if (!text.length)
359
-
375
+
360
376
  # try to scan for link with schema - that's the most simple rule
361
377
  if @re[:schema_test] =~ text
362
378
  re = @re[:schema_search]
@@ -449,7 +465,7 @@ class Linkify
449
465
  # LinkifyIt#match(text) -> Array|null
450
466
  #
451
467
  # Returns array of found link descriptions or `null` on fail. We strongly suggest
452
- # to use [[LinkifyIt#test]] first, for best speed.
468
+ # recommend to use [[LinkifyIt#test]] first, for best speed.
453
469
  #
454
470
  # ##### Result match description
455
471
  #
@@ -527,7 +543,7 @@ class Linkify
527
543
  #------------------------------------------------------------------------------
528
544
  def normalize(match)
529
545
  return if @bypass_normalizer
530
-
546
+
531
547
  # Do minimal possible changes by default. Need to collect feedback prior
532
548
  # to move forward https://github.com/markdown-it/linkify-it/issues/1
533
549
 
@@ -538,4 +554,11 @@ class Linkify
538
554
  end
539
555
  end
540
556
 
557
+ # LinkifyIt#onCompile()
558
+ #
559
+ # Override to modify basic RegExp-s.
560
+ #------------------------------------------------------------------------------
561
+ def onCompile
562
+ end
563
+
541
564
  end
@@ -1,57 +1,40 @@
1
1
  module LinkifyRe
2
-
2
+
3
3
  # Use direct extract instead of `regenerate` to reduce size
4
4
  SRC_ANY = UCMicro::Properties::Any::REGEX.source
5
5
  SRC_CC = UCMicro::Categories::Cc::REGEX.source
6
6
  SRC_Z = UCMicro::Categories::Z::REGEX.source
7
7
  SRC_P = UCMicro::Categories::P::REGEX.source
8
8
 
9
- # \p{\Z\P\Cc} (white spaces + control + punctuation)
9
+ # \p{\Z\P\Cc\Cf} (white spaces + control + format + punctuation)
10
10
  SRC_Z_P_CC = [ SRC_Z, SRC_P, SRC_CC ].join('|')
11
11
 
12
12
  # \p{\Z\Cc} (white spaces + control)
13
13
  SRC_Z_CC = [ SRC_Z, SRC_CC ].join('|')
14
14
 
15
+ # Experimental. List of chars, completely prohibited in links
16
+ # because can separate it from other part of text
17
+ TEXT_SEPARATORS = '[><\uff5c]'
18
+
15
19
  # All possible word characters (everything without punctuation, spaces & controls)
16
20
  # Defined via punctuation & spaces to save space
17
21
  # Should be something like \p{\L\N\S\M} (\w but without `_`)
18
- SRC_PSEUDO_LETTER = '(?:(?!' + SRC_Z_P_CC + ')' + SRC_ANY + ')'
22
+ SRC_PSEUDO_LETTER = '(?:(?!' + TEXT_SEPARATORS + '|' + SRC_Z_P_CC + ')' + SRC_ANY + ')'
19
23
  # The same as above but without [0-9]
20
- SRC_PSEUDO_LETTER_NON_D = '(?:(?![0-9]|' + SRC_Z_P_CC + ')' + SRC_ANY + ')'
24
+ # SRC_PSEUDO_LETTER_NON_D = '(?:(?![0-9]|' + SRC_Z_P_CC + ')' + SRC_ANY + ')'
21
25
 
22
26
  #------------------------------------------------------------------------------
23
27
 
24
28
  SRC_IP4 = '(?:(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\.){3}(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)'
25
- SRC_AUTH = '(?:(?:(?!' + SRC_Z_CC + ').)+@)?'
29
+
30
+ # Prohibit any of "@/[]()" in user/pass to avoid wrong domain fetch.
31
+ SRC_AUTH = '(?:(?:(?!' + SRC_Z_CC + '|[@/\\[\\]()]).)+@)?'
26
32
 
27
33
  SRC_PORT = '(?::(?:6(?:[0-4]\\d{3}|5(?:[0-4]\\d{2}|5(?:[0-2]\\d|3[0-5])))|[1-5]?\\d{1,4}))?'
28
34
 
29
- SRC_HOST_TERMINATOR = '(?=$|' + SRC_Z_P_CC + ')(?!-|_|:\\d|\\.-|\\.(?!$|' + SRC_Z_P_CC + '))'
35
+ SRC_HOST_TERMINATOR = '(?=$|' + TEXT_SEPARATORS + '|' + SRC_Z_P_CC + ')(?!-|_|:\\d|\\.-|\\.(?!$|' + SRC_Z_P_CC + '))'
30
36
 
31
- SRC_PATH =
32
- '(?:' +
33
- '[/?#]' +
34
- '(?:' +
35
- '(?!' + SRC_Z_CC + '|[()\\[\\]{}.,"\'?!\\-]).|' +
36
- '\\[(?:(?!' + SRC_Z_CC + '|\\]).)*\\]|' +
37
- '\\((?:(?!' + SRC_Z_CC + '|[)]).)*\\)|' +
38
- '\\{(?:(?!' + SRC_Z_CC + '|[}]).)*\\}|' +
39
- '\\"(?:(?!' + SRC_Z_CC + '|["]).)+\\"|' +
40
- "\\'(?:(?!" + SRC_Z_CC + "|[']).)+\\'|" +
41
- "\\'(?=" + SRC_PSEUDO_LETTER + ').|' + # allow `I'm_king` if no pair found
42
- '\\.{2,3}[a-zA-Z0-9%/]|' + # github has ... in commit range links. Restrict to
43
- # - english
44
- # - percent-encoded
45
- # - parts of file path
46
- # until more examples found.
47
- '\\.(?!' + SRC_Z_CC + '|[.]).|' +
48
- '\\-(?!--(?:[^-]|$))(?:-*)|' + # `---` => long dash, terminate
49
- '\\,(?!' + SRC_Z_CC + ').|' + # allow `,,,` in paths
50
- '\\!(?!' + SRC_Z_CC + '|[!]).|' +
51
- '\\?(?!' + SRC_Z_CC + '|[?]).' +
52
- ')+' +
53
- '|\\/' +
54
- ')?'
37
+ # moved SRC_PATH into re_src_path
55
38
 
56
39
  SRC_EMAIL_NAME = '[\\-;:&=\\+\\$,\\"\\.a-zA-Z0-9_]+'
57
40
  SRC_XN = 'xn--[a-z0-9\\-]{1,59}'
@@ -59,15 +42,15 @@ module LinkifyRe
59
42
  # More to read about domain names
60
43
  # http://serverfault.com/questions/638260/
61
44
 
62
- SRC_DOMAIN_ROOT =
63
- # Can't have digits and dashes
45
+ SRC_DOMAIN_ROOT =
46
+ # Allow letters & digits (http://test1)
64
47
  '(?:' +
65
48
  SRC_XN +
66
49
  '|' +
67
- SRC_PSEUDO_LETTER_NON_D + '{1,63}' +
50
+ SRC_PSEUDO_LETTER + '{1,63}' +
68
51
  ')'
69
52
 
70
- SRC_DOMAIN =
53
+ SRC_DOMAIN =
71
54
  '(?:' +
72
55
  SRC_XN +
73
56
  '|' +
@@ -79,14 +62,15 @@ module LinkifyRe
79
62
  '(?:' + SRC_PSEUDO_LETTER + '(?:-(?!-)|' + SRC_PSEUDO_LETTER + '){0,61}' + SRC_PSEUDO_LETTER + ')' +
80
63
  ')'
81
64
 
82
- SRC_HOST =
65
+ SRC_HOST =
83
66
  '(?:' +
84
- SRC_IP4 +
85
- '|' +
86
- '(?:(?:(?:' + SRC_DOMAIN + ')\\.)*' + SRC_DOMAIN_ROOT + ')' +
67
+ # Don't need IP check, because digits are already allowed in normal domain names
68
+ # SRC_IP4 +
69
+ # '|' +
70
+ '(?:(?:(?:' + SRC_DOMAIN + ')\\.)*' + SRC_DOMAIN + ')' +
87
71
  ')'
88
72
 
89
- TPL_HOST_FUZZY =
73
+ TPL_HOST_FUZZY =
90
74
  '(?:' +
91
75
  SRC_IP4 +
92
76
  '|' +
@@ -96,27 +80,98 @@ module LinkifyRe
96
80
  TPL_HOST_NO_IP_FUZZY =
97
81
  '(?:(?:(?:' + SRC_DOMAIN + ')\\.)+(?:%TLDS%))'
98
82
 
99
- SRC_HOST_STRICT = SRC_HOST + SRC_HOST_TERMINATOR
100
- TPL_HOST_FUZZY_STRICT = TPL_HOST_FUZZY + SRC_HOST_TERMINATOR
101
- SRC_HOST_PORT_STRICT = SRC_HOST + SRC_PORT + SRC_HOST_TERMINATOR
102
- TPL_HOST_PORT_FUZZY_STRICT = TPL_HOST_FUZZY + SRC_PORT + SRC_HOST_TERMINATOR
103
- TPL_HOST_PORT_NO_IP_FUZZY_STRICT = TPL_HOST_NO_IP_FUZZY + SRC_PORT + SRC_HOST_TERMINATOR
104
-
83
+ SRC_HOST_STRICT = SRC_HOST + SRC_HOST_TERMINATOR
84
+ TPL_HOST_FUZZY_STRICT = TPL_HOST_FUZZY + SRC_HOST_TERMINATOR
85
+ SRC_HOST_PORT_STRICT = SRC_HOST + SRC_PORT + SRC_HOST_TERMINATOR
86
+ TPL_HOST_PORT_FUZZY_STRICT = TPL_HOST_FUZZY + SRC_PORT + SRC_HOST_TERMINATOR
87
+ TPL_HOST_PORT_NO_IP_FUZZY_STRICT = TPL_HOST_NO_IP_FUZZY + SRC_PORT + SRC_HOST_TERMINATOR
88
+
105
89
  #------------------------------------------------------------------------------
106
90
  # Main rules
107
91
 
108
92
  # Rude test fuzzy links by host, for quick deny
109
- TPL_HOST_FUZZY_TEST = 'localhost|\\.\\d{1,3}\\.|(?:\\.(?:%TLDS%)(?:' + SRC_Z_P_CC + '|$))'
110
- TPL_EMAIL_FUZZY = '(^|>|' + SRC_Z_CC + ')(' + SRC_EMAIL_NAME + '@' + TPL_HOST_FUZZY_STRICT + ')'
111
- TPL_LINK_FUZZY =
112
- # Fuzzy link can't be prepended with .:/\- and non punctuation.
113
- # but can start with > (markdown blockquote)
114
- '(^|(?![.:/\\-_@])(?:[$+<=>^`|]|' + SRC_Z_P_CC + '))' +
115
- '((?![$+<=>^`|])' + TPL_HOST_PORT_FUZZY_STRICT + SRC_PATH + ')'
116
-
117
- TPL_LINK_NO_IP_FUZZY =
118
- # Fuzzy link can't be prepended with .:/\- and non punctuation.
119
- # but can start with > (markdown blockquote)
120
- '(^|(?![.:/\\-_@])(?:[$+<=>^`|]|' + SRC_Z_P_CC + '))' +
121
- '((?![$+<=>^`|])' + TPL_HOST_PORT_NO_IP_FUZZY_STRICT + SRC_PATH + ')'
93
+ TPL_HOST_FUZZY_TEST = 'localhost|www\\.|\\.\\d{1,3}\\.|(?:\\.(?:%TLDS%)(?:' + SRC_Z_P_CC + '|>|$))'
94
+ TPL_EMAIL_FUZZY = '(^|' + TEXT_SEPARATORS + '|\\(|' +SRC_Z_CC + ')(' + SRC_EMAIL_NAME + '@' + TPL_HOST_FUZZY_STRICT + ')'
95
+
96
+ # moved TPL_LINK_FUZZY and TPL_LINK_NO_IP_FUZZY into build_re
97
+
98
+ #------------------------------------------------------------------------------
99
+ def build_re(opts)
100
+ re = {
101
+ src_Any: SRC_ANY,
102
+ src_Cc: SRC_CC,
103
+ src_Z: SRC_Z,
104
+ src_P: SRC_P,
105
+ src_XPCc: SRC_Z_P_CC,
106
+ src_ZCc: SRC_Z_CC,
107
+ src_pseudo_letter: SRC_PSEUDO_LETTER,
108
+ src_ip4: SRC_IP4,
109
+ src_auth: SRC_AUTH,
110
+ src_port: SRC_PORT,
111
+ src_host_terminator: SRC_HOST_TERMINATOR,
112
+ src_path: re_src_path(opts),
113
+ src_email_name: SRC_EMAIL_NAME,
114
+ src_xn: SRC_XN,
115
+ src_domain_root: SRC_DOMAIN_ROOT,
116
+ src_domain: SRC_DOMAIN,
117
+ src_host: SRC_HOST,
118
+
119
+ tpl_host_fuzzy: TPL_HOST_FUZZY,
120
+ tpl_host_no_ip_fuzzy: TPL_HOST_NO_IP_FUZZY,
121
+ src_host_strict: SRC_HOST_STRICT,
122
+ tpl_host_fuzzy_strict: TPL_HOST_FUZZY_STRICT,
123
+ src_host_port_strict: SRC_HOST_PORT_STRICT,
124
+ tpl_host_port_fuzzy_strict: TPL_HOST_PORT_FUZZY_STRICT,
125
+ tpl_host_port_no_ip_fuzzy_strict: TPL_HOST_PORT_NO_IP_FUZZY_STRICT,
126
+
127
+ tpl_host_fuzzy_test: TPL_HOST_FUZZY_TEST,
128
+ tpl_email_fuzzy: TPL_EMAIL_FUZZY
129
+ }
130
+
131
+ # Fuzzy link can't be prepended with .:/\- and non punctuation.
132
+ # but can start with > (markdown blockquote)
133
+ re[:tpl_link_fuzzy] =
134
+ '(^|(?![.:/\\-_@])(?:[$+<=>^`|\uff5c]|' + SRC_Z_P_CC + '))' +
135
+ '((?![$+<=>^`|\uff5c])' + TPL_HOST_PORT_FUZZY_STRICT + re[:src_path] + ')'
136
+
137
+ # Fuzzy link can't be prepended with .:/\- and non punctuation.
138
+ # but can start with > (markdown blockquote)
139
+ re[:tpl_link_no_ip_fuzzy] =
140
+ '(^|(?![.:/\\-_@])(?:[$+<=>^`|\uff5c]|' + SRC_Z_P_CC + '))' +
141
+ '((?![$+<=>^`|\uff5c])' + TPL_HOST_PORT_NO_IP_FUZZY_STRICT + re[:src_path] + ')'
142
+
143
+ return re
144
+ end
145
+
146
+ #------------------------------------------------------------------------------
147
+ def re_src_path(opts = nil)
148
+ '(?:' +
149
+ '[/?#]' +
150
+ '(?:' +
151
+ '(?!' + SRC_Z_CC + '|' + TEXT_SEPARATORS + '|[()\\[\\]{}.,"\'?!\\-]).|' +
152
+ '\\[(?:(?!' + SRC_Z_CC + '|\\]).)*\\]|' +
153
+ '\\((?:(?!' + SRC_Z_CC + '|[)]).)*\\)|' +
154
+ '\\{(?:(?!' + SRC_Z_CC + '|[}]).)*\\}|' +
155
+ '\\"(?:(?!' + SRC_Z_CC + '|["]).)+\\"|' +
156
+ "\\'(?:(?!" + SRC_Z_CC + "|[']).)+\\'|" +
157
+ "\\'(?=" + SRC_PSEUDO_LETTER + '|[-]).|' + # allow `I'm_king` if no pair found
158
+ '\\.{2,3}[a-zA-Z0-9%/]|' + # github has ... in commit range links. Restrict to
159
+ # - english
160
+ # - percent-encoded
161
+ # - parts of file path
162
+ # until more examples found.
163
+ '\\.(?!' + SRC_Z_CC + '|[.]).|' +
164
+ (opts && opts[:'---'] ?
165
+ '\\-(?!--(?:[^-]|$))(?:-*)|' # `---` => long dash, terminate
166
+ :
167
+ '\\-+|'
168
+ ) +
169
+ '\\,(?!' + SRC_Z_CC + ').|' + # allow `,,,` in paths
170
+ '\\!(?!' + SRC_Z_CC + '|[!]).|' +
171
+ '\\?(?!' + SRC_Z_CC + '|[?]).' +
172
+ ')+' +
173
+ '|\\/' +
174
+ ')?'
175
+ end
176
+
122
177
  end
@@ -1,5 +1,5 @@
1
1
  module LinkifyIt
2
2
 
3
- VERSION = '1.2.0'
3
+ VERSION = '2.0.3'
4
4
 
5
5
  end
@@ -255,4 +255,23 @@ describe 'API' do
255
255
  expect(l.match('1.1.1.1.')[0].text).to eq '1.1.1.1'
256
256
  end
257
257
 
258
+ #------------------------------------------------------------------------------
259
+ it 'should not hang in fuzzy mode with sequences of astrals' do
260
+ l = Linkify.new
261
+
262
+ l.set({ fuzzyLink: true })
263
+
264
+ expect(l.match('😡😡😡😡😡😡😡😡😡😡😡😡😡😡😡😡😡😡😡😡😡😡😡😡😡😡😡😡😡😡😡😡😡😡😡 .com')).to eq []
265
+ end
266
+
267
+ #------------------------------------------------------------------------------
268
+ it 'should accept `---` if enabled' do
269
+ l = Linkify.new
270
+
271
+ expect(l.match('http://e.com/foo---bar')[0].text).to eq 'http://e.com/foo---bar'
272
+
273
+ l = Linkify.new(nil, { '---': true })
274
+
275
+ expect(l.match('http://e.com/foo---bar')[0].text).to eq 'http://e.com/foo'
276
+ end
258
277
  end
@@ -1,2 +1,2 @@
1
- require 'byebug'
1
+ require 'pry-byebug'
2
2
  require 'linkify-it-rb'
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: linkify-it-rb
3
3
  version: !ruby/object:Gem::Version
4
- version: 1.2.0
4
+ version: 2.0.3
5
5
  platform: ruby
6
6
  authors:
7
7
  - Brett Walker
@@ -9,7 +9,7 @@ authors:
9
9
  autorequire:
10
10
  bindir: bin
11
11
  cert_chain: []
12
- date: 2016-11-14 00:00:00.000000000 Z
12
+ date: 2018-04-02 00:00:00.000000000 Z
13
13
  dependencies:
14
14
  - !ruby/object:Gem::Dependency
15
15
  name: uc.micro-rb
@@ -25,6 +25,20 @@ dependencies:
25
25
  - - "~>"
26
26
  - !ruby/object:Gem::Version
27
27
  version: '1.0'
28
+ - !ruby/object:Gem::Dependency
29
+ name: bacon-expect
30
+ requirement: !ruby/object:Gem::Requirement
31
+ requirements:
32
+ - - "~>"
33
+ - !ruby/object:Gem::Version
34
+ version: '1.0'
35
+ type: :development
36
+ prerelease: false
37
+ version_requirements: !ruby/object:Gem::Requirement
38
+ requirements:
39
+ - - "~>"
40
+ - !ruby/object:Gem::Version
41
+ version: '1.0'
28
42
  description: Ruby version of linkify-it for motion-markdown-it, for Ruby and RubyMotion
29
43
  email: github@digitalmoksha.com
30
44
  executables: []
@@ -58,10 +72,10 @@ required_rubygems_version: !ruby/object:Gem::Requirement
58
72
  version: '0'
59
73
  requirements: []
60
74
  rubyforge_project:
61
- rubygems_version: 2.4.5
75
+ rubygems_version: 2.6.8
62
76
  signing_key:
63
77
  specification_version: 4
64
78
  summary: linkify-it for motion-markdown-it in Ruby
65
79
  test_files:
66
- - spec/linkify-it-rb/test_spec.rb
67
80
  - spec/spec_helper.rb
81
+ - spec/linkify-it-rb/test_spec.rb