uri_scanner 0.0.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +7 -0
- data/.gitignore +14 -0
- data/.rspec +2 -0
- data/Gemfile +6 -0
- data/LICENSE.txt +22 -0
- data/README.md +73 -0
- data/Rakefile +35 -0
- data/example/parse.rb +32 -0
- data/example/scanner.rb +14 -0
- data/lib/machines/ip_addr.rl +60 -0
- data/lib/machines/ruby_actions.rl +49 -0
- data/lib/machines/sip_uri.rl +52 -0
- data/lib/machines/uri.rl +92 -0
- data/lib/uri_scanner.rb +24 -0
- data/lib/uri_scanner/ip_address.rb +532 -0
- data/lib/uri_scanner/ip_address.rl +27 -0
- data/lib/uri_scanner/uri_parser.rb +10539 -0
- data/lib/uri_scanner/uri_parser.rl +44 -0
- data/lib/uri_scanner/uri_scanner.rb +1007 -0
- data/lib/uri_scanner/uri_scanner.rl +45 -0
- data/lib/uri_scanner/version.rb +3 -0
- data/spec/ip_addr_spec.rb +64 -0
- data/spec/scanner_spec.rb +40 -0
- data/spec/spec_helper.rb +96 -0
- data/spec/uri_scanner_spec.rb +43 -0
- data/spec/uri_spec.rb +185 -0
- data/spec/url.txt +156 -0
- data/uri_scanner.gemspec +23 -0
- metadata +106 -0
data/spec/url.txt
ADDED
@@ -0,0 +1,156 @@
|
|
1
|
+
Uniform Resource Locators (URL) Tim Berners-Lee
|
2
|
+
draft-ietf-uri-url-03.{ps,txt} URI working Group
|
3
|
+
Expires 21 September 1994 21 March 1994
|
4
|
+
|
5
|
+
|
6
|
+
Uniform Resource Locators (URL)
|
7
|
+
|
8
|
+
A Syntax for the Expression of
|
9
|
+
Access Information of Objects on the Network
|
10
|
+
|
11
|
+
|
12
|
+
ABOUT THIS DOCUMENT
|
13
|
+
|
14
|
+
This document specifies a Uniform Resource Locator (URL), the
|
15
|
+
syntax and semantics of formalized information for location and
|
16
|
+
access of resources on the Internet.
|
17
|
+
|
18
|
+
This document was written by the URI working group of the Internet
|
19
|
+
Engineering Task Force. Comments may be addressed to the editor,
|
20
|
+
Tim Berners-Lee <timbl@info.cern.ch>, or to the URI-WG
|
21
|
+
<uri@bunyip.com>. Discussions of the group are archived at
|
22
|
+
|
23
|
+
<http://www.acl.lanl.gov/URI/archive/uri-archive.index.html>
|
24
|
+
|
25
|
+
This document is bound by the Requirements Specification in
|
26
|
+
preparation.
|
27
|
+
|
28
|
+
The work is derived from concepts introduced by the World-Wide Web
|
29
|
+
global information initiative, whose use of such objects dates
|
30
|
+
from 1990 and is described in "Universal Resource identifeirs for
|
31
|
+
the World-Wide Web", RFCXXX .
|
32
|
+
|
33
|
+
This document is available in hypertext form, with links to
|
34
|
+
background information, as:
|
35
|
+
|
36
|
+
<http://info.cern.ch/hypertext/WWW/Addressing/URL/Overview.html>
|
37
|
+
|
38
|
+
.
|
39
|
+
|
40
|
+
Example
|
41
|
+
|
42
|
+
Yes, Jim, I found it under <ftp://info.cern.ch/pub/www/doc> but
|
43
|
+
you can probably pick it up from <ftp://ds.internic.net/rfc>.
|
44
|
+
|
45
|
+
|
46
|
+
|
47
|
+
REFERENCES
|
48
|
+
|
49
|
+
Alberti, R., et.al. (1991)
|
50
|
+
"Notes on the Internet Gopher Protocol"
|
51
|
+
University of Minnesota, December 1991,
|
52
|
+
<ftp://boombox.micro.umn.edu/pub/gopher/gopher_protocol> . See also
|
53
|
+
<gopher://gopher.micro.umn.edu/00/InformationAbout Gopher/About Gopher>
|
54
|
+
|
55
|
+
Berners-Lee, T ., (1991)
|
56
|
+
"Hypertext Transfer Protocol (HTTP)" , CERN,
|
57
|
+
December 1991, as updated from time to time,
|
58
|
+
<ftp://info.cern.ch/pub/www/doc/http-spec.txt
|
59
|
+
>
|
60
|
+
|
61
|
+
Crocker "Standard for ARPA Internet Text Messages" .
|
62
|
+
David H. Crocker, RFC822,
|
63
|
+
|
64
|
+
Davis, F, et al., (1990)
|
65
|
+
"WAIS Interface Protocol: Prototype
|
66
|
+
Functional Specification", Thinking Machines
|
67
|
+
Corporation, April 23, 1990
|
68
|
+
"ftp://quake.think.com/pub/ais/doc/protspec.txt"
|
69
|
+
|
70
|
+
International Standards Organization, (1991)
|
71
|
+
Information and Documentation - Search and
|
72
|
+
Retrieve Application Protocol Specification
|
73
|
+
for open Systems Interconnection, ISO-10163
|
74
|
+
|
75
|
+
Horton (1987) M. Horton, R. Adams, "Standard for
|
76
|
+
interchange of USENET messages", Internet RFC
|
77
|
+
1036 , 12/01/1987.
|
78
|
+
|
79
|
+
Huitema, C., (1991) "Naming: strategies and techniques",
|
80
|
+
Computer Networks and ISDN Systems 23 (1991)
|
81
|
+
107-110.
|
82
|
+
|
83
|
+
|
84
|
+
|
85
|
+
Berners-Lee 18
|
86
|
+
|
87
|
+
RFC XXXX Uniform Resource Locators (URL) March 21 1994
|
88
|
+
|
89
|
+
Kahle, Brewster, (1991)
|
90
|
+
"Document Identifiers, or International
|
91
|
+
Standard Book Numbers for the Electronic
|
92
|
+
Age",
|
93
|
+
<ftp://quake.think.com/pub/wais/doc/doc-ids.txt>
|
94
|
+
|
95
|
+
Kantor, B., and Lapsley, P., (1986)
|
96
|
+
"A proposed standard for the stream-based
|
97
|
+
transmission of news" , Internet RFC-977,
|
98
|
+
February 1986.
|
99
|
+
<ftp://ds.internic.net/rfc/rfc977.txt>
|
100
|
+
|
101
|
+
Kunze, 1994 J. Kunze, Requirements for URLs, to be
|
102
|
+
published.
|
103
|
+
|
104
|
+
Lynch, C., Coallition for Networked Information: (1991)
|
105
|
+
"Workshop on ID and Reference Structures for
|
106
|
+
Networked Information", November 1991. See
|
107
|
+
<wais://quake.think.com/wais-discussion-archives?lynch>
|
108
|
+
|
109
|
+
Mockapetris, P., (1987)
|
110
|
+
"Domain names + concepts and facilities",
|
111
|
+
RFC-1034, USC-ISI, November 1987,
|
112
|
+
<ftp://ds.internic.net/rfc/rfc1034.txt>
|
113
|
+
|
114
|
+
Neuman, B. Clifford, (1992)
|
115
|
+
"Prospero: A Tool for Organizing Internet
|
116
|
+
Resources", Electronic Networking: Research,
|
117
|
+
Applications and Policy, Vol 1 No 2, Meckler
|
118
|
+
Westport CT USA. See also
|
119
|
+
<ftp://prospero.isi.edu/pub/prospero/oir.ps>
|
120
|
+
|
121
|
+
Postel, J. and Reynolds, J. (1985)
|
122
|
+
"File Transfer Protocol (FTP)", Internet
|
123
|
+
RFC-959, October 1985.
|
124
|
+
<ftp://ds.internic.net/rfc/rfc959.txt>
|
125
|
+
|
126
|
+
Sollins 1994 K. Sollins and L. Masinter, Requiremnets for
|
127
|
+
URNs, to be published.
|
128
|
+
|
129
|
+
Yeong, W., (1991a) "Towards Networked Information Retrieval",
|
130
|
+
Technical report 91-06-25-01, June 1991,
|
131
|
+
Performance Systems International, Inc.
|
132
|
+
<ftp://uu.psi.com/wp/nir.txt>
|
133
|
+
|
134
|
+
Yeong, W., (1991b), "Representing Public Archives in the
|
135
|
+
Directory", Internet Draft, November 1991,
|
136
|
+
now expired.
|
137
|
+
|
138
|
+
.
|
139
|
+
|
140
|
+
|
141
|
+
Berners-Lee 19
|
142
|
+
|
143
|
+
RFC XXXX Uniform Resource Locators (URL) March 21 1994
|
144
|
+
|
145
|
+
EDITOR'S ADDRESS
|
146
|
+
|
147
|
+
Tim Berners-Lee
|
148
|
+
Address: World-Wide Web project
|
149
|
+
CERN,
|
150
|
+
1211 Geneva 23,
|
151
|
+
Switzerland
|
152
|
+
|
153
|
+
Telephone: +41 (22)767 3755
|
154
|
+
Fax: +41 (22)767 7155
|
155
|
+
Email: timbl@info.cern.ch
|
156
|
+
|
data/uri_scanner.gemspec
ADDED
@@ -0,0 +1,23 @@
|
|
1
|
+
# coding: utf-8
|
2
|
+
lib = File.expand_path('../lib', __FILE__)
|
3
|
+
$LOAD_PATH.unshift(lib) unless $LOAD_PATH.include?(lib)
|
4
|
+
require 'uri_scanner/version'
|
5
|
+
|
6
|
+
Gem::Specification.new do |spec|
|
7
|
+
spec.name = "uri_scanner"
|
8
|
+
spec.version = URIScanner::VERSION
|
9
|
+
spec.authors = ["Stas Kobzar"]
|
10
|
+
spec.email = ["stas@modulis.ca"]
|
11
|
+
spec.summary = %q{URI Parser with Ragel}
|
12
|
+
spec.description = %q{Parsing URI and tokenize URI segments. Scan input text and extract URIs to array. Based on Ragel FSM compiler.}
|
13
|
+
spec.homepage = "https://github.com/staskobzar/uri_scanner"
|
14
|
+
spec.license = "MIT"
|
15
|
+
|
16
|
+
spec.files = `git ls-files -z`.split("\x0")
|
17
|
+
spec.executables = spec.files.grep(%r{^bin/}) { |f| File.basename(f) }
|
18
|
+
spec.test_files = spec.files.grep(%r{^(test|spec|features)/})
|
19
|
+
spec.require_paths = ["lib"]
|
20
|
+
|
21
|
+
spec.add_development_dependency "bundler", "~> 1.7"
|
22
|
+
spec.add_development_dependency "rake", "~> 10.5"
|
23
|
+
end
|
metadata
ADDED
@@ -0,0 +1,106 @@
|
|
1
|
+
--- !ruby/object:Gem::Specification
|
2
|
+
name: uri_scanner
|
3
|
+
version: !ruby/object:Gem::Version
|
4
|
+
version: 0.0.1
|
5
|
+
platform: ruby
|
6
|
+
authors:
|
7
|
+
- Stas Kobzar
|
8
|
+
autorequire:
|
9
|
+
bindir: bin
|
10
|
+
cert_chain: []
|
11
|
+
date: 2016-02-06 00:00:00.000000000 Z
|
12
|
+
dependencies:
|
13
|
+
- !ruby/object:Gem::Dependency
|
14
|
+
name: bundler
|
15
|
+
requirement: !ruby/object:Gem::Requirement
|
16
|
+
requirements:
|
17
|
+
- - "~>"
|
18
|
+
- !ruby/object:Gem::Version
|
19
|
+
version: '1.7'
|
20
|
+
type: :development
|
21
|
+
prerelease: false
|
22
|
+
version_requirements: !ruby/object:Gem::Requirement
|
23
|
+
requirements:
|
24
|
+
- - "~>"
|
25
|
+
- !ruby/object:Gem::Version
|
26
|
+
version: '1.7'
|
27
|
+
- !ruby/object:Gem::Dependency
|
28
|
+
name: rake
|
29
|
+
requirement: !ruby/object:Gem::Requirement
|
30
|
+
requirements:
|
31
|
+
- - "~>"
|
32
|
+
- !ruby/object:Gem::Version
|
33
|
+
version: '10.5'
|
34
|
+
type: :development
|
35
|
+
prerelease: false
|
36
|
+
version_requirements: !ruby/object:Gem::Requirement
|
37
|
+
requirements:
|
38
|
+
- - "~>"
|
39
|
+
- !ruby/object:Gem::Version
|
40
|
+
version: '10.5'
|
41
|
+
description: Parsing URI and tokenize URI segments. Scan input text and extract URIs
|
42
|
+
to array. Based on Ragel FSM compiler.
|
43
|
+
email:
|
44
|
+
- stas@modulis.ca
|
45
|
+
executables: []
|
46
|
+
extensions: []
|
47
|
+
extra_rdoc_files: []
|
48
|
+
files:
|
49
|
+
- ".gitignore"
|
50
|
+
- ".rspec"
|
51
|
+
- Gemfile
|
52
|
+
- LICENSE.txt
|
53
|
+
- README.md
|
54
|
+
- Rakefile
|
55
|
+
- example/parse.rb
|
56
|
+
- example/scanner.rb
|
57
|
+
- lib/machines/ip_addr.rl
|
58
|
+
- lib/machines/ruby_actions.rl
|
59
|
+
- lib/machines/sip_uri.rl
|
60
|
+
- lib/machines/uri.rl
|
61
|
+
- lib/uri_scanner.rb
|
62
|
+
- lib/uri_scanner/ip_address.rb
|
63
|
+
- lib/uri_scanner/ip_address.rl
|
64
|
+
- lib/uri_scanner/uri_parser.rb
|
65
|
+
- lib/uri_scanner/uri_parser.rl
|
66
|
+
- lib/uri_scanner/uri_scanner.rb
|
67
|
+
- lib/uri_scanner/uri_scanner.rl
|
68
|
+
- lib/uri_scanner/version.rb
|
69
|
+
- spec/ip_addr_spec.rb
|
70
|
+
- spec/scanner_spec.rb
|
71
|
+
- spec/spec_helper.rb
|
72
|
+
- spec/uri_scanner_spec.rb
|
73
|
+
- spec/uri_spec.rb
|
74
|
+
- spec/url.txt
|
75
|
+
- uri_scanner.gemspec
|
76
|
+
homepage: https://github.com/staskobzar/uri_scanner
|
77
|
+
licenses:
|
78
|
+
- MIT
|
79
|
+
metadata: {}
|
80
|
+
post_install_message:
|
81
|
+
rdoc_options: []
|
82
|
+
require_paths:
|
83
|
+
- lib
|
84
|
+
required_ruby_version: !ruby/object:Gem::Requirement
|
85
|
+
requirements:
|
86
|
+
- - ">="
|
87
|
+
- !ruby/object:Gem::Version
|
88
|
+
version: '0'
|
89
|
+
required_rubygems_version: !ruby/object:Gem::Requirement
|
90
|
+
requirements:
|
91
|
+
- - ">="
|
92
|
+
- !ruby/object:Gem::Version
|
93
|
+
version: '0'
|
94
|
+
requirements: []
|
95
|
+
rubyforge_project:
|
96
|
+
rubygems_version: 2.4.1
|
97
|
+
signing_key:
|
98
|
+
specification_version: 4
|
99
|
+
summary: URI Parser with Ragel
|
100
|
+
test_files:
|
101
|
+
- spec/ip_addr_spec.rb
|
102
|
+
- spec/scanner_spec.rb
|
103
|
+
- spec/spec_helper.rb
|
104
|
+
- spec/uri_scanner_spec.rb
|
105
|
+
- spec/uri_spec.rb
|
106
|
+
- spec/url.txt
|