uri_scanner 0.0.1
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +7 -0
- data/.gitignore +14 -0
- data/.rspec +2 -0
- data/Gemfile +6 -0
- data/LICENSE.txt +22 -0
- data/README.md +73 -0
- data/Rakefile +35 -0
- data/example/parse.rb +32 -0
- data/example/scanner.rb +14 -0
- data/lib/machines/ip_addr.rl +60 -0
- data/lib/machines/ruby_actions.rl +49 -0
- data/lib/machines/sip_uri.rl +52 -0
- data/lib/machines/uri.rl +92 -0
- data/lib/uri_scanner.rb +24 -0
- data/lib/uri_scanner/ip_address.rb +532 -0
- data/lib/uri_scanner/ip_address.rl +27 -0
- data/lib/uri_scanner/uri_parser.rb +10539 -0
- data/lib/uri_scanner/uri_parser.rl +44 -0
- data/lib/uri_scanner/uri_scanner.rb +1007 -0
- data/lib/uri_scanner/uri_scanner.rl +45 -0
- data/lib/uri_scanner/version.rb +3 -0
- data/spec/ip_addr_spec.rb +64 -0
- data/spec/scanner_spec.rb +40 -0
- data/spec/spec_helper.rb +96 -0
- data/spec/uri_scanner_spec.rb +43 -0
- data/spec/uri_spec.rb +185 -0
- data/spec/url.txt +156 -0
- data/uri_scanner.gemspec +23 -0
- metadata +106 -0
data/spec/url.txt
ADDED
@@ -0,0 +1,156 @@
|
|
1
|
+
Uniform Resource Locators (URL) Tim Berners-Lee
|
2
|
+
draft-ietf-uri-url-03.{ps,txt} URI working Group
|
3
|
+
Expires 21 September 1994 21 March 1994
|
4
|
+
|
5
|
+
|
6
|
+
Uniform Resource Locators (URL)
|
7
|
+
|
8
|
+
A Syntax for the Expression of
|
9
|
+
Access Information of Objects on the Network
|
10
|
+
|
11
|
+
|
12
|
+
ABOUT THIS DOCUMENT
|
13
|
+
|
14
|
+
This document specifies a Uniform Resource Locator (URL), the
|
15
|
+
syntax and semantics of formalized information for location and
|
16
|
+
access of resources on the Internet.
|
17
|
+
|
18
|
+
This document was written by the URI working group of the Internet
|
19
|
+
Engineering Task Force. Comments may be addressed to the editor,
|
20
|
+
Tim Berners-Lee <timbl@info.cern.ch>, or to the URI-WG
|
21
|
+
<uri@bunyip.com>. Discussions of the group are archived at
|
22
|
+
|
23
|
+
<http://www.acl.lanl.gov/URI/archive/uri-archive.index.html>
|
24
|
+
|
25
|
+
This document is bound by the Requirements Specification in
|
26
|
+
preparation.
|
27
|
+
|
28
|
+
The work is derived from concepts introduced by the World-Wide Web
|
29
|
+
global information initiative, whose use of such objects dates
|
30
|
+
from 1990 and is described in "Universal Resource identifeirs for
|
31
|
+
the World-Wide Web", RFCXXX .
|
32
|
+
|
33
|
+
This document is available in hypertext form, with links to
|
34
|
+
background information, as:
|
35
|
+
|
36
|
+
<http://info.cern.ch/hypertext/WWW/Addressing/URL/Overview.html>
|
37
|
+
|
38
|
+
.
|
39
|
+
|
40
|
+
Example
|
41
|
+
|
42
|
+
Yes, Jim, I found it under <ftp://info.cern.ch/pub/www/doc> but
|
43
|
+
you can probably pick it up from <ftp://ds.internic.net/rfc>.
|
44
|
+
|
45
|
+
|
46
|
+
|
47
|
+
REFERENCES
|
48
|
+
|
49
|
+
Alberti, R., et.al. (1991)
|
50
|
+
"Notes on the Internet Gopher Protocol"
|
51
|
+
University of Minnesota, December 1991,
|
52
|
+
<ftp://boombox.micro.umn.edu/pub/gopher/gopher_protocol> . See also
|
53
|
+
<gopher://gopher.micro.umn.edu/00/InformationAbout Gopher/About Gopher>
|
54
|
+
|
55
|
+
Berners-Lee, T ., (1991)
|
56
|
+
"Hypertext Transfer Protocol (HTTP)" , CERN,
|
57
|
+
December 1991, as updated from time to time,
|
58
|
+
<ftp://info.cern.ch/pub/www/doc/http-spec.txt
|
59
|
+
>
|
60
|
+
|
61
|
+
Crocker "Standard for ARPA Internet Text Messages" .
|
62
|
+
David H. Crocker, RFC822,
|
63
|
+
|
64
|
+
Davis, F, et al., (1990)
|
65
|
+
"WAIS Interface Protocol: Prototype
|
66
|
+
Functional Specification", Thinking Machines
|
67
|
+
Corporation, April 23, 1990
|
68
|
+
"ftp://quake.think.com/pub/ais/doc/protspec.txt"
|
69
|
+
|
70
|
+
International Standards Organization, (1991)
|
71
|
+
Information and Documentation - Search and
|
72
|
+
Retrieve Application Protocol Specification
|
73
|
+
for open Systems Interconnection, ISO-10163
|
74
|
+
|
75
|
+
Horton (1987) M. Horton, R. Adams, "Standard for
|
76
|
+
interchange of USENET messages", Internet RFC
|
77
|
+
1036 , 12/01/1987.
|
78
|
+
|
79
|
+
Huitema, C., (1991) "Naming: strategies and techniques",
|
80
|
+
Computer Networks and ISDN Systems 23 (1991)
|
81
|
+
107-110.
|
82
|
+
|
83
|
+
|
84
|
+
|
85
|
+
Berners-Lee 18
|
86
|
+
|
87
|
+
RFC XXXX Uniform Resource Locators (URL) March 21 1994
|
88
|
+
|
89
|
+
Kahle, Brewster, (1991)
|
90
|
+
"Document Identifiers, or International
|
91
|
+
Standard Book Numbers for the Electronic
|
92
|
+
Age",
|
93
|
+
<ftp://quake.think.com/pub/wais/doc/doc-ids.txt>
|
94
|
+
|
95
|
+
Kantor, B., and Lapsley, P., (1986)
|
96
|
+
"A proposed standard for the stream-based
|
97
|
+
transmission of news" , Internet RFC-977,
|
98
|
+
February 1986.
|
99
|
+
<ftp://ds.internic.net/rfc/rfc977.txt>
|
100
|
+
|
101
|
+
Kunze, 1994 J. Kunze, Requirements for URLs, to be
|
102
|
+
published.
|
103
|
+
|
104
|
+
Lynch, C., Coallition for Networked Information: (1991)
|
105
|
+
"Workshop on ID and Reference Structures for
|
106
|
+
Networked Information", November 1991. See
|
107
|
+
<wais://quake.think.com/wais-discussion-archives?lynch>
|
108
|
+
|
109
|
+
Mockapetris, P., (1987)
|
110
|
+
"Domain names + concepts and facilities",
|
111
|
+
RFC-1034, USC-ISI, November 1987,
|
112
|
+
<ftp://ds.internic.net/rfc/rfc1034.txt>
|
113
|
+
|
114
|
+
Neuman, B. Clifford, (1992)
|
115
|
+
"Prospero: A Tool for Organizing Internet
|
116
|
+
Resources", Electronic Networking: Research,
|
117
|
+
Applications and Policy, Vol 1 No 2, Meckler
|
118
|
+
Westport CT USA. See also
|
119
|
+
<ftp://prospero.isi.edu/pub/prospero/oir.ps>
|
120
|
+
|
121
|
+
Postel, J. and Reynolds, J. (1985)
|
122
|
+
"File Transfer Protocol (FTP)", Internet
|
123
|
+
RFC-959, October 1985.
|
124
|
+
<ftp://ds.internic.net/rfc/rfc959.txt>
|
125
|
+
|
126
|
+
Sollins 1994 K. Sollins and L. Masinter, Requiremnets for
|
127
|
+
URNs, to be published.
|
128
|
+
|
129
|
+
Yeong, W., (1991a) "Towards Networked Information Retrieval",
|
130
|
+
Technical report 91-06-25-01, June 1991,
|
131
|
+
Performance Systems International, Inc.
|
132
|
+
<ftp://uu.psi.com/wp/nir.txt>
|
133
|
+
|
134
|
+
Yeong, W., (1991b), "Representing Public Archives in the
|
135
|
+
Directory", Internet Draft, November 1991,
|
136
|
+
now expired.
|
137
|
+
|
138
|
+
.
|
139
|
+
|
140
|
+
|
141
|
+
Berners-Lee 19
|
142
|
+
|
143
|
+
RFC XXXX Uniform Resource Locators (URL) March 21 1994
|
144
|
+
|
145
|
+
EDITOR'S ADDRESS
|
146
|
+
|
147
|
+
Tim Berners-Lee
|
148
|
+
Address: World-Wide Web project
|
149
|
+
CERN,
|
150
|
+
1211 Geneva 23,
|
151
|
+
Switzerland
|
152
|
+
|
153
|
+
Telephone: +41 (22)767 3755
|
154
|
+
Fax: +41 (22)767 7155
|
155
|
+
Email: timbl@info.cern.ch
|
156
|
+
|
data/uri_scanner.gemspec
ADDED
@@ -0,0 +1,23 @@
|
|
1
|
+
# coding: utf-8
|
2
|
+
lib = File.expand_path('../lib', __FILE__)
|
3
|
+
$LOAD_PATH.unshift(lib) unless $LOAD_PATH.include?(lib)
|
4
|
+
require 'uri_scanner/version'
|
5
|
+
|
6
|
+
Gem::Specification.new do |spec|
|
7
|
+
spec.name = "uri_scanner"
|
8
|
+
spec.version = URIScanner::VERSION
|
9
|
+
spec.authors = ["Stas Kobzar"]
|
10
|
+
spec.email = ["stas@modulis.ca"]
|
11
|
+
spec.summary = %q{URI Parser with Ragel}
|
12
|
+
spec.description = %q{Parsing URI and tokenize URI segments. Scan input text and extract URIs to array. Based on Ragel FSM compiler.}
|
13
|
+
spec.homepage = "https://github.com/staskobzar/uri_scanner"
|
14
|
+
spec.license = "MIT"
|
15
|
+
|
16
|
+
spec.files = `git ls-files -z`.split("\x0")
|
17
|
+
spec.executables = spec.files.grep(%r{^bin/}) { |f| File.basename(f) }
|
18
|
+
spec.test_files = spec.files.grep(%r{^(test|spec|features)/})
|
19
|
+
spec.require_paths = ["lib"]
|
20
|
+
|
21
|
+
spec.add_development_dependency "bundler", "~> 1.7"
|
22
|
+
spec.add_development_dependency "rake", "~> 10.5"
|
23
|
+
end
|
metadata
ADDED
@@ -0,0 +1,106 @@
|
|
1
|
+
--- !ruby/object:Gem::Specification
|
2
|
+
name: uri_scanner
|
3
|
+
version: !ruby/object:Gem::Version
|
4
|
+
version: 0.0.1
|
5
|
+
platform: ruby
|
6
|
+
authors:
|
7
|
+
- Stas Kobzar
|
8
|
+
autorequire:
|
9
|
+
bindir: bin
|
10
|
+
cert_chain: []
|
11
|
+
date: 2016-02-06 00:00:00.000000000 Z
|
12
|
+
dependencies:
|
13
|
+
- !ruby/object:Gem::Dependency
|
14
|
+
name: bundler
|
15
|
+
requirement: !ruby/object:Gem::Requirement
|
16
|
+
requirements:
|
17
|
+
- - "~>"
|
18
|
+
- !ruby/object:Gem::Version
|
19
|
+
version: '1.7'
|
20
|
+
type: :development
|
21
|
+
prerelease: false
|
22
|
+
version_requirements: !ruby/object:Gem::Requirement
|
23
|
+
requirements:
|
24
|
+
- - "~>"
|
25
|
+
- !ruby/object:Gem::Version
|
26
|
+
version: '1.7'
|
27
|
+
- !ruby/object:Gem::Dependency
|
28
|
+
name: rake
|
29
|
+
requirement: !ruby/object:Gem::Requirement
|
30
|
+
requirements:
|
31
|
+
- - "~>"
|
32
|
+
- !ruby/object:Gem::Version
|
33
|
+
version: '10.5'
|
34
|
+
type: :development
|
35
|
+
prerelease: false
|
36
|
+
version_requirements: !ruby/object:Gem::Requirement
|
37
|
+
requirements:
|
38
|
+
- - "~>"
|
39
|
+
- !ruby/object:Gem::Version
|
40
|
+
version: '10.5'
|
41
|
+
description: Parsing URI and tokenize URI segments. Scan input text and extract URIs
|
42
|
+
to array. Based on Ragel FSM compiler.
|
43
|
+
email:
|
44
|
+
- stas@modulis.ca
|
45
|
+
executables: []
|
46
|
+
extensions: []
|
47
|
+
extra_rdoc_files: []
|
48
|
+
files:
|
49
|
+
- ".gitignore"
|
50
|
+
- ".rspec"
|
51
|
+
- Gemfile
|
52
|
+
- LICENSE.txt
|
53
|
+
- README.md
|
54
|
+
- Rakefile
|
55
|
+
- example/parse.rb
|
56
|
+
- example/scanner.rb
|
57
|
+
- lib/machines/ip_addr.rl
|
58
|
+
- lib/machines/ruby_actions.rl
|
59
|
+
- lib/machines/sip_uri.rl
|
60
|
+
- lib/machines/uri.rl
|
61
|
+
- lib/uri_scanner.rb
|
62
|
+
- lib/uri_scanner/ip_address.rb
|
63
|
+
- lib/uri_scanner/ip_address.rl
|
64
|
+
- lib/uri_scanner/uri_parser.rb
|
65
|
+
- lib/uri_scanner/uri_parser.rl
|
66
|
+
- lib/uri_scanner/uri_scanner.rb
|
67
|
+
- lib/uri_scanner/uri_scanner.rl
|
68
|
+
- lib/uri_scanner/version.rb
|
69
|
+
- spec/ip_addr_spec.rb
|
70
|
+
- spec/scanner_spec.rb
|
71
|
+
- spec/spec_helper.rb
|
72
|
+
- spec/uri_scanner_spec.rb
|
73
|
+
- spec/uri_spec.rb
|
74
|
+
- spec/url.txt
|
75
|
+
- uri_scanner.gemspec
|
76
|
+
homepage: https://github.com/staskobzar/uri_scanner
|
77
|
+
licenses:
|
78
|
+
- MIT
|
79
|
+
metadata: {}
|
80
|
+
post_install_message:
|
81
|
+
rdoc_options: []
|
82
|
+
require_paths:
|
83
|
+
- lib
|
84
|
+
required_ruby_version: !ruby/object:Gem::Requirement
|
85
|
+
requirements:
|
86
|
+
- - ">="
|
87
|
+
- !ruby/object:Gem::Version
|
88
|
+
version: '0'
|
89
|
+
required_rubygems_version: !ruby/object:Gem::Requirement
|
90
|
+
requirements:
|
91
|
+
- - ">="
|
92
|
+
- !ruby/object:Gem::Version
|
93
|
+
version: '0'
|
94
|
+
requirements: []
|
95
|
+
rubyforge_project:
|
96
|
+
rubygems_version: 2.4.1
|
97
|
+
signing_key:
|
98
|
+
specification_version: 4
|
99
|
+
summary: URI Parser with Ragel
|
100
|
+
test_files:
|
101
|
+
- spec/ip_addr_spec.rb
|
102
|
+
- spec/scanner_spec.rb
|
103
|
+
- spec/spec_helper.rb
|
104
|
+
- spec/uri_scanner_spec.rb
|
105
|
+
- spec/uri_spec.rb
|
106
|
+
- spec/url.txt
|