h1p 0.2 → 0.5
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CHANGELOG.md +12 -1
- data/Gemfile.lock +1 -1
- data/README.md +61 -15
- data/Rakefile +1 -1
- data/benchmarks/bm_http1_parser.rb +1 -1
- data/benchmarks/pipelined.rb +101 -0
- data/examples/callable.rb +1 -1
- data/examples/http_server.rb +2 -2
- data/ext/h1p/h1p.c +525 -235
- data/ext/h1p/h1p.h +5 -0
- data/ext/h1p/limits.rb +7 -6
- data/lib/h1p/version.rb +1 -1
- data/lib/h1p.rb +16 -10
- data/test/run.rb +5 -0
- data/test/test_h1p_client.rb +532 -0
- data/test/{test_h1p.rb → test_h1p_server.rb} +91 -36
- metadata +7 -4
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 53aadd6f6c4ae112ff844b68e8325fe334d51dd0a09bb11cc54d017190e97677
|
4
|
+
data.tar.gz: e9f6d504813d74c050a46b4e973e9fbc227dd790c02395a0073b62f43a7393da
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 3ac98c2d7e702f8cf5f9052e78e9abe486ee3ba0814c4c16fe222acdccfb5338ac9d5c9b4a5d1da2178fca65472693c94a4997cc86de05db70f13c90f1bc767e
|
7
|
+
data.tar.gz: 73ce78e3435eb40f3e6d46ee8eef803284e8da99f32d610dbd723c3661548a74ed454ae70e1ec02dcfbdb70aa346059fb7713eab19de3e6e6e65a4ca859926c7
|
data/CHANGELOG.md
CHANGED
@@ -1,4 +1,15 @@
|
|
1
|
-
## 0.
|
1
|
+
## 0.5 2022-03-19
|
2
|
+
|
3
|
+
- Implement `Parser#splice_body_to` (#3)
|
4
|
+
|
5
|
+
## 0.4 2022-02-28
|
6
|
+
|
7
|
+
- Rename `__parser_read_method__` to `__read_method__`
|
8
|
+
|
9
|
+
## 0.3 2022-02-03
|
10
|
+
|
11
|
+
- Add support for parsing HTTP responses (#1)
|
12
|
+
- Put state directly in parser struct
|
2
13
|
|
3
14
|
## 0.2 2021-08-20
|
4
15
|
|
data/Gemfile.lock
CHANGED
data/README.md
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
# H1P - a blocking HTTP/1 parser for Ruby
|
2
2
|
|
3
3
|
[](http://rubygems.org/gems/h1p)
|
4
|
-
[](https://github.com/digital-fabric/h1p/actions?query=workflow%3ATests)
|
5
5
|
[](https://github.com/digital-fabric/h1p/blob/master/LICENSE)
|
6
6
|
|
7
7
|
H1P is a blocking/synchronous HTTP/1 parser for Ruby with a simple and intuitive
|
@@ -23,8 +23,11 @@ The H1P was originally written as part of
|
|
23
23
|
- Simple, blocking/synchronous API
|
24
24
|
- Zero dependencies
|
25
25
|
- Transport-agnostic
|
26
|
+
- Parses both HTTP request and HTTP response
|
26
27
|
- Support for chunked encoding
|
27
28
|
- Support for both `LF` and `CRLF` line breaks
|
29
|
+
- Support for **splicing** request/response bodies (when used with
|
30
|
+
[Polyphony](https://github.com/digital-fabric/polyphony))
|
28
31
|
- Track total incoming traffic
|
29
32
|
|
30
33
|
## Installing
|
@@ -41,15 +44,21 @@ You can then run `bundle install` to install it. Otherwise, just run `gem instal
|
|
41
44
|
|
42
45
|
## Usage
|
43
46
|
|
44
|
-
Start by creating an instance of H1P::Parser
|
47
|
+
Start by creating an instance of `H1P::Parser`, passing a connection instance and the parsing mode:
|
45
48
|
|
46
49
|
```ruby
|
47
50
|
require 'h1p'
|
48
51
|
|
49
|
-
parser = H1P::Parser.new(conn)
|
52
|
+
parser = H1P::Parser.new(conn, :server)
|
50
53
|
```
|
51
54
|
|
52
|
-
|
55
|
+
In order to parse HTTP responses, change the mode to `:client`:
|
56
|
+
|
57
|
+
```ruby
|
58
|
+
parser = H1P::Parser.new(conn, :client)
|
59
|
+
```
|
60
|
+
|
61
|
+
To read the next message from the connection, call `#parse_headers`:
|
53
62
|
|
54
63
|
```ruby
|
55
64
|
loop do
|
@@ -65,13 +74,21 @@ headers. In case the client has closed the connection, `#parse_headers` will
|
|
65
74
|
return `nil` (see the guard clause above).
|
66
75
|
|
67
76
|
In addition to the header keys and values, the resulting hash also contains the
|
68
|
-
following "pseudo-headers":
|
77
|
+
following "pseudo-headers" (in server mode):
|
69
78
|
|
70
79
|
- `:method`: the HTTP method (in upper case)
|
71
80
|
- `:path`: the request target
|
72
81
|
- `:protocol`: the protocol used (either `'http/1.0'` or `'http/1.1'`)
|
73
82
|
- `:rx`: the total bytes read by the parser
|
74
83
|
|
84
|
+
In client mode, the following pseudo-headers will be present:
|
85
|
+
|
86
|
+
- `:protocol`: the protocol used (either `'http/1.0'` or `'http/1.1'`)
|
87
|
+
- `:status': the HTTP status as an integer
|
88
|
+
- `:status_message`: the HTTP status message
|
89
|
+
- `:rx`: the total bytes read by the parser
|
90
|
+
|
91
|
+
|
75
92
|
The header keys are always lower-cased. Consider the following HTTP request:
|
76
93
|
|
77
94
|
```
|
@@ -101,24 +118,24 @@ where the value is an array containing the corresponding values. For example,
|
|
101
118
|
multiple `Cookie` headers will appear in the hash as a single `"cookie"` entry,
|
102
119
|
e.g. `{ "cookie" => ['a=1', 'b=2'] }`
|
103
120
|
|
104
|
-
### Handling of invalid
|
121
|
+
### Handling of invalid message
|
105
122
|
|
106
|
-
When an invalid
|
107
|
-
exception. An incoming
|
108
|
-
has been encountered at any point in parsing the
|
123
|
+
When an invalid message is encountered, the parser will raise a `H1P::Error`
|
124
|
+
exception. An incoming message may be considered invalid if an invalid character
|
125
|
+
has been encountered at any point in parsing the message, or if any of the
|
109
126
|
tokens have an invalid length. You can consult the limits used by the parser
|
110
127
|
[here](https://github.com/digital-fabric/h1p/blob/main/ext/h1p/limits.rb).
|
111
128
|
|
112
|
-
### Reading the
|
129
|
+
### Reading the message body
|
113
130
|
|
114
|
-
To read the
|
131
|
+
To read the message body use `#read_body`:
|
115
132
|
|
116
133
|
```ruby
|
117
134
|
# read entire body
|
118
135
|
body = parser.read_body
|
119
136
|
```
|
120
137
|
|
121
|
-
The H1P parser knows how to read both
|
138
|
+
The H1P parser knows how to read both message bodies with a specified
|
122
139
|
`Content-Length` and request bodies in chunked encoding. The method call will
|
123
140
|
return when the entire body has been read. If the body is incomplete or has
|
124
141
|
invalid formatting, the parser will raise a `H1P::Error` exception.
|
@@ -146,12 +163,42 @@ end
|
|
146
163
|
The `#read_body` and `#read_body_chunk` methods will return `nil` if no body is
|
147
164
|
expected (based on the received headers).
|
148
165
|
|
166
|
+
## Splicing request/response bodies
|
167
|
+
|
168
|
+
> Splicing of request/response bodies is available only on Linux, and works only
|
169
|
+
> with [Polyphony](https://github.com/digital-fabric/polyphony).
|
170
|
+
|
171
|
+
H1P also lets you [splice](https://man7.org/linux/man-pages/man2/splice.2.html)
|
172
|
+
request or response bodies directly to a pipe. This is particularly useful for
|
173
|
+
uploading or downloading large files, as the data does not need to be loaded
|
174
|
+
into Ruby strings. In fact, the data will stay almost entirely in kernel
|
175
|
+
buffers, which means any data copying is reduced to the absolute minimum.
|
176
|
+
|
177
|
+
The following example sends a request, then splices the response body to a file:
|
178
|
+
|
179
|
+
```ruby
|
180
|
+
require 'polyphony'
|
181
|
+
require 'h1p'
|
182
|
+
|
183
|
+
socket = TCPSocket.new('example.com', 80)
|
184
|
+
socket << "GET /bigfile HTTP/1.1\r\nHost: example.com\r\n\r\n"
|
185
|
+
|
186
|
+
parser = H1P::Parser.new(socket, :client)
|
187
|
+
headers = parser.parse_headers
|
188
|
+
|
189
|
+
pipe = Polyphony.pipe
|
190
|
+
File.open('bigfile', 'w+') do |f|
|
191
|
+
spin { parser.splice_body_to(pipe) }
|
192
|
+
f.splice_from(pipe)
|
193
|
+
end
|
194
|
+
```
|
195
|
+
|
149
196
|
## Parsing from arbitrary transports
|
150
197
|
|
151
198
|
The H1P parser was built to read from any arbitrary transport or source, as long
|
152
199
|
as they conform to one of two alternative interfaces:
|
153
200
|
|
154
|
-
- An object implementing a `
|
201
|
+
- An object implementing a `__read_method__` method, which returns any of
|
155
202
|
the following values:
|
156
203
|
|
157
204
|
- `:stock_readpartial` - to be used for instances of `IO`, `Socket`,
|
@@ -211,8 +258,7 @@ performance.
|
|
211
258
|
Here are some of the features and enhancements planned for H1P:
|
212
259
|
|
213
260
|
- Add conformance and security tests
|
214
|
-
- Add ability to
|
215
|
-
- Add ability to splice the request body into an arbitrary fd
|
261
|
+
- Add ability to splice the message body into an arbitrary fd
|
216
262
|
(Polyphony-specific)
|
217
263
|
- Improve performance
|
218
264
|
|
data/Rakefile
CHANGED
@@ -0,0 +1,101 @@
|
|
1
|
+
# frozen_string_literal: true
|
2
|
+
|
3
|
+
HTTP_REQUEST = "GET /foo HTTP/1.1\r\nHost: example.com\r\nAccept: */*\r\nUser-Agent: foobar\r\n\r\n" +
|
4
|
+
"GET /bar HTTP/1.1\r\nHost: example.com\r\nAccept: */*\r\nUser-Agent: foobar\r\n\r\n"
|
5
|
+
|
6
|
+
def measure_time_and_allocs
|
7
|
+
4.times { GC.start }
|
8
|
+
GC.disable
|
9
|
+
|
10
|
+
t0 = Time.now
|
11
|
+
a0 = object_count
|
12
|
+
yield
|
13
|
+
t1 = Time.now
|
14
|
+
a1 = object_count
|
15
|
+
[t1 - t0, a1 - a0]
|
16
|
+
ensure
|
17
|
+
GC.enable
|
18
|
+
end
|
19
|
+
|
20
|
+
def object_count
|
21
|
+
count = ObjectSpace.count_objects
|
22
|
+
count[:TOTAL] - count[:FREE]
|
23
|
+
end
|
24
|
+
|
25
|
+
def benchmark_other_http1_parser(iterations)
|
26
|
+
STDOUT << "http_parser.rb: "
|
27
|
+
require 'http_parser.rb'
|
28
|
+
|
29
|
+
i, o = IO.pipe
|
30
|
+
parser = Http::Parser.new
|
31
|
+
done = false
|
32
|
+
queue = nil
|
33
|
+
rx = 0
|
34
|
+
req_count = 0
|
35
|
+
parser.on_headers_complete = proc do |h|
|
36
|
+
h[':method'] = parser.http_method
|
37
|
+
h[':path'] = parser.request_url
|
38
|
+
h[':rx'] = rx
|
39
|
+
queue << h
|
40
|
+
end
|
41
|
+
parser.on_message_complete = proc { done = true }
|
42
|
+
|
43
|
+
writer = Thread.new do
|
44
|
+
iterations.times { o << HTTP_REQUEST }
|
45
|
+
o.close
|
46
|
+
end
|
47
|
+
|
48
|
+
elapsed, allocated = measure_time_and_allocs do
|
49
|
+
queue = []
|
50
|
+
done = false
|
51
|
+
rx = 0
|
52
|
+
loop do
|
53
|
+
data = i.readpartial(4096) rescue nil
|
54
|
+
break unless data
|
55
|
+
|
56
|
+
rx += data.bytesize
|
57
|
+
parser << data
|
58
|
+
while (req = queue.shift)
|
59
|
+
req_count += 1
|
60
|
+
end
|
61
|
+
end
|
62
|
+
end
|
63
|
+
puts(format('count: %d, elapsed: %f, allocated: %d (%f/req), rate: %f ips', req_count, elapsed, allocated, allocated.to_f / iterations, iterations / elapsed))
|
64
|
+
end
|
65
|
+
|
66
|
+
def benchmark_h1p_parser(iterations)
|
67
|
+
STDOUT << "H1P parser: "
|
68
|
+
require_relative '../lib/h1p'
|
69
|
+
i, o = IO.pipe
|
70
|
+
parser = H1P::Parser.new(i)
|
71
|
+
req_count = 0
|
72
|
+
|
73
|
+
writer = Thread.new do
|
74
|
+
iterations.times { o << HTTP_REQUEST }
|
75
|
+
o.close
|
76
|
+
end
|
77
|
+
|
78
|
+
elapsed, allocated = measure_time_and_allocs do
|
79
|
+
while (headers = parser.parse_headers)
|
80
|
+
req_count += 1
|
81
|
+
end
|
82
|
+
end
|
83
|
+
puts(format('count: %d, elapsed: %f, allocated: %d (%f/req), rate: %f ips', req_count, elapsed, allocated, allocated.to_f / iterations, iterations / elapsed))
|
84
|
+
end
|
85
|
+
|
86
|
+
def fork_benchmark(method, iterations)
|
87
|
+
pid = fork do
|
88
|
+
send(method, iterations)
|
89
|
+
rescue Exception => e
|
90
|
+
p e
|
91
|
+
p e.backtrace
|
92
|
+
exit!
|
93
|
+
end
|
94
|
+
Process.wait(pid)
|
95
|
+
end
|
96
|
+
|
97
|
+
x = 100000
|
98
|
+
fork_benchmark(:benchmark_other_http1_parser, x)
|
99
|
+
fork_benchmark(:benchmark_h1p_parser, x)
|
100
|
+
|
101
|
+
# benchmark_h1p_parser(x)
|
data/examples/callable.rb
CHANGED
data/examples/http_server.rb
CHANGED
@@ -10,13 +10,13 @@ trap('SIGINT') { exit! }
|
|
10
10
|
|
11
11
|
def handle_client(conn)
|
12
12
|
Thread.new do
|
13
|
-
parser = H1P::Parser.new(conn)
|
13
|
+
parser = H1P::Parser.new(conn, :server)
|
14
14
|
loop do
|
15
15
|
headers = parser.parse_headers
|
16
16
|
break unless headers
|
17
17
|
|
18
18
|
req_body = parser.read_body
|
19
|
-
|
19
|
+
|
20
20
|
p headers: headers
|
21
21
|
p body: req_body
|
22
22
|
|