pcap_tools 0.0.5 → 0.0.6

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
data/.gitignore ADDED
@@ -0,0 +1,3 @@
1
+ *.gem
2
+ *.pcap
3
+ *.xml
data/Gemfile ADDED
@@ -0,0 +1,4 @@
1
+ source 'https://rubygems.org/'
2
+
3
+ gemspec
4
+
data/Gemfile.lock ADDED
@@ -0,0 +1,24 @@
1
+ PATH
2
+ remote: .
3
+ specs:
4
+ pcap_tools (0.0.6)
5
+ nokogiri (= 1.6.0)
6
+ popen4 (= 0.1.2)
7
+
8
+ GEM
9
+ remote: https://rubygems.org/
10
+ specs:
11
+ Platform (0.4.0)
12
+ mini_portile (0.5.2)
13
+ nokogiri (1.6.0)
14
+ mini_portile (~> 0.5.0)
15
+ open4 (1.3.0)
16
+ popen4 (0.1.2)
17
+ Platform (>= 0.4.0)
18
+ open4 (>= 0.4.0)
19
+
20
+ PLATFORMS
21
+ ruby
22
+
23
+ DEPENDENCIES
24
+ pcap_tools!
data/README.markdown CHANGED
@@ -1,56 +1,107 @@
1
1
  # What is it ?
2
2
 
3
- It's a ruby library to help tcpdump file processing : do some offline analysis on tcpdump files.
3
+ PCapTools is a ruby library to process pcap file from [wireshark](http://www.wireshark.org/) or [tcpdump](http://www.tcpdump.org/).
4
+
5
+ PCapTools uses [tshark](http://www.wireshark.org/docs/man-pages/tshark.html) to process the pcap file, and run some analysis on it. Tshark is bundled with [wireshark](http://www.wireshark.org/download.html).
6
+
7
+ There are two ways to use PCapTools
8
+
9
+ * as a command line tools
10
+ * as a ruby library
11
+
12
+ ## As a command line tools
4
13
 
5
14
  Main functionnalities :
6
15
 
7
16
  * Rebuild tcp streams
8
- * Extract and parse http request
17
+ * Extract and parse http requests
18
+
19
+ ### Install
20
+
21
+ Ensure `tshark` is installed. If not, check your wireshark install.
22
+
23
+ tshark -v
24
+
25
+ Install pcap_tools
26
+
27
+ gem install pcap_tools
28
+
29
+ ### Command line options
30
+
31
+ pcap_tools --help
32
+
33
+ Usage: pcap_tools_http [options] pcap_files
34
+ --no-body Do not display body
35
+ --tshark_path Path to tshark executable
36
+ --one-tcp-stream [index] Display only one tcp stream
37
+ --mode [MODE] Parsing mode : http, tcp, frame, tcp_count. Default http
38
+
39
+ ### Typical use
40
+
41
+ tcpdump -w out.pcap -s0 port 80
42
+ pcap_tools out.pcap
43
+
44
+
45
+ ## As a ruby library
9
46
 
10
- # How use it
47
+ ### Install
11
48
 
12
- ## Make a tcpdump
49
+ Ensure `tshark` is installed. If not, check your wireshark install.
13
50
 
14
- * `tcpdump -w out.pcap -s 4096 <filter>`
15
- * Get the output file out.pcap
51
+ tshark -v
16
52
 
17
- Please adjust the 4096 value, to the max packet size to capture.
53
+ Declare dependency to `pcap_tools` in your Gemfile.
18
54
 
19
- ## Write a ruby script
55
+ gem 'pcap_tools'
20
56
 
21
- require 'pcap_tools'
57
+ ### How to use it
22
58
 
23
- # Load tcpdump file
24
- capture = PCAPRUB::Pcap.open_offline('out.pcap')
59
+ The best example is the [pcap_tools command line script](https://github.com/bpaquet/pcap_tools/blob/master/bin/pcap_tools).
25
60
 
26
- ## Available functions
61
+ Pcap_tools is an event processor : `tshark` returns an XML Flow, which is parsed with a SAX processor. Each packet is processed by a TCP processor, which build streams. Each streams is processed by an HTTP processor, which rebuild HTTP request / response.
62
+
63
+ #### Loading files
64
+
65
+ PcapTools::Loader::load_file(f, {:tshark => OPTIONS[:tshark_path]}) do |index, packet|
66
+ end
67
+
68
+ Each packet is a ruby object containing the main attributes found in the packet, extracted with tshark.
69
+
70
+ You have to use a packet processor to process this packet. The main one is `PcapTools::TcpProcessor`.
27
71
 
28
72
  ### Extract tcp streams
29
73
 
30
- This function rebuild tcp streams from an array of pcap capture object.
74
+ processor = PcapTools::TcpProcessor.new
75
+ PcapTools::Loader::load_file(f, {:tshark => OPTIONS[:tshark_path]}) do |index, packet|
76
+ processor.inject index, packet
77
+ end
31
78
 
32
- tcp_streams = PcapTools::extract_tcp_streams(captures)
79
+ The [TCPProcessor](https://github.com/bpaquet/pcap_tools/blob/master/lib/pcap_tools/packet_processors/tcp.rb) rebuild streams from IP raw packets. To use the streams, you have to add some streams processors.
33
80
 
34
- `tcp_streams` is an array of hash, each hash has tree keys :
81
+ processor.add_stream_processor PcapTools::TcpStreamRebuilder.new
35
82
 
36
- * `:type` : `:in` or `:out`, if the packet was sent or received
37
- * `:time` : timestamp of packet
38
- * `:data` : payload of packet
83
+ The [TcpStreamRebuilder](https://github.com/bpaquet/pcap_tools/blob/master/lib/pcap_tools/stream_processors/http.rb) reassembles contiguous packet, for example the packets containing a big HTTP Response.
84
+
85
+ processor.add_stream_processor PcapTools::HttpExtractor.new
39
86
 
40
- Remarks :
87
+ The [HttpExtractor](https://github.com/bpaquet/pcap_tools/blob/master/lib/pcap_tools/stream_processors/rebuilder.rb) build HTTP request and response from streams.
41
88
 
42
- * Packets are in the rigth ordere
43
- * Packets are not merged (eg an http response can be splitted on serval consecutive packets,
44
- with the same type `:in` or `:out`).
45
- To reassemble packet of the same type, please use `stream.rebuild_packets`
89
+ Please read the code to build your own stream processor. Do not be afraid, it's easy :)
46
90
 
47
- ### Extract http calls
91
+ Note : the TCPProcessor is not able to rebuild tcp stream which do not start in the pcap file. For example, if you launch tcpdump after a long running Oracle DB connection, TCPProcessor will not show the Oracle DB connection.
48
92
 
49
- This function extract http calls from a tcp stream, returned from the `extract_tcp_streams` function.
93
+ ### Data format
50
94
 
51
- http_calls = PcapTools::extract_http_calls(stream)
95
+ Streams objects
52
96
 
53
- `http_calls` is an array of `http_call`.
97
+ A `tcp_streams` is an array of hash, each hash has some keys :
98
+
99
+ * `:type` : `:in` or `:out`, if the packet was sent or received
100
+ * `:time` : timestamp of the packet.
101
+ * `:data` : payload of packet.
102
+ * `:size` : payload size.
103
+
104
+ HTTP Objects
54
105
 
55
106
  A `http_call` is an array of two objects :
56
107
 
@@ -76,15 +127,3 @@ The request and response object have some new attributes
76
127
  For the response object body, the following "Content-Encoding" type are honored :
77
128
 
78
129
  * gzip
79
-
80
- ### Extract http calls from captures
81
-
82
- The two in one : extract http calls from an array of captures objects
83
-
84
- http_calls = PcapTools::extract_http_calls_from_captures(captures)
85
-
86
- ### Load multiple files
87
-
88
- Load multiple pcap files, in time order. Useful when you use `tcpdump -C 5 -W 100000`, to split captured data into pieces of 5M
89
-
90
- captures = PcapTools::load_mutliple_files '*pcap*'
data/bin/pcap_tools CHANGED
@@ -3,7 +3,7 @@
3
3
  require 'pcap_tools'
4
4
  require 'optparse'
5
5
 
6
- options = {
6
+ OPTIONS = {
7
7
  :mode => :http,
8
8
  }
9
9
 
@@ -11,60 +11,133 @@ OptionParser.new do |opts|
11
11
  opts.banner = "Usage: pcap_tools_http [options] pcap_files"
12
12
 
13
13
  opts.on("--no-body", "Do not display body") do
14
- options[:no_body] = true
14
+ OPTIONS[:no_body] = true
15
15
  end
16
16
 
17
- opts.on("--mode [MODE]", [:http, :tcp], "parsing mode") do |m|
18
- options[:mode] = m
17
+ opts.on("--tshark_path", "Path to tshark executable") do |x|
18
+ OPTIONS[:tshark_path] = x
19
19
  end
20
20
 
21
- end.parse!
22
-
23
- data = ARGV.map{|f| puts "Loading #{f}"; PcapTools::Parser::load_file(f)}
21
+ opts.on("--one-tcp-stream [index]", Integer, "Display only one tcp stream") do |x|
22
+ OPTIONS[:one_tcp_stream] = x
23
+ end
24
24
 
25
- tcps = PcapTools::extract_tcp_streams(data)
25
+ opts.on("--mode [MODE]", [:http, :tcp, :frame, :tcp_count], "Parsing mode : http, tcp, frame, tcp_count. Default http") do |m|
26
+ OPTIONS[:mode] = m
27
+ end
26
28
 
27
- puts "Tcp streams extracted : #{tcps.size}"
28
- puts "Parsing mode : #{options[:mode]}"
29
- puts
29
+ end.parse!
30
30
 
31
31
  def format_time t
32
32
  "#{t} #{t.nsec / 1000}"
33
33
  end
34
34
 
35
- if options[:mode] == :http
36
- tcps.each do |tcp|
37
- PcapTools::extract_http_calls(tcp).each do |req, resp|
38
- puts ">>>> #{req["pcap-src"]}:#{req["pcap-src-port"]} > #{req["pcap-dst"]}:#{req["pcap-dst-port"]} #{format_time req.time}"
39
- puts "#{req.method} #{req.path}"
40
- req.each_capitalized_name.reject{|x| x =~ /^Pcap/ }.each do |x|
41
- puts "#{x}: #{req[x]}"
35
+ puts "Mode : #{OPTIONS[:mode]}"
36
+
37
+ processor = nil
38
+
39
+ if OPTIONS[:mode] == :frame
40
+
41
+ processor = PcapTools::FrameProcessor.new
42
+
43
+ else
44
+
45
+ class TcpCounter
46
+
47
+ def initialize
48
+ @counter = 0
49
+ end
50
+
51
+ def process_stream stream
52
+ @counter += 1
53
+ stream
54
+ end
55
+
56
+ def finalize
57
+ puts "Number of TCP Streams : #{@counter}"
58
+ end
59
+
60
+ end
61
+
62
+
63
+ processor = PcapTools::TcpProcessor.new
64
+ processor.add_stream_processor TcpCounter.new
65
+ processor.add_stream_processor PcapTools::TcpOneStreamFilter.new OPTIONS[:one_tcp_stream]
66
+
67
+ if OPTIONS[:mode] == :tcp
68
+
69
+ class TcpPrinter
70
+
71
+ def process_stream stream
72
+ puts "<<<< new connection >>>> [ Wirershark stream index #{stream[:index]} ]"
73
+ stream[:data].each do |packet|
74
+ type = packet[:type] == :out ? ">>>>" : "<<<<<"
75
+ puts "#{type} #{packet[:from]}:#{packet[:from_port]} > #{packet[:to]}:#{packet[:to_port]}, size #{packet[:data].size} #{format_time packet[:time]}"
76
+ puts packet[:data] unless OPTIONS[:no_body]
77
+ puts
78
+ end
79
+ end
80
+
81
+ def finalize
82
+ end
83
+
84
+ end
85
+
86
+ processor.add_stream_processor TcpPrinter.new
87
+
88
+ end
89
+
90
+ if OPTIONS[:mode] == :http
91
+
92
+ class HttpPrinter
93
+
94
+ def initialize
95
+ @counter = 0
42
96
  end
43
- puts
44
- puts req.body unless options[:no_body]
45
- puts "<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< #{format_time resp.time}"
46
- if resp
47
- puts "#{resp.code} #{resp.message}"
48
- resp.each_capitalized_name.reject{|x| x =~ /^Pcap/ }.each do |x|
49
- puts "#{x}: #{resp[x]}"
97
+
98
+ def process_stream stream
99
+ stream.each do |index, req, resp|
100
+ @counter += 1
101
+ puts ">>>> #{req["pcap-src"]}:#{req["pcap-src-port"]} > #{req["pcap-dst"]}:#{req["pcap-dst-port"]} #{format_time req.time} [ Wirershark stream index #{index} ]"
102
+ puts "#{req.method} #{req.path}"
103
+ req.each_capitalized_name.reject{|x| x =~ /^Pcap/ }.each do |x|
104
+ puts "#{x}: #{req[x]}"
105
+ end
106
+ puts
107
+ puts req.body unless OPTIONS[:no_body]
108
+ if resp
109
+ puts "<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< #{format_time resp.time}"
110
+ puts "#{resp.code} #{resp.message}"
111
+ resp.each_capitalized_name.reject{|x| x =~ /^Pcap/ }.each do |x|
112
+ puts "#{x}: #{resp[x]}"
113
+ end
114
+ puts
115
+ puts resp.body unless OPTIONS[:no_body]
116
+ else
117
+ puts "No response found"
118
+ end
119
+ puts
50
120
  end
51
- puts
52
- puts resp.body unless options[:no_body]
53
- else
54
- puts "No response in pcap file"
55
121
  end
56
- puts
122
+
123
+ def finalize
124
+ puts "Number of HTTP Request / response : #{@counter}"
125
+ end
126
+
57
127
  end
128
+
129
+ processor.add_stream_processor PcapTools::TcpStreamRebuilder.new
130
+ processor.add_stream_processor PcapTools::HttpExtractor.new
131
+ processor.add_stream_processor HttpPrinter.new
132
+
58
133
  end
134
+
59
135
  end
60
136
 
61
- if options[:mode] == :tcp
62
- tcps.each do |tcp|
63
- tcp.each do |packet|
64
- type = packet[:type] == :out ? ">>>>" : "<<<<<"
65
- puts "#{type} #{packet[:from]}:#{packet[:from_port]} > #{packet[:to]}:#{packet[:to_port]}, size #{packet[:data].size} #{format_time packet[:time]}"
66
- puts packet[:data]
67
- puts
68
- end
137
+ ARGV.each do |f|
138
+ PcapTools::Loader::load_file(f, {:tshark => OPTIONS[:tshark_path]}) do |index, packet|
139
+ processor.inject index, packet
69
140
  end
70
- end
141
+ end
142
+
143
+ processor.finalize
data/lib/pcap_tools.rb CHANGED
@@ -1,189 +1,14 @@
1
- require 'rubygems'
2
1
  require 'net/http'
2
+ require 'bindata'
3
3
  require 'zlib'
4
4
 
5
- require File.join(File.dirname(__FILE__), 'pcap_parser')
5
+ require_relative 'pcap_tools/loader'
6
6
 
7
- module Net
7
+ require_relative 'pcap_tools/patches/http.rb'
8
8
 
9
- class HTTPRequest
10
- attr_accessor :time
11
- end
9
+ require_relative 'pcap_tools/packet_processors/frame'
10
+ require_relative 'pcap_tools/packet_processors/tcp'
12
11
 
13
- class HTTPResponse
14
- attr_accessor :time
15
-
16
- def body= body
17
- @body = body
18
- @read = true
19
- end
20
-
21
- end
22
-
23
- end
24
-
25
- module PcapTools
26
-
27
- class TcpStream < Array
28
-
29
- def insert_tcp sym, packet
30
- data = packet.payload
31
- return if data.size == 0
32
- timestamp = Time.at(packet.parent.parent.parent.ts_sec, packet.parent.parent.parent.ts_usec)
33
- self << {:type => sym, :data => data, :from => packet.parent.src_addr, :to => packet.parent.dst_addr, :from_port => packet.src_port, :to_port => packet.dst_port, :time => timestamp}
34
- end
35
-
36
- def rebuild_packets
37
- out = TcpStream.new
38
- current = nil
39
- self.each do |packet|
40
- if current
41
- if packet[:type] == current[:type]
42
- current[:data] += packet[:data]
43
- else
44
- out << current
45
- current = packet.clone
46
- end
47
- else
48
- current = packet.clone
49
- end
50
- end
51
- out << current if current
52
- out
53
- end
54
-
55
- end
56
-
57
- def extract_http_calls_from_captures captures
58
- calls = []
59
- extract_tcp_streams(captures).each do |tcp|
60
- calls.concat(extract_http_calls(tcp))
61
- end
62
- calls
63
- end
64
-
65
- module_function :extract_http_calls_from_captures
66
-
67
- def extract_tcp_streams captures
68
- packets = []
69
- captures.each do |capture|
70
- capture.each do |packet|
71
- packets << packet
72
- end
73
- end
74
-
75
- streams = []
76
- packets.each_with_index do |packet, k|
77
- if packet.respond_to?(:type) && packet.type == "TCP" && packet.syn == 1 && packet.ack == 0
78
- kk = k
79
- tcp = TcpStream.new
80
- while kk < packets.size
81
- packet2 = packets[kk]
82
- if packet2.respond_to?(:type) && packet.type == "TCP"
83
- if packet.dst_port == packet2.dst_port && packet.src_port == packet2.src_port
84
- tcp.insert_tcp :out, packet2
85
- break if packet.fin == 1 || packet2.fin == 1
86
- end
87
- if packet.dst_port == packet2.src_port && packet.src_port == packet2.dst_port
88
- tcp.insert_tcp :in, packet2
89
- break if packet.fin == 1 || packet2.fin == 1
90
- end
91
- end
92
- kk += 1
93
- end
94
- streams << tcp
95
- end
96
- end
97
- streams
98
- end
99
-
100
- module_function :extract_tcp_streams
101
-
102
- def extract_http_calls stream
103
- rebuilded = stream.rebuild_packets
104
- calls = []
105
- data_out = ""
106
- data_in = nil
107
- k = 0
108
- while k < rebuilded.size
109
- begin
110
- req = HttpParser::parse_request(rebuilded[k])
111
- resp = k + 1 < rebuilded.size ? HttpParser::parse_response(rebuilded[k + 1]) : nil
112
- calls << [req, resp]
113
- rescue Exception => e
114
- warn "Unable to parse http call : #{e}"
115
- end
116
- k += 2
117
- end
118
- calls
119
- end
120
-
121
- module_function :extract_http_calls
122
-
123
- module HttpParser
124
-
125
- def parse_request stream
126
- headers, body = split_headers(stream[:data])
127
- line0 = headers.shift
128
- m = /(\S+)\s+(\S+)\s+(\S+)/.match(line0) or raise "Unable to parse first line of http request #{line0}"
129
- clazz = {'POST' => Net::HTTP::Post, 'HEAD' => Net::HTTP::Head, 'GET' => Net::HTTP::Get, 'PUT' => Net::HTTP::Put}[m[1]] or raise "Unknown http request type #{m[1]}"
130
- req = clazz.new m[2]
131
- req['Pcap-Src'] = stream[:from]
132
- req['Pcap-Src-Port'] = stream[:from_port]
133
- req['Pcap-Dst'] = stream[:to]
134
- req['Pcap-Dst-Port'] = stream[:to_port]
135
- req.time = stream[:time]
136
- req.body = body
137
- req['user-agent'] = nil
138
- req['accept'] = nil
139
- add_headers req, headers
140
- req.body.size == req['Content-Length'].to_i or raise "Wrong content-length for http request, header say #{req['Content-Length'].chomp}, found #{req.body.size}"
141
- req
142
- end
143
-
144
- module_function :parse_request
145
-
146
- def parse_response stream
147
- headers, body = split_headers(stream[:data])
148
- line0 = headers.shift
149
- m = /^(\S+)\s+(\S+)\s+(.*)$/.match(line0) or raise "Unable to parse first line of http response #{line0}"
150
- resp = Net::HTTPResponse.send(:response_class, m[2]).new(m[1], m[2], m[3])
151
- resp.time = stream[:time]
152
- add_headers resp, headers
153
- if resp.chunked?
154
- resp.body = read_chunked("\r\n" + body)
155
- else
156
- resp.body = body
157
- resp.body.size == resp['Content-Length'].to_i or raise "Wrong content-length for http response, header say #{resp['Content-Length'].chomp}, found #{resp.body.size}"
158
- end
159
- resp.body = Zlib::GzipReader.new(StringIO.new(resp.body)).read if resp['Content-Encoding'] == 'gzip'
160
- resp
161
- end
162
-
163
- module_function :parse_response
164
-
165
- private
166
-
167
- def self.add_headers o, headers
168
- headers.each do |line|
169
- m = /\A([^:]+):\s*/.match(line) or raise "Unable to parse line #{line}"
170
- o[m[1]] = m.post_match
171
- end
172
- end
173
-
174
- def self.split_headers str
175
- index = str.index("\r\n\r\n")
176
- return str[0 .. index].split("\r\n"), str[index + 4 .. -1]
177
- end
178
-
179
- def self.read_chunked str
180
- return "" if str == "\r\n"
181
- m = /\r\n([0-9a-fA-F]+)\r\n/.match(str) or raise "Unable to read chunked body in #{str.split("\r\n")[0]}"
182
- len = m[1].hex
183
- return "" if len == 0
184
- m.post_match[0..len - 1] + read_chunked(m.post_match[len .. -1])
185
- end
186
-
187
- end
188
-
189
- end
12
+ require_relative 'pcap_tools/stream_processors/one_stream_filter'
13
+ require_relative 'pcap_tools/stream_processors/rebuilder'
14
+ require_relative 'pcap_tools/stream_processors/http'
@@ -0,0 +1,118 @@
1
+ require 'popen4'
2
+ require 'ox'
3
+ require 'fileutils'
4
+
5
+ module PcapTools
6
+
7
+ module Loader
8
+
9
+ class MyParser < ::Ox::Sax
10
+
11
+ def initialize block
12
+ @current_packet_index = 0
13
+ @current_packet = nil
14
+ @current_processing = nil
15
+ @current_proto_name = nil
16
+ @current_field_name = nil
17
+ @block = block
18
+ end
19
+
20
+ def attr name, value
21
+ if @current_processing == :proto && name == :name
22
+ @current_proto_name = value
23
+ @current_packet[:protos] << value
24
+ elsif @current_processing == :field && name == :name
25
+ @current_field_name = value
26
+ # p @current_field_name
27
+ elsif name == :show
28
+ if @current_proto_name == "geninfo" && @current_field_name == "timestamp"
29
+ @current_packet[:time] = Time.parse value
30
+ elsif @current_proto_name == "ip" && @current_field_name == "ip.src"
31
+ @current_packet[:from] = value
32
+ elsif @current_proto_name == "ip" && @current_field_name == "ip.dst"
33
+ @current_packet[:to] = value
34
+ elsif @current_proto_name == "tcp" && @current_field_name == "tcp.len"
35
+ @current_packet[:size] = value.to_i
36
+ elsif @current_proto_name == "tcp" && @current_field_name == "tcp.stream"
37
+ @current_packet[:stream] = value.to_i
38
+ elsif @current_proto_name == "tcp" && @current_field_name == "tcp.srcport"
39
+ @current_packet[:from_port] = value.to_i
40
+ elsif @current_proto_name == "tcp" && @current_field_name == "tcp.dstport"
41
+ @current_packet[:to_port] = value.to_i
42
+ elsif @current_proto_name == "tcp" && @current_field_name == "tcp.flags.fin"
43
+ @current_packet[:tcp_flags][:fin] = value == "1"
44
+ elsif @current_proto_name == "tcp" && @current_field_name == "tcp.flags.reset"
45
+ @current_packet[:tcp_flags][:rst] = value == "1"
46
+ elsif @current_proto_name == "tcp" && @current_field_name == "tcp.flags.ack"
47
+ @current_packet[:tcp_flags][:ack] = value == "1"
48
+ elsif @current_proto_name == "tcp" && @current_field_name == "tcp.flags.syn"
49
+ @current_packet[:tcp_flags][:syn] = value == "1"
50
+ end
51
+ elsif name == :value
52
+ if @current_proto_name == "fake-field-wrapper" && @current_field_name == "data"
53
+ @current_packet[:data] = [value].pack("H*")
54
+ elsif @current_proto_name == "tcp" && @current_field_name == "tcp.segment_data"
55
+ @current_packet[:data] = [value].pack("H*")
56
+ end
57
+ end
58
+ end
59
+
60
+ def start_element name, attrs = []
61
+ case name
62
+ when :packet
63
+ @current_packet = {
64
+ :tcp_flags => {},
65
+ :packet_index => @current_packet_index + 1,
66
+ :protos => [],
67
+ }
68
+ when :proto
69
+ @current_processing = :proto
70
+ when :field
71
+ @current_processing = :field
72
+ when :pdml
73
+ else
74
+ raise "Unknown element [#{name}]"
75
+ end
76
+ end
77
+
78
+
79
+ def end_element name
80
+ if name == :packet
81
+ # p @current_packet
82
+ if @current_packet[:protos].include? "malformed"
83
+ $stderr.puts "Malformed packet #{@current_packet_index}"
84
+ return
85
+ end
86
+ raise "No data found in packet #{@current_packet_index}, protocols found #{@current_packet[:protos]}" if @current_packet[:data].nil? && @current_packet[:size] > 0
87
+ @current_packet.delete :protos
88
+ @block.call @current_packet_index, @current_packet
89
+ @current_packet_index += 1
90
+ end
91
+ end
92
+
93
+ end
94
+
95
+ def self.load_file f, options = {}, &block
96
+ tshark_executable = options[:tshark] || "tshark"
97
+ accepted_protocols = ["geninfo", "tcp", "ip", "eth", "sll", "frame"]
98
+ accepted_protocols += options[:accepted_protocols] if options[:accepted_protocols]
99
+ profile_name = "pcap_tools"
100
+ profile_dir = "#{ENV['HOME']}/.wireshark/profiles/#{profile_name}"
101
+ unless File.exist? "#{profile_dir}/disabled_protos"
102
+ status = POpen4::popen4("#{tshark_executable} -G protocols") do |stdout, stderr, stdin, pid|
103
+ list = stdout.read.split("\n").map { |x| x.split(" ").last }.reject { |x| accepted_protocols.include? x }
104
+ FileUtils.mkdir_p profile_dir
105
+ File.open("#{profile_dir}/disabled_protos", "w") { |io| io.write(list.join("\n") + "\n") }
106
+ end
107
+ raise "Tshark execution error when listing protocols" unless status.exitstatus == 0
108
+ end
109
+ status = POpen4::popen4("#{tshark_executable} -n -C #{profile_name} -T pdml -r #{f}") do |stdout, stderr, stdin, pid|
110
+ Ox.sax_parse(MyParser.new(block), stdout)
111
+ $stderr.puts stderr.read
112
+ end
113
+ raise "Tshark execution error with file #{f}" unless status.exitstatus == 0
114
+ end
115
+
116
+ end
117
+
118
+ end
@@ -0,0 +1,19 @@
1
+ module PcapTools
2
+
3
+ class FrameProcessor
4
+
5
+ def initialize
6
+ @counter = 0
7
+ end
8
+
9
+ def inject index, packet
10
+ @counter += 1
11
+ end
12
+
13
+ def finalize
14
+ puts "Number of frames : #{@counter}"
15
+ end
16
+
17
+ end
18
+
19
+ end
@@ -0,0 +1,51 @@
1
+ require 'time'
2
+
3
+ module PcapTools
4
+
5
+ class TcpProcessor
6
+
7
+ def initialize
8
+ @streams = {}
9
+ @stream_processors = []
10
+ end
11
+
12
+ def add_stream_processor processor
13
+ @stream_processors << processor
14
+ end
15
+
16
+ def inject index, packet
17
+ stream_index = packet[:stream]
18
+ if stream_index
19
+ if packet[:tcp_flags][:syn] && packet[:tcp_flags][:ack] === false
20
+ @streams[stream_index] = {
21
+ :first => packet,
22
+ :data => [],
23
+ }
24
+ elsif packet[:tcp_flags][:fin] || packet[:tcp_flags][:rst]
25
+ if @streams[stream_index]
26
+ current = {:index => stream_index, :data => @streams[stream_index][:data]}
27
+ @stream_processors.each do |p|
28
+ current = p.process_stream current
29
+ break unless current
30
+ end
31
+ @streams.delete stream_index
32
+ end
33
+ else
34
+ if @streams[stream_index]
35
+ packet[:type] = (packet[:from] == @streams[stream_index][:first][:from] && packet[:from_port] == @streams[stream_index][:first][:from_port]) ? :out : :in
36
+ packet.delete :tcp_flags
37
+ @streams[stream_index][:data] << packet if packet[:size] > 0
38
+ end
39
+ end
40
+ end
41
+ end
42
+
43
+ def finalize
44
+ @stream_processors.each do |p|
45
+ p.finalize
46
+ end
47
+ end
48
+
49
+ end
50
+
51
+ end
@@ -0,0 +1,17 @@
1
+ module Net
2
+
3
+ class HTTPRequest
4
+ attr_accessor :time
5
+ end
6
+
7
+ class HTTPResponse
8
+ attr_accessor :time
9
+
10
+ def body=(body)
11
+ @body = body
12
+ @read = true
13
+ end
14
+
15
+ end
16
+
17
+ end
@@ -0,0 +1,99 @@
1
+ module PcapTools
2
+
3
+ class HttpExtractor
4
+
5
+ def process_stream stream
6
+ calls = []
7
+ k = 0
8
+ while k < stream[:data].size
9
+ begin
10
+ req = parse_request(stream[:data][k])
11
+ resp = k + 1 < stream[:data].size ? parse_response(stream[:data][k + 1]) : nil
12
+ calls << [stream[:index], req, resp]
13
+ rescue Exception => e
14
+ warn "Unable to parse http call in stream #{stream[:index]} : #{e}"
15
+ end
16
+ k += 2
17
+ end
18
+ calls
19
+ end
20
+
21
+ def finalize
22
+ end
23
+
24
+ private
25
+
26
+ def parse_request stream
27
+ headers, body = split_headers(stream[:data])
28
+ line0 = headers.shift
29
+ m = /(\S+)\s+(\S+)\s+(\S+)/.match(line0) or raise "Unable to parse first line of http request #{line0}"
30
+ clazz = {
31
+ 'POST' => Net::HTTP::Post,
32
+ 'HEAD' => Net::HTTP::Head,
33
+ 'GET' => Net::HTTP::Get,
34
+ 'PUT' => Net::HTTP::Put
35
+ }[m[1]] or raise "Unknown http request type [#{m[1]}]"
36
+ req = clazz.new m[2]
37
+ req['Pcap-Src'] = stream[:from]
38
+ req['Pcap-Src-Port'] = stream[:from_port]
39
+ req['Pcap-Dst'] = stream[:to]
40
+ req['Pcap-Dst-Port'] = stream[:to_port]
41
+ req.time = stream[:time]
42
+ req.body = body
43
+ req['user-agent'] = nil
44
+ req['accept'] = nil
45
+ add_headers req, headers
46
+ if req['Content-Length']
47
+ req.body.size == req['Content-Length'].to_i or raise "Wrong content-length for http request, header say [#{req['Content-Length'].chomp}], found #{req.body.size}"
48
+ end
49
+ req
50
+ end
51
+
52
+ def parse_response stream
53
+ headers, body = split_headers(stream[:data])
54
+ line0 = headers.shift
55
+ m = /^(\S+)\s+(\S+)\s+(.*)$/.match(line0) or raise "Unable to parse first line of http response [#{line0}]"
56
+ resp = Net::HTTPResponse.send(:response_class, m[2]).new(m[1], m[2], m[3])
57
+ resp.time = stream[:time]
58
+ add_headers resp, headers
59
+ if resp.chunked?
60
+ resp.body = read_chunked("\r\n" + body)
61
+ else
62
+ resp.body = body
63
+ if resp['Content-Length']
64
+ resp.body.size == resp['Content-Length'].to_i or raise "Wrong content-length for http response, header say [#{resp['Content-Length'].chomp}], found #{resp.body.size}"
65
+ end
66
+ end
67
+ begin
68
+ resp.body = Zlib::GzipReader.new(StringIO.new(resp.body)).read if resp['Content-Encoding'] == 'gzip'
69
+ rescue Zlib::GzipFile::Error
70
+ warn "Response body is not in gzip: [#{resp.body}]"
71
+ end
72
+ resp
73
+ end
74
+
75
+ def add_headers o, headers
76
+ headers.each do |line|
77
+ m = /\A([^:]+):\s*/.match(line) or raise "Unable to parse header line [#{line}]"
78
+ o[m[1]] = m.post_match
79
+ end
80
+ end
81
+
82
+ def split_headers str
83
+ index = str.index("\r\n\r\n")
84
+ return str[0 .. index].split("\r\n"), str[index + 4 .. -1]
85
+ end
86
+
87
+ def read_chunked str
88
+ if str.nil? || (str == "\r\n")
89
+ return ''
90
+ end
91
+ m = /\r\n([0-9a-fA-F]+)\r\n/.match(str) or raise "Unable to read chunked body in #{str.split("\r\n")[0]}"
92
+ len = m[1].hex
93
+ return '' if len == 0
94
+ m.post_match[0..len - 1] + read_chunked(m.post_match[len .. -1])
95
+ end
96
+
97
+ end
98
+
99
+ end
@@ -0,0 +1,19 @@
1
+ module PcapTools
2
+
3
+ class TcpOneStreamFilter
4
+
5
+ def initialize target
6
+ @target = target
7
+ end
8
+
9
+ def process_stream stream
10
+ return nil if @target && stream[:index] != @target
11
+ stream
12
+ end
13
+
14
+ def finalize
15
+ end
16
+
17
+ end
18
+
19
+ end
@@ -0,0 +1,33 @@
1
+ module PcapTools
2
+
3
+ class TcpStreamRebuilder
4
+
5
+ def process_stream stream
6
+ out = []
7
+ current = nil
8
+ stream[:data].each do |packet|
9
+ if current
10
+ if packet[:type] == current[:type]
11
+ current[:times] << {:offset => current[:size], :time => packet[:time]}
12
+ current[:data] += packet[:data]
13
+ current[:size] += packet[:size]
14
+ else
15
+ out << current
16
+ current = packet.clone
17
+ current[:times] = [{:offset => 0, :time => packet[:time]}]
18
+ end
19
+ else
20
+ current = packet.clone
21
+ current[:times] = [{:offset => 0, :time => packet[:time]}]
22
+ end
23
+ end
24
+ out << current if current
25
+ {:index => stream[:index], :data => out}
26
+ end
27
+
28
+ def finalize
29
+ end
30
+
31
+ end
32
+
33
+ end
data/pcap_tools.gemspec CHANGED
@@ -2,7 +2,7 @@ require 'rake'
2
2
 
3
3
  Gem::Specification.new do |s|
4
4
  s.name = 'pcap_tools'
5
- s.version = '0.0.5'
5
+ s.version = '0.0.6'
6
6
  s.authors = ['Bertrand Paquet']
7
7
  s.email = 'bertrand.paquet@gmail.com'
8
8
  s.summary = 'Tools for extracting data from pcap files'
@@ -11,5 +11,6 @@ Gem::Specification.new do |s|
11
11
  s.files = `git ls-files`.split($/)
12
12
  s.license = 'BSD'
13
13
 
14
- s.add_runtime_dependency('bindata', '>= 1.6.0')
14
+ s.add_dependency('popen4', '0.1.2')
15
+ s.add_dependency('ox', '2.0.11')
15
16
  end
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: pcap_tools
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.0.5
4
+ version: 0.0.6
5
5
  prerelease:
6
6
  platform: ruby
7
7
  authors:
@@ -9,24 +9,40 @@ authors:
9
9
  autorequire:
10
10
  bindir: bin
11
11
  cert_chain: []
12
- date: 2013-11-22 00:00:00.000000000 Z
12
+ date: 2013-12-08 00:00:00.000000000 Z
13
13
  dependencies:
14
14
  - !ruby/object:Gem::Dependency
15
- name: bindata
15
+ name: popen4
16
16
  requirement: !ruby/object:Gem::Requirement
17
17
  none: false
18
18
  requirements:
19
- - - ! '>='
19
+ - - '='
20
20
  - !ruby/object:Gem::Version
21
- version: 1.6.0
21
+ version: 0.1.2
22
22
  type: :runtime
23
23
  prerelease: false
24
24
  version_requirements: !ruby/object:Gem::Requirement
25
25
  none: false
26
26
  requirements:
27
- - - ! '>='
27
+ - - '='
28
28
  - !ruby/object:Gem::Version
29
- version: 1.6.0
29
+ version: 0.1.2
30
+ - !ruby/object:Gem::Dependency
31
+ name: ox
32
+ requirement: !ruby/object:Gem::Requirement
33
+ none: false
34
+ requirements:
35
+ - - '='
36
+ - !ruby/object:Gem::Version
37
+ version: 2.0.11
38
+ type: :runtime
39
+ prerelease: false
40
+ version_requirements: !ruby/object:Gem::Requirement
41
+ none: false
42
+ requirements:
43
+ - - '='
44
+ - !ruby/object:Gem::Version
45
+ version: 2.0.11
30
46
  description:
31
47
  email: bertrand.paquet@gmail.com
32
48
  executables:
@@ -34,10 +50,19 @@ executables:
34
50
  extensions: []
35
51
  extra_rdoc_files: []
36
52
  files:
53
+ - .gitignore
54
+ - Gemfile
55
+ - Gemfile.lock
37
56
  - README.markdown
38
57
  - bin/pcap_tools
39
- - lib/pcap_parser.rb
40
58
  - lib/pcap_tools.rb
59
+ - lib/pcap_tools/loader.rb
60
+ - lib/pcap_tools/packet_processors/frame.rb
61
+ - lib/pcap_tools/packet_processors/tcp.rb
62
+ - lib/pcap_tools/patches/http.rb
63
+ - lib/pcap_tools/stream_processors/http.rb
64
+ - lib/pcap_tools/stream_processors/one_stream_filter.rb
65
+ - lib/pcap_tools/stream_processors/rebuilder.rb
41
66
  - pcap_tools.gemspec
42
67
  homepage: https://github.com/bpaquet/pcap_tools
43
68
  licenses:
@@ -60,9 +85,8 @@ required_rubygems_version: !ruby/object:Gem::Requirement
60
85
  version: '0'
61
86
  requirements: []
62
87
  rubyforge_project:
63
- rubygems_version: 1.8.24
88
+ rubygems_version: 1.8.23
64
89
  signing_key:
65
90
  specification_version: 3
66
91
  summary: Tools for extracting data from pcap files
67
92
  test_files: []
68
- has_rdoc:
data/lib/pcap_parser.rb DELETED
@@ -1,218 +0,0 @@
1
- require 'bindata'
2
-
3
- module PcapTools
4
-
5
- module Parser
6
-
7
- module HasParent
8
-
9
- attr_accessor :parent
10
-
11
- end
12
-
13
- class PcapFile < BinData::Record
14
- endian :little
15
-
16
- struct :header do
17
- uint32 :magic
18
- uint16 :major
19
- uint16 :minor
20
- int32 :this_zone
21
- uint32 :sig_figs
22
- uint32 :snaplen
23
- uint32 :linktype
24
- end
25
-
26
- array :packets, :read_until => :eof do
27
- uint32 :ts_sec
28
- uint32 :ts_usec
29
- uint32 :incl_len
30
- uint32 :orig_len
31
- string :data, :length => :incl_len
32
- end
33
-
34
- end
35
-
36
- # Present IP addresses in a human readable way
37
- class IPAddr < BinData::Primitive
38
- array :octets, :type => :uint8, :initial_length => 4
39
-
40
- def set(val)
41
- ints = val.split(/\./).collect { |int| int.to_i }
42
- self.octets = ints
43
- end
44
-
45
- def get
46
- self.octets.collect { |octet| "%d" % octet }.join(".")
47
- end
48
- end
49
-
50
- # TCP Protocol Data Unit
51
- class TCP_PDU < BinData::Record
52
- mandatory_parameter :packet_length
53
-
54
- endian :big
55
-
56
- uint16 :src_port
57
- uint16 :dst_port
58
- uint32 :seq
59
- uint32 :ack_seq
60
- bit4 :doff
61
- bit4 :res1
62
- bit2 :res2
63
- bit1 :urg
64
- bit1 :ack
65
- bit1 :psh
66
- bit1 :rst
67
- bit1 :syn
68
- bit1 :fin
69
- uint16 :window
70
- uint16 :checksum
71
- uint16 :urg_ptr
72
- string :options, :read_length => :options_length_in_bytes
73
- string :payload, :read_length => lambda { packet_length - payload.rel_offset }
74
-
75
- def options_length_in_bytes
76
- (doff - 5 ) * 4
77
- end
78
-
79
- def type
80
- "TCP"
81
- end
82
-
83
- include HasParent
84
-
85
- end
86
-
87
- # UDP Protocol Data Unit
88
- class UDP_PDU < BinData::Record
89
- mandatory_parameter :packet_length
90
-
91
- endian :big
92
-
93
- uint16 :src_port
94
- uint16 :dst_port
95
- uint16 :len
96
- uint16 :checksum
97
- string :payload, :read_length => lambda { packet_length - payload.rel_offset }
98
-
99
- def type
100
- "UDP"
101
- end
102
-
103
- include HasParent
104
- end
105
-
106
- # IP Protocol Data Unit
107
- class IP_PDU < BinData::Record
108
- endian :big
109
-
110
- bit4 :version, :asserted_value => 4
111
- bit4 :header_length
112
- uint8 :tos
113
- uint16 :total_length
114
- uint16 :ident
115
- bit3 :flags
116
- bit13 :frag_offset
117
- uint8 :ttl
118
- uint8 :protocol
119
- uint16 :checksum
120
- ip_addr :src_addr
121
- ip_addr :dst_addr
122
- string :options, :read_length => :options_length_in_bytes
123
- choice :payload, :selection => :protocol do
124
- tcp_pdu 6, :packet_length => :payload_length_in_bytes
125
- udp_pdu 17, :packet_length => :payload_length_in_bytes
126
- string :default, :read_length => :payload_length_in_bytes
127
- end
128
-
129
- def header_length_in_bytes
130
- header_length * 4
131
- end
132
-
133
- def options_length_in_bytes
134
- header_length_in_bytes - options.rel_offset
135
- end
136
-
137
- def payload_length_in_bytes
138
- total_length - header_length_in_bytes
139
- end
140
-
141
- def type
142
- "IP"
143
- end
144
-
145
- include HasParent
146
- end
147
-
148
- class MacAddr < BinData::Primitive
149
- array :octets, :type => :uint8, :initial_length => 6
150
-
151
- def set(val)
152
- ints = val.split(/\./).collect { |int| int.to_i }
153
- self.octets = ints
154
- end
155
-
156
- def get
157
- self.octets.collect { |octet| "%02x" % octet }.join(":")
158
- end
159
- end
160
-
161
- IPV4 = 0x0800
162
- class Ethernet < BinData::Record
163
- endian :big
164
-
165
- mac_addr :dst
166
- mac_addr :src
167
- uint16 :protocol
168
- choice :payload, :selection => :protocol do
169
- ip_pdu IPV4
170
- rest :default
171
- end
172
-
173
- include HasParent
174
- end
175
-
176
- class LinuxCookedCapture < BinData::Record
177
- endian :big
178
-
179
- uint16 :type
180
- uint16 :address_type
181
- uint16 :address_len
182
- array :octets, :type => :uint8, :initial_length => 8
183
- uint16 :protocol
184
- choice :payload, :selection => :protocol do
185
- ip_pdu IPV4
186
- rest :default
187
- end
188
-
189
- include HasParent
190
- end
191
-
192
- def load_file f
193
- packets = []
194
- File.open(f, 'rb') do |io|
195
- content = PcapFile.read(io)
196
- raise 'Wrong endianess' unless content.header.magic.to_i.to_s(16) == "a1b2c3d4"
197
- content.packets.each do |original_packet|
198
- packet = case content.header.linktype
199
- when 113 then LinuxCookedCapture.read(original_packet.data)
200
- when 1 then Ethernet.read(original_packet.data)
201
- else raise "Unknown network #{content.header.linktype}"
202
- end
203
- packet.parent = original_packet
204
- while packet.respond_to?(:payload) && packet.payload.is_a?(BinData::Choice)
205
- packet.payload.parent = packet
206
- packet = packet.payload
207
- end
208
- packets << packet
209
- end
210
- end
211
- packets
212
- end
213
-
214
- module_function :load_file
215
-
216
- end
217
-
218
- end