pcap_tools 0.0.5 → 0.0.6

Sign up to get free protection for your applications and to get access to all the features.
data/.gitignore ADDED
@@ -0,0 +1,3 @@
1
+ *.gem
2
+ *.pcap
3
+ *.xml
data/Gemfile ADDED
@@ -0,0 +1,4 @@
1
+ source 'https://rubygems.org/'
2
+
3
+ gemspec
4
+
data/Gemfile.lock ADDED
@@ -0,0 +1,24 @@
1
+ PATH
2
+ remote: .
3
+ specs:
4
+ pcap_tools (0.0.6)
5
+ nokogiri (= 1.6.0)
6
+ popen4 (= 0.1.2)
7
+
8
+ GEM
9
+ remote: https://rubygems.org/
10
+ specs:
11
+ Platform (0.4.0)
12
+ mini_portile (0.5.2)
13
+ nokogiri (1.6.0)
14
+ mini_portile (~> 0.5.0)
15
+ open4 (1.3.0)
16
+ popen4 (0.1.2)
17
+ Platform (>= 0.4.0)
18
+ open4 (>= 0.4.0)
19
+
20
+ PLATFORMS
21
+ ruby
22
+
23
+ DEPENDENCIES
24
+ pcap_tools!
data/README.markdown CHANGED
@@ -1,56 +1,107 @@
1
1
  # What is it ?
2
2
 
3
- It's a ruby library to help tcpdump file processing : do some offline analysis on tcpdump files.
3
+ PCapTools is a ruby library to process pcap file from [wireshark](http://www.wireshark.org/) or [tcpdump](http://www.tcpdump.org/).
4
+
5
+ PCapTools uses [tshark](http://www.wireshark.org/docs/man-pages/tshark.html) to process the pcap file, and run some analysis on it. Tshark is bundled with [wireshark](http://www.wireshark.org/download.html).
6
+
7
+ There are two ways to use PCapTools
8
+
9
+ * as a command line tools
10
+ * as a ruby library
11
+
12
+ ## As a command line tools
4
13
 
5
14
  Main functionnalities :
6
15
 
7
16
  * Rebuild tcp streams
8
- * Extract and parse http request
17
+ * Extract and parse http requests
18
+
19
+ ### Install
20
+
21
+ Ensure `tshark` is installed. If not, check your wireshark install.
22
+
23
+ tshark -v
24
+
25
+ Install pcap_tools
26
+
27
+ gem install pcap_tools
28
+
29
+ ### Command line options
30
+
31
+ pcap_tools --help
32
+
33
+ Usage: pcap_tools_http [options] pcap_files
34
+ --no-body Do not display body
35
+ --tshark_path Path to tshark executable
36
+ --one-tcp-stream [index] Display only one tcp stream
37
+ --mode [MODE] Parsing mode : http, tcp, frame, tcp_count. Default http
38
+
39
+ ### Typical use
40
+
41
+ tcpdump -w out.pcap -s0 port 80
42
+ pcap_tools out.pcap
43
+
44
+
45
+ ## As a ruby library
9
46
 
10
- # How use it
47
+ ### Install
11
48
 
12
- ## Make a tcpdump
49
+ Ensure `tshark` is installed. If not, check your wireshark install.
13
50
 
14
- * `tcpdump -w out.pcap -s 4096 <filter>`
15
- * Get the output file out.pcap
51
+ tshark -v
16
52
 
17
- Please adjust the 4096 value, to the max packet size to capture.
53
+ Declare dependency to `pcap_tools` in your Gemfile.
18
54
 
19
- ## Write a ruby script
55
+ gem 'pcap_tools'
20
56
 
21
- require 'pcap_tools'
57
+ ### How to use it
22
58
 
23
- # Load tcpdump file
24
- capture = PCAPRUB::Pcap.open_offline('out.pcap')
59
+ The best example is the [pcap_tools command line script](https://github.com/bpaquet/pcap_tools/blob/master/bin/pcap_tools).
25
60
 
26
- ## Available functions
61
+ Pcap_tools is an event processor : `tshark` returns an XML Flow, which is parsed with a SAX processor. Each packet is processed by a TCP processor, which build streams. Each streams is processed by an HTTP processor, which rebuild HTTP request / response.
62
+
63
+ #### Loading files
64
+
65
+ PcapTools::Loader::load_file(f, {:tshark => OPTIONS[:tshark_path]}) do |index, packet|
66
+ end
67
+
68
+ Each packet is a ruby object containing the main attributes found in the packet, extracted with tshark.
69
+
70
+ You have to use a packet processor to process this packet. The main one is `PcapTools::TcpProcessor`.
27
71
 
28
72
  ### Extract tcp streams
29
73
 
30
- This function rebuild tcp streams from an array of pcap capture object.
74
+ processor = PcapTools::TcpProcessor.new
75
+ PcapTools::Loader::load_file(f, {:tshark => OPTIONS[:tshark_path]}) do |index, packet|
76
+ processor.inject index, packet
77
+ end
31
78
 
32
- tcp_streams = PcapTools::extract_tcp_streams(captures)
79
+ The [TCPProcessor](https://github.com/bpaquet/pcap_tools/blob/master/lib/pcap_tools/packet_processors/tcp.rb) rebuild streams from IP raw packets. To use the streams, you have to add some streams processors.
33
80
 
34
- `tcp_streams` is an array of hash, each hash has tree keys :
81
+ processor.add_stream_processor PcapTools::TcpStreamRebuilder.new
35
82
 
36
- * `:type` : `:in` or `:out`, if the packet was sent or received
37
- * `:time` : timestamp of packet
38
- * `:data` : payload of packet
83
+ The [TcpStreamRebuilder](https://github.com/bpaquet/pcap_tools/blob/master/lib/pcap_tools/stream_processors/http.rb) reassembles contiguous packet, for example the packets containing a big HTTP Response.
84
+
85
+ processor.add_stream_processor PcapTools::HttpExtractor.new
39
86
 
40
- Remarks :
87
+ The [HttpExtractor](https://github.com/bpaquet/pcap_tools/blob/master/lib/pcap_tools/stream_processors/rebuilder.rb) build HTTP request and response from streams.
41
88
 
42
- * Packets are in the rigth ordere
43
- * Packets are not merged (eg an http response can be splitted on serval consecutive packets,
44
- with the same type `:in` or `:out`).
45
- To reassemble packet of the same type, please use `stream.rebuild_packets`
89
+ Please read the code to build your own stream processor. Do not be afraid, it's easy :)
46
90
 
47
- ### Extract http calls
91
+ Note : the TCPProcessor is not able to rebuild tcp stream which do not start in the pcap file. For example, if you launch tcpdump after a long running Oracle DB connection, TCPProcessor will not show the Oracle DB connection.
48
92
 
49
- This function extract http calls from a tcp stream, returned from the `extract_tcp_streams` function.
93
+ ### Data format
50
94
 
51
- http_calls = PcapTools::extract_http_calls(stream)
95
+ Streams objects
52
96
 
53
- `http_calls` is an array of `http_call`.
97
+ A `tcp_streams` is an array of hash, each hash has some keys :
98
+
99
+ * `:type` : `:in` or `:out`, if the packet was sent or received
100
+ * `:time` : timestamp of the packet.
101
+ * `:data` : payload of packet.
102
+ * `:size` : payload size.
103
+
104
+ HTTP Objects
54
105
 
55
106
  A `http_call` is an array of two objects :
56
107
 
@@ -76,15 +127,3 @@ The request and response object have some new attributes
76
127
  For the response object body, the following "Content-Encoding" type are honored :
77
128
 
78
129
  * gzip
79
-
80
- ### Extract http calls from captures
81
-
82
- The two in one : extract http calls from an array of captures objects
83
-
84
- http_calls = PcapTools::extract_http_calls_from_captures(captures)
85
-
86
- ### Load multiple files
87
-
88
- Load multiple pcap files, in time order. Useful when you use `tcpdump -C 5 -W 100000`, to split captured data into pieces of 5M
89
-
90
- captures = PcapTools::load_mutliple_files '*pcap*'
data/bin/pcap_tools CHANGED
@@ -3,7 +3,7 @@
3
3
  require 'pcap_tools'
4
4
  require 'optparse'
5
5
 
6
- options = {
6
+ OPTIONS = {
7
7
  :mode => :http,
8
8
  }
9
9
 
@@ -11,60 +11,133 @@ OptionParser.new do |opts|
11
11
  opts.banner = "Usage: pcap_tools_http [options] pcap_files"
12
12
 
13
13
  opts.on("--no-body", "Do not display body") do
14
- options[:no_body] = true
14
+ OPTIONS[:no_body] = true
15
15
  end
16
16
 
17
- opts.on("--mode [MODE]", [:http, :tcp], "parsing mode") do |m|
18
- options[:mode] = m
17
+ opts.on("--tshark_path", "Path to tshark executable") do |x|
18
+ OPTIONS[:tshark_path] = x
19
19
  end
20
20
 
21
- end.parse!
22
-
23
- data = ARGV.map{|f| puts "Loading #{f}"; PcapTools::Parser::load_file(f)}
21
+ opts.on("--one-tcp-stream [index]", Integer, "Display only one tcp stream") do |x|
22
+ OPTIONS[:one_tcp_stream] = x
23
+ end
24
24
 
25
- tcps = PcapTools::extract_tcp_streams(data)
25
+ opts.on("--mode [MODE]", [:http, :tcp, :frame, :tcp_count], "Parsing mode : http, tcp, frame, tcp_count. Default http") do |m|
26
+ OPTIONS[:mode] = m
27
+ end
26
28
 
27
- puts "Tcp streams extracted : #{tcps.size}"
28
- puts "Parsing mode : #{options[:mode]}"
29
- puts
29
+ end.parse!
30
30
 
31
31
  def format_time t
32
32
  "#{t} #{t.nsec / 1000}"
33
33
  end
34
34
 
35
- if options[:mode] == :http
36
- tcps.each do |tcp|
37
- PcapTools::extract_http_calls(tcp).each do |req, resp|
38
- puts ">>>> #{req["pcap-src"]}:#{req["pcap-src-port"]} > #{req["pcap-dst"]}:#{req["pcap-dst-port"]} #{format_time req.time}"
39
- puts "#{req.method} #{req.path}"
40
- req.each_capitalized_name.reject{|x| x =~ /^Pcap/ }.each do |x|
41
- puts "#{x}: #{req[x]}"
35
+ puts "Mode : #{OPTIONS[:mode]}"
36
+
37
+ processor = nil
38
+
39
+ if OPTIONS[:mode] == :frame
40
+
41
+ processor = PcapTools::FrameProcessor.new
42
+
43
+ else
44
+
45
+ class TcpCounter
46
+
47
+ def initialize
48
+ @counter = 0
49
+ end
50
+
51
+ def process_stream stream
52
+ @counter += 1
53
+ stream
54
+ end
55
+
56
+ def finalize
57
+ puts "Number of TCP Streams : #{@counter}"
58
+ end
59
+
60
+ end
61
+
62
+
63
+ processor = PcapTools::TcpProcessor.new
64
+ processor.add_stream_processor TcpCounter.new
65
+ processor.add_stream_processor PcapTools::TcpOneStreamFilter.new OPTIONS[:one_tcp_stream]
66
+
67
+ if OPTIONS[:mode] == :tcp
68
+
69
+ class TcpPrinter
70
+
71
+ def process_stream stream
72
+ puts "<<<< new connection >>>> [ Wirershark stream index #{stream[:index]} ]"
73
+ stream[:data].each do |packet|
74
+ type = packet[:type] == :out ? ">>>>" : "<<<<<"
75
+ puts "#{type} #{packet[:from]}:#{packet[:from_port]} > #{packet[:to]}:#{packet[:to_port]}, size #{packet[:data].size} #{format_time packet[:time]}"
76
+ puts packet[:data] unless OPTIONS[:no_body]
77
+ puts
78
+ end
79
+ end
80
+
81
+ def finalize
82
+ end
83
+
84
+ end
85
+
86
+ processor.add_stream_processor TcpPrinter.new
87
+
88
+ end
89
+
90
+ if OPTIONS[:mode] == :http
91
+
92
+ class HttpPrinter
93
+
94
+ def initialize
95
+ @counter = 0
42
96
  end
43
- puts
44
- puts req.body unless options[:no_body]
45
- puts "<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< #{format_time resp.time}"
46
- if resp
47
- puts "#{resp.code} #{resp.message}"
48
- resp.each_capitalized_name.reject{|x| x =~ /^Pcap/ }.each do |x|
49
- puts "#{x}: #{resp[x]}"
97
+
98
+ def process_stream stream
99
+ stream.each do |index, req, resp|
100
+ @counter += 1
101
+ puts ">>>> #{req["pcap-src"]}:#{req["pcap-src-port"]} > #{req["pcap-dst"]}:#{req["pcap-dst-port"]} #{format_time req.time} [ Wirershark stream index #{index} ]"
102
+ puts "#{req.method} #{req.path}"
103
+ req.each_capitalized_name.reject{|x| x =~ /^Pcap/ }.each do |x|
104
+ puts "#{x}: #{req[x]}"
105
+ end
106
+ puts
107
+ puts req.body unless OPTIONS[:no_body]
108
+ if resp
109
+ puts "<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< #{format_time resp.time}"
110
+ puts "#{resp.code} #{resp.message}"
111
+ resp.each_capitalized_name.reject{|x| x =~ /^Pcap/ }.each do |x|
112
+ puts "#{x}: #{resp[x]}"
113
+ end
114
+ puts
115
+ puts resp.body unless OPTIONS[:no_body]
116
+ else
117
+ puts "No response found"
118
+ end
119
+ puts
50
120
  end
51
- puts
52
- puts resp.body unless options[:no_body]
53
- else
54
- puts "No response in pcap file"
55
121
  end
56
- puts
122
+
123
+ def finalize
124
+ puts "Number of HTTP Request / response : #{@counter}"
125
+ end
126
+
57
127
  end
128
+
129
+ processor.add_stream_processor PcapTools::TcpStreamRebuilder.new
130
+ processor.add_stream_processor PcapTools::HttpExtractor.new
131
+ processor.add_stream_processor HttpPrinter.new
132
+
58
133
  end
134
+
59
135
  end
60
136
 
61
- if options[:mode] == :tcp
62
- tcps.each do |tcp|
63
- tcp.each do |packet|
64
- type = packet[:type] == :out ? ">>>>" : "<<<<<"
65
- puts "#{type} #{packet[:from]}:#{packet[:from_port]} > #{packet[:to]}:#{packet[:to_port]}, size #{packet[:data].size} #{format_time packet[:time]}"
66
- puts packet[:data]
67
- puts
68
- end
137
+ ARGV.each do |f|
138
+ PcapTools::Loader::load_file(f, {:tshark => OPTIONS[:tshark_path]}) do |index, packet|
139
+ processor.inject index, packet
69
140
  end
70
- end
141
+ end
142
+
143
+ processor.finalize
data/lib/pcap_tools.rb CHANGED
@@ -1,189 +1,14 @@
1
- require 'rubygems'
2
1
  require 'net/http'
2
+ require 'bindata'
3
3
  require 'zlib'
4
4
 
5
- require File.join(File.dirname(__FILE__), 'pcap_parser')
5
+ require_relative 'pcap_tools/loader'
6
6
 
7
- module Net
7
+ require_relative 'pcap_tools/patches/http.rb'
8
8
 
9
- class HTTPRequest
10
- attr_accessor :time
11
- end
9
+ require_relative 'pcap_tools/packet_processors/frame'
10
+ require_relative 'pcap_tools/packet_processors/tcp'
12
11
 
13
- class HTTPResponse
14
- attr_accessor :time
15
-
16
- def body= body
17
- @body = body
18
- @read = true
19
- end
20
-
21
- end
22
-
23
- end
24
-
25
- module PcapTools
26
-
27
- class TcpStream < Array
28
-
29
- def insert_tcp sym, packet
30
- data = packet.payload
31
- return if data.size == 0
32
- timestamp = Time.at(packet.parent.parent.parent.ts_sec, packet.parent.parent.parent.ts_usec)
33
- self << {:type => sym, :data => data, :from => packet.parent.src_addr, :to => packet.parent.dst_addr, :from_port => packet.src_port, :to_port => packet.dst_port, :time => timestamp}
34
- end
35
-
36
- def rebuild_packets
37
- out = TcpStream.new
38
- current = nil
39
- self.each do |packet|
40
- if current
41
- if packet[:type] == current[:type]
42
- current[:data] += packet[:data]
43
- else
44
- out << current
45
- current = packet.clone
46
- end
47
- else
48
- current = packet.clone
49
- end
50
- end
51
- out << current if current
52
- out
53
- end
54
-
55
- end
56
-
57
- def extract_http_calls_from_captures captures
58
- calls = []
59
- extract_tcp_streams(captures).each do |tcp|
60
- calls.concat(extract_http_calls(tcp))
61
- end
62
- calls
63
- end
64
-
65
- module_function :extract_http_calls_from_captures
66
-
67
- def extract_tcp_streams captures
68
- packets = []
69
- captures.each do |capture|
70
- capture.each do |packet|
71
- packets << packet
72
- end
73
- end
74
-
75
- streams = []
76
- packets.each_with_index do |packet, k|
77
- if packet.respond_to?(:type) && packet.type == "TCP" && packet.syn == 1 && packet.ack == 0
78
- kk = k
79
- tcp = TcpStream.new
80
- while kk < packets.size
81
- packet2 = packets[kk]
82
- if packet2.respond_to?(:type) && packet.type == "TCP"
83
- if packet.dst_port == packet2.dst_port && packet.src_port == packet2.src_port
84
- tcp.insert_tcp :out, packet2
85
- break if packet.fin == 1 || packet2.fin == 1
86
- end
87
- if packet.dst_port == packet2.src_port && packet.src_port == packet2.dst_port
88
- tcp.insert_tcp :in, packet2
89
- break if packet.fin == 1 || packet2.fin == 1
90
- end
91
- end
92
- kk += 1
93
- end
94
- streams << tcp
95
- end
96
- end
97
- streams
98
- end
99
-
100
- module_function :extract_tcp_streams
101
-
102
- def extract_http_calls stream
103
- rebuilded = stream.rebuild_packets
104
- calls = []
105
- data_out = ""
106
- data_in = nil
107
- k = 0
108
- while k < rebuilded.size
109
- begin
110
- req = HttpParser::parse_request(rebuilded[k])
111
- resp = k + 1 < rebuilded.size ? HttpParser::parse_response(rebuilded[k + 1]) : nil
112
- calls << [req, resp]
113
- rescue Exception => e
114
- warn "Unable to parse http call : #{e}"
115
- end
116
- k += 2
117
- end
118
- calls
119
- end
120
-
121
- module_function :extract_http_calls
122
-
123
- module HttpParser
124
-
125
- def parse_request stream
126
- headers, body = split_headers(stream[:data])
127
- line0 = headers.shift
128
- m = /(\S+)\s+(\S+)\s+(\S+)/.match(line0) or raise "Unable to parse first line of http request #{line0}"
129
- clazz = {'POST' => Net::HTTP::Post, 'HEAD' => Net::HTTP::Head, 'GET' => Net::HTTP::Get, 'PUT' => Net::HTTP::Put}[m[1]] or raise "Unknown http request type #{m[1]}"
130
- req = clazz.new m[2]
131
- req['Pcap-Src'] = stream[:from]
132
- req['Pcap-Src-Port'] = stream[:from_port]
133
- req['Pcap-Dst'] = stream[:to]
134
- req['Pcap-Dst-Port'] = stream[:to_port]
135
- req.time = stream[:time]
136
- req.body = body
137
- req['user-agent'] = nil
138
- req['accept'] = nil
139
- add_headers req, headers
140
- req.body.size == req['Content-Length'].to_i or raise "Wrong content-length for http request, header say #{req['Content-Length'].chomp}, found #{req.body.size}"
141
- req
142
- end
143
-
144
- module_function :parse_request
145
-
146
- def parse_response stream
147
- headers, body = split_headers(stream[:data])
148
- line0 = headers.shift
149
- m = /^(\S+)\s+(\S+)\s+(.*)$/.match(line0) or raise "Unable to parse first line of http response #{line0}"
150
- resp = Net::HTTPResponse.send(:response_class, m[2]).new(m[1], m[2], m[3])
151
- resp.time = stream[:time]
152
- add_headers resp, headers
153
- if resp.chunked?
154
- resp.body = read_chunked("\r\n" + body)
155
- else
156
- resp.body = body
157
- resp.body.size == resp['Content-Length'].to_i or raise "Wrong content-length for http response, header say #{resp['Content-Length'].chomp}, found #{resp.body.size}"
158
- end
159
- resp.body = Zlib::GzipReader.new(StringIO.new(resp.body)).read if resp['Content-Encoding'] == 'gzip'
160
- resp
161
- end
162
-
163
- module_function :parse_response
164
-
165
- private
166
-
167
- def self.add_headers o, headers
168
- headers.each do |line|
169
- m = /\A([^:]+):\s*/.match(line) or raise "Unable to parse line #{line}"
170
- o[m[1]] = m.post_match
171
- end
172
- end
173
-
174
- def self.split_headers str
175
- index = str.index("\r\n\r\n")
176
- return str[0 .. index].split("\r\n"), str[index + 4 .. -1]
177
- end
178
-
179
- def self.read_chunked str
180
- return "" if str == "\r\n"
181
- m = /\r\n([0-9a-fA-F]+)\r\n/.match(str) or raise "Unable to read chunked body in #{str.split("\r\n")[0]}"
182
- len = m[1].hex
183
- return "" if len == 0
184
- m.post_match[0..len - 1] + read_chunked(m.post_match[len .. -1])
185
- end
186
-
187
- end
188
-
189
- end
12
+ require_relative 'pcap_tools/stream_processors/one_stream_filter'
13
+ require_relative 'pcap_tools/stream_processors/rebuilder'
14
+ require_relative 'pcap_tools/stream_processors/http'
@@ -0,0 +1,118 @@
1
+ require 'popen4'
2
+ require 'ox'
3
+ require 'fileutils'
4
+
5
+ module PcapTools
6
+
7
+ module Loader
8
+
9
+ class MyParser < ::Ox::Sax
10
+
11
+ def initialize block
12
+ @current_packet_index = 0
13
+ @current_packet = nil
14
+ @current_processing = nil
15
+ @current_proto_name = nil
16
+ @current_field_name = nil
17
+ @block = block
18
+ end
19
+
20
+ def attr name, value
21
+ if @current_processing == :proto && name == :name
22
+ @current_proto_name = value
23
+ @current_packet[:protos] << value
24
+ elsif @current_processing == :field && name == :name
25
+ @current_field_name = value
26
+ # p @current_field_name
27
+ elsif name == :show
28
+ if @current_proto_name == "geninfo" && @current_field_name == "timestamp"
29
+ @current_packet[:time] = Time.parse value
30
+ elsif @current_proto_name == "ip" && @current_field_name == "ip.src"
31
+ @current_packet[:from] = value
32
+ elsif @current_proto_name == "ip" && @current_field_name == "ip.dst"
33
+ @current_packet[:to] = value
34
+ elsif @current_proto_name == "tcp" && @current_field_name == "tcp.len"
35
+ @current_packet[:size] = value.to_i
36
+ elsif @current_proto_name == "tcp" && @current_field_name == "tcp.stream"
37
+ @current_packet[:stream] = value.to_i
38
+ elsif @current_proto_name == "tcp" && @current_field_name == "tcp.srcport"
39
+ @current_packet[:from_port] = value.to_i
40
+ elsif @current_proto_name == "tcp" && @current_field_name == "tcp.dstport"
41
+ @current_packet[:to_port] = value.to_i
42
+ elsif @current_proto_name == "tcp" && @current_field_name == "tcp.flags.fin"
43
+ @current_packet[:tcp_flags][:fin] = value == "1"
44
+ elsif @current_proto_name == "tcp" && @current_field_name == "tcp.flags.reset"
45
+ @current_packet[:tcp_flags][:rst] = value == "1"
46
+ elsif @current_proto_name == "tcp" && @current_field_name == "tcp.flags.ack"
47
+ @current_packet[:tcp_flags][:ack] = value == "1"
48
+ elsif @current_proto_name == "tcp" && @current_field_name == "tcp.flags.syn"
49
+ @current_packet[:tcp_flags][:syn] = value == "1"
50
+ end
51
+ elsif name == :value
52
+ if @current_proto_name == "fake-field-wrapper" && @current_field_name == "data"
53
+ @current_packet[:data] = [value].pack("H*")
54
+ elsif @current_proto_name == "tcp" && @current_field_name == "tcp.segment_data"
55
+ @current_packet[:data] = [value].pack("H*")
56
+ end
57
+ end
58
+ end
59
+
60
+ def start_element name, attrs = []
61
+ case name
62
+ when :packet
63
+ @current_packet = {
64
+ :tcp_flags => {},
65
+ :packet_index => @current_packet_index + 1,
66
+ :protos => [],
67
+ }
68
+ when :proto
69
+ @current_processing = :proto
70
+ when :field
71
+ @current_processing = :field
72
+ when :pdml
73
+ else
74
+ raise "Unknown element [#{name}]"
75
+ end
76
+ end
77
+
78
+
79
+ def end_element name
80
+ if name == :packet
81
+ # p @current_packet
82
+ if @current_packet[:protos].include? "malformed"
83
+ $stderr.puts "Malformed packet #{@current_packet_index}"
84
+ return
85
+ end
86
+ raise "No data found in packet #{@current_packet_index}, protocols found #{@current_packet[:protos]}" if @current_packet[:data].nil? && @current_packet[:size] > 0
87
+ @current_packet.delete :protos
88
+ @block.call @current_packet_index, @current_packet
89
+ @current_packet_index += 1
90
+ end
91
+ end
92
+
93
+ end
94
+
95
+ def self.load_file f, options = {}, &block
96
+ tshark_executable = options[:tshark] || "tshark"
97
+ accepted_protocols = ["geninfo", "tcp", "ip", "eth", "sll", "frame"]
98
+ accepted_protocols += options[:accepted_protocols] if options[:accepted_protocols]
99
+ profile_name = "pcap_tools"
100
+ profile_dir = "#{ENV['HOME']}/.wireshark/profiles/#{profile_name}"
101
+ unless File.exist? "#{profile_dir}/disabled_protos"
102
+ status = POpen4::popen4("#{tshark_executable} -G protocols") do |stdout, stderr, stdin, pid|
103
+ list = stdout.read.split("\n").map { |x| x.split(" ").last }.reject { |x| accepted_protocols.include? x }
104
+ FileUtils.mkdir_p profile_dir
105
+ File.open("#{profile_dir}/disabled_protos", "w") { |io| io.write(list.join("\n") + "\n") }
106
+ end
107
+ raise "Tshark execution error when listing protocols" unless status.exitstatus == 0
108
+ end
109
+ status = POpen4::popen4("#{tshark_executable} -n -C #{profile_name} -T pdml -r #{f}") do |stdout, stderr, stdin, pid|
110
+ Ox.sax_parse(MyParser.new(block), stdout)
111
+ $stderr.puts stderr.read
112
+ end
113
+ raise "Tshark execution error with file #{f}" unless status.exitstatus == 0
114
+ end
115
+
116
+ end
117
+
118
+ end
@@ -0,0 +1,19 @@
1
+ module PcapTools
2
+
3
+ class FrameProcessor
4
+
5
+ def initialize
6
+ @counter = 0
7
+ end
8
+
9
+ def inject index, packet
10
+ @counter += 1
11
+ end
12
+
13
+ def finalize
14
+ puts "Number of frames : #{@counter}"
15
+ end
16
+
17
+ end
18
+
19
+ end
@@ -0,0 +1,51 @@
1
+ require 'time'
2
+
3
+ module PcapTools
4
+
5
+ class TcpProcessor
6
+
7
+ def initialize
8
+ @streams = {}
9
+ @stream_processors = []
10
+ end
11
+
12
+ def add_stream_processor processor
13
+ @stream_processors << processor
14
+ end
15
+
16
+ def inject index, packet
17
+ stream_index = packet[:stream]
18
+ if stream_index
19
+ if packet[:tcp_flags][:syn] && packet[:tcp_flags][:ack] === false
20
+ @streams[stream_index] = {
21
+ :first => packet,
22
+ :data => [],
23
+ }
24
+ elsif packet[:tcp_flags][:fin] || packet[:tcp_flags][:rst]
25
+ if @streams[stream_index]
26
+ current = {:index => stream_index, :data => @streams[stream_index][:data]}
27
+ @stream_processors.each do |p|
28
+ current = p.process_stream current
29
+ break unless current
30
+ end
31
+ @streams.delete stream_index
32
+ end
33
+ else
34
+ if @streams[stream_index]
35
+ packet[:type] = (packet[:from] == @streams[stream_index][:first][:from] && packet[:from_port] == @streams[stream_index][:first][:from_port]) ? :out : :in
36
+ packet.delete :tcp_flags
37
+ @streams[stream_index][:data] << packet if packet[:size] > 0
38
+ end
39
+ end
40
+ end
41
+ end
42
+
43
+ def finalize
44
+ @stream_processors.each do |p|
45
+ p.finalize
46
+ end
47
+ end
48
+
49
+ end
50
+
51
+ end
@@ -0,0 +1,17 @@
1
+ module Net
2
+
3
+ class HTTPRequest
4
+ attr_accessor :time
5
+ end
6
+
7
+ class HTTPResponse
8
+ attr_accessor :time
9
+
10
+ def body=(body)
11
+ @body = body
12
+ @read = true
13
+ end
14
+
15
+ end
16
+
17
+ end
@@ -0,0 +1,99 @@
1
+ module PcapTools
2
+
3
+ class HttpExtractor
4
+
5
+ def process_stream stream
6
+ calls = []
7
+ k = 0
8
+ while k < stream[:data].size
9
+ begin
10
+ req = parse_request(stream[:data][k])
11
+ resp = k + 1 < stream[:data].size ? parse_response(stream[:data][k + 1]) : nil
12
+ calls << [stream[:index], req, resp]
13
+ rescue Exception => e
14
+ warn "Unable to parse http call in stream #{stream[:index]} : #{e}"
15
+ end
16
+ k += 2
17
+ end
18
+ calls
19
+ end
20
+
21
+ def finalize
22
+ end
23
+
24
+ private
25
+
26
+ def parse_request stream
27
+ headers, body = split_headers(stream[:data])
28
+ line0 = headers.shift
29
+ m = /(\S+)\s+(\S+)\s+(\S+)/.match(line0) or raise "Unable to parse first line of http request #{line0}"
30
+ clazz = {
31
+ 'POST' => Net::HTTP::Post,
32
+ 'HEAD' => Net::HTTP::Head,
33
+ 'GET' => Net::HTTP::Get,
34
+ 'PUT' => Net::HTTP::Put
35
+ }[m[1]] or raise "Unknown http request type [#{m[1]}]"
36
+ req = clazz.new m[2]
37
+ req['Pcap-Src'] = stream[:from]
38
+ req['Pcap-Src-Port'] = stream[:from_port]
39
+ req['Pcap-Dst'] = stream[:to]
40
+ req['Pcap-Dst-Port'] = stream[:to_port]
41
+ req.time = stream[:time]
42
+ req.body = body
43
+ req['user-agent'] = nil
44
+ req['accept'] = nil
45
+ add_headers req, headers
46
+ if req['Content-Length']
47
+ req.body.size == req['Content-Length'].to_i or raise "Wrong content-length for http request, header say [#{req['Content-Length'].chomp}], found #{req.body.size}"
48
+ end
49
+ req
50
+ end
51
+
52
+ def parse_response stream
53
+ headers, body = split_headers(stream[:data])
54
+ line0 = headers.shift
55
+ m = /^(\S+)\s+(\S+)\s+(.*)$/.match(line0) or raise "Unable to parse first line of http response [#{line0}]"
56
+ resp = Net::HTTPResponse.send(:response_class, m[2]).new(m[1], m[2], m[3])
57
+ resp.time = stream[:time]
58
+ add_headers resp, headers
59
+ if resp.chunked?
60
+ resp.body = read_chunked("\r\n" + body)
61
+ else
62
+ resp.body = body
63
+ if resp['Content-Length']
64
+ resp.body.size == resp['Content-Length'].to_i or raise "Wrong content-length for http response, header say [#{resp['Content-Length'].chomp}], found #{resp.body.size}"
65
+ end
66
+ end
67
+ begin
68
+ resp.body = Zlib::GzipReader.new(StringIO.new(resp.body)).read if resp['Content-Encoding'] == 'gzip'
69
+ rescue Zlib::GzipFile::Error
70
+ warn "Response body is not in gzip: [#{resp.body}]"
71
+ end
72
+ resp
73
+ end
74
+
75
+ def add_headers o, headers
76
+ headers.each do |line|
77
+ m = /\A([^:]+):\s*/.match(line) or raise "Unable to parse header line [#{line}]"
78
+ o[m[1]] = m.post_match
79
+ end
80
+ end
81
+
82
+ def split_headers str
83
+ index = str.index("\r\n\r\n")
84
+ return str[0 .. index].split("\r\n"), str[index + 4 .. -1]
85
+ end
86
+
87
+ def read_chunked str
88
+ if str.nil? || (str == "\r\n")
89
+ return ''
90
+ end
91
+ m = /\r\n([0-9a-fA-F]+)\r\n/.match(str) or raise "Unable to read chunked body in #{str.split("\r\n")[0]}"
92
+ len = m[1].hex
93
+ return '' if len == 0
94
+ m.post_match[0..len - 1] + read_chunked(m.post_match[len .. -1])
95
+ end
96
+
97
+ end
98
+
99
+ end
@@ -0,0 +1,19 @@
1
+ module PcapTools
2
+
3
+ class TcpOneStreamFilter
4
+
5
+ def initialize target
6
+ @target = target
7
+ end
8
+
9
+ def process_stream stream
10
+ return nil if @target && stream[:index] != @target
11
+ stream
12
+ end
13
+
14
+ def finalize
15
+ end
16
+
17
+ end
18
+
19
+ end
@@ -0,0 +1,33 @@
1
+ module PcapTools
2
+
3
+ class TcpStreamRebuilder
4
+
5
+ def process_stream stream
6
+ out = []
7
+ current = nil
8
+ stream[:data].each do |packet|
9
+ if current
10
+ if packet[:type] == current[:type]
11
+ current[:times] << {:offset => current[:size], :time => packet[:time]}
12
+ current[:data] += packet[:data]
13
+ current[:size] += packet[:size]
14
+ else
15
+ out << current
16
+ current = packet.clone
17
+ current[:times] = [{:offset => 0, :time => packet[:time]}]
18
+ end
19
+ else
20
+ current = packet.clone
21
+ current[:times] = [{:offset => 0, :time => packet[:time]}]
22
+ end
23
+ end
24
+ out << current if current
25
+ {:index => stream[:index], :data => out}
26
+ end
27
+
28
+ def finalize
29
+ end
30
+
31
+ end
32
+
33
+ end
data/pcap_tools.gemspec CHANGED
@@ -2,7 +2,7 @@ require 'rake'
2
2
 
3
3
  Gem::Specification.new do |s|
4
4
  s.name = 'pcap_tools'
5
- s.version = '0.0.5'
5
+ s.version = '0.0.6'
6
6
  s.authors = ['Bertrand Paquet']
7
7
  s.email = 'bertrand.paquet@gmail.com'
8
8
  s.summary = 'Tools for extracting data from pcap files'
@@ -11,5 +11,6 @@ Gem::Specification.new do |s|
11
11
  s.files = `git ls-files`.split($/)
12
12
  s.license = 'BSD'
13
13
 
14
- s.add_runtime_dependency('bindata', '>= 1.6.0')
14
+ s.add_dependency('popen4', '0.1.2')
15
+ s.add_dependency('ox', '2.0.11')
15
16
  end
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: pcap_tools
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.0.5
4
+ version: 0.0.6
5
5
  prerelease:
6
6
  platform: ruby
7
7
  authors:
@@ -9,24 +9,40 @@ authors:
9
9
  autorequire:
10
10
  bindir: bin
11
11
  cert_chain: []
12
- date: 2013-11-22 00:00:00.000000000 Z
12
+ date: 2013-12-08 00:00:00.000000000 Z
13
13
  dependencies:
14
14
  - !ruby/object:Gem::Dependency
15
- name: bindata
15
+ name: popen4
16
16
  requirement: !ruby/object:Gem::Requirement
17
17
  none: false
18
18
  requirements:
19
- - - ! '>='
19
+ - - '='
20
20
  - !ruby/object:Gem::Version
21
- version: 1.6.0
21
+ version: 0.1.2
22
22
  type: :runtime
23
23
  prerelease: false
24
24
  version_requirements: !ruby/object:Gem::Requirement
25
25
  none: false
26
26
  requirements:
27
- - - ! '>='
27
+ - - '='
28
28
  - !ruby/object:Gem::Version
29
- version: 1.6.0
29
+ version: 0.1.2
30
+ - !ruby/object:Gem::Dependency
31
+ name: ox
32
+ requirement: !ruby/object:Gem::Requirement
33
+ none: false
34
+ requirements:
35
+ - - '='
36
+ - !ruby/object:Gem::Version
37
+ version: 2.0.11
38
+ type: :runtime
39
+ prerelease: false
40
+ version_requirements: !ruby/object:Gem::Requirement
41
+ none: false
42
+ requirements:
43
+ - - '='
44
+ - !ruby/object:Gem::Version
45
+ version: 2.0.11
30
46
  description:
31
47
  email: bertrand.paquet@gmail.com
32
48
  executables:
@@ -34,10 +50,19 @@ executables:
34
50
  extensions: []
35
51
  extra_rdoc_files: []
36
52
  files:
53
+ - .gitignore
54
+ - Gemfile
55
+ - Gemfile.lock
37
56
  - README.markdown
38
57
  - bin/pcap_tools
39
- - lib/pcap_parser.rb
40
58
  - lib/pcap_tools.rb
59
+ - lib/pcap_tools/loader.rb
60
+ - lib/pcap_tools/packet_processors/frame.rb
61
+ - lib/pcap_tools/packet_processors/tcp.rb
62
+ - lib/pcap_tools/patches/http.rb
63
+ - lib/pcap_tools/stream_processors/http.rb
64
+ - lib/pcap_tools/stream_processors/one_stream_filter.rb
65
+ - lib/pcap_tools/stream_processors/rebuilder.rb
41
66
  - pcap_tools.gemspec
42
67
  homepage: https://github.com/bpaquet/pcap_tools
43
68
  licenses:
@@ -60,9 +85,8 @@ required_rubygems_version: !ruby/object:Gem::Requirement
60
85
  version: '0'
61
86
  requirements: []
62
87
  rubyforge_project:
63
- rubygems_version: 1.8.24
88
+ rubygems_version: 1.8.23
64
89
  signing_key:
65
90
  specification_version: 3
66
91
  summary: Tools for extracting data from pcap files
67
92
  test_files: []
68
- has_rdoc:
data/lib/pcap_parser.rb DELETED
@@ -1,218 +0,0 @@
1
- require 'bindata'
2
-
3
- module PcapTools
4
-
5
- module Parser
6
-
7
- module HasParent
8
-
9
- attr_accessor :parent
10
-
11
- end
12
-
13
- class PcapFile < BinData::Record
14
- endian :little
15
-
16
- struct :header do
17
- uint32 :magic
18
- uint16 :major
19
- uint16 :minor
20
- int32 :this_zone
21
- uint32 :sig_figs
22
- uint32 :snaplen
23
- uint32 :linktype
24
- end
25
-
26
- array :packets, :read_until => :eof do
27
- uint32 :ts_sec
28
- uint32 :ts_usec
29
- uint32 :incl_len
30
- uint32 :orig_len
31
- string :data, :length => :incl_len
32
- end
33
-
34
- end
35
-
36
- # Present IP addresses in a human readable way
37
- class IPAddr < BinData::Primitive
38
- array :octets, :type => :uint8, :initial_length => 4
39
-
40
- def set(val)
41
- ints = val.split(/\./).collect { |int| int.to_i }
42
- self.octets = ints
43
- end
44
-
45
- def get
46
- self.octets.collect { |octet| "%d" % octet }.join(".")
47
- end
48
- end
49
-
50
- # TCP Protocol Data Unit
51
- class TCP_PDU < BinData::Record
52
- mandatory_parameter :packet_length
53
-
54
- endian :big
55
-
56
- uint16 :src_port
57
- uint16 :dst_port
58
- uint32 :seq
59
- uint32 :ack_seq
60
- bit4 :doff
61
- bit4 :res1
62
- bit2 :res2
63
- bit1 :urg
64
- bit1 :ack
65
- bit1 :psh
66
- bit1 :rst
67
- bit1 :syn
68
- bit1 :fin
69
- uint16 :window
70
- uint16 :checksum
71
- uint16 :urg_ptr
72
- string :options, :read_length => :options_length_in_bytes
73
- string :payload, :read_length => lambda { packet_length - payload.rel_offset }
74
-
75
- def options_length_in_bytes
76
- (doff - 5 ) * 4
77
- end
78
-
79
- def type
80
- "TCP"
81
- end
82
-
83
- include HasParent
84
-
85
- end
86
-
87
- # UDP Protocol Data Unit
88
- class UDP_PDU < BinData::Record
89
- mandatory_parameter :packet_length
90
-
91
- endian :big
92
-
93
- uint16 :src_port
94
- uint16 :dst_port
95
- uint16 :len
96
- uint16 :checksum
97
- string :payload, :read_length => lambda { packet_length - payload.rel_offset }
98
-
99
- def type
100
- "UDP"
101
- end
102
-
103
- include HasParent
104
- end
105
-
106
- # IP Protocol Data Unit
107
- class IP_PDU < BinData::Record
108
- endian :big
109
-
110
- bit4 :version, :asserted_value => 4
111
- bit4 :header_length
112
- uint8 :tos
113
- uint16 :total_length
114
- uint16 :ident
115
- bit3 :flags
116
- bit13 :frag_offset
117
- uint8 :ttl
118
- uint8 :protocol
119
- uint16 :checksum
120
- ip_addr :src_addr
121
- ip_addr :dst_addr
122
- string :options, :read_length => :options_length_in_bytes
123
- choice :payload, :selection => :protocol do
124
- tcp_pdu 6, :packet_length => :payload_length_in_bytes
125
- udp_pdu 17, :packet_length => :payload_length_in_bytes
126
- string :default, :read_length => :payload_length_in_bytes
127
- end
128
-
129
- def header_length_in_bytes
130
- header_length * 4
131
- end
132
-
133
- def options_length_in_bytes
134
- header_length_in_bytes - options.rel_offset
135
- end
136
-
137
- def payload_length_in_bytes
138
- total_length - header_length_in_bytes
139
- end
140
-
141
- def type
142
- "IP"
143
- end
144
-
145
- include HasParent
146
- end
147
-
148
- class MacAddr < BinData::Primitive
149
- array :octets, :type => :uint8, :initial_length => 6
150
-
151
- def set(val)
152
- ints = val.split(/\./).collect { |int| int.to_i }
153
- self.octets = ints
154
- end
155
-
156
- def get
157
- self.octets.collect { |octet| "%02x" % octet }.join(":")
158
- end
159
- end
160
-
161
- IPV4 = 0x0800
162
- class Ethernet < BinData::Record
163
- endian :big
164
-
165
- mac_addr :dst
166
- mac_addr :src
167
- uint16 :protocol
168
- choice :payload, :selection => :protocol do
169
- ip_pdu IPV4
170
- rest :default
171
- end
172
-
173
- include HasParent
174
- end
175
-
176
- class LinuxCookedCapture < BinData::Record
177
- endian :big
178
-
179
- uint16 :type
180
- uint16 :address_type
181
- uint16 :address_len
182
- array :octets, :type => :uint8, :initial_length => 8
183
- uint16 :protocol
184
- choice :payload, :selection => :protocol do
185
- ip_pdu IPV4
186
- rest :default
187
- end
188
-
189
- include HasParent
190
- end
191
-
192
- def load_file f
193
- packets = []
194
- File.open(f, 'rb') do |io|
195
- content = PcapFile.read(io)
196
- raise 'Wrong endianess' unless content.header.magic.to_i.to_s(16) == "a1b2c3d4"
197
- content.packets.each do |original_packet|
198
- packet = case content.header.linktype
199
- when 113 then LinuxCookedCapture.read(original_packet.data)
200
- when 1 then Ethernet.read(original_packet.data)
201
- else raise "Unknown network #{content.header.linktype}"
202
- end
203
- packet.parent = original_packet
204
- while packet.respond_to?(:payload) && packet.payload.is_a?(BinData::Choice)
205
- packet.payload.parent = packet
206
- packet = packet.payload
207
- end
208
- packets << packet
209
- end
210
- end
211
- packets
212
- end
213
-
214
- module_function :load_file
215
-
216
- end
217
-
218
- end