pcap_tools 0.0.2

Sign up to get free protection for your applications and to get access to all the features.
data/README.markdown ADDED
@@ -0,0 +1,90 @@
1
+ # What is it ?
2
+
3
+ It's a ruby library to help tcpdump file processing : do some offline analysis on tcpdump files.
4
+
5
+ Main functionnalities :
6
+
7
+ * Rebuild tcp streams
8
+ * Extract and parse http request
9
+
10
+ # How use it
11
+
12
+ ## Make a tcpdump
13
+
14
+ * `tcpdump -w out.pcap -s 4096 <filter>`
15
+ * Get the output file out.pcap
16
+
17
+ Please adjust the 4096 value, to the max packet size to capture.
18
+
19
+ ## Write a ruby script
20
+
21
+ require 'pcap_tools'
22
+
23
+ # Load tcpdump file
24
+ capture = PCAPRUB::Pcap.open_offline('out.pcap')
25
+
26
+ ## Available functions
27
+
28
+ ### Extract tcp streams
29
+
30
+ This function rebuild tcp streams from an array of pcap capture object.
31
+
32
+ tcp_streams = PcapTools::extract_tcp_streams(captures)
33
+
34
+ `tcp_streams` is an array of hash, each hash has tree keys :
35
+
36
+ * `:type` : `:in` or `:out`, if the packet was sent or received
37
+ * `:time` : timestamp of packet
38
+ * `:data` : payload of packet
39
+
40
+ Remarks :
41
+
42
+ * Packets are in the rigth ordere
43
+ * Packets are not merged (eg an http response can be splitted on serval consecutive packets,
44
+ with the same type `:in` or `:out`).
45
+ To reassemble packet of the same type, please use `stream.rebuild_packets`
46
+
47
+ ### Extract http calls
48
+
49
+ This function extract http calls from a tcp stream, returned from the `extract_tcp_streams` function.
50
+
51
+ http_calls = PcapTools::extract_http_calls(stream)
52
+
53
+ `http_calls` is an array of `http_call`.
54
+
55
+ A `http_call` is an array of two objects :
56
+
57
+ * The http request, an instance of `Net::HTTPRequest`, eg `Net::HTTPGet` or `Net::HTTPPost`. You can use this object
58
+ like any http request of [std lib `net/http`](http://www.ruby-doc.org/stdlib/libdoc/net/http/rdoc/index.html)
59
+ * `req.path` : get the request path
60
+ * `req['User-Agent']` : get the User-Agent
61
+ * `req.body` : get the request body
62
+ * ...
63
+ * The http response, an instance of `Net::HTTPResponse`, eg `Net::HTTPOk` or `Net::HTTPMovedPermanently`. You can use this object
64
+ like any http response of [std lib `net/http`](http://www.ruby-doc.org/stdlib/libdoc/net/http/rdoc/index.html)
65
+ * `resp.code` : get the http return code
66
+ * `resp['User-Agent']` : get the User-Agent
67
+ * `resp.body` : get the request body
68
+ * ...
69
+
70
+ The response can be `nil` if there is no response in the tcp stream.
71
+
72
+ The request and response object have some new attributes
73
+
74
+ * `req.time` : get the time where the request or response was captured
75
+
76
+ For the response object body, the following "Content-Encoding" type are honored :
77
+
78
+ * gzip
79
+
80
+ ### Extract http calls from captures
81
+
82
+ The two in one : extract http calls from an array of captures objects
83
+
84
+ http_calls = PcapTools::extract_http_calls_from_captures(captures)
85
+
86
+ ### Load multiple files
87
+
88
+ Load multiple pcap files, in time order. Useful when you use `tcpdump -C 5 -W 100000`, to split captured data into pieces of 5M
89
+
90
+ captures = PcapTools::load_mutliple_files '*pcap*'
@@ -0,0 +1,37 @@
1
+ #!/usr/bin/env ruby
2
+
3
+ require 'pcap_tools'
4
+ require 'optparse'
5
+
6
+ options = {}
7
+ OptionParser.new do |opts|
8
+ opts.banner = "Usage: pcap_tools_http [options] pcap_files"
9
+
10
+ opts.on("--no-body", "Do not display body") do
11
+ options[:no_body] = true
12
+ end
13
+ end.parse!
14
+
15
+ data = ARGV.map{|f| PacketFu::PcapFile.file_to_array(f)}
16
+
17
+ tcps = PcapTools::extract_tcp_streams(data)
18
+
19
+ tcps.each do |tcp|
20
+ PcapTools::extract_http_calls(tcp).each do |req, resp|
21
+ puts ">>>> #{req["pcap-src"]}:#{req["pcap-src-port"]} > #{req["pcap-dst"]}:#{req["pcap-dst-port"]}"
22
+ puts "#{req.method} #{req.path}"
23
+ req.each_capitalized_name.reject{|x| x =~ /^Pcap/ }.each do |x|
24
+ puts "#{x}: #{req[x]}"
25
+ end
26
+ puts
27
+ puts req.body unless options[:no_body]
28
+ puts "<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< #{resp.time}"
29
+ puts "#{resp.code} #{resp.message}"
30
+ resp.each_capitalized_name.reject{|x| x =~ /^Pcap/ }.each do |x|
31
+ puts "#{x}: #{resp[x]}"
32
+ end
33
+ puts
34
+ puts resp.body unless options[:no_body]
35
+ puts
36
+ end
37
+ end
data/lib/pcap_tools.rb ADDED
@@ -0,0 +1,191 @@
1
+ require 'rubygems'
2
+ require 'packetfu'
3
+ require 'net/http'
4
+ require 'zlib'
5
+
6
+ module Net
7
+
8
+ class HTTPRequest
9
+ attr_accessor :time
10
+ end
11
+
12
+ class HTTPResponse
13
+ attr_accessor :time
14
+
15
+ def body= body
16
+ @body = body
17
+ @read = true
18
+ end
19
+
20
+ end
21
+
22
+ end
23
+
24
+ module PcapTools
25
+
26
+ class TcpStream < Array
27
+
28
+ def insert_tcp sym, packet
29
+ data = packet.payload
30
+ return if data.size == 0
31
+ self << {:type => sym, :data => data, :from => packet.ip_saddr, :to => packet.ip_daddr, :from_port => packet.tcp_src, :to_port => packet.tcp_dst}
32
+ end
33
+
34
+ def rebuild_packets
35
+ out = TcpStream.new
36
+ current = nil
37
+ self.each do |packet|
38
+ if current
39
+ if packet[:type] == current[:type]
40
+ current[:data] += packet[:data]
41
+ else
42
+ out << current
43
+ current = packet.clone
44
+ end
45
+ else
46
+ current = packet.clone
47
+ end
48
+ end
49
+ out << current if current
50
+ out
51
+ end
52
+
53
+ end
54
+
55
+ def load_mutliple_files dir
56
+ Dir.glob(dir).sort{|a, b| File.new(a).mtime <=> File.new(b).mtime}.map{|file| PacketFu::PcapFile.file_to_array(file)}
57
+ end
58
+
59
+ module_function :load_mutliple_files
60
+
61
+ def extract_http_calls_from_captures captures
62
+ calls = []
63
+ extract_tcp_streams(captures).each do |tcp|
64
+ calls.concat(extract_http_calls(tcp))
65
+ end
66
+ calls
67
+ end
68
+
69
+ module_function :extract_http_calls_from_captures
70
+
71
+ def extract_tcp_streams captures
72
+ packets = []
73
+ captures.each do |capture|
74
+ capture.each do |packet|
75
+ packets << PacketFu::Packet.parse(packet)
76
+ end
77
+ end
78
+
79
+ streams = []
80
+ packets.each_with_index do |packet, k|
81
+ if packet.is_a?(PacketFu::TCPPacket) && packet.tcp_flags.syn == 1 && packet.tcp_flags.ack == 0
82
+ kk = k
83
+ tcp = TcpStream.new
84
+ while kk < packets.size
85
+ packet2 = packets[kk]
86
+ if packet2.is_a?(PacketFu::TCPPacket)
87
+ if packet.tcp_dst == packet2.tcp_dst && packet.tcp_src == packet2.tcp_src
88
+ tcp.insert_tcp :out, packet2
89
+ break if packet.tcp_flags.fin == 1 || packet2.tcp_flags.fin == 1
90
+ end
91
+ if packet.tcp_dst == packet2.tcp_src && packet.tcp_src == packet2.tcp_dst
92
+ tcp.insert_tcp :in, packet2
93
+ break if packet.tcp_flags.fin == 1 || packet2.tcp_flags.fin == 1
94
+ end
95
+ end
96
+ kk += 1
97
+ end
98
+ streams << tcp
99
+ end
100
+ end
101
+ streams
102
+ end
103
+
104
+ module_function :extract_tcp_streams
105
+
106
+ def extract_http_calls stream
107
+ rebuilded = stream.rebuild_packets
108
+ calls = []
109
+ data_out = ""
110
+ data_in = nil
111
+ k = 0
112
+ while k < rebuilded.size
113
+ begin
114
+ req = HttpParser::parse_request(rebuilded[k])
115
+ resp = k + 1 < rebuilded.size ? HttpParser::parse_response(rebuilded[k + 1]) : nil
116
+ calls << [req, resp]
117
+ rescue Exception => e
118
+ warn "Unable to parse http call : #{e}"
119
+ end
120
+ k += 2
121
+ end
122
+ calls
123
+ end
124
+
125
+ module_function :extract_http_calls
126
+
127
+ module HttpParser
128
+
129
+ def parse_request stream
130
+ headers, body = split_headers(stream[:data])
131
+ line0 = headers.shift
132
+ m = /(\S+)\s+(\S+)\s+(\S+)/.match(line0) or raise "Unable to parse first line of http request #{line0}"
133
+ clazz = {'POST' => Net::HTTP::Post, 'GET' => Net::HTTP::Get, 'PUT' => Net::HTTP::Put}[m[1]] or raise "Unknown http request type #{m[1]}"
134
+ req = clazz.new m[2]
135
+ req['Pcap-Src'] = stream[:from]
136
+ req['Pcap-Src-Port'] = stream[:from_port]
137
+ req['Pcap-Dst'] = stream[:to]
138
+ req['Pcap-Dst-Port'] = stream[:to_port]
139
+ req.time = stream[:time]
140
+ req.body = body
141
+ add_headers req, headers
142
+ req.body.size == req['Content-Length'].to_i or raise "Wrong content-length for http request, header say #{req['Content-Length'].chomp}, found #{req.body.size}"
143
+ req
144
+ end
145
+
146
+ module_function :parse_request
147
+
148
+ def parse_response stream
149
+ headers, body = split_headers(stream[:data])
150
+ line0 = headers.shift
151
+ m = /^(\S+)\s+(\S+)\s+(.*)$/.match(line0) or raise "Unable to parse first line of http response #{line0}"
152
+ resp = Net::HTTPResponse.send(:response_class, m[2]).new(m[1], m[2], m[3])
153
+ resp.time = stream[:time]
154
+ add_headers resp, headers
155
+ if resp.chunked?
156
+ resp.body = read_chunked("\r\n" + body)
157
+ else
158
+ resp.body = body
159
+ resp.body.size == resp['Content-Length'].to_i or raise "Wrong content-length for http response, header say #{resp['Content-Length'].chomp}, found #{resp.body.size}"
160
+ end
161
+ resp.body = Zlib::GzipReader.new(StringIO.new(resp.body)).read if resp['Content-Encoding'] == 'gzip'
162
+ resp
163
+ end
164
+
165
+ module_function :parse_response
166
+
167
+ private
168
+
169
+ def self.add_headers o, headers
170
+ headers.each do |line|
171
+ m = /\A([^:]+):\s*/.match(line) or raise "Unable to parse line #{line}"
172
+ o[m[1]] = m.post_match
173
+ end
174
+ end
175
+
176
+ def self.split_headers str
177
+ index = str.index("\r\n\r\n")
178
+ return str[0 .. index].split("\r\n"), str[index + 4 .. -1]
179
+ end
180
+
181
+ def self.read_chunked str
182
+ return "" if str == "\r\n"
183
+ m = /\r\n([0-9a-fA-F]+)\r\n/.match(str) or raise "Unable to read chunked body in #{str.split("\r\n")[0]}"
184
+ len = m[1].hex
185
+ return "" if len == 0
186
+ m.post_match[0..len - 1] + read_chunked(m.post_match[len .. -1])
187
+ end
188
+
189
+ end
190
+
191
+ end
@@ -0,0 +1,15 @@
1
+ require 'rake'
2
+
3
+ Gem::Specification.new do |s|
4
+ s.name = 'pcap_tools'
5
+ s.version = '0.0.2'
6
+ s.authors = ['Bertrand Paquet']
7
+ s.email = 'bertrand.paquet@gmail.com'
8
+ s.summary = 'Tools for extracting data from pcap files'
9
+ s.homepage = 'https://github.com/bpaquet/pcap_tools'
10
+ s.executables << 'pcap_tools_http'
11
+ s.files = `git ls-files`.split($/)
12
+ s.license = 'BSD'
13
+
14
+ s.add_development_dependency('packetfu', '>= 1.1.9')
15
+ end
metadata ADDED
@@ -0,0 +1,66 @@
1
+ --- !ruby/object:Gem::Specification
2
+ name: pcap_tools
3
+ version: !ruby/object:Gem::Version
4
+ version: 0.0.2
5
+ prerelease:
6
+ platform: ruby
7
+ authors:
8
+ - Bertrand Paquet
9
+ autorequire:
10
+ bindir: bin
11
+ cert_chain: []
12
+ date: 2013-09-25 00:00:00.000000000 Z
13
+ dependencies:
14
+ - !ruby/object:Gem::Dependency
15
+ name: packetfu
16
+ requirement: !ruby/object:Gem::Requirement
17
+ none: false
18
+ requirements:
19
+ - - ! '>='
20
+ - !ruby/object:Gem::Version
21
+ version: 1.1.9
22
+ type: :development
23
+ prerelease: false
24
+ version_requirements: !ruby/object:Gem::Requirement
25
+ none: false
26
+ requirements:
27
+ - - ! '>='
28
+ - !ruby/object:Gem::Version
29
+ version: 1.1.9
30
+ description:
31
+ email: bertrand.paquet@gmail.com
32
+ executables:
33
+ - pcap_tools_http
34
+ extensions: []
35
+ extra_rdoc_files: []
36
+ files:
37
+ - README.markdown
38
+ - bin/pcap_tools_http
39
+ - lib/pcap_tools.rb
40
+ - pcap_tools.gemspec
41
+ homepage: https://github.com/bpaquet/pcap_tools
42
+ licenses:
43
+ - BSD
44
+ post_install_message:
45
+ rdoc_options: []
46
+ require_paths:
47
+ - lib
48
+ required_ruby_version: !ruby/object:Gem::Requirement
49
+ none: false
50
+ requirements:
51
+ - - ! '>='
52
+ - !ruby/object:Gem::Version
53
+ version: '0'
54
+ required_rubygems_version: !ruby/object:Gem::Requirement
55
+ none: false
56
+ requirements:
57
+ - - ! '>='
58
+ - !ruby/object:Gem::Version
59
+ version: '0'
60
+ requirements: []
61
+ rubyforge_project:
62
+ rubygems_version: 1.8.24
63
+ signing_key:
64
+ specification_version: 3
65
+ summary: Tools for extracting data from pcap files
66
+ test_files: []