dap 0.0.10 → 0.0.11

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: 8b61714c9b3553759bb7c726aad187acbf2adddf
4
- data.tar.gz: e5f0ea147aa9a6f9fbcbb09854221b3d247eb467
3
+ metadata.gz: e8a027b00cf9225a71e0a599704a788c23200f2c
4
+ data.tar.gz: d5bf2864eb1ac5290a19bef4de11e76a3da5dcac
5
5
  SHA512:
6
- metadata.gz: 53c0c68b19babf428a673063963122bb57f1e185a8205423f29c5e66ee9b4e2e3f2cfa5e905741a2bcf742358b2c5ddaef689edb862cb152a68ba453c7e8f592
7
- data.tar.gz: fa2601b3d4bcaa99f0e3c4087e101e80f423a42228b95d53325c9bf31355aa8b186c3801f2a20b06a8cc1f4791ac6dfd1bc0b631fc829d4afa7e97b211fad25f
6
+ metadata.gz: 7e8aab164ce8b3be600320d2f9b32fe3da1ef76137efb6af3608bce035667b5b7422325a724dd232e105eabca7a47bdf81897a0ec2a6959bcffa45620f6c8154
7
+ data.tar.gz: 4ceecd71b00e4fef27dd33434ef9e55cb0f96ed38310fc93fafee9c1a2e60949525cb302bb0722fef69147732cf6b922ec391cecf2cb6b9331e83c9211a6d114
data/README.md CHANGED
@@ -9,33 +9,62 @@ DAP reads data using an input plugin, transforms it through a series of filters,
9
9
 
10
10
  DAP was written to process terabyte-sized public scan datasets, such as those provided by https://scans.io/. Although DAP isn't particularly fast, it can be used across multiple cores (and machines) by splitting the input source and wrapping the execution with GNU Parallel.
11
11
 
12
- ## Prerequisites
13
12
 
14
- DAP depends on GeoIP (http://dev.maxmind.com/geoip/legacy/downloadable/) to be able to append geographic metadata to analyzed datasets. At least on Ubuntu, the libgeoip-dev package provides this capability.
13
+ ## Installation
15
14
 
16
- ## Usage
15
+ ### Prerequisites
17
16
 
18
- See [Samples](https://github.com/rapid7/dap/tree/master/samples)
17
+ DAP requires Ruby, and is best suited for systems with a relatively current version,
18
+ preferably one installed and managed by
19
+ [`rbenv`](https://github.com/rbenv/rbenv) or [`rvm`](https://rvm.io/). Using
20
+ system managed/installed Rubies is possible but fraught with peril.
19
21
 
20
- ### Quick Setup for GeoIP Lookups
22
+ DAP depends on [Maxmind's geoip database](http://dev.maxmind.com/geoip/legacy/downloadable/) to be able to append geographic metadata to analyzed datasets. If you intend on using this capability, run the following as `root`:
21
23
 
24
+ ```bash
25
+ mkdir -p /var/lib/geoip && cd /var/lib/geoip && wget http://geolite.maxmind.com/download/geoip/database/GeoLiteCity.dat.gz && gunzip GeoLiteCity.dat.gz && mv GeoLiteCity.dat geoip.dat
22
26
  ```
23
- $ git clone https://github.com/rapid7/dap.git
24
- $ cd dap
25
- $ gem install bundler
26
- $ bundle install
27
- $ sudo bash
28
- # mkdir -p /var/lib/geoip && cd /var/lib/geoip && wget http://geolite.maxmind.com/download/geoip/database/GeoLiteCity.dat.gz && gunzip GeoLiteCity.dat.gz && mv GeoLiteCity.dat geoip.dat
27
+
28
+ ### Ubuntu
29
+
30
+ ```bash
31
+ sudo apt-get install libgeoip-dev
32
+ gem install dap
29
33
  ```
30
34
 
35
+ ### OS X
36
+
37
+ ```bash
38
+ brew update
39
+ brew install geoip
40
+ gem install dap
41
+ ```
42
+
43
+ ## Usage
44
+
45
+ In its simplest form, DAP takes input, applies zero or more filters which modify the input, and then outputs the result. The input, filters and output are separated by plus signs (`+`). As seen from `dap -h`:
46
+
47
+ ```
48
+ Uage: dap [input] + [filter] + [output]
49
+ --inputs
50
+ --outputs
51
+ --filters
52
+ ```
53
+
54
+ To see which input/output formats are supported and what filters are available, run `dap --inputs`,`dap --outputs` or `dap --filters`, respectively.
55
+
56
+ This example reads as input a single IP address from `STDIN` in line form, applies geo-ip transofmrations as a filter on that line, and then returns the output as JSON:
57
+
31
58
  ```
32
59
  $ echo 8.8.8.8 | bin/dap + lines + geo_ip line + json
33
60
  {"line":"8.8.8.8","line.country_code":"US","line.country_code3":"USA","line.country_name":"United States","line.latitude":"38.0","line.longitude":"-97.0"}
34
61
  ```
35
62
 
36
- Where dap gets fun is doing transforms, like just grabbing the country code:
63
+ This example does the same, but only outputs the geo-ip country code:
64
+
37
65
  ```
38
66
  $ echo 8.8.8.8 | bin/dap + lines + geo_ip line + select line.country_code3 + lines
39
67
  USA
40
68
  ```
41
69
 
70
+ There are also several examples of how to use DAP along with sample datasets [here](samples).
data/bin/dap CHANGED
@@ -113,6 +113,7 @@ while true
113
113
  data = inp.read_record
114
114
  break if data == Dap::Input::Error::EOF
115
115
  next if data == Dap::Input::Error::Empty
116
+ next if data == Dap::Input::Error::InvalidFormat
116
117
 
117
118
  docs = [ data ]
118
119
 
@@ -4,6 +4,8 @@ module Filter
4
4
  require 'htmlentities'
5
5
  require 'shellwords'
6
6
  require 'uri'
7
+ require 'zlib'
8
+ require 'stringio'
7
9
 
8
10
  # Dirty element extractor, works around memory issues with Nokogiri
9
11
  module HTMLGhetto
@@ -191,9 +193,17 @@ class FilterDecodeHTTPReply
191
193
  # Some buggy systems exclude the header entirely
192
194
  body ||= head
193
195
 
196
+ if save["http_raw_headers"]["content-encoding"] == "gzip"
197
+ begin
198
+ gunzip = Zlib::GzipReader.new(StringIO.new(body))
199
+ body = gunzip.read.encode('UTF-8', :invalid=>:replace, :replace=>'?')
200
+ gunzip.close()
201
+ rescue
202
+ end
203
+ end
194
204
  save["http_body"] = body
195
205
 
196
- if body =~ /<title>([^>]+)</min
206
+ if body =~ /<title>([^>]+)</mi
197
207
  save["http_title"] = $1.strip
198
208
  end
199
209
 
data/lib/dap/input.rb CHANGED
@@ -2,22 +2,23 @@ module Dap
2
2
  module Input
3
3
 
4
4
  require 'oj'
5
-
5
+
6
6
  #
7
7
  # Error codes for failed reads
8
- #
8
+ #
9
9
  module Error
10
10
  EOF = :eof
11
11
  Empty = :empty
12
+ InvalidFormat = :invalid
12
13
  end
13
14
 
14
15
  module FileSource
15
-
16
+
16
17
  attr_accessor :fd
17
18
 
18
19
  def open(file_name)
19
20
  close
20
- self.fd = ['-', 'stdin', nil].include?(file_name) ?
21
+ self.fd = ['-', 'stdin', nil].include?(file_name) ?
21
22
  $stdin : ::File.open(file_name, "rb")
22
23
  end
23
24
 
@@ -31,7 +32,7 @@ module Input
31
32
  # Line Input
32
33
  #
33
34
  class InputLines
34
-
35
+
35
36
  include FileSource
36
37
 
37
38
  def initialize(args)
@@ -50,17 +51,23 @@ module Input
50
51
  # JSON Input (line-delimited records)
51
52
  #
52
53
  class InputJSON
53
-
54
+
54
55
  include FileSource
55
56
 
56
57
  def initialize(args)
57
58
  self.open(args.first)
58
59
  end
59
60
 
60
- def read_record
61
+ def read_record
61
62
  line = self.fd.readline rescue nil
62
63
  return Error::EOF unless line
63
- json = Oj.load(line.strip) rescue nil
64
+ begin
65
+ json = Oj.load(line.strip)
66
+ rescue
67
+ $stderr.puts "\nRecord is not valid JSON and will be skipped."
68
+ $stderr.puts line
69
+ return Error::InvalidFormat
70
+ end
64
71
  return Error::Empty unless json
65
72
  json
66
73
  end
@@ -166,6 +166,12 @@ class LDAP
166
166
  return [result_type, results]
167
167
  end
168
168
 
169
+ unless data.value && data.value.length > 1
170
+ result_type = 'Error'
171
+ results['errorMessage'] = 'parse_message: Invalid LDAP response (Empty Sequence)'
172
+ return [result_type, results]
173
+ end
174
+
169
175
  if data.value[1].tag == 4
170
176
  # SearchResultEntry found..
171
177
  result_type = 'SearchResultEntry'
data/lib/dap/version.rb CHANGED
@@ -1,3 +1,3 @@
1
1
  module Dap
2
- VERSION = "0.0.10"
2
+ VERSION = "0.0.11"
3
3
  end
@@ -0,0 +1,53 @@
1
+ require 'zlib'
2
+
3
+ describe Dap::Filter::FilterDecodeHTTPReply do
4
+ describe '.decode' do
5
+
6
+ let(:filter) { described_class.new(['data']) }
7
+
8
+
9
+ context 'decoding non-HTTP response' do
10
+ let(:decode) { filter.decode("This\r\nis\r\nnot\r\nHTTP\r\n\r\n") }
11
+ it 'returns an empty hash' do
12
+ expect(decode).to eq({})
13
+ end
14
+ end
15
+
16
+ context 'decoding uncompressed response' do
17
+ let(:decode) { filter.decode("HTTP/1.0 200 OK\r\nHeader1: value1\r\n\r\nstuff") }
18
+
19
+ it 'correctly sets status code' do
20
+ expect(decode['http_code']).to eq(200)
21
+ end
22
+
23
+ it 'correctly sets status message' do
24
+ expect(decode['http_message']).to eq('OK')
25
+ end
26
+
27
+ it 'correctly sets body' do
28
+ expect(decode['http_body']).to eq('stuff')
29
+ end
30
+
31
+ it 'correct extracts header(s)' do
32
+ expect(decode['http_raw_headers']).to eq({'header1' => 'value1'})
33
+ end
34
+ end
35
+
36
+ context 'decoding gzip compressed response' do
37
+ let(:body) {
38
+ io = StringIO.new
39
+ io.set_encoding('ASCII-8BIT')
40
+ gz = Zlib::GzipWriter.new(io)
41
+ gz.write('stuff')
42
+ gz.close
43
+ io.string
44
+ }
45
+ let(:decode) { filter.decode("HTTP/1.0 200 OK\r\nContent-encoding: gzip\r\n\r\n#{body}") }
46
+
47
+ it 'correctly decompresses body' do
48
+ expect(decode['http_body']).to eq('stuff')
49
+ end
50
+ end
51
+
52
+ end
53
+ end
@@ -57,6 +57,8 @@ describe Dap::Proto::LDAP do
57
57
 
58
58
  data = original.pack('H*')
59
59
 
60
+ excessive_len = ['308480010000000000000000'].pack('H*')
61
+
60
62
  entry = ['3030020107642b040030273025040b6f626a656374436c6173'\
61
63
  '7331160403746f70040f4f70656e4c444150726f6f74445345']
62
64
 
@@ -90,6 +92,31 @@ describe Dap::Proto::LDAP do
90
92
  expect(split_messages.class).to eq(::Array)
91
93
  end
92
94
  end
95
+
96
+ context 'testing message length greater than total data length' do
97
+ let(:split_messages) { subject.split_messages(excessive_len) }
98
+ it 'returns Array as expected' do
99
+ expect(split_messages.class).to eq(::Array)
100
+ end
101
+
102
+ it 'returns empty Array as expected' do
103
+ expect(split_messages).to eq([])
104
+ end
105
+ end
106
+
107
+ context 'testing empty ASN.1 Sequence' do
108
+ hex = ['308400000000']
109
+ empty_seq = hex.pack('H*')
110
+
111
+ let(:split_messages) { subject.split_messages(empty_seq) }
112
+ it 'returns Array as expected' do
113
+ expect(split_messages.class).to eq(::Array)
114
+ end
115
+
116
+ it 'returns empty Array as expected' do
117
+ expect(split_messages).to eq([])
118
+ end
119
+ end
93
120
  end
94
121
 
95
122
  describe '.parse_ldapresult' do
@@ -209,6 +236,24 @@ describe Dap::Proto::LDAP do
209
236
  end
210
237
  end
211
238
 
239
+ context 'testing empty ASN.1 Sequence' do
240
+
241
+ data = OpenSSL::ASN1::Sequence.new([])
242
+
243
+ let(:parse_message) { subject.parse_message(data) }
244
+ it 'returns Array as expected' do
245
+ expect(parse_message.class).to eq(::Array)
246
+ end
247
+
248
+ it 'returns error value as expected' do
249
+ test_val = ['Error', {
250
+ 'errorMessage' =>
251
+ 'parse_message: Invalid LDAP response (Empty Sequence)'
252
+ }]
253
+ expect(parse_message).to eq(test_val)
254
+ end
255
+ end
256
+
212
257
  end
213
258
 
214
259
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: dap
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.0.10
4
+ version: 0.0.11
5
5
  platform: ruby
6
6
  authors:
7
7
  - Rapid7 Research
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2016-08-02 00:00:00.000000000 Z
11
+ date: 2016-08-22 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: rspec
@@ -219,6 +219,7 @@ files:
219
219
  - samples/ssl_certs_org.sh
220
220
  - samples/udp-netbios.csv.bz2
221
221
  - samples/udp-netbios.sh
222
+ - spec/dap/filter/http_filter_spec.rb
222
223
  - spec/dap/filter/ldap_filter_spec.rb
223
224
  - spec/dap/proto/ipmi_spec.rb
224
225
  - spec/dap/proto/ldap_proto_spec.rb
@@ -253,6 +254,7 @@ signing_key:
253
254
  specification_version: 4
254
255
  summary: 'DAP: The Data Analysis Pipeline'
255
256
  test_files:
257
+ - spec/dap/filter/http_filter_spec.rb
256
258
  - spec/dap/filter/ldap_filter_spec.rb
257
259
  - spec/dap/proto/ipmi_spec.rb
258
260
  - spec/dap/proto/ldap_proto_spec.rb