dap 0.0.10 → 0.0.11

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: 8b61714c9b3553759bb7c726aad187acbf2adddf
4
- data.tar.gz: e5f0ea147aa9a6f9fbcbb09854221b3d247eb467
3
+ metadata.gz: e8a027b00cf9225a71e0a599704a788c23200f2c
4
+ data.tar.gz: d5bf2864eb1ac5290a19bef4de11e76a3da5dcac
5
5
  SHA512:
6
- metadata.gz: 53c0c68b19babf428a673063963122bb57f1e185a8205423f29c5e66ee9b4e2e3f2cfa5e905741a2bcf742358b2c5ddaef689edb862cb152a68ba453c7e8f592
7
- data.tar.gz: fa2601b3d4bcaa99f0e3c4087e101e80f423a42228b95d53325c9bf31355aa8b186c3801f2a20b06a8cc1f4791ac6dfd1bc0b631fc829d4afa7e97b211fad25f
6
+ metadata.gz: 7e8aab164ce8b3be600320d2f9b32fe3da1ef76137efb6af3608bce035667b5b7422325a724dd232e105eabca7a47bdf81897a0ec2a6959bcffa45620f6c8154
7
+ data.tar.gz: 4ceecd71b00e4fef27dd33434ef9e55cb0f96ed38310fc93fafee9c1a2e60949525cb302bb0722fef69147732cf6b922ec391cecf2cb6b9331e83c9211a6d114
data/README.md CHANGED
@@ -9,33 +9,62 @@ DAP reads data using an input plugin, transforms it through a series of filters,
9
9
 
10
10
  DAP was written to process terabyte-sized public scan datasets, such as those provided by https://scans.io/. Although DAP isn't particularly fast, it can be used across multiple cores (and machines) by splitting the input source and wrapping the execution with GNU Parallel.
11
11
 
12
- ## Prerequisites
13
12
 
14
- DAP depends on GeoIP (http://dev.maxmind.com/geoip/legacy/downloadable/) to be able to append geographic metadata to analyzed datasets. At least on Ubuntu, the libgeoip-dev package provides this capability.
13
+ ## Installation
15
14
 
16
- ## Usage
15
+ ### Prerequisites
17
16
 
18
- See [Samples](https://github.com/rapid7/dap/tree/master/samples)
17
+ DAP requires Ruby, and is best suited for systems with a relatively current version,
18
+ preferably one installed and managed by
19
+ [`rbenv`](https://github.com/rbenv/rbenv) or [`rvm`](https://rvm.io/). Using
20
+ system managed/installed Rubies is possible but fraught with peril.
19
21
 
20
- ### Quick Setup for GeoIP Lookups
22
+ DAP depends on [Maxmind's geoip database](http://dev.maxmind.com/geoip/legacy/downloadable/) to be able to append geographic metadata to analyzed datasets. If you intend on using this capability, run the following as `root`:
21
23
 
24
+ ```bash
25
+ mkdir -p /var/lib/geoip && cd /var/lib/geoip && wget http://geolite.maxmind.com/download/geoip/database/GeoLiteCity.dat.gz && gunzip GeoLiteCity.dat.gz && mv GeoLiteCity.dat geoip.dat
22
26
  ```
23
- $ git clone https://github.com/rapid7/dap.git
24
- $ cd dap
25
- $ gem install bundler
26
- $ bundle install
27
- $ sudo bash
28
- # mkdir -p /var/lib/geoip && cd /var/lib/geoip && wget http://geolite.maxmind.com/download/geoip/database/GeoLiteCity.dat.gz && gunzip GeoLiteCity.dat.gz && mv GeoLiteCity.dat geoip.dat
27
+
28
+ ### Ubuntu
29
+
30
+ ```bash
31
+ sudo apt-get install libgeoip-dev
32
+ gem install dap
29
33
  ```
30
34
 
35
+ ### OS X
36
+
37
+ ```bash
38
+ brew update
39
+ brew install geoip
40
+ gem install dap
41
+ ```
42
+
43
+ ## Usage
44
+
45
+ In its simplest form, DAP takes input, applies zero or more filters which modify the input, and then outputs the result. The input, filters and output are separated by plus signs (`+`). As seen from `dap -h`:
46
+
47
+ ```
48
+ Uage: dap [input] + [filter] + [output]
49
+ --inputs
50
+ --outputs
51
+ --filters
52
+ ```
53
+
54
+ To see which input/output formats are supported and what filters are available, run `dap --inputs`,`dap --outputs` or `dap --filters`, respectively.
55
+
56
+ This example reads as input a single IP address from `STDIN` in line form, applies geo-ip transofmrations as a filter on that line, and then returns the output as JSON:
57
+
31
58
  ```
32
59
  $ echo 8.8.8.8 | bin/dap + lines + geo_ip line + json
33
60
  {"line":"8.8.8.8","line.country_code":"US","line.country_code3":"USA","line.country_name":"United States","line.latitude":"38.0","line.longitude":"-97.0"}
34
61
  ```
35
62
 
36
- Where dap gets fun is doing transforms, like just grabbing the country code:
63
+ This example does the same, but only outputs the geo-ip country code:
64
+
37
65
  ```
38
66
  $ echo 8.8.8.8 | bin/dap + lines + geo_ip line + select line.country_code3 + lines
39
67
  USA
40
68
  ```
41
69
 
70
+ There are also several examples of how to use DAP along with sample datasets [here](samples).
data/bin/dap CHANGED
@@ -113,6 +113,7 @@ while true
113
113
  data = inp.read_record
114
114
  break if data == Dap::Input::Error::EOF
115
115
  next if data == Dap::Input::Error::Empty
116
+ next if data == Dap::Input::Error::InvalidFormat
116
117
 
117
118
  docs = [ data ]
118
119
 
@@ -4,6 +4,8 @@ module Filter
4
4
  require 'htmlentities'
5
5
  require 'shellwords'
6
6
  require 'uri'
7
+ require 'zlib'
8
+ require 'stringio'
7
9
 
8
10
  # Dirty element extractor, works around memory issues with Nokogiri
9
11
  module HTMLGhetto
@@ -191,9 +193,17 @@ class FilterDecodeHTTPReply
191
193
  # Some buggy systems exclude the header entirely
192
194
  body ||= head
193
195
 
196
+ if save["http_raw_headers"]["content-encoding"] == "gzip"
197
+ begin
198
+ gunzip = Zlib::GzipReader.new(StringIO.new(body))
199
+ body = gunzip.read.encode('UTF-8', :invalid=>:replace, :replace=>'?')
200
+ gunzip.close()
201
+ rescue
202
+ end
203
+ end
194
204
  save["http_body"] = body
195
205
 
196
- if body =~ /<title>([^>]+)</min
206
+ if body =~ /<title>([^>]+)</mi
197
207
  save["http_title"] = $1.strip
198
208
  end
199
209
 
data/lib/dap/input.rb CHANGED
@@ -2,22 +2,23 @@ module Dap
2
2
  module Input
3
3
 
4
4
  require 'oj'
5
-
5
+
6
6
  #
7
7
  # Error codes for failed reads
8
- #
8
+ #
9
9
  module Error
10
10
  EOF = :eof
11
11
  Empty = :empty
12
+ InvalidFormat = :invalid
12
13
  end
13
14
 
14
15
  module FileSource
15
-
16
+
16
17
  attr_accessor :fd
17
18
 
18
19
  def open(file_name)
19
20
  close
20
- self.fd = ['-', 'stdin', nil].include?(file_name) ?
21
+ self.fd = ['-', 'stdin', nil].include?(file_name) ?
21
22
  $stdin : ::File.open(file_name, "rb")
22
23
  end
23
24
 
@@ -31,7 +32,7 @@ module Input
31
32
  # Line Input
32
33
  #
33
34
  class InputLines
34
-
35
+
35
36
  include FileSource
36
37
 
37
38
  def initialize(args)
@@ -50,17 +51,23 @@ module Input
50
51
  # JSON Input (line-delimited records)
51
52
  #
52
53
  class InputJSON
53
-
54
+
54
55
  include FileSource
55
56
 
56
57
  def initialize(args)
57
58
  self.open(args.first)
58
59
  end
59
60
 
60
- def read_record
61
+ def read_record
61
62
  line = self.fd.readline rescue nil
62
63
  return Error::EOF unless line
63
- json = Oj.load(line.strip) rescue nil
64
+ begin
65
+ json = Oj.load(line.strip)
66
+ rescue
67
+ $stderr.puts "\nRecord is not valid JSON and will be skipped."
68
+ $stderr.puts line
69
+ return Error::InvalidFormat
70
+ end
64
71
  return Error::Empty unless json
65
72
  json
66
73
  end
@@ -166,6 +166,12 @@ class LDAP
166
166
  return [result_type, results]
167
167
  end
168
168
 
169
+ unless data.value && data.value.length > 1
170
+ result_type = 'Error'
171
+ results['errorMessage'] = 'parse_message: Invalid LDAP response (Empty Sequence)'
172
+ return [result_type, results]
173
+ end
174
+
169
175
  if data.value[1].tag == 4
170
176
  # SearchResultEntry found..
171
177
  result_type = 'SearchResultEntry'
data/lib/dap/version.rb CHANGED
@@ -1,3 +1,3 @@
1
1
  module Dap
2
- VERSION = "0.0.10"
2
+ VERSION = "0.0.11"
3
3
  end
@@ -0,0 +1,53 @@
1
+ require 'zlib'
2
+
3
+ describe Dap::Filter::FilterDecodeHTTPReply do
4
+ describe '.decode' do
5
+
6
+ let(:filter) { described_class.new(['data']) }
7
+
8
+
9
+ context 'decoding non-HTTP response' do
10
+ let(:decode) { filter.decode("This\r\nis\r\nnot\r\nHTTP\r\n\r\n") }
11
+ it 'returns an empty hash' do
12
+ expect(decode).to eq({})
13
+ end
14
+ end
15
+
16
+ context 'decoding uncompressed response' do
17
+ let(:decode) { filter.decode("HTTP/1.0 200 OK\r\nHeader1: value1\r\n\r\nstuff") }
18
+
19
+ it 'correctly sets status code' do
20
+ expect(decode['http_code']).to eq(200)
21
+ end
22
+
23
+ it 'correctly sets status message' do
24
+ expect(decode['http_message']).to eq('OK')
25
+ end
26
+
27
+ it 'correctly sets body' do
28
+ expect(decode['http_body']).to eq('stuff')
29
+ end
30
+
31
+ it 'correct extracts header(s)' do
32
+ expect(decode['http_raw_headers']).to eq({'header1' => 'value1'})
33
+ end
34
+ end
35
+
36
+ context 'decoding gzip compressed response' do
37
+ let(:body) {
38
+ io = StringIO.new
39
+ io.set_encoding('ASCII-8BIT')
40
+ gz = Zlib::GzipWriter.new(io)
41
+ gz.write('stuff')
42
+ gz.close
43
+ io.string
44
+ }
45
+ let(:decode) { filter.decode("HTTP/1.0 200 OK\r\nContent-encoding: gzip\r\n\r\n#{body}") }
46
+
47
+ it 'correctly decompresses body' do
48
+ expect(decode['http_body']).to eq('stuff')
49
+ end
50
+ end
51
+
52
+ end
53
+ end
@@ -57,6 +57,8 @@ describe Dap::Proto::LDAP do
57
57
 
58
58
  data = original.pack('H*')
59
59
 
60
+ excessive_len = ['308480010000000000000000'].pack('H*')
61
+
60
62
  entry = ['3030020107642b040030273025040b6f626a656374436c6173'\
61
63
  '7331160403746f70040f4f70656e4c444150726f6f74445345']
62
64
 
@@ -90,6 +92,31 @@ describe Dap::Proto::LDAP do
90
92
  expect(split_messages.class).to eq(::Array)
91
93
  end
92
94
  end
95
+
96
+ context 'testing message length greater than total data length' do
97
+ let(:split_messages) { subject.split_messages(excessive_len) }
98
+ it 'returns Array as expected' do
99
+ expect(split_messages.class).to eq(::Array)
100
+ end
101
+
102
+ it 'returns empty Array as expected' do
103
+ expect(split_messages).to eq([])
104
+ end
105
+ end
106
+
107
+ context 'testing empty ASN.1 Sequence' do
108
+ hex = ['308400000000']
109
+ empty_seq = hex.pack('H*')
110
+
111
+ let(:split_messages) { subject.split_messages(empty_seq) }
112
+ it 'returns Array as expected' do
113
+ expect(split_messages.class).to eq(::Array)
114
+ end
115
+
116
+ it 'returns empty Array as expected' do
117
+ expect(split_messages).to eq([])
118
+ end
119
+ end
93
120
  end
94
121
 
95
122
  describe '.parse_ldapresult' do
@@ -209,6 +236,24 @@ describe Dap::Proto::LDAP do
209
236
  end
210
237
  end
211
238
 
239
+ context 'testing empty ASN.1 Sequence' do
240
+
241
+ data = OpenSSL::ASN1::Sequence.new([])
242
+
243
+ let(:parse_message) { subject.parse_message(data) }
244
+ it 'returns Array as expected' do
245
+ expect(parse_message.class).to eq(::Array)
246
+ end
247
+
248
+ it 'returns error value as expected' do
249
+ test_val = ['Error', {
250
+ 'errorMessage' =>
251
+ 'parse_message: Invalid LDAP response (Empty Sequence)'
252
+ }]
253
+ expect(parse_message).to eq(test_val)
254
+ end
255
+ end
256
+
212
257
  end
213
258
 
214
259
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: dap
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.0.10
4
+ version: 0.0.11
5
5
  platform: ruby
6
6
  authors:
7
7
  - Rapid7 Research
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2016-08-02 00:00:00.000000000 Z
11
+ date: 2016-08-22 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: rspec
@@ -219,6 +219,7 @@ files:
219
219
  - samples/ssl_certs_org.sh
220
220
  - samples/udp-netbios.csv.bz2
221
221
  - samples/udp-netbios.sh
222
+ - spec/dap/filter/http_filter_spec.rb
222
223
  - spec/dap/filter/ldap_filter_spec.rb
223
224
  - spec/dap/proto/ipmi_spec.rb
224
225
  - spec/dap/proto/ldap_proto_spec.rb
@@ -253,6 +254,7 @@ signing_key:
253
254
  specification_version: 4
254
255
  summary: 'DAP: The Data Analysis Pipeline'
255
256
  test_files:
257
+ - spec/dap/filter/http_filter_spec.rb
256
258
  - spec/dap/filter/ldap_filter_spec.rb
257
259
  - spec/dap/proto/ipmi_spec.rb
258
260
  - spec/dap/proto/ldap_proto_spec.rb