RubyGems - opendns-dnsdb - Versions diffs - 0.1.0 - Mend

opendns-dnsdb 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (47) hide show

checksums.yaml +7 -0
data/.gitignore +18 -0
data/.rspec +4 -0
data/Gemfile +8 -0
data/LICENSE +20 -0
data/README.md +64 -0
data/Rakefile +6 -0
data/THANKS +1 -0
data/docs/Makefile +177 -0
data/docs/_themes/LICENSE +45 -0
data/docs/_themes/README.rst +25 -0
data/docs/_themes/flask_theme_support.py +86 -0
data/docs/_themes/kr/layout.html +32 -0
data/docs/_themes/kr/relations.html +19 -0
data/docs/_themes/kr/static/flasky.css_t +469 -0
data/docs/_themes/kr/static/small_flask.css +70 -0
data/docs/_themes/kr/theme.conf +7 -0
data/docs/_themes/kr_small/layout.html +22 -0
data/docs/_themes/kr_small/static/flasky.css_t +287 -0
data/docs/_themes/kr_small/theme.conf +10 -0
data/docs/conf.py +261 -0
data/docs/index.rst +101 -0
data/docs/make.bat +242 -0
data/docs/operations/by_ip.rst +229 -0
data/docs/operations/by_name.rst +256 -0
data/docs/operations/label.rst +217 -0
data/docs/operations/related.rst +127 -0
data/docs/operations/traffic.rst +126 -0
data/lib/opendns-dnsdb.rb +5 -0
data/lib/opendns-dnsdb/dnsdb.rb +58 -0
data/lib/opendns-dnsdb/dnsdb/by_ip.rb +69 -0
data/lib/opendns-dnsdb/dnsdb/by_name.rb +93 -0
data/lib/opendns-dnsdb/dnsdb/label.rb +105 -0
data/lib/opendns-dnsdb/dnsdb/related.rb +92 -0
data/lib/opendns-dnsdb/dnsdb/response.rb +41 -0
data/lib/opendns-dnsdb/dnsdb/rrutils.rb +11 -0
data/lib/opendns-dnsdb/dnsdb/siphash.rb +94 -0
data/lib/opendns-dnsdb/dnsdb/traffic.rb +80 -0
data/lib/opendns-dnsdb/version.rb +5 -0
data/opendns-dnsdb.gemspec +20 -0
data/spec/by_ip_spec.rb +54 -0
data/spec/by_name_spec.rb +88 -0
data/spec/label_spec.rb +88 -0
data/spec/related_spec.rb +92 -0
data/spec/spec_helper.rb +5 -0
data/spec/traffic_spec.rb +36 -0
metadata +123 -0

data/docs/operations/related.rst ADDED

@@ -0,0 +1,127 @@
+Related names
+=============
+Related names are names that have been frequently observed shortly
+before or after a reference name.
+This has proven to be very useful to discover command and control
+domains used by malware when only a few of them were previously known.
+This is also useful to investigate an infection chain.
+Internally, multiple complementary matching algorithms are used, but
+this client library takes care of aggregating and normalizing the
+results.
+Getting the list of related names
+---------------------------------
+Related names for a single name can be looked up, as well as for
+a vector of names:
+.. code-block:: ruby
+    db.related_names('www.github.com')
+    db.relates_names(['www.github.com', 'www.mozilla.org')
+These functions return a ``Response::Distinct`` object, if a single
+name was used as a starting point, or a ``Response::HashByName`` if a
+vector was provided.
+The maximum number of results can be specified:
+.. code-block:: ruby
+    db.related_names('www.skyrock.com', max_names: 50)
+An optional block can also be given.
+This block is a filter: it will be given each (name, score) as an
+argument, and only names for which the return value of this block is
+not ``false``/``nil`` will be kept.
+For example, this only retrieves names matching a given regular
+expression:
+.. code-block:: ruby
+    db.related_names('www.skyrock.com') { |name| name.match /^miss-/ }
+And this only retrieves names whose score is more than 0.1:
+.. code-block:: ruby
+    db.related_names('www.skyrock.com') { |name, score| score > 0.1 }
+Getting the list of related names, with scores
+----------------------------------------------
+In addition to a list of names, a "score" can be returned for each
+name found. This score is in the [0.0, 1.0] range, 1.0 meaning that a
+name is likely to be closely related to the reference name, 0.0
+meaning that these have not been observed together very frequently.
+Related names for a single name can be looked up, as well as for
+a vector of names:
+.. code-block:: ruby
+    db.related_names_with_score('www.github.com')
+    db.relates_names_with_score(['www.github.com', 'www.mozilla.org')
+These functions return a ``Response::HashByName``.
+An optional filter can be provided:
+.. code-block:: ruby
+    db.related_names_with_score('www.skyrock.com') do |name|
+      name.match /^miss-/
+    end
+Getting a set of distinct related names for a list of names
+-----------------------------------------------------------
+Given a list of names, this returns a set of names related to these.
+.. code-block:: ruby
+    db.distinct_related_names(['www.github.com', 'www.github.io'])
+This returns a ``Result::Distinct`` object.
+The maximum number of results can be specified:
+.. code-block:: ruby
+    db.distinct_related_names(['www.github.com', 'www.github.io'],
+                              max_results: 250)
+By default, only direct neighbors of the given names are returned, but
+deep traversal is also fully supported.
+This will return a list of names related to those provided in the
+vector, but also names related to these newly found names, names
+related to these related names:
+.. code-block:: ruby
+    db.distinct_related_names(['www.github.com', 'www.github.io'],
+                              max_results: 250,
+                              max_depth: 3)
+Since a deep traversal can return a lot of results, some not being of
+interest, a filter can be provided. This filter will be automatically applied
+after each iteration:
+.. code-block:: ruby
+    db.distinct_related_names(['www.github.com', 'www.github.io'],
+                              max_results: 250,
+                              max_depth: 3) do |name, score|
+      name.match(/^com-/) && score > 0.1
+    end
+A single name can also be given instead of a vector. This is
+equivalent to ``related_names`` when a deep traversal is not performed.
+This function returns a ``Response::Distinct`` object.

data/docs/operations/traffic.rst ADDED

@@ -0,0 +1,126 @@
+DNS traffic
+===========
+The number of DNS queries observed for a name over a time period can
+be retrieved.
+This is especially useful to see if a domain is popular, and to spot
+anomalies in its traffic.
+Getting the number of queries observed for a name
+-------------------------------------------------
+The ``daily_traffic_by_name`` method returns a vector with the number
+of queries observed for each day, within a time period.
+By default, the time period starts 7 days before the current day, and
+ends at the current day, a day starting at 00:00 UTC.
+.. code-block:: ruby
+    db.daily_traffic_by_name('www.github.com')
+The output is a ``Result::TimeSeries`` object:
+::
+    [
+        [0] 6152525,
+        [1] 4756714,
+        [2] 4670300,
+        [3] 5954983,
+        [4] 6140915,
+        [5] 6040669,
+        [6] 5529869
+    ]
+This method accepts several options:
+- ``start``: a ``Date`` object representing the lower bound of the time interval
+- ``end``: a ``Date`` object representing the higher bound of the time interval
+- ``days_back``: if ``start`` is not provided, this represents the number of days to go back in time.
+Here are some examples featuring these options:
+.. code-block:: ruby
+    db.daily_traffic_by_name('www.github.com', end: Date.today - 2, days_back: 10)
+    db.daily_traffic_by_name('www.github.com', start: Date.today - 10)
+The traffic for multiple domains can be looked up, provided that a
+vector is given instead of a single name. In that case, the output is
+a ``Result::HashByName`` object.
+.. code-block:: ruby
+    db.daily_traffic_by_name(['www.github.com', 'www.github.io'])
+For example, the following snippet compares the median number of
+queries for a set of domains:
+.. code-block:: ruby
+    ts = db.daily_traffic_by_name(['www.github.com', 'www.github.io'])
+    ts.merge(ts) { |name, ts| ts.median.to_i }
+::
+    {
+        "www.github.com" => 5954983,
+         "www.github.io" => 528002
+    }
+Anomaly detection in traffic
+----------------------------
+A benign web site tends to have a comparable traffic every day. Sudden
+spikes or drop of traffic usually indicate a major event (incident,
+unusual volume of sent email), or some suspicious activity.
+Domain names used as C&C typically receive very little traffic, and
+suddenly get a spike of traffic for a short period of time. The same
+can be observed with compromised hosts acting as intermediaries.
+After having retrieved the traffic for a name, computing the relative
+standard deviation is a simple and efficient way to detect anomalies.
+To do so, the library includes the ``descriptive_statistics`` module
+and implements a ``relative_standard_deviation`` method. This method
+can work on the time series of a single domain, as well as on a set
+of multiple time series.
+.. code-block:: ruby
+    ts = d.daily_traffic_by_name(['skyrock.com', 'github.com', 'ooctmxmgwigqt.info'])
+    ap d.relative_standard_deviation(ts)
+This outputs either a ``Response::TimeSeries`` or a ``Response::HashByName`` object:
+::
+    {
+               "skyrock.com" => 2.4300100908269657,
+                "github.com" => 10.628632305278618,
+        "ooctmxmgwigqt.info" => 244.18566965045403
+    }
+In this example, we can clearly spot a domain name whose traffic
+doesn't follow what we usually observe for a benign domain.
+High-pass filter
+----------------
+Domains receiving little traffic are frequently receiving more noise
+(bots, internal traffic) than queries sent by actual users.
+A simple high pass filter sets to 0 all entries of a time series below
+a cutoff value. This is provided by the ``high_pass_filter`` method:
+.. code-block:: ruby
+    ts = d.high_pass_filter(ts, cutoff: 5.0)
+This method works on the time series of a single domain, as well as on
+a set of multiple time series. The result is either a
+`Response::TimeSeries` or a `Response::HashByName` object.

data/lib/opendns-dnsdb.rb ADDED

@@ -0,0 +1,5 @@
+require_relative 'opendns-dnsdb/version'
+require_relative 'opendns-dnsdb/dnsdb'
+module OpenDNS
+end

data/lib/opendns-dnsdb/dnsdb.rb ADDED

@@ -0,0 +1,58 @@
+require 'date'
+require 'ethon'
+require 'hashie'
+require 'multi_json'
+require_relative 'dnsdb/response'
+require_relative 'dnsdb/by_ip'
+require_relative 'dnsdb/by_name'
+require_relative 'dnsdb/label'
+require_relative 'dnsdb/related'
+require_relative 'dnsdb/traffic'
+module OpenDNS
+  class DNSDB
+    include OpenDNS::DNSDB::Response
+    include OpenDNS::DNSDB::ByIP
+    include OpenDNS::DNSDB::ByName
+    include OpenDNS::DNSDB::Label
+    include OpenDNS::DNSDB::Related
+    include OpenDNS::DNSDB::Traffic
+    DEFAULT_TIMEOUT = 15
+    DEFAULT_MAXCONNECTS = 10
+    SGRAPH_API_BASE_URL = 'https://sgraph.umbrella.com'
+    attr_reader :timeout
+    attr_reader :sslcert
+    attr_reader :sslcerttype
+    attr_reader :sslcertpasswd
+    attr_reader :maxconnects
+    def initialize(params = { })
+      raise UsageError, 'Missing certificate file' unless params[:sslcert]
+      @sslcert = params[:sslcert]
+      @timeout = DEFAULT_TIMEOUT
+      @timeout = params[:timeout].to_f if params[:timeout]
+      @maxconnects = DEFAULT_MAXCONNECTS
+      @maxconnects = params[:maxconnects].to_i if params[:maxconnects]
+      @sslcerttype = params[:sslcerttype] || 'p12'
+      @sslcertpasswd = params[:sslcertpasswd] || ''
+      @options = {
+        followlocation: true,
+        timeout: @timeout,
+        sslcert: @sslcert,
+        sslcerttype: @sslcerttype,
+        sslcertpasswd: @sslcertpasswd
+      }
+    end
+    def query_handler(endpoint, method = :get, options = { })
+      url = SGRAPH_API_BASE_URL + endpoint
+      query = Ethon::Easy.new(@options)
+      query.http_request(url, method, options)
+      query
+    end
+  end
+end

data/lib/opendns-dnsdb/dnsdb/by_ip.rb ADDED

@@ -0,0 +1,69 @@
+require_relative 'rrutils'
+module OpenDNS
+  class DNSDB
+    module ByIP
+      include OpenDNS::DNSDB::RRUtils
+      def rr_only_for_ips(responses)
+        responses_is_hash = responses.kind_of?(Hash)
+        responses = { a: responses } unless responses_is_hash
+        responses.each_pair do |key, history|
+          responses[key] = Response::Distinct.new(history.collect do |rr|
+            rr.rr
+          end.flatten.uniq)
+        end
+        responses = responses.values.first unless responses_is_hash
+        responses
+      end
+      def history_by_ip(ips, type)
+        ips_is_array = ips.kind_of?(Enumerable)
+        ips = [ ips ] unless ips_is_array
+        multi = Ethon::Multi.new
+        queries = { }
+        ips.each do |ip|
+          next if queries[ip]
+          url = "/dnsdb/ip/#{type}/#{ip}.json"
+          query = query_handler(url)
+          multi.add(query)
+          queries[ip] = query
+        end
+        multi.perform
+        responses = { }
+        queries.each_pair do |ip, query|
+          obj = MultiJson.load(query.response_body)
+          responses[ip] = Response::Raw.new(obj).rrs
+        end
+        responses = Response::HashByIP[responses]
+        responses = responses.values.first unless ips_is_array
+        responses
+      end
+      def names_history_by_nameserver_ip(ips)
+        history_by_ip(ips, 'ns')
+      end
+      def names_by_nameserver_ip(ips)
+        rr_only_for_ips(names_history_by_nameserver_ip(ips))
+      end
+      def distinct_names_by_nameserver_ip(ips)
+        distinct_rrs(names_by_nameserver_ip(ips))
+      end
+      def names_history_by_ip(ips)
+        history_by_ip(ips, 'a')
+      end
+      def names_by_ip(ips)
+        rr_only_for_ips(names_history_by_ip(ips))
+      end
+      def distinct_names_by_ip(ips)
+        distinct_rrs(names_by_ip(ips))
+      end
+    end
+  end
+end

data/lib/opendns-dnsdb/dnsdb/by_name.rb ADDED

@@ -0,0 +1,93 @@
+require_relative 'rrutils'
+module OpenDNS
+  class DNSDB
+    module ByName
+      include OpenDNS::DNSDB::RRUtils
+      def rr_only_for_names(responses)
+        responses_is_hash = responses.kind_of?(Hash)
+        responses = { a: responses } unless responses_is_hash
+        responses.each_pair do |key, history|
+          responses[key] = Response::Distinct.new(history.collect do |hrecord|
+            hrecord.rrs.collect { |rr| rr.rr }
+          end.flatten.uniq)
+        end
+        responses = responses.values.first unless responses_is_hash
+        responses
+      end
+      def history_by_name(names, type)
+        names_is_array = names.kind_of?(Enumerable)
+        names = [ names ] unless names_is_array
+        multi = Ethon::Multi.new
+        queries = { }
+        names.each do |name|
+          next if queries[name]
+          url = "/dnsdb/name/#{type}/#{name}.json"
+          query = query_handler(url)
+          multi.add(query)
+          queries[name] = query
+        end
+        multi.perform
+        responses = { }
+        queries.each_pair do |name, query|
+          obj = MultiJson.load(query.response_body)
+          responses[name] = Response::Raw.new(obj).rrs_tf
+        end
+        responses = Response::HashByName[responses]
+        responses = responses.values.first unless names_is_array
+        responses
+      end
+      def nameservers_ips_history_by_name(names)
+        history_by_name(names, 'ns')
+      end
+      def nameservers_ips_by_name(names)
+        rr_only_for_names(nameservers_ips_history_by_name(names))
+      end
+      def distinct_nameservers_ips_by_name(names)
+        Response::Distinct.new(distinct_rrs(nameservers_ips_by_name(names)))
+      end
+      def ips_history_by_name(names)
+        history_by_name(names, 'a')
+      end
+      def ips_by_name(names)
+        rr_only_for_names(ips_history_by_name(names))
+      end
+      def distinct_ips_by_name(names)
+        distinct_rrs(ips_by_name(names))
+      end
+      def mxs_history_by_name(names)
+        history_by_name(names, 'mx')
+      end
+      def mxs_by_name(names)
+        rr_only_for_names(mxs_history_by_name(names))
+      end
+      def distinct_mxs_by_name(names)
+        distinct_rrs(mxs_by_name(names))
+      end
+      def cnames_history_by_name(names)
+        history_by_name(names, 'cname')
+      end
+      def cnames_by_name(names)
+        rr_only_for_names(cnames_history_by_name(names))
+      end
+      def distinct_cnames_by_name(names)
+        distinct_rrs(cnames_by_name(names))
+      end
+    end
+  end
+end