thanthus 0.2.3

Sign up to get free protection for your applications and to get access to all the features.
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA256:
3
+ metadata.gz: f7ea5694e8673e5182d6f701d2e2aa246f9736bdd61abcf496713d6647c4a03c
4
+ data.tar.gz: 963ea97d58469d033524f336f474b5ff31f1092fa65deee064e638155b36246b
5
+ SHA512:
6
+ metadata.gz: 5f932d46868fab1463e8dc76758b5b7d62d42deebb5dea02119f9bfb474dbb3a0476978fda2c883e6c1ce6e8a8fbb209cc6fca3e86254acf53818cb2884a65ef
7
+ data.tar.gz: b059fd8ae45d1b9c28d6945dbeed360bb4f1f5c4435d2adc957a4075766ebbd2ab4d9cf6fece3ca74fff28ef4969a9ce7d1dd2d21d606158cbaa32548498fbb3
data/LICENSE ADDED
@@ -0,0 +1,21 @@
1
+ The MIT License (MIT)
2
+
3
+ Copyright (c) 2019 Thomas Pasquier
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in
13
+ all copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
21
+ THE SOFTWARE.
@@ -0,0 +1,289 @@
1
+ # Xanthus: Automated Reproducible Data Generation for Evaluating Intrusion Detection Systems
2
+
3
+ > I add simple Windows support. As I finished this work when studying in THU, I name this version `thanthus`.
4
+
5
+ Fairly evaluating and comparing the efficacy of different intrusion detection systems (IDS) requires that experimental data
6
+ be generated in a similar mechanism and/or shared across these systems.
7
+ The reality, unfortunately, is that there exist few public repositories (e.g., DARPA 1998/1999/2000, KDD Cup99, DARPA TC Engagement 3)
8
+ containing experimental data captured solely for the purpose of security analysis.
9
+ Among those public data repositories, most are outdated because a tremendous amount of manual labor is almost always
10
+ necessary to capture the data (e.g., DARPA TC program involves a number of teams from
11
+ across the academia and the industry and it spans over many a year).
12
+ Consequently, some newly-developed systems, in order to be able to compare against older systems,
13
+ are evaluated using the data that is a decade or two older than the systems themselves
14
+ (and usually and unsurprisingly exhibit good results).
15
+ Given that there is a perpetual arms race between the defenders and the offenders in the realm of cyber security
16
+ and that new cyber-threats are manufactured every day,
17
+ a successful defence against a decade-old exploit is hardly an achievement.
18
+
19
+ Many existing systems, acknowledging this fact and ready to showcase their detection capability,
20
+ design their own experiments and produce their own dataset as a result.
21
+ Although the experiments are sometimes carefully described in their associated publications (e.g., in academic projects),
22
+ such dataset suffers from the following drawbacks:
23
+
24
+ - In the cases where the dataset is made public, later systems can but consume only a subset of the dataset for analysis.
25
+ Therefore, if they require e.g., additional features from the dataset in the analysis, they must rerun the experiments
26
+ to capture the data themselves again, instead of simply re-using the available dataset.
27
+ Moreover, some systems publish only pre-processed dataset, which usually eliminates information from the original,
28
+ raw dataset that is not relevant to their analysis, even though such information may be relevant for other systems.
29
+
30
+ - When raw dataset is made public, it provides later systems with richer information content.
31
+ However, the underlying systems that capture the raw dataset (e.g., audit systems) are also constantly evolving,
32
+ generating finer-grained, more accurate information or
33
+ offering a completely different perspective through which one understands system behavior (e.g., provenance systems).
34
+ Security systems that take advantage of such advancement in the underlying systems
35
+ may very well find even the raw data provided by previous systems insufficient.
36
+
37
+ - If later systems must resort to reproducing dataset themselves as a result of the reasons listed above,
38
+ they need to rely on descriptions provided by previous systems to ensure high-fidelity experiment replay.
39
+ Even if we assume that previous systems provide sufficiently detailed descriptions to understand the experiment
40
+ (which certainly is not always the case),
41
+ there still exist a number of challenges.
42
+
43
+ - The experiment must be conducted using the exact software involved with matching versions.
44
+ In many cases, security experts have since identified and patched vulnerabilities in the exploitable software
45
+ used in security-related experiments, and thus the software itself usually has been updated to a newer version.
46
+ Downgrading the target software and its dependencies is therefore necessary to reproduce the experiment. This
47
+ sometimes cannot be automatically configured through existing package management systems and requires significant
48
+ manual configuration.
49
+
50
+ - Some vulnerability may affect only a particular version of the operating system. This requirement no doubt
51
+ further complicates the experimental setup and demands additional engineering effort.
52
+
53
+ - Other controllable factors may be omitted in the description that may or may not affect the final results of the
54
+ experiment. For example, background activities may have been included in the dataset but was not discussed in detail.
55
+
56
+ Before we go into any detail about using **Xanthus** for automated, reproducible data generation for security analysis,
57
+ we describe a pipeline in which we create dataset for a *specific* attack in a push-button fashion. **Xanthus** is
58
+ a higher-level abstracted framework that generates such a pipeline for *any* attack that existing or future IDS intend to
59
+ evaluate.
60
+
61
+ ## Primer to Xanthus: A Specific Pipeline
62
+
63
+ We introduce a specific pipeline that automates data capture for a particular attack.
64
+ In this pipeline, we deploy virtual machines (VM), set up a virtual environment that recreates the attack scenario,
65
+ and run the attack, while capturing data from a whole-system provenance capture system.
66
+ Code is publicly available online at [GitHub](https://github.com/crimson-unicorn/demo/tree/master/wget).
67
+ Please refer to the code while finishing off the rest of this section.
68
+
69
+ ### Prerequisites
70
+
71
+ We assume that you understand the following terms and concepts.
72
+ If not, click on the item that you do not understand to read more about it:
73
+
74
+ * [Virtual machines](https://en.wikipedia.org/wiki/Virtual_machine)
75
+ * [Makefile](https://en.wikipedia.org/wiki/Makefile)
76
+ * [VirtualBox](https://www.virtualbox.org/manual/ch01.html)
77
+ * [Vagrant](https://www.vagrantup.com/intro/index.html), [Vagrantfile](https://www.vagrantup.com/docs/vagrantfile/)
78
+ and [provisioning](https://www.vagrantup.com/docs/provisioning/index.html)
79
+ * [CamFlow](http://camflow.org)
80
+
81
+ You may want to understand the following terms and concepts if you want to fully understand the attack
82
+ that we will describe in the next section:
83
+
84
+ * [Trojan software](https://en.wikipedia.org/wiki/Advanced_persistent_threat)
85
+ and [reverse shell](https://resources.infosecinstitute.com/icmp-reverse-shell/#gref)
86
+
87
+ ### A Brief Attack Description
88
+
89
+ You could better understand the pipeline with the knowledge of the attack that we would like to reproduce automatically.
90
+ The attacker aims to invade a victim machine through a vulnerable (or exploitable) `wget`.
91
+ The attacker sets up a malicious (or compromised) `HTTP` server that redirects any requests to a malicious `FTP` server
92
+ that contains a `Debian` package with a Trojan backdoor.
93
+ The package appears to be the same as its legitimate version and may even work the same way,
94
+ but the moment the package is installed on the victim machine, it will initiate a reverse TCP connection to the attacker
95
+ who is listening for connections and create a reverse shell that allows the attacker to infiltrate into the victim machine.
96
+
97
+ When the victim machine attempts to download the benign package from the `HTTP` server using `wget`,
98
+ `wget` allows arbitrary remote file upload to the host system.
99
+ Meaning that, instead of fetching the intended benign package, it allows redirection of the `HTTP` server and downloads
100
+ the malicious one.
101
+ The user is unaware of such behavior and install the package through the package manager `dpkg`.
102
+ The installed Trojan software establishes a connection to the attacker and the attack succeeds.
103
+
104
+ ### Software Involved
105
+
106
+ * `wget` v1.17 or older
107
+ * Any `Debian` package with a Trojan backdoor. The `Debian` package must be installable (both benign and malicious version).
108
+ * Functioning `HTTP` and `FTP` server
109
+ * `dpkg` package manager
110
+ * `CamFlow` whole-system provenance capture system
111
+
112
+ ### Execution Platform
113
+
114
+ As expected, `Debian` package can only run on any `Debian`-based operating systems. This particular pipeline is run on
115
+ `Ubuntu 18.04` (both the client and the server).
116
+
117
+ ### The Pipeline
118
+
119
+ #### Installation
120
+
121
+ To run this pipeline, you need to install at least the following items:
122
+
123
+ * `Vagrant`
124
+ * Oracle `VirtualBox`
125
+
126
+ #### Usage
127
+
128
+ If you `git clone` the entire repository from [GitHub](https://github.com/crimson-unicorn/demo/), `cd` into `wget` directory.
129
+ We assume this directory would be your working directory.
130
+
131
+ We write a `Makefile` to run our attack scenario for many times. If you want to run it once only,
132
+ modify this line: `[ $${cnt} -lt 25 ]` to `[ $${cnt} -lt 1 ]` in the `Makefile`.
133
+ (In `Xanthus`, we would be able to configure this easily without actually modifying the code.)
134
+
135
+ If you are running on `Mac`:
136
+ ```
137
+ make test_mac
138
+ ```
139
+ On `Linux`, you would run:
140
+ ```
141
+ make test_linux
142
+ ```
143
+ We do *not* support `Windows` operating system for now.
144
+ You would locate the output data file in `data/` directory.
145
+
146
+ #### Behind the Scenes
147
+
148
+ This pipeline seems to be very user-friendly. So, one might ask, why do we bother to design and implement `Xanthus`?
149
+ The truth is, we have done a lot of heavy-lifting for you behind the scenes. Let's take a closer look.
150
+
151
+ The `Makefile` you run starts the `vagrant` process, which would boot up two virtual machines, one `server` and one
152
+ `client` (now, take a look into `Vagrantfile`).
153
+
154
+ The `server` machine is provisioned by `provision/server.sh` script.
155
+ It configures an `FTP` and an `HTTP` server and puts the malicious `Debian` package in the `FTP` server.
156
+ Of course, the user must provide the pipeline with the package.
157
+ We build the package ourselves in [Kali Linux](https://en.wikipedia.org/wiki/Makefile)
158
+ with [TheFatRat](https://github.com/Screetsec/TheFatRat). You are free to use any tools at your disposal.
159
+ We also put the benign one in the `HTTP` server to trick the user to download it.
160
+
161
+ The `client` machine involves more operations.
162
+ First, unlike the `server` machine that simply uses a `Ubuntu 18.04` base operating system
163
+ (as seen in `server.vm.box = "bento/ubuntu-18.04"`),
164
+ the `client` machine uses our customized `VirtualBox` box called `michaelh/ubuncam`.
165
+ This box is built with the following specifications:
166
+
167
+ * It is built upon the original `Ubuntu 18.04` base box from `Vagrant`.
168
+ * It is installed with `CamFlow` as its provenance-capture system.
169
+ * It downgrades `wget` to its desired version (`v1.17`) that contains the vulnerability.
170
+ * It can install `Debian` packages in the experiment.
171
+
172
+ Note that it is always desirable to package such a box and upload it to the `VagrantCloud` so that we can
173
+ configure once and reuse many times.
174
+ One can always use a base box and configure the above specifications on-the-fly,
175
+ but it is not guaranteed that the configuration would work in the distant future.
176
+ For example, the link to download an older version of `wget` may expire without notice.
177
+ `Xanthus` allows users to either provide a customized virtual box or configure a base box through provisioning.
178
+ If an online configuration is provided, `Xanthus` would automatically generate a customized box for the user
179
+ to prevent future re-configuration or possible failure in future configuration.
180
+
181
+ The `client` machine runs the script in `provision/attack`.
182
+ The user must provide such a script.
183
+ In our case, we automatically generate attack scripts using `wget-attack-script-gen.py`.
184
+ `Xanthus` allows users to provide logic to generate scripts or simply provide scripts to run during the experiment.
185
+
186
+ ### Installation
187
+
188
+ Add this line to your application's Gemfile:
189
+
190
+ ```ruby
191
+ gem 'xanthus'
192
+ ```
193
+
194
+ And then execute:
195
+
196
+ $ bundle
197
+
198
+ Or install it yourself as:
199
+
200
+ $ gem install xanthus
201
+
202
+ ### Usage
203
+
204
+ ```
205
+ xanthus version | return Xanthus version number.
206
+ xanthus dependencies | installation instructions for system dependencies.
207
+ xanthus init <project name> | initialize a new project.
208
+ xanthus run | run .xanthus file in the current folder.
209
+ ```
210
+
211
+ ### Development
212
+
213
+ To add more features in `Xanthus`,
214
+ clone this repository
215
+ ```
216
+ git clone https://github.com/tfjmp/xanthus
217
+ cd xanthus
218
+ ```
219
+ and build the gem by running
220
+ ```
221
+ gem build xanthus
222
+ ```
223
+ To install this gem locally on your machine, you can also run
224
+ ```
225
+ gem install xanthus
226
+ ```
227
+ After you add a new feature (and test it yourself), you can release a new version of `Xanthus`.
228
+ First, please update the version number in `lib/xanthus/version.rb`, tag the repository `git tag -a x.x.x -m 'x.x.x'`, and push the tag `git push --tags`.
229
+ Then you can run
230
+ ```
231
+ gem push xanthus-x.x.x.gem
232
+ ```
233
+ This last step publishes the gem at [https://rubygems.org/gems/xanthus](https://rubygems.org/gems/xanthus).
234
+
235
+ ### Contribution
236
+
237
+ We welcome bug reports and pull requests on GitHub at https://github.com/[USERNAME]/xanthus.
238
+
239
+ ### License
240
+
241
+ This gem is available as an open source project under the [MIT License](https://opensource.org/licenses/MIT).
242
+
243
+ ### Issues and Solutions with VirtualBox
244
+ VirtualBox Guest Additions is not as well designed as we may hope. If you encountered the following error:
245
+ ```
246
+ Vagrant was unable to mount VirtualBox shared folders. This is usually
247
+ because the filesystem "vboxsf" is not available. This filesystem is
248
+ made available via the VirtualBox Guest Additions and kernel module.
249
+ Please verify that these guest additions are properly installed in the
250
+ guest. This is not a bug in Vagrant and is usually caused by a faulty
251
+ Vagrant box. For context, the command attempted was:
252
+
253
+ mount -t vboxsf -o uid=900,gid=900 vagrant /vagrant
254
+
255
+ The error output from the command was:
256
+
257
+ /sbin/mount.vboxsf: mounting failed with the error: No such device
258
+ ```
259
+ It is most likely the fault of incompatible GA between the VM and the host. Even though the script might have stop, the VM is still booted. You can `vagrant ssh` into the VM and manually input the following two commands:
260
+ ```
261
+ sudo apt-get -y install dkms build-essential linux-headers-$(uname -r) virtualbox-guest-additions-iso
262
+ sudo /opt/VBoxGuestAdditions*/init/vboxadd setup
263
+ ```
264
+ After this, you may encounter this error:
265
+ ```
266
+ ...
267
+ ==> default: Machine booted and ready!
268
+ [default] GuestAdditions seems to be installed (6.0.20) correctly, but not running.
269
+ bash: line 4: setup: command not found
270
+ ==> default: Checking for guest additions in VM...
271
+ The following SSH command responded with a non-zero exit status.
272
+ Vagrant assumes that this means the command failed!
273
+
274
+ setup
275
+
276
+ Stdout from the command:
277
+
278
+
279
+
280
+ Stderr from the command:
281
+
282
+ bash: line 4: setup: command not found
283
+ ```
284
+ Please add the following into the Vagrant script:
285
+ ```
286
+ if Vagrant.has_plugin?("vagrant-vbguest")
287
+ config.vbguest.auto_update = false
288
+ end
289
+ ```
@@ -0,0 +1,32 @@
1
+ #!/usr/bin/env ruby
2
+
3
+ require "xanthus"
4
+
5
+ instruction = ARGV[0]
6
+ param1 = ARGV[1]
7
+
8
+ if instruction == 'version'
9
+ Xanthus.version
10
+ elsif instruction == 'init' && !param1.nil?
11
+ Xanthus::Init.init param1
12
+ elsif instruction == 'run'
13
+ xanthus_file = !param1.nil? ? param1 : '.xanthus'
14
+ load("./#{xanthus_file}")
15
+ elsif instruction == 'help'
16
+ puts 'xanthus version | return Xanthus version number.'
17
+ puts 'xanthus dependencies | installation instructions for system dependencies.'
18
+ puts 'xanthus init <project name> | initialize a new project.'
19
+ puts 'xanthus run [xanthus file] | run in the current folder. If not specified, we will try to run .xanthus .'
20
+ elsif instruction == 'dependencies'
21
+ puts 'You need to install the following software on your system for Xanthus to run:'
22
+ puts 'git (see https://git-scm.com/book/en/v2/Getting-Started-Installing-Git)'
23
+ puts 'git lfs (see https://help.github.com/en/articles/installing-git-large-file-storage)'
24
+ puts 'virtualbox (see https://www.virtualbox.org/wiki/Downloads)'
25
+ puts 'vagrant (see https://www.vagrantup.com/docs/installation/)'
26
+ else
27
+ # the same to `xanthus help`
28
+ puts 'xanthus version | return Xanthus version number.'
29
+ puts 'xanthus dependencies | installation instructions for system dependencies.'
30
+ puts 'xanthus init <project name> | initialize a new project.'
31
+ puts 'xanthus run [xanthus file] | run in the current folder. If not specified, we will try to run .xanthus .'
32
+ end
@@ -0,0 +1,16 @@
1
+ def os_family
2
+ case RUBY_PLATFORM
3
+ when /ix/i, /ux/i, /gnu/i,
4
+ /sysv/i, /solaris/i,
5
+ /sunos/i, /bsd/i
6
+ "unix"
7
+ when /win/i, /ming/i
8
+ "windows"
9
+ else
10
+ "others"
11
+ end
12
+ end
13
+
14
+ def sys_script_ext
15
+ os_family == 'unix' ? 'sh' : 'cmd'
16
+ end
@@ -0,0 +1,15 @@
1
+ require "xanthus/version"
2
+ require "xanthus/init"
3
+ require "xanthus/script"
4
+ require "xanthus/virtual_machine"
5
+ require "xanthus/job"
6
+ require "xanthus/default"
7
+ require "xanthus/repository"
8
+ require "xanthus/github"
9
+ require "xanthus/dataverse"
10
+ require "xanthus/configuration"
11
+
12
+ module Xanthus
13
+ class Error < StandardError; end
14
+ # Your code goes here...
15
+ end
@@ -0,0 +1,94 @@
1
+ module Xanthus
2
+ class Configuration
3
+ attr_accessor :name
4
+ attr_accessor :authors
5
+ attr_accessor :affiliation
6
+ attr_accessor :email
7
+ attr_accessor :description
8
+ attr_accessor :seed
9
+ attr_accessor :params
10
+ attr_accessor :vms
11
+ attr_accessor :scripts
12
+ attr_accessor :jobs
13
+ attr_accessor :github_conf
14
+ attr_accessor :dataverse_conf
15
+
16
+ def initialize
17
+ @params = Hash.new
18
+ @vms = Hash.new
19
+ @scripts = Hash.new
20
+ @jobs = Hash.new
21
+ end
22
+
23
+ def vm name
24
+ vm = VirtualMachine.new
25
+ yield(vm)
26
+ vm.name = name
27
+ @vms[name] = vm
28
+ end
29
+
30
+ def script name
31
+ @scripts[name] = yield
32
+ end
33
+
34
+ def job name
35
+ v = Job.new
36
+ yield(v)
37
+ v.name = name
38
+ @jobs[name] = v
39
+ end
40
+
41
+ def github
42
+ github = GitHub.new
43
+ yield(github)
44
+ @github_conf = github
45
+ end
46
+
47
+ def dataverse
48
+ dataverse = Dataverse.new
49
+ yield(dataverse)
50
+ @dataverse_conf = dataverse
51
+ end
52
+
53
+ def to_readme_md
54
+ %Q{
55
+ # #{@name}
56
+
57
+ authors: #{@authors}
58
+ affiliation: #{@affiliation}
59
+ email: #{@email}
60
+
61
+ seed: #{@seed}
62
+
63
+ ## Description
64
+
65
+ #{@description}
66
+ }
67
+ end
68
+ end
69
+
70
+ def self.configure
71
+ config = Configuration.new
72
+ yield(config)
73
+ puts "Running experiment #{config.name} with seed #{config.seed}."
74
+ srand config.seed
75
+ config.vms.each do |k, v|
76
+ v.generate_box config
77
+ end
78
+
79
+ # initializing storage backends
80
+ config.github_conf.init(config) unless config.github_conf.nil?
81
+ config.dataverse_conf.init(config) unless config.dataverse_conf.nil?
82
+
83
+ # executing jobs
84
+ config.jobs.each do |name,job|
85
+ for i in 0..(job.iterations-1) do
86
+ job.execute config, i
87
+ end
88
+ end
89
+
90
+ # finalizing storage backends
91
+ config.github_conf.tag unless config.github_conf.nil?
92
+ config.github_conf.clean unless config.github_conf.nil?
93
+ end
94
+ end