hirsute 0.1.0

Sign up to get free protection for your applications and to get access to all the features.
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA1:
3
+ data.tar.gz: d700fb375ef73a088c8570f51f22ebab0f6c126e
4
+ metadata.gz: 6d4be2b6029c43c577cca8bcc958f42a489ec96a
5
+ SHA512:
6
+ data.tar.gz: f1fcf9ccd33fae57312302103ff9b75c9150db8ac2bee8f05b1f00f88b610f2491e67e55636ed6eb6a08bad3fd0d8448241b75e097a0b53ce6644724f6763a06
7
+ metadata.gz: 59189a4d7d722bb6c92b2c46d8f66a512ceb61a4a9502a76238a6af7ecaf129108e73a855304a3b58ba0ed94fbcd2281899b34282f912a73d2d6f1ccb2d3bd6b
@@ -0,0 +1,7 @@
1
+ Copyright (c) 2012 Derrick Schneider
2
+
3
+ Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
4
+
5
+ The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
6
+
7
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
@@ -0,0 +1,105 @@
1
+ Hirsute
2
+ =======
3
+ Hirsute is a Ruby-based domain specific language for generating plausible fake data. You might need fake data for any of the following reasons:
4
+
5
+ * demoing to a potential customer and providing a realistic experience
6
+ * building a "real" database for testing (versus the often messy, inaccurate databases in dev systems)
7
+ * building a database for load testing before a launch
8
+
9
+ In Hirsute, you define a template for what an object might look like, and then generate however many copies you need. Then you can work with those collections as needed. There is <a href="https://github.com/derricks/hirsute/blob/master/manual.md">a full manual</a>, but here is a quick example to give the flavor. Say you're building a system where you have a bunch of users, and you want to build in a "friend" concept that allows each user to have 10 friends. You want to generate a random sample of data, but you think most users will only have 2-4 friends.
10
+
11
+ The relevant Hirsute script might look like this:
12
+
13
+ <code><pre>
14
+ # define a user template that has an id that is an incrementing counter starting at 1 and an email address that we're defining as testuser1@gmail.com,testuser2@yahoo.com,testuser3@aol.com, and so forth
15
+
16
+ storage :mysql
17
+
18
+ a('user') {
19
+ has :id => counter,
20
+ :email => combination(
21
+ "testuser",counter,"@",one_of(['gmail','aol','yahoo']),".com")
22
+ is_stored_in "users"
23
+ }
24
+
25
+ #make 1000 users
26
+ users = user * 1000
27
+
28
+ # define a friendship object that maps two users together. Just define the user ids as literals so they can be defined but can be filled in later
29
+ a('friendship') {
30
+ has :user1 => 1,
31
+ :user2 => 1
32
+ is_stored_in "friendship"
33
+ }
34
+ friendships = collection_of friendship
35
+
36
+ # for each user, pick an appropriate number of friends and create the friendship objects
37
+ foreach user do |cur_user|
38
+ # figure out a number of friends this user might have. Pass in a histogram to steer the probability the way we want
39
+
40
+ # while you can pass in an array of probabilities, you can also define the histogram more intuitively
41
+ friend_dist = <<-HIST
42
+ num_friends
43
+ 0 **
44
+ 1 **********
45
+ 2 ******************************
46
+ 3 ******************************
47
+ 4 ********************
48
+ 5 *
49
+ 6 *
50
+ 7 *
51
+ 8 *
52
+ 9 **
53
+ 10 *
54
+ HIST
55
+ # the first argument is the options to draw from, the second argument (optional) is a histogram representing distribution of probabilities
56
+ num_friends = pick_from([0,1,2,3,4,5,6,7,8,9,10], HIST)
57
+
58
+ # since this in Ruby, you can just write in it as needed
59
+ (0...num_friends).each do |idx|
60
+ # grab a random user that isn't this one
61
+ friend = any(user) {|friend| friend.id != cur_user.id}
62
+
63
+ new_friendship = friendship.make # because there's only one collection holding these, it's added automatically
64
+ new_friendship.user1 = friend.id
65
+ new_friendship.user2 = cur_user.id
66
+ end
67
+ end
68
+
69
+ # and now write them all out to files
70
+ finish users
71
+ finish friendships
72
+ </pre></code>
73
+
74
+ This will create files that have data such as this:
75
+ <code><pre>
76
+ INSERT INTO users ('email','id') VALUES ('testuser1@yahoo.com',1);
77
+ INSERT INTO users ('email','id') VALUES ('testuser2@yahoo.com',2);
78
+ INSERT INTO users ('email','id') VALUES ('testuser3@aol.com',3);
79
+ INSERT INTO users ('email','id') VALUES ('testuser4@yahoo.com',4);
80
+ INSERT INTO users ('email','id') VALUES ('testuser5@yahoo.com',5);
81
+ </pre></code>
82
+
83
+ and
84
+
85
+ <code><pre>
86
+ INSERT INTO friendship ('user1','user2') VALUES (624,1);
87
+ INSERT INTO friendship ('user1','user2') VALUES (808,1);
88
+ INSERT INTO friendship ('user1','user2') VALUES (81,1);
89
+ INSERT INTO friendship ('user1','user2') VALUES (15,2);
90
+ </pre></code>
91
+
92
+ To run this script, cd to the hirsute directory and run
93
+ <code><pre>
94
+ ruby lib/hirsute.rb samples/readme.hrs
95
+ </pre></code>
96
+
97
+ Roadmap
98
+ -------
99
+ Hirsute is still early in development, but my goal is to continue adding output formats (currently only mysql and csv are supported) and generators as well as continuing to allow for more declarative syntax that would make the data generation more flexible, terse, and intuitive.
100
+
101
+ I also don't think it will yet meet one of my needs, which is to generate the data for a multimillion-user system. So far, it does everything in memory, which will obviously cause problems for large data sets.
102
+
103
+ Why Hirsute?
104
+ ------------
105
+ The name was a joke with a friend. When I wanted something like this, I asked him, since he's up on many open-source projects. He said he didn't know of something like this, so I replied, "No, no. You're supposed to say 'Look at Hirsute' or something like that."
@@ -0,0 +1,14 @@
1
+ #!/usr/bin/env ruby
2
+ require 'fileutils'
3
+
4
+ cur_wd = FileUtils.pwd()
5
+
6
+ # convert args to absolute paths
7
+ ARGV.map! {|item| File.expand_path(item)}
8
+
9
+ # Absolute path to this script, e.g. /home/user/bin/foo.sh
10
+ curdir = File.dirname(__FILE__)
11
+ FileUtils.cd(File.join([curdir,".."]))
12
+ load(File.join([curdir,"..","lib","hirsute.rb"]))
13
+
14
+ FileUtils.cd(cur_wd)
@@ -0,0 +1,110 @@
1
+ # defines the basic functions used by the hirsute language, allowing a hirsute file to be loaded in
2
+ # usage:
3
+ # ruby hirsute.rb <filename>
4
+ # if filename is not specified, you can use this in irb to define language items
5
+
6
+ require('lib/hirsute_utils.rb')
7
+ require('lib/hirsute_template.rb')
8
+ require('lib/hirsute_collection.rb')
9
+ require('lib/hirsute_fixed.rb')
10
+ require('lib/hirsute_output.rb')
11
+
12
+ # store the absolute path of the file (if present) to ensure we don't lose track during chdirs
13
+ ABS_HRS_FILE = File::expand_path(ARGV[0]) if ARGV[0]
14
+
15
+ Dir::chdir(File::dirname(__FILE__) + "/..")
16
+
17
+
18
+ include Hirsute::Support
19
+
20
+ @outputters = {:mysql => Hirsute::MySQLOutputter,
21
+ :csv => Hirsute::CSVOutputter}
22
+
23
+ @objToTemplates = Hash.new
24
+
25
+ def storage(storage_system)
26
+ @storage = storage_system
27
+ end
28
+
29
+ def storage_options(options = Hash.new)
30
+ @storage_options = options
31
+ end
32
+
33
+ # you can use an or a as your definition
34
+ def an(objName,&block)
35
+ a(objName) {block.call}
36
+ end
37
+
38
+ # This method defines a Template with an identifier of objName. It is the basic method for defining
39
+ # dummy objects
40
+ def a(objName, &block)
41
+
42
+ template = make_template(objName,&block)
43
+
44
+ @objToTemplates[objName] = template
45
+
46
+ # this allows the client to do something like user * 5
47
+ # define_method objName -> template
48
+ self.class.send(:define_method,objName.to_sym) {template}
49
+
50
+
51
+ template
52
+ end
53
+
54
+ # iterates over every object of the specified type across any collection holding that type
55
+ # usage: foreach user {|cur_user|}
56
+ # if you only want to iterate over one collection, use that collection's each method
57
+ def foreach(objTemplate)
58
+ #find every collection that has registered for this type of object (in the call you get the template)
59
+ colls = Hirsute::Collection.collections_holding_object(objTemplate.templateName)
60
+ colls.each {|coll| coll.each {|item| yield item if block_given?}}
61
+ end
62
+
63
+ # return any object in any collection that meets the criteria passed in a block
64
+ def any(objTemplate,&block)
65
+ every(objTemplate,&block).choice
66
+ end
67
+
68
+ # return every object of a given type, combined from any collections that might include that type. If a block is passed, only elements returning true from the
69
+ # block will be returned
70
+ def every(objTemplate)
71
+ results = Array.new
72
+ foreach(objTemplate) do |item|
73
+ results << item if !block_given?
74
+ results << item if block_given? && (yield item)
75
+ end
76
+ results
77
+ end
78
+
79
+ # tells Hirsute to output the given collection to the given storage system (or to generate the files necessary for that)
80
+ # if no storage symbol is passed in, this will use the default set with the storage command
81
+ def finish(collection,storageSymbol = nil)
82
+ raise "No storage defined. Use 'storage <symbol>' to define a storage type" if @storage.nil? && storageSymbol.nil?
83
+
84
+ if storageSymbol.nil?
85
+ @outputters[@storage].new(collection,@storage_options).output
86
+ else
87
+ @outputters[storageSymbol].new(collection,@storage_options).output
88
+ end
89
+ end
90
+
91
+ # returns an empty collection
92
+ def collection(objectName)
93
+ Hirsute::Collection.new(objectName)
94
+ end
95
+
96
+ # returns an empty collection of the specified template type
97
+ def collection_of(template)
98
+ collection template.templateName
99
+ end
100
+
101
+ # pick an item from an array based on an optional histogram
102
+ def pick_from(options,histogram = nil)
103
+ random_item_with_histogram(options,histogram)
104
+ end
105
+
106
+
107
+ if ARGV[0]
108
+ Dir::chdir(File::dirname(ABS_HRS_FILE))
109
+ load ABS_HRS_FILE
110
+ end
@@ -0,0 +1,67 @@
1
+ # defines a Collection interface for Hirsute::Fixed objects
2
+ # why not just an array? because eventually this might need to deal with objects in a text file for memory purposes, but I want to provide a consistent interface
3
+ # in the short-term though, just wrap an array
4
+ require('lib/hirsute_utils.rb')
5
+ require('lib/hirsute_template.rb')
6
+
7
+ module Hirsute
8
+
9
+ class Collection
10
+ include Enumerable
11
+ # hold a class variable that contains all the collections for specific users
12
+ @object_names_to_collections = Hash.new
13
+
14
+ #class methods
15
+ # return a list of all collections holding objects of the specified type
16
+ def self.collections_holding_object(object_name)
17
+ @object_names_to_collections[object_name]
18
+ end
19
+
20
+ def self.registerCollectionForObject(collection,objectName)
21
+ if @object_names_to_collections[objectName]
22
+ @object_names_to_collections[objectName] << collection
23
+ else
24
+ @object_names_to_collections[objectName] = [collection]
25
+ end
26
+ end
27
+
28
+ include Enumerable
29
+ include Support
30
+
31
+ attr_reader :object_name
32
+
33
+ def initialize(objectName)
34
+ @object_name = objectName # defines the object type kept in this collection
35
+ Hirsute::Collection.registerCollectionForObject(self,objectName)
36
+ @collection = Array.new
37
+ end
38
+
39
+ def each(&block)
40
+ @collection.each(&block)
41
+ end
42
+
43
+ def <<(element)
44
+ # allows for deferred definition of type
45
+ if element.kind_of? Hirsute::Template
46
+ self.<<(element.make(false))
47
+ return
48
+ end
49
+
50
+ if element.kind_of? Hirsute::Collection
51
+ element.each {|item| self << item}
52
+ return
53
+ end
54
+
55
+ raise "Only objects of type #{@object_name} can be stored in this collection" if element.class != class_for_name(@object_name)
56
+
57
+ @collection << element
58
+ end
59
+
60
+ def length; @collection.length; end;
61
+
62
+ # so that collections can be used with the one_of generator
63
+ def choice
64
+ @collection.choice
65
+ end
66
+ end
67
+ end
@@ -0,0 +1,17 @@
1
+ module Hirsute
2
+ # defines the interface for a Constraint object that can correct a field that would otherwise break a constraint
3
+ class Constraint
4
+ def correct(field_value)
5
+ end
6
+ end
7
+
8
+ # this just adds
9
+ class UniqueConstraint < Constraint
10
+ @counter = 1
11
+ def correct(field_value)
12
+ ret_val = field_value.to_s + @counter.to_s
13
+ @counter = @counter + 1
14
+ ret_val
15
+ end
16
+ end
17
+ end
@@ -0,0 +1,17 @@
1
+ # this represents a fixed object created from a template
2
+ module Hirsute
3
+ class Fixed
4
+
5
+ attr_accessor :fields
6
+
7
+ # though Hirsute can use set and get within itself, set also configures require(ibute accessors
8
+ # for the specified field, thus allowing more user-friendly management of Hirsute objects
9
+ def set(field,value)
10
+ set_method = (field.to_s + "=").to_sym
11
+ m = self.method((field.to_s + "=").to_sym)
12
+ m.call value
13
+ end
14
+
15
+ def get(field); self.method(field.to_sym).call; end
16
+ end
17
+ end
@@ -0,0 +1,140 @@
1
+ # represents a generator used to derive a field value. An empty object, save for a generate method that is here largely for
2
+ # documentation. Instances of this class are set up within the Template object
3
+
4
+ require ('lib/hirsute_utils.rb')
5
+
6
+ module Hirsute
7
+ class Generator
8
+ include Hirsute::Support
9
+
10
+ def initialize(block)
11
+ @finalizer = block
12
+ end
13
+
14
+ # do the actual work of generating a value. takes the fixed object being made as an argument
15
+ def generate(onObj)
16
+ result = _generate(onObj)
17
+
18
+ # if a generator returns a generator, keep going down the chain
19
+ while result.kind_of? Generator
20
+ result = result.generate(onObj)
21
+ end
22
+
23
+ # if it's a range, grab the array from the range and choose one item randomly
24
+ if result.kind_of? Range
25
+ ary = Hirsute::Support.get_range_array(result)
26
+ result = ary.choice
27
+ end
28
+
29
+ finish(result,onObj)
30
+ end
31
+
32
+ def _generate(onObj)
33
+ end
34
+
35
+ private
36
+ def finish(value,onObj)
37
+ # create a local copy for closure
38
+ if @finalizer
39
+ onObj.instance_exec value, &@finalizer
40
+ else
41
+ value
42
+ end
43
+ end
44
+ end
45
+
46
+ #in this case, the definition is fixed, so no need for dynamic construction
47
+ class CompoundGenerator < Generator
48
+
49
+ def initialize(generators,block)
50
+ @generators = generators
51
+ super(block)
52
+ end
53
+
54
+ # return the joined response of each embedded generator
55
+ def _generate(onObj)
56
+ ret_val = ""
57
+ @generators.each {|gen| ret_val = ret_val + gen.generate(onObj).to_s}
58
+ ret_val
59
+ end
60
+ end
61
+
62
+ # convenience class for literal values (especially strings and nil)
63
+ class LiteralGenerator < Generator
64
+ def initialize(value,block)
65
+ @value = value
66
+ super(block)
67
+ end
68
+
69
+ def _generate(onObj)
70
+ @value
71
+ end
72
+ end
73
+
74
+ class ReadFromFileGenerator < Generator
75
+
76
+ public
77
+ def initialize(file_name,algorithm,block)
78
+ @file_name = file_name
79
+ @file = File.open @file_name
80
+ @algorithm = algorithm
81
+ super(block)
82
+ end
83
+
84
+ def _generate(onObj)
85
+ if @algorithm == :markov
86
+ advance_count = rand(100)
87
+ read_line_at(advance_count)
88
+ elsif @algorithm == :linear
89
+ read_line_at(1)
90
+ else
91
+ raise "Unknown read_from_file algorithm: " + @algorithm
92
+ end
93
+
94
+ end
95
+
96
+ private
97
+ def reset_file
98
+ @file.close
99
+ @file = File.open(@file_name)
100
+ end
101
+
102
+ # advances line_count lines from current location (resetting the file if necessary)
103
+ # and returns the relevant line
104
+ def read_line_at(line_count)
105
+ line = ""
106
+ (0...line_count).each do |idx|
107
+ line = @file.gets
108
+
109
+ while line && line.strip == ""
110
+ line = @file.gets
111
+ end
112
+
113
+ if line.nil? # reached the end of the file
114
+ reset_file
115
+ line = read_line_at(1)
116
+ end
117
+ end
118
+ line.chomp
119
+ end
120
+ end
121
+
122
+ # generators of this type are dependant on some field in the final object already being set
123
+ # this base class allows
124
+ class DependentGenerator < Generator
125
+ def initialize(dependencyFields,block)
126
+ @dependencyFields = dependencyFields
127
+ super(block)
128
+ end
129
+
130
+ def dependency_fields
131
+ if @dependencyFields.kind_of? Array
132
+ @dependencyFields
133
+ else
134
+ @dependencyFields = [@dependencyFields]
135
+ dependency_fields
136
+ end
137
+ end
138
+
139
+ end
140
+ end