dbd 0.0.1 → 0.0.2

Sign up to get free protection for your applications and to get access to all the features.
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA1:
3
+ metadata.gz: 6c1d692a3ce886178a32b2dd624c2a69b4f380ed
4
+ data.tar.gz: 2ec48ccfd39bb4bab04e626eabc002b0b7c18105
5
+ SHA512:
6
+ metadata.gz: 1fc7a48c2f7243febf3d8f8e34a6acb658dfc321ee81bae513fbed3635c9a07289f2b3a93a6f2f2c8fa745e367b6581f2afc965d96a3613b901dcb46096c13d9
7
+ data.tar.gz: 0130fe663c07bf96352c36ab6e90a6d2bae8cd0731625c71ff2bb49f96c9870ec5daf9dea680aac4946f9044669e0a792e1f452955a49e4f4dd31ef6aed74d3e
@@ -0,0 +1,13 @@
1
+ 0.0.1 (12 May 2013)
2
+ =====
3
+
4
+ * Initial release showing the basic Fact format
5
+ * a description with features and rationale
6
+
7
+ 0.0.2 (22 May 2013)
8
+ =====
9
+
10
+ * Better TimeStamp management (better randomness)
11
+ * Adding a (Provenance)Resource directly to a Graph with <<
12
+ * Simplification and cleaner implementations
13
+ * Adding Fact to a Resource now sets (provenance_)subject
data/README.md CHANGED
@@ -17,7 +17,7 @@ This is facts based data store, inspired by [RDF] concepts, but adding a log bas
17
17
  * 1 data source has _all_ my data : never more loose stuff :-)
18
18
  * facts can be invalidated (and replaced) later on
19
19
  * Privacy
20
- * a "hard delete" is possible: all downstream readers of the fact stream
20
+ * a "hard delete" is possible: all downstream readers of the fact stream
21
21
  must remove this fact and replace the back-up
22
22
  * since one single back-up file suffices, replacing the *single* back-up
23
23
  file will actually remove the hard deleted fact(s) for good
@@ -74,23 +74,22 @@ Open Source [MIT]
74
74
 
75
75
  graph = Dbd::Graph.new
76
76
 
77
- provenance.each {|provenance_fact| graph << provenance_fact}
78
- nobel_peace_2012.each {|fact| graph << fact}
77
+ graph << provenance
78
+ graph << nobel_peace_2012
79
79
 
80
80
  puts graph.to_CSV
81
81
 
82
82
  results in
83
83
 
84
-
85
84
  $ ruby test.rb
86
- "611dbc31-6961-4a86-9259-4a2700add783","2013-05-12 21:50:19 UTC","","98b7bb17-9921-4d52-a08a-39667c2abb4c","provenance:context","public"
87
- "79e9c0e7-b6fd-4735-817c-8c21c97c9575","2013-05-12 21:50:19 UTC","","98b7bb17-9921-4d52-a08a-39667c2abb4c","dcterms:creator","@peter_v"
88
- "7d143a50-8a63-4637-8ab8-c2aa7fc6e12e","2013-05-12 21:50:19 UTC","","98b7bb17-9921-4d52-a08a-39667c2abb4c","provenance:created_at","2013-05-12 21:50:19 UTC"
89
- "fd121b00-7934-4e22-81c8-8e810760c686","2013-05-12 21:50:19 UTC","98b7bb17-9921-4d52-a08a-39667c2abb4c","477a2e10-5e34-434d-8fc1-969277f61f9f","base:nobelPeacePriceWinner","2012"
90
- "2d852fe1-911f-497d-9485-6c24a6000fbb","2013-05-12 21:50:19 UTC","98b7bb17-9921-4d52-a08a-39667c2abb4c","477a2e10-5e34-434d-8fc1-969277f61f9f","rdfs:label","EU"
91
- "ab00b092-65a3-47c0-b10b-837cb0a5ad81","2013-05-12 21:50:19 UTC","98b7bb17-9921-4d52-a08a-39667c2abb4c","477a2e10-5e34-434d-8fc1-969277f61f9f","rdfs:comment","European Union"
92
- "a8d6b34b-6e02-4a5e-8529-4785f090866a","2013-05-12 21:50:19 UTC","98b7bb17-9921-4d52-a08a-39667c2abb4c","477a2e10-5e34-434d-8fc1-969277f61f9f","base:story","A long period of peace,
93
- that is a ""bliss""."
85
+ "aaf11676-d016-4e74-a502-2db042ea8c67","2013-05-22 21:30:08.830374243 UTC","","3fe37986-0c00-45fb-a574-ed2d0374b3fc","provenance:context","public"
86
+ "1fd25f59-b838-4872-a290-4857e783a12c","2013-05-22 21:30:08.830416859 UTC","","3fe37986-0c00-45fb-a574-ed2d0374b3fc","dcterms:creator","@peter_v"
87
+ "f118e66a-aa96-4523-ae47-4f9ceff11916","2013-05-22 21:30:08.830434360 UTC","","3fe37986-0c00-45fb-a574-ed2d0374b3fc","provenance:created_at","2013-05-22 21:30:08 UTC"
88
+ "c2d29b70-7135-4434-829b-f0640475aeb5","2013-05-22 21:30:08.830450090 UTC","3fe37986-0c00-45fb-a574-ed2d0374b3fc","f628f608-27c3-4eb6-a687-cb121f793a4d","base:nobelPeacePriceWinner","2012"
89
+ "9c8048a5-b9e3-459c-9e82-6ae195ce22e6","2013-05-22 21:30:08.830465012 UTC","3fe37986-0c00-45fb-a574-ed2d0374b3fc","f628f608-27c3-4eb6-a687-cb121f793a4d","rdfs:label","EU"
90
+ "5e06472d-2146-4933-a31c-90873fa9ed26","2013-05-22 21:30:08.830478065 UTC","3fe37986-0c00-45fb-a574-ed2d0374b3fc","f628f608-27c3-4eb6-a687-cb121f793a4d","rdfs:comment","European Union"
91
+ "d984e5c3-8acd-4c50-b40f-4cacf9f8f5c7","2013-05-22 21:30:08.830489061 UTC","3fe37986-0c00-45fb-a574-ed2d0374b3fc","f628f608-27c3-4eb6-a687-cb121f793a4d","base:story","A long period of peace,
92
+ that is a ""bliss""."
94
93
 
95
94
  [RDF]: http://www.w3.org/RDF/
96
95
  [Rationale]: http://github.com/petervandenabeele/dbd/blob/master/docs/rationale.md
@@ -26,5 +26,5 @@ Gem::Specification.new do |spec|
26
26
  spec.add_development_dependency 'yard'
27
27
  spec.add_runtime_dependency 'neography'
28
28
  spec.add_runtime_dependency 'rdf', '~> 1.0.6'
29
- spec.add_runtime_dependency 'ruby_peter_v'
29
+ spec.add_runtime_dependency 'ruby_peter_v', '>= 0.0.4'
30
30
  end
@@ -33,6 +33,9 @@ I can build and store a group of resources with provenance
33
33
 
34
34
  * change arguments for (Provenance)Fact to an options hash
35
35
 
36
- * add a store method on Graph
37
- * that will store a (Provenance)Resource instance
38
- * this will now set the time_stamps (enforcing the strictly monotic order)
36
+ * add a << method on Graph
37
+ * that stores a fact
38
+ * this will now set the time_stamp (enforcing the strictly monotic order)
39
+ * the time_stamp is a set_once property
40
+ (a soft form of immutable behavior that does not require the creation of
41
+ new objects (garbage collection cost).
@@ -0,0 +1,23 @@
1
+ 010_time_class
2
+
3
+ As a client
4
+ I can add facts to a fact stream and the time_stamp is set
5
+
6
+ * change valid? to errors
7
+ * returns array of error messages (errors.empty? indicates valid?)
8
+
9
+ * add a performance bench mark to the test set (this will also
10
+ act as a collision test for the randomization approaches).
11
+
12
+ * upgrade ruby 1.9.3 and 2.0.0
13
+
14
+ * use a TimeStamp class
15
+
16
+ * the to_s of this class shows the time_stamp to ns precision
17
+
18
+ * the TimeStamp.new accepts a larger_than: option that
19
+ works out the offsetting for being strictly larger than the
20
+ newest_time_stamp in a Fact collection in a graph
21
+
22
+ * the time_stamp class adds some random offset time
23
+ (e.g. between 1 and 999 ns) to a new time_stamp
@@ -0,0 +1,10 @@
1
+ 011_store_resource_in_graph.txt
2
+
3
+ * allow setting the subject and provenance_subject of a fact with set_once
4
+ (a soft form of immutable behavior that does not require the creation of
5
+ new objects (garbage collection cost) and may clean up the way too complex
6
+ check_or_set_subject_and_provenance behavior in Resource)
7
+
8
+ * add a << method for a resource on Graph
9
+ * that will store a (Provenance)Resource instance
10
+ * this will now set the time_stamps (enforcing the strictly monotic order)
@@ -1,4 +1,4 @@
1
- 010_provenance_fact_predicates_from_provenance_ontology
1
+ 011_provenance_fact_predicates_from_provenance_ontology
2
2
 
3
3
  As a client
4
4
  I can read the predicates for a provenance_fact from provenance ontology
@@ -26,7 +26,17 @@ nobel_peace_2012 << fact_EU_story
26
26
 
27
27
  graph = Dbd::Graph.new
28
28
 
29
- provenance.each {|provenance_fact| graph << provenance_fact}
30
- nobel_peace_2012.each {|fact| graph << fact}
29
+ graph << provenance
30
+ graph << nobel_peace_2012
31
31
 
32
32
  puts graph.to_CSV
33
+
34
+ # "9f868d99-af27-4d83-86ae-ea5f4a1fa654","2013-05-22 21:25:48.136527770 UTC","","a6e028dd-a340-49ce-b3f8-2f158e257a87","provenance:context","public"
35
+ # "28496b40-1891-4bd0-9ee1-0c6c2a878cc1","2013-05-22 21:25:48.136596276 UTC","","a6e028dd-a340-49ce-b3f8-2f158e257a87","dcterms:creator","@peter_v"
36
+ # "98b9dd72-3473-4500-814d-d955eec2c5ee","2013-05-22 21:25:48.136621174 UTC","","a6e028dd-a340-49ce-b3f8-2f158e257a87","provenance:created_at","2013-05-22 21:25:40 UTC"
37
+ # "a0482b46-414d-40df-b436-41142728fda6","2013-05-22 21:25:55.367834295 UTC","a6e028dd-a340-49ce-b3f8-2f158e257a87","cd66aece-0b21-4e3e-8286-4191efb3aea1","base:nobelPeacePriceWinner","2012"
38
+ # "c1af381d-800f-4279-a2cc-ccccf31f5134","2013-05-22 21:25:55.367891996 UTC","a6e028dd-a340-49ce-b3f8-2f158e257a87","cd66aece-0b21-4e3e-8286-4191efb3aea1","rdfs:label","EU"
39
+ # "ac08843a-baae-49ea-a725-81b7e199e8f9","2013-05-22 21:25:55.367910018 UTC","a6e028dd-a340-49ce-b3f8-2f158e257a87","cd66aece-0b21-4e3e-8286-4191efb3aea1","rdfs:comment","European Union"
40
+ # "6e91fa40-daa8-45d1-916e-f9b243d01f2c","2013-05-22 21:25:55.367928936 UTC","a6e028dd-a340-49ce-b3f8-2f158e257a87","cd66aece-0b21-4e3e-8286-4191efb3aea1","base:story","A long period of peace,
41
+ # that is a ""bliss""."
42
+
data/lib/dbd.rb CHANGED
@@ -3,6 +3,7 @@ require 'ruby_peter_v'
3
3
  require 'dbd/version'
4
4
 
5
5
  require 'dbd/errors'
6
+ require 'dbd/time_stamp'
6
7
  require 'dbd/fact'
7
8
  require 'dbd/provenance_fact'
8
9
  require 'dbd/resource'
@@ -7,7 +7,7 @@ module Dbd
7
7
  ##
8
8
  # Basic Fact of knowledge.
9
9
  #
10
- # The database is built as an ordered sequence of facts, the "fact stream".
10
+ # The database is built as an ordered sequence of facts, a "fact stream".
11
11
  #
12
12
  # This is somewhat similar to a "triple" in the RDF (Resource Description
13
13
  # Framework) concept, but with different and extended functionality.
@@ -15,22 +15,22 @@ module Dbd
15
15
  # Each basic fact has:
16
16
  # * a unique and invariant *id* (a uuid)
17
17
  #
18
- # To allow referencing back to it (e.g. to invalidate it later in the fact stream).
18
+ # To allow referencing back to it (e.g. to invalidate it later in a fact stream).
19
19
  #
20
20
  # * a *time_stamp* (time with nanosecond granularity)
21
21
  #
22
- # To allow verifying that the order in the fact stream is correct.
22
+ # To allow verifying that the order in a fact stream is correct.
23
23
  #
24
24
  # A time_stamp does not need to represent the exact time of the
25
25
  # creation of the fact, but it has to increase in strictly monotic
26
- # order in the fact stream.
26
+ # order in a fact stream.
27
27
  #
28
28
  # * a *provenance_subject* (a uuid)
29
29
  #
30
30
  # The subject of the ProvenanceResource (a set of ProvenanceFacts with
31
31
  # the same subject) about this fact. Each Fact, points *back* to a
32
32
  # ProvenanceResource (the ProvenanceResource must have been fully
33
- # defined, earlier in the fact stream).
33
+ # defined, earlier in a fact stream).
34
34
  #
35
35
  # * a *subject* (a uuid)
36
36
  #
@@ -72,9 +72,35 @@ module Dbd
72
72
  attr_reader attribute
73
73
  end
74
74
 
75
+ ##
76
+ # A set_once setter for time_stamp.
77
+ #
78
+ # This implements a "form" of immutable behavior. The value can
79
+ # be set once (possibly after creation the object), but can
80
+ # never be changed after that.
75
81
  def time_stamp=(time_stamp)
76
- raise RuntimeError if @time_stamp
77
- @time_stamp = time_stamp
82
+ raise ArgumentError unless time_stamp.is_a?(TimeStamp)
83
+ set_once(:time_stamp, time_stamp)
84
+ end
85
+
86
+ ##
87
+ # A set_once setter for subject.
88
+ #
89
+ # This implements a "form" of immutable behavior. The value can
90
+ # be set once (possibly after creation the object), but can
91
+ # never be changed after that.
92
+ def subject=(subject)
93
+ set_once(:subject, subject)
94
+ end
95
+
96
+ ##
97
+ # A set_once setter for provenance_subject.
98
+ #
99
+ # This implements a "form" of immutable behavior. The value can
100
+ # be set once (possibly after creation the object), but can
101
+ # never be changed after that.
102
+ def provenance_subject=(provenance_subject)
103
+ set_once(:provenance_subject, provenance_subject)
78
104
  end
79
105
 
80
106
  ##
@@ -128,15 +154,18 @@ module Dbd
128
154
  end
129
155
 
130
156
  ##
131
- # Checks if a fact is valid for storing in the graph.
157
+ # Checks if a fact has errors for storing in the graph.
132
158
  #
133
- # @return [#true?] not nil if valid
134
- def valid?
135
- # id not validated, is set automatically
136
- # predicate not validated, is validated in initialize
137
- # object not validated, is validated in initialize
138
- provenance_subject_valid?(provenance_subject) &&
139
- subject
159
+ # @return [Array] an Array of error messages
160
+ def errors
161
+ # * id not validated, is set automatically upon creation
162
+ # * time_stamp not validated, is set automatically later
163
+ # * predicate not validated, is validated in initialize
164
+ # * object not validated, is validated in initialize
165
+ [].tap do |a|
166
+ a << provenance_subject_error(provenance_subject)
167
+ a << "Subject is missing" unless subject
168
+ end.compact
140
169
  end
141
170
 
142
171
  ##
@@ -147,35 +176,9 @@ module Dbd
147
176
  # This is how the difference is encoded between Fact and
148
177
  # ProvenanceFact in the fact stream.
149
178
  # @param [#nil?] provenance_subject
150
- # Return [Boolean]
151
- def provenance_subject_valid?(provenance_subject)
152
- provenance_subject
153
- end
154
-
155
- ##
156
- # Builds duplicate with the subject set.
157
- #
158
- # @param [Subject] subject_arg
159
- # @return [Fact] the duplicate fact
160
- def dup_with_subject(subject_arg)
161
- self.class.new(
162
- provenance_subject: provenance_subject,
163
- subject: subject_arg, # from arg
164
- predicate: predicate,
165
- object: object)
166
- end
167
-
168
- ##
169
- # Builds duplicate with the provenance_subject set.
170
- #
171
- # @param [Subject] provenance_subject_arg
172
- # @return [Fact] the duplicate fact
173
- def dup_with_provenance_subject(provenance_subject_arg)
174
- self.class.new(
175
- provenance_subject: provenance_subject_arg, # from arg
176
- subject: subject,
177
- predicate: predicate,
178
- object: object)
179
+ # Return [nil, String] nil or an error message
180
+ def provenance_subject_error(provenance_subject)
181
+ "Provenance subject is missing" unless provenance_subject
179
182
  end
180
183
 
181
184
  end
@@ -29,7 +29,7 @@ module Dbd
29
29
  #
30
30
  # @return [self] for chaining
31
31
  #
32
- # Validates that added fact is valid.
32
+ # Validates that added fact is valid (has no errors).
33
33
  #
34
34
  # Validates that added fact is newer.
35
35
  #
@@ -41,8 +41,7 @@ module Dbd
41
41
  #
42
42
  # Mark the fact in the list of used provenance_subjects (for [A]).
43
43
  def <<(fact)
44
- # TODO Add a more descriptive Exception message
45
- raise FactError unless fact.valid?
44
+ raise FactError, "#{fact.errors.join(', ')}." unless fact.errors.empty?
46
45
  raise OutOfOrderError if (self.newest_time_stamp && fact.time_stamp <= self.newest_time_stamp)
47
46
  raise OutOfOrderError if (@used_provenance_subjects[fact.subject])
48
47
  index = Helpers::OrderedSetCollection.add_and_return_index(fact, @internal_collection)
@@ -9,9 +9,15 @@ module Dbd
9
9
 
10
10
  include Fact::Collection
11
11
 
12
- def <<(fact)
13
- enforce_strictly_monotonic_time(fact)
14
- super(fact)
12
+ ##
13
+ # Add a Fact or Resource to the Graph.
14
+ #
15
+ # This will add a time_stamp to the Facts.
16
+ def <<(fact_or_resource)
17
+ Array(fact_or_resource).each do |fact|
18
+ enforce_strictly_monotonic_time(fact)
19
+ super(fact)
20
+ end
15
21
  end
16
22
 
17
23
  ##
@@ -29,18 +35,11 @@ module Dbd
29
35
  private
30
36
 
31
37
  ##
32
- # The system mmust enforce that the time_stamps are strictly monotonic.
33
- #
34
- # This has been detected because on Java (JRuby) the the Wall time has
35
- # a resolution of only 1 ms so sometimes, the exact same value for
36
- # Time.now was reported.
38
+ # Setting a strictly monotonically increasing time_stamp (if not yet set).
39
+ # The time_stamp also has some randomness (1 .. 999 ns) to reduce the
40
+ # chance on collisions when merging fact streams from different sources.
37
41
  def enforce_strictly_monotonic_time(fact)
38
- new_time = Time.now.utc
39
- newest_time_stamp = newest_time_stamp()
40
- if newest_time_stamp && new_time <= newest_time_stamp
41
- new_time = newest_time_stamp + 0.000_000_002 # Add approx. 2 nanoseconds
42
- end
43
- fact.time_stamp = new_time
42
+ fact.time_stamp = TimeStamp.new(larger_than: newest_time_stamp) unless fact.time_stamp
44
43
  end
45
44
 
46
45
  end
@@ -12,7 +12,7 @@ module Dbd
12
12
  # usage of provenance_subject is not recursive on this level (this
13
13
  # allows efficient single pass loading in an underlying database).
14
14
  #
15
- # In the serialisation of the fact stream, the presence or absence of a
15
+ # In the serialisation of a fact stream, the presence or absence of a
16
16
  # provenance_subject marks the difference between a (base) Fact and a
17
17
  # ProvenanceFact.
18
18
  #
@@ -47,9 +47,9 @@ module Dbd
47
47
  #
48
48
  # Here, in the derived ProvenanceFact, it must not be present.
49
49
  # @param [#nil?] provenance_subject
50
- # Return [Boolean]
51
- def provenance_subject_valid?(provenance_subject)
52
- provenance_subject.nil?
50
+ # Return [nil, String] nil if valid, an error message if not
51
+ def provenance_subject_error(provenance_subject)
52
+ "Provenance subject should not be present in Provenance Fact" if provenance_subject
53
53
  end
54
54
 
55
55
  ##
@@ -44,10 +44,11 @@ module Dbd
44
44
  ##
45
45
  # Check provenance_subject, which should be nil here
46
46
  # @param [ProvenanceFact] provenance_fact
47
- # @return [ProvenanceFact] with validated nil on provenance_subject
48
- def check_or_set_provenance(provenance_fact)
49
- raise ProvenanceError if provenance_fact.provenance_subject
50
- provenance_fact
47
+ def set_provenance!(provenance_fact)
48
+ if provenance_fact.provenance_subject
49
+ raise ProvenanceError,
50
+ "trying to set provenance_subject to#{provenance_fact.provenance_subject}"
51
+ end
51
52
  end
52
53
 
53
54
  end
@@ -19,14 +19,22 @@ module Dbd
19
19
  # Resources that are associated with it.
20
20
  #
21
21
  # During build-up of a Fact, the subject and the provenance_subject
22
- # can be nil. These will then be set in a local duplicate when the
23
- # Fact is added (with '<<') to a resource.
22
+ # can be nil. These will then be set when the Fact is added
23
+ # (with '<<') to a resource.
24
24
  class Resource
25
25
 
26
26
  include Helpers::OrderedSetCollection
27
27
 
28
28
  attr_reader :subject
29
29
 
30
+ ##
31
+ # Getter for provenance_subject.
32
+ #
33
+ # Will be overridden in the ProvenanceResource subclass.
34
+ def provenance_subject
35
+ @provenance_subject
36
+ end
37
+
30
38
  ##
31
39
  # @return [Fact::Subject] a new (random) Resource subject
32
40
  def self.new_subject
@@ -56,57 +64,28 @@ module Dbd
56
64
  ##
57
65
  # Add a fact.
58
66
  #
59
- # * if it has no subject, the subject is set in a duplicate fact
67
+ # * if it has no subject, the subject is set (this modifies the fact !)
60
68
  # * if is has the same subject as the resource, added unchanged.
61
69
  # * if it has a different subject, a SubjectError is raised.
62
- # * if it has no provenance_subject, the provenance_subject is set in a duplicate fact
70
+ # * if it has no provenance_subject, the provenance_subject is set (this modifies the fact !)
63
71
  # * if is has the same provenance_subject as the resource, added unchanged.
64
72
  # * if it has a different provenance_subject, a ProvenanceError is raised.
65
73
  def <<(fact)
66
74
  # TODO: check the type of the fact (Fact)
67
- super(check_or_set_subject_and_provenance(fact))
68
- end
69
-
70
- ##
71
- # Getter for provenance_subject.
72
- #
73
- # Will be overridden in the ProvenanceResource subclass.
74
- def provenance_subject
75
- @provenance_subject
75
+ set_subject!(fact)
76
+ set_provenance!(fact)
77
+ super(fact)
76
78
  end
77
79
 
78
80
  private
79
81
 
80
- def check_or_set_subject_and_provenance(element)
81
- with_subject = check_or_set_subject(element)
82
- check_or_set_provenance(with_subject)
83
- end
84
-
85
- def check_or_set_subject(element)
86
- if element.subject
87
- if element.subject == @subject
88
- return element
89
- else
90
- raise SubjectError,
91
- "self.subject is #{subject} and element.subject is #{element.subject}"
92
- end
93
- else
94
- element.dup_with_subject(@subject)
95
- end
82
+ def set_subject!(fact)
83
+ fact.subject = subject
96
84
  end
97
85
 
98
86
  # this will be overriden in the ProvenanceResource sub_class
99
- def check_or_set_provenance(element)
100
- if element.provenance_subject
101
- if element.provenance_subject == @provenance_subject
102
- return element
103
- else
104
- raise ProvenanceError,
105
- "self.provenance_subject is #{provenance_subject} and element.provenance_subject is #{element.provenance_subject}"
106
- end
107
- else
108
- element.dup_with_provenance_subject(@provenance_subject)
109
- end
87
+ def set_provenance!(fact)
88
+ fact.provenance_subject = provenance_subject
110
89
  end
111
90
 
112
91
  def validate_provenance_subject