dbd 0.0.1 → 0.0.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA1:
3
+ metadata.gz: 6c1d692a3ce886178a32b2dd624c2a69b4f380ed
4
+ data.tar.gz: 2ec48ccfd39bb4bab04e626eabc002b0b7c18105
5
+ SHA512:
6
+ metadata.gz: 1fc7a48c2f7243febf3d8f8e34a6acb658dfc321ee81bae513fbed3635c9a07289f2b3a93a6f2f2c8fa745e367b6581f2afc965d96a3613b901dcb46096c13d9
7
+ data.tar.gz: 0130fe663c07bf96352c36ab6e90a6d2bae8cd0731625c71ff2bb49f96c9870ec5daf9dea680aac4946f9044669e0a792e1f452955a49e4f4dd31ef6aed74d3e
@@ -0,0 +1,13 @@
1
+ 0.0.1 (12 May 2013)
2
+ =====
3
+
4
+ * Initial release showing the basic Fact format
5
+ * a description with features and rationale
6
+
7
+ 0.0.2 (22 May 2013)
8
+ =====
9
+
10
+ * Better TimeStamp management (better randomness)
11
+ * Adding a (Provenance)Resource directly to a Graph with <<
12
+ * Simplification and cleaner implementations
13
+ * Adding Fact to a Resource now sets (provenance_)subject
data/README.md CHANGED
@@ -17,7 +17,7 @@ This is facts based data store, inspired by [RDF] concepts, but adding a log bas
17
17
  * 1 data source has _all_ my data : never more loose stuff :-)
18
18
  * facts can be invalidated (and replaced) later on
19
19
  * Privacy
20
- * a "hard delete" is possible: all downstream readers of the fact stream
20
+ * a "hard delete" is possible: all downstream readers of the fact stream
21
21
  must remove this fact and replace the back-up
22
22
  * since one single back-up file suffices, replacing the *single* back-up
23
23
  file will actually remove the hard deleted fact(s) for good
@@ -74,23 +74,22 @@ Open Source [MIT]
74
74
 
75
75
  graph = Dbd::Graph.new
76
76
 
77
- provenance.each {|provenance_fact| graph << provenance_fact}
78
- nobel_peace_2012.each {|fact| graph << fact}
77
+ graph << provenance
78
+ graph << nobel_peace_2012
79
79
 
80
80
  puts graph.to_CSV
81
81
 
82
82
  results in
83
83
 
84
-
85
84
  $ ruby test.rb
86
- "611dbc31-6961-4a86-9259-4a2700add783","2013-05-12 21:50:19 UTC","","98b7bb17-9921-4d52-a08a-39667c2abb4c","provenance:context","public"
87
- "79e9c0e7-b6fd-4735-817c-8c21c97c9575","2013-05-12 21:50:19 UTC","","98b7bb17-9921-4d52-a08a-39667c2abb4c","dcterms:creator","@peter_v"
88
- "7d143a50-8a63-4637-8ab8-c2aa7fc6e12e","2013-05-12 21:50:19 UTC","","98b7bb17-9921-4d52-a08a-39667c2abb4c","provenance:created_at","2013-05-12 21:50:19 UTC"
89
- "fd121b00-7934-4e22-81c8-8e810760c686","2013-05-12 21:50:19 UTC","98b7bb17-9921-4d52-a08a-39667c2abb4c","477a2e10-5e34-434d-8fc1-969277f61f9f","base:nobelPeacePriceWinner","2012"
90
- "2d852fe1-911f-497d-9485-6c24a6000fbb","2013-05-12 21:50:19 UTC","98b7bb17-9921-4d52-a08a-39667c2abb4c","477a2e10-5e34-434d-8fc1-969277f61f9f","rdfs:label","EU"
91
- "ab00b092-65a3-47c0-b10b-837cb0a5ad81","2013-05-12 21:50:19 UTC","98b7bb17-9921-4d52-a08a-39667c2abb4c","477a2e10-5e34-434d-8fc1-969277f61f9f","rdfs:comment","European Union"
92
- "a8d6b34b-6e02-4a5e-8529-4785f090866a","2013-05-12 21:50:19 UTC","98b7bb17-9921-4d52-a08a-39667c2abb4c","477a2e10-5e34-434d-8fc1-969277f61f9f","base:story","A long period of peace,
93
- that is a ""bliss""."
85
+ "aaf11676-d016-4e74-a502-2db042ea8c67","2013-05-22 21:30:08.830374243 UTC","","3fe37986-0c00-45fb-a574-ed2d0374b3fc","provenance:context","public"
86
+ "1fd25f59-b838-4872-a290-4857e783a12c","2013-05-22 21:30:08.830416859 UTC","","3fe37986-0c00-45fb-a574-ed2d0374b3fc","dcterms:creator","@peter_v"
87
+ "f118e66a-aa96-4523-ae47-4f9ceff11916","2013-05-22 21:30:08.830434360 UTC","","3fe37986-0c00-45fb-a574-ed2d0374b3fc","provenance:created_at","2013-05-22 21:30:08 UTC"
88
+ "c2d29b70-7135-4434-829b-f0640475aeb5","2013-05-22 21:30:08.830450090 UTC","3fe37986-0c00-45fb-a574-ed2d0374b3fc","f628f608-27c3-4eb6-a687-cb121f793a4d","base:nobelPeacePriceWinner","2012"
89
+ "9c8048a5-b9e3-459c-9e82-6ae195ce22e6","2013-05-22 21:30:08.830465012 UTC","3fe37986-0c00-45fb-a574-ed2d0374b3fc","f628f608-27c3-4eb6-a687-cb121f793a4d","rdfs:label","EU"
90
+ "5e06472d-2146-4933-a31c-90873fa9ed26","2013-05-22 21:30:08.830478065 UTC","3fe37986-0c00-45fb-a574-ed2d0374b3fc","f628f608-27c3-4eb6-a687-cb121f793a4d","rdfs:comment","European Union"
91
+ "d984e5c3-8acd-4c50-b40f-4cacf9f8f5c7","2013-05-22 21:30:08.830489061 UTC","3fe37986-0c00-45fb-a574-ed2d0374b3fc","f628f608-27c3-4eb6-a687-cb121f793a4d","base:story","A long period of peace,
92
+ that is a ""bliss""."
94
93
 
95
94
  [RDF]: http://www.w3.org/RDF/
96
95
  [Rationale]: http://github.com/petervandenabeele/dbd/blob/master/docs/rationale.md
@@ -26,5 +26,5 @@ Gem::Specification.new do |spec|
26
26
  spec.add_development_dependency 'yard'
27
27
  spec.add_runtime_dependency 'neography'
28
28
  spec.add_runtime_dependency 'rdf', '~> 1.0.6'
29
- spec.add_runtime_dependency 'ruby_peter_v'
29
+ spec.add_runtime_dependency 'ruby_peter_v', '>= 0.0.4'
30
30
  end
@@ -33,6 +33,9 @@ I can build and store a group of resources with provenance
33
33
 
34
34
  * change arguments for (Provenance)Fact to an options hash
35
35
 
36
- * add a store method on Graph
37
- * that will store a (Provenance)Resource instance
38
- * this will now set the time_stamps (enforcing the strictly monotic order)
36
+ * add a << method on Graph
37
+ * that stores a fact
38
+ * this will now set the time_stamp (enforcing the strictly monotic order)
39
+ * the time_stamp is a set_once property
40
+ (a soft form of immutable behavior that does not require the creation of
41
+ new objects (garbage collection cost).
@@ -0,0 +1,23 @@
1
+ 010_time_class
2
+
3
+ As a client
4
+ I can add facts to a fact stream and the time_stamp is set
5
+
6
+ * change valid? to errors
7
+ * returns array of error messages (errors.empty? indicates valid?)
8
+
9
+ * add a performance bench mark to the test set (this will also
10
+ act as a collision test for the randomization approaches).
11
+
12
+ * upgrade ruby 1.9.3 and 2.0.0
13
+
14
+ * use a TimeStamp class
15
+
16
+ * the to_s of this class shows the time_stamp to ns precision
17
+
18
+ * the TimeStamp.new accepts a larger_than: option that
19
+ works out the offsetting for being strictly larger than the
20
+ newest_time_stamp in a Fact collection in a graph
21
+
22
+ * the time_stamp class adds some random offset time
23
+ (e.g. between 1 and 999 ns) to a new time_stamp
@@ -0,0 +1,10 @@
1
+ 011_store_resource_in_graph.txt
2
+
3
+ * allow setting the subject and provenance_subject of a fact with set_once
4
+ (a soft form of immutable behavior that does not require the creation of
5
+ new objects (garbage collection cost) and may clean up the way too complex
6
+ check_or_set_subject_and_provenance behavior in Resource)
7
+
8
+ * add a << method for a resource on Graph
9
+ * that will store a (Provenance)Resource instance
10
+ * this will now set the time_stamps (enforcing the strictly monotic order)
@@ -1,4 +1,4 @@
1
- 010_provenance_fact_predicates_from_provenance_ontology
1
+ 011_provenance_fact_predicates_from_provenance_ontology
2
2
 
3
3
  As a client
4
4
  I can read the predicates for a provenance_fact from provenance ontology
@@ -26,7 +26,17 @@ nobel_peace_2012 << fact_EU_story
26
26
 
27
27
  graph = Dbd::Graph.new
28
28
 
29
- provenance.each {|provenance_fact| graph << provenance_fact}
30
- nobel_peace_2012.each {|fact| graph << fact}
29
+ graph << provenance
30
+ graph << nobel_peace_2012
31
31
 
32
32
  puts graph.to_CSV
33
+
34
+ # "9f868d99-af27-4d83-86ae-ea5f4a1fa654","2013-05-22 21:25:48.136527770 UTC","","a6e028dd-a340-49ce-b3f8-2f158e257a87","provenance:context","public"
35
+ # "28496b40-1891-4bd0-9ee1-0c6c2a878cc1","2013-05-22 21:25:48.136596276 UTC","","a6e028dd-a340-49ce-b3f8-2f158e257a87","dcterms:creator","@peter_v"
36
+ # "98b9dd72-3473-4500-814d-d955eec2c5ee","2013-05-22 21:25:48.136621174 UTC","","a6e028dd-a340-49ce-b3f8-2f158e257a87","provenance:created_at","2013-05-22 21:25:40 UTC"
37
+ # "a0482b46-414d-40df-b436-41142728fda6","2013-05-22 21:25:55.367834295 UTC","a6e028dd-a340-49ce-b3f8-2f158e257a87","cd66aece-0b21-4e3e-8286-4191efb3aea1","base:nobelPeacePriceWinner","2012"
38
+ # "c1af381d-800f-4279-a2cc-ccccf31f5134","2013-05-22 21:25:55.367891996 UTC","a6e028dd-a340-49ce-b3f8-2f158e257a87","cd66aece-0b21-4e3e-8286-4191efb3aea1","rdfs:label","EU"
39
+ # "ac08843a-baae-49ea-a725-81b7e199e8f9","2013-05-22 21:25:55.367910018 UTC","a6e028dd-a340-49ce-b3f8-2f158e257a87","cd66aece-0b21-4e3e-8286-4191efb3aea1","rdfs:comment","European Union"
40
+ # "6e91fa40-daa8-45d1-916e-f9b243d01f2c","2013-05-22 21:25:55.367928936 UTC","a6e028dd-a340-49ce-b3f8-2f158e257a87","cd66aece-0b21-4e3e-8286-4191efb3aea1","base:story","A long period of peace,
41
+ # that is a ""bliss""."
42
+
data/lib/dbd.rb CHANGED
@@ -3,6 +3,7 @@ require 'ruby_peter_v'
3
3
  require 'dbd/version'
4
4
 
5
5
  require 'dbd/errors'
6
+ require 'dbd/time_stamp'
6
7
  require 'dbd/fact'
7
8
  require 'dbd/provenance_fact'
8
9
  require 'dbd/resource'
@@ -7,7 +7,7 @@ module Dbd
7
7
  ##
8
8
  # Basic Fact of knowledge.
9
9
  #
10
- # The database is built as an ordered sequence of facts, the "fact stream".
10
+ # The database is built as an ordered sequence of facts, a "fact stream".
11
11
  #
12
12
  # This is somewhat similar to a "triple" in the RDF (Resource Description
13
13
  # Framework) concept, but with different and extended functionality.
@@ -15,22 +15,22 @@ module Dbd
15
15
  # Each basic fact has:
16
16
  # * a unique and invariant *id* (a uuid)
17
17
  #
18
- # To allow referencing back to it (e.g. to invalidate it later in the fact stream).
18
+ # To allow referencing back to it (e.g. to invalidate it later in a fact stream).
19
19
  #
20
20
  # * a *time_stamp* (time with nanosecond granularity)
21
21
  #
22
- # To allow verifying that the order in the fact stream is correct.
22
+ # To allow verifying that the order in a fact stream is correct.
23
23
  #
24
24
  # A time_stamp does not need to represent the exact time of the
25
25
  # creation of the fact, but it has to increase in strictly monotic
26
- # order in the fact stream.
26
+ # order in a fact stream.
27
27
  #
28
28
  # * a *provenance_subject* (a uuid)
29
29
  #
30
30
  # The subject of the ProvenanceResource (a set of ProvenanceFacts with
31
31
  # the same subject) about this fact. Each Fact, points *back* to a
32
32
  # ProvenanceResource (the ProvenanceResource must have been fully
33
- # defined, earlier in the fact stream).
33
+ # defined, earlier in a fact stream).
34
34
  #
35
35
  # * a *subject* (a uuid)
36
36
  #
@@ -72,9 +72,35 @@ module Dbd
72
72
  attr_reader attribute
73
73
  end
74
74
 
75
+ ##
76
+ # A set_once setter for time_stamp.
77
+ #
78
+ # This implements a "form" of immutable behavior. The value can
79
+ # be set once (possibly after creation the object), but can
80
+ # never be changed after that.
75
81
  def time_stamp=(time_stamp)
76
- raise RuntimeError if @time_stamp
77
- @time_stamp = time_stamp
82
+ raise ArgumentError unless time_stamp.is_a?(TimeStamp)
83
+ set_once(:time_stamp, time_stamp)
84
+ end
85
+
86
+ ##
87
+ # A set_once setter for subject.
88
+ #
89
+ # This implements a "form" of immutable behavior. The value can
90
+ # be set once (possibly after creation the object), but can
91
+ # never be changed after that.
92
+ def subject=(subject)
93
+ set_once(:subject, subject)
94
+ end
95
+
96
+ ##
97
+ # A set_once setter for provenance_subject.
98
+ #
99
+ # This implements a "form" of immutable behavior. The value can
100
+ # be set once (possibly after creation the object), but can
101
+ # never be changed after that.
102
+ def provenance_subject=(provenance_subject)
103
+ set_once(:provenance_subject, provenance_subject)
78
104
  end
79
105
 
80
106
  ##
@@ -128,15 +154,18 @@ module Dbd
128
154
  end
129
155
 
130
156
  ##
131
- # Checks if a fact is valid for storing in the graph.
157
+ # Checks if a fact has errors for storing in the graph.
132
158
  #
133
- # @return [#true?] not nil if valid
134
- def valid?
135
- # id not validated, is set automatically
136
- # predicate not validated, is validated in initialize
137
- # object not validated, is validated in initialize
138
- provenance_subject_valid?(provenance_subject) &&
139
- subject
159
+ # @return [Array] an Array of error messages
160
+ def errors
161
+ # * id not validated, is set automatically upon creation
162
+ # * time_stamp not validated, is set automatically later
163
+ # * predicate not validated, is validated in initialize
164
+ # * object not validated, is validated in initialize
165
+ [].tap do |a|
166
+ a << provenance_subject_error(provenance_subject)
167
+ a << "Subject is missing" unless subject
168
+ end.compact
140
169
  end
141
170
 
142
171
  ##
@@ -147,35 +176,9 @@ module Dbd
147
176
  # This is how the difference is encoded between Fact and
148
177
  # ProvenanceFact in the fact stream.
149
178
  # @param [#nil?] provenance_subject
150
- # Return [Boolean]
151
- def provenance_subject_valid?(provenance_subject)
152
- provenance_subject
153
- end
154
-
155
- ##
156
- # Builds duplicate with the subject set.
157
- #
158
- # @param [Subject] subject_arg
159
- # @return [Fact] the duplicate fact
160
- def dup_with_subject(subject_arg)
161
- self.class.new(
162
- provenance_subject: provenance_subject,
163
- subject: subject_arg, # from arg
164
- predicate: predicate,
165
- object: object)
166
- end
167
-
168
- ##
169
- # Builds duplicate with the provenance_subject set.
170
- #
171
- # @param [Subject] provenance_subject_arg
172
- # @return [Fact] the duplicate fact
173
- def dup_with_provenance_subject(provenance_subject_arg)
174
- self.class.new(
175
- provenance_subject: provenance_subject_arg, # from arg
176
- subject: subject,
177
- predicate: predicate,
178
- object: object)
179
+ # Return [nil, String] nil or an error message
180
+ def provenance_subject_error(provenance_subject)
181
+ "Provenance subject is missing" unless provenance_subject
179
182
  end
180
183
 
181
184
  end
@@ -29,7 +29,7 @@ module Dbd
29
29
  #
30
30
  # @return [self] for chaining
31
31
  #
32
- # Validates that added fact is valid.
32
+ # Validates that added fact is valid (has no errors).
33
33
  #
34
34
  # Validates that added fact is newer.
35
35
  #
@@ -41,8 +41,7 @@ module Dbd
41
41
  #
42
42
  # Mark the fact in the list of used provenance_subjects (for [A]).
43
43
  def <<(fact)
44
- # TODO Add a more descriptive Exception message
45
- raise FactError unless fact.valid?
44
+ raise FactError, "#{fact.errors.join(', ')}." unless fact.errors.empty?
46
45
  raise OutOfOrderError if (self.newest_time_stamp && fact.time_stamp <= self.newest_time_stamp)
47
46
  raise OutOfOrderError if (@used_provenance_subjects[fact.subject])
48
47
  index = Helpers::OrderedSetCollection.add_and_return_index(fact, @internal_collection)
@@ -9,9 +9,15 @@ module Dbd
9
9
 
10
10
  include Fact::Collection
11
11
 
12
- def <<(fact)
13
- enforce_strictly_monotonic_time(fact)
14
- super(fact)
12
+ ##
13
+ # Add a Fact or Resource to the Graph.
14
+ #
15
+ # This will add a time_stamp to the Facts.
16
+ def <<(fact_or_resource)
17
+ Array(fact_or_resource).each do |fact|
18
+ enforce_strictly_monotonic_time(fact)
19
+ super(fact)
20
+ end
15
21
  end
16
22
 
17
23
  ##
@@ -29,18 +35,11 @@ module Dbd
29
35
  private
30
36
 
31
37
  ##
32
- # The system mmust enforce that the time_stamps are strictly monotonic.
33
- #
34
- # This has been detected because on Java (JRuby) the the Wall time has
35
- # a resolution of only 1 ms so sometimes, the exact same value for
36
- # Time.now was reported.
38
+ # Setting a strictly monotonically increasing time_stamp (if not yet set).
39
+ # The time_stamp also has some randomness (1 .. 999 ns) to reduce the
40
+ # chance on collisions when merging fact streams from different sources.
37
41
  def enforce_strictly_monotonic_time(fact)
38
- new_time = Time.now.utc
39
- newest_time_stamp = newest_time_stamp()
40
- if newest_time_stamp && new_time <= newest_time_stamp
41
- new_time = newest_time_stamp + 0.000_000_002 # Add approx. 2 nanoseconds
42
- end
43
- fact.time_stamp = new_time
42
+ fact.time_stamp = TimeStamp.new(larger_than: newest_time_stamp) unless fact.time_stamp
44
43
  end
45
44
 
46
45
  end
@@ -12,7 +12,7 @@ module Dbd
12
12
  # usage of provenance_subject is not recursive on this level (this
13
13
  # allows efficient single pass loading in an underlying database).
14
14
  #
15
- # In the serialisation of the fact stream, the presence or absence of a
15
+ # In the serialisation of a fact stream, the presence or absence of a
16
16
  # provenance_subject marks the difference between a (base) Fact and a
17
17
  # ProvenanceFact.
18
18
  #
@@ -47,9 +47,9 @@ module Dbd
47
47
  #
48
48
  # Here, in the derived ProvenanceFact, it must not be present.
49
49
  # @param [#nil?] provenance_subject
50
- # Return [Boolean]
51
- def provenance_subject_valid?(provenance_subject)
52
- provenance_subject.nil?
50
+ # Return [nil, String] nil if valid, an error message if not
51
+ def provenance_subject_error(provenance_subject)
52
+ "Provenance subject should not be present in Provenance Fact" if provenance_subject
53
53
  end
54
54
 
55
55
  ##
@@ -44,10 +44,11 @@ module Dbd
44
44
  ##
45
45
  # Check provenance_subject, which should be nil here
46
46
  # @param [ProvenanceFact] provenance_fact
47
- # @return [ProvenanceFact] with validated nil on provenance_subject
48
- def check_or_set_provenance(provenance_fact)
49
- raise ProvenanceError if provenance_fact.provenance_subject
50
- provenance_fact
47
+ def set_provenance!(provenance_fact)
48
+ if provenance_fact.provenance_subject
49
+ raise ProvenanceError,
50
+ "trying to set provenance_subject to#{provenance_fact.provenance_subject}"
51
+ end
51
52
  end
52
53
 
53
54
  end
@@ -19,14 +19,22 @@ module Dbd
19
19
  # Resources that are associated with it.
20
20
  #
21
21
  # During build-up of a Fact, the subject and the provenance_subject
22
- # can be nil. These will then be set in a local duplicate when the
23
- # Fact is added (with '<<') to a resource.
22
+ # can be nil. These will then be set when the Fact is added
23
+ # (with '<<') to a resource.
24
24
  class Resource
25
25
 
26
26
  include Helpers::OrderedSetCollection
27
27
 
28
28
  attr_reader :subject
29
29
 
30
+ ##
31
+ # Getter for provenance_subject.
32
+ #
33
+ # Will be overridden in the ProvenanceResource subclass.
34
+ def provenance_subject
35
+ @provenance_subject
36
+ end
37
+
30
38
  ##
31
39
  # @return [Fact::Subject] a new (random) Resource subject
32
40
  def self.new_subject
@@ -56,57 +64,28 @@ module Dbd
56
64
  ##
57
65
  # Add a fact.
58
66
  #
59
- # * if it has no subject, the subject is set in a duplicate fact
67
+ # * if it has no subject, the subject is set (this modifies the fact !)
60
68
  # * if is has the same subject as the resource, added unchanged.
61
69
  # * if it has a different subject, a SubjectError is raised.
62
- # * if it has no provenance_subject, the provenance_subject is set in a duplicate fact
70
+ # * if it has no provenance_subject, the provenance_subject is set (this modifies the fact !)
63
71
  # * if is has the same provenance_subject as the resource, added unchanged.
64
72
  # * if it has a different provenance_subject, a ProvenanceError is raised.
65
73
  def <<(fact)
66
74
  # TODO: check the type of the fact (Fact)
67
- super(check_or_set_subject_and_provenance(fact))
68
- end
69
-
70
- ##
71
- # Getter for provenance_subject.
72
- #
73
- # Will be overridden in the ProvenanceResource subclass.
74
- def provenance_subject
75
- @provenance_subject
75
+ set_subject!(fact)
76
+ set_provenance!(fact)
77
+ super(fact)
76
78
  end
77
79
 
78
80
  private
79
81
 
80
- def check_or_set_subject_and_provenance(element)
81
- with_subject = check_or_set_subject(element)
82
- check_or_set_provenance(with_subject)
83
- end
84
-
85
- def check_or_set_subject(element)
86
- if element.subject
87
- if element.subject == @subject
88
- return element
89
- else
90
- raise SubjectError,
91
- "self.subject is #{subject} and element.subject is #{element.subject}"
92
- end
93
- else
94
- element.dup_with_subject(@subject)
95
- end
82
+ def set_subject!(fact)
83
+ fact.subject = subject
96
84
  end
97
85
 
98
86
  # this will be overriden in the ProvenanceResource sub_class
99
- def check_or_set_provenance(element)
100
- if element.provenance_subject
101
- if element.provenance_subject == @provenance_subject
102
- return element
103
- else
104
- raise ProvenanceError,
105
- "self.provenance_subject is #{provenance_subject} and element.provenance_subject is #{element.provenance_subject}"
106
- end
107
- else
108
- element.dup_with_provenance_subject(@provenance_subject)
109
- end
87
+ def set_provenance!(fact)
88
+ fact.provenance_subject = provenance_subject
110
89
  end
111
90
 
112
91
  def validate_provenance_subject