declarative_policy 1.0.0 → 1.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/.gitignore +2 -0
- data/.gitlab-ci.yml +59 -16
- data/.rubocop.yml +4 -1
- data/CHANGELOG.md +8 -0
- data/CONTRIBUTING.md +41 -0
- data/Gemfile +7 -8
- data/Gemfile.lock +37 -20
- data/LICENSE.txt +4 -1
- data/README.md +6 -4
- data/benchmarks/repeated_invocation.rb +37 -0
- data/declarative_policy.gemspec +1 -1
- data/doc/caching.md +299 -1
- data/doc/defining-policies.md +29 -3
- data/doc/optimization.md +277 -0
- data/lib/declarative_policy/base.rb +60 -28
- data/lib/declarative_policy/cache.rb +1 -1
- data/lib/declarative_policy/condition.rb +4 -2
- data/lib/declarative_policy/configuration.rb +7 -1
- data/lib/declarative_policy/rule.rb +5 -5
- data/lib/declarative_policy/runner.rb +58 -26
- data/lib/declarative_policy/version.rb +1 -1
- data/lib/declarative_policy.rb +30 -40
- metadata +11 -7
data/doc/caching.md
CHANGED
@@ -1,4 +1,302 @@
|
|
1
1
|
# Caching
|
2
2
|
|
3
|
-
|
3
|
+
This library deals with making observations about the state of
|
4
|
+
a system (usually performing I/O, such as making a database query),
|
5
|
+
and combining these facts into logical propositions.
|
4
6
|
|
7
|
+
In order to make this performant, the library transparently caches repeated
|
8
|
+
observations of conditions. Understanding how caching works is useful for
|
9
|
+
designing good policies, using them effectively.
|
10
|
+
|
11
|
+
## What is cached?
|
12
|
+
|
13
|
+
If a policy is instantiated with a cache, then the following things will be
|
14
|
+
stored in it:
|
15
|
+
|
16
|
+
- Policy instances (there will only ever be one policy per `user/subject` pair
|
17
|
+
for the lifetime of the cache).
|
18
|
+
- Condition results
|
19
|
+
|
20
|
+
The correctness of these cached values depends on the correctness of the
|
21
|
+
cache-keys. We assume the objects in your domain have a `#id` method that
|
22
|
+
fully captures the notion of object identity. See [Cache keys](#cache-keys) for
|
23
|
+
details. All cache keys begin with `"/dp/"`.
|
24
|
+
|
25
|
+
Policies themselves cache the results of the abilities they compute.
|
26
|
+
|
27
|
+
Policies distinguish between facts based on the type of the fact:
|
28
|
+
|
29
|
+
- Boolean facts: implemented with `condition`.
|
30
|
+
- Abilities: implemented with `rule` blocks.
|
31
|
+
- Non-boolean facts: implemented by policy instance methods.
|
32
|
+
|
33
|
+
For example, consider a policy for countries:
|
34
|
+
|
35
|
+
```ruby
|
36
|
+
class CountryPolicy < DeclarativePolicy::Base
|
37
|
+
condition(:citizen) { @user.citizen_of?(country.country_code) }
|
38
|
+
condition(:eu_citizen, scope: :user) { @user.citizen_of?(*Unions::EU) }
|
39
|
+
condition(:eu_member, scope: :subject) { Unions::EU.include?(country.country_code) }
|
40
|
+
|
41
|
+
condition(:has_visa_waiver) { country.visa_waivers.any? { |c| @user.citizen_of?(c) } }
|
42
|
+
condition(:permanent_resident) { visa_category == :permanent }
|
43
|
+
condition(:has_work_visa) { visa_category == :work }
|
44
|
+
condition(:has_current_visa) { has_visa_waiver? || current_visa.present? }
|
45
|
+
condition(:has_business_visa) { has_visa_waiver? || has_work_visa? || visa_category == :business }
|
46
|
+
|
47
|
+
condition(:full_rights, score: 20) { citizen? || permanent_resident? }
|
48
|
+
condition(:banned) { country.banned_list.include?(@user) }
|
49
|
+
|
50
|
+
rule { eu_member & eu_citizen }.enable :freedom_of_movement
|
51
|
+
rule { full_rights | can?(:freedom_of_movement) }.enable :settle
|
52
|
+
rule { can?(:settle) | has_current_visa }.enable :enter_country
|
53
|
+
rule { can?(:settle) | has_business_visa }.enable :attend_meetings
|
54
|
+
rule { can?(:settle) | has_work_visa }.enable :work
|
55
|
+
rule { citizen }.enable :vote
|
56
|
+
rule { ~citizen & ~permanent_resident }.enable :apply_for_visa
|
57
|
+
rule { banned }.prevent :enter_country, :apply_for_visa
|
58
|
+
|
59
|
+
def current_visa
|
60
|
+
return @current_visa if defined?(@current_visa)
|
61
|
+
|
62
|
+
@current_visa = country.active_visas.find_by(applicant: @user)
|
63
|
+
end
|
64
|
+
|
65
|
+
def visa_category
|
66
|
+
current_visa&.category
|
67
|
+
end
|
68
|
+
|
69
|
+
def country
|
70
|
+
@subject
|
71
|
+
end
|
72
|
+
end
|
73
|
+
```
|
74
|
+
|
75
|
+
This is a reasonably realistic policy - there are a few pieces of state (the
|
76
|
+
country, the list of visa waiver agreements, the list of citizenships the user
|
77
|
+
holds, the kind of visa the user has, if they have one, the current list of
|
78
|
+
banned users), and these are combined to determine a range of abilities (whether
|
79
|
+
one can visit or live in or vote in a certain country). Importantly, these
|
80
|
+
pieces of information are re-used between abilities - the citizenship status is
|
81
|
+
relevant to all abilities, whereas the banned list is only considered on entry
|
82
|
+
and when applying for a new visa).
|
83
|
+
|
84
|
+
If we imagine that some of these operations are reasonably expensive (fetching
|
85
|
+
the current visa status, or checking the banned list, for example), then it
|
86
|
+
follows that we really care about avoiding re-computation of these facts. In the
|
87
|
+
policy above we can see a few strategies that are taken to avoid this:
|
88
|
+
|
89
|
+
- Conditions are re-used liberally.
|
90
|
+
- Non-boolean facts are cached at the policy level.
|
91
|
+
|
92
|
+
## Re-using conditions
|
93
|
+
|
94
|
+
Rules can and should re-use conditions as much as possible. Condition
|
95
|
+
observations are cached automatically, so referring to the same condition in
|
96
|
+
multiple rules is encouraged. Conditions can also refer to other conditions by
|
97
|
+
using the predicate methods that are created for them (see `full_rights`, which
|
98
|
+
refers to the `:citizen` condition as `citizen?`).
|
99
|
+
|
100
|
+
Note that referring to conditions inside other conditions can be DRY, but it
|
101
|
+
limits the ability of the library to optimize the steps (see
|
102
|
+
[optimization](./optimization.md)). For example in the `:has_current_visa`
|
103
|
+
condition, the sub-conditions will always be tested in the order
|
104
|
+
`has_visa_waiver` then `current_visa.present?`. It is recommended not to rely
|
105
|
+
heavily on this kind of abstraction.
|
106
|
+
|
107
|
+
## Re-using rules
|
108
|
+
|
109
|
+
Entire rule-sets can be re-used with `can?`. This is a form of logical
|
110
|
+
implication where a previous conclusion can be used in a further rule. Examples
|
111
|
+
of this here are `can?(:settle)` and `can?(:freedom_of_movement)`. This can
|
112
|
+
prevent having to repeat long groups of conditions in rule definitions. This
|
113
|
+
abstraction is transparent to the optimizer.
|
114
|
+
|
115
|
+
## Non-boolean values must be managed manually
|
116
|
+
|
117
|
+
The condition `has_current_visa` and the more specific
|
118
|
+
`has_{work,business}_visa` all refer to the same piece of state - the
|
119
|
+
`#current_visa`. Since this is not a boolean (but is here a database record with
|
120
|
+
a `#category` attribute), this cannot be a condition, but must be managed by the
|
121
|
+
policy itself.
|
122
|
+
|
123
|
+
The best approach here is to use normal Ruby methods and instance variables for
|
124
|
+
such values. The policy instances themselves are cached, so that any two
|
125
|
+
invocations of `DeclarativePolicy.policy_for(user, object)` with identical
|
126
|
+
`user` and `object` arguments will always return the same policy object. This
|
127
|
+
means instance variables stored on the policy will be available for the lifetime
|
128
|
+
of the cache.
|
129
|
+
|
130
|
+
Methods can be used for the usual reasons of clarity (such as referring to the
|
131
|
+
`@subject` as `country`) and brevity (such as `visa_category`).
|
132
|
+
|
133
|
+
## Cache lifetime
|
134
|
+
|
135
|
+
The cache is provided by the user of the library, passing it to the
|
136
|
+
`.policy_for` method. For example:
|
137
|
+
|
138
|
+
```ruby
|
139
|
+
DeclarativePolicy.policy_for(user, country, cache: some_cache_value)
|
140
|
+
```
|
141
|
+
|
142
|
+
The object only needs to implement the following methods:
|
143
|
+
|
144
|
+
- `cache[key: String] -> Boolean?`: Fetch the cached value
|
145
|
+
- `cache.key?(key: String) -> Boolean`: Test if the key is cached
|
146
|
+
- `cache[key: String] = Boolean`: Cache a value
|
147
|
+
|
148
|
+
Obviously, a `HashMap` will work just fine, but so will a wrapper around a
|
149
|
+
[`Concurrent::Map`](https://ruby-concurrency.github.io/concurrent-ruby/1.1.4/Concurrent/Map.html),
|
150
|
+
or even a map that delegates to Redis with a TTL for each key, so long as the
|
151
|
+
object supports these methods. Keys are never deleted by the library, and values
|
152
|
+
are only computed if the key is not cached, so it is up to the application code
|
153
|
+
to determine the life-time of each key.
|
154
|
+
|
155
|
+
Clearly, cache-invalidation is a hard problem. At GitLab we share a single cache
|
156
|
+
object for each request - so any single request can freely request a permission
|
157
|
+
check multiple times (or even compute related abilities, such as
|
158
|
+
`:enter_country` and `:settle`) and know that no work is duplicated. This
|
159
|
+
allows developers to reason declaratively, and add permission checks where
|
160
|
+
needed, without worrying about performance.
|
161
|
+
|
162
|
+
## Cache sharing: scopes
|
163
|
+
|
164
|
+
Not all conditions are equally specific. The condition `citizen` refers to
|
165
|
+
both the user and the country, and so can only be used when checking both the
|
166
|
+
user and the country. We say that this is the `normal` scope.
|
167
|
+
|
168
|
+
This is not always true however. Sometimes a condition refers only to the user.
|
169
|
+
For example, above we have two conditions: `eu_citizen` and `eu_member`:
|
170
|
+
|
171
|
+
```ruby
|
172
|
+
condition(:eu_citizen, scope: :user) { @user.citizen_of?(*Unions::EU) }
|
173
|
+
condition(:eu_member, scope: :subject) { Unions::EU.include?(country.country_code) }
|
174
|
+
```
|
175
|
+
|
176
|
+
`eu_citizen` refers only to the user, and `eu_member` refers only to the
|
177
|
+
country.
|
178
|
+
|
179
|
+
If we have a user that wants to enter multiple countries on a grand European
|
180
|
+
tour, we could check this with:
|
181
|
+
|
182
|
+
```ruby
|
183
|
+
itinerary.countries.all? { |c| DeclarativePolicy.policy_for(user, c).allowed?(:enter_country) }
|
184
|
+
```
|
185
|
+
|
186
|
+
If `eu_citizen` were declared with the `normal` scope, then this would have a lot of cache
|
187
|
+
misses. By using the `:user` scope on `eu_citizen`, we only check EU citizenship
|
188
|
+
once.
|
189
|
+
|
190
|
+
Similarly for `eu_member`, if a team of football players want to visit a
|
191
|
+
country, then we could check this with:
|
192
|
+
|
193
|
+
```ruby
|
194
|
+
team.players.all? { |user| DeclarativePolicy.policy_for(user, country).allowed?(:enter_country) }
|
195
|
+
```
|
196
|
+
|
197
|
+
Again, by declaring `eu_member` as having the `:subject` scope, this ensures we
|
198
|
+
only check EU membership once, not once for each football player.
|
199
|
+
|
200
|
+
The last scope is `:global`, used when the condition is universally true:
|
201
|
+
|
202
|
+
```ruby
|
203
|
+
condition(:earth_destroyed_by_meteor, scope: global) { !Planet::Earth.exists? }
|
204
|
+
|
205
|
+
rule { earth_destroyed_by_meteor }.prevent_all
|
206
|
+
```
|
207
|
+
|
208
|
+
In this case, it doesn't matter who the user is or even where they are going:
|
209
|
+
the condition will be computed once (per cache lifetime) for all combinations.
|
210
|
+
|
211
|
+
Because of the implications for sharing, the scope determines the
|
212
|
+
[`#score`](https://gitlab.com/gitlab-org/declarative-policy/blob/2ab9dbdf44fb37beb8d0f7c131742d47ae9ef5d0/lib/declarative_policy/condition.rb#L58-77) of
|
213
|
+
the condition (if not provided explicitly). The intention is to prefer values we
|
214
|
+
are more likely (all other things being equal) to re-use:
|
215
|
+
|
216
|
+
- Conditions we have already cached get a score of `0`.
|
217
|
+
- Conditions that are in the `:global` scope get a score of `2`.
|
218
|
+
- Conditions that are in the `:user` or `:subject` scopes get a score of `8`.
|
219
|
+
- Conditions that are in the `:normal` scope get a score of `16`.
|
220
|
+
|
221
|
+
Bear helper-methods in mind when defining scopes. While the instance level cache
|
222
|
+
for non-boolean values would not be shared, as long as the derived condition is
|
223
|
+
shared (for example by being in the `:user` scope, rather than the `:normal`
|
224
|
+
scope), helper-methods will also benefit from improved cache hits.
|
225
|
+
|
226
|
+
### Preferred scope
|
227
|
+
|
228
|
+
In the example situations above (a single user visiting many countries, or a
|
229
|
+
football team visiting one country), we know which is more likely to be useful,
|
230
|
+
the `:subject` or the `:user` scope. We can inform the optimizer of this
|
231
|
+
by setting `DeclarativePolicy.preferred_scope`.
|
232
|
+
|
233
|
+
To do this, check the abilities within a block bounded
|
234
|
+
by [`DeclarativePolicy.with_preferred_scope`](https://gitlab.com/gitlab-org/declarative-policy/blob/481c322a74f76c325d3ccab7f2f3cc2773e8168b/lib/declarative_policy/preferred_scope.rb#L7-13).
|
235
|
+
For example:
|
236
|
+
|
237
|
+
```ruby
|
238
|
+
cache = {}
|
239
|
+
|
240
|
+
# preferring to run user-scoped conditions
|
241
|
+
DeclarativePolicy.with_preferred_scope(:user) do
|
242
|
+
itinerary.countries.all? do |c|
|
243
|
+
DeclarativePolicy.policy_for(user, c, cache: cache).allowed?(:enter_country)
|
244
|
+
end
|
245
|
+
end
|
246
|
+
|
247
|
+
# preferring to run subject-scoped conditions
|
248
|
+
DeclarativePolicy.with_preferred_scope(:subject) do
|
249
|
+
team.players.all? do |player|
|
250
|
+
DeclarativePolicy.policy_for(player, c, cache: cache).allowed?(:enter_country)
|
251
|
+
end
|
252
|
+
end
|
253
|
+
|
254
|
+
```
|
255
|
+
|
256
|
+
When we set `preferred_scope`, this reduces the default score for conditions in
|
257
|
+
that scope, so that they are more likely to be executed first. Instead of `8`,
|
258
|
+
they are given a default score of `4`.
|
259
|
+
|
260
|
+
## Cache keys
|
261
|
+
|
262
|
+
In order for an object to be cached, it should be able to identify itself
|
263
|
+
with a suitable cache key. A good cache key will identify an object, without
|
264
|
+
containing irrelevant information - a database `#id` is perfect, and this
|
265
|
+
library defaults to calling an `#id` method on objects, falling back to
|
266
|
+
`object_id`.
|
267
|
+
|
268
|
+
Relying on `object_id` is not recommended since otherwise equivalent objects
|
269
|
+
have different `object_id` values, and using `object_id` will not get optimal caching. All
|
270
|
+
policy subjects should implement `#id` for this reason. `ActiveRecord` models
|
271
|
+
with an `id` primary ID attribute do not need any extra configuration.
|
272
|
+
|
273
|
+
Please see: [`DeclarativePolicy::Cache`](https://gitlab.com/gitlab-org/declarative-policy/blob/master/lib/declarative_policy/cache.rb).
|
274
|
+
|
275
|
+
## Cache invalidation
|
276
|
+
|
277
|
+
Generally, cache invalidation is best avoided. It is very hard to get right, and
|
278
|
+
relying on it opens you up to subtle but pernicious bugs that are hard to
|
279
|
+
reproduce and debug.
|
280
|
+
|
281
|
+
The best strategy is to run all permission checks upfront, before mutating any
|
282
|
+
state that might change a permission computation. For instance, if you want to
|
283
|
+
make a user an administrator, then check for permission **before** assigning
|
284
|
+
administrator privileges.
|
285
|
+
|
286
|
+
However, it isn't always possible to avoid needing to mark certain parts of the
|
287
|
+
cached state as dirty (in need of re-computation). If this is needed, then you
|
288
|
+
can call the `DeclarativePolicy.invalidate(cache, keys)` method. This takes an
|
289
|
+
enumerable of dirty keys, and:
|
290
|
+
|
291
|
+
- removes the cached condition results from the cache
|
292
|
+
- marks the abilities that depend on those conditions as dirty, and in need of
|
293
|
+
re-computation.
|
294
|
+
|
295
|
+
The responsibility for determining which cache-keys are dirty falls on the
|
296
|
+
client. You could, for example, do this by observing which keys are added to the
|
297
|
+
cache (knowing that condition keys all start with `"/dp/condition/"`), or by
|
298
|
+
scanning the cache for keys that match a heuristic.
|
299
|
+
|
300
|
+
This method is the only place where the `#delete` method is called on the cache.
|
301
|
+
If you do not call `.invalidate`, there is no need for the cache to implement
|
302
|
+
`#delete`.
|
data/doc/defining-policies.md
CHANGED
@@ -74,7 +74,7 @@ condition(:owns) { @subject.owner == @user }
|
|
74
74
|
condition(:has_access_to) { @subject.owner.trusts?(@user) }
|
75
75
|
condition(:old_enough_to_drive) { @user.age >= laws.minimum_age }
|
76
76
|
condition(:has_driving_license) { @user.driving_license&.valid? }
|
77
|
-
condition(:intoxicated, score: 5) { @user.blood_alcohol
|
77
|
+
condition(:intoxicated, score: 5) { @user.blood_alcohol > laws.max_blood_alcohol }
|
78
78
|
condition(:has_access_to, score: 3) { @subject.owner.trusts?(@user) }
|
79
79
|
```
|
80
80
|
|
@@ -108,8 +108,7 @@ Rules are conclusions we can draw based on the facts:
|
|
108
108
|
rule { owns }.enable :drive_vehicle
|
109
109
|
rule { has_access_to }.enable :drive_vehicle
|
110
110
|
rule { ~old_enough_to_drive }.prevent :drive_vehicle
|
111
|
-
rule { intoxicated }.prevent :drive_vehicle
|
112
|
-
rule { ~has_driving_license }.prevent :drive_vehicle
|
111
|
+
rule { intoxicated | ~has_driving_license }.prevent :drive_vehicle
|
113
112
|
```
|
114
113
|
|
115
114
|
Rules are combined such that each ability must be enabled at least once, and not
|
@@ -130,6 +129,33 @@ access the `@user` or `@subject`, or any methods on the policy instance. You
|
|
130
129
|
should not perform I/O in a rule. They exist solely to define the logical rules
|
131
130
|
of implication and combination between conditions.
|
132
131
|
|
132
|
+
The available operations inside a rule block are:
|
133
|
+
|
134
|
+
- Bare words to refer to conditions in the policy, or on any delegate.
|
135
|
+
For example `owns`. This is equivalent to `cond(:owns)`, but as a matter of
|
136
|
+
general style, bare words are preferred.
|
137
|
+
- `~` to negate any rule. For example `~owns`, or `~(intoxicated | banned)`.
|
138
|
+
- `&` or `all?` to combine rules such that all must succeed. For example:
|
139
|
+
`old_enough_to_drive & has_driving_license` or `all?(old_enough_to_drive, has_driving_license)`.
|
140
|
+
- `|` or `any?` to combine rules such that one must succeed. For example:
|
141
|
+
`intoxicated | banned` or `any?(intoxicated, banned)`.
|
142
|
+
- `can?` to refer to the result of evaluating an ability. For example,
|
143
|
+
`can?(:sell_vehicle)`.
|
144
|
+
- `delegate(:delegate_name, :condition_name)` to refer to a specific
|
145
|
+
condition on a named delegate. Use of this is rare, but can be used to
|
146
|
+
handle overrides. For example if a vehicle policy defines a delegate as
|
147
|
+
`delegate :registration`, then we could refer to that
|
148
|
+
as `rule { delegate(:registration, :valid) }`.
|
149
|
+
|
150
|
+
Note: Be careful not to confuse `DeclarativePolicy::Base.condition` with
|
151
|
+
`DeclarativePolicy::RuleDSL#cond`.
|
152
|
+
|
153
|
+
- `condition` constructs a condition from a name and a block. For example:
|
154
|
+
`condition(:adult) { @subject.age >= country.age_of_majority }`.
|
155
|
+
- `cond` constructs a rule which refers to a condition by name. For example:
|
156
|
+
`rule { cond(:adult) }.enable :vote`. Use of `cond` is rare - it is nicer to
|
157
|
+
use the bare word form: `rule { adult }.enable :vote`.
|
158
|
+
|
133
159
|
### Complex conditions
|
134
160
|
|
135
161
|
Conditions may be combined in the rule blocks:
|
data/doc/optimization.md
ADDED
@@ -0,0 +1,277 @@
|
|
1
|
+
# Optimization
|
2
|
+
|
3
|
+
This library cares a lot about performance, and includes features that
|
4
|
+
aim to limit the impact of permission checks on an application. In particular,
|
5
|
+
effort is made to ensure that repeated checks of the same permission are
|
6
|
+
efficient, aiming to eliminate repeated computation and unnecessary I/O.
|
7
|
+
|
8
|
+
The key observation: permission checks generally involve some facts
|
9
|
+
about the real world, and this involves (relatively expensive) I/O to compute.
|
10
|
+
These facts are then combined in some way to generate a judgment. Not all facts
|
11
|
+
are necessary to know in order to determine a judgment. The main aims of the
|
12
|
+
library:
|
13
|
+
|
14
|
+
- Avoid unnecessary work.
|
15
|
+
- If we must do work, do the least work possible.
|
16
|
+
|
17
|
+
The library enables you to define both how to compute these facts
|
18
|
+
(conditions), and how to combine them (rules), but the library is entirely
|
19
|
+
responsible for the scheduling of when to compute each fact.
|
20
|
+
|
21
|
+
## Making truth
|
22
|
+
|
23
|
+
This library is essentially a build-system for truth - you can think of it as
|
24
|
+
similar to [`make`](https://www.gnu.org/software/make/), but:
|
25
|
+
|
26
|
+
- Instead of `targets` there are `abilities`.
|
27
|
+
- Instead of `files`, we produce `boolean` values.
|
28
|
+
|
29
|
+
We have no notion of freshness - uncached conditions are always re-computed, but
|
30
|
+
just like `make`, we try to do the least work possible in order to evaluate the
|
31
|
+
given ability.
|
32
|
+
|
33
|
+
For the interested, this corresponds to
|
34
|
+
[`memo`](https://hackage.haskell.org/package/build-1.0/docs/src/Build.System.html#memo) in
|
35
|
+
the taxonomy of build systems (although the scheduler here is somewhat smarter
|
36
|
+
about the relative order of dependencies).
|
37
|
+
|
38
|
+
## Optimization is reducing computation of expensive I/O
|
39
|
+
|
40
|
+
In the context of this library, optimization refers to ways we can:
|
41
|
+
|
42
|
+
- Expose the smallest possible units of I/O to the scheduler.
|
43
|
+
- Never run a computation twice.
|
44
|
+
- Indicate to the scheduler which computations should be run first.
|
45
|
+
|
46
|
+
For example, if a policy defines the following rule:
|
47
|
+
|
48
|
+
```ruby
|
49
|
+
rule { fact_a & fact_b }.enable :some_ability
|
50
|
+
```
|
51
|
+
|
52
|
+
The core of the matter: if we know in advance that `fact_a == false`, then we do not need to compute
|
53
|
+
`fact_b`. Conversely, if we know in advance that `fact_b == false`, then we do
|
54
|
+
not need to run `fact_a`. The same goes for `fact_a | fact_a`.
|
55
|
+
|
56
|
+
In this case:
|
57
|
+
|
58
|
+
- The smallest possible units of I/O are `fact_a` and `fact_b`, and the library
|
59
|
+
is aware of them.
|
60
|
+
- The library uses the [cache](./caching.md) to avoid running a condition more
|
61
|
+
than once.
|
62
|
+
- It does not matter which order we run these conditions in - the scheduler is
|
63
|
+
free to re-order them if it thinks that `fact_b` is somehow more efficient to
|
64
|
+
compute than `fact_a`.
|
65
|
+
|
66
|
+
## The scheduling logic
|
67
|
+
|
68
|
+
The problem each permission check seeks to solve is determining the truth value
|
69
|
+
of a proposition of the form:
|
70
|
+
|
71
|
+
```pseudo
|
72
|
+
any? enabling-conditions && not (any? preventing-conditions)
|
73
|
+
```
|
74
|
+
|
75
|
+
If `[a, b, c]` are enabling conditions, and `[x, y, z]` are preventing
|
76
|
+
conditions, then this could be expressed as:
|
77
|
+
|
78
|
+
```ruby
|
79
|
+
(a | b | c) & ~x & ~y & ~z
|
80
|
+
```
|
81
|
+
|
82
|
+
But the [scheduler](../lib/declarative_policy/runner.rb) represents this
|
83
|
+
as a flat list of rules - conditions and their outcomes:
|
84
|
+
|
85
|
+
```pseudo
|
86
|
+
[
|
87
|
+
(a, :enable),
|
88
|
+
(b, :enable),
|
89
|
+
(c, :enable),
|
90
|
+
(x, :prevent),
|
91
|
+
(y, :prevent),
|
92
|
+
(z, :prevent)
|
93
|
+
]
|
94
|
+
```
|
95
|
+
|
96
|
+
They aren't necessarily run in this order, however. Instead, we try to order
|
97
|
+
the list to minimize unnecessary work.
|
98
|
+
|
99
|
+
The
|
100
|
+
[logic](https://gitlab.com/gitlab-org/declarative-policy/blob/659ac0525773a76cf8712d47b3c2dadd03b758c9/lib/declarative_policy/runner.rb#L80-112)
|
101
|
+
to process this list is (in pseudo-code):
|
102
|
+
|
103
|
+
```pseudo
|
104
|
+
while any-enable-rule-remains?(rules)
|
105
|
+
rule := pop-cheapest-remaining-rule(rules)
|
106
|
+
fact := observe-io-and-update-cache rule.condition
|
107
|
+
|
108
|
+
if fact and rule.prevents?
|
109
|
+
return prevented
|
110
|
+
else if fact and rule.enables?
|
111
|
+
skip-all-other-enabling-rules!
|
112
|
+
enabled? := true
|
113
|
+
|
114
|
+
if enabled?
|
115
|
+
return enabled
|
116
|
+
else
|
117
|
+
return prevented
|
118
|
+
```
|
119
|
+
|
120
|
+
The process for ordering rules is that each condition has a score, and we prefer
|
121
|
+
the rules with the lowest `score`. Cached values have a score of `0`. Composite
|
122
|
+
conditions (such as `a | b | c`) have a score that the sum of the scores of
|
123
|
+
their components.
|
124
|
+
|
125
|
+
The evaluation of one rule results in updating the cache, so other rules might
|
126
|
+
become cheaper, during policy evaluation. To take this into account, we re-score
|
127
|
+
the set of rules on each iteration of the main loop.
|
128
|
+
|
129
|
+
## Consequences for the policy-writer
|
130
|
+
|
131
|
+
While interesting in its own right, this has some practical consequences for the
|
132
|
+
policy writer:
|
133
|
+
|
134
|
+
### Flat is better than nested
|
135
|
+
|
136
|
+
The scheduler can do a better job of arranging work into the smallest possible
|
137
|
+
chunks if the definitions are as flat as possible, meaning this:
|
138
|
+
|
139
|
+
```ruby
|
140
|
+
rule { condition_a }.enable :some_ability
|
141
|
+
rule { condition_b }.prevent :some_ability
|
142
|
+
```
|
143
|
+
|
144
|
+
Is easier to optimise than:
|
145
|
+
|
146
|
+
```ruby
|
147
|
+
rule { condition_a & ~condition_b }.enable :some_ability
|
148
|
+
```
|
149
|
+
|
150
|
+
We do attempt to flatten and de-nest logical expressions, but it is not always
|
151
|
+
possible to raise all expressions to the top level. All things being
|
152
|
+
equal, we recommend using the declarative style.
|
153
|
+
|
154
|
+
#### An example of sub-optimal scheduling
|
155
|
+
|
156
|
+
The scheduler is only able to re-order conditions that can be flattened out to
|
157
|
+
the top level. For example, given the following definition:
|
158
|
+
|
159
|
+
```ruby
|
160
|
+
condition(:a, score: 1) { ... }
|
161
|
+
condition(:b, score: 2) { ... }
|
162
|
+
condition(:c, score: 3) { ... }
|
163
|
+
|
164
|
+
rule { a & c }.enable :some_ability
|
165
|
+
rule { b & c }.enable :some_ability
|
166
|
+
```
|
167
|
+
|
168
|
+
The conditions are evaluated in the following order:
|
169
|
+
|
170
|
+
- `a & c` (score = 4):
|
171
|
+
- `a` (score = 1)
|
172
|
+
- `c` (score = 3)
|
173
|
+
- `b & c` (score = 3):
|
174
|
+
- `c` (score = 0 [cached])
|
175
|
+
- `b` (score = 2)
|
176
|
+
|
177
|
+
If instead this were three top level rules:
|
178
|
+
|
179
|
+
```ruby
|
180
|
+
rule { a }.enable :some_ability
|
181
|
+
rule { b }.enable :some_ability
|
182
|
+
rule { ~c }.prevent :some_ability
|
183
|
+
```
|
184
|
+
|
185
|
+
Then this would be evaluated as:
|
186
|
+
|
187
|
+
- `a` (score = 1)
|
188
|
+
- `b` (score = 2)
|
189
|
+
- `c` (score = 3)
|
190
|
+
|
191
|
+
If `a` and `b` fail, then `3` is never evaluated, saving the most
|
192
|
+
expensive call.
|
193
|
+
|
194
|
+
The total evaluated costs for each arrangement are:
|
195
|
+
|
196
|
+
| Failing conditions | Nested cost | Flat cost |
|
197
|
+
|--------------------|-----------------|---------------|
|
198
|
+
| none | 4 `(a, c)` | 4 `(a, c)` |
|
199
|
+
| all | 3 `(a, b)` | 3 `(a, b)` |
|
200
|
+
| `a` | 6 `(a, b, c)` | 6 `(a, b, c)` |
|
201
|
+
| `b` | 4 `(a, c)` | 4 `(a, c)` |
|
202
|
+
| `c` | 4 `(a, c, c=0)` | 4 `(a, c)` |
|
203
|
+
| `a` and `b` | 4 `(a, c, c=0)` | 3 `(a, b)` |
|
204
|
+
| `a` and `c` | 6 `(a, b, c)` | 6 `(a, b, c)` |
|
205
|
+
| `b` and `c` | 4 `(a, c, c=0)` | 4 `(a, c)` |
|
206
|
+
|
207
|
+
While the overall costs for all arrangements are very similar,
|
208
|
+
the flat representation is strictly superior, and does not even need to
|
209
|
+
rely on the cache for this behavior.
|
210
|
+
|
211
|
+
### Getting the scope right matters
|
212
|
+
|
213
|
+
By default, the outcome of each rule is cached against a key like
|
214
|
+
`(rule.condition.key, user.key, subject.key)`. (For more information, read
|
215
|
+
[caching](./caching.md).) This makes sense for some things like:
|
216
|
+
|
217
|
+
```ruby
|
218
|
+
condition(:owns_vehicle) { @user == @subject.owner }
|
219
|
+
```
|
220
|
+
|
221
|
+
In this case, the result depends on both the `@user` and the `@subject`. Not all
|
222
|
+
conditions are like that, though! The following condition only refers to the
|
223
|
+
subject:
|
224
|
+
|
225
|
+
```ruby
|
226
|
+
condition(:roadworthy) { @subject.warrant_of_fitness.current? }
|
227
|
+
```
|
228
|
+
|
229
|
+
If we cached this against `(user_a, car_a)` and then tested it
|
230
|
+
against `(user_b, car_a)` it would not match, and we would have to re-compute
|
231
|
+
the condition, even though the road-worthiness of a vehicle does not depend on
|
232
|
+
the driver. See [caching](./caching.md) for more discussion on scopes.
|
233
|
+
|
234
|
+
Because more general conditions are more sharable, all things being equal, it is
|
235
|
+
better to evaluate a condition that might be shared later, rather than one that
|
236
|
+
is less likely to be shared. For this reason, when we sort the rules,
|
237
|
+
we prefer ones with more general scopes to more specific ones.
|
238
|
+
|
239
|
+
### Getting the score right matters
|
240
|
+
|
241
|
+
Each condition has a `score`, which is an abstract weight. By default this is
|
242
|
+
determined by the scope.
|
243
|
+
|
244
|
+
However, if you know that a condition is very expensive to run, then it makes sense
|
245
|
+
to give it a higher score, meaning it's only evaluated if we really need
|
246
|
+
to. On the other hand, if a condition is very likely to be determinative, then
|
247
|
+
giving it a lower score would ensure we test it first.
|
248
|
+
|
249
|
+
For example, take two conditions, one which queries the local DB, and one
|
250
|
+
which makes an external API call. If they are otherwise equivalent, calling
|
251
|
+
the database one first is likely to be more efficient, as it might save us needing
|
252
|
+
to make the external API call. Conditions that are
|
253
|
+
[pure](https://en.wikipedia.org/wiki/Pure_function) can even be given a value of
|
254
|
+
`0`, as no I/O is required to compute them.
|
255
|
+
|
256
|
+
```ruby
|
257
|
+
condition(:local_db) { @subject.related_object.present? }
|
258
|
+
condition(:pure, score: 0) { @subject.some_attribute? }
|
259
|
+
condition(:external_api, score: API_SCORE) { ExtrnalService.get(@subject.id).ok? }
|
260
|
+
|
261
|
+
# these are run in the order: pure, local_db, external_api
|
262
|
+
rule { external_api & pure & local_db }.enable :some_ability
|
263
|
+
```
|
264
|
+
|
265
|
+
The other consideration is the likelihood that a condition is determinative. For
|
266
|
+
example, if `condition_a` is true 80% of the time, and `condition_b` is true
|
267
|
+
20% of the time, then we should prefer to run `condition_a` if these conditions
|
268
|
+
enable an ability (because 80% of the time we don't need to run `condition_b`).
|
269
|
+
But if they prevent an ability, then we would prefer to run `condition_b` first,
|
270
|
+
because again, 80% of the time we can skip `condition_a`. This consideration is
|
271
|
+
more subtle. It requires knowing both the distribution of the condition, and
|
272
|
+
the consequence of its outcome, but this can be used to further optimize the
|
273
|
+
order of evaluation by marking some conditions as more likely to affect the
|
274
|
+
outcome.
|
275
|
+
|
276
|
+
All things being equal, we prefer to run prevent rules, because they have this
|
277
|
+
property - they are more likely to save extra work.
|