prune_ar 0.1.0

Sign up to get free protection for your applications and to get access to all the features.
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA256:
3
+ metadata.gz: 38a4f3eb4a02e9aabc7c1a72dc4a7324ead32471c141d09c28a5273cad6420c0
4
+ data.tar.gz: '08188305b0aefbefb79a056faefd2f27463a107a6b7c1e01bb626820962ab3ff'
5
+ SHA512:
6
+ metadata.gz: cc89a9cf90a24df6575f6a95e27c56d2c47cb50a7d1be529f866df7476837a63346e074b2e0d9205447fc8e7c1e415455885d7367b1c36f51207aea905227d44
7
+ data.tar.gz: d1b0797185d44305ea7c7ae02747ae458a20676f2ae75529b1bae60e7f2c33d7cbccc3d3e771c231e8e097e1cdb600eb8a95c318c94553480453d4b7e9a83200
@@ -0,0 +1,12 @@
1
+ # Changelog
2
+
3
+ All notable changes to this project will be documented in this file.
4
+
5
+ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
6
+ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
7
+
8
+ ## [Unreleased] - 2018-12-11
9
+
10
+ ### Added
11
+
12
+ - This is the initial implementation of this gem.
@@ -0,0 +1,89 @@
1
+ # Contributing
2
+
3
+ When contributing to this repository, please first discuss the change you wish to make via issue,
4
+ email, or any other method with the owners of this repository before making a change.
5
+
6
+ Please note we have a code of conduct, please follow it in all your interactions with the project.
7
+
8
+ ## Pull Request Process
9
+
10
+ 1. Ensure any install or build dependencies are removed before the end of the layer when doing a
11
+ build.
12
+ 2. Update the README.md with details of changes to the interface, this includes new environment
13
+ variables, exposed ports, useful file locations and container parameters.
14
+ 3. Increase the version numbers in any examples files and the README.md to the new version that this
15
+ Pull Request would represent. The versioning scheme we use is [SemVer](http://semver.org/).
16
+ 4. You may merge the Pull Request in once you have the sign-off of two other developers, or if you
17
+ do not have permission to do that, you may request the second reviewer to merge it for you.
18
+
19
+ ## Code of Conduct
20
+
21
+ ### Our Pledge
22
+
23
+ In the interest of fostering an open and welcoming environment, we as
24
+ contributors and maintainers pledge to making participation in our project and
25
+ our community a harassment-free experience for everyone, regardless of age, body
26
+ size, disability, ethnicity, gender identity and expression, level of experience,
27
+ nationality, personal appearance, race, religion, or sexual identity and
28
+ orientation.
29
+
30
+ ### Our Standards
31
+
32
+ Examples of behavior that contributes to creating a positive environment
33
+ include:
34
+
35
+ - Using welcoming and inclusive language
36
+ - Being respectful of differing viewpoints and experiences
37
+ - Gracefully accepting constructive criticism
38
+ - Focusing on what is best for the community
39
+ - Showing empathy towards other community members
40
+
41
+ Examples of unacceptable behavior by participants include:
42
+
43
+ - The use of sexualized language or imagery and unwelcome sexual attention or
44
+ advances
45
+ - Trolling, insulting/derogatory comments, and personal or political attacks
46
+ - Public or private harassment
47
+ - Publishing others' private information, such as a physical or electronic
48
+ address, without explicit permission
49
+ - Other conduct which could reasonably be considered inappropriate in a
50
+ professional setting
51
+
52
+ ### Our Responsibilities
53
+
54
+ Project maintainers are responsible for clarifying the standards of acceptable
55
+ behavior and are expected to take appropriate and fair corrective action in
56
+ response to any instances of unacceptable behavior.
57
+
58
+ Project maintainers have the right and responsibility to remove, edit, or
59
+ reject comments, commits, code, wiki edits, issues, and other contributions
60
+ that are not aligned to this Code of Conduct, or to ban temporarily or
61
+ permanently any contributor for other behaviors that they deem inappropriate,
62
+ threatening, offensive, or harmful.
63
+
64
+ ### Scope
65
+
66
+ This Code of Conduct applies both within project spaces and in public spaces
67
+ when an individual is representing the project or its community. Examples of
68
+ representing a project or community include using an official project e-mail
69
+ address, posting via an official social media account, or acting as an appointed
70
+ representative at an online or offline event. Representation of a project may be
71
+ further defined and clarified by project maintainers.
72
+
73
+ ### Enforcement
74
+
75
+ Instances of abusive, harassing, or otherwise unacceptable behavior may be
76
+ reported by contacting the project team at devops@contently.com. All
77
+ complaints will be reviewed and investigated and will result in a response that
78
+ is deemed necessary and appropriate to the circumstances. The project team is
79
+ obligated to maintain confidentiality with regard to the reporter of an incident.
80
+ Further details of specific enforcement policies may be posted separately.
81
+
82
+ Project maintainers who do not follow or enforce the Code of Conduct in good
83
+ faith may face temporary or permanent repercussions as determined by other
84
+ members of the project's leadership.
85
+
86
+ ### Attribution
87
+
88
+ This Code of Conduct is adapted from the [Contributor Covenant homepage](http://contributor-covenant.org), version 1.4,
89
+ available at [version](http://contributor-covenant.org/version/1/4)
data/Gemfile ADDED
@@ -0,0 +1,8 @@
1
+ # frozen_string_literal: true
2
+
3
+ source 'https://rubygems.org'
4
+
5
+ git_source(:github) { |repo_name| "https://github.com/#{repo_name}" }
6
+
7
+ # Specify your gem's dependencies in prune_ar.gemspec
8
+ gemspec
@@ -0,0 +1,101 @@
1
+ PATH
2
+ remote: .
3
+ specs:
4
+ prune_ar (0.1.0)
5
+ activerecord
6
+
7
+ GEM
8
+ remote: https://rubygems.org/
9
+ specs:
10
+ activemodel (5.2.2)
11
+ activesupport (= 5.2.2)
12
+ activerecord (5.2.2)
13
+ activemodel (= 5.2.2)
14
+ activesupport (= 5.2.2)
15
+ arel (>= 9.0)
16
+ activesupport (5.2.2)
17
+ concurrent-ruby (~> 1.0, >= 1.0.2)
18
+ i18n (>= 0.7, < 2)
19
+ minitest (~> 5.1)
20
+ tzinfo (~> 1.1)
21
+ arel (9.0.0)
22
+ ast (2.4.0)
23
+ coderay (1.1.2)
24
+ concurrent-ruby (1.1.3)
25
+ database_cleaner (1.7.0)
26
+ diff-lcs (1.3)
27
+ docile (1.3.1)
28
+ i18n (1.1.1)
29
+ concurrent-ruby (~> 1.0)
30
+ jaro_winkler (1.5.1)
31
+ json (2.1.0)
32
+ method_source (0.9.2)
33
+ minitest (5.11.3)
34
+ mysql2 (0.5.2)
35
+ parallel (1.12.1)
36
+ parser (2.5.3.0)
37
+ ast (~> 2.4.0)
38
+ pg (1.1.3)
39
+ powerpack (0.1.2)
40
+ pry (0.12.2)
41
+ coderay (~> 1.1.0)
42
+ method_source (~> 0.9.0)
43
+ rainbow (3.0.0)
44
+ rake (10.5.0)
45
+ rspec (3.8.0)
46
+ rspec-core (~> 3.8.0)
47
+ rspec-expectations (~> 3.8.0)
48
+ rspec-mocks (~> 3.8.0)
49
+ rspec-core (3.8.0)
50
+ rspec-support (~> 3.8.0)
51
+ rspec-expectations (3.8.2)
52
+ diff-lcs (>= 1.2.0, < 2.0)
53
+ rspec-support (~> 3.8.0)
54
+ rspec-mocks (3.8.0)
55
+ diff-lcs (>= 1.2.0, < 2.0)
56
+ rspec-support (~> 3.8.0)
57
+ rspec-support (3.8.0)
58
+ rspec_junit_formatter (0.4.1)
59
+ rspec-core (>= 2, < 4, != 2.12.0)
60
+ rubocop (0.61.1)
61
+ jaro_winkler (~> 1.5.1)
62
+ parallel (~> 1.10)
63
+ parser (>= 2.5, != 2.5.1.1)
64
+ powerpack (~> 0.1)
65
+ rainbow (>= 2.2.2, < 4.0)
66
+ ruby-progressbar (~> 1.7)
67
+ unicode-display_width (~> 1.4.0)
68
+ rubocop-junit_formatter (0.2)
69
+ rubocop (~> 0.49)
70
+ ruby-progressbar (1.10.0)
71
+ simplecov (0.16.1)
72
+ docile (~> 1.1)
73
+ json (>= 1.8, < 3)
74
+ simplecov-html (~> 0.10.0)
75
+ simplecov-html (0.10.2)
76
+ sqlite3 (1.3.13)
77
+ thread_safe (0.3.6)
78
+ tzinfo (1.2.5)
79
+ thread_safe (~> 0.1)
80
+ unicode-display_width (1.4.0)
81
+
82
+ PLATFORMS
83
+ ruby
84
+
85
+ DEPENDENCIES
86
+ bundler (~> 1.17)
87
+ database_cleaner (~> 1.7)
88
+ mysql2 (~> 0.5)
89
+ pg (~> 1.1)
90
+ prune_ar!
91
+ pry (~> 0.12)
92
+ rake (~> 10.0)
93
+ rspec (~> 3.8)
94
+ rspec_junit_formatter (~> 0.4)
95
+ rubocop (~> 0.61)
96
+ rubocop-junit_formatter (~> 0.2)
97
+ simplecov (~> 0.12)
98
+ sqlite3 (~> 1.3)
99
+
100
+ BUNDLED WITH
101
+ 1.17.1
data/LICENSE ADDED
@@ -0,0 +1,7 @@
1
+ Copyright 2018 Contently
2
+
3
+ Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
4
+
5
+ The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
6
+
7
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
@@ -0,0 +1,240 @@
1
+ # prune_ar
2
+
3
+ [![CircleCI](https://circleci.com/gh/contently/prune_ar.svg?style=shield)](https://circleci.com/gh/contently/prune_ar)
4
+
5
+ prune_ar is a gem that prunes database records using passed in deletion criteria & then pruning all subsequently orphaned records. It uses ActiveRecord's `belongs_to` associations in order to find orphaned records. It's main intent is to be able to delete any sets of records you would like to but also making sure that the database is left in a consistent state after the deletion (no orphaned records & no violated foreign key constraints). A side effect of pruning the orphaned records (done for consistency) is that it can be effectively used to prune down the whole database at once by issuing a delete on a top level table. Contently uses this process to prune down its production database down to one that is suitable for use in development (devoid of customer data).
6
+
7
+ The APIs provided are **destructive** to your database, so don't run this in production.
8
+
9
+ ## Getting Started
10
+ ### Prerequisites
11
+ ##### Production
12
+ No non-gem dependencies.
13
+
14
+ ##### Development & testing
15
+ Make sure to have [sqlite3](https://www.sqlite.org/index.html) installed so you can run the tests.
16
+
17
+ ### Support
18
+
19
+ #### Database
20
+
21
+ The gem **should** work with any database since it uses either generic SQL or abstractions through `ActiveRecord` that should all be general. Sadly, since different databases sometimes have very different behaviors and quirks, one would expect things to possibly break on some databases. These are the databases the gem is tested against & what level of testing is done against them:
22
+
23
+ Database | Level of testing (confidence of success)
24
+ --- | ---
25
+ PostgreSQL 9.2 | Unit tests
26
+ PostgreSQL 9.3 | Unit tests
27
+ PostgreSQL 9.4 | Unit tests & real world workload
28
+ PostgreSQL 9.5 | Unit tests
29
+ PostgreSQL 9.6 | Unit tests
30
+ PostgreSQL 10 | Unit tests & real world workload
31
+ PostgreSQL 11 | Unit tests
32
+ MySQL 5 | Unit tests
33
+ MySQL 8 | Unit tests
34
+ MariaDB 5 | Unit tests
35
+ MariaDB 10 | Unit tests
36
+ SQLite 3 | Unit tests
37
+
38
+ So we certainly have the highest confidence for success with PostgreSQL since we've actually tested the code against a real database. Please let us know your experience (& any issues you face) with this gem when used with your database type.
39
+
40
+ If you use this gem with MySQL (or others with similar behavior), you may want to turn off the sanity checking (which creates foreign key constraints). From my small knowledge of MySQL, I believe it creates an index (if one doesn't exist) when a foreign key constraint is created. The gem does not make any effort to clean up these indexes when the foreign key constraints are deleted. It is possible that MySQL will auto-delete them? (but I'm not sure).
41
+
42
+ #### ActiveRecord
43
+
44
+ This gem is known to work with ActiveRecord 5.x but the version has not been fixed in the gemspec to allow you to try it out with other versions of ActiveRecord (where it may well work fine or may not).
45
+
46
+ ### Installation & usage
47
+
48
+ For bundler, add this to your `Gemfile`:
49
+
50
+ ```ruby
51
+ gem "prune_ar", "~> 0.1"
52
+ ```
53
+
54
+ followed by
55
+
56
+ ```sh
57
+ bundle install
58
+ ```
59
+
60
+ You can also just install the gem without bundler with
61
+
62
+ ```sh
63
+ gem install prune_ar
64
+ ```
65
+
66
+ #### Usage
67
+
68
+ Basic example:
69
+
70
+ ```ruby
71
+ require 'prune_ar'
72
+ deletion_criteria = { Account => ["accounts.internal = 'f'"] }
73
+ Rails.application.eager_load! # We do this to make sure all models are loaded
74
+ PruneAr::prune_all_models(deletion_criteria: deletion_criteria)
75
+ ```
76
+
77
+ This will delete all external `Account`s & any child records that have an upward dependency chain (unlimited number of hops) to those deleted records. If a table does **not** have an upward dependency chain to the `Account` table, it will remain untouched.
78
+
79
+ One API is provided: `PruneAr::prune_all_models`. `prune_all_models` gathers all models in your application by looking at the descendants of `ActiveRecord::Base` so it is vital that you make sure all your `ActiveRecord` models are loaded before you call this. The models gathered here are the only ones that `PruneAr` will prune orphaned records from so if only a subset of your applications models are loaded (& seen by `PruneAr`) your database could be left in an inconsistent (referential integrity wise) state after pruning or be blocked from pruning by foreign keys constraints on tables untracked by `PruneAr`.
80
+
81
+ Here's a brief description of what the main parameters to these APIs mean:
82
+
83
+ ---
84
+
85
+ ##### :deletion_criteria
86
+ The core pruning criteria that you want to execute (will be executed up front)
87
+ ```ruby
88
+ deletion_criteria: {
89
+ Account => ['accounts.id NOT IN (1, 2)']
90
+ User => ["users.internal = 'f'", "users.active = 'f'"]
91
+ }
92
+ ```
93
+
94
+ ---
95
+
96
+ ##### :full_delete_models
97
+ Models for which you want to purge all records
98
+ ```ruby
99
+ full_delete_models: [Model1, Model2]
100
+ ```
101
+
102
+ ---
103
+
104
+ ##### :pre_queries_to_run
105
+ Arbitrary SQL statements to execute before pruning
106
+ ```ruby
107
+ pre_queries_to_run: ['UPDATE users SET invited_by_id = NULL WHERE invited_by_id IS NOT NULL']
108
+ ```
109
+
110
+ ---
111
+
112
+ ##### :conjunctive_deletion_criteria
113
+ Pruning criteria you want executed in conjunction with each iteration of pruning of orphaned records (one case where this is useful if pruning entities which don't have a belongs_to chain to the entities we pruned but instead are associated via join tables)
114
+ ```ruby
115
+ conjunctive_deletion_criteria: {
116
+ Image => ['NOT EXISTS (SELECT 1 FROM imagings WHERE imagings.image_id = images.id)']
117
+ }
118
+ ```
119
+
120
+ ---
121
+
122
+ ##### :perform_sanity_check (defaults to true)
123
+ Determines whether `PruneAr` sanity checks it's own pruning by setting (& subsequently removing) foreign key constraints for all belongs_to relations. This is to prove that we maintained referential integrity.
124
+ ```ruby
125
+ perform_sanity_check: true
126
+ ```
127
+
128
+ ---
129
+
130
+ ##### :logger
131
+ You can provide your own logger to be used for logging any messages that the API logs
132
+ ```ruby
133
+ logger: Logger.new(STDOUT).tap { |l| l.level = Logger::INFO }
134
+ ```
135
+
136
+ ---
137
+
138
+ Here is an example of a rake task that is similar to what Contently uses to prune their database:
139
+
140
+ ```ruby
141
+ desc 'Prune tables using prune_ar'
142
+ task prune_tables: :no_prod_env do
143
+ Rails.application.eager_load!
144
+ PruneAr::prune_all_models(
145
+ deletion_criteria: {
146
+ Account => ['id NOT IN (589, 87)'],
147
+ User => ["email NOT ILIKE '%@company.com'"]
148
+ },
149
+ # This pre-query makes sure that the users that we want to keep (emails like company.com) are not pruned because they were
150
+ # => invited by another user that doesn't have this email (and hence we've deleted their record)
151
+ pre_queries_to_run: ["UPDATE users SET invited_by_id = NULL WHERE invited_by_id IS NOT NULL"],
152
+ # Since images are referenced via a join table, they do not have a direct upward dependency chain to another entity
153
+ # => so we manually prune them using the query below
154
+ conjunctive_deletion_criteria: {
155
+ Image => ['NOT EXISTS (SELECT 1 FROM imagings WHERE imagings.image_id = images.id)']
156
+ },
157
+ full_delete_models: [Comment],
158
+ logger: Logger.new(STDOUT).tap { |l| l.level = Logger::INFO }
159
+ )
160
+ end
161
+ ```
162
+
163
+ ## Details
164
+
165
+ The motivation for writing prune_ar came about due to two main things:
166
+
167
+ - wanting a system to specify easy deletion criteria for a few top level tables & have the whole database be pruned accordingly. One use of this is to take a production database, provide a few high level table deletion criteria & end up with a pruned small clean database to be used for development purposes.
168
+ - previous approaches to accomplish the above goal had issues with orphaned records. These orphaned records are fairly harmless on their own, but it creates major issues when one is prevented from adding foreign key constraints to these tables with orphaned records.
169
+
170
+ prune_ar solves both of these. Here is a high level overview of what the algorithm used in prune_ar does:
171
+
172
+ 1. Gather all `belongs_to` associations for the models given.
173
+ 2. Drop all foreign key constraints (if the database supports foreign keys). This is done so prune_ar can delete records without hitting foreign key violation errors.
174
+ 3. Run any queries specified in `pre_queries_to_run`.
175
+ 4. Run the base deletion provided via `deletion_criteria`.
176
+ 5. Truncate tables for `full_delete_models`.
177
+ 6. Prune orphaned records & also delete via `conjunctive_deletion_criteria` alongside.
178
+ 7. Restore original foreign key constraints (if the database supports foreign keys).
179
+
180
+ prune_ar handles polymorphic belongs_to & is able to prune [HABTM](https://guides.rubyonrails.org/association_basics.html#the-has-and-belongs-to-many-association) tables in addition simple belongs_to associations.
181
+
182
+ ## Development
183
+ ### Installation
184
+ ```sh
185
+ gem install bundler
186
+ git clone https://github.com/contently/prune_ar.git
187
+ cd ./prune_ar
188
+ bundle install
189
+ ```
190
+
191
+ ### Running the tests
192
+ Assuming you've followed the steps outlined above for the Development installation, you can run
193
+ ```sh
194
+ bundle exec rspec
195
+ ```
196
+ to execute all tests.
197
+
198
+ #### Test structure
199
+ Each class has it's own `class_name_spec.rb` file in the [spec](spec) directory. The database [schema](spec/support/schema.rb) & [models](spec/support/models.rb) are located in [spec/support](spec/support).
200
+
201
+ ### Coding style
202
+ Mostly standard rubocop guidelines are followed with a few modications as can be seen in [.rubocop.yml](.rubocop.yml).
203
+
204
+ ## Deployment
205
+ In order to deploy a new version of the gem to rubygems:
206
+
207
+ - Bump the version in [version.rb](lib/prune_ar/version.rb) as appropriate according to [SemVer](http://semver.org/).
208
+ - Commit all changes & merge it to the master branch.
209
+ - On latest master (after a `git pull` on master):
210
+
211
+ ```sh
212
+ rake build
213
+ rake release
214
+ ```
215
+
216
+ If all went well, a new version of this gem should be published on rubygems.
217
+
218
+ ## Built With
219
+
220
+ * [bundler](https://bundler.io/) - Dependency management.
221
+ * [activerecord](https://rubygems.org/gems/activerecord) - This gem is based around ActiveRecord models.
222
+ * [rspec](https://rubygems.org/gems/rspec) - Testing framework.
223
+
224
+ ## Contributing
225
+
226
+ Please read [CONTRIBUTING.md](CONTRIBUTING.md) for details on our code of conduct, and the process for submitting pull requests to us.
227
+
228
+ ## Versioning
229
+
230
+ We use [SemVer](http://semver.org/) for versioning. For the versions available, see the [tags on this repository](https://github.com/contently/prune_ar/tags).
231
+
232
+ ## Authors
233
+
234
+ * **Anirban Mukhopadhyay** - *Initial work* - [anirbanmu](https://github.com/anirbanmu)
235
+
236
+ See also the list of [contributors](https://github.com/contently/prune_ar/contributors) who participated in this project.
237
+
238
+ ## License
239
+
240
+ This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details