schema-evolution-manager 0.9.24

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA1:
3
+ metadata.gz: f558b6d464ff5f0df460c0a6cfdc2a9b3eadc062
4
+ data.tar.gz: 59038ba0b4234da0ea505718d1846414fea3bb40
5
+ SHA512:
6
+ metadata.gz: 0d52b58665bbb777cee7eff41648fcbd0b11d1b69528c47f1175a039bb516ffb0854c1ff1b027729661edbbc87402b7514f72fc984ca58f581a9af17a9acb2a0
7
+ data.tar.gz: 972a3132a497d487919031ec4acee6d25be979c73a5ac9727a92cb75ef821d947dacaec835675710909fe3c56929cae357b57d9b227da4a36f8d1cf15b003938
@@ -0,0 +1,334 @@
1
+ # Schema Evolution Manager (sem)
2
+
3
+ ## Intended Audience
4
+
5
+ - Engineers who regularly manage the creation of scripts to update the
6
+ schema in a postgresql database.
7
+
8
+ - Engineers who want to simplify and/or standardize how other team
9
+ members contribute schema changes to a postgresql database.
10
+
11
+ ## Purpose
12
+
13
+ Schema Evolution Manager (sem) makes it very simple for engineers to
14
+ contribute schema changes to a postgresql database, managing the
15
+ schema evolutions as proper source code. Schema changes are deployed
16
+ as gzipped tarballs named with the corresponding git tag.
17
+
18
+ To apply schema changes to a particular database, download a tarball
19
+ and use sem to figure out which scripts have not yet been
20
+ applied, then apply those scripts in chronological order.
21
+
22
+ sem provides well tested, simple tools to manage the process
23
+ of creating and applying schema upgrade scripts to databases in all
24
+ environments.
25
+
26
+ - scripts are automatically named with a timestamp assigned at time
27
+ of creation
28
+
29
+ - all scripts applied to the postgresql database are recorded in
30
+ the table schema_evolution_manager.scripts - making it simple to
31
+ see what has been applied, and when, if needed.
32
+
33
+ sem contains only tools for managing schema evolutions. The idea is
34
+ that you create one git repository for each of your databases then use
35
+ sem to manage the schema evolution of each database.
36
+
37
+ At Gilt Groupe, we started using sem in early 2012 and have observed
38
+ an increase in the reliability of our production schema deploys across
39
+ dozens of independent postgresql databases.
40
+
41
+ See INSTALLATION and GETTING STARTED for details.
42
+
43
+
44
+ ## Project Goals
45
+
46
+ - Absolutely minimal set of dependencies. We found that anything
47
+ more complex led developers to prefer to manage their own schema
48
+ evolutions. We prefer small sets of scripts that each do one thing
49
+ well.
50
+
51
+ - Committed to true simplicity - features that would add complexity
52
+ are not added. We hope that more advanced features might be built
53
+ on top of schema evolution manager.
54
+
55
+ - Works for ALL applications - schema management is a first class
56
+ task now so any application framework can leverage these
57
+ migration tools.
58
+
59
+ - No rollback. We have found in practice that rolling back schema
60
+ changes is not 100% reliable. Therefore we inentionally do NOT
61
+ support rollback. This is an often debated element of sem,
62
+ and although the design itself could be easily extended to support
63
+ rollback, we currently have no plans to do so.
64
+
65
+ In place of rollback, we prefer to keep focus on the criticalness of
66
+ schema changes, encouraging peer review and lots of smaller evolutions
67
+ that themselves are relatively harmless.
68
+
69
+ This stems from the idea that we believe schema evolutions are
70
+ fundamentally risky. We believe the best way to manage this risk is
71
+ to:
72
+
73
+ 1. Treat schema evolution changes as normal software releases
74
+ as much as possible
75
+
76
+ 2. Manage schema versions as simple tarballs - artifacts are
77
+ critical to provide 100% reproducibility. This means the exact
78
+ same artifacts can be applied in development then QA and finally
79
+ production environments.
80
+
81
+ 3. Isolate schema changes as their own deploy. This then
82
+ guarantees that every other application itself can be rolled
83
+ back if needed. In practice, we have seen greater risk when
84
+ applications couple code changes with schema changes.
85
+
86
+ This last point bears some more detail. By fundamentally deciding to
87
+ manage and release schema changes independent of application changes:
88
+
89
+ 1. Schema changes are required to be incremental. For example, to
90
+ rename a column takes 4 separate, independent production deploys:
91
+
92
+ a. add new column
93
+ b. deploy changes in application to use old and new column
94
+ c. remove old column
95
+ d. deploy changes in application to use only new column
96
+
97
+ Though at first this may seem more complex, each individual change
98
+ itself is smaller and lower risk.
99
+
100
+ 2. It is worth repeating that all application deploys can now be
101
+ rolled back. This has been a huge win for our teams.
102
+
103
+
104
+ ## Talks
105
+
106
+ First presented at PGDay NYC 2013:
107
+ https://speakerdeck.com/mbryzek/schema-evolutions-at-gilt-groupe
108
+
109
+ ## Dependencies
110
+
111
+ - Ruby: Current testing against ruby 2.x. 1.8 and 1.9 are supported.
112
+
113
+ - Postgres: Only tested against 9.x. We minimize use of advanced
114
+ features and should work against 8.x series. If you try 8.x and
115
+ run into problems, please let us know so we can update.
116
+
117
+ - plpgsql must be available in the database. If needed you can:
118
+
119
+ createlang plpgsql template1
120
+ [http://www.postgresql.org/docs/8.4/static/app-createlang.html]
121
+
122
+ - Git: Designed to use git for history (all versions since 1.7).
123
+
124
+ ## Installation
125
+
126
+ git clone git://github.com/mbryzek/schema-evolution-manager.git
127
+ cd schema-evolution-manager
128
+ git checkout 0.9.24
129
+ ruby ./configure.rb
130
+ sudo ./install.rb
131
+
132
+
133
+ ## Upgrading
134
+
135
+ Upgrading is as simple as following the Installation instructions for
136
+ the new version. Each installation of sem will create a new directory
137
+ for that specific version. When you install the newer version, a new
138
+ directory will be created and symlinks updated to point to the latest
139
+ version.
140
+
141
+
142
+ ## Getting Started
143
+
144
+ ### Initialization
145
+
146
+ git init /tmp/sample
147
+ sem-init --dir /tmp/sample --url postgresql://postgres@localhost/sample
148
+
149
+ ### Writing your first sql script
150
+
151
+ cd /tmp/sample
152
+ echo "create table tmp_table (id integer)" > new.sql
153
+ sem-add ./new.sql
154
+
155
+ ### Applying changes to your local database:
156
+
157
+ cd /tmp/sample
158
+ createdb sample
159
+ sem-apply --url postgresql://postgres@localhost/sample
160
+
161
+ Note that you can also pass in the username, db host, and db name explicitly:
162
+
163
+ sem-apply --host localhost --name sample --user postgres
164
+
165
+ Similarly, for non-standard setups, you can optionally pass in the port
166
+
167
+ sem-apply --host localhost --port 5433 --name sample --user postgres
168
+
169
+ ### When you are happy with your change, commit:
170
+
171
+ git commit -m "Adding a new tmp table to test sem process" scripts
172
+
173
+ ## Publishing a Release
174
+
175
+ cd /tmp/sample
176
+ sem-dist
177
+
178
+ By default, the sem-dist script will create the next micro git tag,
179
+ and use that tag in the file name.
180
+
181
+ If you already have a tag:
182
+
183
+ sem-dist --tag 0.0.2
184
+
185
+ You will now have a single artifict -
186
+ /tmp/sample/dist/sample-0.0.2.tar.gz - that you can manage in standard
187
+ deploy process.
188
+
189
+
190
+ ## Deploying Schema Changes
191
+
192
+ ### Extract tarball on server
193
+
194
+ scp /tmp/sample/dist/sample-0.0.2.tar.gz <your server>:~/
195
+ ssh <your server>
196
+ tar xfz sample-0.0.2.tar.gz
197
+ cd sample-0.0.2
198
+
199
+ ### Do a dry run
200
+
201
+ sem-apply --url postgresql://postgres@localhost/sample --dry_run
202
+
203
+ You will likely see a number of create table statements (see data model section below). You should also see:
204
+
205
+ [DRY RUN] Applying 20130318-214407.sql
206
+
207
+ which tells you that if you apply these changes, that sql script will be applied to the sample db
208
+
209
+
210
+ ### Apply the changes
211
+
212
+ sem-apply --url postgresql://postgres@localhost/sample
213
+
214
+ You will see:
215
+
216
+ Upgrading schema for postgres@localhost/sample
217
+ Applying 20130318-214407.sql
218
+
219
+ Attempt to apply again:
220
+
221
+ sem-apply --url postgresql://postgres@localhost/sample
222
+
223
+ You will see:
224
+
225
+ Upgrading schema for postgres@localhost/sample
226
+ All scripts have been previously applied
227
+
228
+ ### Baselines
229
+
230
+ If you have an existing database, and you want to start using schema
231
+ evolution manager, we support the notion of creating a baseline. The
232
+ sem-baseline script will record that all of the scripts have been
233
+ applied to the database, without actually applying them. From this
234
+ point forward, only new scripts will be applied to the database.
235
+
236
+ sem-baseline --url postgresql://postgres@localhost/sample
237
+
238
+
239
+ ## Data Model
240
+
241
+ sem will create a new postgresql schema in your database named 'schema_evolution_manager'
242
+
243
+ psql sample
244
+ set search_path to schema_evolution_manager;
245
+ \dt
246
+
247
+ Schema | Name | Type | Owner
248
+ -------------+-------------------+-------+----------
249
+ schema_evolution_manager | bootstrap_scripts | table | postgres
250
+ schema_evolution_manager | scripts | table | postgres
251
+
252
+ Each of these tables has a column named 'filename' which keeps track
253
+ of the sql files applied to each database.
254
+
255
+ - The scripts table is used for your application.
256
+ - The bootstrap_scripts table is used to manage upgrades to the sem
257
+ application itself.
258
+
259
+ For details on these tables, see scripts/*sql where the tables themselves are defined.
260
+
261
+
262
+ ## PLPGSQL Utilities
263
+
264
+ We've included a copy of the schema conventions we practice at
265
+ (Gilt Groupe)[CONVENTIONS.md]. There are also a number of utility plpgsql
266
+ functions to help developers apply these conventions in a systematic way.
267
+
268
+ The helpers are defined in
269
+
270
+ scripts/20130318-105456.sql
271
+
272
+ We have found these utilities incredibly useful - and are committed to
273
+ providing only the most relevant, high quality, and extremely clear
274
+ helpers as possible.
275
+
276
+ In CONVENTIONS.md you will find a simple example of these conventions
277
+ and utilities in practice.
278
+
279
+
280
+ ## Command Line Utilities
281
+
282
+ - sem-init: Initialize a git repository for sem support
283
+ - sem-add: Adds a database upgrade script
284
+ - sem-dist: Create a distribution tar.gz file containing schema upgrade scripts
285
+ - sem-apply: Apply any deltas from a distribution tarball to a particular database
286
+
287
+
288
+ ## Attributes supported in sql migration scripts
289
+
290
+ Sometimes you may want to adjust the specific options used by SEM when
291
+ applying SQL scripts. Attributes can be specified within each SQL file
292
+ in comments.
293
+
294
+ To specify an attribute, add a comment of the following format
295
+ anywhere in your SQL file (but at the top by convention):
296
+
297
+ -- sem.attribute.[name] = [value]
298
+
299
+ Currently supported attributes:
300
+
301
+ - transaction
302
+
303
+ - single (default): the entire file is applied within a
304
+ transaction (by using the psql command line argument
305
+ --single-transaction)
306
+ - none: Each command in the file will be applied in order. If a
307
+ later command in the file fails, there will be no rollback.
308
+
309
+ Examples:
310
+
311
+ - -- sem.attribute.transaction = none
312
+ - -- sem.attribute.transaction = single
313
+
314
+
315
+ ## TODO
316
+
317
+ - Consider offering an option to install via ruby gems
318
+
319
+
320
+ ## License
321
+
322
+ Copyright 2013-2015 Gilt Groupe, Inc.
323
+
324
+ Licensed under the Apache License, Version 2.0 (the "License");
325
+ you may not use this file except in compliance with the License.
326
+ You may obtain a copy of the License at
327
+
328
+ http://www.apache.org/licenses/LICENSE-2.0
329
+
330
+ Unless required by applicable law or agreed to in writing, software
331
+ distributed under the License is distributed on an "AS IS" BASIS,
332
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
333
+ See the License for the specific language governing permissions and
334
+ limitations under the License.
@@ -0,0 +1,46 @@
1
+ #!/usr/bin/env ruby
2
+ # == Adds a database upgrade script to this repository.
3
+ #
4
+ # == Usage
5
+ # sem-add <path>
6
+ #
7
+ # == Example
8
+ # sem-add ./new-script.sql
9
+ #
10
+
11
+ load File.join(File.dirname(__FILE__), 'sem-config')
12
+
13
+ file = ARGV.shift
14
+ if file.to_s.strip == ""
15
+ puts "**** ERROR: Need file path"
16
+ SchemaEvolutionManager::RdocUsage.printAndExit(1)
17
+ end
18
+
19
+ SchemaEvolutionManager::Preconditions.check_state(File.exists?(file), "File[#{file}] could not be found")
20
+ SchemaEvolutionManager::Preconditions.check_state(file.match(/\.sql/i), "File[#{file}] must end with .sql")
21
+
22
+ scripts_dir = File.join(`pwd`.strip, "scripts")
23
+ SchemaEvolutionManager::Library.ensure_dir!(scripts_dir)
24
+
25
+ contents = IO.read(file)
26
+ now = Time.now.strftime('%Y%m%d-%H%M%S')
27
+ target = File.join(scripts_dir, "#{now}.sql")
28
+
29
+ padding = 1000
30
+ counter = 0
31
+ while File.exists?(target)
32
+ counter += 1
33
+ if counter >= padding * 10
34
+ raise "ERROR: Cannot add more than #{padding * 10} files / second. Doing so would break the implementation of lexicographically sorted filenames"
35
+ end
36
+ # The .z prefix is here to ensure these files sort AFTER the first
37
+ # .sql file. Maintaining lexicographic sorting is important to
38
+ # support simple command line tools (e.g. ls)
39
+ target = File.join(scripts_dir, "#{now}.z#{padding + counter}.sql")
40
+ end
41
+
42
+ puts "Adding #{target}"
43
+ SchemaEvolutionManager::Library.system_or_error("mv #{file} #{target}")
44
+ SchemaEvolutionManager::Library.system_or_error("git add #{target}")
45
+
46
+ puts "File staged in git. You need to commit and push"
@@ -0,0 +1,41 @@
1
+ #!/usr/bin/env ruby
2
+ # == Applies all pending database upgrade scripts to the specified database. All pending SQL scripts are
3
+ # sorted by the timestamp assigned at the time the script was added
4
+ #
5
+ # == Usage
6
+ # sem-apply --url <database url>
7
+ # or
8
+ # sem-apply --host <database host> --user <db user> --name <db name>
9
+ #
10
+ # == Examples
11
+ # sem-apply --url postgresql://postgres@localhost/sample
12
+ # sem-apply --host localhost --user web --name test
13
+ #
14
+
15
+ load File.join(File.dirname(__FILE__), 'sem-config')
16
+
17
+ args = SchemaEvolutionManager::Args.from_stdin(:optional => %w(url host port name user dry_run))
18
+
19
+ db = SchemaEvolutionManager::Db.from_args(args)
20
+ db.bootstrap!
21
+
22
+ dry_run = args.dry_run.nil? ? false : args.dry_run
23
+
24
+ util = SchemaEvolutionManager::ApplyUtil.new(db, :dry_run => dry_run)
25
+
26
+ puts "Upgrading schema for #{db.url}"
27
+
28
+ begin
29
+ count = util.apply!("./scripts")
30
+ if count == 0
31
+ puts " All scripts have been previously applied"
32
+ end
33
+ rescue SchemaEvolutionManager::ScriptError => e
34
+ puts ""
35
+ puts "ERROR applying script: %s" % e.filename
36
+ puts ""
37
+ puts "If this script has previously been applied to this database, you can record it as having been applied by:"
38
+ puts " " + e.dml
39
+ puts ""
40
+ exit(1)
41
+ end
@@ -0,0 +1,34 @@
1
+ #!/usr/bin/env ruby
2
+ # == Makes sure that sem is bootstapped, and then adds all existing scripts to the sem-database without applying
3
+ # Useful for migration to sem
4
+ #
5
+ # == Usage
6
+ # sem-baseline --url <database url>
7
+ # or
8
+ # sem-baseline --host <database host> --user <db user> --name <db name>
9
+ #
10
+ # == Examples
11
+ # sem-baseline --url postgresql://postgres@localhost/sample
12
+ # sem-apply --host localhost --user web --name test
13
+ #
14
+
15
+
16
+ load File.join(File.dirname(__FILE__), 'sem-config')
17
+
18
+ args = SchemaEvolutionManager::Args.from_stdin(:optional => %w(url host port name user dry_run))
19
+
20
+ db = SchemaEvolutionManager::Db.from_args(args)
21
+ db.bootstrap!
22
+
23
+ dry_run = args.dry_run.nil? ? false : args.dry_run
24
+
25
+ util = SchemaEvolutionManager::BaselineUtil.new(db, :dry_run => dry_run)
26
+
27
+ puts "Baselining schema for #{db.url}"
28
+
29
+ begin
30
+ count = util.apply!("./scripts")
31
+ if count == 0
32
+ puts " All scripts have been previously applied"
33
+ end
34
+ end