schema-evolution-manager 0.9.24

Sign up to get free protection for your applications and to get access to all the features.
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA1:
3
+ metadata.gz: f558b6d464ff5f0df460c0a6cfdc2a9b3eadc062
4
+ data.tar.gz: 59038ba0b4234da0ea505718d1846414fea3bb40
5
+ SHA512:
6
+ metadata.gz: 0d52b58665bbb777cee7eff41648fcbd0b11d1b69528c47f1175a039bb516ffb0854c1ff1b027729661edbbc87402b7514f72fc984ca58f581a9af17a9acb2a0
7
+ data.tar.gz: 972a3132a497d487919031ec4acee6d25be979c73a5ac9727a92cb75ef821d947dacaec835675710909fe3c56929cae357b57d9b227da4a36f8d1cf15b003938
@@ -0,0 +1,334 @@
1
+ # Schema Evolution Manager (sem)
2
+
3
+ ## Intended Audience
4
+
5
+ - Engineers who regularly manage the creation of scripts to update the
6
+ schema in a postgresql database.
7
+
8
+ - Engineers who want to simplify and/or standardize how other team
9
+ members contribute schema changes to a postgresql database.
10
+
11
+ ## Purpose
12
+
13
+ Schema Evolution Manager (sem) makes it very simple for engineers to
14
+ contribute schema changes to a postgresql database, managing the
15
+ schema evolutions as proper source code. Schema changes are deployed
16
+ as gzipped tarballs named with the corresponding git tag.
17
+
18
+ To apply schema changes to a particular database, download a tarball
19
+ and use sem to figure out which scripts have not yet been
20
+ applied, then apply those scripts in chronological order.
21
+
22
+ sem provides well tested, simple tools to manage the process
23
+ of creating and applying schema upgrade scripts to databases in all
24
+ environments.
25
+
26
+ - scripts are automatically named with a timestamp assigned at time
27
+ of creation
28
+
29
+ - all scripts applied to the postgresql database are recorded in
30
+ the table schema_evolution_manager.scripts - making it simple to
31
+ see what has been applied, and when, if needed.
32
+
33
+ sem contains only tools for managing schema evolutions. The idea is
34
+ that you create one git repository for each of your databases then use
35
+ sem to manage the schema evolution of each database.
36
+
37
+ At Gilt Groupe, we started using sem in early 2012 and have observed
38
+ an increase in the reliability of our production schema deploys across
39
+ dozens of independent postgresql databases.
40
+
41
+ See INSTALLATION and GETTING STARTED for details.
42
+
43
+
44
+ ## Project Goals
45
+
46
+ - Absolutely minimal set of dependencies. We found that anything
47
+ more complex led developers to prefer to manage their own schema
48
+ evolutions. We prefer small sets of scripts that each do one thing
49
+ well.
50
+
51
+ - Committed to true simplicity - features that would add complexity
52
+ are not added. We hope that more advanced features might be built
53
+ on top of schema evolution manager.
54
+
55
+ - Works for ALL applications - schema management is a first class
56
+ task now so any application framework can leverage these
57
+ migration tools.
58
+
59
+ - No rollback. We have found in practice that rolling back schema
60
+ changes is not 100% reliable. Therefore we inentionally do NOT
61
+ support rollback. This is an often debated element of sem,
62
+ and although the design itself could be easily extended to support
63
+ rollback, we currently have no plans to do so.
64
+
65
+ In place of rollback, we prefer to keep focus on the criticalness of
66
+ schema changes, encouraging peer review and lots of smaller evolutions
67
+ that themselves are relatively harmless.
68
+
69
+ This stems from the idea that we believe schema evolutions are
70
+ fundamentally risky. We believe the best way to manage this risk is
71
+ to:
72
+
73
+ 1. Treat schema evolution changes as normal software releases
74
+ as much as possible
75
+
76
+ 2. Manage schema versions as simple tarballs - artifacts are
77
+ critical to provide 100% reproducibility. This means the exact
78
+ same artifacts can be applied in development then QA and finally
79
+ production environments.
80
+
81
+ 3. Isolate schema changes as their own deploy. This then
82
+ guarantees that every other application itself can be rolled
83
+ back if needed. In practice, we have seen greater risk when
84
+ applications couple code changes with schema changes.
85
+
86
+ This last point bears some more detail. By fundamentally deciding to
87
+ manage and release schema changes independent of application changes:
88
+
89
+ 1. Schema changes are required to be incremental. For example, to
90
+ rename a column takes 4 separate, independent production deploys:
91
+
92
+ a. add new column
93
+ b. deploy changes in application to use old and new column
94
+ c. remove old column
95
+ d. deploy changes in application to use only new column
96
+
97
+ Though at first this may seem more complex, each individual change
98
+ itself is smaller and lower risk.
99
+
100
+ 2. It is worth repeating that all application deploys can now be
101
+ rolled back. This has been a huge win for our teams.
102
+
103
+
104
+ ## Talks
105
+
106
+ First presented at PGDay NYC 2013:
107
+ https://speakerdeck.com/mbryzek/schema-evolutions-at-gilt-groupe
108
+
109
+ ## Dependencies
110
+
111
+ - Ruby: Current testing against ruby 2.x. 1.8 and 1.9 are supported.
112
+
113
+ - Postgres: Only tested against 9.x. We minimize use of advanced
114
+ features and should work against 8.x series. If you try 8.x and
115
+ run into problems, please let us know so we can update.
116
+
117
+ - plpgsql must be available in the database. If needed you can:
118
+
119
+ createlang plpgsql template1
120
+ [http://www.postgresql.org/docs/8.4/static/app-createlang.html]
121
+
122
+ - Git: Designed to use git for history (all versions since 1.7).
123
+
124
+ ## Installation
125
+
126
+ git clone git://github.com/mbryzek/schema-evolution-manager.git
127
+ cd schema-evolution-manager
128
+ git checkout 0.9.24
129
+ ruby ./configure.rb
130
+ sudo ./install.rb
131
+
132
+
133
+ ## Upgrading
134
+
135
+ Upgrading is as simple as following the Installation instructions for
136
+ the new version. Each installation of sem will create a new directory
137
+ for that specific version. When you install the newer version, a new
138
+ directory will be created and symlinks updated to point to the latest
139
+ version.
140
+
141
+
142
+ ## Getting Started
143
+
144
+ ### Initialization
145
+
146
+ git init /tmp/sample
147
+ sem-init --dir /tmp/sample --url postgresql://postgres@localhost/sample
148
+
149
+ ### Writing your first sql script
150
+
151
+ cd /tmp/sample
152
+ echo "create table tmp_table (id integer)" > new.sql
153
+ sem-add ./new.sql
154
+
155
+ ### Applying changes to your local database:
156
+
157
+ cd /tmp/sample
158
+ createdb sample
159
+ sem-apply --url postgresql://postgres@localhost/sample
160
+
161
+ Note that you can also pass in the username, db host, and db name explicitly:
162
+
163
+ sem-apply --host localhost --name sample --user postgres
164
+
165
+ Similarly, for non-standard setups, you can optionally pass in the port
166
+
167
+ sem-apply --host localhost --port 5433 --name sample --user postgres
168
+
169
+ ### When you are happy with your change, commit:
170
+
171
+ git commit -m "Adding a new tmp table to test sem process" scripts
172
+
173
+ ## Publishing a Release
174
+
175
+ cd /tmp/sample
176
+ sem-dist
177
+
178
+ By default, the sem-dist script will create the next micro git tag,
179
+ and use that tag in the file name.
180
+
181
+ If you already have a tag:
182
+
183
+ sem-dist --tag 0.0.2
184
+
185
+ You will now have a single artifict -
186
+ /tmp/sample/dist/sample-0.0.2.tar.gz - that you can manage in standard
187
+ deploy process.
188
+
189
+
190
+ ## Deploying Schema Changes
191
+
192
+ ### Extract tarball on server
193
+
194
+ scp /tmp/sample/dist/sample-0.0.2.tar.gz <your server>:~/
195
+ ssh <your server>
196
+ tar xfz sample-0.0.2.tar.gz
197
+ cd sample-0.0.2
198
+
199
+ ### Do a dry run
200
+
201
+ sem-apply --url postgresql://postgres@localhost/sample --dry_run
202
+
203
+ You will likely see a number of create table statements (see data model section below). You should also see:
204
+
205
+ [DRY RUN] Applying 20130318-214407.sql
206
+
207
+ which tells you that if you apply these changes, that sql script will be applied to the sample db
208
+
209
+
210
+ ### Apply the changes
211
+
212
+ sem-apply --url postgresql://postgres@localhost/sample
213
+
214
+ You will see:
215
+
216
+ Upgrading schema for postgres@localhost/sample
217
+ Applying 20130318-214407.sql
218
+
219
+ Attempt to apply again:
220
+
221
+ sem-apply --url postgresql://postgres@localhost/sample
222
+
223
+ You will see:
224
+
225
+ Upgrading schema for postgres@localhost/sample
226
+ All scripts have been previously applied
227
+
228
+ ### Baselines
229
+
230
+ If you have an existing database, and you want to start using schema
231
+ evolution manager, we support the notion of creating a baseline. The
232
+ sem-baseline script will record that all of the scripts have been
233
+ applied to the database, without actually applying them. From this
234
+ point forward, only new scripts will be applied to the database.
235
+
236
+ sem-baseline --url postgresql://postgres@localhost/sample
237
+
238
+
239
+ ## Data Model
240
+
241
+ sem will create a new postgresql schema in your database named 'schema_evolution_manager'
242
+
243
+ psql sample
244
+ set search_path to schema_evolution_manager;
245
+ \dt
246
+
247
+ Schema | Name | Type | Owner
248
+ -------------+-------------------+-------+----------
249
+ schema_evolution_manager | bootstrap_scripts | table | postgres
250
+ schema_evolution_manager | scripts | table | postgres
251
+
252
+ Each of these tables has a column named 'filename' which keeps track
253
+ of the sql files applied to each database.
254
+
255
+ - The scripts table is used for your application.
256
+ - The bootstrap_scripts table is used to manage upgrades to the sem
257
+ application itself.
258
+
259
+ For details on these tables, see scripts/*sql where the tables themselves are defined.
260
+
261
+
262
+ ## PLPGSQL Utilities
263
+
264
+ We've included a copy of the schema conventions we practice at
265
+ (Gilt Groupe)[CONVENTIONS.md]. There are also a number of utility plpgsql
266
+ functions to help developers apply these conventions in a systematic way.
267
+
268
+ The helpers are defined in
269
+
270
+ scripts/20130318-105456.sql
271
+
272
+ We have found these utilities incredibly useful - and are committed to
273
+ providing only the most relevant, high quality, and extremely clear
274
+ helpers as possible.
275
+
276
+ In CONVENTIONS.md you will find a simple example of these conventions
277
+ and utilities in practice.
278
+
279
+
280
+ ## Command Line Utilities
281
+
282
+ - sem-init: Initialize a git repository for sem support
283
+ - sem-add: Adds a database upgrade script
284
+ - sem-dist: Create a distribution tar.gz file containing schema upgrade scripts
285
+ - sem-apply: Apply any deltas from a distribution tarball to a particular database
286
+
287
+
288
+ ## Attributes supported in sql migration scripts
289
+
290
+ Sometimes you may want to adjust the specific options used by SEM when
291
+ applying SQL scripts. Attributes can be specified within each SQL file
292
+ in comments.
293
+
294
+ To specify an attribute, add a comment of the following format
295
+ anywhere in your SQL file (but at the top by convention):
296
+
297
+ -- sem.attribute.[name] = [value]
298
+
299
+ Currently supported attributes:
300
+
301
+ - transaction
302
+
303
+ - single (default): the entire file is applied within a
304
+ transaction (by using the psql command line argument
305
+ --single-transaction)
306
+ - none: Each command in the file will be applied in order. If a
307
+ later command in the file fails, there will be no rollback.
308
+
309
+ Examples:
310
+
311
+ - -- sem.attribute.transaction = none
312
+ - -- sem.attribute.transaction = single
313
+
314
+
315
+ ## TODO
316
+
317
+ - Consider offering an option to install via ruby gems
318
+
319
+
320
+ ## License
321
+
322
+ Copyright 2013-2015 Gilt Groupe, Inc.
323
+
324
+ Licensed under the Apache License, Version 2.0 (the "License");
325
+ you may not use this file except in compliance with the License.
326
+ You may obtain a copy of the License at
327
+
328
+ http://www.apache.org/licenses/LICENSE-2.0
329
+
330
+ Unless required by applicable law or agreed to in writing, software
331
+ distributed under the License is distributed on an "AS IS" BASIS,
332
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
333
+ See the License for the specific language governing permissions and
334
+ limitations under the License.
@@ -0,0 +1,46 @@
1
+ #!/usr/bin/env ruby
2
+ # == Adds a database upgrade script to this repository.
3
+ #
4
+ # == Usage
5
+ # sem-add <path>
6
+ #
7
+ # == Example
8
+ # sem-add ./new-script.sql
9
+ #
10
+
11
+ load File.join(File.dirname(__FILE__), 'sem-config')
12
+
13
+ file = ARGV.shift
14
+ if file.to_s.strip == ""
15
+ puts "**** ERROR: Need file path"
16
+ SchemaEvolutionManager::RdocUsage.printAndExit(1)
17
+ end
18
+
19
+ SchemaEvolutionManager::Preconditions.check_state(File.exists?(file), "File[#{file}] could not be found")
20
+ SchemaEvolutionManager::Preconditions.check_state(file.match(/\.sql/i), "File[#{file}] must end with .sql")
21
+
22
+ scripts_dir = File.join(`pwd`.strip, "scripts")
23
+ SchemaEvolutionManager::Library.ensure_dir!(scripts_dir)
24
+
25
+ contents = IO.read(file)
26
+ now = Time.now.strftime('%Y%m%d-%H%M%S')
27
+ target = File.join(scripts_dir, "#{now}.sql")
28
+
29
+ padding = 1000
30
+ counter = 0
31
+ while File.exists?(target)
32
+ counter += 1
33
+ if counter >= padding * 10
34
+ raise "ERROR: Cannot add more than #{padding * 10} files / second. Doing so would break the implementation of lexicographically sorted filenames"
35
+ end
36
+ # The .z prefix is here to ensure these files sort AFTER the first
37
+ # .sql file. Maintaining lexicographic sorting is important to
38
+ # support simple command line tools (e.g. ls)
39
+ target = File.join(scripts_dir, "#{now}.z#{padding + counter}.sql")
40
+ end
41
+
42
+ puts "Adding #{target}"
43
+ SchemaEvolutionManager::Library.system_or_error("mv #{file} #{target}")
44
+ SchemaEvolutionManager::Library.system_or_error("git add #{target}")
45
+
46
+ puts "File staged in git. You need to commit and push"
@@ -0,0 +1,41 @@
1
+ #!/usr/bin/env ruby
2
+ # == Applies all pending database upgrade scripts to the specified database. All pending SQL scripts are
3
+ # sorted by the timestamp assigned at the time the script was added
4
+ #
5
+ # == Usage
6
+ # sem-apply --url <database url>
7
+ # or
8
+ # sem-apply --host <database host> --user <db user> --name <db name>
9
+ #
10
+ # == Examples
11
+ # sem-apply --url postgresql://postgres@localhost/sample
12
+ # sem-apply --host localhost --user web --name test
13
+ #
14
+
15
+ load File.join(File.dirname(__FILE__), 'sem-config')
16
+
17
+ args = SchemaEvolutionManager::Args.from_stdin(:optional => %w(url host port name user dry_run))
18
+
19
+ db = SchemaEvolutionManager::Db.from_args(args)
20
+ db.bootstrap!
21
+
22
+ dry_run = args.dry_run.nil? ? false : args.dry_run
23
+
24
+ util = SchemaEvolutionManager::ApplyUtil.new(db, :dry_run => dry_run)
25
+
26
+ puts "Upgrading schema for #{db.url}"
27
+
28
+ begin
29
+ count = util.apply!("./scripts")
30
+ if count == 0
31
+ puts " All scripts have been previously applied"
32
+ end
33
+ rescue SchemaEvolutionManager::ScriptError => e
34
+ puts ""
35
+ puts "ERROR applying script: %s" % e.filename
36
+ puts ""
37
+ puts "If this script has previously been applied to this database, you can record it as having been applied by:"
38
+ puts " " + e.dml
39
+ puts ""
40
+ exit(1)
41
+ end
@@ -0,0 +1,34 @@
1
+ #!/usr/bin/env ruby
2
+ # == Makes sure that sem is bootstapped, and then adds all existing scripts to the sem-database without applying
3
+ # Useful for migration to sem
4
+ #
5
+ # == Usage
6
+ # sem-baseline --url <database url>
7
+ # or
8
+ # sem-baseline --host <database host> --user <db user> --name <db name>
9
+ #
10
+ # == Examples
11
+ # sem-baseline --url postgresql://postgres@localhost/sample
12
+ # sem-apply --host localhost --user web --name test
13
+ #
14
+
15
+
16
+ load File.join(File.dirname(__FILE__), 'sem-config')
17
+
18
+ args = SchemaEvolutionManager::Args.from_stdin(:optional => %w(url host port name user dry_run))
19
+
20
+ db = SchemaEvolutionManager::Db.from_args(args)
21
+ db.bootstrap!
22
+
23
+ dry_run = args.dry_run.nil? ? false : args.dry_run
24
+
25
+ util = SchemaEvolutionManager::BaselineUtil.new(db, :dry_run => dry_run)
26
+
27
+ puts "Baselining schema for #{db.url}"
28
+
29
+ begin
30
+ count = util.apply!("./scripts")
31
+ if count == 0
32
+ puts " All scripts have been previously applied"
33
+ end
34
+ end