frahugo-s3sync 1.3.8

Sign up to get free protection for your applications and to get access to all the features.
data/History.txt ADDED
@@ -0,0 +1,206 @@
1
+ 2010-12-16
2
+ Version 1.3.7
3
+ Install instructions
4
+
5
+
6
+ 2010-12-16
7
+ Version 1.3.7
8
+ Added delegate requirement in HTTPstreaming to fix error "uninitialized constant S3sync::SimpleDelegator"
9
+
10
+ 2010-11-29
11
+ Version 1.3.5
12
+ able to set Content-Encoding with --gzip option
13
+
14
+ 2010-10-29
15
+ Version 1.3.3
16
+ Enumerator is used instead of Generator
17
+
18
+ 2010-10-24
19
+ Version 1.3.1
20
+ Now compatible with ruby 1.9.2
21
+ Able to get key and secret with s3sync command argument.
22
+
23
+ === 0.0.1 2009-08-05
24
+
25
+ * 1 major enhancement:
26
+ * Initial release
27
+
28
+
29
+
30
+ 2006-09-29:
31
+ Added support for --expires and --cache-control. Eg:
32
+ --expires="Thu, 01 Dec 2007 16:00:00 GMT"
33
+ --cache-control="no-cache"
34
+
35
+ Thanks to Charles for pointing out the need for this, and supplying a patch
36
+ proving that it would be trivial to add =) Apologies for not including the short
37
+ form (-e) for the expires. I have a rule that options taking arguments should
38
+ use the long form.
39
+ ----------
40
+
41
+ 2006-10-04
42
+ Several minor debugs and edge cases.
43
+ Fixed a bug where retries didn't rewind the stream to start over.
44
+ ----------
45
+
46
+ 2006-10-12
47
+ Version 1.0.5
48
+ Finally figured out and fixed bug of trying to follow local symlink-to-directory.
49
+ Fixed a really nasty sorting discrepancy that caused problems when files started
50
+ with the same name as a directory.
51
+ Retry on connection-reset on the S3 side.
52
+ Skip files that we can't read instead of dying.
53
+ ----------
54
+
55
+ 2006-10-12
56
+ Version 1.0.6
57
+ Some GC voodoo to try and keep a handle on the memory footprint a little better.
58
+ There is still room for improvement here.
59
+ ----------
60
+
61
+ 2006-10-13
62
+ Version 1.0.7
63
+ Fixed symlink dirs being stored to S3 as real dirs (and failing with 400)
64
+ Added a retry catch for connection timeout error.
65
+ (Hopefully) caught a bug that expected every S3 listing to contain results
66
+ ----------
67
+
68
+ 2006-10-14
69
+ Version 1.0.8
70
+ Was testing for file? before symlink? in localnode.stream. This meant that for
71
+ symlink files it was trying to shove the real file contents into the symlink
72
+ body on s3.
73
+ ----------
74
+
75
+ 2006-10-14
76
+ Version 1.0.9
77
+ Woops, I was using "max-entries" for some reason but the proper header is
78
+ "max-keys". Not a big deal.
79
+ Broke out the S3try stuff into a separate file so I could re-use it for s3cmd.rb
80
+ ----------
81
+
82
+ 2006-10-16
83
+ Added a couple debug lines; not even enough to call it a version revision.
84
+ ----------
85
+
86
+ 2006-10-25
87
+ Version 1.0.10
88
+ UTF-8 fixes.
89
+ Catching a couple more retry-able errors in s3try (instead of aborting the
90
+ program).
91
+ ----------
92
+
93
+ 2006-10-26
94
+ Version 1.0.11
95
+ Revamped some details of the generators and comparator so that directories are
96
+ handled in a more exact and uniform fashion across local and S3.
97
+ ----------
98
+
99
+ 2006-11-28
100
+ Version 1.0.12
101
+ Added a couple more error catches to s3try.
102
+ ----------
103
+
104
+ 2007-01-08
105
+ Version 1.0.13
106
+ Numerous small changes to slash and path handling, in order to catch several
107
+ cases where "root" directory nodes were not being created on S3.
108
+ This makes restores work a lot more intuitively in many cases.
109
+ ----------
110
+
111
+ 2007-01-25
112
+ Version 1.0.14
113
+ Peter Fales' marker fix.
114
+ Also, markers should be decoded into native charset (because that's what s3
115
+ expects to see).
116
+ ----------
117
+
118
+ 2007-02-19
119
+ Version 1.1.0
120
+ *WARNING* Lots of path-handling changes. *PLEASE* test safely before you just
121
+ swap this in for your working 1.0.x version.
122
+
123
+ - Adding --exclude (and there was much rejoicing).
124
+ - Found Yet Another Leading Slash Bug with respect to local nodes. It was always
125
+ "recursing" into the first folder even if there was no trailing slash and -r
126
+ wasn't specified. What it should have done in this case is simply create a node
127
+ for the directory itself, then stop (not check the dir's contents).
128
+ - Local node canonicalization was (potentially) stripping the trailing slash,
129
+ which we need in order to make some decisios in the local generator.
130
+ - Fixed problem where it would prepend a "/" to s3 key names even with blank
131
+ prefix.
132
+ - Fixed S3->local when there's no "/" in the source so it doesn't try to create
133
+ a folder with the bucket name.
134
+ - Updated s3try and s3_s3sync_mod to allow SSL_CERT_FILE
135
+ ----------
136
+
137
+ 2007-02-22
138
+ Version 1.1.1
139
+ Fixed dumb regression bug caused by the S3->local bucket name fix in 1.1.0
140
+ ----------
141
+
142
+ 2007-02-25
143
+ Version 1.1.2
144
+ Added --progress
145
+ ----------
146
+
147
+ 2007-06-02
148
+ Version 1.1.3
149
+ IMPORTANT!
150
+ Pursuant to http://s3sync.net/forum/index.php?topic=49.0 , the tar.gz now
151
+ expands into its own sub-directory named "s3sync" instead of dumping all the
152
+ files into the current directory.
153
+
154
+ In the case of commands of the form:
155
+ s3sync -r somedir somebucket:
156
+ The root directory node in s3 was being stored as "somedir/" instead of "somedir"
157
+ which caused restores to mess up when you say:
158
+ s3sync -r somebucket: restoredir
159
+ The fix to this, by coincidence, actually makes s3fox work even *less* well with
160
+ s3sync. I really need to build my own xul+javascript s3 GUI some day.
161
+
162
+ Also fixed some of the NoMethodError stuff for when --progress is used
163
+ and caught Errno::ETIMEDOUT
164
+ ----------
165
+
166
+ 2007-07-12
167
+ Version 1.1.4
168
+ Added Alastair Brunton's yaml config code.
169
+ ----------
170
+
171
+ 2007-11-17
172
+ Version 1.2.1
173
+ Compatibility for S3 API revisions.
174
+ When retries are exhausted, emit an error.
175
+ Don't ever try to delete the 'root' local dir.
176
+ ----------
177
+
178
+ 2007-11-20
179
+ Version 1.2.2
180
+ Handle EU bucket 307 redirects (in s3try.rb)
181
+ --make-dirs added
182
+ ----------
183
+
184
+ 2007-11-20
185
+ Version 1.2.3
186
+ Fix SSL verification settings that broke in new S3 API.
187
+ ----------
188
+
189
+ 2008-01-06
190
+ Version 1.2.4
191
+ Run from any dir (search "here" for includes).
192
+ Search out s3config.yml in some likely places.
193
+ Reset connection (properly) on retry-able non-50x errors.
194
+ Fix calling format bug preventing it from working from yml.
195
+ Added http proxy support.
196
+ ----------
197
+
198
+ 2008-05-11
199
+ Version 1.2.5
200
+ Added option --no-md5
201
+ ----------
202
+
203
+ 2008-06-16
204
+ Version 1.2.6
205
+ Catch connect errors and retry.
206
+ ----------
data/Manifest.txt ADDED
@@ -0,0 +1,11 @@
1
+ History.txt
2
+ Manifest.txt
3
+ PostInstall.txt
4
+ README.rdoc
5
+ Rakefile
6
+ lib/s3sync.rb
7
+ script/console
8
+ script/destroy
9
+ script/generate
10
+ test/test_helper.rb
11
+ test/test_s3sync.rb
data/PostInstall.txt ADDED
@@ -0,0 +1,7 @@
1
+
2
+ For more information on s3sync, see http://s3sync.rubyforge.org
3
+
4
+ NOTE: Change this information in PostInstall.txt
5
+ You can also delete it if you don't want it.
6
+
7
+
data/README.rdoc ADDED
@@ -0,0 +1,330 @@
1
+
2
+ == CHANGED from original to be compatible with 1.9.2
3
+ * require 'md5'
4
+ Instead require "digest/md5"
5
+ * Thread.critical
6
+ Thread.critical is not used since 1.9
7
+ * Dir#collect
8
+ In 1.9.2 Dir#collect is not Array but Enumerator
9
+ * Array#to_s
10
+ The result of [1,2].to_s is different from 1.8. Instead of to_s, used join
11
+ * use Enumerator instead of thread_generator
12
+
13
+ == DESCRIPTION:
14
+
15
+ Welcome to s3sync.rb
16
+ --------------------
17
+ Home page, wiki, forum, bug reports, etc: http://s3sync.net
18
+
19
+ This is a ruby program that easily transfers directories between a local
20
+ directory and an S3 bucket:prefix. It behaves somewhat, but not precisely, like
21
+ the rsync program. In particular, it shares rsync's peculiar behavior that
22
+ trailing slashes on the source side are meaningful. See examples below.
23
+
24
+ One benefit over some other comparable tools is that s3sync goes out of its way
25
+ to mirror the directory structure on S3. Meaning you don't *need* to use s3sync
26
+ later in order to view your files on S3. You can just as easily use an S3
27
+ shell, a web browser (if you used the --public-read option), etc. Note that
28
+ s3sync is NOT necessarily going to be able to read files you uploaded via some
29
+ other tool. This includes things uploaded with the old perl version! For best
30
+ results, start fresh!
31
+
32
+ s3sync runs happily on linux, probably other *ix, and also Windows (except that
33
+ symlinks and permissions management features don't do anything on Windows). If
34
+ you get it running somewhere interesting let me know (see below)
35
+
36
+ s3sync is free, and license terms are included in all the source files. If you
37
+ decide to make it better, or find bugs, please let me know.
38
+
39
+ The original inspiration for this tool is the perl script by the same name which
40
+ was made by Thorsten von Eicken (and later updated by me). This ruby program
41
+ does not share any components or logic from that utility; the only relation is
42
+ that it performs a similar task.
43
+
44
+
45
+ Management tasks
46
+ ----------------
47
+ For low-level S3 operations not encapsulated by the sync paradigm, try the
48
+ companion utility s3cmd.rb. See README_s3cmd.txt.
49
+
50
+
51
+ About single files
52
+ ------------------
53
+ s3sync lacks the special case code that would be needed in order to handle a
54
+ source/dest that's a single file. This isn't one of the supported use cases so
55
+ don't expect it to work. You can use the companion utility s3cmd.rb for single
56
+ get/puts.
57
+
58
+
59
+ About Directories, the bane of any S3 sync-er
60
+ ---------------------------------------------
61
+ In S3 there's no actual concept of folders, just keys and nodes. So, every tool
62
+ uses its own proprietary way of storing dir info (my scheme being the best
63
+ naturally) and in general the methods are not compatible.
64
+
65
+ If you populate S3 by some means *other than* s3sync and then try to use s3sync
66
+ to "get" the S3 stuff to a local filesystem, you will want to use the
67
+ --make-dirs option. This causes the local dirs to be created even if there is no
68
+ s3sync-compatible directory node info stored on the S3 side. In other words,
69
+ local folders are conjured into existence whenever they are needed to make the
70
+ "get" succeed.
71
+
72
+
73
+ About MD5 hashes
74
+ ----------------
75
+ s3sync's normal operation is to compare the file size and MD5 hash of each item
76
+ to decide whether it needs syncing. On the S3 side, these hashes are stored and
77
+ returned to us as the "ETag" of each item when the bucket is listed, so it's
78
+ very easy. On the local side, the MD5 must be calculated by pushing every byte
79
+ in the file through the MD5 algorithm. This is CPU and IO intensive!
80
+
81
+ Thus you can specify the option --no-md5. This will compare the upload time on
82
+ S3 to the "last modified" time on the local item, and not do md5 calculations
83
+ locally at all. This might cause more transfers than are absolutely necessary.
84
+ For example if the file is "touched" to a newer modified date, but its contents
85
+ didn't change. Conversely if a file's contents are modified but the date is not
86
+ updated, then the sync will pass over it. Lastly, if your clock is very
87
+ different from the one on the S3 servers, then you may see unanticipated
88
+ behavior.
89
+
90
+
91
+ A word on SSL_CERT_DIR:
92
+ -----------------------
93
+ On my debian install I didn't find any root authority public keys. I installed
94
+ some by running this shell archive:
95
+ http://mirbsd.mirsolutions.de/cvs.cgi/src/etc/ssl.certs.shar
96
+ (You have to click download, and then run it wherever you want the certs to be
97
+ placed). I do not in any way assert that these certificates are good,
98
+ comprehensive, moral, noble, or otherwise correct. But I am using them.
99
+
100
+ If you don't set up a cert dir, and try to use ssl, then you'll 1) get an ugly
101
+ warning message slapped down by ruby, and 2) not have any protection AT ALL from
102
+ malicious servers posing as s3.amazonaws.com. Seriously... you want to get
103
+ this right if you're going to have any sensitive data being tossed around.
104
+ --
105
+ There is a debian package ca-certificates; this is what I'm using now.
106
+ apt-get install ca-certificates
107
+ and then use:
108
+ SSL_CERT_DIR=/etc/ssl/certs
109
+
110
+ You used to be able to use just one certificate, but recently AWS has started
111
+ using more than one CA.
112
+
113
+
114
+ Getting started:
115
+ ----------------
116
+ Invoke by typing s3sync.rb and you should get a nice usage screen.
117
+ Options can be specified in short or long form (except --delete, which has no
118
+ short form)
119
+
120
+ ALWAYS TEST NEW COMMANDS using --dryrun(-n) if you want to see what will be
121
+ affected before actually doing it. ESPECIALLY if you use --delete. Otherwise, do
122
+ not be surprised if you misplace a '/' or two and end up deleting all your
123
+ precious, precious files.
124
+
125
+ If you use the --public-read(-p) option, items sent to S3 will be ACL'd so that
126
+ anonymous web users can download them, given the correct URL. This could be
127
+ useful if you intend to publish directories of information for others to see.
128
+ For example, I use s3sync to publish itself to its home on S3 via the following
129
+ command: s3sync.rb -v -p publish/ ServEdge_pub:s3sync Where the files live in a
130
+ local folder called "publish" and I wish them to be copied to the URL:
131
+ http://s3.amazonaws.com/ServEdge_pub/s3sync/... If you use --ssl(-s) then your
132
+ connections with S3 will be encrypted. Otherwise your data will be sent in clear
133
+ form, i.e. easy to intercept by malicious parties.
134
+
135
+ If you want to prune items from the destination side which are not found on the
136
+ source side, you can use --delete. Always test this with -n first to make sure
137
+ the command line you specify is not going to do something terrible to your
138
+ cherished and irreplaceable data.
139
+
140
+
141
+ Updates and other discussion:
142
+ -----------------------------
143
+ The latest version of s3sync should normally be at:
144
+ http://s3.amazonaws.com/ServEdge_pub/s3sync/s3sync.tar.gz
145
+ and the Amazon S3 forums probably have a few threads going on it at any given
146
+ time. I may not always see things posted to the threads, so if you want you can
147
+ contact me at gbs-s3@10forward.com too.
148
+
149
+
150
+ == FEATURES/PROBLEMS:
151
+
152
+ * FIX (list of features or problems)
153
+
154
+ == SYNOPSIS:
155
+
156
+ Examples:
157
+ ---------
158
+ (using S3 bucket 'mybucket' and prefix 'pre')
159
+ Put the local etc directory itself into S3
160
+ s3sync.rb -r /etc mybucket:pre
161
+ (This will yield S3 keys named pre/etc/...)
162
+ Put the contents of the local /etc dir into S3, rename dir:
163
+ s3sync.rb -r /etc/ mybucket:pre/etcbackup
164
+ (This will yield S3 keys named pre/etcbackup/...)
165
+ Put contents of S3 "directory" etc into local dir
166
+ s3sync.rb -r mybucket:pre/etc/ /root/etcrestore
167
+ (This will yield local files at /root/etcrestore/...)
168
+ Put the contents of S3 "directory" etc into a local dir named etc
169
+ s3sync.rb -r mybucket:pre/etc /root
170
+ (This will yield local files at /root/etc/...)
171
+ Put S3 nodes under the key pre/etc/ to the local dir etcrestore
172
+ **and create local dirs even if S3 side lacks dir nodes**
173
+ s3sync.rb -r --make-dirs mybucket:pre/etc/ /root/etcrestore
174
+ (This will yield local files at /root/etcrestore/...)
175
+
176
+ List all the buckets your account owns:
177
+ s3cmd.rb listbuckets
178
+
179
+ Create a new bucket:
180
+ s3cmd.rb createbucket BucketName
181
+
182
+ Create a new bucket in the EU:
183
+ s3cmd.rb createbucket BucketName EU
184
+
185
+ Find out the location constraint of a bucket:
186
+ s3cmd.rb location BucketName
187
+
188
+ Delete an old bucket you don't want any more:
189
+ s3cmd.rb deletebucket BucketName
190
+
191
+ Find out what's in a bucket, 10 lines at a time:
192
+ s3cmd.rb list BucketName 10
193
+
194
+ Only look in a particular prefix:
195
+ s3cmd.rb list BucketName:startsWithThis
196
+
197
+ Look in the virtual "directory" named foo;
198
+ lists sub-"directories" and keys that are at this level.
199
+ Note that if you specify a delimiter you must specify a max before it.
200
+ (until I make the options parsing smarter)
201
+ s3cmd.rb list BucketName:foo/ 10 /
202
+
203
+ Delete a key:
204
+ s3cmd.rb delete BucketName:AKey
205
+
206
+ Delete all keys that match (like a combo between list and delete):
207
+ s3cmd.rb deleteall BucketName:SomePrefix
208
+
209
+ Only pretend you're going to delete all keys that match, but list them:
210
+ s3cmd.rb --dryrun deleteall BucketName:SomePrefix
211
+
212
+ Delete all keys in a bucket (leaving the bucket):
213
+ s3cmd.rb deleteall BucketName
214
+
215
+ Get a file from S3 and store it to a local file
216
+ s3cmd.rb get BucketName:TheFileOnS3.txt ALocalFile.txt
217
+
218
+ Put a local file up to S3
219
+ Note we don't automatically set mime type, etc.
220
+ NOTE that the order of the options doesn't change. S3 stays first!
221
+ s3cmd.rb put BucketName:TheFileOnS3.txt ALocalFile.txt
222
+
223
+
224
+ A note about [headers]
225
+ ----------------------
226
+ For some S3 operations, such as "put", you might want to specify certain headers
227
+ to the request such as Cache-Control, Expires, x-amz-acl, etc. Rather than
228
+ supporting a load of separate command-line options for these, I just allow
229
+ header specification. So to upload a file with public-read access you could
230
+ say:
231
+ s3cmd.rb put MyBucket:TheFile.txt x-amz-acl:public-read
232
+
233
+ If you don't need to add any particular headers then you can just ignore this
234
+ whole [headers] thing and pretend it's not there. This is somewhat of an
235
+ advanced option.
236
+
237
+
238
+ == REQUIREMENTS:
239
+
240
+ * FIX (list of requirements)
241
+
242
+ == INSTALL:
243
+
244
+ sudo gem install frahugo-s3sync
245
+
246
+ Or if you use bundler, you can point to the source repo:
247
+
248
+ gem 'frahugo-s3sync', :git => 'git://github.com/frahugo/s3sync.git'
249
+
250
+
251
+ Your environment:
252
+ -----------------
253
+ s3sync needs to know several interesting values to work right. It looks for
254
+ them in the following environment variables -or- a s3config.yml file.
255
+ In the yml case, the names need to be lowercase (see example file).
256
+ Furthermore, the yml is searched for in the following locations, in order:
257
+ $S3CONF/s3config.yml
258
+ $HOME/.s3conf/s3config.yml
259
+ /etc/s3conf/s3config.yml
260
+
261
+ Required:
262
+ AWS_ACCESS_KEY_ID
263
+ AWS_SECRET_ACCESS_KEY
264
+
265
+ If you don't know what these are, then s3sync is probably not the
266
+ right tool for you to be starting out with.
267
+ Optional:
268
+ AWS_S3_HOST - I don't see why the default would ever be wrong
269
+ HTTP_PROXY_HOST,HTTP_PROXY_PORT,HTTP_PROXY_USER,HTTP_PROXY_PASSWORD - proxy
270
+ SSL_CERT_DIR - Where your Cert Authority keys live; for verification
271
+ SSL_CERT_FILE - If you have just one PEM file for CA verification
272
+ S3SYNC_RETRIES - How many HTTP errors to tolerate before exiting
273
+ S3SYNC_WAITONERROR - How many seconds to wait after an http error
274
+ S3SYNC_MIME_TYPES_FILE - Where is your mime.types file
275
+ S3SYNC_NATIVE_CHARSET - For example Windows-1252. Defaults to ISO-8859-1.
276
+ AWS_CALLING_FORMAT - Defaults to REGULAR
277
+ REGULAR # http://s3.amazonaws.com/bucket/key
278
+ SUBDOMAIN # http://bucket.s3.amazonaws.com/key
279
+ VANITY # http://<vanity_domain>/key
280
+
281
+ Important: For EU-located buckets you should set the calling format to SUBDOMAIN
282
+ Important: For US buckets with CAPS or other weird traits set the calling format
283
+ to REGULAR
284
+
285
+ I use "envdir" from the daemontools package to set up my env
286
+ variables easily: http://cr.yp.to/daemontools/envdir.html
287
+ For example:
288
+ envdir /root/s3sync/env /root/s3sync/s3sync.rb -etc etc etc
289
+ I know there are other similar tools out there as well.
290
+
291
+ You can also just call it in a shell script where you have exported the vars
292
+ first such as:
293
+ #!/bin/bash
294
+ export AWS_ACCESS_KEY_ID=valueGoesHere
295
+ ...
296
+ s3sync.rb -etc etc etc
297
+
298
+ But by far the easiest (and newest) way to set this up is to put the name:value
299
+ pairs in a file named s3config.yml and let the yaml parser pick them up. There
300
+ is an .example file shipped with the tar.gz to show what a yaml file looks like.
301
+ Thanks to Alastair Brunton for this addition.
302
+
303
+ You can also use some combination of .yaml and environment variables, if you
304
+ want. Go nuts.
305
+
306
+
307
+ == LICENSE:
308
+
309
+ (The MIT License)
310
+
311
+ Copyright (c) 2009 FIXME full name
312
+
313
+ Permission is hereby granted, free of charge, to any person obtaining
314
+ a copy of this software and associated documentation files (the
315
+ 'Software'), to deal in the Software without restriction, including
316
+ without limitation the rights to use, copy, modify, merge, publish,
317
+ distribute, sublicense, and/or sell copies of the Software, and to
318
+ permit persons to whom the Software is furnished to do so, subject to
319
+ the following conditions:
320
+
321
+ The above copyright notice and this permission notice shall be
322
+ included in all copies or substantial portions of the Software.
323
+
324
+ THE SOFTWARE IS PROVIDED 'AS IS', WITHOUT WARRANTY OF ANY KIND,
325
+ EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
326
+ MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
327
+ IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
328
+ CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
329
+ TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
330
+ SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.