aproxacs-s3sync 1.3.3

Sign up to get free protection for your applications and to get access to all the features.
data/History.txt ADDED
@@ -0,0 +1,193 @@
1
+ 2010-10-29
2
+ Version 1.3.3
3
+ Enumerator is used instead of Generator
4
+
5
+ 2010-10-24
6
+ Version 1.3.1
7
+ Now compatible with ruby 1.9.2
8
+ Able to get key and secret with s3sync command argument.
9
+
10
+ === 0.0.1 2009-08-05
11
+
12
+ * 1 major enhancement:
13
+ * Initial release
14
+
15
+
16
+
17
+ 2006-09-29:
18
+ Added support for --expires and --cache-control. Eg:
19
+ --expires="Thu, 01 Dec 2007 16:00:00 GMT"
20
+ --cache-control="no-cache"
21
+
22
+ Thanks to Charles for pointing out the need for this, and supplying a patch
23
+ proving that it would be trivial to add =) Apologies for not including the short
24
+ form (-e) for the expires. I have a rule that options taking arguments should
25
+ use the long form.
26
+ ----------
27
+
28
+ 2006-10-04
29
+ Several minor debugs and edge cases.
30
+ Fixed a bug where retries didn't rewind the stream to start over.
31
+ ----------
32
+
33
+ 2006-10-12
34
+ Version 1.0.5
35
+ Finally figured out and fixed bug of trying to follow local symlink-to-directory.
36
+ Fixed a really nasty sorting discrepancy that caused problems when files started
37
+ with the same name as a directory.
38
+ Retry on connection-reset on the S3 side.
39
+ Skip files that we can't read instead of dying.
40
+ ----------
41
+
42
+ 2006-10-12
43
+ Version 1.0.6
44
+ Some GC voodoo to try and keep a handle on the memory footprint a little better.
45
+ There is still room for improvement here.
46
+ ----------
47
+
48
+ 2006-10-13
49
+ Version 1.0.7
50
+ Fixed symlink dirs being stored to S3 as real dirs (and failing with 400)
51
+ Added a retry catch for connection timeout error.
52
+ (Hopefully) caught a bug that expected every S3 listing to contain results
53
+ ----------
54
+
55
+ 2006-10-14
56
+ Version 1.0.8
57
+ Was testing for file? before symlink? in localnode.stream. This meant that for
58
+ symlink files it was trying to shove the real file contents into the symlink
59
+ body on s3.
60
+ ----------
61
+
62
+ 2006-10-14
63
+ Version 1.0.9
64
+ Woops, I was using "max-entries" for some reason but the proper header is
65
+ "max-keys". Not a big deal.
66
+ Broke out the S3try stuff into a separate file so I could re-use it for s3cmd.rb
67
+ ----------
68
+
69
+ 2006-10-16
70
+ Added a couple debug lines; not even enough to call it a version revision.
71
+ ----------
72
+
73
+ 2006-10-25
74
+ Version 1.0.10
75
+ UTF-8 fixes.
76
+ Catching a couple more retry-able errors in s3try (instead of aborting the
77
+ program).
78
+ ----------
79
+
80
+ 2006-10-26
81
+ Version 1.0.11
82
+ Revamped some details of the generators and comparator so that directories are
83
+ handled in a more exact and uniform fashion across local and S3.
84
+ ----------
85
+
86
+ 2006-11-28
87
+ Version 1.0.12
88
+ Added a couple more error catches to s3try.
89
+ ----------
90
+
91
+ 2007-01-08
92
+ Version 1.0.13
93
+ Numerous small changes to slash and path handling, in order to catch several
94
+ cases where "root" directory nodes were not being created on S3.
95
+ This makes restores work a lot more intuitively in many cases.
96
+ ----------
97
+
98
+ 2007-01-25
99
+ Version 1.0.14
100
+ Peter Fales' marker fix.
101
+ Also, markers should be decoded into native charset (because that's what s3
102
+ expects to see).
103
+ ----------
104
+
105
+ 2007-02-19
106
+ Version 1.1.0
107
+ *WARNING* Lots of path-handling changes. *PLEASE* test safely before you just
108
+ swap this in for your working 1.0.x version.
109
+
110
+ - Adding --exclude (and there was much rejoicing).
111
+ - Found Yet Another Leading Slash Bug with respect to local nodes. It was always
112
+ "recursing" into the first folder even if there was no trailing slash and -r
113
+ wasn't specified. What it should have done in this case is simply create a node
114
+ for the directory itself, then stop (not check the dir's contents).
115
+ - Local node canonicalization was (potentially) stripping the trailing slash,
116
+ which we need in order to make some decisios in the local generator.
117
+ - Fixed problem where it would prepend a "/" to s3 key names even with blank
118
+ prefix.
119
+ - Fixed S3->local when there's no "/" in the source so it doesn't try to create
120
+ a folder with the bucket name.
121
+ - Updated s3try and s3_s3sync_mod to allow SSL_CERT_FILE
122
+ ----------
123
+
124
+ 2007-02-22
125
+ Version 1.1.1
126
+ Fixed dumb regression bug caused by the S3->local bucket name fix in 1.1.0
127
+ ----------
128
+
129
+ 2007-02-25
130
+ Version 1.1.2
131
+ Added --progress
132
+ ----------
133
+
134
+ 2007-06-02
135
+ Version 1.1.3
136
+ IMPORTANT!
137
+ Pursuant to http://s3sync.net/forum/index.php?topic=49.0 , the tar.gz now
138
+ expands into its own sub-directory named "s3sync" instead of dumping all the
139
+ files into the current directory.
140
+
141
+ In the case of commands of the form:
142
+ s3sync -r somedir somebucket:
143
+ The root directory node in s3 was being stored as "somedir/" instead of "somedir"
144
+ which caused restores to mess up when you say:
145
+ s3sync -r somebucket: restoredir
146
+ The fix to this, by coincidence, actually makes s3fox work even *less* well with
147
+ s3sync. I really need to build my own xul+javascript s3 GUI some day.
148
+
149
+ Also fixed some of the NoMethodError stuff for when --progress is used
150
+ and caught Errno::ETIMEDOUT
151
+ ----------
152
+
153
+ 2007-07-12
154
+ Version 1.1.4
155
+ Added Alastair Brunton's yaml config code.
156
+ ----------
157
+
158
+ 2007-11-17
159
+ Version 1.2.1
160
+ Compatibility for S3 API revisions.
161
+ When retries are exhausted, emit an error.
162
+ Don't ever try to delete the 'root' local dir.
163
+ ----------
164
+
165
+ 2007-11-20
166
+ Version 1.2.2
167
+ Handle EU bucket 307 redirects (in s3try.rb)
168
+ --make-dirs added
169
+ ----------
170
+
171
+ 2007-11-20
172
+ Version 1.2.3
173
+ Fix SSL verification settings that broke in new S3 API.
174
+ ----------
175
+
176
+ 2008-01-06
177
+ Version 1.2.4
178
+ Run from any dir (search "here" for includes).
179
+ Search out s3config.yml in some likely places.
180
+ Reset connection (properly) on retry-able non-50x errors.
181
+ Fix calling format bug preventing it from working from yml.
182
+ Added http proxy support.
183
+ ----------
184
+
185
+ 2008-05-11
186
+ Version 1.2.5
187
+ Added option --no-md5
188
+ ----------
189
+
190
+ 2008-06-16
191
+ Version 1.2.6
192
+ Catch connect errors and retry.
193
+ ----------
data/Manifest.txt ADDED
@@ -0,0 +1,11 @@
1
+ History.txt
2
+ Manifest.txt
3
+ PostInstall.txt
4
+ README.rdoc
5
+ Rakefile
6
+ lib/s3sync.rb
7
+ script/console
8
+ script/destroy
9
+ script/generate
10
+ test/test_helper.rb
11
+ test/test_s3sync.rb
data/PostInstall.txt ADDED
@@ -0,0 +1,7 @@
1
+
2
+ For more information on s3sync, see http://s3sync.rubyforge.org
3
+
4
+ NOTE: Change this information in PostInstall.txt
5
+ You can also delete it if you don't want it.
6
+
7
+
data/README.rdoc ADDED
@@ -0,0 +1,325 @@
1
+
2
+ == CHANGED from original to be compatible with 1.9.2
3
+ * require 'md5'
4
+ Instead require "digest/md5"
5
+ * Thread.critical
6
+ Thread.critical is not used since 1.9
7
+ * Dir#collect
8
+ In 1.9.2 Dir#collect is not Array but Enumerator
9
+ * Array#to_s
10
+ The result of [1,2].to_s is different from 1.8. Instead of to_s, used join
11
+
12
+ == DESCRIPTION:
13
+
14
+ Welcome to s3sync.rb
15
+ --------------------
16
+ Home page, wiki, forum, bug reports, etc: http://s3sync.net
17
+
18
+ This is a ruby program that easily transfers directories between a local
19
+ directory and an S3 bucket:prefix. It behaves somewhat, but not precisely, like
20
+ the rsync program. In particular, it shares rsync's peculiar behavior that
21
+ trailing slashes on the source side are meaningful. See examples below.
22
+
23
+ One benefit over some other comparable tools is that s3sync goes out of its way
24
+ to mirror the directory structure on S3. Meaning you don't *need* to use s3sync
25
+ later in order to view your files on S3. You can just as easily use an S3
26
+ shell, a web browser (if you used the --public-read option), etc. Note that
27
+ s3sync is NOT necessarily going to be able to read files you uploaded via some
28
+ other tool. This includes things uploaded with the old perl version! For best
29
+ results, start fresh!
30
+
31
+ s3sync runs happily on linux, probably other *ix, and also Windows (except that
32
+ symlinks and permissions management features don't do anything on Windows). If
33
+ you get it running somewhere interesting let me know (see below)
34
+
35
+ s3sync is free, and license terms are included in all the source files. If you
36
+ decide to make it better, or find bugs, please let me know.
37
+
38
+ The original inspiration for this tool is the perl script by the same name which
39
+ was made by Thorsten von Eicken (and later updated by me). This ruby program
40
+ does not share any components or logic from that utility; the only relation is
41
+ that it performs a similar task.
42
+
43
+
44
+ Management tasks
45
+ ----------------
46
+ For low-level S3 operations not encapsulated by the sync paradigm, try the
47
+ companion utility s3cmd.rb. See README_s3cmd.txt.
48
+
49
+
50
+ About single files
51
+ ------------------
52
+ s3sync lacks the special case code that would be needed in order to handle a
53
+ source/dest that's a single file. This isn't one of the supported use cases so
54
+ don't expect it to work. You can use the companion utility s3cmd.rb for single
55
+ get/puts.
56
+
57
+
58
+ About Directories, the bane of any S3 sync-er
59
+ ---------------------------------------------
60
+ In S3 there's no actual concept of folders, just keys and nodes. So, every tool
61
+ uses its own proprietary way of storing dir info (my scheme being the best
62
+ naturally) and in general the methods are not compatible.
63
+
64
+ If you populate S3 by some means *other than* s3sync and then try to use s3sync
65
+ to "get" the S3 stuff to a local filesystem, you will want to use the
66
+ --make-dirs option. This causes the local dirs to be created even if there is no
67
+ s3sync-compatible directory node info stored on the S3 side. In other words,
68
+ local folders are conjured into existence whenever they are needed to make the
69
+ "get" succeed.
70
+
71
+
72
+ About MD5 hashes
73
+ ----------------
74
+ s3sync's normal operation is to compare the file size and MD5 hash of each item
75
+ to decide whether it needs syncing. On the S3 side, these hashes are stored and
76
+ returned to us as the "ETag" of each item when the bucket is listed, so it's
77
+ very easy. On the local side, the MD5 must be calculated by pushing every byte
78
+ in the file through the MD5 algorithm. This is CPU and IO intensive!
79
+
80
+ Thus you can specify the option --no-md5. This will compare the upload time on
81
+ S3 to the "last modified" time on the local item, and not do md5 calculations
82
+ locally at all. This might cause more transfers than are absolutely necessary.
83
+ For example if the file is "touched" to a newer modified date, but its contents
84
+ didn't change. Conversely if a file's contents are modified but the date is not
85
+ updated, then the sync will pass over it. Lastly, if your clock is very
86
+ different from the one on the S3 servers, then you may see unanticipated
87
+ behavior.
88
+
89
+
90
+ A word on SSL_CERT_DIR:
91
+ -----------------------
92
+ On my debian install I didn't find any root authority public keys. I installed
93
+ some by running this shell archive:
94
+ http://mirbsd.mirsolutions.de/cvs.cgi/src/etc/ssl.certs.shar
95
+ (You have to click download, and then run it wherever you want the certs to be
96
+ placed). I do not in any way assert that these certificates are good,
97
+ comprehensive, moral, noble, or otherwise correct. But I am using them.
98
+
99
+ If you don't set up a cert dir, and try to use ssl, then you'll 1) get an ugly
100
+ warning message slapped down by ruby, and 2) not have any protection AT ALL from
101
+ malicious servers posing as s3.amazonaws.com. Seriously... you want to get
102
+ this right if you're going to have any sensitive data being tossed around.
103
+ --
104
+ There is a debian package ca-certificates; this is what I'm using now.
105
+ apt-get install ca-certificates
106
+ and then use:
107
+ SSL_CERT_DIR=/etc/ssl/certs
108
+
109
+ You used to be able to use just one certificate, but recently AWS has started
110
+ using more than one CA.
111
+
112
+
113
+ Getting started:
114
+ ----------------
115
+ Invoke by typing s3sync.rb and you should get a nice usage screen.
116
+ Options can be specified in short or long form (except --delete, which has no
117
+ short form)
118
+
119
+ ALWAYS TEST NEW COMMANDS using --dryrun(-n) if you want to see what will be
120
+ affected before actually doing it. ESPECIALLY if you use --delete. Otherwise, do
121
+ not be surprised if you misplace a '/' or two and end up deleting all your
122
+ precious, precious files.
123
+
124
+ If you use the --public-read(-p) option, items sent to S3 will be ACL'd so that
125
+ anonymous web users can download them, given the correct URL. This could be
126
+ useful if you intend to publish directories of information for others to see.
127
+ For example, I use s3sync to publish itself to its home on S3 via the following
128
+ command: s3sync.rb -v -p publish/ ServEdge_pub:s3sync Where the files live in a
129
+ local folder called "publish" and I wish them to be copied to the URL:
130
+ http://s3.amazonaws.com/ServEdge_pub/s3sync/... If you use --ssl(-s) then your
131
+ connections with S3 will be encrypted. Otherwise your data will be sent in clear
132
+ form, i.e. easy to intercept by malicious parties.
133
+
134
+ If you want to prune items from the destination side which are not found on the
135
+ source side, you can use --delete. Always test this with -n first to make sure
136
+ the command line you specify is not going to do something terrible to your
137
+ cherished and irreplaceable data.
138
+
139
+
140
+ Updates and other discussion:
141
+ -----------------------------
142
+ The latest version of s3sync should normally be at:
143
+ http://s3.amazonaws.com/ServEdge_pub/s3sync/s3sync.tar.gz
144
+ and the Amazon S3 forums probably have a few threads going on it at any given
145
+ time. I may not always see things posted to the threads, so if you want you can
146
+ contact me at gbs-s3@10forward.com too.
147
+
148
+
149
+ == FEATURES/PROBLEMS:
150
+
151
+ * FIX (list of features or problems)
152
+
153
+ == SYNOPSIS:
154
+
155
+ Examples:
156
+ ---------
157
+ (using S3 bucket 'mybucket' and prefix 'pre')
158
+ Put the local etc directory itself into S3
159
+ s3sync.rb -r /etc mybucket:pre
160
+ (This will yield S3 keys named pre/etc/...)
161
+ Put the contents of the local /etc dir into S3, rename dir:
162
+ s3sync.rb -r /etc/ mybucket:pre/etcbackup
163
+ (This will yield S3 keys named pre/etcbackup/...)
164
+ Put contents of S3 "directory" etc into local dir
165
+ s3sync.rb -r mybucket:pre/etc/ /root/etcrestore
166
+ (This will yield local files at /root/etcrestore/...)
167
+ Put the contents of S3 "directory" etc into a local dir named etc
168
+ s3sync.rb -r mybucket:pre/etc /root
169
+ (This will yield local files at /root/etc/...)
170
+ Put S3 nodes under the key pre/etc/ to the local dir etcrestore
171
+ **and create local dirs even if S3 side lacks dir nodes**
172
+ s3sync.rb -r --make-dirs mybucket:pre/etc/ /root/etcrestore
173
+ (This will yield local files at /root/etcrestore/...)
174
+
175
+ List all the buckets your account owns:
176
+ s3cmd.rb listbuckets
177
+
178
+ Create a new bucket:
179
+ s3cmd.rb createbucket BucketName
180
+
181
+ Create a new bucket in the EU:
182
+ s3cmd.rb createbucket BucketName EU
183
+
184
+ Find out the location constraint of a bucket:
185
+ s3cmd.rb location BucketName
186
+
187
+ Delete an old bucket you don't want any more:
188
+ s3cmd.rb deletebucket BucketName
189
+
190
+ Find out what's in a bucket, 10 lines at a time:
191
+ s3cmd.rb list BucketName 10
192
+
193
+ Only look in a particular prefix:
194
+ s3cmd.rb list BucketName:startsWithThis
195
+
196
+ Look in the virtual "directory" named foo;
197
+ lists sub-"directories" and keys that are at this level.
198
+ Note that if you specify a delimiter you must specify a max before it.
199
+ (until I make the options parsing smarter)
200
+ s3cmd.rb list BucketName:foo/ 10 /
201
+
202
+ Delete a key:
203
+ s3cmd.rb delete BucketName:AKey
204
+
205
+ Delete all keys that match (like a combo between list and delete):
206
+ s3cmd.rb deleteall BucketName:SomePrefix
207
+
208
+ Only pretend you're going to delete all keys that match, but list them:
209
+ s3cmd.rb --dryrun deleteall BucketName:SomePrefix
210
+
211
+ Delete all keys in a bucket (leaving the bucket):
212
+ s3cmd.rb deleteall BucketName
213
+
214
+ Get a file from S3 and store it to a local file
215
+ s3cmd.rb get BucketName:TheFileOnS3.txt ALocalFile.txt
216
+
217
+ Put a local file up to S3
218
+ Note we don't automatically set mime type, etc.
219
+ NOTE that the order of the options doesn't change. S3 stays first!
220
+ s3cmd.rb put BucketName:TheFileOnS3.txt ALocalFile.txt
221
+
222
+
223
+ A note about [headers]
224
+ ----------------------
225
+ For some S3 operations, such as "put", you might want to specify certain headers
226
+ to the request such as Cache-Control, Expires, x-amz-acl, etc. Rather than
227
+ supporting a load of separate command-line options for these, I just allow
228
+ header specification. So to upload a file with public-read access you could
229
+ say:
230
+ s3cmd.rb put MyBucket:TheFile.txt x-amz-acl:public-read
231
+
232
+ If you don't need to add any particular headers then you can just ignore this
233
+ whole [headers] thing and pretend it's not there. This is somewhat of an
234
+ advanced option.
235
+
236
+
237
+ == REQUIREMENTS:
238
+
239
+ * FIX (list of requirements)
240
+
241
+ == INSTALL:
242
+
243
+ sudo gem install aproxacs-s3sync
244
+
245
+
246
+ Your environment:
247
+ -----------------
248
+ s3sync needs to know several interesting values to work right. It looks for
249
+ them in the following environment variables -or- a s3config.yml file.
250
+ In the yml case, the names need to be lowercase (see example file).
251
+ Furthermore, the yml is searched for in the following locations, in order:
252
+ $S3CONF/s3config.yml
253
+ $HOME/.s3conf/s3config.yml
254
+ /etc/s3conf/s3config.yml
255
+
256
+ Required:
257
+ AWS_ACCESS_KEY_ID
258
+ AWS_SECRET_ACCESS_KEY
259
+
260
+ If you don't know what these are, then s3sync is probably not the
261
+ right tool for you to be starting out with.
262
+ Optional:
263
+ AWS_S3_HOST - I don't see why the default would ever be wrong
264
+ HTTP_PROXY_HOST,HTTP_PROXY_PORT,HTTP_PROXY_USER,HTTP_PROXY_PASSWORD - proxy
265
+ SSL_CERT_DIR - Where your Cert Authority keys live; for verification
266
+ SSL_CERT_FILE - If you have just one PEM file for CA verification
267
+ S3SYNC_RETRIES - How many HTTP errors to tolerate before exiting
268
+ S3SYNC_WAITONERROR - How many seconds to wait after an http error
269
+ S3SYNC_MIME_TYPES_FILE - Where is your mime.types file
270
+ S3SYNC_NATIVE_CHARSET - For example Windows-1252. Defaults to ISO-8859-1.
271
+ AWS_CALLING_FORMAT - Defaults to REGULAR
272
+ REGULAR # http://s3.amazonaws.com/bucket/key
273
+ SUBDOMAIN # http://bucket.s3.amazonaws.com/key
274
+ VANITY # http://<vanity_domain>/key
275
+
276
+ Important: For EU-located buckets you should set the calling format to SUBDOMAIN
277
+ Important: For US buckets with CAPS or other weird traits set the calling format
278
+ to REGULAR
279
+
280
+ I use "envdir" from the daemontools package to set up my env
281
+ variables easily: http://cr.yp.to/daemontools/envdir.html
282
+ For example:
283
+ envdir /root/s3sync/env /root/s3sync/s3sync.rb -etc etc etc
284
+ I know there are other similar tools out there as well.
285
+
286
+ You can also just call it in a shell script where you have exported the vars
287
+ first such as:
288
+ #!/bin/bash
289
+ export AWS_ACCESS_KEY_ID=valueGoesHere
290
+ ...
291
+ s3sync.rb -etc etc etc
292
+
293
+ But by far the easiest (and newest) way to set this up is to put the name:value
294
+ pairs in a file named s3config.yml and let the yaml parser pick them up. There
295
+ is an .example file shipped with the tar.gz to show what a yaml file looks like.
296
+ Thanks to Alastair Brunton for this addition.
297
+
298
+ You can also use some combination of .yaml and environment variables, if you
299
+ want. Go nuts.
300
+
301
+
302
+ == LICENSE:
303
+
304
+ (The MIT License)
305
+
306
+ Copyright (c) 2009 FIXME full name
307
+
308
+ Permission is hereby granted, free of charge, to any person obtaining
309
+ a copy of this software and associated documentation files (the
310
+ 'Software'), to deal in the Software without restriction, including
311
+ without limitation the rights to use, copy, modify, merge, publish,
312
+ distribute, sublicense, and/or sell copies of the Software, and to
313
+ permit persons to whom the Software is furnished to do so, subject to
314
+ the following conditions:
315
+
316
+ The above copyright notice and this permission notice shall be
317
+ included in all copies or substantial portions of the Software.
318
+
319
+ THE SOFTWARE IS PROVIDED 'AS IS', WITHOUT WARRANTY OF ANY KIND,
320
+ EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
321
+ MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
322
+ IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
323
+ CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
324
+ TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
325
+ SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.