cprobert-s3sync 1.3.6

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml ADDED
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA1:
3
+ metadata.gz: ba1822eb603981cf34b82d2496e16a239f3b432d
4
+ data.tar.gz: b4d1483f9b24e0ac741d9da04d3a84cd26128d49
5
+ SHA512:
6
+ metadata.gz: a5a5e264eeb50e6fd49ce84b5323d656b9190baac88641e92f3596435becc2cc49615b811ed6c1b2f40c05416f072028b94d198bb8bc0cc2d71b3bff70f78607
7
+ data.tar.gz: 0e8a18876ccb5aa56ce72c5ec7c483e413a30b668dc704e1e1c6811556680d30d930e5ff2aa4a9713885a071840be188d84e6587496acdda367f4c6b52df247b
data/History.txt ADDED
@@ -0,0 +1,197 @@
1
+ 2010-11-29
2
+ Version 1.3.5
3
+ able to set Content-Encoding with --gzip option
4
+
5
+ 2010-10-29
6
+ Version 1.3.3
7
+ Enumerator is used instead of Generator
8
+
9
+ 2010-10-24
10
+ Version 1.3.1
11
+ Now compatible with ruby 1.9.2
12
+ Able to get key and secret with s3sync command argument.
13
+
14
+ === 0.0.1 2009-08-05
15
+
16
+ * 1 major enhancement:
17
+ * Initial release
18
+
19
+
20
+
21
+ 2006-09-29:
22
+ Added support for --expires and --cache-control. Eg:
23
+ --expires="Thu, 01 Dec 2007 16:00:00 GMT"
24
+ --cache-control="no-cache"
25
+
26
+ Thanks to Charles for pointing out the need for this, and supplying a patch
27
+ proving that it would be trivial to add =) Apologies for not including the short
28
+ form (-e) for the expires. I have a rule that options taking arguments should
29
+ use the long form.
30
+ ----------
31
+
32
+ 2006-10-04
33
+ Several minor debugs and edge cases.
34
+ Fixed a bug where retries didn't rewind the stream to start over.
35
+ ----------
36
+
37
+ 2006-10-12
38
+ Version 1.0.5
39
+ Finally figured out and fixed bug of trying to follow local symlink-to-directory.
40
+ Fixed a really nasty sorting discrepancy that caused problems when files started
41
+ with the same name as a directory.
42
+ Retry on connection-reset on the S3 side.
43
+ Skip files that we can't read instead of dying.
44
+ ----------
45
+
46
+ 2006-10-12
47
+ Version 1.0.6
48
+ Some GC voodoo to try and keep a handle on the memory footprint a little better.
49
+ There is still room for improvement here.
50
+ ----------
51
+
52
+ 2006-10-13
53
+ Version 1.0.7
54
+ Fixed symlink dirs being stored to S3 as real dirs (and failing with 400)
55
+ Added a retry catch for connection timeout error.
56
+ (Hopefully) caught a bug that expected every S3 listing to contain results
57
+ ----------
58
+
59
+ 2006-10-14
60
+ Version 1.0.8
61
+ Was testing for file? before symlink? in localnode.stream. This meant that for
62
+ symlink files it was trying to shove the real file contents into the symlink
63
+ body on s3.
64
+ ----------
65
+
66
+ 2006-10-14
67
+ Version 1.0.9
68
+ Woops, I was using "max-entries" for some reason but the proper header is
69
+ "max-keys". Not a big deal.
70
+ Broke out the S3try stuff into a separate file so I could re-use it for s3cmd.rb
71
+ ----------
72
+
73
+ 2006-10-16
74
+ Added a couple debug lines; not even enough to call it a version revision.
75
+ ----------
76
+
77
+ 2006-10-25
78
+ Version 1.0.10
79
+ UTF-8 fixes.
80
+ Catching a couple more retry-able errors in s3try (instead of aborting the
81
+ program).
82
+ ----------
83
+
84
+ 2006-10-26
85
+ Version 1.0.11
86
+ Revamped some details of the generators and comparator so that directories are
87
+ handled in a more exact and uniform fashion across local and S3.
88
+ ----------
89
+
90
+ 2006-11-28
91
+ Version 1.0.12
92
+ Added a couple more error catches to s3try.
93
+ ----------
94
+
95
+ 2007-01-08
96
+ Version 1.0.13
97
+ Numerous small changes to slash and path handling, in order to catch several
98
+ cases where "root" directory nodes were not being created on S3.
99
+ This makes restores work a lot more intuitively in many cases.
100
+ ----------
101
+
102
+ 2007-01-25
103
+ Version 1.0.14
104
+ Peter Fales' marker fix.
105
+ Also, markers should be decoded into native charset (because that's what s3
106
+ expects to see).
107
+ ----------
108
+
109
+ 2007-02-19
110
+ Version 1.1.0
111
+ *WARNING* Lots of path-handling changes. *PLEASE* test safely before you just
112
+ swap this in for your working 1.0.x version.
113
+
114
+ - Adding --exclude (and there was much rejoicing).
115
+ - Found Yet Another Leading Slash Bug with respect to local nodes. It was always
116
+ "recursing" into the first folder even if there was no trailing slash and -r
117
+ wasn't specified. What it should have done in this case is simply create a node
118
+ for the directory itself, then stop (not check the dir's contents).
119
+ - Local node canonicalization was (potentially) stripping the trailing slash,
120
+ which we need in order to make some decisios in the local generator.
121
+ - Fixed problem where it would prepend a "/" to s3 key names even with blank
122
+ prefix.
123
+ - Fixed S3->local when there's no "/" in the source so it doesn't try to create
124
+ a folder with the bucket name.
125
+ - Updated s3try and s3_s3sync_mod to allow SSL_CERT_FILE
126
+ ----------
127
+
128
+ 2007-02-22
129
+ Version 1.1.1
130
+ Fixed dumb regression bug caused by the S3->local bucket name fix in 1.1.0
131
+ ----------
132
+
133
+ 2007-02-25
134
+ Version 1.1.2
135
+ Added --progress
136
+ ----------
137
+
138
+ 2007-06-02
139
+ Version 1.1.3
140
+ IMPORTANT!
141
+ Pursuant to http://s3sync.net/forum/index.php?topic=49.0 , the tar.gz now
142
+ expands into its own sub-directory named "s3sync" instead of dumping all the
143
+ files into the current directory.
144
+
145
+ In the case of commands of the form:
146
+ s3sync -r somedir somebucket:
147
+ The root directory node in s3 was being stored as "somedir/" instead of "somedir"
148
+ which caused restores to mess up when you say:
149
+ s3sync -r somebucket: restoredir
150
+ The fix to this, by coincidence, actually makes s3fox work even *less* well with
151
+ s3sync. I really need to build my own xul+javascript s3 GUI some day.
152
+
153
+ Also fixed some of the NoMethodError stuff for when --progress is used
154
+ and caught Errno::ETIMEDOUT
155
+ ----------
156
+
157
+ 2007-07-12
158
+ Version 1.1.4
159
+ Added Alastair Brunton's yaml config code.
160
+ ----------
161
+
162
+ 2007-11-17
163
+ Version 1.2.1
164
+ Compatibility for S3 API revisions.
165
+ When retries are exhausted, emit an error.
166
+ Don't ever try to delete the 'root' local dir.
167
+ ----------
168
+
169
+ 2007-11-20
170
+ Version 1.2.2
171
+ Handle EU bucket 307 redirects (in s3try.rb)
172
+ --make-dirs added
173
+ ----------
174
+
175
+ 2007-11-20
176
+ Version 1.2.3
177
+ Fix SSL verification settings that broke in new S3 API.
178
+ ----------
179
+
180
+ 2008-01-06
181
+ Version 1.2.4
182
+ Run from any dir (search "here" for includes).
183
+ Search out s3config.yml in some likely places.
184
+ Reset connection (properly) on retry-able non-50x errors.
185
+ Fix calling format bug preventing it from working from yml.
186
+ Added http proxy support.
187
+ ----------
188
+
189
+ 2008-05-11
190
+ Version 1.2.5
191
+ Added option --no-md5
192
+ ----------
193
+
194
+ 2008-06-16
195
+ Version 1.2.6
196
+ Catch connect errors and retry.
197
+ ----------
data/Manifest.txt ADDED
@@ -0,0 +1,11 @@
1
+ History.txt
2
+ Manifest.txt
3
+ PostInstall.txt
4
+ README.rdoc
5
+ Rakefile
6
+ lib/s3sync.rb
7
+ script/console
8
+ script/destroy
9
+ script/generate
10
+ test/test_helper.rb
11
+ test/test_s3sync.rb
data/PostInstall.txt ADDED
@@ -0,0 +1,7 @@
1
+
2
+ For more information on s3sync, see http://s3sync.rubyforge.org
3
+
4
+ NOTE: Change this information in PostInstall.txt
5
+ You can also delete it if you don't want it.
6
+
7
+
data/README.rdoc ADDED
@@ -0,0 +1,326 @@
1
+
2
+ == CHANGED from original to be compatible with 1.9.2
3
+ * require 'md5'
4
+ Instead require "digest/md5"
5
+ * Thread.critical
6
+ Thread.critical is not used since 1.9
7
+ * Dir#collect
8
+ In 1.9.2 Dir#collect is not Array but Enumerator
9
+ * Array#to_s
10
+ The result of [1,2].to_s is different from 1.8. Instead of to_s, used join
11
+ * use Enumerator instead of thread_generator
12
+
13
+ == DESCRIPTION:
14
+
15
+ Welcome to s3sync.rb
16
+ --------------------
17
+ Home page, wiki, forum, bug reports, etc: http://s3sync.net
18
+
19
+ This is a ruby program that easily transfers directories between a local
20
+ directory and an S3 bucket:prefix. It behaves somewhat, but not precisely, like
21
+ the rsync program. In particular, it shares rsync's peculiar behavior that
22
+ trailing slashes on the source side are meaningful. See examples below.
23
+
24
+ One benefit over some other comparable tools is that s3sync goes out of its way
25
+ to mirror the directory structure on S3. Meaning you don't *need* to use s3sync
26
+ later in order to view your files on S3. You can just as easily use an S3
27
+ shell, a web browser (if you used the --public-read option), etc. Note that
28
+ s3sync is NOT necessarily going to be able to read files you uploaded via some
29
+ other tool. This includes things uploaded with the old perl version! For best
30
+ results, start fresh!
31
+
32
+ s3sync runs happily on linux, probably other *ix, and also Windows (except that
33
+ symlinks and permissions management features don't do anything on Windows). If
34
+ you get it running somewhere interesting let me know (see below)
35
+
36
+ s3sync is free, and license terms are included in all the source files. If you
37
+ decide to make it better, or find bugs, please let me know.
38
+
39
+ The original inspiration for this tool is the perl script by the same name which
40
+ was made by Thorsten von Eicken (and later updated by me). This ruby program
41
+ does not share any components or logic from that utility; the only relation is
42
+ that it performs a similar task.
43
+
44
+
45
+ Management tasks
46
+ ----------------
47
+ For low-level S3 operations not encapsulated by the sync paradigm, try the
48
+ companion utility s3cmd.rb. See README_s3cmd.txt.
49
+
50
+
51
+ About single files
52
+ ------------------
53
+ s3sync lacks the special case code that would be needed in order to handle a
54
+ source/dest that's a single file. This isn't one of the supported use cases so
55
+ don't expect it to work. You can use the companion utility s3cmd.rb for single
56
+ get/puts.
57
+
58
+
59
+ About Directories, the bane of any S3 sync-er
60
+ ---------------------------------------------
61
+ In S3 there's no actual concept of folders, just keys and nodes. So, every tool
62
+ uses its own proprietary way of storing dir info (my scheme being the best
63
+ naturally) and in general the methods are not compatible.
64
+
65
+ If you populate S3 by some means *other than* s3sync and then try to use s3sync
66
+ to "get" the S3 stuff to a local filesystem, you will want to use the
67
+ --make-dirs option. This causes the local dirs to be created even if there is no
68
+ s3sync-compatible directory node info stored on the S3 side. In other words,
69
+ local folders are conjured into existence whenever they are needed to make the
70
+ "get" succeed.
71
+
72
+
73
+ About MD5 hashes
74
+ ----------------
75
+ s3sync's normal operation is to compare the file size and MD5 hash of each item
76
+ to decide whether it needs syncing. On the S3 side, these hashes are stored and
77
+ returned to us as the "ETag" of each item when the bucket is listed, so it's
78
+ very easy. On the local side, the MD5 must be calculated by pushing every byte
79
+ in the file through the MD5 algorithm. This is CPU and IO intensive!
80
+
81
+ Thus you can specify the option --no-md5. This will compare the upload time on
82
+ S3 to the "last modified" time on the local item, and not do md5 calculations
83
+ locally at all. This might cause more transfers than are absolutely necessary.
84
+ For example if the file is "touched" to a newer modified date, but its contents
85
+ didn't change. Conversely if a file's contents are modified but the date is not
86
+ updated, then the sync will pass over it. Lastly, if your clock is very
87
+ different from the one on the S3 servers, then you may see unanticipated
88
+ behavior.
89
+
90
+
91
+ A word on SSL_CERT_DIR:
92
+ -----------------------
93
+ On my debian install I didn't find any root authority public keys. I installed
94
+ some by running this shell archive:
95
+ http://mirbsd.mirsolutions.de/cvs.cgi/src/etc/ssl.certs.shar
96
+ (You have to click download, and then run it wherever you want the certs to be
97
+ placed). I do not in any way assert that these certificates are good,
98
+ comprehensive, moral, noble, or otherwise correct. But I am using them.
99
+
100
+ If you don't set up a cert dir, and try to use ssl, then you'll 1) get an ugly
101
+ warning message slapped down by ruby, and 2) not have any protection AT ALL from
102
+ malicious servers posing as s3.amazonaws.com. Seriously... you want to get
103
+ this right if you're going to have any sensitive data being tossed around.
104
+ --
105
+ There is a debian package ca-certificates; this is what I'm using now.
106
+ apt-get install ca-certificates
107
+ and then use:
108
+ SSL_CERT_DIR=/etc/ssl/certs
109
+
110
+ You used to be able to use just one certificate, but recently AWS has started
111
+ using more than one CA.
112
+
113
+
114
+ Getting started:
115
+ ----------------
116
+ Invoke by typing s3sync.rb and you should get a nice usage screen.
117
+ Options can be specified in short or long form (except --delete, which has no
118
+ short form)
119
+
120
+ ALWAYS TEST NEW COMMANDS using --dryrun(-n) if you want to see what will be
121
+ affected before actually doing it. ESPECIALLY if you use --delete. Otherwise, do
122
+ not be surprised if you misplace a '/' or two and end up deleting all your
123
+ precious, precious files.
124
+
125
+ If you use the --public-read(-p) option, items sent to S3 will be ACL'd so that
126
+ anonymous web users can download them, given the correct URL. This could be
127
+ useful if you intend to publish directories of information for others to see.
128
+ For example, I use s3sync to publish itself to its home on S3 via the following
129
+ command: s3sync.rb -v -p publish/ ServEdge_pub:s3sync Where the files live in a
130
+ local folder called "publish" and I wish them to be copied to the URL:
131
+ http://s3.amazonaws.com/ServEdge_pub/s3sync/... If you use --ssl(-s) then your
132
+ connections with S3 will be encrypted. Otherwise your data will be sent in clear
133
+ form, i.e. easy to intercept by malicious parties.
134
+
135
+ If you want to prune items from the destination side which are not found on the
136
+ source side, you can use --delete. Always test this with -n first to make sure
137
+ the command line you specify is not going to do something terrible to your
138
+ cherished and irreplaceable data.
139
+
140
+
141
+ Updates and other discussion:
142
+ -----------------------------
143
+ The latest version of s3sync should normally be at:
144
+ http://s3.amazonaws.com/ServEdge_pub/s3sync/s3sync.tar.gz
145
+ and the Amazon S3 forums probably have a few threads going on it at any given
146
+ time. I may not always see things posted to the threads, so if you want you can
147
+ contact me at gbs-s3@10forward.com too.
148
+
149
+
150
+ == FEATURES/PROBLEMS:
151
+
152
+ * FIX (list of features or problems)
153
+
154
+ == SYNOPSIS:
155
+
156
+ Examples:
157
+ ---------
158
+ (using S3 bucket 'mybucket' and prefix 'pre')
159
+ Put the local etc directory itself into S3
160
+ s3sync.rb -r /etc mybucket:pre
161
+ (This will yield S3 keys named pre/etc/...)
162
+ Put the contents of the local /etc dir into S3, rename dir:
163
+ s3sync.rb -r /etc/ mybucket:pre/etcbackup
164
+ (This will yield S3 keys named pre/etcbackup/...)
165
+ Put contents of S3 "directory" etc into local dir
166
+ s3sync.rb -r mybucket:pre/etc/ /root/etcrestore
167
+ (This will yield local files at /root/etcrestore/...)
168
+ Put the contents of S3 "directory" etc into a local dir named etc
169
+ s3sync.rb -r mybucket:pre/etc /root
170
+ (This will yield local files at /root/etc/...)
171
+ Put S3 nodes under the key pre/etc/ to the local dir etcrestore
172
+ **and create local dirs even if S3 side lacks dir nodes**
173
+ s3sync.rb -r --make-dirs mybucket:pre/etc/ /root/etcrestore
174
+ (This will yield local files at /root/etcrestore/...)
175
+
176
+ List all the buckets your account owns:
177
+ s3cmd.rb listbuckets
178
+
179
+ Create a new bucket:
180
+ s3cmd.rb createbucket BucketName
181
+
182
+ Create a new bucket in the EU:
183
+ s3cmd.rb createbucket BucketName EU
184
+
185
+ Find out the location constraint of a bucket:
186
+ s3cmd.rb location BucketName
187
+
188
+ Delete an old bucket you don't want any more:
189
+ s3cmd.rb deletebucket BucketName
190
+
191
+ Find out what's in a bucket, 10 lines at a time:
192
+ s3cmd.rb list BucketName 10
193
+
194
+ Only look in a particular prefix:
195
+ s3cmd.rb list BucketName:startsWithThis
196
+
197
+ Look in the virtual "directory" named foo;
198
+ lists sub-"directories" and keys that are at this level.
199
+ Note that if you specify a delimiter you must specify a max before it.
200
+ (until I make the options parsing smarter)
201
+ s3cmd.rb list BucketName:foo/ 10 /
202
+
203
+ Delete a key:
204
+ s3cmd.rb delete BucketName:AKey
205
+
206
+ Delete all keys that match (like a combo between list and delete):
207
+ s3cmd.rb deleteall BucketName:SomePrefix
208
+
209
+ Only pretend you're going to delete all keys that match, but list them:
210
+ s3cmd.rb --dryrun deleteall BucketName:SomePrefix
211
+
212
+ Delete all keys in a bucket (leaving the bucket):
213
+ s3cmd.rb deleteall BucketName
214
+
215
+ Get a file from S3 and store it to a local file
216
+ s3cmd.rb get BucketName:TheFileOnS3.txt ALocalFile.txt
217
+
218
+ Put a local file up to S3
219
+ Note we don't automatically set mime type, etc.
220
+ NOTE that the order of the options doesn't change. S3 stays first!
221
+ s3cmd.rb put BucketName:TheFileOnS3.txt ALocalFile.txt
222
+
223
+
224
+ A note about [headers]
225
+ ----------------------
226
+ For some S3 operations, such as "put", you might want to specify certain headers
227
+ to the request such as Cache-Control, Expires, x-amz-acl, etc. Rather than
228
+ supporting a load of separate command-line options for these, I just allow
229
+ header specification. So to upload a file with public-read access you could
230
+ say:
231
+ s3cmd.rb put MyBucket:TheFile.txt x-amz-acl:public-read
232
+
233
+ If you don't need to add any particular headers then you can just ignore this
234
+ whole [headers] thing and pretend it's not there. This is somewhat of an
235
+ advanced option.
236
+
237
+
238
+ == REQUIREMENTS:
239
+
240
+ * FIX (list of requirements)
241
+
242
+ == INSTALL:
243
+
244
+ sudo gem install cprobert-s3sync
245
+
246
+
247
+ Your environment:
248
+ -----------------
249
+ s3sync needs to know several interesting values to work right. It looks for
250
+ them in the following environment variables -or- a s3config.yml file.
251
+ In the yml case, the names need to be lowercase (see example file).
252
+ Furthermore, the yml is searched for in the following locations, in order:
253
+ $S3CONF/s3config.yml
254
+ $HOME/.s3conf/s3config.yml
255
+ /etc/s3conf/s3config.yml
256
+
257
+ Required:
258
+ AWS_ACCESS_KEY_ID
259
+ AWS_SECRET_ACCESS_KEY
260
+
261
+ If you don't know what these are, then s3sync is probably not the
262
+ right tool for you to be starting out with.
263
+ Optional:
264
+ AWS_S3_HOST - I don't see why the default would ever be wrong
265
+ HTTP_PROXY_HOST,HTTP_PROXY_PORT,HTTP_PROXY_USER,HTTP_PROXY_PASSWORD - proxy
266
+ SSL_CERT_DIR - Where your Cert Authority keys live; for verification
267
+ SSL_CERT_FILE - If you have just one PEM file for CA verification
268
+ S3SYNC_RETRIES - How many HTTP errors to tolerate before exiting
269
+ S3SYNC_WAITONERROR - How many seconds to wait after an http error
270
+ S3SYNC_MIME_TYPES_FILE - Where is your mime.types file
271
+ S3SYNC_NATIVE_CHARSET - For example Windows-1252. Defaults to ISO-8859-1.
272
+ AWS_CALLING_FORMAT - Defaults to REGULAR
273
+ REGULAR # http://s3.amazonaws.com/bucket/key
274
+ SUBDOMAIN # http://bucket.s3.amazonaws.com/key
275
+ VANITY # http://<vanity_domain>/key
276
+
277
+ Important: For EU-located buckets you should set the calling format to SUBDOMAIN
278
+ Important: For US buckets with CAPS or other weird traits set the calling format
279
+ to REGULAR
280
+
281
+ I use "envdir" from the daemontools package to set up my env
282
+ variables easily: http://cr.yp.to/daemontools/envdir.html
283
+ For example:
284
+ envdir /root/s3sync/env /root/s3sync/s3sync.rb -etc etc etc
285
+ I know there are other similar tools out there as well.
286
+
287
+ You can also just call it in a shell script where you have exported the vars
288
+ first such as:
289
+ #!/bin/bash
290
+ export AWS_ACCESS_KEY_ID=valueGoesHere
291
+ ...
292
+ s3sync.rb -etc etc etc
293
+
294
+ But by far the easiest (and newest) way to set this up is to put the name:value
295
+ pairs in a file named s3config.yml and let the yaml parser pick them up. There
296
+ is an .example file shipped with the tar.gz to show what a yaml file looks like.
297
+ Thanks to Alastair Brunton for this addition.
298
+
299
+ You can also use some combination of .yaml and environment variables, if you
300
+ want. Go nuts.
301
+
302
+
303
+ == LICENSE:
304
+
305
+ (The MIT License)
306
+
307
+ Copyright (c) 2009 FIXME full name
308
+
309
+ Permission is hereby granted, free of charge, to any person obtaining
310
+ a copy of this software and associated documentation files (the
311
+ 'Software'), to deal in the Software without restriction, including
312
+ without limitation the rights to use, copy, modify, merge, publish,
313
+ distribute, sublicense, and/or sell copies of the Software, and to
314
+ permit persons to whom the Software is furnished to do so, subject to
315
+ the following conditions:
316
+
317
+ The above copyright notice and this permission notice shall be
318
+ included in all copies or substantial portions of the Software.
319
+
320
+ THE SOFTWARE IS PROVIDED 'AS IS', WITHOUT WARRANTY OF ANY KIND,
321
+ EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
322
+ MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
323
+ IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
324
+ CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
325
+ TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
326
+ SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.