s3sync 1.2.5 → 2.0.0

Sign up to get free protection for your applications and to get access to all the features.
data/README DELETED
@@ -1,401 +0,0 @@
1
- Welcome to s3sync.rb
2
- --------------------
3
- Home page, wiki, forum, bug reports, etc: http://s3sync.net
4
-
5
- This is a ruby program that easily transfers directories between a local
6
- directory and an S3 bucket:prefix. It behaves somewhat, but not precisely, like
7
- the rsync program. In particular, it shares rsync's peculiar behavior that
8
- trailing slashes on the source side are meaningful. See examples below.
9
-
10
- One benefit over some other comparable tools is that s3sync goes out of its way
11
- to mirror the directory structure on S3. Meaning you don't *need* to use s3sync
12
- later in order to view your files on S3. You can just as easily use an S3
13
- shell, a web browser (if you used the --public-read option), etc. Note that
14
- s3sync is NOT necessarily going to be able to read files you uploaded via some
15
- other tool. This includes things uploaded with the old perl version! For best
16
- results, start fresh!
17
-
18
- s3sync runs happily on linux, probably other *ix, and also Windows (except that
19
- symlinks and permissions management features don't do anything on Windows). If
20
- you get it running somewhere interesting let me know (see below)
21
-
22
- s3sync is free, and license terms are included in all the source files. If you
23
- decide to make it better, or find bugs, please let me know.
24
-
25
- The original inspiration for this tool is the perl script by the same name which
26
- was made by Thorsten von Eicken (and later updated by me). This ruby program
27
- does not share any components or logic from that utility; the only relation is
28
- that it performs a similar task.
29
-
30
-
31
- Examples:
32
- ---------
33
- (using S3 bucket 'mybucket' and prefix 'pre')
34
- Put the local etc directory itself into S3
35
- s3sync.rb -r /etc mybucket:pre
36
- (This will yield S3 keys named pre/etc/...)
37
- Put the contents of the local /etc dir into S3, rename dir:
38
- s3sync.rb -r /etc/ mybucket:pre/etcbackup
39
- (This will yield S3 keys named pre/etcbackup/...)
40
- Put contents of S3 "directory" etc into local dir
41
- s3sync.rb -r mybucket:pre/etc/ /root/etcrestore
42
- (This will yield local files at /root/etcrestore/...)
43
- Put the contents of S3 "directory" etc into a local dir named etc
44
- s3sync.rb -r mybucket:pre/etc /root
45
- (This will yield local files at /root/etc/...)
46
- Put S3 nodes under the key pre/etc/ to the local dir etcrestore
47
- **and create local dirs even if S3 side lacks dir nodes**
48
- s3sync.rb -r --make-dirs mybucket:pre/etc/ /root/etcrestore
49
- (This will yield local files at /root/etcrestore/...)
50
-
51
-
52
- Prerequisites:
53
- --------------
54
- You need a functioning Ruby (>=1.8.4) installation, as well as the OpenSSL ruby
55
- library (which may or may not come with your ruby).
56
-
57
- How you get these items working on your system is really not any of my
58
- business, but you might find the following things helpful. If you're using
59
- Windows, the ruby site has a useful "one click installer" (although it takes
60
- more clicks than that, really). On debian (and ubuntu, and other debian-like
61
- things), there are apt packages available for ruby and the open ssl lib.
62
-
63
-
64
- Your environment:
65
- -----------------
66
- s3sync needs to know several interesting values to work right. It looks for
67
- them in the following environment variables -or- a s3config.yml file.
68
- In the yml case, the names need to be lowercase (see example file).
69
- Furthermore, the yml is searched for in the following locations, in order:
70
- $S3CONF/s3config.yml
71
- $HOME/.s3conf/s3config.yml
72
- /etc/s3conf/s3config.yml
73
-
74
- Required:
75
- AWS_ACCESS_KEY_ID
76
- AWS_SECRET_ACCESS_KEY
77
-
78
- If you don't know what these are, then s3sync is probably not the
79
- right tool for you to be starting out with.
80
- Optional:
81
- AWS_S3_HOST - I don't see why the default would ever be wrong
82
- HTTP_PROXY_HOST,HTTP_PROXY_PORT,HTTP_PROXY_USER,HTTP_PROXY_PASSWORD - proxy
83
- SSL_CERT_DIR - Where your Cert Authority keys live; for verification
84
- SSL_CERT_FILE - If you have just one PEM file for CA verification
85
- S3SYNC_RETRIES - How many HTTP errors to tolerate before exiting
86
- S3SYNC_WAITONERROR - How many seconds to wait after an http error
87
- S3SYNC_MIME_TYPES_FILE - Where is your mime.types file
88
- S3SYNC_NATIVE_CHARSET - For example Windows-1252. Defaults to ISO-8859-1.
89
- AWS_CALLING_FORMAT - Defaults to REGULAR
90
- REGULAR # http://s3.amazonaws.com/bucket/key
91
- SUBDOMAIN # http://bucket.s3.amazonaws.com/key
92
- VANITY # http://<vanity_domain>/key
93
-
94
- Important: For EU-located buckets you should set the calling format to SUBDOMAIN
95
- Important: For US buckets with CAPS or other weird traits set the calling format
96
- to REGULAR
97
-
98
- I use "envdir" from the daemontools package to set up my env
99
- variables easily: http://cr.yp.to/daemontools/envdir.html
100
- For example:
101
- envdir /root/s3sync/env /root/s3sync/s3sync.rb -etc etc etc
102
- I know there are other similar tools out there as well.
103
-
104
- You can also just call it in a shell script where you have exported the vars
105
- first such as:
106
- #!/bin/bash
107
- export AWS_ACCESS_KEY_ID=valueGoesHere
108
- ...
109
- s3sync.rb -etc etc etc
110
-
111
- But by far the easiest (and newest) way to set this up is to put the name:value
112
- pairs in a file named s3config.yml and let the yaml parser pick them up. There
113
- is an .example file shipped with the tar.gz to show what a yaml file looks like.
114
- Thanks to Alastair Brunton for this addition.
115
-
116
- You can also use some combination of .yaml and environment variables, if you
117
- want. Go nuts.
118
-
119
-
120
- Management tasks
121
- ----------------
122
- For low-level S3 operations not encapsulated by the sync paradigm, try the
123
- companion utility s3cmd.rb. See README_s3cmd.txt.
124
-
125
-
126
- About single files
127
- ------------------
128
- s3sync lacks the special case code that would be needed in order to handle a
129
- source/dest that's a single file. This isn't one of the supported use cases so
130
- don't expect it to work. You can use the companion utility s3cmd.rb for single
131
- get/puts.
132
-
133
-
134
- About Directories, the bane of any S3 sync-er
135
- ---------------------------------------------
136
- In S3 there's no actual concept of folders, just keys and nodes. So, every tool
137
- uses its own proprietary way of storing dir info (my scheme being the best
138
- naturally) and in general the methods are not compatible.
139
-
140
- If you populate S3 by some means *other than* s3sync and then try to use s3sync
141
- to "get" the S3 stuff to a local filesystem, you will want to use the
142
- --make-dirs option. This causes the local dirs to be created even if there is no
143
- s3sync-compatible directory node info stored on the S3 side. In other words,
144
- local folders are conjured into existence whenever they are needed to make the
145
- "get" succeed.
146
-
147
-
148
- About MD5 hashes
149
- ----------------
150
- s3sync's normal operation is to compare the file size and MD5 hash of each item
151
- to decide whether it needs syncing. On the S3 side, these hashes are stored and
152
- returned to us as the "ETag" of each item when the bucket is listed, so it's
153
- very easy. On the local side, the MD5 must be calculated by pushing every byte
154
- in the file through the MD5 algorithm. This is CPU and IO intensive!
155
-
156
- Thus you can specify the option --no-md5. This will compare the upload time on
157
- S3 to the "last modified" time on the local item, and not do md5 calculations
158
- locally at all. This might cause more transfers than are absolutely necessary.
159
- For example if the file is "touched" to a newer modified date, but its contents
160
- didn't change. Conversely if a file's contents are modified but the date is not
161
- updated, then the sync will pass over it. Lastly, if your clock is very
162
- different from the one on the S3 servers, then you may see unanticipated
163
- behavior.
164
-
165
-
166
- A word on SSL_CERT_DIR:
167
- -----------------------
168
- On my debian install I didn't find any root authority public keys. I installed
169
- some by running this shell archive:
170
- http://mirbsd.mirsolutions.de/cvs.cgi/src/etc/ssl.certs.shar
171
- (You have to click download, and then run it wherever you want the certs to be
172
- placed). I do not in any way assert that these certificates are good,
173
- comprehensive, moral, noble, or otherwise correct. But I am using them.
174
-
175
- If you don't set up a cert dir, and try to use ssl, then you'll 1) get an ugly
176
- warning message slapped down by ruby, and 2) not have any protection AT ALL from
177
- malicious servers posing as s3.amazonaws.com. Seriously... you want to get
178
- this right if you're going to have any sensitive data being tossed around.
179
- --
180
- There is a debian package ca-certificates; this is what I'm using now.
181
- apt-get install ca-certificates
182
- and then use:
183
- SSL_CERT_DIR=/etc/ssl/certs
184
-
185
- You used to be able to use just one certificate, but recently AWS has started
186
- using more than one CA.
187
-
188
-
189
- Getting started:
190
- ----------------
191
- Invoke by typing s3sync.rb and you should get a nice usage screen.
192
- Options can be specified in short or long form (except --delete, which has no
193
- short form)
194
-
195
- ALWAYS TEST NEW COMMANDS using --dryrun(-n) if you want to see what will be
196
- affected before actually doing it. ESPECIALLY if you use --delete. Otherwise, do
197
- not be surprised if you misplace a '/' or two and end up deleting all your
198
- precious, precious files.
199
-
200
- If you use the --public-read(-p) option, items sent to S3 will be ACL'd so that
201
- anonymous web users can download them, given the correct URL. This could be
202
- useful if you intend to publish directories of information for others to see.
203
- For example, I use s3sync to publish itself to its home on S3 via the following
204
- command: s3sync.rb -v -p publish/ ServEdge_pub:s3sync Where the files live in a
205
- local folder called "publish" and I wish them to be copied to the URL:
206
- http://s3.amazonaws.com/ServEdge_pub/s3sync/... If you use --ssl(-s) then your
207
- connections with S3 will be encrypted. Otherwise your data will be sent in clear
208
- form, i.e. easy to intercept by malicious parties.
209
-
210
- If you want to prune items from the destination side which are not found on the
211
- source side, you can use --delete. Always test this with -n first to make sure
212
- the command line you specify is not going to do something terrible to your
213
- cherished and irreplaceable data.
214
-
215
-
216
- Updates and other discussion:
217
- -----------------------------
218
- The latest version of s3sync should normally be at:
219
- http://s3.amazonaws.com/ServEdge_pub/s3sync/s3sync.tar.gz
220
- and the Amazon S3 forums probably have a few threads going on it at any given
221
- time. I may not always see things posted to the threads, so if you want you can
222
- contact me at gbs-s3@10forward.com too.
223
-
224
-
225
- Change Log:
226
- -----------
227
-
228
- 2006-09-29:
229
- Added support for --expires and --cache-control. Eg:
230
- --expires="Thu, 01 Dec 2007 16:00:00 GMT"
231
- --cache-control="no-cache"
232
-
233
- Thanks to Charles for pointing out the need for this, and supplying a patch
234
- proving that it would be trivial to add =) Apologies for not including the short
235
- form (-e) for the expires. I have a rule that options taking arguments should
236
- use the long form.
237
- ----------
238
-
239
- 2006-10-04
240
- Several minor debugs and edge cases.
241
- Fixed a bug where retries didn't rewind the stream to start over.
242
- ----------
243
-
244
- 2006-10-12
245
- Version 1.0.5
246
- Finally figured out and fixed bug of trying to follow local symlink-to-directory.
247
- Fixed a really nasty sorting discrepancy that caused problems when files started
248
- with the same name as a directory.
249
- Retry on connection-reset on the S3 side.
250
- Skip files that we can't read instead of dying.
251
- ----------
252
-
253
- 2006-10-12
254
- Version 1.0.6
255
- Some GC voodoo to try and keep a handle on the memory footprint a little better.
256
- There is still room for improvement here.
257
- ----------
258
-
259
- 2006-10-13
260
- Version 1.0.7
261
- Fixed symlink dirs being stored to S3 as real dirs (and failing with 400)
262
- Added a retry catch for connection timeout error.
263
- (Hopefully) caught a bug that expected every S3 listing to contain results
264
- ----------
265
-
266
- 2006-10-14
267
- Version 1.0.8
268
- Was testing for file? before symlink? in localnode.stream. This meant that for
269
- symlink files it was trying to shove the real file contents into the symlink
270
- body on s3.
271
- ----------
272
-
273
- 2006-10-14
274
- Version 1.0.9
275
- Woops, I was using "max-entries" for some reason but the proper header is
276
- "max-keys". Not a big deal.
277
- Broke out the S3try stuff into a separate file so I could re-use it for s3cmd.rb
278
- ----------
279
-
280
- 2006-10-16
281
- Added a couple debug lines; not even enough to call it a version revision.
282
- ----------
283
-
284
- 2006-10-25
285
- Version 1.0.10
286
- UTF-8 fixes.
287
- Catching a couple more retry-able errors in s3try (instead of aborting the
288
- program).
289
- ----------
290
-
291
- 2006-10-26
292
- Version 1.0.11
293
- Revamped some details of the generators and comparator so that directories are
294
- handled in a more exact and uniform fashion across local and S3.
295
- ----------
296
-
297
- 2006-11-28
298
- Version 1.0.12
299
- Added a couple more error catches to s3try.
300
- ----------
301
-
302
- 2007-01-08
303
- Version 1.0.13
304
- Numerous small changes to slash and path handling, in order to catch several
305
- cases where "root" directory nodes were not being created on S3.
306
- This makes restores work a lot more intuitively in many cases.
307
- ----------
308
-
309
- 2007-01-25
310
- Version 1.0.14
311
- Peter Fales' marker fix.
312
- Also, markers should be decoded into native charset (because that's what s3
313
- expects to see).
314
- ----------
315
-
316
- 2007-02-19
317
- Version 1.1.0
318
- *WARNING* Lots of path-handling changes. *PLEASE* test safely before you just
319
- swap this in for your working 1.0.x version.
320
-
321
- - Adding --exclude (and there was much rejoicing).
322
- - Found Yet Another Leading Slash Bug with respect to local nodes. It was always
323
- "recursing" into the first folder even if there was no trailing slash and -r
324
- wasn't specified. What it should have done in this case is simply create a node
325
- for the directory itself, then stop (not check the dir's contents).
326
- - Local node canonicalization was (potentially) stripping the trailing slash,
327
- which we need in order to make some decisios in the local generator.
328
- - Fixed problem where it would prepend a "/" to s3 key names even with blank
329
- prefix.
330
- - Fixed S3->local when there's no "/" in the source so it doesn't try to create
331
- a folder with the bucket name.
332
- - Updated s3try and s3_s3sync_mod to allow SSL_CERT_FILE
333
- ----------
334
-
335
- 2007-02-22
336
- Version 1.1.1
337
- Fixed dumb regression bug caused by the S3->local bucket name fix in 1.1.0
338
- ----------
339
-
340
- 2007-02-25
341
- Version 1.1.2
342
- Added --progress
343
- ----------
344
-
345
- 2007-06-02
346
- Version 1.1.3
347
- IMPORTANT!
348
- Pursuant to http://s3sync.net/forum/index.php?topic=49.0 , the tar.gz now
349
- expands into its own sub-directory named "s3sync" instead of dumping all the
350
- files into the current directory.
351
-
352
- In the case of commands of the form:
353
- s3sync -r somedir somebucket:
354
- The root directory node in s3 was being stored as "somedir/" instead of "somedir"
355
- which caused restores to mess up when you say:
356
- s3sync -r somebucket: restoredir
357
- The fix to this, by coincidence, actually makes s3fox work even *less* well with
358
- s3sync. I really need to build my own xul+javascript s3 GUI some day.
359
-
360
- Also fixed some of the NoMethodError stuff for when --progress is used
361
- and caught Errno::ETIMEDOUT
362
- ----------
363
-
364
- 2007-07-12
365
- Version 1.1.4
366
- Added Alastair Brunton's yaml config code.
367
- ----------
368
-
369
- 2007-11-17
370
- Version 1.2.1
371
- Compatibility for S3 API revisions.
372
- When retries are exhausted, emit an error.
373
- Don't ever try to delete the 'root' local dir.
374
- ----------
375
-
376
- 2007-11-20
377
- Version 1.2.2
378
- Handle EU bucket 307 redirects (in s3try.rb)
379
- --make-dirs added
380
- ----------
381
-
382
- 2007-11-20
383
- Version 1.2.3
384
- Fix SSL verification settings that broke in new S3 API.
385
- ----------
386
-
387
- 2008-01-06
388
- Version 1.2.4
389
- Run from any dir (search "here" for includes).
390
- Search out s3config.yml in some likely places.
391
- Reset connection (properly) on retry-able non-50x errors.
392
- Fix calling format bug preventing it from working from yml.
393
- Added http proxy support.
394
- ----------
395
-
396
- 2008-05-11
397
- Version 1.2.5
398
- Added option --no-md5
399
- ----------
400
-
401
- FNORD
@@ -1,172 +0,0 @@
1
- Welcome to s3cmd.rb
2
- -------------------
3
- This is a ruby program that wraps S3 operations into a simple command-line tool.
4
- It is inspired by things like rsh3ll, #sh3ll, etc., but shares no code from
5
- them. It's meant as a companion utility to s3sync.rb but could be used on its
6
- own (provided you have read the other readme file and know how to use s3sync in
7
- theory).
8
-
9
- I made this even though lots of other "shell"s exist, because I wanted a
10
- single-operation utility, instead of a shell "environment". This lends itself
11
- more to scripting, etc. Also the delete operation on rsh3ll seems to be borken
12
- at the moment? =(
13
-
14
- Users not yet familiar with s3sync should read about that first, since s3cmd and
15
- s3sync share a tremendous amount of conventions and syntax. Particularly you
16
- have to set up environment variables prior to calling s3cmd, and s3cmd also uses
17
- the "bucket:key" syntax popularized by s3sync. Many of the options are the same
18
- too. Really, go read the other readme first if you haven't used s3sync yet.
19
- Otherwise you will become confused. It's OK, I'll wait.
20
-
21
- ....
22
-
23
- In general, s3sync and s3cmd complement each other. s3sync is useful to perform
24
- serious synchronization operations, and s3cmd allows you to do simple things
25
- such as bucket management, listing, transferring single files, and the like.
26
-
27
- Here is the usage, with examples to follow.
28
-
29
- s3cmd.rb [options] <command> [arg(s)] version 1.0.0
30
- --help -h --verbose -v --dryrun -n
31
- --ssl -s --debug -d
32
-
33
- Commands:
34
- s3cmd.rb listbuckets [headers]
35
- s3cmd.rb createbucket|deletebucket <bucket> [headers]
36
- s3cmd.rb list <bucket>[:prefix] [max/page] [delimiter] [headers]
37
- s3cmd.rb delete <bucket>:key [headers]
38
- s3cmd.rb deleteall <bucket>[:prefix] [headers]
39
- s3cmd.rb get|put <bucket>:key <file> [headers]
40
-
41
-
42
- A note about [headers]
43
- ----------------------
44
- For some S3 operations, such as "put", you might want to specify certain headers
45
- to the request such as Cache-Control, Expires, x-amz-acl, etc. Rather than
46
- supporting a load of separate command-line options for these, I just allow
47
- header specification. So to upload a file with public-read access you could
48
- say:
49
- s3cmd.rb put MyBucket:TheFile.txt x-amz-acl:public-read
50
-
51
- If you don't need to add any particular headers then you can just ignore this
52
- whole [headers] thing and pretend it's not there. This is somewhat of an
53
- advanced option.
54
-
55
-
56
- Examples
57
- --------
58
- List all the buckets your account owns:
59
- s3cmd.rb listbuckets
60
-
61
- Create a new bucket:
62
- s3cmd.rb createbucket BucketName
63
-
64
- Create a new bucket in the EU:
65
- s3cmd.rb createbucket BucketName EU
66
-
67
- Find out the location constraint of a bucket:
68
- s3cmd.rb location BucketName
69
-
70
- Delete an old bucket you don't want any more:
71
- s3cmd.rb deletebucket BucketName
72
-
73
- Find out what's in a bucket, 10 lines at a time:
74
- s3cmd.rb list BucketName 10
75
-
76
- Only look in a particular prefix:
77
- s3cmd.rb list BucketName:startsWithThis
78
-
79
- Look in the virtual "directory" named foo;
80
- lists sub-"directories" and keys that are at this level.
81
- Note that if you specify a delimiter you must specify a max before it.
82
- (until I make the options parsing smarter)
83
- s3cmd.rb list BucketName:foo/ 10 /
84
-
85
- Delete a key:
86
- s3cmd.rb delete BucketName:AKey
87
-
88
- Delete all keys that match (like a combo between list and delete):
89
- s3cmd.rb deleteall BucketName:SomePrefix
90
-
91
- Only pretend you're going to delete all keys that match, but list them:
92
- s3cmd.rb --dryrun deleteall BucketName:SomePrefix
93
-
94
- Delete all keys in a bucket (leaving the bucket):
95
- s3cmd.rb deleteall BucketName
96
-
97
- Get a file from S3 and store it to a local file
98
- s3cmd.rb get BucketName:TheFileOnS3.txt ALocalFile.txt
99
-
100
- Put a local file up to S3
101
- Note we don't automatically set mime type, etc.
102
- NOTE that the order of the options doesn't change. S3 stays first!
103
- s3cmd.rb put BucketName:TheFileOnS3.txt ALocalFile.txt
104
-
105
-
106
- Change Log:
107
- -----------
108
- 2006-10-14:
109
- Created.
110
- -----------
111
-
112
- 2006-10-16
113
- Version 1.0.1
114
- Force content length to a string value since some ruby's don't convert it right.
115
- -----------
116
-
117
- 2006-10-25
118
- UTF-8 fixes.
119
- -----------
120
-
121
- 2006-11-28
122
- Version 1.0.3
123
- Added a couple more error catches to s3try.
124
- ----------
125
-
126
- 2007-01-25
127
- Version 1.0.4
128
- Peter Fales' marker fix.
129
- Also, markers should be decoded into native charset (because that's what s3
130
- expects to see).
131
- ----------
132
-
133
- 2007-02-19
134
- - Updated s3try and s3_s3sync_mod to allow SSL_CERT_FILE
135
- ----------
136
-
137
- 2007-2-25
138
- Added --progress
139
- ----------
140
-
141
- 2007-07-12
142
- Version 1.0.6
143
- Added Alastair Brunton's yaml config code.
144
- ----------
145
-
146
- 2007-11-17
147
- Version 1.2.1
148
- Compatibility for S3 API revisions.
149
- When retries are exhausted, emit an error.
150
- ----------
151
-
152
- 2007-11-20
153
- Version 1.2.2
154
- Handle EU bucket 307 redirects (in s3try.rb)
155
- ----------
156
-
157
- 2007-11-20
158
- Version 1.2.3
159
- Fix SSL verification settings that broke in new S3 API.
160
- ----------
161
-
162
- 2008-01-06
163
- Version 1.2.4
164
- Run from any dir (search "here" for includes).
165
- Search out s3config.yml in some likely places.
166
- Reset connection (properly) on retry-able non-50x errors.
167
- Fix calling format bug preventing it from working from yml.
168
- Added http proxy support.
169
- ----------
170
-
171
-
172
- FNORD