thin_out_backups 0.0.1
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +7 -0
- data/.gitignore +17 -0
- data/.rspec +1 -0
- data/Gemfile +4 -0
- data/License +22 -0
- data/Rakefile +2 -0
- data/Readme.md +43 -0
- data/bin/thin_out_backups +180 -0
- data/lib/thin_out_backups/time_fixes.rb +131 -0
- data/lib/thin_out_backups/version.rb +3 -0
- data/lib/thin_out_backups.rb +285 -0
- data/spec/thin_out_backups_spec.rb +137 -0
- data/thin_out_backups.gemspec +23 -0
- metadata +116 -0
checksums.yaml
ADDED
@@ -0,0 +1,7 @@
|
|
1
|
+
---
|
2
|
+
SHA1:
|
3
|
+
metadata.gz: 71d687dd4b0eb09b150c37d7d4567d134eb21cc6
|
4
|
+
data.tar.gz: 10a0c8c8eb6fa2a090fff6569720ced62fafa997
|
5
|
+
SHA512:
|
6
|
+
metadata.gz: ed27167bae6ab29f0466127064500f964288dce9e71fc7d6836623b871b437368014d0800084b5106131f567d7be4ffdcab994f02ed91d382b2e091ff214da92
|
7
|
+
data.tar.gz: e89661fc8e6f5e2e41226f8f746c573775cb7dae69434a9ad8ba75465a3a3c97b7b269062806a08c828b88b45cb65983a1883422f10bb730f4a46b08d187f1d9
|
data/.gitignore
ADDED
data/.rspec
ADDED
@@ -0,0 +1 @@
|
|
1
|
+
--color
|
data/Gemfile
ADDED
data/License
ADDED
@@ -0,0 +1,22 @@
|
|
1
|
+
Copyright (c) 2012 Tyler Rick
|
2
|
+
|
3
|
+
MIT License
|
4
|
+
|
5
|
+
Permission is hereby granted, free of charge, to any person obtaining
|
6
|
+
a copy of this software and associated documentation files (the
|
7
|
+
"Software"), to deal in the Software without restriction, including
|
8
|
+
without limitation the rights to use, copy, modify, merge, publish,
|
9
|
+
distribute, sublicense, and/or sell copies of the Software, and to
|
10
|
+
permit persons to whom the Software is furnished to do so, subject to
|
11
|
+
the following conditions:
|
12
|
+
|
13
|
+
The above copyright notice and this permission notice shall be
|
14
|
+
included in all copies or substantial portions of the Software.
|
15
|
+
|
16
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
|
17
|
+
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
|
18
|
+
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
|
19
|
+
NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
|
20
|
+
LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
|
21
|
+
OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
|
22
|
+
WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
|
data/Rakefile
ADDED
data/Readme.md
ADDED
@@ -0,0 +1,43 @@
|
|
1
|
+
# thin_out_backups
|
2
|
+
|
3
|
+
Quickly and safely thin out a backups directory that's taking up too much hard disk space!
|
4
|
+
|
5
|
+
`thin_out_backups` will keep the specified number of backups in each frequency category (weekly,
|
6
|
+
daily, etc.) and delete the rest, keep the space requirements of your backups directory fairly
|
7
|
+
constant over time.
|
8
|
+
|
9
|
+
The files that you are thinning out don't have to be backups, but that is probably the most common
|
10
|
+
use case.
|
11
|
+
|
12
|
+
## Installation
|
13
|
+
|
14
|
+
$ gem install thin_out_backups
|
15
|
+
|
16
|
+
## Usage
|
17
|
+
|
18
|
+
$ thin_out_backups
|
19
|
+
|
20
|
+
|
21
|
+
|
22
|
+
## License
|
23
|
+
|
24
|
+
Copyright 2008, 2012 Tyler Rick
|
25
|
+
|
26
|
+
Released under the MIT license. See License file.
|
27
|
+
|
28
|
+
## Contributing
|
29
|
+
|
30
|
+
1. Fork it
|
31
|
+
2. Create your feature branch (`git checkout -b my-new-feature`)
|
32
|
+
3. Commit your changes (`git commit -am 'Added some feature'`)
|
33
|
+
4. Push to the branch (`git push origin my-new-feature`)
|
34
|
+
5. Create new Pull Request
|
35
|
+
|
36
|
+
## Other names considered
|
37
|
+
|
38
|
+
* thin_out_backup_dir
|
39
|
+
* sparsen_dir
|
40
|
+
* sparsify_dir
|
41
|
+
* [rm_extra_copies](https://github.com/TylerRick/command-line/blob/master/bin/rm_extra_copies) (previously published under this name)
|
42
|
+
* prune_backups
|
43
|
+
* trim_dir
|
@@ -0,0 +1,180 @@
|
|
1
|
+
#!/usr/bin/env ruby
|
2
|
+
# vim: textwidth=150
|
3
|
+
|
4
|
+
=begin To Do
|
5
|
+
Make it possible to run on a capistrano releases directory to only keep the n latest releases and delete the rest (just like cap deploy:cleanup). So you don't end up with a pile of 200 releases in that directory...
|
6
|
+
|
7
|
+
Add --duplicates/--dups option, that when present deletes all duplicate files in a directory
|
8
|
+
|
9
|
+
Add --max-size option: keeps deleting until the du -s #{dir} reports that it is beneath that threshold.
|
10
|
+
What order should it delete? Oldest first?
|
11
|
+
After each deletion, checks if it's reached (below) max size yet (just check filesize before deleting and then subtract that from a running total -- faster than doing a du over and over).
|
12
|
+
Before any other deletes, it should search for/delete duplicates. If --duplicates option present, will do the search even if already below max size.
|
13
|
+
=end
|
14
|
+
|
15
|
+
require 'getoptlong'
|
16
|
+
#require 'rubygems'
|
17
|
+
#require 'facets'
|
18
|
+
require 'thin_out_backups'
|
19
|
+
|
20
|
+
def help
|
21
|
+
puts <<End
|
22
|
+
Synopsis:
|
23
|
+
---------
|
24
|
+
|
25
|
+
thin_out_backups [options] dirs...
|
26
|
+
|
27
|
+
Example:
|
28
|
+
thin_out_backups --daily 5 --monthly 12 /path/to/backups
|
29
|
+
|
30
|
+
|
31
|
+
thin_out_backups will keep the specified number of time-stamped copies from each "time bucket" and remove all older copies.
|
32
|
+
|
33
|
+
So even if you don't actually make any weekly backups, as long as you make daily or hourly backups, you will still *have* weekly backups which
|
34
|
+
you can keep. This will let you keep, for example, *all* the weekly backups but only 6 days worth of daily backups.
|
35
|
+
|
36
|
+
When you specify --weekly=3, it means go back for 3 weeks and keep one backup from each week.
|
37
|
+
So it will keep the most recent file from the current week, the most recent file from the previous week, etc.
|
38
|
+
|
39
|
+
Use a * instead of a number to go back all the way to the oldest [week] represented in that directory and keep 1 copy from each [week] visited
|
40
|
+
along the way. Note that it usually only makes sense to do this for the largest time interval being used. (It wouldn't make any sense to say
|
41
|
+
--hourly=* --daily=3 because the --daily=3 will cause no files to be kept that wouldn't already be kept due to the --hourly.)
|
42
|
+
|
43
|
+
The same file may satisfy more than 1 quota. So if you set them all to 1, they will all be satisfied by the same file: the most recent backup in the
|
44
|
+
directory. For that reason, 2 is probably the minimum you'd want to set any of these to.
|
45
|
+
|
46
|
+
|
47
|
+
Options:
|
48
|
+
--------
|
49
|
+
--help Print this message
|
50
|
+
--minutely/--hourly/--daily/--weekly/--monthly/--yearly quota/frequency
|
51
|
+
Keep n number of [daily] backups or * to keep all
|
52
|
+
--<custom-interval>=<n>
|
53
|
+
--force Don't ask for confirmation before deleting. Use this if you put this command in a crontab and you are very sure you know
|
54
|
+
what it will do.
|
55
|
+
--now Use a different date/time as now.
|
56
|
+
--get-time-from filename Takes the time from the filename. It must be present in the filename and formatted in %Y%m%dT%H%M format.
|
57
|
+
--get-time-from file_system
|
58
|
+
Uses the mtime of the file.
|
59
|
+
--no-color Use a different date/time as now.
|
60
|
+
--last/--latest Take the last (latest) candidate in each time interval (default).
|
61
|
+
--first/--earliest Take the first (earliest) candidate in each time interval (not default).
|
62
|
+
|
63
|
+
By default, days start at the beginning of day, weeks to beginning of week, etc.
|
64
|
+
|
65
|
+
To do:
|
66
|
+
--identical Compares the md5 digest of each pair of files and removes any files that have the same (and therefore probably are identical) -- except for the oldest/newest one.
|
67
|
+
--consec-identical Same as --identical except only checks and removes consecutive/neighboring files to see if they are identical.
|
68
|
+
--align=<?>
|
69
|
+
add option to align the time ranges (days, etc.) from "now" rather than at the beginning of the day, etc. So "day" would actually be "the most
|
70
|
+
recent 24-hour period, beginning exactly 24 hours ago and ending right now".
|
71
|
+
|
72
|
+
|
73
|
+
Format of quota/frequency
|
74
|
+
-------------------------
|
75
|
+
|
76
|
+
TODO: allow crontab-style specifiers like --hourly=2/4 to keep 2 4-hourly backups, or
|
77
|
+
--hourly=*/12 to keep *all* 12-hourly backups (= keep 2 backups per day).
|
78
|
+
|
79
|
+
|
80
|
+
How your files/directories must be organized
|
81
|
+
--------------------------------------------
|
82
|
+
|
83
|
+
Each directory is expected to contain only one type of backup -- with many "copies" of that backup accumulated over time -- not a mix of backups.
|
84
|
+
|
85
|
+
These backups may be either files or directories. (rm -rf will be used to remove them)
|
86
|
+
|
87
|
+
Within each directory, this program will take the n most recently created files in each "time bucket" and retain them while pruning/deleting *all*
|
88
|
+
other files in the directory.
|
89
|
+
|
90
|
+
WARNING: If you have other stuff in that directory that you want to keep, be warned: it WILL be deleted by this script!
|
91
|
+
|
92
|
+
TODO: let you specify a glob pattern for files that WILL be pruned; and/or add an --ignore option to specify file patterns to ignore = NOT subject to pruning = not delete
|
93
|
+
|
94
|
+
For the moment, I require that the timestamp be in the filename itself, in "%Y%m%dT%H%M%S" format.
|
95
|
+
TODO: add option to use the timestamp recorded in the file system rather than the timestamp from within the filename.
|
96
|
+
|
97
|
+
Background
|
98
|
+
----------
|
99
|
+
|
100
|
+
This is useful when you have a cron job that continuously creates backups and dumps them in a certain directory and you want to keep that directory
|
101
|
+
from becoming inordinately large. (Or quickly get it down to a smaller size now that it's grown extremely large and you've run out of room on the disk.)
|
102
|
+
|
103
|
+
The idea is that you are more likely to want/need a recent backup than an older backup. Probably because you hope to become aware of whatever problem
|
104
|
+
necessitates looking at/using/restoring from the backup very soon after the problem arises.
|
105
|
+
|
106
|
+
So you want to keep a higher density of backups from recent times than you do from older times. The high density of recent backups ensures that you
|
107
|
+
have a higher likelihood of having a backup from very soon before the data became corrupt/whatever and needed to be restored.
|
108
|
+
|
109
|
+
For example, if you have backups for every hour in the last 24 hours, and you discover that something got deleted/corrupted/etc. at 23:15 last night,
|
110
|
+
then if you pull the most recent backup, say the 23:00 backup, it will be at most an hour before the time of the problem. So the amount of data that
|
111
|
+
is lost is at most 1 hour's worth.
|
112
|
+
|
113
|
+
As you go further back in time, however, you are less likely to need any backups from that time. But you may want to keep them around for historical
|
114
|
+
or statistical or "just in case there is a subtle problem that we don't find out about until 6 months later and we need to be able to go back and
|
115
|
+
determine how it happened and restore a certain chunk of data from that old pre-problem backup."
|
116
|
+
|
117
|
+
Well, this tool will help you to keep around SOME old copies, without keep around as many as you keep of the more recent ones.
|
118
|
+
|
119
|
+
End
|
120
|
+
exit 0
|
121
|
+
end
|
122
|
+
|
123
|
+
|
124
|
+
|
125
|
+
quotas = {}
|
126
|
+
|
127
|
+
help if ARGV.empty?
|
128
|
+
|
129
|
+
opts = GetoptLong.new(
|
130
|
+
[ '--minutely', GetoptLong::REQUIRED_ARGUMENT ],
|
131
|
+
[ '--hourly', GetoptLong::REQUIRED_ARGUMENT ],
|
132
|
+
[ '--daily', GetoptLong::REQUIRED_ARGUMENT ],
|
133
|
+
[ '--weekly', GetoptLong::REQUIRED_ARGUMENT ],
|
134
|
+
[ '--monthly', GetoptLong::REQUIRED_ARGUMENT ],
|
135
|
+
[ '--yearly', GetoptLong::REQUIRED_ARGUMENT ],
|
136
|
+
[ '--get-time-from', GetoptLong::REQUIRED_ARGUMENT ],
|
137
|
+
[ '--last', '--latest', GetoptLong::NO_ARGUMENT ],
|
138
|
+
[ '--first', '--earliest', GetoptLong::NO_ARGUMENT ],
|
139
|
+
[ '--force', '-f', GetoptLong::NO_ARGUMENT ],
|
140
|
+
[ '--no-color', GetoptLong::NO_ARGUMENT ],
|
141
|
+
[ '--time-format', GetoptLong::REQUIRED_ARGUMENT ],
|
142
|
+
[ '--pattern', GetoptLong::REQUIRED_ARGUMENT ],
|
143
|
+
[ '--exclude', GetoptLong::REQUIRED_ARGUMENT ],
|
144
|
+
[ '--now', GetoptLong::REQUIRED_ARGUMENT ]
|
145
|
+
)
|
146
|
+
|
147
|
+
opts.each do | opt, arg |
|
148
|
+
case opt
|
149
|
+
|
150
|
+
when '--help'
|
151
|
+
help
|
152
|
+
|
153
|
+
when '--force', '-f'
|
154
|
+
ThinOutBackups::Command.force = true
|
155
|
+
|
156
|
+
when '--no-color'
|
157
|
+
ThinOutBackups::Command.color = false
|
158
|
+
|
159
|
+
when *ThinOutBackups::Command.options.map {|o| "--#{o.to_s.gsub(/_/, '-')}"}
|
160
|
+
name = opt.gsub(/^--/, '').to_sym
|
161
|
+
ThinOutBackups::Command.send("#{name}=", arg)
|
162
|
+
|
163
|
+
when *ThinOutBackups::Command.allowed_bucket_names.map {|o| "--#{o}"}
|
164
|
+
name = opt.gsub(/^--/, '').to_sym
|
165
|
+
if arg == '*'
|
166
|
+
#quotas[name] = :all
|
167
|
+
quotas[name] = '*'
|
168
|
+
else
|
169
|
+
quotas[name] = arg.to_i
|
170
|
+
end
|
171
|
+
end
|
172
|
+
end
|
173
|
+
|
174
|
+
dirs = ARGV
|
175
|
+
raise "Must specify at least one directory" if dirs.empty?
|
176
|
+
dirs.each do |dir|
|
177
|
+
command = ThinOutBackups::Command.new(dir, quotas)
|
178
|
+
command.run
|
179
|
+
end
|
180
|
+
|
@@ -0,0 +1,131 @@
|
|
1
|
+
# Until facets gets patched proper, we'll monkey patch it here:
|
2
|
+
# TODO: Is this still needed?
|
3
|
+
#require '/home/tyler/dev/ruby/facets/lib/core/facets/time/hence'
|
4
|
+
class Time
|
5
|
+
|
6
|
+
if defined?(::ActiveSupport)
|
7
|
+
|
8
|
+
alias_method :in, :since
|
9
|
+
alias_method :hence, :since
|
10
|
+
|
11
|
+
else
|
12
|
+
|
13
|
+
# Returns a new Time representing the time
|
14
|
+
# a number of time-units ago.
|
15
|
+
#
|
16
|
+
def ago(number, units=:seconds)
|
17
|
+
time =(
|
18
|
+
case units.to_s.downcase.to_sym
|
19
|
+
when :years
|
20
|
+
set(:year => (year - number))
|
21
|
+
when :months
|
22
|
+
y = ((month - number) / 12).to_i
|
23
|
+
#puts "(#{month} - #{number}) / 12 == #{y}"
|
24
|
+
|
25
|
+
new_month = ((month - number - 1) % 12) + 1
|
26
|
+
y += 1 if new_month > month
|
27
|
+
#puts y
|
28
|
+
|
29
|
+
set(:year => (year - y), :month => new_month)
|
30
|
+
when :weeks
|
31
|
+
self - (number * 604800)
|
32
|
+
when :days
|
33
|
+
self - (number * 86400)
|
34
|
+
when :hours
|
35
|
+
self - (number * 3600)
|
36
|
+
when :minutes
|
37
|
+
self - (number * 60)
|
38
|
+
when :seconds, nil
|
39
|
+
self - number
|
40
|
+
else
|
41
|
+
raise ArgumentError, "unrecognized time units -- #{units}"
|
42
|
+
end
|
43
|
+
)
|
44
|
+
dst_adjustment(time)
|
45
|
+
end
|
46
|
+
#
|
47
|
+
# Returns a new Time representing the time
|
48
|
+
# a number of time-units hence.
|
49
|
+
|
50
|
+
def hence(number, units=:seconds)
|
51
|
+
time =(
|
52
|
+
case units.to_s.downcase.to_sym
|
53
|
+
when :years
|
54
|
+
set( :year=>(year + number) )
|
55
|
+
when :months
|
56
|
+
y = ((month + number - 1) / 12).to_i
|
57
|
+
m = ((month + number - 1) % 12) + 1
|
58
|
+
set(:year => (year + y), :month => m)
|
59
|
+
when :weeks
|
60
|
+
self + (number * 604800)
|
61
|
+
when :days
|
62
|
+
self + (number * 86400)
|
63
|
+
when :hours
|
64
|
+
self + (number * 3600)
|
65
|
+
when :minutes
|
66
|
+
self + (number * 60)
|
67
|
+
when :seconds
|
68
|
+
self + number
|
69
|
+
else
|
70
|
+
raise ArgumentError, "unrecognized time units -- #{units}"
|
71
|
+
end
|
72
|
+
)
|
73
|
+
dst_adjustment(time)
|
74
|
+
end
|
75
|
+
|
76
|
+
alias_method :in, :hence
|
77
|
+
alias_method :since, :hence
|
78
|
+
|
79
|
+
# Adjust DST
|
80
|
+
#
|
81
|
+
# TODO: Can't seem to get this to pass ActiveSupport tests.
|
82
|
+
# Even though it is essentially identical to the
|
83
|
+
# ActiveSupport code (see Time#since in time/calculations.rb).
|
84
|
+
# It handles all but 4 tests.
|
85
|
+
def dst_adjustment(time)
|
86
|
+
self_dst = self.dst? ? 1 : 0
|
87
|
+
time_dst = time.dst? ? 1 : 0
|
88
|
+
seconds = (self - time).abs
|
89
|
+
if (seconds >= 86400 && self_dst != time_dst)
|
90
|
+
time + ((self_dst - time_dst) * 60 * 60)
|
91
|
+
else
|
92
|
+
time
|
93
|
+
end
|
94
|
+
end
|
95
|
+
|
96
|
+
end
|
97
|
+
|
98
|
+
end
|
99
|
+
|
100
|
+
|
101
|
+
class DateTime
|
102
|
+
def to_time
|
103
|
+
Time.mktime(year, month, day, hour, min, sec)
|
104
|
+
end
|
105
|
+
end
|
106
|
+
|
107
|
+
class Time
|
108
|
+
def to_s
|
109
|
+
#strftime("%Y%m%dT%H%M%S")
|
110
|
+
strftime("%Y-%m-%d %H:%M")
|
111
|
+
end
|
112
|
+
def to_s_full
|
113
|
+
strftime("%Y-%m-%d %H:%M:%S")
|
114
|
+
end
|
115
|
+
|
116
|
+
# /usr/lib/ruby/gems/1.8/gems/activesupport-2.1.2/lib/active_support/core_ext/time/calculations.rb
|
117
|
+
|
118
|
+
def beginning_of_day
|
119
|
+
change(:hour => 0)
|
120
|
+
end
|
121
|
+
alias :midnight :beginning_of_day
|
122
|
+
|
123
|
+
# Returns a new Time representing the "start" of this week (Sunday, 0:00)
|
124
|
+
def beginning_of_week
|
125
|
+
days_to_sunday = self.wday
|
126
|
+
self.ago(days_to_sunday, :days).midnight
|
127
|
+
end
|
128
|
+
alias :sunday :beginning_of_week
|
129
|
+
|
130
|
+
end
|
131
|
+
|
@@ -0,0 +1,285 @@
|
|
1
|
+
require "thin_out_backups/version"
|
2
|
+
|
3
|
+
# Tested by: spec/thin_out_backups_spec.rb
|
4
|
+
|
5
|
+
require 'fileutils'
|
6
|
+
require 'pathname'
|
7
|
+
require 'delegate'
|
8
|
+
|
9
|
+
#require 'rubygems'
|
10
|
+
require 'facets/time'
|
11
|
+
require 'colored'
|
12
|
+
require 'quality_extensions/module/attribute_accessors'
|
13
|
+
require 'thin_out_backups/time_fixes'
|
14
|
+
|
15
|
+
class ThinOutBackups::Command
|
16
|
+
#---------------------------------------------------------------------------------------------------------------------------------------------------
|
17
|
+
|
18
|
+
@@allowed_bucket_names = [:minutely, :hourly, :daily, :weekly, :monthly, :yearly]
|
19
|
+
mattr_reader :allowed_bucket_names
|
20
|
+
|
21
|
+
#---------------------------------------------------------------------------------------------------------------------------------------------------
|
22
|
+
# Options
|
23
|
+
@@options = [:get_time_from, :ignore_files ,:verbosity, :time_format, :now, :force, :no_color]
|
24
|
+
mattr_reader :options
|
25
|
+
mattr_accessor *@@options
|
26
|
+
|
27
|
+
@@get_time_from = :filename
|
28
|
+
def self.get_time_from=(new)
|
29
|
+
@@get_time_from = new.to_sym
|
30
|
+
raise "Unknown value for #{ThinOutBackups::Command.get_time_from}" unless @@get_time_from.in?([:filename, :file_system])
|
31
|
+
end
|
32
|
+
|
33
|
+
@@ignore_files = nil
|
34
|
+
@@verbosity = 1
|
35
|
+
@@force = false
|
36
|
+
@@color = true
|
37
|
+
|
38
|
+
@@now = Time.now
|
39
|
+
def self.now=(new)
|
40
|
+
time = DateTime.strptime('2008-11-12 07:45:00', '%Y-%m-%d %H:%M:%S').to_time
|
41
|
+
@@now = new
|
42
|
+
puts "Using alternate now: #{@@now}"
|
43
|
+
end
|
44
|
+
|
45
|
+
@@time_format = /(\d{4})(\d{2})(\d{2})T(\d{2})(\d{2})(\d{2})?/
|
46
|
+
@@time_format_parts = [:Y,:m,:d, :H,:M,:S]
|
47
|
+
# TODO: Maybe use something like this for interpreting time, rather than a regexp? DateTime.strptime("27/Nov/2007:15:01:43 -0800", "%d/%b/%Y:%H:%M:%S %z")
|
48
|
+
def self.time_format=(new)
|
49
|
+
# TODO: accept format strings such as 'H:M:S d.m.Y.' and 'Y-m-d H:M:S'
|
50
|
+
# TODO: do error checking
|
51
|
+
end
|
52
|
+
|
53
|
+
|
54
|
+
#---------------------------------------------------------------------------------------------------------------------------------------------------
|
55
|
+
|
56
|
+
class Bucket
|
57
|
+
attr_reader :parent, :name, :quota, :keepers
|
58
|
+
@@quota_format = %r[(\d+|\*)(/\d+)?]
|
59
|
+
|
60
|
+
def initialize(parent, name, quota)
|
61
|
+
@parent = parent
|
62
|
+
@name = name
|
63
|
+
(
|
64
|
+
raise "Invalid quota '#{quota}'" unless quota.is_a?(Fixnum) || quota =~ @@quota_format
|
65
|
+
@quota = quota
|
66
|
+
)
|
67
|
+
@keepers = []
|
68
|
+
end
|
69
|
+
|
70
|
+
def unit
|
71
|
+
{
|
72
|
+
:minutely => :minutes,
|
73
|
+
:hourly => :hours,
|
74
|
+
:daily => :days,
|
75
|
+
:weekly => :weeks,
|
76
|
+
:monthly => :months,
|
77
|
+
:yearly => :years,
|
78
|
+
}[@name.to_sym]
|
79
|
+
end
|
80
|
+
|
81
|
+
def start_time
|
82
|
+
start_time = parent.now.dup
|
83
|
+
if parent.align_at_beginning_of_time_interval
|
84
|
+
beginning_of_interval =
|
85
|
+
case unit
|
86
|
+
when :minutes
|
87
|
+
start_time.change(:sec => 0)
|
88
|
+
when :hours
|
89
|
+
start_time.change(:min => 0)
|
90
|
+
when :days
|
91
|
+
start_time.change(:hour => 0)
|
92
|
+
when :weeks
|
93
|
+
start_time.beginning_of_week
|
94
|
+
when :months
|
95
|
+
start_time.change( :day => 1, :hour => 0)
|
96
|
+
when :years
|
97
|
+
start_time.change(:month => 1, :day => 1, :hour => 0)
|
98
|
+
else
|
99
|
+
raise "unexpected unit #{unit}"
|
100
|
+
end
|
101
|
+
# We actually want to use the *next* interval (in the future) as our start_time because we will be using this as our max and going backwards in time...
|
102
|
+
beginning_of_interval.hence(1, unit)
|
103
|
+
else
|
104
|
+
start_time
|
105
|
+
end
|
106
|
+
end
|
107
|
+
|
108
|
+
def keep(keeper)
|
109
|
+
@keepers << keeper
|
110
|
+
end
|
111
|
+
|
112
|
+
def still_need
|
113
|
+
if keep_all?
|
114
|
+
1 # it has insatiable hunger for keepers that can never be satisfied ... always just 1 more ...
|
115
|
+
else
|
116
|
+
@quota - @keepers.size
|
117
|
+
end
|
118
|
+
end
|
119
|
+
|
120
|
+
def keep_all?; quota =~ /\*/ end
|
121
|
+
|
122
|
+
def satisfied?
|
123
|
+
if keep_all?
|
124
|
+
false # it has insatiable hunger for keepers that can never be satisfied
|
125
|
+
else
|
126
|
+
still_need == 0
|
127
|
+
end
|
128
|
+
end
|
129
|
+
end
|
130
|
+
|
131
|
+
class File < DelegateClass(::Pathname)
|
132
|
+
attr_reader :filename, :file
|
133
|
+
|
134
|
+
def initialize(filename)
|
135
|
+
super(Pathname.new(filename))
|
136
|
+
end
|
137
|
+
|
138
|
+
def full_path
|
139
|
+
dirname.to_s + '/' + filename
|
140
|
+
end
|
141
|
+
|
142
|
+
def filename
|
143
|
+
basename.to_s
|
144
|
+
end
|
145
|
+
def to_s
|
146
|
+
filename
|
147
|
+
end
|
148
|
+
|
149
|
+
def time
|
150
|
+
if ThinOutBackups::Command.get_time_from == :filename
|
151
|
+
if filename =~ ThinOutBackups::Command.time_format
|
152
|
+
y,m,d, h,i,s = $1,$2,$3, $4,$5,$6
|
153
|
+
Time.mktime(y,m,d, h,i,s)
|
154
|
+
else
|
155
|
+
nil
|
156
|
+
end
|
157
|
+
elsif ThinOutBackups::Command.get_time_from == :file_system
|
158
|
+
file.mtime
|
159
|
+
else
|
160
|
+
raise "Unknown value for #{ThinOutBackups::Command.get_time_from}"
|
161
|
+
end
|
162
|
+
end
|
163
|
+
|
164
|
+
def has_time?; !!time end
|
165
|
+
def ignored?
|
166
|
+
!has_time? or
|
167
|
+
ThinOutBackups::Command.ignore_files && filename =~ ThinOutBackups::Command.ignore_files
|
168
|
+
end
|
169
|
+
end
|
170
|
+
|
171
|
+
attr_reader :align_at_beginning_of_time_interval, :files_with_times
|
172
|
+
attr_accessor :dir
|
173
|
+
|
174
|
+
def initialize(dir, quotas)
|
175
|
+
@align_at_beginning_of_time_interval = true
|
176
|
+
@dir = dir
|
177
|
+
@buckets = {}
|
178
|
+
@@allowed_bucket_names.each do |name|
|
179
|
+
quota = quotas[name]
|
180
|
+
@buckets[name] = Bucket.new(self, name, quota) unless quota.nil?
|
181
|
+
end
|
182
|
+
|
183
|
+
puts "Processing #{@dir}/*".magenta
|
184
|
+
files = Dir["#{@dir}/*"].map { |filename|
|
185
|
+
file = File.new(filename)
|
186
|
+
}
|
187
|
+
@files_with_times = files.
|
188
|
+
reject {|file| !file.has_time?}.
|
189
|
+
sort { |a, b|
|
190
|
+
a.time <=> b.time
|
191
|
+
}.
|
192
|
+
reverse
|
193
|
+
|
194
|
+
end
|
195
|
+
|
196
|
+
def bucket_remaining(bucket_name, decr = nil)
|
197
|
+
send("#{bucket_name}=", send("#{bucket_name}") - decr) if decr
|
198
|
+
send "#{bucket_name}"
|
199
|
+
end
|
200
|
+
|
201
|
+
def buckets
|
202
|
+
@buckets.values
|
203
|
+
end
|
204
|
+
def bucket(name)
|
205
|
+
@buckets[name] or raise "unknown bucket '#{name}'"
|
206
|
+
end
|
207
|
+
|
208
|
+
def now
|
209
|
+
Time.now
|
210
|
+
end
|
211
|
+
|
212
|
+
def delete_non_keepers
|
213
|
+
#raise "Didn't find any files to keep?!" unless keepers.any?
|
214
|
+
files_with_times.each do |file|
|
215
|
+
if (buckets = buckets_with_file(file)).any?
|
216
|
+
puts "#{file.full_path}: in buckets: #{buckets.map(&:name).join(', ')}".green
|
217
|
+
else
|
218
|
+
puts "#{file.full_path}: delete".red
|
219
|
+
end
|
220
|
+
end
|
221
|
+
|
222
|
+
if @@force == false
|
223
|
+
print "Continue with deletions? (yes or no) >".magenta
|
224
|
+
response = STDIN.gets
|
225
|
+
(puts "Aborting"; return) unless response.chomp.downcase == 'yes'
|
226
|
+
end
|
227
|
+
|
228
|
+
files_with_times.each do |file|
|
229
|
+
if (buckets = buckets_with_file(file)).any?
|
230
|
+
#
|
231
|
+
else
|
232
|
+
file.unlink
|
233
|
+
end
|
234
|
+
end
|
235
|
+
end
|
236
|
+
|
237
|
+
def buckets_with_file(file)
|
238
|
+
buckets.find_all {|bucket| bucket.keepers.include?(file)}
|
239
|
+
end
|
240
|
+
|
241
|
+
def delete(files)
|
242
|
+
puts "Deleting files: #{files.join(', ')}..."
|
243
|
+
end
|
244
|
+
|
245
|
+
def earliest_file_time
|
246
|
+
files_with_times.last.time
|
247
|
+
end
|
248
|
+
|
249
|
+
def run
|
250
|
+
raise "Must keep at least 1 file from at least one time bucket" if buckets.empty?
|
251
|
+
(puts "Found no files with times! Aborting."; return) if files_with_times.empty?
|
252
|
+
|
253
|
+
# Fill each bucket until its quota is met
|
254
|
+
buckets.each do |bucket|
|
255
|
+
puts "Trying to fill bucket '#{bucket.name}' (quota: #{bucket.quota})...".magenta
|
256
|
+
|
257
|
+
time_max = bucket.start_time
|
258
|
+
time_min = time_max.ago(1, bucket.unit)
|
259
|
+
|
260
|
+
#puts "Earliest_file_time: #{earliest_file_time}"
|
261
|
+
while time_max > earliest_file_time
|
262
|
+
print "Checking range (#{time_min} .. #{time_max})... ".yellow if verbosity >= 1
|
263
|
+
new_keeper = files_with_times.detect {|file|
|
264
|
+
#print "#{file}? "
|
265
|
+
time_min <= file.time &&
|
266
|
+
file.time < time_max
|
267
|
+
}
|
268
|
+
if new_keeper
|
269
|
+
puts "found keeper #{new_keeper}".green if verbosity >= 1
|
270
|
+
bucket.keep new_keeper
|
271
|
+
else
|
272
|
+
#puts "found no keepers".red if verbosity >= 1
|
273
|
+
puts "" if verbosity >= 1
|
274
|
+
end
|
275
|
+
|
276
|
+
time_max = time_min
|
277
|
+
#puts "Stepping back from #{time_min} by 1 #{bucket.unit} => #{time_min.ago(1, bucket.unit)}"
|
278
|
+
time_min = time_min.ago(1, bucket.unit)
|
279
|
+
(puts 'Filled quota!'.green; break) if bucket.satisfied?
|
280
|
+
end
|
281
|
+
end
|
282
|
+
|
283
|
+
delete_non_keepers
|
284
|
+
end
|
285
|
+
end
|
@@ -0,0 +1,137 @@
|
|
1
|
+
require 'tmpdir'
|
2
|
+
#require 'rubygems'
|
3
|
+
require 'rspec'
|
4
|
+
require 'facets'
|
5
|
+
require_relative '../lib/thin_out_backups'
|
6
|
+
|
7
|
+
$now = Time.utc(2008,11,12, 7,45,19)
|
8
|
+
|
9
|
+
describe Time, "#beginning_of_week" do
|
10
|
+
it "should return a Sunday" do
|
11
|
+
Time.utc(2008,11,12).beginning_of_week.should == Time.utc(2008,11,9)
|
12
|
+
end
|
13
|
+
end
|
14
|
+
|
15
|
+
describe ThinOutBackups::Command::Bucket, "time interval alignment" do
|
16
|
+
|
17
|
+
def sample_quotas
|
18
|
+
{
|
19
|
+
:minutely => 1,
|
20
|
+
:hourly => 3,
|
21
|
+
:daily => 1,
|
22
|
+
:weekly => '*',
|
23
|
+
:monthly => 1,
|
24
|
+
:yearly => '*'
|
25
|
+
}
|
26
|
+
end
|
27
|
+
|
28
|
+
before do
|
29
|
+
@command = ThinOutBackups::Command.new('bogus_dir', sample_quotas)
|
30
|
+
@now = $now
|
31
|
+
@command.stub!(:now).and_return(@now)
|
32
|
+
end
|
33
|
+
|
34
|
+
it "should use the time specified by our test" do
|
35
|
+
@command.now.should == @now
|
36
|
+
end
|
37
|
+
|
38
|
+
it "hour interval should start on the hour, etc." do
|
39
|
+
@command.bucket(:minutely).start_time.should == Time.utc(2008,11,12, 7,46,0)
|
40
|
+
@command.bucket(:hourly). start_time.should == Time.utc(2008,11,12, 8,0,0)
|
41
|
+
@command.bucket(:daily). start_time.should == Time.utc(2008,11,13, 0,0,0)
|
42
|
+
@command.bucket(:weekly). start_time.should == Time.utc(2008,11,16, 0,0,0)
|
43
|
+
@command.bucket(:monthly). start_time.should == Time.utc(2008,12, 1, 0,0,0)
|
44
|
+
@command.bucket(:yearly). start_time.should == Time.utc(2009, 1, 1, 0,0,0)
|
45
|
+
end
|
46
|
+
|
47
|
+
end
|
48
|
+
|
49
|
+
|
50
|
+
$command = <<End
|
51
|
+
thin_out_backups --force --daily=3 --weekly=3 --monthly=* \
|
52
|
+
--now='#{$now.to_s_full}'\
|
53
|
+
spec/test_dir/db_dumps \
|
54
|
+
spec/test_dir/maildir
|
55
|
+
End
|
56
|
+
describe ThinOutBackups::Command, "when calling `#{$command}`" do
|
57
|
+
before do
|
58
|
+
Pathname.new("spec/test_dir/").rmtree rescue nil
|
59
|
+
|
60
|
+
dir='spec/test_dir/db_dumps/'
|
61
|
+
system "mkdir -p #{dir}"
|
62
|
+
files = %w[
|
63
|
+
db_dump_2008-08-08T0303.sql
|
64
|
+
db_dump_2008-09-01T0303.sql
|
65
|
+
db_dump_2008-09-10T0303.sql
|
66
|
+
db_dump_2008-10-15T0303.sql
|
67
|
+
db_dump_2008-10-16T0303.sql
|
68
|
+
db_dump_2008-10-17T0303.sql
|
69
|
+
db_dump_2008-10-18T0303.sql
|
70
|
+
db_dump_2008-10-19T0303.sql
|
71
|
+
db_dump_2008-10-20T0303.sql
|
72
|
+
db_dump_2008-10-21T0303.sql
|
73
|
+
db_dump_2008-10-22T0303.sql
|
74
|
+
db_dump_2008-10-23T0303.sql
|
75
|
+
db_dump_2008-10-24T0303.sql
|
76
|
+
db_dump_2008-10-25T0303.sql
|
77
|
+
db_dump_2008-10-26T0303.sql
|
78
|
+
db_dump_2008-10-27T0303.sql
|
79
|
+
db_dump_2008-10-28T0303.sql
|
80
|
+
db_dump_2008-10-29T0303.sql
|
81
|
+
db_dump_2008-10-30T0303.sql
|
82
|
+
db_dump_2008-10-31T0303.sql
|
83
|
+
db_dump_2008-11-01T0303.sql
|
84
|
+
db_dump_2008-11-02T0303.sql
|
85
|
+
db_dump_2008-11-03T0303.sql
|
86
|
+
db_dump_2008-11-04T0303.sql
|
87
|
+
db_dump_2008-11-05T0303.sql
|
88
|
+
db_dump_2008-11-06T0303.sql
|
89
|
+
db_dump_2008-11-07T0303.sql
|
90
|
+
db_dump_2008-11-08T0303.sql
|
91
|
+
db_dump_2008-11-09T0303.sql
|
92
|
+
db_dump_2008-11-10T0303.sql
|
93
|
+
db_dump_2008-11-11T0303.sql
|
94
|
+
db_dump_2008-11-12T0303.sql
|
95
|
+
]
|
96
|
+
files.each do |file|
|
97
|
+
Dir.getwd
|
98
|
+
#puts %(Dir.getwd=#{(Dir.getwd).inspect})
|
99
|
+
#puts %("touch #{dir}/#{file}"=#{("touch #{dir}/#{file}").inspect})
|
100
|
+
system "touch #{dir}/#{file}"
|
101
|
+
end
|
102
|
+
|
103
|
+
dir='spec/test_dir/maildir/'
|
104
|
+
system "mkdir -p #{dir}"
|
105
|
+
subdirs = %w[
|
106
|
+
2008-11-09T0303
|
107
|
+
2008-11-10T0303
|
108
|
+
2008-11-11T0303
|
109
|
+
]
|
110
|
+
subdirs.each do |subdir|
|
111
|
+
system "mkdir -p #{dir}/#{subdir}"
|
112
|
+
system "touch #{dir}/#{subdir}/inbox"
|
113
|
+
system "touch #{dir}/#{subdir}/some_other_folder"
|
114
|
+
end
|
115
|
+
|
116
|
+
puts %($command=#{($command).inspect})
|
117
|
+
#system $command
|
118
|
+
# TODO: also capture output of command and check it against expected
|
119
|
+
end
|
120
|
+
|
121
|
+
it "keeps/removes the correct files" do
|
122
|
+
Dir['spec/test_dir/db_dumps/*'].should =~
|
123
|
+
["spec/test_dir/db_dumps/db_dump_2008-11-12T0303.sql",
|
124
|
+
"spec/test_dir/db_dumps/db_dump_2008-11-08T0303.sql",
|
125
|
+
"spec/test_dir/db_dumps/db_dump_2008-10-31T0303.sql",
|
126
|
+
"spec/test_dir/db_dumps/db_dump_2008-11-10T0303.sql",
|
127
|
+
"spec/test_dir/db_dumps/db_dump_2008-08-08T0303.sql",
|
128
|
+
"spec/test_dir/db_dumps/db_dump_2008-11-01T0303.sql",
|
129
|
+
"spec/test_dir/db_dumps/db_dump_2008-09-10T0303.sql",
|
130
|
+
"spec/test_dir/db_dumps/db_dump_2008-11-11T0303.sql"]
|
131
|
+
end
|
132
|
+
|
133
|
+
after do
|
134
|
+
#Pathname.new("spec/test_dir/").rmtree rescue nil
|
135
|
+
end
|
136
|
+
end
|
137
|
+
|
@@ -0,0 +1,23 @@
|
|
1
|
+
# -*- encoding: utf-8 -*-
|
2
|
+
require File.expand_path('../lib/thin_out_backups/version', __FILE__)
|
3
|
+
|
4
|
+
Gem::Specification.new do |gem|
|
5
|
+
gem.authors = ["Tyler Rick"]
|
6
|
+
gem.email = ["github.com@tylerrick.com"]
|
7
|
+
gem.summary = %q{Thin out a directory full of backups, only keeping a specified number from each category (weekly, daily, etc.), and deleting the rest.}
|
8
|
+
gem.description = gem.summary
|
9
|
+
gem.homepage = ""
|
10
|
+
|
11
|
+
gem.files = `git ls-files`.split($\)
|
12
|
+
gem.executables = gem.files.grep(%r{^bin/}).map{ |f| File.basename(f) }
|
13
|
+
gem.test_files = gem.files.grep(%r{^(test|spec|features)/})
|
14
|
+
gem.name = "thin_out_backups"
|
15
|
+
gem.require_paths = ["lib"]
|
16
|
+
gem.version = ThinOutBackups::Version
|
17
|
+
|
18
|
+
gem.add_dependency 'facets'
|
19
|
+
gem.add_dependency 'colored'
|
20
|
+
gem.add_dependency 'quality_extensions'
|
21
|
+
|
22
|
+
gem.add_development_dependency 'rspec'
|
23
|
+
end
|
metadata
ADDED
@@ -0,0 +1,116 @@
|
|
1
|
+
--- !ruby/object:Gem::Specification
|
2
|
+
name: thin_out_backups
|
3
|
+
version: !ruby/object:Gem::Version
|
4
|
+
version: 0.0.1
|
5
|
+
platform: ruby
|
6
|
+
authors:
|
7
|
+
- Tyler Rick
|
8
|
+
autorequire:
|
9
|
+
bindir: bin
|
10
|
+
cert_chain: []
|
11
|
+
date: 2013-03-27 00:00:00.000000000 Z
|
12
|
+
dependencies:
|
13
|
+
- !ruby/object:Gem::Dependency
|
14
|
+
name: facets
|
15
|
+
requirement: !ruby/object:Gem::Requirement
|
16
|
+
requirements:
|
17
|
+
- - '>='
|
18
|
+
- !ruby/object:Gem::Version
|
19
|
+
version: '0'
|
20
|
+
type: :runtime
|
21
|
+
prerelease: false
|
22
|
+
version_requirements: !ruby/object:Gem::Requirement
|
23
|
+
requirements:
|
24
|
+
- - '>='
|
25
|
+
- !ruby/object:Gem::Version
|
26
|
+
version: '0'
|
27
|
+
- !ruby/object:Gem::Dependency
|
28
|
+
name: colored
|
29
|
+
requirement: !ruby/object:Gem::Requirement
|
30
|
+
requirements:
|
31
|
+
- - '>='
|
32
|
+
- !ruby/object:Gem::Version
|
33
|
+
version: '0'
|
34
|
+
type: :runtime
|
35
|
+
prerelease: false
|
36
|
+
version_requirements: !ruby/object:Gem::Requirement
|
37
|
+
requirements:
|
38
|
+
- - '>='
|
39
|
+
- !ruby/object:Gem::Version
|
40
|
+
version: '0'
|
41
|
+
- !ruby/object:Gem::Dependency
|
42
|
+
name: quality_extensions
|
43
|
+
requirement: !ruby/object:Gem::Requirement
|
44
|
+
requirements:
|
45
|
+
- - '>='
|
46
|
+
- !ruby/object:Gem::Version
|
47
|
+
version: '0'
|
48
|
+
type: :runtime
|
49
|
+
prerelease: false
|
50
|
+
version_requirements: !ruby/object:Gem::Requirement
|
51
|
+
requirements:
|
52
|
+
- - '>='
|
53
|
+
- !ruby/object:Gem::Version
|
54
|
+
version: '0'
|
55
|
+
- !ruby/object:Gem::Dependency
|
56
|
+
name: rspec
|
57
|
+
requirement: !ruby/object:Gem::Requirement
|
58
|
+
requirements:
|
59
|
+
- - '>='
|
60
|
+
- !ruby/object:Gem::Version
|
61
|
+
version: '0'
|
62
|
+
type: :development
|
63
|
+
prerelease: false
|
64
|
+
version_requirements: !ruby/object:Gem::Requirement
|
65
|
+
requirements:
|
66
|
+
- - '>='
|
67
|
+
- !ruby/object:Gem::Version
|
68
|
+
version: '0'
|
69
|
+
description: Thin out a directory full of backups, only keeping a specified number
|
70
|
+
from each category (weekly, daily, etc.), and deleting the rest.
|
71
|
+
email:
|
72
|
+
- github.com@tylerrick.com
|
73
|
+
executables:
|
74
|
+
- thin_out_backups
|
75
|
+
extensions: []
|
76
|
+
extra_rdoc_files: []
|
77
|
+
files:
|
78
|
+
- .gitignore
|
79
|
+
- .rspec
|
80
|
+
- Gemfile
|
81
|
+
- License
|
82
|
+
- Rakefile
|
83
|
+
- Readme.md
|
84
|
+
- bin/thin_out_backups
|
85
|
+
- lib/thin_out_backups.rb
|
86
|
+
- lib/thin_out_backups/time_fixes.rb
|
87
|
+
- lib/thin_out_backups/version.rb
|
88
|
+
- spec/thin_out_backups_spec.rb
|
89
|
+
- thin_out_backups.gemspec
|
90
|
+
homepage: ''
|
91
|
+
licenses: []
|
92
|
+
metadata: {}
|
93
|
+
post_install_message:
|
94
|
+
rdoc_options: []
|
95
|
+
require_paths:
|
96
|
+
- lib
|
97
|
+
required_ruby_version: !ruby/object:Gem::Requirement
|
98
|
+
requirements:
|
99
|
+
- - '>='
|
100
|
+
- !ruby/object:Gem::Version
|
101
|
+
version: '0'
|
102
|
+
required_rubygems_version: !ruby/object:Gem::Requirement
|
103
|
+
requirements:
|
104
|
+
- - '>='
|
105
|
+
- !ruby/object:Gem::Version
|
106
|
+
version: '0'
|
107
|
+
requirements: []
|
108
|
+
rubyforge_project:
|
109
|
+
rubygems_version: 2.0.0
|
110
|
+
signing_key:
|
111
|
+
specification_version: 4
|
112
|
+
summary: Thin out a directory full of backups, only keeping a specified number from
|
113
|
+
each category (weekly, daily, etc.), and deleting the rest.
|
114
|
+
test_files:
|
115
|
+
- spec/thin_out_backups_spec.rb
|
116
|
+
has_rdoc:
|