thin_out_backups 0.0.1

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml ADDED
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA1:
3
+ metadata.gz: 71d687dd4b0eb09b150c37d7d4567d134eb21cc6
4
+ data.tar.gz: 10a0c8c8eb6fa2a090fff6569720ced62fafa997
5
+ SHA512:
6
+ metadata.gz: ed27167bae6ab29f0466127064500f964288dce9e71fc7d6836623b871b437368014d0800084b5106131f567d7be4ffdcab994f02ed91d382b2e091ff214da92
7
+ data.tar.gz: e89661fc8e6f5e2e41226f8f746c573775cb7dae69434a9ad8ba75465a3a3c97b7b269062806a08c828b88b45cb65983a1883422f10bb730f4a46b08d187f1d9
data/.gitignore ADDED
@@ -0,0 +1,17 @@
1
+ *.gem
2
+ *.rbc
3
+ .bundle
4
+ .config
5
+ .yardoc
6
+ Gemfile.lock
7
+ InstalledFiles
8
+ _yardoc
9
+ coverage
10
+ doc/
11
+ lib/bundler/man
12
+ pkg
13
+ rdoc
14
+ spec/reports
15
+ test/tmp
16
+ test/version_tmp
17
+ tmp
data/.rspec ADDED
@@ -0,0 +1 @@
1
+ --color
data/Gemfile ADDED
@@ -0,0 +1,4 @@
1
+ source 'https://rubygems.org'
2
+
3
+ # Specify your gem's dependencies in thin_out_backups.gemspec
4
+ gemspec
data/License ADDED
@@ -0,0 +1,22 @@
1
+ Copyright (c) 2012 Tyler Rick
2
+
3
+ MIT License
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining
6
+ a copy of this software and associated documentation files (the
7
+ "Software"), to deal in the Software without restriction, including
8
+ without limitation the rights to use, copy, modify, merge, publish,
9
+ distribute, sublicense, and/or sell copies of the Software, and to
10
+ permit persons to whom the Software is furnished to do so, subject to
11
+ the following conditions:
12
+
13
+ The above copyright notice and this permission notice shall be
14
+ included in all copies or substantial portions of the Software.
15
+
16
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
17
+ EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
18
+ MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
19
+ NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
20
+ LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
21
+ OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
22
+ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
data/Rakefile ADDED
@@ -0,0 +1,2 @@
1
+ #!/usr/bin/env rake
2
+ require "bundler/gem_tasks"
data/Readme.md ADDED
@@ -0,0 +1,43 @@
1
+ # thin_out_backups
2
+
3
+ Quickly and safely thin out a backups directory that's taking up too much hard disk space!
4
+
5
+ `thin_out_backups` will keep the specified number of backups in each frequency category (weekly,
6
+ daily, etc.) and delete the rest, keep the space requirements of your backups directory fairly
7
+ constant over time.
8
+
9
+ The files that you are thinning out don't have to be backups, but that is probably the most common
10
+ use case.
11
+
12
+ ## Installation
13
+
14
+ $ gem install thin_out_backups
15
+
16
+ ## Usage
17
+
18
+ $ thin_out_backups
19
+
20
+
21
+
22
+ ## License
23
+
24
+ Copyright 2008, 2012 Tyler Rick
25
+
26
+ Released under the MIT license. See License file.
27
+
28
+ ## Contributing
29
+
30
+ 1. Fork it
31
+ 2. Create your feature branch (`git checkout -b my-new-feature`)
32
+ 3. Commit your changes (`git commit -am 'Added some feature'`)
33
+ 4. Push to the branch (`git push origin my-new-feature`)
34
+ 5. Create new Pull Request
35
+
36
+ ## Other names considered
37
+
38
+ * thin_out_backup_dir
39
+ * sparsen_dir
40
+ * sparsify_dir
41
+ * [rm_extra_copies](https://github.com/TylerRick/command-line/blob/master/bin/rm_extra_copies) (previously published under this name)
42
+ * prune_backups
43
+ * trim_dir
@@ -0,0 +1,180 @@
1
+ #!/usr/bin/env ruby
2
+ # vim: textwidth=150
3
+
4
+ =begin To Do
5
+ Make it possible to run on a capistrano releases directory to only keep the n latest releases and delete the rest (just like cap deploy:cleanup). So you don't end up with a pile of 200 releases in that directory...
6
+
7
+ Add --duplicates/--dups option, that when present deletes all duplicate files in a directory
8
+
9
+ Add --max-size option: keeps deleting until the du -s #{dir} reports that it is beneath that threshold.
10
+ What order should it delete? Oldest first?
11
+ After each deletion, checks if it's reached (below) max size yet (just check filesize before deleting and then subtract that from a running total -- faster than doing a du over and over).
12
+ Before any other deletes, it should search for/delete duplicates. If --duplicates option present, will do the search even if already below max size.
13
+ =end
14
+
15
+ require 'getoptlong'
16
+ #require 'rubygems'
17
+ #require 'facets'
18
+ require 'thin_out_backups'
19
+
20
+ def help
21
+ puts <<End
22
+ Synopsis:
23
+ ---------
24
+
25
+ thin_out_backups [options] dirs...
26
+
27
+ Example:
28
+ thin_out_backups --daily 5 --monthly 12 /path/to/backups
29
+
30
+
31
+ thin_out_backups will keep the specified number of time-stamped copies from each "time bucket" and remove all older copies.
32
+
33
+ So even if you don't actually make any weekly backups, as long as you make daily or hourly backups, you will still *have* weekly backups which
34
+ you can keep. This will let you keep, for example, *all* the weekly backups but only 6 days worth of daily backups.
35
+
36
+ When you specify --weekly=3, it means go back for 3 weeks and keep one backup from each week.
37
+ So it will keep the most recent file from the current week, the most recent file from the previous week, etc.
38
+
39
+ Use a * instead of a number to go back all the way to the oldest [week] represented in that directory and keep 1 copy from each [week] visited
40
+ along the way. Note that it usually only makes sense to do this for the largest time interval being used. (It wouldn't make any sense to say
41
+ --hourly=* --daily=3 because the --daily=3 will cause no files to be kept that wouldn't already be kept due to the --hourly.)
42
+
43
+ The same file may satisfy more than 1 quota. So if you set them all to 1, they will all be satisfied by the same file: the most recent backup in the
44
+ directory. For that reason, 2 is probably the minimum you'd want to set any of these to.
45
+
46
+
47
+ Options:
48
+ --------
49
+ --help Print this message
50
+ --minutely/--hourly/--daily/--weekly/--monthly/--yearly quota/frequency
51
+ Keep n number of [daily] backups or * to keep all
52
+ --<custom-interval>=<n>
53
+ --force Don't ask for confirmation before deleting. Use this if you put this command in a crontab and you are very sure you know
54
+ what it will do.
55
+ --now Use a different date/time as now.
56
+ --get-time-from filename Takes the time from the filename. It must be present in the filename and formatted in %Y%m%dT%H%M format.
57
+ --get-time-from file_system
58
+ Uses the mtime of the file.
59
+ --no-color Use a different date/time as now.
60
+ --last/--latest Take the last (latest) candidate in each time interval (default).
61
+ --first/--earliest Take the first (earliest) candidate in each time interval (not default).
62
+
63
+ By default, days start at the beginning of day, weeks to beginning of week, etc.
64
+
65
+ To do:
66
+ --identical Compares the md5 digest of each pair of files and removes any files that have the same (and therefore probably are identical) -- except for the oldest/newest one.
67
+ --consec-identical Same as --identical except only checks and removes consecutive/neighboring files to see if they are identical.
68
+ --align=<?>
69
+ add option to align the time ranges (days, etc.) from "now" rather than at the beginning of the day, etc. So "day" would actually be "the most
70
+ recent 24-hour period, beginning exactly 24 hours ago and ending right now".
71
+
72
+
73
+ Format of quota/frequency
74
+ -------------------------
75
+
76
+ TODO: allow crontab-style specifiers like --hourly=2/4 to keep 2 4-hourly backups, or
77
+ --hourly=*/12 to keep *all* 12-hourly backups (= keep 2 backups per day).
78
+
79
+
80
+ How your files/directories must be organized
81
+ --------------------------------------------
82
+
83
+ Each directory is expected to contain only one type of backup -- with many "copies" of that backup accumulated over time -- not a mix of backups.
84
+
85
+ These backups may be either files or directories. (rm -rf will be used to remove them)
86
+
87
+ Within each directory, this program will take the n most recently created files in each "time bucket" and retain them while pruning/deleting *all*
88
+ other files in the directory.
89
+
90
+ WARNING: If you have other stuff in that directory that you want to keep, be warned: it WILL be deleted by this script!
91
+
92
+ TODO: let you specify a glob pattern for files that WILL be pruned; and/or add an --ignore option to specify file patterns to ignore = NOT subject to pruning = not delete
93
+
94
+ For the moment, I require that the timestamp be in the filename itself, in "%Y%m%dT%H%M%S" format.
95
+ TODO: add option to use the timestamp recorded in the file system rather than the timestamp from within the filename.
96
+
97
+ Background
98
+ ----------
99
+
100
+ This is useful when you have a cron job that continuously creates backups and dumps them in a certain directory and you want to keep that directory
101
+ from becoming inordinately large. (Or quickly get it down to a smaller size now that it's grown extremely large and you've run out of room on the disk.)
102
+
103
+ The idea is that you are more likely to want/need a recent backup than an older backup. Probably because you hope to become aware of whatever problem
104
+ necessitates looking at/using/restoring from the backup very soon after the problem arises.
105
+
106
+ So you want to keep a higher density of backups from recent times than you do from older times. The high density of recent backups ensures that you
107
+ have a higher likelihood of having a backup from very soon before the data became corrupt/whatever and needed to be restored.
108
+
109
+ For example, if you have backups for every hour in the last 24 hours, and you discover that something got deleted/corrupted/etc. at 23:15 last night,
110
+ then if you pull the most recent backup, say the 23:00 backup, it will be at most an hour before the time of the problem. So the amount of data that
111
+ is lost is at most 1 hour's worth.
112
+
113
+ As you go further back in time, however, you are less likely to need any backups from that time. But you may want to keep them around for historical
114
+ or statistical or "just in case there is a subtle problem that we don't find out about until 6 months later and we need to be able to go back and
115
+ determine how it happened and restore a certain chunk of data from that old pre-problem backup."
116
+
117
+ Well, this tool will help you to keep around SOME old copies, without keep around as many as you keep of the more recent ones.
118
+
119
+ End
120
+ exit 0
121
+ end
122
+
123
+
124
+
125
+ quotas = {}
126
+
127
+ help if ARGV.empty?
128
+
129
+ opts = GetoptLong.new(
130
+ [ '--minutely', GetoptLong::REQUIRED_ARGUMENT ],
131
+ [ '--hourly', GetoptLong::REQUIRED_ARGUMENT ],
132
+ [ '--daily', GetoptLong::REQUIRED_ARGUMENT ],
133
+ [ '--weekly', GetoptLong::REQUIRED_ARGUMENT ],
134
+ [ '--monthly', GetoptLong::REQUIRED_ARGUMENT ],
135
+ [ '--yearly', GetoptLong::REQUIRED_ARGUMENT ],
136
+ [ '--get-time-from', GetoptLong::REQUIRED_ARGUMENT ],
137
+ [ '--last', '--latest', GetoptLong::NO_ARGUMENT ],
138
+ [ '--first', '--earliest', GetoptLong::NO_ARGUMENT ],
139
+ [ '--force', '-f', GetoptLong::NO_ARGUMENT ],
140
+ [ '--no-color', GetoptLong::NO_ARGUMENT ],
141
+ [ '--time-format', GetoptLong::REQUIRED_ARGUMENT ],
142
+ [ '--pattern', GetoptLong::REQUIRED_ARGUMENT ],
143
+ [ '--exclude', GetoptLong::REQUIRED_ARGUMENT ],
144
+ [ '--now', GetoptLong::REQUIRED_ARGUMENT ]
145
+ )
146
+
147
+ opts.each do | opt, arg |
148
+ case opt
149
+
150
+ when '--help'
151
+ help
152
+
153
+ when '--force', '-f'
154
+ ThinOutBackups::Command.force = true
155
+
156
+ when '--no-color'
157
+ ThinOutBackups::Command.color = false
158
+
159
+ when *ThinOutBackups::Command.options.map {|o| "--#{o.to_s.gsub(/_/, '-')}"}
160
+ name = opt.gsub(/^--/, '').to_sym
161
+ ThinOutBackups::Command.send("#{name}=", arg)
162
+
163
+ when *ThinOutBackups::Command.allowed_bucket_names.map {|o| "--#{o}"}
164
+ name = opt.gsub(/^--/, '').to_sym
165
+ if arg == '*'
166
+ #quotas[name] = :all
167
+ quotas[name] = '*'
168
+ else
169
+ quotas[name] = arg.to_i
170
+ end
171
+ end
172
+ end
173
+
174
+ dirs = ARGV
175
+ raise "Must specify at least one directory" if dirs.empty?
176
+ dirs.each do |dir|
177
+ command = ThinOutBackups::Command.new(dir, quotas)
178
+ command.run
179
+ end
180
+
@@ -0,0 +1,131 @@
1
+ # Until facets gets patched proper, we'll monkey patch it here:
2
+ # TODO: Is this still needed?
3
+ #require '/home/tyler/dev/ruby/facets/lib/core/facets/time/hence'
4
+ class Time
5
+
6
+ if defined?(::ActiveSupport)
7
+
8
+ alias_method :in, :since
9
+ alias_method :hence, :since
10
+
11
+ else
12
+
13
+ # Returns a new Time representing the time
14
+ # a number of time-units ago.
15
+ #
16
+ def ago(number, units=:seconds)
17
+ time =(
18
+ case units.to_s.downcase.to_sym
19
+ when :years
20
+ set(:year => (year - number))
21
+ when :months
22
+ y = ((month - number) / 12).to_i
23
+ #puts "(#{month} - #{number}) / 12 == #{y}"
24
+
25
+ new_month = ((month - number - 1) % 12) + 1
26
+ y += 1 if new_month > month
27
+ #puts y
28
+
29
+ set(:year => (year - y), :month => new_month)
30
+ when :weeks
31
+ self - (number * 604800)
32
+ when :days
33
+ self - (number * 86400)
34
+ when :hours
35
+ self - (number * 3600)
36
+ when :minutes
37
+ self - (number * 60)
38
+ when :seconds, nil
39
+ self - number
40
+ else
41
+ raise ArgumentError, "unrecognized time units -- #{units}"
42
+ end
43
+ )
44
+ dst_adjustment(time)
45
+ end
46
+ #
47
+ # Returns a new Time representing the time
48
+ # a number of time-units hence.
49
+
50
+ def hence(number, units=:seconds)
51
+ time =(
52
+ case units.to_s.downcase.to_sym
53
+ when :years
54
+ set( :year=>(year + number) )
55
+ when :months
56
+ y = ((month + number - 1) / 12).to_i
57
+ m = ((month + number - 1) % 12) + 1
58
+ set(:year => (year + y), :month => m)
59
+ when :weeks
60
+ self + (number * 604800)
61
+ when :days
62
+ self + (number * 86400)
63
+ when :hours
64
+ self + (number * 3600)
65
+ when :minutes
66
+ self + (number * 60)
67
+ when :seconds
68
+ self + number
69
+ else
70
+ raise ArgumentError, "unrecognized time units -- #{units}"
71
+ end
72
+ )
73
+ dst_adjustment(time)
74
+ end
75
+
76
+ alias_method :in, :hence
77
+ alias_method :since, :hence
78
+
79
+ # Adjust DST
80
+ #
81
+ # TODO: Can't seem to get this to pass ActiveSupport tests.
82
+ # Even though it is essentially identical to the
83
+ # ActiveSupport code (see Time#since in time/calculations.rb).
84
+ # It handles all but 4 tests.
85
+ def dst_adjustment(time)
86
+ self_dst = self.dst? ? 1 : 0
87
+ time_dst = time.dst? ? 1 : 0
88
+ seconds = (self - time).abs
89
+ if (seconds >= 86400 && self_dst != time_dst)
90
+ time + ((self_dst - time_dst) * 60 * 60)
91
+ else
92
+ time
93
+ end
94
+ end
95
+
96
+ end
97
+
98
+ end
99
+
100
+
101
+ class DateTime
102
+ def to_time
103
+ Time.mktime(year, month, day, hour, min, sec)
104
+ end
105
+ end
106
+
107
+ class Time
108
+ def to_s
109
+ #strftime("%Y%m%dT%H%M%S")
110
+ strftime("%Y-%m-%d %H:%M")
111
+ end
112
+ def to_s_full
113
+ strftime("%Y-%m-%d %H:%M:%S")
114
+ end
115
+
116
+ # /usr/lib/ruby/gems/1.8/gems/activesupport-2.1.2/lib/active_support/core_ext/time/calculations.rb
117
+
118
+ def beginning_of_day
119
+ change(:hour => 0)
120
+ end
121
+ alias :midnight :beginning_of_day
122
+
123
+ # Returns a new Time representing the "start" of this week (Sunday, 0:00)
124
+ def beginning_of_week
125
+ days_to_sunday = self.wday
126
+ self.ago(days_to_sunday, :days).midnight
127
+ end
128
+ alias :sunday :beginning_of_week
129
+
130
+ end
131
+
@@ -0,0 +1,3 @@
1
+ module ThinOutBackups
2
+ Version = "0.0.1"
3
+ end
@@ -0,0 +1,285 @@
1
+ require "thin_out_backups/version"
2
+
3
+ # Tested by: spec/thin_out_backups_spec.rb
4
+
5
+ require 'fileutils'
6
+ require 'pathname'
7
+ require 'delegate'
8
+
9
+ #require 'rubygems'
10
+ require 'facets/time'
11
+ require 'colored'
12
+ require 'quality_extensions/module/attribute_accessors'
13
+ require 'thin_out_backups/time_fixes'
14
+
15
+ class ThinOutBackups::Command
16
+ #---------------------------------------------------------------------------------------------------------------------------------------------------
17
+
18
+ @@allowed_bucket_names = [:minutely, :hourly, :daily, :weekly, :monthly, :yearly]
19
+ mattr_reader :allowed_bucket_names
20
+
21
+ #---------------------------------------------------------------------------------------------------------------------------------------------------
22
+ # Options
23
+ @@options = [:get_time_from, :ignore_files ,:verbosity, :time_format, :now, :force, :no_color]
24
+ mattr_reader :options
25
+ mattr_accessor *@@options
26
+
27
+ @@get_time_from = :filename
28
+ def self.get_time_from=(new)
29
+ @@get_time_from = new.to_sym
30
+ raise "Unknown value for #{ThinOutBackups::Command.get_time_from}" unless @@get_time_from.in?([:filename, :file_system])
31
+ end
32
+
33
+ @@ignore_files = nil
34
+ @@verbosity = 1
35
+ @@force = false
36
+ @@color = true
37
+
38
+ @@now = Time.now
39
+ def self.now=(new)
40
+ time = DateTime.strptime('2008-11-12 07:45:00', '%Y-%m-%d %H:%M:%S').to_time
41
+ @@now = new
42
+ puts "Using alternate now: #{@@now}"
43
+ end
44
+
45
+ @@time_format = /(\d{4})(\d{2})(\d{2})T(\d{2})(\d{2})(\d{2})?/
46
+ @@time_format_parts = [:Y,:m,:d, :H,:M,:S]
47
+ # TODO: Maybe use something like this for interpreting time, rather than a regexp? DateTime.strptime("27/Nov/2007:15:01:43 -0800", "%d/%b/%Y:%H:%M:%S %z")
48
+ def self.time_format=(new)
49
+ # TODO: accept format strings such as 'H:M:S d.m.Y.' and 'Y-m-d H:M:S'
50
+ # TODO: do error checking
51
+ end
52
+
53
+
54
+ #---------------------------------------------------------------------------------------------------------------------------------------------------
55
+
56
+ class Bucket
57
+ attr_reader :parent, :name, :quota, :keepers
58
+ @@quota_format = %r[(\d+|\*)(/\d+)?]
59
+
60
+ def initialize(parent, name, quota)
61
+ @parent = parent
62
+ @name = name
63
+ (
64
+ raise "Invalid quota '#{quota}'" unless quota.is_a?(Fixnum) || quota =~ @@quota_format
65
+ @quota = quota
66
+ )
67
+ @keepers = []
68
+ end
69
+
70
+ def unit
71
+ {
72
+ :minutely => :minutes,
73
+ :hourly => :hours,
74
+ :daily => :days,
75
+ :weekly => :weeks,
76
+ :monthly => :months,
77
+ :yearly => :years,
78
+ }[@name.to_sym]
79
+ end
80
+
81
+ def start_time
82
+ start_time = parent.now.dup
83
+ if parent.align_at_beginning_of_time_interval
84
+ beginning_of_interval =
85
+ case unit
86
+ when :minutes
87
+ start_time.change(:sec => 0)
88
+ when :hours
89
+ start_time.change(:min => 0)
90
+ when :days
91
+ start_time.change(:hour => 0)
92
+ when :weeks
93
+ start_time.beginning_of_week
94
+ when :months
95
+ start_time.change( :day => 1, :hour => 0)
96
+ when :years
97
+ start_time.change(:month => 1, :day => 1, :hour => 0)
98
+ else
99
+ raise "unexpected unit #{unit}"
100
+ end
101
+ # We actually want to use the *next* interval (in the future) as our start_time because we will be using this as our max and going backwards in time...
102
+ beginning_of_interval.hence(1, unit)
103
+ else
104
+ start_time
105
+ end
106
+ end
107
+
108
+ def keep(keeper)
109
+ @keepers << keeper
110
+ end
111
+
112
+ def still_need
113
+ if keep_all?
114
+ 1 # it has insatiable hunger for keepers that can never be satisfied ... always just 1 more ...
115
+ else
116
+ @quota - @keepers.size
117
+ end
118
+ end
119
+
120
+ def keep_all?; quota =~ /\*/ end
121
+
122
+ def satisfied?
123
+ if keep_all?
124
+ false # it has insatiable hunger for keepers that can never be satisfied
125
+ else
126
+ still_need == 0
127
+ end
128
+ end
129
+ end
130
+
131
+ class File < DelegateClass(::Pathname)
132
+ attr_reader :filename, :file
133
+
134
+ def initialize(filename)
135
+ super(Pathname.new(filename))
136
+ end
137
+
138
+ def full_path
139
+ dirname.to_s + '/' + filename
140
+ end
141
+
142
+ def filename
143
+ basename.to_s
144
+ end
145
+ def to_s
146
+ filename
147
+ end
148
+
149
+ def time
150
+ if ThinOutBackups::Command.get_time_from == :filename
151
+ if filename =~ ThinOutBackups::Command.time_format
152
+ y,m,d, h,i,s = $1,$2,$3, $4,$5,$6
153
+ Time.mktime(y,m,d, h,i,s)
154
+ else
155
+ nil
156
+ end
157
+ elsif ThinOutBackups::Command.get_time_from == :file_system
158
+ file.mtime
159
+ else
160
+ raise "Unknown value for #{ThinOutBackups::Command.get_time_from}"
161
+ end
162
+ end
163
+
164
+ def has_time?; !!time end
165
+ def ignored?
166
+ !has_time? or
167
+ ThinOutBackups::Command.ignore_files && filename =~ ThinOutBackups::Command.ignore_files
168
+ end
169
+ end
170
+
171
+ attr_reader :align_at_beginning_of_time_interval, :files_with_times
172
+ attr_accessor :dir
173
+
174
+ def initialize(dir, quotas)
175
+ @align_at_beginning_of_time_interval = true
176
+ @dir = dir
177
+ @buckets = {}
178
+ @@allowed_bucket_names.each do |name|
179
+ quota = quotas[name]
180
+ @buckets[name] = Bucket.new(self, name, quota) unless quota.nil?
181
+ end
182
+
183
+ puts "Processing #{@dir}/*".magenta
184
+ files = Dir["#{@dir}/*"].map { |filename|
185
+ file = File.new(filename)
186
+ }
187
+ @files_with_times = files.
188
+ reject {|file| !file.has_time?}.
189
+ sort { |a, b|
190
+ a.time <=> b.time
191
+ }.
192
+ reverse
193
+
194
+ end
195
+
196
+ def bucket_remaining(bucket_name, decr = nil)
197
+ send("#{bucket_name}=", send("#{bucket_name}") - decr) if decr
198
+ send "#{bucket_name}"
199
+ end
200
+
201
+ def buckets
202
+ @buckets.values
203
+ end
204
+ def bucket(name)
205
+ @buckets[name] or raise "unknown bucket '#{name}'"
206
+ end
207
+
208
+ def now
209
+ Time.now
210
+ end
211
+
212
+ def delete_non_keepers
213
+ #raise "Didn't find any files to keep?!" unless keepers.any?
214
+ files_with_times.each do |file|
215
+ if (buckets = buckets_with_file(file)).any?
216
+ puts "#{file.full_path}: in buckets: #{buckets.map(&:name).join(', ')}".green
217
+ else
218
+ puts "#{file.full_path}: delete".red
219
+ end
220
+ end
221
+
222
+ if @@force == false
223
+ print "Continue with deletions? (yes or no) >".magenta
224
+ response = STDIN.gets
225
+ (puts "Aborting"; return) unless response.chomp.downcase == 'yes'
226
+ end
227
+
228
+ files_with_times.each do |file|
229
+ if (buckets = buckets_with_file(file)).any?
230
+ #
231
+ else
232
+ file.unlink
233
+ end
234
+ end
235
+ end
236
+
237
+ def buckets_with_file(file)
238
+ buckets.find_all {|bucket| bucket.keepers.include?(file)}
239
+ end
240
+
241
+ def delete(files)
242
+ puts "Deleting files: #{files.join(', ')}..."
243
+ end
244
+
245
+ def earliest_file_time
246
+ files_with_times.last.time
247
+ end
248
+
249
+ def run
250
+ raise "Must keep at least 1 file from at least one time bucket" if buckets.empty?
251
+ (puts "Found no files with times! Aborting."; return) if files_with_times.empty?
252
+
253
+ # Fill each bucket until its quota is met
254
+ buckets.each do |bucket|
255
+ puts "Trying to fill bucket '#{bucket.name}' (quota: #{bucket.quota})...".magenta
256
+
257
+ time_max = bucket.start_time
258
+ time_min = time_max.ago(1, bucket.unit)
259
+
260
+ #puts "Earliest_file_time: #{earliest_file_time}"
261
+ while time_max > earliest_file_time
262
+ print "Checking range (#{time_min} .. #{time_max})... ".yellow if verbosity >= 1
263
+ new_keeper = files_with_times.detect {|file|
264
+ #print "#{file}? "
265
+ time_min <= file.time &&
266
+ file.time < time_max
267
+ }
268
+ if new_keeper
269
+ puts "found keeper #{new_keeper}".green if verbosity >= 1
270
+ bucket.keep new_keeper
271
+ else
272
+ #puts "found no keepers".red if verbosity >= 1
273
+ puts "" if verbosity >= 1
274
+ end
275
+
276
+ time_max = time_min
277
+ #puts "Stepping back from #{time_min} by 1 #{bucket.unit} => #{time_min.ago(1, bucket.unit)}"
278
+ time_min = time_min.ago(1, bucket.unit)
279
+ (puts 'Filled quota!'.green; break) if bucket.satisfied?
280
+ end
281
+ end
282
+
283
+ delete_non_keepers
284
+ end
285
+ end
@@ -0,0 +1,137 @@
1
+ require 'tmpdir'
2
+ #require 'rubygems'
3
+ require 'rspec'
4
+ require 'facets'
5
+ require_relative '../lib/thin_out_backups'
6
+
7
+ $now = Time.utc(2008,11,12, 7,45,19)
8
+
9
+ describe Time, "#beginning_of_week" do
10
+ it "should return a Sunday" do
11
+ Time.utc(2008,11,12).beginning_of_week.should == Time.utc(2008,11,9)
12
+ end
13
+ end
14
+
15
+ describe ThinOutBackups::Command::Bucket, "time interval alignment" do
16
+
17
+ def sample_quotas
18
+ {
19
+ :minutely => 1,
20
+ :hourly => 3,
21
+ :daily => 1,
22
+ :weekly => '*',
23
+ :monthly => 1,
24
+ :yearly => '*'
25
+ }
26
+ end
27
+
28
+ before do
29
+ @command = ThinOutBackups::Command.new('bogus_dir', sample_quotas)
30
+ @now = $now
31
+ @command.stub!(:now).and_return(@now)
32
+ end
33
+
34
+ it "should use the time specified by our test" do
35
+ @command.now.should == @now
36
+ end
37
+
38
+ it "hour interval should start on the hour, etc." do
39
+ @command.bucket(:minutely).start_time.should == Time.utc(2008,11,12, 7,46,0)
40
+ @command.bucket(:hourly). start_time.should == Time.utc(2008,11,12, 8,0,0)
41
+ @command.bucket(:daily). start_time.should == Time.utc(2008,11,13, 0,0,0)
42
+ @command.bucket(:weekly). start_time.should == Time.utc(2008,11,16, 0,0,0)
43
+ @command.bucket(:monthly). start_time.should == Time.utc(2008,12, 1, 0,0,0)
44
+ @command.bucket(:yearly). start_time.should == Time.utc(2009, 1, 1, 0,0,0)
45
+ end
46
+
47
+ end
48
+
49
+
50
+ $command = <<End
51
+ thin_out_backups --force --daily=3 --weekly=3 --monthly=* \
52
+ --now='#{$now.to_s_full}'\
53
+ spec/test_dir/db_dumps \
54
+ spec/test_dir/maildir
55
+ End
56
+ describe ThinOutBackups::Command, "when calling `#{$command}`" do
57
+ before do
58
+ Pathname.new("spec/test_dir/").rmtree rescue nil
59
+
60
+ dir='spec/test_dir/db_dumps/'
61
+ system "mkdir -p #{dir}"
62
+ files = %w[
63
+ db_dump_2008-08-08T0303.sql
64
+ db_dump_2008-09-01T0303.sql
65
+ db_dump_2008-09-10T0303.sql
66
+ db_dump_2008-10-15T0303.sql
67
+ db_dump_2008-10-16T0303.sql
68
+ db_dump_2008-10-17T0303.sql
69
+ db_dump_2008-10-18T0303.sql
70
+ db_dump_2008-10-19T0303.sql
71
+ db_dump_2008-10-20T0303.sql
72
+ db_dump_2008-10-21T0303.sql
73
+ db_dump_2008-10-22T0303.sql
74
+ db_dump_2008-10-23T0303.sql
75
+ db_dump_2008-10-24T0303.sql
76
+ db_dump_2008-10-25T0303.sql
77
+ db_dump_2008-10-26T0303.sql
78
+ db_dump_2008-10-27T0303.sql
79
+ db_dump_2008-10-28T0303.sql
80
+ db_dump_2008-10-29T0303.sql
81
+ db_dump_2008-10-30T0303.sql
82
+ db_dump_2008-10-31T0303.sql
83
+ db_dump_2008-11-01T0303.sql
84
+ db_dump_2008-11-02T0303.sql
85
+ db_dump_2008-11-03T0303.sql
86
+ db_dump_2008-11-04T0303.sql
87
+ db_dump_2008-11-05T0303.sql
88
+ db_dump_2008-11-06T0303.sql
89
+ db_dump_2008-11-07T0303.sql
90
+ db_dump_2008-11-08T0303.sql
91
+ db_dump_2008-11-09T0303.sql
92
+ db_dump_2008-11-10T0303.sql
93
+ db_dump_2008-11-11T0303.sql
94
+ db_dump_2008-11-12T0303.sql
95
+ ]
96
+ files.each do |file|
97
+ Dir.getwd
98
+ #puts %(Dir.getwd=#{(Dir.getwd).inspect})
99
+ #puts %("touch #{dir}/#{file}"=#{("touch #{dir}/#{file}").inspect})
100
+ system "touch #{dir}/#{file}"
101
+ end
102
+
103
+ dir='spec/test_dir/maildir/'
104
+ system "mkdir -p #{dir}"
105
+ subdirs = %w[
106
+ 2008-11-09T0303
107
+ 2008-11-10T0303
108
+ 2008-11-11T0303
109
+ ]
110
+ subdirs.each do |subdir|
111
+ system "mkdir -p #{dir}/#{subdir}"
112
+ system "touch #{dir}/#{subdir}/inbox"
113
+ system "touch #{dir}/#{subdir}/some_other_folder"
114
+ end
115
+
116
+ puts %($command=#{($command).inspect})
117
+ #system $command
118
+ # TODO: also capture output of command and check it against expected
119
+ end
120
+
121
+ it "keeps/removes the correct files" do
122
+ Dir['spec/test_dir/db_dumps/*'].should =~
123
+ ["spec/test_dir/db_dumps/db_dump_2008-11-12T0303.sql",
124
+ "spec/test_dir/db_dumps/db_dump_2008-11-08T0303.sql",
125
+ "spec/test_dir/db_dumps/db_dump_2008-10-31T0303.sql",
126
+ "spec/test_dir/db_dumps/db_dump_2008-11-10T0303.sql",
127
+ "spec/test_dir/db_dumps/db_dump_2008-08-08T0303.sql",
128
+ "spec/test_dir/db_dumps/db_dump_2008-11-01T0303.sql",
129
+ "spec/test_dir/db_dumps/db_dump_2008-09-10T0303.sql",
130
+ "spec/test_dir/db_dumps/db_dump_2008-11-11T0303.sql"]
131
+ end
132
+
133
+ after do
134
+ #Pathname.new("spec/test_dir/").rmtree rescue nil
135
+ end
136
+ end
137
+
@@ -0,0 +1,23 @@
1
+ # -*- encoding: utf-8 -*-
2
+ require File.expand_path('../lib/thin_out_backups/version', __FILE__)
3
+
4
+ Gem::Specification.new do |gem|
5
+ gem.authors = ["Tyler Rick"]
6
+ gem.email = ["github.com@tylerrick.com"]
7
+ gem.summary = %q{Thin out a directory full of backups, only keeping a specified number from each category (weekly, daily, etc.), and deleting the rest.}
8
+ gem.description = gem.summary
9
+ gem.homepage = ""
10
+
11
+ gem.files = `git ls-files`.split($\)
12
+ gem.executables = gem.files.grep(%r{^bin/}).map{ |f| File.basename(f) }
13
+ gem.test_files = gem.files.grep(%r{^(test|spec|features)/})
14
+ gem.name = "thin_out_backups"
15
+ gem.require_paths = ["lib"]
16
+ gem.version = ThinOutBackups::Version
17
+
18
+ gem.add_dependency 'facets'
19
+ gem.add_dependency 'colored'
20
+ gem.add_dependency 'quality_extensions'
21
+
22
+ gem.add_development_dependency 'rspec'
23
+ end
metadata ADDED
@@ -0,0 +1,116 @@
1
+ --- !ruby/object:Gem::Specification
2
+ name: thin_out_backups
3
+ version: !ruby/object:Gem::Version
4
+ version: 0.0.1
5
+ platform: ruby
6
+ authors:
7
+ - Tyler Rick
8
+ autorequire:
9
+ bindir: bin
10
+ cert_chain: []
11
+ date: 2013-03-27 00:00:00.000000000 Z
12
+ dependencies:
13
+ - !ruby/object:Gem::Dependency
14
+ name: facets
15
+ requirement: !ruby/object:Gem::Requirement
16
+ requirements:
17
+ - - '>='
18
+ - !ruby/object:Gem::Version
19
+ version: '0'
20
+ type: :runtime
21
+ prerelease: false
22
+ version_requirements: !ruby/object:Gem::Requirement
23
+ requirements:
24
+ - - '>='
25
+ - !ruby/object:Gem::Version
26
+ version: '0'
27
+ - !ruby/object:Gem::Dependency
28
+ name: colored
29
+ requirement: !ruby/object:Gem::Requirement
30
+ requirements:
31
+ - - '>='
32
+ - !ruby/object:Gem::Version
33
+ version: '0'
34
+ type: :runtime
35
+ prerelease: false
36
+ version_requirements: !ruby/object:Gem::Requirement
37
+ requirements:
38
+ - - '>='
39
+ - !ruby/object:Gem::Version
40
+ version: '0'
41
+ - !ruby/object:Gem::Dependency
42
+ name: quality_extensions
43
+ requirement: !ruby/object:Gem::Requirement
44
+ requirements:
45
+ - - '>='
46
+ - !ruby/object:Gem::Version
47
+ version: '0'
48
+ type: :runtime
49
+ prerelease: false
50
+ version_requirements: !ruby/object:Gem::Requirement
51
+ requirements:
52
+ - - '>='
53
+ - !ruby/object:Gem::Version
54
+ version: '0'
55
+ - !ruby/object:Gem::Dependency
56
+ name: rspec
57
+ requirement: !ruby/object:Gem::Requirement
58
+ requirements:
59
+ - - '>='
60
+ - !ruby/object:Gem::Version
61
+ version: '0'
62
+ type: :development
63
+ prerelease: false
64
+ version_requirements: !ruby/object:Gem::Requirement
65
+ requirements:
66
+ - - '>='
67
+ - !ruby/object:Gem::Version
68
+ version: '0'
69
+ description: Thin out a directory full of backups, only keeping a specified number
70
+ from each category (weekly, daily, etc.), and deleting the rest.
71
+ email:
72
+ - github.com@tylerrick.com
73
+ executables:
74
+ - thin_out_backups
75
+ extensions: []
76
+ extra_rdoc_files: []
77
+ files:
78
+ - .gitignore
79
+ - .rspec
80
+ - Gemfile
81
+ - License
82
+ - Rakefile
83
+ - Readme.md
84
+ - bin/thin_out_backups
85
+ - lib/thin_out_backups.rb
86
+ - lib/thin_out_backups/time_fixes.rb
87
+ - lib/thin_out_backups/version.rb
88
+ - spec/thin_out_backups_spec.rb
89
+ - thin_out_backups.gemspec
90
+ homepage: ''
91
+ licenses: []
92
+ metadata: {}
93
+ post_install_message:
94
+ rdoc_options: []
95
+ require_paths:
96
+ - lib
97
+ required_ruby_version: !ruby/object:Gem::Requirement
98
+ requirements:
99
+ - - '>='
100
+ - !ruby/object:Gem::Version
101
+ version: '0'
102
+ required_rubygems_version: !ruby/object:Gem::Requirement
103
+ requirements:
104
+ - - '>='
105
+ - !ruby/object:Gem::Version
106
+ version: '0'
107
+ requirements: []
108
+ rubyforge_project:
109
+ rubygems_version: 2.0.0
110
+ signing_key:
111
+ specification_version: 4
112
+ summary: Thin out a directory full of backups, only keeping a specified number from
113
+ each category (weekly, daily, etc.), and deleting the rest.
114
+ test_files:
115
+ - spec/thin_out_backups_spec.rb
116
+ has_rdoc: