thin_out_backups 0.0.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml ADDED
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA1:
3
+ metadata.gz: 71d687dd4b0eb09b150c37d7d4567d134eb21cc6
4
+ data.tar.gz: 10a0c8c8eb6fa2a090fff6569720ced62fafa997
5
+ SHA512:
6
+ metadata.gz: ed27167bae6ab29f0466127064500f964288dce9e71fc7d6836623b871b437368014d0800084b5106131f567d7be4ffdcab994f02ed91d382b2e091ff214da92
7
+ data.tar.gz: e89661fc8e6f5e2e41226f8f746c573775cb7dae69434a9ad8ba75465a3a3c97b7b269062806a08c828b88b45cb65983a1883422f10bb730f4a46b08d187f1d9
data/.gitignore ADDED
@@ -0,0 +1,17 @@
1
+ *.gem
2
+ *.rbc
3
+ .bundle
4
+ .config
5
+ .yardoc
6
+ Gemfile.lock
7
+ InstalledFiles
8
+ _yardoc
9
+ coverage
10
+ doc/
11
+ lib/bundler/man
12
+ pkg
13
+ rdoc
14
+ spec/reports
15
+ test/tmp
16
+ test/version_tmp
17
+ tmp
data/.rspec ADDED
@@ -0,0 +1 @@
1
+ --color
data/Gemfile ADDED
@@ -0,0 +1,4 @@
1
+ source 'https://rubygems.org'
2
+
3
+ # Specify your gem's dependencies in thin_out_backups.gemspec
4
+ gemspec
data/License ADDED
@@ -0,0 +1,22 @@
1
+ Copyright (c) 2012 Tyler Rick
2
+
3
+ MIT License
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining
6
+ a copy of this software and associated documentation files (the
7
+ "Software"), to deal in the Software without restriction, including
8
+ without limitation the rights to use, copy, modify, merge, publish,
9
+ distribute, sublicense, and/or sell copies of the Software, and to
10
+ permit persons to whom the Software is furnished to do so, subject to
11
+ the following conditions:
12
+
13
+ The above copyright notice and this permission notice shall be
14
+ included in all copies or substantial portions of the Software.
15
+
16
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
17
+ EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
18
+ MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
19
+ NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
20
+ LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
21
+ OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
22
+ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
data/Rakefile ADDED
@@ -0,0 +1,2 @@
1
+ #!/usr/bin/env rake
2
+ require "bundler/gem_tasks"
data/Readme.md ADDED
@@ -0,0 +1,43 @@
1
+ # thin_out_backups
2
+
3
+ Quickly and safely thin out a backups directory that's taking up too much hard disk space!
4
+
5
+ `thin_out_backups` will keep the specified number of backups in each frequency category (weekly,
6
+ daily, etc.) and delete the rest, keep the space requirements of your backups directory fairly
7
+ constant over time.
8
+
9
+ The files that you are thinning out don't have to be backups, but that is probably the most common
10
+ use case.
11
+
12
+ ## Installation
13
+
14
+ $ gem install thin_out_backups
15
+
16
+ ## Usage
17
+
18
+ $ thin_out_backups
19
+
20
+
21
+
22
+ ## License
23
+
24
+ Copyright 2008, 2012 Tyler Rick
25
+
26
+ Released under the MIT license. See License file.
27
+
28
+ ## Contributing
29
+
30
+ 1. Fork it
31
+ 2. Create your feature branch (`git checkout -b my-new-feature`)
32
+ 3. Commit your changes (`git commit -am 'Added some feature'`)
33
+ 4. Push to the branch (`git push origin my-new-feature`)
34
+ 5. Create new Pull Request
35
+
36
+ ## Other names considered
37
+
38
+ * thin_out_backup_dir
39
+ * sparsen_dir
40
+ * sparsify_dir
41
+ * [rm_extra_copies](https://github.com/TylerRick/command-line/blob/master/bin/rm_extra_copies) (previously published under this name)
42
+ * prune_backups
43
+ * trim_dir
@@ -0,0 +1,180 @@
1
+ #!/usr/bin/env ruby
2
+ # vim: textwidth=150
3
+
4
+ =begin To Do
5
+ Make it possible to run on a capistrano releases directory to only keep the n latest releases and delete the rest (just like cap deploy:cleanup). So you don't end up with a pile of 200 releases in that directory...
6
+
7
+ Add --duplicates/--dups option, that when present deletes all duplicate files in a directory
8
+
9
+ Add --max-size option: keeps deleting until the du -s #{dir} reports that it is beneath that threshold.
10
+ What order should it delete? Oldest first?
11
+ After each deletion, checks if it's reached (below) max size yet (just check filesize before deleting and then subtract that from a running total -- faster than doing a du over and over).
12
+ Before any other deletes, it should search for/delete duplicates. If --duplicates option present, will do the search even if already below max size.
13
+ =end
14
+
15
+ require 'getoptlong'
16
+ #require 'rubygems'
17
+ #require 'facets'
18
+ require 'thin_out_backups'
19
+
20
+ def help
21
+ puts <<End
22
+ Synopsis:
23
+ ---------
24
+
25
+ thin_out_backups [options] dirs...
26
+
27
+ Example:
28
+ thin_out_backups --daily 5 --monthly 12 /path/to/backups
29
+
30
+
31
+ thin_out_backups will keep the specified number of time-stamped copies from each "time bucket" and remove all older copies.
32
+
33
+ So even if you don't actually make any weekly backups, as long as you make daily or hourly backups, you will still *have* weekly backups which
34
+ you can keep. This will let you keep, for example, *all* the weekly backups but only 6 days worth of daily backups.
35
+
36
+ When you specify --weekly=3, it means go back for 3 weeks and keep one backup from each week.
37
+ So it will keep the most recent file from the current week, the most recent file from the previous week, etc.
38
+
39
+ Use a * instead of a number to go back all the way to the oldest [week] represented in that directory and keep 1 copy from each [week] visited
40
+ along the way. Note that it usually only makes sense to do this for the largest time interval being used. (It wouldn't make any sense to say
41
+ --hourly=* --daily=3 because the --daily=3 will cause no files to be kept that wouldn't already be kept due to the --hourly.)
42
+
43
+ The same file may satisfy more than 1 quota. So if you set them all to 1, they will all be satisfied by the same file: the most recent backup in the
44
+ directory. For that reason, 2 is probably the minimum you'd want to set any of these to.
45
+
46
+
47
+ Options:
48
+ --------
49
+ --help Print this message
50
+ --minutely/--hourly/--daily/--weekly/--monthly/--yearly quota/frequency
51
+ Keep n number of [daily] backups or * to keep all
52
+ --<custom-interval>=<n>
53
+ --force Don't ask for confirmation before deleting. Use this if you put this command in a crontab and you are very sure you know
54
+ what it will do.
55
+ --now Use a different date/time as now.
56
+ --get-time-from filename Takes the time from the filename. It must be present in the filename and formatted in %Y%m%dT%H%M format.
57
+ --get-time-from file_system
58
+ Uses the mtime of the file.
59
+ --no-color Use a different date/time as now.
60
+ --last/--latest Take the last (latest) candidate in each time interval (default).
61
+ --first/--earliest Take the first (earliest) candidate in each time interval (not default).
62
+
63
+ By default, days start at the beginning of day, weeks to beginning of week, etc.
64
+
65
+ To do:
66
+ --identical Compares the md5 digest of each pair of files and removes any files that have the same (and therefore probably are identical) -- except for the oldest/newest one.
67
+ --consec-identical Same as --identical except only checks and removes consecutive/neighboring files to see if they are identical.
68
+ --align=<?>
69
+ add option to align the time ranges (days, etc.) from "now" rather than at the beginning of the day, etc. So "day" would actually be "the most
70
+ recent 24-hour period, beginning exactly 24 hours ago and ending right now".
71
+
72
+
73
+ Format of quota/frequency
74
+ -------------------------
75
+
76
+ TODO: allow crontab-style specifiers like --hourly=2/4 to keep 2 4-hourly backups, or
77
+ --hourly=*/12 to keep *all* 12-hourly backups (= keep 2 backups per day).
78
+
79
+
80
+ How your files/directories must be organized
81
+ --------------------------------------------
82
+
83
+ Each directory is expected to contain only one type of backup -- with many "copies" of that backup accumulated over time -- not a mix of backups.
84
+
85
+ These backups may be either files or directories. (rm -rf will be used to remove them)
86
+
87
+ Within each directory, this program will take the n most recently created files in each "time bucket" and retain them while pruning/deleting *all*
88
+ other files in the directory.
89
+
90
+ WARNING: If you have other stuff in that directory that you want to keep, be warned: it WILL be deleted by this script!
91
+
92
+ TODO: let you specify a glob pattern for files that WILL be pruned; and/or add an --ignore option to specify file patterns to ignore = NOT subject to pruning = not delete
93
+
94
+ For the moment, I require that the timestamp be in the filename itself, in "%Y%m%dT%H%M%S" format.
95
+ TODO: add option to use the timestamp recorded in the file system rather than the timestamp from within the filename.
96
+
97
+ Background
98
+ ----------
99
+
100
+ This is useful when you have a cron job that continuously creates backups and dumps them in a certain directory and you want to keep that directory
101
+ from becoming inordinately large. (Or quickly get it down to a smaller size now that it's grown extremely large and you've run out of room on the disk.)
102
+
103
+ The idea is that you are more likely to want/need a recent backup than an older backup. Probably because you hope to become aware of whatever problem
104
+ necessitates looking at/using/restoring from the backup very soon after the problem arises.
105
+
106
+ So you want to keep a higher density of backups from recent times than you do from older times. The high density of recent backups ensures that you
107
+ have a higher likelihood of having a backup from very soon before the data became corrupt/whatever and needed to be restored.
108
+
109
+ For example, if you have backups for every hour in the last 24 hours, and you discover that something got deleted/corrupted/etc. at 23:15 last night,
110
+ then if you pull the most recent backup, say the 23:00 backup, it will be at most an hour before the time of the problem. So the amount of data that
111
+ is lost is at most 1 hour's worth.
112
+
113
+ As you go further back in time, however, you are less likely to need any backups from that time. But you may want to keep them around for historical
114
+ or statistical or "just in case there is a subtle problem that we don't find out about until 6 months later and we need to be able to go back and
115
+ determine how it happened and restore a certain chunk of data from that old pre-problem backup."
116
+
117
+ Well, this tool will help you to keep around SOME old copies, without keep around as many as you keep of the more recent ones.
118
+
119
+ End
120
+ exit 0
121
+ end
122
+
123
+
124
+
125
+ quotas = {}
126
+
127
+ help if ARGV.empty?
128
+
129
+ opts = GetoptLong.new(
130
+ [ '--minutely', GetoptLong::REQUIRED_ARGUMENT ],
131
+ [ '--hourly', GetoptLong::REQUIRED_ARGUMENT ],
132
+ [ '--daily', GetoptLong::REQUIRED_ARGUMENT ],
133
+ [ '--weekly', GetoptLong::REQUIRED_ARGUMENT ],
134
+ [ '--monthly', GetoptLong::REQUIRED_ARGUMENT ],
135
+ [ '--yearly', GetoptLong::REQUIRED_ARGUMENT ],
136
+ [ '--get-time-from', GetoptLong::REQUIRED_ARGUMENT ],
137
+ [ '--last', '--latest', GetoptLong::NO_ARGUMENT ],
138
+ [ '--first', '--earliest', GetoptLong::NO_ARGUMENT ],
139
+ [ '--force', '-f', GetoptLong::NO_ARGUMENT ],
140
+ [ '--no-color', GetoptLong::NO_ARGUMENT ],
141
+ [ '--time-format', GetoptLong::REQUIRED_ARGUMENT ],
142
+ [ '--pattern', GetoptLong::REQUIRED_ARGUMENT ],
143
+ [ '--exclude', GetoptLong::REQUIRED_ARGUMENT ],
144
+ [ '--now', GetoptLong::REQUIRED_ARGUMENT ]
145
+ )
146
+
147
+ opts.each do | opt, arg |
148
+ case opt
149
+
150
+ when '--help'
151
+ help
152
+
153
+ when '--force', '-f'
154
+ ThinOutBackups::Command.force = true
155
+
156
+ when '--no-color'
157
+ ThinOutBackups::Command.color = false
158
+
159
+ when *ThinOutBackups::Command.options.map {|o| "--#{o.to_s.gsub(/_/, '-')}"}
160
+ name = opt.gsub(/^--/, '').to_sym
161
+ ThinOutBackups::Command.send("#{name}=", arg)
162
+
163
+ when *ThinOutBackups::Command.allowed_bucket_names.map {|o| "--#{o}"}
164
+ name = opt.gsub(/^--/, '').to_sym
165
+ if arg == '*'
166
+ #quotas[name] = :all
167
+ quotas[name] = '*'
168
+ else
169
+ quotas[name] = arg.to_i
170
+ end
171
+ end
172
+ end
173
+
174
+ dirs = ARGV
175
+ raise "Must specify at least one directory" if dirs.empty?
176
+ dirs.each do |dir|
177
+ command = ThinOutBackups::Command.new(dir, quotas)
178
+ command.run
179
+ end
180
+
@@ -0,0 +1,131 @@
1
+ # Until facets gets patched proper, we'll monkey patch it here:
2
+ # TODO: Is this still needed?
3
+ #require '/home/tyler/dev/ruby/facets/lib/core/facets/time/hence'
4
+ class Time
5
+
6
+ if defined?(::ActiveSupport)
7
+
8
+ alias_method :in, :since
9
+ alias_method :hence, :since
10
+
11
+ else
12
+
13
+ # Returns a new Time representing the time
14
+ # a number of time-units ago.
15
+ #
16
+ def ago(number, units=:seconds)
17
+ time =(
18
+ case units.to_s.downcase.to_sym
19
+ when :years
20
+ set(:year => (year - number))
21
+ when :months
22
+ y = ((month - number) / 12).to_i
23
+ #puts "(#{month} - #{number}) / 12 == #{y}"
24
+
25
+ new_month = ((month - number - 1) % 12) + 1
26
+ y += 1 if new_month > month
27
+ #puts y
28
+
29
+ set(:year => (year - y), :month => new_month)
30
+ when :weeks
31
+ self - (number * 604800)
32
+ when :days
33
+ self - (number * 86400)
34
+ when :hours
35
+ self - (number * 3600)
36
+ when :minutes
37
+ self - (number * 60)
38
+ when :seconds, nil
39
+ self - number
40
+ else
41
+ raise ArgumentError, "unrecognized time units -- #{units}"
42
+ end
43
+ )
44
+ dst_adjustment(time)
45
+ end
46
+ #
47
+ # Returns a new Time representing the time
48
+ # a number of time-units hence.
49
+
50
+ def hence(number, units=:seconds)
51
+ time =(
52
+ case units.to_s.downcase.to_sym
53
+ when :years
54
+ set( :year=>(year + number) )
55
+ when :months
56
+ y = ((month + number - 1) / 12).to_i
57
+ m = ((month + number - 1) % 12) + 1
58
+ set(:year => (year + y), :month => m)
59
+ when :weeks
60
+ self + (number * 604800)
61
+ when :days
62
+ self + (number * 86400)
63
+ when :hours
64
+ self + (number * 3600)
65
+ when :minutes
66
+ self + (number * 60)
67
+ when :seconds
68
+ self + number
69
+ else
70
+ raise ArgumentError, "unrecognized time units -- #{units}"
71
+ end
72
+ )
73
+ dst_adjustment(time)
74
+ end
75
+
76
+ alias_method :in, :hence
77
+ alias_method :since, :hence
78
+
79
+ # Adjust DST
80
+ #
81
+ # TODO: Can't seem to get this to pass ActiveSupport tests.
82
+ # Even though it is essentially identical to the
83
+ # ActiveSupport code (see Time#since in time/calculations.rb).
84
+ # It handles all but 4 tests.
85
+ def dst_adjustment(time)
86
+ self_dst = self.dst? ? 1 : 0
87
+ time_dst = time.dst? ? 1 : 0
88
+ seconds = (self - time).abs
89
+ if (seconds >= 86400 && self_dst != time_dst)
90
+ time + ((self_dst - time_dst) * 60 * 60)
91
+ else
92
+ time
93
+ end
94
+ end
95
+
96
+ end
97
+
98
+ end
99
+
100
+
101
+ class DateTime
102
+ def to_time
103
+ Time.mktime(year, month, day, hour, min, sec)
104
+ end
105
+ end
106
+
107
+ class Time
108
+ def to_s
109
+ #strftime("%Y%m%dT%H%M%S")
110
+ strftime("%Y-%m-%d %H:%M")
111
+ end
112
+ def to_s_full
113
+ strftime("%Y-%m-%d %H:%M:%S")
114
+ end
115
+
116
+ # /usr/lib/ruby/gems/1.8/gems/activesupport-2.1.2/lib/active_support/core_ext/time/calculations.rb
117
+
118
+ def beginning_of_day
119
+ change(:hour => 0)
120
+ end
121
+ alias :midnight :beginning_of_day
122
+
123
+ # Returns a new Time representing the "start" of this week (Sunday, 0:00)
124
+ def beginning_of_week
125
+ days_to_sunday = self.wday
126
+ self.ago(days_to_sunday, :days).midnight
127
+ end
128
+ alias :sunday :beginning_of_week
129
+
130
+ end
131
+
@@ -0,0 +1,3 @@
1
+ module ThinOutBackups
2
+ Version = "0.0.1"
3
+ end
@@ -0,0 +1,285 @@
1
+ require "thin_out_backups/version"
2
+
3
+ # Tested by: spec/thin_out_backups_spec.rb
4
+
5
+ require 'fileutils'
6
+ require 'pathname'
7
+ require 'delegate'
8
+
9
+ #require 'rubygems'
10
+ require 'facets/time'
11
+ require 'colored'
12
+ require 'quality_extensions/module/attribute_accessors'
13
+ require 'thin_out_backups/time_fixes'
14
+
15
+ class ThinOutBackups::Command
16
+ #---------------------------------------------------------------------------------------------------------------------------------------------------
17
+
18
+ @@allowed_bucket_names = [:minutely, :hourly, :daily, :weekly, :monthly, :yearly]
19
+ mattr_reader :allowed_bucket_names
20
+
21
+ #---------------------------------------------------------------------------------------------------------------------------------------------------
22
+ # Options
23
+ @@options = [:get_time_from, :ignore_files ,:verbosity, :time_format, :now, :force, :no_color]
24
+ mattr_reader :options
25
+ mattr_accessor *@@options
26
+
27
+ @@get_time_from = :filename
28
+ def self.get_time_from=(new)
29
+ @@get_time_from = new.to_sym
30
+ raise "Unknown value for #{ThinOutBackups::Command.get_time_from}" unless @@get_time_from.in?([:filename, :file_system])
31
+ end
32
+
33
+ @@ignore_files = nil
34
+ @@verbosity = 1
35
+ @@force = false
36
+ @@color = true
37
+
38
+ @@now = Time.now
39
+ def self.now=(new)
40
+ time = DateTime.strptime('2008-11-12 07:45:00', '%Y-%m-%d %H:%M:%S').to_time
41
+ @@now = new
42
+ puts "Using alternate now: #{@@now}"
43
+ end
44
+
45
+ @@time_format = /(\d{4})(\d{2})(\d{2})T(\d{2})(\d{2})(\d{2})?/
46
+ @@time_format_parts = [:Y,:m,:d, :H,:M,:S]
47
+ # TODO: Maybe use something like this for interpreting time, rather than a regexp? DateTime.strptime("27/Nov/2007:15:01:43 -0800", "%d/%b/%Y:%H:%M:%S %z")
48
+ def self.time_format=(new)
49
+ # TODO: accept format strings such as 'H:M:S d.m.Y.' and 'Y-m-d H:M:S'
50
+ # TODO: do error checking
51
+ end
52
+
53
+
54
+ #---------------------------------------------------------------------------------------------------------------------------------------------------
55
+
56
+ class Bucket
57
+ attr_reader :parent, :name, :quota, :keepers
58
+ @@quota_format = %r[(\d+|\*)(/\d+)?]
59
+
60
+ def initialize(parent, name, quota)
61
+ @parent = parent
62
+ @name = name
63
+ (
64
+ raise "Invalid quota '#{quota}'" unless quota.is_a?(Fixnum) || quota =~ @@quota_format
65
+ @quota = quota
66
+ )
67
+ @keepers = []
68
+ end
69
+
70
+ def unit
71
+ {
72
+ :minutely => :minutes,
73
+ :hourly => :hours,
74
+ :daily => :days,
75
+ :weekly => :weeks,
76
+ :monthly => :months,
77
+ :yearly => :years,
78
+ }[@name.to_sym]
79
+ end
80
+
81
+ def start_time
82
+ start_time = parent.now.dup
83
+ if parent.align_at_beginning_of_time_interval
84
+ beginning_of_interval =
85
+ case unit
86
+ when :minutes
87
+ start_time.change(:sec => 0)
88
+ when :hours
89
+ start_time.change(:min => 0)
90
+ when :days
91
+ start_time.change(:hour => 0)
92
+ when :weeks
93
+ start_time.beginning_of_week
94
+ when :months
95
+ start_time.change( :day => 1, :hour => 0)
96
+ when :years
97
+ start_time.change(:month => 1, :day => 1, :hour => 0)
98
+ else
99
+ raise "unexpected unit #{unit}"
100
+ end
101
+ # We actually want to use the *next* interval (in the future) as our start_time because we will be using this as our max and going backwards in time...
102
+ beginning_of_interval.hence(1, unit)
103
+ else
104
+ start_time
105
+ end
106
+ end
107
+
108
+ def keep(keeper)
109
+ @keepers << keeper
110
+ end
111
+
112
+ def still_need
113
+ if keep_all?
114
+ 1 # it has insatiable hunger for keepers that can never be satisfied ... always just 1 more ...
115
+ else
116
+ @quota - @keepers.size
117
+ end
118
+ end
119
+
120
+ def keep_all?; quota =~ /\*/ end
121
+
122
+ def satisfied?
123
+ if keep_all?
124
+ false # it has insatiable hunger for keepers that can never be satisfied
125
+ else
126
+ still_need == 0
127
+ end
128
+ end
129
+ end
130
+
131
+ class File < DelegateClass(::Pathname)
132
+ attr_reader :filename, :file
133
+
134
+ def initialize(filename)
135
+ super(Pathname.new(filename))
136
+ end
137
+
138
+ def full_path
139
+ dirname.to_s + '/' + filename
140
+ end
141
+
142
+ def filename
143
+ basename.to_s
144
+ end
145
+ def to_s
146
+ filename
147
+ end
148
+
149
+ def time
150
+ if ThinOutBackups::Command.get_time_from == :filename
151
+ if filename =~ ThinOutBackups::Command.time_format
152
+ y,m,d, h,i,s = $1,$2,$3, $4,$5,$6
153
+ Time.mktime(y,m,d, h,i,s)
154
+ else
155
+ nil
156
+ end
157
+ elsif ThinOutBackups::Command.get_time_from == :file_system
158
+ file.mtime
159
+ else
160
+ raise "Unknown value for #{ThinOutBackups::Command.get_time_from}"
161
+ end
162
+ end
163
+
164
+ def has_time?; !!time end
165
+ def ignored?
166
+ !has_time? or
167
+ ThinOutBackups::Command.ignore_files && filename =~ ThinOutBackups::Command.ignore_files
168
+ end
169
+ end
170
+
171
+ attr_reader :align_at_beginning_of_time_interval, :files_with_times
172
+ attr_accessor :dir
173
+
174
+ def initialize(dir, quotas)
175
+ @align_at_beginning_of_time_interval = true
176
+ @dir = dir
177
+ @buckets = {}
178
+ @@allowed_bucket_names.each do |name|
179
+ quota = quotas[name]
180
+ @buckets[name] = Bucket.new(self, name, quota) unless quota.nil?
181
+ end
182
+
183
+ puts "Processing #{@dir}/*".magenta
184
+ files = Dir["#{@dir}/*"].map { |filename|
185
+ file = File.new(filename)
186
+ }
187
+ @files_with_times = files.
188
+ reject {|file| !file.has_time?}.
189
+ sort { |a, b|
190
+ a.time <=> b.time
191
+ }.
192
+ reverse
193
+
194
+ end
195
+
196
+ def bucket_remaining(bucket_name, decr = nil)
197
+ send("#{bucket_name}=", send("#{bucket_name}") - decr) if decr
198
+ send "#{bucket_name}"
199
+ end
200
+
201
+ def buckets
202
+ @buckets.values
203
+ end
204
+ def bucket(name)
205
+ @buckets[name] or raise "unknown bucket '#{name}'"
206
+ end
207
+
208
+ def now
209
+ Time.now
210
+ end
211
+
212
+ def delete_non_keepers
213
+ #raise "Didn't find any files to keep?!" unless keepers.any?
214
+ files_with_times.each do |file|
215
+ if (buckets = buckets_with_file(file)).any?
216
+ puts "#{file.full_path}: in buckets: #{buckets.map(&:name).join(', ')}".green
217
+ else
218
+ puts "#{file.full_path}: delete".red
219
+ end
220
+ end
221
+
222
+ if @@force == false
223
+ print "Continue with deletions? (yes or no) >".magenta
224
+ response = STDIN.gets
225
+ (puts "Aborting"; return) unless response.chomp.downcase == 'yes'
226
+ end
227
+
228
+ files_with_times.each do |file|
229
+ if (buckets = buckets_with_file(file)).any?
230
+ #
231
+ else
232
+ file.unlink
233
+ end
234
+ end
235
+ end
236
+
237
+ def buckets_with_file(file)
238
+ buckets.find_all {|bucket| bucket.keepers.include?(file)}
239
+ end
240
+
241
+ def delete(files)
242
+ puts "Deleting files: #{files.join(', ')}..."
243
+ end
244
+
245
+ def earliest_file_time
246
+ files_with_times.last.time
247
+ end
248
+
249
+ def run
250
+ raise "Must keep at least 1 file from at least one time bucket" if buckets.empty?
251
+ (puts "Found no files with times! Aborting."; return) if files_with_times.empty?
252
+
253
+ # Fill each bucket until its quota is met
254
+ buckets.each do |bucket|
255
+ puts "Trying to fill bucket '#{bucket.name}' (quota: #{bucket.quota})...".magenta
256
+
257
+ time_max = bucket.start_time
258
+ time_min = time_max.ago(1, bucket.unit)
259
+
260
+ #puts "Earliest_file_time: #{earliest_file_time}"
261
+ while time_max > earliest_file_time
262
+ print "Checking range (#{time_min} .. #{time_max})... ".yellow if verbosity >= 1
263
+ new_keeper = files_with_times.detect {|file|
264
+ #print "#{file}? "
265
+ time_min <= file.time &&
266
+ file.time < time_max
267
+ }
268
+ if new_keeper
269
+ puts "found keeper #{new_keeper}".green if verbosity >= 1
270
+ bucket.keep new_keeper
271
+ else
272
+ #puts "found no keepers".red if verbosity >= 1
273
+ puts "" if verbosity >= 1
274
+ end
275
+
276
+ time_max = time_min
277
+ #puts "Stepping back from #{time_min} by 1 #{bucket.unit} => #{time_min.ago(1, bucket.unit)}"
278
+ time_min = time_min.ago(1, bucket.unit)
279
+ (puts 'Filled quota!'.green; break) if bucket.satisfied?
280
+ end
281
+ end
282
+
283
+ delete_non_keepers
284
+ end
285
+ end
@@ -0,0 +1,137 @@
1
+ require 'tmpdir'
2
+ #require 'rubygems'
3
+ require 'rspec'
4
+ require 'facets'
5
+ require_relative '../lib/thin_out_backups'
6
+
7
+ $now = Time.utc(2008,11,12, 7,45,19)
8
+
9
+ describe Time, "#beginning_of_week" do
10
+ it "should return a Sunday" do
11
+ Time.utc(2008,11,12).beginning_of_week.should == Time.utc(2008,11,9)
12
+ end
13
+ end
14
+
15
+ describe ThinOutBackups::Command::Bucket, "time interval alignment" do
16
+
17
+ def sample_quotas
18
+ {
19
+ :minutely => 1,
20
+ :hourly => 3,
21
+ :daily => 1,
22
+ :weekly => '*',
23
+ :monthly => 1,
24
+ :yearly => '*'
25
+ }
26
+ end
27
+
28
+ before do
29
+ @command = ThinOutBackups::Command.new('bogus_dir', sample_quotas)
30
+ @now = $now
31
+ @command.stub!(:now).and_return(@now)
32
+ end
33
+
34
+ it "should use the time specified by our test" do
35
+ @command.now.should == @now
36
+ end
37
+
38
+ it "hour interval should start on the hour, etc." do
39
+ @command.bucket(:minutely).start_time.should == Time.utc(2008,11,12, 7,46,0)
40
+ @command.bucket(:hourly). start_time.should == Time.utc(2008,11,12, 8,0,0)
41
+ @command.bucket(:daily). start_time.should == Time.utc(2008,11,13, 0,0,0)
42
+ @command.bucket(:weekly). start_time.should == Time.utc(2008,11,16, 0,0,0)
43
+ @command.bucket(:monthly). start_time.should == Time.utc(2008,12, 1, 0,0,0)
44
+ @command.bucket(:yearly). start_time.should == Time.utc(2009, 1, 1, 0,0,0)
45
+ end
46
+
47
+ end
48
+
49
+
50
+ $command = <<End
51
+ thin_out_backups --force --daily=3 --weekly=3 --monthly=* \
52
+ --now='#{$now.to_s_full}'\
53
+ spec/test_dir/db_dumps \
54
+ spec/test_dir/maildir
55
+ End
56
+ describe ThinOutBackups::Command, "when calling `#{$command}`" do
57
+ before do
58
+ Pathname.new("spec/test_dir/").rmtree rescue nil
59
+
60
+ dir='spec/test_dir/db_dumps/'
61
+ system "mkdir -p #{dir}"
62
+ files = %w[
63
+ db_dump_2008-08-08T0303.sql
64
+ db_dump_2008-09-01T0303.sql
65
+ db_dump_2008-09-10T0303.sql
66
+ db_dump_2008-10-15T0303.sql
67
+ db_dump_2008-10-16T0303.sql
68
+ db_dump_2008-10-17T0303.sql
69
+ db_dump_2008-10-18T0303.sql
70
+ db_dump_2008-10-19T0303.sql
71
+ db_dump_2008-10-20T0303.sql
72
+ db_dump_2008-10-21T0303.sql
73
+ db_dump_2008-10-22T0303.sql
74
+ db_dump_2008-10-23T0303.sql
75
+ db_dump_2008-10-24T0303.sql
76
+ db_dump_2008-10-25T0303.sql
77
+ db_dump_2008-10-26T0303.sql
78
+ db_dump_2008-10-27T0303.sql
79
+ db_dump_2008-10-28T0303.sql
80
+ db_dump_2008-10-29T0303.sql
81
+ db_dump_2008-10-30T0303.sql
82
+ db_dump_2008-10-31T0303.sql
83
+ db_dump_2008-11-01T0303.sql
84
+ db_dump_2008-11-02T0303.sql
85
+ db_dump_2008-11-03T0303.sql
86
+ db_dump_2008-11-04T0303.sql
87
+ db_dump_2008-11-05T0303.sql
88
+ db_dump_2008-11-06T0303.sql
89
+ db_dump_2008-11-07T0303.sql
90
+ db_dump_2008-11-08T0303.sql
91
+ db_dump_2008-11-09T0303.sql
92
+ db_dump_2008-11-10T0303.sql
93
+ db_dump_2008-11-11T0303.sql
94
+ db_dump_2008-11-12T0303.sql
95
+ ]
96
+ files.each do |file|
97
+ Dir.getwd
98
+ #puts %(Dir.getwd=#{(Dir.getwd).inspect})
99
+ #puts %("touch #{dir}/#{file}"=#{("touch #{dir}/#{file}").inspect})
100
+ system "touch #{dir}/#{file}"
101
+ end
102
+
103
+ dir='spec/test_dir/maildir/'
104
+ system "mkdir -p #{dir}"
105
+ subdirs = %w[
106
+ 2008-11-09T0303
107
+ 2008-11-10T0303
108
+ 2008-11-11T0303
109
+ ]
110
+ subdirs.each do |subdir|
111
+ system "mkdir -p #{dir}/#{subdir}"
112
+ system "touch #{dir}/#{subdir}/inbox"
113
+ system "touch #{dir}/#{subdir}/some_other_folder"
114
+ end
115
+
116
+ puts %($command=#{($command).inspect})
117
+ #system $command
118
+ # TODO: also capture output of command and check it against expected
119
+ end
120
+
121
+ it "keeps/removes the correct files" do
122
+ Dir['spec/test_dir/db_dumps/*'].should =~
123
+ ["spec/test_dir/db_dumps/db_dump_2008-11-12T0303.sql",
124
+ "spec/test_dir/db_dumps/db_dump_2008-11-08T0303.sql",
125
+ "spec/test_dir/db_dumps/db_dump_2008-10-31T0303.sql",
126
+ "spec/test_dir/db_dumps/db_dump_2008-11-10T0303.sql",
127
+ "spec/test_dir/db_dumps/db_dump_2008-08-08T0303.sql",
128
+ "spec/test_dir/db_dumps/db_dump_2008-11-01T0303.sql",
129
+ "spec/test_dir/db_dumps/db_dump_2008-09-10T0303.sql",
130
+ "spec/test_dir/db_dumps/db_dump_2008-11-11T0303.sql"]
131
+ end
132
+
133
+ after do
134
+ #Pathname.new("spec/test_dir/").rmtree rescue nil
135
+ end
136
+ end
137
+
@@ -0,0 +1,23 @@
1
+ # -*- encoding: utf-8 -*-
2
+ require File.expand_path('../lib/thin_out_backups/version', __FILE__)
3
+
4
+ Gem::Specification.new do |gem|
5
+ gem.authors = ["Tyler Rick"]
6
+ gem.email = ["github.com@tylerrick.com"]
7
+ gem.summary = %q{Thin out a directory full of backups, only keeping a specified number from each category (weekly, daily, etc.), and deleting the rest.}
8
+ gem.description = gem.summary
9
+ gem.homepage = ""
10
+
11
+ gem.files = `git ls-files`.split($\)
12
+ gem.executables = gem.files.grep(%r{^bin/}).map{ |f| File.basename(f) }
13
+ gem.test_files = gem.files.grep(%r{^(test|spec|features)/})
14
+ gem.name = "thin_out_backups"
15
+ gem.require_paths = ["lib"]
16
+ gem.version = ThinOutBackups::Version
17
+
18
+ gem.add_dependency 'facets'
19
+ gem.add_dependency 'colored'
20
+ gem.add_dependency 'quality_extensions'
21
+
22
+ gem.add_development_dependency 'rspec'
23
+ end
metadata ADDED
@@ -0,0 +1,116 @@
1
+ --- !ruby/object:Gem::Specification
2
+ name: thin_out_backups
3
+ version: !ruby/object:Gem::Version
4
+ version: 0.0.1
5
+ platform: ruby
6
+ authors:
7
+ - Tyler Rick
8
+ autorequire:
9
+ bindir: bin
10
+ cert_chain: []
11
+ date: 2013-03-27 00:00:00.000000000 Z
12
+ dependencies:
13
+ - !ruby/object:Gem::Dependency
14
+ name: facets
15
+ requirement: !ruby/object:Gem::Requirement
16
+ requirements:
17
+ - - '>='
18
+ - !ruby/object:Gem::Version
19
+ version: '0'
20
+ type: :runtime
21
+ prerelease: false
22
+ version_requirements: !ruby/object:Gem::Requirement
23
+ requirements:
24
+ - - '>='
25
+ - !ruby/object:Gem::Version
26
+ version: '0'
27
+ - !ruby/object:Gem::Dependency
28
+ name: colored
29
+ requirement: !ruby/object:Gem::Requirement
30
+ requirements:
31
+ - - '>='
32
+ - !ruby/object:Gem::Version
33
+ version: '0'
34
+ type: :runtime
35
+ prerelease: false
36
+ version_requirements: !ruby/object:Gem::Requirement
37
+ requirements:
38
+ - - '>='
39
+ - !ruby/object:Gem::Version
40
+ version: '0'
41
+ - !ruby/object:Gem::Dependency
42
+ name: quality_extensions
43
+ requirement: !ruby/object:Gem::Requirement
44
+ requirements:
45
+ - - '>='
46
+ - !ruby/object:Gem::Version
47
+ version: '0'
48
+ type: :runtime
49
+ prerelease: false
50
+ version_requirements: !ruby/object:Gem::Requirement
51
+ requirements:
52
+ - - '>='
53
+ - !ruby/object:Gem::Version
54
+ version: '0'
55
+ - !ruby/object:Gem::Dependency
56
+ name: rspec
57
+ requirement: !ruby/object:Gem::Requirement
58
+ requirements:
59
+ - - '>='
60
+ - !ruby/object:Gem::Version
61
+ version: '0'
62
+ type: :development
63
+ prerelease: false
64
+ version_requirements: !ruby/object:Gem::Requirement
65
+ requirements:
66
+ - - '>='
67
+ - !ruby/object:Gem::Version
68
+ version: '0'
69
+ description: Thin out a directory full of backups, only keeping a specified number
70
+ from each category (weekly, daily, etc.), and deleting the rest.
71
+ email:
72
+ - github.com@tylerrick.com
73
+ executables:
74
+ - thin_out_backups
75
+ extensions: []
76
+ extra_rdoc_files: []
77
+ files:
78
+ - .gitignore
79
+ - .rspec
80
+ - Gemfile
81
+ - License
82
+ - Rakefile
83
+ - Readme.md
84
+ - bin/thin_out_backups
85
+ - lib/thin_out_backups.rb
86
+ - lib/thin_out_backups/time_fixes.rb
87
+ - lib/thin_out_backups/version.rb
88
+ - spec/thin_out_backups_spec.rb
89
+ - thin_out_backups.gemspec
90
+ homepage: ''
91
+ licenses: []
92
+ metadata: {}
93
+ post_install_message:
94
+ rdoc_options: []
95
+ require_paths:
96
+ - lib
97
+ required_ruby_version: !ruby/object:Gem::Requirement
98
+ requirements:
99
+ - - '>='
100
+ - !ruby/object:Gem::Version
101
+ version: '0'
102
+ required_rubygems_version: !ruby/object:Gem::Requirement
103
+ requirements:
104
+ - - '>='
105
+ - !ruby/object:Gem::Version
106
+ version: '0'
107
+ requirements: []
108
+ rubyforge_project:
109
+ rubygems_version: 2.0.0
110
+ signing_key:
111
+ specification_version: 4
112
+ summary: Thin out a directory full of backups, only keeping a specified number from
113
+ each category (weekly, daily, etc.), and deleting the rest.
114
+ test_files:
115
+ - spec/thin_out_backups_spec.rb
116
+ has_rdoc: