file_scanner 1.0.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml ADDED
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA1:
3
+ metadata.gz: 6fb33515de67d99ac734334fb573a11424861444
4
+ data.tar.gz: 7e6742c59f1f0939114ee6ba81d49be590a74447
5
+ SHA512:
6
+ metadata.gz: 639bfe4e4078486bd18fb49762154e8465766187b5be9038e61494c10fa9ff7273f68d1c48abffd89884d81918d0590ebebf43c1c1921d8e59ce5369e6cf3801
7
+ data.tar.gz: c7256af0baef400aaed7b302728cb5f1ccd2df9cb203a02a02a4491bb7d14ca4a4ecd936cbb1f89ccaa77f832b5cf6364c7c4658d4249212e2dc814691d03883
data/.gitignore ADDED
@@ -0,0 +1,9 @@
1
+ /.bundle/
2
+ /.yardoc
3
+ /Gemfile.lock
4
+ /_yardoc/
5
+ /coverage/
6
+ /doc/
7
+ /pkg/
8
+ /spec/reports/
9
+ /tmp/
data/.travis.yml ADDED
@@ -0,0 +1,8 @@
1
+ sudo: false
2
+ language: ruby
3
+ rvm:
4
+ - 2.1.2
5
+ - 2.2.2
6
+ - 2.3.0
7
+ - 2.4.0
8
+ before_install: gem install bundler -v 1.15.2
data/Gemfile ADDED
@@ -0,0 +1,4 @@
1
+ source "https://rubygems.org"
2
+
3
+ # Specify your gem's dependencies in file_scanner.gemspec
4
+ gemspec
data/README.md ADDED
@@ -0,0 +1,112 @@
1
+ ## Table of Contents
2
+
3
+ * [Scope](#scope)
4
+ * [Motivation](#motivation)
5
+ * [Installation](#installation)
6
+ * [Usage](#usage)
7
+ * [Loader](#loader)
8
+ * [Filters](#filters)
9
+ * [Policies](#policies)
10
+ * [Worker](#worker)
11
+
12
+ ## Scope
13
+ This gem is aimed to collect a set of file paths starting by a wildcard rule, filter them by default/custom filters (access time, size range) and apply a set of custom policies to them.
14
+
15
+ ## Motivation
16
+ This gem is helpful to purge obsolete files or to promote relevant ones, by calling external services (CDN APIs) and/or local file system actions (copy, move, delete, etc).
17
+
18
+ ## Installation
19
+ Add this line to your application's Gemfile:
20
+ ```ruby
21
+ gem "file_scanner"
22
+ ```
23
+
24
+ And then execute:
25
+ ```shell
26
+ bundle
27
+ ```
28
+
29
+ Or install it yourself as:
30
+ ```shell
31
+ gem install file_scanner
32
+ ```
33
+
34
+ ## Usage
35
+
36
+ ### Loader
37
+ The first step is to create a `Loader` instance by specifying the path where the files need to be scanned with optional extensions list:
38
+ ```ruby
39
+ require "file_scanner"
40
+
41
+ loader = FileScanner::Loader.new(path: ENV["HOME"], extensions: %w[html txt])
42
+ ```
43
+
44
+ ### Filters
45
+ The second step is to provide the filters list to select files for which the `call` method is truthy.
46
+
47
+ #### Default
48
+ You can rely on existing filters that select files by:
49
+ * checking if file is older than *30 days*
50
+ * checking if file size is *smaller than 100 bytes*
51
+
52
+ You can configure default behaviour by passing different arguments:
53
+ ```ruby
54
+ accessed_a_week_ago = FileScanner::Filters::LastAccess.new(Time.now-7*24*3600)
55
+ one_to_two_mega = FileScanner::Filters::SizeRange.new(min: 1024**2, max: 2*1024**2)
56
+
57
+ filters = []
58
+ filters << accessed_a_week_ago
59
+ filters << one_to_two_mega
60
+ ```
61
+
62
+ #### Custom
63
+ It is convenient to create custom filters by just relying on `Proc` instances that satisfy the `callable` protocol:
64
+ ```ruby
65
+ filters << ->(file) { File.directory?(file) }
66
+ ```
67
+
68
+ ### Policies
69
+ The third step is creating custom policies objects (no default exist) to be applied to the list of filtered paths.
70
+ Again, it suffice the policy responds to the `call` method and accept an array of paths as an argument:
71
+ ```ruby
72
+ require "fileutils"
73
+
74
+ remove_from_disk = ->(paths) do
75
+ FileUtils.rm_rf(paths)
76
+ end
77
+
78
+ policies = []
79
+ policies << remove_from_disk
80
+ ```
81
+
82
+ ### Worker
83
+ Now that you have all of the collaborators in place, you can create the `Worker` instance:
84
+ ```ruby
85
+ worker = FileScanner::Worker.new(loader: loader, filters: filters, policies: policies)
86
+ worker.call # apply all the specified policies to the filtered file paths
87
+ ```
88
+
89
+ #### Slice of files
90
+ In case you are going to scan a large number of files, is better to work in batches.
91
+ This is exactly why the `Worker` class accept a `slice` attribute to distribute the work and avoid saturating the resources used by the specified policies:
92
+ ```ruby
93
+ worker = FileScanner::Worker.new(loader: loader, filter: filter, policies: policies, slice: 1000)
94
+ worker.call # call policies by slice of 1000 files
95
+ ```
96
+
97
+ #### Policies by block
98
+ In case you prefer to specify the policies as a block yielding the files slice, you can omit the `policies` argument at all:
99
+ ```ruby
100
+ worker = FileScanner::Worker.new(loader: loader, filter: filter)
101
+ worker.call do |slice|
102
+ ->(slice) { FileUtils.chmod_R(0700, slice) }.call
103
+ end
104
+ ```
105
+
106
+ #### Use a logger
107
+ If you dare to trace what the worker is doing (including errors), you can specify a logger to the worker class:
108
+ ```ruby
109
+ my_logger = Logger.new("my_file.log")
110
+ worker = FileScanner::Worker.new(loader: loader, filter: filter, logger: my_logger)
111
+ worker.call # will log worker actions to my_file.log
112
+ ```
data/Rakefile ADDED
@@ -0,0 +1,10 @@
1
+ require "bundler/gem_tasks"
2
+ require "rake/testtask"
3
+
4
+ Rake::TestTask.new(:spec) do |t|
5
+ t.libs << "spec"
6
+ t.libs << "lib"
7
+ t.test_files = FileList["spec/**/*_spec.rb"]
8
+ end
9
+
10
+ task :default => :spec
data/bin/console ADDED
@@ -0,0 +1,14 @@
1
+ #!/usr/bin/env ruby
2
+
3
+ require "bundler/setup"
4
+ require "file_scanner"
5
+
6
+ # You can add fixtures and/or initialization code here to make experimenting
7
+ # with your gem easier. You can also use a different console, if you like.
8
+
9
+ # (If you use this, don't forget to add pry to your Gemfile!)
10
+ # require "pry"
11
+ # Pry.start
12
+
13
+ require "irb"
14
+ IRB.start(__FILE__)
data/bin/setup ADDED
@@ -0,0 +1,8 @@
1
+ #!/usr/bin/env bash
2
+ set -euo pipefail
3
+ IFS=$'\n\t'
4
+ set -vx
5
+
6
+ bundle install
7
+
8
+ # Do any other automated setup that you need to do here
@@ -0,0 +1,20 @@
1
+ # coding: utf-8
2
+ lib = File.expand_path("../lib", __FILE__)
3
+ $LOAD_PATH.unshift(lib) unless $LOAD_PATH.include?(lib)
4
+ require "file_scanner/version"
5
+
6
+ Gem::Specification.new do |s|
7
+ s.name = "file_scanner"
8
+ s.version = FileScanner::VERSION
9
+ s.authors = ["costajob"]
10
+ s.email = ["costajob@gmail.com"]
11
+ s.summary = "A scanner routine that collect file paths basing on specified filters and apply to them a set of custom policies"
12
+ s.files = `git ls-files -z`.split("\x0").reject { |f| f.match(%r{^(spec|test|s|features)/}) }
13
+ s.require_paths = ["lib"]
14
+ s.license = "MIT"
15
+ s.required_ruby_version = ">= 2.1.2"
16
+
17
+ s.add_development_dependency "bundler", "~> 1.15"
18
+ s.add_development_dependency "rake", "~> 10.0"
19
+ s.add_development_dependency "minitest", "~> 5.0"
20
+ end
@@ -0,0 +1,31 @@
1
+ module FileScanner
2
+ module Filters
3
+ def self.defaults
4
+ constants.map do |name|
5
+ self.const_get(name).new
6
+ end
7
+ end
8
+
9
+ class LastAccess
10
+ DAY = 3600*24
11
+
12
+ def initialize(atime = Time.now-30*DAY)
13
+ @atime = atime
14
+ end
15
+
16
+ def call(file)
17
+ @atime >= File.atime(file)
18
+ end
19
+ end
20
+
21
+ class SizeRange
22
+ def initialize(min: 100, max: Float::INFINITY)
23
+ @range = min..max
24
+ end
25
+
26
+ def call(file)
27
+ @range === File.size(file)
28
+ end
29
+ end
30
+ end
31
+ end
@@ -0,0 +1,21 @@
1
+ module FileScanner
2
+ class Loader
3
+ def initialize(path:, extensions: [])
4
+ @path = path
5
+ @extensions = extensions
6
+ end
7
+
8
+ def call
9
+ Dir.glob(files_path)
10
+ end
11
+
12
+ private def files_path
13
+ File.join(@path, "**", extensions_path)
14
+ end
15
+
16
+ private def extensions_path
17
+ return "*" if @extensions.empty?
18
+ "*.{#{@extensions.join(",")}}"
19
+ end
20
+ end
21
+ end
@@ -0,0 +1,3 @@
1
+ module FileScanner
2
+ VERSION = "1.0.3"
3
+ end
@@ -0,0 +1,42 @@
1
+ require "logger"
2
+
3
+ module FileScanner
4
+ class Worker
5
+ attr_reader :filters, :policies
6
+
7
+ def initialize(loader:, filters: Filters::defaults, policies: [], logger: Logger.new(nil), slice: nil)
8
+ @loader = loader
9
+ @filters = filters
10
+ @policies = policies
11
+ @slice = slice.to_i
12
+ @logger = logger
13
+ end
14
+
15
+ def call
16
+ slices.each do |slice|
17
+ yield(slice) if block_given? && policies.empty?
18
+ policies.each do |policy|
19
+ @logger.info { "applying \e[1m#{policy}\e[0m to \e[1m#{slice.size}\e[0m files" }
20
+ policy.call(slice)
21
+ end
22
+ end
23
+ rescue StandardError => e
24
+ @logger.error { e.message }
25
+ raise e
26
+ end
27
+
28
+ private def files
29
+ @files ||= Array(@loader.call).select do |f|
30
+ @filters.all? do |filter|
31
+ @logger.info { "applying \e[1m#{filter.class}\w[0m to \e[1m#{File.basename(f)}\e[0m" }
32
+ filter.call(f)
33
+ end
34
+ end
35
+ end
36
+
37
+ private def slices
38
+ return [files] if @slice.zero?
39
+ files.each_slice(@slice)
40
+ end
41
+ end
42
+ end
@@ -0,0 +1,4 @@
1
+ require "file_scanner/filters"
2
+ require "file_scanner/loader"
3
+ require "file_scanner/version"
4
+ require "file_scanner/worker"
metadata ADDED
@@ -0,0 +1,100 @@
1
+ --- !ruby/object:Gem::Specification
2
+ name: file_scanner
3
+ version: !ruby/object:Gem::Version
4
+ version: 1.0.3
5
+ platform: ruby
6
+ authors:
7
+ - costajob
8
+ autorequire:
9
+ bindir: bin
10
+ cert_chain: []
11
+ date: 2017-08-08 00:00:00.000000000 Z
12
+ dependencies:
13
+ - !ruby/object:Gem::Dependency
14
+ name: bundler
15
+ requirement: !ruby/object:Gem::Requirement
16
+ requirements:
17
+ - - "~>"
18
+ - !ruby/object:Gem::Version
19
+ version: '1.15'
20
+ type: :development
21
+ prerelease: false
22
+ version_requirements: !ruby/object:Gem::Requirement
23
+ requirements:
24
+ - - "~>"
25
+ - !ruby/object:Gem::Version
26
+ version: '1.15'
27
+ - !ruby/object:Gem::Dependency
28
+ name: rake
29
+ requirement: !ruby/object:Gem::Requirement
30
+ requirements:
31
+ - - "~>"
32
+ - !ruby/object:Gem::Version
33
+ version: '10.0'
34
+ type: :development
35
+ prerelease: false
36
+ version_requirements: !ruby/object:Gem::Requirement
37
+ requirements:
38
+ - - "~>"
39
+ - !ruby/object:Gem::Version
40
+ version: '10.0'
41
+ - !ruby/object:Gem::Dependency
42
+ name: minitest
43
+ requirement: !ruby/object:Gem::Requirement
44
+ requirements:
45
+ - - "~>"
46
+ - !ruby/object:Gem::Version
47
+ version: '5.0'
48
+ type: :development
49
+ prerelease: false
50
+ version_requirements: !ruby/object:Gem::Requirement
51
+ requirements:
52
+ - - "~>"
53
+ - !ruby/object:Gem::Version
54
+ version: '5.0'
55
+ description:
56
+ email:
57
+ - costajob@gmail.com
58
+ executables: []
59
+ extensions: []
60
+ extra_rdoc_files: []
61
+ files:
62
+ - ".gitignore"
63
+ - ".travis.yml"
64
+ - Gemfile
65
+ - README.md
66
+ - Rakefile
67
+ - bin/console
68
+ - bin/setup
69
+ - file_scanner.gemspec
70
+ - lib/file_scanner.rb
71
+ - lib/file_scanner/filters.rb
72
+ - lib/file_scanner/loader.rb
73
+ - lib/file_scanner/version.rb
74
+ - lib/file_scanner/worker.rb
75
+ homepage:
76
+ licenses:
77
+ - MIT
78
+ metadata: {}
79
+ post_install_message:
80
+ rdoc_options: []
81
+ require_paths:
82
+ - lib
83
+ required_ruby_version: !ruby/object:Gem::Requirement
84
+ requirements:
85
+ - - ">="
86
+ - !ruby/object:Gem::Version
87
+ version: 2.1.2
88
+ required_rubygems_version: !ruby/object:Gem::Requirement
89
+ requirements:
90
+ - - ">="
91
+ - !ruby/object:Gem::Version
92
+ version: '0'
93
+ requirements: []
94
+ rubyforge_project:
95
+ rubygems_version: 2.6.8
96
+ signing_key:
97
+ specification_version: 4
98
+ summary: A scanner routine that collect file paths basing on specified filters and
99
+ apply to them a set of custom policies
100
+ test_files: []