file_scanner 1.0.3
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +7 -0
- data/.gitignore +9 -0
- data/.travis.yml +8 -0
- data/Gemfile +4 -0
- data/README.md +112 -0
- data/Rakefile +10 -0
- data/bin/console +14 -0
- data/bin/setup +8 -0
- data/file_scanner.gemspec +20 -0
- data/lib/file_scanner/filters.rb +31 -0
- data/lib/file_scanner/loader.rb +21 -0
- data/lib/file_scanner/version.rb +3 -0
- data/lib/file_scanner/worker.rb +42 -0
- data/lib/file_scanner.rb +4 -0
- metadata +100 -0
checksums.yaml
ADDED
@@ -0,0 +1,7 @@
|
|
1
|
+
---
|
2
|
+
SHA1:
|
3
|
+
metadata.gz: 6fb33515de67d99ac734334fb573a11424861444
|
4
|
+
data.tar.gz: 7e6742c59f1f0939114ee6ba81d49be590a74447
|
5
|
+
SHA512:
|
6
|
+
metadata.gz: 639bfe4e4078486bd18fb49762154e8465766187b5be9038e61494c10fa9ff7273f68d1c48abffd89884d81918d0590ebebf43c1c1921d8e59ce5369e6cf3801
|
7
|
+
data.tar.gz: c7256af0baef400aaed7b302728cb5f1ccd2df9cb203a02a02a4491bb7d14ca4a4ecd936cbb1f89ccaa77f832b5cf6364c7c4658d4249212e2dc814691d03883
|
data/.gitignore
ADDED
data/.travis.yml
ADDED
data/Gemfile
ADDED
data/README.md
ADDED
@@ -0,0 +1,112 @@
|
|
1
|
+
## Table of Contents
|
2
|
+
|
3
|
+
* [Scope](#scope)
|
4
|
+
* [Motivation](#motivation)
|
5
|
+
* [Installation](#installation)
|
6
|
+
* [Usage](#usage)
|
7
|
+
* [Loader](#loader)
|
8
|
+
* [Filters](#filters)
|
9
|
+
* [Policies](#policies)
|
10
|
+
* [Worker](#worker)
|
11
|
+
|
12
|
+
## Scope
|
13
|
+
This gem is aimed to collect a set of file paths starting by a wildcard rule, filter them by default/custom filters (access time, size range) and apply a set of custom policies to them.
|
14
|
+
|
15
|
+
## Motivation
|
16
|
+
This gem is helpful to purge obsolete files or to promote relevant ones, by calling external services (CDN APIs) and/or local file system actions (copy, move, delete, etc).
|
17
|
+
|
18
|
+
## Installation
|
19
|
+
Add this line to your application's Gemfile:
|
20
|
+
```ruby
|
21
|
+
gem "file_scanner"
|
22
|
+
```
|
23
|
+
|
24
|
+
And then execute:
|
25
|
+
```shell
|
26
|
+
bundle
|
27
|
+
```
|
28
|
+
|
29
|
+
Or install it yourself as:
|
30
|
+
```shell
|
31
|
+
gem install file_scanner
|
32
|
+
```
|
33
|
+
|
34
|
+
## Usage
|
35
|
+
|
36
|
+
### Loader
|
37
|
+
The first step is to create a `Loader` instance by specifying the path where the files need to be scanned with optional extensions list:
|
38
|
+
```ruby
|
39
|
+
require "file_scanner"
|
40
|
+
|
41
|
+
loader = FileScanner::Loader.new(path: ENV["HOME"], extensions: %w[html txt])
|
42
|
+
```
|
43
|
+
|
44
|
+
### Filters
|
45
|
+
The second step is to provide the filters list to select files for which the `call` method is truthy.
|
46
|
+
|
47
|
+
#### Default
|
48
|
+
You can rely on existing filters that select files by:
|
49
|
+
* checking if file is older than *30 days*
|
50
|
+
* checking if file size is *smaller than 100 bytes*
|
51
|
+
|
52
|
+
You can configure default behaviour by passing different arguments:
|
53
|
+
```ruby
|
54
|
+
accessed_a_week_ago = FileScanner::Filters::LastAccess.new(Time.now-7*24*3600)
|
55
|
+
one_to_two_mega = FileScanner::Filters::SizeRange.new(min: 1024**2, max: 2*1024**2)
|
56
|
+
|
57
|
+
filters = []
|
58
|
+
filters << accessed_a_week_ago
|
59
|
+
filters << one_to_two_mega
|
60
|
+
```
|
61
|
+
|
62
|
+
#### Custom
|
63
|
+
It is convenient to create custom filters by just relying on `Proc` instances that satisfy the `callable` protocol:
|
64
|
+
```ruby
|
65
|
+
filters << ->(file) { File.directory?(file) }
|
66
|
+
```
|
67
|
+
|
68
|
+
### Policies
|
69
|
+
The third step is creating custom policies objects (no default exist) to be applied to the list of filtered paths.
|
70
|
+
Again, it suffice the policy responds to the `call` method and accept an array of paths as an argument:
|
71
|
+
```ruby
|
72
|
+
require "fileutils"
|
73
|
+
|
74
|
+
remove_from_disk = ->(paths) do
|
75
|
+
FileUtils.rm_rf(paths)
|
76
|
+
end
|
77
|
+
|
78
|
+
policies = []
|
79
|
+
policies << remove_from_disk
|
80
|
+
```
|
81
|
+
|
82
|
+
### Worker
|
83
|
+
Now that you have all of the collaborators in place, you can create the `Worker` instance:
|
84
|
+
```ruby
|
85
|
+
worker = FileScanner::Worker.new(loader: loader, filters: filters, policies: policies)
|
86
|
+
worker.call # apply all the specified policies to the filtered file paths
|
87
|
+
```
|
88
|
+
|
89
|
+
#### Slice of files
|
90
|
+
In case you are going to scan a large number of files, is better to work in batches.
|
91
|
+
This is exactly why the `Worker` class accept a `slice` attribute to distribute the work and avoid saturating the resources used by the specified policies:
|
92
|
+
```ruby
|
93
|
+
worker = FileScanner::Worker.new(loader: loader, filter: filter, policies: policies, slice: 1000)
|
94
|
+
worker.call # call policies by slice of 1000 files
|
95
|
+
```
|
96
|
+
|
97
|
+
#### Policies by block
|
98
|
+
In case you prefer to specify the policies as a block yielding the files slice, you can omit the `policies` argument at all:
|
99
|
+
```ruby
|
100
|
+
worker = FileScanner::Worker.new(loader: loader, filter: filter)
|
101
|
+
worker.call do |slice|
|
102
|
+
->(slice) { FileUtils.chmod_R(0700, slice) }.call
|
103
|
+
end
|
104
|
+
```
|
105
|
+
|
106
|
+
#### Use a logger
|
107
|
+
If you dare to trace what the worker is doing (including errors), you can specify a logger to the worker class:
|
108
|
+
```ruby
|
109
|
+
my_logger = Logger.new("my_file.log")
|
110
|
+
worker = FileScanner::Worker.new(loader: loader, filter: filter, logger: my_logger)
|
111
|
+
worker.call # will log worker actions to my_file.log
|
112
|
+
```
|
data/Rakefile
ADDED
data/bin/console
ADDED
@@ -0,0 +1,14 @@
|
|
1
|
+
#!/usr/bin/env ruby
|
2
|
+
|
3
|
+
require "bundler/setup"
|
4
|
+
require "file_scanner"
|
5
|
+
|
6
|
+
# You can add fixtures and/or initialization code here to make experimenting
|
7
|
+
# with your gem easier. You can also use a different console, if you like.
|
8
|
+
|
9
|
+
# (If you use this, don't forget to add pry to your Gemfile!)
|
10
|
+
# require "pry"
|
11
|
+
# Pry.start
|
12
|
+
|
13
|
+
require "irb"
|
14
|
+
IRB.start(__FILE__)
|
data/bin/setup
ADDED
@@ -0,0 +1,20 @@
|
|
1
|
+
# coding: utf-8
|
2
|
+
lib = File.expand_path("../lib", __FILE__)
|
3
|
+
$LOAD_PATH.unshift(lib) unless $LOAD_PATH.include?(lib)
|
4
|
+
require "file_scanner/version"
|
5
|
+
|
6
|
+
Gem::Specification.new do |s|
|
7
|
+
s.name = "file_scanner"
|
8
|
+
s.version = FileScanner::VERSION
|
9
|
+
s.authors = ["costajob"]
|
10
|
+
s.email = ["costajob@gmail.com"]
|
11
|
+
s.summary = "A scanner routine that collect file paths basing on specified filters and apply to them a set of custom policies"
|
12
|
+
s.files = `git ls-files -z`.split("\x0").reject { |f| f.match(%r{^(spec|test|s|features)/}) }
|
13
|
+
s.require_paths = ["lib"]
|
14
|
+
s.license = "MIT"
|
15
|
+
s.required_ruby_version = ">= 2.1.2"
|
16
|
+
|
17
|
+
s.add_development_dependency "bundler", "~> 1.15"
|
18
|
+
s.add_development_dependency "rake", "~> 10.0"
|
19
|
+
s.add_development_dependency "minitest", "~> 5.0"
|
20
|
+
end
|
@@ -0,0 +1,31 @@
|
|
1
|
+
module FileScanner
|
2
|
+
module Filters
|
3
|
+
def self.defaults
|
4
|
+
constants.map do |name|
|
5
|
+
self.const_get(name).new
|
6
|
+
end
|
7
|
+
end
|
8
|
+
|
9
|
+
class LastAccess
|
10
|
+
DAY = 3600*24
|
11
|
+
|
12
|
+
def initialize(atime = Time.now-30*DAY)
|
13
|
+
@atime = atime
|
14
|
+
end
|
15
|
+
|
16
|
+
def call(file)
|
17
|
+
@atime >= File.atime(file)
|
18
|
+
end
|
19
|
+
end
|
20
|
+
|
21
|
+
class SizeRange
|
22
|
+
def initialize(min: 100, max: Float::INFINITY)
|
23
|
+
@range = min..max
|
24
|
+
end
|
25
|
+
|
26
|
+
def call(file)
|
27
|
+
@range === File.size(file)
|
28
|
+
end
|
29
|
+
end
|
30
|
+
end
|
31
|
+
end
|
@@ -0,0 +1,21 @@
|
|
1
|
+
module FileScanner
|
2
|
+
class Loader
|
3
|
+
def initialize(path:, extensions: [])
|
4
|
+
@path = path
|
5
|
+
@extensions = extensions
|
6
|
+
end
|
7
|
+
|
8
|
+
def call
|
9
|
+
Dir.glob(files_path)
|
10
|
+
end
|
11
|
+
|
12
|
+
private def files_path
|
13
|
+
File.join(@path, "**", extensions_path)
|
14
|
+
end
|
15
|
+
|
16
|
+
private def extensions_path
|
17
|
+
return "*" if @extensions.empty?
|
18
|
+
"*.{#{@extensions.join(",")}}"
|
19
|
+
end
|
20
|
+
end
|
21
|
+
end
|
@@ -0,0 +1,42 @@
|
|
1
|
+
require "logger"
|
2
|
+
|
3
|
+
module FileScanner
|
4
|
+
class Worker
|
5
|
+
attr_reader :filters, :policies
|
6
|
+
|
7
|
+
def initialize(loader:, filters: Filters::defaults, policies: [], logger: Logger.new(nil), slice: nil)
|
8
|
+
@loader = loader
|
9
|
+
@filters = filters
|
10
|
+
@policies = policies
|
11
|
+
@slice = slice.to_i
|
12
|
+
@logger = logger
|
13
|
+
end
|
14
|
+
|
15
|
+
def call
|
16
|
+
slices.each do |slice|
|
17
|
+
yield(slice) if block_given? && policies.empty?
|
18
|
+
policies.each do |policy|
|
19
|
+
@logger.info { "applying \e[1m#{policy}\e[0m to \e[1m#{slice.size}\e[0m files" }
|
20
|
+
policy.call(slice)
|
21
|
+
end
|
22
|
+
end
|
23
|
+
rescue StandardError => e
|
24
|
+
@logger.error { e.message }
|
25
|
+
raise e
|
26
|
+
end
|
27
|
+
|
28
|
+
private def files
|
29
|
+
@files ||= Array(@loader.call).select do |f|
|
30
|
+
@filters.all? do |filter|
|
31
|
+
@logger.info { "applying \e[1m#{filter.class}\w[0m to \e[1m#{File.basename(f)}\e[0m" }
|
32
|
+
filter.call(f)
|
33
|
+
end
|
34
|
+
end
|
35
|
+
end
|
36
|
+
|
37
|
+
private def slices
|
38
|
+
return [files] if @slice.zero?
|
39
|
+
files.each_slice(@slice)
|
40
|
+
end
|
41
|
+
end
|
42
|
+
end
|
data/lib/file_scanner.rb
ADDED
metadata
ADDED
@@ -0,0 +1,100 @@
|
|
1
|
+
--- !ruby/object:Gem::Specification
|
2
|
+
name: file_scanner
|
3
|
+
version: !ruby/object:Gem::Version
|
4
|
+
version: 1.0.3
|
5
|
+
platform: ruby
|
6
|
+
authors:
|
7
|
+
- costajob
|
8
|
+
autorequire:
|
9
|
+
bindir: bin
|
10
|
+
cert_chain: []
|
11
|
+
date: 2017-08-08 00:00:00.000000000 Z
|
12
|
+
dependencies:
|
13
|
+
- !ruby/object:Gem::Dependency
|
14
|
+
name: bundler
|
15
|
+
requirement: !ruby/object:Gem::Requirement
|
16
|
+
requirements:
|
17
|
+
- - "~>"
|
18
|
+
- !ruby/object:Gem::Version
|
19
|
+
version: '1.15'
|
20
|
+
type: :development
|
21
|
+
prerelease: false
|
22
|
+
version_requirements: !ruby/object:Gem::Requirement
|
23
|
+
requirements:
|
24
|
+
- - "~>"
|
25
|
+
- !ruby/object:Gem::Version
|
26
|
+
version: '1.15'
|
27
|
+
- !ruby/object:Gem::Dependency
|
28
|
+
name: rake
|
29
|
+
requirement: !ruby/object:Gem::Requirement
|
30
|
+
requirements:
|
31
|
+
- - "~>"
|
32
|
+
- !ruby/object:Gem::Version
|
33
|
+
version: '10.0'
|
34
|
+
type: :development
|
35
|
+
prerelease: false
|
36
|
+
version_requirements: !ruby/object:Gem::Requirement
|
37
|
+
requirements:
|
38
|
+
- - "~>"
|
39
|
+
- !ruby/object:Gem::Version
|
40
|
+
version: '10.0'
|
41
|
+
- !ruby/object:Gem::Dependency
|
42
|
+
name: minitest
|
43
|
+
requirement: !ruby/object:Gem::Requirement
|
44
|
+
requirements:
|
45
|
+
- - "~>"
|
46
|
+
- !ruby/object:Gem::Version
|
47
|
+
version: '5.0'
|
48
|
+
type: :development
|
49
|
+
prerelease: false
|
50
|
+
version_requirements: !ruby/object:Gem::Requirement
|
51
|
+
requirements:
|
52
|
+
- - "~>"
|
53
|
+
- !ruby/object:Gem::Version
|
54
|
+
version: '5.0'
|
55
|
+
description:
|
56
|
+
email:
|
57
|
+
- costajob@gmail.com
|
58
|
+
executables: []
|
59
|
+
extensions: []
|
60
|
+
extra_rdoc_files: []
|
61
|
+
files:
|
62
|
+
- ".gitignore"
|
63
|
+
- ".travis.yml"
|
64
|
+
- Gemfile
|
65
|
+
- README.md
|
66
|
+
- Rakefile
|
67
|
+
- bin/console
|
68
|
+
- bin/setup
|
69
|
+
- file_scanner.gemspec
|
70
|
+
- lib/file_scanner.rb
|
71
|
+
- lib/file_scanner/filters.rb
|
72
|
+
- lib/file_scanner/loader.rb
|
73
|
+
- lib/file_scanner/version.rb
|
74
|
+
- lib/file_scanner/worker.rb
|
75
|
+
homepage:
|
76
|
+
licenses:
|
77
|
+
- MIT
|
78
|
+
metadata: {}
|
79
|
+
post_install_message:
|
80
|
+
rdoc_options: []
|
81
|
+
require_paths:
|
82
|
+
- lib
|
83
|
+
required_ruby_version: !ruby/object:Gem::Requirement
|
84
|
+
requirements:
|
85
|
+
- - ">="
|
86
|
+
- !ruby/object:Gem::Version
|
87
|
+
version: 2.1.2
|
88
|
+
required_rubygems_version: !ruby/object:Gem::Requirement
|
89
|
+
requirements:
|
90
|
+
- - ">="
|
91
|
+
- !ruby/object:Gem::Version
|
92
|
+
version: '0'
|
93
|
+
requirements: []
|
94
|
+
rubyforge_project:
|
95
|
+
rubygems_version: 2.6.8
|
96
|
+
signing_key:
|
97
|
+
specification_version: 4
|
98
|
+
summary: A scanner routine that collect file paths basing on specified filters and
|
99
|
+
apply to them a set of custom policies
|
100
|
+
test_files: []
|