file_scanner 1.1.0 → 2.0.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/README.md +22 -45
- data/file_scanner.gemspec +1 -0
- data/lib/file_scanner/version.rb +1 -1
- data/lib/file_scanner/worker.rb +11 -15
- metadata +16 -2
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA1:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 04135b6ae88b32be621d587dfa0ec5a9021062ff
|
4
|
+
data.tar.gz: 87e4d6a63ac8febc2f106b32749d68f4a11848de
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: bfc804f77a62758da9e32c6a472e254a008f2a7de736b965d6a94775b9311c73b833da4ce475bf06eeeefbf128d88467a00e647f1b500e90f555be9cb259125f
|
7
|
+
data.tar.gz: 23418775678248e6f8e84a54c7dab42383af56efb9e88ee9b7fa92e1355d539fe7fa7ef353c91e159c24985bcad61b5f523e6e9bfbe3948436507eac30b73e37
|
data/README.md
CHANGED
@@ -6,11 +6,14 @@
|
|
6
6
|
* [Usage](#usage)
|
7
7
|
* [Loader](#loader)
|
8
8
|
* [Filters](#filters)
|
9
|
-
|
9
|
+
* [Defaults](#defaults)
|
10
|
+
* [Custom](#custom)
|
10
11
|
* [Worker](#worker)
|
12
|
+
* [Batches](#batches)
|
13
|
+
* [Logger](#logger)
|
11
14
|
|
12
15
|
## Scope
|
13
|
-
This gem is aimed to collect a set of file paths starting by a wildcard rule, filter them by any default/custom filters (access time, size range) and apply a set of
|
16
|
+
This gem is aimed to collect a set of file paths starting by a wildcard rule, filter them by any default/custom filters (access time, matching name and size range) and apply a set of actions via a block call.
|
14
17
|
|
15
18
|
## Motivation
|
16
19
|
This gem is helpful to purge obsolete files or to promote relevant ones, by calling external services (CDN APIs) and/or local file system actions (copy, move, delete, etc).
|
@@ -42,25 +45,21 @@ loader = FileScanner::Loader.new(path: ENV["HOME"], extensions: %w[html txt])
|
|
42
45
|
```
|
43
46
|
|
44
47
|
### Filters
|
45
|
-
The second step is to provide the filters list to select file paths for which the `call` method is truthy
|
46
|
-
Selection is done with the `any?` predicate, so also one matching filter will
|
48
|
+
The second step is to provide the filters list to select file paths for which the `call` method is *truthy*.
|
49
|
+
Selection is done with the `any?` predicate, so also one matching filter will do the selection.
|
47
50
|
|
48
|
-
####
|
49
|
-
If you specify no filters the
|
51
|
+
#### Defaults
|
52
|
+
If you specify no filters the default ones are loaded, selecting files by:
|
50
53
|
* checking if file is older than *30 days*
|
51
54
|
* checking if file size is within *0KB and 5KB*
|
52
55
|
* checking if file *basename matches* the specified *regexp* (if any)
|
53
56
|
|
54
|
-
You can update default behaviours by passing custom arguments:
|
57
|
+
You can update default filters behaviours by passing custom arguments:
|
55
58
|
```ruby
|
56
59
|
a_week_ago = FileScanner::Filters::LastAccess.new(Time.now-7*24*3600)
|
57
60
|
one_two_mb = FileScanner::Filters::SizeRange.new(min: 1024**2, max: 2*1024**2)
|
58
61
|
hidden = FileScanner::Filters::MatchingName.new(/^\./)
|
59
|
-
|
60
|
-
filters = []
|
61
|
-
filters << a_week_ago
|
62
|
-
filters << one_two_mb
|
63
|
-
filters << hidden
|
62
|
+
filters = [a_week_ago, one_two_mb, hidden]
|
64
63
|
```
|
65
64
|
|
66
65
|
#### Custom
|
@@ -69,51 +68,29 @@ It is convenient to create custom filters by creating `Proc` instances that sati
|
|
69
68
|
filters << ->(file) { File.directory?(file) }
|
70
69
|
```
|
71
70
|
|
72
|
-
### Policies
|
73
|
-
The third step is creating custom policies objects (no defaults exist) to be applied to the list of filtered paths.
|
74
|
-
Again, it suffice the policy responds to the `call` method and accepts an array of paths as unique argument:
|
75
|
-
```ruby
|
76
|
-
require "fileutils"
|
77
|
-
|
78
|
-
remove_from_disk = ->(paths) do
|
79
|
-
FileUtils.rm_rf(paths)
|
80
|
-
end
|
81
|
-
|
82
|
-
policies = []
|
83
|
-
policies << remove_from_disk
|
84
|
-
```
|
85
|
-
|
86
71
|
### Worker
|
87
|
-
Now that you have all of the collaborators in place, you can create the `Worker` instance:
|
88
|
-
```ruby
|
89
|
-
worker = FileScanner::Worker.new(loader: loader, filters: filters, policies: policies)
|
90
|
-
worker.call # apply all the specified policies to the filtered file paths
|
91
|
-
```
|
92
|
-
|
93
|
-
#### Slice of files
|
94
|
-
In case you are going to scan a large number of files, it is better to work in batches.
|
95
|
-
The `Worker` constructor accepts a `slice` attribute to better distribute loading (no sleep by default, use block syntax):
|
72
|
+
Now that you have all of the collaborators in place, you can create the `Worker` instance to performs actions on the filtered paths:
|
96
73
|
```ruby
|
97
|
-
worker = FileScanner::Worker.new(loader: loader,
|
98
|
-
worker.call
|
74
|
+
worker = FileScanner::Worker.new(loader: loader, filters: filters)
|
75
|
+
worker.call do |paths|
|
76
|
+
# do whatever you want with the paths list
|
77
|
+
end
|
99
78
|
```
|
100
79
|
|
101
|
-
####
|
102
|
-
In case you
|
80
|
+
#### Batches
|
81
|
+
In case you are going to scan a large number of files, it is suggested to work in batches.
|
82
|
+
The `Worker` constructor accepts a `slice` attribute to give you a chance to distribute loading:
|
103
83
|
```ruby
|
104
|
-
worker = FileScanner::Worker.new(loader: loader)
|
84
|
+
worker = FileScanner::Worker.new(loader: loader, slice: 1000)
|
105
85
|
worker.call do |slice|
|
106
|
-
|
107
|
-
policy.call
|
108
|
-
sleep 10 # wait 10 seconds before slurping next slice
|
86
|
+
# perform action on a slice of 1000 paths
|
109
87
|
end
|
110
88
|
```
|
111
89
|
|
112
|
-
####
|
90
|
+
#### Logger
|
113
91
|
If you dare to trace what the worker is doing (including errors), you can specify a logger to the worker class:
|
114
92
|
```ruby
|
115
93
|
my_logger = Logger.new("my_file.log")
|
116
|
-
|
117
94
|
worker = FileScanner::Worker.new(loader: loader, logger: my_logger)
|
118
95
|
worker.call do |slice|
|
119
96
|
fail "Doh!" # will log error to my_file.log and re-raise exception
|
data/file_scanner.gemspec
CHANGED
data/lib/file_scanner/version.rb
CHANGED
data/lib/file_scanner/worker.rb
CHANGED
@@ -2,7 +2,7 @@ require "logger"
|
|
2
2
|
|
3
3
|
module FileScanner
|
4
4
|
class Worker
|
5
|
-
attr_reader :filters
|
5
|
+
attr_reader :filters
|
6
6
|
|
7
7
|
def self.default_logger
|
8
8
|
Logger.new(nil).tap do |logger|
|
@@ -10,23 +10,16 @@ module FileScanner
|
|
10
10
|
end
|
11
11
|
end
|
12
12
|
|
13
|
-
def initialize(loader:,
|
14
|
-
filters: Filters::defaults, policies: [],
|
15
|
-
logger: self.class.default_logger, slice: nil)
|
13
|
+
def initialize(loader:, filters: Filters::defaults, logger: self.class.default_logger, slice: nil)
|
16
14
|
@loader = loader
|
17
15
|
@filters = filters
|
18
|
-
@policies = policies
|
19
16
|
@slice = slice.to_i
|
20
17
|
@logger = logger
|
21
18
|
end
|
22
19
|
|
23
20
|
def call
|
24
21
|
slices.each do |slice|
|
25
|
-
yield(slice) if block_given?
|
26
|
-
policies.each do |policy|
|
27
|
-
@logger.info { "applying \e[33m#{policy}\e[0m to #{slice.size} files" }
|
28
|
-
policy.call(slice)
|
29
|
-
end
|
22
|
+
yield(slice, @logger) if block_given?
|
30
23
|
end
|
31
24
|
rescue StandardError => e
|
32
25
|
@logger.error { e.message }
|
@@ -34,11 +27,14 @@ module FileScanner
|
|
34
27
|
end
|
35
28
|
|
36
29
|
private def files
|
37
|
-
|
38
|
-
|
39
|
-
|
40
|
-
|
41
|
-
|
30
|
+
paths = @loader.call
|
31
|
+
paths.select! { |file| filter(file) } || paths
|
32
|
+
end
|
33
|
+
|
34
|
+
private def filter(file)
|
35
|
+
@filters.any? do |filter|
|
36
|
+
@logger.info { "applying \e[33m#{filter}\e[0m to #{File.basename(file)}" }
|
37
|
+
filter.call(file)
|
42
38
|
end
|
43
39
|
end
|
44
40
|
|
metadata
CHANGED
@@ -1,14 +1,14 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: file_scanner
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version:
|
4
|
+
version: 2.0.1
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- costajob
|
8
8
|
autorequire:
|
9
9
|
bindir: bin
|
10
10
|
cert_chain: []
|
11
|
-
date: 2017-08-
|
11
|
+
date: 2017-08-09 00:00:00.000000000 Z
|
12
12
|
dependencies:
|
13
13
|
- !ruby/object:Gem::Dependency
|
14
14
|
name: bundler
|
@@ -52,6 +52,20 @@ dependencies:
|
|
52
52
|
- - "~>"
|
53
53
|
- !ruby/object:Gem::Version
|
54
54
|
version: '5.0'
|
55
|
+
- !ruby/object:Gem::Dependency
|
56
|
+
name: benchmark-ips
|
57
|
+
requirement: !ruby/object:Gem::Requirement
|
58
|
+
requirements:
|
59
|
+
- - "~>"
|
60
|
+
- !ruby/object:Gem::Version
|
61
|
+
version: '2.7'
|
62
|
+
type: :development
|
63
|
+
prerelease: false
|
64
|
+
version_requirements: !ruby/object:Gem::Requirement
|
65
|
+
requirements:
|
66
|
+
- - "~>"
|
67
|
+
- !ruby/object:Gem::Version
|
68
|
+
version: '2.7'
|
55
69
|
description:
|
56
70
|
email:
|
57
71
|
- costajob@gmail.com
|