pdfmd 1.9.1 → 2.0.0
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- data/CHANGELOG.md +22 -2
- data/README.md +2 -2
- data/TODO.mkd +26 -0
- data/bin/pdfmd +267 -1
- data/lib/pdfmd.rb +242 -634
- data/lib/pdfmd/explain.hiera.md +25 -4
- data/lib/pdfmd/long_desc.pdfmdconfig.txt +40 -0
- data/lib/pdfmd/long_desc.pdfmdedit.txt +166 -0
- data/lib/pdfmd/long_desc.pdfmdexplain.txt +16 -0
- data/lib/pdfmd/long_desc.pdfmdrename.txt +206 -0
- data/lib/pdfmd/long_desc.pdfmdshow.txt +92 -0
- data/lib/pdfmd/long_desc.pdfmdsort.txt +111 -0
- data/lib/pdfmd/long_desc.pdfmdstat.txt +23 -0
- data/lib/pdfmd/pdfmdconfig.rb +30 -0
- data/lib/pdfmd/pdfmdedit.rb +201 -0
- data/lib/pdfmd/pdfmdmethods.rb +125 -0
- data/lib/pdfmd/pdfmdrename.rb +243 -0
- data/lib/pdfmd/pdfmdshow.rb +88 -0
- data/lib/pdfmd/pdfmdsort.rb +115 -0
- data/lib/pdfmd/pdfmdstat.rb +117 -0
- data/lib/{string_extend.rb → pdfmd/string_extend.rb} +0 -0
- data/lib/run.rb +235 -0
- data/pdfmd.gemspec +3 -2
- metadata +23 -11
- data/lib/pdfmd/check.rb +0 -10
- data/lib/pdfmd/config.rb +0 -59
- data/lib/pdfmd/edit.rb +0 -144
- data/lib/pdfmd/rename.rb +0 -295
- data/lib/pdfmd/show.rb +0 -164
- data/lib/pdfmd/sort.rb +0 -199
data/lib/pdfmd/explain.hiera.md
CHANGED
@@ -2,10 +2,12 @@ Information about hiera: https://docs.puppetlabs.com/hiera/1/index.html
|
|
2
2
|
|
3
3
|
Installation:
|
4
4
|
|
5
|
-
```
|
5
|
+
```
|
6
6
|
$ gem install hiera
|
7
7
|
```
|
8
8
|
|
9
|
+
This is a list of all available hiera settings.
|
10
|
+
|
9
11
|
Configure default settings for pdfmd in hiera:
|
10
12
|
|
11
13
|
|
@@ -14,12 +16,29 @@ Configure default settings for pdfmd in hiera:
|
|
14
16
|
pdfmd::config:
|
15
17
|
default:
|
16
18
|
password : secretpassword
|
19
|
+
loglevel : info|warn|error|debug
|
17
20
|
sort:
|
18
21
|
copy : true
|
19
22
|
destination : /data/output
|
20
23
|
interactive : true
|
21
24
|
log : true
|
22
25
|
logfile : /var/log/pdfmd.log
|
26
|
+
rename:
|
27
|
+
allkeywords : true
|
28
|
+
copy : true
|
29
|
+
defaultdoctype: doc
|
30
|
+
keywords : 4
|
31
|
+
outputdir : /data/output/sorted
|
32
|
+
log : true
|
33
|
+
logfile : /var/log/pdfmd.log
|
34
|
+
rename:
|
35
|
+
allkeywords : true
|
36
|
+
copy : true
|
37
|
+
defaultdoctype: doc
|
38
|
+
keywords : 4
|
39
|
+
outputdir : /data/output/sorted
|
40
|
+
log : true
|
41
|
+
logfile : /var/log/pdfmd.log
|
23
42
|
rename:
|
24
43
|
allkeywords : true
|
25
44
|
copy : true
|
@@ -27,6 +46,8 @@ Configure default settings for pdfmd in hiera:
|
|
27
46
|
keywords : 4
|
28
47
|
outputdir : /data/output/sorted
|
29
48
|
edit:
|
30
|
-
|
31
|
-
|
32
|
-
|
49
|
+
rename : true
|
50
|
+
opendoc : true
|
51
|
+
pdfviewer : evince
|
52
|
+
log : true
|
53
|
+
logfile : /var/log/pdfmd.log
|
@@ -0,0 +1,40 @@
|
|
1
|
+
Shows the current default configuration as available in Hiera.
|
2
|
+
|
3
|
+
|
4
|
+
|
5
|
+
== Usage
|
6
|
+
|
7
|
+
|
8
|
+
Example: `pdfmd config [<command>]`
|
9
|
+
|
10
|
+
|
11
|
+
|
12
|
+
== Parameter
|
13
|
+
|
14
|
+
|
15
|
+
[<command>]
|
16
|
+
|
17
|
+
|
18
|
+
Shows only the default configuration from hiera for the specified command.
|
19
|
+
The command parameter is not case-sensitive.
|
20
|
+
|
21
|
+
|
22
|
+
Example: `pdfmd config edit`
|
23
|
+
|
24
|
+
|
25
|
+
|
26
|
+
--log, -l
|
27
|
+
|
28
|
+
|
29
|
+
Enables/Disables logging.
|
30
|
+
|
31
|
+
|
32
|
+
Default: true
|
33
|
+
|
34
|
+
|
35
|
+
|
36
|
+
--logfile, -p
|
37
|
+
|
38
|
+
|
39
|
+
Path to the logfile. Default: `./.pdfmd.log`
|
40
|
+
|
@@ -0,0 +1,166 @@
|
|
1
|
+
== General
|
2
|
+
|
3
|
+
Command will edit the metadata of a PDF document. Multiple values can be
|
4
|
+
specified or 'all'.
|
5
|
+
|
6
|
+
The command will invoke an interactive user input and request the values
|
7
|
+
for the metatag if no value is provided.
|
8
|
+
|
9
|
+
Additionally the file can be renamed at the end according to the new meta
|
10
|
+
tags. See `pdfmd help rename` for details.
|
11
|
+
|
12
|
+
== Parameter
|
13
|
+
|
14
|
+
--log, -l
|
15
|
+
|
16
|
+
Enables/Disables the logging. Default: true
|
17
|
+
|
18
|
+
|
19
|
+
|
20
|
+
--logfile, -p
|
21
|
+
|
22
|
+
Path to the standard logfile. Default: ./.pdfmd.log
|
23
|
+
|
24
|
+
|
25
|
+
|
26
|
+
--tag, -t
|
27
|
+
|
28
|
+
Names or list of names of Metatag fields to set, separated by commata.
|
29
|
+
|
30
|
+
If a value is provided, the current Value will be replaced by the new value.
|
31
|
+
|
32
|
+
This parameter has no sibling in Hiera and needs to be specified on the command line.
|
33
|
+
|
34
|
+
|
35
|
+
|
36
|
+
--rename, -r
|
37
|
+
|
38
|
+
Rename file after updating the meta tag information according to the fields.
|
39
|
+
|
40
|
+
This parameter is identical to running `pdfmd rename <filename>`.
|
41
|
+
|
42
|
+
|
43
|
+
|
44
|
+
== Hiera
|
45
|
+
|
46
|
+
--- #YAML</br>
|
47
|
+
pdfmd::config:
|
48
|
+
edit:
|
49
|
+
rename : true|false
|
50
|
+
log : true|false
|
51
|
+
logfile : /var/log/pdfmd.log
|
52
|
+
opendoc : true|false
|
53
|
+
pdfviewer : evince
|
54
|
+
|
55
|
+
|
56
|
+
opendoc:
|
57
|
+
|
58
|
+
If set to true the command will try to start a the pdfviewer specified in 'pdfviewer' and display the PDF document while editing. If all values are being specified to the tags (e.g.: 'author:John Doe'), the pdf viewer will not be started. Only if some user interaction is requested.
|
59
|
+
|
60
|
+
The PID of the view process started will be automatically killed when the editing of the document has been finished.
|
61
|
+
|
62
|
+
There is no aquivalent command line parameter for this.
|
63
|
+
|
64
|
+
|
65
|
+
|
66
|
+
pdfviewer:
|
67
|
+
|
68
|
+
Command to run the pdf viewer on the system. Default: evince
|
69
|
+
|
70
|
+
There is no aquivalent command line parameter for this.
|
71
|
+
|
72
|
+
|
73
|
+
|
74
|
+
== General example:
|
75
|
+
|
76
|
+
# Edit tag 'TAG' and set a new value interactive.
|
77
|
+
|
78
|
+
$ pdfmd edit -t TAG <filename>
|
79
|
+
|
80
|
+
# Edit tag 'Author' and set new value interactive.
|
81
|
+
|
82
|
+
$ pdfmd edit -t author example.pdf
|
83
|
+
|
84
|
+
# Edit multiple Tags and set a new value interactive.
|
85
|
+
|
86
|
+
$ pdfmd edit -t tag1,tag2,tag3 <filename>
|
87
|
+
|
88
|
+
# Edit multiple Tags and set a new value in batch mode.
|
89
|
+
|
90
|
+
$ pdfmd edit -t tag1:'value1',tag2:'value2' <filename>
|
91
|
+
|
92
|
+
|
93
|
+
|
94
|
+
|
95
|
+
== Multiple Tags
|
96
|
+
|
97
|
+
For setting multiple tags list the tags comma separated.
|
98
|
+
|
99
|
+
For setting all tags (Author, Title, Subject, CreateDate, Keywords) use the keyword 'all' as tagname.
|
100
|
+
|
101
|
+
# Set tags 'Author', 'Title', 'Subject' in example.pdf interactivly.
|
102
|
+
|
103
|
+
$ pdfmd edit -t author,title,subject example.pdf`
|
104
|
+
|
105
|
+
# Set tags 'Author', 'Title', 'Subject', 'CreateDate', 'Keywords' in example.pdf interactive:
|
106
|
+
|
107
|
+
$ pdfmd edit -t all example.pdf
|
108
|
+
|
109
|
+
# Set tags 'Author', 'CreateDate' in example.pdf in batch mode (non-interactive:
|
110
|
+
|
111
|
+
pdfmd edit -t author:'Me',createdate:'1970:00:00 01:01:01' example.pdf
|
112
|
+
|
113
|
+
pdfmd edit -t author:'Me',Createdate:19700000 example.pdf
|
114
|
+
|
115
|
+
|
116
|
+
|
117
|
+
== Tag: CreateDate
|
118
|
+
|
119
|
+
In order to enter a value for the 'CreateDate' field, some internal matching is going on in order to make it easier and faster to enter dates and times.
|
120
|
+
|
121
|
+
The following formats are identified/matched:
|
122
|
+
|
123
|
+
yyyymmdd
|
124
|
+
|
125
|
+
yyyymmd
|
126
|
+
|
127
|
+
yyyymmddHHMMSS
|
128
|
+
|
129
|
+
yyyy-mm-dd HH:MM:SS
|
130
|
+
|
131
|
+
yyyy:mm:dd HH:MM:SS
|
132
|
+
|
133
|
+
yyyy.mm.dd HH:MM:SS
|
134
|
+
|
135
|
+
yyyy-mm-d
|
136
|
+
|
137
|
+
yyyy-mm-dd
|
138
|
+
|
139
|
+
yyyy.mm.d
|
140
|
+
|
141
|
+
yyyy.mm.dd
|
142
|
+
|
143
|
+
yyyy:mm:d
|
144
|
+
|
145
|
+
yyyy:mm:dd
|
146
|
+
|
147
|
+
- If HH:MM:SS or HHMMSS is not provided, those values are automatically set to zero.
|
148
|
+
|
149
|
+
- The output format of every timestamp is <yyyy:mm:dd HH:MM:SS>
|
150
|
+
|
151
|
+
- When providing and invalid date, the incorrect date is rejected and the user asked to provide the correct date.
|
152
|
+
|
153
|
+
|
154
|
+
|
155
|
+
== Rename file
|
156
|
+
|
157
|
+
In addition to setting the tags the current file can be renamed according to
|
158
|
+
the new metadata.
|
159
|
+
|
160
|
+
# Set tag 'Author' and rename file example.pdf
|
161
|
+
|
162
|
+
$ pdfmd edit -t author -r example.pdf
|
163
|
+
|
164
|
+
See `pdfmd help rename` for details about renaming.
|
165
|
+
|
166
|
+
To enable this feature in hiera add the key 'rename' into the section 'edit' with the value 'true'.
|
@@ -0,0 +1,206 @@
|
|
1
|
+
== General
|
2
|
+
|
3
|
+
Rename a file with the meta tags in the document.
|
4
|
+
|
5
|
+
== Parameter
|
6
|
+
|
7
|
+
--dry-run, -n
|
8
|
+
|
9
|
+
Simulate the renaming process and show the result without changing the file. Default: false
|
10
|
+
|
11
|
+
|
12
|
+
|
13
|
+
--dryrun, -n
|
14
|
+
|
15
|
+
Simulate what would be done without actually doing it. Default: false
|
16
|
+
|
17
|
+
|
18
|
+
|
19
|
+
--allkeywords, -a
|
20
|
+
|
21
|
+
Use all keywords from the meta information in the file name and ignore the limit. Default: false
|
22
|
+
|
23
|
+
|
24
|
+
|
25
|
+
--nrkeywwords, -k
|
26
|
+
|
27
|
+
Set the number of keywords used in the filename to a new value. Default: 3
|
28
|
+
|
29
|
+
|
30
|
+
|
31
|
+
--outputdir, -o
|
32
|
+
|
33
|
+
Rename the file and move it to the directory defined in '--outputdir'. Default: ./
|
34
|
+
|
35
|
+
|
36
|
+
|
37
|
+
--copy, -c
|
38
|
+
|
39
|
+
Copy the file instead of moving it to the new name or destination. Default: false
|
40
|
+
|
41
|
+
|
42
|
+
|
43
|
+
--log, -l
|
44
|
+
|
45
|
+
Enable logging. Default: true
|
46
|
+
|
47
|
+
|
48
|
+
|
49
|
+
--logfile, -p
|
50
|
+
|
51
|
+
Define logfile path. Default: ./.pdfmd.log
|
52
|
+
|
53
|
+
|
54
|
+
|
55
|
+
== Hiera
|
56
|
+
|
57
|
+
--- #YAML
|
58
|
+
pdfmd::config:
|
59
|
+
rename:
|
60
|
+
dryrun : true|false
|
61
|
+
allkeywords : true|false
|
62
|
+
outputdir : /tmp
|
63
|
+
nrkeywords : 3
|
64
|
+
copy : true|false
|
65
|
+
log : true|false
|
66
|
+
logfile : /var/log/pdfmd.log
|
67
|
+
|
68
|
+
|
69
|
+
|
70
|
+
== Example
|
71
|
+
|
72
|
+
# Rename the file according to the metatags
|
73
|
+
|
74
|
+
$ pdfmd rename <filename>
|
75
|
+
|
76
|
+
# Rename example.pdf according to the metatags
|
77
|
+
|
78
|
+
$ pdfmd rename example.pdf
|
79
|
+
|
80
|
+
# Simulate renaming example.pdf according to the metatags (dry-run)
|
81
|
+
$ pdfmd rename -n example.pdf
|
82
|
+
|
83
|
+
|
84
|
+
|
85
|
+
== Hiera
|
86
|
+
|
87
|
+
There are Hiera settings available, that cannot be addressed by a commandline parameter.
|
88
|
+
|
89
|
+
defaultdoctype: Defines the appreviation for the default document type. This one isused when no other document type could be determined from the metadata-field 'title'. Default value is 'doc'.
|
90
|
+
|
91
|
+
For details on how to set the parameter, see 'pdfmd explain hiera'.
|
92
|
+
|
93
|
+
|
94
|
+
== Rules
|
95
|
+
|
96
|
+
There are some rules regarding how documents are being renamed
|
97
|
+
|
98
|
+
Rule 1: All documents have the following filenaming structure:
|
99
|
+
|
100
|
+
<yyyymmdd>-<author>-<type>-<additionalInformation>.<extension>
|
101
|
+
|
102
|
+
# <yyyymmdd>: Year, month and day identical to the meta information in the document.
|
103
|
+
|
104
|
+
# <author>: Author of the document, identical to the meta information
|
105
|
+
in the document. Special characters and whitespaces are replaced.
|
106
|
+
|
107
|
+
# <type>: Document type, is being generated from the title field in the metadata of the document. Document type is a three character abbreviation following the following logic:
|
108
|
+
|
109
|
+
con => Contract
|
110
|
+
|
111
|
+
inv => Invoice
|
112
|
+
|
113
|
+
inf => Information
|
114
|
+
|
115
|
+
man => Manual
|
116
|
+
|
117
|
+
off => Offer
|
118
|
+
|
119
|
+
ord => Order
|
120
|
+
|
121
|
+
rpt => Receipt
|
122
|
+
|
123
|
+
tic => Ticket
|
124
|
+
|
125
|
+
If the dokument type can not be determined automatically, it defaults to 'dok'.
|
126
|
+
|
127
|
+
This default behavior got introduced with version 1.8.1 and can be overwritten by hiera.
|
128
|
+
See `pdfmd explain hiera-keys` for information on how to do this.
|
129
|
+
|
130
|
+
# <additionalInformation>: Information generated from the metadata fields
|
131
|
+
'title', 'subject' and 'keywords'.
|
132
|
+
|
133
|
+
If 'Title' or 'Keywords' contains one of the following keywords, they will be replaced with the corresponding abbreviation followed by the specified value:
|
134
|
+
|
135
|
+
Contract => con
|
136
|
+
|
137
|
+
Invoice => inv
|
138
|
+
|
139
|
+
Information => inf
|
140
|
+
|
141
|
+
Manual => man
|
142
|
+
|
143
|
+
Offer => off
|
144
|
+
|
145
|
+
Order => ord
|
146
|
+
|
147
|
+
Receipt => rpt
|
148
|
+
|
149
|
+
Ticket => tic
|
150
|
+
|
151
|
+
|
152
|
+
This setting will be overwritten as well by defining the 'keys' hash in Hiera.
|
153
|
+
|
154
|
+
Rule 2: The number of keywords used in the filename is defined by the parameter '-k'. See the section of that parameter for more details and the default value.
|
155
|
+
|
156
|
+
Rule 3: Keywords matching 'kvi','fak','ord','kdn' are prioritised.
|
157
|
+
|
158
|
+
Rule 4: Special characters and whitespaces are replaced:
|
159
|
+
|
160
|
+
' ' => '_'
|
161
|
+
|
162
|
+
'/' => '_'
|
163
|
+
|
164
|
+
Rule 5: The new filename has only lowercase characters.
|
165
|
+
|
166
|
+
|
167
|
+
|
168
|
+
== Example (detailed)
|
169
|
+
|
170
|
+
# Example PDF with following MetaTags:
|
171
|
+
|
172
|
+
Filename : example.pdf
|
173
|
+
|
174
|
+
Author : John
|
175
|
+
|
176
|
+
Subject : new Product
|
177
|
+
|
178
|
+
Title : Presentation
|
179
|
+
|
180
|
+
CreateDate : 1970:01:01 01:00:00
|
181
|
+
|
182
|
+
Keywords : John Doe, Jane Doe, Mister Doe
|
183
|
+
|
184
|
+
|
185
|
+
# Renaming the file
|
186
|
+
|
187
|
+
$ pdfmd rename example.pdf
|
188
|
+
|
189
|
+
example.pdf => 19700101-john-dok-new_product-john_doe-jane_doe.pdf
|
190
|
+
|
191
|
+
|
192
|
+
|
193
|
+
# Simulation to rename the file (no actual change)
|
194
|
+
|
195
|
+
pdfmd rename -n example.pdf
|
196
|
+
|
197
|
+
example.pdf => 19700101-john-dok-new_product-john_doe-jane_doe.pdf
|
198
|
+
|
199
|
+
|
200
|
+
|
201
|
+
# Renaming the file with all keywords
|
202
|
+
|
203
|
+
$ pdfmd rename -n -a example.pdf
|
204
|
+
|
205
|
+
example.pdf => 19700101-john-dok-new_product-john_doe-jane_doe-mister_doe.pdf
|
206
|
+
|