pdfmd 1.9.1 → 2.0.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CHANGELOG.md +22 -2
- data/README.md +2 -2
- data/TODO.mkd +26 -0
- data/bin/pdfmd +267 -1
- data/lib/pdfmd.rb +242 -634
- data/lib/pdfmd/explain.hiera.md +25 -4
- data/lib/pdfmd/long_desc.pdfmdconfig.txt +40 -0
- data/lib/pdfmd/long_desc.pdfmdedit.txt +166 -0
- data/lib/pdfmd/long_desc.pdfmdexplain.txt +16 -0
- data/lib/pdfmd/long_desc.pdfmdrename.txt +206 -0
- data/lib/pdfmd/long_desc.pdfmdshow.txt +92 -0
- data/lib/pdfmd/long_desc.pdfmdsort.txt +111 -0
- data/lib/pdfmd/long_desc.pdfmdstat.txt +23 -0
- data/lib/pdfmd/pdfmdconfig.rb +30 -0
- data/lib/pdfmd/pdfmdedit.rb +201 -0
- data/lib/pdfmd/pdfmdmethods.rb +125 -0
- data/lib/pdfmd/pdfmdrename.rb +243 -0
- data/lib/pdfmd/pdfmdshow.rb +88 -0
- data/lib/pdfmd/pdfmdsort.rb +115 -0
- data/lib/pdfmd/pdfmdstat.rb +117 -0
- data/lib/{string_extend.rb → pdfmd/string_extend.rb} +0 -0
- data/lib/run.rb +235 -0
- data/pdfmd.gemspec +3 -2
- metadata +23 -11
- data/lib/pdfmd/check.rb +0 -10
- data/lib/pdfmd/config.rb +0 -59
- data/lib/pdfmd/edit.rb +0 -144
- data/lib/pdfmd/rename.rb +0 -295
- data/lib/pdfmd/show.rb +0 -164
- data/lib/pdfmd/sort.rb +0 -199
data/lib/pdfmd/explain.hiera.md
CHANGED
@@ -2,10 +2,12 @@ Information about hiera: https://docs.puppetlabs.com/hiera/1/index.html
|
|
2
2
|
|
3
3
|
Installation:
|
4
4
|
|
5
|
-
```
|
5
|
+
```
|
6
6
|
$ gem install hiera
|
7
7
|
```
|
8
8
|
|
9
|
+
This is a list of all available hiera settings.
|
10
|
+
|
9
11
|
Configure default settings for pdfmd in hiera:
|
10
12
|
|
11
13
|
|
@@ -14,12 +16,29 @@ Configure default settings for pdfmd in hiera:
|
|
14
16
|
pdfmd::config:
|
15
17
|
default:
|
16
18
|
password : secretpassword
|
19
|
+
loglevel : info|warn|error|debug
|
17
20
|
sort:
|
18
21
|
copy : true
|
19
22
|
destination : /data/output
|
20
23
|
interactive : true
|
21
24
|
log : true
|
22
25
|
logfile : /var/log/pdfmd.log
|
26
|
+
rename:
|
27
|
+
allkeywords : true
|
28
|
+
copy : true
|
29
|
+
defaultdoctype: doc
|
30
|
+
keywords : 4
|
31
|
+
outputdir : /data/output/sorted
|
32
|
+
log : true
|
33
|
+
logfile : /var/log/pdfmd.log
|
34
|
+
rename:
|
35
|
+
allkeywords : true
|
36
|
+
copy : true
|
37
|
+
defaultdoctype: doc
|
38
|
+
keywords : 4
|
39
|
+
outputdir : /data/output/sorted
|
40
|
+
log : true
|
41
|
+
logfile : /var/log/pdfmd.log
|
23
42
|
rename:
|
24
43
|
allkeywords : true
|
25
44
|
copy : true
|
@@ -27,6 +46,8 @@ Configure default settings for pdfmd in hiera:
|
|
27
46
|
keywords : 4
|
28
47
|
outputdir : /data/output/sorted
|
29
48
|
edit:
|
30
|
-
|
31
|
-
|
32
|
-
|
49
|
+
rename : true
|
50
|
+
opendoc : true
|
51
|
+
pdfviewer : evince
|
52
|
+
log : true
|
53
|
+
logfile : /var/log/pdfmd.log
|
@@ -0,0 +1,40 @@
|
|
1
|
+
Shows the current default configuration as available in Hiera.
|
2
|
+
|
3
|
+
|
4
|
+
|
5
|
+
== Usage
|
6
|
+
|
7
|
+
|
8
|
+
Example: `pdfmd config [<command>]`
|
9
|
+
|
10
|
+
|
11
|
+
|
12
|
+
== Parameter
|
13
|
+
|
14
|
+
|
15
|
+
[<command>]
|
16
|
+
|
17
|
+
|
18
|
+
Shows only the default configuration from hiera for the specified command.
|
19
|
+
The command parameter is not case-sensitive.
|
20
|
+
|
21
|
+
|
22
|
+
Example: `pdfmd config edit`
|
23
|
+
|
24
|
+
|
25
|
+
|
26
|
+
--log, -l
|
27
|
+
|
28
|
+
|
29
|
+
Enables/Disables logging.
|
30
|
+
|
31
|
+
|
32
|
+
Default: true
|
33
|
+
|
34
|
+
|
35
|
+
|
36
|
+
--logfile, -p
|
37
|
+
|
38
|
+
|
39
|
+
Path to the logfile. Default: `./.pdfmd.log`
|
40
|
+
|
@@ -0,0 +1,166 @@
|
|
1
|
+
== General
|
2
|
+
|
3
|
+
Command will edit the metadata of a PDF document. Multiple values can be
|
4
|
+
specified or 'all'.
|
5
|
+
|
6
|
+
The command will invoke an interactive user input and request the values
|
7
|
+
for the metatag if no value is provided.
|
8
|
+
|
9
|
+
Additionally the file can be renamed at the end according to the new meta
|
10
|
+
tags. See `pdfmd help rename` for details.
|
11
|
+
|
12
|
+
== Parameter
|
13
|
+
|
14
|
+
--log, -l
|
15
|
+
|
16
|
+
Enables/Disables the logging. Default: true
|
17
|
+
|
18
|
+
|
19
|
+
|
20
|
+
--logfile, -p
|
21
|
+
|
22
|
+
Path to the standard logfile. Default: ./.pdfmd.log
|
23
|
+
|
24
|
+
|
25
|
+
|
26
|
+
--tag, -t
|
27
|
+
|
28
|
+
Names or list of names of Metatag fields to set, separated by commata.
|
29
|
+
|
30
|
+
If a value is provided, the current Value will be replaced by the new value.
|
31
|
+
|
32
|
+
This parameter has no sibling in Hiera and needs to be specified on the command line.
|
33
|
+
|
34
|
+
|
35
|
+
|
36
|
+
--rename, -r
|
37
|
+
|
38
|
+
Rename file after updating the meta tag information according to the fields.
|
39
|
+
|
40
|
+
This parameter is identical to running `pdfmd rename <filename>`.
|
41
|
+
|
42
|
+
|
43
|
+
|
44
|
+
== Hiera
|
45
|
+
|
46
|
+
--- #YAML</br>
|
47
|
+
pdfmd::config:
|
48
|
+
edit:
|
49
|
+
rename : true|false
|
50
|
+
log : true|false
|
51
|
+
logfile : /var/log/pdfmd.log
|
52
|
+
opendoc : true|false
|
53
|
+
pdfviewer : evince
|
54
|
+
|
55
|
+
|
56
|
+
opendoc:
|
57
|
+
|
58
|
+
If set to true the command will try to start a the pdfviewer specified in 'pdfviewer' and display the PDF document while editing. If all values are being specified to the tags (e.g.: 'author:John Doe'), the pdf viewer will not be started. Only if some user interaction is requested.
|
59
|
+
|
60
|
+
The PID of the view process started will be automatically killed when the editing of the document has been finished.
|
61
|
+
|
62
|
+
There is no aquivalent command line parameter for this.
|
63
|
+
|
64
|
+
|
65
|
+
|
66
|
+
pdfviewer:
|
67
|
+
|
68
|
+
Command to run the pdf viewer on the system. Default: evince
|
69
|
+
|
70
|
+
There is no aquivalent command line parameter for this.
|
71
|
+
|
72
|
+
|
73
|
+
|
74
|
+
== General example:
|
75
|
+
|
76
|
+
# Edit tag 'TAG' and set a new value interactive.
|
77
|
+
|
78
|
+
$ pdfmd edit -t TAG <filename>
|
79
|
+
|
80
|
+
# Edit tag 'Author' and set new value interactive.
|
81
|
+
|
82
|
+
$ pdfmd edit -t author example.pdf
|
83
|
+
|
84
|
+
# Edit multiple Tags and set a new value interactive.
|
85
|
+
|
86
|
+
$ pdfmd edit -t tag1,tag2,tag3 <filename>
|
87
|
+
|
88
|
+
# Edit multiple Tags and set a new value in batch mode.
|
89
|
+
|
90
|
+
$ pdfmd edit -t tag1:'value1',tag2:'value2' <filename>
|
91
|
+
|
92
|
+
|
93
|
+
|
94
|
+
|
95
|
+
== Multiple Tags
|
96
|
+
|
97
|
+
For setting multiple tags list the tags comma separated.
|
98
|
+
|
99
|
+
For setting all tags (Author, Title, Subject, CreateDate, Keywords) use the keyword 'all' as tagname.
|
100
|
+
|
101
|
+
# Set tags 'Author', 'Title', 'Subject' in example.pdf interactivly.
|
102
|
+
|
103
|
+
$ pdfmd edit -t author,title,subject example.pdf`
|
104
|
+
|
105
|
+
# Set tags 'Author', 'Title', 'Subject', 'CreateDate', 'Keywords' in example.pdf interactive:
|
106
|
+
|
107
|
+
$ pdfmd edit -t all example.pdf
|
108
|
+
|
109
|
+
# Set tags 'Author', 'CreateDate' in example.pdf in batch mode (non-interactive:
|
110
|
+
|
111
|
+
pdfmd edit -t author:'Me',createdate:'1970:00:00 01:01:01' example.pdf
|
112
|
+
|
113
|
+
pdfmd edit -t author:'Me',Createdate:19700000 example.pdf
|
114
|
+
|
115
|
+
|
116
|
+
|
117
|
+
== Tag: CreateDate
|
118
|
+
|
119
|
+
In order to enter a value for the 'CreateDate' field, some internal matching is going on in order to make it easier and faster to enter dates and times.
|
120
|
+
|
121
|
+
The following formats are identified/matched:
|
122
|
+
|
123
|
+
yyyymmdd
|
124
|
+
|
125
|
+
yyyymmd
|
126
|
+
|
127
|
+
yyyymmddHHMMSS
|
128
|
+
|
129
|
+
yyyy-mm-dd HH:MM:SS
|
130
|
+
|
131
|
+
yyyy:mm:dd HH:MM:SS
|
132
|
+
|
133
|
+
yyyy.mm.dd HH:MM:SS
|
134
|
+
|
135
|
+
yyyy-mm-d
|
136
|
+
|
137
|
+
yyyy-mm-dd
|
138
|
+
|
139
|
+
yyyy.mm.d
|
140
|
+
|
141
|
+
yyyy.mm.dd
|
142
|
+
|
143
|
+
yyyy:mm:d
|
144
|
+
|
145
|
+
yyyy:mm:dd
|
146
|
+
|
147
|
+
- If HH:MM:SS or HHMMSS is not provided, those values are automatically set to zero.
|
148
|
+
|
149
|
+
- The output format of every timestamp is <yyyy:mm:dd HH:MM:SS>
|
150
|
+
|
151
|
+
- When providing and invalid date, the incorrect date is rejected and the user asked to provide the correct date.
|
152
|
+
|
153
|
+
|
154
|
+
|
155
|
+
== Rename file
|
156
|
+
|
157
|
+
In addition to setting the tags the current file can be renamed according to
|
158
|
+
the new metadata.
|
159
|
+
|
160
|
+
# Set tag 'Author' and rename file example.pdf
|
161
|
+
|
162
|
+
$ pdfmd edit -t author -r example.pdf
|
163
|
+
|
164
|
+
See `pdfmd help rename` for details about renaming.
|
165
|
+
|
166
|
+
To enable this feature in hiera add the key 'rename' into the section 'edit' with the value 'true'.
|
@@ -0,0 +1,206 @@
|
|
1
|
+
== General
|
2
|
+
|
3
|
+
Rename a file with the meta tags in the document.
|
4
|
+
|
5
|
+
== Parameter
|
6
|
+
|
7
|
+
--dry-run, -n
|
8
|
+
|
9
|
+
Simulate the renaming process and show the result without changing the file. Default: false
|
10
|
+
|
11
|
+
|
12
|
+
|
13
|
+
--dryrun, -n
|
14
|
+
|
15
|
+
Simulate what would be done without actually doing it. Default: false
|
16
|
+
|
17
|
+
|
18
|
+
|
19
|
+
--allkeywords, -a
|
20
|
+
|
21
|
+
Use all keywords from the meta information in the file name and ignore the limit. Default: false
|
22
|
+
|
23
|
+
|
24
|
+
|
25
|
+
--nrkeywwords, -k
|
26
|
+
|
27
|
+
Set the number of keywords used in the filename to a new value. Default: 3
|
28
|
+
|
29
|
+
|
30
|
+
|
31
|
+
--outputdir, -o
|
32
|
+
|
33
|
+
Rename the file and move it to the directory defined in '--outputdir'. Default: ./
|
34
|
+
|
35
|
+
|
36
|
+
|
37
|
+
--copy, -c
|
38
|
+
|
39
|
+
Copy the file instead of moving it to the new name or destination. Default: false
|
40
|
+
|
41
|
+
|
42
|
+
|
43
|
+
--log, -l
|
44
|
+
|
45
|
+
Enable logging. Default: true
|
46
|
+
|
47
|
+
|
48
|
+
|
49
|
+
--logfile, -p
|
50
|
+
|
51
|
+
Define logfile path. Default: ./.pdfmd.log
|
52
|
+
|
53
|
+
|
54
|
+
|
55
|
+
== Hiera
|
56
|
+
|
57
|
+
--- #YAML
|
58
|
+
pdfmd::config:
|
59
|
+
rename:
|
60
|
+
dryrun : true|false
|
61
|
+
allkeywords : true|false
|
62
|
+
outputdir : /tmp
|
63
|
+
nrkeywords : 3
|
64
|
+
copy : true|false
|
65
|
+
log : true|false
|
66
|
+
logfile : /var/log/pdfmd.log
|
67
|
+
|
68
|
+
|
69
|
+
|
70
|
+
== Example
|
71
|
+
|
72
|
+
# Rename the file according to the metatags
|
73
|
+
|
74
|
+
$ pdfmd rename <filename>
|
75
|
+
|
76
|
+
# Rename example.pdf according to the metatags
|
77
|
+
|
78
|
+
$ pdfmd rename example.pdf
|
79
|
+
|
80
|
+
# Simulate renaming example.pdf according to the metatags (dry-run)
|
81
|
+
$ pdfmd rename -n example.pdf
|
82
|
+
|
83
|
+
|
84
|
+
|
85
|
+
== Hiera
|
86
|
+
|
87
|
+
There are Hiera settings available, that cannot be addressed by a commandline parameter.
|
88
|
+
|
89
|
+
defaultdoctype: Defines the appreviation for the default document type. This one isused when no other document type could be determined from the metadata-field 'title'. Default value is 'doc'.
|
90
|
+
|
91
|
+
For details on how to set the parameter, see 'pdfmd explain hiera'.
|
92
|
+
|
93
|
+
|
94
|
+
== Rules
|
95
|
+
|
96
|
+
There are some rules regarding how documents are being renamed
|
97
|
+
|
98
|
+
Rule 1: All documents have the following filenaming structure:
|
99
|
+
|
100
|
+
<yyyymmdd>-<author>-<type>-<additionalInformation>.<extension>
|
101
|
+
|
102
|
+
# <yyyymmdd>: Year, month and day identical to the meta information in the document.
|
103
|
+
|
104
|
+
# <author>: Author of the document, identical to the meta information
|
105
|
+
in the document. Special characters and whitespaces are replaced.
|
106
|
+
|
107
|
+
# <type>: Document type, is being generated from the title field in the metadata of the document. Document type is a three character abbreviation following the following logic:
|
108
|
+
|
109
|
+
con => Contract
|
110
|
+
|
111
|
+
inv => Invoice
|
112
|
+
|
113
|
+
inf => Information
|
114
|
+
|
115
|
+
man => Manual
|
116
|
+
|
117
|
+
off => Offer
|
118
|
+
|
119
|
+
ord => Order
|
120
|
+
|
121
|
+
rpt => Receipt
|
122
|
+
|
123
|
+
tic => Ticket
|
124
|
+
|
125
|
+
If the dokument type can not be determined automatically, it defaults to 'dok'.
|
126
|
+
|
127
|
+
This default behavior got introduced with version 1.8.1 and can be overwritten by hiera.
|
128
|
+
See `pdfmd explain hiera-keys` for information on how to do this.
|
129
|
+
|
130
|
+
# <additionalInformation>: Information generated from the metadata fields
|
131
|
+
'title', 'subject' and 'keywords'.
|
132
|
+
|
133
|
+
If 'Title' or 'Keywords' contains one of the following keywords, they will be replaced with the corresponding abbreviation followed by the specified value:
|
134
|
+
|
135
|
+
Contract => con
|
136
|
+
|
137
|
+
Invoice => inv
|
138
|
+
|
139
|
+
Information => inf
|
140
|
+
|
141
|
+
Manual => man
|
142
|
+
|
143
|
+
Offer => off
|
144
|
+
|
145
|
+
Order => ord
|
146
|
+
|
147
|
+
Receipt => rpt
|
148
|
+
|
149
|
+
Ticket => tic
|
150
|
+
|
151
|
+
|
152
|
+
This setting will be overwritten as well by defining the 'keys' hash in Hiera.
|
153
|
+
|
154
|
+
Rule 2: The number of keywords used in the filename is defined by the parameter '-k'. See the section of that parameter for more details and the default value.
|
155
|
+
|
156
|
+
Rule 3: Keywords matching 'kvi','fak','ord','kdn' are prioritised.
|
157
|
+
|
158
|
+
Rule 4: Special characters and whitespaces are replaced:
|
159
|
+
|
160
|
+
' ' => '_'
|
161
|
+
|
162
|
+
'/' => '_'
|
163
|
+
|
164
|
+
Rule 5: The new filename has only lowercase characters.
|
165
|
+
|
166
|
+
|
167
|
+
|
168
|
+
== Example (detailed)
|
169
|
+
|
170
|
+
# Example PDF with following MetaTags:
|
171
|
+
|
172
|
+
Filename : example.pdf
|
173
|
+
|
174
|
+
Author : John
|
175
|
+
|
176
|
+
Subject : new Product
|
177
|
+
|
178
|
+
Title : Presentation
|
179
|
+
|
180
|
+
CreateDate : 1970:01:01 01:00:00
|
181
|
+
|
182
|
+
Keywords : John Doe, Jane Doe, Mister Doe
|
183
|
+
|
184
|
+
|
185
|
+
# Renaming the file
|
186
|
+
|
187
|
+
$ pdfmd rename example.pdf
|
188
|
+
|
189
|
+
example.pdf => 19700101-john-dok-new_product-john_doe-jane_doe.pdf
|
190
|
+
|
191
|
+
|
192
|
+
|
193
|
+
# Simulation to rename the file (no actual change)
|
194
|
+
|
195
|
+
pdfmd rename -n example.pdf
|
196
|
+
|
197
|
+
example.pdf => 19700101-john-dok-new_product-john_doe-jane_doe.pdf
|
198
|
+
|
199
|
+
|
200
|
+
|
201
|
+
# Renaming the file with all keywords
|
202
|
+
|
203
|
+
$ pdfmd rename -n -a example.pdf
|
204
|
+
|
205
|
+
example.pdf => 19700101-john-dok-new_product-john_doe-jane_doe-mister_doe.pdf
|
206
|
+
|