opex-manifest-generator 1.3.5__py3-none-any.whl → 1.3.7__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,619 @@
1
+ Metadata-Version: 2.4
2
+ Name: opex_manifest_generator
3
+ Version: 1.3.7
4
+ Summary: An Opex Manifest Generator tool for use with OPEX Files, as designed by Preservica
5
+ Author-email: Christopher Prince <c.pj.prince@gmail.com>
6
+ License-Expression: Apache-2.0
7
+ Project-URL: Homepage, https://github.com/CPJPRINCE/opex_manifest_generator
8
+ Project-URL: Issues, https://github.com/CPJPRINCE/opex_manifest_generator/issues
9
+ Keywords: archiving,archives,digital archiving,opex,Preservica,opex generator
10
+ Classifier: Programming Language :: Python :: 3
11
+ Classifier: Operating System :: OS Independent
12
+ Classifier: Topic :: System :: Archiving
13
+ Description-Content-Type: text/markdown
14
+ License-File: LICENSE.md
15
+ Requires-Dist: auto_reference_generator
16
+ Requires-Dist: pandas
17
+ Requires-Dist: openpyxl
18
+ Requires-Dist: lxml
19
+ Provides-Extra: addex
20
+ Requires-Dist: odfpy; extra == "addex"
21
+ Provides-Extra: dev
22
+ Requires-Dist: pytest>=7.0; extra == "dev"
23
+ Dynamic: license-file
24
+
25
+ # Opex Manifest Generator Tool
26
+
27
+ [![Supported Versions](https://img.shields.io/pypi/pyversions/opex_manifest_generator.svg)](https://pypi.org/project/opex_manifest_generator)
28
+ [![CodeQL](https://github.com/CPJPRINCE/opex_manifest_generator/actions/workflows/codeql.yml/badge.svg)](https://github.com/CPJPRINCE/opex_manifest_generator/actions/workflows/codeql.yml)
29
+
30
+ A small Python programme for generating opex manifest files. Used for safe transfer of files and metadata ingests into opex compatible systems (Preservica). The program will recurse through a given hierarchy and generate manifests for all folders/files (depending on option).
31
+
32
+ ## Table of Contents
33
+
34
+ - [Quick Start](#quick-start)
35
+ - [Version & Package Info](#version--package-info)
36
+ - [Why Use This Tool?](#why-use-this-tool)
37
+ - [Additional Features](#additional-features)
38
+ - [Expected Output](#expected-output)
39
+ - [Advanced Usage](#advanced-usage)
40
+ - [Fixity Generation](#fixity-generation)
41
+ - [Continuous Operation](#continuous-operation)
42
+ - [Clearing Opex Files](#clearing-opex-files)
43
+ - [Zipping](#zipping)
44
+ - [Removing Empty Directories](#removing-empty-directories)
45
+ - [Hidden Directories](#hidden-directories)
46
+ - [Auto Reference Usage](#auto-reference-usage)
47
+ - [Input Option](#input-option)
48
+ - [XIP Metadata - Title, Description and Security Tags](#xip-metadata---title-description-and-security-tags)
49
+ - [XIP Metadata - Identifiers](#xip-metadata---identifiers)
50
+ - [Samples](#samples)
51
+ - [Custom Spreadsheets](#custom-spreadsheets)
52
+ - [XML Metadata - Basic Templates](#xml-metadata---basic-templates)
53
+ - [XML Metadata - Quick Notes](#xml-metadata---quick-notes)
54
+ - [XML Metadata Templates - Custom Templates](#xml-metadata-templates---custom-templates)
55
+ - [Input Hashes](#input-hashes)
56
+ - [Removals & Ignore](#removals--ignore)
57
+ - [Options File](#options-file)
58
+ - [Full Options](#full-options)
59
+ - [Future Developments](#future-developments)
60
+ - [Troubleshooting](#troubleshooting)
61
+ - [Developers](#developers)
62
+ - [Contributing](#contributing)
63
+
64
+ ## Quick Start
65
+
66
+ ### Option 1: Using pip (Recommended for Python users / long-term use)
67
+ ```bash
68
+ pip install -U opex_manifest_generator
69
+ opex_generate /path/to/root
70
+ ```
71
+
72
+ ### Option 2: Using Portable Executable (No Python Required)
73
+
74
+ Download the latest portable executable for your platform from [Releases](https://github.com/CPJPRINCE/opex_manifest_generator/releases)
75
+
76
+ Extract and run:
77
+ ```bash
78
+ # Windows
79
+ cd opex_generate\bin
80
+ .\opex_generate.cmd .\path\to\root -fx SHA-256
81
+
82
+ # Linux/macOS
83
+ ./opex_generate /path/to/root -fx SHA-256
84
+ ```
85
+
86
+ On Windows you can also use the install.cmd with admin privileges to install and run the command without navigating to the bin folder (see Option 1 for use).
87
+
88
+ ## Version & Package Info
89
+
90
+ **Python Version:**
91
+
92
+ Python Version 3.10+ is recommended. Earlier versions may work but are not tested.
93
+
94
+ **Additional Packages:**
95
+ - auto_reference_generator (required)
96
+ - pandas (required)
97
+ - tqdm (required)
98
+ - openpyxl (required)
99
+ - lxml (required)
100
+ - odfpy (optional - ods export)
101
+
102
+ To install using Python:
103
+
104
+ ```bash
105
+ pip install pandas openpyxl pyodf lxml tqdm
106
+ ```
107
+
108
+ If using Python, ensure it is added to your Environment variables.
109
+
110
+ ### Output
111
+
112
+ Will generate an `.opex` manifest file for each of your directories. This manifest will contain a list of all files/folders in that folder.
113
+
114
+ File manifests may also be generated if using additional options. When file manifests are active, the folder manifest automatically accounts for additional opexes.
115
+
116
+ ## Why Use This Tool?
117
+
118
+ This tool was primarily intended to allow users to undertake larger uploads safely using bulk ingests.
119
+
120
+ It functions with all methods of Opex Ingests. For Preservica this includes:
121
+ - **Opex Incremental Workflow**
122
+ - **PUT Tool**
123
+ - **Starter Drag 'n' Drop**
124
+ - **Manual Ingest**
125
+
126
+ ## Additional Features
127
+
128
+ - **Hash generation (MD5, SHA1, SHA256, SHA512) - for additional security checks.**
129
+ - **Generate multiple algorithm hashes**
130
+ - **Generate hashes for PAX files**
131
+ - **Continuous Operation - allowing closure/crashes of the program to occur and then picking up where you left off**
132
+ - **Opex removal**
133
+ - **Zip functionality**
134
+
135
+ The Program also includes the [Auto Reference Generator](https://github.com/CPJPRINCE/auto_reference_generator), built in allowing for:
136
+ - **Automated Reference generation straight to Opex files**
137
+ - **Clearing and logging empty folders**
138
+ - **A Removal mode to delete and log files/folders**
139
+ - **Sorting - by alphabetically or 'folders first'**
140
+ - **Keyword assignment - replacing numerals with specified keywords (initials, first letter, JSON map)**
141
+ - **And more! See the github page for details**
142
+
143
+ A key function built on ARG is the `--input` mode, allowing you to use a spreadsheet to assign XIP/XML metadata to your files and folders. Currently this allows:
144
+ - **Assignment of XIP title, description, and security status fields**
145
+ - **Assignment of standard and custom XML metadata templates**
146
+ - **'Drop-in/drop-out' operations, so only needed columns are added**
147
+
148
+ All these options can be combined to create extensive and robust Opex files for file transfers.
149
+
150
+ ## Expected Output
151
+
152
+ At a basic level, using `opex_generate`, the program will only generate folder manifests.
153
+
154
+ ![Opex in Folder](assets/Opex%20Folder.png)
155
+
156
+ Which will contain a simple list of files/folders in that folder:
157
+
158
+ ![Opex Folder Manifest](assets/Opex%20Manifest.png)
159
+
160
+ When using an option that affects files, you will generate individual Opexes for files:
161
+
162
+ ![Opex Files](assets/Opex%20Files.png)
163
+
164
+ These will contain the data about the files (which will vary based on selected options).
165
+
166
+ ![Opex File Manifest](assets/Opex%20File%20Manifest.png)
167
+
168
+ When individual opex files are generated, the folder manifest will include these as **metadata** files.
169
+
170
+ ![Opex Folder Manifest](assets/Opex%20Manifest%20with%20files.png)
171
+
172
+ ## Advanced Usage
173
+
174
+ **Important Notes**
175
+
176
+ - The term `meta` is hard-coded to always be ignored. This is case-sensitive.
177
+ - A meta folder will only be created using `--fixity`, `--remove-empty` or `-rm` options. You can disable this using the `--disable-meta-dir` option or `-o` option to relocate it.
178
+
179
+ ### Fixity Generation
180
+
181
+ ```bash
182
+ # Generate with SHA-256 Hash
183
+ opex_generate "/path/to/folder" -fx SHA-256
184
+
185
+ # Generate with MD5 and SHA-256 Hash
186
+ opex_generate "/path/to/folder" -fx MD5 SHA-256
187
+
188
+ # Generate with SHA-512 for PAX - PAXes can be zipped or a folder titled '.pax'
189
+ opex_generate "/path/to/paxfolders" -fx SHA-1 --pax-fixity
190
+
191
+ # Generate with MD5 and SHA1 for PAX
192
+ opex_generate "/path/to/paxfolders" -fx MD5 SHA-1
193
+
194
+ # Using -fx without specifying will default to SHA-1
195
+ ```
196
+
197
+ ### Continuous Operation
198
+
199
+ The program won't override an existing opex when generating a new Opex. If an opex is present it will state:
200
+
201
+ ```
202
+ Avoiding override, Opex exists at: /path/to/opex
203
+ ```
204
+
205
+ This allows for continuous operation, as long generations - particularly if you have large files - can be cancelled at any point, then picked up later. To halt the program, simply press `Ctrl + C` in the console.
206
+
207
+ There is no way to force an override. If you need to rerun a generation, use the `-clr` option.
208
+
209
+ ### Clearing Opex Files
210
+
211
+ ```bash
212
+ # Will clear existing opexes recursively then end
213
+ opex_generate /path/to/folder -clr
214
+
215
+ # If other options are enabled will clear and rerun generation
216
+ opex_generate /path/to/folder -clr -fx SHA1
217
+ ```
218
+
219
+ ### Zipping
220
+
221
+ ```bash
222
+ # Will zip opex and file into a zip file
223
+ opex_generate /path/to/folder -fx SHA-1 -z
224
+
225
+ # Will zip opex and file and remove the original files
226
+ opex_generate /path/to/folder -fx SHA-1 -z --remove-zipped-files
227
+ ```
228
+
229
+ **Use zipping with caution, repeated use can get quite messy fast.**
230
+
231
+ ### Removing Empty Directories
232
+
233
+ ```bash
234
+ # Remove and generate a text log to the 'meta' folder of removed directories
235
+ opex_generate /path/to/folder --remove-empty
236
+
237
+ # You will be asked to give confirmation that you want to proceed
238
+ ```
239
+
240
+ ### Hidden Directories
241
+
242
+ ```bash
243
+ # By default hidden directories/files are not included. Adding --hidden will include hidden files
244
+ opex_generate /path/to/folder --hidden
245
+ ```
246
+
247
+ ## Auto Reference Usage
248
+
249
+ As mentioned, built into the OMG is the Auto Reference Generator, allowing archival references to be assigned directly to Opexes. By default, codes generated using this method are hard-coded to the identifier `code`.
250
+
251
+ If you want to understand what these References will look like, please see [here](https://github.com/CPJPRINCE/auto_reference_generator?tab=readme-ov-file#structure-of-references).
252
+
253
+ ```bash
254
+ # Will generate a reference code for the hierarchy with the prefix "ARCH"
255
+ opex_generate /path/to/folder -r catalog -p ARCH
256
+
257
+ # Will generate a reference code with prefix "ARCH-1-2-3", suffix "Z" and delimiter "-"
258
+ opex_generate /path/to/folder -r catalog -p "ARCH-1-2-3" -s Z -dlm "-"
259
+
260
+ # Will generate a reference code without a prefix - this will only be the numerals
261
+ opex_generate /path/to/folder -r catalog
262
+
263
+ # Will generate an accession code / 'running number' with the prefix "2026-X"
264
+ opex_generate /path/to/folder -r accession -p 2026-X
265
+
266
+ # Will fill in title, description and security tag data based upon file and folder names and sets to the default security tag 'open'
267
+ opex_generate -c generic /path/to/folder
268
+ ```
269
+
270
+ ## Input Option
271
+
272
+ This program also supports using a spreadsheet as an `input`. This allows the data to be prefilled in and set on ingest. The following XIP Metadata fields can be set:
273
+
274
+ - Title
275
+ - Description
276
+ - Security Status
277
+ - Identifiers
278
+ - SourceID
279
+
280
+ XML metadata data is also supported for both default and custom XMLs.
281
+
282
+ ### XIP Metadata - Title, Description and Security Tags
283
+
284
+ To use an input override, you first need to create a spreadsheet folder listing. It's not necessary, but for convenience, I'd recommend using the `auto_ref` tool. Like so:
285
+
286
+ ```bash
287
+ auto_ref -p "ARCH" /path/to/root
288
+ ```
289
+
290
+ The column headers are all 'drop-in/drop-out'. Simply add new columns for the data you'd like to edit. The column headers are case-sensitive and have to match exactly. For reference, these are the following:
291
+
292
+ ```
293
+ Title
294
+ Description
295
+ Security
296
+ ```
297
+
298
+ These fields would then be filled in with the relevant data. **For Security Tags**, ensure they are an exact match to the tag on your system, which are also case-sensitive.
299
+
300
+ ![ScreenshotXIPColumns](assets/Column%20Headers.png)
301
+
302
+ Once the cells are filled in with the respective data, run a generation using the `-i` option and input the full path to your spreadsheet. Ensure that the `/path/to/root` is the same root as you generated the spreadsheet for.
303
+
304
+ ```bash
305
+ # Will use the 'spreadsheet.xlsx' as an input
306
+ opex_generate -i /path/to/your/spreadsheet.xlsx /path/to/root
307
+
308
+ # These can still be combined with the above options
309
+ opex_generate -i /path/to/your/spreadsheet.xlsx -fx SHA-1 /path/to/root
310
+ ```
311
+
312
+ **To Note:**
313
+ - If you leave blank cells, it will simply skip those details.
314
+ - If you rearrange the hierarchy after your spreadsheet generation, you may receive errors or mismatches due to folders/files being incorrectly looked up. In these cases, you may need to regenerate your list and migrate the data to it.
315
+ - Assignment is not specific to Folders/Files.
316
+
317
+ ### XIP Metadata - Identifiers
318
+
319
+ Identifiers are also supported and can be added to the column header following this convention:
320
+
321
+ ```
322
+ Identifier:Key
323
+ ```
324
+
325
+ The `Key` will determine the identifier name and the cells will contain the value.
326
+
327
+ ![Identifier Screenshot](assets/Identifiers%20Headers.png)
328
+
329
+ You can also use the following column headers:
330
+
331
+ ```
332
+ # Defaults to 'code' key
333
+ - Identifier
334
+ - Archive_Reference
335
+
336
+ # Defaults to 'accref' key
337
+ - Accession_Reference
338
+ ```
339
+
340
+ ### Samples
341
+
342
+ A completed Opex based on this data:
343
+
344
+ ![Sample Spreadsheet input](assets/Spreadsheet%20Input%20Sample.png)
345
+
346
+ Using the command: `opex_generate /home/chris/dev/opex_manifest_generator -i /home/chris/Dev/opex_manifest_generator/meta/opex_manifest_generator_AutoRef.xlsx`
347
+
348
+ Will generate the following for folder manifest:
349
+
350
+ ![Folder Manifest](assets/Folder%20Opex%20Input%20Sample.png)
351
+
352
+ For file manifest:
353
+
354
+ ![File Manifest](assets/File%20Opex%20Input%20Sample.png)
355
+
356
+ ### Custom Spreadsheets
357
+
358
+ The OMG is only dependent on the `FullName` header being present for correct functionality. You can use any spreadsheet as long as the `FullName` header is present and correctly matches the hierarchy. Additional headers can be dropped in/out without interfering.
359
+
360
+ ![FullName Column](assets/FullName%20Column.png)
361
+
362
+ ### XML Metadata - Basic Templates
363
+
364
+ DC, MODS, GDPR, and EAD templates are supported out of the box. The column headers are also 'drop-in/drop-out'.
365
+
366
+ XML Column Headers need to be written as: `ns:tagname` with `ns` being the XML's namespace and `tagname` the tag name.
367
+
368
+ ![XML Headers](assets/XML%20Headers.png)
369
+
370
+ There are two ways to enter the column header: `exactly` or `flatly` (also known as 'nested' vs 'flat' mode). When entering `exact`, you must enter all parents of the tag separated by `/`. Flatly only requires the end tag to be present. In both cases, case-sensitivity matters. `exact` is the default method.
371
+
372
+ If you enter a non-matching header (such as a misspelling), it won't match to the field.
373
+
374
+ ```
375
+ # Exactly:
376
+ mods:recordInfo/mods:recordIdentifier
377
+
378
+ # Flatly:
379
+ mods:recordIdentifier
380
+ ```
381
+
382
+ In both cases, these match to the same `recordIdentifier` field.
383
+
384
+ While using the `flatly` method is easier, if non-unique tags are present, such as `mods:note`, it will match to the first occurrence in the XML, which might not be its intended destination. For complex XMLs, I'd recommend sticking with the `exact` method.
385
+
386
+ Once you have added your headers and data, you can run like so:
387
+
388
+ ```bash
389
+ # Run with flat method
390
+ opex_generate -i "/path/to/your/spreadsheet.xlsx" "/path/to/root/dir" -m flat
391
+
392
+ # Run with exact method
393
+ opex_generate -i "/path/to/your/spreadsheet.xlsx" "/path/to/root/dir" -m exact
394
+ ```
395
+
396
+ ### XML Metadata - Quick Notes
397
+
398
+ - You can use `--print-xmls` and `--convert-xmls` to return XMLs to the console or generate spreadsheet templates.
399
+
400
+ ```bash
401
+ # You can use `--print-xmls` to display the correct header names of your XMLs to the console
402
+ opex_generate /path/to/root --print-xmls
403
+
404
+ # You can also use `--convert-xmls` to create spreadsheets with all the right headers. Will be output to the cwd of your terminal
405
+ opex_generate /path/to/root --convert-xmls
406
+ ```
407
+
408
+ - When you have multiple non-unique tags, such as `mods:note`, you will need to add an index in square brackets `[0]` like so: `mods:note[1] mods:note[2] ...` The number should correspond to the order they appear in the XML tree.
409
+ - If you use `-m` option without adding any data, a blank XML template will be added to the opex.
410
+ - I've also included sample spreadsheets for DC, MODS, GDPR and EAD templates with the `exact` headers [here](https://github.com/CPJPRINCE/opex_manifest_generator/tree/master/samples/spreads).
411
+
412
+ ### XML Metadata Templates - Custom Templates
413
+
414
+ Any custom XML template that is functioning in your system will work!
415
+
416
+ To use custom XMLs, place your XMLs in a specific folder, then use the `-mdir` option with `/path/to/metadata`. You can also use `--print-xmls` and `--convert-xmls` in conjunction with this to generate.
417
+
418
+ ```bash
419
+ # Will use /path/to/metadata as source for files
420
+ opex_generate /path/to/root -mdir /path/to/metadata
421
+ ```
422
+
423
+ ### Input Hashes
424
+
425
+ If you use the column headers `Hash` and `Algorithm` with hash data, when using the `-fx` option in combination with `-i`, the program will read the hashes from the spreadsheet instead of generating them.
426
+
427
+ ![Hash Screenshot](assets/Hash%20Headers.png)
428
+
429
+ **Does not currently support multiple hashes**
430
+
431
+ ### Removals & Ignore
432
+
433
+ You can set the column header `Removals`, and when the cell is marked TRUE, the specified folder/file will be deleted. To activate, use the option `-rm` and confirm when prompted. A text log will be generated for the deleted files in the `meta` folder.
434
+
435
+ Similarly, you can set the column header `Ignore`, and when the cell is marked `TRUE` it will skip the generation of an Opex for the specified file/folder.
436
+
437
+ ### Options File
438
+
439
+ You can use your own `options.properties` file to change the default column headers and some other defaults. Like so:
440
+
441
+ ```bash
442
+ opex_generate --options-file path/to/options.properties /path/to/root
443
+ ```
444
+
445
+ The default options look like:
446
+
447
+ ```
448
+ [options]
449
+
450
+ INDEX_FIELD = FullName
451
+ TITLE_FIELD = Title
452
+ DESCRIPTION_FIELD = Description
453
+ SECURITY_FIELD = Security
454
+ IDENTIFIER_FIELD = Identifier
455
+ IDENTIFIER_DEFAULT = code
456
+ REMOVAL_FIELD = Removals
457
+ IGNORE_FIELD = Ignore
458
+ SOURCEID_FIELD = SourceID
459
+ HASH_FIELD = Hash
460
+ ALGORITHM_FIELD = Algorithm
461
+
462
+ ACCREF_CODE = accref
463
+ ARCREF_FIELD = Archive_Reference
464
+ ACCREF_FIELD = Accession_Reference
465
+
466
+ METAFOLDER = meta
467
+ FIXITY_SUFFIX = _Fixity
468
+ REMOVALS_SUFFIX = _Removals
469
+ GENERIC_DEFAULT_SECURITY = open
470
+ ```
471
+
472
+ ## Full Options
473
+
474
+ The below covers the full range of options. Use `-h` option to show this dialog.
475
+
476
+ <!-- argparse_to_md:opex_manifest_generator:create_parser -->
477
+ Usage:
478
+ ```
479
+ Opex_Manifest_Generator [-h] [-v] [-fx [{SHA-1,MD5,SHA-256,SHA-512} ...]] [--pax-fixity]
480
+ [-z] [--remove-zipped-files] [--remove-empty] [--hidden] [-clr]
481
+ [-opt OPTIONS_FILE] [-i [INPUT]] [-mdir [METADATA_DIR]]
482
+ [-m [{exact,flat}]] [-rm] [--print-xmls] [--convert-xmls]
483
+ [--autoref-options AUTOREF_OPTIONS]
484
+ [-r {catalog,accession,both,generic,catalog-generic,accession-generic,both-generic}]
485
+ [-p PREFIX [PREFIX ...]] [-s [SUFFIX]]
486
+ [--suffix-option {file,directory,both}]
487
+ [--accession-mode [{file,directory,both}]] [-str [START_REF]]
488
+ [-dlm [DELIMITER]] [--sort-by [{folders_first,alphabetical}]]
489
+ [-key [KEYWORDS ...]]
490
+ [-keym [{initialise,firstletters,from_json}]]
491
+ [--keywords-case-sensitivity] [--keywords-retain-order]
492
+ [--keywords-abbreviation-number KEYWORDS_ABBREVIATION_NUMBER [KEYWORDS_ABBREVIATION_NUMBER ...]]
493
+ [--log-level [{DEBUG,INFO,WARNING,ERROR}]]
494
+ [--log-file [LOG_FILE]] [-o [OUTPUT]] [--disable-meta-dir]
495
+ [--disable-all-exports] [--disable-fixity-export]
496
+ [--disable-empty-export] [--disable-removal-export] [-ex]
497
+ [-fmt {xlsx,csv,json,ods,xml}]
498
+ [root]
499
+ ```
500
+ OPEX Manifest Generator for Preservica Uploads
501
+
502
+ Positional arguments:
503
+ - `root`: The root path to generate Opexes for, will recursively traverse all sub-directories.
504
+ Generates an Opex for each folder & (depending on options) file in the directory tree.
505
+
506
+ Optional arguments:
507
+ - `-v`, `--version`: show program's version number and exit
508
+
509
+ Opex Options:
510
+ Options that control the generation of Opex Manifests
511
+
512
+ - `-fx [{SHA-1`, `MD5`, `SHA-256`, `SHA-512} ...]`, `--fixity [{SHA-1`, `MD5`, `SHA-256`, `SHA-512} ...]`: Generates a hash for each file and adds it to the opex.
513
+ Can select one or more algorithms to utilise: {-fx MD5 SHA-1}
514
+ If no algorithm is specified defaults to SHA-1.
515
+
516
+ - `--pax-fixity`: Enables use of PAX fixity generation, in line with Preservica's Recommendation.
517
+ "Files / folders ending in .pax or .pax.zip will have individual files in folder / zip added to Opex.
518
+ - `-z`, `--zip`: Set to zip files
519
+ - `--remove-zipped-files`: Set to remove the original files that have been zipped
520
+ - `--remove-empty`: Remove and log empty directories from root. Log will be exported to 'meta' / output folder.
521
+ - `--hidden`: Set whether to include hidden files and folders
522
+ - `-clr`, `--clear-opex`: Clears existing opex files from a directory. If set with no further options will only clear opexes;
523
+ if multiple options are set will clear opexes and then run the program
524
+ - `-opt OPTIONS_FILE`, `--options-file OPTIONS_FILE`: Specify a custom Options file, changing the set presets for column headers (Title,Description,etc)
525
+
526
+ Input Override Options:
527
+ Options that control the Input Override features
528
+
529
+ - `-i [INPUT]`, `--input [INPUT]`: Set to utilise a CSV / XLSX spreadsheet to import data from
530
+ - `-mdir [METADATA_DIR]`, `--metadata-dir [METADATA_DIR]`: Specify the metadata directory to pull XML files from
531
+ - `-m [{exact`, `flat}]`, `--metadata [{exact`, `flat}]`: Set whether to include xml metadata fields in the generation of the Opex
532
+ - `-rm`, `--remove`: Set whether to enable removals of files and folders from a directory. ***Currently in testing
533
+ - `--print-xmls`: Prints the elements from your xmls to the consoles
534
+ - `--convert-xmls`: Convert XMLs templates files in mdir to spreadsheets/csv files
535
+ - `--autoref-options AUTOREF_OPTIONS`: Specify a custom Auto Reference Options file, changing the set presets for Input Override / Auto Reference Generator
536
+
537
+ Auto Reference Generator Options:
538
+ Options that control the Auto Reference Generator features
539
+
540
+ - `-r {catalog`, `accession`, `both`, `generic`, `catalog-generic`, `accession-generic`, `both-generic}`, `--autoref {catalog`, `accession`, `both`, `generic`, `catalog-generic`, `accession-generic`, `both-generic}`: Toggles whether to utilise the auto_reference_generator
541
+ to generate an on the fly Reference listing.
542
+
543
+ There are several options, {catalog} will generate
544
+ a Archival Reference following an ISAD(G) structure.
545
+
546
+ {accession} will create a running number of files.
547
+ {both} will do both at the same time!
548
+ {generic} will populate the title and description fields with the folder/file's name,
549
+ if used in conjunction with one of the above options:
550
+ {generic-catalog,generic-accession, generic-both} it will do both simultaneously.
551
+
552
+ - `-p PREFIX [PREFIX ...]`, `--prefix PREFIX [PREFIX ...]`: Assign a prefix when utilising the --autoref option. Prefix will append any text before all generated text.
553
+ When utilising the {both} option fill in like: [catalog-prefix, accession-prefix] without square brackets.
554
+
555
+ - `-s [SUFFIX]`, `--suffix [SUFFIX]`: Assign a suffix when utilising the --autoref option. Suffix will append any text after all generated text.
556
+ - `--suffix-option {file`, `directory`, `both}`: Set whether to apply the suffix to files, folders or both when utilising the --autoref option.
557
+ - `--accession-mode [{file`, `directory`, `both}]`: Set the mode when utilising the Accession option in autoref.
558
+ file - only adds on files, folder - only adds on folders, both - adds on files and folders
559
+ - `-str [START_REF]`, `--start-ref [START_REF]`: Set a custom Starting reference for the Auto Reference Generator. The generated reference will
560
+ - `-dlm [DELIMITER]`, `--delimiter [DELIMITER]`: Set a custom delimiter for generated references, default is '/'
561
+ - `--sort-by [{folders_first`, `alphabetical}]`: Set the sorting method, 'folders_first' sorts folders first then files alphabetically; 'alphabetically' sorts alphabetically (ignoring folder distinction)
562
+
563
+ Keyword Options:
564
+ Options that control the Keyword features for Auto Reference Generation
565
+
566
+ - `-key [KEYWORDS ...]`, `--keywords [KEYWORDS ...]`: Set to replace reference numbers with given Keywords for folders (only Folders atm). Can be a list of keywords or a JSON file mapping folder names to keywords.
567
+ - `-keym [{initialise`, `firstletters`, `from_json}]`, `--keywords-mode [{initialise`, `firstletters`, `from_json}]`: Set to alternate keyword mode: 'initialise' will use initials of words; 'firstletters' will use the first letters of the string; 'from_json' will use a JSON file mapping names to keywords
568
+ - `--keywords-case-sensitivity`: Set to change case keyword matching sensitivity. By default keyword matching is insensitive
569
+ - `--keywords-retain-order`: Set when using keywords to continue reference numbering. If not used keywords don't 'count' to reference numbering, e.g. if using initials 'Project Alpha' -> 'PA' then the next folder/file will still be '001' not '003'
570
+ - `--keywords-abbreviation-number KEYWORDS_ABBREVIATION_NUMBER [KEYWORDS_ABBREVIATION_NUMBER ...]`: Set to set the number of letters to abbreviate for 'firstletters' mode, does not impact 'initialise' mode.
571
+
572
+ Export Options:
573
+ Options that control various export features
574
+
575
+ - `--log-level [{DEBUG`, `INFO`, `WARNING`, `ERROR}]`: Set the logging level (default: INFO)
576
+ - `--log-file [LOG_FILE]`: Optional path to write logs to a file (default: stdout)
577
+ - `-o [OUTPUT]`, `--output [OUTPUT]`: Sets the output of the meta folder to send any generated files (Remove Empty, Fixity List, Autoref Export) to. Can be used in conjunction with --disable-meta-dir to set output location without generating meta directory.
578
+ - `--disable-meta-dir`: Set whether to disable the creation of a 'meta' directory for generated files,
579
+ default behaviour is to always generate this directory
580
+ - `--disable-all-exports`: Set to prevent all exports (Fixity, Removal, Empty) from being created in the meta directory.
581
+ - `--disable-fixity-export`: Set whether to export the generated fixity list to a text file in the meta directory.
582
+ Enabled by default, disable with this flag.
583
+ - `--disable-empty-export`: Set whether to export the generated empty list to a text file in the meta directory.
584
+ Enabled by default, disable with this flag.
585
+ - `--disable-removal-export`: Set whether to export the generated removals list to a text file in the meta directory.
586
+ Enabled by default, disable with this flag.
587
+ - `-ex`, `--export-autoref`: Set whether to export the generated references to an AutoRef spreadsheet
588
+ - `-fmt {xlsx`, `csv`, `json`, `ods`, `xml}`, `--output-format {xlsx`, `csv`, `json`, `ods`, `xml}`: Set whether to export AutoRef Spreadsheet to: xlsx, csv, json, ods or xml format
589
+ <!-- argparse_to_md_end -->
590
+
591
+ ## Future Developments
592
+
593
+ - ~~Customizable Filtering~~ *Added!*
594
+ - ~~Adjust Accession so the different modes can use from Opex~~ *Added!*
595
+ - ~~Add SourceID as option for use with Auto Ref Spreadsheets~~ *Added!*
596
+ - ~~Allow for multiple Identifiers to be added with Auto Ref Spreadsheets. Currently only 1 or 2 identifiers can be added at a time, under "Archive_Reference" or "Accession_Reference". These are also tied to be either "code" or "accref". An Option needs to be added to allow custom setting of identifier~~ *Added!*
597
+ - ~~Add an option/make it a default for Metadata XMLs to be located in a specified directory rather than in the package~~ *Added!*
598
+ - Zipping to conform to PAX - Last on the check list; it technically does...
599
+ - In theory, this tool should be compatible with any system that uses the OPEX standard... But in theory Communism works, in theory...
600
+
601
+ ## Troubleshooting
602
+
603
+ - On Windows, ensure that when you enter the root folder it does not end in a `\`. This is slightly annoying as it adds it by default when tabbing.
604
+ - In the examples above, I've used Linux paths. If you're on Windows, don't forget to change these to backslashes `\`
605
+ - There are a number of helpers when entering options: use SHA1 instead of SHA-1, c for catalog, acc for accession.
606
+
607
+ ## Developers
608
+
609
+ For Developers, you can also use the tool as a module:
610
+
611
+ ```python
612
+ from opex_manifest_generator import OpexManifestGenerator
613
+
614
+ omg = OpexManifestGenerator(root="/path/to/root", algorithm="SHA-256").main()
615
+ ```
616
+
617
+ ## Contributing
618
+
619
+ I welcome further contributions and feedback! If there are any issues please raise them [here](https://github.com/CPJPRINCE/opex_manifest_generator/issues)
@@ -0,0 +1,16 @@
1
+ opex_manifest_generator/__init__.py,sha256=HsSQLRVsUMOzvT1Cqb3K_J_f5jUOTOu1LkKHJCgwOGY,460
2
+ opex_manifest_generator/cli.py,sha256=fq8BRMqoZ9cm9OZ_tdKPbM4OfszoPuxymkFYTtFrLrM,24011
3
+ opex_manifest_generator/common.py,sha256=uCwyca2cppo4-xemF1Evaouxh9D5VJhPfVAuevOAv7s,3055
4
+ opex_manifest_generator/hash.py,sha256=KcVP96J6zaRacFhsyuGC48CqES3JiytYlZe5Kc3aMdQ,2833
5
+ opex_manifest_generator/opex_manifest.py,sha256=rvXcTNFSzX79HbfINmojuPPqY-trdXX5NM6O73IYx-s,55202
6
+ opex_manifest_generator/metadata/DublinCore Template.xml,sha256=csNGXzSH27Whs4BQNuwMZl8nLSdDq7Y_OblTfzeBqWQ,775
7
+ opex_manifest_generator/metadata/EAD Template.xml,sha256=qr_kaBdt4Klb9IzCrgPN8fZwdS614U4fXHvI2sZQ1Ok,2168
8
+ opex_manifest_generator/metadata/GDPR Template.xml,sha256=-lbX2cp8ubqU21grkcrr4y5rCDdam4h53lOV8gYM2wM,476
9
+ opex_manifest_generator/metadata/MODS Template.xml,sha256=j9KE3f6WuuDyvscLKVF01172q58DCaistku18I_oCO8,2636
10
+ opex_manifest_generator/options/options.properties,sha256=X-svlvQ-mM5AkJLprwRnqQlYjRX6qMIoHKZmdKY5JTk,480
11
+ opex_manifest_generator-1.3.7.dist-info/licenses/LICENSE.md,sha256=z8d0m5b2O9McPEK1xHG_dWgUBT6EfBDz6wA0F7xSPTA,11358
12
+ opex_manifest_generator-1.3.7.dist-info/METADATA,sha256=eTj_KQH9j-21i21yGMLqGSkfpsRvgGq9NrXWSS5MgoQ,28845
13
+ opex_manifest_generator-1.3.7.dist-info/WHEEL,sha256=wUyA8OaulRlbfwMtmQsvNngGrxQHAvkKcvRmdizlJi0,92
14
+ opex_manifest_generator-1.3.7.dist-info/entry_points.txt,sha256=6IZhtmfD045LUtJcitYNWzE9hLu_IePjQBm8gan2krw,67
15
+ opex_manifest_generator-1.3.7.dist-info/top_level.txt,sha256=K48eGnaDLVO6YDJdAZLqbeoZvJHBGX25cvYT-i8gWt0,24
16
+ opex_manifest_generator-1.3.7.dist-info/RECORD,,
@@ -0,0 +1,2 @@
1
+ [console_scripts]
2
+ opex_generate = opex_manifest_generator.cli:main
@@ -199,4 +199,4 @@
199
199
  distributed under the License is distributed on an "AS IS" BASIS,
200
200
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
201
201
  See the License for the specific language governing permissions and
202
- limitations under the License.
202
+ limitations under the License.