pynotefile 0.11.3__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,250 @@
1
+ Metadata-Version: 2.4
2
+ Name: pynotefile
3
+ Version: 0.11.3
4
+ Summary: Create associated notefiles (sidecar files)
5
+ Author-email: Justin Winokur <Jwink3101@users.noreply.github.com>
6
+ License: MIT
7
+ Project-URL: Homepage, https://github.com/Jwink3101/notefile
8
+ Requires-Python: >=3.8
9
+ Description-Content-Type: text/markdown
10
+ Requires-Dist: ruamel.yaml
11
+ Provides-Extra: pyyaml
12
+ Requires-Dist: pyyaml; extra == "pyyaml"
13
+
14
+ # Notefile
15
+
16
+ notefile is a tool to quickly and easily manage sidecar metadata files ("notefiles") along with the file itself as a YAML file (with the extensions `.notes.yaml`).
17
+
18
+ It is not a perfect solution but it does address many main needs as well as concerns I have with alternative tools.
19
+
20
+ Notefile is designed to assist in keeping associated notes and to perform the most basic operations. However, it is not designed to do all possible things. Notes can be modified (in YAML) as needed with other tools including those included here.
21
+
22
+ It is also worth noting that while notefile can be used as a Python module, it is really design to be primarily a CLI.
23
+
24
+ ## Design & Goals
25
+
26
+ When a note or tag is added, a notefile is created in the same location with the same name plus `.notes.yaml`. The design is a compromise of competing factors of the alternatives.
27
+
28
+ For example, extended attributes are great but they are easily broken and are not always compatible across different operating systems. Other metadata like ID3 or EXIF changes the file itself and are domain-specific.
29
+
30
+ Similarly, single-database solutions (like [TMSU](https://tmsu.org/)) are cleaner but risk damage and are a single point of failure (corruption and recoverability). And it is not as explicit that they are being used on a file.
31
+
32
+ YAML notefiles provide a clear indication of their being a note (or tag) of interest and are cross-platform. Furthermore, by being YAML text-based files, they are not easily corrupted. Also, YAML files are easily read and written by humans.
33
+
34
+ The format is YAML and should not be changed. However, this code does not assume any given fields except:
35
+
36
+ * filesize
37
+ * sha256 (optional)
38
+ * tags
39
+ * notes
40
+
41
+ Any other data can be added and will be preserved across all actions.
42
+
43
+ Notefile primarily grew around files, but it can also attach notes to directories. Directory notes follow the same sidecar model, but the sidecar is placed in the parent directory rather than inside the target directory. For example, a note for `somepath/subdir/` would live at `somepath/subdir.notes.yaml` (or the corresponding hidden/subdir-mode variant).
44
+
45
+ Directory support is newer and less refined than file support. It is intended to be useful, but the coupling between a directory and its note is weaker than the file case and the repair heuristics are correspondingly less exact.
46
+
47
+ ## JSON vs YAML
48
+
49
+ Notefile can write the notes as nicely-formatted YAML or as JSON (which is technically still YAML as YAML is a superset of JSON). JSON is that it is *much* faster to read than YAML but comes at cost of being hard to edit manually.
50
+
51
+ The extension will **always** be `.yaml` as YAML is a superset of JSON and any YAML parser should be able to read JSON
52
+
53
+ ## Install and Usage
54
+
55
+ ### Install
56
+
57
+ Install from PyPI:
58
+
59
+ $ python -m pip install pynotefile
60
+
61
+ Optional PyYAML backend (LibYAML speedup when available):
62
+
63
+ $ python -m pip install "pynotefile[pyyaml]"
64
+
65
+ This installs the `notefile` command and keeps the Python import name as `notefile`:
66
+
67
+ ```python
68
+ import notefile
69
+ ```
70
+
71
+ Install right from github:
72
+
73
+ $ python -m pip install git+https://github.com/Jwink3101/notefile.git
74
+
75
+ Optional PyYAML backend (LibYAML speedup when available):
76
+
77
+ $ python -m pip install "git+https://github.com/Jwink3101/notefile.git#egg=pynotefile[pyyaml]"
78
+
79
+ ### Requirements
80
+
81
+ The only *real* requirement is `ruamel.yaml`. However, if you have `pyyaml` ([website](https://pyyaml.org/)) installed, notefile will use that as a faster **read-only** parser (writes still use ruamel.yaml). Even better, if you have [LibYAML](https://pyyaml.org/wiki/LibYAML), it will be about 25x faster for reads.
82
+
83
+ Note: We avoid writing with PyYAML due to known issues. See [PyYAML issue #121](https://github.com/yaml/pyyaml/issues/121).
84
+
85
+ To install LibYAML, see: (based on [these instructions](https://pyyaml.org/wiki/LibYAML)):
86
+
87
+ > Download the source package: http://pyyaml.org/download/libyaml/yaml-0.2.5.tar.gz.
88
+ >
89
+ > To build and install LibYAML, run
90
+ >
91
+ > $ ./configure
92
+ > $ make
93
+ > # make install
94
+
95
+ Then to install pyyaml,
96
+
97
+ $ python -m pip install pyyaml
98
+
99
+ Or via the optional extra:
100
+
101
+ $ python -m pip install "pynotefile[pyyaml]"
102
+
103
+ In my (limited) experience, pyyaml comes with Anaconda but not miniconda
104
+
105
+ ### Usage
106
+
107
+ Every command is documented. For example, run
108
+
109
+ $ notefile -h
110
+
111
+ to see a list of commands and universal options and then
112
+
113
+ $ notefile <command> -h
114
+
115
+ for specific options.
116
+
117
+ The most basic command will be
118
+
119
+ $ notefile edit file.ext
120
+
121
+ which will launch `$EDITOR` (or try other global variables) to edit the notes. You can also use
122
+
123
+ $ notefile mod -t mytag file.ext
124
+
125
+ to add tags.
126
+
127
+
128
+ ## Repairs
129
+
130
+ It is possible for the sidecar notefiles to get out of sync with the basefile. The two possible issues are:
131
+
132
+ * **metadata**: The basefile has been modified thereby changing its size, sha256, and mtime
133
+ * **orphaned**: The basefile has been renamed thereby orphaning the notefile
134
+
135
+ The `repair` function can repair either (but *not* both) types of issues. To repair metadata, the notefile is simply updated with the new file.
136
+
137
+ To repair an orphaned notefile, it will search in and below the current directory for the file. It will first compare file sizes and then compare sha256 values. If more than one possible file is the original, it will *not* repair it and instead provide a warning.
138
+
139
+ Directory notes work similarly, but not identically. Directory metadata and orphan repair are intentionally shallow:
140
+
141
+ * directory notes do **not** hash file contents
142
+ * directory notes do **not** recurse through the full tree
143
+ * directory notes track only the immediate children returned by `os.listdir()`
144
+ * orphan repair for directories uses the count of immediate subdirectories, the count of immediate non-note files, and a shallow hash of the sorted immediate child names
145
+
146
+ This makes directory notes much cheaper to track, but also means their repair matching is less exact than for files. File notes remain the more robust and more mature case.
147
+
148
+ ## File Hashes
149
+
150
+ By default, the SHA256 hash is computed. It is *highly* suggested that this be allowed since it greatly increases the integrity of the link between the basefile and the notefile sidecar. However, `--no-hash` can be passed to many of the functions and it will disable hashing.
151
+
152
+ Note that when using `--no-hash`, the file may still be rehashed in subsequent runs without `--no-hash`, depending on the opperation.
153
+
154
+ When repairing an orphaned notefile, candidate files are first compared by filesize and then by SHA256. While not foolproof, this *greatly* reduces the number of SHA256 computations to be performed; especially on larger files where it becomes increasingly unlikely to be the exact same size.
155
+
156
+ Directory notes use a different kind of hash. Instead of hashing file contents, they use a shallow hash of the sorted immediate child names in the directory. This is used only as a lightweight directory identity signal and should not be thought of as equivalent to a file content hash.
157
+
158
+ ## Hidden and Subdir Notefiles
159
+
160
+ Notes can be hidden and/or in a subdirectory. Consider `file.txt`. When a note is *created* with the following flags, the location of the note is as follows:
161
+
162
+ | Flags | Note Destination | comment |
163
+ |-------------------------|----------------------------------|---------|
164
+ | `--visible --no-subdir` | `file.txt.notes.yaml` | default |
165
+ | `--visible --subdir` | `_notefiles/file.txt.notes.yaml` | |
166
+ | `--hidden --no-subdir` | `.file.txt.notes.yaml` | |
167
+ | `--hidden --subdir` | `.notefiles/file.txt.notes.yaml` | |
168
+
169
+ For a directory target, the same rule applies except the note is stored alongside the directory in its parent directory. For example, the note for `somepath/subdir/` would be one of:
170
+
171
+ | Flags | Note Destination |
172
+ |-------------------------|---------------------------------------------|
173
+ | `--visible --no-subdir` | `somepath/subdir.notes.yaml` |
174
+ | `--visible --subdir` | `somepath/_notefiles/subdir.notes.yaml` |
175
+ | `--hidden --no-subdir` | `somepath/.subdir.notes.yaml` |
176
+ | `--hidden --subdir` | `somepath/.notefiles/subdir.notes.yaml` |
177
+
178
+
179
+ The default is `--visible` and `--no-subdir` but both can be controlled with environmental variables:
180
+
181
+ $ export NOTEFILE_HIDDEN=true
182
+ $ export NOTEFILE_SUBDIR=true
183
+
184
+
185
+ Note that the flags *only* apply to creating a *new* note. For example if a visible note already exists, it will always go to that even if `-H` is set.
186
+
187
+ To hide or unhide a note, use `notefile vis hide` or `notefile vis show` on either file(s) or dir(s). These will also use the subdir setting
188
+
189
+ Changing the visibility of a symlinked referent will cause the symlinked note to be broken. However, by design it will still properly read the note and will be fixed when editing or repairing metadata.
190
+
191
+ Hidden notefiles are more easily orphaned since it is harder to move both files but not having a directory filling with notefiles can be helpful.
192
+
193
+ ## Tips
194
+
195
+ ### Scripts
196
+
197
+ Includes are some [scripts](scripts/) that may prove useful. As noted before, the goal of `notefile` is to be capable but it doesn't have to do everything!
198
+
199
+ In those scripts (and the tests), actions are often performed by calling the `cli()`. While less efficient, `notefile` is *really* designed with CLI in mind so some of the other functions are less robust.
200
+
201
+ ### Tracking History
202
+
203
+ `notefile` does not track the history of notes and instead suggest doing so in git. They can either be tracked with an existing git repo or its own.
204
+
205
+ If using it on its own, you can tell git to only track notes files with the following in your `.gitignore`:
206
+
207
+ ```git
208
+ # Ignore Everything except directories, so we can recurse into them
209
+ *
210
+ !*/
211
+
212
+ # Allow these
213
+ !*.notes.yaml
214
+ !.gitignore
215
+ ```
216
+
217
+ Alternatively, the `export` command can be used.
218
+
219
+ ## Known Issues
220
+
221
+ These will likely be addressed (roughly in order of priority)
222
+
223
+ - Behavior with hidden files themselves is not consistent. A warning will be thrown
224
+ - Directory support is newer and less refined than file support, especially around repair heuristics and edge cases
225
+
226
+ ## Additional Workflows
227
+
228
+ This tools includes a lot of features but does not include everything. More can be done in Python directly
229
+
230
+ For example, to search for all notes and perform a test do
231
+
232
+ ```python
233
+ import notefile
234
+ for note in notefile.find(return_note=True):
235
+ # test on note.data (which is read automatically)
236
+ ```
237
+
238
+ Additional fields can be added (or removed) from `data` and will be saved when `write` is called.
239
+
240
+ Note that notefile does support setting alternative note fields (but not tags) so that may be useful from the CLI.
241
+
242
+ ## Changelog
243
+
244
+ See [Changelog](changelog.md)
245
+
246
+ ## AI/LLM/Coding Agent Disclosure
247
+
248
+ Almost all of the original code was developed by hand by the author. Around version 0.9.0 (which is also when switched to numeric versioning) OpenAI Codex was used to improve flow, catch bugs, and patch the code.
249
+
250
+ Major features of safe queries (0.9.0) and directory notes (0.10.0) were developed heavily with Codex with human reviews and confirmation of test cases.
@@ -0,0 +1,248 @@
1
+ __version__ = "0.11.3"
2
+ __author__ = "Justin Winokur"
3
+
4
+ import os
5
+ import sys
6
+
7
+ if sys.version_info < (3, 8):
8
+ # Limited by argarse's extend action
9
+ sys.stderr.write("ERROR: Must use Python >= 3.8\n")
10
+ sys.exit(1)
11
+
12
+ # Env Variables
13
+ HIDDEN = os.environ.get("NOTEFILE_HIDDEN", "false").strip().lower() == "true"
14
+ SUBDIR = os.environ.get("NOTEFILE_SUBDIR", "false").strip().lower() == "true"
15
+
16
+ DEBUG = os.environ.get("NOTEFILE_DEBUG", "false").strip().lower() == "true"
17
+ NOTEFIELD = os.environ.get("NOTEFILE_NOTEFIELD", "notes").strip()
18
+ FORMAT = os.environ.get("NOTEFILE_FORMAT", "yaml").strip().lower()
19
+
20
+ DISABLE_QUERY = os.environ.get("NOTEFILE_DISABLE_QUERY", "false").lower() == "true"
21
+ SAFE_QUERY = os.environ.get("NOTEFILE_SAFE_QUERY", "true").strip().lower() == "true"
22
+
23
+ # Constants
24
+ NOTESEXT = ".notes.yaml"
25
+ NOHASH = "** not computed **"
26
+ DT = 1 # mtime change
27
+
28
+
29
+ def debug(*args, **kwargs):
30
+ """Print a debug message when `NOTEFILE_DEBUG` is enabled."""
31
+ if DEBUG:
32
+ kwargs["file"] = sys.stderr
33
+ print("DEBUG:", *args, **kwargs)
34
+
35
+
36
+ def warn(*args, **kwargs):
37
+ """Print a warning message to standard error."""
38
+ kwargs["file"] = sys.stderr
39
+ print("WARNING:", *args, **kwargs)
40
+
41
+
42
+ from .find import find
43
+ from .notefile import Notefile, get_filenames
44
+
45
+
46
+ def query_help(print_help=True, safe=None):
47
+ """Return or print the query-language help text.
48
+
49
+ Parameters
50
+ ----------
51
+ print_help:
52
+ When true, print the generated help text to stdout.
53
+ safe:
54
+ Override whether the safe-query or unsafe-query help variant is used.
55
+
56
+ Returns
57
+ -------
58
+ str
59
+ The rendered help text.
60
+ """
61
+ if safe is None:
62
+ safe = SAFE_QUERY
63
+
64
+ if safe:
65
+ help = """\
66
+ Queries:
67
+ --------
68
+ Queries are single statements where the last line must evaluate to True or False.
69
+ They are evaluated by a restricted parser (no eval/exec).
70
+
71
+ The following variables are defined:
72
+
73
+ data Dictionary of the note itself.
74
+ notes == data['notes'] or data[<note_field>] if set. The note text.
75
+ tags == data['tags']. Set of tags (note, all lower case).
76
+ text Raw contents (YAML/JSON) of the note.
77
+ filename Path to the file being noted.
78
+ notefile_path Path to the notefile sidecar.
79
+ isdir True if the note target is a directory.
80
+ isfile True if the note target is a file.
81
+
82
+ And it includes the following functions:
83
+
84
+ grep performs a match against 'notes'. Respects the flags:
85
+ '--match-expr-case','--fixed-strings','--full-word' automatically but
86
+ can also be overridden with the respective keyword arguments.
87
+
88
+ g Aliased to grep
89
+
90
+ gall Essentially grep with match_any = False
91
+
92
+ gany Essentially grep with match_any = True
93
+
94
+ tany Returns True if that tag is in tags: e.g.
95
+ tany('tag1','tag2') <==> any(t in tags for t in ['tag1','tag2'])
96
+
97
+ tall Returns true if all args are in tags: e.g.
98
+ tall('tag1','tag2') <==> all(t in tags for t in ['tag1','tag2'])
99
+
100
+ t aliased to tany
101
+ norm_tags Normalize tags (splits commas, lowercases, strips whitespace)
102
+
103
+ It also includes the `re` module and `ss = shlex.split`. Imports are not supported.
104
+ Attribute access is limited to safe string and dict methods.
105
+
106
+ Additional supported features include but are not limited to:
107
+
108
+ - Safe builtins: any, all, len, set, list, tuple, sorted, min, max, sum
109
+ - Container literals beyond lists (tuples, sets, dicts)
110
+ - Comprehensions (list, set, dict, generator)
111
+ - Subscripts and slices (e.g., x[0], x[1:3])
112
+ - If-expressions (e.g., a if cond else b)
113
+ - Dict method access (get, keys, values, items, copy)
114
+ - String method allowlist (e.g., splitlines, removeprefix, removesuffix)
115
+
116
+ Queries can replace --tag and grep but grep is faster if it can be used since it
117
+ is accelerated by not parsing YAML unless needed.
118
+
119
+ For example, the following return the same thing:
120
+
121
+ $ notefile grep word1 word2
122
+ $ notefile query "grep('word1') or grep('word2')"
123
+ $ notefile query "grep('word1','word2')"
124
+ $ notefile query "grep(ss('word1 word2'))" # can use shlex.split (ss)
125
+
126
+ However, queries can be much more complex. For example:
127
+
128
+ $ notefile query "(grep('word1') or grep('word2')) and not grep('word3')"
129
+
130
+ Limited multi-line support exists. Multiple lines can be delineated by ';'.
131
+ However, the last line must evaluate the query. Example:
132
+
133
+ $ notefile query "tt = ['a','b','c']; all(t in tags for t in tt)"
134
+
135
+ Or even using multiple lines in the shell
136
+
137
+ $ notefile query "tt = ['a','b','c']
138
+ > all(t in tags for t in tt)"
139
+
140
+ Can also pass STDIN with the expression `-` to make quoting a bit less onerous
141
+
142
+ $ notefile query - <<EOF
143
+ > a = t('tag1') and not t('tag2')
144
+ > b = g('expr1') or g('expr2') or not g('expr3')
145
+ > a and b
146
+ > EOF
147
+
148
+ tany and/or tall could also be used:
149
+
150
+ $ notefile query "tall('a','b','c')"
151
+
152
+ Reminder: safe queries are restricted but not fully sandboxed; expensive regex or
153
+ large computations can still be costly.
154
+
155
+ Safe queries are now the default. To enable *unsafe* queries on the CLI, set
156
+ NOTEFILE_SAFE_QUERY=false. Or use the Note.unsafe_query(...) APIs.
157
+ """
158
+ else:
159
+ help = """\
160
+ Queries:
161
+ --------
162
+ Queries are single statements where the last line must evaluate to True or False.
163
+ They are evaluated as Python (with no sandboxing or sanitizing so DO NOT EVALUATE
164
+ UNTRUSTED INPUT). The following variables are defined:
165
+
166
+ note Notefile object including attributes such as 'filename',
167
+ 'destnote','hidden', etc. See notefile.Notefile documention.
168
+ data Dictionary of the note itself (optional convenience).
169
+ notes == data['notes'] or data[<note_field>] if set. The note text.
170
+ tags == data['tags']. Set object of tags (note, all lower case).
171
+ text Raw contents (YAML/JSON) of the note.
172
+ filename Path to the file being noted.
173
+ notefile_path Path to the notefile sidecar.
174
+ isdir True if the note target is a directory.
175
+ isfile True if the note target is a file.
176
+
177
+ And it includes the following functions:
178
+
179
+ grep performs a match against 'notes'. Respects the flags:
180
+ '--match-expr-case','--fixed-strings','--full-word' automatically but
181
+ can also be overridden with the respective keyword arguments.
182
+
183
+ g Aliased to grep
184
+
185
+ gall Essentially grep with match_any = False
186
+
187
+ gany Essentially grep with match_any = True
188
+
189
+ tany Returns True if that tag is in tags: e.g.
190
+ tany('tag1','tag2') <==> any(t in tags for t in ['tag1','tag2'])
191
+
192
+ tall Returns true if all args are in tags: e.g.
193
+ tall('tag1','tag2') <==> all(t in tags for t in ['tag1','tag2'])
194
+
195
+ t aliased to tany
196
+ norm_tags Normalize tags (splits commas, lowercases, strips whitespace)
197
+
198
+ It also includes the `re` module and `ss = shlex.split`. More cn be imported with
199
+ multiple lines.
200
+
201
+ WARNING: Unsafe queries are deprecated and will be removed in a future release.
202
+ Set NOTEFILE_SAFE_QUERY=true to use safe queries.
203
+
204
+ Queries can replace --tag and grep but grep is faster if it can be used since it
205
+ is accelerated by not parsing YAML unless needed.
206
+
207
+ For example, the following return the same thing:
208
+
209
+ $ notefile grep word1 word2
210
+ $ notefile query "grep('word1') or grep('word2')"
211
+ $ notefile query "grep('word1','word2')"
212
+ $ notefile query "grep(ss('word1 word2'))" # can use shlex.split (ss)
213
+
214
+ However, queries can be much more complex. For example:
215
+
216
+ $ notefile query "(grep('word1') or grep('word2')) and not grep('word3')"
217
+
218
+ Limited multi-line support exists. Multiple lines can be delineated by ';'.
219
+ However, the last line must evaluate the query. Example:
220
+
221
+ $ notefile query "tt = ['a','b','c']; all(t in tags for t in tt)"
222
+
223
+ Or even using multiple lines in the shell
224
+
225
+ $ notefile query "tt = ['a','b','c']
226
+ > all(t in tags for t in tt)"
227
+
228
+ Can also pass STDIN with the expression `-` to make quoting a bit less onerous
229
+
230
+ $ notefile query - <<EOF
231
+ > a = t('tag1') and not t('tag2')
232
+ > b = g('expr1') or g('expr2') or not g('expr3')
233
+ > a and b
234
+ > EOF
235
+
236
+ tany and/or tall could also be used:
237
+
238
+ $ notefile query "tall('a','b','c')"
239
+
240
+ Queries are pretty flexible and give a good bit of control but some actions
241
+ and queries are still better handled directly in Python.
242
+
243
+ Reminder: `unsafe_query` is unsafe for untrusted input. `safe_query` is restricted
244
+ but not fully sandboxed; expensive regex or large computations can still be costly.
245
+ """
246
+ if print_help:
247
+ print(help)
248
+ return help
@@ -0,0 +1,4 @@
1
+ from .cli import cli
2
+
3
+ if __name__ == "__main__":
4
+ cli()