kiarina-utils-file 1.0.0__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- kiarina_utils_file-1.0.0/PKG-INFO +488 -0
- kiarina_utils_file-1.0.0/README.md +453 -0
- kiarina_utils_file-1.0.0/pyproject.toml +52 -0
- kiarina_utils_file-1.0.0/setup.cfg +4 -0
- kiarina_utils_file-1.0.0/src/kiarina/utils/encoding/__init__.py +62 -0
- kiarina_utils_file-1.0.0/src/kiarina/utils/encoding/_helpers/__init__.py +0 -0
- kiarina_utils_file-1.0.0/src/kiarina/utils/encoding/_helpers/decode_binary_to_text.py +45 -0
- kiarina_utils_file-1.0.0/src/kiarina/utils/encoding/_helpers/detect_encoding.py +63 -0
- kiarina_utils_file-1.0.0/src/kiarina/utils/encoding/_helpers/get_default_encoding.py +11 -0
- kiarina_utils_file-1.0.0/src/kiarina/utils/encoding/_helpers/is_binary.py +30 -0
- kiarina_utils_file-1.0.0/src/kiarina/utils/encoding/_operations/__init__.py +0 -0
- kiarina_utils_file-1.0.0/src/kiarina/utils/encoding/_operations/detect_with_charset_normalizer.py +49 -0
- kiarina_utils_file-1.0.0/src/kiarina/utils/encoding/_operations/detect_with_fallback.py +44 -0
- kiarina_utils_file-1.0.0/src/kiarina/utils/encoding/_operations/detect_with_nkf.py +78 -0
- kiarina_utils_file-1.0.0/src/kiarina/utils/encoding/_operations/should_use_nkf.py +98 -0
- kiarina_utils_file-1.0.0/src/kiarina/utils/encoding/_utils/__init__.py +0 -0
- kiarina_utils_file-1.0.0/src/kiarina/utils/encoding/_utils/normalize_newlines.py +25 -0
- kiarina_utils_file-1.0.0/src/kiarina/utils/encoding/py.typed +0 -0
- kiarina_utils_file-1.0.0/src/kiarina/utils/encoding/settings.py +41 -0
- kiarina_utils_file-1.0.0/src/kiarina/utils/ext/__init__.py +63 -0
- kiarina_utils_file-1.0.0/src/kiarina/utils/ext/_helpers/__init__.py +0 -0
- kiarina_utils_file-1.0.0/src/kiarina/utils/ext/_helpers/detect_extension.py +59 -0
- kiarina_utils_file-1.0.0/src/kiarina/utils/ext/_helpers/extract_extension.py +100 -0
- kiarina_utils_file-1.0.0/src/kiarina/utils/ext/_operations/__init__.py +0 -0
- kiarina_utils_file-1.0.0/src/kiarina/utils/ext/_operations/detect_with_dictionary.py +36 -0
- kiarina_utils_file-1.0.0/src/kiarina/utils/ext/_operations/detect_with_mimetypes.py +23 -0
- kiarina_utils_file-1.0.0/src/kiarina/utils/ext/_operations/extract_multi_extension.py +101 -0
- kiarina_utils_file-1.0.0/src/kiarina/utils/ext/_utils/__init__.py +0 -0
- kiarina_utils_file-1.0.0/src/kiarina/utils/ext/_utils/clean_url_path.py +39 -0
- kiarina_utils_file-1.0.0/src/kiarina/utils/ext/_utils/normalize_extension.py +33 -0
- kiarina_utils_file-1.0.0/src/kiarina/utils/ext/_utils/normalize_mime_type.py +28 -0
- kiarina_utils_file-1.0.0/src/kiarina/utils/ext/py.typed +0 -0
- kiarina_utils_file-1.0.0/src/kiarina/utils/ext/settings.py +78 -0
- kiarina_utils_file-1.0.0/src/kiarina/utils/file/__init__.py +140 -0
- kiarina_utils_file-1.0.0/src/kiarina/utils/file/_async/__init__.py +0 -0
- kiarina_utils_file-1.0.0/src/kiarina/utils/file/_async/helpers/__init__.py +0 -0
- kiarina_utils_file-1.0.0/src/kiarina/utils/file/_async/helpers/read_file.py +44 -0
- kiarina_utils_file-1.0.0/src/kiarina/utils/file/_async/helpers/write_file.py +21 -0
- kiarina_utils_file-1.0.0/src/kiarina/utils/file/_async/utils/__init__.py +0 -0
- kiarina_utils_file-1.0.0/src/kiarina/utils/file/_async/utils/read_binary.py +33 -0
- kiarina_utils_file-1.0.0/src/kiarina/utils/file/_async/utils/read_json_dict.py +32 -0
- kiarina_utils_file-1.0.0/src/kiarina/utils/file/_async/utils/read_json_list.py +32 -0
- kiarina_utils_file-1.0.0/src/kiarina/utils/file/_async/utils/read_text.py +28 -0
- kiarina_utils_file-1.0.0/src/kiarina/utils/file/_async/utils/read_yaml_dict.py +32 -0
- kiarina_utils_file-1.0.0/src/kiarina/utils/file/_async/utils/read_yaml_list.py +32 -0
- kiarina_utils_file-1.0.0/src/kiarina/utils/file/_async/utils/remove_file.py +13 -0
- kiarina_utils_file-1.0.0/src/kiarina/utils/file/_async/utils/write_binary.py +14 -0
- kiarina_utils_file-1.0.0/src/kiarina/utils/file/_async/utils/write_json_dict.py +32 -0
- kiarina_utils_file-1.0.0/src/kiarina/utils/file/_async/utils/write_json_list.py +32 -0
- kiarina_utils_file-1.0.0/src/kiarina/utils/file/_async/utils/write_text.py +14 -0
- kiarina_utils_file-1.0.0/src/kiarina/utils/file/_async/utils/write_yaml_dict.py +25 -0
- kiarina_utils_file-1.0.0/src/kiarina/utils/file/_async/utils/write_yaml_list.py +25 -0
- kiarina_utils_file-1.0.0/src/kiarina/utils/file/_core/__init__.py +0 -0
- kiarina_utils_file-1.0.0/src/kiarina/utils/file/_core/models/__init__.py +0 -0
- kiarina_utils_file-1.0.0/src/kiarina/utils/file/_core/models/file_blob.py +238 -0
- kiarina_utils_file-1.0.0/src/kiarina/utils/file/_core/operations/__init__.py +0 -0
- kiarina_utils_file-1.0.0/src/kiarina/utils/file/_core/operations/read_file.py +73 -0
- kiarina_utils_file-1.0.0/src/kiarina/utils/file/_core/operations/write_file.py +53 -0
- kiarina_utils_file-1.0.0/src/kiarina/utils/file/_core/utils/__init__.py +0 -0
- kiarina_utils_file-1.0.0/src/kiarina/utils/file/_core/utils/get_lock_file_path.py +303 -0
- kiarina_utils_file-1.0.0/src/kiarina/utils/file/_core/utils/read_binary.py +90 -0
- kiarina_utils_file-1.0.0/src/kiarina/utils/file/_core/utils/read_json_dict.py +66 -0
- kiarina_utils_file-1.0.0/src/kiarina/utils/file/_core/utils/read_json_list.py +66 -0
- kiarina_utils_file-1.0.0/src/kiarina/utils/file/_core/utils/read_text.py +65 -0
- kiarina_utils_file-1.0.0/src/kiarina/utils/file/_core/utils/read_yaml_dict.py +73 -0
- kiarina_utils_file-1.0.0/src/kiarina/utils/file/_core/utils/read_yaml_list.py +67 -0
- kiarina_utils_file-1.0.0/src/kiarina/utils/file/_core/utils/remove_file.py +61 -0
- kiarina_utils_file-1.0.0/src/kiarina/utils/file/_core/utils/write_binary.py +103 -0
- kiarina_utils_file-1.0.0/src/kiarina/utils/file/_core/utils/write_json_dict.py +66 -0
- kiarina_utils_file-1.0.0/src/kiarina/utils/file/_core/utils/write_json_list.py +66 -0
- kiarina_utils_file-1.0.0/src/kiarina/utils/file/_core/utils/write_text.py +45 -0
- kiarina_utils_file-1.0.0/src/kiarina/utils/file/_core/utils/write_yaml_dict.py +61 -0
- kiarina_utils_file-1.0.0/src/kiarina/utils/file/_core/utils/write_yaml_list.py +61 -0
- kiarina_utils_file-1.0.0/src/kiarina/utils/file/_sync/__init__.py +0 -0
- kiarina_utils_file-1.0.0/src/kiarina/utils/file/_sync/helpers/__init__.py +0 -0
- kiarina_utils_file-1.0.0/src/kiarina/utils/file/_sync/helpers/read_file.py +44 -0
- kiarina_utils_file-1.0.0/src/kiarina/utils/file/_sync/helpers/write_file.py +21 -0
- kiarina_utils_file-1.0.0/src/kiarina/utils/file/_sync/utils/__init__.py +0 -0
- kiarina_utils_file-1.0.0/src/kiarina/utils/file/_sync/utils/read_binary.py +31 -0
- kiarina_utils_file-1.0.0/src/kiarina/utils/file/_sync/utils/read_json_dict.py +30 -0
- kiarina_utils_file-1.0.0/src/kiarina/utils/file/_sync/utils/read_json_list.py +30 -0
- kiarina_utils_file-1.0.0/src/kiarina/utils/file/_sync/utils/read_text.py +28 -0
- kiarina_utils_file-1.0.0/src/kiarina/utils/file/_sync/utils/read_yaml_dict.py +30 -0
- kiarina_utils_file-1.0.0/src/kiarina/utils/file/_sync/utils/read_yaml_list.py +30 -0
- kiarina_utils_file-1.0.0/src/kiarina/utils/file/_sync/utils/remove_file.py +13 -0
- kiarina_utils_file-1.0.0/src/kiarina/utils/file/_sync/utils/write_binary.py +14 -0
- kiarina_utils_file-1.0.0/src/kiarina/utils/file/_sync/utils/write_json_dict.py +32 -0
- kiarina_utils_file-1.0.0/src/kiarina/utils/file/_sync/utils/write_json_list.py +32 -0
- kiarina_utils_file-1.0.0/src/kiarina/utils/file/_sync/utils/write_text.py +14 -0
- kiarina_utils_file-1.0.0/src/kiarina/utils/file/_sync/utils/write_yaml_dict.py +25 -0
- kiarina_utils_file-1.0.0/src/kiarina/utils/file/_sync/utils/write_yaml_list.py +25 -0
- kiarina_utils_file-1.0.0/src/kiarina/utils/file/asyncio.py +140 -0
- kiarina_utils_file-1.0.0/src/kiarina/utils/file/py.typed +0 -0
- kiarina_utils_file-1.0.0/src/kiarina/utils/file/settings.py +35 -0
- kiarina_utils_file-1.0.0/src/kiarina/utils/mime/__init__.py +97 -0
- kiarina_utils_file-1.0.0/src/kiarina/utils/mime/_helpers/__init__.py +0 -0
- kiarina_utils_file-1.0.0/src/kiarina/utils/mime/_helpers/apply_mime_alias.py +27 -0
- kiarina_utils_file-1.0.0/src/kiarina/utils/mime/_helpers/create_mime_blob.py +22 -0
- kiarina_utils_file-1.0.0/src/kiarina/utils/mime/_helpers/detect_mime_type.py +136 -0
- kiarina_utils_file-1.0.0/src/kiarina/utils/mime/_models/__init__.py +0 -0
- kiarina_utils_file-1.0.0/src/kiarina/utils/mime/_models/mime_blob.py +291 -0
- kiarina_utils_file-1.0.0/src/kiarina/utils/mime/_operations/__init__.py +0 -0
- kiarina_utils_file-1.0.0/src/kiarina/utils/mime/_operations/detect_with_dictionary.py +65 -0
- kiarina_utils_file-1.0.0/src/kiarina/utils/mime/_operations/detect_with_mimetypes.py +20 -0
- kiarina_utils_file-1.0.0/src/kiarina/utils/mime/_operations/detect_with_puremagic.py +63 -0
- kiarina_utils_file-1.0.0/src/kiarina/utils/mime/py.typed +0 -0
- kiarina_utils_file-1.0.0/src/kiarina/utils/mime/settings.py +50 -0
- kiarina_utils_file-1.0.0/src/kiarina_utils_file.egg-info/PKG-INFO +488 -0
- kiarina_utils_file-1.0.0/src/kiarina_utils_file.egg-info/SOURCES.txt +110 -0
- kiarina_utils_file-1.0.0/src/kiarina_utils_file.egg-info/dependency_links.txt +1 -0
- kiarina_utils_file-1.0.0/src/kiarina_utils_file.egg-info/requires.txt +7 -0
- kiarina_utils_file-1.0.0/src/kiarina_utils_file.egg-info/top_level.txt +1 -0
|
@@ -0,0 +1,488 @@
|
|
|
1
|
+
Metadata-Version: 2.4
|
|
2
|
+
Name: kiarina-utils-file
|
|
3
|
+
Version: 1.0.0
|
|
4
|
+
Summary: Comprehensive Python library for file I/O operations with automatic encoding detection, MIME type detection, and support for various file formats
|
|
5
|
+
Author-email: kiarina <kiarinadawa@gmail.com>
|
|
6
|
+
Maintainer-email: kiarina <kiarinadawa@gmail.com>
|
|
7
|
+
License: MIT
|
|
8
|
+
Project-URL: Homepage, https://github.com/kiarina/kiarina-python
|
|
9
|
+
Project-URL: Repository, https://github.com/kiarina/kiarina-python
|
|
10
|
+
Project-URL: Issues, https://github.com/kiarina/kiarina-python/issues
|
|
11
|
+
Project-URL: Changelog, https://github.com/kiarina/kiarina-python/blob/main/packages/kiarina-utils-file/CHANGELOG.md
|
|
12
|
+
Project-URL: Documentation, https://github.com/kiarina/kiarina-python/tree/main/packages/kiarina-utils-file#readme
|
|
13
|
+
Keywords: file,io,encoding,mime,async,sync,json,yaml,binary,text,detection,blob,atomic,thread-safe
|
|
14
|
+
Classifier: Development Status :: 4 - Beta
|
|
15
|
+
Classifier: Intended Audience :: Developers
|
|
16
|
+
Classifier: License :: OSI Approved :: MIT License
|
|
17
|
+
Classifier: Operating System :: OS Independent
|
|
18
|
+
Classifier: Programming Language :: Python :: 3
|
|
19
|
+
Classifier: Programming Language :: Python :: 3.12
|
|
20
|
+
Classifier: Programming Language :: Python :: 3.13
|
|
21
|
+
Classifier: Topic :: Software Development :: Libraries :: Python Modules
|
|
22
|
+
Classifier: Topic :: System :: Filesystems
|
|
23
|
+
Classifier: Topic :: Text Processing
|
|
24
|
+
Classifier: Topic :: Utilities
|
|
25
|
+
Classifier: Typing :: Typed
|
|
26
|
+
Requires-Python: >=3.12
|
|
27
|
+
Description-Content-Type: text/markdown
|
|
28
|
+
Requires-Dist: aiofiles>=24.1.0
|
|
29
|
+
Requires-Dist: charset-normalizer>=3.4.3
|
|
30
|
+
Requires-Dist: filelock>=3.19.1
|
|
31
|
+
Requires-Dist: pydantic>=2.11.7
|
|
32
|
+
Requires-Dist: pydantic-settings>=2.10.1
|
|
33
|
+
Requires-Dist: pydantic-settings-manager>=2.1.0
|
|
34
|
+
Requires-Dist: pyyaml>=6.0.2
|
|
35
|
+
|
|
36
|
+
# kiarina-utils-file
|
|
37
|
+
|
|
38
|
+
A comprehensive Python library for file I/O operations with automatic encoding detection, MIME type detection, and support for various file formats.
|
|
39
|
+
|
|
40
|
+
[](https://www.python.org/downloads/)
|
|
41
|
+
[](https://opensource.org/licenses/MIT)
|
|
42
|
+
|
|
43
|
+
## Features
|
|
44
|
+
|
|
45
|
+
### 🚀 **Comprehensive File I/O**
|
|
46
|
+
- **Multiple file formats**: Text, binary, JSON, YAML
|
|
47
|
+
- **Sync & Async support**: Full async/await support for high-performance applications
|
|
48
|
+
- **Atomic operations**: Safe file writing with temporary files and locking
|
|
49
|
+
- **Thread safety**: File locking mechanisms prevent concurrent access issues
|
|
50
|
+
|
|
51
|
+
### 🔍 **Smart Detection**
|
|
52
|
+
- **Automatic encoding detection**: Smart handling of various text encodings with nkf support
|
|
53
|
+
- **MIME type detection**: Automatic content type identification using multiple detection methods
|
|
54
|
+
- **Extension handling**: Support for complex multi-part extensions (.tar.gz, .tar.gz.gpg)
|
|
55
|
+
|
|
56
|
+
### 📦 **Data Containers**
|
|
57
|
+
- **FileBlob**: Unified file data container with metadata and path information
|
|
58
|
+
- **MIMEBlob**: MIME-typed binary data container with format conversion support
|
|
59
|
+
- **Hash-based naming**: Content-addressable file naming using cryptographic hashes
|
|
60
|
+
|
|
61
|
+
### 🛡️ **Production Ready**
|
|
62
|
+
- **Error handling**: Graceful handling of missing files with configurable defaults
|
|
63
|
+
- **Performance optimized**: Non-blocking I/O operations and efficient caching
|
|
64
|
+
- **Type safety**: Full type hints and comprehensive testing
|
|
65
|
+
|
|
66
|
+
## Installation
|
|
67
|
+
|
|
68
|
+
```bash
|
|
69
|
+
pip install kiarina-utils-file
|
|
70
|
+
```
|
|
71
|
+
|
|
72
|
+
### Optional Dependencies
|
|
73
|
+
|
|
74
|
+
For enhanced functionality, install optional dependencies:
|
|
75
|
+
|
|
76
|
+
```bash
|
|
77
|
+
# For MIME type detection from file content
|
|
78
|
+
pip install kiarina-utils-file[mime]
|
|
79
|
+
|
|
80
|
+
# Or install with all optional dependencies
|
|
81
|
+
pip install kiarina-utils-file[all]
|
|
82
|
+
```
|
|
83
|
+
|
|
84
|
+
## Quick Start
|
|
85
|
+
|
|
86
|
+
### Basic File Operations
|
|
87
|
+
|
|
88
|
+
```python
|
|
89
|
+
import kiarina.utils.file as kf
|
|
90
|
+
|
|
91
|
+
# Read and write text files with automatic encoding detection
|
|
92
|
+
text = kf.read_text("document.txt", default="")
|
|
93
|
+
kf.write_text("output.txt", "Hello, World! 🌍")
|
|
94
|
+
|
|
95
|
+
# Binary file operations
|
|
96
|
+
data = kf.read_binary("image.jpg")
|
|
97
|
+
if data:
|
|
98
|
+
kf.write_binary("copy.jpg", data)
|
|
99
|
+
|
|
100
|
+
# JSON operations with type safety
|
|
101
|
+
config = kf.read_json_dict("config.json", default={})
|
|
102
|
+
kf.write_json_dict("output.json", {"key": "value"})
|
|
103
|
+
|
|
104
|
+
# YAML operations
|
|
105
|
+
settings = kf.read_yaml_dict("settings.yaml", default={})
|
|
106
|
+
kf.write_yaml_list("list.yaml", [1, 2, 3])
|
|
107
|
+
```
|
|
108
|
+
|
|
109
|
+
### High-Level FileBlob Operations
|
|
110
|
+
|
|
111
|
+
```python
|
|
112
|
+
import kiarina.utils.file as kf
|
|
113
|
+
|
|
114
|
+
# Read file with automatic MIME type detection
|
|
115
|
+
blob = kf.read_file("document.pdf")
|
|
116
|
+
if blob:
|
|
117
|
+
print(f"File: {blob.file_path}")
|
|
118
|
+
print(f"MIME type: {blob.mime_type}")
|
|
119
|
+
print(f"Size: {len(blob.raw_data)} bytes")
|
|
120
|
+
print(f"Extension: {blob.ext}")
|
|
121
|
+
|
|
122
|
+
# Create and write FileBlob
|
|
123
|
+
blob = kf.FileBlob(
|
|
124
|
+
"output.txt",
|
|
125
|
+
mime_type="text/plain",
|
|
126
|
+
raw_text="Hello, World!"
|
|
127
|
+
)
|
|
128
|
+
kf.write_file(blob)
|
|
129
|
+
|
|
130
|
+
# Data URL generation for web use
|
|
131
|
+
print(blob.raw_base64_url) # data:text/plain;base64,SGVsbG8sIFdvcmxkIQ==
|
|
132
|
+
```
|
|
133
|
+
|
|
134
|
+
### Async Operations
|
|
135
|
+
|
|
136
|
+
```python
|
|
137
|
+
import kiarina.utils.file.asyncio as kfa
|
|
138
|
+
|
|
139
|
+
async def process_files():
|
|
140
|
+
# All operations have async equivalents
|
|
141
|
+
text = await kfa.read_text("large_file.txt")
|
|
142
|
+
await kfa.write_json_dict("result.json", {"processed": True})
|
|
143
|
+
|
|
144
|
+
# FileBlob operations
|
|
145
|
+
blob = await kfa.read_file("document.pdf")
|
|
146
|
+
if blob:
|
|
147
|
+
await kfa.write_file(blob, "backup.pdf")
|
|
148
|
+
```
|
|
149
|
+
|
|
150
|
+
### MIME Type and Extension Detection
|
|
151
|
+
|
|
152
|
+
```python
|
|
153
|
+
import kiarina.utils.mime as km
|
|
154
|
+
import kiarina.utils.ext as ke
|
|
155
|
+
|
|
156
|
+
# MIME type detection from content and filename
|
|
157
|
+
mime_type = km.detect_mime_type(
|
|
158
|
+
raw_data=file_data,
|
|
159
|
+
file_name_hint="document.pdf"
|
|
160
|
+
)
|
|
161
|
+
|
|
162
|
+
# Extension detection from MIME type
|
|
163
|
+
extension = ke.detect_extension("application/json") # ".json"
|
|
164
|
+
|
|
165
|
+
# Multi-part extension extraction
|
|
166
|
+
extension = ke.extract_extension("archive.tar.gz") # ".tar.gz"
|
|
167
|
+
|
|
168
|
+
# Create MIME blob from data
|
|
169
|
+
blob = km.create_mime_blob(jpeg_data)
|
|
170
|
+
print(f"Detected: {blob.mime_type}") # "image/jpeg"
|
|
171
|
+
```
|
|
172
|
+
|
|
173
|
+
### Encoding Detection
|
|
174
|
+
|
|
175
|
+
```python
|
|
176
|
+
import kiarina.utils.encoding as kenc
|
|
177
|
+
|
|
178
|
+
# Automatic encoding detection
|
|
179
|
+
with open("mystery_file.txt", "rb") as f:
|
|
180
|
+
raw_data = f.read()
|
|
181
|
+
|
|
182
|
+
encoding = kenc.detect_encoding(raw_data)
|
|
183
|
+
text = kenc.decode_binary_to_text(raw_data)
|
|
184
|
+
|
|
185
|
+
# Check if data is binary or text
|
|
186
|
+
is_binary = kenc.is_binary(raw_data)
|
|
187
|
+
```
|
|
188
|
+
|
|
189
|
+
## Advanced Usage
|
|
190
|
+
|
|
191
|
+
### Custom Configuration
|
|
192
|
+
|
|
193
|
+
Configure behavior through environment variables:
|
|
194
|
+
|
|
195
|
+
```bash
|
|
196
|
+
# Encoding detection
|
|
197
|
+
export KIARINA_UTILS_ENCODING_USE_NKF=true
|
|
198
|
+
export KIARINA_UTILS_ENCODING_DEFAULT_ENCODING=utf-8
|
|
199
|
+
|
|
200
|
+
# File operations
|
|
201
|
+
export KIARINA_UTILS_FILE_LOCK_DIR=/custom/lock/dir
|
|
202
|
+
export KIARINA_UTILS_FILE_LOCK_CLEANUP_ENABLED=true
|
|
203
|
+
|
|
204
|
+
# MIME type detection
|
|
205
|
+
export KIARINA_UTILS_MIME_HASH_ALGORITHM=sha256
|
|
206
|
+
```
|
|
207
|
+
|
|
208
|
+
### Error Handling
|
|
209
|
+
|
|
210
|
+
```python
|
|
211
|
+
import kiarina.utils.file as kf
|
|
212
|
+
|
|
213
|
+
try:
|
|
214
|
+
data = kf.read_json_dict("config.json")
|
|
215
|
+
if data is None:
|
|
216
|
+
print("File not found, using defaults")
|
|
217
|
+
data = {"default": True}
|
|
218
|
+
except json.JSONDecodeError as e:
|
|
219
|
+
print(f"Invalid JSON: {e}")
|
|
220
|
+
except Exception as e:
|
|
221
|
+
print(f"Unexpected error: {e}")
|
|
222
|
+
```
|
|
223
|
+
|
|
224
|
+
### Performance Considerations
|
|
225
|
+
|
|
226
|
+
```python
|
|
227
|
+
import kiarina.utils.file.asyncio as kfa
|
|
228
|
+
|
|
229
|
+
# For I/O intensive operations, use async versions
|
|
230
|
+
async def process_many_files(file_paths):
|
|
231
|
+
tasks = [kfa.read_file(path) for path in file_paths]
|
|
232
|
+
results = await asyncio.gather(*tasks)
|
|
233
|
+
return [r for r in results if r is not None]
|
|
234
|
+
|
|
235
|
+
# Use appropriate defaults to avoid None checks
|
|
236
|
+
config = kf.read_json_dict("config.json", default={})
|
|
237
|
+
# Instead of:
|
|
238
|
+
# config = kf.read_json_dict("config.json")
|
|
239
|
+
# if config is None:
|
|
240
|
+
# config = {}
|
|
241
|
+
```
|
|
242
|
+
|
|
243
|
+
## API Reference
|
|
244
|
+
|
|
245
|
+
### File Operations
|
|
246
|
+
|
|
247
|
+
#### Synchronous API (`kiarina.utils.file`)
|
|
248
|
+
|
|
249
|
+
**High-level operations:**
|
|
250
|
+
- `read_file(path, *, fallback_mime_type="application/octet-stream", default=None) -> FileBlob | None`
|
|
251
|
+
- `write_file(file_blob, file_path=None) -> None`
|
|
252
|
+
|
|
253
|
+
**Text operations:**
|
|
254
|
+
- `read_text(path, *, default=None) -> str | None`
|
|
255
|
+
- `write_text(path, text) -> None`
|
|
256
|
+
|
|
257
|
+
**Binary operations:**
|
|
258
|
+
- `read_binary(path, *, default=None) -> bytes | None`
|
|
259
|
+
- `write_binary(path, data) -> None`
|
|
260
|
+
|
|
261
|
+
**JSON operations:**
|
|
262
|
+
- `read_json_dict(path, *, default=None) -> dict[str, Any] | None`
|
|
263
|
+
- `write_json_dict(path, data, *, indent=2, ensure_ascii=False, sort_keys=False) -> None`
|
|
264
|
+
- `read_json_list(path, *, default=None) -> list[Any] | None`
|
|
265
|
+
- `write_json_list(path, data, *, indent=2, ensure_ascii=False, sort_keys=False) -> None`
|
|
266
|
+
|
|
267
|
+
**YAML operations:**
|
|
268
|
+
- `read_yaml_dict(path, *, default=None) -> dict[str, Any] | None`
|
|
269
|
+
- `write_yaml_dict(path, data, *, allow_unicode=True, sort_keys=False) -> None`
|
|
270
|
+
- `read_yaml_list(path, *, default=None) -> list[Any] | None`
|
|
271
|
+
- `write_yaml_list(path, data, *, allow_unicode=True, sort_keys=False) -> None`
|
|
272
|
+
|
|
273
|
+
**File management:**
|
|
274
|
+
- `remove_file(path) -> None`
|
|
275
|
+
|
|
276
|
+
#### Asynchronous API (`kiarina.utils.file.asyncio`)
|
|
277
|
+
|
|
278
|
+
All synchronous functions have async equivalents with the same signatures, but they return `Awaitable` objects and must be called with `await`.
|
|
279
|
+
|
|
280
|
+
### Data Containers
|
|
281
|
+
|
|
282
|
+
#### FileBlob
|
|
283
|
+
|
|
284
|
+
```python
|
|
285
|
+
class FileBlob:
|
|
286
|
+
def __init__(self, file_path, mime_blob=None, *, mime_type=None, raw_data=None, raw_text=None)
|
|
287
|
+
|
|
288
|
+
# Properties
|
|
289
|
+
file_path: str
|
|
290
|
+
mime_blob: MIMEBlob
|
|
291
|
+
mime_type: str
|
|
292
|
+
raw_data: bytes
|
|
293
|
+
raw_text: str
|
|
294
|
+
raw_base64_str: str
|
|
295
|
+
raw_base64_url: str
|
|
296
|
+
hash_string: str
|
|
297
|
+
ext: str
|
|
298
|
+
hashed_file_name: str
|
|
299
|
+
|
|
300
|
+
# Methods
|
|
301
|
+
def is_binary() -> bool
|
|
302
|
+
def is_text() -> bool
|
|
303
|
+
def replace(*, file_path=None, mime_blob=None, mime_type=None, raw_data=None, raw_text=None) -> FileBlob
|
|
304
|
+
```
|
|
305
|
+
|
|
306
|
+
#### MIMEBlob
|
|
307
|
+
|
|
308
|
+
```python
|
|
309
|
+
class MIMEBlob:
|
|
310
|
+
def __init__(self, mime_type, raw_data=None, *, raw_text=None)
|
|
311
|
+
|
|
312
|
+
# Properties
|
|
313
|
+
mime_type: str
|
|
314
|
+
raw_data: bytes
|
|
315
|
+
raw_text: str
|
|
316
|
+
raw_base64_str: str
|
|
317
|
+
raw_base64_url: str
|
|
318
|
+
hash_string: str
|
|
319
|
+
ext: str
|
|
320
|
+
hashed_file_name: str
|
|
321
|
+
|
|
322
|
+
# Methods
|
|
323
|
+
def is_binary() -> bool
|
|
324
|
+
def is_text() -> bool
|
|
325
|
+
def replace(*, mime_type=None, raw_data=None, raw_text=None) -> MIMEBlob
|
|
326
|
+
```
|
|
327
|
+
|
|
328
|
+
### Utility Functions
|
|
329
|
+
|
|
330
|
+
#### MIME Type Detection (`kiarina.utils.mime`)
|
|
331
|
+
|
|
332
|
+
- `detect_mime_type(*, raw_data=None, stream=None, file_name_hint=None, **kwargs) -> str | None`
|
|
333
|
+
- `create_mime_blob(raw_data, *, fallback_mime_type="application/octet-stream") -> MIMEBlob`
|
|
334
|
+
- `apply_mime_alias(mime_type, *, mime_aliases=None) -> str`
|
|
335
|
+
|
|
336
|
+
#### Extension Detection (`kiarina.utils.ext`)
|
|
337
|
+
|
|
338
|
+
- `detect_extension(mime_type, *, custom_extensions=None, default=None) -> str | None`
|
|
339
|
+
- `extract_extension(file_name_hint, *, multi_extensions=None, **kwargs, default=None) -> str | None`
|
|
340
|
+
|
|
341
|
+
#### Encoding Detection (`kiarina.utils.encoding`)
|
|
342
|
+
|
|
343
|
+
- `detect_encoding(raw_data, *, use_nkf=None, **kwargs) -> str | None`
|
|
344
|
+
- `decode_binary_to_text(raw_data, *, use_nkf=None, **kwargs) -> str`
|
|
345
|
+
- `is_binary(raw_data, *, use_nkf=None, **kwargs) -> bool`
|
|
346
|
+
- `get_default_encoding() -> str`
|
|
347
|
+
- `normalize_newlines(text) -> str`
|
|
348
|
+
|
|
349
|
+
## Configuration
|
|
350
|
+
|
|
351
|
+
### Environment Variables
|
|
352
|
+
|
|
353
|
+
#### Encoding Detection
|
|
354
|
+
- `KIARINA_UTILS_ENCODING_USE_NKF`: Enable/disable nkf usage (bool)
|
|
355
|
+
- `KIARINA_UTILS_ENCODING_DEFAULT_ENCODING`: Default encoding (default: "utf-8")
|
|
356
|
+
- `KIARINA_UTILS_ENCODING_FALLBACK_ENCODINGS`: Comma-separated list of fallback encodings
|
|
357
|
+
- `KIARINA_UTILS_ENCODING_MAX_SAMPLE_SIZE`: Maximum bytes to sample for detection (default: 8192)
|
|
358
|
+
- `KIARINA_UTILS_ENCODING_CHARSET_NORMALIZER_CONFIDENCE_THRESHOLD`: Confidence threshold (default: 0.6)
|
|
359
|
+
|
|
360
|
+
#### File Operations
|
|
361
|
+
- `KIARINA_UTILS_FILE_LOCK_DIR`: Custom lock directory path
|
|
362
|
+
- `KIARINA_UTILS_FILE_LOCK_CLEANUP_ENABLED`: Enable automatic cleanup (default: true)
|
|
363
|
+
- `KIARINA_UTILS_FILE_LOCK_MAX_AGE_HOURS`: Maximum age for lock files in hours (default: 24)
|
|
364
|
+
|
|
365
|
+
#### MIME Type Detection
|
|
366
|
+
- `KIARINA_UTILS_MIME_HASH_ALGORITHM`: Hash algorithm for content addressing (default: "sha256")
|
|
367
|
+
|
|
368
|
+
#### Extension Detection
|
|
369
|
+
- `KIARINA_UTILS_EXT_MAX_MULTI_EXTENSION_PARTS`: Maximum parts for multi-extension detection (default: 4)
|
|
370
|
+
|
|
371
|
+
## Requirements
|
|
372
|
+
|
|
373
|
+
- **Python**: 3.12 or higher
|
|
374
|
+
- **Core dependencies**:
|
|
375
|
+
- `aiofiles>=24.1.0` - Async file operations
|
|
376
|
+
- `charset-normalizer>=3.4.3` - Encoding detection
|
|
377
|
+
- `filelock>=3.19.1` - File locking
|
|
378
|
+
- `pydantic>=2.11.7` - Data validation
|
|
379
|
+
- `pydantic-settings>=2.10.1` - Settings management
|
|
380
|
+
- `pydantic-settings-manager>=2.1.0` - Advanced settings management
|
|
381
|
+
- `pyyaml>=6.0.2` - YAML support
|
|
382
|
+
|
|
383
|
+
- **Optional dependencies**:
|
|
384
|
+
- `puremagic>=1.30` - Enhanced MIME type detection from file content
|
|
385
|
+
|
|
386
|
+
## Development
|
|
387
|
+
|
|
388
|
+
### Prerequisites
|
|
389
|
+
|
|
390
|
+
- Python 3.12+
|
|
391
|
+
- [uv](https://github.com/astral-sh/uv) for dependency management
|
|
392
|
+
- [mise](https://mise.jdx.dev/) for task running
|
|
393
|
+
|
|
394
|
+
### Setup
|
|
395
|
+
|
|
396
|
+
```bash
|
|
397
|
+
# Clone the repository
|
|
398
|
+
git clone https://github.com/kiarina/kiarina-python.git
|
|
399
|
+
cd kiarina-python
|
|
400
|
+
|
|
401
|
+
# Setup development environment
|
|
402
|
+
mise run setup
|
|
403
|
+
|
|
404
|
+
# Install dependencies for this package
|
|
405
|
+
cd packages/kiarina-utils-file
|
|
406
|
+
uv sync --group dev
|
|
407
|
+
```
|
|
408
|
+
|
|
409
|
+
### Running Tests
|
|
410
|
+
|
|
411
|
+
```bash
|
|
412
|
+
# Run all tests
|
|
413
|
+
mise run package:test kiarina-utils-file
|
|
414
|
+
|
|
415
|
+
# Run with coverage
|
|
416
|
+
mise run package:test kiarina-utils-file --coverage
|
|
417
|
+
|
|
418
|
+
# Run specific test files
|
|
419
|
+
uv run --group test pytest tests/file/test_kiarina_utils_file_sync.py
|
|
420
|
+
uv run --group test pytest tests/file/test_kiarina_utils_file_async.py
|
|
421
|
+
```
|
|
422
|
+
|
|
423
|
+
### Code Quality
|
|
424
|
+
|
|
425
|
+
```bash
|
|
426
|
+
# Format code
|
|
427
|
+
mise run package:format kiarina-utils-file
|
|
428
|
+
|
|
429
|
+
# Run linting
|
|
430
|
+
mise run package:lint kiarina-utils-file
|
|
431
|
+
|
|
432
|
+
# Type checking
|
|
433
|
+
mise run package:typecheck kiarina-utils-file
|
|
434
|
+
|
|
435
|
+
# Run all checks
|
|
436
|
+
mise run package kiarina-utils-file
|
|
437
|
+
```
|
|
438
|
+
|
|
439
|
+
## Performance
|
|
440
|
+
|
|
441
|
+
### Benchmarks
|
|
442
|
+
|
|
443
|
+
The library is optimized for performance with several key features:
|
|
444
|
+
|
|
445
|
+
- **Lazy loading**: Properties are computed only when accessed
|
|
446
|
+
- **Caching**: Expensive operations like encoding detection are cached
|
|
447
|
+
- **Async support**: Non-blocking I/O for high-throughput applications
|
|
448
|
+
- **Efficient sampling**: Large files are sampled for encoding/MIME detection
|
|
449
|
+
- **Atomic operations**: Safe concurrent file access with minimal overhead
|
|
450
|
+
|
|
451
|
+
### Memory Usage
|
|
452
|
+
|
|
453
|
+
- **Streaming support**: Large files can be processed without loading entirely into memory
|
|
454
|
+
- **Configurable sampling**: Detection algorithms use configurable sample sizes
|
|
455
|
+
- **Efficient caching**: Only frequently accessed properties are cached
|
|
456
|
+
|
|
457
|
+
## License
|
|
458
|
+
|
|
459
|
+
This project is licensed under the MIT License - see the [LICENSE](../../LICENSE) file for details.
|
|
460
|
+
|
|
461
|
+
## Contributing
|
|
462
|
+
|
|
463
|
+
This is a personal project, but contributions are welcome! Please feel free to submit issues or pull requests.
|
|
464
|
+
|
|
465
|
+
### Guidelines
|
|
466
|
+
|
|
467
|
+
1. **Code Style**: Follow the existing code style (enforced by ruff)
|
|
468
|
+
2. **Testing**: Add tests for new functionality
|
|
469
|
+
3. **Documentation**: Update documentation for API changes
|
|
470
|
+
4. **Type Hints**: Maintain full type hint coverage
|
|
471
|
+
|
|
472
|
+
## Related Projects
|
|
473
|
+
|
|
474
|
+
- [kiarina-python](https://github.com/kiarina/kiarina-python) - The main monorepo containing this package
|
|
475
|
+
- [pydantic-settings-manager](https://github.com/kiarina/pydantic-settings-manager) - Configuration management library used by this package
|
|
476
|
+
|
|
477
|
+
## Changelog
|
|
478
|
+
|
|
479
|
+
See [CHANGELOG.md](CHANGELOG.md) for a detailed history of changes.
|
|
480
|
+
|
|
481
|
+
## Support
|
|
482
|
+
|
|
483
|
+
- **Issues**: [GitHub Issues](https://github.com/kiarina/kiarina-python/issues)
|
|
484
|
+
- **Discussions**: [GitHub Discussions](https://github.com/kiarina/kiarina-python/discussions)
|
|
485
|
+
|
|
486
|
+
---
|
|
487
|
+
|
|
488
|
+
Made with ❤️ by [kiarina](https://github.com/kiarina)
|